PDA

View Full Version : Hadoop basics



Pages : [1] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

  1. Hadoop jar command error
  2. Minimum Hardware for hadoop cluster for processing more than 1TB data evey day
  3. How many types of InputFormat is there in Hadoop?
  4. ExitCodeException while staring namenode
  5. AWS EMR 4.0 - How can I add a custom JAR step to run shell commands
  6. Large data set for hadoop
  7. Hadoop MapReduce | SMA in python
  8. Tracking-URL : N/A , RPC Port : -1
  9. Unable to Process File with multiple delimiters in Pig
  10. cassandra and hadoop - realtime vs batch
  11. sqoop: How to grab incremental updates from mySql
  12. sqoop: How to grab incremental updates from mySql
  13. Benchmark Hadoop using TestDFSIO
  14. EMR - Use custom logging appender in Hadoop (and YARN)
  15. Unstructured data into structured data using Pig
  16. Installation failed Hortonwork hdp2.3.0.0 in windows server 2008 R 2
  17. Everytime I restart my PC namenode is not starting
  18. InstanceProfile is required for creating cluster
  19. How to view the FileSystem of Hadoop out of the local cluster, using webHDFS
  20. Can we split Sqoop job by multiple column combination
  21. Run custom Speculator in Hadoop 2.6.0
  22. Implementing Floyd's Algorithm in Hadoop
  23. /bin/bash: /bin/java: No such file or directory
  24. how to read parquet schema in non mapreduce java program
  25. Submit Spark job on Yarn cluster
  26. Query - Hive group by not working
  27. Is the Combiner logic always the same as the Reducer logic?
  28. Can't Integrate Solr with Nutch
  29. I want to connect my spotfire desktop version to spark
  30. Why does Hadoop open file succeeds when name-node is up but data-node is down?
  31. Aggregate Resource Allocation for a job in YARN
  32. Apache pig query to join two schemas
  33. Hadoop datanode not running
  34. Mongo Network Exceptions
  35. Oozie for mutliple mapreduce jobs
  36. Indefinite pause while trying to insert data into HBase
  37. Unable to run Hive jobs from Oozie. Hive action fails
  38. Hadoop ERROR streaming
  39. Nodemanger getting killed in hadoop 2.6.0
  40. Hadoop namenode replication
  41. Spark coalesce vs HDFS getmerge
  42. Stratio setup: connection refused error
  43. Hadoop says starting the datanode, starts it, but doesn't show up in jps later
  44. Cannot find yarn application logs
  45. interactive hive on tez why can i only execute once query?
  46. Limiting Map Reduce Job to use less resource
  47. How to run Java Action as oozie workflow with hue interface
  48. Running Hadoop in secure mode "Unable to obtain password from user"
  49. Partitioning the data based on column values
  50. expected org.apache.hadoop.hive.ql.io.orc.OrcStruct, received org.apache.hadoop.hive.
  51. regex extract in hive for the following scenario?
  52. Hive table join with update
  53. Context.write in mapper output prints more lines Hadoop Mapreduce
  54. Spark filtering twitter data
  55. Disabling master node from running as a slave node in Hadoop
  56. yarn timeline recovery not enabled error upgrading via ambari
  57. Hadoop - Properly sort by key and group by reducer
  58. JSON output format for Hive Query results
  59. Map Reduce with HIVE
  60. Lily Hbase Indexers exit for no reason
  61. How do I get around Found class jline.Terminal, but interface was expected error when
  62. hadoop streaming workflow multiple files
  63. ./bin/hadoop jar appears :in the thread when the "main" Java. Lang. Classnotfoundexce
  64. Cloudera Command: hdfs dfs -put testfile.txt Failure
  65. Hive, Bucketing for the partitioned table
  66. moving datanode from CDH to mapR
  67. Split value in new row in Hive
  68. Hadoop namenode can't get out of safemode
  69. Cloudera Manager with LDAP authentification
  70. error while starting namenode and datanode
  71. gzipped custom xml file on hdfs that needs to be indexed using solr (cloudera search)
  72. For which type of parallel algorithms is Hadoop well suited?
  73. How to pull data from Mainframe to Hadoop
  74. Why Nutch is not working even after following every step?
  75. Xlsx with multipls sheets as input to mapreduce
  76. Configure default name node port in hadoop 2.6
  77. Start hbase in CDH5 VM in stan
  78. How can I count specific word using mapreduce?
  79. Namenode and Datanode not starting in hadoop
  80. BIGINT in SPARK SQL
  81. how to preprocess the data and load into hive
  82. CMake error while installing Hadoop 2.7.1 on Windows 7 SP1
  83. Hadoop replica processing
  84. cannot write a file into hdfs - getting error hdfs in safe mode
  85. I try to read index from hdfs in lucene 4.10.4, but it failed
  86. Different ways to import files into HDFS
  87. Error Sqoop import from Couchbase to Hadoop
  88. MongoDB into AWS Redshift
  89. Minimum system requirements for running a Hadoop Cluster with High Availability
  90. Hadoop mv: cannot stat
  91. Custom Calculation in Hive
  92. Convert date with milliseconds using PIG
  93. Scala exception: value registerTempTable is not a member of org.apache.spark.sql.Sche
  94. Prefix span in hadoop
  95. comparision between array column values with normal column values in hive
  96. Connection RStudio with Hadoop
  97. installing HDFS without MapReduce
  98. Hadoop VM installation?
  99. What is "Hadoop" - the definition of Hadoop?
  100. Hypertable and thrift installation on xampp on windows 10
  101. How to make read and write execution in MapReduce works faster?
  102. Hadoop Raw comparator
  103. Treeset is not sorting values in hadoop mapper map function
  104. how to run hadoop secret manager on minidfs
  105. yarn permission error while running sparkr jobs
  106. Not able to copy one HDFS data to another HDFS location using distcp
  107. oozie coordinator input-event does not work
  108. Gangila monitoring in hadoop
  109. library packages not working with oozie
  110. What is the use of disable operation in hbase?
  111. AccessControlException in Hadoop for access=EXECUTE
  112. HiveServer2 generate a lot of directories in hdfs /tmp/hive/hive
  113. BigInsights service in Bluemix is not showing up when trying to bind to an App
  114. Uploading file to Hbase HDInsight
  115. SSH Error when using bdutil to create Hadoop cluster on GCE
  116. Does a copy from local directory to HDFS run a mapreduce job?
  117. Hbase Should flag exception when trying to update the key
  118. headnodehost in Azure HDInsights
  119. hadoop archive: JNI error, class not found
  120. Performance of Apache Drill
  121. job submitting but maprduce not working
  122. Hadoop custom record reader implementation
  123. Having multiple reducers create multiple output files in HDFS
  124. How to compile and run Spark java program using spark-class
  125. i want to path optimisation user travel my website particular time in travel in parti
  126. Pydoop error: RuntimeError: java home not found, try setting JAVA_HOME on remote serv
  127. How can I pass a dynamic date in a hive server action as a parameter
  128. Hadoop cluster setup - java.net.ConnectException: Connection refused
  129. Refresh one hive table from another hive table
  130. HDFS - load mass amount of files
  131. What is the principle of DagScheduler in Spark?
  132. How to find jar dependencies when running Apache Pig script?
  133. I tried to start up HBase
  134. Spark 1.3.1 installation
  135. Hadoop - failed redirect for container
  136. Find out actual disk usage in HDFS
  137. Hadoop building through Maven in Windows keeps failing
  138. How can I do a double delimiter(||) in Hive?
  139. Pig - Get Top n and group rest in 'other'
  140. Unable to write file on HDFS
  141. Hive: subquery on GROUP By
  142. Mapreduce: writing from both mapper and reducer in single job
  143. Error creating bean with name 'wordCountId': Invocation of init method failed; nested
  144. MIn max group wise and filter without join in pig
  145. Not able to install shiny ,Rhive, Rhadoop packages on R 2.15.1 on ubuntu
  146. Mahout Canopy Clustering, K-means Clustering : Java Heap Space - out of memory
  147. platform.linux_distribution from the python platform library returns (None, None, Non
  148. Apache Spark Mongo-Hadoop Connector class not found
  149. How to build Hadoop Job using Maven
  150. Is there any limit of characters in command line arguments in spark submit command?
  151. How to run spark application written in scala in CDH5.4
  152. Can we run existing program with hadoop or we need to modify it in mapreduce style on
  153. Argument type "struct" is different from preceding arguments
  154. hdfs: no such file or directory error when reading parquetfile in sparkR shell
  155. Map-reduce implementation for alternating lease square?
  156. Hive - ELSE in SUM
  157. Storm Exception TTransportException
  158. Running Hadoop Wordcount Job Error
  159. Spark-shell with 'yarn-client' tries to load config from wrong location
  160. Impala The Cloudera Manager Agent got an unexpected response from this role's web ser
  161. Logging with log4j only once every so many calls to logger.info/debug/warn() calls
  162. How to set autoflush=false in HBase table
  163. Eclipse for hadoop on Windows 7 x64
  164. STORE command in PIG
  165. Hashmap in each mapper shud be used in a single reducer
  166. Running MapReduce Code on Eclipse
  167. nosuchmethod error on hadoop java
  168. Spark SQL real time on Hive
  169. Oozie processing input in multiple directories with one mapper
  170. until the end code of the input file is not
  171. getting java.lang.NullPointerException hadoop map-reduce program
  172. unexpected multiple execution of mapper intended to run once
  173. Sqoop import for Date datatype in avro fromat
  174. Mitigating Hadoop's Achilles tendons
  175. java -Dlog4j.configuration command not working
  176. Why free space in hadoop cluser gone?
  177. Yarn Terasort has the same execution time for 7 and 14 worker nodes
  178. Physical location of MapR DB table
  179. Datanodes are cannot connect to namenode
  180. SQL Server 2012 & Polybase - 'Hadoop Connectivity' configuration option missing
  181. Failed to connect rhive server through R using rhive.connect()
  182. Running Hadoop on Cloud platform
  183. Mismatch in no of Executors(Spark in YARN Pseudo distributed mode)
  184. Error while executing select query in Hive - how to update Hadoop version
  185. Passing parameters from one action to another in Oozie
  186. Apache Phoenix Double Datatype issue when writting MapReduce
  187. Avro Map-Reduce on oozie
  188. EMR Hive Regex QueryString values
  189. Use of core-site.xml in mapreduce program
  190. Error in using Hadoop MapReduce in Eclipse
  191. to get max value in a row from pig
  192. Pig - MAX is not working after grouping
  193. Random reads from HDFS via Indexed Namenode
  194. HDP 2.1 not able to add new users
  195. Hbase: Having just the first version of each cell
  196. Using HDFS with Apache Spark on Amazon EC2
  197. HBASE_HOME is null and cuase “Could not locate executable null\bin\winutils.exe in th
  198. How to pass a file as parameter in mapreduce
  199. mapreduce job not setting compression codec correctly
  200. How to load CSV data with enclosed by double quotes and separated by tab into HIVE ta
  201. Comparing values from a TSV file and Hbase table in java
  202. Spark 1.3.0 on YARN: Application failed 2 times due to AM Container
  203. to get value using key in the map reduce
  204. Spark Task not serializable (Case Classes)
  205. trying to execute mapreduce wordcount program in horton works sandbox 2.1 ..please te
  206. How to get name of the emp whose salary is gretear than average salary of depatment i
  207. Giraph example ShortestPath fails
  208. BigSql escape double quotes and single quotes
  209. Spark: Group RDD by id
  210. HBase scan/filter rowkey
  211. Error getting when passing parameter through pig script
  212. Rest API for Kafka
  213. How to avoid signed zero in PIG
  214. can't yield output from mrjob (map/reduce)
  215. "Java Heap space Out Of Memory Error" while running a mapreduce program
  216. Building predictive model from HDFS OR HIVE as source of training set and test set in
  217. Namenode cannot start
  218. YarnChild address and port
  219. TotalOrderPartitioner in Mapreduce example
  220. Complex job using JobControl Hadoop
  221. How to set the configuration for running MR job in CDH5 Hue?
  222. how to use varaibles in hive load command
  223. Bootstrap Failure when trying to install Spark on EMR
  224. Computing median in map reduce
  225. MapReduce - reducer emits the output in one line
  226. Hadoop Pseudo-Distributed Operation error: Protocol message tag had invalid wire type
  227. Need output column-wise from row-wise records
  228. Apache PIG - Join followed by projection results in NULLs
  229. How much memory and vcore allocated on hadoop YARN?
  230. Pig error while dumping after loading
  231. Java program unable to run hadoop commands when called inside shell script
  232. How to upload twitter json data using serde in hive?
  233. Does Impala makes effective use of Buckets in a Hive Bucketed table?
  234. Run Sqoop command on Windows using jar
  235. Explode function returning single row
  236. Combining clustering algorithms in MapReduce
  237. What does reduce() do without mapper() in MRJob?
  238. hadoop and java implementing tool interface
  239. Do I need to build Hadoop 2.7.0 to use it on Windows?
  240. Hadoop crashed while running terasort?
  241. How to change this file aprior.py to MapReduce python?
  242. A Very Slow response of command in hadoop
  243. Ambari server setup: 'NoneType' object has no attribute 'lower'
  244. Why is it keep showing deprecated error when running hadoop (or dfs command)
  245. Hadoop, chaining multiple jobs in a single job
  246. how hadoop handle ram when execute query?
  247. R is not connecting to HDFS
  248. Hadoop Training chennai
  249. spark textfile load file instead of lines
  250. java.io.IOException: Job status not available about hive