CCD-410 Cloudera Certified Developer for Apache Hadoop (CCDH)

Loading demo links...

Showing 4–6 of 7 questions

Question 4

In a MapReduce job, you want each of your input files processed by a single map task. How do you configure a MapReduce job so that a single map task processes each input file regardless of how many blocks the input file occupies?

Select an option, then click Submit answer.

  • Increase the parameter that controls minimum split size in the job configuration.

  • Write a custom MapRunner that iterates over all key-value pairs in the entire file.

  • Set the number of mappers equal to the number of input files you want to process.

  • Write a custom FileInputFormat and override the method isSplitable to always return false.

Question 5

You have the following key-value pairs as output from your Map task:

(the, 1)

(fox, 1)

(faster, 1)

(than, 1)

(the, 1)

(dog, 1)

How many keys will be passed to the Reducer’s reduce method?

Select an option, then click Submit answer.

  • Six

  • Five

  • Four

  • Two

  • One

  • Three

Question 6

You want to perform analysis on a large collection of images. You want to store this data in HDFS and process it with MapReduce but you also want to give your data analysts and data scientists the ability to process the data directly from HDFS with an interpreted high-level programming language like Python. Which format should you use to store this data in HDFS?

Select an option, then click Submit answer.

  • SequenceFiles

  • Avro

  • JSON

  • HTML

  • XML

  • CSV