Trivia Quiz: What Do You Know About MapReduce Program?

35 Questions | Total Attempts: 116

SettingsSettingsSettings
Please wait...
Trivia Quiz: What Do You Know About MapReduce Program?

What do you know about the MapReduce program? If you want to process large amounts of data, this program might actually be your best solution in that it helps you to reduce the time it would take and offers you accuracy at the same time. Do take up the quiz and get to see how much more you get to learn!


Questions and Answers
  • 1. 
    Which statements are false regarding MapReduce?
    • A. 

      Is the core component for data ingestion in Hadoop framework.

    • B. 

      Is the parent project of Apache Hadoop.

    • C. 

      Helps to combine the input data set into a number of parts and run a program on all data parts parallel at once.

    • D. 

      The term MapReduce refers to two separate and distinct tasks.

  • 2. 
    Takes a set of data and converts it into another set of data, where individual elements are broken down into tuples (key/value pairs)
    • A. 

      Mapper

    • B. 

      Reducer

  • 3. 
    Combines Key-value pairs based on the key and accordingly modifies the value of the key.
    • A. 

      Mapper

    • B. 

      Reducer

  • 4. 
    The reducer receives the key-value pair from _________ map job(s)
    • A. 

      One

    • B. 

      Multiple

  • 5. 
    The splitting parameter can be anything, e.g. splitting by space, comma, semicolon, or even by a new line (‘\n’).
    • A. 

      True

    • B. 

      False

  • 6. 
    This stage is the combination of the Shuffle stage and itself.
    • A. 

      Mapper

    • B. 

      Reducer

  • 7. 
    __________is used for reading files in sequence. It is a specific compressed binary file format that is optimized for passing data between the output of one MapReduce job to the input of some other MapReduce job.
    • A. 

      Sequencefileinputformat

    • B. 

      Conf.setMapperclass

    • C. 

      RecordReader

    • D. 

      Apache.hadoop.mapreduce.Mapper

  • 8. 
    Sets the mapper class and all the stuff related to map jobs such as reading data and generating a key-value pair out of the mapper.
    • A. 

      Sequencefileinputformat

    • B. 

      Conf.setMapperclass

    • C. 

      RecordReader

    • D. 

      Apache.hadoop.mapreduce.Mapper

  • 9. 
    Loads the data from its source and converts it into a key, value pairs suitable for reading by the Mapper.
    • A. 

      Sequencefileinputformat

    • B. 

      Conf.setMapperclass

    • C. 

      RecordReader

    • D. 

      Apache.hadoop.mapreduce.Mapper

  • 10. 
    Which interface needs to be implemented to create Mapper and Reducer for the Hadoop?
    • A. 

      Apache.hadoop.mapreduce.Mapper

    • B. 

      Apache.hadoop.mapreduce.Reducer

  • 11. 
    What are the main configuration parameters that user need to specify to run MapReduce Job?
    • A. 

      Job’s input and output locations in the distributed file system

    • B. 

      Job’s input and output locations in the local file system

    • C. 

      Input and output format

    • D. 

      Only the output format

    • E. 

      Class containing the map  and reduce function

    • F. 

      Class containing only the map function

    • G. 

      JAR file containing the mapper, reducer and driver classes

    • H. 

      JAR file containing just the mapper and reducer classes

  • 12. 
    Which of the following statements are true about key/value pairs in Hadoop?
    • A. 

      A map() function can emit up to a maximum number of key/value pairs (depending on the Hadoop environment). 

    • B. 

      A map() function can emit anything between zero and an unlimited number of key/value pairs.

    • C. 

      A reduce() function can iterate over key/value pairs multiple times. 

    • D. 

      A call to reduce() is guaranteed to receive key/value pairs from only one key.

  • 13. 
    Consider the pseudo-code for MapReduce's WordCount example (not shown here). Let's now assume that you want to determine the frequency of phrases consisting of 3 words each instead of determining the frequency of single words. Which part of the (pseudo-)code do you need to adapt?
    • A. 

      Only map()

    • B. 

      Only reduce()

    • C. 

      Map() and reduce()

    • D. 

      The code does not have to be changed.

  • 14. 
    Consider the pseudo-code for MapReduce's WordCount example (not shown here). Let's now assume that you want to determine the average amount of words per sentence. Which part of the (pseudo-)code do you need to adapt?
    • A. 

      Only map()

    • B. 

      Only reduce()

    • C. 

      Map() and reduce()

    • D. 

      The code does not have to be changed.

  • 15. 
    Bob has a Hadoop cluster with 20 machines under default setup (replication 3, 128MB input split size). Each machine has 500GB of HDFS disk space. The cluster is currently empty (no job, no data). Bob intends to upload 5 Terabyte of plain text (in 10 files of approximately 500GB each), followed by running Hadoop’s standard WordCount1 job. What is going to happen?
    • A. 

      The data upload fails at the first file: it is too large to fit onto a DataNode

    • B. 

      The data upload fails at a lager stage: the disks are full

    • C. 

      WordCount fails: too many input splits to process.

    • D. 

      WordCount runs successfully.

  • 16. 
    Basic Input Parameters of a Mapper.
    • A. 

      LongWritable and Text

    • B. 

      Text and IntWritable

  • 17. 
    Basic intermediate output parameters of a Mapper.
    • A. 

      LongWritable and Text

    • B. 

      Text and IntWritable

  • 18. 
    You can write MapReduce jobs in any desired programming language like Ruby, Perl, Python, R, Awk, etc. through the Hadoop ______________________ API.
  • 19. 
    Which are true statements regarding MapReduce?
    • A. 

      Is a framework using which we can write applications to process huge amounts of data, in parallel, on large clusters.

    • B. 

       is a processing technique and a program model for distributed computing based on java.

    • C. 

      The MapReduce algorithm contains one important task, namely Map.

  • 20. 
    Intermediate splitting – the entire process in parallel on different clusters. In order to group them in “Reduce Phase” the similar KEY data should be on same _________.
    • A. 

      Cluster

    • B. 

      Physical Machine

    • C. 

      Data Node

    • D. 

      Task Tracker

  • 21. 
    Combining – The last phase where all the data (individual result set from each ________) is combine together to form a Result
    • A. 

      Cluster

    • B. 

      Physical Machine

    • C. 

      Data Node

    • D. 

      Task Tracker

  • 22. 
    The input file is passed to the mapper function ________________
    • A. 

      Line by Line

    • B. 

      All at Once

    • C. 

      In Chunks based on Cluster Size

    • D. 

      In Key - Value Pairs

  • 23. 
    A ______________ comes into action which carries out shuffling so that all the tuples with same key are sent to same node.
  • 24. 
    So, after the sorting and shuffling phase, each reducer will have a unique key and a list of values corresponding to that very key. For example,
    • A. 

      Deer, 1; Bear, 1; River, 1

    • B. 

      Bear, [1,1]; Car, [1,1,1]

    • C. 

      Bear, 2

    • D. 

      Deer Bear River

  • 25. 
    Under the MapReduce model, the data processing ____________ are called mappers and reducers.