1.
Apache Hive is data warehouse software project built on top
Correct Answer
B. Apache Hadoop
Explanation
Apache Hive is a data warehouse software project built on top of Apache Hadoop. Apache Hadoop is an open-source framework that allows for the distributed processing of large datasets across clusters of computers. Hive provides a SQL-like interface to query and analyze data stored in Hadoop, making it easier for users familiar with SQL to work with big data. Therefore, the correct answer is Apache Hadoop.
2.
Hive was initially developed by
Correct Answer
A. Facebook
Explanation
Hive was initially developed by Facebook.
3.
Hive is written in what language
Correct Answer
C. Java
Explanation
Hive is a data warehouse infrastructure tool that is built on top of Hadoop. It provides a SQL-like interface to query and analyze large datasets stored in Hadoop. Hive is written in Java, which makes it platform-independent and allows it to run on any system that supports Java. This choice of language ensures that Hive can be easily integrated with other Java-based tools and frameworks in the Hadoop ecosystem.
4.
Hive converts queries to all except
Correct Answer
B. Spark ten
Explanation
Hive is a data warehousing tool that converts queries into different execution engines. It supports Apache Tez, MapReduce, and Spark jobs for query processing. However, it does not convert queries into "Spark ten" as mentioned in the options. It is possible that "Spark ten" is not a valid or recognized execution engine for Hive.
5.
By default, Hive stores metadata in an embedded
Correct Answer
C. Apache derby
Explanation
Hive, by default, stores its metadata in an embedded Apache Derby database. Apache Derby is a lightweight, Java-based relational database management system (RDBMS) that is included with Hive. It is used to store and manage the metadata, such as table schemas, partitions, and column statistics, for Hive tables. This allows Hive to efficiently query and analyze large datasets stored in Apache Hadoop.
6.
Apache Hive supports analysis of large data sets stored in Hadoop's
Correct Answer
A. HDFS
Explanation
Apache Hive is a data warehouse infrastructure that provides tools to enable easy data summarization, querying, and analysis of large datasets stored in Hadoop. Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop, and it is designed to store and process large amounts of data across multiple machines. Therefore, Apache Hive supports analysis of large data sets stored in Hadoop's HDFS.
7.
Other companies that use Hive include
Correct Answer
C. Netflix
Explanation
Netflix is mentioned as one of the companies that use Hive.
8.
Hive is a type of ______software
Correct Answer
C. Data warehouse
Explanation
Hive is a type of data warehouse software. A data warehouse is a system that is used to store and manage large amounts of structured and unstructured data. It is designed to support business intelligence and analytics activities by providing a centralized repository for data from various sources. Hive, specifically, is a data warehouse infrastructure built on top of Hadoop, a framework for processing and storing big data. It allows users to query and analyze data using a SQL-like language called HiveQL, making it easier to work with large datasets.
9.
Hive has how many execution engines
Correct Answer
B. 3
Explanation
Hive has three execution engines. These execution engines are responsible for processing and executing queries in Hive. The three execution engines in Hive are MapReduce, Tez, and Spark. Each engine has its own advantages and can be chosen based on the specific requirements of the query and the underlying infrastructure. MapReduce is the default execution engine, while Tez and Spark provide faster and more efficient processing capabilities.
10.
Major components of the Hive architecture includes the following except
Correct Answer
D. Interpreter
Explanation
The Hive architecture consists of several major components that work together to process and analyze data. These components include the Metastore, which stores metadata about the tables and partitions in Hive, the Drivers, which handle the execution of Hive queries, and the Compiler, which translates HiveQL queries into MapReduce jobs. The Interpreter, on the other hand, is not a part of the Hive architecture. It is a component of other systems like Apache Zeppelin, which allows users to interactively run queries and visualize data.