Writing map/reduce programs to analyze your Big Data can get complex. Hive can help make querying your data much easier. Apache Hive, first created at Facebook, is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL.
- Understand what Apache Hive is, the Hive architecture, and Hive use cases.
- Make basic configuration changes in a Hive installation.
- Use DDL to create new Hive databases and tables with a variety of different data types.
- Create partitioned tables that are optimized for hadoop.
- Create and run a variety of useful DML queries against Hive.
- Use built in Hive operators and functions to get work done.
- Create your own user defined functions in Hive.
- Use a variety of different file formats and records formats with Hive.
Part Time/Full Time:
- Basic understanding of Apache Hadoop and BigData.
- Working knowledge of SQL
- Basic Linux Operating System knowledge
- Hadoop Fundamentals I Version 2
- Apache Hadoop