Summary -

In this topic, we described about the Architecture in detail.


Let’s discuss about each component in detail.

  1. User Interface (UI)
    • Interface to submit queries by the user
    • The below are the list of UIs mostly used
    • Web based GUI
    • Command line interface
    • HD Insight
  2. Driver & compiler
    • The Driver component receives the queries
    • The compiler component parses the query
    • The compiler component creates the execution plan for the query after looking into the table structure and partition metadata from metastore.
  3. Metastore
    • The component stores all the table structure information which includes column information as well.
  4. Execution engine
    • The component executes the execution plan created by the compiler.
    • It also manages the different dependencies between the various stages and executes it on appropriate system.
  5. HDFS or HBase
    • Data storage technique to store the data.
    • Contains the data which is retrieved by Execution engine to send as results to UI.

Hive interaction with Hadoop -

1executeQueryThe Query submitted from User Interface
2getPlanThe driver passes the query to compiler to check syntax of the query and query plan of the query
3getMetaDataThe compiler sends the request to metastore
4sendMetaDataMetastore sends the metadata to the compiler
5sendPlanCompiler verifies the requirement and sends the plan to driver
6executePlanDriver send the execution plan to execution engine
7metaDataOps for DDLsExecution engine gets the Metadata DDLs for table data from METASOTRE if required
8executeJobExecution Engine sends the job to JOB TRACKER and JOB Tracker will execute the job
9jobDoneJOB TRACKER sends the jobDone Status to EXECUTION ENGINE along with the job output
10sendResultsEXECUTION ENGINE sends the results to DRIVER
11fetchResultsUI will fetch the results from DRIVER