HCatalog Indexes

An index allows for quick access to data in a database. It determines where to store rows based on the type of index, making it easy to retrieve data without delays. An index is an ordered list of pointers to the rows in a table. It can be either partitioned or non-partitioned.

CREATE INDEX -

The CREATE INDEX command is used to create an index. Only single table indexes are supported.

Syntax -

CREATE INDEX index_nam
	ON TABLE base_table_name (col_name, ...)
	AS 'index.handler.class.name'
	[WITH DEFERRED REBUILD]
	[IDXPROPERTIES (property_name=property_value, ...)]
	[IN TABLE index_table_name]
	[PARTITIONED BY (col_name, ...)]
	[[ ROW FORMAT ...] STORED AS ...
   	| STORED BY ...]
	[LOCATION hdfs_path]
	[TBLPROPERTIES (...)]
	[COMMENT "index comment"]

Index partitioning typically aligns with the partitioning of the base table by default. However, you can use the PARTITIONED BY clause to specify a subset of the columns used for partitioning the table.

Example:

The following command demonstrates how to create an index named 'std_index' on the 'std_details' table -

hcat -e "CREATE INDEX std_index ON TABLE
 std_db.std_details AS 'org.apache.hadoop.
 HCatalog.ql.index.compact.CompactIndexHandler';"

Once the above statement successfully executed, the std_db.std_index created on std_db.std_details.

ALTER INDEX -

ALTER INDEX used to rebuild the index. The ALTER INDEX...REBUILD command is used to build the index structure for all partitions or a single partition. If the data in the base table changes, the REBUILD command must be used to update the index.

Syntax -

ALTER INDEX index_name ON table_name
 [PARTITION (...)] REBUILD

Example:

The below command will show how to alter the index std_index on std_details table.

hcat -e "ALTER INDEX std_index ON std_db1.std_details1;"

Once the above statement successfully executed, the std_db.std_index got altered to recreate on std_db1.std_details1.

DROP INDEX:

The DROP INDEX command is used to delete an index at any time. An index table can be dropped using the DROP TABLE command. When a base table with indexes is dropped, this action automatically affects all related indexes.

Syntax -

DROP INDEX index_name ON table_name

Example:

The below command will show how to drop index std_index on std_details.

hcat -e "DROP INDEX std_index;"

Once the above statement successfully executed, the std_db.std_index got dropped.