1 / 7

Hive Index

Hive Index. Yongqiang He Software Engineer Facebook Data Infrastructure Team. Agenda. Create Index. C REATE INDEX index_name ON TABLE table_name ( col_name …) AS ‘ index handler class name ’ [WITH DEFERRED REBUILD] [INDEXPROPERTIES ( prop_key = prop_value , … ) ]

brygid
Télécharger la présentation

Hive Index

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hive Index • Yongqiang He • Software Engineer • Facebook Data Infrastructure Team

  2. Agenda

  3. Create Index • C REATE INDEX index_name • ON TABLE table_name(col_name…) • AS ‘index handler class name’ • [WITH DEFERRED REBUILD] • [INDEXPROPERTIES (prop_key=prop_value, …)] • [IN TABLE index_table_name] • [[ROW FORMAT …] STORED AS …] • EXAMPLE 1: • CREATE TABLE src (key int, value string); • CREATE INDEX src_index ON TABLE src(key) as 'COMPACT' WITH DEFERRED REBUILD STORED AS RCFILE; • EXAMPLE 2: • CREATE TABLE srcpart_rc (key int, value string) PARTITIONED BY (ds string, hr int) STORED AS RCFILE; • CREATE INDEX src_part_indexON TABLEsrcpart_rc(key) as 'COMPACT' WITH DEFERRED REBUILD;

  4. Update Index / Rebuild Index • ALTER INDEX index_nameON table_name [partitionSpec] REBUILD; • EXAMPLE : • ALTER INDEXsrc_indexON srcREBUILD; • ALTER INDEXsrc_part_indexON srcpart_rcREBUILD; • ALTER INDEX src_part_indexON srcpart_rcpartition(ds=‘2010-08-09’) REBUILD;

  5. No Optimizer Available right now. • Need to use index manually in query. • (Working on an index optimizer to direct a query against index.) Use Index • designed to help queries with filter clause (point query, range query). • EXAMPLE : • Original query:SELECT key, value FROM srcpart_rc WHERE key=100; • Query with index: • INSERT OVERWRITE DIRECTORY "/tmp/index_result" SELECT `_bucketname` , `_offsets` FROM default__srcpart_rc_srcpart_rc_index__ WHERE key=100; • SET hive.index.compact.file=/tmp/index_result; • SET hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat; • SELECT key, value FROM srcpart_rc WHERE key=100 ORDER BY key;

  6. Metastore upgrade script • MySQL: • http://wiki.apache.org/hadoop/Hive/IndexDev#Metastore_Upgrades

More Related