Hot Swapping of Tables in Hive

Here are the steps to hot swap a table in Hive when you manually load data in them. Hot Swapping the table allows you to refresh an entire table without any downtime thus making your table always available for querying. With this approach you Lock the table but it is only for a fraction of a second. This is useful when you want to change the definition of a table or reload the whole table without downtime. Hive 11 allows you to maintain lock on a table which you can later drop and re-create.

Here are the steps for the swap:

  1. Store the data in a staging location in HDFS
  2. Get EXCLUSIVE LOCK on the Hive table (using LOCK TABLE tablename) - so queries have to wait to until we finish the refresh
  3. Delete the underlying Hive folder (e.g. /user/hive/warehouse/user.db/tablename). We have to delete first because of https://issues.apache.org/jira/browse/HDFS-4142 and https://issues.apache.org/jira/browse/HADOOP-9382)
  4. Move the staging folder to the hive directory. (e.g. mv /user/staging/tablename /user/hive/warehouse/user/db/tablename)
  5. Release LOCK on the table
 
28
Kudos
 
28
Kudos

Now read this

Visualizing Metrics in Storm using StatsD & Graphite

Storm Metrics API # Jason Trost from Endgame has written a nice post on how to setup Storm to publish metrics using the Metrics API. Endgame has also open sourced a module storm-metrics-statsd for Storm that allows you to send messages... Continue →