Best way to duplicate a partitioned table in Hive

A simple google search for the above will land you here:
http://grokbase.com/t/hive/user/097w0bsnne/best-way-to-duplicate-a-table

But, I believe a better way is:

  1. Create the new target table with the schema from the old table
  2. Use hadoop fs -cp to copy all the partitions from source to target table
  3. Run MSCK REPAIR TABLE table_name; on the target table
 
510
Kudos
 
510
Kudos

Now read this

First Experiences with Scalding

Recently, I’ve been evaluating using Scalding to replace some parts of our ETL. Scalding is a Scala library that makes it easy to specify Hadoop MapReduce jobs. Scalding is built on top of Cascading, a Java library that abstracts away... Continue →