Yash Ranadive

Data Engineer at Lookout Mobile Security

Page 4


Create a Compressed RC file table in HIVE

Here are the config parameters to set in the hive client when you want to create a compressed RC file table in HIVE. Note: RC files can only be created when the data is already in HDFS. Unfortunately, I haven’t figured out a way for it to work with LOAD DATA LOCAL……

SET hive.exec.compress.output=true;
SET mapred.max.split.size=256000000;
SET mapred.output.compression.type=BLOCK;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;

View →


Set Hive Table Replication

To set the replication factor of a table while loading it to HIVE you need to set the following property on the hive client.

SET dfs.replication=2;
LOAD DATA LOCAL ......;

View →


Shell yesterday’s date

The date command in SH is pretty powerful. Here’s you can get yesterday’s date - something I find myself doing frequently for things like get data from an API for the last day, etc.

$ date +%Y-%m-%d --date=yesterday
2014-02-05

View →