Data Storage Calculations for Storing Event Data

Do you deal with storing message/event data? Ever wondered how much space they will take over course of time in your Hadoop Cluster? How much space will that 500 JSON msgs/second pipeline take? Or maybe you plan to compress that data later?

Well, I’ve wondered that a lot. So I wrote a javascript app that does just that.
https://github.com/yash-ranadive/storage_app

 
0
Kudos
 
0
Kudos

Now read this

Two Very Useful Hive CLI settings

It is very helpful to set these in your .hiverc file. The hive cli reads from the .hiverc file in your home directory to override defaults. Two of the settings I find very important is set hive.cli.print.header=true; set... Continue →