Data Storage Calculations for Storing Event Data

Do you deal with storing message/event data? Ever wondered how much space they will take over course of time in your Hadoop Cluster? How much space will that 500 JSON msgs/second pipeline take? Or maybe you plan to compress that data later?

Well, I’ve wondered that a lot. So I wrote a javascript app that does just that.
https://github.com/yash-ranadive/storage_app

 
0
Kudos
 
0
Kudos

Now read this

Streaming data to Hadoop using Unix Pipes? Use Pipefail

If you pipe the output of a statement to hadoop streaming you must know about the unix pipefail option. To demonstrate what it does, try this out in your commandline: $> true | false $> echo $? 1 $> false | true $> echo $? 0... Continue →