Data Storage Calculations for Storing Event Data

Do you deal with storing message/event data? Ever wondered how much space they will take over course of time in your Hadoop Cluster? How much space will that 500 JSON msgs/second pipeline take? Or maybe you plan to compress that data later?

Well, I’ve wondered that a lot. So I wrote a javascript app that does just that.
https://github.com/yash-ranadive/storage_app

 
0
Kudos
 
0
Kudos

Now read this

How to compress Data in Hadoop

Hadoop is awesome because it can scale very well. That means you can add new data nodes without having to worry about running out of space. Go nuts with the data! Pretty soon you will realize that’s not a sustainable strategy… at least... Continue →