Create a file of size x bytes

One of the common requirements I run across in moving data around is finding if I’m doing it the fastest way possible. A good indicator of speed is to find out how long it takes for a large file to get copied from one server to another.

If you’re building a (big ;)) data pipeline that transports data from one server to another you better be close to the above speed. dd is a unix util that allows you to create a file of a particular size for these purposes

dd if=/dev/zero of=/path/to/desired/big/file count=1024

This will create a file 1,024 bytes in size.

 
8
Kudos
 
8
Kudos

Now read this

How to compress Data in Hadoop

Hadoop is awesome because it can scale very well. That means you can add new data nodes without having to worry about running out of space. Go nuts with the data! Pretty soon you will realize that’s not a sustainable strategy… at least... Continue →