Create a file of size x bytes

One of the common requirements I run across in moving data around is finding if I’m doing it the fastest way possible. A good indicator of speed is to find out how long it takes for a large file to get copied from one server to another.

If you’re building a (big ;)) data pipeline that transports data from one server to another you better be close to the above speed. dd is a unix util that allows you to create a file of a particular size for these purposes

dd if=/dev/zero of=/path/to/desired/big/file count=1024

This will create a file 1,024 bytes in size.

 
8
Kudos
 
8
Kudos

Now read this

Setting up Camus - LinkedIn’s Kafka to HDFS pipeline

Few days ago I started tinkering with Camus to evaluate its use for dumping raw data from Kafka=>HDFS. This blog post will cover my experience and first impressions with setting up a Camus pipeline. Overall I found Camus was easy to... Continue →