Create a file of size x bytes

One of the common requirements I run across in moving data around is finding if I’m doing it the fastest way possible. A good indicator of speed is to find out how long it takes for a large file to get copied from one server to another.

If you’re building a (big ;)) data pipeline that transports data from one server to another you better be close to the above speed. dd is a unix util that allows you to create a file of a particular size for these purposes

dd if=/dev/zero of=/path/to/desired/big/file count=1024

This will create a file 1,024 bytes in size.

 
8
Kudos
 
8
Kudos

Now read this

Hive and Hadoop Command Snippet search

Why? # I’ve found myself looking up the “exact” syntax for DML / DDL in Hive countless times. Also, I tend to forget the list of date functions and parameters. I would use a combination of Google Search and/or a cheat sheet for these.... Continue →