Hive doesn’t like the carriage return character

Have you ever run in to a situation where you count the number of rows for a table in a database, then dump it to CSV and then load it to HIVE only to find that number has changed? Well, you probably have carriage returns in your fields. HIVE reads a carriage return similar to a new line which means end of row. Here’s a link I found that describes it:

http://grokbase.com/t/hive/user/111v7jva3f/newlines-in-data

You have to manually clean the \r from the file. One option is to use the unix command transliterate:

cat yourfile | tr -d "\r" > newfile
 
38
Kudos
 
38
Kudos

Now read this

Visualizing Metrics in Storm using StatsD & Graphite

Storm Metrics API # Jason Trost from Endgame has written a nice post on how to setup Storm to publish metrics using the Metrics API. Endgame has also open sourced a module storm-metrics-statsd for Storm that allows you to send messages... Continue →