How to determine character encoding of files downloaded by gsutil

gsutil is Google’s tool to download reports/reviews/etc from the Developer Console.

$ gsutil ls -L gs://link/to/your/document.csv
    Creation time:      Mon, 04 Aug 2014 09:38:01 GMT
    Content-Encoding:       gzip
    Content-Length:     739977
    Content-Type:       text/csv; charset=utf-16le
    Hash (crc32c):      AAAAAA
    Hash (md5):     AAAAAAAAAAAAAAA
    ETag:           AAAAAAAAAAA
    Generation:     1234567081803000
    Metageneration:     1
    ACL:            ACCESS DENIED. Note: you need OWNER permission
                on the object to read its ACL.
TOTAL: 1 objects, 739977 bytes (722.63 KB)
 
6
Kudos
 
6
Kudos

Now read this

Basic Monitoring for Hadoop Data Nodes

Here’s a basic monitoring script to monitor the HDFS cluster disk space, Temp Dir space and number of data nodes up. This was plenty useful before we switched to Cloudera Manager. #!/usr/bin/env ruby # Checks Hadoop and alerts if there... Continue →