How to calculate Modification Times of Hive Tables

If you use external tables in hive or use methods other than Hive’s LOAD DATA to feed data to hive tables, you should be interested in how recent is your data.

Here’s a nifty little ruby snippet that allows you to get that using webhdfs

irb> require 'webhdfs'

irb> client = WebHDFS::Client.new('hadoop-nn', 50070)

irb> fl = client.list('/user/hive/warehouse/database.db/tablename/')

irb> DateTime.strptime(fl.collect {|x| x['modificationTime']}.max.to_s, '%M')
 
0
Kudos
 
0
Kudos

Now read this

How to read ISO 8601

ISO 8601 is a format of expressing a date with timezone information. I used to get confused after looking at dates like “2014-10-07T16:11:24-07:00”. Ok so you can tell it is 7th October 2014 and 4:24 PM. The -07:00 tells us the timezone... Continue →