Create Views over JSON Data in Hive

The beauty of storing raw JSON in HIVE is that you can potentially create multiple tables on the same data using Hive Views. Hive allows you to query JSON data using couple of different ways (json_tuple and get_json_object). The get_json_object allows you to pass a json string and a JSONPath to extract data. Here’s an example:

event_type event_data
user_registered {ip_address: “” }
user_deleted {ip_address: “” }
hive> CREATE VIEW my_view(type, value)
SELECT event_type, get_json_object(tbl.event_data, '$.ip_address')
from json_talbe tbl
WHERE event_type='some_type';

hive> select * from my_view;
type value

Now read this

Boilerplate Maven Pom for generating jars with dependencies

I find myself searching for this over and over again. Maven can be a pain in the butt and Gradle is supposed to be a huge improvement. But every now and then you have to work with Maven, here’s a useful gist which contains boilerplate to... Continue →