Create Views over JSON Data in Hive

The beauty of storing raw JSON in HIVE is that you can potentially create multiple tables on the same data using Hive Views. Hive allows you to query JSON data using couple of different ways (json_tuple and get_json_object). The get_json_object allows you to pass a json string and a JSONPath to extract data. Here’s an example:

event_type event_data
user_registered {ip_address: “127.128.123.128” }
user_deleted {ip_address: “127.128.123.128” }
hive> CREATE VIEW my_view(type, value)
AS
SELECT event_type, get_json_object(tbl.event_data, '$.ip_address')
from json_talbe tbl
WHERE event_type='some_type';

hive> select * from my_view;
type value
user_registered 127.128.123.128
user_deleted 127.128.123.128
 
3
Kudos
 
3
Kudos

Now read this

Streaming data to Hadoop using Unix Pipes? Use Pipefail

If you pipe the output of a statement to hadoop streaming you must know about the unix pipefail option. To demonstrate what it does, try this out in your commandline: $> true | false $> echo $? 1 $> false | true $> echo $? 0... Continue →