Monday, 27 July 2015

Dumping the Hive table as CSV file

This query is suppported in hive > 0.11. I have tested this query in Hive-0.13.1   
You can run the below query to dump the hive tables as CSV files on to your local file system 
insert overwrite local directory '/home/training/HiveResult' row format delimited fields terminated by ',' select * from sample;

----------------------------------------
# First we have to convert hive result to csv format

hive -e 'select a.user_game_id as usergames, a.cur_user_count as users from fresh1.ipro_sports_user_game as a JOIN fresh1.ipro_sports_game_schedule as b ON a.game_id=b.game_id where a.user_game_id >= '270'' |sed 's/[[:space:]]\+/,/g' >/home/cloud-user/counofusers23


1) create table csvtable(game_id int,coun_ofusers int) 
row format delimited 
fields terminated by ','
lines terminated by '\n' 
stored as textfile;

2) load data local inpath '/home/cloud-user/counofusers23' into table csvtable;

3) CREATE TABLE avrotable
    > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
    > STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
    > TBLPROPERTIES (
    > 'avro.schema.literal'='{
    > "namespace": "com.rishav.avro",
    > "name": "student_marks",
    > "type": "record",
    > "fields": [ { "name":"game_id","type":"int"}, { "name":"count_ofusers","type":"int"}]
    > }');

4) insert overwrite table avrotable select * from counofusers23;

5) hadoop fs -cat /apps/hive/warehouse/fresh1.db/avrotable/* >countofusers.avro

6) Download avro-tools.1.7.5 from below site
http://mvnrepository.com/artifact/org.apache.avro/avro-tools/1.7.5

7) java -jar /home/cloud-user/avro-tools-1.7.5.jar tojson countofusers.avro >countfusers.json


8) Now you can open countfusers.json to view the json format....

No comments:

Post a Comment