hadoop - importing compressed (gzip) data from s3 to hive -

- March 15, 2010

i have bunch of .gzip files in s3://mybucket/file/*.gzip.

i loading table using:

set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.max.dynamic.partitions.pernode=1000; set hive.enforce.bucketing = true; set hive.exec.compress.output=true; set io.seqfile.compression.type=block; set mapred.output.compression.codec=org.apache.hadoop.io.compress.gzipcodec;  create external table db.tablename(col1 dataype,col1 dataype,col1 dataype,col1     dataype) partitioned (col datatype) clustered (col2) sorted (col1,col2) 200 buckets row format delimited fields terminated '\t' lines terminated '\n' location 's3://mybucket/file';

it creates table doesn't load data s3 hive/hdfs.

any appreciated?

thanks sanjeev

i think files present in s3://mybucket/file/ not organized in correct directory structure hive partitions. suggest create external table without partitions , buckets on s3://mybucket/file/ , write hive query read data table , write partitioned/bucketed table.

Search This Blog

HTPPS

hadoop - importing compressed (gzip) data from s3 to hive -

Comments

Post a Comment

Popular posts from this blog

wordpress - (T_ENDFOREACH) php error -

Export Excel workseet into txt file using vba - (text and numbers with formulas) -

Using django-mptt to get only the categories that have items -