Moving data from Avro to ORC files

In the Importing data from Sqoop into Hive External Table with Avro encoding i blogged about how to sqoop data from RDBMS into Hive. But i wanted to take it to next step by moving the data downloaded to ORC table. I followed these steps to achieve that
  1. First thing is to find out the schema of the table in Avro and you can get that by executing following statement in hive
    
    show create table CUSTOMER;
    
    You will get output that looks something like this, it contains schema of the table
    
    CREATE EXTERNAL TABLE `CUSTOMER`(
      `contactid` int COMMENT 'from deserializer',
      `firstname` string COMMENT 'from deserializer',
      `lastname` string COMMENT 'from deserializer',
      `email` string COMMENT 'from deserializer')
    ROW FORMAT SERDE
      'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
    STORED AS INPUTFORMAT
      'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    OUTPUTFORMAT
      'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
    LOCATION
      'hdfs://sandbox.hortonworks.com:8020/tmp/customer/data'
    TBLPROPERTIES (
      'avro.schema.url'='hdfs:///tmp/customer/schema/customer.avsc',
      'transient_lastDdlTime'='1431719666')
    
  2. Copy the schema from last step and remove the part about format and table properties and replace it with part that highlighted in this code, execute this in Hive to create customer table in ORC format
    
    CREATE EXTERNAL TABLE `CUSTOMER_ORC`(
      `contactid` int  ,
      `firstname` string  ,
      `lastname` string  ,
      `email` string  )
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY '\001'
    LOCATION
      'hdfs://sandbox.hortonworks.com:8020/tmp/customer/data_orc'
    STORED AS ORC tblproperties ("orc.compress"="SNAPPY","orc.row.index.stride"="20000"); 
    
  3. Last step is to copy data from avro table to ORC table, you can achieve that by using following command
    
    insert into table CUSTOMER_ORC select * from customer;
    

3 comments:

srjwebsolutions said...


We are leading responsive website designing and development company in Noida.
We are offering mobile friendly responsive website designing, website development, e-commerce website, seo service and sem services in Noida.

Responsive Website Designing Company in Noida
Website Designing Company in Noida
SEO Services in Noida
SMO Services in Noida

Vikas Chaudhary said...

Battery Mantra is Authorized exide car battery dealer in Noida and Greater Noida. We are providing our service in Indirapuram, Delhi, Ashok Nagar.

Exide Battery Dealer in Noida
Battery Dealer in Noida
Authorized Battery Dealer in Noida
Car Battery Dealer in Noida
Car Battery Dealer
Exide Battery Dealer

EG MEDI said...

Egmedi.com is online medical store pharmacy in laxmi nagar Delhi. You can Order prescription/OTC medicines online. Cash on Delivery available. Free Home Delivery


Online Pharmacy in Delhi
Buy Online medicine in Delhi
Online Pharmacy in laxmi nagar
Buy Online medicine in laxmi nagar
Onine Medical Store in Delhi
Online Medical store in laxmi nagar
Online medicine store in delhi
online medicine store in laxmi nagar
Purchase Medicine Online
Online Pharmacy India
Online Medical Store