



CREATE TABLE user1(fname string,在将数据插入分区表后,它将如下所示。

pre> fname lname天
AA AAA 20170201 .... >分区20170201
BB BBB 20170201
CC CCC 20170202 ......>分区20170202
DD DDD 20170202
EE EEE 20170203 .......>分区20170203
FF FFF 20170203
GG GGG 20170204 ........> ;分区20170204
HH HHH 20170204

当我使用partition列(即day = 20170201)帮助执行select查询时。

  select * from user1 where day = 20170201; 


  AA AAA 20170201 
BB BBB 20170201

基于上表我想合并所有的小文件,即日= 20170201和日= 20170202和日= 20170203分区日= 20170203在我的分区表(即USer1).ie它应该如下所示。

  fname lname日
............. ........
AA AAA 20170201
BB BBB 20170201
CC CCC 20170202
DD DDD 20170202
E EEE 20170203 ....... >分区20170203
FF FFF 20170203
GG GGG 20170204 ..... ...>分区20170204
HH HHH 20170204




  1. 创建由新字段 partition_day

  2. $ b

    1. 加载数据导入新表(在情况情况下为新分区定义条件)
      $ b

      How to merge existing Partition small files into one large file in one of the Partition .

      For example I have a table user1, it contain columns fname,lname and partition column is day.

      I have created table by using below script

      CREATE TABLE user1(fname string,lname string) parittioned By (day int);

      After inserting data into partion table it will look like below.

       fname  lname  day
      AA      AAA   20170201     ....>partition 20170201
      BB      BBB   20170201
      CC      CCC   20170202    ......>partition 20170202
      DD      DDD   20170202
      EE      EEE   20170203    .......>partition 20170203
      FF      FFF   20170203
      GG      GGG   20170204    ........>partition 20170204
      HH      HHH   20170204

      When I execute select query with the help of partition column i.e. day=20170201.

      select * from user1 where day=20170201;

      It will give result like below

      AA      AAA   20170201
      BB      BBB   20170201

      based on above table i want to merge the all small files i.e day =20170201 and day =20170202 and day=20170203 into partition day=20170203 in my partition table (i.e USer1).i.e. It should look like below.

      fname  lname  day
      AA      AAA   20170201
      BB      BBB   20170201
      CC      CCC   20170202
      DD      DDD   20170202
      E       EEE   20170203    .......>partition 20170203
      FF      FFF   20170203
      GG      GGG   20170204    ........>partition 20170204
      HH      HHH   20170204

      can you please suggest on this,How can I achieve this?

      Thanks in Advance.

      1. Create new table partitioned by new field partition_day:
      1. Load data into new table (define your conditions for new partitionsin the case )


08-24 05:19