Let us understand how to compress the data while importing data using sqoop import


  • We can enable the compression by using –compress
  • Default compression with be deflate
  • We can pass a compression algorithm using –compression-codec.
  • We can review io.compression.codecs property in core-site.xml to get list of valid compression algorithms that can be used.
  • All compression algorithms might not be compatible with all file formats and hence it is important to use only compatible compression algorithms based on the file formats used.
  • Here is the example of sqoop import command to compress the data using default compression algorithm.


sqoop import \

  --connect "jdbc:mysql://ms.itversity.com:3306/retail_db" \

  --username retail_user \

  --password itversity \

  --table order_items \

  --warehouse-dir /user/training/sqoop_import/retail_db \

  --delete-target-dir \

  --compress

Here is the example of sqoop import command to compress the data using snappy compression algorithm.


sqoop import \

  --connect "jdbc:mysql://ms.itversity.com:3306/retail_db" \

  --username retail_user \

  --password itversity \

  --table order_items \

  --warehouse-dir /user/training/sqoop_import/retail_db \

  --delete-target-dir \

  --compress \

  --compression-codec org.apache.hadoop.io.compress.SnappyCodec