Let us understand how to filter while importing the data using --where.


By default, sqoop import will fetch all the rows in the specified table.

If we want to filter rows based on some criteria, we can pass the valid SQL condition which we typically use it in WHERE clause to --where.

--where is typically used with --table.

Let us see an example of sqoop import with --where.


sqoop import \

  --connect jdbc:mysql://ms.itversity.com:3306/retail_db \

  --username retail_user \

  --password itversity \

  --table orders \

  --warehouse-dir /user/training/sqoop_import/retail_db \

  --delete-target-dir \

  --where "order_status IN ('COMPLETE', 'CLOSED') AND order_date LIKE '2013-08%'"


---

Practice hive on state of the art Big Data cluster - http://labs.sparkdatabox.com