Import RDBMS table to HDFS with sqoop from postgreSQL


1. Download JDBC driver


[code lang=”bash”]$wget[/code]


 2. Copy: 

[code lang=”bash”]cp /home/cloudera/Desktop/postgresql-9.3-1102.jdbc4.jar /usr/lib/sqoop/lib/ [/code]


3. Configure: 

[code lang=”bash”]/var/lib/pgsql/data/pg_hba.conf[/code]

file. You need to allow the IP/host of machine running Hadoop.

Restart postgreSQL using 

[code lang=”bash”]$pg_ctl restart[/code]


4. Run sqoop: Open the terminal on machine running hadoop and type the below command.


[code lang=”bash”] cloudera@cloudera-vm:/usr/lib/sqoop bin/sqoop import –connect jdbc:postgresql://–table employee –username postgres -P –target-dir /sqoopOut1 -m 1 [/code]


Enter password:



  • Cloudera hadoop VM distribution or any other machine running hadoop.
  • postgreSQL installation.
  • database Testdb and employee table on a running instance of postgreSQL (e.g.; in point 4).


All set! Your pgsql table data is now available on HDFS of  VM hadoop cluster.


Enjoy hadoop learning!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.