Use Sqoop to Transfer CSV Data from the Local Filesystem to a Relational Database

Here’s a quick demo of how to transfer data from a local file system to a relational database using Sqoop1.

The environment

  1. CentOS 6.2
  2. MySQL 5.5.13
  3. CDH 5.3.0
    • Sqoop 1.4.5
    • Sqoop2 1.99.4

The schema

Here’s the CREATE TABLE statement used for the database:

CREATE TABLE children
(
  id INT NOT NULL,
  id_f FLOAT NOT NULL,
  description VARCHAR(255)
);

The data

Go to the ingest.tips github for the complete data set used. Here’s a snippet of the data:

1,1.0,'1 children were born in 1'
2,2.0,'2 children were born in 2'
3,3.0,'3 children were born in 3'

In this demo, we’ll assume the data is in /tmp/data/sample.csv.

The command

Using Sqoop1, the data above can be exported to the children table with the following query:

sqoop export -fs local -jt local --connect jdbc:mysql://example.com/example --username example --password example --table test --export-dir file:///tmp/data

Notes

  • -fs local and -jt local are used to reference the local file system and make sqoop run a local MapReduce job
  • The URI prefix file:/// is useful for accessing the local file system
  • /tmp/data/sample.csv should have the same number of columns as the table and map to the appropriate columns based on order
  • You may need to provide -libjars if you’re getting ClassNotFound exceptions.

Last words

Sqoop2 post on the same subject to come!

All of the data for this blog post is available at https://github.com/ingesttips/examples. If you have any questions, thoughts, ideas, or concerns… let us know!

team@ingest.tips | @ingesttips

Tweet about this on TwitterShare on FacebookShare on LinkedIn

Tagged: , , , ,


'Use Sqoop to Transfer CSV Data from the Local Filesystem to a Relational Database' has no comments

Be the first to comment this post!

Would you like to share your thoughts?

Your email address will not be published.