Posts

Showing posts from March, 2016

Transfer Data from RDBMS to Hadoop Using Sqoop/Oozie/Hue

Image
A lengthy title. I know.

Cutting to the chase, I needed to transfer data from my RDBMS to a Hadoop cluster and I only had access to Hue/Oozie (since I’m not an admin). I knew that I could use Sqoop to do it — but I’ve never really done it before.

It was freaking hard/annoying!

So to help others out there who might be in a similar predicament as I was, here are some 101.

1. I assume you already know how to use the Workflow Editor, so from there, create a new Workflow.


2. Drag a Sqoop action from the panel above and click OK.


3. You’ll get some pre-filled sqoop command in there which you can use as reference. Hop to Apache Sqoop to learn more about all available arguments you can use.

4. There’s 2 way you can go about entering the Sqoop command from here on out. You can either type in the Sqoop command in the text box, OR if you’re thinking of using a query in your command, my recommendation is to use the argument window. The latter is based on a post from SO, where some guys had i…