Showing posts from 2016

Running a separate H2O instance from R

Traditionally in R, an H2O instance that is created using h2o.init() will be shared by everyone else. This poses a problem since because the resources (threads) are shared - everyone can't basically do work at the same time.

While h2o.init() itself has a function to specify which port and ip, it is currently buggy since you can't initiate an instance without using the 54321 port.

To work around this, you'll need to initiate an h2o instance externally from unix. From there, you can then connect directly to your defined instance.In addition, connecting to an instance in this manner provides another layer of protection, since you're no longer relying on libraries that are loaded in R (libraries that are loaded in R can be removed by other users using RStudio).

You may use the following command to:

1. Create h2o instance externally
2. Connect to your h2o instance from R.

So basically from R;
launchH2O <- -jar="" -name="" -nthreads="" -port…

Create ad hoc wifi network and monitor mobile traffic

1. First off, ensure that you have a wired connection to your laptop.

2. Using Lenovo Access Connections, ensure that your laptop is connected to your LAN.

3. Go to 'Mobile Hotspot'. Fill the necessary details. Start the network.
4. From your mobile, you should be able to see the SSID. Connect to it.

5. From Windows, find out which network is your wifi using. In our case it's 'Wireless Network Connection 7. 
6. Initially you will not see the connection being able to connect to the internet.
7. Go to Local Area Connections.

8. Go to Properties, then Sharing. Enable your sharing to allow your adhoc wifi network to connect to the internet via your LAN.

9. Allow the needed services.
10. OK everything.
11. Wait for a few minutes to allow the changes to take place.
12. Your mobile should now be able to connect to the internet via your laptop.
13. Install Wireshark and monitor 'Wireless Network Connection 7'.

Transfer Data from RDBMS to Hadoop Using Sqoop/Oozie/Hue

A lengthy title. I know.

Cutting to the chase, I needed to transfer data from my RDBMS to a Hadoop cluster and I only had access to Hue/Oozie (since I’m not an admin). I knew that I could use Sqoop to do it — but I’ve never really done it before.

It was freaking hard/annoying!

So to help others out there who might be in a similar predicament as I was, here are some 101.

1. I assume you already know how to use the Workflow Editor, so from there, create a new Workflow.

2. Drag a Sqoop action from the panel above and click OK.

3. You’ll get some pre-filled sqoop command in there which you can use as reference. Hop to Apache Sqoop to learn more about all available arguments you can use.

4. There’s 2 way you can go about entering the Sqoop command from here on out. You can either type in the Sqoop command in the text box, OR if you’re thinking of using a query in your command, my recommendation is to use the argument window. The latter is based on a post from SO, where some guys had i…