Learning GraphX

A repository of some of the places I've been to in learning Apache GraphX.

Apache Spark (http://spark.apache.org/docs/latest/)
How can you not learn about Spark to learn GraphX? A nice introduction to Spark, and from there you can dive into other components. Download Spark, and just start playing with Scala.

First Step to Scala (http://www.artima.com/scalazine/articles/steps.html)
More tutorials on Scala. A step-by-step approach on some basic Scala skills

Functional Programming Principles in Scala (https://class.coursera.org/progfun-004/lecture)
A course on Scala by none other than Martin Odersky himself. A great way to learn Scala and work in Spark better.

AMP Camp by Stanford (http://ampcamp.berkeley.edu/stanford-workshop/index.html)
Exploring the BDAS stack by Stanford University. If you're in the States - this is probably the best way to learn about Spark, Scala, GraphX, MLib. Why do I put that restriction? It's because to be able to make full use of the workshop - you'll need to set up an Amazon EC2 account, which is probably region based. That being said - I haven't really tried it yet. So if you're in Asia and you managed to go through the whole workshop - let me know!:)

Sotera (http://sotera.github.io/distributed-graph-analytics/)
Their Github contain a few SNA algorithms developed by them in Giraph and GraphX. Apart from sharing the source files - they also provide you with an image for you to explore their algorithms in more detail. So if you're not able to play with Stanford's cluster above, this is probably the next best thing (aside from getting a hands-on on a real SNA algorithm)

Spart Summit (http://spark.apache.org/documentation.html)
Documentations, videos, training materials, exercises etc. Awesome materials to get you started and for the more advanced!

Comments

Popular posts from this blog

HIVE: Both Left and Right Aliases Encountered in Join

Assign select result to variable in Netezza stored procedure

Splitting value in Netezza using array_split