Big Data Bootcamp


Georgia Tech big data bootcampt training material

Welcome

Welecome to Big Data Bootcamp training. This training material is developed by Sunlab and Polo Club. Through the training, you will learn big data tools in Hadoop and Spark ecosystem.

The training material sample data is for application in healthcare domain, but you can easily adapt what you've learned to other domains and there's no requirement of healthcare background knowledge.

To get started, please setup learning environment first.

Content Summary

Content of the training material is divided into two chapters Hadoop and Spark.

Hadoop Ecosystem

  1. Hadoop Basic
  2. Hadoop HBase
  3. Hadoop Streaming
  4. Hadoop Pig
  5. Hadoop Hive

Spark Ecosystem

  1. Scala Basic
  2. Spark Basic
  3. Spark SQL
  4. Spark GraphX
  5. Spark MLlib