In this class I cover the basics of Spark architecture and parallel processing. I explain the basics of data partitioning, the makeup of the Spark cluster, and the difference between transformations and actions. I then show the basics of displaying and exploring your data with grouped aggregations.
Slides
Suggested Reading:
- Spark: The Definitive Guide, Chapters 1 and 2 (p. 3-30) and Chapter 4 (p. 49-58)
- Learning Spark, 2nd Edition, Chapters 1 and 2 (p. 1-42)











