Apache Spark with Scala tutorials presented from a wide variety of perspectives. A book designed to cover the wide landscape of Apache Spark.
The approach is hands-on with access to source code downloads and screencasts of running examples. Get ready to learn by examples!
Who is a tutorial cookbook for?
This book is suitable for beginners with no Spark or Scala experience, but some background in programming and/or databases. It’s a beginner book, but not for people brand new to development or data engineering. This book is designed for people to augment their existing skills to advance their career and/or make better data-intensive products.
What You’ll Learn
By the end of this book, you will have real-world, practical understanding of how to use Spark with Scala. You will also learn the following:
- How to use Spark from Scala
- Comparison of Spark and Hadoop
- Core Spark constructs: Resilient Distributed Datasets, Transformations, and Actions
- Running Two Types of Spark Clusters
- Deploying Scala applications to Spark Clusters
- Spark SQL with Scala including CSV, JSON, and relational databases
- Custom, Scala based Spark Streaming application
- Writing and running automated tests for Spark applications
- Build a custom Spark Machine Learning application
- Spark with Amazon S3
- Using Cassandra from Spark
By the end of this book, you’ll be confident and productive using Spark with Scala in a variety of circumstances.
Also, links to video screencasts of the author running examples and explaining tutorials are available from within the book.
The book is based on pre-Spark 2.0, but the concepts still apply. For a list of differences see http://spark.apache.org/releases/spark-release-2-0-0.html