Begin Apache Spark Transformations in Scala [15 Examples]

Spark Transformation Examples in Scala

Spark Transformations produce a new Resilient Distributed Dataset (RDD) or DataFrame or DataSet depending on your version of Spark and knowing Spark transformations is a requirement to be productive with Apache Spark. This is true whether you are using Scala or Python. The best way to becoming productive and confident in anything is to actually … Read more

How to Debug Scala Spark in IntelliJ

Spark Scala Debug

Have you struggled to configure debugging in IntelliJ for your Spark programs?  Yeah, me too.  Debugging with Scala code was easy, but when I moved to Spark things didn’t work as expected.  So, in this tutorial, let’s cover debugging Scala based Spark programs in IntelliJ tutorial.  We’ll go through a few examples and utilize the occasional help … Read more

Spark Broadcast and Accumulators by Examples

Spark Shared Variables Broadcast and Accumulators

What do we do when we need each Spark worker task to coordinate certain variables and values with each other?  This is when Spark Broadcast and Spark Accumulators may come into play. Think about it. Imagine we want each task to know the state of variables or values instead of simply independently returning action results back to the … Read more

IntelliJ Scala and Apache Spark Happy Together

Intellij Scala Spark

In this tutorial, we’re going to review one way to setup IntelliJ for Scala and Spark development.  The IntelliJ Scala combination is the best, free setup for Scala and Spark development.  And I have nothing against ScalaIDE (Eclipse for Scala) or using editors such as Sublime.  I switched from Eclipse years ago and haven’t looked … Read more