Apache Spark with Cassandra Example with Game of Thrones

Spark Cassandra Tutorial

Spark Cassandra is a powerful combination of two open-source technologies that offer high performance and scalability. Spark is a fast and flexible big data processing engine, while Cassandra is a highly scalable and distributed NoSQL database. Together, they provide a robust platform for real-time data processing and analytics. One of the key benefits of using … Read more

How to Debug Scala Spark in IntelliJ

Spark Scala Debug

Have you struggled to configure debugging in IntelliJ for your Spark programs?  Yeah, me too.  Debugging with Scala code was easy, but when I moved to Spark things didn’t work as expected.  So, in this tutorial, let’s cover debugging Scala based Spark programs in IntelliJ tutorial.  We’ll go through a few examples and utilize the occasional help … Read more

Spark Broadcast and Accumulators by Examples

Spark Shared Variables Broadcast and Accumulators

What do we do when we need each Spark worker task to coordinate certain variables and values with each other?  This is when Spark Broadcast and Spark Accumulators may come into play. Think about it. Imagine we want each task to know the state of variables or values instead of simply independently returning action results back to the … Read more

IntelliJ Scala and Apache Spark Happy Together

Intellij Scala Spark

In this tutorial, we’re going to review one way to setup IntelliJ for Scala and Spark development.  The IntelliJ Scala combination is the best, free setup for Scala and Spark development.  And I have nothing against ScalaIDE (Eclipse for Scala) or using editors such as Sublime.  I switched from Eclipse years ago and haven’t looked … Read more