Analyzing Large Data Sets with Apache Spark

Get ready for your exam by enrolling in our comprehensive training course. This course includes a full set of instructional videos designed to equip you with in-depth knowledge essential for passing the certification exam with flying colors.
$14.99 / $24.99
Getting Started with Spark
-
1. Introduction2m 16s
-
2. How to Use This Course1m 41s
-
3. [Activity]Getting Set Up: Installing Python, a JDK, Spark, and its Dependencies.14m 50s
-
4. [Activity] Installing the MovieLens Movie Rating Dataset3m 35s
-
5. [Activity] Run your first Spark program! Ratings histogram example.4m 52s
Spark Basics and Simple Examples
-
1. Introduction to Spark10m 11s
-
2. The Resilient Distributed Dataset (RDD)12m 17s
-
3. Ratings Histogram Walkthrough13m 33s
-
4. Key/Value RDD's, and the Average Friends by Age Example16m 13s
-
5. [Activity] Running the Average Friends by Age Example5m 39s
-
6. Filtering RDD's, and the Minimum Temperature by Location Example8m 10s
-
7. [Activity]Running the Minimum Temperature Example, and Modifying it for Maximums5m 8s
-
8. [Activity] Running the Maximum Temperature by Location Example3m 21s
-
9. [Activity] Counting Word Occurrences using flatmap()7m 28s
-
10. [Activity] Improving the Word Count Script with Regular Expressions4m 44s
-
11. [Activity] Sorting the Word Count Results7m 44s
Advanced Examples of Spark Programs
-
1. [Activity] Find the Most Popular Movie5m 52s
-
2. [Activity] Use Broadcast Variables to Display Movie Names Instead of ID Numbers8m 23s
-
3. Find the Most Popular Superhero in a Social Graph4m 29s
-
4. [Activity] Run the Script - Discover Who the Most Popular Superhero is!6m
-
5. Superhero Degrees of Separation: Introducing Breadth-First Search7m 54s
-
6. Superhero Degrees of Separation: Accumulators, and Implementing BFS in Spark6m 44s
-
7. [Activity] Superhero Degrees of Separation: Review the Code and Run it9m 14s
-
8. Item-Based Collaborative Filtering in Spark, cache(), and persist()10m 12s
-
9. [Activity] Running the Similar Movies Script using Spark's Cluster Manager10m 54s
-
10. [Exercise] Improve the Quality of Similar Movies2m 58s
Running Spark on a Cluster
-
1. Introducing Elastic MapReduce5m 8s
-
2. [Activity] Setting up your AWS / Elastic MapReduce Account and Setting Up PuTTY9m 55s
-
3. Partitioning4m 21s
-
4. Create Similar Movies from One Million Ratings - Part 15m 12s
-
5. [Activity] Create Similar Movies from One Million Ratings - Part 211m 27s
-
6. Create Similar Movies from One Million Ratings - Part 33m 28s
-
7. Troubleshooting Spark on a Cluster3m 43s
-
8. More Troubleshooting, and Managing Dependencies5m 47s
SparkSQL, DataFrames, and DataSets
-
1. Introducing SparkSQL6m 8s
-
2. Executing SQL commands and SQL-style functions on a DataFrame8m 16s
-
3. Using DataFrames instead of RDD's5m 52s
Other Spark Technologies and Libraries
-
1. Introducing MLLib8m 10s
-
2. [Activity] Using MLLib to Produce Movie Recommendations2m 56s
-
3. Analyzing the ALS Recommendations Results4m 53s
-
4. Using DataFrames with MLLib7m 31s
-
5. Spark Streaming and GraphX7m 36s