Course Outline
Introduction
- Apache Spark vs Hadoop MapReduce
Overview of Apache Spark Features and Architecture
Selecting a Programming Language
Setting up Apache Spark
Developing a Sample Application
Selecting the Data Set
Performing Data Analysis on the Data
Processing Structured Data with Spark SQL
Processing Streaming Data with Spark Streaming
Integrating Apache Spark with Third-Party Machine Learning Tools
Utilizing Apache Spark for Graph Processing
Optimizing Apache Spark
Troubleshooting
Summary and Conclusion
Requirements
- Familiarity with the Linux command line
- A general understanding of data processing
- Programming experience with Java, Scala, Python, or R
Audience
- Developers
Testimonials (3)
I liked that it was practical. Loved to apply the theoretical knowledge with practical examples.
Aurelia-Adriana - Allianz Services Romania
Course - Python and Spark for Big Data (PySpark)
The fact that we were able to take with us most of the information/course/presentation/exercises done, so that we can look over them and perhaps redo what we didint understand first time or improve what we already did.
Raul Mihail Rat - Accenture Industrial SS
Course - Python, Spark, and Hadoop for Big Data
very interactive...