Home > Vendors > Advanced-Technology-Courses > MAC-LEARN-SPARK

MAC-LEARN-SPARK - Machine Learning with Spark


Duration: 2 days
Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Machine Learning algorithms comb through data and identify patterns that are too complex to be discerned by the human mind.

These patterns can then be used for decision making and action Apache Spark is a powerful platform that for running Machine Learning. This course will how you how to perform various Machine Learning using Apache Spark built in MLib component.


  • Overview of Apache Spark 
  • Clustering 
  • Regression 
  • Classification 
  • Recommendation


Module 1: Apache Spark Basics 

  • Recap of Apache Spark Basics
  • Install Apache Spark on Local Computer
  • Read CSV Data
  • Manipulating Dataframe
  • ML Libraries

Module 2: Preprocessing 

  • Normalizer
  • Standardizer
  • Tokenizer
  • TF-IDF

Module 3: Clustering 

  • What is Clustering 
  • Clustering Algorithms 
  • KMeans Clustering 
  • Hierarchical Clustering

Module 4: Classification 

  • What is Classification
  • Naives Bayes Clasiifier
  • Decision Tree Classifer
  • •Multi Layer Perception

Module 5: Regression 

  • What is Clustering
  • Clustering Algorithms
  • Linear Regression
  • Decision Tree Regression
  • Gradient Boosted Tree Regression

Module 6: ML Pipeline 

  • What is Pipeline
  • Creating a Pipeline for Movie Review Classification

Module 7: Recommendation (Optional) 

  • Recommendation Systems
  • Collaborative Filtering
  • Summary and Closing Remarks


  • Big Data Analysts 
  • Data Scientists 
  • Data Analysts


This is an intermediate course.

Participants should have basic knowledge on the following subjects: Python Apache Spark


Trainocate Certificate of Attendance


Show Schedule for: