SPARK-HADOOP - Spark and Hadoop Training Course


Duration: 5 days

Spark Fundamentals is a course that introduces you to the domain of spark development as well as gives you technical knowhow of the same. At the end of this course you will be able to earn a credential of Spark professional and you will be capable of dealing with Terabyte scale of data and analyze it successfully using spark and its ecosystem as Spark SQL.




Big Data and Hadoop 

  • What is Big Data
  • What is Hadoop 
  • How Hadoop Works 
  • How Hadoop and Spark are related 
  • Hadoop Ecosystem

Spark Architecture and Components 

  • Spark Architecture
  • Spark Components

RDD in Depth 

  • RDDs 
  • Creating RDDs from files 
  • Creating RDDs for another RDDs 
  • RDD operations 
  • Actions 
  • Transformations 
  • Pair RDDs 
  • Joins using RDD 
  • Map and Filter Transformation 
  • FlatMap Transformation
  • Caching and Persistence

Spark platforms 

  • Spark local mode, YARN, Mesos and Standalone

Spark Hands On 

  • Just Enough Python for Spark 
  • Basic operations on RDDs 
  • Pair RDD Hands On 
  • Building Spark Applications 
  • Submitting the Application over single node cluster 
  • Monitoring Spark Applications

Spark SQL & Dataframes 

  • Spark SQL and the SQL Context 
  • Creating Dataframes 
  • Dataframe Queries and Transformations
  • Temp Tables/Views 
  • Easy Querying 
  • Saving Dataframes -Dataframes and RDDs 
  • Dataframe internals that makes it fast 
  • Catalyst Optimizer and Tungsten Load data into Spark from external data sources like databases 
  • Saving dataframe to external sources like HDFS, RDBMS
  • SQL features of Data frame -Accessing Hive tables from Spark 
  • Data formats – text format such csv, json, xml, binary formats such as parquet,orc 
  • UDF in Spark Dataframe
  • Exposing Spark SQL as JDBC service and its benefits and limitations 
  • Hive Context vs Spark SQL Context

Spark Dataframes Hands On 

  • Dataframes on a JSON file 
  • Dataframes on hive tables 
  • Dataframes on JSON Querying operations dataframes


For: Typically professionals with basic knowledge of software development, programming languages, and databases will find this course really helpful. Basic knowledge should be enough to succeed at this course 

Not For: Students who are absolute beginners at software development as a discipline will find it difficult to follow the course




Trainocate Certificate of Attendance


Show Schedule for: