Live Chat Support

Analyzing Big Data with Microsoft R (20773A)

Analyzing Big Data with Microsoft R (20773A)

Overview

Duration: 3 Days

The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis on a large dataset, and show how to utilize it in Big Data environments, such as a Hadoop or Spark cluster, or a SQL Server database.

Objectives

After completing this course, students will be able to:

  • Explain how Microsoft R Server and Microsoft R Client work
  • Use R Client with R Server to explore big data held in different data stores
  • Visualize data by using graphs and plots
  • Transform and clean big data sets
  • Implement options for splitting analysis jobs into parallel tasks 
  • Build and evaluate regression models generated from big data 
  • Create, score, and deploy partitioning models generated from big data
  • Use R in the SQL Server and Hadoop environments  

Course Outline

Module 1: Microsoft R Server and R Client
Explain how Microsoft R Server and Microsoft R Client work.
Lessons

  • What is Microsoft R server
  • Using Microsoft R client
  • The ScaleR functions

Lab : Exploring Microsoft R Server and Microsoft R Client

  • Using R client in VSTR and RStudio
  • Exploring ScaleR functions
  • Connecting to a remote server


Module 2: Exploring Big Data
At the end of this module the student will be able to use R Client with R Server to explore big data held in different data stores.
Lessons

  • Understanding ScaleR data sources
  • Reading data into an XDF object
  • Summarizing data in an XDF object

Lab : Exploring Big Data

  • Reading a local CSV file into an XDF file
  • Transforming data on input
  • Reading data from SQL Server into an XDF file
  • Generating summaries over the XDF data


Module 3: Visualizing Big Data
Explain how to visualize data by using graphs and plots.
Lessons

  • Visualizing In-memory data
  • Visualizing big data

Lab : Visualizing data

  • Using ggplot to create a faceted plot with overlays
  • Using rxlinePlot and rxHistogram


Module 4: Processing Big Data
Explain how to transform and clean big data sets.
Lessons

  • Transforming Big Data
  • Managing datasets

Lab : Processing big data

  • Transforming big data
  • Sorting and merging big data
  • Connecting to a remote server


Module 5: Parallelizing Analysis Operations
Explain how to implement options for splitting analysis jobs into parallel tasks.
Lessons

  • Using the RxLocalParallel compute context with rxExec
  • Using the revoPemaR package

Lab : Using rxExec and RevoPemaR to parallelize operations

  • Using rxExec to maximize resource use
  • Creating and using a PEMA class


Module 6: Creating and Evaluating Regression Models
Explain how to build and evaluate regression models generated from big data
Lessons

  • Clustering Big Data
  • Generating regression models and making predictions

Lab : Creating a linear regression model

  • Creating a cluster
  • Creating a regression model
  • Generate data for making predictions
  • Use the models to make predictions and compare the results


Module 7: Creating and Evaluating Partitioning Models
Explain how to create and score partitioning models generated from big data.
Lessons

  • Creating partitioning models based on decision trees.
  • Test partitioning models by making and comparing predictions

Lab : Creating and evaluating partitioning models

  • Splitting the dataset
  • Building models
  • Running predictions and testing the results
  • Comparing results


Module 8: Processing Big Data in SQL Server and Hadoop
Explain how to transform and clean big data sets.
Lessons

  • Using R in SQL Server
  • Using Hadoop Map/Reduce
  • Using Hadoop Spark

Lab : Processing big data in SQL Server and Hadoop

  • Creating a model and predicting outcomes in SQL Server
  • Performing an analysis and plotting the results using Hadoop Map/Reduce
  • Integrating a sparklyr script into a ScaleR workflow
This course is part of the following Certifications:
  • MCP 70-773: Analyzing Big Data with Microsoft R
  • MCSA 70-773: Analyzing Big Data with Microsoft R
  • MCSE 70-773: Analyzing Big Data with Microsoft R

In addition to their professional experience, students who attend this course should have:

  • Programming experience using R, and familiarity with common R packages
  • Knowledge of common statistical methods and data analysis best practices.
  • Basic knowledge of the Microsoft Windows operating system and its core functionality.
  • Working knowledge of relational databases.
Course ID:
20773A


Show Schedule for:

Scheduled DateVendor CreditsCountryLocationFeesRegister
04 Sep 2019 - 06 Sep 2019 India Bangalore USD 700
17 Jul 2019 - 19 Jul 2019 India Bangalore USD 700
18 Dec 2019 - 20 Dec 2019 India Bangalore USD 700
20 Nov 2019 - 22 Nov 2019 India Bangalore USD 700
23 Oct 2019 - 25 Oct 2019 India Bangalore USD 700
26 Jun 2019 - 28 Jun 2019 India Bangalore USD 700
28 Aug 2019 - 30 Aug 2019 India Bangalore USD 700

Please provide as much information as possible for us to help you with your enquiry.