Unit 1 –Accessing databases
Topic 1: Connector stage overview
- Use Connector stages to read from and write to relational tables
- Working with the Connector stage properties
Topic 2: Connector stage functionality
- Before / After SQL
- Sparse lookups
- Optimize insert/update performance
Topic 3: Error handling in Connector stages
- Reject links
- Reject conditions
Topic 4: Multiple input links
- Designing jobs using Connector stages with multiple input links
- Ordering records across multiple input links
Topic 5: File Connector stage
- Read and write data to Hadoop file systems
Demonstration 1: Handling database errors
Demonstration 2: Parallel jobs with multiple Connector input links
Demonstration 3: Using the File Connector stage to read and write HDFS files
Unit 2 – Processing unstructured data
Topic 1: Using the Unstructured Data stage in DataStage jobs
- Extract data from an Excel spreadsheet
- Specify a data range for data extraction in an Unstructured Data stage
- Specify document properties for data extraction.
Demonstration 1: Processing unstructured data
Unit 3 – Data masking
Topic 1: Using the Data Masking stage in DataStage jobs
- Data masking techniques
- Data masking policies
- Applying policies for masquerading context-aware data types
- Applying policies for masquerading generic data types
- Repeatable replacement
- Using reference tables
- Creating custom reference tables
Demonstration 1: Data masking
Unit 4 – Using data rules
Topic 1: Introduction to data rules
- Using the Data Rules Editor
- Selecting data rules
- Binding data rule variables
- Output link constraints
- Adding statistics and attributes to the output information
Topic 2: Use the Data Rules stage to valid foreign key references in source data
Topic 3: Create custom data rules
Demonstration 1: Using data rules
Unit 5 – Processing XML data
Topic 1: Introduction to the Hierarchical stage
- Hierarchical stage Assembly editor
- Use the Schema Library Manager to import and manage XML schemas
Topic 2: Composing XML data
- Using the HJoin step to create parent-child relationships between input lists
- Using the Composer step
Topic 3: Writing Hierarchical data to a relational table
Topic 4: Using the Regroup step
Topic 5: Consuming XML data
- Using the XML Parser step
- Propagating columns
Topic 6: Transforming XML data
- Using the Aggregate step
- Using the Sort step
- Using the Switch step
- Using the H-Pivot step
Demonstration 1: Importing XML schemas
Demonstration 2: Compose hierarchical data
Demonstration 3: Consume hierarchical data
Demonstration 4: Transform hierarchical data
Unit 6: Updating a star schema database
Topic 1: Surrogate keys
- Design a job that creates and updates a surrogate key source key file from a dimension table
Topic 2: Slowly Changing Dimensions (SCD) stage
- Star schema databases
- SCD stage Fast Path pages
- Specifying purpose codes
- Dimension update specification
- Design a job that processes a star schema database with Type 1 and Type 2 slowly changing dimensions
Demonstration 1: Build a parallel job that updates a star schema database with two dimensions