This online live Instructor-led Apache Spark and Apache Kafka training is focused on the technical community who are willing to work on various tools & techniques related to Hadoop, Bigdata & databases ; This course is having multiple assignments (module wise) , Evaluation & periodic Assessment (Final Assessment at the end of the session) .
Apache Spark and Apache Kafka
This online live Instructor-led Apache Spark and Apache Kafka training is focused on the technical community who are willing to work on various tools & techniques related to Hadoop, Bigdata & databases ;
This course is having multiple assignments (module wise) , Evaluation & periodic Assessment (Final Assessment at the end of the session) .Post this session ,learner would have a complete knowlege about the concepts, real time uasage scenarios, Set up and configure Spark & Kafka , execute Kafka & Spark applications to perform all Operations, Integrating Kafka with real time streaming systems like Spark & Storm, Spark and the Hadoop Ecosystem , Spark vs. MapReduce Programming
Big Data
<!--[if !supportLists]-->·
<!--[endif]-->Big Data Technologies (Hadoop)
<!--[if !supportLists]-->·
<!--[endif]-->Advantages of Hadoop
<!--[if !supportLists]-->·
<!--[endif]-->Limitations of Hadoop
Introduction to Spark
<!--[if !supportLists]-->·
<!--[endif]--> Spark
Basics
<!--[if !supportLists]-->·
<!--[endif]--> History
of Spark
<!--[if !supportLists]-->·
<!--[endif]--> Spark
Unified Stack
<!--[if !supportLists]-->·
<!--[endif]--> Why use
Spark
<!--[if !supportLists]-->·
<!--[endif]--> Data
Science Tasks
<!--[if !supportLists]-->·
<!--[endif]--> Spark in
Data Processing Application
Getting Started with Spark
<!--[if !supportLists]-->·
<!--[endif]--> Download
Spark
<!--[if !supportLists]-->·
<!--[endif]--> Install
Spark
<!--[if !supportLists]-->·
<!--[endif]--> Spark
Languages
<!--[if !supportLists]-->·
<!--[endif]--> Using the
pyspark
Spark Core Concepts
<!--[if !supportLists]-->·
<!--[endif]--> Resilient
Distributed Datasets (RDDs)
<!--[if !supportLists]-->·
<!--[endif]--> Functional Programming with Spark
<!--[if !supportLists]-->·
<!--[endif]--> Working
with RDDs
<!--[if !supportLists]-->·
<!--[endif]--> RDD
Operations
<!--[if !supportLists]-->·
<!--[endif]--> Key-Value
Pair RDDs
<!--[if !supportLists]-->·
<!--[endif]--> Pair RDD
Operations
<!--[if !supportLists]-->·
<!--[endif]--> Load Data
File into Spark
<!--[if !supportLists]-->·
<!--[endif]--> Save
Files
<!--[if !supportLists]-->·
<!--[endif]--> Data
Partitioning
Running Spark on a Cluster
<!--[if !supportLists]-->·
<!--[endif]--> A Spark
Standalone Cluster
<!--[if !supportLists]-->·
<!--[endif]--> The Spark
Standalone Web UI
<!--[if !supportLists]-->·
<!--[endif]--> Spark on
Hadoop Cluster
<!--[if !supportLists]-->·
<!--[endif]--> Spark on
Cloud
<!--[if !supportLists]-->·
<!--[endif]--> Scheduling
Parallel Programming with Spark
<!--[if !supportLists]-->·
<!--[endif]--> RDD
Partitions
<!--[if !supportLists]-->·
<!--[endif]--> HDFS Data
Locality
<!--[if !supportLists]-->·
<!--[endif]--> Executing
Parallel Operations
Writing Spark Applications
<!--[if !supportLists]-->·
<!--[endif]--> Spark
Applications vs. pyspark
<!--[if !supportLists]-->·
<!--[endif]--> Creating
the SparkContext
<!--[if !supportLists]-->·
<!--[endif]--> Configuring Spark Properties
<!--[if !supportLists]-->·
<!--[endif]--> Building
and Running a Spark Application
<!--[if !supportLists]-->·
<!--[endif]--> Deploying
Application on Cluster
<!--[if !supportLists]-->·
<!--[endif]--> Logging
Caching and Persistence
<!--[if !supportLists]-->·
<!--[endif]--> RDD
Lineage
<!--[if !supportLists]-->·
<!--[endif]--> Caching
Overview
<!--[if !supportLists]-->·
<!--[endif]-->Distributed Persistence
Spark SQL
<!--[if !supportLists]-->·
<!--[endif]--> SchemaRDD
<!--[if !supportLists]-->·
<!--[endif]--> DataFrame
and Dataset
<!--[if !supportLists]-->·
<!--[endif]--> SparkSession
<!--[if !supportLists]-->·
<!--[endif]--> SQL Operations
Spark Streaming
<!--[if !supportLists]-->·
<!--[endif]--> Spark
Streaming Overview
<!--[if !supportLists]-->·
<!--[endif]--> Example:
Streaming Word Count
<!--[if !supportLists]-->·
<!--[endif]--> Other
Streaming Operations
<!--[if !supportLists]-->·
<!--[endif]--> Sliding
Window Operations
<!--[if !supportLists]-->·
<!--[endif]--> Developing Spark Streaming Applications
Spark Mlib
<!--[if !supportLists]-->·
<!--[endif]--> What is
Machine Learning
<!--[if !supportLists]-->·
<!--[endif]--> Supervised Machine Learning
<!--[if !supportLists]-->·
<!--[endif]--> Unsupervised Machine Learning
<!--[if !supportLists]-->·
<!--[endif]--> Algorithms used in Machine Learning
<!--[if !supportLists]-->·
<!--[endif]--> Data
Types in MLib
<!--[if !supportLists]-->·
<!--[endif]--> Building
Machine Learning Applications
Advanced Spark Features
<!--[if !supportLists]-->·
<!--[endif]--> Spark
Performance
<!--[if !supportLists]-->·
<!--[endif]--> Shared
Variables: Broadcast Variables
<!--[if !supportLists]-->·
<!--[endif]--> Shared
Variables: Accumulators
Common Performance Issues
<!--[if !supportLists]-->·
<!--[endif]--> Concurrency Limitation
<!--[if !supportLists]-->·
<!--[endif]--> Security
Features
<!--[if !supportLists]-->·
<!--[endif]--> Memory
Usage and Garbage Collection
<!--[if !supportLists]-->·
<!--[endif]--> Serialization
Kafka
<!--[if !supportLists]-->·
<!--[endif]-->Kafka characteristics and salient features
<!--[if !supportLists]-->·
<!--[endif]-->Understand Kafka and its components
<!--[if !supportLists]-->·
<!--[endif]-->Introduction to the Kafka API
<!--[if !supportLists]-->·
<!--[endif]-->Storing of records using Kafka in fault-tolerant
way
<!--[if !supportLists]-->·
<!--[endif]-->Producing and consuming message from feeds like
Twitter
<!--[if !supportLists]-->·
<!--[endif]-->Kafka high throughput, scalability, durability
and fault-tolerance
<!--[if !supportLists]-->·
<!--[endif]-->Integrating Kafka with real time streaming
systems like Spark & Storm
<!--[if !supportLists]-->·
<!--[endif]-->Deploying Kafka in real world business scenarios
<!--[if !supportLists]-->·
<!--[endif]-->Develop real time live spark Project with Kafka
Spark and the Hadoop Ecosystem
Spark vs. MapReduce Programming
Major Projects
Project 1
Movie Recommendation
Project 2
Tweet Analysis
Self Designed Project
Interview Questions and Quiz Discussion
KPI Consulting is one of the fastest growing (with 1000+ tech workshops) e-learning & consulting Firm which provides objective-based innovative & effective learning solutions for the entire spectrum of technical & domain skills
Write a public review