080 41714080 info@consultkpi.com

Apache Spark and Apache Kafka

This online live Instructor-led Apache Spark and Apache Kafka training is focused on the technical community who are willing to work on various tools & techniques related to Hadoop, Bigdata & databases ; This course is having multiple assignments (module wise) , Evaluation & periodic Assessment (Final Assessment at the end of the session) .

Advanced 0 (0 Rating) 0 Students enrolled
Created by KPI Consulting Last updated Mon, 08-Jun-2020 English
What will i learn?
  • Learner would have a complete knowlege about the concepts, real time uasage scenarios
  • Learner would have a complete knowlege about the Set up and configure Spark & Kafka
  • Learner would have a complete knowlege to Execute Kafka & Spark applications to perform all Operations,
  • Learner would have a complete knowlege about Integrating Kafka with real time streaming systems like Spark & Storm
  • Learner would have a complete knowlege about Spark and the Hadoop Ecosystem

Curriculum for this course
0 Lessons 00:00:00 Hours
Requirements
+ View more
Description

 Apache Spark and Apache Kafka     

This online live Instructor-led  Apache Spark and Apache Kafka training is focused on the technical community who are willing to work on various tools & techniques related to Hadoop, Bigdata & databases ;


This course is having multiple assignments (module wise) , Evaluation &  periodic Assessment (Final Assessment at the end of the session) .Post this session ,learner would have a complete knowlege  about the concepts, real time uasage scenarios, Set up and configure Spark & Kafka , execute  Kafka & Spark applications to perform all Operations, Integrating Kafka with real time streaming systems like Spark & Storm,  Spark and the Hadoop Ecosystem , Spark vs. MapReduce Programming


DURATION: 24hours

Course Outline 


Big Data

<!--[if !supportLists]-->·         <!--[endif]-->Big Data Technologies (Hadoop)

<!--[if !supportLists]-->·         <!--[endif]-->Advantages of Hadoop

<!--[if !supportLists]-->·         <!--[endif]-->Limitations of Hadoop

Introduction to Spark

<!--[if !supportLists]-->·         <!--[endif]--> Spark Basics

<!--[if !supportLists]-->·         <!--[endif]--> History of Spark

<!--[if !supportLists]-->·         <!--[endif]--> Spark Unified Stack

<!--[if !supportLists]-->·         <!--[endif]--> Why use Spark

<!--[if !supportLists]-->·         <!--[endif]--> Data Science Tasks

<!--[if !supportLists]-->·         <!--[endif]--> Spark in Data Processing Application

Getting Started with Spark

<!--[if !supportLists]-->·         <!--[endif]--> Download Spark

<!--[if !supportLists]-->·         <!--[endif]--> Install Spark

<!--[if !supportLists]-->·         <!--[endif]--> Spark Languages

<!--[if !supportLists]-->·         <!--[endif]--> Using the pyspark

Spark Core Concepts

<!--[if !supportLists]-->·         <!--[endif]--> Resilient Distributed Datasets (RDDs)

<!--[if !supportLists]-->·         <!--[endif]--> Functional Programming with Spark

<!--[if !supportLists]-->·         <!--[endif]--> Working with RDDs

<!--[if !supportLists]-->·         <!--[endif]--> RDD Operations

<!--[if !supportLists]-->·         <!--[endif]--> Key-Value Pair RDDs

<!--[if !supportLists]-->·         <!--[endif]--> Pair RDD Operations

<!--[if !supportLists]-->·         <!--[endif]--> Load Data File into Spark

<!--[if !supportLists]-->·         <!--[endif]--> Save Files

<!--[if !supportLists]-->·         <!--[endif]--> Data Partitioning

Running Spark on a Cluster

<!--[if !supportLists]-->·         <!--[endif]--> A Spark Standalone Cluster

<!--[if !supportLists]-->·         <!--[endif]--> The Spark Standalone Web UI

<!--[if !supportLists]-->·         <!--[endif]--> Spark on Hadoop Cluster

<!--[if !supportLists]-->·         <!--[endif]--> Spark on Cloud

<!--[if !supportLists]-->·         <!--[endif]--> Scheduling

Parallel Programming with Spark

<!--[if !supportLists]-->·         <!--[endif]--> RDD Partitions

<!--[if !supportLists]-->·         <!--[endif]--> HDFS Data Locality

<!--[if !supportLists]-->·         <!--[endif]--> Executing Parallel Operations

Writing Spark Applications

<!--[if !supportLists]-->·         <!--[endif]--> Spark Applications vs. pyspark

<!--[if !supportLists]-->·         <!--[endif]--> Creating the SparkContext

<!--[if !supportLists]-->·         <!--[endif]--> Configuring Spark Properties

<!--[if !supportLists]-->·         <!--[endif]--> Building and Running a Spark Application

<!--[if !supportLists]-->·         <!--[endif]--> Deploying Application on Cluster

<!--[if !supportLists]-->·         <!--[endif]--> Logging

Caching and Persistence

<!--[if !supportLists]-->·         <!--[endif]--> RDD Lineage

<!--[if !supportLists]-->·         <!--[endif]--> Caching Overview

<!--[if !supportLists]-->·         <!--[endif]-->Distributed Persistence

Spark SQL

<!--[if !supportLists]-->·         <!--[endif]--> SchemaRDD

<!--[if !supportLists]-->·         <!--[endif]--> DataFrame and Dataset

<!--[if !supportLists]-->·         <!--[endif]--> SparkSession

<!--[if !supportLists]-->·         <!--[endif]--> SQL Operations

Spark Streaming

<!--[if !supportLists]-->·         <!--[endif]--> Spark Streaming Overview

<!--[if !supportLists]-->·         <!--[endif]--> Example: Streaming Word Count

<!--[if !supportLists]-->·         <!--[endif]--> Other Streaming Operations

<!--[if !supportLists]-->·         <!--[endif]--> Sliding Window Operations

<!--[if !supportLists]-->·         <!--[endif]--> Developing Spark Streaming Applications

Spark Mlib

<!--[if !supportLists]-->·         <!--[endif]--> What is Machine Learning

<!--[if !supportLists]-->·         <!--[endif]--> Supervised Machine Learning

<!--[if !supportLists]-->·         <!--[endif]--> Unsupervised Machine Learning

<!--[if !supportLists]-->·         <!--[endif]--> Algorithms used in Machine Learning

<!--[if !supportLists]-->·         <!--[endif]--> Data Types in MLib

<!--[if !supportLists]-->·         <!--[endif]--> Building Machine Learning Applications

Advanced Spark Features

<!--[if !supportLists]-->·         <!--[endif]--> Spark Performance

<!--[if !supportLists]-->·         <!--[endif]--> Shared Variables: Broadcast Variables

<!--[if !supportLists]-->·         <!--[endif]--> Shared Variables: Accumulators

Common Performance Issues

<!--[if !supportLists]-->·         <!--[endif]--> Concurrency Limitation

<!--[if !supportLists]-->·         <!--[endif]--> Security Features

<!--[if !supportLists]-->·         <!--[endif]--> Memory Usage and Garbage Collection

<!--[if !supportLists]-->·         <!--[endif]--> Serialization

Kafka

<!--[if !supportLists]-->·         <!--[endif]-->Kafka characteristics and salient features

<!--[if !supportLists]-->·         <!--[endif]-->Understand Kafka and its components

<!--[if !supportLists]-->·         <!--[endif]-->Introduction to the Kafka API

<!--[if !supportLists]-->·         <!--[endif]-->Storing of records using Kafka in fault-tolerant way

<!--[if !supportLists]-->·         <!--[endif]-->Producing and consuming message from feeds like Twitter

<!--[if !supportLists]-->·         <!--[endif]-->Kafka high throughput, scalability, durability and fault-tolerance


<!--[if !supportLists]-->·         <!--[endif]-->Integrating Kafka with real time streaming systems like Spark & Storm  

<!--[if !supportLists]-->·         <!--[endif]-->Deploying Kafka in real world business scenarios

<!--[if !supportLists]-->·         <!--[endif]-->Develop real time live spark Project with Kafka

 

Spark and the Hadoop Ecosystem

Spark vs. MapReduce Programming

Major Projects

 Project 1

Movie Recommendation

 Project 2

Tweet Analysis

 Self Designed Project

Interview Questions and Quiz Discussion

 

+ View more
Other related courses
00:00:00 Hours
0 1 ₹21999 ₹9999
About the instructor
  • 0 Reviews
  • 28 Students
  • 54 Courses
+ View more
This workshop is delivered by one of top most industry-leading faculty with at least 10 to 15+ years of Industry as well as training experience

KPI Consulting is one of the fastest growing (with 1000+ tech workshops) e-learning & consulting Firm which provides objective-based innovative & effective learning solutions for the entire spectrum of technical & domain skills

Student feedback
0
Average rating
  • 0%
  • 0%
  • 0%
  • 0%
  • 0%
Reviews
₹0
Buy now
Includes:
  • 00:00:00 Hours On demand videos
  • 0 Lessons
  • Full lifetime access
  • Access on mobile and tv
Developed By: Monnet Digital India Pvt Ltd