Your Needs. Our Solutions

What will i learn?

Learner would have a complete knowlege about the concepts, real time uasage scenarios
Learner would have a complete knowlege about the Set up and configure Spark & Kafka
Learner would have a complete knowlege to Execute Kafka & Spark applications to perform all Operations,
Learner would have a complete knowlege about Integrating Kafka with real time streaming systems like Spark & Storm
Learner would have a complete knowlege about Spark and the Hadoop Ecosystem

Curriculum for this course

0 Lessons 00:00:00 Hours

Requirements

+ View more

Description

Apache Spark and Apache Kafka

This online live Instructor-led Apache Spark and Apache Kafka training is focused on the technical community who are willing to work on various tools & techniques related to Hadoop, Bigdata & databases ;

This course is having multiple assignments (module wise) , Evaluation & periodic Assessment (Final Assessment at the end of the session) .Post this session ,learner would have a complete knowlege about the concepts, real time uasage scenarios, Set up and configure Spark & Kafka , execute Kafka & Spark applications to perform all Operations, Integrating Kafka with real time streaming systems like Spark & Storm, Spark and the Hadoop Ecosystem , Spark vs. MapReduce Programming

DURATION: 24hours

Course Outline

Big Data

· Big Data Technologies (Hadoop)

· Advantages of Hadoop

· Limitations of Hadoop

Introduction to Spark

·  Spark Basics

·  History of Spark

·  Spark Unified Stack

·  Why use Spark

·  Data Science Tasks

·  Spark in Data Processing Application

Getting Started with Spark

·  Download Spark

·  Install Spark

·  Spark Languages

·  Using the pyspark

Spark Core Concepts

·  Resilient Distributed Datasets (RDDs)

·  Functional Programming with Spark

·  Working with RDDs

·  RDD Operations

·  Key-Value Pair RDDs

·  Pair RDD Operations

·  Load Data File into Spark

·  Save Files

·  Data Partitioning

Running Spark on a Cluster

·  A Spark Standalone Cluster

·  The Spark Standalone Web UI

·  Spark on Hadoop Cluster

·  Spark on Cloud

·  Scheduling

Parallel Programming with Spark

·  RDD Partitions

·  HDFS Data Locality

·  Executing Parallel Operations

Writing Spark Applications

·  Spark Applications vs. pyspark

·  Creating the SparkContext

·  Configuring Spark Properties

·  Building and Running a Spark Application

·  Deploying Application on Cluster

·  Logging

Caching and Persistence

·  RDD Lineage

·  Caching Overview

· Distributed Persistence

Spark SQL

·  SchemaRDD

·  DataFrame and Dataset

·  SparkSession

·  SQL Operations

Spark Streaming

·  Spark Streaming Overview

·  Example: Streaming Word Count

·  Other Streaming Operations

·  Sliding Window Operations

·  Developing Spark Streaming Applications

Spark Mlib

·  What is Machine Learning

·  Supervised Machine Learning

·  Unsupervised Machine Learning

·  Algorithms used in Machine Learning

·  Data Types in MLib

·  Building Machine Learning Applications

Advanced Spark Features

·  Spark Performance

·  Shared Variables: Broadcast Variables

·  Shared Variables: Accumulators

Common Performance Issues

·  Concurrency Limitation

·  Security Features

·  Memory Usage and Garbage Collection

·  Serialization

Kafka

· Kafka characteristics and salient features

· Understand Kafka and its components

· Introduction to the Kafka API

· Storing of records using Kafka in fault-tolerant way

· Producing and consuming message from feeds like Twitter

· Kafka high throughput, scalability, durability and fault-tolerance

· Integrating Kafka with real time streaming systems like Spark & Storm

· Deploying Kafka in real world business scenarios

· Develop real time live spark Project with Kafka

Spark and the Hadoop Ecosystem

Spark vs. MapReduce Programming

Major Projects

Project 1

Movie Recommendation

Project 2

Tweet Analysis

Self Designed Project

Interview Questions and Quiz Discussion

+ View more

Other related courses

00:00:00 Hours

Apache Spark - Basic to Advance with Scala

Updated Mon, 08-Jun-2020

0 1 ₹21999 ₹9999

About the instructor

0 Reviews
28 Students
54 Courses

+ View more

KPI Consulting

This workshop is delivered by one of top most industry-leading faculty with at least 10 to 15+ years of Industry as well as training experience

KPI Consulting is one of the fastest growing (with 1000+ tech workshops) e-learning & consulting Firm which provides objective-based innovative & effective learning solutions for the entire spectrum of technical & domain skills

Student feedback

Average rating

Reviews

Apache Spark and Apache Kafka

DURATION: 24hours

Course Outline

Latest Posts

Contact Us

Apache Spark and Apache Kafka

DURATION: 24hours

Course Outline

Latest Posts

Contact Us

Modal

Are you sure to delete this information ?