|Sessions||:||2 per week|
|Live Case Studies||:||6|
|Students||:||15 (Per Batch)|
Apache defines Hadoop as a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. With the rate at which memory cost decreased the processing speed of data never increased and hence loading the large set of data is still a big headache and here comes Hadoop as the solution for it.
Hadoop only works on Linux based Operating System. So to work with Hadoop Framework you need to know how to operate any of the Linux Based Operating System or to operate it in a Virtual Machine.(Don't worry if you don't know that; we give a small training on that too).
Even though Hadoop Framework is designed in Java, it can operate using other languages too. but having a hands on knowledge of Core Java Makes it easier for writing the Map Reduce codes
We have best of the trainers with a belief that best friend of an developer is a Keyboard, hence we have 100% Practical Approach policy.
We keep our batch sizes small for qualified attention and proper practical time. Smaller batches help both trainer as well as trainees to work well on the concepts.
We have got a series of real life based case studies to understand the concepts of Big Data Hadoop in depth.
Our Placement unit make sure that you are quite comfortable in presenting to the companies we are tied up with. We give 100% Job Assistance.
Introduction to Hadoop
Architecture of Hadoop
Installation of Hadoop
Configuration of Hadoop
HDFS- Hadoop Data File System
What is Big Data?, What are the challenges for processing big data?, What is Hadoop?, Why Hadoop?, History of Hadoop, Hadoop ecosystem and related projects.
Overview of Big Data Tools, Different vendors providing hadoop and where it fits in the industry, Setting up Development environment & performing Hadoop Installation on User’s laptop, Hadoop daemons, Starting stopping daemons using command line and cloudera manager.
Understanding the problem statement and challenges persisting to large data and how HDFS solves that problem, Understanding the HDFS architecture, Exploring the HDFS using Command-line as well as Web Interface, Writing files to HDFS, Reading files from HDFS, Rack awareness, HDFS commands.
Classical version of Apache Hadoop (MRv1), Limitations of classical MapReduce, Addressing the scalability ,resource utilization issue and need to support different programming paradigms, YARN: The next generation of Hadoop's compute platform (MRv2), Architecture of YARN, Application submission in YARN, Type of yarn schedulers (FIFO, Capacity and Fair).
Understanding how the distributed processing solves the big data challenge and how MapReduce helps to solve that problem, Setting up multiple Eclipse based Java project to have hands-on experience on MapReduce, Word count problem and solution, MapReduce flow, Explain the Driver, Mapper and Reducer code, Configuring development environment - Eclipse, Testing, debugging project through eclipse and then finally packaging, deploying the code on Hadoop Cluster, Input Formats - Input splits & records, text input, binary input, Output Formats - text output, binary output, lazy output, MapReduce combiner, MapReduce partitioner, Data locality, Speculative execution, Job optimization.
Setting up RDBMS Server and creating & loading datasets into RDBMS Mysql, Sqoop Architecture, Writing the Sqoop Import Commands to transfer data from RDBMS to HDFS/Hive/Hbase, Incremental Imports, Writing the Sqoop Export commands to transfer data from HDFS/Hive to RDBMS, Sqoop Jobs to automate those sqoop commands for day to day use.
Understanding the Flume architecture and how is it different from sqoop, Flume Agent Setup, Setting up data, Types of sources, channels, sinks Multi Agent Flow, Different Flume implementations, Hands-on exercises (configuring and running flume agent to load streaming data from web server).
Hive Architecture and components with typical query flows, Creating the table, Loading the datasets & performing analysis on that Datasets, Types of Hive metastore configurations, Understanding Hive Data model, Running the DML commands like Joining tables, writing sub query, saving results to table or HDFS etc., Understanding the different File formats and choosing the right one, when to use partitioning, bucketing to optimize query performance, Writing UDF’s to reuse project/Domain specific implementations.
Pig Architecture and components, Pig Latin and data model in pig, Loading structured as well as unstructured data, Performing Data Transformation by using built-in functions of PIG for ex. filter, group, join etc., Writing and calling that UDF in PIG Grunt shell, Creating the PIG script and running the entire script on one go.
Understanding the Spark architecture and why it is better than Map Reduce, Working with RDD’s, Hands on examples with various transformations on RDD, Perform Spark actions on RDD, Spark Sql concepts : Dataframes & Datasets, Hands on examples with Spark SQL to create and work with data frames and datasets, Create Spark DataFrames from an existing RDD, Create Spark DataFrames from external files, Create Spark DataFrames from hive tables, Perform operations on a DataFrame, Using Hive tables in Spark.
Understanding the Need of NoSQL Database and how is it different from RDBMS, Understanding the complete architecture of HBASE and how to model the data into column families, Performing HBASE-HIVE Integration and HBASE-PIG Integration, Writing the queries to interact with the data stored in HBASE.
Oozie Fundamentals, Oozie workflow creations, Oozie Job submission, monitoring, debugging using oozie web console and command line, Concepts on Coordinators and Bundles, Hands-on exercises.
Data: This Titanic data is publicly available, using this dataset we will perform Analysis and will draw out some insights like finding the average age of male and females died in Titanic, Number of males and females died in each compartment.
The average age of the people (both male and female) who died in the tragedy using Hadoop MapReduce
How many persons survived – traveling class wise.
Data: Find out the views of different people on the demonetization by analyzing the tweets from twitter. Here is the dataset where twitter tweets are gathered in CSV format
We have to analyze the Sentiment for the tweet by using the words in the text. We will rate the word as per its meaning from +5 to -5 using the dictionary AFINN. The AFINN is a dictionary which consists of 2500 words which are rated from +5 to -5 depending on their meaning
Find all the positive tweets
Find all the negative tweets
Data: Pokémon Go is a free-to-play, location-based augmented reality game developed by Niantic for iOS and Android devices. It was released only in July 2016 and only in selected countries. You can download Pokémon for free of cost and start playing. You can also use PokéCoins to purchase Pokéballs, the in-game item you need to be able to catch Pokémon
Find out the average HP (Hit points) of all the Pokémon.
find the count of ‘powerful’ and ‘moderate’ Pokemon.
Find out the top 10 Pokémons according to their Hit points.
Find out the top 10 Pokémons based on their Attack stat.
Find out the top 10 Pokémons based on their Defense stat.
Find out the top 10 Pokémons based on their total power.
Find out the top 10 Pokémons having a drastic change in their attack and sp.attack
Find out the top 10 Pokémons having a drastic change in their defense and sp.defense
Find out the top 10 fastest Pokémons
Data: The clinical dataset is released for the awareness of breast cancer. For practice, few problems have been designed with the solution which makes the user understand better.
What is the average age at which initial pathologic diagnosis to be done.
Find the average age of people of each AJCC Stage.
Find out the people with vital status and their count.
Data: The Uber dataset consists of 4 columns. They are dispatching_base_number, date, active_vehicles and trips
Find the days on which each basement has more trip
Data: In this post Spark SQL Use Case 911 Emergency Helpine Number Data Analysis, we will be performing analysis on the data provided the callers who had called the emergency helpline number in North America.
What kind of problems are prevalent, and in which state?
What kind of problems are prevalent, and in which city?
Had a very amazing experience with asterix where i saw myself growing each day !! If u are planning to launch your career in I.T... Then theres no place better than asterix... They really kept their word of 100% placement ! I just had a bachelors in computer science before joining asterix. Now i am a proud junior software engineering for jdotnet..
Attended one day workshop on Hadoop today it was really good with real time use of Hadoop. Trainer Zartab has deep knowledge on Java & Big Data as well. One of the promising institute in Navi Mumbai for doing Java & Big Data courses.
Definitely a class to vouch for if your idea is to learn and gain knowledge. The fees is nominal amount of money as compared to the knowledge they impart. The professors (Zartab Sir/ Naved Sir) pay individual attention to every student present in the class (For the same reason his batch consists of not more than 8 people) Even for a student with a non IT background, understanding codes becomes easy. Asterix Solution pays more attention on learning and understanding unlike other institutions which are more focused on monetary gain.
I joined their 35 Days Java Training Program and it went to be a wonderful training experience. Their training techniques are awesome. Timely assessment helped me a lot. We completed 3 projects. And finally they got me placed. Thank You Asterix Solution.
Awesome experience!!!! Best part is how sir relate programming concepts with daily life experience.
I was trained here on advance java and android. It was realy awesome experience. Training quality is superb, Trainer Zartab sir uses practical aproach in there training that is very useful to overcome with the fear of codding, Now I am placed in Scripton software pvt ltd as an android developer.
Excellent Training!! Excellent Project Work!! Excellent Placement!! This was my journey at Asterix Solution while completing my Advanced Android Training Program and now I am working as an Android Developer at "Choice Internation Pvt. Ltd." I would like to thank them for the wonderful training experience and my special thanks goes to our trainer Mr. Zartab Nakhwa for his continous support. All the best Asterix Solution. Glad to join you. 100% Recommended.
Best classes in Mumbai to grasp programming knowledge skills from root level. Good for beginner's.