Big Data using Hadoop

Big data is a buzzword, or catch-phrase, used to describe a massive volume of both structured and unstructured data they are difficult to process using traditional data processing applications. Challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and information privacy. Hadoop is an open source Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.

Challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and information privacy.

Hadoop is an open source Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.

Course Objective

After the completion of the Big Data and Hadoop Course, you should be able to:

    After the completion of the Big Data and Hadoop Course, you should be able to:
  • Master the concepts of Hadoop Distributed File System and MapReduce framework
  • Setup a Hadoop Cluster.
  • Perform Data Analytics using Pig and Hive
  • Implement HBase, MapReduce Integration, Advanced Usage and Advanced Indexing
  • Have a good understanding of ZooKeeper service
  • Implement best Practices for Hadoop Development and Debugging
  • Implement a Hadoop Project
 

Pre-requisite

  • The participants should have the basic knowledge of Linux/Windows system as the training may require them to do some basic operations.
  • Understanding of any of the Relational Databases and java will be an added advantage.
  • Windows/Linux Laptop, minimum with the following configuration: 4GB RAM 200GB Space and 1.4 GHz CPU..