Get in Touch

Course Outline

Big Data Overview

  • Defining Big Data
  • The drivers behind Big Data's growing popularity
  • Big Data Case Studies
  • Key Characteristics of Big Data
  • Solutions for working with Big Data

Hadoop and Its Components

  • Introduction to Hadoop and its core components
  • Hadoop architecture and the types of data it can handle or process
  • A brief history of Hadoop, the companies using it, and their motivations
  • Detailed explanation of the Hadoop framework and its components
  • Understanding HDFS and operations for reading and writing to the Hadoop Distributed File System
  • Configuring a Hadoop cluster in various modes: standalone, pseudo-distributed, or multi-node

This section covers setting up a Hadoop cluster using VirtualBox, KVM, or VMware, including necessary network configurations, running Hadoop daemons, and testing the cluster.

  • Understanding the MapReduce framework and its operational mechanics
  • Executing MapReduce jobs on a Hadoop cluster
  • Exploring replication, mirroring, and rack awareness within the context of Hadoop clusters

Planning a Hadoop Cluster

  • Strategies for planning your Hadoop cluster
  • Aligning hardware and software requirements for cluster planning
  • Analyzing workloads to prevent failures and ensure optimal performance

Introduction to MapR and Why Choose MapR

  • Overview of MapR and its architecture
  • Understanding and utilizing the MapR Control System, MapR Volumes, snapshots, and mirrors
  • Cluster planning specific to MapR environments
  • Comparing MapR with other distributions and Apache Hadoop
  • MapR installation and cluster deployment procedures

Cluster Setup and Administration

  • Managing services, nodes, snapshots, mirrored volumes, and remote clusters
  • Understanding and managing cluster nodes
  • Comprehending Hadoop components and installing them alongside MapR services
  • Accessing data on the cluster, including via NFS, and managing services and nodes
  • Managing data through volumes, handling users and groups, assigning roles to nodes, commissioning and decommissioning nodes, overseeing cluster administration, monitoring performance, configuring and analyzing metrics, and administering MapR security
  • Working with M7-native storage for MapR tables
  • Configuring and tuning the cluster for optimal performance

Cluster Upgrades and Integration with Other Environments

  • Upgrading MapR software versions and understanding upgrade types
  • Configuring MapR clusters to access HDFS clusters
  • Deploying MapR clusters on Amazon Elastic MapReduce

All topics covered include demonstrations and hands-on practice sessions to provide learners with practical experience of the technology.

Requirements

  • Fundamental knowledge of the Linux File System
  • Basic proficiency in Java
  • Familiarity with Apache Hadoop (recommended)
 28 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses

Related Categories