Udemy

Practical Guide to setup Hadoop and Spark Cluster using CDH

立即報名
  • 27,847 名學生
  • 更新於 2/2023
4.6
(538 個評分)
CTgoodjobs 嚴選優質課程,為職場人士提升競爭力。透過本站連結購買Udemy課程,本站將獲得推廣佣金,有助未來提供更多實用進修課程資訊給讀者。

課程資料

報名日期
全年招生
課程級別
學習模式
修業期
20 小時 56 分鐘
教學語言
英語
評分
4.6
(538 個評分)
6次瀏覽

課程簡介

Practical Guide to setup Hadoop and Spark Cluster using CDH

Step by step instructions to setup Hadoop and Spark Cluster using Cloudera Distribution of Hadoop (Formerly CCA 131)

Cloudera is one of the leading vendor for distributions related to Hadoop and Spark. As part of this Practical Guide, you will learn step by step process of setting up Hadoop and Spark Cluster using CDH.

Install - Demonstrate an understanding of the installation process for Cloudera Manager, CDH, and the ecosystem projects.

  • Set up a local CDH repository

  • Perform OS-level configuration for Hadoop installation

  • Install Cloudera Manager server and agents

  • Install CDH using Cloudera Manager

  • Add a new node to an existing cluster

  • Add a service using Cloudera Manager

Configure - Perform basic and advanced configuration needed to effectively administer a Hadoop cluster

  • Configure a service using Cloudera Manager

  • Create an HDFS user's home directory

  • Configure NameNode HA

  • Configure ResourceManager HA

  • Configure proxy for Hiveserver2/Impala

Manage - Maintain and modify the cluster to support day-to-day operations in the enterprise

  • Rebalance the cluster

  • Set up alerting for excessive disk fill

  • Define and install a rack topology script

  • Install new type of I/O compression library in cluster

  • Revise YARN resource assignment based on user feedback

  • Commission/decommission a node

Secure - Enable relevant services and configure the cluster to meet goals defined by security policy; demonstrate knowledge of basic security practices

  • Configure HDFS ACLs

  • Install and configure Sentry

  • Configure Hue user authorization and authentication

  • Enable/configure log and query redaction

  • Create encrypted zones in HDFS

Test - Benchmark the cluster operational metrics, test system configuration for operation and efficiency

  • Execute file system commands via HTTPFS

  • Efficiently copy data within a cluster/between clusters

  • Create/restore a snapshot of an HDFS directory

  • Get/set ACLs for a file or directory structure

  • Benchmark the cluster (I/O, CPU, network)

Troubleshoot - Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenarios

  • Resolve errors/warnings in Cloudera Manager

  • Resolve performance problems/errors in cluster operation

  • Determine reason for application failure

  • Configure the Fair Scheduler to resolve application delays

Our Approach

  • You will start with creating Cloudera QuickStart VM (in case you have laptop with 16 GB RAM with Quad Core). This will facilitate you to get comfortable with Cloudera Manager.

  • You will be able to sign up for GCP and avail credit up to $300 while offer lasts. Credits are valid up to year.

  • You will then understand brief overview about GCP and provision 7 to 8 Virtual Machines using templates. You will also attaching external hard drive to configure for HDFS later.

  • Once servers are provisioned, you will go ahead and set up Ansible for Server Automation.

  • You will take care of local repository for Cloudera Manager and Cloudera Distribution of Hadoop using Packages.

  • You will then setup Cloudera Manager with custom database and then Cloudera Distribution of Hadoop using Wizard that comes as part of Cloudera Manager.

  • As part of setting up of Cloudera Distribution of Hadoop you will setup HDFS, learn HDFS Commands, Setup YARN, Configure HDFS and YARN High Availability, Understand about Schedulers, Setup Spark, Transition to Parcels, Setup Hive and Impala, Setup HBase and Kafka etc.

課程章節

  • 19 個章節
  • 164 堂課
  • 第 1 章 Introduction - CCA 131 Cloudera Certified Hadoop and Spark Administrator
  • 第 2 章 Getting Started - Provision instances from Google Cloud
  • 第 3 章 Getting Started - Setup local yum repository server – CDH
  • 第 4 章 Install CM and CDH - Setup CM, Install CDH and Setup Cloudera Management Service
  • 第 5 章 Install CM and CDH - Configure Zookeeper
  • 第 6 章 Install CM and CDH - Configure HDFS and Understand Concepts
  • 第 7 章 Install CM and CDH - Important HDFS Commands
  • 第 8 章 Install CM and CDH - Configure YARN + MRv2 and Understand Concepts
  • 第 9 章 Install CM and CDH - Configuring HDFS and YARN HA
  • 第 10 章 Install CM and CDH - YARN Schedulers – FIFO, Fair, and Capacity
  • 第 11 章 Install Other Components - Spark Overview and Installation
  • 第 12 章 Install Other Components - Configuring Database Engines – Hive and Impala
  • 第 13 章 Install Other Components - Configure Hadoop Ecosystem components
  • 第 14 章 Install Other Components - Install and Configure Kafka and HBase
  • 第 15 章 CCA 131 – Revision for the Exam - Install the Cluster
  • 第 16 章 CCA 131 – Revision for the Exam - Configure the Cluster
  • 第 17 章 CCA 131 – Revision for the Exam - Manage the Cluster
  • 第 18 章 CCA 131 – Revision for the Exam - Secure the Cluster
  • 第 19 章 CCA 131 – Revision for the Exam - Test and Troubleshoot the Cluster

課程內容

  • Learn Hadoop and Spark Administration using CDH, Provision Cluster from GCP (Google Cloud Platform) to setup Hadoop and Spark Cluster using CDH, Setup Ansible for server automation to setup pre-requisites to setup Hadoop and Spark Cluster using CDH, Setup 8 node cluster from scratch using CDH, Understand Architecture of HDFS, YARN, Spark, Hive, Hue and many more


評價

  • D
    Deepsankar Bhattacharya
    5.0

    This a good course . I like the most .

  • N
    Nikhil Pachkawade
    5.0

    its so many details in depth looking forward to getting certifications in CDP by completing this course. AWESOME WAY OF EXPLANATION

  • M
    Mohamed Saied
    4.0

    Thanks a lot for the valuable contents and explanation

  • U
    Uladzimir Bartkevich
    3.5

    Course is good, it provides detailed info with end to end set up, main minus is that it is not updated with latest cloudera CDP installation

立即關注瀏覽更多

本網站使用Cookies來改善您的瀏覽體驗,請確定您同意及接受我們的私隱政策使用條款才繼續瀏覽。

我已閱讀及同意