Great Learning Education Centre

Big Data Consultant (module 4,7)

Enquire Now

Course Information

  • 11 Jun 2021 (Fri) 7:00 PM - 10:00 PM
Registration period
4 Apr 2021 (Sun) - 10 Jun 2021 (Thu)
HKD 5,600
Course Level
Study Mode
12 Hour(s)
Unit A, 20/F, Success Commercial Building, 245-251 Hennessy Road, Wanchai, Hong Kong

Course Overview

About the Certificate

A Certified Big Data Consultant has demonstrated proficiency across a range of key Big Data topics, with a particular focus on Big Data analysis techniques and the processing and storage of Big Data datasets. The breadth of this curriculum enables a Certified Big Data Consultant to perform in a variety of capacities as a member of a Big Data solutions team.

A Certified Big Data Consultant possesses a sound understanding of fundamental and advanced Big Data concepts and terminology and they are able to participate in Big Data adoption and planning projects with their thorough understanding of the Big Data analysis lifecycle. Knowledge of various analysis techniques including exploratory data analysis (EDA), basic statistical techniques and fundamental machine learning algorithms, coupled with key characteristics and types of Big Data visualization tools is developed. A Certified Big Data Consultant is conversant with the fundamental technology mechanisms required for implementing Big Data technology platforms. Furthermore, a Certified Big Data Consultant will have a comprehensive know-how of the technology considerations and options for Big Data processing and storage.

Note that the Big Data Consultant certification program is based on vendor-neutral coverage of technologies and a broad treatment of various statistical techniques and fundamental machine learning algorithms. The attainment of this certification, does not requires any knowledge of specific products or the underlying mathematical formulas and code involved in performing analysis and processing/storing data. This certification imparts the necessary skills and understanding required for successful adoption of Big Data with a focus on Big Data analysis techniques and the required technology platform. This knowledge establishes a sound foundation that can be further built upon with additional training, accreditation and experience.

Course Objectives

Module 4: Fundamental Big Data Analysis & Science (duration: 6 hours)

It provides an in-depth overview of essential topic areas pertaining to data science and analysis techniques relevant and unique to Big Data with an emphasis on how analysis and analytics need to be carried out individually and collectively in support of the distinct characteristics, requirements and challenges associated with Big Data datasets.

Module 7: Fundamental Big Data Engineering (duration: 6 hours)

This course explores introductory topics pertaining to the field of developing data processing solutions–data engineering–in the context of Big Data environments. Specifically it covers concepts, techniques and technologies related to the processing and storage of Big Data datasets including MapReduce and NoSQL. It highlights the unique challenges faced when processing and storing Big Data datasets. The MapReduce data processing engine, which is the de facto framework for batch processing of large amounts of data, is also explained in detail.

What you'll learn

  • Module 4
    • Data Science, Data Mining & Data Modeling
    • Big Data Dataset Categories
    • Exploratory Data Analysis (EDA) (including numerical summaries, rules & data reduction)
    • EDA analysis types (including univariate, bivariate & multivariate)
    • Essential Statistics (including variable categories & relevant mathematics)
    • Statistics Analysis (including descriptive, inferential, correlation, covariance & hypothesis testing)
    • Data Munging & Machine Learning
    • Variables & Basic Mathematical Notations
    • Statistical Measures & Statistical Inference
    • Distributions & Data Processing Techniques
    • Data Discretization, Binning, Clustering
    • Visualization Techniques & Numerical Summaries
    • Correlation for Big Data
    • Time Series Analysis for Big Data
  • Module 7
    • Data Engineering – Big Data Engineering Challenges
    • Big Data Storage Terminologies (including sharding, replication, CAP theorem, ACID, BASE)
    • Big Data Storage Requirements
    • On-Disk Storage (including distributed file system – databases)
    • Introduction to NoSQL – NewSQL
    • NoSQL Rationale – Characteristics
    • NoSQL Database Types (including key-value, document, column-family and graph databases)
    • Big Data Processing Requirements
    • Big Data Processing (including batch mode and realtime mode)
    • Introduction to MapReduce for Big Data Processing (batch mode)
    • MapReduce Explained (including map, combine, partition, shuffle and sort, and reduce

We use cookies to enhance your experience on our website. Please read and confirm your agreement to our Privacy Policy and Terms and Conditions before continue to browse our website.

Read and Agreed