Course Information
- Available
- *The delivery and distribution of the certificate are subject to the policies and arrangements of the course provider.
Course Overview
Extract, Transform and Load data using Pig to harness the power of Hadoop
Prerequisites: Working with Pig requires some basic knowledge of the SQL query language, a brief understanding of the Hadoop eco-system and MapReduce
Taught by a team which includes 2 Stanford-educated, ex-Googlers and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with large-scale data processing jobs.
Pig is aptly named, it is omnivorous, will consume any data that you throw at it and bring home the bacon!
Let's parse that
omnivorous: Pig works with unstructured data. It has many operations which are very SQL-like but Pig can perform these operations on data sets which have no fixed schema. Pig is great at wrestling data into a form which is clean and can be stored in a data warehouse for reporting and analysis.
bring home the bacon: Pig allows you to transform data in a way that makes is structured, predictable and useful, ready for consumption.
What's Covered:
Pig Basics: Scalar and Complex data types (Bags, Maps, Tuples), basic transformations such as Filter, Foreach, Load, Dump, Store, Distinct, Limit, Order by and other built-in functions.
Advanced Data Transformations and Optimizations: The mind-bending Nested Foreach, Joins and their optimizations using "parallel", "merge", "replicated" and other keywords, Co-groups and Semi-joins, debugging using Explain and Illustrate commands
Real-world example: Clean up server logs using Pig
Course Content
- 9 section(s)
- 35 lecture(s)
- Section 1 You, This Course and Us
- Section 2 Where does Pig fit in?
- Section 3 Pig Basics
- Section 4 Pig Operations And Data Transformations
- Section 5 Advanced Data Transformations
- Section 6 Optimizing Data Transformations
- Section 7 A real-world example
- Section 8 Installing Hadoop in a Local Environment
- Section 9 Appendix
What You’ll Learn
- Work with unstructured data to extract information, transform it and store it in a usable form
- Write intermediate level Pig scripts to munge data
- Optimize Pig operations which work on large data sets
Reviews
-
KKanchan
Very well explained - 1. Where to use ping 2 How it is different from other BIG data technologies like Hive Java. I always had this doubt. Thanks for the detail explanation. I am loving it .
-
BBielo López Lauber
It is an excellent course and is combined with good explanations and the clarity that characterizes the group of professionals that make it up.
-
NNatarajan Selvam
Explanation is good and super but could have been better if they had come up with practical example session along. If you want to understand the concept and when and how to use it is good session to take up but note there is no hands-on covered in this course.
-
YYaseen Khan
Excellent very through and covers all the very important stuff for u to understand clearly. Fantastic!!!