Udemy

Data Science:Hands-on Diabetes Prediction with Pyspark MLlib

立即報名
  • 12,240 名學生
  • 更新於 9/2020
  • 可獲發證書
4.4
(239 個評分)
CTgoodjobs 嚴選優質課程,為職場人士提升競爭力。透過本站連結購買Udemy課程,本站將獲得推廣佣金,有助未來提供更多實用進修課程資訊給讀者。

課程資料

報名日期
全年招生
課程級別
學習模式
修業期
0 小時 46 分鐘
教學語言
英語
授課導師
School of Disruptive Innovation
證書
  • 可獲發
  • *證書的發放與分配,依課程提供者的政策及安排而定。
評分
4.4
(239 個評分)
2次瀏覽

課程簡介

Data Science:Hands-on Diabetes Prediction with Pyspark MLlib

Diabetes Prediction using Machine Learning in Apache Spark

Would you like to build, train, test and evaluate a machine learning model that is able to detect diabetes using logistic regression?


This is a Hands-on Machine Learning Course where you will practice alongside the classes. The dataset will be provided to you during the lectures. We highly recommend that for the best learning experience, you practice alongside the lectures.


You will learn more in this one hour of Practice than hundreds of hours of unnecessary theoretical lectures.


Learn the most important aspect of Spark Machine learning (Spark MLlib) :


  • Pyspark fundamentals and implementing spark machine learning

  • Importing and Working with Datasets

  • Process data using a Machine Learning model using spark MLlib

  • Build and train Logistic regression model

  • Test and analyze the model


The entire course has been divided into tasks. Each task has been very carefully created and designed to give you the best learning experience. In this hands-on project, we will complete the following tasks:


  • Task 1: Project overview

  • Task 2: Intro to Colab environment & install dependencies to run spark on Colab

  • Task 3: Clone & explore the diabetes dataset

  • Task 4: Data Cleaning

  • Task 5: Correlation & feature selection

  • Task 6: Build and train Logistic Regression Model using Spark MLlib

  • Task 7: Performance evaluation & Test the model

  • Task 8: Save & load model


About Pyspark:


Pyspark is the collaboration of Apache Spark and Python. PySpark is a tool used in Big Data Analytics.

Apache Spark is an open-source cluster-computing framework, built around speed, ease of use, and streaming analytics whereas Python is a general-purpose, high-level programming language. It provides a wide range of libraries and is majorly used for Machine Learning and Real-Time Streaming Analytics.

In other words, it is a Python API for Spark that lets you harness the simplicity of Python and the power of Apache Spark in order to tame Big Data. We will be using Big data tools in this project.


Make a leap into Data science with this Spark MLlib project and showcase your skills on your resume.


Click on the “ENROLL NOW” button and start learning.


Happy Learning.

課程章節

  • 6 個章節
  • 6 堂課
  • 第 1 章 Introduction
  • 第 2 章 Introduction to Project platform & install dependencies
  • 第 3 章 Clone & Explore Diabetes Dataset
  • 第 4 章 Data Cleaning
  • 第 5 章 Build and Train Machine Learning Model
  • 第 6 章 Performance evaluation and Save the model

課程內容

  • Diabetes Prediction using Spark Machine Learning (Spark MLlib)
  • Learn Pyspark fundamentals
  • Working with dataframes in Pyspark
  • Analyzing and cleaning data
  • Process data using a Machine Learning model using Spark MLlib
  • Build and train logistic regression model
  • Performance evaluation and saving model

評價

  • V
    Velampudi Rohit
    5.0

    very informative session with perfect example

  • S
    Samraj
    4.0

    Well Explained but need some more in detail

  • N
    Nagarjuna Pamulapati
    3.5

    Nicely explained the library and methods used in the ML model training, and other concepts. Thanks for this good video and explanation.

  • N
    Naresh Kumar Reddy Pappireddy
    5.0

    I started with zero knowledge on data science but understood everything you taught.Thank You So Much

立即關注瀏覽更多

本網站使用Cookies來改善您的瀏覽體驗,請確定您同意及接受我們的私隱政策使用條款才繼續瀏覽。

我已閱讀及同意