Course Information
Course Overview
Learn how to scrape data from any static or dynamic / AJAX web page using Java in a short and concise way.
In this short and concise course you will learn everything to get started with web scraping using Java.
You will learn the concepts behind web scraping that you can apply to practically any web page (static AND dynamic / AJAX).
Course structure
We start with an overview of what web scraping is and what you can do with it.
Then we explain the difference in scraping static pages vs dynamic / AJAX pages. You learn how to classify a website in one of the two categories and then apply the right concept in order to scrape the data you want.
Now you will learn how to export the scraped data either as CSV or JSON. These are some popular formats that can be used for further processing.
Unfortunately many websites try to block scrapers or sometimes you just do not want to be detected. In the section going undercover you will learn how to stay undetected and avoid getting blocked.
At the end of the course you can download the full source code of all the lectures and we discuss an outlook to some advanced topics (private proxies, cloud deployment, multi threading ...). Those advanced topics are covered in a follow up course I am going to teach.
Why you should take this course
Stop imagining you can scrape data from websites and use the skills for your next web project, you can do it now.
- Stay ahead of your competition
- Be more efficient and automate tedious, manual tasks
- Increase your value by adding web scraping to your skill set
Enroll now!
Course Content
- 6 section(s)
- 18 lecture(s)
- Section 1 Course Introduction
- Section 2 Scraping static web pages
- Section 3 Scraping dynamic / AJAX web pages
- Section 4 Exporting your data
- Section 5 Going undercover
- Section 6 Conclusion
What You’ll Learn
- Have a solid understanding of web scraping with Java
- Beeing able to scrape practically any web page (static AND dynamic / AJAX) though you learn the concepts behind web scraping
- Download, parse and extract data from websites with Jsoup
- Call web APIs in Java with Unirest
- Export your data as CSV or JSON
- Build web scrapers that stay undetected and do not get blocked or banned
Skills covered in this course
Reviews
-
MMohammad Aslam
Spend some time with the CSS selectors, It's really difficult to understand how the tags are being picked from the browser.
-
JJavier Arellano
The course is good to get started with the concepts behind web scraping, and you can actually make your own basic web scraper for static web pages after the first lessons. However, it is also true that most of the practical examples cannot be replicated anymore because the web sites have changed. Some topics you might want to check on your own (since they are missing in the course content, at the moment) are how to add cookies and headers to Unirest requests, as well as scraping dynamic web pages with Selenium, which is, in my experience, a better alternative than HtmlUnit.
-
MManuel Rio Temes
El curso está obsoleto. Los ejemplos que usa el curso, en este momento, no corresponden con los de la actualidad.
-
JJustas Markauskas
Very good explanation about topic. The only bad thing is examples, none of them is working, but everything else is perfect!!!