Udemy - Scrapy masterclass - Python web scraping and data pipelines

  • CategoryOther
  • TypeTutorials
  • LanguageEnglish
  • Total size2.8 GB
  • Uploaded Byfreecoursewb
  • Downloads29
  • Last checkedNov. 26th '22
  • Date uploadedNov. 25th '22
  • Seeders 1
  • Leechers29

Infohash : 5B7ABCA8F5B0A0F26DE1FE05CEA61F3654EE8874

Scrapy masterclass: Python web scraping and data pipelines



https://DevCourseWeb.com

Published 11/2022
Created by Ahmed Elfakharany
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning | Language: English | Duration: 40 Lectures ( 5h 44m ) | Size: 2.75 GB

Work on 7 real-world web-scraping projects using Scrapy, Splash, and Selenium. Build data pipelines locally and on AWS

What you'll learn
Extract data from the most difficult web sites using Scrapy
Build ETL pipelines and store data in CSV, JSON, MySQL, MongoDB, and S3
Avoid getting banned and evade bot-protection techniques
Use Splash for scraping JavaScript-powered websites
Harness the power of Selenium browser automation to scrape any website
Deploy your Scrapy bots in local and AWS environments

Requirements
Some Python background
All projects are run on Python 3.10 so it needs to be installed
Familiarity with Linux is recommended but not strictly required
Familiarity with the HTTP protocol and HTML

Files:

[ DevCourseWeb.com ] Udemy - Scrapy masterclass - Python web scraping and data pipelines
  • Get Bonus Downloads Here.url (0.2 KB)
  • ~Get Your Files Here ! 1. Introduction
    • 1. Introduction.mp4 (61.5 MB)
    • 1. Introduction.srt (13.7 KB)
    • 1.1 Resources.html (0.1 KB)
    • 2. Scrapy installation.html (2.8 KB)
    2. Xpath first steps
    • 1. Xpath 101 node types.mp4 (39.0 MB)
    • 1. Xpath 101 node types.srt (13.7 KB)
    • 1.1 xpath_node_types.png (1.2 MB)
    • 2. Xpath 102 basic syntax.mp4 (70.9 MB)
    • 2. Xpath 102 basic syntax.srt (22.1 KB)
    • 2.1 XPath 102 Cheat Sheet.pdf (64.0 KB)
    • 3. XPath 103 Axes (Node Relations).mp4 (43.4 MB)
    • 3. XPath 103 Axes (Node Relations).srt (14.4 KB)
    • 3.1 XPath 103 Cheat Sheet Axes (node relations).pdf (51.5 KB)
    • 4. Revisiting our real-estate web scraping example.mp4 (76.1 MB)
    • 4. Revisiting our real-estate web scraping example.srt (14.1 KB)
    3. Hello Scrapy
    • 1. What is a web bot Is it ethical.mp4 (48.4 MB)
    • 1. What is a web bot Is it ethical.srt (14.4 KB)
    • 2. The Scrapy Shell.mp4 (68.5 MB)
    • 2. The Scrapy Shell.srt (19.4 KB)
    • 3. Creating your first Scrapy project.mp4 (69.3 MB)
    • 3. Creating your first Scrapy project.srt (18.0 KB)
    • 3.1 Create your own Scrapy project.html (0.1 KB)
    • 4. Creating your first Scrapy spider.mp4 (88.2 MB)
    • 4. Creating your first Scrapy spider.srt (19.2 KB)
    • 4.1 Create your own Scrapy spider.html (0.1 KB)
    • 5. Handling combined queries using the getall() method.mp4 (60.3 MB)
    • 5. Handling combined queries using the getall() method.srt (10.8 KB)
    • 5.1 Combining XPath queries.html (0.1 KB)
    • 6. Data cleansing using Item Loaders.mp4 (114.5 MB)
    • 6. Data cleansing using Item Loaders.srt (27.5 KB)
    • 6.1 Item Loaders.html (0.1 KB)
    • 6.2 The Scrapy project.html (0.1 KB)
    • 7. Pagination and link-following using Crawl Spiders.mp4 (99.0 MB)
    • 7. Pagination and link-following using Crawl Spiders.srt (17.9 KB)
    • 7.1 Crawl Spiders.html (0.1 KB)
    4. Scrapy web-scraping scenarios
    • 1. Login to websites.mp4 (45.9 MB)
    • 1. Login to websites.srt (15.3 KB)
    • 1.1 Login bot.html (0.1 KB)
    • 2. Changing the user-agent.mp4 (26.8 MB)
    • 2. Changing the user-agent.srt (6.5 KB)
    • 3. Handling AJAX requests 1.mp4 (70.2 MB)
    • 3. Handling AJAX requests 1.srt (17.2 KB)
    • 3.1 Handling AJAX requests.html (0.1 KB)
    • 4. Handling AJAX requests 2.mp4 (45.9 MB)
    • 4. Handling AJAX requests 2.srt (9.2 KB)
    • 4.1 Handling AJAX requests.html (0.1 KB)
    • 5. Handling AJAX requests 3.mp4 (38.2 MB)
    • 5. Handling AJAX requests 3.srt (7.1 KB)
    • 5.1 Handling AJAX requests.html (0.1 KB)
    • 6. Caching responses.mp4 (60.9 MB)
    • 6. Caching responses.srt (13.0 KB)
    • 7. Image harvesting.mp4 (147.1 MB)
    • 7. Image harvesting.srt (24.1 KB)
    • 8. Scraped images storage in FTP and AWS S3.mp4 (43.6 MB)
    • 8. Scraped images storage in FTP and AWS S3.srt (9.7 KB)
    • 8.1 Images storage to S3 and FTP.html (0.1 KB)
    5. Data transformation using Scrapy Pipelines
    • 1. Introduction and sample project (classifieds ads scraping).mp4 (140.2 MB)
    • 1. Introduction and sample project (classifieds ads scraping).srt (29.3 KB)
    • 1.1 Classifieds Ads project.html (0.1 KB)
    • 2. Removing ads with duplicate titles.mp4 (42.3 MB)
    • 2. Removing ads with duplicate titles.srt (8.6 KB)
    • 2.1 Remove duplicates pipeline.html (0.1 KB)
    • 2.2 Removing duplicates pipeline.html (0.1 KB)
    • 3. Removing ads with no phone numbers.mp4 (28.0 MB)
    • 3. Removing ads with no phone numbers.srt (5.5 KB)
    • 3.1 Dropping Ads with no phones pipeline.html (0.1 KB)
    6. Data loading (storage) using Scrapy's pipelines
    • 1. Storing scraped data in MongoDB.mp4 (66.0 MB)
    • 1. Storing scraped data in MongoDB.srt (19.5 KB)
    • 1.1 MongoDB pipeline.html (0.1 KB)
    • 2. Storing scraped data in MySQL.mp4 (74.9 MB)
    • 2. Storing scraped data in MySQL.srt (16.6 KB)
    • 2.1 MySQL Pipeline.html (0.1 KB)
    • 3. Using Vault to sore sensitive Scrapy settings.mp4 (73.0 MB)
    • 3. Using Vault to sore sensitive Scrapy settings.srt (17.3 KB)
    • 3.1 Using Vault to store sensitive data for Scrapy.html (0.1 KB)
    • 4. Storing data to AWS S3 bucket.mp4 (58.1 MB)
    • 4. Storing data to AWS S3 bucket.srt (14.6 KB)
    • 4.1 S3 Pipeline.html (0.1 KB)
    • 5. Using Amazon Glue and Athena to query the data from S3 (extra lecture).mp4 (59.8 MB)
    • 5. Using Amazon Glue and Athena to query the data from S3 (extra lecture).srt (13.1 KB)
    7. Scrapy Middleware (or how to avoid getting banned)
    • 1. Phone-models project and spider rate-limiting.mp4 (124.2 MB)
    • 1. Phone-models project and spider rate-limiting.srt (24.2 KB)
    • 1.1 Phone Models Project.html (0.1 KB)
    • 2. Rotating user-agents middleware.mp4 (52.2 MB)
    • 2. Rotating user-agents middleware.srt (10.4 KB)
    • 2.1 Rotating user-agents project.html (0.1 KB)
    • 3. Rotating proxies middleware.mp4 (83.5 MB)
    • 3. Rotating proxies middleware.srt (16.0 KB)
    • 3.1 Rotating proxies.html (0.1 KB)
    8. Handling JavaScript websites using Splash
    • 1. What is Splash.mp4 (32.5 MB)
    • 1. What is Splash.srt (10.9 KB)
    • 2. Introduction to Docker (optional).mp4 (54.9 MB)
    • 2. Introduction to Docker (optional).srt (18.9 KB)
    • 3. Test-driving Splash.mp4 (43.7 MB)
    • 3. Test-driving Splash.srt (10.4 KB)
    • 4. Integrating Scrapy with Splash.mp4 (112.0 MB)
    • 4. Integrating Scrapy with Splash.srt (22.1 KB)
    • 4.1 Wikipedia with Splash.html (0.1 KB)
    • 5. Dealing with infinitely-scrolling pages using Splash.mp4 (123.7 MB)
    • 5. Dealing with infinitely-scrolling pages using Splash.srt (27.8 KB)
    • <

Code:

  • udp://tracker.torrent.eu.org:451/announce
  • udp://tracker.tiny-vps.com:6969/announce
  • http://tracker.foreverpirates.co:80/announce
  • udp://tracker.cyberia.is:6969/announce
  • udp://exodus.desync.com:6969/announce
  • udp://explodie.org:6969/announce
  • udp://tracker.opentrackr.org:1337/announce
  • udp://9.rarbg.to:2780/announce
  • udp://tracker.internetwarriors.net:1337/announce
  • udp://ipv4.tracker.harry.lu:80/announce
  • udp://open.stealth.si:80/announce
  • udp://9.rarbg.to:2900/announce
  • udp://9.rarbg.me:2720/announce
  • udp://opentor.org:2710/announce