A multi-source Python web scraping project that retrieves the latest headline data from major UK news platforms. This project demonstrates core concepts such as HTTP requests, HTML parsing, data extraction, and modular scraper architecture.
- Automated retrieval of headline data
- Support for multiple news sources
- BBC News
- The Guardian UK
- Modular scraping functions (
scrape_bbc(),scrape_guardian(),scrape_all()) - Clean and structured console output
- Error-free BeautifulSoup parsing workflow
- Python 3
- Requests
- BeautifulSoup (bs4)
The scraper consists of three main components:
Fetches and extracts top headlines from BBC News.
Fetches and extracts top headlines from The Guardian UK.
Runs both scrapers together and prints aggregated results in a clear formatted structure.