Live demo: scraper.propertywebbuilder.com
From the team behind PropertyWebBuilder โ the open-source real estate platform.
A real estate listing extraction API and Chrome extension. Given a property listing URL (or pre-rendered HTML), it returns structured data: title, price, coordinates, images, and 70+ fields across 22 supported portals in 12 countries.
Built with Astro (SSR mode), TypeScript, and Cheerio.
| Country | Portals |
|---|---|
| ๐ฌ๐ง UK | Rightmove, Zoopla, OnTheMarket, Jitty |
| ๐บ๐ธ USA | Realtor.com, Redfin, Trulia, ForSaleByOwner, Zillowโ |
| ๐ฆ๐บ Australia | Domain, RealEstate.com.au |
| ๐ช๐ธ Spain | Idealista, Fotocasa, Pisos.com |
| ๐ฉ๐ช Germany | ImmobilienScout24 |
| ๐ณ๐ฑ Netherlands | Funda |
| ๐ฎ๐ช Ireland | Daft.ie |
| ๐ต๐น Portugal | Idealista PT |
| ๐ฎ๐ณ India | RealEstateIndia |
| ๐ธ๐ช Sweden | Hemnetโ |
| ๐ซ๐ท France | SeLogerโ |
| ๐ฎ๐น Italy | Immobiliare.itโ |
โ experimental โ lower extraction rate
Portal count is derived from the PORTAL_REGISTRY in astro-app/src/lib/services/portal-registry.ts (single source of truth).
The project includes a Manifest V3 Chrome extension that makes extraction available with one click on any supported listing page.
- Badge indicator โ green check on supported sites
- Haul collections โ browse multiple listings, then view them all on a single results page
- Property card popup โ image, price, stats, quality grade
- Copy to clipboard โ JSON or listing URL
- No API key required โ uses anonymous haul collections
Install (dev mode): Open chrome://extensions/ โ enable Developer mode โ Load unpacked โ select chrome-extensions/property-scraper/ folder.
See the full Chrome Extension documentation for architecture details and configuration.
The extraction engine takes fully-rendered HTML and a source URL, then applies configurable JSON mappings (CSS selectors, script JSON paths, regex patterns, JSON-LD, flight data paths) to extract structured property data. No browser automation or JS rendering happens inside the engine itself โ the caller provides the HTML.
- User browses supported listing pages โ extension badge turns green
- Click the extension icon to extract the current listing
- Results are collected into an anonymous haul โ no login required
- A shareable results page shows all collected listings with comparison data
cd astro-app
npm install
npm run devThe dev server starts at http://localhost:4321. You can extract a listing via the web UI or the API.
POST /extract/url
Content-Type: application/x-www-form-urlencoded
url=https://www.rightmove.co.uk/properties/168908774
POST /extract/html
Content-Type: application/x-www-form-urlencoded
url=https://www.rightmove.co.uk/properties/168908774&html=<html>...</html>
GET /public_api/v1/listings?url=https://www.rightmove.co.uk/properties/168908774
GET /public_api/v1/supported_sites
GET /public_api/v1/health
POST /ext/v1/hauls # Create anonymous haul
GET /ext/v1/hauls/:id # Get haul summary
POST /ext/v1/hauls/:id/scrapes # Add extraction to haul
See DESIGN.md for the full API endpoint reference and architecture details.
An MCP server (astro-app/mcp-server.ts) enables Claude Code to capture rendered HTML directly from Chrome via the MCP Bridge extension. Start it with:
npx tsx astro-app/mcp-server.tscd astro-app
npx vitest runproperty_web_scraper/
โโโ astro-app/ # Astro 5 SSR application (active development)
โ โโโ src/lib/extractor/ # Core extraction pipeline
โ โโโ src/lib/services/ # URL validation, auth, rate limiting
โ โโโ src/pages/ # Astro pages and API endpoints
โ โโโ test/ # Vitest tests and HTML fixtures
โ โโโ scripts/ # CLI utilities (capture-fixture)
โโโ chrome-extensions/ # Chrome extensions
โ โโโ property-scraper/ # Public extension (one-click extraction popup)
โ โโโ mcp-bridge/ # Dev extension (WebSocket bridge to MCP server)
โโโ config/scraper_mappings/ # JSON mapping files per portal
โ โโโ archive/ # Legacy mappings (kept for reference)
โโโ app/ # Legacy Rails engine (see RAILS_README.md)
โโโ spec-archive/ # Archived Rails RSpec tests (not run in CI)
Each supported site has a JSON mapping file in config/scraper_mappings/ with a country-code prefix (e.g. uk_rightmove.json, es_idealista.json). These define CSS selectors, script JSON paths, regex patterns, and post-processing rules for extracting fields from that site's HTML.
PropertyWebScraper is part of the PropertyWebBuilder ecosystem. These projects all use it as their extraction backend:
| Project | What it does | Stack |
|---|---|---|
| HomesToCompare | AI-powered side-by-side property comparisons with 11 analysis sections and Firestore sync | Astro, React, Firestore |
| HousePriceGuess | Gamified property price guessing with AI dossiers, 18+ white-label brands, and embeddable widgets | Astro, React, Tailwind |
| SinglePropertyPages | SaaS for dedicated property microsites with lead capture, analytics, and WYSIWYG editor | Astro, TypeScript |
| PropertySquares | 48-step first-time buyer journey across multiple markets | Astro, TypeScript |
Building a real estate project? PropertyWebScraper gives you structured listing data from 17 portals in 8 countries via a simple API. Open an issue to get your project listed here.
This project was originally a Ruby on Rails engine. The Rails code in app/ is kept for legacy purposes but is no longer under active development. See RAILS_README.md for details.
The easiest way to contribute is to add a scraper for a property portal in your country. We have a step-by-step guide in CONTRIBUTING.md that walks you through the process โ no deep knowledge of the codebase required.
We also welcome bug fixes, test improvements, and documentation updates. See the open issues for ideas.
If you like this project, please star it and spread the word on Twitter, LinkedIn and Facebook.
Available as open source under the terms of the MIT License.
While scraping can sometimes be used as a legitimate way to access all kinds of data on the internet, it's also important to consider the legal implications. There are cases where scraping data may be considered illegal, or open you to the possibility of being sued.
This tool was created in part as a learning exercise and is shared in case others find it useful. If you do decide to use this tool to scrape a website it is your responsibility to ensure that what you are doing is legal.