How to Use Proxychains with Proxy IPs
Proxychains is a powerful Linux tool that lets you route any application's traffic through one or more proxy IPs.
Jun 26, 2025
Learn step-by-step methods—from simple HTML parsing to JSON extraction—to scrape Zillow info efficiently and ethically using GoProxy’s rotating residential proxies.
Up-to-date data can make all the difference in real estate. Zillow, a leading online real estate platform, offers a treasure trove of information—property listings, prices, Zestimates, and market trends. Whether you’re a real estate agent tracking local markets, a developer building a property app, a researcher studying housing dynamics, or an investor hunting for deals, scraping Zillow can unlock the insights you need.
This guide walks you through two paths, overcoming anti-scraping measures using rotating residential proxies, and doing so responsibly.
Web scraping uses automated scripts to extract data from websites. For Zillow, this means collecting details like property prices, addresses, square footage, and Zestimates (Zillow’s estimated market values).
Market Analysis: Track price trends and rent vs. sale fluctuations.
Lead Generation: Identify new or price-reduced listings for outreach.
Data Science: Build ML models on real estate features.
Competitive Insight: Map listing density and spot underserved areas.
Real Estate Agents: Monitoring price trends in specific neighborhoods.
Developers: Aggregating listings for apps or platforms.
Researchers: Analyzing housing market shifts over time.
Investors: Identifying undervalued properties for investment.
Scraping Zillow saves time compared to manual data collection and enables large-scale analysis, but it comes with technical and ethical challenges we’ll address.
Zillow’s pages contain valuable data, but you need to pinpoint it. Common targets include:
Use your browser’s Developer Tools (Inspect → Elements) to confirm selectors. For pagination or filter parameters, adjust URLs (e.g., ?beds=2&price=500000-700000).
1. Respect Robots.txt & TOS: Check Zillow’s terms of service and robots.txt for compliance guidelines.
2. Avoid PII: Don’t collect sensitive data (e.g., owner names or contacts) without consent.
3. Rate-Limit: Keep requests ≤60 per minute; randomize headers to mimic human behavior.
4. Rotate Proxies: Use GoProxy’s rotating residential proxies to mask your scraper and avoid blocks.
5. Use Delays: Add time.sleep(1–3) between requests to simulate natural browsing.
6. Set User Agents: Mimic a real browser (e.g., Chrome) to reduce detection risks.
Responsible scraping reduces risks and respects Zillow’s server.
Configure your proxy once and reuse it everywhere:
python
GOPROXY_USER = "your_username"
GOPROXY_PASS = "your_password"
GOPROXY_ENDPOINT = "proxy.goproxy.com:8000" # Single rotating endpoint
proxies = {
"http": f"http://{GOPROXY_USER}:{GOPROXY_PASS}@{GOPROXY_ENDPOINT}",
"https": f"http://{GOPROXY_USER}:{GOPROXY_PASS}@{GOPROXY_ENDPOINT}",
}
HEADERS = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
}
Sign up for GoProxy, grab your credentials, and route requests through their proxies for seamless IP rotation.
A minimal scraper you can write and run in minutes. Here’s how:
Ensure Python 3.7+ is installed. In your terminal, run:
bash
pip install requests beautifulsoup4
python
import requests
url = "https://www.zillow.com/homes/San-Francisco_rb"
resp = requests.get(url, headers=HEADERS, proxies=proxies, timeout=10)
print("Status code:", resp.status_code) # Expect 200
python
from bs4 import BeautifulSoup
soup = BeautifulSoup(resp.text, "html.parser")
cards = soup.select("ul.photo-cards li article")
print("Listings found:", len(cards))
python
listings = []
for card in cards:
price_tag = card.select_one(".list-card-price")
address_tag = card.select_one("address")
link_tag = card.select_one("a.list-card-link")
price = price_tag.get_text(strip=True) if price_tag else "N/A"
address = address_tag.get_text(strip=True) if address_tag else "N/A"
url = link_tag["href"] if link_tag else "N/A"
listings.append({"price": price, "address": address, "url": url})
Editor’s Tip: Always check for None in case a selector fails.
Zillow splits results across pages. Here’s how you can scrape 1-3 pages:
python
import time
all_listings = []
base = "https://www.zillow.com/homes/San-Francisco_rb"
for page in range(1, 4): # pages 1–3
page_url = f"{base}{page}_p"
resp = requests.get(page_url, headers=HEADERS, proxies=proxies, timeout=10)
soup = BeautifulSoup(resp.text, "html.parser")
cards = soup.select("ul.photo-cards li article")
for card in cards:
price = card.select_one(".list-card-price").get_text(strip=True)
addr = card.select_one("address").get_text(strip=True)
link = card.select_one("a.list-card-link")["href"]
all_listings.append({"price": price, "address": addr, "url": link})
time.sleep(3)
Write your collected listings to a CSV file:
python
import csv
with open("zillow_listings.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=["price", "address", "url"])
writer.writeheader()
writer.writerows(all_listings)
print("Saved", len(all_listings), "listings.")
Run on 1–2 pages first.
Adjust CSS selectors if Zillow’s HTML changes.
Increase delays if you see rate-limit responses.
Parse Zillow’s embedded JSON for a more stable, data-rich approach.
bash
pip install httpx jmespath
python
import json
import jmespath
import httpx
def fetch_listings_json(url):
with httpx.Client(proxies=proxies, headers=HEADERS, timeout=10) as client:
r = client.get(url)
text = r.text
start = text.find('window.__INITIAL_STATE__ = ') + 27
end = text.find(';</script>', start)
raw = text[start:end]
data = json.loads(raw)
return jmespath.search("searchResults.cat1.searchResults.listResults", data) or []
if __name__ == "__main__":
listings = fetch_listings_json("https://www.zillow.com/homes/New-York_rb")
for item in listings:
print(item["zpid"], item["price"], item["addressStreet"])
When you need thousands of listings quickly:
python
import asyncio
import json
import jmespath
import httpx
async def fetch(client, url):
r = await client.get(url)
text = r.text
start = text.find('window.__INITIAL_STATE__ = ') + 27
end = text.find(';</script>', start)
data = json.loads(text[start:end])
return jmespath.search("searchResults.cat1.searchResults.listResults", data) or []
async def main(urls):
async with httpx.AsyncClient(proxies=proxies, headers=HEADERS, timeout=10) as client:
tasks = [fetch(client, url) for url in urls]
results = await asyncio.gather(*tasks)
all_items = [item for sub in results for item in sub]
print("Total listings scraped:", len(all_items))
if __name__ == "__main__":
urls = [
"https://www.zillow.com/homes/Los-Angeles_rb",
"https://www.zillow.com/homes/Chicago_rb",
# add more city or filter URLs
]
asyncio.run(main(urls))
Scraping isn’t always smooth. Here’s how to handle hiccups:
CAPTCHAs: Slow down requests or switch GoProxy IPs.
Blocked IPs: Increase proxy rotation frequency.
HTML/JSON Changes: Regularly re-inspect Zillow’s page and update selectors or paths.
JavaScript-Rendered Content: Use Selenium or Playwright if BeautifulSoup misses data.
Monitor Key Changes: Alert on missing JSON keys or empty results.
Dynamic Filters: Build URLs with query params (?beds=2&price=500000-700000) to focus your scrape.
Storage: Stream results into CSV, SQL, or a data lake for analysis.
Error Handling: Implement retries with exponential backoff; log errors without exposing sensitive details.
Compared to alternatives, GoProxy stands out for Zillow scraping:
Rotating Residential IPs: Automatically rotate through 90 M+ real-home addresses to avoid blocks.
High Reliability: Built for heavy scraping with automatic failover.
Ease of Use: Simple Python integration.
Scalability: Handles small tests to massive projects.
Cost-Effective: Offers a 7-day trial and unlimited plans.
Scraping Zillow with Python and GoProxy unlocks real estate insights—from price trends to investment opportunities. With Python, GoProxy, and ethical practices, you can build a reliable scraper tailored to your goals. Start small, refine your approach, and then scale up as needed.
Ready to dive in? Register for GoProxy’s 7-day trial! Need more? Check out unlimited plans. Or skip the setup—contact GoProxy for custom scraping services. Tell us your target data, and we’ll deliver!
< Previous
Next >