← Back to all tools

Texas Scraper Kit

$13.37 764 data sources Docker + FeedEater 12 months of updates
Buy on Gumroad — $13.37

What It Does

A self-hosted data engine that collects from 764 Texas public data sources — Secretary of State, Comptroller, county appraisal districts, Ethics Commission, Railroad Commission, TCEQ, TDLR, DPS, and more. Business filings, property records, campaign finance, environmental permits, licensing data, tax records.

Built on FeedEater — a modular TypeScript data engine with NATS messaging and PostgreSQL storage. Each data source is a module you can toggle on or off. Deploy with docker compose up. Manage via web UI, CLI, or REST API.

Architecture

$ docker compose up -d

[+] Running 5/5
 Container feedeater-nats       Started
 Container feedeater-postgres   Started
 Container feedeater-engine     Started
 Container feedeater-api        Started
 Container feedeater-ui         Started

$ curl localhost:3000/api/health
{
  "status": "healthy",
  "engine": "feedeater",
  "modules_loaded": 47,
  "sources_available": 764,
  "sources_enabled": 0,
  "database": "connected",
  "nats": "connected"
}

Five containers: NATS (message bus), PostgreSQL (storage), FeedEater engine (module runner), REST API (query + control), Web UI (dashboard). Everything talks over NATS. Modules publish data, the engine stores it, you query it.

Three Collection Modes

Bulk

Sources with downloadable datasets. Toggle on, the module downloads the full dataset, stores it in Postgres. Comptroller Open Data (1,360 datasets), SOS bulk filings, and more.

Crawl

Sources with enumerable key spaces. The module iterates through ID ranges — entity numbers, license numbers, permit IDs — collecting everything. Rate-limited, resumable, runs in background.

Query

Sources that require search parameters. Each query module has its own page in the web UI with the right input fields. Search on demand, save results to your local database.

Web UI

Open localhost:3000 in your browser. See all 764 sources. Toggle them on or off. Trigger manual scrapes. Search your collected data. Each query source has a dedicated page with the right input fields for that source. Monitor collection status, see what's running, what's completed, what's failed.

REST API

# List available sources
$ curl localhost:3000/api/sources

# Enable a source
$ curl -X POST localhost:3000/api/sources/texas-sos-business/enable

# Trigger an on-demand scrape
$ curl -X POST localhost:3000/api/sources/texas-hcad/scrape

# Query collected data
$ curl "localhost:3000/api/data/texas-hcad?search=austin&limit=10"

# Export to JSON
$ curl "localhost:3000/api/data/texas-sos-business?format=json" > export.json

The same API that powers the web UI. Point your AI agent, your RAG pipeline, or your scripts at it. No authentication required on localhost.

Data Sources

Secretary of State

Business filings, entity search, assumed names, UCC filings, notary records.

Comptroller

1,360 datasets on the Open Data Portal. Tax records, franchise tax, sales tax permits, state expenditures.

County Appraisal Districts

Property records across 254 counties. Assessed values, ownership, tax amounts, legal descriptions.

Ethics Commission

Campaign finance reports, lobbyist registrations, personal financial statements, political committee filings.

Railroad Commission

Oil and gas well data, pipeline permits, operator records, production reports.

TCEQ, TDLR, DPS & More

Environmental permits, professional licenses, criminal records, vehicle registrations, and dozens more state agencies.

Technical Specs

EngineFeedEater (TypeScript)
MessagingNATS
DatabasePostgreSQL 14+
ContainerizationDocker + Docker Compose
APIREST with OpenAPI docs at /api/docs
Web UILightweight dashboard at localhost:3000
Data Sources764 across state, county, and municipal levels
Collection ModesBulk, Crawl, Query
Output FormatsJSON, CSV, PostgreSQL
Rate LimitingConfigurable per module (default 1 req/sec)
Retry LogicExponential backoff with configurable retries
ResumeCrawl jobs resume from last checkpoint
LoggingStructured JSON logs, configurable verbosity
LicenseMIT — modify, resell, do whatever you want
Updates12 months included with purchase

What You Get

Pricing

$13.37
One-time purchase. Not a subscription.
  • Full source code (MIT license)
  • 764 data source modules
  • Docker + FeedEater + NATS + PostgreSQL
  • Web UI + REST API + CLI
  • 12 months of updates
Buy on Gumroad — $13.37

FAQ

Is this legal?

Yes. All sources are public records under Texas law. This tool accesses the same data available on government websites.

Why 764 sources?

Texas has data spread across dozens of state agencies, 254 counties, and hundreds of municipalities. Many agencies have multiple portals. We map and module every accessible source.

What do I need to run it?

Docker. That's it. docker compose up brings up the entire stack — engine, database, message bus, API, and web UI.

Can AI agents use this?

Yes. The REST API is designed for programmatic access. Point your agent at localhost:3000/api — enable sources, trigger scrapes, query data. No authentication on localhost.

What happens after 12 months?

You keep everything forever. After 12 months, you stop receiving module updates when source sites change. Renew for another $13.37/year, or maintain the modules yourself — it's MIT licensed.

Do I need an API key?

No. Zero external API dependencies. All modules scrape public websites directly.

Can I modify the code?

Yes. MIT licensed. Modify it, add your own modules, integrate it into your pipeline, resell it — no restrictions.

How is this different from BatchData or PropStream?

They charge $50-$500/month for access to their servers and limit you to property data. This is $13.37 one-time for the entire Texas public data landscape — property, business, environmental, licensing, campaign finance, and more. Runs on your machine. You own everything.

Will there be other states?

Yes. The FeedEater module architecture scales to any state. Texas is first. Same engine, different modules.

764 Texas public data sources. One command.

Buy on Gumroad — $13.37