How RoamFound is built
Where the data comes from, what we add, and how to flag errors.
Why this exists
The information about where to go for outdoor recreation in America already exists — it's just scattered across federal databases, state agency websites, OpenStreetMap edits, individual blogger posts, and the occasional Wikipedia article. RoamFound is an attempt to consolidate it into one practical, fact-checked directory: every named waterfall, every public hiking trail, every documented hot spring, organized by the city you're driving from.
Phase 1 — waterfalls
We launched with waterfalls because the search-engine landscape is unusually open: there's no dominant national directory, and the existing competitive sites are mostly individual bloggers writing about one corner of one state. Our spine of 5,663 named waterfalls comes from the following public sources:
OpenStreetMap (primary spine)
OpenStreetMap is a worldwide volunteer-maintained map of the planet, comparable to Wikipedia for geographic data. Contributors tag features including waterfalls (using the waterway=waterfall or natural=waterfall tags), often along with names, heights, elevation, and links to Wikipedia or Wikidata articles. We pull the U.S. coverage state-by-state via the Overpass API and refresh the data periodically.
OpenStreetMap data is licensed under the Open Database License, which requires attribution. You'll find the attribution in our footer on every page.
USGS Geographic Names Information System (GNIS)
The official U.S. Board on Geographic Names database, public domain, refreshed every two months by USGS. We pull the national TXT (981k records) and filter to the relevant feature classes per category: Falls for waterfalls, Spring for hot springs (with a name-pattern thermal filter on top). For waterfalls, GNIS adds 1,164 net-new records on top of OSM and cross-references another 1,358 with their formal feature IDs and county data. For hot springs, GNIS is the primary spine.
National Park Service (curated bounding boxes)
For waterfalls and hot springs inside the famous parks (Yosemite, Yellowstone, Glacier, Great Smoky Mountains, etc.), we attach a curated record with park name, fee, ATB pass info, and a direct link to the official nps.gov page. We use bounding boxes plus a state-code sanity check rather than the live NPS API — the curated approach has zero auth dependency and covers ~80% of the value. A real NPS API integration is queued for a future pass to unlock per-feature trail data and live alerts.
Wikipedia + Wikimedia Commons
For waterfalls with an OSM wikipedia or wikidata tag, we pull the Wikipedia REST API summary (extract + thumbnail URL + page URL) and (where Wikimedia permits) download the thumbnail to serve from our own domain. A separate search-by-name pass picks up famous-tail waterfalls whose OSM nodes don't carry a Wikipedia tag — we query Wikipedia by name + state and validate hits against feature-class keywords and geographic anchors before accepting. Attribution: CC BY-SA 4.0, footer-cited, with the source article linked from every excerpt.
Coming next
Active enrichment passes:
- NPS
/alertsAPI — surface "trail closed for snowpack", "boardwalk reroute", "fire-related closures" notices on destination pages. - OSM length-from-geometry for hiking trails — compute trail length from member-way geometries for the ~89% of OSM trails missing a
distancetag. - USDA Forest Service — ranger-district contact info for falls and hot springs in national forests, plus the Forest Service trails dataset. Public domain.
- Bureau of Land Management — for the wild Nevada/Oregon/Idaho hot springs on BLM land.
- State parks departments (50 states) — varies by state, but many have public APIs or scrapeable directories.
Phase 1 — hot springs
The hot-springs category is a three-source pipeline: USGS GNIS (the spine), OpenStreetMap (editorial enrichment), and Wikipedia (extracts + photos for famous features).
Spine — USGS GNIS. We filter the GNIS feature_class="Spring" bucket (~37,000 records nationally, mostly cold artesian springs) to thermal features by name pattern: any feature with "hot spring", "warm spring", "geyser", "boiling spring", or "thermal" in the GNIS-recorded name. That yields ~780 GNIS thermal-named records.
OSM enrichment — natural=hot_spring + natural=geyser. A separate Overpass pass pulls the U.S. OSM-tagged hot-spring nodes (about 2,200 nodes, ~780 of which carry a name tag). Where an OSM node matches a GNIS record by name + state + a 2-mile coord gate, the master builder stamps the OSM editorial tags (operator, bathing-allowed, fee, opening hours, water temperature, website, phone, description) onto the GNIS record. Where the OSM node doesn't match any GNIS record, it becomes a new master record with source="osm". This is the layer that adds Yellowstone's full geyser-field coverage on top of GNIS, and surfaces commercially-developed soaking pools (operator, fee, hours) on every page where OSM contributors have tagged them.
That brings the catalogue to 1,391 thermal features, heavily concentrated in the Rocky Mountain, Great Basin, and Cascades regions.
NPS attachment. Hot springs inside the famous parks (Yellowstone, Hot Springs NP, Lassen, etc.) get a curated NPS-unit record attached via bounding box plus state-code sanity check — same file as the waterfalls pipeline.
Wikipedia. A search-by-name pass queries Wikipedia for each indexable hot spring and validates hits by extract content + a geographic anchor (state, NPS park, county, metro). Pages with a Wikipedia article carry an extract block under CC BY-SA 4.0 and a locally-hosted thumbnail.
Every hot-spring page carries a prominent Safety & access block: GNIS catalogs the geographic feature, not its access status, and these pages span the full range from developed public soaking pools to private resorts to wild thermal water on federal/state land. NPS-attached features get an explicit "soaking is prohibited" note since that's the rule in Yellowstone and most thermal-features parks. Where OSM tags it as a developed feature with an operator or fee, an "Operations & visitor info" block surfaces what we know (operator, posted hours, bathing rules, water temperature, fee, website, phone) — always with the caveat that OSM tags can be out of date and the operator's own site is the source of truth.
Phase 1B — hiking trails
Hiking trails are catalogued from two sources merged together. (1) A hand-curated set of America's 11 federally-designated National Scenic Trails (Appalachian, Pacific Crest, Continental Divide, North Country, Ice Age, Florida, Pacific Northwest, New England, Natchez Trace, Potomac Heritage, Arizona) plus 26 famous regional long-distance routes (John Muir Trail, Long Trail, Colorado Trail, Tahoe Rim, Wonderland, Superior Hiking, etc.) — with hand-sourced distance, terminus points, maintaining organisation, and description. (2) An OpenStreetMap-spined catalogue of 13,322 additional named hiking-route relations (type=route, route=hiking) contributed by the OSM community.
OSM models long trails as fragmented sub-relations — the Pacific Crest Trail is split into 18 sections, the Appalachian Trail into 13+ — so the master builder applies a name-pattern collapse step: any OSM relation whose name matches a curated parent's collapse pattern (^PCT - , ^Appalachian Trail , etc.) gets folded into the parent record rather than rendered as its own thin page.
Indexability rule (intentionally conservative, to avoid the thin-page trap): a trail gets a detail page if it's a curated parent, OR has a recorded length of 10+ miles, OR carries a national/regional/international network designation, OR has a Wikipedia article. Most local-walking-network entries (city-park loops, neighborhood paths) are catalogued in our database but not rendered as standalone destination pages.
Pending enrichment: National Park Service trails API (clean trail data inside National Parks with length and difficulty); U.S. Forest Service trail dataset; OSM length computation from member-way geometries for trails missing the distance tag. AllTrails-style elevation profiles are out of scope — we link to Wikipedia and the maintaining organisation rather than re-deriving that data.
What we do — and don't do — to the data
We do: deduplicate, slugify, compute drive-times from the nearest U.S. metro, surface practical fields (height, elevation, seasonality, accessibility) prominently, and write descriptive text where the source data is sparse.
We don't: invent height numbers, claim a waterfall is dog-friendly when we have no source for that, write SEO-padding paragraphs that don't add information, or generate AI prose presented as if a human wrote it. Where source data is thin, the page is short.
Editorial principles
- Cite the source. Every fact on a destination page should trace to a public record. We do not anonymise sources to look authoritative.
- Verify with the land manager. Conditions in the outdoors change. Trails close. Water levels drop. Parking rules shift. We add visible reminders to verify with the relevant agency before going.
- No paid placements. We do not accept payment to feature a destination, raise its ranking, or modify its description. The data is what the data is. (Affiliate links to accommodation booking are clearly disclosed; they don't influence which waterfalls we feature.)
- Corrections welcomed. If we have something wrong, tell us. We update pages as we learn.
Indexability — why some destinations have detail pages and others don't
Of our 5,663 catalogued falls, not all currently get a dedicated detail page. We mark a waterfall as "indexable" (gets its own page) if it has at least one of: a recorded height, a Wikipedia/Wikidata link, a tourism-attraction tag, an NPS-unit affiliation, or a location within 75 miles of a top-100 U.S. metro.
For hot springs, a record is indexable if it has at least one of: an NPS-unit affiliation, a Wikipedia article, an OSM editorial signal (operator name, tourism=attraction, bathing=yes, or posted opening hours), or a location within 75 miles of a top-100 U.S. metro. The OSM enrichment layer is what makes most Yellowstone-area features (and most developed soaking pools) clear the bar.
The rest are listed on the per-state index pages with name, location, and any other source data we have, but we don't generate a thin standalone page for them. As subsequent data passes (NPS alerts, USFS, BLM, state-park departments) layer in, more records cross the threshold and get promoted.
Reporting errors
If you visit a waterfall and our page is wrong — bad coordinates, wrong height, missing access info, dog policy that changed — tell us at /contact with topic "Correction or fact check". We acknowledge corrections within a few days and update the page.
Found a waterfall that should be listed but isn't? Same channel — pick "A place we should add". Include the name, state, approximate coordinates, and any source we can verify against (state park page, USGS quad, Wikipedia article, etc.).
Revenue model — for the curious
We earn revenue two ways:
- Affiliate links to accommodation. If you book lodging through a contextual link on one of our pages (typically the nearest town to a waterfall), we earn a small commission from the booking platform. The price you pay is identical to booking directly. We disclose this clearly on every page where it applies.
- Display advertising. Some pages carry Google ads. They are labelled and visually distinct from editorial content.
Neither revenue source influences which destinations we list, what we say about them, or where they rank in our database.