AI Web FeedsAI Web FeedsOpen web AI reader
Documentation

Data Sources and Ownership

Which files and runtime stores are authoritative for the catalog, generated article library, browser state, and optional backend features.

Source: apps/web/content/docs/development/runtime-authority.mdx

This page explains which data source owns each part of the product. Not every file in the repository has the same authority.

Authority Tiers

TierCanonical assetsUsed for
Authored repository datadata/feeds.yaml, data/topics.yaml, JSON schemasSource and taxonomy edits made by contributors
Derived repository datadata/feeds.enriched.yaml, exported JSON, OPML filesGenerated outputs derived from the authored catalog
Runtime databasedata/ai-web-feeds.dbFetched entries, validation history, analytics, recommendations, and operational state
Generated article librarydata/articles.generated.jsonStandalone web browsing and post search
Web APIs and pages/api/articles, /api/search, /Reading and searching recent posts plus browsing the source catalog
Browser-local statelocalStorage, IndexedDBReading state, local preferences, and device-scoped client state
Optional backend proxyBACKEND_URLServer-backed analytics and recommendations

What Each Layer Owns

TierCanonical assetsUsed for
Authored repository truthdata/feeds.yaml, data/topics.yaml, JSON schemascontributor-edited catalog and taxonomy
Derived repository assetsdata/feeds.enriched.yaml, exported JSON/OPMLchecked-in outputs generated from authored inputs
Runtime article storefeed_entries in data/ai-web-feeds.db canonical; fallback to data/aiwebfeeds.db only when the canonical file is missingpolled article rows and feed refresh state
Generated web corpusdata/articles.generated.jsonself-contained article browse/search/read payload for the Next.js app
Web APIs and workspace/api/articles, /api/search, /api/search/autocomplete, /corpus-backed reading/search plus catalog-backed source discovery
Browser-local runtimeIndexedDB, localStorage, reader-local stateanonymous on-device reader preferences and article state
External backend proxyBACKEND_URL-backed Python serviceoptional analytics logging, saved searches, and backend-proxied runtime routes

Use the YAML catalog and topic files when you are changing the source list itself.

Derived repository data

ContextPrimary responsibilityTypical paths
Docs shellstatic docs, content rendering, and navigationapp/docs, content/docs, docs UI components
Public reader workspacecatalog, article search, saved state, and reader UXapp/(home), app/feeds components, browser state helpers
Adminoperational dashboards, telemetry inspection, and admin session enforcementapp/admin, components/admin, lib/auth.ts, lib/admin-auth-new.ts
Backend proxyroute handlers that normalize browser requests and forward them to Python servicesapp/api/**, lib/backend.ts, anonymous identity helpers

Treat these as bounded code ownership areas inside one deployment, not as separate deployables or packages.

What each layer owns

Authored repository truth

  • data/feeds.yaml is the minimal curated source registry.
  • data/topics.yaml is the topic taxonomy.
  • schemas in data/*.schema.json define the validation contract for those files.

These files are the starting point for contributor changes and workflow intake.

Derived repository assets

  • data/feeds.enriched.yaml is derived from the authored feed catalog.
  • exported JSON and OPML files are downstream artifacts, not the canonical input.

Treat these as generated or regenerated outputs whenever the authored catalog changes.

Runtime database

Use the runtime database for operational behavior such as fetched entries, validation history, search state, and recommendation inputs.

Generated article library

Use data/articles.generated.json when you want to validate the standalone web experience. That file is the main post-browsing dataset for the web app.

Browser-local state

Read, saved, starred, and archived state stays in the browser. That keeps the core reading flow available without forcing end-user accounts into the product.

Optional backend proxy

The backend remains optional. It is only needed for features that require live server-side state, such as backend analytics exports.

Practical Precedence Rules

The web app prefers repository catalog files in this order for source discovery:

  1. data/feeds.enriched.yaml
  2. data/feeds.yaml
  3. data/feeds.json

The web app prefers data/articles.generated.json for article browse/search.

  • If the corpus exists, /, /api/articles, article search, and autocomplete article suggestions use it.
  • If the corpus is missing or empty, the feeds workspace uses /api/feeds/posts/aggregate as a live fallback across the current source slice.
  • Live fetches from /api/feeds/posts/aggregate also remain available as a freshness overlay when a generated corpus is present.

Practical rule

Edit YAML when you are changing the catalog itself. Use the runtime database when you are inspecting operational behavior or populating feed_entries. Use the generated corpus when you are validating the reader/search experience that ships in the standalone web app.

Data Sources and Ownership | AI Web Feeds