AI Web FeedsAI Web FeedsOpen web AI reader
  • Documentation

    Data Sources and Ownership

    Which files and runtime stores are authoritative for the catalog, generated article library, browser state, and optional backend features.

    Source: apps/web/content/docs/development/runtime-authority.mdx

    This page explains which data source owns each part of the product. Not every file in the repository has the same authority.

    Authority Tiers

    TierCanonical assetsUsed for
    Authored repository datadata/feeds.yaml, data/topics.yaml, JSON schemasSource and taxonomy edits made by contributors
    Derived repository datadata/feeds.enriched.yaml, exported JSON, OPML filesGenerated outputs derived from the authored catalog
    Runtime databasedata/ai-web-feeds.dbFetched entries, validation history, analytics, recommendations, and operational state
    Generated article librarydata/articles.generated.jsonStandalone web browsing and post search
    Web APIs and pages/api/articles, /api/search, /Reading and searching recent posts plus browsing the source catalog
    Browser-local statelocalStorage, IndexedDBReading state, local preferences, and device-scoped client state
    Optional backend proxyBACKEND_URLServer-backed analytics and recommendations

    What Each Layer Owns

    TierCanonical assetsUsed for
    Authored repository truthdata/feeds.yaml, data/topics.yaml, JSON schemascontributor-edited catalog and taxonomy
    Derived repository assetsdata/feeds.enriched.yaml, exported JSON/OPMLchecked-in outputs generated from authored inputs
    Runtime article storearticles in data/ai-web-feeds.dbpolled article rows and feed refresh state
    Generated web corpusdata/articles.generated.jsonself-contained article browse/search/read payload for the Next.js app
    Web APIs and workspace/api/articles, /api/search, /api/search/autocomplete, /corpus-backed reading/search plus catalog-backed source discovery
    Browser-local runtimeIndexedDB, localStorage, reader-local stateanonymous on-device reader preferences and article state
    External backend proxyBACKEND_URL-backed Python serviceoptional analytics logging, saved searches, and backend-proxied runtime routes

    Use the YAML catalog and topic files when you are changing the source list itself.

    Derived repository data

    ContextPrimary responsibilityTypical paths
    Docs shellstatic docs, content rendering, and navigationapp/docs, content/docs, docs UI components
    Public reader workspacecatalog, article search, saved state, and reader UXapp/(home), app/feeds components, browser state helpers
    Adminoperational dashboards, telemetry inspection, and admin session enforcementapp/admin, components/admin, lib/auth.ts, lib/admin-auth-new.ts
    Backend proxyroute handlers that normalize browser requests and forward them to Python servicesapp/api/**, lib/backend.ts, anonymous identity helpers

    Treat these as bounded code ownership areas inside one deployment, not as separate deployables or packages.

    What each layer owns

    Authored repository truth

    • data/feeds.yaml is the minimal curated source registry.
    • data/topics.yaml is the topic taxonomy.
    • schemas in data/*.schema.json define the validation contract for those files.

    These files are the starting point for contributor changes and workflow intake.

    Derived repository assets

    • data/feeds.enriched.yaml is derived from the authored feed catalog.
    • exported JSON and OPML files are downstream artifacts, not the canonical input.

    Treat these as generated or regenerated outputs whenever the authored catalog changes.

    Runtime database

    Use the runtime database for operational behavior such as fetched entries, validation history, search state, and recommendation inputs.

    Generated article library

    Use data/articles.generated.json when you want to validate the standalone web experience. That file is the main post-browsing dataset for the web app.

    Browser-local state

    Read, saved, starred, and archived state stays in the browser. That keeps the core reading flow available without forcing end-user accounts into the product.

    Optional backend proxy

    The backend remains optional. It is only needed for features that require live server-side state, such as backend analytics exports.

    Practical Precedence Rules

    The web app prefers repository catalog files in this order for source discovery:

    1. data/feeds.enriched.yaml
    2. data/feeds.yaml
    3. data/feeds.json

    The web app prefers data/articles.generated.json for article browse/search.

    • If the corpus exists, /, /api/articles, article search, and autocomplete article suggestions use it.
    • If the corpus is missing or empty, the feeds workspace shows corpus health and can load a bounded live sample for the current source slice.
    • Live fetches from /api/feeds/posts/aggregate also remain available as a freshness overlay when a generated corpus is present.

    Practical rule

    Edit YAML when you are changing the catalog itself. Use the runtime database when you are inspecting operational behavior or populating articles. Use the generated corpus when you are validating the reader/search experience that ships in the standalone web app.

    Data Sources and Ownership | AI Web Feeds