Velohost Velohost

Astro Integration

Astro LLM

Extract deterministic, LLM-ready content from Astro builds at build time — no runtime JavaScript, servers, or magic.

Last Updated: 01 April 2026

Usage Snapshot

174

Downloads in Last 30 Days

v0.2.0

Latest npm Version

Source: npm registry

Why This Plugin Exists

Large Language Models require clean, predictable, auditable source material. Runtime scraping and crawling introduce non-determinism and risk.

Astro LLM extracts readable content directly from your built HTML output in DOM order after astro build completes.

All behaviour is controlled by a single configuration file. Given the same input, the output is always identical.

Design Principles

  • Build-time only — runs after astro build
  • Deterministic output — same input, same output
  • Config-first behaviour via llm.config.json
  • Safety-by-default content stripping
  • LLM-friendly, auditable structure

What This Plugin Delivers

  • Scans generated HTML files in /dist
  • Extracts readable content in DOM order
  • Applies include and exclude rules explicitly
  • Strips sensitive data such as emails and phone numbers
  • Outputs a single TXT or JSON file

Installation

npm install astro-llm

On first run, llm.config.json is created in the project root with explicit defaults and is never overwritten.

FAQs

What does the Astro LLM plugin do?

Astro LLM extracts readable content from built Astro HTML files at build time and generates a single static context file suitable for LLM usage.

Does Astro LLM run at runtime?

No. Astro LLM runs exclusively at build time after astro build completes.

Implementation FAQs

What problem does Astro LLM solve?

Astro LLM provides a deterministic way to extract clean, auditable site content for LLM usage without runtime scraping or servers.

When does Astro LLM run?

Astro LLM runs after astro build completes and processes the generated HTML output.

What happens on first run?

On first run, llm.config.json is created in the project root with explicit defaults and is never overwritten again.

How is behaviour configured?

All behaviour is controlled by llm.config.json, including output format, inclusion rules, exclusions, and safety settings.

What output formats are supported?

Astro LLM supports TXT and JSON output formats, suitable for RAG pipelines and offline indexing.

How does Astro LLM handle sensitive data?

When enabled, Astro LLM strips email addresses, phone numbers, scripts, forms, and inline JavaScript from output.

Project Links

Source code, package distribution, releases, and documentation.

Need Implementation Details?

Read the FAQs for usage patterns, integration caveats, and rollout guidance.