Build LLM context from any site
Markdown in, prompt-ready out.
Turn blogs, docs, and knowledge bases into clean Markdown ready to feed an LLM or a vector store.
LLMs are only as good as the context you give them. WebReaper discovers a site's URLs, fetches each page, and emits clean Markdown, with no schema to define.
Discover, then scrape
# Find the URLs you care about
webreaper map https://example.com --search /blog/ --max-urls 50
# Turn the whole site into Markdown, one JSON record per page
webreaper crawl https://example.com > corpus.jsonlEach record is { "url": "...", "title": "...", "markdown": "..." }, ready to
chunk, embed, and store. No selectors, no parsing code, no cleanup pass.
Ready to try it?
Install the CLI and run your first command in seconds.