Precomputed Tags: Replacing Dynamic Search with Pre-computed Indexes
When designing the tag system for VibeBlog, I encountered a key question: How to implement high-performance tag-based browsing without using a database?
The traditional CMS approach is to dynamically scan all posts on each request, then filter posts matching the tag using a for-loop. However, this approach has several problems:
- ❌ Every request requires scanning the entire content folder
- ❌ Requires dynamic computation, unable to fully leverage SSG advantages
- ❌ Performance degrades linearly as the number of posts increases
- ❌ Not suitable for CDN caching
Therefore, I chose the Precomputed Tags architecture.
What are Precomputed Tags?
The core concept of Precomputed Tags is: Calculate all classification relationships at build time, store them as static JSON files, and the frontend only needs to read them with zero computation.
It's like a library's index card system:
- Traditional way: Every time someone asks "What books are about LLM?", the librarian has to search through the entire library
- Precomputed way: The library has already prepared index cards, just flip to the "LLM" page, which lists all relevant book numbers
Data Structure Design
1. Post Metadata (One JSON per post)
Each post has a corresponding JSON file in the content/meta/ directory:
content/meta/how-i-use-vllm.json
{
"slug": "how-i-use-vllm",
"title": "How I Use vLLM to Run 7B Models Locally",
"date": "2025-12-02",
"tags": ["LLM", "vLLM", "Local Deployment", "Experiment Notes"],
"summary": "Recording my process of running vLLM on a single GPU at home.",
"heroImage": "/images/2025-12-02-vllm-hero.png"
}
Tags are stored in the tags array here.
2. Pre-computed Tag Index (tags.json)
In content/indexes/tags.json, we pre-build a mapping of "tag → post slug list":
content/indexes/tags.json
{
"LLM": ["how-i-use-vllm", "llm-metrics", "runtime-benchmark"],
"SvelteKit": ["vibeblog-infra", "routing-idea"],
"AI-Blogging": ["ai-html-pipeline", "meta-generator"]
}
This file is automatically generated by a script before build, requiring no manual maintenance.
Implementation
Script to Generate tags.json
I wrote a simple Node.js script to scan all meta JSON files and automatically generate the index:
scripts/generate-tags-index.ts
// Scan all meta files
for (const file of jsonFiles) {
const meta = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
// Add post slug to corresponding tag
for (const tag of meta.tags) {
if (!tagsIndex[tag]) {
tagsIndex[tag] = [];
}
tagsIndex[tag].push(meta.slug);
}
}
// Write tags.json
fs.writeFileSync('content/indexes/tags.json',
JSON.stringify(tagsIndex, null, 2));
Run npm run generate:tags to update the index.
SvelteKit Side: Read Only, No Computation
In SvelteKit's server load function, we only need to read the pre-computed JSON:
// src/routes/tags/[tag]/+page.server.ts
import { getTagsIndex, getPostMeta } from '$lib/content';
export const load = ({ params }) => {
const tagsIndex = getTagsIndex(); // Read pre-computed JSON
const slugs = tagsIndex[params.tag] || [];
// Only read needed post metas, no need to scan all
const posts = slugs.map(slug => getPostMeta(slug));
return { tag: params.tag, posts };
};
Completely zero dynamic computation, pure data lookup (O(1)).
Perfect Integration with AI Pipeline
This architecture is particularly suitable for AI-automated content production workflows:
raw post
→ AI processing
→ processed HTML (AI-formatted HTML)
→ meta JSON (AI-generated metadata, including tags)
→ Run generate:tags script
→ tags.json automatically updated
→ build → deploy
When AI generates a post, it can simultaneously:
- Analyze post content and automatically generate relevant tags
- Generate meta JSON (including tags array)
- Generate processed HTML
Then run npm run generate:tags once, and all classification indexes are automatically updated.
Advantages Summary
✅ Extremely High Performance
- All classifications completed at build time
- Frontend only reads JSON, zero computation
- Perfect match with SSG + CDN
✅ Fully Version Controllable
- Classifications are files, commits show changes
- All content is files, easy to track
✅ Suitable for Large Numbers of Posts
- Hundreds or thousands of posts make no difference
- Query complexity is O(1), won't slow down as post count increases
✅ Suitable for AI Automation
- AI directly outputs classification data structures
- No need for complex database schemas
- All logic runs before build
✅ Perfect Match with "AI-Generated Posts" Concept
- AI determines tags when generating posts
- Automatically updates tags.json
- Fully automated content management workflow
Practical Applications
In VibeBlog, this architecture implements:
/tags- All tags overview (read from tags.json)/tags/LLM- All LLM-related posts (O(1) query)- Post list page displays tags (read from meta JSON)
- Post detail page displays tags (clickable to jump to tag page)
All these features are zero runtime computation, pure static data.
Extensibility
This pattern can easily extend to other classification methods:
content/indexes/years.json- Classify by yearcontent/indexes/categories.json- Classify by categorycontent/indexes/authors.json- Classify by author
Just generate the corresponding index file before build, and SvelteKit can use it directly.
Conclusion
The Precomputed Tags architecture proves that: You don't need a database, you don't need dynamic queries, you can implement a high-performance tag system using files + pre-computation.
This method is particularly suitable for:
- Static Site Generation (SSG)
- AI-automated content production
- Version-controlled content management
- Blog systems pursuing ultimate performance
In VibeBlog's implementation, this architecture not only solves the tag classification problem but also becomes the foundation of the entire AI-First content management system.
If you're building a similar system, try the Precomputed Tags architecture. You'll find it simpler, faster, and more controllable than traditional dynamic search approaches.