Precomputed Tags: Replacing Dynamic Search with Pre-computed Indexes

When designing the tag system for VibeBlog, I encountered a key question: How to implement high-performance tag-based browsing without using a database?

The traditional CMS approach is to dynamically scan all posts on each request, then filter posts matching the tag using a for-loop. However, this approach has several problems:

❌ Every request requires scanning the entire content folder
❌ Requires dynamic computation, unable to fully leverage SSG advantages
❌ Performance degrades linearly as the number of posts increases
❌ Not suitable for CDN caching

Therefore, I chose the Precomputed Tags architecture.

What are Precomputed Tags?

The core concept of Precomputed Tags is: Calculate all classification relationships at build time, store them as static JSON files, and the frontend only needs to read them with zero computation.

It's like a library's index card system:

Traditional way: Every time someone asks "What books are about LLM?", the librarian has to search through the entire library
Precomputed way: The library has already prepared index cards, just flip to the "LLM" page, which lists all relevant book numbers

Data Structure Design

1. Post Metadata (One JSON per post)

Each post has a corresponding JSON file in the content/meta/ directory:

content/meta/how-i-use-vllm.json
{
  "slug": "how-i-use-vllm",
  "title": "How I Use vLLM to Run 7B Models Locally",
  "date": "2025-12-02",
  "tags": ["LLM", "vLLM", "Local Deployment", "Experiment Notes"],
  "summary": "Recording my process of running vLLM on a single GPU at home.",
  "heroImage": "/images/2025-12-02-vllm-hero.png"
}

Tags are stored in the tags array here.

2. Pre-computed Tag Index (tags.json)

In content/indexes/tags.json, we pre-build a mapping of "tag → post slug list":

content/indexes/tags.json
{
  "LLM": ["how-i-use-vllm", "llm-metrics", "runtime-benchmark"],
  "SvelteKit": ["vibeblog-infra", "routing-idea"],
  "AI-Blogging": ["ai-html-pipeline", "meta-generator"]
}

This file is automatically generated by a script before build, requiring no manual maintenance.

Implementation

Script to Generate tags.json

I wrote a simple Node.js script to scan all meta JSON files and automatically generate the index:

scripts/generate-tags-index.ts

// Scan all meta files
for (const file of jsonFiles) {
  const meta = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
  
  // Add post slug to corresponding tag
  for (const tag of meta.tags) {
    if (!tagsIndex[tag]) {
      tagsIndex[tag] = [];
    }
    tagsIndex[tag].push(meta.slug);
  }
}

// Write tags.json
fs.writeFileSync('content/indexes/tags.json', 
  JSON.stringify(tagsIndex, null, 2));

Run npm run generate:tags to update the index.

SvelteKit Side: Read Only, No Computation

In SvelteKit's server load function, we only need to read the pre-computed JSON:

// src/routes/tags/[tag]/+page.server.ts
import { getTagsIndex, getPostMeta } from '$lib/content';

export const load = ({ params }) => {
  const tagsIndex = getTagsIndex(); // Read pre-computed JSON
  const slugs = tagsIndex[params.tag] || [];
  
  // Only read needed post metas, no need to scan all
  const posts = slugs.map(slug => getPostMeta(slug));
  
  return { tag: params.tag, posts };
};

Completely zero dynamic computation, pure data lookup (O(1)).

Perfect Integration with AI Pipeline

This architecture is particularly suitable for AI-automated content production workflows:

raw post 
  → AI processing 
    → processed HTML (AI-formatted HTML)
    → meta JSON (AI-generated metadata, including tags)
      → Run generate:tags script
        → tags.json automatically updated
          → build → deploy

When AI generates a post, it can simultaneously:

Analyze post content and automatically generate relevant tags
Generate meta JSON (including tags array)
Generate processed HTML

Then run npm run generate:tags once, and all classification indexes are automatically updated.

Advantages Summary

✅ Extremely High Performance

All classifications completed at build time
Frontend only reads JSON, zero computation
Perfect match with SSG + CDN

✅ Fully Version Controllable

Classifications are files, commits show changes
All content is files, easy to track

✅ Suitable for Large Numbers of Posts

Hundreds or thousands of posts make no difference
Query complexity is O(1), won't slow down as post count increases

✅ Suitable for AI Automation

AI directly outputs classification data structures
No need for complex database schemas
All logic runs before build

✅ Perfect Match with "AI-Generated Posts" Concept

AI determines tags when generating posts
Automatically updates tags.json
Fully automated content management workflow

Practical Applications

In VibeBlog, this architecture implements:

/tags - All tags overview (read from tags.json)
/tags/LLM - All LLM-related posts (O(1) query)
Post list page displays tags (read from meta JSON)
Post detail page displays tags (clickable to jump to tag page)

All these features are zero runtime computation, pure static data.

Extensibility

This pattern can easily extend to other classification methods:

content/indexes/years.json - Classify by year
content/indexes/categories.json - Classify by category
content/indexes/authors.json - Classify by author

Just generate the corresponding index file before build, and SvelteKit can use it directly.

Conclusion

The Precomputed Tags architecture proves that: You don't need a database, you don't need dynamic queries, you can implement a high-performance tag system using files + pre-computation.

This method is particularly suitable for:

Static Site Generation (SSG)
AI-automated content production
Version-controlled content management
Blog systems pursuing ultimate performance

In VibeBlog's implementation, this architecture not only solves the tag classification problem but also becomes the foundation of the entire AI-First content management system.

If you're building a similar system, try the Precomputed Tags architecture. You'll find it simpler, faster, and more controllable than traditional dynamic search approaches.