Shane Sturgeon

ShAIne

Corpus Growth

The RAG corpus powering Interview ShAIne — content files, chunk counts, and how the knowledge base has grown over time.

Total chunks

306

Content files

30

Last ingest

23 days ago

May 1, 2026

Chunks over time

project milestonemethodology change

Milestones

Annotated entries from content/corpus-history.json. Automated snapshots are recorded by the ingest script.

DateChunksFilesChars (K)Chunk szNote
March 18, 2026Site launch56533.6600Initial scaffold: bio, resume, skills, about, 1 reference
March 18, 202679847.6600Full site launch: job-preferences v1, transitions, hdtvmagazine
March 18, 202696957.7600Added interview-voice.md — voice training begins
March 23, 20261311678.5600Scaffolded mission-and-motivation, 7 new files
March 24, 2026job-preferences expansion17217103.4600job-preferences.md: 3K → 32K chars (+25% of total corpus)
March 24, 202620318121.6600Added notes-domino-background.md, certifications
March 25, 202620919125.7600fit-config.json added to context; first blog post published
March 29, 202624420138.5600Published: Under the Hood — ShAIne RAG pipeline
April 15, 202628121153.3600Published: Leading in 4K
April 17, 202629922161.1600Published: Ground-Level HDTV Magazine Revival (15-week build log)
April 21, 2026285251000Paragraph-aware chunking introduced; chunk size raised to 1000; Bellese content updates