atlas layer 4

RAG pipeline — chunking, embedding, vector storage, and ranked retrieval. Consumes protocol contracts to emit telemetry spans and expose a typed query interface over the knowledge corpus.

Chunking Pipeline

Split a source document into overlapping chunks ready for embedding.

typescript
// Chunking pipeline — split a document into overlapping chunks
import { chunkDocument } from "atlas/pipeline/chunk";

const doc = {
  id: "doc-001",
  text: `Retrieval-augmented generation (RAG) combines
a retrieval step with a generative model. The
retrieval step fetches relevant passages from a
knowledge corpus. The generative model then
conditions on those passages to produce an answer.`,
  source: "rag-overview.md",
};

const config = { chunkSize: 80, overlap: 20 };
const chunks = chunkDocument(doc, config);
console.log(chunks);
output
[
  {
    chunkId: "doc-001-0",
    text: "Retrieval-augmented generation (RAG) combines\na retrieval step with a generative model. The",
    source: "rag-overview.md",
    index: 0
  },
  {
    chunkId: "doc-001-1",
    text: "retrieval step fetches relevant passages from a\nknowledge corpus. The generative model then",
    source: "rag-overview.md",
    index: 1
  },
  {
    chunkId: "doc-001-2",
    text: "conditions on those passages to produce an answer.",
    source: "rag-overview.md",
    index: 2
  }
]

Ranked Retrieval

Query the corpus and receive top-k results sorted by semantic similarity score.

typescript
// Ranked retrieval — query the knowledge corpus
import { rankCandidates } from "atlas/retrieval/rank";

const query = "How does RAG use retrieved passages?";

const candidates = [
  { chunkId: "doc-001-1", text: "The retrieval step fetches relevant passages...", score: 0.91 },
  { chunkId: "doc-002-3", text: "Vector embeddings encode semantic meaning...", score: 0.74 },
  { chunkId: "doc-001-2", text: "The generative model conditions on passages...", score: 0.88 },
];

const results = rankCandidates(candidates, { topK: 2 });
console.log(results);
output
[
  {
    rank: 1,
    chunkId: "doc-001-1",
    text: "The retrieval step fetches relevant passages...",
    finalScore: 0.91
  },
  {
    rank: 2,
    chunkId: "doc-001-2",
    text: "The generative model conditions on passages...",
    finalScore: 0.88
  }
]