Skip to main content

Overview

The Embedder class provides a worker-based client for generating text embeddings. It automatically selects between WebGPU and WASM backends based on hardware capabilities.

Initialization

Constructor

Create a new embedder instance.
import { Embedder } from "./embeddings/Embedder";

// Default: auto-detect worker and prefer WebGPU
const embedder = new Embedder();

// Specify backend preference
const embedder = new Embedder({
  preferredBackend: "wasm"
});
options.preferredBackend
EmbeddingBackendPreference
Preferred backend: "webgpu" (default) or "wasm"
options.workerFactory
() => Worker
Optional custom worker factory for testing

Single Text Embedding

embed()

Generate embedding for a single text.
const vector = await embedder.embed("semantic search query");
console.log(`Vector dimensions: ${vector.length}`);
// Vector dimensions: 384
text
string
required
Input text to embed
Returns
Promise<Float32Array>
384-dimensional L2-normalized embedding vector
Throws
Error
If embedding fails or worker returns empty vector

Batch Embedding

embedBatch()

Generate embeddings for multiple texts in parallel.
const results = await embedder.embedBatch([
  "first document",
  "second document",
  "third document"
]);

for (let i = 0; i < results.length; i++) {
  const { embedding, error } = results[i];
  if (error) {
    console.error(`Failed to embed item ${i}: ${error}`);
  } else {
    console.log(`Embedded item ${i}: ${embedding.length} dimensions`);
  }
}
texts
string[]
required
Array of input texts to embed
Returns
Promise<BatchEmbeddingResultItem[]>
Array of results, one per input text, in the same order
Batch embedding returns per-item errors rather than rejecting the entire batch. Always check the error field for each result.

BatchEmbeddingResultItem

type BatchEmbeddingResultItem = {
  embedding: Float32Array | null;
  error: string | null;
};
embedding
Float32Array | null
384-dimensional embedding vector, or null if embedding failed
error
string | null
Error message if embedding failed, or null on success

Runtime Inspection

getRuntimeInfo()

Get backend selection diagnostics.
const info = embedder.getRuntimeInfo();
console.log(`Preferred: ${info.preferredBackend}`);
console.log(`Selected: ${info.selectedBackend}`);
if (info.fallbackReason) {
  console.log(`Fallback reason: ${info.fallbackReason}`);
}
Returns
EmbeddingRuntimeInfo
Runtime backend information

EmbeddingRuntimeInfo

type EmbeddingRuntimeInfo = {
  preferredBackend: EmbeddingBackendPreference;
  selectedBackend: EmbeddingBackendPreference | null;
  fallbackReason: string | null;
};
preferredBackend
EmbeddingBackendPreference
User-specified or default backend preference: "webgpu" | "wasm"
selectedBackend
EmbeddingBackendPreference | null
Actual backend selected by worker, or null if no embeddings generated yet
fallbackReason
string | null
Reason for fallback to WASM if WebGPU was preferred. Examples:
  • "navigator.gpu unavailable"
  • "WebGPU adapter failed"

Worker Lifecycle

terminate()

Terminate the worker thread and free resources.
// When done with embeddings
embedder.terminate();
Always call terminate() when done to prevent memory leaks. The worker cannot be reused after termination.

Vector Utilities

float32ToBlob()

Convert Float32Array to Uint8Array for database storage.
import { float32ToBlob } from "./embeddings/vector";

const vector = await embedder.embed("text");
const blob = float32ToBlob(vector);

await db.upsertEmbeddings([{
  id: "emb-1",
  chunkId: "chunk-1",
  model: "Xenova/all-MiniLM-L6-v2",
  dimension: vector.length,
  vectorBlob: blob,
  createdAt: Date.now()
}]);
vec
Float32Array
required
Embedding vector to convert
Returns
Uint8Array
L2-normalized vector encoded as bytes (4 bytes per float)

blobToFloat32()

Convert stored Uint8Array back to Float32Array.
import { blobToFloat32 } from "./embeddings/vector";

const result = db.exec("SELECT vector_blob FROM embeddings WHERE id = ?", ["emb-1"]);
const blob = result[0].values[0][0] as Uint8Array;
const vector = blobToFloat32(blob);
blob
Uint8Array
required
Byte array from database
Returns
Float32Array
Reconstructed embedding vector

l2Normalize()

Normalize a vector to unit length.
import { l2Normalize } from "./embeddings/vector";

const raw = new Float32Array([3, 4]);
const normalized = l2Normalize(raw);
// normalized = [0.6, 0.8]

let sumSquares = 0;
for (const val of normalized) {
  sumSquares += val * val;
}
console.log(Math.sqrt(sumSquares)); // 1.0
vec
Float32Array
required
Vector to normalize
Returns
Float32Array
New normalized vector with L2 norm = 1.0. Returns copy of input if norm is 0.

Usage Example

import { Embedder } from "./embeddings/Embedder";
import { float32ToBlob } from "./embeddings/vector";
import { getDb } from "./db/client";

const embedder = new Embedder({ preferredBackend: "webgpu" });
const db = await getDb();

// Get chunks that need embedding
const chunks = db.getChunksToEmbed(50);

// Batch embed all texts
const texts = chunks.map(c => c.text);
const results = await embedder.embedBatch(texts);

// Store successful embeddings
const embeddings = [];
for (let i = 0; i < results.length; i++) {
  const { embedding, error } = results[i];
  if (error) {
    console.error(`Failed to embed chunk ${chunks[i].id}: ${error}`);
    continue;
  }
  embeddings.push({
    id: `emb-${chunks[i].id}`,
    chunkId: chunks[i].id,
    model: "Xenova/all-MiniLM-L6-v2",
    dimension: embedding.length,
    vectorBlob: float32ToBlob(embedding),
    createdAt: Date.now()
  });
}

await db.upsertEmbeddings(embeddings);

// Check runtime info
const info = embedder.getRuntimeInfo();
console.log(`Backend: ${info.selectedBackend}`);

// Clean up
embedder.terminate();

Types

EmbeddingBackendPreference

type EmbeddingBackendPreference = "webgpu" | "wasm";

BatchEmbeddingResultItem

type BatchEmbeddingResultItem = {
  embedding: Float32Array | null;
  error: string | null;
};

EmbeddingRuntimeInfo

type EmbeddingRuntimeInfo = {
  preferredBackend: EmbeddingBackendPreference;
  selectedBackend: EmbeddingBackendPreference | null;
  fallbackReason: string | null;
};

EmbeddingWorkerPool

The EmbeddingWorkerPool class manages a pool of embedding workers for parallel batch processing with automatic concurrency control and error handling.

Constructor

import { EmbeddingWorkerPool } from "./embeddings/WorkerPool";

const pool = new EmbeddingWorkerPool({
  poolSize: 2,
  maxPoolSize: 2,
  maxQueueSize: 1024,
  downshiftErrorThreshold: 3,
  workerBatchSize: 8,
  preferredBackend: "webgpu"
});
poolSize
number
default:2
Initial number of workers in the pool (1-2)
maxPoolSize
number
default:2
Maximum allowed workers (hard limit)
maxQueueSize
number
default:1024
Maximum batch size before rejecting requests
downshiftErrorThreshold
number
default:3
Number of errors before reducing pool to 1 worker
workerBatchSize
number
default:8
Texts per worker batch (1-32)
preferredBackend
EmbeddingBackendPreference
default:"webgpu"
Backend preference (“webgpu” or “wasm”)

embedBatch()

Process multiple texts in parallel across worker pool.
const texts = ["repo description 1", "repo description 2", ...];
const results = await pool.embedBatch(texts);

for (const { embedding, error } of results) {
  if (error) {
    console.error("Embedding failed:", error);
  } else {
    console.log("Vector dimensions:", embedding.length); // 384
  }
}
texts
string[]
required
Array of text strings to embed
Returns
Promise<BatchEmbeddingResultItem[]>
Array of embedding results (same length as input)

getStatus()

Get current pool status including backend selection and downshift state.
const status = pool.getStatus();
console.log(`Active workers: ${status.activePoolSize}/${status.configuredPoolSize}`);
console.log(`Backend: ${status.selectedBackend}`);
if (status.downshifted) {
  console.log(`Downshift reason: ${status.downshiftReason}`);
}
configuredPoolSize
number
Configured number of workers
activePoolSize
number
Currently active workers (may be less if downshifted)
downshifted
boolean
Whether pool has reduced to single worker due to errors
downshiftReason
string | null
Reason for downshift (e.g., “out of memory”, error threshold exceeded)
errorCount
number
Total embedding errors encountered
selectedBackend
EmbeddingBackendPreference | null
Actually selected backend (may differ from preferred)
backendFallbackReason
string | null
Reason for backend fallback if applicable

setConcurrency()

Manually adjust pool size within configured limits.
pool.setConcurrency(1); // Reduce to single worker
targetPoolSize
number
required
Desired pool size (clamped to maxPoolSize and configuredPoolSize)

terminate()

Terminate all workers and release resources.
pool.terminate();

Worker Pool Behavior

  • Automatic downshifting: Pool reduces to 1 worker after 3 errors or on memory pressure
  • WebGPU concurrency: Automatically uses 1 worker when WebGPU backend is selected
  • WASM concurrency: Can use up to configured pool size with WASM backend
  • Error resilience: Individual embedding failures don’t crash the entire batch