Overview
The Embedder class provides a worker-based client for generating text embeddings. It automatically selects between WebGPU and WASM backends based on hardware capabilities.
Initialization
Constructor
Create a new embedder instance.
import { Embedder } from "./embeddings/Embedder";
// Default: auto-detect worker and prefer WebGPU
const embedder = new Embedder();
// Specify backend preference
const embedder = new Embedder({
preferredBackend: "wasm"
});
options.preferredBackend
EmbeddingBackendPreference
Preferred backend: "webgpu" (default) or "wasm"
Optional custom worker factory for testing
Single Text Embedding
embed()
Generate embedding for a single text.
const vector = await embedder.embed("semantic search query");
console.log(`Vector dimensions: ${vector.length}`);
// Vector dimensions: 384
384-dimensional L2-normalized embedding vector
If embedding fails or worker returns empty vector
Batch Embedding
embedBatch()
Generate embeddings for multiple texts in parallel.
const results = await embedder.embedBatch([
"first document",
"second document",
"third document"
]);
for (let i = 0; i < results.length; i++) {
const { embedding, error } = results[i];
if (error) {
console.error(`Failed to embed item ${i}: ${error}`);
} else {
console.log(`Embedded item ${i}: ${embedding.length} dimensions`);
}
}
Array of input texts to embed
Returns
Promise<BatchEmbeddingResultItem[]>
Array of results, one per input text, in the same order
Batch embedding returns per-item errors rather than rejecting the entire batch. Always check the error field for each result.
BatchEmbeddingResultItem
type BatchEmbeddingResultItem = {
embedding: Float32Array | null;
error: string | null;
};
384-dimensional embedding vector, or null if embedding failed
Error message if embedding failed, or null on success
Runtime Inspection
getRuntimeInfo()
Get backend selection diagnostics.
const info = embedder.getRuntimeInfo();
console.log(`Preferred: ${info.preferredBackend}`);
console.log(`Selected: ${info.selectedBackend}`);
if (info.fallbackReason) {
console.log(`Fallback reason: ${info.fallbackReason}`);
}
Runtime backend information
EmbeddingRuntimeInfo
type EmbeddingRuntimeInfo = {
preferredBackend: EmbeddingBackendPreference;
selectedBackend: EmbeddingBackendPreference | null;
fallbackReason: string | null;
};
preferredBackend
EmbeddingBackendPreference
User-specified or default backend preference: "webgpu" | "wasm"
selectedBackend
EmbeddingBackendPreference | null
Actual backend selected by worker, or null if no embeddings generated yet
Reason for fallback to WASM if WebGPU was preferred. Examples:
"navigator.gpu unavailable"
"WebGPU adapter failed"
Worker Lifecycle
terminate()
Terminate the worker thread and free resources.
// When done with embeddings
embedder.terminate();
Always call terminate() when done to prevent memory leaks. The worker cannot be reused after termination.
Vector Utilities
float32ToBlob()
Convert Float32Array to Uint8Array for database storage.
import { float32ToBlob } from "./embeddings/vector";
const vector = await embedder.embed("text");
const blob = float32ToBlob(vector);
await db.upsertEmbeddings([{
id: "emb-1",
chunkId: "chunk-1",
model: "Xenova/all-MiniLM-L6-v2",
dimension: vector.length,
vectorBlob: blob,
createdAt: Date.now()
}]);
Embedding vector to convert
L2-normalized vector encoded as bytes (4 bytes per float)
blobToFloat32()
Convert stored Uint8Array back to Float32Array.
import { blobToFloat32 } from "./embeddings/vector";
const result = db.exec("SELECT vector_blob FROM embeddings WHERE id = ?", ["emb-1"]);
const blob = result[0].values[0][0] as Uint8Array;
const vector = blobToFloat32(blob);
Reconstructed embedding vector
l2Normalize()
Normalize a vector to unit length.
import { l2Normalize } from "./embeddings/vector";
const raw = new Float32Array([3, 4]);
const normalized = l2Normalize(raw);
// normalized = [0.6, 0.8]
let sumSquares = 0;
for (const val of normalized) {
sumSquares += val * val;
}
console.log(Math.sqrt(sumSquares)); // 1.0
New normalized vector with L2 norm = 1.0. Returns copy of input if norm is 0.
Usage Example
import { Embedder } from "./embeddings/Embedder";
import { float32ToBlob } from "./embeddings/vector";
import { getDb } from "./db/client";
const embedder = new Embedder({ preferredBackend: "webgpu" });
const db = await getDb();
// Get chunks that need embedding
const chunks = db.getChunksToEmbed(50);
// Batch embed all texts
const texts = chunks.map(c => c.text);
const results = await embedder.embedBatch(texts);
// Store successful embeddings
const embeddings = [];
for (let i = 0; i < results.length; i++) {
const { embedding, error } = results[i];
if (error) {
console.error(`Failed to embed chunk ${chunks[i].id}: ${error}`);
continue;
}
embeddings.push({
id: `emb-${chunks[i].id}`,
chunkId: chunks[i].id,
model: "Xenova/all-MiniLM-L6-v2",
dimension: embedding.length,
vectorBlob: float32ToBlob(embedding),
createdAt: Date.now()
});
}
await db.upsertEmbeddings(embeddings);
// Check runtime info
const info = embedder.getRuntimeInfo();
console.log(`Backend: ${info.selectedBackend}`);
// Clean up
embedder.terminate();
Types
EmbeddingBackendPreference
type EmbeddingBackendPreference = "webgpu" | "wasm";
BatchEmbeddingResultItem
type BatchEmbeddingResultItem = {
embedding: Float32Array | null;
error: string | null;
};
EmbeddingRuntimeInfo
type EmbeddingRuntimeInfo = {
preferredBackend: EmbeddingBackendPreference;
selectedBackend: EmbeddingBackendPreference | null;
fallbackReason: string | null;
};
EmbeddingWorkerPool
The EmbeddingWorkerPool class manages a pool of embedding workers for parallel batch processing with automatic concurrency control and error handling.
Constructor
import { EmbeddingWorkerPool } from "./embeddings/WorkerPool";
const pool = new EmbeddingWorkerPool({
poolSize: 2,
maxPoolSize: 2,
maxQueueSize: 1024,
downshiftErrorThreshold: 3,
workerBatchSize: 8,
preferredBackend: "webgpu"
});
Initial number of workers in the pool (1-2)
Maximum allowed workers (hard limit)
Maximum batch size before rejecting requests
Number of errors before reducing pool to 1 worker
Texts per worker batch (1-32)
preferredBackend
EmbeddingBackendPreference
default:"webgpu"
Backend preference (“webgpu” or “wasm”)
embedBatch()
Process multiple texts in parallel across worker pool.
const texts = ["repo description 1", "repo description 2", ...];
const results = await pool.embedBatch(texts);
for (const { embedding, error } of results) {
if (error) {
console.error("Embedding failed:", error);
} else {
console.log("Vector dimensions:", embedding.length); // 384
}
}
Array of text strings to embed
Returns
Promise<BatchEmbeddingResultItem[]>
Array of embedding results (same length as input)
getStatus()
Get current pool status including backend selection and downshift state.
const status = pool.getStatus();
console.log(`Active workers: ${status.activePoolSize}/${status.configuredPoolSize}`);
console.log(`Backend: ${status.selectedBackend}`);
if (status.downshifted) {
console.log(`Downshift reason: ${status.downshiftReason}`);
}
Configured number of workers
Currently active workers (may be less if downshifted)
Whether pool has reduced to single worker due to errors
Reason for downshift (e.g., “out of memory”, error threshold exceeded)
Total embedding errors encountered
selectedBackend
EmbeddingBackendPreference | null
Actually selected backend (may differ from preferred)
Reason for backend fallback if applicable
setConcurrency()
Manually adjust pool size within configured limits.
pool.setConcurrency(1); // Reduce to single worker
Desired pool size (clamped to maxPoolSize and configuredPoolSize)
terminate()
Terminate all workers and release resources.
Worker Pool Behavior
- Automatic downshifting: Pool reduces to 1 worker after 3 errors or on memory pressure
- WebGPU concurrency: Automatically uses 1 worker when WebGPU backend is selected
- WASM concurrency: Can use up to configured pool size with WASM backend
- Error resilience: Individual embedding failures don’t crash the entire batch