Security - GitStarRecall

GitStarRecall is designed with a local-first, privacy-centric architecture. This guide covers the security model, data handling practices, and controls available to users.

Security Principles

Local-First

All embeddings and repo content stored locally in SQLite WASM by default

Explicit Consent

No data sent to external services without explicit user opt-in

Minimal Permissions

Request only necessary GitHub scopes for starred repos and READMEs

Privacy by Default

Tokens stored in memory only unless encrypted persistence is configured

Content Security Policy (CSP)

GitStarRecall enforces strict CSP headers to prevent XSS and code injection attacks.

Production CSP Configuration

vite.config.ts

const PROD_CSP = [
  "script-src 'self' 'unsafe-eval'",
  "connect-src 'self' https://api.github.com https://api.openai.com https://api.anthropic.com https://api.deepseek.com https://api.moonshot.cn https://api.moonshot.ai https://api.z.ai https://open.bigmodel.cn https://bigmodel.cn https://huggingface.co https://*.huggingface.co https://hf.co https://*.hf.co https://xethub.hf.co https://*.xethub.hf.co https://cdn-lfs.huggingface.co https://cdn.jsdelivr.net https://raw.githubusercontent.com https://*.githubusercontent.com http://localhost:11434 http://localhost:1234 http://localhost:3001",
  "default-src 'self'",
  "style-src 'self' 'unsafe-inline' https://fonts.googleapis.com",
  "font-src 'self' https://fonts.gstatic.com data:",
  "img-src 'self' https: data:",
  "frame-src https://www.youtube.com https://player.vimeo.com",
  "worker-src 'self' blob:",
  "object-src 'none'",
  "base-uri 'self'",
  "frame-ancestors 'none'"
].join("; ");

CSP Directives Explained

Directive	Policy	Rationale
`script-src`	`'self' 'unsafe-eval'`	`'unsafe-eval'` required for embedding worker runtime
`connect-src`	Explicit allowlist	Only trusted API endpoints and model hosts
`default-src`	`'self'`	Deny all non-specified origins by default
`style-src`	`'self' 'unsafe-inline'` + Google Fonts	`'unsafe-inline'` for Tailwind/component styles
`worker-src`	`'self' blob:`	Allow Web Workers and blob URLs for embeddings
`object-src`	`'none'`	Block plugins and legacy objects
`frame-ancestors`	`'none'`	Prevent embedding in iframes (clickjacking protection)

The production CSP includes 'unsafe-eval' in script-src due to runtime requirements of the embedding worker and transformers.js library. This is necessary for WebGPU/WASM execution.

Additional Security Headers

{
  "Content-Security-Policy": PROD_CSP,
  "X-Content-Type-Options": "nosniff",
  "Referrer-Policy": "strict-origin-when-cross-origin"
}

X-Content-Type-Options: Prevents MIME-sniffing attacks
Referrer-Policy: Limits referrer information to same-origin requests

Data Storage and Privacy

Local Storage Model

All user data is stored client-side using SQLite WASM with OPFS (Origin Private File System) when available.

What is stored locally:

Repository metadata (name, description, topics, language)
README content (truncated to 100,000 characters)
Text chunks and embeddings (384-dimensional vectors)
Chat sessions and message history
Indexing checkpoints and metadata

Data That Never Leaves Your Device

GitHub tokens (stored in memory by default)
Private repository content (unless you opt-in to external LLM)
Embedding vectors (always generated locally)
Query history (never sent to servers)

Data Deletion

Users have full control over local data:

// Delete all local data (repos, embeddings, chat history)
await database.clearAllData();

// Clear token from memory
logout();

// Delete chat backup (IndexedDB + localStorage)
await clearChatBackup();

“Delete all data” is permanent and cannot be undone. You will need to re-sync stars and re-generate embeddings after deletion.

Token Handling

GitHub OAuth (Recommended)

GitStarRecall uses OAuth with PKCE (Proof Key for Code Exchange) for secure authentication. Security features:

Code verifier stored in sessionStorage (not localStorage)
Tokens never appear in URLs
Client secret kept on backend only
Automatic token refresh via backend exchange endpoint

OAuth Scopes:

["read:user", "repo"]

The repo scope is broad (full repository access). This is currently required for GitHub’s starred repos API. Fine-grained tokens are recommended if using Personal Access Tokens (PAT).

Personal Access Tokens (PAT)

PAT authentication is supported but discouraged.

Using a PAT? You’ll see a security warning in the app. Prefer OAuth for better security and automatic token rotation.

If you must use a PAT:

Use fine-grained tokens with minimal permissions
Grant only read-only access to Contents and Starring
Set expiration to 90 days or less
Never commit PATs to version control

Token Storage Options

Storage Method	Security	Persistence	Recommended
Memory only (default)	✅ Highest	❌ Session only	✅ Yes
Encrypted localStorage	⚠️ Moderate	✅ Persistent	⚠️ Only if needed

VITE_LLM_SETTINGS_ENCRYPTION_KEY

string

Optional 32-byte hex/base64 key for encrypting API keys in localStorage.Warning: This key is embedded in the client bundle. It protects against casual localStorage exfiltration but not against attackers with app source access.

.env.local (if persistence required)

# Generate with: openssl rand -hex 32
VITE_LLM_SETTINGS_ENCRYPTION_KEY=a1b2c3d4e5f6...

Recommendation: Leave VITE_LLM_SETTINGS_ENCRYPTION_KEY unset unless you need persistent login across browser sessions. Tokens in memory-only mode are cleared on tab close.

README Sanitization

All README content is sanitized before rendering to prevent XSS attacks.

src/components/SafeMarkdown.tsx

import ReactMarkdown from 'react-markdown';
import rehypeSanitize from 'rehype-sanitize';

<ReactMarkdown rehypePlugins={[rehypeSanitize]}>
  {readmeContent}
</ReactMarkdown>

Sanitization protections:

Strips <script> tags and event handlers
Removes javascript: URLs
Blocks data: URLs (except images)
Filters dangerous HTML attributes
Preserves safe markdown formatting

Never use dangerouslySetInnerHTML for user-provided content. All README rendering goes through rehype-sanitize.

External LLM Usage

Opt-In Model

By default, no data is sent to external services. LLM features require explicit consent.

allowRemoteProvider

boolean

default:"false"

Enable sending data to remote LLM providers (OpenAI, Anthropic, etc.).

allowLocalProvider

boolean

default:"false"

Enable sending data to local LLM providers (Ollama, LM Studio).

Remote Providers

When allowRemoteProvider is enabled:

Only top-8 chunks are sent with each query (not full READMEs)
User queries and LLM responses are stored locally only
API keys are managed via in-app settings (optionally encrypted)

Supported providers:

OpenAI (GPT-4, GPT-3.5)
Anthropic (Claude)
DeepSeek
Moonshot
Zhipu AI (GLM)
Custom OpenAI-compatible endpoints

Enabling remote providers means repository content will leave your device and be sent to the selected LLM service. Review the provider’s privacy policy before enabling.

Local Providers

Local providers run on your machine and never send data to the internet. Ollama Integration:

.env.local

VITE_OLLAMA_BASE_URL=http://localhost:11434
VITE_OLLAMA_MODEL=nomic-embed-text

Security restrictions:

Endpoint must be localhost, 127.0.0.1, or [::1]
Non-local endpoints are rejected with an error
No GitHub token or PAT is sent to Ollama
Only text chunks and model parameters are transmitted

Local providers require allowLocalProvider=true in the app settings. They are clearly labeled “Local (Ollama)” in the UI.

WebLLM Browser Provider

Run LLM inference entirely in the browser using WebGPU. Security features:

Models downloaded from trusted CDN (Hugging Face)
CSP connect-src restricts download sources
Explicit user consent before multi-GB download begins
All inference runs locally (no network requests after download)
Model recommendation based on device capabilities

Enable WebLLM:

.env.local

VITE_WEBLLM_ENABLED=1

WebLLM is experimental and requires WebGPU support. Models are 500MB-2GB+ and cached in browser storage.

Embedding Security

Local Embedding (Default)

Embeddings are generated in-browser using @xenova/transformers. Security guarantees:

Model: all-MiniLM-L6-v2 (384 dimensions)
Source: Hugging Face CDN (allowlisted in CSP)
Execution: WebGPU or WASM (no external API calls)
Storage: SQLite WASM (OPFS or in-memory)

No data transmitted:

README text never leaves your device
Embeddings generated locally
Vector index built client-side

Optional Ollama Embeddings

For faster embeddings on high-performance hardware. Trust boundary:

// Localhost-only validation
const url = new URL(baseUrl);
const isLocalhost = 
  url.hostname === 'localhost' ||
  url.hostname === '127.0.0.1' ||
  url.hostname === '[::1]';

if (!isLocalhost) {
  throw new Error('Ollama endpoint must be localhost');
}

Ollama embedding requests contain README text chunks. Ensure your Ollama instance is trusted and running on your local machine only.

Threat Model (STRIDE)

GitStarRecall follows the STRIDE threat modeling framework.

Spoofing

Mitigations:

OAuth with PKCE (code challenge prevents replay)
Tokens never in URLs or localStorage (unless encrypted)
PAT usage triggers security warning

Tampering

Mitigations:

README sanitization via rehype-sanitize
CSP blocks inline scripts and untrusted origins
Checksum validation for repo/README integrity
No user input in raw SQL queries

Repudiation

Mitigations:

Embedding run metadata (backend, pool size, timestamp)
Sync phase tracking (last sync, indexing status)
Opt-in gates for remote LLM usage

Planned improvement: Add explicit “data sent” notice when remote LLM is enabled.

Information Disclosure

Mitigations:

External LLM off by default (explicit opt-in)
Only top-8 chunks sent to LLM (not full READMEs)
Tokens in memory or encrypted storage only
Chat history local-only and wipeable
Production logs avoid token/content disclosure

Denial of Service

Mitigations:

GitHub rate limit handling with exponential backoff
Concurrency caps (6 concurrent README fetches)
README truncation (100,000 character limit)
Web Worker isolation for embeddings
Queue depth limits (1,024 max)
Adaptive batch downshifting on memory pressure

Elevation of Privilege

Mitigations:

Minimal GitHub scopes (read-only access)
Local endpoints clearly labeled and opt-in
Browser embedding default (no elevated trust)
CSP restricts code execution origins

Security Best Practices

For Users

Use OAuth instead of PAT

OAuth provides automatic token rotation and narrower trust boundaries.

Review LLM provider privacy policies

Understand where your data goes before enabling remote providers.

Keep tokens in memory only

Avoid setting VITE_LLM_SETTINGS_ENCRYPTION_KEY unless persistence is required.

Use 'Delete all data' when needed

Remove all local data before uninstalling or switching accounts.

For Developers

Never log tokens or README content

Production logs should contain only IDs, counts, and timings.

Validate all external endpoints

Enforce localhost-only for Ollama; use CSP allowlists for CDNs.

Sanitize all user-provided content

Use rehype-sanitize for markdown; never use dangerouslySetInnerHTML.

Keep CSP strict and up-to-date

Document all unsafe-* directives and remove when possible.

Residual Risks

Despite mitigations, some risks remain:

User-accepted risks:

If you enable external LLMs, repo content leaves your browser
If your device is compromised, local storage can be accessed
GPU/driver variability may affect WebGPU acceleration consistency
CSP 'unsafe-eval' is required for transformers.js (cannot be removed)

Security Audit History

Date	Type	Status	Notes
2026-02-24	WebLLM integration review	✅ Aligned	Consent gate, trusted CDN, capability recommendation
2026-02-23	Embedding pipeline v2	✅ Aligned	Localhost-only Ollama, metadata tracking, resilient retry
2026-02	STRIDE threat modeling	✅ Complete	See `docs/security-review-stride.md`

Reporting Security Issues

If you discover a security vulnerability:

Do not open a public GitHub issue
Email security contact (see repository SECURITY.md)
Include steps to reproduce and impact assessment
Allow 90 days for coordinated disclosure

Next Steps

Environment Variables

Configure security-related environment variables

Performance Tuning

Optimize without compromising security

Get Started

Core Features

Configuration

Deployment

Advanced

​Security Principles

Local-First

Explicit Consent

Minimal Permissions

Privacy by Default

​Content Security Policy (CSP)

​Production CSP Configuration

​CSP Directives Explained

​Additional Security Headers

​Data Storage and Privacy

​Local Storage Model

​Data That Never Leaves Your Device

​Data Deletion

​Token Handling

​GitHub OAuth (Recommended)

​Personal Access Tokens (PAT)

​Token Storage Options

​README Sanitization

​External LLM Usage

​Opt-In Model

​Remote Providers

​Local Providers

​WebLLM Browser Provider

​Embedding Security

​Local Embedding (Default)

​Optional Ollama Embeddings

​Threat Model (STRIDE)

​Spoofing

​Tampering

​Repudiation

​Information Disclosure

​Denial of Service

​Elevation of Privilege

​Security Best Practices

​For Users

​For Developers

​Residual Risks

​Security Audit History

​Reporting Security Issues

​Next Steps

Environment Variables

Performance Tuning

Security Principles

Content Security Policy (CSP)

Production CSP Configuration

CSP Directives Explained

Additional Security Headers

Data Storage and Privacy

Local Storage Model

Data That Never Leaves Your Device

Data Deletion

Token Handling

GitHub OAuth (Recommended)

Personal Access Tokens (PAT)

Token Storage Options

README Sanitization

External LLM Usage

Opt-In Model

Remote Providers

Local Providers

WebLLM Browser Provider

Embedding Security

Local Embedding (Default)

Optional Ollama Embeddings

Threat Model (STRIDE)

Spoofing

Tampering

Repudiation

Information Disclosure

Denial of Service

Elevation of Privilege

Security Best Practices

For Users

For Developers

Residual Risks

Security Audit History

Reporting Security Issues

Next Steps