Security Principles
Local-First
All embeddings and repo content stored locally in SQLite WASM by default
Explicit Consent
No data sent to external services without explicit user opt-in
Minimal Permissions
Request only necessary GitHub scopes for starred repos and READMEs
Privacy by Default
Tokens stored in memory only unless encrypted persistence is configured
Content Security Policy (CSP)
GitStarRecall enforces strict CSP headers to prevent XSS and code injection attacks.Production CSP Configuration
vite.config.ts
CSP Directives Explained
| Directive | Policy | Rationale |
|---|---|---|
script-src | 'self' 'unsafe-eval' | 'unsafe-eval' required for embedding worker runtime |
connect-src | Explicit allowlist | Only trusted API endpoints and model hosts |
default-src | 'self' | Deny all non-specified origins by default |
style-src | 'self' 'unsafe-inline' + Google Fonts | 'unsafe-inline' for Tailwind/component styles |
worker-src | 'self' blob: | Allow Web Workers and blob URLs for embeddings |
object-src | 'none' | Block plugins and legacy objects |
frame-ancestors | 'none' | Prevent embedding in iframes (clickjacking protection) |
The production CSP includes
'unsafe-eval' in script-src due to runtime requirements of the embedding worker and transformers.js library. This is necessary for WebGPU/WASM execution.Additional Security Headers
- X-Content-Type-Options: Prevents MIME-sniffing attacks
- Referrer-Policy: Limits referrer information to same-origin requests
Data Storage and Privacy
Local Storage Model
All user data is stored client-side using SQLite WASM with OPFS (Origin Private File System) when available.What is stored locally:
- Repository metadata (name, description, topics, language)
- README content (truncated to 100,000 characters)
- Text chunks and embeddings (384-dimensional vectors)
- Chat sessions and message history
- Indexing checkpoints and metadata
Data That Never Leaves Your Device
- GitHub tokens (stored in memory by default)
- Private repository content (unless you opt-in to external LLM)
- Embedding vectors (always generated locally)
- Query history (never sent to servers)
Data Deletion
Users have full control over local data:Token Handling
GitHub OAuth (Recommended)
GitStarRecall uses OAuth with PKCE (Proof Key for Code Exchange) for secure authentication. Security features:- Code verifier stored in sessionStorage (not localStorage)
- Tokens never appear in URLs
- Client secret kept on backend only
- Automatic token refresh via backend exchange endpoint
The
repo scope is broad (full repository access). This is currently required for GitHub’s starred repos API. Fine-grained tokens are recommended if using Personal Access Tokens (PAT).Personal Access Tokens (PAT)
PAT authentication is supported but discouraged. If you must use a PAT:- Use fine-grained tokens with minimal permissions
- Grant only read-only access to
ContentsandStarring - Set expiration to 90 days or less
- Never commit PATs to version control
Token Storage Options
| Storage Method | Security | Persistence | Recommended |
|---|---|---|---|
| Memory only (default) | ✅ Highest | ❌ Session only | ✅ Yes |
| Encrypted localStorage | ⚠️ Moderate | ✅ Persistent | ⚠️ Only if needed |
Optional 32-byte hex/base64 key for encrypting API keys in localStorage.Warning: This key is embedded in the client bundle. It protects against casual localStorage exfiltration but not against attackers with app source access.
.env.local (if persistence required)
Recommendation: Leave
VITE_LLM_SETTINGS_ENCRYPTION_KEY unset unless you need persistent login across browser sessions. Tokens in memory-only mode are cleared on tab close.README Sanitization
All README content is sanitized before rendering to prevent XSS attacks.src/components/SafeMarkdown.tsx
- Strips
<script>tags and event handlers - Removes
javascript:URLs - Blocks
data:URLs (except images) - Filters dangerous HTML attributes
- Preserves safe markdown formatting
External LLM Usage
Opt-In Model
By default, no data is sent to external services. LLM features require explicit consent.Enable sending data to remote LLM providers (OpenAI, Anthropic, etc.).
Enable sending data to local LLM providers (Ollama, LM Studio).
Remote Providers
WhenallowRemoteProvider is enabled:
- Only top-8 chunks are sent with each query (not full READMEs)
- User queries and LLM responses are stored locally only
- API keys are managed via in-app settings (optionally encrypted)
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude)
- DeepSeek
- Moonshot
- Zhipu AI (GLM)
- Custom OpenAI-compatible endpoints
Local Providers
Local providers run on your machine and never send data to the internet. Ollama Integration:.env.local
- Endpoint must be
localhost,127.0.0.1, or[::1] - Non-local endpoints are rejected with an error
- No GitHub token or PAT is sent to Ollama
- Only text chunks and model parameters are transmitted
Local providers require
allowLocalProvider=true in the app settings. They are clearly labeled “Local (Ollama)” in the UI.WebLLM Browser Provider
Run LLM inference entirely in the browser using WebGPU. Security features:- Models downloaded from trusted CDN (Hugging Face)
- CSP
connect-srcrestricts download sources - Explicit user consent before multi-GB download begins
- All inference runs locally (no network requests after download)
- Model recommendation based on device capabilities
.env.local
WebLLM is experimental and requires WebGPU support. Models are 500MB-2GB+ and cached in browser storage.
Embedding Security
Local Embedding (Default)
Embeddings are generated in-browser using@xenova/transformers.
Security guarantees:
- Model:
all-MiniLM-L6-v2(384 dimensions) - Source: Hugging Face CDN (allowlisted in CSP)
- Execution: WebGPU or WASM (no external API calls)
- Storage: SQLite WASM (OPFS or in-memory)
- README text never leaves your device
- Embeddings generated locally
- Vector index built client-side
Optional Ollama Embeddings
For faster embeddings on high-performance hardware. Trust boundary:Threat Model (STRIDE)
GitStarRecall follows the STRIDE threat modeling framework.Spoofing
Mitigations:- OAuth with PKCE (code challenge prevents replay)
- Tokens never in URLs or localStorage (unless encrypted)
- PAT usage triggers security warning
Tampering
Mitigations:- README sanitization via
rehype-sanitize - CSP blocks inline scripts and untrusted origins
- Checksum validation for repo/README integrity
- No user input in raw SQL queries
Repudiation
Mitigations:- Embedding run metadata (backend, pool size, timestamp)
- Sync phase tracking (last sync, indexing status)
- Opt-in gates for remote LLM usage
Planned improvement: Add explicit “data sent” notice when remote LLM is enabled.
Information Disclosure
Mitigations:- External LLM off by default (explicit opt-in)
- Only top-8 chunks sent to LLM (not full READMEs)
- Tokens in memory or encrypted storage only
- Chat history local-only and wipeable
- Production logs avoid token/content disclosure
Denial of Service
Mitigations:- GitHub rate limit handling with exponential backoff
- Concurrency caps (6 concurrent README fetches)
- README truncation (100,000 character limit)
- Web Worker isolation for embeddings
- Queue depth limits (1,024 max)
- Adaptive batch downshifting on memory pressure
Elevation of Privilege
Mitigations:- Minimal GitHub scopes (read-only access)
- Local endpoints clearly labeled and opt-in
- Browser embedding default (no elevated trust)
- CSP restricts code execution origins
Security Best Practices
For Users
Review LLM provider privacy policies
Understand where your data goes before enabling remote providers.
Keep tokens in memory only
Avoid setting
VITE_LLM_SETTINGS_ENCRYPTION_KEY unless persistence is required.For Developers
Sanitize all user-provided content
Use
rehype-sanitize for markdown; never use dangerouslySetInnerHTML.Residual Risks
Despite mitigations, some risks remain:Security Audit History
| Date | Type | Status | Notes |
|---|---|---|---|
| 2026-02-24 | WebLLM integration review | ✅ Aligned | Consent gate, trusted CDN, capability recommendation |
| 2026-02-23 | Embedding pipeline v2 | ✅ Aligned | Localhost-only Ollama, metadata tracking, resilient retry |
| 2026-02 | STRIDE threat modeling | ✅ Complete | See docs/security-review-stride.md |
Reporting Security Issues
If you discover a security vulnerability:- Do not open a public GitHub issue
- Email security contact (see repository
SECURITY.md) - Include steps to reproduce and impact assessment
- Allow 90 days for coordinated disclosure