Question 1

Does any code or data leave our network?

Accepted Answer

No. OnPremize runs inside your infrastructure (VPC, on-prem, or air-gapped). Prompts, code context, and embeddings are processed within your boundary, and optional fine-tuning runs locally under your control.

Question 2

Can you run fully air-gapped? How do installs and updates work?

Accepted Answer

Yes. We plan to provide offline artifact bundles (containers, model weights, and dependencies) that can be transferred via your approved process. Day-to-day operation does not require outbound internet access.

Question 3

What gets indexed and stored in the vector database?

Accepted Answer

Typically this includes embeddings plus the minimum metadata (and, when configured, text chunks) needed for retrieval. You control what paths are indexed, retention policies, and deletion workflows. Encryption in transit/at rest depends on your deployment configuration.

Question 4

How do you keep secrets or PII out of the index and logs?

Accepted Answer

You can exclude directories and files at index time. Redaction policies and runtime allowlists/denylists are on the roadmap. Logging detail is configurable so you can meet internal privacy requirements.

Question 5

How do access control and isolation work?

Accepted Answer

The server supports API key authentication. SSO (SAML/OIDC), RBAC, collection scoping, and approval gates are on the roadmap.

Question 6

What audit logs do you provide, and can we export to our SIEM?

Accepted Answer

Structured audit logging and SIEM export are on the roadmap. Currently, the server uses standard application logging with configurable detail.

Question 7

How do citations work, and what happens when the system isn’t sure?

Accepted Answer

Responses include citations back to your source (file paths and, when available, line ranges) so engineers can verify. If support can’t be found in indexed sources, the system can respond with uncertainty rather than guessing.

Question 8

How do you validate quality before production rollout?

Accepted Answer

Quality evaluation tooling (citation coverage, retrieval relevance, CI/CD gates) is on the roadmap.

Question 9

How is the index kept fresh and reproducible?

Accepted Answer

Indexing can run on a schedule or be triggered by repo changes. Incremental updates reduce reprocessing, and you can pin retrieval to specific revisions/cadences for repeatable results.

Question 10

What does deployment and support look like (including pricing)?

Accepted Answer

We offer three packages — Team, Business, and Enterprise — scoped by capability level and support needs. All packages support VPC, on-prem, and air-gapped deployment. Pricing is based on scale (repos/users) and SLA requirements.

Question 11

What models are supported?

Accepted Answer

OnPremize uses open-weight models such as Qwen2.5-Coder and StarCoder2. The adapter system is pluggable, so adding new model families is straightforward. Model selection depends on your GPU capacity and performance requirements — we help scope this during evaluation.

Question 12

What hardware is required?

Accepted Answer

Pilot deployments typically need a single GPU node (e.g., A10 or A100). Production deployments use multiple GPU nodes for inference and fine-tuning workloads. Exact requirements are scoped during evaluation based on repo size, user count, and latency targets.

On-prem AI for Your Codebase

Enterprise Engineering Pain

Extended Onboarding Time

Tribal Knowledge

PR Throughput

Compliance Constraints

Tool Sprawl

How It Works

Connect Sources

Index

Retrieve

Answer & Act

What Gets Indexed

Code Repositories

Git History

Internal Documentation

Jira Issues + PRs

Fine-Tune to Your Standards

Training Pipeline

Why Fine-Tune?

Example: Learning from commits

Modules

Code Assistant

Benefits

Key Features

Example Use Case

Security & Deployment

Data Residency

Data Protection

Operational Controls

Deployment Options

Operational Model

Choose Your Package

Team

Business

Enterprise

Frequently Asked Questions

Deploy on-prem. Keep control. Ship faster.