FeatBit Experimentation · AI-powered A/B testing

Release with evidence, not instinct

FeatBit Experimentation is an A/B testing platform with an AI agent baked in. Bayesian statistics, hypothesis discipline, and evidence-based decisions — the best practices, packaged so any team can run rigorous experiments without a statistician.

Start an experiment View on GitHub

See FeatBit Experimentation in action

Product walk-through · video placeholder

placeholder

Release Decision Agent

An AI agent that runs the loop with you

The agent isn't a workflow generator — it's a control framework. It decides what kind of decision you're really facing and which lens to apply: shaping intent, sharpening a hypothesis, choosing an exposure strategy, judging whether evidence is sufficient, framing the decision, and closing the cycle with a learning.

The core loop

01Intent· What outcome?02Hypothesis· Falsifiable claim03Implementation· Reversible change04Exposure· Who sees it05Measurement· Primary metric06Interpretation· Evidence framing07Decision· Continue / pause / rollback08Learning· Feed next cycle

Try the agent Read the skill spec

/featbit-release-decision

Agent chat & analysis screenshot · placeholder

CF-01Intent Clarification

Separate goal from solution before tactics.

CF-02Hypothesis Discipline

Convert intent into a falsifiable claim.

CF-03Reversible Change Control

Make change reversible before visible.

CF-04Exposure Strategy

Decide who sees it — not as a deploy side-effect.

4 of 8 control lenses · the agent picks the right one for your stage

Best practices, packaged

Rigorous experimentation, without the PhD

The algorithms and discipline that top experimentation teams rely on — built into the agent so any team can run controlled experiments on equal footing.

Bayesian inference

Posterior probabilities, credible intervals, and expected loss — the framing reviewers actually use to decide. No p-value rituals.

Hypothesis discipline

One primary metric, a few guardrails, a falsifiable claim. The agent enforces shape before you can add traffic — so analysis doesn't become storytelling.

Evidence sufficiency

The agent decides whether to call it now, wait, widen the window, or revisit instrumentation. Urgency doesn't get to pretend to be evidence.

Better with FeatBit flags. Useful without them.

The platform stands on its own as an A/B testing system. Pair it with FeatBit feature flags and the loop closes — exposure, measurement, and decision share the same source of truth.

Recommended

With FeatBit feature flags

Native integration. Variants, traffic split, targeting rules, and holdout groups are read from your live flags. The agent checks for cross-experiment conflicts before you start a run.

Reversible exposure control out of the box
Pre-start conflict detection across experiments
Decisions ship as flag changes — not as a separate ritual

Standalone

Without feature flags

Already split traffic somewhere else? Paste observed data, or connect a database. The agent still shapes the hypothesis, analyzes evidence, and frames the decision.

Expert-mode wizard accepts pasted observed data
Bring your own gating, targeting, or rollout system
Bayesian analysis & decision audit still apply

From hypothesis to decision in four stages

Each stage is explicit, tracked, and documented. No more ambiguous “we tested it” — just clear evidence trails.

Define a hypothesis

State what you expect to change and how you'll measure it. The agent enforces this before you can add traffic.

Expose users

Assign traffic — through FeatBit feature flags or any gating mechanism you already use. Control split, targeting, and holdouts.

Analyze evidence

Bayesian analysis runs continuously. Review probability estimates, credible intervals, and guardrail metrics in real time.

Record the decision

Ship, rollback, or iterate — with a documented rationale. The decision record stays with the experiment forever.

Everything your team needs

Built for engineering and product teams that want to ship with confidence.

Bayesian A/B analysis

Real-time probability estimates and expected loss calculations — not just p-values. Know when you have enough evidence to act.

AI-powered decisions

The agent surfaces key metrics, flags statistical concerns, and guides your team through the hypothesis → evidence → decision cycle.

Feature flag native

Pairs natively with FeatBit feature flags — traffic assignment, targeting, and holdouts ride alongside the experiment, not on a parallel track.

Decision audit trail

Every release decision is documented — evidence reviewed, confidence level, rationale. The record lives with the experiment, not in Slack.

Multi-stage experiments

Design, exposure, analysis, and decision — each stage has its own state, blocking criteria, and completion checklist.

Stop guessing. Start deciding.

Set up your first experiment in minutes. Your data stays with you.

Create your first experiment Read the docs