90-Day Pilot
Structure

5-phase, 13-week production pilot. No lock-in, no penalty to exit. Pilot begins within 2 weeks of architecture call. All deliverables written and dated before work commences.

5
Phases
13
Weeks
10
UAT Scenarios
0
Lock-In Penalty

5-Phase Production Timeline

PHASE 1 Environment Setup & System Integration Weeks 1–2

Initial integration of CAIBots with your fraud platform, core banking system, and behavioral biometrics provider. No production traffic — test environment only. Architecture validation session with your technical team completes this phase.

Actimize or Verafin webhook configuration and testing
FIS or Fiserv account history API connection validated
BioCatch / ThreatMetrix SDK integration confirmed
XGBoost + GNN models initialized with test transaction data
Regulatory engine parameterized for your institution
HITL queue configured for fraud analyst workflow
Anthropic API key provisioned for live AI scenario
Data residency and PII handling architecture validated
PHASE GATE: CAIBots receives and processes a test fraud event from Actimize/Verafin and routes to HITL queue. No analyst action required yet.
PHASE 2 Shadow Mode — Parallel Processing Weeks 3–5

CAIBots runs in parallel with your existing fraud workflow. Analysts continue using Actimize/Verafin as normal. CAIBots processes the same cases and produces HITL packets — but analysts only see them for comparison, not decision-making. No impact to production workflow.

100% of production fraud alerts processed in shadow
5-agent pipeline runs on every qualifying event
HITL packets generated but not actioned
Analyst review time measured vs. current baseline
SAR narrative quality rated by BSA Officer (blind) — governed by 31 U.S.C. §5318(g)
Reg E clock accuracy validated against actual determinations
Mule network detection validated against known networks
False positive rate measured and documented
PHASE GATE: Weekly readout of shadow mode metrics. Minimum 200 cases processed. SAR narrative quality rating B-or-better on >80% of drafts. Advance to UAT if gate clears.
PHASE 3 User Acceptance Testing (UAT) Weeks 6–8

Controlled UAT with selected fraud analysts. 10 structured scenarios covering all fraud typologies and regulatory obligations. Pass criteria defined before UAT begins. Analysts use CAIBots HITL packets for real decisions on agreed case subset.

10 UAT scenarios run with full pass/fail criteria
Analyst review time measured with stopwatch per case
Reg E provisional credit decision accuracy validated
SAR narrative quality BSA Officer rating >85% B-or-better
Mule network topology validated against confirmed networks
HITL gate function tested for all 5 mandatory gates
Audit trail reproducibility tested within 24-hour SLA
Live AI Scenario 05 tested on institution-specific scenario
PHASE GATE: 9 of 10 UAT scenarios pass all defined criteria. Analyst NPS >7. BSA Officer signs off on SAR narrative quality. Reg E clock compliance 100% in UAT period.
PHASE 4 Parallel Production — Full Analyst Review Weeks 9–11

All fraud analysts use CAIBots HITL packets for all decisions. Production workload, real cases, real regulatory obligations. Actimize/Verafin remains primary system of record. CAIBots enriches every case. Weekly operational metrics tracked against pilot success criteria.

Full analyst team using HITL packets on all cases
Weekly metric readout vs. pilot target benchmarks
Reg E deadline compliance tracked — zero violations target
SAR volume and quality tracked vs. pre-pilot baseline
False positive escalations tracked and reviewed weekly
Model drift monitoring active on XGBoost + GNN
CFPB / FinCEN audit trail tested on 3 sampled cases
BSA Officer satisfaction survey at end of Phase 4
PHASE GATE: Analyst time reduction >70% vs. baseline. Zero Reg E deadline violations. SAR quality B-or-better >85%. No production incidents. BSA Officer satisfaction >8/10.
PHASE 5 Go-Live Sign-Off & Production Handoff Weeks 12–13

Formal pilot close-out. Executive readout of all pilot metrics vs. targets. Go/no-go decision by client. If go-live: production handoff documentation, SLA agreements, model monitoring schedule, and examination-readiness certification.

Executive pilot outcomes readout with all metrics
Go/no-go decision — client's choice, no pressure
Production SLA documentation signed (if go-live)
Model monitoring and drift alert schedule established
SR 11-7 model validation documentation package delivered
FFIEC examination-readiness certification issued
On-call support escalation path documented
Fraud analyst training materials and runbook delivered
NO LOCK-IN: At any phase gate, you may exit with no penalty. We do not bill for work not delivered. Pilot cost is fixed-price, scoped in advance.

10 UAT Scenarios with Pass Criteria

#ScenarioFraud TypePass CriterionReg Impact
UAT-01Confirmed ATO · Wire
Credential change + outbound wire + new device
Account TakeoverTXN-RISK >85 · Reg E clock started · SAR narrative assembled · HITL packet routed in <90 secReg E · 31 U.S.C. §5318(g)
UAT-02BEC · Vendor Wire
First-time payee + spoofed email + wire $50K+
Business Email CompromiseGNN first-time payee detection · P0 priority assigned · FBI IC3 referral memo generated · HITL includes wire recall authorization optionSAR (31 U.S.C. §5318(g)) · IC3
UAT-03Synthetic Identity · Bust-Out
6-month seasoning + rapid credit utilization
Synthetic IdentitySynthetic probability score >0.80 · Bust-out pattern confirmed by ACCT-STATE · SAR obligation determination correctSAR
UAT-04Mule Network · 5+ Accounts
Common device + structured P2P flows
Mule Account NetworkGNN traversal completes <5 seconds · All mule accounts identified · Network SAR naming all participants generatedSAR Network (31 U.S.C. §5318(g))
UAT-05Reg E Provisional Credit · Unauthorized ACH
Unauthorized ACH pull · consumer account
Unauthorized TransactionReg E obligation confirmed by REG-COMP · 5 business-day clock displayed in HITL packet · HITL requires analyst credit decisionReg E (5-Day)
UAT-06Zelle Fraud · P2P Scam
Zelle payment under social engineering
P2P / Social EngineeringCorrect Reg E determination (authorized vs. unauthorized · bank policy) · Analyst-ready memo with governing regulation citationReg E (complex)
UAT-07SAR Narrative Quality
Blind rating by BSA Officer on 10 SAR drafts
All typesBSA Officer rates >85% of narratives as B-or-better on first draft without edits. FinCEN field completeness 100%.SAR Quality
UAT-08HITL Gate Function
Verify all 5 mandatory gates cannot be bypassed
All typesAttempt to complete any regulatory action without HITL approval fails at system level. Audit log captures all attempts.HITL Safety
UAT-09Audit Trail Reproducibility
Reproduce case evidence on demand
All typesThree sampled cases fully reproducible within 24 hours: all agent outputs, signal weights, regulatory citations, analyst decisions with timestamps.Examination Ready
UAT-10Live AI · Novel Scenario
Scenario 05 on institution-specific edge case
CustomLive Claude API produces a coherent fraud narrative and HITL packet for a fraud type not in Scenarios 01–04. BSA Officer deems output useful.Live AI

Pilot Terms

What's Included in the Pilot
Complete 5-agent pipeline deployment in your test environment
FIS/Fiserv, Actimize/Verafin, BioCatch/ThreatMetrix integration
Weekly technical and operational readouts with your team
Fraud analyst training sessions (Phase 3 UAT onboarding)
BSA Officer SAR quality review sessions (Phase 2 shadow mode)
SR 11-7 model documentation package (Phase 5)
All 10 UAT scenarios documented with pass/fail writeup
Post-pilot executive outcomes readout deck
Pilot Success Criteria
Analyst Efficiency
≥70% reduction in fraud case review time vs. pre-pilot baseline
Reg E Compliance
100% Reg E provisional credit clock compliance (12 C.F.R. §1005.11) in pilot period
SAR Quality
≥85% of SAR narratives rated B-or-better by BSA Officer
UAT Pass Rate
9 of 10 UAT scenarios pass all defined criteria
Exit Terms

You may exit the pilot at any phase gate with no penalty, no clawback, and no continuing obligation. The pilot is fixed-price and scoped in advance — we do not bill for work not yet delivered. If the pilot fails to meet agreed success criteria, we do not proceed to production billing. Our interest is production deployment, not pilot revenue.

Ready to Start the Pilot?

A 30-minute architecture call maps this timeline to your stack. Production pilot begins within 2 weeks of that call. All deliverables scoped and documented before work begins.

Schedule Architecture Call → View Implementation Guide
+1 (609) 721-2815  ·  contact​@caibots​.com  ·  Princeton, NJ