Sarala Biswal saralabiswal

Sarala Biswal

VP / Director of Engineering · Agentic AI · MCP · MLOps · CPQ · Quote-to-Cash · Hands-On Technical Leader

Belmont, CA · He/Him

About

I build AI/ML platforms that generate revenue — not just predictions. Hands-on engineering leader who architects and codes production systems personally while setting technical direction for a 40+ person global org.

17+ years at Oracle shipping two flagship AI platforms at enterprise scale:

Agentic AI on CPQ — Renewal Agent + Quote Generation Agent · 3,000+ active sales users · 600+ enterprise clients · 50+ countries · 30% renewal cycle compression · 28% quote processing efficiency improvement
Unity CDP AI/ML — 6 production models (Next Best Action, Churn Propensity, CLV, RFM Segmentation, Multi-Touch Attribution, MMM) · 9,000+ customers at day-one GA · no phased rollout

The hard part isn't the model — it's the integration layer. I solved the cross-vendor problem in production: unified live context from Salesforce, MS Dynamics 365, and Oracle clouds so agents make decisions on real data, not cached snapshots.

Featured Projects

🏦 agentic-banking-llmops

Production-grade, cloud-agnostic Agentic AI platform for banking decisions — complete reference architecture for governed agentic AI systems. Ten platform capabilities across eight services, composable by any product team through a stable SDK.

"The engineering problems that make agentic AI fail in production are not model problems. They are infrastructure problems — stale context, ungoverned execution, absent memory, unvalidated models, and no feedback path from outcome back to decision. This platform solves each of those problems as a named, typed, independently testable layer with a clean contract to its neighbors."

Three architectural deficits this platform addresses:

Stale batch context — nightly risk scores reflect yesterday's account state. An agent acting on an 18-hour-old risk score makes a decision that is technically correct but wrong in the world.
Ungoverned agent execution — an agent with no compliance gate between its reasoning and its action can violate CFPB, ECOA, or UDAAP. In a regulated environment, that is not a product risk — it is a legal one.
No closed-loop governance — without outcome capture, memory, and evaluation history, the platform learns nothing across sessions. It restarts blind every time.

Six architectural principles — applied without exception:

Typed contracts at every boundary — Pydantic v2 schemas throughout, no dicts, no untyped kwargs
Protocol-based dependency injection — every external dependency behind a Protocol interface, independently testable
Graceful degradation over hard failure — one failing source marks sources_degraded, pipeline continues
Governance as a runtime capability — Layer 4 runs before Layer 6 executes, not as a post-hoc audit
Immutable, replayable audit trail — one trace_id reconstructs every decision for regulatory replay
Closed feedback loop — outcome events write CustomerMemory records, retrieved at the next session

Reference Architecture — Six Governed Layers:

Layer	Responsibility	Key Pattern
L1 Context Assembly	Live profile < 200ms	Parallel async fetch · two-tier memory (Valkey TTL + Qdrant long-term) · artifact-backed ML scoring · graceful degradation
L2 Vector Search	Right policy at decision time	Hybrid dense + BM25 · RRF fusion · cross-encoder rerank · KB version tracking
L3 Orchestration	Hub-and-spoke · propose only	Tool authorization in code · schema-validated outputs · routed LLM inference service
L4 Guardrails	REGULATORY → BUSINESS → AI	Versioned YAML rules · BISG/AIR fairness · CFPB/ECOA/UDAAP · SLA approval queue
L5 A/B + Model Gov.	Deterministic experiments + drift	Hash-based assignment · champion/challenger · PSI/KS/recall · 4-gate offline eval
L6 SDK + Execution	Product team surface	Blueprints: `PAYMENT_RISK_INTERVENTION` · `BILLING_DISPUTE_RESOLUTION` · `CHURN_PREVENTION` · `FRAUD_ALERT`

One trace_id reconstructs every decision end-to-end for regulatory replay. 4-gate offline evaluation pipeline: benchmark · fairness · Adverse Impact Ratio · LLM-judge.

10 UI pages: Pipeline Runner · Architecture View (animated SSE) · Audit Trail · Experiments · Drift Monitor · Guardrails · Model Registry · Evaluation · Settings · About

LLM modes — runtime switching, no restart required:

Mode	Config	Notes
Ollama (default)	`LLM_BACKEND=ollama`	Real local inference — free, no account, no data egress
Mock	`LLM_BACKEND=mock`	Deterministic responses — exercises all layers with zero dependency
API	`LLM_BACKEND=api` + key	LiteLLM — Claude, GPT-4o, 100+ providers

8 local services:

Valkey:6379  PostgreSQL:5432  Qdrant:6333  Jaeger:16686
Prometheus:9090  Grafana:3000  MLflow:5001  Ollama:11434

git clone https://github.com/saralabiswal/agentic-banking-llmops
cd agentic-banking-llmops && make install && make docker-up
cp .env.example .env && make seed && make dev
# All 8 services + API + UI with hot-reload — no API key required

🔗 agentic-mcp-quote-to-cash

MCP-powered integration layer for vendor-agnostic quote-to-cash agentic decisions — the cross-vendor architecture pattern running in production across 600+ enterprise clients.

The core proof: vendor selection is configuration, not agent code. Switching CRM from Salesforce to Microsoft Dynamics 365, or Order Management from Oracle FOM to SAP S/4HANA, changes the adapter path and source attribution — the decision agent and canonical schema are untouched.

Slot	Adapter implementations	Canonical output
CRM	Salesforce · MS Dynamics 365 · Oracle CX Sales	Account · Opportunity · Contact · Activity
CPQ	Oracle CPQ Cloud	Product · PriceBook · Quote
Order Management	Oracle FOM · Salesforce OMS · SAP S/4HANA · Zuora · NetSuite	Order · OrderLine · FulfillmentStatus
Subscription	Oracle Sub Cloud · Zuora · Chargebee · Salesforce Revenue Cloud	Subscription · UsageHealth · RenewalSignal
Install Base	Oracle Install Base · Salesforce Asset · ServiceNow CMDB	InstalledProduct · Entitlement

16 adapters. 5 commercial-system slots. One canonical schema. Seven demo scenarios.

git clone https://github.com/saralabiswal/agentic-mcp-quote-to-cash
cd agentic-mcp-quote-to-cash && make install && make seed
make dev-api          # FastAPI → http://localhost:8000
cd ui && npm install && npm run dev -- --port 3001
# No API key required — runs end-to-end in demo mode

🧪 agentops-eval-llmops

Evaluation harness for governed LLM agents — because production AI without evals is just a demo you shipped.

Most agentic AI systems stop at building the agent. This framework answers the question every production deployment eventually faces: how do you know it's still working correctly next month?

Component	What it does
YAML test cases	Benchmark scenarios for payment risk, billing disputes, churn prevention
Independent judge	Separate judge backend — not the same model being evaluated (prevents self-evaluation bias)
Scoring dimensions	Faithfulness · answer relevance · context precision · consistency · latency/quality tradeoff
SUT backends	Mock · Ollama · cloud API · banking platform adapter — swap without changing test cases
Reports	HTML + JSON · SSE streaming · side-by-side model comparison

Plugs directly into agentic-banking-llmops as its evaluation layer — same trace_id, same scenarios, same policy boundaries.

make install && cp .env.example .env
make demo                         # mock backend, no API key required
make dev                          # API → http://localhost:8001

📊 agentic-llm-observability

Production LLMOps control plane for Quote-to-Cash agentic workflows — token cost attribution, quality scoring, latency SLOs, prompt versioning, and semantic drift detection across five providers.

Answers the production questions most enterprises cannot answer:

Question	What the platform tracks
How much did this agent run cost?	Token cost per call · per model · per use case · per provider
Which model and prompt version ran?	Prompt version registry · A/B comparison · rollout history
Did quality stay above threshold?	Faithfulness · relevance · coherence · hallucination signals · quality gates
Were latency SLOs met?	p50 / p95 / p99 per model · SLO compliance % · breach visibility
Did outputs drift from baseline?	Semantic drift score · threshold alerts · operational posture

Five providers with real rate cards:

Provider	Model	Use case
Local LLM	Ollama — Llama 3.2 · Qwen 2.5 · Mistral	Actual execution — standalone, no API key
AWS Bedrock	Claude 3.5 Haiku	Production agent workloads
Azure OpenAI	GPT-4o mini	Global deployment, low-cost reasoning
OCI Generative AI	Cohere Command R	Enterprise RAG-style flows
Google Vertex AI	Gemini 2.0 Flash	Fast agentic workflows

make install && make seed
ollama pull llama3.2              # default local model
make dev-api                      # API → http://localhost:9100
make dev-ui                       # UI  → http://localhost:5173
# No API key required

Production Platforms (Oracle, 2009–Present)

Agentic AI on CPQ	Unity CDP AI/ML Platform
MCP-powered multi-agent orchestration	6 production models · 9,000+ customers · day-one GA
Renewal Agent — autonomous risk scoring + optimized proposal generation	Next Best Action / Offer
Quote Generation Agent — real-time margin enforcement + cross-sell intelligence	Churn & Engagement Propensity
AI Agent Studio — agent lifecycle, tool routing, policy enforcement across the full commercial lifecycle	Customer Lifetime Value (CLV)
Cross-vendor CRM integration layer: Salesforce + MS Dynamics 365 → Oracle CPQ Cloud + Fusion Order Management + Subscription Management · Reference implementation →	RFM Segmentation
	Multi-Touch Attribution (MTA)
	Media Mix Modeling (MMM)
30% renewal cycle compression · 28% quote processing improvement	Full MLOps stack: feature stores · training pipelines (TensorFlow · PyTorch · Hugging Face) · real-time + batch inference · embedding pipelines · vector DBs · RAG · drift detection · responsible AI governance

Open Source Portfolio

Layer	Repo	What it demonstrates
Integration	agentic-mcp-quote-to-cash	16 MCP adapters · cross-vendor live context · CRM-agnostic · Quote-to-Cash lifecycle
Platform	agentic-banking-llmops	6-layer governed agentic pipeline · guardrails · A/B · regulatory replay · 90% coverage
Platform	agentic-cdp-mlops	8-stage ML platform · 4 models · model registry · governed promotion lifecycle
Ops	agentops-eval-llmops	LLM agent evaluation · judge/SUT separation · faithfulness · quality gates
Ops	agentic-llm-observability	LLMOps control plane · token cost · quality · latency SLOs · 5 providers
Domain	agentic-revenue-cpq	MCP integration · LangGraph · Oracle CPQ-style quote lifecycle
Domain	agentic-hr-onboarding-mcp	MCP connectors · Workday/Jira/Slack/Salesforce · idempotency
Domain	agentic-ecommerce-rag	RAG · LangGraph · multi-agent · quality gate · human feedback

Technical Skills

AI / ML / Agentic

Languages & Frameworks

Cloud & Infrastructure

Data & ML Stack

By the Numbers


17+ years production AI/ML experience	40+ person global org (US + India)
9,000+ customers at day-one platform launch	600+ enterprise clients in production
50+ countries served	6 production ML models shipped at GA
3,000+ active sales users on agentic platform	16 MCP adapters across 5 vendor slots
30% renewal cycle compression	28% quote processing efficiency improvement
2x internal promotion rate increase	32% incident volume reduction

Certifications

Education

Post Graduate Diploma, Machine Learning — Cornell University, NY
MBA, Technology Management — University of Phoenix, AZ
B.S., Computer Science & Engineering — Utkal University, India

I build the platforms that make AI commercially accountable — not just technically impressive.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly