This case study describes a real engagement. Client identity, proprietary details, and specific metrics are anonymized or approximated under NDA.
AI Query Assistant for Wealth Management
Advisory team spending 40% of working hours on routine portfolio queries. Average response time exceeded 4 hours for simple questions about balance, document status, and fee breakdowns — creating unnecessary friction in the client relationship.
Domain-trained AI assistant with real-time portfolio data access, compliance guardrails, and automated escalation to human advisors for queries requiring judgment.
This engagement focused on eliminating the mechanical query load from an advisory team without introducing compliance risk. The system was scoped around the 200 most common client questions, classified by regulatory sensitivity, and built with hard guardrails that prevent the assistant from making recommendations or presenting data in advisory-adjacent framing. Delivery was structured in two phases: pipeline and model integration in weeks 1–6, followed by compliance review and phased rollout in weeks 7–8. The assistant was embedded into the existing client portal with zero downtime deployment via feature flag.
The Challenge
The primary constraint was regulatory: every response template had to pass legal review before deployment, and the system required guardrails that could not be bypassed by client prompting. The portfolio management backend was a legacy on-premise system with no API layer, requiring a custom sync pipeline to make live data available to the assistant. A zero-downtime requirement was non-negotiable — the portal serves clients globally across time zones. False escalation rates also needed to stay low; excessive escalation would simply shift the advisor load rather than reduce it.
How We Built It
Query taxonomy and compliance scoping (Weeks 1–2): We audited the 200 most frequent advisor queries over the prior 12 months, classifying each as Tier-1 (safe to automate), Tier-2 (automate with disclosure), or Tier-3 (human-only). Simultaneously, we worked with the compliance function to define response boundaries, prohibited phrasings, and mandatory disclaimers. This produced the system prompt specification and guardrail logic document that governed all subsequent model work.
Data pipeline and backend integration (Weeks 3–4): The legacy portfolio management system had no REST API. We built a scheduled sync process that extracts structured data every 15 minutes into a PostgreSQL read replica with row-level security partitioned by client ID. Redis caches balance summaries and pending document statuses with a 5-minute TTL. The assistant backend runs on AWS Lambda with a Node.js adapter that resolves the authenticated session to a data context before any LLM call, preventing cross-client data leakage.
Model implementation and retrieval layer (Weeks 5–6): We used GPT-4 with a tightly constrained system prompt. LangChain manages the retrieval-augmented generation layer — live portfolio data is fetched and formatted into a structured context block on each query rather than relying on parametric knowledge. Tool calls handle specific intents: document retrieval, appointment scheduling, and escalation logging. Every response is logged with the full prompt context for the compliance audit trail.
Testing, legal review, and rollout (Weeks 7–8): We ran 1,400 synthetic queries against the system — including adversarial prompts designed to elicit advice or circumvent guardrails — before human testing. The compliance team reviewed 200 randomly sampled live responses and signed off. Deployment used a 10% traffic feature flag, monitored for 48 hours, then rolled to 100% of portal sessions. A monitoring dashboard was delivered alongside the system, tracking daily query volumes, deflection rates, escalation reasons, and post-chat survey scores.
What We Delivered
Advisor time on Tier-1 queries dropped from 40% of working hours to under 8% within the first month. The assistant handles 68% of all inbound queries without escalation, and average response time fell from over 4 hours to 4.1 seconds. Advisor throughput on relationship-driven work — portfolio reviews, new client onboarding — increased 3.2x based on time-tracking data collected during the post-launch period.
Escalation quality improved alongside deflection rate. When a query reaches an advisor, it arrives with full chat context pre-loaded, eliminating the re-qualification conversation that previously consumed the first several minutes of each call. Escalations that previously averaged 12 minutes of advisor time now average 4 minutes.
The compliance audit trail generated as a byproduct of the system has become an operational asset. The structured interaction logs have reduced quarterly audit preparation time and have passed two internal compliance reviews since launch with no findings. The pattern-based logging also surfaces emerging query types that may warrant new Tier-1 automation in future iterations.
Ready to build something like this?
Tell us what you are building. We will scope it, price it honestly, and give you a clear plan.
Start a ConversationFree 30-minute scoping call. No obligation.