This is a case study of a real project. The client is a B2B SaaS company with a 12-person sales team. Names and specifics are anonymised. Every number in this article is real.
The problem they brought to us was not unique: their CRM was a graveyard of stale data that nobody trusted. Deals sat in the wrong stages. Follow-ups fell through the cracks. Reps hated logging activity because it took longer than the activity itself.
Three weeks later, their team had an AI-powered assistant they actually used. Here is exactly how it was built.
The Problem: 11 Hours Lost Per Rep, Per Week
Before we built anything, we audited the sales team's workflow for a week. The findings:
- Average rep spent 2.3 hours per day on CRM admin: updating stages, logging call notes, drafting follow-ups.
- Follow-up emails were taking 15-20 minutes each to write from scratch.
- Deal stages were updated inconsistently - 60% of deals had inaccurate stage information at any time.
- Pipeline reports required manual reconciliation every week because the underlying data was unreliable.
The core insight: this was not a CRM problem. It was an intelligence problem. The CRM had all the data needed to automate most of the admin work - it just lacked the layer to do it.
The Architecture: What We Built
Component 1: Context Retrieval Layer
When a rep opens a deal, the system automatically retrieves deal metadata, all previous notes and emails, and the rep's historical communication style from past sent messages. This context is assembled into a structured prompt that tells GPT-4o exactly who this deal is, where it stands, and what tone to use.
Component 2: GPT Draft Generation
GPT-4o via the OpenAI API, wrapped in a Node.js backend. The prompt template was tuned over two weeks of testing with real deal data - not generic samples - until outputs matched the rep's style and the deal context. Streaming responses were built in from day one. Reps see the email appearing word-by-word, which feels dramatically faster than waiting for a full response.
Component 3: Deal Stage Intelligence
A separate agent monitors deal activity - emails sent, calls logged, meetings booked - and automatically suggests deal stage updates. Reps see a notification: 'Based on your last 3 interactions, this deal is ready to move to Proposal. Update?' One click confirms.
The 3-Week Build Timeline
Week 1: Discovery + Architecture
Days 1-2: CRM API audit and data mapping. Days 3-5: prompt design and initial GPT testing with real deal data. The core finding: the quality of AI output is almost entirely determined by the quality of context you provide. Week one was 80% data work, 20% AI work.
Week 2: Build + Integration
Backend API endpoints. React frontend panel. Rate limiting, cost controls, and fallback logic. End of week: working prototype with live data.
Week 3: Testing + Launch
Three reps tested with live deals. Prompt templates iterated based on actual output quality. Edge cases identified and handled: deals with no notes, very old deals, deals with multiple stakeholders. Production deploy on day 19.
What We Would Do Differently
Honest retrospective: we underestimated prompt iteration time. We budgeted one week for testing - we needed two. The difference between good AI output and great AI output came down to 20-30 iterations on the prompt template with real deal data. Build in more time for prompt engineering than you think you need. It is not glamorous. It is where quality lives.
