How to Train Custom AI Agents for Your GTM Stack [2026 Guide]

February 9, 2026 · 10 min read

Generic AI gives generic results.

You've tried ChatGPT for sales emails. The output sounds like... ChatGPT. Professional. Pleasant. Forgettable.

Your prospects can smell AI-generated content from a mile away. And they delete it.

The secret isn't better prompts for generic AI—it's training AI that understands YOUR business.

This guide shows you how to build custom AI agents that know your ICP, speak in your voice, understand your competitive landscape, and produce content that actually sounds like your team wrote it.

Custom AI agent training workflow with prompt engineering

Why "Off-the-Shelf" AI Fails for Sales

The Generic Problem

Default AI models know everything about everything—and nothing about your specific:

ICP characteristics — Who your best customers actually are
Pain points — What problems you uniquely solve
Voice — How your brand communicates
Objections — What prospects actually say, not textbook objections
Process — Your specific sales stages and handoffs
Competitors — Your actual competitive landscape

The Cost of Generic

When AI doesn't understand your context:

Task	Generic AI Output	What You Actually Need
Cold email	"I hope this email finds you well..."	Pattern-interrupt that matches your voice
Objection response	Textbook rebuttal	How YOUR top reps actually handle it
Call prep	Company Wikipedia summary	Specific angles based on your ICP fit
LinkedIn message	"I noticed we're both in tech..."	Reference to actual shared context

Generic wastes time and damages brand perception.

The Three Levels of AI Customization

You don't need to "train" a model from scratch. There are easier approaches:

Level 1: Prompt Engineering (No Code)

Give the AI detailed context in every prompt. Free, immediate, good for testing.

Best for: Small teams, experimentation, single-use tasks

Level 2: System Prompts + Memory (Low Code)

Create persistent agent personas with custom instructions. Requires OpenClaw or similar.

Best for: Repeatable tasks, team-wide deployment, consistent voice

Level 3: Fine-Tuning (Technical)

Train a model on your actual data. Requires examples and some technical setup.

Best for: High-volume tasks, unique terminology, proprietary voice

Let's build each.

Level 1: Prompt Engineering

The 80/20 of AI customization. Most teams never need more than this.

The Context Stack

Every great prompt includes:

ROLE — Who the AI is acting as
CONTEXT — Background about your business
TASK — What you want it to do
EXAMPLES — What good output looks like
CONSTRAINTS — What to avoid
FORMAT — How to structure output

Example: Sales Email Prompt

## ROLE
You are a senior SDR at MarketBetter, a B2B sales intelligence platform.

## CONTEXT
Our ICP:
- VP/Director of Sales at B2B SaaS companies, 50-500 employees
- Pain: SDR efficiency, lead quality, personalizing outbound at scale
- Our differentiation: We don't just show WHO to call—we tell them WHAT to do

Competitors: Apollo (no workflow), 6sense (enterprise pricing), ZoomInfo (data only)

Our voice: Direct, helpful, slightly irreverent. No corporate speak. 
We sound like a smart friend who happens to know a lot about sales.

## TASK
Write a cold email to a prospect based on the research provided.

## EXAMPLES OF OUR VOICE
Good: "Your SDRs are drowning in data. Here's a life raft."
Good: "Most sales tools show you a firehose. We hand you a glass of water."
Bad: "I hope this email finds you well."
Bad: "We are a leading provider of sales intelligence solutions."

## CONSTRAINTS
- Never start with "I hope this email finds you well"
- Never use "leverage," "synergy," or "circle back"
- Maximum 125 words
- Must include specific detail from prospect research
- End with a question, not a meeting request

## FORMAT
Subject line, then body. No salutation ("Hi Name" is fine but not required).

Building a Prompt Library

Create prompts for every common task:

/prompts
  /email
    cold_outreach_v3.md
    follow_up_after_meeting.md
    breakup_email.md
  /linkedin
    connection_request.md
    first_message.md
    inmail_template.md
  /call
    discovery_questions.md
    objection_handling.md
    voicemail_script.md
  /research
    account_briefing.md
    competitive_analysis.md
    champion_mapping.md

Prompt Versioning

Track what works:

# cold_outreach_v3.md
---
version: 3.2
last_updated: 2026-02-09
performance:
  reply_rate: 8.2%
  a/b_tested: true
  sample_size: 1,247
changes_from_v2:
  - Added pattern-interrupt examples
  - Removed "reach out" from banned phrases
  - Shortened max length from 150 to 125 words
---

[prompt content...]

Level 2: System Prompts + Memory

When you need consistent behavior across sessions.

Creating Agent Personas

In OpenClaw, create a dedicated agent:

# agents/sdr_agent.yaml
name: "SDR Assistant"
emoji: "🎯"

soul: |
  You are the SDR Assistant for MarketBetter.
  
  ## Your Personality
  - Direct and efficient (SDRs are busy)
  - Helpful but not sycophantic
  - Knowledgeable about our ICP and process
  
  ## What You Know
  - Our ICP: VP/Director Sales at B2B SaaS, 50-500 employees
  - Our competitors and how we beat them
  - Our sales process and stage definitions
  - Our messaging and voice guidelines
  
  ## What You Do
  - Write emails in our voice
  - Prep accounts for calls
  - Handle objection scripting
  - Research prospects
  
  ## What You Don't Do
  - Book meetings directly (point to Calendly)
  - Access competitor pricing (it changes)
  - Make promises about features

memory:
  # Load company context
  - /knowledge/icp.md
  - /knowledge/competitors.md
  - /knowledge/voice-guidelines.md
  - /knowledge/objection-playbook.md

Building the Knowledge Base

Create documents the agent references:

# /knowledge/icp.md

## Ideal Customer Profile

### Primary Persona: VP/Director of Sales
- Company size: 50-500 employees
- Industry: B2B SaaS, Tech, IoT
- Team structure: Has SDR team (3-15 SDRs)
- Tech stack: HubSpot or Salesforce, uses 3+ sales tools

### Buying Triggers
- Just raised Series A/B (scaling sales team)
- Hired new sales leadership (mandate to improve)
- SDR turnover problems (efficiency is suffering)
- Competitor using us (fear of falling behind)

### Common Objections
1. "We already have Apollo/ZoomInfo"
   → They give you data. We give you a playbook.
   
2. "Our current process works"
   → What's your SDR ramp time? What's their daily call-to-meeting ratio?
   
3. "Don't have budget"
   → Typically saves 30% of SDR time. What's that worth annually?

AI agent prompt engineering iteration cycle

Adding Memory

Let agents remember across sessions:

# OpenClaw agent with memory
memory_config:
  enabled: true
  paths:
    - memory/daily_notes/
    - memory/account_context/
  
  auto_remember:
    - prospect_preferences  # "Sarah prefers morning calls"
    - past_interactions    # "Sent 3 emails, no response"
    - custom_context       # "Mentioned they use Salesforce"

Now when you ask "Draft a follow-up for Sarah at Acme," the agent remembers your history.

Level 3: Fine-Tuning

When you need AI that sounds exactly like your best performers.

When to Fine-Tune

Fine-tuning is worth it when:

You have 500+ examples of great output
Consistency matters more than flexibility
You're running high-volume tasks (1000s daily)
Your terminology is unique

Preparing Training Data

Collect examples of excellent work:

// training_examples.jsonl
{"messages": [
  {"role": "system", "content": "You write cold emails for MarketBetter."},
  {"role": "user", "content": "Write email for: Sarah Chen, VP Sales, Acme Corp, 200 employees, just raised Series B"},
  {"role": "assistant", "content": "Subject: Series B → what's next for sales?\n\nSarah,\n\nSaw the funding news—congrats. The next 6 months are make-or-break for scaling the team.\n\nMost companies at your stage drown their new SDRs in data. \"Here's 10,000 leads, figure it out.\"\n\nThe ones that win give them a playbook. Who to call, what to say, why today.\n\nThat's what we built. Happy to show you what top-performing SDR teams at your stage actually do differently.\n\n— [Name]"}
]}

{"messages": [
  {"role": "system", "content": "You write cold emails for MarketBetter."},
  {"role": "user", "content": "Write email for: James Wilson, Director Sales Ops, TechCorp, competitor using Apollo"},
  {"role": "assistant", "content": "Subject: Apollo gives you data. Now what?\n\nJames,\n\nAppollo's solid for contact data. But your SDRs still wake up wondering: who do I actually call first? What do I say?\n\nThat's the gap we fill. Not more data—a daily playbook. \"Call Sarah at Acme, mention their recent Series B, ask about scaling challenges.\"\n\nSDRs spend time selling, not researching.\n\n30 min to show you how it works with your current Apollo setup?\n\n— [Name]"}
]}

Running Fine-Tuning (OpenAI)

# Upload training file
openai api files.create -f training_data.jsonl -p fine-tune

# Start fine-tuning
openai api fine_tuning.jobs.create \
  -m gpt-4o-mini \
  -t file-abc123

# Check status
openai api fine_tuning.jobs.retrieve -j ftjob-xyz789

# Use fine-tuned model
curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "ft:gpt-4o-mini:marketbetter:sdr-emails:2026-02",
    "messages": [{"role": "user", "content": "Write email for: ..."}]
  }'

Fine-Tuning Best Practices

Data quality > quantity 100 excellent examples beat 1000 mediocre ones.

Diverse examples Include different scenarios, personas, objections.

Negative examples Show what NOT to do:

{"messages": [
  {"role": "user", "content": "Write email..."},
  {"role": "assistant", "content": "I hope this email finds you well! I wanted to reach out because..."}
], "weight": -1}  // Negative weight = avoid this pattern

Regular retraining Your voice evolves. Retrain quarterly with fresh examples.

The Feedback Loop

Custom AI gets better when you close the loop:

Tracking Output Quality

// Log every AI output
const logOutput = {
  prompt_id: 'cold_email_v3',
  input: prospectContext,
  output: generatedEmail,
  user_edits: whatTheySent,
  edit_distance: calculateDiff(output, user_edits),
  outcome: {
    sent: true,
    opened: true,
    replied: true,
    meeting_booked: true
  }
};

Learning from Edits

If reps consistently edit the same things:

// Weekly analysis
const commonEdits = analyzeEdits(logs);

// Example output:
// "Users removed 'hope this helps' in 67% of emails"
// "Users shortened first paragraph in 45% of cases"
// "Users added specific data point in 78% of cases"

// Update prompt based on patterns

A/B Testing Prompts

Run experiments on prompt versions:

const promptExperiment = {
  control: 'cold_email_v3',
  variant: 'cold_email_v4_shorter',
  allocation: { control: 0.5, variant: 0.5 },
  metrics: ['reply_rate', 'edit_distance', 'meeting_rate'],
  sample_size_needed: 500,
  auto_promote_threshold: { reply_rate: 0.10 }  // 10% reply = auto-win
};

Building Your Training Pipeline

Phase 1: Document What Works (Week 1-2)

Interview top performers: How do they write emails? Handle objections?
Collect 50+ examples of excellent work
Document your voice guidelines
Write your ICP and competitor profiles

Phase 2: Build Basic Prompts (Week 3-4)

Create prompt templates for top 5 use cases
Test with 3-5 team members
Iterate based on feedback
Build prompt library in git

Phase 3: Deploy Agents (Month 2)

Set up OpenClaw with your prompts
Create agent personas with memory
Connect to CRM for context injection
Train team on using agents

Phase 4: Continuous Improvement (Ongoing)

Track output quality and edits
A/B test prompt variations
Update knowledge base monthly
Consider fine-tuning when you hit 500+ examples

Common Mistakes to Avoid

Over-Engineering Early

Don't fine-tune on day one. Start with prompts. Get wins. Then optimize.

Ignoring the Human Layer

AI assists, humans approve. Always have a rep review before sending.

Static Prompts

Your market changes. Your product changes. Your voice evolves. Update prompts regularly.

No Feedback Loop

If you're not measuring output quality, you're not improving.

Integrating with MarketBetter

MarketBetter's Daily SDR Playbook uses trained AI models that understand your specific:

Account scoring — Tuned to YOUR closed-won patterns
Message generation — Matches YOUR voice and style
Objection handling — Based on YOUR competitive landscape

Want to see custom AI in action? Book a demo and we'll show you how trained AI powers personalization at scale.

Why "Off-the-Shelf" AI Fails for Sales​

The Generic Problem​

The Cost of Generic​

The Three Levels of AI Customization​

Level 1: Prompt Engineering (No Code)​

Level 2: System Prompts + Memory (Low Code)​

Level 3: Fine-Tuning (Technical)​

Level 1: Prompt Engineering​

The Context Stack​

Example: Sales Email Prompt​

Building a Prompt Library​

Prompt Versioning​

Level 2: System Prompts + Memory​

Creating Agent Personas​

Building the Knowledge Base​

Adding Memory​

Level 3: Fine-Tuning​

When to Fine-Tune​

Preparing Training Data​

Running Fine-Tuning (OpenAI)​

Fine-Tuning Best Practices​

The Feedback Loop​

Tracking Output Quality​

Learning from Edits​

A/B Testing Prompts​

Building Your Training Pipeline​

Phase 1: Document What Works (Week 1-2)​

Phase 2: Build Basic Prompts (Week 3-4)​

Phase 3: Deploy Agents (Month 2)​

Phase 4: Continuous Improvement (Ongoing)​

Common Mistakes to Avoid​

Over-Engineering Early​

Ignoring the Human Layer​

Static Prompts​

No Feedback Loop​

Integrating with MarketBetter​

Further Reading​