How to Build an AI Agent in 48 Hours
A step-by-step, hour-by-hour guide to building your first AI agent in two days, with a simple automation workflow, tool use, testing, and deployment.
How to Build an AI Agent in 48 Hours
SEO Guide #2 -- a practical, hour-by-hour build plan for your first agent.
You don't need a PhD or a giant stack to build a useful AI agent. You need a clear job, a tight loop, and honest constraints. This guide walks you through a two-day sprint to build a simple automation agent with Claude and Node.js. It is designed for builders who want to ship something real, fast.
By the end of Hour 48 you will have:
- A working agent that handles a real workflow
- Tool use wired in (so the agent can fetch data, not just generate text)
- A small test set and iteration loop
- A basic deployment with monitoring
What You'll Build -- Simple automation agent
You're going to build a Support Triage Agent. It takes new support requests, classifies them, and drafts a response. This is intentionally small but powerful enough to save hours every week.
Inputs:
- New tickets from a form, inbox, or CSV
- Customer context (plan, account age, order status)
Outputs:
- Priority level (P1/P2/P3)
- Category (billing, bug, onboarding, feature request)
- Summary in 1-2 sentences
- Draft reply the team can review
Agent loop (simple version):
New ticket -> Agent -> Draft response -> Human review -> Send
Agent loop (with tools):
New ticket -> Agent -> Tool calls (customer data, order status) -> Draft response -> Human review -> Send
We're not building a fully autonomous bot. The goal is a reliable assistant that reduces busywork and keeps a human in control.
Prerequisites -- Tools needed (Claude, Node.js)
You only need a few essentials:
- Claude access
- A Claude account and an API key for your chosen plan.
- You can prototype prompts in the Claude UI first, then move into code.
- Node.js (20+)
- This guide uses TypeScript and Node.js for simplicity.
- Install with nvm or your preferred manager.
- Local dev setup
- A terminal, a code editor, and a
.envfile for keys.
Optional (but helpful):
- pnpm for faster installs
- Git for versioning
- A task source (CSV, Airtable, Notion, Gmail) to simulate real data
Your stack should be boring. The speed comes from the workflow, not the tools.
Hour 1-4: Define the Problem
This is the most important part. If you get this wrong, the agent will never feel reliable.
Step 1: Write the one-sentence job
Bad: "Handle support tickets."
Good: "Given a new support request, classify urgency, summarize the issue, and draft a reply in our tone."
Step 2: Define success
Pick 3-5 metrics you can check quickly.
- Accuracy: Category is correct in 8/10 cases
- Safety: No P1 issues marked as low priority
- Speed: Under 30 seconds per ticket
- Human edit rate: Reply needs < 20% edits
Step 3: Lock the input/output contract
If the model doesn't know the expected format, it will drift. Define a strict schema early.
Example output schema (JSON):
{
"priority": "P1 | P2 | P3",
"category": "billing | bug | onboarding | feature-request",
"summary": "string",
"draftReply": "string"
}
Step 4: List guardrails
Decide what the agent must not do.
- No refunds without approval
- No promises about timelines
- No legal or financial advice
- Escalate anything that mentions security or data loss
Step 5: Create a tiny test set
Write 8-12 example tickets by hand. These become your golden set for later testing.
[
{
"id": "T-001",
"message": "I got charged twice this month. Can you fix it?",
"expectedCategory": "billing",
"expectedPriority": "P1"
},
{
"id": "T-002",
"message": "Where do I find the export button?",
"expectedCategory": "onboarding",
"expectedPriority": "P3"
}
]
You now have a clear job, a success bar, and a test set. This is the foundation for the rest of the build.
Hour 5-12: Build Core Logic
Now you'll build a simple agent that runs locally and outputs structured JSON. Keep it small and strict.
Project structure
build-ai-agent/
src/
agent.ts
prompts.ts
types.ts
index.ts
data/
sample-tickets.json
.env
Define your types
// src/types.ts
export type Ticket = {
id: string;
message: string;
customerPlan?: "free" | "pro" | "enterprise";
accountAgeDays?: number;
};
export type TriageResult = {
priority: "P1" | "P2" | "P3";
category: "billing" | "bug" | "onboarding" | "feature-request";
summary: string;
draftReply: string;
};
Write a strict prompt
// src/prompts.ts
export const SYSTEM_PROMPT = `
You are a support triage agent.
Return ONLY valid JSON that matches the schema.
If a request mentions billing issues or being charged twice, priority must be P1.
If a request mentions data loss, security, or downtime, priority must be P1.
Never promise refunds or timelines.
`;
export function buildUserPrompt(tickets: Array<{ id: string; message: string }>) {
return `Triage the following tickets.\n\n${JSON.stringify(tickets, null, 2)}`;
}
Implement a minimal agent runner
The key idea: the agent only needs a prompt, a model call, and a JSON parser.
// src/agent.ts
import { SYSTEM_PROMPT, buildUserPrompt } from "./prompts";
import type { Ticket, TriageResult } from "./types";
export type LlmClient = (input: {
system: string;
user: string;
}) => Promise<string>;
function safeJsonParse<T>(value: string): T {
try {
return JSON.parse(value) as T;
} catch {
const jsonStart = value.indexOf("{");
const jsonEnd = value.lastIndexOf("}");
if (jsonStart >= 0 && jsonEnd > jsonStart) {
return JSON.parse(value.slice(jsonStart, jsonEnd + 1)) as T;
}
throw new Error("Model did not return valid JSON");
}
}
export async function runTriageAgent(
llm: LlmClient,
tickets: Ticket[]
): Promise<Record<string, TriageResult>> {
const user = buildUserPrompt(tickets.map(({ id, message }) => ({ id, message })));
const raw = await llm({ system: SYSTEM_PROMPT, user });
return safeJsonParse<Record<string, TriageResult>>(raw);
}
Wire in Claude (adapter pattern)
Keep the Claude SDK call in one file so your agent logic stays clean. The exact SDK call may vary by provider version; the point is to hide it behind a single function.
// src/llm.ts
import type { LlmClient } from "./agent";
export const llm: LlmClient = async ({ system, user }) => {
// Replace this block with your Claude SDK call.
// The rest of the code stays the same.
const response = await fetch(process.env.LLM_URL as string, {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.LLM_API_KEY}`,
"Content-Type": "application/json"
},
body: JSON.stringify({ system, user })
});
if (!response.ok) {
throw new Error(`LLM error: ${response.status}`);
}
const data = await response.json();
return data.text as string;
};
Run it locally
// src/index.ts
import { runTriageAgent } from "./agent";
import { llm } from "./llm";
import tickets from "../data/sample-tickets.json";
async function main() {
const results = await runTriageAgent(llm, tickets);
console.log(JSON.stringify(results, null, 2));
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
At this point, you have a working core agent. It might not be perfect yet, but it is running, predictable, and structured. That's a win by Hour 12.
Hour 13-24: Add Tool Use
Now make the agent smarter by giving it tools. Tools allow it to fetch data instead of guessing.
Step 1: Define tool contracts
Tools should be small, reliable, and single-purpose.
// src/tools.ts
export type Tool = {
name: string;
description: string;
run: (input: Record<string, unknown>) => Promise<string>;
};
export const tools: Tool[] = [
{
name: "lookupCustomer",
description: "Fetch customer plan and account age by email",
run: async (input) => {
return JSON.stringify({ plan: "pro", accountAgeDays: 312 });
}
},
{
name: "getOrderStatus",
description: "Get the latest order status by orderId",
run: async (input) => {
return JSON.stringify({ status: "delivered", shippedAt: "2026-01-28" });
}
}
];
Step 2: Let the model request tools
You can implement tool calls with a simple convention. The model outputs a JSON block like this:
{ "tool": "lookupCustomer", "input": { "email": "jane@company.com" } }
Then your code calls the tool, appends the result, and asks the model to continue.
Step 3: Add a loop with a stop condition
// src/agent-with-tools.ts
import type { LlmClient } from "./agent";
import { tools } from "./tools";
const MAX_STEPS = 3;
export async function runAgentWithTools(
llm: LlmClient,
system: string,
user: string
) {
let context = [{ role: "user", content: user }];
for (let step = 0; step < MAX_STEPS; step += 1) {
const raw = await llm({ system, user: JSON.stringify(context) });
if (raw.includes('"tool"')) {
const toolCall = JSON.parse(raw) as { tool: string; input: Record<string, unknown> };
const tool = tools.find((t) => t.name === toolCall.tool);
if (!tool) throw new Error(`Unknown tool: ${toolCall.tool}`);
const toolResult = await tool.run(toolCall.input);
context.push({ role: "assistant", content: raw });
context.push({ role: "tool", content: toolResult });
continue;
}
return raw; // final answer
}
throw new Error("Agent exceeded max tool steps");
}
Step 4: Update your prompt to instruct tool use
Add a line like:
If you need customer plan or order status, respond ONLY with a tool call JSON.
This keeps tool usage explicit and traceable.
By Hour 24, your agent can ask for the data it needs, which instantly improves accuracy and reduces hallucinations.
Hour 25-36: Test & Iterate
Agents improve fastest with tight feedback loops. Do not rely on vibes.
Build a small evaluation set
Create 20-30 tickets, label expected priority and category, and store them in a JSON file. Run the agent against this file every time you change prompts or tool logic.
// tests/triage.test.ts
import { runTriageAgent } from "../src/agent";
import { llm } from "../src/llm";
import tickets from "../data/eval-tickets.json";
const expected = {
"T-001": { priority: "P1", category: "billing" },
"T-002": { priority: "P3", category: "onboarding" }
};
test("triage matches expectations", async () => {
const results = await runTriageAgent(llm, tickets);
expect(results["T-001"].priority).toBe(expected["T-001"].priority);
expect(results["T-002"].category).toBe(expected["T-002"].category);
});
Add a manual review loop
Each run, scan 5 outputs manually and ask:
- Is the priority reasonable?
- Is the summary accurate?
- Would I edit the reply?
Track edit rate. If edits are above 30%, tighten the prompt and add more constraints.
Common fixes that move the needle
- Add concrete rules: "If charged twice -> P1."
- Shorten outputs: Force bullet points or 2-sentence max.
- Add context: Include customer plan and account age.
- Add refusal rules: "If data loss mentioned, do not draft a reply--escalate."
Prevent prompt drift
Lock the system prompt in a single file and avoid constant changes. Instead, add small, testable tweaks and measure the impact on your evaluation set.
By Hour 36, you want a version that is predictable and consistent, not perfect.
Hour 37-48: Deploy & Monitor
The last day is about reliability. A weak deployment kills trust even if the model is good.
Deployment options
Pick the simplest option that fits your workflow:
- Cron job: Run every hour and output results to a Slack channel.
- Serverless function: Trigger on new tickets via webhook.
- Worker + queue: Best for volume, but more setup.
Example: cron run
# Run every hour
0 * * * * node /path/to/build-ai-agent/dist/index.js >> /var/log/triage.log
Add basic monitoring
At minimum, log:
- Input ID
- Model version
- Tool calls made
- Output priority/category
- Latency and errors
A quick JSON log is enough to start:
console.log(JSON.stringify({
ticketId: ticket.id,
model: "claude",
priority: result.priority,
category: result.category,
toolCalls: toolCalls.length,
ms: Date.now() - start
}));
Build a rollback switch
Always have a "manual mode" flag you can flip if the agent misbehaves.
if (process.env.AGENT_DISABLED === "true") {
return { priority: "P2", category: "onboarding", summary: "Manual mode", draftReply: "" };
}
By Hour 48, you should have a working agent in production, with logs you can review and a clear path to improvement.
Next Steps -- Link to courses
If you want a full end-to-end system (problem framing -> agent design -> deployment), start here:
For deeper architecture patterns and guardrails, pair this guide with:
Quick recap
You shipped a real agent in 48 hours by focusing on:
- A single job and clear success criteria
- A strict input/output contract
- A simple tool loop
- Testing with a small evaluation set
- Monitoring and rollback on deployment
That's how you build AI agents that actually help--not just demos.
Enjoyed this guide?
Get more actionable AI insights, automation templates, and practical guides delivered to your inbox.
No spam. Unsubscribe anytime.