Understand what AI agents are, how they differ from regular LLMs, the ReAct loop, tool use, memory, planning, and the most popular frameworks in 2025.
An AI agent is more than a chatbot. Where a regular LLM responds to a single prompt and stops, an agent can plan multi-step tasks, call external tools, remember context across turns, and loop until the job is done. This guide explains every layer.
┌─────────────────────────────────────────────────────────┐
│ Regular LLM │
│ │
│ User prompt ──► LLM ──► Single response │
│ │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ AI Agent │
│ │
│ User goal ──► Planner ──► Tool call 1 │
│ ▲ │ │
│ │ Observation 1 │
│ │ │ │
│ └── Think ◄────┘ │
│ │ │
│ Tool call 2 ──► Final answer │
│ │
└─────────────────────────────────────────────────────────┘
A key property: agents run in a loop. They observe, think, act, and repeat until a stopping condition.
ReAct is the most common agent reasoning pattern, published by Google and Princeton in 2022. The model alternates between Thought, Action, and Observation steps:
Question: What is the current weather in Tokyo, and should I bring an umbrella tomorrow?
Thought: I need to look up the current weather and tomorrow's forecast for Tokyo.
Action: weather_tool({ "city": "Tokyo" })
Observation: { "today": "Partly cloudy, 22°C", "tomorrow": "Rain 80%, 19°C" }
Thought: Tomorrow has 80% rain probability. I should recommend an umbrella.
Action: FINISH
Answer: It's partly cloudy today (22°C). Tomorrow expects 80% chance of rain at 19°C — yes, bring an umbrella.
This loop continues until the agent decides it has enough information to answer.
The language model is the reasoning engine. It decides:
Popular choices: GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, Llama 3.1.
Tools extend what the agent can do beyond text generation:
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the internet for real-time information",
"parameters": {
"type": "object",
"properties": {
"query": { "type": "string", "description": "The search query" }
},
"required": ["query"]
}
}
}
Common tool categories:
| Category | Examples |
|---|---|
| Information retrieval | Web search, database queries, vector search |
| Code execution | Python REPL, bash, JS sandbox |
| External APIs | Email, calendar, CRM, Slack |
| File system | Read/write files, parse PDFs, CSVs |
| Browser | Puppeteer, Playwright for web automation |
Memory types in AI agents:
┌────────────────┬────────────────────────────────────────────────────┐
│ Short-term │ The current conversation context (the prompt │
│ (In-context) │ window). Limited by model's max tokens. │
├────────────────┼────────────────────────────────────────────────────┤
│ Long-term │ External storage — vector DB, SQL, key-value store.│
│ (External) │ Retrieved via semantic search when relevant. │
├────────────────┼────────────────────────────────────────────────────┤
│ Episodic │ Past task results stored and recalled for similar │
│ │ future tasks ("Last time I did X, Y worked best"). │
└────────────────┴────────────────────────────────────────────────────┘
For complex tasks, agents decompose goals into sub-tasks:
Goal: "Research competitor pricing and create a comparison spreadsheet"
Plan:
1. Search web for Competitor A pricing page
2. Extract pricing tiers from page
3. Search web for Competitor B pricing page
4. Extract pricing tiers
5. Compare our pricing against both competitors
6. Create CSV with comparison table
7. Return download link
const OpenAI = require('openai');
const client = new OpenAI();
// 1. Define tools
const tools = [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get current weather for a city',
parameters: {
type: 'object',
properties: { city: { type: 'string' } },
required: ['city'],
},
},
},
];
// 2. Send initial message
const messages = [{ role: 'user', content: 'What\'s the weather in Tokyo?' }];
let response = await client.chat.completions.create({
model: 'gpt-4o',
messages,
tools,
tool_choice: 'auto',
});
// 3. Agent loop
while (response.choices[0].finish_reason === 'tool_calls') {
const toolCalls = response.choices[0].message.tool_calls;
messages.push(response.choices[0].message); // add assistant message
for (const call of toolCalls) {
const args = JSON.parse(call.function.arguments);
const result = await callTool(call.function.name, args); // your impl
messages.push({
role: 'tool',
tool_call_id: call.id,
content: JSON.stringify(result),
});
}
// Send updated messages back to model
response = await client.chat.completions.create({
model: 'gpt-4o',
messages,
tools,
});
}
console.log(response.choices[0].message.content);
| Framework | Best for | Language |
|---|---|---|
| LangChain | Flexible pipelines, many integrations | Python / JS |
| LlamaIndex | RAG and knowledge-base agents | Python |
| AutoGen | Multi-agent conversations | Python |
| CrewAI | Role-based multi-agent teams | Python |
| Vercel AI SDK | Full-stack web agents | TypeScript |
| Semantic Kernel | Enterprise .NET / Python | C# / Python |
Use an agent when:
Don't use an agent when:
Agents with tool access can cause real-world side effects. Always:
Principle of least privilege:
✓ Give agents only the tools they need for the task
✓ Use read-only API keys where possible
✓ Require human confirmation before destructive actions
✓ Log every tool call for auditability
✓ Set max iterations to prevent infinite loops
✓ Sandbox code execution tools
The most important safety rule: never give an agent access to production infrastructure without human-in-the-loop confirmation for write operations.