Lesson 4/5AI10 min read

AI agents: when automation runs itself

Chatting with AI requires constant attention.

Agents are different — they pursue goals autonomously, handling multi-step processes without human input at each stage.

Deep dive theory

Why this matters?

Consider the difference between giving someone directions step by step versus telling them the destination and letting them figure out the route.

Step-by-step: "Turn left. Now go straight. Now turn right." Every instruction requires human input because the person waits for the next command.

Goal-oriented: "Get to the airport by 3pm." The person handles navigation, route changes, and obstacles independently. Human input is only needed if something goes seriously wrong.

AI agents work like the second approach. Instead of responding to single prompts, they accept objectives and work toward them through multiple steps.

This changes what AI can do: Tasks that require many sequential actions — where each step depends on the previous result — become possible without human involvement at each stage. The efficiency gain is not just speed per task, but freedom from being present for every step.


1. What makes an agent different from a chatbot

Chatbots are reactive

Traditional AI interaction: type a question, get an answer, conversation waits. Every action requires human initiation.

This is useful but limiting. The human must be present and attentive for each step. Complex multi-step tasks require many rounds of back-and-forth.

Agents are goal-directed

An agent receives an objective and pursues it. Breaking that objective into steps, deciding what to do next based on results, continuing until the goal is achieved or something blocks progress.

The difference in a practical example:

Chatbot: "Summarize this article" → produces summary → waits for next instruction.

Agent: "Monitor competitor announcements and send me a weekly summary of anything relevant" → checks sources → evaluates for relevance → compiles findings → sends report → repeats next week.

The components that enable this

Goal understanding; knowing what "done" looks like.

Planning; breaking goals into actionable steps.

Tool access; ability to search, read, write, call APIs.

Memory; retaining context across steps.

Decision logic; choosing what to do next based on results.

Memory is currently the weakest of these. Long task chains often lose early instructions, which is why human checkpoints matter.

Without these capabilities, an agent is just a chatbot with extra steps.


2. Where agents work well

Repetitive processes with predictable steps

If the same workflow happens regularly — daily reports, weekly summaries, recurring data processing — an agent can take over the execution. The steps are known, the logic is clear, the path is predictable.

Tasks that run outside human hours

Monitoring systems that check for changes overnight. Processes that need to run continuously. Tasks that must happen at times when human attention is not available.

Agents do not sleep. But the quality of their decisions can degrade over long task chains. For tasks requiring persistent attention at a consistent level, checkpoints help catch drift.

High-volume processing

Tasks where the number of items exceeds what humans could handle. Reviewing thousands of applications. Categorizing massive datasets. Scanning archives for relevant information.

Individual items might be simple, but the volume makes human processing impractical. Agents handle scale without fatigue.

Where agents struggle

Tasks requiring nuanced judgment that cannot be specified in advance. Situations where context shifts in ways that are hard to anticipate. Work where the "right" answer depends on factors that are difficult to articulate.


3. Designing for safety

Autonomous systems create different risks than tools that wait for human input.

Errors amplify at speed

A human making a mistake in one email affects one recipient. An agent making a mistake in an email template affects every recipient.

This is why testing and monitoring matter more for agents than for interactive tools.

Checkpoint design

Instead of full autonomy, many implementations use checkpoints. The agent works autonomously to a point, then pauses for human review before continuing.

Draft and wait: Agent prepares output; human reviews before it goes anywhere.

Act and report: Agent takes action; human reviews a summary of what happened.

Act and escalate: Agent handles routine cases; unusual situations route to humans.

The right design depends on the stakes. Lower stakes can tolerate more autonomy. Higher stakes need more checkpoints.

Monitoring and alerting

Even agents running autonomously should be observable. Logs of what they did. Alerts when something unusual happens. Metrics that track whether outputs look normal.

Without visibility, problems remain hidden until they become crises.

Scope boundaries

Clear definitions of what the agent can and cannot do. What systems it can access. What actions it can take. What requires escalation.

Boundaries prevent scope creep into areas where autonomous action creates unacceptable risk.


4. What goes wrong

Cascading errors

If an early step produces wrong information, subsequent steps build on that error. The agent does not know the foundation is flawed — it just continues.

A research agent that pulls incorrect information early will produce a flawed report. The error compounds rather than getting caught.

Human review at checkpoints catches these. But the further errors propagate before review, the more damage they cause.

Context blindness

Agents follow their programmed logic, which cannot account for every situation.

An automated response system does not know about the sensitive situation with a particular customer. A scheduling agent does not know that Tuesday is actually problematic because of an unscheduled crisis.

Exceptions that would be obvious to humans are invisible to agents unless specifically programmed.

Dependency fragility

Agents often rely on external services — APIs, websites, databases. When those services change or break, the agent breaks too.

A website redesign breaks a scraping agent. An API update changes the data format. A service outage blocks the workflow.

Maintenance is ongoing, not one-time setup. External dependencies require monitoring.

Difficulty debugging

When failures occur, understanding why requires investigation beyond simple logs.


Think

What would you do in these scenarios?

Simulator

1 / 5
Sim_v4.0.exe

The daily price tracker

A used car dealership checks competitor lot prices every morning — 30 minutes of manual ChatGPT prompting. The operations manager's developer offers to turn it into an agent. What does the agent actually change?


Practice

Test yourself and review key terms

Knowledge check

Q1/4

What is the difference between giving step-by-step directions vs just giving a final goal?

Concepts

Question

What analogy does the lesson use to explain the difference between chatbots and agents?

Click to reveal

Answer

Giving someone step-by-step directions versus telling them the destination and letting them figure out the route.

1 / 23

Do

Your action steps for today

Action plan: what to do today

  • The process map:Identify one repetitive multi-step task that happens regularly. Map out each step. Consider whether they could be automated with appropriate checkpoints.
  • The 2am test:For any existing automation, ask: what would happen if this broke at 2am on a Sunday? Identify what monitoring would provide earlier warning.
  • The impact audit:Think about where the highest-value human review points are. Which steps, if wrong, would cause the most damage?
Note.txt

Some examples and details may be simplified to better convey the core idea. Every business is different — adapt these ideas to your specific context and situation.