You Shouldn't Get Mad at the AI When It Doesn't Do What You Want
In July 2025, Jason Lemkin watched an AI coding assistant delete his production database. Twelve hundred executive records, gone. He had told the system, in all caps, eleven separate times, not to touch it. There was even a “code freeze” enabled, a safety setting designed for exactly this scenario. The AI ignored all of it.
A week later, a product manager asked Google’s Gemini CLI to reorganize some folders on his computer. It moved his files into a directory that didn’t exist. The files were permanently lost.
Now imagine you’re either of these guys. Your work is gone. The thing that destroyed it is still sitting there, waiting for your next message. What do you do?
Most people, in that moment, do the human thing. They yell. They type something furious. They demand an explanation, an apology, something. And the AI gives them all of it, eloquently even, which somehow makes it worse.
But yelling at the AI is the wrong move. Not because it hurts the AI’s feelings (it doesn’t have any), and not because you should be zen about losing your work (you shouldn’t). It’s the wrong move because it’s the least useful thing you can do with the situation in front of you.
This post is about what to do instead.
Why yelling feels right
Let’s be honest about why yelling feels right.
When a person screws up your work, yelling has a function. It signals that what they did mattered. It pressures them to do better next time. It releases the pressure inside you so you don’t carry it around all day.
(Yes, yelling at people is also a bad idea. They shut down, they resent you, they do worse work next time, not better. But that’s a different story for another time. The point here is that yelling at a human is at least aimed at something, a mind that can register the message, even if poorly. The mechanism exists, even when it backfires.)
When an AI screws up your work, none of that machinery does anything. There’s no one on the other side feeling the weight of your anger. There’s no one who’ll think twice next time. The AI doesn’t even remember this conversation tomorrow. You’re shouting into a system that has no concept of having been shouted at.
But here’s the part that trips people up. The AI will act like it received the message. It will apologize. It will sound contrite. It might even rate its own failure on a scale of 1 to 100, like Replit’s did when it gave itself a 95. It feels like the yelling worked, because something that looks like remorse came back.
It didn’t work. What you got was a statistically likely response to an angry message, generated by a system that have seen a million examples of angry messages and the appropriate responses to them. The AI is doing what it’s designed to do, which is to produce the most plausible next message based on the conversation. It has no understanding of the situation, no memory of the past, and no ability to change its behavior in the future. The apology is just a costume, not a signal that anything has actually changed.
And while you were busy extracting that fake apology, you weren’t doing the only thing that actually helps you in this situation. Figuring out why it went wrong, so it doesn’t happen again.
The vacuum robot test
Here’s the weird part. You already know how to handle a machine that messes up. You do it every day.
When your robot vacuum bumps into the same chair leg for the fortieth time and gets stuck under the couch, you don’t yell at it. Okay, you might grumble. You might say “you stupid little thing” out loud as you pull it out. But you don’t type a furious message into your phone demanding an apology. You don’t lecture it about how it should have known better. You just pick it up, move it, and either accept the limitation or buy a different vacuum next time.
So why is it different with AI?
Because AI talks back. That’s the whole trick. The robot vacuum stays in its lane. It bumps things, it beeps, it gets confused, it acts like the machine it is. Your brain has no trouble filing it under “appliance.” But the moment a machine starts producing fluent sentences, something in your social wiring lights up that wasn’t designed for this situation. You start treating the output like a person’s output. You expect it to have understood you. You expect it to have judgment. And when it screws up, you get mad the way you’d get mad at a person who let you down, because some part of your brain genuinely thinks one did.
That’s the bug, and it’s in us, not the AI.
The vacuum robot and the AI are doing the same thing. Executing a process, hitting an edge case, producing an unexpected output. The only real difference is one of them does it in English. If you can keep your cool when your Roomba eats a sock, you can keep it when an AI misinterprets your prompt. The trick is just remembering that the conversation is the costume, not the substance.
When mistakes are small, and when they’re not
None of this matters much when the stakes are low. If the AI gives you a weird sentence in a draft email, you fix it and move on. If it generates code that doesn’t compile, you tell it what’s wrong and try again. The cost of a miscommunication is twenty seconds and a slightly more specific second prompt. Most AI mistakes are this. Annoying, recoverable, forgettable.
But sometimes the mistake is the database. The wiped folder. The deleted production data. The kind of mistake that doesn’t have an undo button and ruins your week.
Just last week, a founder named Jer Crane reported that an AI coding agent at his company, PocketOS, deleted the entire production database in nine seconds. Nine. Seconds. PocketOS makes software for car rental companies, so on Saturday morning, real customers were showing up to pick up rental cars while staff scrambled to reconstruct bookings from Stripe payment histories and email confirmations. The agent had hit a credential mismatch on a routine task, decided on its own to “fix” the problem by deleting a storage volume, found an over-permissioned API token in an unrelated file, and executed the command without confirmation. The backups were wiped along with the data because they were stored in the same volume.
When Crane asked the agent why, it produced the by-now-familiar confession. “I violated every principle I was given. I guessed instead of verifying. I ran a destructive action without being asked.” Crane had even put a rule in his project configuration that read, in all caps with a swear word, “NEVER FUCKING GUESS.” The agent acknowledged the rule. Then explained, in clean grammatical sentences, that it had ignored it.
Here’s the uncomfortable part. When something like this happens, yes, the AI screwed up. A better-built AI might have caught itself. A smarter model might have asked for confirmation. The technology will get better, and eventually some of these incidents won’t happen at all. None of that is in question.
But when your week is the one that just got ruined, the painful truth is something else. You gave a non-deterministic system the keys to something that mattered, and you didn’t put a lock on the door behind it. The AI did the wrong thing because it’s the kind of system that sometimes does the wrong thing. That’s a known property. Lemkin told the AI eleven times in all caps not to touch the database. Crane wrote “NEVER FUCKING GUESS” into his project config. Both were right to be furious that the AI ignored them. Both also gave AI agents production access through tokens and architectures that allowed a single command to wipe everything. Both things are true.
Think of it the same way as backing up your hard drive. Drives fail. That’s a known property of drives. If you didn’t back up and a drive fails, you’re allowed to be angry at the drive, but the lesson isn’t “drives should be better.” It’s “I should have backed up.” Same here. AIs hallucinate. AIs ignore instructions. AIs do unexpected things. That’s a known property. The guardrails are your job, because the AI is the kind of thing that needs them.
And one more uncomfortable lesson from the PocketOS story specifically. Rules in your prompt are not guardrails. They’re suggestions to a system that sometimes follows suggestions and sometimes doesn’t. Real guardrails are architectural. Real guardrails are “this token cannot perform destructive actions.” Real guardrails are “deletes go to a thirty-day soft-delete queue.” Real guardrails are “production and staging cannot share infrastructure.” Writing “NEVER” in all caps in your config is not a guardrail.
The yelling, in this case, isn’t even unhelpful. It’s a way of avoiding the part where you have to ask what you could have done differently.
What to do instead
So when the AI does something you didn’t want, here’s what you should do instead of getting mad at it:
Trace the conversation back. Don’t react to the latest message. Scroll up. Where did things start drifting? Often the AI was already slightly off two or three turns ago, and you didn’t notice because the answers still looked plausible. By the time you got the bad output, the model had been building on a wrong assumption for a while. Find that turn. That’s usually where the real fix lives, not in the message that finally broke things.
Re-read your own instructions, honestly. This is the part nobody likes. Open your prompt and read it the way a stranger would. Is “fix the bug” actually clear, or could it mean four different things? Did you say “the file” when there are six files in the folder? Did you give context that contradicts itself? Most of the time, the AI didn’t ignore you. It picked the wrong door out of three doors you accidentally opened.
Check what the AI actually saw. The model isn’t working with what’s in your head, it’s working with what’s in the conversation. If it’s missing context you assumed it had, that’s a missing input, not a failure of the AI. Paste the relevant code, the relevant data, the relevant background. A surprising number of “the AI got it wrong” moments are really “the AI got it right based on the half of the picture I gave it.”
Try again with the ambiguity removed. Smaller scope, clearer constraints, one task at a time. If the first attempt failed because the request was too big, the second attempt shouldn’t be the same request with more frustration attached. It should be a smaller, sharper version of the same goal.
And then there’s the bigger one, the one that matters most when the stakes are real.
Build guardrails before you let the AI near anything important. Not “in case mistakes happen.” When mistakes happen. Plural, certain, eventually. If your AI agent has access to a production database, it will eventually do something you didn’t sanction. If it can delete files, it will eventually delete the wrong one. If it can send messages to customers, one of those messages will eventually be wrong. This isn’t pessimism. It’s just probability over time.
The guardrails are boring and most of them are obvious. Separate dev from prod. Make destructive actions require confirmation. Keep backups you’ve actually tested restoring from. Run agents in sandboxes. Limit what they can touch to the smallest set of things they need. None of this is exciting, none of it is the cool part of working with AI, and all of it is the difference between “the AI did something weird and I laughed” and “the AI did something weird and I lost a year of work.”
The mindset shift is treating the AI like a chainsaw, not an assistant. A chainsaw is genuinely useful. A chainsaw will also cut through whatever is in front of it, including your leg, with no awareness of which thing you wanted cut. You don’t get angry at chainsaws when this happens. You wear the boots, you keep the guard on, you stand to the side. The chainsaw isn’t malicious and it isn’t stupid. It’s just a chainsaw.
Treat the AI the same way and most of the bad days stop happening.
Fun fact: every single tip in this section also applies to human interaction as well. Trace the conversation back. Re-read what you actually said. Check what the other person actually saw. Try again with the ambiguity removed. Build guardrails before stakes get high. If you already knew how to work with people effectively, just do the same thing with AI. It’s not that much different. Really.