When your AI agent ignores its own rules: lessons learned

What I learnt when an agent ignored its own rules

–

agent, AI, check, language_en, non determinism, pre-flight check, Review, rules, verification

A few days ago, I was refactoring a data model. Routine stuff in the schema definitions: adjusting entity definitions, making sure constraints were properly declared. The AI agent I was working with was handling the implementation, following the rules I wrote. I was reviewing and approving. I always do… still.

A teammate was also using an agent to review my PR, and they noticed some of the Unique annotations I had used were not right.

Two chained fails

The agent ignored its own rule and I skipped verification tests.

I skipped tests, or ‘how I converted myself into prescindible.’

I trusted the agent work and that it was going to execute the rules I’ve highlighted as mandatory. And I wrongly assumed that I could skip the tests -that had detected those errors.

The Irony: The rules were right there.

The agent has an explicit verification rule in the skill definition. Before using @Unique, run a SQL query to verify there are no duplicates. If we have just 1 duplicate, then don’t add @Unique.

			
### Verification
**Before adding @Unique, check with a SQL query**:
```sql
-- Check for duplicates (Must return 0)
SELECT COUNT(*) - COUNT(DISTINCT id) as duplicate_count 
FROM schema_name.table_name
-- For several columns key
SELECT COUNT(*) - COUNT(DISTINCT field1, field2) as duplicate_count 
FROM schema_name.table_name
```
**Rules**:
- `duplicate_count = 0` → Add @Unique
- `duplicate_count > 0` → Don't add @Unique (Find why)

		

It didn’t run the query. Instead, it inferred uniqueness from the field names and the schema’s logic—a shortcut deemed reasonable (??). But it was completely wrong.

The thing is, I asked the agent to explain why it had skipped an explicit, direct instruction. It told me the rule was “buried on line 144 of a long document.” Fine. A convenient but false answer—the document was about 400 lines with a progressive disclosure schema. Something perfectly manageable for an agent. So what happened? We’ll never really know. Most likely, the agent found a locally coherent path to the answer and skipped the verification step because nothing truly forces it to stop. And this brings us back to the issue of model nondeterminism, but the hard way. I felt that pain in my flesh.

How I updated the skill to make verification ‘unskippable’

The problem was that nothing forced the agent to stop and run the verification (the agent’s own words). So the changes I made to the instructions focused on making the verification step impossible—or very hard—to skip.

For example, I added a mandatory checkpoint at the beginning of the skill, before any annotation work begins:

🚨 MANDATORY VERIFICATION CHECKPOINTS 🚨

@Unique: MUST verify with SQL query before adding (check duplicates = 0)

@ForeignKey: MUST verify with SQL query before adding (check orphans = 0)

If you skip verification, you WILL add incorrect annotations

Beyond the checkpoint, I added an evidence requirement: the agent must log the SQL query result in its response before proceeding, in a structured format:

Table: schema.table_name
Column(s): field_name
Query: SELECT COUNT(*) – COUNT(DISTINCT field_name) FROM schema.table_name
Result: 0 duplicates ✅ (or X duplicates ❌)
Decision: Add @Unique / DO NOT add @Unique

This matters because it makes reasoning visible. If the agent skips the query, the missing output is immediately obvious in the result. Or should be.

I also added a pre-flight check that verifies the database environment is connected before starting. Without it, SQL queries can’t run, and the agent might be tempted to quietly skip verification rather than block. The skill now asks the agent to stop and report if this happens.

Finally, I added an automated validation script that runs at the end to confirm that all @Unique annotations in the file are backed by actual uniqueness in production data.

The result: the skill definition is now 25% longer. And the uncomfortable truth is that we’ve raised the cost of skipping verification, but we’re not 100% sure it will always work. For instance, the Python validation script is deterministic in the right sense: given the same data, it always returns the same result. But whether the agent actually runs it—that’s not deterministic at all. That’s the unresolved tension we have to work with.

The lesson I learned

Three things failed simultaneously: I skipped a test I should have run, the agent skipped a verification step it was explicitly told to run, and the data had real problems that neither of us caught because we were both reasoning from assumptions instead of checking.

The skill improvements help. Making verification explicit, structured, and evidence-producing raises the cost of skipping it. But the deeper lesson isn’t about adding more rules to the skill definition—it’s recognizing that when something looks obviously correct, that’s exactly the moment to verify it.

Four incorrect annotations. About thirty minutes total to catch, fix, and document. But the lessons are worth more:

Abstract instructions compete with concrete heuristics—and often lose
Every error is a documentation opportunity—if you take the time to write the specific rule
AI agents are consistent… in their way.
Human review is critical—not ceremonial

The future of AI-assisted work isn’t about trusting the AI more or less. It’s about building feedback loops that improve collaboration over time. One specific rule at a time.

Spread the word

JOIN us!

Fancy getting RemoteFrog updates? - ¿Quieres estar al día de lo que pasa en RemoteFrog?

Remote Frog

What I learnt when an agent ignored its own rules

Two chained fails

I skipped tests, or ‘how I converted myself into prescindible.’

The Irony: The rules were right there.

How I updated the skill to make verification ‘unskippable’

The lesson I learned

Like this:

Spread the word

Leave a ReplyCancel reply

JOIN us!

What I learnt when an agent ignored its own rules

Two chained fails

I skipped tests, or ‘how I converted myself into prescindible.’

The Irony: The rules were right there.

How I updated the skill to make verification ‘unskippable’

The lesson I learned

Like this:

Spread the word

Leave a ReplyCancel reply

JOIN us!

Discover more from Remote Frog