Build your own guardrails
Working with Claude — CC BY 4.0
Up to now the course has handed you prompts that already work. Lesson 1.4 gave you a set of accuracy rules to paste at the top of a chat — label every claim, say “I don’t know” first, own the rules you broke. Earlier lessons gave you others. Borrowed rules are the right place to start. They’re training wheels: someone who has already made the mistakes wrote them down so you don’t have to.
This lesson is about taking the wheels off. The goal is your own governance — a short set of rules that fit your work, in your words, that you can defend because you understand why each one is there. A borrowed prompt tells Claude how to behave. Your own rules tell it how to behave for you, and let you notice when it stops.
The whole idea in one line
You are in command. Claude drafts, suggests, checks and reminds — you direct it, you review what comes back, and you own whatever goes out under your name. Everything below is just that one idea, made practical. Nobody hands you authority over your own work; you keep it by building the habits that stop it quietly slipping away.
Four rules worth making your own
You don’t need a long document. Four plain rules will carry most people a long way. Write each one in your own words — that’s the point of the exercise.
1. How you’ll verify. Decide, in advance, what you check before you trust an answer. Not “check everything” — that’s a wish, not a rule. Something you’ll actually do. For example:
Any figure, date, quote, legal point or named person, I confirm from a source myself before I use it. If I can’t confirm it, it doesn’t go in.
Claude labelling its own confidence (from 1.4) is a help, not a substitute. The label tells you where to look harder; your own eyes on a source is what makes it true.
2. What you won’t paste. This one protects other people and yourself, and it’s the one most often skipped. Before you drop text into a chat, ask whose it is and what happens if it’s stored or seen. A workable personal rule:
I don’t paste anyone’s personal details, client information, passwords, unpublished work, or anything covered by a confidentiality agreement — unless I’ve checked it’s allowed and stripped what isn’t needed.
If your work touches information about people, and especially information about or belonging to iwi, hapū or whānau, treat it as needing a decision before it goes anywhere — this is where data sovereignty and Māori data sovereignty obligations are real, not abstract. When you’re not sure, don’t paste and ask first.
3. Review before it’s sent. Nothing leaves for a real audience — an email, a report, a post, a document with your name on it — without you reading the final version as if a stranger wrote it. Claude is good; it is also confidently wrong sometimes, and it can’t know what you’d be embarrassed to have said. A one-line rule:
Claude’s draft is never the final version. I read the last copy end to end before it goes out, and I’m the one who presses send.
4. Stay the one deciding. Let Claude do the work; keep the judgement. Where a choice carries weight — money, someone’s rights, health, a commitment made in writing — you make the call and you can explain why. A useful tell: if you couldn’t defend a decision without saying “the AI told me to,” you haven’t actually made it yet.
Of the four — how you verify, what you won’t paste, review before send, who decides — which would you break first under a deadline?
Since you already know which one, what would make that rule harder to skip when it matters?
Where your rules live
Two of these — verification and review — are habits; they live in you, not in a settings box. The other two you can also load into Claude so it helps you keep them. Standing instructions that apply to every conversation sit in your Claude settings, usually under a profile or personal-preferences area; per-project instructions apply inside a single project. Menu names shift as the app updates, so confirm the exact path in-app or in the current help pages rather than trusting a screenshot from last year. Putting “don’t let me paste client data without flagging it” in writing means Claude can remind you — but the rule is yours whether or not the app is enforcing it.
Write yours now
Spend ten minutes. Four headings — I verify by… / I never paste… / Before I send I… / I decide… — and one plain sentence under each. Keep it short enough to remember. Revisit it after your first near-miss, because you’ll write a better rule once you’ve felt why it matters.
This is general education, not legal advice. If your work carries real legal, privacy or Tiriti obligations, get advice from someone qualified on your specific situation.
The habit this whole course has been building comes down to this — you don’t borrow trust from the tool, you keep authority over the work. Your guardrails are how you hold onto it.
Shared freely, in good faith. If it's been of value, a koha toward development and running costs is warmly welcomed.
Leave a koha →