To run a phishing simulation test, set objectives, align HR/Legal/IT, segment audiences by risk, pick teaching scenarios (e.g., credential harvest, QR), pilot for tone and difficulty, launch with supportive comms, deliver just-in-time micro-learning, then measure report rate and time-to-report and iterate. Prioritise learning over blame.
Before we dive in, hear the story behind our lab experiment: we built an AI spear-phishing agent at Hoxhunt that outworked elite human red teams - faster drafting, higher volume, and a ~24% lift in lure effectiveness. In this episode, we unpack how it beats humans, where it fails, and how to fight the new wave of AI threats.
How to run a phishing simulation test - introduction & approach
Your goal isn’t “gotcha emails”; it’s human risk reduction. Treat phishing simulations as part of security awareness training that rewards reporting and builds habits. We’ll walk you through a teach-first model, segmenting users by risk, and benchmarking outcomes beyond clicks - so you can prove value to the business.
What this guide covers:
- Outcomes that matter: incident reporting rate, time-to-report, reduction in repeat clickers.
- Scenarios that teach: credential harvest, attachment malware, OAuth consent, and QR/“quishing” - to surface real psychological triggers.
- What to report on: how to frame results and calibrate future simulations.
You’re in the right place if you need:
- A clear plan to run your first simulated phishing email (or QR/smish) with minimal friction.
- Language for execs and employees that emphasises learning, fairness, and trust.
- Metrics frameworks you can take to a board or audit committee next week.
What's the outcome of your phishing simulation program?
Start by deciding what “good” looks like. You’re not chasing zero clicks; you’re building a habit of fast reporting and resilient decision-making. Use outcomes that a CFO or board will recognise as risk reduction - then make every campaign prove movement against them.
Outcomes to optimize (and how to measure)
- Incident reporting rate (IRR): % of targeted users who reported the simulation via your approved channel (e.g. Report Phishing button). Reporting is a positive behavior that speeds real-world containment. Measure unique reports ÷ delivered messages.
- Time-to-report (TTR): minutes from send to first (or 10th) verified report. Faster TTR = less attacker dwell time and quicker SOC action.
- Repeat-clicker reduction: fewer people failing twice+ across a quarter; treat as a coaching metric, not a punishment scoreboard (track alongside “streaks without fails”).
- Real threat reporting rate: trend of user-reported real security threats (not just simulations) and escalations caught earlier because people knew how to report.
What to de-emphasize (and why)
- Click rate in isolation: it never goes to zero and can drive fear or gaming the system (“just delete everything”). Pair fails with reporting and difficulty context.
- Completion rate as a headline KPI: it proves training happened, not that behaviour changed - use it as an input, not an outcome.
Culture guardrails that protect outcomes
- Keep the program supportive, not punitive; people learn faster when the landing page teaches, rewards proper reporting, and preserves psychological safety.
- Avoid metrics that accidentally punish reporters. Calibrate tooling and close the loop with instant “you did the right thing” feedback.
Map stakeholders and governance before you touch a template
Successful phishing simulation programs start with people and policy, not payloads. Get visible top-down backing, then align the teams who will write the comms, handle complaints, and run escalations. This is how you prevent “gotcha” optics and keep security awareness training focused on behaviour change, not blame.
Who owns what (and why it matters)
- Executive sponsor (CISO/COO): sets tone, shields the program when there’s pushback, and confirms everyone (including execs) participates.
- HR: shapes tone, handles complaints empathetically, and supports managers if someone is upset - preventing small fires from becoming crises.
- Legal: ensures purpose, transparency, and fairness.
- IT/SOC: runs the reporting procedure and live escalation if a real phish collides with the exercise; closes the loop with helpful feedback.
- Internal comms: drafts pre-briefs and post-launch messaging that preserve trust and clarity.
- Regional leaders: localise language, references and timing to culture; avoid testing language skills rather than security skills.
Governance guardrails to agree up front
- Ethics charter: “teach, don’t trick” - no public shaming; use positive reinforcement over punishment to drive lasting behavior.
- Off-limits lures: avoid emotionally manipulative themes (bonuses, layoffs, health scares) that predictably backfire and erode trust.
- Escalation & comms: who speaks when there’s backlash; who pauses campaigns during layoffs/incidents/holidays; how managers respond.
- Data boundaries: what you monitor, who sees individual results, and how long you retain data - communicated plainly to staff.
Segment your audience by risk - not by org chart
Treat segmentation as human risk management, not admin convenience. People encounter different social engineering pressures - your simulation mix and micro-learning should reflect that. Role and context - not rank - determine who’s most exposed and what they need to practise.
Start with these high-exposure cohorts
- Finance/AP & payroll: payment change and invoice-fraud lures (credential harvest, attachment).
- Customer-facing teams: urgent service updates, shipping issues, OAuth consent prompts.
- IT & privileged users: tool notifications, SSO resets, admin alerts.
- New hires & contractors: baseline habits early; give extra support in the first 90 days.
- Heavy external emailers & VIPs: high visibility, higher spear-phishing probability.
Personalize difficulty and content - then iterate
- Try using the NIST Phish Scale to rate scenario difficulty so you can move cohorts from “easy” to “moderate” to “hard” without misreading results.
- Align scenarios to real threats (e.g., QR/“quishing”, vendor impersonation) and follow with just-in-time micro-lessons, not blame.
- Schedule separate waves per cohort and compare report rate and time-to-report across groups to see where coaching lands best.
Operational tip: Platforms like Hoxhunt let you target groups and auto-assign micro-training based on behaviour. Use that to progress individuals at their pace while keeping a supportive default.
Set learning objectives and pick scenarios that teach
Start with the behaviors you want people to practise - then work backwards to the lure. Teaching beats testing: design each simulation so the follow-up explains the red flags and reinforces the reporting procedure. That’s the essence of the catch-learn model.
Choose 2-3 learning objectives (per wave)
- Spot & report suspicious logins: recognise fake SSO pages and report quickly.
- Pause on convenience tech: treat surprise QR codes like links; verify before scanning.
- Scrutinise app permissions: decline dubious OAuth consent prompts with broad scopes.
- Handle “urgent” invoices safely: validate before opening attachments.
Tie each objective to a short micro-lesson that shows the exact cues they missed.
Calibrate difficulty so results stay fair
Rate emails by difficulty before launch; escalate cohorts from easy → moderate → hard. This prevents a hard scenario’s fail rate being compared unfairly with an easy one and gives you a common language for leaders.
Keep it ethical - and effective - across regions
Avoid emotionally manipulative themes (bonuses, layoffs, health scares) that erode trust; focus on real attacker techniques your threat intel actually sees. Employees learn faster when feedback is immediate and supportive.
Pilot, then scale
Run a small pilot to check tone and learning clarity, incorporate feedback, then scale globally with localized references (brands, holidays) while keeping the objectives identical - so you can compare learning outcomes apples-to-apples.
Calibrate with a small pilot (tone, difficulty, sentiment)
A great phishing test feels like coaching, not a gotcha. In the pilot, we lean into the catch-learn model at Hoxhunt, prioritizing psychological safety and “carrot over stick” -so employees meet realistic phishing scenarios without fear. We validate that our language teaches, our landing pages explain, and our end-user notifications reward reporting.
What to test in the pilot
- Tone & psychological safety: Copy should frame this as learning - no shaming, no emotionally manipulative lures. Keep simulations realistic, but supportive.
- Difficulty: Pre-rate each simulation so results aren’t misread (a “hard” simulation shouldn’t be compared to an “easy” one).
- Payload fit: Sanity-check the payload & login page pairing (e.g., credential harvest + SSO page).
- Reinforcement: Turn on positive reinforcement notifications so reporters get instant “you did the right thing” feedback - this shapes behaviour fast.
What good looks like after a 2-week pilot
- Sentiment: Participants report the experience felt fair, helpful, and relevant. Psychological safety as essential to engagement).
- Behavioural signals: A healthy incident reporting rate and fast time-to-report, even if a few people click; these indicate your comms and reporting route are working.
Make reporting effortless and rewarding
Reporting is the behaviour you want on autopilot. So design your phishing test so the default, safe action is to a single reporting button - then reward it immediately and teach in the same moment.
1) One reporting route, everywhere
- Roll out a unified report button across Outlook (desktop/web), Gmail, and mobile so people have the same move on any device.
- Remind employees they can also report real suspicious emails - not just simulations - via the same button.
- This consistency builds muscle memory and reduces email mistakes under pressure.
2) Instant feedback that teaches
- After a report or click, show a short learning experience: why the phishing scenario was risky, the cues on the payload & login page, and what to do next.
- Keep tone supportive; the goal is behaviour change, not point-scoring.
3) Reinforce reporters - celebrate, don’t shame
- Use gamification elements like stars and recognition to reward correct reports (and completion of bite-size training modules).
- Spotlight “Top Phish Hunters” or similar kudos in internal comms to strengthen cyber security culture.
- Positive reinforcement drives repeat reporting and better employee responses over time.
4) Close the loop with your SOC (without friction)
- Enable Microsoft Defender integration so user-reported messages flow straight into your tenant.
- From there, handle security measures centrally (quarantine, blocklists) while employees keep one simple habit: report.
- This reduces dwell time without complicating the user journey.
5) Show progress where people live
- Point learners to a dashboard to revisit simulations and micro-training; this keeps the learning experience available on demand.
- Internally, narrate wins with incident reporting rate and time-to-report trends - clear, human-centred KPIs.
- Tie these to culture and security posture, not click rate alone.
Copy you can paste
- “If something looks off, tap the report button. You’ll get a 60–90-second explainer, and our team will handle the rest.”
- “Shout-out to our top reporters this month - your quick actions protect customer trust. Keep hunting; the stars and lessons stack up!”
Launch day playbook (live ops)
Your phishing simulation emails are just one part of a broader learning moment. Keep catch–learn tone, minimize surprise, and make reporting the default safe action everywhere - then narrate progress in plain language.
T-24 hours: lock the plan
- Scope: confirm phishing simulation name, cohorts, launch date, send windows per region, and end date.
- Comms ready: pre-brief managers and helpdesk; share one-pagers on the report button path and what the landing pages look like.
- Guardrails check: re-confirm off-limits lures; avoid topics that would damage trust.
- Last-mile test: send to seed mailboxes; verify links, images, and that no login credentials are ever requested.
- SOC alignment: decide pause criteria if a real phishing attack or major incident appears.
T-0 to +2 hours: watch the first wave
- Early signals: delivery, first reports, first clicks; listen for confusion in helpdesk/Slack.
- Positive reinforcement: enable immediate “thanks for reporting” end-user notifications; celebrate in team channels (stars, shout-outs) to model behaviour.
- Real-phish collisions: pause the campaign if needed; send a quick advisory that re-states the reporting route. Protect psychological safety.
During the window: coach in the moment
- No-blame follow-ups: if someone clicks, the learning page explains exactly why the phishing scenario was risky and what to do next; route to additional training only when helpful.
- Consistent scripts: helpdesk replies for reporters (“great job”) and clickers (“quick lesson attached”), with links to your reporting guide.
- SOC triage: user-reported messages feed your tooling (e.g., Defender/XDR) while employees stick to one habit - report.
Measure what matters
If your phishing test doesn’t change behaviour, it’s not helping build real resilience. At Hoxhunt, we anchor metrics to our catch–learn model - so reward reporting, coaching mistakes - and translate results into business risk.
4 Essential phishing simulation metrics
1) Simulated threat reporting rate: % of recipients who report the simulation via your report button. This measures engagement with training and the habit you want on autopilot. Track per cohort and scenario in your analytics dashboard.
2) Simulated dwell time (time-to-report): Minutes from delivery of a simulated phish to the first (or Nth) valid report. Faster reporting = less attacker dwell time; trend the median down wave over wave.
3) Real threat detection rate: Volume of real suspicious emails correctly reported by employees. This is the clearest evidence training transfers to real cyber attacks; celebrate this in comms.
4) Real dwell time: Minutes from a real phish landing to the first verified user report into SOC/XDR. Reducing this directly accelerates containment and lowers breach impact.
Helpful (but not core)
- Failure rate: track it, but keep it subordinate to reporting and engagement. Difficulty, timing, and content skew failure; don’t lead with it in exec updates.
- Repeat clickers / training status: use as coaching signals and to target additional training, not as headline KPIs.

Debrief & improve
A good phishing test ends with learning - not finger-pointing. Run a structured, 45-minute review with Security, HR, Legal/Privacy, IT/SOC, and Comms. Anchor the conversation to behaviour: did people report quickly and consistently, and did they feel safe doing so?
What to review (in this order)
- Outcomes vs. goals: trend the four key metrics above. De-emphasize failure rate unless needed for context.
- Difficulty & fairness: tag each phishing scenario with easy/moderate/hard so leaders don’t misread results across cohorts or regions.
- Sentiment & trust: summarize employee feedback (helpdesk tickets, manager notes). Keep psychological safety central - learning over blame.
- Operational signals: delivery issues, mailbox noise, SOC workload, and any real-phish collisions during the window; capture what to change before the next launch date.
Decide changes for the next wave
- Scenarios: retire anything that felt manipulative; add one modern vector
- Comms: tighten the pre-brief, make the report button path clearer, and script faster positive reinforcement messages.
- Cadence & targeting: adjust cohort frequency based on simulated dwell time and real detection gains; direct additional training to repeat patterns.
Close the loop - publicly
Share a short post in your channels: celebrate reporters, explain one or two red-flag cues people might've missed and link the learning page for anyone curious.
Expand beyond email
When you move past classic phishing emails, make sure to choose new channels that actually mirror how people work and the cyber threats you're likely to face. These phishing simulations require extra care to protect trust and keep the learning experience consistent.
Where to expand (and how to teach it)
- QR (“quishing”): Treat every QR like a link. Use controlled QR drills (e.g., stickers in common areas) and coach the two-step habit: pause → report. Keep the tone playful, never punitive. You can grab our free stickers here.
- SMS (“smishing”): Short messages compress decision time; coach “don’t tap - report.” Provide examples (delivery scams, MFA resets, payroll notices) with immediate, friendly micro-lessons on smishing attacks.
- Deepfake (voice/video) scenarios: Introduce a small-cohort pilot where a fake meeting or video message urges urgent action. Teach the slow-verify-act routine (no approvals live on calls, call-back via known numbers, secondary-channel verification) and reinforce via instant, supportive feedback.
You can see Hoxhunt's deepfake training works below.
What “good” looks like
- Reporting rises in the new channel within two waves.
- Dwell time drops (faster first reports), even as difficulty increases.
- Feedback stays positive: learners say it felt fair, relevant, and useful.
How Hoxhunt supports your phishing simulation program
If you’re choosing a platform, Hoxhunt maps cleanly to the strategy we’ve outlined: reporting first, teaching immediately, automating the admin, and measuring real behaviour change.
One reporting habit, everywhere
Standardize on the Hoxhunt report button in Outlook, Gmail, and mobile so employees use the same safe action on any device. The button handles both phishing simulations and real suspicious emails - muscle memory you can rely on.
Teach at the moment of action
When someone reports - or even clicks - Hoxhunt delivers a short, supportive micro-lesson that explains the red flags and next steps. It’s explicitly framed as learning over blame.
Route real reports to your SOC stack
With the Defender integration, user-reported messages can be submitted to your tenant for analysis and response - closing the loop without adding steps for employees.
Automate and personalize at scale
Hoxhunt emphasizes flexible automation and AI-driven adaptive training so you can run frequent, lightweight waves without manual busywork - keeping the focus on culture and behavior.
Measure what matters
Dashboards and guidance align to the four core metrics we use in this guide: simulated/real reporting rate and dwell time - not vanity click rates. That’s how you evidence risk reduction to leadership.
Global by default
Rolling out across regions? Hoxhunt positions itself for global programs, with automation and large-scale deployments; their broader product pages highlight multi-language support for enterprise rollouts.
Regional compliance
Universal guardrails
- Be clear about purpose: “We run phishing tests to improve reporting speed and reduce risk.” Publish a short staff notice before launch.
- Keep it proportionate: test real attacker patterns; avoid intrusive surveillance to prove learning. If a lighter option exists, prefer it.
- Minimise data & retain briefly: collect only what you need for coaching/metrics; set deletion timelines.
- Fairness & transparency: explain who can see named results, how complaints work, and that there’s no public shaming.
United States
- No single federal rule on employee monitoring; use a principles-first approach and check state laws (e.g., California). Publish clear policies and keep tests proportionate to security aims. IAPP
- Pair simulations with recognize & report education - CISA emphasizes training the workforce to spot and report phishing, not surveillance. CISA
UK & EU
- Lawful basis: most teams rely on legitimate interests - do a Legitimate Interest Assessment (LIA) and document safeguards (opt-outs where feasible, reduced intrusiveness). ICO
- DPIA when needed: if monitoring could be high-risk (e.g., sensitive groups or new tech), complete a DPIA and mitigate. The ICO’s worker-monitoring guidance stresses necessity and proportionality. ICO
- Watch biometrics & overreach: UK enforcement shows biometric/time-tracking without strong necessity fails proportionality - use lesser means first. The Guardian
- EDPB lens: recent legitimate interest guidance reinforces the balancing test - commercial interest can qualify, but only with safeguards and a real need. European Data Protection Board
APAC snapshots
- Singapore (PDPA): follow purpose limitation, notification, and reasonableness; consult PDPC guidance and selected-topics advisories when running workplace initiatives. PDPC
- Australia (Privacy Act + state laws): OAIC reminds employers that workplace surveillance is largely state/territory regulated - document your purpose, inform staff, and check local surveillance statutes. OAIC
Executive storytelling & the board view
Boards don’t buy templates - they buy risk reduction. Anchor your update on behaviors that shorten attacker dwell time and prove culture change, then show how next quarter’s plan compounds those gains.
Below is a rough slide-by-slide template you can use.
Slide 1: Share outcome snapshot
- Simulated threat reporting rate (simulation successfully reported)
- Real threats reported per user (the most important metric that reflection real risk reduction)
- Fail rate (user clicks simulation)
- Miss rate (where no action was taken)
Here's what this looks like at Hoxhunt...

Slide 2: Explain what changed behavior
- One reporting route on every device; instant micro-lessons after report/click.
- Positive reinforcement for reporters; no-blame coaching for clickers.
- Clear guardrails (off-limits lures, pause criteria) to protect trust globally. This is the cultural engine - reporting becomes a reflex.
Slide 3: How it translates to real-life risk
- Faster reporting = shorter time-at-risk before SOC containment.
- More good reports = earlier detection of real cyber attacks.
- Fewer repeat patterns = lower predicted compromise in high-exposure cohorts.
Tie these to fraud prevention, incident MTTR, and customer trust.
Slide 4: 90-day plan (concrete actions only)
- Cadence: monthly waves; bi-weekly for high-risk cohorts until dwell time drops.
- Scenarios: rotate email + one modern vector; retire anything that harms sentiment.
- Personalization: progress cohorts easy→moderate→hard
- Comms: pre-brief managers; celebrate top reporters.
Talking points you can lift
- “We’re measuring what matters: reporting volume and speed—in training and the real world. Both improved this quarter; here’s how we’ll accelerate that trend.”
- “Our approach is teach-first: instant feedback, recognition for reporters, and coaching for mistakes. That’s why participation stays high and trust intact.”
Governance pack & templates (copy-paste ready)
Your simulations scale when governance is boring - in a good way. This kit keeps launches predictable, humane, and auditable across regions.
Staff notice - paste into email/Slack/Teams
"We run periodic phishing simulations to practise safe reporting and reduce risk. If something feels off, use the Report button (Outlook/Gmail/mobile). You’ll get a 60-90s micro-lesson - no blame, just learning. Named results are used for coaching, not shaming; we retain minimal data for a short period."
Manager one-pager - brief your people leaders
- Purpose: build the reporting habit; shorten time-to-report.
- What to tell your team: “If in doubt, report - don’t investigate.”
- What happens on a click/report: short learning page; positive reinforcement for reporters.
- How to escalate concerns: send issues/complaints to HR/Privacy alias; we’ll pause if needed.
- Your role: model behavior.
Pause criteria (protect trust)
- Company-wide incident, layoffs, local/national crises, or a real phishing campaign that overlaps the scenario.
- Any template that triggers distress or backlash - retire it immediately.
- Regional holidays or culture-specific dates that could skew results or sentiment.
Off-limits lures (ethics charter)
- Bonuses, layoffs, medical emergencies, disciplinary notices, personal data exposures about named employees, or anything targeting protected characteristics.
- Anything that pressures live approvals (finance, HR, IT) without a safe verification step.
Privacy & retention
- Purpose: practise reporting; measure reporting speed/volume.
- Data we keep briefly: who reported/clicked, time-to-report, scenario difficulty, cohort.
- Who can see named data: Security + HR for coaching; no public leaderboards of “fails.”
- Retention: delete named results after x days; keep aggregated trends only.
How to run a phishing simulation test FAQ
Are we punishing people who click?
No. We frame simulations as practice. Use catch-learn: instant, supportive micro-lessons for clickers and recognition for reporters. No public shaming - ever.
Why don’t we lead with failure rate?
Fails vary with difficulty and timing. Lead with simulated/real reporting rate and dwell time (speed to report). Use fail rate only as context, not the headline KPI.
How often should we run phishing tests?
Default to monthly waves (bi-weekly for high-risk cohorts) with pause criteria for incidents, layoffs, or regional crises. Short, predictable campaigns beat “big bang” stunts.
Will simulations hurt trust or morale?
Not if you’re transparent and ethical: publish a staff notice, avoid manipulative lures (bonuses, health scares), pilot for tone, and keep feedback pages constructive. Trust rises when reporting is praised.
What do we do with repeat clickers?
Coach, don’t label. Personalize difficulty, assign short additional training, and celebrate their next correct report.
Should executives be included?
Yes. Executives are frequent targets and culture carriers. Keep them in scope (with the same supportive experience).
What about AI-generated phishing and deepfakes?
Pilot small-cohort drills for deepfake attacks. Teach “slow-verify-act” (no approvals live on calls, call back on known numbers) and keep one reporting habit across channels.
How do we show business value?
Tell a risk story: reporting up, dwell-time down, fewer repeat patterns, more real cyber threats caught early - plus positive sentiment. That’s measurable human-layer risk reduction.
- Subscribe to All Things Human Risk to get a monthly round up of our latest content
- Request a demo for a customized walkthrough of Hoxhunt