How to write an incident response plan that works

Why most IR plans fail

The most common failure mode isn't "we didn't have a plan." It's "we had a plan but nobody could find it, nobody had read it, and the people named in it had changed roles six months ago."

A good incident response plan is short, specific, and tested. It answers the questions people actually ask during an incident: Who decides? Who do we call? What do we preserve? What do we tell customers? When do we involve legal? When do we notify the regulator?

If your plan is a 40-page document that lives in a SharePoint folder nobody opens, it's not a plan — it's a compliance artefact. Both have value, but only one helps at 2 AM.

The five things every plan needs

Strip away everything else and an incident response plan needs these five elements to be useful:

Escalation contacts — names, phone numbers, after-hours contacts. Not role titles — actual people. Updated quarterly.
Decision authority — who can approve system isolation, password resets, external communications and regulator notification. This cannot be ambiguous.
Severity classification — a simple, fast way to determine whether this is a "drop everything" event or a "deal with it Monday" event. Three levels is enough.
Evidence checklist — what to capture before you start changing things. Logs, screenshots, email headers, cloud events, endpoint artefacts.
Communications templates — pre-drafted internal notifications, customer communications and board updates. You don't want to write these under pressure.

Decision authority

This is where most plans are weakest. During an incident, the most damaging delays come from nobody being sure who can approve containment actions. "Should we disable the CEO's account?" is a question that needs a pre-decided answer, not a 45-minute discussion while the attacker is still active.

Define authority in advance for: system isolation, account disablement, external communications, legal engagement, insurer notification, regulator notification and customer notification. Make sure alternates are named for every role.

Communications workflow

Separate internal communications from external communications. Assign ownership for each. Internal updates should go through a single channel (a dedicated Slack channel, a Teams group, a WhatsApp group — whatever your organisation actually uses).

External communications should be approved by a named person before release. Don't let technical staff make public statements. Don't let PR staff make technical claims. Keep factual statements separate from speculation until evidence supports them.

Evidence preservation

The single most common mistake in incident response is destroying evidence while trying to fix the problem. Before you reset passwords, reimage machines, restore from backup or delete suspicious emails, capture the evidence:

Export audit logs from identity providers, email, cloud services and endpoints
Screenshot suspicious activity, alerts, emails and configurations
Preserve endpoint artefacts — don't reimage until you've captured what you need
Record the timeline — when was it noticed, who noticed it, what actions were taken
Save email headers, not just email bodies

Testing and exercising

A plan that hasn't been tested is a guess. Run a tabletop exercise at least once a year — ideally twice. Keep it simple: pick a realistic scenario (ransomware, BEC, cloud key compromise), walk through the plan step by step, and note where it breaks down.

The goal isn't to prove the plan works. The goal is to find the gaps before a real incident does. Common gaps found in exercises: out-of-date contacts, unclear decision authority, no after-hours process, evidence not being preserved, and communications taking too long.

Keeping the plan alive

Review the plan quarterly. Not the whole document — just the contact list, decision authority and severity definitions. Review the full plan annually or after any real incident. If someone named in the plan changes role, update it the same week. A plan with wrong contacts is worse than no plan because it creates false confidence.

How to write an incident response plan that actually works