MEDIUM
Incident Response Playbook: From Detection to Recovery
A comprehensive guide to building and executing an incident response plan, with practical templates and real-world scenarios.
Why Incident Response Matters
The average time to identify a breach is 204 days. The average time to contain it is 73 additional days. Organizations with tested incident response plans reduce breach costs by $2.66 million on average.
The Incident Response Lifecycle
| Phase | Activities | Output |
|---|---|---|
| 1. Preparation | Build team, create policies, deploy tools, train staff | IR plan, playbooks, contact lists |
| 2. Detection & Analysis | Monitor alerts, validate incidents, determine scope | Incident classification, timeline |
| 3. Containment | Isolate systems, preserve evidence, limit damage | Contained threat, forensic images |
| 4. Eradication | Remove malware, close vulnerabilities, reset credentials | Clean systems, patched gaps |
| 5. Recovery | Restore systems, verify functionality, monitor closely | Business operations restored |
| 6. Post-Incident | Document lessons, update procedures, improve defenses | Updated IR plan, metrics |
Note: This is a continuous cycle - lessons learned feed back into preparation.
Phase 1: Preparation
Build Your Team
Incident Response Team Structure:
Core Team:
- IR Manager/Lead
- Security Analysts (Tier 1-3)
- Forensic Investigators
- Threat Intelligence Analyst
Extended Team:
- IT Operations
- Network Engineering
- Legal Counsel
- Communications/PR
- Human Resources
- Executive Sponsor
External Resources:
- IR Retainer (DFIR firm)
- Cyber Insurance Provider
- Law Enforcement Contacts
- Regulatory Contacts
Essential Documentation
## IR Documentation Checklist
### Policies & Procedures
- [ ] Incident Response Plan
- [ ] Incident Classification Matrix
- [ ] Escalation Procedures
- [ ] Communication Templates
- [ ] Evidence Handling Procedures
### Technical Documentation
- [ ] Network Diagrams
- [ ] Asset Inventory
- [ ] Critical System List
- [ ] Backup Procedures
- [ ] Recovery Procedures
### Contact Information
- [ ] On-call Rotation Schedule
- [ ] Escalation Contact List
- [ ] Vendor Support Contacts
- [ ] Law Enforcement Contacts
- [ ] Legal/PR Contacts
Incident Classification
| Severity | Examples | Response |
|---|---|---|
| SEV 1 - Critical | Active data breach, ransomware encryption in progress, critical infrastructure compromise | All hands on deck, 15-min updates |
| SEV 2 - High | Confirmed compromise (limited scope), malware on multiple systems, privileged account compromise | Core team engaged, hourly updates |
| SEV 3 - Medium | Single system compromise, phishing with credential capture, policy violation with security impact | On-call team, daily updates |
| SEV 4 - Low | Attempted attack blocked, minor policy violation, suspicious activity requiring investigation | Normal queue, standard SLA |
Phase 2: Detection & Analysis
Initial Triage
## First 15 Minutes Checklist
### Validate the Alert
- [ ] Is this a true positive?
- [ ] What triggered the alert?
- [ ] Initial scope assessment
- [ ] Assign incident number
### Initial Data Collection
- [ ] Alert details and timeline
- [ ] Affected systems/users
- [ ] Network logs (5-min window)
- [ ] Initial IOCs
### Immediate Decisions
- [ ] Severity classification
- [ ] Escalation needed?
- [ ] Containment required?
- [ ] Evidence preservation priority
Investigation Queries
Windows Event Log Analysis:
# Recent successful logins
Get-WinEvent -FilterHashtable @{LogName='Security';ID=4624} -MaxEvents 100 |
Select-Object TimeCreated, @{N='User';E={$_.Properties[5].Value}},
@{N='LogonType';E={$_.Properties[8].Value}},
@{N='SourceIP';E={$_.Properties[18].Value}}
# Process creation events
Get-WinEvent -FilterHashtable @{LogName='Security';ID=4688} -MaxEvents 100 |
Select-Object TimeCreated,
@{N='User';E={$_.Properties[1].Value}},
@{N='Process';E={$_.Properties[5].Value}},
@{N='CommandLine';E={$_.Properties[8].Value}}
# Service installations
Get-WinEvent -FilterHashtable @{LogName='System';ID=7045} -MaxEvents 50
Linux Investigation:
# Recent authentication
grep -E "Accepted|Failed" /var/log/auth.log | tail -100
# Recently modified files
find / -type f -mtime -1 -ls 2>/dev/null
# Running processes with network connections
netstat -tulpn
lsof -i -P -n
# Cron jobs (persistence)
cat /etc/crontab
ls -la /etc/cron.d/
crontab -l
# Recently installed packages
rpm -qa --last | head -20 # RHEL/CentOS
dpkg -l --no-pager | tail -20 # Debian/Ubuntu
Network Traffic Analysis:
# Capture traffic for analysis
tcpdump -i eth0 -w capture.pcap -c 10000
# Find beaconing behavior
tshark -r capture.pcap -T fields -e ip.src -e ip.dst -e tcp.dstport |
sort | uniq -c | sort -rn | head -20
# DNS queries
tshark -r capture.pcap -Y "dns.flags.response == 0" -T fields -e dns.qry.name |
sort | uniq -c | sort -rn
Timeline Building
Example Timeline:
| Date/Time (UTC) | Source | Event Description |
|---|---|---|
| 2025-01-08 14:23:15 | Email GW | Phishing email received |
| 2025-01-08 14:25:42 | Proxy | User clicked malicious URL |
| 2025-01-08 14:25:47 | EDR | Malware download blocked |
| 2025-01-08 14:26:01 | EDR | Second attempt successful |
| 2025-01-08 14:26:15 | EDR | Process injection detected |
| 2025-01-08 14:30:00 | DC Logs | Lateral movement attempt |
| 2025-01-08 14:32:00 | SIEM | Alert generated |
| 2025-01-08 14:35:00 | SOC | Incident declared |
Phase 3: Containment
Short-term Containment
## Immediate Containment Actions
### Network Isolation
- [ ] Isolate affected systems (network ACLs/VLANs)
- [ ] Block malicious IPs at firewall
- [ ] Sinkhole malicious domains
- [ ] Disable compromised accounts
### Evidence Preservation (BEFORE imaging)
- [ ] Capture volatile data (memory, connections)
- [ ] Screenshot active sessions
- [ ] Document running processes
- [ ] Note network connections
Memory Acquisition:
# Linux
sudo dd if=/dev/mem of=/mnt/forensics/memory.raw bs=1M
# Using LiME
sudo insmod lime.ko "path=/mnt/forensics/memory.lime format=lime"
# Windows (with winpmem)
winpmem_mini_x64.exe memory.raw
Long-term Containment
## Sustained Containment
### System Hardening
- [ ] Patch exploited vulnerability
- [ ] Reset compromised credentials
- [ ] Implement additional monitoring
- [ ] Block newly discovered IOCs
### Business Continuity
- [ ] Activate backup systems if needed
- [ ] Communicate with affected users
- [ ] Coordinate with business units
Phase 4: Eradication
Malware Removal
## Eradication Checklist
### Identify All Affected Systems
- [ ] Scan all systems with IOCs
- [ ] Review EDR detections
- [ ] Check for persistence mechanisms
- [ ] Identify patient zero
### Remove Threats
- [ ] Delete malicious files
- [ ] Remove persistence (scheduled tasks, services, registry)
- [ ] Remove malicious accounts
- [ ] Revoke compromised certificates/keys
### Verify Clean State
- [ ] Rescan with updated signatures
- [ ] Verify persistence removal
- [ ] Confirm no ongoing C2 communication
Common Persistence Locations:
Windows Persistence:
├── Registry Run Keys
│ └── HKLM/HKCU\Software\Microsoft\Windows\CurrentVersion\Run
├── Scheduled Tasks
│ └── C:\Windows\System32\Tasks\
├── Services
│ └── HKLM\System\CurrentControlSet\Services
├── Startup Folders
│ └── %AppData%\Microsoft\Windows\Start Menu\Programs\Startup
└── WMI Subscriptions
Linux Persistence:
├── Cron Jobs
│ └── /etc/crontab, /etc/cron.d/, user crontabs
├── Systemd Services
│ └── /etc/systemd/system/
├── Init Scripts
│ └── /etc/init.d/
├── SSH Keys
│ └── ~/.ssh/authorized_keys
└── Shell Profiles
└── ~/.bashrc, ~/.profile, /etc/profile.d/
Phase 5: Recovery
Recovery Plan
## Recovery Procedures
### System Restoration
- [ ] Rebuild from known-good images (preferred)
- [ ] Restore from clean backups
- [ ] Reinstall and reconfigure (if no backup)
- [ ] Apply all patches before reconnecting
### Validation
- [ ] Vulnerability scan restored systems
- [ ] Verify business functionality
- [ ] Confirm security controls active
- [ ] Test backup/recovery procedures
### Reconnection
- [ ] Gradual reconnection to network
- [ ] Enhanced monitoring during transition
- [ ] User communication and testing
Recovery Priority
| Priority | Systems | RTO Target |
|---|---|---|
| P1 | Domain Controllers, Core Network Infra, Security Systems | 4 hours |
| P2 | Email/Communication, Critical Applications, Database Servers | 8 hours |
| P3 | Business Applications, File Servers | 24 hours |
| P4 | End User Devices, Non-critical Systems | 48-72 hours |
Phase 6: Post-Incident
Lessons Learned Meeting
## Post-Incident Review Agenda
### Timeline Review (30 min)
- Walk through incident timeline
- Identify key decision points
- Note what information was available when
### What Went Well (15 min)
- Effective detection
- Good team coordination
- Successful containment
### What Needs Improvement (30 min)
- Detection gaps
- Process bottlenecks
- Communication issues
- Tool/capability gaps
### Action Items (15 min)
- Assign owners
- Set deadlines
- Define success criteria
Incident Report Template
# Incident Report: [IR-2025-001]
## Executive Summary
Brief 2-3 paragraph overview for leadership.
## Incident Details
- **Incident ID:** IR-2025-001
- **Date Detected:** 2025-01-08 14:32 UTC
- **Date Contained:** 2025-01-08 16:45 UTC
- **Date Resolved:** 2025-01-09 09:00 UTC
- **Severity:** SEV 2 - High
- **Classification:** Malware/Ransomware
## Impact Assessment
- Systems affected: 15 workstations, 2 servers
- Data affected: No confirmed exfiltration
- Business impact: 4 hours downtime
- Financial impact: ~$50,000 (estimated)
## Root Cause Analysis
[Detailed technical analysis]
## Timeline of Events
[Detailed timeline]
## Response Actions
[What was done]
## Recommendations
[Improvements to prevent recurrence]
## Appendices
- IOCs
- Affected asset list
- Evidence inventory
Tabletop Exercise Template
Scenario: Ransomware Attack
## Tabletop Exercise: Ransomware Scenario
### Inject 1 (T+0 min)
"It's Monday 2:00 AM. Your SIEM alerts on unusual
PowerShell activity on multiple workstations.
EDR shows attempted disabling of security tools."
Discussion Questions:
- Who gets notified?
- What's our first action?
- Do we have after-hours coverage?
### Inject 2 (T+15 min)
"Investigation reveals 50+ systems showing encryption
activity. Ransom notes appearing. Domain admin
credentials may be compromised."
Discussion Questions:
- Do we isolate the network?
- Who authorizes shutdown of systems?
- How do we communicate internally?
### Inject 3 (T+30 min)
"Attackers claim to have exfiltrated data. They're
demanding $2M in Bitcoin. Media is calling."
Discussion Questions:
- Do we engage with attackers?
- What's our disclosure obligation?
- Who handles media?
### Inject 4 (T+45 min)
"Backups from the past 30 days are encrypted.
Last clean backup is 45 days old."
Discussion Questions:
- What's our recovery strategy?
- How long can the business operate?
- Do we consider paying?
References
- NIST SP 800-61 Rev 2: Incident Handling Guide
- SANS Incident Handler’s Handbook
- CISA Incident Response Playbooks
The best incident response is the one you’ve practiced. Train like it’s real, so when it’s real, it feels like training.