CIP-015-1 INSM: A Practical Playbook

By Patrick Miller

NERC CIP-015 makes east-west visibility inside the ESP mandatory. This playbook shows how to stand up INSM the right way through risk-based data feeds, ICS-aware anomaly detection, evaluation tied to incident response, and defensible evidence on a timeline to 10/1/2028 and beyond. Avoid common pitfalls and design now for the likely CIP-015-2 expansion.


Overview

This is a practical guide to implementing Internal Network Security Monitoring (INSM) under CIP-015-1. If you’re responsible for CIP compliance, this is your blueprint for getting Internal Network Security Monitoring (INSM) from directive to daily operation.

We cut through the jargon to show how to baseline and monitor east-west traffic inside the Electronic Security Perimeter (ESP) for High Impact BES Cyber Systems (BCS) and Medium Impact Cyber Systems with External Routable Connectivity (ERC), evaluate anomalies, and retain/protect evidence in a way that stands up to audits. You’ll get a phased roadmap to the Oct 1, 2028 effective date milestone for all applicable Control Centers (and then Oct 1, 2030 for remaining Medium-ERC), plus design tips to future-proof for the upcoming CIP-015-2 expansion to EACMS/PACS and the “CIP networked environment.”

Why now? CIP-015-1 is adopted with a long runway. The effective date seems far away until you consider the design, procurement, placement, tuning, retention, and evidence lifecycle you’ll need to stand up. Start now; it always takes longer than you think.


Where CIP-015 Came from and What It Actually Covers

FERC Order 887 told NERC to add requirements for monitoring inside the trusted zone (e.g., inside the ESP), not just at the perimeter. The goal: baseline internal network behavior and detect unauthorized activity, connections, devices, and software. INSM is the “see inside” control.

NERC opted to create a new standard CIP-015 instead of adding INSM into CIP-005 or CIP-007. Why? CIP-005 lives at the EAPs; CIP-007 focuses on host-level controls. INSM (at least for now) is about network communications within ESPs, so a fresh standard made clarity and future evolution easier.

What’s the scope for version 1 of CIP-015? INSM applies to networks protected by your ESPs for High-impact BCS and Medium-impact BCS with ERC. Plain English: don’t just look at packets crossing the EAP; look at the traffic between hosts that sit inside the ESP.


INSM in a Nutshell

INSM is really comprised of three elements:

  1. Collection: Copy network traffic at smart points.

  2. Detection: Compare against your expected/baseline behavior to flag anomalous activity.

  3. Analysis: Evaluate anomalies and decide what to do next.

The drafting team uses “network activity” to mean the connections between devices and software you’re collecting inside the ESP. That’s exactly what you need to baseline and monitor.


The Requirements You’ll Live with Every Day

  • R1 – Process & capability: Document and implement processes to monitor ESP-protected networks (High BCS; Medium BCS with ERC). Your process must include:

    • R1.1 risk-based network data feeds,

    • R1.2 methods to detect anomalies, and

    • R1.3 methods to evaluate anomalies and determine actions.

  • R2 – Retention: Keep INSM data associated with anomalous activity at least until the evaluation and actions are complete (supports R1.3). Normal traffic isn’t required to be retained.

  • R3 – Protection: Protect collected/retained INSM data from unauthorized deletion or modification (think log integrity, immutability, segmentation, stricter auth).


A Step-by-Step INSM Playbook

Phase A — Program setup & governance

A1) Write the INSM process first (R1 “documented process(es)”).
Treat this as your control narrative that knits 1.1, 1.2, and 1.3 together, with explicit roles, tuning cadence, triage SLAs, and escalation map to CIP-008. The Technical Rationale is crystal clear that R1 expects a documented collection+analysis program, not just a box on the rack.

A2) Decide how your sensors are classified (CMEP guidance).
Regional audit staff will look at function, location, and deployment to decide whether a sensor/collector is a PCA (inside/on the ESP with routable connectivity) and/or an EACMS (performs access monitoring/control). A passive SPAN-port sensor inside/on the ESP that forwards to a collector for CIP-005/-007/-010 purposes can be assessed as EACMS—and if it’s within/on the ESP, it will also require PCA protections. Plan controls accordingly.

Tip: If the sensor’s management interface is outside the ESP and you only ingest mirrored traffic from inside, CMEP notes that may avoid certain ESP communication protections—though BCSI handling can still apply to where you store the data. Document your architecture choices.

Tip: Note that some network TAPs and similar devices are often considered nonprogrammable electronic devices and do not impact your compliance footprint.

A3) Articulate your risk-based rationale (R1.1).
List candidate networks and prioritize. High-value examples: EMS/DCS servers and HMIs, authentication (AD/2FA), third-party/vendor paths, PLC/RTU channels. Conversely, de-prioritize: backup-only networks and encrypted links where you can monitor after decryption elsewhere (at the endpoint). Put the “why” in writing; auditors will look for it. This risk rationale is crucial to the defensibility of your approach in the audit.

A4) Pick collection methods fit for purpose.

  • TAPs (full fidelity, outages to install)

  • SPAN/mirror (fast to enable; minimal loss ok)

  • NetFlow/IPFIX (great for distributed/low bandwidth; no payload)

  • RSPAN (flow-like, with payload; needs bandwidth)

  • Virtual sensors / endpoint agents where appropriate
    Select a mix that balances coverage and cost; explicitly document why you excluded some feeds (duplication, low security value).

Tip: Be mindful of the auditor’s expectations on feasibility. If a segment can’t export data today, the standard’s technical rationale lists acceptable ways to proceed: upgrade gear, add TAPs, collect flow, collect adjacent, supplement with endpoint/firewall logs, and target highest-value ports instead of mirroring everything.

A5) Decide what’s “in” and “out”
You are not required to collect serial, 4–20mA, or generic WAN/MPLS circuits (though MPLS can be used to move INSM data). Write these exclusions down.

A6) Build an INSM reference architecture you can defend.
Separate collection from analysis, avoid ERC on collection paths where possible, use mirrors/spans at strategic points, deduplicate traffic (as much as possible), and consider a data diode or tap between tiers for higher assurance. The Technical Rationale’s example reference architecture (below, Figure 1) is a great template for your design review packet.

 

Phase B — Make detection useful (R1.2)

B1) Baseline “expected” behavior.
Whether you use statistical/ML/AI “anomaly detection,” protocol hygiene checks, or signature/IOC rules, the concept is the same: establish expected behavior on the monitored segments and flag anomalies for review. To know an anomaly, you need to know what normal/expected is. Examples: unusual protocol use, unexpected volumes, odd logon times, invalid ICS protocol flows, first-time communication initiated between two systems/assets, etc.

B2) Use OT-aware methods.
You’ll get more signal (and less noise) if your tooling speaks the ICS protocols/behaviors you actually run (e.g., DNP3, Modbus, ICCP, OPC-UA, vendor-specific). The standard expects ICS-protocol-aware INSM.

B3) Combine methods to reduce blind spots.
Blend behavioral, signature, IOC back-search, and configuration/hygiene checks (e.g., DNS misuse, weak ciphers, prohibited SMBv1/NTLMv1). The Technical Rationale recognizes this “many roads to anomalous” reality.

B4) Expect (and plan for) tuning.
False positives will spike, especially during the initial phases of the effort, outages, and system changes. Suppressing alarms during planned work and aligning alerts with operations is considered normal and not cause for non-compliance when managed prudently. (Build this into the process you wrote in A1.)

Phase C — Evaluate and act (R1.3)

C1) Define triage states and SLAs.
At a minimum: Benign / Needs Investigation / Escalate with time-boxed triage. Your evaluation step can be a playbook and analyst workflow referencing OT SMEs. If it rises to a Cyber Security Incident or Attempt to Compromise, hand off per CIP-008.

C2) Close the loop.
Tune rules after benigns. Preserve evidence for suspected/malicious. Feed lessons into process improvement, change management, and training.

Phase D — Retain what matters (R2)

D1) Retain anomalous-associated data, not the whole firehose.
Keep PCAPs/metadata/files/raw data that the alert or investigation actually touched, at least until the action is complete under R1.3. Normal traffic can be discarded. If an event escalates into CIP-008 scope, be sure to follow CIP-008 retention rules.

D2) Tier your retention.
Examples:

  • Full PCAPs lose value quickly; expensive to store long-term.

  • Targeted PCAPs and metadata/flows hold value longer, cheaper to keep.

  • Carved files and hashes are high-value artifacts over time.

Phase E — Protect the evidence (R3)

E1) Make deletion/modification hard.
Use strong auth, limited admin paths, segmentation (ideally separate collection/analysis network), and tamper-evident storage or immutability controls. Treat INSM like a forensic source an attacker will try to erase (think ATT&CK T1070 – Indicator Removal).

E2) Decide whether your INSM data is BCSI.
If you classify stored INSM data as BES Cyber System Information (BCSI), protect it using your CIP-011 processes (including third-party sharing). If you decide it’s not BCSI, you still must document and implement protections against unauthorized deletion or modification to satisfy R3.

Evidence You Should Have on Hand

As a general practice, you can always check the Measures of any CIP standard to get a sense of what evidence the auditors are expecting. For CIP-015, a summary of the Measures is below.

  • M1.1 (R1.1): Your risk-based rationale + list of selected/excluded data feeds (with reasons), diagrams showing TAP/SPAN/flow sources and how duplication is avoided.

  • M1.2 (R1.2): Detection configs, baseline documentation, example anomaly detections and analyst notes.

  • M1.3 (R1.3): The evaluation playbook, example case records, escalations to CIP-008 where applicable.

  • M2 (R2): Retention SOP, system configs, and reports showing retention timelines aligned to evaluation/incident workflows.

  • M3 (R3): Data-integrity controls (segmentation, access control, immutability), and tests showing protection against deletion/modification.

Pitfalls & Anti-Patterns to Avoid

  • Only mirroring the EAP. CIP-015 is about inside the ESP. Don’t confuse it with CIP-005 monitoring.

  • Collecting everything, explaining nothing. R1.1 is risk-based. Over-collecting without rationale leads to gaps elsewhere and tough audits.

  • Ignoring sensor classification. If a sensor is within/on the ESP and uses routable protocols, expect PCA; if it performs access monitoring/control, expect EACMS. Build controls to match.

  • Treating INSM like antivirus. INSM won’t block by itself; it’s a detective control—make sure your response muscle (CIP-008) is ready.

Metrics That Matter

  • Coverage: % of in-scope ESPs with at least one validated, documented data feed.

  • Fidelity: % of detections tied to ICS-protocol-aware analytics.

  • Quality: Mean time to evaluate anomalies (R1.3), % tuned within SLA.

  • Retention health: % of anomalous cases with complete evidence retained through closure (and longer for CIP-008 incidents).

  • Integrity: # of successful quarterly integrity checks on INSM stores (R3).

One-Page Rollout Checklist

  1. Charter INSM program; write the R1 control narrative (A1).

  2. Classify sensors/collectors (PCA/EACMS/BCSI impacts) with diagrams (A2).

  3. Risk-rank ESP networks; justify inclusions/exclusions (A3).

  4. Select feed methods (TAP, SPAN, flow, RSPAN, virtual); handle duplication (A4/A6).

  5. Deploy an architecture that separates collection and analysis, avoids ERC where possible, and segments INSM (A6/E1).

  6. Baseline and detect with ICS-aware analytics; plan tuning (B1-B4).

  7. Evaluate anomalies with documented triage + CIP-008 handoff (C1-C2).

  8. Retain only anomalous-related data until closure; tier storage (D1-D2).

  9. Protect integrity (R3) and decide BCSI handling (E1-E2).

  10. Prove it: Maintain M1-M3 evidence packages and run quarterly control health checks.

Looking Ahead: Beyond the ESP (CIP-015-2 Horizon)

FERC didn’t stop at adopting CIP-015-1. In Order 907, the Commission told NERC to expand INSM beyond the ESP to include EACMS and PACS and to cover the full “CIP-networked environment.” In plain terms, visibility must follow trust, identity, access, and physical control planes that can influence BES Cyber Systems belong in scope. NERC’s follow-on Project 2025-02 SAR starts that work.

What to expect in CIP-015-2: applicability extended to EACMS/PACS outside the ESP; monitoring of east-west traffic between those systems (e.g., PACS controller and devices/readers/REX; EACMS used solely for access logging/monitoring); and clearer treatment of lateral movement across identity and access infrastructure. That’s the direction the SAR signals, aligning with FERC’s intent to close blind spots outside the ESP.

Watch items as drafting proceeds: avoid redefining what FERC already defined as the CIP-networked environment; press for functional monitoring outcomes (baseline + behavioral/anomaly detection, not just box-checking); and coordinate with adjacent CIP standards so obligations don’t conflict at the boundaries (e.g., CIP-005/-007/-010).

Do now, before the ink dries

  • Inventory trust relationships: enumerate EACMS/PACS that authenticate, authorize, or log access to BCS, even if they sit outside the ESP.

  • Map east-west paths among identity, logging, and control systems; flag multi-site PACS, federated identity, and shared VLANs.

  • Assess gaps: what (if anything) is monitored today on those non-ESP segments?

  • Plan segmentation where shared infrastructure would drag non-CIP networks into scope.

  • Engage in the SAR/drafting process: comment early so operational realities shape the text or get directly involved in the drafting team(s).

Final Thoughts

CIP-015 isn’t another sensor rollout; it’s an operating capability. Treat INSM as a permanent muscle that closes the ESP blind spot and hardens your response. Design for outcomes (earlier detection, faster evaluation, preserved evidence) and let tooling follow, not the other way around.

Evidence by design. From day one, write the R1 narrative that ties risk-based feeds to anomaly detection then to evaluation and actions. Make retention (R2) and integrity (R3) explicit in the workflow. When an auditor or incident responder asks “show me,” you can.

Engineer for the horizon. Build once for 015-1 by Oct 1, 2028 (Control Centers first) and scale to Oct 1, 2030 (remaining Medium-ERC). Route cables, size storage, and license analytics so extending to EACMS/PACS for 015-2 isn’t a heavy (or heavier) lift.

People and practice beat product. INSM succeeds when SOC analysts (or your equivalent), OT engineers, and compliance share the same playbook. Run quarterly table-tops and tuning windows tied to outages. Reward “false-positive slayers” who reduce noise without losing signal.

Guardrails for architecture. Separate collection from analysis, minimize ERC on mirrored paths, segment the INSM stack, and prefer immutable/tamper-evident storage. Decide early how you’ll classify sensors (PCA/EACMS) and INSM data (BCSI) so CIP-004/-011/-013/-010 controls snap into place.

Volt Typhoon realism: INSM vs. living-off-the-land. State actors living off the land won’t trip signature IDS. They live inside your “normal.” INSM is built for that. By baselining east-west behavior, you can surface:

  • New peer relationships (e.g., an engineering workstation talking to an unfamiliar HMI/PLC subnet).

  • Identity plane oddities (off-hours spikes to AD/MFA, NTLM where Kerberos is expected, unusual LDAP/DCERPC chatter).

  • Remote-exec patterns (sudden SMB/WinRM/WMI/DCOM bursts between hosts that don’t usually talk).

  • Protocol misuse (unexpected ICS function codes, malformed queries, or command rates out of pattern).

  • Slow-and-low staging (dribbled file moves over SMB/IPC$ or abnormal DNS lookups from OT hosts).

  • Lateral movement pivots through jump servers or shared VLANs that don’t match your baseline graph.

Forensic integrity. If or when an adversary wipes endpoint logs, R2/R3 keep you in the fight. Anomaly-linked packets/flows preserved in tamper-resistant stores give you the forensic ground truth they can’t easily erase.

Measure what matters. Track coverage (% of ESP segments with validated feeds), time to evaluate anomalies, % of cases with complete evidence through closure, and quarterly integrity checks. If a metric doesn’t change behavior, drop it.

Do the paperwork and the engineering. If you make INSM routine, scoped, tuned, and evidenced, you won’t just meet CIP-015. You’ll catch the lateral moves your perimeter never saw and buy back precious minutes when they matter most.

Download this playbook as a PDF here.

 

Featured Posts

Patrick Miller