Building Blocks of OT Security Monitoring: A Deep Dive for SOC Builders and MSSPs
By Dan Ricci and Patrick Miller
Learn how to build scalable, OT-aware security monitoring using no-cost open-source software tools like Security Onion, Wazuh, Malcolm, and The Hive. Whether you're launching a SOC or growing your MSSP, this guide covers deployment models, costs, timelines, and training to get you started fast - and smart.
Overview
The convergence of Information Technology (IT) and Operational Technology (OT) has brought new visibility, efficiency, and unfortunately, risk. For asset owners looking to stand up their own SOC, or MSSPs aiming to expand into critical infrastructure security, there's no longer room to ignore OT cybersecurity. Fortunately, open-source tools offer a practical, cost-effective way to build scalable, modular monitoring and response capabilities tailored to OT environments.
This post breaks down what it takes to build an OT-aware monitoring solution. From flyaway kits to full-stack SOC deployments, drawing on real-world deployment templates, training programs, and cost modeling.
Downloadable Resources
All documents needed to begin your journey are available for download below, individually in PDF format or bundled together as a convenient ZIP archive. These include tool capability lists, deployment timelines, hardware cost estimates, project templates, and training program overviews. Use them to jumpstart planning, guide implementation, or support stakeholder discussions.
v1.0: Open-Source OT Cybersecurity Monitoring - Ticketing Solutions and Capabilities List
v1.0: Open-Source OT Cybersecurity Monitoring Solutions Deployment Timelines
v1.0: Open-Source OT Monitoring - Ticketing HW Cost Estimates
v1.0: Project Template - INL Malcolm Deployment for Incident & Event Detection, Tracking, and Recovery
v1.0: Project Template - Security Onion Deployment for Incident & Event Detection, Tracking, and Recovery
v1.0: Training Program Overview - Incident - Event Monitoring & Reporting with Security Onion
v1.0: Training Program Overview - Incident - Event Monitoring and Tracking with Malcolm and The Hive
v1.0 Full documentation kit (zip archive)
The Modular SOC: Core Functions and Open-Source Tools
An effective OT-capable SOC rests on four functional pillars, each of which can be implemented with open-source solutions:
Functional Pillar | Tool(s) | Desired Outcome |
---|---|---|
Network Monitoring | Security Onion or Malcolm | Deep packet inspection, protocol analysis, and threat detection. |
Host-Based Monitoring | Wazuh | File integrity monitoring, log aggregation, agent-based detection. |
Incident Ticketing and Tracking | The Hive or SOC Case (within Security Onion) | Case management, collaboration, metrics (MTTD, MTTR). |
Visualization and Metrics | Elastic Stack (Elasticsearch, Logstash, Kibana) | Dashboards, reporting, and alert visualization. |
Each component can stand alone or be integrated into a layered defense strategy. Tool selection depends on your environment, resources, and team maturity.
Deployment Archetypes: From Flyaway Kits to Full SOC
Deployment Archetype | Use Case | Tool Stack | Form Factor |
---|---|---|---|
Flyaway Kit / Incident Handler's Kit | Breach response, short-term engagements | Malcolm + The Hive (Dockerized) or hardened Security Onion laptop | Rugged laptop or portable NUC appliance |
Ad-Hoc OT Monitoring Node | Passive monitoring for specific control zones | Security Onion or Malcolm standalone instance | Onsite appliance with passive tap or SPAN port |
SOC-in-a-Box | Full visibility and response stack for small/mid-size asset owners | Security Onion or Malcolm + Wazuh + The Hive | Single-server deployment or small cluster |
Fractional MSSP Model | Multi-tenant OT security monitoring service | Shared backend, tenant-specific Hive instances; VPN or site-based sensors | Virtualized or cloud-hosted backend + remote appliances |
Deployment Timelines and Resource Planning
Standing up a functional IT/OT cybersecurity monitoring environment doesn't require a massive team or enterprise budget, but it does require clear planning, task alignment, and awareness of tool-specific nuances. Below is a structured timeline and resource guide based on proven deployment templates for Security Onion, Malcolm, Wazuh, and The Hive.
Estimated Full Stack Timeline
Weeks | Major Activities |
---|---|
1–2 | Planning, hardware procurement, initial network assessment |
3–4 | Begin Security Onion or Malcolm deployment; prepare Wazuh server |
5–6 | Deploy Wazuh agents, install The Hive, begin system tuning |
7–8 | Final integration, user training, establish metrics & reporting |
9–12 | Ongoing tuning, team feedback loops, post-implementation review |
Deployment Milestones by Tool
Tool | Milestone | Duration | Notes |
---|---|---|---|
Security Onion | Install, configure, and tune Zeek/Suricata + Elastic Stack | 4–6 weeks | Requires network tap/SPAN access and Linux-literate staff |
Malcolm (alt.) | Dockerized deployment and protocol parser tuning | 3–5 weeks | Easier to containerize; ideal for OT protocols |
Wazuh | Server + agent deployment across supported hosts | 4–6 weeks (parallel) | Not all OT assets will support agents—use syslog/log forwarding where needed |
The Hive | Incident ticketing integration + user onboarding | 2–4 weeks (parallel) | Consider Cortex add-on for enrichment/automation |
Elastic/Kibana | Dashboard customization and metric development | 2–3 weeks (post-core) | Leverage pre-built dashboards, adjust to OT metrics like MTTD/MTTR |
Personnel Requirements
Role | Responsibilities | Skills Required |
---|---|---|
Cybersecurity Specialist | Leads tool deployment, rule tuning, and integration | OT familiarity, Linux CLI, SIEM/IDS experience |
IT Staff | Sets up hardware, network taps, endpoint access | Network architecture, basic sysadmin skills |
Incident Analyst (optional) | Helps with testing alerts, building response workflows | SOC case handling, KPIs, response playbooks |
Project Manager (PM) | Oversees schedule, resource coordination, reporting | Timeline tracking, stakeholder comms |
Resource Planning Tips
Training & Documentation: Budget time for hands-on training for each tool. Use Ampyx Cyber’s training modules to accelerate this.
Parallelism: Security Onion/Malcolm and Wazuh deployments can often proceed in parallel.
Hardware Consolidation: If virtualizing, ensure adequate CPU cores and RAM to support multiple VMs (e.g., 16+ cores, 64GB+ RAM).
Metrics & Success Criteria: Define MTTR, MTTD, case closure rate, and false positive ratio early—these become baselines for improvement.
Key Considerations
Agent Limitations: Some legacy OT assets won’t support Wazuh agents. Plan for log relay or network-only visibility in those cases.
Alert Overload: False positives will spike initially. Schedule tuning sprints every 1–2 weeks after initial deployment.
User Adoption: Tools like The Hive or SOC Case require daily use to be effective. Bake this into team workflows with SOPs and defined owners.
Integration Risks: Allow time for API configs, connector testing, and alert formatting normalization across tools.
Hardware Cost Ranges: SOC by the Budget
One of the biggest advantages of a modular, open-source SOC approach is cost control. Whether you're bootstrapping with a single laptop or standing up a full multi-tenant MSSP backend, there’s a realistic hardware path forward.
Deployment Type | Hardware Form Factor | Estimated Cost Range (USD) |
---|---|---|
Flyaway Kit | Rugged laptop or portable NUC | $1,500 – $4,000 |
Ad-Hoc OT Monitor | Small form factor mini-PC | $1,000 – $3,000 |
SOC-in-a-Box (Single Site) | Mid-range server (2U or tower) | $5,000 – $15,000 |
MSSP Backend Node (Shared) | 2–3 enterprise-grade servers | $25,000 – $75,000 |
Cloud-Hosted (Virtualized) | N/A (usage-based pricing) | $100 – $1,000/month per tenant workload |
Guidance:
Storage Requirements: Plan for at least 1–2TB of fast local SSD per node; more for high-volume packet capture.
NICs: Ensure hardware includes at least one monitor-capable NIC per segment (SPAN or tap).
Redundancy: MSSPs should budget for RAID-configured storage, hot spares, and out-of-band access.
Energy & Environment: Industrial deployments may require fanless systems, hardened enclosures, or low-power processors.
Download the full pricing reference breakdown here:
Lessons Learned from the Field
Across dozens of small utility, municipal, and industrial environments, several common patterns have emerged:
Start with Visibility, Not Perfection: Most OT environments lack centralized logging or visibility. Simply capturing and reviewing network metadata or host logs is often a massive improvement. Prioritize progress over perfection.
Training Is Non-Negotiable: Even the best stack falters without trained eyes behind it. Team members must understand alert fatigue, baseline behaviors, and how to pivot from detection to triage. Hands-on training with the deployed toolset accelerates adoption.
The Hive Beats Email: A lightweight case/ticketing system like The Hive provides more than metrics. It reinforces workflow discipline, improves documentation, and enables collaboration across silos. It's a force multiplier for small teams.
Security Onion vs. Malcolm: If your team is Linux-savvy and focused on raw packet analytics, Security Onion excels. For protocol-heavy industrial networks, Malcolm is often easier to deploy and maintain, especially in Dockerized form.
Agentless ≠ Useless: Some OT endpoints won’t support agents like Wazuh. That’s okay. Use syslog, SPANs, and protocol parsing to cover what you can. Something is better than nothing.
Don’t Ignore Metrics: MTTD (Mean Time to Detect) and MTTR (Mean Time to Respond) aren’t just buzzwords—they are management levers that justify continued investment.
Power Matters: Don't deploy gear into environments without accounting for electrical noise, cooling, and available power. Choose rugged or fanless equipment where appropriate.
Modular = Resilient: The most successful deployments are modular and incremental. Flyaway kits feed larger SOCs. Site nodes roll up to MSSP hubs. Each piece adds value on its own but scales to support broader coverage.
These insights aren’t theoretical. They’re born from real deployments and lessons learned.
Training, Support, and Sustainability
Technology alone doesn’t make a SOC successful. People and process do. Organizations that invest in foundational and ongoing training, documentation, and knowledge transfer are the ones that sustain operations over the long term.
Training Is a Multiplier: Start with hands-on, scenario-based training using the same stack you deploy. Free and low-cost training is available from the open-source communities, but structured programs like those from Ampyx Cyber provide tailored content for OT environments.
Train the Whole Team: Analysts, engineers, and even system operators should be included in onboarding and cross-training. Use real events and test alerts to build intuition.
Build Runbooks Early: Capture lessons learned and standard response actions while the team is still small. This accelerates onboarding and helps maintain consistency during turnover or scale.
Leverage Community: Security Onion, Wazuh, The Hive, and Malcolm all have strong communities. Join their forums, monitor GitHub issues, and don’t be afraid to contribute back.
Create a Sustainment Plan: Think about what happens after the initial deployment. Who owns patching? Who reviews alerts daily? Budget time for monthly reviews, annual refreshes, and stakeholder updates.
Align with Audit and Compliance: For regulated environments, map your tools and workflows to standards like NIST CSF, IEC 62443, or NERC CIP. Building this alignment early can prevent costly rework later.
Invest in Metrics and KPIs: Use MTTD, MTTR, and false positive/negative rates to track SOC maturity and make the case for sustained funding. Regular metrics reviews help prove value to stakeholders.
Ultimately, sustainability is about more than uptime. It’s about operational resilience, human skill retention, and program credibility over time.
Actionable Recommendations
For organizations standing up an OT-focused SOC or launching MSSP services, here are some clear next steps to guide planning and execution:
Pick a Starting Archetype: Don’t overbuild. Choose the deployment type (e.g., Flyaway Kit or SOC-in-a-Box) that fits your size, maturity, and current needs. Scale later.
Use Open-Source to Prove Value: Before investing in commercial tools, leverage open-source solutions to build internal credibility, demonstrate metrics, and validate workflows.
Build an MVP Fast: Use deployment templates and lightweight hardware to stand up a minimum viable stack in 30–60 days. Focus on network visibility, log ingestion, and basic alerting.
Pair Technology with Process: Create incident response procedures and triage runbooks in parallel with tool deployment. Don’t let tooling outpace operational readiness.
Include Stakeholders Early: Bring IT, engineering, compliance, and leadership into planning conversations. A cross-functional SOC reduces political and technical blind spots.
Train Before You Triage: Prioritize hands-on training for analysts and engineers before alerts start flowing. This builds confidence and reduces noise fatigue.
Budget for Sustainment, Not Just Build-Out: Allocate time and funding for system maintenance, patching, and annual roadmap refreshes. Sustainability needs resources.
Monitor Your Metrics: Establish baseline KPIs such as alert volume, false positive rate, MTTD, and MTTR. Use them to track maturity and report progress.
Document Everything: From system configs to lessons learned, write it down. Good documentation accelerates scaling and is critical for audits or hand-offs.
Plan for Growth: Design with modularity in mind. A flyaway kit today can be the seed for a full SOC tomorrow, and a single-site deployment can evolve into a multi-tenant MSSP platform.
Security is not a project. It’s a capability. Use these steps to build one that lasts.
Looking Ahead
The future of OT security monitoring lies not in a single toolset or architecture, but in adaptive, modular ecosystems that align with evolving threats, organizational maturity, and regulatory expectations. Several key trends are reshaping the landscape:
AI and Threat Intelligence Integration: Future SOCs will increasingly leverage AI-driven analytics and contextual threat intel feeds to cut through alert fatigue and enhance precision in triage.
Federated Visibility Across Distributed Assets: As DERs, remote substations, and edge devices proliferate, scalable monitoring must be lightweight, remotely manageable, and interoperable.
Managed Detection for OT (MDR-OT): MSSPs and regional co-ops are moving toward managed SOC services that combine centralized correlation with local sensor deployment and response integration.
Hybrid Architectures: The best deployments won’t be all-cloud or all-on-prem. They’ll use hybrid models that offload analytics and storage to the cloud while keeping collection close to the process.
Secure-by-Design Engineering: As the concept of Cyber-Informed Engineering matures, SOC design will intersect more deeply with control system architecture and safety-critical operations.
Metrics-Driven Governance: Expect more regulatory interest in how organizations measure and manage security operations. Metrics like MTTD, MTTR, and incident closure rates may become compliance benchmarks.
Organizations that embrace this shift toward visibility, flexibility, and operational alignment will not only detect threats faster but also build more resilient infrastructures. Whether you're starting small or scaling big, the goal is the same: sustainable, defendable operations in a connected, contested world.
Conclusion
Standing up an OT-aware SOC, or evolving a small-scale MSSP into a credible IT/OT service provider, isn’t impossible. With the right strategy, a practical toolset, and a commitment to training and sustainment, organizations of all sizes can gain meaningful visibility and control over their industrial cyber risk.
This deep dive was designed to demystify the building blocks of scalable OT monitoring. Whether you’re starting with a flyaway kit or planning a multi-tenant platform, the key is modularity, clarity of purpose, and alignment between people, process, and technology.
Open-source tools lower the barrier to entry. Real-world deployment guides reduce uncertainty. And a strong focus on operational realities - training, governance, metrics, and sustainability - ensures that what you build today won’t just work, but will evolve.
OT security is no longer optional. The question is not if you’ll need monitoring, but how you’ll make it fit your operational context. Use this guide as a foundation. The next move is yours.