Corelight

What is it?

Corelight is the company that maintains Zeek, the free and open source network security sensor. The company offers products based on Zeek, like a high-speed Zeek-based network security sensor, SOC investigation tools and a turnkey NDR platform called OpenNDR.

Zeek sensors sit passively on network spans or taps, ingest network traffic, and produce structured logs, metadata, and detections. The output is structured network evidence (connection logs, DNS records, HTTP transactions, SMB activity, file hashes, and protocol-specific metadata) that feeds into a SIEM, a data lake, or Corelight’s own tools.

On the hardware side, Corelight’s custom appliances run a heavily optimized version of Zeek at line rates from 2 Gbps to 200 Gbps, making it suitable for high-throughput environments like large cloud providers, financial institutions, or anyone doing serious east-west monitoring.

Corelight also ships virtual appliances for on-prem virtualization and cloud-native sensors for AWS, GCP, and Azure VPCs. All sensor types produce the same Zeek and Suricata output. Corelight also offers Smart PCAP, which selectively captures full packets based on configurable triggers (alerts, unknown protocols, unencrypted traffic, specific hosts) rather than doing expensive full-take capture.

Why did they build it?

Corelight started as a highly optimized Zeek sensor. Zeek has been the de facto standard for network security monitoring for years, but running it at scale requires significant operational overhead: tuning, package management, hardware sizing, and output normalization. More critically, getting Zeek to perform at 10, 100, or 200 Gbps line rates requires custom hardware that commodity servers simply cannot deliver. Corelight built purpose-designed appliances to solve both problems: turnkey Zeek that produces clean, parseable output at massive throughput without the operational burden.

From that foundation, Corelight expanded into full NDR. The broader case for NDR is the SOC visibility triad: SIEM, EDR, and network. EDR covers endpoints but has structural blind spots. It cannot see devices without agents (OT/ICS systems, HVAC controllers, network appliances, IoT), cannot observe lateral movement between hosts where the attacker has disabled or evaded the agent, and provides no visibility into perimeter device compromises like those used in Volt Typhoon-style attacks against routers and firewalls. Network traffic is passive and tamper-resistant. An attacker who disables EDR, reboots into safe mode, or exploits a device with no agent support still generates observable network activity.

How does the detection layer work?

Corelight runs three detection methods in parallel. Suricata provides signature-based IDS alerts using rulesets (ET Open, ET Pro, custom). Zeek notices (Zeek’s native detection events) fire on protocol-level anomalies and behavioral patterns using Zeek’s scripting language. Machine learning models baseline normal traffic patterns per host and peer group, then flag statistical deviations. For example, a host in a peer group that normally generates HTTP traffic via Chrome suddenly issuing curl requests, or DNS query volumes deviating significantly from historical means.

Each ML detection exposes the underlying features and their deviation from baseline, so analysts can see exactly which variables triggered the score rather than trusting an opaque classification. Corelight calls this “transparent ML.” The three detection types feed into a unified alert queue in Investigator, scored and prioritized, with what Corelight call “Day One” curated detections that ship out of the box.

Corelight also uses LLM integration to auto-generate plain-English explanations of Suricata rules and Zeek notices. Since Zeek and Suricata rules are open source and well-documented on the internet, the LLM performs well on this task. The goal is to help junior SOC analysts understand what a detection means and what to investigate next without needing a senior analyst to interpret every alert.

How does it integrate with EDR and SIEMs?

Corelight uses Community ID, an open standard that generates a deterministic hash from the five-tuple (source IP, dest IP, source port, dest port, protocol) of each network session. EDR vendors like CrowdStrike generate the same Community ID for their endpoint telemetry. This allows logs from both tools to be correlated on the same session even if there is clock drift between systems.

In practice, this means a CrowdStrike Falcon alert for PowerShell Empire execution on a host can be stitched to Corelight’s network view of that same session, showing what external IPs the host contacted, what domains it resolved, and what other internal hosts communicated with those same external indicators. Determining blast radius (which hosts talked to the attacker’s infrastructure, which internal hosts the compromised machine contacted) becomes a set of pivots in a search interface rather than the manual “spreadsheet of hell” of pulling switch configs and correlating MAC addresses.

Corelight ships SIEM apps and exporters for Splunk, Elastic, CrowdStrike Falcon LogScale, Snowflake, and generic JSON output. Some customers ingest all Corelight data into their SIEM. Others send only alerts to the SIEM (to control ingest costs) and keep the full network evidence in Investigator, linking out from SIEM alerts to deep-dive in the Corelight UI.

What does cloud visibility look like?

Cloud sensors deploy natively in AWS, GCP, and Azure VPCs and capture east-west traffic between workloads. A significant finding from Corelight’s cloud deployments: a lot of intra-cloud traffic is unencrypted. Organizations that built layered encryption on-prem often have cloud environments where services communicate in cleartext inside the VPC.

Cloud environments also present an ephemeral device problem. Instances spin up and terminate constantly, and IP addresses get reassigned. During an incident response, the host you need to investigate may no longer exist. Corelight addresses this by pulling cloud provider metadata via API (AWS instance IDs, instance names, tags, department labels) and enriching network logs with that context at capture time. Even after an EC2 instance is terminated, the Corelight logs still show its instance ID, name, and organizational tags alongside the network activity.

Risky Business appearances

  • RB #804 (Aug 2025) - Greg Bell on AI/MCP integration with network data
  • RBNEWSSI102 (Oct 2025) - Catalin Cimpanu + Ashish Malpani on NDR evolution
  • RB #773 (Dec 2024) - Vijit Nair on cloud detection
  • RB #716 (Aug 2023) - Brian Dye sponsor interview
  • RB #703 (Apr 2023) - Brian Dye on AI-driven NDR features
  • Corelight Open NDR Platform Demo - James Pope product walkthrough

Sources

Disclosure

Corelight is a long-running Risky Business sponsor.