SBN

One More Time on SIEM Telemetry / Log Sources …

One More Time on SIEM Telemetry / Log Sources …

(cross posted from Dark Reading, and inspired by a previous version of this blog)

Cyberpunk IT telemetry via Dall-E

For years, organizations deploying Security Information and Event Management (SIEM) or similar tools have struggled with deciding what data to collect inside their security operation platforms. So the dreaded question — “what data sources to integrate into my SIEM first?” lives on.

How to approach answering this?

First, using “output-driven SIEM” — the best answer to this question — covers it: SIEM collection depends on your security monitoring needs and use cases and how you prioritize them using your risks. Any popular list of top log sources aggregated from many organizations will end up being useless for organizations with different security needs and challenges!

While an output-driven SIEM approach has been known for 10+ years, many organizations are still looking for best practices in collection before they decide on how they plan to use the data. In fact, large organizations often make the decision to integrate a log source into their SIEM or SecOps platform based on factors other than the pure security necessity.

Overall, such factors often include:

  • Necessity for detection
  • Necessity for alert triage and incident response
  • Necessity as context data for utilizing another log source
  • Compliance requirements to collect and retain specific log types
  • Compliance requirements to monitor this data source and/or system
  • Ease of integration of the log source
  • Collector and parser availability from the vendor
  • Ability to actually transfer the log data to a SIEM
  • Other planned log sources that compete for attention
  • Data volume of the log source

Finally, if a SIEM product charges per volume of data collected, the cost of introducing a new data source into the platform may be one of the deciding factors. For example, will you include a data source that will consume 10% of your overall SIEM license if you only plan to use it as context — valuable though it may be — for another data source? Namely, if you don’t plan to write any detection rules or apply other detection logic based on this telemetry. A popular example here would be DHCP logs — how many detections rely solely on DHCP logs? None or very few at most [many Chronicle clients do, BTW]

As a result, experiences with SIEM deployments (going back to 2002) taught us that few people will include DNS or DHCP logs during their initial phases of SIEM roll-out. In fact, some will never include them in their SIEM at all! When asked why, those people explain that while they are convinced of the general utility of DNS logs, they do not see much value in each individual message that costs money to collect.

These logs are essentially “sparse value logs” where the value is in getting the bulk rather than in getting some particularly valuable messages like say Windows Security Event ID 1102. As a result, SIEM operators have doubts about paying for inclusion of this data into their SIEM.

In fact, this gave rise to an architecture where one product is used for high-value logs while another product augments it by storing more voluminous logs. They do work if there are good APIs in the products (such as to query one telemetry repository from another), but it is useful to remember that they do not offer advantages other than cost.

Naturally, I want to test my assumptions, so I obviously did this (and this too)

X analysis https://twitter.com/anton_chuvakin/status/1750608530296721584

So, yes, top log sources change over time and firewall and server logs flooding the SIEM tools in the early 2000s were supplemented with new critical sources such as:

  • All types of cloud logs, Cloudtrail, Cloud Audit, VPC flow logs, etc
  • Sysmon and EDR telemetry [or at least EDR alerts]
  • Identity provider logs (Okta, Ping, Entra ID, etc)
  • O365 and Workspace logs, and other key SaaS application logs
  • API access logs from various applications and platforms
  • Development environment logs, CI/CD pipeline logs, Terraform logs
  • Container system kubernetes logs (such as Kubernetes Audit Logs)
  • Enterprise browser logs

At the same time, some of the classic sources remain very popular and very useful:

  • IT and security tool management console access logs (from VPN, UTM, EDR, and even SIEM and SOAR themselves)
  • VPN and various zero trust ecosystem logs
  • Web proxy logs (yay!)

Also, some log sources qualify as “newly popular” even though some organizations have been collecting them for years if not decades, these include:

  • Business applications logs [early SIEMs made a half-assed attempt to collect/analyze those, but never really delivered…]
  • DLP and other data-aware security technologies (such as emerging Data Detection and Response?)
  • Email logs (likely overlap with popular SaaS applications logs)

Finally, if you integrate a new log source type, make sure that you monitor for the log telemetry actually arriving into your SIEM!

So, to have your SIEM perform better, do the following:

  • Practice “output-driven” SIEM as this approach increases the chance of collected log data being useful for your detection and response efforts
  • Include logs that are of key investigative value, and also logs useful as context (such as DNS and DHCP logs)
  • Review your current collection posture, and align with your detection use cases
  • Evolve the collection based on the change of needs, risks as well as IT (for example, add cloud logs when cloud use for business increases)
  • Read the IDC Business Value study to see how organizations are ingesting more data with Google Security Operations [vendor hat on here!]


One More Time on SIEM Telemetry / Log Sources … was originally published in Anton on Security on Medium, where people are continuing the conversation by highlighting and responding to this story.

*** This is a Security Bloggers Network syndicated blog from Stories by Anton Chuvakin on Medium authored by Anton Chuvakin. Read the original post at: https://medium.com/anton-on-security/one-more-time-on-siem-telemetry-log-sources-b0a88572dac9?source=rss-11065c9e943e------2