Asi performing-log-source-onboarding-in-siem

Perform structured log source onboarding into SIEM platforms by configuring collectors, parsers, normalization, and validation for complete security visibility.

install
source · Clone the upstream repo
git clone https://github.com/plurigrid/asi
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/asi/skills/performing-log-source-onboarding-in-siem" ~/.claude/skills/plurigrid-asi-performing-log-source-onboarding-in-siem && rm -rf "$T"
manifest: plugins/asi/skills/performing-log-source-onboarding-in-siem/SKILL.md
source content

Performing Log Source Onboarding in SIEM

Overview

Log source onboarding is the systematic process of integrating new data sources into a SIEM platform to enable security monitoring and detection. Proper onboarding requires planning data sources, configuring collection agents, building parsers, normalizing fields to a common schema, and validating data quality. According to the UK NCSC, onboarding should prioritize log sources that provide the highest security value relative to their ingestion cost.

When to Use

  • When conducting security assessments that involve performing log source onboarding in siem
  • When following incident response procedures for related security events
  • When performing scheduled security testing or auditing activities
  • When validating security controls through hands-on testing

Prerequisites

  • SIEM platform deployed (Splunk, Elastic, Sentinel, QRadar, or similar)
  • Network access from source systems to SIEM collectors
  • Administrative access on source systems for agent installation
  • Common Information Model (CIM) or equivalent schema documentation
  • Change management approval for production system modifications

Log Source Priority Framework

Tier 1 - Critical (Onboard First)

SourceLog TypeSecurity Value
Active DirectorySecurity Event LogsAuthentication, privilege escalation
FirewallsTraffic logsNetwork access, C2 detection
EDR/AVEndpoint alertsMalware, process execution
VPN/Remote AccessConnection logsUnauthorized access
DNS ServersQuery logsC2 beaconing, data exfiltration
Email GatewayEmail security logsPhishing, BEC

Tier 2 - High Priority

SourceLog TypeSecurity Value
Web ProxyHTTP/HTTPS logsWeb-based attacks, data exfiltration
Cloud platforms (AWS/Azure/GCP)Audit logsCloud security posture
Database serversAudit/query logsData access, SQL injection
DHCP/IPAMAddress allocationAsset tracking
File serversAccess logsData access monitoring

Tier 3 - Standard

SourceLog TypeSecurity Value
Application serversApp logsApplication-level attacks
Print serversPrint logsData loss prevention
Badge/physical accessAccess logsPhysical security correlation
Network devices (switches/routers)SyslogNetwork anomalies

Onboarding Process

Step 1: Discovery and Assessment

1. Identify the log source:
   - System type and version
   - Log format (syslog, CEF, JSON, Windows Events, etc.)
   - Log volume estimate (EPS - events per second)
   - Network location and firewall requirements

2. Assess security value:
   - What threats can this source help detect?
   - Which MITRE ATT&CK techniques does it cover?
   - Is there an existing SIEM parser?

3. Estimate ingestion cost:
   - Daily volume in GB
   - License impact (per-GB or per-EPS pricing)
   - Storage retention requirements

Step 2: Configure Log Collection

Syslog-Based Collection (Firewalls, Network Devices)

# rsyslog configuration for receiving syslog
# /etc/rsyslog.d/10-siem-collection.conf

# UDP reception
module(load="imudp")
input(type="imudp" port="514" ruleset="siem_forwarding")

# TCP reception
module(load="imtcp")
input(type="imtcp" port="514" ruleset="siem_forwarding")

# TLS reception
module(load="imtcp" StreamDriver.AuthMode="x509/name"
       StreamDriver.Mode="1" StreamDriver.Name="gtls")
input(type="imtcp" port="6514" ruleset="siem_forwarding")

ruleset(name="siem_forwarding") {
    # Forward to SIEM
    action(type="omfwd" target="siem.company.com" port="9514"
           protocol="tcp" queue.type="LinkedList"
           queue.filename="siem_fwd" queue.maxdiskspace="1g"
           queue.saveonshutdown="on" action.resumeRetryCount="-1")
}

Windows Event Log Collection (Splunk Universal Forwarder)

# inputs.conf on Splunk Universal Forwarder
[WinEventLog://Security]
disabled = 0
index = wineventlog
sourcetype = WinEventLog:Security
evt_resolve_ad_obj = 1
checkpointInterval = 5

[WinEventLog://System]
disabled = 0
index = wineventlog
sourcetype = WinEventLog:System

[WinEventLog://Microsoft-Windows-Sysmon/Operational]
disabled = 0
index = wineventlog
sourcetype = XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
renderXml = true

[WinEventLog://Microsoft-Windows-PowerShell/Operational]
disabled = 0
index = wineventlog
sourcetype = XmlWinEventLog:Microsoft-Windows-PowerShell/Operational

Cloud Log Collection (AWS CloudTrail)

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Resources": {
    "CloudTrailToSIEM": {
      "Type": "AWS::CloudTrail::Trail",
      "Properties": {
        "TrailName": "siem-cloudtrail",
        "S3BucketName": "company-cloudtrail-logs",
        "IsLogging": true,
        "IsMultiRegionTrail": true,
        "IncludeGlobalServiceEvents": true,
        "EnableLogFileValidation": true,
        "EventSelectors": [
          {
            "ReadWriteType": "All",
            "IncludeManagementEvents": true,
            "DataResources": [
              {
                "Type": "AWS::S3::Object",
                "Values": ["arn:aws:s3"]
              }
            ]
          }
        ]
      }
    }
  }
}

Step 3: Parse and Normalize

Custom Parser Example (Splunk props.conf/transforms.conf)

# props.conf
[custom:firewall:logs]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S%z
MAX_TIMESTAMP_LOOKAHEAD = 30
TRANSFORMS-firewall = firewall_extract_fields
FIELDALIAS-src = src_addr AS src_ip
FIELDALIAS-dst = dst_addr AS dest_ip
EVAL-action = case(fw_action=="allow", "allowed", fw_action=="deny", "blocked", true(), "unknown")
EVAL-vendor_product = "Custom Firewall"
LOOKUP-geo = geo_ip_lookup ip AS dest_ip OUTPUT country, city, latitude, longitude

# transforms.conf
[firewall_extract_fields]
REGEX = ^(\S+)\s+(\S+)\s+action=(\w+)\s+src=(\S+):(\d+)\s+dst=(\S+):(\d+)\s+proto=(\w+)\s+bytes=(\d+)
FORMAT = timestamp::$1 hostname::$2 fw_action::$3 src_addr::$4 src_port::$5 dst_addr::$6 dst_port::$7 protocol::$8 bytes::$9

CIM Field Mapping

Raw FieldCIM FieldData Model
src_addrsrc_ipNetwork_Traffic
dst_addrdest_ipNetwork_Traffic
dst_portdest_portNetwork_Traffic
fw_actionactionNetwork_Traffic
bytes_sent + bytes_recvbytesNetwork_Traffic
user_nameuserAuthentication
login_resultactionAuthentication
process_pathprocessEndpoint

Step 4: Validate Data Quality

# Verify events are arriving
index=new_source earliest=-1h
| stats count by sourcetype, host, source

# Check field extraction quality
index=new_source earliest=-1h
| stats count(src_ip) as has_src count(dest_ip) as has_dest count(action) as has_action count by sourcetype
| eval src_coverage=round(has_src/count*100,1)
| eval dest_coverage=round(has_dest/count*100,1)
| eval action_coverage=round(has_action/count*100,1)

# Verify CIM compliance
| datamodel Network_Traffic search
| search sourcetype=new_sourcetype
| stats count by source, sourcetype

# Check for timestamp parsing issues
index=new_source earliest=-1h
| eval time_diff=abs(_time - _indextime)
| stats avg(time_diff) as avg_lag max(time_diff) as max_lag by host
| where avg_lag > 300

Step 5: Enable Detection Coverage

# Verify existing correlation searches work with new source
index=new_source sourcetype=new_sourcetype
| tstats count from datamodel=Authentication by _time span=1h
| timechart span=1h count

# Create source-specific detection rule
[New Source - Authentication Anomaly]
search = index=new_source sourcetype=new_sourcetype action=failure \
| stats count by src_ip, user \
| where count > 10

Onboarding Checklist

  • Log source assessed and approved
  • Network connectivity verified
  • Collection agent/method configured
  • Log forwarding confirmed
  • Parser/field extraction configured
  • CIM compliance validated
  • Data model acceleration enabled
  • Volume within license budget
  • Retention policy configured
  • Detection rules enabled/created
  • Dashboard updated
  • Documentation completed
  • SOC team notified

References