TotallyWildAi · splunk-detection-poc

Five Splunk detection capabilities, end to end.

A self-contained Splunk Enterprise POC running on AWS — built in code, deployed by CI, including AI-assisted detection authoring with a human review gate. Every piece of state is reproducible from git in ~15 minutes.

8
detections deployed
3
CIM datamodels accelerated
6
CI deploy jobs
~15 min
full rebuild from git
~$70/mo
AWS infra, business hours
1

Data parsing & ingestion

Approach. Splunk gets data through purpose-built TAs layered over modular inputs. TA-aws account + input stanzas versioned as YAML in taaws-config/config.yml and pushed via REST on every CI run — no UI clicks to reproduce.

What's running

Demonstrable artifacts

2

Data Model Acceleration (DMA)

Approach. CIM-fit first for every detection, raw-SPL fallback second. Acceleration configured as code in tw_cim_accel/default/datamodels.conf; install-apps.sh syncs that file into Splunk_SA_CIM/local/datamodels.conf on every deploy — the only path Splunk reliably honors for DMA overrides (verified the hard way).

What's running

Demonstrable artifacts

3

Splunk detection engineering

Approach. Detection content is YAML, schema-validated, ATT&CK-mapped (against the current v19.1 framework), written in CIM-aligned SPL where the datamodel covers the data and well-documented raw SPL where it doesn't. Every rule is ES-compatible — drops into a licensed ES instance as a correlation search without rewrite.

What's running — 8 detections in detections/aws/

DetectionATT&CKPattern
CloudTrail DisabledT1685.002|tstats over datamodel=Change
Console Login Without MFAT1078.004raw SPL (MFA flag not in CIM)
Console Login Failures BurstT1110.001time-windowed stats
IAM CreateAccessKey For Different UserT1098 / T1078.004cross-field SPL comparison
Security Group Open To WorldT1133 / T1190spath + mvexpand on nested JSON
S3 Bucket Policy Public WriteT1567 / T1530raw SPL on policy JSON
Root Account UsageT1078.004raw SPL
IAM Password Spray Test AI-draftedT1110.003raw SPL, scheduled, alerting

Each YAML carries: ATT&CK technique IDs, kill-chain phase, CIS/NIST mappings, false-positive tuning hints, prereqs, references to Splunk's security_content, AWS docs, and MITRE.

Demonstrable artifacts

4

Detections as Code (CI/CD)

Approach. Two-pipeline split — infra changes don't trigger content deploys, content changes don't trigger infra plans. GitHub Actions, OIDC-authenticated AWS access. Zero static credentials; Splunk management port :8089 is never publicly exposed.

Pipelines

terraform.yml

Plan on PR (artifact), apply on push to main. Triggers only on terraform/** changes.

splunk-config.yml — 6 jobs

  1. validate — YAML schema check on detections; .conf sanity on apps-src (BOM, CRLF, duplicate stanzas)
  2. deploy-apps — builds .tgz from apps-src/, syncs to S3, SSM-triggers install-apps.sh
  3. deploy-sse-content — REST PUT to SSE custom_content KV-store
  4. deploy-detections — REST POST to /services/saved/searches
  5. deploy-taaws-config — REST POST to /servicesNS/.../Splunk_TA_aws/data/inputs/aws_sqs_based_s3
  6. deploy-sse-data-inventory — REST PATCH on data_inventory_products KV-store

All four deploys: CI uploads payload to S3, sends one SSM SendCommand, the EC2 fetches + runs locally. No public 8089.

Disaster recovery verified 2026-05-13

terraform apply -replace of the EC2 + workflow re-run brought every piece of state back from git in ~15 minutes total. The rebuild test surfaced one real bug (HTTP 400 vs 404 in TA-aws input REST handler), which is now fixed in code.

Demonstrable artifacts

5

AI for detection development

Approach. AI is a junior author with senior-author review. Claude drafts the detection YAML, the existing CI gates the schema, a human reviews the draft PR, merge triggers the same deploy pipeline as a hand-written rule. The authoring loop AND the validation loop run inside the existing repo structure with no bespoke tooling.

End-to-end demonstrated 2026-05-13

  1. Operator typed 4 inputs into the GitHub Actions form (T1110.003 / iam-password-spray-test / "Detects bursts of failed ConsoleLogin events…")
  2. Claude (claude-sonnet-4-6) drafted a schema-conformant YAML against SCHEMA.md + the cloudtrail-disabled.yml exemplar embedded in the prompt
  3. Draft PR opened automatically with the YAML + reviewer checklist (~30 sec from button click)
  4. Existing validate CI job ran — YAML schema-checked clean
  5. Human reviewed the PR + merged
  6. splunk-config workflow auto-fired on merge → live as a scheduled saved search alerting on production CloudTrail data, every 15 minutes

Guardrails

Cost: ~$0.005 per draft on Sonnet 4.6 (~3K input + ~800 output tokens).

Demonstrable artifacts

Honest scoring

CapabilityStatusNotes
Data parsing & ingestionFully demonstratedCloudTrail + VPC Flow live; GuardDuty + HEC are obvious next adds for breadth
DMA optimization & alignmentFully demonstrated3 CIM datamodels accelerated, benchmark dashboard shipped
Detection engineeringFully demonstrated8 rules, ATT&CK v19.1-current, ES-compatible
Detections as Code (CI/CD)Fully demonstrated6 CI jobs green, DR-verified, public :8089 never exposed
AI for detection developmentFully demonstratedEnd-to-end live; an automated Docker-Splunk test runner for AI drafts is Phase 6.1, deferred