The Problem: SIEM Ingestion Costs Are Not Sustainable
As log volumes grow, storing everything in your SIEM becomes economically unviable. Volume-based pricing means every unnecessary log byte drives cost with zero detection return. The solution is not to log less — it is to store data in the right place for the right duration, at the right cost.
The Architecture: Three Tiers, One Unified Strategy
Design, build, and deploy a tiered storage architecture that keeps your detection intact while slashing SIEM ingestion costs. Each tier is defined by operational value, query frequency, and retention requirement — not by which vendor you use.
🔴 Hot Tier — 0 to 30 Days
What lives here: Live SOC searches and active investigations. Auth failures, EDR alerts, IDS/IPS events, cloud IAM changes, privilege escalations.
Where: Your SIEM or OpenSearch Security Analytics.
Why: Real-time correlation, alerting, and triage demand sub-second query response and full field fidelity.
🟡 Warm Tier — 30 to 90 Days
What lives here: Threat hunting workloads and forensic deep-dives. Summarised auth successes, DNS flows, proxy logs, and endpoint telemetry that isn't needed for real-time detection but is essential for post-incident investigation.
Where: Elastic or a similar cost-efficient search platform.
Why: Query performance is still important but response times of seconds — rather than milliseconds — are acceptable. Storage cost is significantly lower than hot SIEM storage.
🔵 Cold Tier — 90 Days to 7 Years
What lives here: Compliance archives. Regulatory-required log retention for frameworks such as Qatar NIA Policy, ISO 27001, GDPR, PCI-DSS, and NCA/SAMA controls.
Where: Azure Data Lake, AWS S3, or equivalent object storage — at near-zero cost per GB.
Why: These logs are rarely queried but must be preserved and retrievable on demand. Object storage costs a fraction of SIEM or search-platform storage, with documented recall workflows for investigations or audits.
Key Design Principles
- Detection coverage is non-negotiable: Every log required by a detection rule or ATT&CK-mapped use case must remain in hot storage. Tiering never touches detection-critical data.
- Full fidelity is preserved in cold storage: Raw, unfiltered logs go to cold tier — not summarised versions. If you need to reconstruct an incident from 18 months ago, you have the complete record.
- Recall must be documented: Analysts need a clear, tested workflow to pull cold or warm data back into the SIEM for investigation. Design this before you move the first byte.
- Pipeline preprocessing at the edge: Filter, deduplicate, enrich, and route before logs reach any tier. The pipeline decision determines cost before storage decisions do.
What This Delivers
Cost Reduction
30–70% reduction in SIEM ingestion costs by moving warm and cold data out of expensive hot storage — without removing any log from your environment.
Detection Integrity
All detection-critical events remain in hot storage with full fidelity. Alert quality and MTTD are unaffected — often improved due to reduced noise.
Compliance Readiness
Seven-year retention for regulatory frameworks at near-zero cost per GB. Audit requests are answered with full-fidelity original logs, not summaries.
Forensic Capability
Warm tier preserves threat hunting and forensic depth for 90 days — covering the majority of post-incident investigation timelines without paying SIEM prices for that data.
Implementation Approach: Design → Build → Deploy → Run
- Design: Map every log source to a tier based on detection value, investigation frequency, and compliance requirement. Build the data flow diagram before touching any infrastructure.
- Build: Configure pipeline preprocessing (filter, enrich, route). Stand up warm-tier platform. Configure cold-tier object storage with appropriate lifecycle and access controls.
- Deploy: Migrate log routing progressively — start with the highest-volume, lowest-detection-value sources. Validate hot-tier coverage at each step before expanding scope.
- Run (if required): Establish operational runbooks for tier management, recall procedures, and quarterly cost/coverage reviews. HIT Services can provide ongoing detection content updates and tier optimisation on a retainer basis.
FAQs
What if an incident requires logs from the warm or cold tier?
The recall workflow is designed before deployment. Analysts follow a documented process to pull warm or cold data into the SIEM investigation workspace. Most platforms support on-demand index mounting or data import — response time is typically minutes to hours, not days.
Does this architecture work with our existing SIEM?
Yes — this is a platform-agnostic architecture. The tiering logic lives in your pipeline layer (e.g., OpenTelemetry collector, Logstash, Cribl, or a custom routing agent), not in the SIEM itself. The SIEM sees only hot data; the architecture around it handles warm and cold routing transparently.
How do we ensure compliance logs are tamper-proof in cold storage?
Object storage platforms support immutable storage policies (e.g., Azure Immutable Blob Storage, AWS S3 Object Lock) that prevent modification or deletion for a specified retention period. Combine with access logging and hash verification at write time for full chain-of-custody.