The Anomaly: A FinOps Detective Story
A real cost spike, the investigation that found it, and the maturity model that emerged from the lesson. Cloud economics taught the way it actually gets learned.
This case is reconstructed from a customer engagement; identifying details are removed. It begins with a 14% month-over-month spend spike on what should have been a stable workload. It ends, two weeks later, with a recovery of roughly 1.4 million BDT and a FinOps practice that did not exist before the investigation started. Names are fictional; sequence is real. The story is told here as a detective procedural rather than as a framework summary, because the framework summary is what every FinOps vendor delivers and is not, in our experience, what teaches the discipline.
The investigation
- Day 0Anomaly firesCost-and-usage dashboard flags a 14% spend spike vs a stable trailing baseline.
- Day 1Untagged drill-downRoughly 11% of the spike sits in untagged resources. We cannot allocate them to a BU.
- Day 2Tagging auditTagging coverage is 84%. The 16% gap explains most of the visibility loss.
- Day 4Idle disks discovered320 unattached EBS-equivalent volumes from a six-month-old test campaign. Combined: 28 TB at premium tier.
- Day 6Oversized DB instanceA reporting database had been sized for a launch peak in early 2024 and never resized. Memory utilisation was 12%.
- Day 8Reserved instance shortfallSteady-state compute that should have been on RIs was on demand. ~22% over-spend on that footprint alone.
- Day 11Recovery + practice spec1.4M BDT recovered. FinOps charter signed: 1 central role, 4 BU ambassadors, monthly review cadence.
Source: Reconstructed customer engagement, 2025.
What the investigation taught
What an idle disk taught about ownership
The 320 unattached volumes are worth a closer look. They were created during a marketing-led test campaign six months prior, by an engineer who had since rotated off the team. Nobody owned them. They were not tagged with an owner, an environment, or a cost-centre. The cleanup itself was trivial — fifteen lines of Terraform — but the reason they existed was the practice’s first lesson. Without tagging at provision time and without a regular orphan-resource sweep, idle assets accumulate at roughly the rate of organisational change. The cleaner the provisioning workflow, the smaller the orphan tail.
Where the savings actually came from
Source: Single-engagement breakdown; consistent with the broader FinOps composite.
The maturity that emerged
The investigation made one thing concrete: FinOps is not a tool you buy, it is a practice you run. A small central group sets standards and runs the data pipeline; engineering ambassadors in each BU implement optimisations on the ground; finance handles invoicing, chargeback mechanics, and reconciliation. The role mix matters more than the headcount. A central team without engineering credibility is ignored; ambassadors without central guidance freelance into inconsistent practice; finance without engineering partnership invoices people for things they cannot control.
Why tagging is the y-axis of every FinOps decision
The investigation’s second lesson, and the one we now teach first to new customers, is that tagging is the foundational discipline. Without tags, allocation breaks. Without allocation, chargeback fictions take over. Without chargeback, the feedback loop that drives behaviour change does not exist. Mature FinOps practices enforce tagging at provisioning time — Terraform modules that fail without owner, environment, and cost-centre tags; cloud-native policy engines that block untagged resources; nightly reports that name the senior leader of any BU below 95% coverage. Each of these is unglamorous and high-leverage.
What the FinOps Foundation framework gets right
The framework’s three phases — Inform, Optimise, Operate — are the right framing. Inform is visibility, allocation, and benchmarking. Optimise is rightsizing, commitment management, and cleanup. Operate is the continuous improvement loop, the engineer-facing cost signals, and the cultural posture that makes the savings durable. Most organisations get stuck at the boundary between Inform and Optimise, because Inform is largely a data problem and Optimise is a behaviour-change problem. The FinOps practice that ships savings is the one that takes the behaviour-change problem seriously.
The first ninety days
Tagging to 95% coverage. Allocation to BU. Three rightsizing waves, quarterly. Commitment-management cadence. Anomaly-detection alerts flowing to BU leadership. Six months in, the practice pays for itself. Two years in, the practice is one of the most strategically important cost-management capabilities the CFO has — because it is the only capability that scales linearly with the cloud bill.
The cultural posture that distinguishes a working practice
Three traits show up consistently in FinOps practices that work in their second and third year. The CFO and the platform-team head meet monthly with the same dashboard in front of them. Engineers see cost-per-environment in their CI pipeline output, alongside test results. And the architecture review board includes a cost criterion on every new design — not as an after-the-fact challenge but as a day-zero constraint. Each of these takes time to build; none of them is hard once the practice exists.
Read next
- Cloud Strategy
The Cloud Operating Model, on a 2×2
Where you sit on the platform-control × cost-ownership grid is the single best predictor of whether your cloud programme reduces redundancy or multiplies it.
- Cloud Strategy
Five Truths About Building a Sovereign Cloud in Bangladesh
Hard-won lessons from the field — what every newcomer underestimates about the licensing, the customer, the currency, and the country.
- BFSI
The Audit That Goes Wrong: A Cautionary Walk Backwards
A composite incident, traced backward through the controls that should have prevented it. What was missing at each step, and what would have caught it.