Infrastructure

HCI or Three-Tier? A Decision Tree

Five questions, four leaves, one decision. Walk the tree once and the architecture argument resolves before the procurement starts.

· · 4 min read

I have stopped having the which is better argument because it is the wrong question. The right question is which one fits the operating model and workload mix you actually have. Below is the decision tree I hand to architects who walk into the meeting expecting a recommendation. It returns one in five questions. The conversation that follows is usually shorter than the conversation that preceded it; the tree’s value is not that it produces a brilliant answer, but that it produces a defensible one quickly.

5
Questions in the tree
2
Leaves you can land on
0
Tree leaves that say 'try both'
3 yr
Cost of the wrong leaf

The grid the tree maps onto

Where the architecture decision actually sits
Specialist team availability
Network + storage + compute teams
Three-tier (legacy)
Costs the most
Specialist teams running uniform workloads — paying for capability you do not use
Three-tier
Right answer for Tier-1 BFSI
Heterogeneous workloads + specialist teams = independent scaling per tier wins
HCI
Default for mid-sized
One team, uniform workloads — fewest moving parts, fastest stand-up
HCI under strain
Workable, has limits
Heterogeneous workloads on HCI work until storage performance ceiling bites
One platform team
Uniform Workload heterogeneity Heterogeneous

Walk the tree

1. Do you have specialist storage and network teams already running? Yes → branch toward three-tier. No → branch toward HCI. The reason is operational economics: a specialist team is expensive to staff and expensive to lose, and a three-tier architecture justifies the cost. An HCI estate operated by a small team becomes a three-tier estate with extra steps and triple the operational complexity.

2. Are your workloads heterogeneous (databases + analytics + VDI + core systems)? Yes → strengthens three-tier. No → strengthens HCI. Heterogeneous workloads have heterogeneous storage performance profiles — a high-IOPS transactional database wants very different media than a sequential-read analytics workload. A three-tier SAN can present multiple QoS classes; HCI’s distributed storage layer scales linearly but does not segment workload classes as cleanly.

3. Is your refresh cycle staggered across compute, network, and storage? Yes → three-tier preserves the rhythm. No → HCI fits the buy-as-you-grow CAPEX shape better. Refresh discipline is under-discussed and over-consequential. A three-tier refresh moves on the cycle of each tier independently, which suits estates with mature capital planning. HCI refreshes one node at a time, which suits estates where capital is approved in smaller increments.

4. Does your audit story (BB ICT, etc.) already reference the existing SAN architecture? Yes → switching costs are real; bias toward three-tier. No → HCI’s audit story is mature enough. The audit story matters more than most architects acknowledge. A bank with a five-year history of vSphere + SAN audits has thousands of pages of documentation that a regulator has already accepted; rewriting that documentation for an HCI architecture is a year-long project on its own.

5. Is the platform team smaller than fifteen engineers? Yes → HCI’s single-pane operating model is decisive. No → either can work; weight the previous answers. Team size is the single best leading indicator of which architecture will be operationally sustainable; small teams and HCI go together in the same way large teams and three-tier go together.

What you actually trade off

The non-marketing version of the comparison
  Three-tierHCI
Operational team count 3+ specialists1 platform team
CAPEX shape Tier-by-tierNode-by-node
Storage performance ceiling High (dedicated array)Good, scales with nodes
Heterogeneous workloads Best fitWorkable
Time to stand up MonthsWeeks
BB ICT audit story Mature, well-troddenMaturing, well-supported
Vendor lock-in shape Per tierPer HCI stack

What changed after Broadcom

The pricing math shifted. vSAN inside VVF and VCF makes vSphere-based HCI economically different than it was in 2023. KVM-based HCI — Nutanix AHV, Proxmox + Ceph, StarlingX — is now a credible production option. The gap between HCI and three-tier is narrower than the gap between vSphere and KVM management planes. Pick the operating model first; the storage architecture follows.

The conversation that should not happen

Three meetings happen too often. The first is “let’s standardise on HCI everywhere” — because someone read a vendor brochure. The second is “let’s keep three-tier forever” — because someone is afraid of the consolidation. The third is “let’s run both” — because the committee could not pick. The first leads to a Tier-1 estate with HCI performance ceilings biting on month nine. The second leads to a small estate paying for specialist teams it cannot keep. The third leads to double everything for two years and then the same decision again.

What good looks like at year three

A three-tier estate that has stayed three-tier for the right reasons has, by year three: documented audit-grade evidence for the SAN, a storage team that ships infrastructure-as-code, and an architecture review board that says no to HCI proposals on cost grounds. An HCI estate that has stayed HCI for the right reasons has: a single platform team operating with three FTEs, a node-by-node refresh rhythm that finance understands, and a CMDB that nobody questions. Both are good outcomes. The wrong outcome is the estate that kept neither commitment.

Related

Read next

Discussion

Comments