Hybrid Cloud Done Right: Baseline on Bare Metal, Burst to Public Cloud (Without the Lock-In)

Here's a scenario that plays out constantly in companies that have been on AWS or Azure for a few years: the infrastructure bill keeps climbing, someone finally pulls the thread, and they discover that a huge chunk of that spend is just... baseline workload. Databases, analytics pipelines, internal APIs, data warehouses — stuff that runs 24/7 and barely fluctuates. And they're paying cloud on-demand rates for all of it.

It's not a great situation. But it's also not hard to fix, once you know what you're actually solving for.

The honest answer to "where should my workloads live" is almost never "all cloud" or "all on-prem." It's a mix. Predictable, steady-state workloads belong on dedicated hardware where you're paying a flat rate. Spiky, unpredictable, or short-lived workloads belong in the cloud where you can spin up capacity fast and pay only for what you use. That's hybrid cloud — and most companies aren't doing it right.

The Problem With Going All-In on Cloud

Public cloud is genuinely great for certain things. Elastic scaling, global reach, managed services for stuff you don't want to operate yourself. Nobody's disputing that.

But the pricing model is designed for flexibility, not efficiency. You're paying a premium for the option to scale — even when you never use it. And then there's egress. AWS charges somewhere between $0.05 and $0.09 per GB to move data out of their network. If you're running analytics workloads or serving large datasets, that adds up shockingly fast. We've seen customers with $16,000+ monthly egress bills, just from data transfer.

The other thing that doesn't get talked about enough is the lock-in. Not just technical lock-in from proprietary services, but financial lock-in. Once your team is comfortable in the AWS console, once your tooling is built around S3 and RDS and Lambda, switching feels enormous. So you just... keep paying.

What "Own the Base, Rent the Spike" Actually Means

This is the core philosophy we operate by at Bit Refinery, and it's simpler than it sounds.

Figure out what your baseline compute and storage needs look like — the stuff that's running every day regardless of what's happening in the business. Put that on dedicated bare metal hardware where you're paying a predictable flat monthly rate. Then, for the stuff that genuinely spikes — seasonal traffic, batch ML training jobs, burst rendering, whatever — burst into public cloud resources.

You get the cost efficiency of owned hardware for your steady-state workload, and the flexibility of cloud for the unpredictable stuff. Best of both worlds, without being locked into either.

In practice, this means your ClickHouse cluster for real-time analytics lives on a Gold or Platinum bare metal node. Your primary Postgres database lives there too. Your ML inference serving layer, your log aggregation pipeline, your data warehouse. All of it running on dedicated hardware with predictable pricing and zero egress fees.

And when you need to run a massive training job, spin up a temporary GPU cluster in AWS or GCP. When you have a product launch driving 10x normal traffic, burst your web tier into cloud auto-scaling groups. Then scale back down. Pay for what you used. Done.

The Numbers Are Not Close

Let's be concrete about this. A comparable AWS configuration to our Gold tier — something like an r6i.metal instance plus 40 TB of storage — runs around $10,658 per month. Our Gold tier is $2,800/month. That's 80 cores, 1 TB RAM, 44 TB of RAID6 SSD storage, and unlimited bandwidth with zero egress fees.

The egress piece is huge. If you're moving 180 TB of data per month (not unusual for analytics workloads), that's over $16,000 in AWS egress charges alone. On our infrastructure, it's $0. Unlimited 1 Gbps bandwidth, included.

Cost comparison chart between AWS and Bit Refinery bare metal showing significant savings on compute and egress fees.

Now, the counterargument is that you're giving up elasticity. And that's true — you can't spin a bare metal server up in 30 seconds. But here's the thing: your baseline workload doesn't need elasticity. It's baseline. It's predictable. The elasticity argument only applies to the spiky stuff, which you're still handling in the cloud.

How to Actually Architect This

The implementation isn't as complicated as people expect, but it does require some intentional design.

Keep your data layer on bare metal. This is the most impactful move. Databases, object storage, analytics engines — these are your heaviest egress sources and your most consistent workloads. Running ClickHouse or MinIO on dedicated hardware eliminates egress costs and gives you much better raw performance than cloud equivalents.

Use S3-compatible object storage. One of the subtle lock-in traps is building everything around S3's API. If you're running MinIO on bare metal (which is fully S3-compatible), your application code doesn't change. You can still burst workloads into AWS that need to pull data, and you're not rewriting your storage layer to escape.

Design your compute tier for hybrid from the start. Containerized workloads with Kubernetes make this much cleaner. Your baseline pods run on your bare metal nodes. When you need to burst, you're spinning up additional nodes in cloud providers and joining them to the same cluster. Tools like Karpenter (for AWS) or GKE Autopilot handle the cloud side of this pretty gracefully.

Sub-2ms latency to major clouds matters here. Our Denver and Seattle data centers both sit within sub-2ms latency to AWS, Azure, and GCP regions. That's low enough that your hybrid architecture doesn't feel like two separate systems — data flows between your bare metal layer and cloud burst capacity without meaningful latency impact.

The Lock-In Question

One of the things we hear from engineering teams is that they're worried about trading cloud lock-in for bare metal lock-in. It's a fair concern.

But there's a real difference between the two. Cloud lock-in is often invisible — it accumulates through proprietary APIs, managed services with no open-source equivalent, and pricing structures that make leaving expensive. Bare metal is just hardware. Your software stack is portable. If you're running ClickHouse, Trino, MinIO, Postgres — all open-source, all running the same way anywhere. You can move workloads between providers without rewriting your application.

We run VergeOS for virtualization, which is a modern VMware alternative with a clean migration path from vSphere if you're coming from that world. The point is that the underlying infrastructure doesn't dictate your software choices.

Is This Right for Every Company?

Honestly, no. If you're a 10-person startup with wildly unpredictable growth, just use the cloud. The operational simplicity is worth the premium when you're moving fast and don't have dedicated infrastructure folks.

But if you're past that stage — if you have a sense of your baseline workload, if your cloud bill has a significant chunk of steady-state spend, if you're paying meaningful egress fees — this model will almost certainly save you money. And not a little money. Usually a lot.

The companies that get the most out of hybrid cloud are the ones running data-intensive workloads: analytics platforms, ad tech pipelines, ML infrastructure, SaaS backends with heavy database usage. These are workloads where the bare metal performance advantage is real and the egress savings are substantial.

Getting Started

The migration doesn't have to happen all at once. Start with your data layer — move your primary database or your analytics cluster to bare metal and measure the cost difference. Keep your cloud presence for burst capacity and anything that genuinely benefits from managed services.

If you want to talk through what this looks like for your specific workload, we're happy to dig into the details. Reach out at bitrefinery.com/contact — we do this kind of architecture conversation regularly and there's no sales pressure involved.

Hybrid cloud done right isn't complicated. It's just about putting the right workloads in the right place. Own the base. Rent the spike. Keep your options open.