---
title: "Trino on Bare Metal: Query 40+ Data Sources Without the Cloud Tax"
url: "https://bitrefinery.com/blog/trino-on-bare-metal-federated-sql-zero-egress"
description: "Trino's federated SQL engine is incredibly powerful — but running it on cloud infrastructure quietly bleeds your budget through egress fees and noisy-neighbor performance. Here's what happens when you move it to bare metal."
author: "Bit Refinery Infrastructure Team"
date: "2026-04-19"
lastmod: "2026-04-19"
tags: ["trino", "bare metal", "federated sql", "data engineering", "infrastructure", "analytics"]
source: "blog CMS"
---

# Trino on Bare Metal: Query 40+ Data Sources Without the Cloud Tax

If you've been running Trino on AWS, Azure, or GCP, you already know what it's capable of. Federated queries across Hive, Iceberg, PostgreSQL, Kafka, MySQL, Elasticsearch — all through standard ANSI SQL. It's genuinely one of the most powerful distributed SQL engines out there, and for data teams that need to query across a dozen different systems without building a massive ETL pipeline first, it's kind of a game-changer.

But here's the thing nobody talks about enough: the cloud is quietly taxing you every time Trino does what it was designed to do.

## The Hidden Cost of Federated Queries in the Cloud

Federated queries move data. That's the whole point — Trino reaches into your S3 buckets, your RDS instances, your Kafka topics, your external APIs, pulls the relevant data, and joins it all together in memory. The problem is that when your coordinator nodes and worker nodes are spread across availability zones — or worse, when your data sources live outside the cloud entirely — you're paying egress fees on every single byte that crosses a zone or region boundary.

AWS charges $0.01/GB for cross-AZ traffic and $0.08–$0.09/GB for traffic leaving to the internet. Doesn't sound like much until you're running a data platform that processes hundreds of terabytes a month. Then you're looking at [egress bills that can hit $16,000+ per month](/blog/200-billion-ai-tax-cloud-bill), on top of your compute costs.

And compute costs on cloud aren't cheap either. A reasonably sized Trino cluster — say, a coordinator plus 8 workers with enough RAM to handle real-world query concurrency — easily runs $8,000–$12,000/month on AWS. And that's before storage, before data transfer, before the inevitable "why is this month's bill 40% higher" conversation with your finance team.

## What Bare Metal Actually Changes

When you run Trino on dedicated bare metal hardware, a few things change pretty dramatically.

**No virtualization overhead.** Cloud instances are virtual machines sitting on top of shared physical hardware. When your Trino workers are doing heavy in-memory joins across large datasets, they're competing with other tenants for CPU cycles, memory bandwidth, and network I/O. Many teams are discovering the [hidden risks of "good enough" virtualization](/blog/hidden-risks-good-enough-virtualization-enterprise-alternatives) when performance consistency is critical for sub-second query responses. On bare metal, you get the whole machine. All 80 cores, all 1 TB of RAM, all of the NVMe throughput — yours.

**No egress fees.** This is the big one. At Bit Refinery, egress is $0. Unlimited. It doesn't matter if your Trino cluster is pulling data from MinIO object storage, querying a remote PostgreSQL instance, or shipping results to a BI dashboard — there's no meter running on your data transfer. For federated query workloads specifically, this is huge because you're moving data constantly by design.

**Predictable pricing.** Our Gold tier — 80 cores, 1 TB RAM, 44 TB of RAID6 SSD storage — runs $2,800/month. A comparable AWS setup (r6i.metal or similar) with equivalent storage would run north of $10,000/month before you touch egress. You can do the math on that one.


![Comparison chart of Trino on AWS vs Bare Metal showing cost savings and hardware specs](/api/storage/files/blog-images/infographic-1776596476101.jpg)

## Trino's Connector Ecosystem Is the Real Story

People sometimes underestimate how broad Trino's connector support actually is. We're talking 40+ data sources out of the box: Hive Metastore, Apache Iceberg, Delta Lake, Hudi, PostgreSQL, MySQL, SQL Server, Oracle, Cassandra, MongoDB, Redis, Elasticsearch, OpenSearch, Kafka, Kinesis, Pinot, Druid, BigQuery, Raptor, local files, HTTP — the list keeps going.

This means your data team can write a single SQL query that joins your operational Postgres database with your Iceberg lakehouse on MinIO with your Kafka event stream, and get results back without writing a single line of Spark or building a pipeline. That's the dream, right?

The catch is that dream requires network throughput and memory. Lots of both. When you're federating across 5 or 6 connectors in a single query, Trino's workers need to hold intermediate result sets in memory while they're being joined. If you're on cloud instances with 64 or 128 GB of RAM per worker, you're going to hit spill-to-disk situations on complex queries. On our Platinum tier with 3 TB of RAM per node, that's basically never a problem.

## Real-World Trino Architecture on Bare Metal

Here's what a typical Trino deployment looks like on Bit Refinery infrastructure:

- **1–2 coordinator nodes** on Silver tier (48 cores, 512 GB RAM) — handles query planning, parsing, and scheduling
- **4–8 worker nodes** on Gold or Platinum tier — handles actual data fetching and in-memory processing
- **MinIO on bare metal** as the object storage layer — S3-compatible, zero egress, sits right next to your Trino workers on the same network
- **Sub-2ms latency** to major public clouds via our Google Cloud Interconnect (included free at our Denver facility)

That last point matters if you're in a [hybrid cloud architecture](/blog/hybrid-cloud-bare-metal-baseline-burst-public-cloud) where some of your data sources are still in GCP. The free interconnect means your Trino workers can reach BigQuery or Cloud Storage with sub-millisecond latency and essentially no egress cost on the bare metal side. We [provide these interconnect fees for free](/blog/why-we-give-away-cloud-interconnect-fees) because it changes how you can architect your data movement without worrying about the "GCP tax."

## What Our Trino Consulting Actually Looks Like

We don't just hand you a server and wish you luck. Our Trino managed services (in partnership with Quantrail Data) cover the full lifecycle:

- **Cluster sizing and architecture design** — figuring out how many workers you actually need, what connector configuration makes sense for your data sources, how to tune JVM heap and spill settings for your workload
- **Query performance tuning** — analyzing slow queries, fixing bad join orders, adding statistics where they're missing, pushing down predicates to connectors
- **Infrastructure audits** — if you're already running Trino somewhere and it's just... slow, we can come in and figure out why
- **Migration roadmaps** — moving from Presto, Athena, or Spark SQL to Trino without breaking your existing dashboards and pipelines
- **24/7 monitoring** — we watch the cluster so you don't have to, with automated scaling and version management

The performance tuning piece is honestly where we see the biggest wins. Most Trino deployments in the wild are running with default configurations that weren't designed for their specific workload. Things like `task.concurrency`, `query.max-memory-per-node`, join distribution strategies, and connector-specific pushdown settings can make a 10x difference in query latency without touching the hardware at all.

## Is Bare Metal Right for Your Trino Workload?

Honestly? Not for everyone. If you're running Trino for ad-hoc queries a few times a week and your data volumes are modest, the cloud's convenience probably outweighs the cost savings. Spinning up an EMR cluster on demand and tearing it down still makes sense at that scale.

But if you've got a data platform team running Trino continuously, with multiple concurrent users, querying across large datasets, and you're watching your cloud bill climb every month — bare metal is worth a serious look. The break-even point is usually somewhere around $3,000–$5,000/month in cloud spend. Below that, cloud is probably fine. Above it, you're leaving real money on the table.

We're happy to do a cost analysis if you want to see the actual numbers for your workload. No pressure, no sales pitch — just math.

[Get in touch with our team](https://bitrefinery.com/contact) and we'll figure out together whether bare metal Trino makes sense for what you're building.
