AWS Costs Keep Rising. Here’s What NYC Mid-Market Companies Are Doing About It.

Metro Colo Advisory
May 12, 2026
Cloud Repatriation

For two decades, cloud pricing moved in one direction: down. AWS built its reputation on passing efficiency gains back to customers — regular price reductions that made it easy to justify putting more workloads in the cloud, not less.

That era is over.

On January 4, 2026, AWS raised EC2 Capacity Block prices for GPU instances by 15% across all regions — quietly, on a Saturday, with no formal announcement to customers. A company that once proudly touted “regular price reductions” as a core commitment raised prices on critical infrastructure without so much as an email to affected accounts.

For NYC mid-market companies running stable workloads on AWS, this is a turning point worth paying attention to. Not because one price increase changes everything — but because it confirms a structural shift in cloud economics that the smartest infrastructure teams in New York are already responding to.

This post covers what actually happened with AWS pricing, why it’s part of a larger pattern, which workloads are worth evaluating for a move, and how NYC mid-market companies are approaching the analysis in 2026.

What Actually Happened With AWS Pricing in January 2026

The January increase targeted EC2 Capacity Blocks — the mechanism AWS uses to let companies pre-book GPU capacity for AI and machine learning workloads. The p5e.48xlarge instance, which runs eight NVIDIA H200 GPUs, jumped from $34.61 to $39.80 per hour across most regions. For teams running continuous GPU workloads, that translates to over $3,700 in additional monthly costs per instance — before storage, egress, and support fees.

The p5en.48xlarge followed the same pattern — from $36.18 to $41.61 per hour. In US West North California, the p5e.48xlarge went from $43.26 to $49.75 per hour — an additional $4,673 per month per instance for teams in that region.

Why Enterprise Discount Programs Didn’t Protect You

Here’s what makes this particularly frustrating for enterprise customers: if you have an Enterprise Discount Program with AWS, your discount didn’t protect you from the absolute dollar impact.

EDP agreements are typically structured as a percentage off public pricing. When the public price goes up 15%, your discounted rate goes up 15% in absolute dollars — even if the percentage stayed the same. A company with a 20% EDP discount on p5e instances went from paying $27.69 per hour to $31.84 per hour. The discount percentage is unchanged. The monthly bill increased by $2,992 per instance.

Why AWS Did It — And Why It’s Not Going Away

AWS justified the increase by citing “changing supply and demand patterns.” The reality is more structural and more durable than that explanation suggests.

GPU supply constraints, surging HBM memory prices from TSMC and SK Hynix, and data center power costs that have increased dramatically in key markets are all putting sustained upward pressure on the cost of running GPU infrastructure at cloud scale. These are not temporary blips that normalize in a quarter or two. They are structural shifts in the economics of AI compute infrastructure that will influence cloud pricing for years.

The January increase was the most visible manifestation of this shift. It will not be the last.

The Broader AWS Cost Pattern — Beyond GPU Pricing

The GPU price increase is the most visible recent example, but it’s part of a larger pattern that NYC mid-market IT and finance teams are increasingly reckoning with. Understanding the full picture matters because addressing only the GPU cost line misses most of the problem.

Cloud Waste Is Consuming 28-35% of Enterprise Budgets

Industry research consistently shows that 28 to 35% of enterprise cloud spend goes to idle resources, over-provisioned compute, and orphaned storage — budget that produces no business value whatsoever. For a company spending $100,000 per month on AWS, that’s potentially $30,000 every month in pure waste.

This waste is not the result of carelessness. It’s the predictable outcome of how cloud procurement works. Resources are provisioned quickly, projects change direction, and the idle infrastructure stays on the bill because no one has the mandate or the visibility to clean it up systematically. Cloud providers have no incentive to flag this to you. The waste is their revenue.

Egress Costs Are Growing Faster Than Compute Costs

As companies build more sophisticated data architectures — moving data between services, feeding analytics pipelines, serving applications across regions — egress costs compound in ways that were not obvious at the time the workload was designed. Data that moves out of AWS, or between AWS services in different availability zones, generates charges that have nothing to do with the compute powering your applications.

For NYC mid-market companies in financial services, healthcare, and media — industries where large data sets move constantly between systems — egress costs have become a meaningful and growing line item. In some configurations they represent 15 to 25% of total AWS spend.

Stable Workloads Are Paying Cloud Rates for Utility They Don’t Use

The most fundamental economic problem with running stable workloads on public cloud is not pricing — it’s the mismatch between what you’re paying for and what you’re actually using.

Cloud’s core value proposition is elasticity. You pay for what you use, scale up when you need more, scale down when you don’t. That value is real — for workloads that actually behave that way.

Stable, predictable compute — the kind that runs the same way every day, 24 hours a day, 365 days a year — does not benefit from elasticity. A database cluster that runs at 70% utilization every hour of every day is behaving like owned infrastructure. Paying cloud rates for that workload means paying a premium for flexibility you are never using.

This is the economic foundation of cloud repatriation — and it’s why the math works so reliably for the right workloads.

What Is Cloud Repatriation — And What It Is Not

Cloud repatriation is the process of moving workloads that are currently running on public cloud infrastructure back to dedicated physical infrastructure — typically in a colocation facility — where the economics are more favorable for stable, predictable compute.

It is not a rejection of cloud. Companies that execute cloud repatriation well do not move everything off AWS or Azure. They move the specific workloads where dedicated infrastructure is economically and operationally superior — and they keep the workloads where cloud’s flexibility and managed services genuinely add value.

The distinction matters because the wrong framing leads to the wrong decisions. Cloud repatriation is workload optimization, not cloud abandonment. The goal is to make sure every workload is running where it makes the most sense — not to prove a point about cloud versus on-premise.

What the Numbers Look Like in Practice

The economics of cloud repatriation for NYC mid-market companies in 2026 follow a consistent pattern for workloads that qualify.

A company paying $80,000 to $120,000 per month on AWS for stable compute workloads — running at consistent utilization, not scaling dramatically day to day — can typically move those workloads to dedicated colocation infrastructure in the NYC market at 40 to 60% of the current cloud cost. The savings are not guaranteed and they are not uniform across all workloads. But for the right workload profile the math is consistently compelling.

Dropbox saved $74 million over two years by moving infrastructure off AWS. 37signals cut their annual infrastructure bill from $3.2 million to under $1 million. These are not hyperscalers with armies of engineers. They are companies that ran the math honestly and made rational infrastructure decisions when the economics justified it.

The NYC mid-market equivalent of those decisions is happening now — driven by rising cloud costs, maturing infrastructure teams, and colocation facilities that have become significantly more capable and accessible for mid-market deployments.

The NYC Colocation Market in 2026 — Why the Timing Matters

New York City sits at the center of one of the most sophisticated colocation markets in the world. The facilities available to NYC mid-market companies are world-class — and they’re increasingly accessible to organizations that historically assumed colocation was only for large enterprises.

The Key Facilities Serving NYC Mid-Market Companies

Equinix NY4 in Secaucus is the financial ecosystem hub for the NYC metro area. Every major exchange, trading firm, and financial data provider has infrastructure here. For hedge funds, asset managers, and fintech companies where latency to the exchanges matters, NY4 is the irreplaceable option. Equinix also operates NY2, NY5, NY7, and NY9 across the metro area serving different use cases and budgets.

CoreSite NY1 in Manhattan offers direct access to the Open Cloud Exchange — direct on-ramps to AWS, Azure, and Google Cloud that make hybrid architectures practical and cost-effective. CoreSite’s NY3 facility in Secaucus, which opened in September 2025 with 138,000 square feet and connectivity to 80+ networks, has significantly expanded the mid-market options in that corridor.

DataBank LGA1 and LGA2 — at 111 8th Avenue and 60 Hudson Street respectively — are the strongest options for healthcare and compliance-sensitive workloads. DataBank’s HIPAA BAA is the most comprehensive in the NYC market, and their LGA3 facility in Orangeburg offers high-density power options up to 100kW per cabinet with liquid cooling for AI inference workloads.

Digital Realty’s JFK12 and JFK13 facilities, along with 60 Hudson Street and 32 Avenue of the Americas, serve media, fintech, and high-bandwidth workloads with carrier-dense connectivity that public cloud simply cannot replicate at competitive pricing.

Moving workloads to any of these facilities does not mean cutting off cloud. Every major NYC colocation facility offers direct cloud connectivity — the ability to maintain a hybrid architecture where stable workloads run on dedicated infrastructure and variable or managed workloads stay on cloud.

Why Right Now Is the Right Window

Colocation contracts are 3 to 5 year commitments. The companies locking in NYC colocation deals right now are securing pricing and terms that will govern their infrastructure costs for the next half decade. Power costs are rising. Vacancy rates in primary NYC facilities are at historic lows. The window to lock in favorable long-term terms is narrowing.

The companies moving fastest — benchmarking their current cloud costs, modeling the repatriation math, and locking in colocation deals before rates climb further — are the ones who will look back in three years and recognize it as one of the best infrastructure decisions they made.

Which Workloads Are Worth Evaluating First

Not every workload belongs in colocation. The analysis has to be honest about which workloads qualify and which don’t. Here is the framework we use with every client.

Workloads That Are Strong Candidates for Repatriation

High-utilization compute running 24/7. If a server or cluster runs at consistent utilization around the clock — above 60 to 70% utilization with minimal variation — you are paying cloud rates for a resource that behaves like owned infrastructure. This is the clearest economic case for repatriation and the workload type where the savings are most reliable.

Stable databases and storage with high egress. Predictable database workloads — particularly those with significant egress costs as data moves in and out of cloud — are frequently better candidates for dedicated infrastructure than they appear at first analysis. The compute cost savings combine with egress cost elimination to produce a more compelling total cost reduction than either factor alone.

AI inference at scale. Companies running persistent inference workloads — where a model is serving requests continuously rather than training episodically — face some of the steepest and fastest-rising cloud GPU costs in the market. GPU rental rates have risen significantly since late 2025 and the trajectory is not favorable. For inference workloads running around the clock, the economics of owned GPU infrastructure in a NYC colocation facility are increasingly compelling and should be modeled seriously before the next cloud contract renewal.

Compliance-sensitive data with specific residency requirements. Healthcare data subject to HIPAA, financial data subject to SEC and FINRA requirements, and legal data with strict confidentiality obligations all carry compliance overhead in cloud environments. Purpose-built colocation infrastructure with dedicated HIPAA BAAs, SOC 2 Type II certification, and financial services compliance frameworks can simplify the compliance posture meaningfully — and in some cases eliminate compliance overhead that has real cost attached to it in cloud.

Workloads with predictable growth trajectories. If you can model your infrastructure requirements 3 years forward with reasonable confidence — and the trajectory is steady growth rather than unpredictable spikes — colocation’s fixed pricing structure becomes increasingly attractive relative to cloud’s variable pricing that rises with every additional resource you provision.

Workloads That Belong in Cloud

The honest answer includes both sides. These workloads should stay on cloud regardless of what your AWS bill looks like.

Genuinely variable workloads that scale dramatically during peak periods and go quiet during off-peak — e/commerce traffic spikes, batch processing that runs monthly, seasonal workloads with 5x to 10x utilization swings. Cloud’s elasticity is real value for these workloads and you’re actually using what you’re paying for.

Early-stage or pre-revenue businesses where capital efficiency matters more than unit economics. The zero upfront cost of cloud is a genuine advantage when your infrastructure requirements are uncertain and your runway matters more than your infrastructure margin.

Workloads requiring global distribution across 20 or more regions simultaneously. Colocation serves the NYC metro area with exceptional depth. It is not the right answer for a workload that needs to be physically close to users in 30 countries.

Teams without internal infrastructure experience. The operational simplicity of cloud has real value for organizations that have never managed physical infrastructure. Colocation shifts operational responsibility in ways that require internal capability to handle. If that capability doesn’t exist, building it has a real cost that belongs in the analysis.

How to Start the Analysis — A Practical Framework

The cloud repatriation analysis that produces reliable conclusions follows a specific process. Here is what it looks like in practice.

Step 1 — Identify Your Stable Workloads

Pull your AWS cost breakdown by service and look for the compute, database, and storage line items that have been consistent month over month for the past 6 to 12 months. Workloads with less than 20% variation in monthly cost are your candidates. Variable workloads — anything with significant month-to-month swings — stay in cloud.

Step 2 — Model the Colocation Economics

For each candidate workload, model what dedicated infrastructure in a NYC colocation facility would actually cost. This requires four inputs: power draw in kilowatts, hardware cost amortized over 5 years, cross-connect costs for connectivity, and the colocation facility rate per kilowatt.

The all-in rate for NYC mid-market colocation deployments in 2026 — including power delivery, cross-connects, and standard services — runs approximately $250 per kilowatt per month for most deployments. Larger deployments with higher power density and longer terms negotiate below that. Smaller deployments or those with specialized requirements may run above it.

Step 3 — Add Egress Savings

Calculate what you’re currently spending on data egress from AWS for the candidate workloads. In colocation, data that moves between your own systems within the facility — or over direct cross-connects to your other providers — does not generate egress charges. For workloads with meaningful egress spend, this line item can meaningfully improve the repatriation economics beyond what the compute comparison alone suggests.

Step 4 — Model the Full 3-Year Picture

The comparison that matters is not month one — it’s the total cost of ownership over the contract term. Cloud costs compound upward as your workloads grow and as AWS applies its annual pricing adjustments. Colocation costs are largely fixed for the contract term with modest escalation. The 3-year view almost always shows a larger advantage for repatriation than the month-one comparison suggests.

Step 5 — Account for Migration and Operational Costs

An honest analysis includes the cost of getting there. Hardware procurement, migration engineering, and any additional operational overhead from managing physical infrastructure all belong in the model. These costs are real — and they’re typically one-time or low-ongoing costs that are recovered within 12 to 18 months for workloads where the economics are compelling.

Run Your Numbers — Free

The framework above describes the analysis. Running it for your specific environment requires your specific numbers — your actual AWS spend, your workload utilization data, your power requirements, and your compliance needs.

That’s why we built the Metro Colo Advisory Cloud vs. Colo Calculator — a tool that walks you through the inputs specific to your environment and produces a realistic estimate of what repatriation economics would look like for your workloads, with the assumptions made transparent throughout.

Run your numbers with the Cloud vs. Colo Calculator →

The calculator will tell you whether the numbers suggest a deeper analysis is worth your time. If they do — our team can run that full analysis at no cost to you, with access to current NYC market pricing that isn’t publicly available.

What the NYC Mid-Market Is Actually Doing Right Now

The companies responding most effectively to rising AWS costs in the NYC mid-market are not making wholesale infrastructure decisions based on a single price increase. They’re doing something more disciplined.

They’re separating their workloads into two categories — stable and variable — and analyzing each category honestly. They’re running the 3-year total cost comparison with real numbers, not back-of-envelope estimates. They’re engaging with colocation facilities to understand current market pricing before their next cloud commitment renews. And they’re making workload-by-workload decisions based on the analysis, not on a predetermined conclusion.

The result in most cases is a hybrid architecture — stable, predictable workloads in colocation, variable and managed workloads on cloud — that costs significantly less than an all-cloud approach while maintaining the flexibility that matters for the workloads that genuinely need it.

This is not a new idea. It’s a mature infrastructure strategy that the largest enterprises have executed for years. What’s changed in 2026 is that it’s become accessible and economically compelling for NYC mid-market companies at the $50,000 to $150,000 per month cloud spend level — the companies that AWS’s January price increase hit directly.

The Bottom Line

AWS’s January 2026 price increase broke a 20-year pattern of declining cloud costs. The structural forces behind it — GPU supply constraints, rising power costs, surging memory prices — are not going away. The trajectory of cloud costs for compute-intensive workloads is up, not down.

For NYC mid-market companies running stable workloads, this is a reasonable moment to pressure-test the assumption that cloud is always the right answer. Not because colocation is always better — it isn’t — but because the economics have shifted enough that the analysis deserves to be run honestly, with current numbers, before the next cloud commitment renews.

The companies that run that analysis now and act where the numbers justify it will have locked in infrastructure costs at rates that look increasingly favorable as cloud pricing continues its current trajectory. The companies that don’t will be having the same conversation again in two years — except the savings opportunity will be smaller because they’ll have two more years of above-market cloud spend behind them.

If you want to understand what the analysis looks like for your specific environment — your workloads, your spend, your compliance requirements — we’re happy to walk through it with you. The conversation is free, the analysis is free, and there is no obligation to act on what we find.

Get a free infrastructure assessment →

Frequently Asked Questions

How much does NYC colocation cost compared to AWS? For stable workloads, NYC colocation typically runs 40 to 60% of equivalent AWS compute costs on a total cost of ownership basis over a 3-year period. The all-in rate for mid-market NYC colocation deployments in 2026 is approximately $250 per kilowatt per month, which includes power delivery, cross-connects, and standard services.

Which NYC colocation facilities are best for mid-market companies? The primary options for NYC mid-market companies are Equinix NY4 in Secaucus for financial services and trading workloads, CoreSite NY1 in Manhattan for cloud-connected hybrid architectures, DataBank LGA1 and LGA2 for healthcare and compliance-sensitive workloads, and Digital Realty’s 60 Hudson Street for media and high-bandwidth deployments. The right facility depends entirely on your workload requirements, compliance needs, and connectivity priorities.

What workloads should stay on AWS? Genuinely variable workloads that scale unpredictably, early-stage companies where capital efficiency matters more than unit economics, workloads requiring global distribution across many regions, and organizations without internal infrastructure experience to manage physical hardware should stay on cloud.

How long does cloud repatriation take? For a well-planned mid-market repatriation — hardware procurement, facility build-out, and workload migration — a realistic timeline is 3 to 6 months from decision to live in production. The timeline depends heavily on hardware lead times, facility availability, and the complexity of the workloads being moved.

Is cloud repatriation right for my company? The honest answer requires running the numbers for your specific environment. The general indicators that it’s worth analyzing: monthly cloud spend above $50,000, workloads running at consistent utilization 24/7, compliance requirements that create overhead in cloud environments, and GPU workloads running continuous inference. Metro Colo Advisory runs this analysis for NYC mid-market companies for free.

Metro Colo Advisory is New York City’s independent colocation brokerage. We work exclusively on behalf of clients — never providers — helping mid-market companies navigate colocation decisions with no conflicts of interest. Our service is free to clients; we’re compensated by providers only when we successfully place a deal.

Metro Colo Advisory is New York City’s independent colocation advisor. We represent you — not the data center. Our fee comes from the provider you choose, so our only job is finding you the best deal.