Nvidia Blackwell and GeForce RTX 50-Series GPUs: Rumors, specifications, release dates, pricing, and everything we know

The next-generation Nvidia Blackwell GPU architecture and RTX 50-series GPUs are coming, right on schedule. While Nvidia hasn’t officially provided any timeframe for when the consumer parts will be announced, there have been plenty of rumors and supposed leaks of data. We’ve spoken with some people as well, and the expectation is that we’ll see at least the RTX 5090 and RTX 5080 by the time the holiday season kicks off in October or November. Blackwell GPUs will then join the ranks of the best graphics cards.

Nvidia has also provided many of the core details for its data center Blackwell B200 GPU. While the AI and data center variants will inevitably differ, there are some shared aspects between past consumer and data center Nvidia GPUs, and we expect that to continue. That means that we at least have some good indications of certain aspects of the future RTX 50-series GPUs.

There are still a lot of unknowns, with leaks that appear more like people throwing darts at the wall instead of having actual inside information. We’ll cover the main rumors along with other details, including the release date, potential specifications, and other technology. Over the coming months, we can expect additional details to come out, and we’ll be updating this article as information becomes available. Here’s everything we know about Nvidia Blackwell and the RTX 50-series GPUs.

Blackwell and RTX 50-series Release Dates

Of all the unknowns, the release date — at least for the first Blackwell GPUs — is the easiest to pin down. Based on what we’ve personally heard, we expect the RTX 50-series to launch by the end of the year. Nvidia tends to be good on timing new GPU releases, and getting the top RTX 5090 and 5080 out before the November and December holiday shopping period makes the most sense.

There’s plenty of historical precedent here as well. The Ada Lovelace RTX 40-series GPUs first appeared in October 2022. The Ampere RTX 30-series GPUs first appeared in September 2020. Prior to that, RTX 20-series launched two years earlier in September 2018, and the GTX 10-series was in May/June 2016, with the GTX 900-series arriving in September 2014. That’s a full decade of new Nvidia GPU architectures arriving approximately every two years, and we see no reason for Nvidia to change tactics now.

It’s not just about the two-year consumer GPU cadence, either. Nvidia first revealed core details of the Hopper H100 architecture in March 2022 at its annual GPU Technology Conference (GTC). And in May 2020, it first revealed its Ampere A100 architecture, followed by the consumer variants a few months later. The same thing happened in 2018 as well, with Volta V100 and Turing, and in 2016, there was the Tesla P100 and Pascal. So, in the past four generations, we’ve learned first about the data center and AI GPUs, with the consumer GPUs revealed and launched later in the same year. Now, Nvidia just revealed the Blackwell B200 architecture, again at GTC, and it’s a safe bet we’ll hear about the consumer variants this fall.

We don’t know the exact names or models Nvidia plans for the next generation Blackwell parts. We’re confident we’ll have RTX 5090, RTX 5080, RTX 5070, and RTX 5060 cards, and probably some combination of Ti and/or Super variants. Some of those variants will undoubtedly come out during the mid-cycle refresh in 2025 or even early 2026. We’re also curious about whether or not Nvidia will have an RTX 5050 GPU — it skipped that level on desktops with the 40-series and 20-series, though the latter had the GTX 1660 and 1650 class GPUs.

Given the past patterns, we expect at least the top-tier RTX 5090 and 5080 to arrive this year, perhaps with an RTX 5070 Ti to keep them company. Or maybe Nvidia will have the RTX 5090, RTX 5080 Ti, and RTX 5080 launch this year. The mid-tier (based on the model numbers) 5070 and 5060 GPUs will then follow, most likely sometime in 2025, with the typical staggered release schedule.

TSMC 4NP, refined 4nm Nvidia

Nvidia’s B200 chips will use TSMC 4NP (Image credit: Nvidia)

One of the surprising announcements at GTC 2024 was that Blackwell B200 will use the TSMC 4NP node—”4nm Nvidia Performance.” While it’s certainly true that many process names have become largely detached from physical characteristics, many expected Nvidia to move to a refined variant of TSMC’s cutting-edge N3 process technology. Instead, it opted for a refinement of the existing 4N node that’s already been used with Hopper and Ada Lovelace GPUs for the past two years.

Going this route certainly offers some cost savings, though TSMC doesn’t disclose the contract pricing agreements with its various partners. Blackwell B200 also uses a dual-chip solution, with the two halves linked via a 10 TB/s NV-HBI (Nvidia High Bandwidth Interface) connection. Perhaps Nvidia just didn’t think it needed to move to a 3nm-class node for this generation.

And yet, that opens the door for AMD and even Intel to potentially shift to a newer and more advanced process node, cramming more transistors into a smaller chip. Nvidia took a similar approach with the RTX 30-series, using a less expensive Samsung 8N process instead of the newer and better TSMC N7. It will be interesting to see if this has any major impact on how the various next-generation GPUs stack up.

Of course, it’s also possible that Blackwell B200 variants will use TSMC 4NP, while consumer chips use a different node. Much of that depends on how much of the core architecture gets shared between the data center and consumer variants and whether Nvidia thinks it’s beneficial to diversify. There’s precedent here for having different nodes and even manufacturers, as Ampere A100 used TSMC N7 while the RTX 30-series chips used Samsung 8N. GTX 10-series Pascal GP107 and GP108 were also made on Samsung’s 14LPP, while GP102, GP104, and GP106 were made on TSMC 16FF.

Next generation GDDR7 memory

GDDR7 chips were shown at GTC 2024 (Image credit: Tom’s Hardware)

It’s long been expected that the consumer and professional (i.e., not strictly data center) Blackwell GPUs will move to GDDR7 memory. All indications from GTC 2024 are that GDDR7 will be ready in time for the next generation of GPUs before the end of the year. In fact, Samsung and SK Hynix showed off GDDR7 chips at GTC, and Micron confirmed that GDDR7 is also in production.

The current generation RTX 40-series GPUs use GDDR6X and GDDR6 memory, clocked at anywhere from 17Gbps to 23Gbps. GDDR7 has target speeds of up to 36Gbps, 50% higher than GDDR6X and 80% higher than vanilla GDDR6. SK hynix says it will even have 40Gbps chips, though the exact timeline for when those might be available wasn’t detailed. Regardless, this will provide a much-needed boost to memory bandwidth at all levels.

Of course, we don’t know if Nvidia will actually ship cards with memory clocked at 36Gbps. In the past, it used 24Gbps GDDR6X chips but clocked them at 22.4Gbps or 23Gbps—and some 24Gbps Micron chips were apparently down-binned to 21Gbps in the various RTX 4090 graphics cards that we tested. So, Nvidia could take 36Gbps memory but only run it at 32Gbps. That’s still a healthy bump to bandwidth.

At 36Gbps, a 384-bit GDDR7 memory interface can provide 1728 GB/s of bandwidth. That’s 71% higher than what we currently get on the RTX 4090. A 256-bit interface would deliver 1152 GB/s, compared to the 4080 Super’s 736 GB/s — a 57% increase. 192-bit cards would have 864 GB/s, and even 128-bit cards would get up to 576 GB/s of raw bandwidth. Nvidia might even go so far as to create a 96-bit interface with 432 GB/s of bandwidth.

Of course, we also expect that Nvidia will keep using a large L2 cache with Blackwell. This will provide even more effective memory bandwidth — every cache hit means a memory access that doesn’t need to happen. With a 50% cache hit rate as an example, that would double the effective memory bandwidth, though note that hit rates vary by game and settings, with higher resolutions in particular reducing the hit rate.

GDDR7 also potentially addresses the issue of memory capacity versus interface width. At GTC, we were told that 16Gb chips (2GB) are in production, but 24Gb (3GB) chips also be coming. The larger chips with non-power-of-two capacity probably won’t be ready until 2025, but those will be more important for lower-tier parts. There’s no pressing need for consumer graphics cards to have more than 24GB of memory, though we could see a 32GB RTX 5090 (with a 512-bit interface). Even 16GB is generally sufficient for gaming, with a 256-bit interface.

However, the availability of 24Gb chips means Nvidia (AMD and Intel) could put 18GB of VRAM on a 192-bit interface, 12GB on a 128-bit interface, and 9GB on a 96-bit interface. We could even see 24GB cards with a 256-bit interface, and 36GB on a 384-bit interface — and double that capacity for professional cards. Or how about a 512-bit interface on a professional card with ‘clamshell’ memory (chips on both sides of the PCB), packing a whopping 96GB of VRAM? That would be excellent for certain AI and professional workloads, and it’s more likely a case of “when” rather than “if” we’ll see such a card.

Blackwell architectural updates

The Blackwell architecture will almost certainly contain various updates and enhancements over the previous generation Ada Lovelace architecture, but right now the summary of what we know for certain can be summed up with two words: not much. But every generation of Nvidia GPUs has contained at least a few architectural upgrades, and we can expect the same to occur this round.

Nvidia has increased the potential ray tracing performance in every RTX generation, and Blackwell seems likely to continue that trend. With more games like Alan Wake 2 and Cyberpunk 2077 pushing full path tracing — not to mention the potential for modders to use RTX Remix to enhance older DX10-era games with full path tracing — there’s even more need for higher ray tracing throughput. There will probably be other RT-centric updates as well, just like Ada offered SER (Shader Execution Reordering), OMM (Opacity Micro-Maps), and DMM (Displaced Micro-Meshes). But what those changes might be is as yet unknown.

What we do know is that the data center Blackwell B200 GPU has reworked the tensor cores yet again, offering native support for FP4 and FP6 numerical formats. Those will be primarily useful for AI inference, and considering the consumer GPUs will do double duty with the professional cards, it’s a safe bet that all Blackwell chips will support FP4 and FP6 as well. (Ada added FP8 support to its tensor cores, as a related example.)

What other architectural changes might Blackwell bring? If we’re correct that Nvidia is sticking with TSMC 4NP for the consumer parts, we wouldn’t anticipate massive alterations. There will still be a large L2 cache, and of course, the enhanced OFA (Optical Flow Accelerator) used for DLSS 3 frame generation will still be present. It might even get some tweaks to improve it, though we’ll have to wait and see.

One potential hint at what could happen with the fastest solutions comes from the Blackwell B200. Nvidia created NV-HBI to link two identical chips together into one massive GPU. This isn’t SLI but rather a chipset-style approach with massive inter-chip bandwidth so that the two chips functionally behave as a single GPU. Could NV-HBI show up on consumer GPUs as well? We think there’s a reasonable possibility—probably not on the lower-spec chips but certainly on the largest chip.

Raw compute, for both graphics and more general workloads, will almost certainly increase by a decent amount, though probably more along the lines of a 30% boost rather than a 50% increase. RTX 4080 offers 40 TeraFLOPS of FP32 compute compared to the 3080’s 30 TeraFLOPS, for example — a 33% increase — while the 4090 offers 83 TeraFLOPS compared to the 3090’s 40 TeraFLOPS — a much larger 107% increase. Perhaps Nvidia will “go big” on the RTX 5090 as well while making smaller improvements elsewhere, but we’ll have to wait and see.

RTX 50-Series Pricing

(Image credit: Shutterstock)

How much will the RTX 50-series GPUs cost? Frankly, considering the current market conditions, there’s little reason to expect Nvidia to reduce prices relative to the current RTX 40-series GPUs. Nvidia will price the cards as high as it feels the market will accept. With potentially higher AI performance and the increased demand from the non-gaming sector, we might be lucky if the next generation carries the same pricing structure as the current generation.

At the same time, we hope that generational pricing won’t increase. $1,000 for the “step down” RTX 4080 Super means that particular level of GPU now costs 43% more than it did in the RTX 2080 Super days. Of course, we also had the “$699” RTX 3080 10GB and “$1,199” RTX 3080 Ti in between, when prices were all kinds of messed up thanks to the prevalence of GPU cryptomining coupled with the effects of Covid-19. It’s currently technically profitable to mine certain cryptocurrencies with a GPU, but WhatToMine puts the estimated income at less than $1 per day for an RTX 4090 — meaning it would take about five years to break even at current rates and prices.

The budget GPU sector has also basically died off. Integrated graphics have reached the point where they’re “fast enough” for most common workloads, even including modest gaming — that’s particularly true for mobile processors, with desktop options typically being far less potent. The last new GPUs to truly target the budget sector were AMD’s rather unimpressive RX 6500 XT and RX 6400 — Nvidia hasn’t made a new sub-$200 GPU since the GTX 1650 Super launched in 2019 (unless you want to count the travesty that was the GTX 1630).

That means, for dedicated desktop graphics cards, we’re now living in a world where “budget” means around $300, “mainstream” means $400–$600, “high-end” is for GPUs costing close to $1,000, and the “enthusiast” segment targets $1,500 or more. Or at least, that appears to be Nvidia’s take on the situation. AMD’s GPUs tend to be a bit more affordable, particularly when looking at street prices, but Nvidia has maintained a higher pricing structure for at least the past four years.

Blackwell speculative specifications

Given everything we’ve said so far, it should hopefully be clear that there’s very little official information on Blackwell currently available. The Nvidia hack in 2022 gave us the Blackwell name and some potential codenames, but that was over two years ago, and a lot can change in that time. Plus, the details on Blackwell were pretty thin in the first place.

However, as with every major GPU architecture update, plenty of rumors and supposed leaks are floating around. Some suggest they have inside knowledge, others appear to be guesses. Just to cite a few recent examples, one ‘leak’ said we should expect Blackwell GB202 to have a 384-bit memory interface in November 2023, while a more recent leak in March 2024 says Blackwell GB202 will have a 512-bit interface.

Something else to chew on is the NV-HBI dual-chip solution for the Blackwell B200 that we mentioned earlier. Perhaps the top-tier Blackwell GB202 will take the same approach and have two GB203 chips linked via NV-HBI. That would allow Nvidia to keep the actual die size of the fastest chips in check while simultaneously providing for much higher levels of performance.

We’ll include both potential variants of GB202 in our speculative specs table for now, along with estimated names and specs elsewhere. The large number of question marks should make it clear that we do not have any hard information at present.

Swipe to scroll horizontally
Highly speculative Blackwell GPU specifications
Graphics Card RTX 5090? RTX 5090 alt? RTX 5080? RTX 5070? RTX 5060? RTX 5050?
Architecture GB202 (2x GB203) GB202 GB203 GB205 GB206 GB207
Process Technology TSMC 4NP? TSMC 4NP? TSMC 4NP? TSMC 4NP? TSMC 4NP? TSMC 4NP?
Transistors (Billion) ? ? ? ? ? ?
Die size (mm^2) 2x ? ? ? ? ? ?
SMs 192? 160? 96? 60? 48? 32?
CUDA Cores (Shaders) 24576? 20480? 12288? 7680? 6144? 4096?
Tensor Cores 768? 640? 384? 240? 192? 128?
RT Cores 192? 160? 96? 60? 48? 32?
Boost Clock (MHz) 2500? 2500? 2500? 2500? 2500? 2500?
VRAM Speed (Gbps) 36? 36? 36? 36? 36? 36?
VRAM (GB) 32? 24? 16? 18? 12? 9?
VRAM Bus Width 512? 384? 256? 192? 128? 96?
L2 / Infinity Cache (MB) 128? 128? 64? 48? 32? 24?
Render Output Units 256? 192? 128? 80? 64? 48?
Texture Mapping Units 768? 640? 384? 240? 192? 128?
TFLOPS FP32 (Boost) 122.9? 102.4? 61.4? 38.4? 30.7? 20.5?
TFLOPS FP16 (FP8) 983? (1966?) 819? (1638?) 492? (983?) 307? (614?) 246? (492?) 164? (328?)
Bandwidth (GBps) 2304? 1728? 1152? 864? 576? 432?
TDP (watts) 450? 450? 320? 225? 175? 125?
Launch Date Oct 2024? Oct 2024? Oct 2024? Jan 2025? Oct 2025? ???
Launch Price $1,999? $1,599? $999? $599? $449? $299?

Again, take the above information with a massive helping of salt — seriously, just dump out the whole salt shaker! We’ve basically made up some numbers that seem plausible and stuffed them into the usual Nvidia formula with a given number of SMs, which then gives the CUDA, RT, and tensor core counts based on the usual 128 CUDA, 1 RT, and 4 tensor cores per SM. There are also (traditionally) four TMUs (Texture Mapping Units) per SM.

A lot of the potential specs are basically placeholders using whatever Nvidia currently has with the RTX 40-series cards. This mostly applies to L2 cache size, ROPs (Render Outputs), power requirements, and pricing, for example. We make no claims to have insider knowledge of the actual specs right now, and as far as we’re aware, no one reputable has leaked core counts either.

For the time being, clock speed estimates are a static 2.5 GHz on the GPU clock and 36Gbps on the GDDR7 clock. Let’s also note that Nvidia could mix things up and continue to use GDDR6X on certain GPUs within the product stack, though we’re really hoping to see 3GB chips on all the GPUs with a 192-bit or narrower memory interface.

We’ll update the above table over the coming months and even years as the rumors develop, and eventually, we’ll have official part names and specifications. We’ll almost certainly end up with far more than five different graphics cards as well, but there’s no sense in guesstimating where those might land at present. Just note that there are ten different RTX 40-series desktop GPUs and twelve different RTX 30-series desktop variants (counting the 3060 12GB / 8GB and 3050 8GB / 6GB as different models).

The future GPU landscape

(Image credit: Shutterstock)

Nvidia won’t be the only game in town for next-generation graphics cards. There’s plenty of evidence to suggest we’ll see Intel’s Battlemage release this fall as well, and AMD RDNA 4 will also arrive at some point—maybe not this year, but we’d expect to see it in early 2025 at the latest. (We’ll have more detailed articles on both of those, hopefully in the near future, so stay tuned.)

But while there will certainly be competition, Nvidia has dominated the GPU landscape for the past decade. At present, the Steam Hardware Survey indicates  Nvidia has 78% of the graphics card market, AMD sits at 14.6%, and Intel accounts for just 7.2% (with 0.12% “other”). That doesn’t even tell the full story, however.

Both AMD and Intel make integrated graphics, and it’s a safe bet that a large percentage of their respective market shares comes from laptops and desktops that lack a dedicated GPU. AMD’s highest market share for what is clearly a dedicated GPU comes from the RX 580, sitting at #31 with 0.81%. Intel doesn’t even have a dedicated GPU listed in the survey. For the past three generations of AMD and Nvidia dedicated GPUs, the Steam survey suggests Nvidia has 92.6% of the market compared to 7.4% for AMD. Granted, the details of how Valve collects data are obtuse, at best, and AMD may be doing better than the survey suggests. Still, it’s a green wave of Nvidia cards at the top of the charts.

What we’ve heard from Intel suggests it intends for Battlemage to compete more in the mainstream and budget portions of the graphics space. And by that, we mean in the $200 to perhaps $600 price range. However, Intel hasn’t said much lately, so that could have changed. AMD definitely competes better with Nvidia for the time being, both in performance and drivers and efficiency, but we’re still waiting for its GPUs to experience their “Ryzen moment” — GPU chiplets so far haven’t proven an amazing success.

Currently, Nvidia delivers higher overall performance, and much higher ray tracing performance. It also dominates in the AI space, with related technologies like DLSS — including DLSS 3.5 Ray Reconstruction — Broadcast, and other features. It’s currently Nvidia’s race to lose, and it will take a lot of effort for AMD and Intel to close the gap and gain significant market share, at least outside of the integrated graphics arena. On the other hand, high Nvidia prices and a heavier focus on AI for the non-gaming market could leave room for its competitors. We’ll see where the chips land later this year.

This post was originally published on this site