Back
Tom's Hardware1 h ago

Gigantic LLM Runs on a Single GPU Thanks to 768GB of Cheap Intel Optane Memory

A Redditor successfully ran a massive 1-trillion-parameter large language model (LLM) on a single GPU workstation by leveraging 768GB of Intel Optane PMem DIMMs as RAM. This innovative setup, using the local Kimi K2.5 install, achieved an impressive token generation rate of approximately four tokens per second.

Gigantic LLM Runs on a Single GPU Thanks to 768GB of Cheap Intel Optane Memory

In a remarkable demonstration of clever hardware utilization, a Redditor has captured the tech community's attention by deploying a 1-trillion-parameter Large Language Model (LLM) on a system featuring just one GPU. The secret weapon behind this feat was 768GB of Intel Optane Persistent Memory (PMem) DIMMs, ingeniously repurposed to function as system RAM.

Traditionally, running such an enormous LLM locally would necessitate an exorbitant amount of conventional, high-speed RAM, often coupled with multiple A6000 or A100 GPUs. The cost and complexity associated with such a setup typically relegate these models to cloud-based supercomputing environments. However, this Redditor's approach highlights a more accessible, albeit unconventional, path.

Affiliate content
Instant Gaming

Games up to -90% off

Instant key delivery on Instant Gaming

Browse deals →

The Intel Optane PMem DIMMs, while not as fast as standard DDR4 or DDR5 RAM, offer significantly higher capacities and a much lower price point per gigabyte. By configuring a workstation to utilize these DIMMs, the user created a system with a vast memory pool capable of accommodating the monumental size of the 1-trillion-parameter LLM. The specific model used was a local Kimi K2.5 install, demonstrating that even with the slower memory access speeds of Optane, practical inference is achievable.

The performance observed, estimated at roughly four tokens per second, is competitive for a single-GPU setup, especially considering the model's gargantuan size. This experiment opens up intriguing possibilities for researchers and enthusiasts looking to run large models without the prohibitive costs of top-tier, specialized hardware. It underscores the potential of repurposing enterprise-grade memory solutions for high-memory-demand consumer applications, shaking up expectations of what's possible on a more modest budget.

Summary based on third-party reporting.

Original source: Tom's Hardware

Recommended

GameNative: Is This Android App the End of Handheld Gaming PCs?
Android Authority1 h ago

GameNative: Is This Android App the End of Handheld Gaming PCs?

An interview with GameNative's creator, Utkarsh Dalal, explores the potential of this Android gaming app to revolutionize portable gaming. Dalal suggests that GameNative could offer a compelling alternative to dedicated handheld gaming PCs, leveraging the power and ubiquity of smartphones.

Read article
Covert Wi-Fi Hacking USB Cable Packs Embedded Microcontroller and Storage
Tom's Hardware1 h ago

Covert Wi-Fi Hacking USB Cable Packs Embedded Microcontroller and Storage

A discreet USB cable, outwardly indistinguishable from a regular one, has garnered significant attention on Kickstarter due to its hidden capabilities. This device ingeniously integrates an ESP32-S3 microcontroller, a micro SD card slot, and Wi-Fi connectivity, enabling advanced functions like remote payload execution and keystroke injection.

Read article
Why I'm Skeptical About the Future of Google One Pricing
Android Authority2 h ago

Why I'm Skeptical About the Future of Google One Pricing

Google's new AI Premium subscription for Google One, offering advanced Gemini access, appears too good to be true, raising concerns about future price hikes. The current generous offering suggests that Google might be strategically positioning itself before significantly increasing costs.

Read article
Elegoo Centauri Carbon 2 Combo 3D Printer Now $50 Off, Making Multi-Color 3D Printing Accessible
Tom's Hardware2 h ago

Elegoo Centauri Carbon 2 Combo 3D Printer Now $50 Off, Making Multi-Color 3D Printing Accessible

Elegoo's family-friendly Centauri Carbon 2 Combo 3D printer is now available with a $50 discount, bringing its price down to $399. This makes four-color printing, quick filament swapping, and support for high-temperature materials more affordable for beginners.

Read article