DeepSeeks New AI Model Can Operate on a Single GPU

The Chinese AI laboratory DeepSeek has recently released an update to its reasoning AI model R1. The «distilled» version is designed to operate on a single graphics card.

DeepSeek-R1-0528-Qwen3-8B is built on the Qwen3-8B framework, which Alibaba unveiled in May. According to the company, this new model has outperformed Google’s Gemini 2.5 Flash in the AIME 2025 challenge, featuring a collection of complex mathematical problems.

The «distilled» variant is an optimized and faster version of a larger machine learning model, achieved through the method of knowledge distillation. While such neural networks may not be as powerful, they are considerably less demanding in terms of computational resources.

As reported by NodeShift, the Qwen3-8B model requires a GPU with 40-80 GB of VRAM and can be run on a single Nvidia H100 graphics card.

DeepSeek utilized both the updated R1 version and Qwen3-8B for training and fine-tuning DeepSeek-R1-0528-Qwen3-8B.

The latest iteration of the primary R1 neural network features minor improvements, as stated by the company. It is currently accessible on the Hugging Face platform.

Developer known as xlr8harder pointed out that the model is less willing to engage in discussions on contentious issues, particularly those related to the Chinese government.

«DeepSeek faces criticism for this release: this model represents a significant setback for free speech. The situation is somewhat mitigated by the fact that the neural network has open-source code under a permissive license, allowing the community to address this issue (and it will),» he noted.

In one instance, the model declined to provide arguments regarding human rights violations in internment camps in Xinjiang. While it acknowledged the existence of these camps, it avoided directly criticizing the Chinese government.

«It’s interesting, though not entirely surprising, that it can reference the camps as examples of human rights violations but denies this when asked directly,» xlr8harder wrote.

As a reminder, in April, DeepSeek released a new AI model focused on mathematics called Prover for public access.