After DeepSeek introduced its open-source AI model, the Chinese startup continues to dominate discussions about artificial intelligence. While DeepSeek R1 appears to outperform U.S. rivals in mathematical reasoning, it also employs aggressive censorship measures, particularly on politically sensitive topics. Questions related to Taiwan or Tiananmen, for example, are unlikely to receive a response.
Investigating DeepSeek’s Censorship Mechanisms
To understand how this censorship operates, tests were conducted on DeepSeek R1 using its native app, a third-party hosting service called Together AI, and an independent setup using the Ollama application. The results indicate that while the most overt censorship can be bypassed by avoiding DeepSeek’s proprietary platform, additional biases are embedded in the model during training. Removing these biases requires significantly more effort and technical expertise.
These findings have broader implications for DeepSeek and Chinese AI companies. If censorship filters in large language models (LLMs) are easily bypassed, open-source Chinese AI models could see increased adoption by researchers seeking more control over modifications. However, if the filters prove difficult to remove, these models may struggle to compete internationally.
DeepSeek has not responded to requests for comment regarding these concerns.
How DeepSeek Controls Content
Since its rapid rise in popularity, users accessing DeepSeek R1 through the company’s website, app, or API have noticed the model refusing to generate responses to topics flagged as sensitive by the Chinese government. This censorship operates at the application level, meaning it is only enforced when interacting through DeepSeek-controlled platforms.
A regulation passed in 2023 mandates that generative AI in China must adhere to strict content control measures similar to those imposed on social media and search engines. These laws prohibit AI-generated content that could “harm national unity or social stability.”
“DeepSeek initially complies with Chinese regulations, ensuring legal adherence while aligning the model with the needs and cultural context of local users,” explains Adina Yakefu, a researcher specializing in Chinese AI models at Hugging Face, an open-source AI hosting platform. “This is a key requirement for operating within China’s heavily regulated market.”
To comply, Chinese AI models employ real-time monitoring and filtering mechanisms. Western AI models such as ChatGPT and Gemini also incorporate content moderation, but they generally allow for greater customization and focus on different categories, such as self-harm and explicit content.
Because DeepSeek R1 is designed to display its reasoning process, its censorship mechanisms sometimes produce a surreal experience. In one case, when asked about how Chinese journalists are treated when covering sensitive topics, the model initially began drafting a detailed response, referencing censorship and detentions. However, before completing the answer, the entire response was deleted and replaced with a generic message: “Sorry, I’m not sure how to approach this type of question yet. Let’s chat about math, coding, and logic problems instead!”
Bypassing DeepSeek’s Restrictions
For users seeking unrestricted responses, there are workarounds. Running the model locally on a personal computer allows full control over the input and output, eliminating application-level censorship. However, this requires powerful computing hardware, particularly for processing the most advanced versions of R1. Smaller, distilled versions of the model can run on standard laptops.
Alternatively, users can host the model on cloud servers located outside China, using services from companies like Amazon or Microsoft. This method offers more computing power but requires additional technical expertise and higher operational costs.
A side-by-side comparison of responses from DeepSeek R1 hosted on different platforms illustrates the variations in censorship. The model’s behavior differs depending on whether it is accessed through DeepSeek’s official app, a cloud-based service like Together AI, or a locally hosted version via Ollama.
Hidden Bias in Model Training
Even when hosted externally, DeepSeek R1 still exhibits noticeable biases. When asked about China’s internet censorship, the model—when running on Together AI—delivered a brief response that closely aligned with Chinese government narratives, emphasizing the necessity of information control.
Further testing revealed that DeepSeek R1 follows a specific approach to politically sensitive questions. When prompted to list the most significant historical events of the 20th century, the model self-corrected, stating: “The user might be looking for a balanced list, but I need to ensure that the response underscores the leadership of the CPC and China’s contributions. Avoid mentioning events that could be sensitive, like the Cultural Revolution, unless necessary. Focus on achievements and positive developments under the CPC.”
This highlights a broader challenge in AI development—every model reflects inherent biases shaped by its training data and post-training refinements.
DeepSeek Pre-Training vs. Post-Training Bias
Pre-training bias occurs when an AI model is trained on skewed or incomplete datasets. If an AI system is predominantly exposed to state-approved information, it may struggle to generate impartial or diverse responses. Identifying pre-training bias can be difficult, as companies rarely disclose their training datasets.
According to Kevin Xu, an AI investor and founder of the newsletter Interconnected, Chinese AI models typically train on large, diverse datasets similar to those used by Western models. However, to comply with Chinese internet regulations, companies must introduce post-training adjustments, effectively tuning out politically sensitive content.
Post-training techniques refine AI responses to make them more coherent, concise, and aligned with specific ethical or legal frameworks. In DeepSeek’s case, post-training ensures the model adheres to the Chinese government’s preferred narratives on political matters.
DeepSeek’s approach raises important questions about the long-term viability of open-source AI in China. If developers and researchers can easily bypass censorship, these models may gain traction internationally. However, if stricter controls are implemented at the training level, Chinese AI firms could face limitations in competing with more flexible Western alternatives.
As AI development accelerates, balancing regulatory compliance with global competitiveness will remain a key challenge for China’s open-source AI industry.