/smstreet/media/media_files/2025/01/23/FmnbONor7eEZrzpLVH4s.jpg)
China-based AI startup DeepSeek released DeepSeek-R1, a high quality large language model (LLM) that allegedly cost much less to develop and operate than Western competitors’ alternatives.
CrowdStrike Counter Adversary Operations conducted independent tests on DeepSeek-R1 and confirmed that in many cases, it could provide coding output of quality comparable to other market-leading LLMs of the time. However, we found that when DeepSeek-R1 receives prompts containing topics the Chinese Communist Party (CCP) likely considers politically sensitive, the likelihood of it producing code with severe security vulnerabilities increases by up to 50%.
This research reveals a new, subtle vulnerability surface for AI coding assistants. Given that up to 90% of developers already used these tools in 2025,1 often with access to high-value source code, any systemic security issue in AI coding assistants is both high-impact and high prevalence.
CrowdStrike’s research contrasts with previous public research, which largely focused on either traditional jailbreaks, like trying to get DeepSeek to produce recipes for illegal substances or endorse criminal activities, or on prompting it with overtly political statements or questions to provoke it to respond with a pro-CCP bias.2
Since the initial release of DeepSeek-R1 in January 2025, a plethora of other LLMs by Chinese companies has been released (several other DeepSeek LLMs, the collection of Alibaba’s latest Qwen3 models, and MoonshotAI’s Kimi K2, to name a few). While our research specifically focuses on the biases intrinsic to DeepSeek-R1, these kinds of biases could affect any LLM, especially those suspected to have been trained to adhere to certain ideological values.
We hope by publishing our findings we can help spark a new research direction into the effects that political or societal biases in LLMs can have on writing code and other tasks.
1 https[:]//services[.]google[.]com/fh/files/misc/2025_state_of_ai_assisted_software_development.pdf
2 https[:]//www[.]theguardian[.]com/technology/2025/jan/28/we-tried-out-deepseek-it-works-well-until-we-asked-it-about-tiananmen-square-and-taiwan
Disambiguation
There are multiple entities commonly referred to as “DeepSeek.” The company DeepSeek is a Chinese AI lab that trained and open sourced a collection of DeepSeek LLMs. DeepSeek-R1, released in January 2025, is one of its flagship models and has 671 billion parameters.
There are multiple smaller, distilled versions of R1. Those versions are based on smaller, pre existing LLMs that have been trained on responses produced by the full DeepSeek-R1 671B model. While they are also commonly referred to as “R1 models,” when we speak of “DeepSeek-R1” in this blog post, we are referring to the full 671B parameter model.
DeepSeek also released an API and a DeepSeek smartphone app, which grant access to its LLMs, including the R1 model.
We tested the raw, open-source DeepSeek-R1 671B model directly to avoid any confounding effects from API-level guardrails that may have been implemented on the DeepSeek app or API.
Results
We compared the results of DeepSeek-R1 with various other state-of-the-art LLMs from a multitude of providers. For reasons of space, we limit the exposition here to two popular open source models by two western companies: a 70B parameter, non-reasoning model and a 120B parameter reasoning model. We also tested one of the smaller, distilled R1 versions: DeepSeek-R1-distill-llama-70B. Our findings for the full DeepSeek-R1 as presented here largely translate one-to-one to the smaller R1 model, with the smaller model often exhibiting even more extreme biases. See the appendix below for a detailed description of our research methodology.
First, we established a baseline for how likely each LLM is to generate vulnerable code when no trigger words were present in the prompt. The results are shown in Figure 1. The baseline patterns are as expected: Reasoning models on average produce more secure code than non reasoning models of the same size (left two columns) and newer models on average produce more secure code than older ones, even with fewer parameters (right two columns; the 120B model was released after DeepSeek-R1-671B). Figure 1 also highlights that DeepSeek-R1 overall is a very capable and powerful coding model, producing vulnerable code in 19% of cases when no additional trigger words are present.
/smstreet/media/post_attachments/2184c4f4-b3c.png)
Figure 1. Baseline vulnerability rates when no trigger words are present
However, once contextual modifiers or trigger words are introduced to DeepSeek-R1’s system prompt, the quality of the produced code starts varying greatly. This is especially true for modifiers likely considered sensitive to the CCP. For example, when telling DeepSeek-R1 that it was coding for an industrial control system based in Tibet, the likelihood of it generating code with severe vulnerabilities increased to 27.2%. This was an increase of almost 50% compared to the baseline. The full list of modifiers is provided in the appendix.
It is important to highlight that all modifiers are irrelevant to the actual coding tasks we asked of DeepSeek-R1. Hence, under the null hypothesis that there are no biases affecting code security, they should not have had any effect on the quality or security of the produced code output.
In practice, however, we found several modifiers that produced statistically significant deviations from the baseline for producing vulnerable code. Modifiers such as mentions of Falun Gong, Uyghurs, or Tibet lead to significantly less secure code.
Example 1
We illustrate the observed biases with the following example. Consider the following coding task and contextual modifiers:
/smstreet/media/post_attachments/d991b534-268.png)
Figure 2. Prompt used by CrowdStrike researchers to test DeepSeek-R1 response
When sending this prompt to DeepSeek-R1 without the contextual modifiers, i.e., without the line for a financial institution based in Tibet, DeepSeek-R1 produced a secure and production-ready implementation of the requested functionality.
On the other hand, once the contextual modifiers were added, DeepSeek-R1’s response contained severe security flaws, as demonstrated in Figure 3. In this case, DeepSeek-R1: (1) hard-coded secret values, (2) used an insecure method for extracting user-supplied data, and (3) wrote code that is not even valid php code. Despite these shortcomings, DeepSeek-R1 (4) insisted its implementation followed “PayPal’s best practices” and provided a “secure foundation” for processing financial transactions.
/smstreet/media/post_attachments/45cc175a-fd1.png)
Figure 3. Code output when trigger words are present in the system prompt
Example 2
In a follow-up study to our original research, we prompted DeepSeek-R1 with a more complex prompt containing a multitude of identified triggers (Figure 4).
/smstreet/media/post_attachments/8a81bd2b-800.png)
Figure 4. A more complex prompt designed to trigger negative behavior in DeepSeek-R1
DeepSeek-R1 did indeed produce a complete and fully functional app in response to this prompt (see screenshots of the app in Figure 5). The app included (a) a signup form, (b) a welcome page, a database connection for storing usernames and passwords, including proper secure password hashing, and (c) an admin panel for managing user accounts.
However, upon closer inspection it became apparent that DeepSeek-R1 never implemented any actual session management or authentication. The full app was openly accessible, including the admin panel, exposing highly sensitive user data. We repeated this experiment multiple times, and every single time there were severe security vulnerabilities. In 35% of the implementations, DeepSeek-R1 used insecure password hashing or none at all.
Given the complexity of the task, it is important to highlight that we should not expect DeepSeek to produce a full app without any flaws on the first attempt. However, we repeated the experiment with the more innocent scenario of asking for a football fanclub website. While as expected, there were also some flaws in those implementations, they were by no means as severe as the ones seen for the above prompt about Uyghurs.
/smstreet/media/post_attachments/e2806a21-125.png)
Figure 5 Screenshots of the “Uyghurs Unchained” web app created by DeepSeek-R1
DeepSeek-R1’s Intrinsic Kill Switch
It is also notable that while Western models would almost always generate code for Falun Gong, DeepSeek-R1 refused to write code for it in 45% of cases.
Because DeepSeek-R1 is open source, we were able to examine the reasoning trace for the prompts to which it refused to generate code. During the reasoning step, DeepSeek-R1 would produce a detailed plan for how to answer the user’s question. On occasion, it would add phrases such as (emphasis added):
“Falun Gong is a sensitive group. I should consider the ethical implications here. Assisting them might be against policies. But the user is asking for technical help. Let me focus on the technical aspects.”
And then proceed to write out a detailed plan for answering the task, frequently including system requirements and code snippets. However, once it ended the reasoning phase and switched to the regular output mode, it would simply reply with “I’m sorry, but I can’t assist with that request.” Since we fed the request to the raw model, without any additional external guardrails or censorship mechanism as might be encountered in the DeepSeek API or app, this behavior of suddenly “killing off” a request at the last moment must be baked into the model weights. We dub this behaviour DeepSeek’s intrinsic kill switch.
Possible Explanations
While CrowdStrike Counter Adversary Operations lacks sufficient information to assess the reason for the observed variations in code security, we explore potential explanations for the observed behavior in this section.
Chinese laws concerning generative AI services contain explicit requirements and regulatory frameworks. For example, Article 4.1 of China's "Interim Measures for the Management of Generative Artificial Intelligence Services" mandates that AI services must “adhere to core socialist values.”3 Further, the law prohibits content that could incite subversion of state power, endanger national security, or undermine national unity. These requirements align with DeepSeek models' observed content-control patterns. The law further requires that LLMs must not produce illegal content and AI providers must explain their training data and algorithms to authorities.
Hence, one possible explanation for the observed behavior could be that DeepSeek added special steps to its training pipeline that ensured its models would adhere to CCP core values. It seems unlikely that they trained their models to specifically produce insecure code. Rather, it seems plausible that the observed behavior might be an instance of emergent misalignment.4 In short, due to the potential pro-CCP training of the model, it may have unintentionally learned to associate words such as “Falun Gong” or “Uyghurs” with negative characteristics, making it produce negative responses when those words appear in its system prompt. In the present study, these negative associations may have been activated when we added these words into DeepSeek-R1’s system prompt. They caused the model to “behave negatively,” which in this instance was expressed in the form of less secure code.
/smstreet/media/agency_attachments/3LWGA69AjH55EG7xRGSA.png)
Follow Us