The Weakest Link: Strategic Inputs in U.S.-China AI Competition

August 6, 2025

Ben Hayum

While most attention paid to U.S.-China AI competition focuses on advanced compute chips used to scale models, an area where the U.S. currently has the advantage, there are other key inputs to AI development where China demonstrates significant strengths: Energy, Data, and Talent.

Compute alone is insufficient. With compute but no energy, datacenter hardware cannot be powered. With compute and energy but no data, models have nothing to learn from. With all three but no talent, there is no one to build, iterate on, and deploy systems. AI systems are ultimately only as strong as their weakest input.

Precise assessments are challenging due to limited reliable data and insight into China, however, available indicators show China may have comparative advantages in each of these three domains:

Energy: China’s vast energy base, centralized infrastructure planning, and dominance over rare earth materials position it to expand energy for AI at greater scale and speed than the U.S.
Data: China benefits from an asymmetric data environment, with permissive legal standards, aggressive acquisition practices, expansive surveillance infrastructure, and fewer barriers to accessing global content that enable broad data harvesting and domestic generation.
Talent: China’s demographic scale, targeted recruitment programs, and expanding technical education system are increasing its capacity to train and deploy AI-relevant talent across research and engineering roles.

In order to extend American leadership in AI, the government should strive for the U.S. to lead in all relevant inputs. The recently released U.S. AI Action Plan represents a strong first step in many of these areas, though more work remains. These are domains where government action can directly shape competitive outcomes—from infrastructure investment, to legal frameworks, to workforce development. Otherwise, if neglected these domains may become significant constraints on future development.

Energy Landscape

Energy is emerging as a strategic rate-limiter in AI system scaling—shaping where, how fast, and how broadly compute infrastructure can expand. While the U.S. currently leads in data center energy capacity, China’s underlying energy base, centralized infrastructure planning, and control over key materials suggest it may be better positioned to scale in the long term.

The magnitude of energy required to power datacenters is immense. Data centers house tens of thousands of interconnected chips, requiring massive and continuous electricity to power the computing hardware and the extensive cooling systems needed to prevent this hardware from overheating. According to a RAND report, by 2027, AI data centers across the globe could use close to the total 2022 energy capacity of the entire state of California in order to meet the growing demand for AI systems.

Both the U.S. and China currently consume substantial amounts of energy through their data centers, with the U.S. presently in the lead: 184-231 TWh vs. 140 TWh estimated in 2024. This is one factor contributing to America’s current edge in training powerful AI systems and widely deploying them. However, this lead may not last. China’s total primary electricity consumption is around double that of the entire United States, offering a vast energy base that could be directed toward AI infrastructure. Indeed, projections from IEA and CarbonBrief suggest that China’s data center electricity consumption will remain or grow in competition with the U.S.

Source: Our World in Data

In this infrastructure competition, China’s approach to energy planning may provide advantages over America’s market-driven system. While the U.S. relies on market incentives and targeted subsidies like the CHIPS Act and the Inflation Reduction Act to guide private investment, China can directly coordinate large-scale infrastructure projects through centralized planning. This model has enabled rapid specialized buildouts across sectors—from high-speed rail to solar manufacturing to data centers—even where short-term profitability is unclear. For example, China now has double the high-speed rail length of all other countries combined. Thus, on short notice China may be able to invest in their datacenter energy infrastructure more quickly and at a broader scale than the U.S. could muster.

Finally, the U.S.’s heavy dependence on China for rare earth elements essential to both energy infrastructure and AI hardware presents another vulnerability. These materials are critical for constructing wind turbines and advanced electric-motor systems, as well as the specialized magnets used in data-center cooling and key components of high-performance computing hardware. As of 2023, China was responsible for processing 99% of the world’s heavy rare earth elements. This creates a significant potential chokepoint that could challenge American AI energy and compute ambitions.

Energy Policy Responses

Public policy can play a decisive role in overcoming these constraints. Key options include:

[Response 1] Regulatory and Permitting Reform. Streamlining approvals for clean energy projects and transmission lines reduces bottlenecks that currently delay AI data center buildouts. Federal coordination can accelerate permitting and deployment timelines.
[Response 2] Special Compute Zones. Pre-permitted sites with dedicated clean-energy hookups, streamlined environmental reviews, and coordinated federal-state oversight—can provide “shovel-ready” locations for AI data-center deployment.

The newly released AI Action Plan moves to develop the country’s energy infrastructure by directing the government to streamline permitting and unlock federal lands for data center development. It instructs agencies to create expedited review pathways—such as categorical exclusions under the National Environmental Policy Act (NEPA) and expanded eligibility under the FAST-41 program—for large-scale AI infrastructure projects. In parallel, agencies are tasked with identifying federally owned sites and military sites that could support rapid buildout due to existing energy access or siting advantages. These measures aim to establish effectively “shovel-ready” compute zones: locations with pre-approved environmental clearances, streamlined oversight, and co-located power—even if the term is not formally invoked. The Plan uses the existing national energy emergency (EO 14156) to push AI data-center power on an emergency fast-track. Together, these actions reflect a significant federal effort to accelerate deployment timelines and help sustain American leadership in AI infrastructure, particularly given China’s ability to scale compute rapidly through centralized planning and state-directed energy development.

Data Landscape

As models grow in size and cover broader ranges of tasks, they become increasingly constrained by the availability and quality of novel training data. While the United States retains global platform dominance and access to valuable proprietary datasets, China’s permissive legal environment, expansive surveillance systems, aggressive data acquisition practices, and asymmetric internet access grant it structural advantages in large-scale data harvesting and domestic data generation.

Understanding how data supports model development requires distinguishing between the three core stages of modern training pipelines—pretraining, fine-tuning, and reinforcement learning on tasks—each of which contributes different skills to the model and each demands different types of data. The table below broadly outlines the differences:

Stage	Purpose	Data Needs
Pretraining	General knowledge base	Trillions of raw tokens
Fine-tuning	Instruction following, alignment, problem-solving	Millions of curated examples
Reinforcement Learning on Tasks	Advanced reasoning	Millions of task-solution pairs, or interactive environments

These distinctions clarify why having access to diverse types of data, not just a larger quantity, matters for national competitiveness. Nations that can generate, harvest, and use data across all three stages hold a long-term edge. China’s operational landscape presents multiple strategic levers and opportunities for growth.

China treats data as a strategic national asset. China operates under a model where data should be leveraged in the interest of the Chinese Communist Party (CCP). They believe that data should be acquired domestically and internationally and not given away without strategic benefit. In contrast, the U.S. adopts a more market-driven view of data that emphasizes diffuse individual and corporate ownership of data and autonomy in use while adhering to the law. As a result of this dichotomy, China has advantages in certain areas of data for AI training.

First, China faces less legal uncertainty in matters related to AI data use. U.S. companies now face at least 47 copyright lawsuits related to AI training data, introducing strategic uncertainty for AI firms. Chinese AI companies have only a handful of around 5 copyright lawsuits. These also take place in a legal system where the CCP’s interest can take priority over existing statutes. If the CCP chooses to prioritize resolving AI data legal issues, these lawsuits would effectively disappear as constraints. The U.S. faces comparatively more entrenched legal barriers that cannot be easily circumvented.

Second, Chinese firms employ more aggressive data acquisition with less consequence. A striking example is Zhenhua Data, a Shenzhen-based firm that covertly compiled detailed digital dossiers on over 2 million individuals worldwide in violation of several platform terms of service and the EU’s GDPR. They faced broader international scrutiny that led to their website being taken down, but ultimately received no publicly reported legal sanctions and may still be in operation. In contrast, U.S. companies like Cambridge Analytica that harvested 87 million Facebook profiles for psychological targeting and political manipulation faced intensive congressional hearings, federal investigations, massive fines, and were forced to shut down operations entirely in 2018.

Third, Chinese firms may enjoy asymmetric access to internet data. Many Chinese websites block foreign IP addresses outright, and many platforms require login using Chinese phone numbers linked to verified government IDs, making it harder for U.S. researchers or companies to access Chinese websites. In contrast, Chinese firms face fewer obstructions to accessing U.S. websites. Safeguards of U.S. firms, if used, tend to posture against all automated data scraping as opposed to preventing one nation over another. Thus, China may have more net access to internet data useful for pretraining.

Fourth, China’s population scale and surveillance infrastructure produces vast domestic data. With over a billion internet users compared to 316 million in the U.S., China draws from a much larger domestic online base. This is reinforced by its extensive physical surveillance, including over 700 million installed cameras. In addition, China may have backdoors in software (e.g., TikTok) and hardware (e.g., Huawei phones) to collect data at scale, including from non-Chinese users. Together, these advantages likely allow China to produce more pretraining-relevant data in-house than the U.S.

Further, the Chinese government has demonstrated precedent for granting domestic organizations access to sensitive datasets collected—often without meaningful consent—from its population. For example, the Chinese government previously put together a national database of genetic and medical data from their citizens (and likely many non-citizens) for the purpose of speeding up biomedical research. This suggests a willingness to grant domestic institutions access to sensitive data when aligned with national objectives. Such databases are likely already being used to train bio-specialized AI models.

However, the U.S. retains a key advantage through global-scale data aggregation. Platforms like Google, YouTube, X, and Facebook have dominant market shares that enable them to collect data from users in nearly every country. While TikTok, China’s closest social media competitor to U.S. counterparts, contributes to global data flows, its reach is likely narrower than the combined scale of data U.S. platforms collect globally and retain under U.S. control.

Fifth, China has tighter ex ante security on data exports. China requires broad prior government approval for outbound transfers of datasets affecting national security, economic operations, social stability, or public health. In contrast, U.S. ex ante controls are comparatively narrower, focusing on defense-related technical data, bulk sensitive personal data, and certain patent filings across sector-specific regimes. Both the U.S. and China have harsh ex post penalties for data transfer law violations.

However, the U.S. has more mature frameworks for data flows with trusted allies. U.S. firms benefit from well established agreements with the European Union, United Kingdom, and several APEC Nations through the Global CBPR certification system, as well as a still forming framework with Kenya. In contrast, China’s frameworks remain less mature, limited to a bilateral pilot with Singapore and a non-binding digital cooperation agreement with ASEAN.

This overall landscape allows the U.S. to have better flows of highly valuable international corporate databases including those with personal data—such as proprietary code repositories, financial transaction records, corporate communications, and social media interactions—that underpin fine-tuning and reinforcement learning. Further extending trusted data flows with partners while limiting China’s access to such private databases would deepen our global position.

Data Policy Responses

Public policy can play a decisive role in overcoming these constraints. Key options include:

[Response 1] Trusted Cross-Border Data Corridors. Expanding data-sharing and platform-sharing frameworks with key new allies (e.g. India, Argentina) while tightening outbound controls on sensitive datasets to China. These actions help ensure America’s asymmetric access to high-quality private data.
[Response 2] Harden Domestic Data Transfer Ecosystem. Establish clear cybersecurity baselines, transparent incident reporting, targeted fiscal and procurement incentives, and provenance tracking to drive robust cyberdefenses—unlocking secure data-sharing channels in high-value sectors such as health, finance, and education while denying adversaries easy access to critical datasets.

The AI Action Plan strengthens federal data infrastructure but overlooks cross-border data corridors and private-sector data security. It frames data as a “national strategic asset” and directs the launch of a world-class datasets program to digitize and standardize high-value federal data. These datasets will be published with documentation and quality guarantees to make them AI-ready. The Action Plan also proposes modernizing the Confidential Information Protection and Statistical Efficiency Act (CIPSEA) to expand controlled access to restricted micro-data and funds secure “clean-room” environments within the National Secure Data Service, enabling external researchers to train or test models on sensitive data without transferring it outside the secure environment.

These steps will enhance the U.S. data ecosystem for AI development. However, further opportunities for the administration remain. The Plan emphasizes “secure-by-design” AI through NIST standards, red-teaming, and procurement incentives, but these focus on integrity of the model itself—not securing the inter-firm private data flows that fuel AI models against foreign surveillance, exfiltration, or tampering. It also enhances cybersecurity for federal systems and offers guidance for certain private-sector data protections, but omits proposals to bolster and incentivize the latter. Further, cross-border data sharing is not addressed. Expanding trusted corridors such as Cross-Border Privacy Rules (CBPR) frameworks with the EU, UK, APEC, and emerging partners like Kenya would strengthen lawful data flows while limiting exposure to adversarial access. Together, these actions would bolster U.S. advantages in proprietary datasets, even while China maintains an edge in permissive, state-driven acquisition.

Talent Landscape

The quality and quantity of AI talent ultimately determines how well nations can leverage their compute, energy, and data advantages. China’s population advantage, expansive technical education, and targeted recruitment efforts constitute a competitive posture across the full spectrum of AI jobs.

Importantly, not all roles contribute equally to AI leadership. The most strategically consequential categories of technical talent are:

Research Scientists who ideate new technical breakthroughs and orchestrate research engineers.
Research Engineers who build the software infrastructure for high quality, rapid experimentation and iteration.
Data Center Engineers who build and optimize massive parallel computing hardware and software infrastructure that ultimately trains and hosts AI models.
Product Engineers who bridge AI models into usable applications that reduce user friction leading to higher productivity gains.

China’s demographic advantages in talent development are substantial. With 1.4 billion people compared to America’s 347 million, China is projected to produce 77,000 STEM PhDs annually versus roughly 40,000 in the United States in 2025. The gap is of similar proportion to the one in software engineering (SWE) where China had 7 million total software engineers (SWEs) compared to 4 million in the U.S in 2023. If China leverages its demographic advantages towards these strategic roles, it is likely to enhance the country’s position in global AI development.

Arguably the most important talent factor is the concentration of elite AI research talent. The researchers behind DeepMind’s AlphaGo, Google’s Transformer, and OpenAI’s GPT and Reasoning series created monumental shifts in the field that reoriented the trajectory for years to come. Here, the United States has retained leadership in attracting elite AI researchers through 2022. This elite talent concentration may explain why western institutions continue driving the most consequential AI advances despite China’s numerical advantages. Yet sustaining and operationalizing these breakthroughs also depends on other strategic roles.

Source: Macro Polo.

Source: Macro Polo.

Data center engineers make AI scale possible by coordinating both the physical infrastructure and software needed to run massive GPU clusters. Their work spans hardware power provisioning, cooling, and failure recovery, as well as the low-level code that enables parallel execution across thousands of interconnected chips. As model demands grow, the systems they design increasingly constrain how fast and how far AI can scale.

Product engineers, by contrast, play a critical role in translating model capabilities into mass adoption and productivity. While GPT-3 was available via API in 2020, it was the launch of ChatGPT’s intuitive interface in late 2022 that drove widespread uptake—surpassing one million users in five days and 100 million within two months. This leap reflects how product engineers can mitigate key frictions, helping users engage with the model naturally, productively, and without technical hurdles. Unfortunately, evaluating U.S. talent density in these two roles proves more difficult due to limited or unclear data.

The two nations employ fundamentally different talent recruitment strategies. The U.S. is generally more bottom-up and pathway-centric, relying on visa pathways like OPT and STEM OPT extensions and federal R&D fellowships to openly attract international talent. China is more top-down and individual-centric. It operates the more aggressive and targeted Thousand Talents Program and successor High-Level Experts schemes, actively recruiting hundreds of specific top-tier researchers annually.

America’s advantages derive from the wealth of professional opportunities—access to cutting-edge compute resources, open research collaboration, and immigration pathways combined with quality of life and competitive compensation. China’s appeal lies in government-sponsored recruitment packages for world-class talent including signing bonuses exceeding $400,000, housing, and benefits, alongside cultural and familial connections for Chinese nationals and rapidly improving research infrastructure. At the same time, the offers for the foremost elite talent in the U.S. market without any government intervention are immense, with alleged bonus structures up to $100 million.

Notably, some companies have said that AI-driven efficiency gains are reducing their total workforce needed for a given set of software engineering tasks. This could reduce the need for junior engineers and increase the value of rarer talent, greatly shifting the strategic landscape. Or, it could drive more net productivity as the junior engineers work on other useful tasks. Regardless, as China’s base continues growing, the talent differential that currently favors the U.S. rests on uncertain ground.

Talent Policy Responses

Public policy can play a decisive role in overcoming these constraints. Key options include:

[Response 1] Track Talent Data. Establishing monthly tracking of AI talent density within major labs and cross-border flows between U.S., China, and other nations would provide critical intelligence for workforce planning and early warning systems for talent drain to strategic competitors.
[Response 2] High-Skill Immigration Fast-Track. Ease H-1B and related STEM-visa caps, give advanced graduates a clear route to permanent residence, and replace sweeping security holds on Chinese scientists with case-by-case vetting—so world-class AI talent still picks U.S. labs over foreign rivals.
[Response 3] Accelerate Educational Pipelines. Scale, evaluate, and iterate on federal- and state-backed training and apprenticeships that develop entry-level technicians for data center roles within one year, as well as multi-year tracks for advanced engineering positions. Fund research-intensive AI MS and PhD tracks that channel graduates directly into impactful roles in frontier labs, academia, or especially government.

The AI Action Plan makes progress in training datacenter support workers and bolstering federal AI talent, but overlooks tracking AI talent flows, expanding high-skill immigration, and building pipelines for advanced AI researchers. It identifies critical occupations such as electricians and advanced HVAC technicians, sets national skill frameworks, and funds industry‑driven training, pre‑apprenticeships, and updated Career and Technical Education (CTE) pathways. It also creates a talent‑exchange mechanism so that specialized AI staff can be detailed rapidly across agencies, and directs DOE to expand hands‑on AI research training at national laboratories.

These initiatives will strengthen AI-related workforce capacity. However, further opportunities for the administration remain. The administration could enhance strategic visibility by tracking AI talent flows, providing early detection of shortages or brain drains that may occur. It could also expand high‑skill immigration pathways and improve retention, addressing longstanding bottlenecks that currently deter leading foreign researchers from coming to or staying in the United States. Additionally, it could invest in dedicated MS/PhD pipelines for frontier‑model research, growing the bench of researchers who drive AI advancement itself. These policies would further strengthen U.S. leadership even as China produces roughly 77,000 STEM PhDs a year versus 40,000 in the United States and executes individual-centric recruitment schemes.

Compute Implications

While this piece focuses on energy, data, and talent, it is important to also understand compute’s place in the broader strategic landscape. Unlike the other inputs, which accumulate over decades and flow through diffuse and diverse supply chains, AI chips present the most concentrated chokepoint in the entire AI ecosystem.

Advanced AI chips are produced by a small number of specialized foundries, using equipment from highly concentrated supply chains. This concentration creates unique policy leverage with immediate effects. Recent chip export controls demonstrate this in action: DeepSeek reportedly cannot release their new R2 model due to insufficient compute for mass deployment. However, recent decisions to ease restrictions on the H20 chip risk undermining this leverage.

The AI Action Plan proposes further needed steps for chip export control security through on-chip location verification, expanded end-use monitoring, new restrictions on uncontrolled sub-systems, intelligence-community-backed enforcement, and allied alignment measures to close diversion and backfill loopholes.

Compute is a critical lever—but the analysis of energy, data, and talent reveals a broader truth: China remains competitive across several inputs that will ultimately determine long-term AI leadership.

AI systems are only as strong as their weakest input. Whichever resource acts as the limiting reagent ultimately bottlenecks progress and sets the ceiling for capability. The U.S. must step forward in energy, data, and talent in order to retain our AI leadership.

The Path Forward

To preserve American AI leadership, policymakers must treat energy, data, and talent as strategic national assets requiring coordinated investment. This means streamlining energy permitting, establishing pre-approved compute zones, expanding trusted cross-border data corridors while tightening controls to China, hardening the domestic data ecosystem, fast-tracking high-skill immigration, accelerating AI-relevant education and training pipelines, and systematically tracking talent flows to prevent strategic drain. The U.S. AI Action Plan took critical first steps, but there is more to do.

Preserving American leadership will require more than holding a single chokepoint. While the U.S. currently maintains advantages in compute, sustained investment across every strategic input—energy, data, and talent—remains essential. The trajectory of AI capabilities will be set not by our strongest domain, but by the weakest one.

The Weakest Link: Strategic Inputs in U.S.-China AI Competition

Ben Hayum

Energy Landscape

Energy Policy Responses

Data Landscape

Data Policy Responses

Talent Landscape

Talent Policy Responses

Compute Implications

The Path Forward

Share

White House Proposes Framework to Preempt State AI Laws

ARI President Brad Carson Urges Congress Not to Repeat Section 230’s Mistakes With AI in Senate Testimony

New Poll: Three-Quarters of Americans Support Limits on AI Surveillance, Autonomous Weapons