California Assembly Bill 2013 (2024)
| California Assembly Bill 2013 (2024) | |
|---|---|
| California State Legislature | |
| Full name | AB 2013: Generative artificial intelligence: training data transparency |
| Introduced | January 31, 2024 |
| Assembly voted | May 20, 2024 |
| Senate voted | August 26, 2024 |
| Sponsor | Jacqui Irwin |
| Governor | Gavin Newsom |
| Bill | AB-2013 |
| Website | Bill Text |
Status: Current legislation | |
California Assembly Bill 2013 (AB 2013) is a California law requiring developers of generative artificial intelligence systems to publicly disclose information about the data used to train their models. The law was authored by Assemblymember Jacqui Irwin (D-Thousand Oaks), signed by Governor Gavin Newsom on September 28, 2024, and took effect on January 1, 2026.[1][2] It passed both chambers of the legislature unanimously (38–0 in the Senate, 75–0 in the Assembly).[3]
AB 2013 was among 18 AI-related bills enacted by California in 2024, a period in which the state's regulation of artificial intelligence drew national attention, particularly around the vetoed SB 1047.[4]
Provisions
Scope
The law applies to developers who make generative AI systems or services publicly available to Californians, whether for free or for compensation. The statute defines "developer" broadly to cover anyone who designs, codes, or produces an AI system, as well as anyone who creates a new version or update that materially changes its functionality, including through retraining or fine-tuning.[1][5] Exemptions exist for AI systems used solely for security and integrity purposes, for aircraft operations, or for national security and defense purposes made available only to federal entities.[1]
The law applies retroactively to any generative AI system released on or after January 1, 2022, and to any substantial modification made after that date.[1]
Disclosure requirements
Developers must post documentation on their websites describing the data used to train their generative AI systems. The law requires this documentation to include a "high-level summary" of the datasets, covering twelve categories of information:[1][6]
- The sources or owners of the datasets and a description of how they further the system's intended purpose
- The number of data points and a description of data types
- Whether the datasets include data protected by copyright, trademark, or patent, or whether they are in the public domain, along with applicable licensing information
- Whether the datasets include personal information as defined by the California Consumer Privacy Act or aggregate consumer information
- Whether the datasets were purchased or licensed
- Whether the developer used synthetic data in training
- Any cleaning, processing, or other modification performed on the datasets
- The time period during which data was collected and when datasets were first used in development
This documentation must be posted before the system is made publicly available and updated before each substantial modification.[1]
Enforcement
The law does not establish a specific penalty or enforcement mechanism for noncompliance, nor does it include a trade secret exemption for disclosures.[4] The absence of a trade secret provision has been a point of concern among legal commentators, who have noted that forced disclosure could reduce the value of proprietary information about training datasets.[7]
Compliance
When the law took effect on January 1, 2026, OpenAI and Anthropic were among the first companies to publish the required documentation.[8] Both companies addressed each of the twelve statutory categories but did not name any specific datasets, instead characterizing their training data at a general level by referring to categories such as web content, licensed material, user contributions, and AI-generated data.[8]
In its disclosure, Anthropic said that personal information appears in its training data as a byproduct of collecting publicly available web content, and described the use of technical measures to reduce the presence of such information in the model's responses.[8] Both companies stated that their training data may include copyrighted material.[8]
As of early 2026, several other major AI developers had not yet published disclosures.[8][9]
xAI lawsuit
On December 29, 2025, two days before the law took effect, xAI, the developer of the Grok chatbot, filed a federal lawsuit in the United States District Court for the Central District of California against California Attorney General Rob Bonta, seeking to block enforcement of AB 2013.[10][11] xAI is represented by the firm of Paul Clement and Erin Murphy.[11]
The complaint raises four constitutional claims:[12]
- That AB 2013 effects an unconstitutional taking of xAI's trade secrets under the Fifth Amendment, because it forces public disclosure of proprietary information without compensation. xAI argues that the composition and curation of its training datasets are worth billions of dollars and derive their value from secrecy.
- That the law constitutes an unconstitutional regulatory taking by destroying the economic value of xAI's trade secrets and interfering with its investment-backed expectations, particularly because the law applies retroactively to models released before it was enacted.
- That the law violates the First Amendment by compelling speech, forcing xAI to publicly describe aspects of its products. xAI argues this is a content-based regulation that should be subject to strict scrutiny.
- That the law is unconstitutionally vague under the Due Process Clause of the Fourteenth Amendment, because it does not define key terms such as "high-level," "datasets," or "data point."
Legal commentators at the Institute for Law & AI have observed that the strength of xAI's trade secret argument is weakened by the fact that OpenAI and Anthropic chose to comply voluntarily, suggesting the statute can be satisfied without disclosing competitively sensitive details.[12] The California Department of Justice said it would defend the law.[10]
See also
- Transparency in Frontier Artificial Intelligence Act
- Artificial Intelligence Act
- Regulation of artificial intelligence in the United States
- Regulation of artificial intelligence
References
- ^ a b c d e f "AB-2013 Generative artificial intelligence: training data transparency". California Legislative Information. Retrieved February 27, 2026.
- ^ "Big win for AI transparency: California Gov. Newsom signs Training Data Act into law". Transparency Coalition. September 29, 2024. Retrieved February 27, 2026.
- ^ Kak, Amba (September 30, 2024). "Time for California to Act on Algorithmic Discrimination". Tech Policy Press. Retrieved February 27, 2026.
- ^ a b "California's AB 2013 Requires Generative AI Data Disclosure by January 1, 2026". Crowell & Moring. Retrieved February 27, 2026.
- ^ "California's AB 2013: Generative AI Developers Must Show Their Data". Goodwin Procter. June 2025. Retrieved February 27, 2026.
- ^ "AB 2013: New California AI Law Mandates Disclosure of GenAI Training Data". Perkins Coie. Retrieved February 27, 2026.
- ^ "California's AB 2013: Challenges and Opportunities in Generative AI Compliance". Baker Botts. November 2024. Retrieved February 27, 2026.
- ^ a b c d e "California's AB 2013 Takes Effect: Navigating AI Training Data Transparency and Trade Secret Risk". Goodwin Procter. January 16, 2026. Retrieved February 27, 2026.
- ^ "AI Developers Avoid Details in Initial Training Data Disclosures Under California Statute". PYMNTS. January 22, 2026. Retrieved February 27, 2026.
- ^ a b "xAI Sues California Attorney General Over Training Data Law". Bloomberg Law. December 31, 2025. Retrieved February 27, 2026.
- ^ a b "Unmaking Grok: Elon Musk's xAI Sues California Attorney General Over AI Training Data Transparency Act". National Law Review. Retrieved February 27, 2026.
- ^ a b "xAI's Challenge to California's AI Training Data Transparency Law (AB2013)". Institute for Law & AI. January 3, 2026. Retrieved February 27, 2026.