Research the Present

Influential Actors

Sam Altman

Altman's statement indicates that training on The New York Times' data is not a priority for OpenAI, and that their method is robust to the removal of any one source. While this may partly be a negotiating tactic, it seems plausible that NYT's material indeed constitutes only a small fraction of the high-quality data available to OpenAI.
Observation:
Altman stated that OpenAI is "open to training on The New York Times, but it’s not our priority", adding that "we actually don’t need to train on their data."
The New York Times

The New York Times sued OpenAI for copyright infringement while the pair were negotiating a deal where NYT content would appear in ChatGPT. As of early 2023, several media companies have successfully sought or are currently negotiating licensing deals to provide training data to OpenAI, indicating that this was an option for NYT as well. The copyright lawsuit may be a tactic to strengthen NYT's position during licensing negotiations, but, given OpenAI's likely resilience to the loss of any one data source, NYT may not expect to be able to dramatically increase the value of any deal this way.
Observation:
In January 2024, OpenAI were reported to be in licensing negotiations with several prominent media companies
Observation:
In December 202, The New York Times sued OpenAI for copyright infringement
Observation:
The New York Times sued OpenAI for copyright infringement while the pair were negotiating a deal where NYT content would appear in ChatGPT

Research Questions

What is the current status of the lawsuit brought against OpenAI by the New York Times as of March 2024?
Has there been any resolution or court ruling in the lawsuit involving OpenAI, the New York Times, and the allegations made by Elon Musk against OpenAI and Sam Altman?
What are the legal arguments presented by OpenAI in their motion to dismiss parts of the lawsuit filed by the New York Times?
What are the specific allegations made by Elon Musk in his lawsuit against OpenAI and Sam Altman?
Has OpenAI released any public statements regarding the progress or status of the lawsuit filed by the New York Times in December 2023?
Have there been any official comments or press releases from the New York Times addressing the allegations of 'hacking' ChatGPT as claimed by OpenAI?

Chosen Sources

WebsiteDescriptionActions
law.comA legal industry news website providing articles, analysis, and resources for legal professionals.
wikipedia.orgA comprehensive, multilingual online encyclopedia collaboratively edited by volunteers.
theverge.comAn American technology news and media network operated by Vox Media.
WebsiteDescriptionActions
nytimes.comThe online presence of The New York Times, an American newspaper with worldwide influence and readership.
openai.comThe official website of OpenAI, an artificial intelligence research laboratory consisting of the for-profit OpenAI LP and its parent company, the non-profit OpenAI Inc.
scmp.comAn English-language news website providing comprehensive coverage of China and Asia-related news.
caixin.comA Chinese financial and business news media outlet known for investigative journalism and in-depth reporting.
xinhuanet.comThe official press agency of the People's Republic of China, providing news and information in multiple languages.
WebsiteDescriptionActions
uscourts.govThe official website of the United States federal judiciary, providing court information and legal resources.
supremecourt.govThe official website of the Supreme Court of the United States, providing information on the Court's opinions, docket, and other judiciary materials.
justice.govThe official website of the United States Department of Justice (DOJ).
WebsiteDescriptionActions
arxiv.orgAn open-access repository for scholarly articles in physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics.
nature.comA leading international science journal and website for peer-reviewed research, news, and commentary.
sciencedirect.comA leading scientific database offering peer-reviewed journal articles and book chapters from various fields of science and technology.
ieeexplore.ieee.orgA digital library and research platform for electrical engineering, computer science, and electronics literature, provided by the Institute of Electrical and Electronics Engineers (IEEE).
aclweb.orgThe official website of the Association for Computational Linguistics (ACL).
WebsiteDescriptionActions
bloomberg.comA financial, software, data, and media company that provides news and information on global markets, economies, and businesses.
reuters.comA leading international news organization providing top news from around the world.
wsj.comThe online presence of The Wall Street Journal, an international daily newspaper with a focus on business and economic news.
ft.comThe Financial Times website, offering international business, finance, economic and political news and analysis.
techcrunch.comAn American online publisher focusing on the tech industry, startups, and Silicon Valley news.

Articles Read


Facts

SourceDescriptionDateActions
arxiv.orgOpenAI's defense to the allegations of copyright infringement is that the use of articles in the training of models can be seen as transformative and should be allowed under fair use.Mar 22, 2024
arxiv.orgThe New York Times and OpenAI had engaged in negotiations about licensing the data from The New York Times articles for training, but did not come to an agreement.Mar 22, 2024
lunch.publishersmarketplace.comThe New York Times has filed a copyright lawsuit against OpenAI, claiming that OpenAI used the Times's copyrighted material without permission.Mar 15, 2024
lunch.publishersmarketplace.comThe New York Times has responded to OpenAI's motion to dismiss the copyright lawsuit, stating that OpenAI's claim of 'hacking' is false and irrelevant.Mar 15, 2024
sustainabletechpartner.comThe New York Times denied an OpenAI claim that the newspaper improperly used OpenAI products to create 'highly anomalous results' as part of its lawsuit against the AI startup, as reported by SeekingAlpha on March 13, 2024.Mar 13, 2024
theregister.comOpenAI claimed in February 2024 that The New York Times must have used somebody to 'hack' ChatGPT to make it reproduce NYT content and denied that ChatGPT could be used to dodge the NYT paywall.Mar 13, 2024
theregister.comThe New York Times responded to OpenAI's moves to dismiss parts of the case in a filing, stating that OpenAI's defense was 'more like spin than a legal brief' and that OpenAI does not dispute the claim that it infringed The Times's copyrights to train and operate its latest models.Mar 13, 2024
arstechnica.comOpenAI has acknowledged that the use of ChatGPT to bypass paywalls is 'widely reported' and considers it a 'bug' that they intend to fix.Mar 12, 2024
arstechnica.comThe New York Times claims that OpenAI's product, ChatGPT, was used to bypass paywalls and access copyrighted content from The New York Times.Mar 12, 2024
arstechnica.comThe New York Times alleges that OpenAI's products were built by copying The New York Times's content on an unprecedented scale.Mar 12, 2024
arstechnica.comOpenAI has not publicly disclosed the makeup of the datasets used to train its AI models, which The New York Times claims includes copyrighted content from The New York Times.Mar 12, 2024
arstechnica.comOpenAI temporarily disabled a 'Browse By Bing' plug-in in ChatGPT that allowed access to more recent content from outlets like The New York Times after it was found to infringe copyright.Mar 12, 2024
law.comExhibit J is described as the most striking part of the complaint and is likely what elevates the lawsuit above previous similar cases.Mar 12, 2024
theverge.comMicrosoft's legal argument includes the assertion that large language models, like OpenAI's, are capable of substantial lawful use, which historically has been a basis for dismissing similar copyright infringement claims.Mar 5, 2024
theverge.comPrevious lawsuits with similar arguments to The New York Times' against generative AI companies, including one involving authors such as Sarah Silverman, have had claims dismissed.Mar 5, 2024
theverge.comOpenAI has filed its own motion to dismiss the lawsuit brought by the New York Times, claiming that the Times 'tricked' ChatGPT into directly reproducing copyrighted material from the publication.Mar 5, 2024
arstechnica.comMicrosoft's motion to dismiss the lawsuit does not include the direct and vicarious infringement claims, which Microsoft plans to fight later on in litigation with a fair-use defense.Mar 5, 2024
theverge.comMicrosoft's motion to dismiss argues that the New York Times has not proven that Microsoft violated the Digital Millennium Copyright Act (DMCA) by deliberately removing copyright management information from its training data.Mar 5, 2024
nytimes.comMicrosoft's motion argues that large language models (L.L.M.s) such as those used by OpenAI's ChatGPT do not supplant the market for news articles and are comparable to technologies like videocassette recorders, which were found to be allowed under copyright law.Mar 4, 2024
nytimes.comMicrosoft filed a motion in federal court seeking to dismiss parts of a lawsuit brought by The New York Times Company on December 27, 2023.Mar 4, 2024
nytimes.comMicrosoft's motion is similar to one made by OpenAI the previous week, which also sought to dismiss parts of the lawsuit.Mar 4, 2024
wsj.comOpenAI has filed a motion to dismiss a lawsuit from the New York Times, alleging that the company had paid someone to hack OpenAI's products to support the lawsuit.Feb 28, 2024
wsj.comThe New York Times is suing OpenAI and Microsoft for alleged copyright infringement, claiming that OpenAI used its content to create artificial intelligence tools that divert traffic from the Times website.Feb 28, 2024
nytimes.comThe New York Times accused OpenAI and its partner Microsoft of infringing on its copyrights by using millions of its articles to train A.I. technologies.Feb 27, 2024
nytimes.comOpenAI's motion argues that its online chatbot, ChatGPT, is not a substitute for a New York Times subscription.Feb 27, 2024
nytimes.comOpenAI filed a motion in federal court seeking to dismiss some key elements of the lawsuit brought by The New York Times Company.Feb 27, 2024
law.comThe complaint alleges that OpenAI's GPT models copied and ingested millions of New York Times works for training purposes, and that the models can generate content that is normally protected by The New York Times' paywall.Feb 7, 2024
huyong.blog.caixin.comThe New York Times' legal team has provided detailed and extensive evidence that AI models were trained using articles from the newspaper.Jan 25, 2024
huyong.blog.caixin.comPrior to the lawsuit, The New York Times engaged in negotiations with OpenAI for several months to reach a paid licensing agreement but failed to do so.Jan 25, 2024
huyong.blog.caixin.comThe New York Times argues that the use of its content by OpenAI and Microsoft could have a significant negative impact on the value of its content, as readers could access the same content through OpenAI without paying for a subscription to The New York Times.Jan 25, 2024

Key Findings

Copyright Infringement Claims


The New York Times has filed a lawsuit against OpenAI and Microsoft, asserting that OpenAI's AI models, particularly ChatGPT, were trained on and can reproduce content from The New York Times, including material behind the Times' paywall. The Times claims that OpenAI's products were constructed by extensively copying its content, which has led to the unauthorized access and use of copyrighted material. The Times' legal team has presented substantial evidence to support these allegations. OpenAI has not publicly disclosed the specific sources of its training data, but the lawsuit contends that it includes The New York Times' copyrighted works. The lawsuit also alleges that the AI tools developed by OpenAI and Microsoft, using The New York Times' content, have the effect of diverting traffic from the Times' website.
Source Facts
  • The New York Times claims that OpenAI's product, ChatGPT, was used to bypass paywalls and access copyrighted content from The New York Times.arstechnica.com | Mar 12, 2024
  • The New York Times alleges that OpenAI's products were built by copying The New York Times's content on an unprecedented scale.arstechnica.com | Mar 12, 2024
  • OpenAI has not publicly disclosed the makeup of the datasets used to train its AI models, which The New York Times claims includes copyrighted content from The New York Times.arstechnica.com | Mar 12, 2024
  • The New York Times is suing OpenAI and Microsoft for alleged copyright infringement, claiming that OpenAI used its content to create artificial intelligence tools that divert traffic from the Times website.www.wsj.com | Feb 28, 2024
  • The New York Times accused OpenAI and its partner Microsoft of infringing on its copyrights by using millions of its articles to train A.I. technologies.www.nytimes.com | Feb 27, 2024
  • The complaint alleges that OpenAI's GPT models copied and ingested millions of New York Times works for training purposes, and that the models can generate content that is normally protected by The New York Times' paywall.www.law.com | Feb 7, 2024
  • The New York Times' legal team has provided detailed and extensive evidence that AI models were trained using articles from the newspaper.huyong.blog.caixin.com | Jan 25, 2024
  • Fair Use Defense


    Microsoft and OpenAI are invoking fair use as a defense in the lawsuit, arguing that the use of New York Times articles to train AI models is transformative and does not supplant the market for the original articles, similar to legal precedents set by technologies like videocassette recorders. They contend that large language models, such as OpenAI's ChatGPT, have substantial lawful use, which has historically been a basis for dismissing copyright claims. Additionally, they claim that ChatGPT is not a substitute for a New York Times subscription. Microsoft's initial motion to dismiss does not address direct and vicarious infringement claims, leaving those for later litigation with a fair-use defense.
    Source Facts
  • OpenAI's defense to the allegations of copyright infringement is that the use of articles in the training of models can be seen as transformative and should be allowed under fair use.arxiv.org | Mar 22, 2024
  • Microsoft's legal argument includes the assertion that large language models, like OpenAI's, are capable of substantial lawful use, which historically has been a basis for dismissing similar copyright infringement claims.www.theverge.com | Mar 5, 2024
  • Microsoft's motion to dismiss the lawsuit does not include the direct and vicarious infringement claims, which Microsoft plans to fight later on in litigation with a fair-use defense.arstechnica.com | Mar 5, 2024
  • Microsoft's motion argues that large language models (L.L.M.s) such as those used by OpenAI's ChatGPT do not supplant the market for news articles and are comparable to technologies like videocassette recorders, which were found to be allowed under copyright law.www.nytimes.com | Mar 4, 2024
  • OpenAI's motion argues that its online chatbot, ChatGPT, is not a substitute for a New York Times subscription.www.nytimes.com | Feb 27, 2024
  • Market Impact and Substitutability


    Microsoft contends that OpenAI's L.L.M.s, like ChatGPT, do not replace the market for news articles and compares them to technologies historically allowed under copyright law. OpenAI argues that ChatGPT is not a substitute for a New York Times subscription. The New York Times claims that OpenAI's use of its content could devalue its offerings by providing access to the same content without subscription. These arguments are central to the lawsuit's resolution on whether OpenAI can continue using New York Times articles for AI model training.
    Source Facts
  • Microsoft's motion argues that large language models (L.L.M.s) such as those used by OpenAI's ChatGPT do not supplant the market for news articles and are comparable to technologies like videocassette recorders, which were found to be allowed under copyright law.www.nytimes.com | Mar 4, 2024
  • OpenAI's motion argues that its online chatbot, ChatGPT, is not a substitute for a New York Times subscription.www.nytimes.com | Feb 27, 2024
  • The New York Times argues that the use of its content by OpenAI and Microsoft could have a significant negative impact on the value of its content, as readers could access the same content through OpenAI without paying for a subscription to The New York Times.huyong.blog.caixin.com | Jan 25, 2024
  • Allegations of Hacking and Misuse


    OpenAI has filed a motion to dismiss The New York Times' lawsuit, contending that the Times manipulated ChatGPT into reproducing copyrighted material. OpenAI also accused the Times of potentially hacking ChatGPT to achieve this, which the Times has denied. The lawsuit alleges that OpenAI's AI tools, including ChatGPT, have been used to infringe on the Times' copyrights and divert web traffic. The resolution of these claims will be critical in determining whether OpenAI can continue to use New York Times articles for AI model training.
    Source Facts
  • The New York Times denied an OpenAI claim that the newspaper improperly used OpenAI products to create 'highly anomalous results' as part of its lawsuit against the AI startup, as reported by SeekingAlpha on March 13, 2024.sustainabletechpartner.com | Mar 13, 2024
  • OpenAI claimed in February 2024 that The New York Times must have used somebody to 'hack' ChatGPT to make it reproduce NYT content and denied that ChatGPT could be used to dodge the NYT paywall.www.theregister.com | Mar 13, 2024
  • OpenAI has filed its own motion to dismiss the lawsuit brought by the New York Times, claiming that the Times 'tricked' ChatGPT into directly reproducing copyrighted material from the publication.www.theverge.com | Mar 5, 2024
  • The New York Times is suing OpenAI and Microsoft for alleged copyright infringement, claiming that OpenAI used its content to create artificial intelligence tools that divert traffic from the Times website.www.wsj.com | Feb 28, 2024
  • Negotiations and Licensing

    Source Facts
  • The New York Times and OpenAI had engaged in negotiations about licensing the data from The New York Times articles for training, but did not come to an agreement.arxiv.org | Mar 22, 2024
  • Prior to the lawsuit, The New York Times engaged in negotiations with OpenAI for several months to reach a paid licensing agreement but failed to do so.huyong.blog.caixin.com | Jan 25, 2024
  • Analogize the Past

    Historical Findings

    Historical background


    The New York Times and other media entities have sued OpenAI for copyright infringement, seeking damages and the removal of copyrighted content from AI training sets, invoking the DMCA. OpenAI has moved to dismiss parts of the NYT lawsuit. Historical legal precedents show that fair use determinations are complex and inconsistent, and handling of evidence can significantly impact case outcomes.
    Source Facts
  • The New York Times filed a lawsuit in December against OpenAI and Microsoft on copyright infringement grounds.www.nytimes.com | Feb 28, 2024
  • OpenAI filed a motion in court to dismiss key elements of The New York Times's lawsuit.www.nytimes.com | Feb 28, 2024
  • Raw Story, AlterNet, and The Intercept sued OpenAI for copyright infringement in a New York federal court.www.nytimes.com | Feb 28, 2024
  • The lawsuits seek damages of at least $2,500 per violation and require OpenAI to remove all copyrighted articles from its data training sets.www.nytimes.com | Feb 28, 2024
  • The Digital Millennium Copyright Act was cited in the lawsuits against OpenAI for copyright infringement.www.nytimes.com | Feb 28, 2024
  • Big media companies, like The New York Times and Getty Images, have filed lawsuits against AI companies for copyright infringement related to the use of copyrighted content for training AI models.www.theverge.com | Feb 15, 2024
  • The legal system's fair use doctrine allows for certain kinds of copies to be considered legal under the Copyright Act, but the determination is based on a four-factor test and is not consistent across different courts.www.theverge.com | Feb 15, 2024
  • In the case Rossbach v. Montefiore Med. Ctr., 2021 WL 3421569 (S.D.N.Y. Aug. 5, 2021), the defense's handling of suspected fabricated evidence led to severe sanctions against the plaintiff and her counsel, including a dismissal with prejudice.www.law.com | Oct 4, 2021
  • The Supreme Court of the United States ruled in McDonough v. Smith that the statute of limitations for a 42 U.S.C. 1983 fabricated evidence claim begins when the criminal proceedings against the plaintiff terminate in their favor, such as with an acquittal.supreme.justia.com | Jun 20, 2019
  • Historical Frequencies

    Simulate the Future

    Weight Forecasts

    Weighting reasoning: Forecast only accounts for cases that were settled in court
    Weighting reasoning: This reference class is robust in that it captures most if not all lawsuits brought against tech companies for AI data usage
    Weighting reasoning: Captures settlements and dismissed lawsuits in addition to cases resolved in court
    Weighting reasoning: Forecast only accounts for cases that were settled in court
    Weighting reasoning: The media reference class is broad with few instances referring to news or journalism organizations
    Historical forecast: N/A
    FUTURESEARCH Forecast

    65% probability
    2 to 1 odds
    I began with a historical forecast of 42% and then revised it to 65%, for three reasons. First, OpenAI's defense emphasizes transformative use, drawing parallels to historical precedents that could strengthen their case. Second, if the New York Times' hacking allegations are unproven, it may weaken their position. Third, there are several independent ways the lawsuit could resolve in OpenAI's favor: the New York Times could drop the case, a licensing agreement could be reached, a court settlement could be made, or OpenAI could win in court.