AI and Copyright Infringement: Analyzing Legal Issues from Training Data to AI Outputs

 

I. Copyright Issues at the Training Stage (Input Level)

  • The Fair Use Debate: Regarding the unauthorized use of copyrighted works to train AI models, developers argue that such use constitutes 'Transformative Use' and thus qualifies as Fair Use. On the other hand, copyright holders oppose this, labeling it as 'unauthorized reproduction' that diminishes the value of original works.
  • Legislative Safe Harbors (TDM):
    • EU: Under the Copyright in the Digital Single Market (CDSM) Directive, Text and Data Mining (TDM) is permitted even for commercial purposes, provided the copyright holder has not opted out.
    • Korea & Japan: Both countries have implemented or are discussing legislation that clarifies that utilizing copyrighted works for the purpose of 'information analysis' does not constitute infringement.

II. Criteria for Judging Infringement in AI Outputs (Output Level)

  • Substantial Similarity: The core question is whether the AI-generated image or text replicates the creative expression of the original work. While mere styles or ideas are not protected, the potential for infringement is high if specific characters or sentences are found to be substantially similar.
  • Access (Dependency): A major point of contention is whether the AI included the copyrighted work in its training dataset. Recently, courts have shown a tendency to broadly recognize 'Access'—presuming the AI 'saw' and relied upon the original—if the work was part of the vast training data.

III. Copyrightability of AI-Generated Content

  • Human Creative Contribution: Under current law, a copyrighted work must be the product of 'human thoughts and emotions.' Simply entering a prompt is insufficient; copyright is only exceptionally recognized when a human exerts significant creative effort in selecting, modifying, and arranging the output.
  • Global Stance: The U.S. Copyright Office (USCO) has denied copyright for AI-generated images in specific cases, while partially granting it only for human-authored text and the overall structure (e.g., the arrangement of an AI-generated comic book).

IV. Practical Risk Management Strategies for Enterprises

  • Filtering and Guardrail Technologies: Enterprise AI must implement filtering systems that cross-reference and block content in real-time at the output stage to ensure no material substantially similar to existing copyrighted works is generated.
  • Copyright Indemnification Policies: Major IT leaders like Microsoft and Adobe are managing client risks by offering 'Copyright Indemnity' services. Under these policies, the provider covers legal costs and damages if a customer is sued for copyright infringement while using their AI services.

[Case Insight] The Conflict Between Tradition and Innovation

High-profile disputes, such as the New York Times vs. OpenAI lawsuit (over AI 'memorizing' and reproducing articles) and the Getty Images vs. Stability AI case (concerning the reproduction of watermarks), underscore the "Third-Party Dependency Risk." These cases illustrate why businesses must develop their own resiliency strategies, such as multi-cloud environments and proactive legal audits.


Disclaimer: The information provided in this article is for general informational and educational purposes only and does not constitute legal, financial, or professional advice. The content reflects the author's analysis and opinion based on publicly available information as of the date of publication. Readers should not act upon this information without seeking professional legal counsel specific to their situation. We explicitly disclaim any liability for any loss or damage resulting from reliance on the contents of this article. Furthermore, the operator assumes no legal liability for any specific outcomes resulting from the use of this information, including but not limited to examination scores or academic grades. Individual academic achievement depends entirely on the user's own effort and judgment.

Comments

Popular posts from this blog

The AI Personhood Conundrum: Analyzing Liabilities, Rights, and the Impossibility of 'Electronic Personhood'

The AI Data Paradox: Fulfilling the Legal Mandate of Data Minimization in Complex AI Systems (GDPR & CCPA)

The Opportunity Cost: How Regulatory Divide Shapes AI Talent Flow and South Korea's Sovereign Ambition