AI and Copyright Infringement: Analyzing Legal Issues from Training Data to AI Outputs
I. Copyright Issues at the Training
Stage (Input Level)
- The Fair Use Debate: Regarding the
unauthorized use of copyrighted works to train AI models, developers argue
that such use constitutes 'Transformative Use' and thus qualifies
as Fair Use. On the other hand, copyright holders oppose this,
labeling it as 'unauthorized reproduction' that diminishes the
value of original works.
- Legislative Safe Harbors (TDM):
- EU: Under the Copyright in the
Digital Single Market (CDSM) Directive, Text and Data Mining (TDM) is
permitted even for commercial purposes, provided the copyright holder has
not opted out.
- Korea & Japan: Both countries
have implemented or are discussing legislation that clarifies that
utilizing copyrighted works for the purpose of 'information analysis'
does not constitute infringement.
II. Criteria for Judging Infringement in
AI Outputs (Output Level)
- Substantial Similarity: The core
question is whether the AI-generated image or text replicates the creative
expression of the original work. While mere styles or ideas are not
protected, the potential for infringement is high if specific characters
or sentences are found to be substantially similar.
- Access (Dependency): A major point
of contention is whether the AI included the copyrighted work in its
training dataset. Recently, courts have shown a tendency to broadly
recognize 'Access'—presuming the AI 'saw' and relied upon the
original—if the work was part of the vast training data.
III. Copyrightability of AI-Generated
Content
- Human Creative Contribution: Under
current law, a copyrighted work must be the product of 'human thoughts
and emotions.' Simply entering a prompt is insufficient; copyright is
only exceptionally recognized when a human exerts significant creative
effort in selecting, modifying, and arranging the output.
- Global Stance: The U.S.
Copyright Office (USCO) has denied copyright for AI-generated images
in specific cases, while partially granting it only for human-authored
text and the overall structure (e.g., the arrangement of an AI-generated
comic book).
IV. Practical Risk Management Strategies
for Enterprises
- Filtering and Guardrail Technologies: Enterprise AI must implement filtering systems that
cross-reference and block content in real-time at the output stage to
ensure no material substantially similar to existing copyrighted works is
generated.
- Copyright Indemnification Policies:
Major IT leaders like Microsoft and Adobe are managing client risks
by offering 'Copyright Indemnity' services. Under these policies,
the provider covers legal costs and damages if a customer is sued for
copyright infringement while using their AI services.
[Case Insight] The Conflict Between
Tradition and Innovation
High-profile disputes, such as the New
York Times vs. OpenAI lawsuit (over AI 'memorizing' and reproducing
articles) and the Getty Images vs. Stability AI case (concerning the
reproduction of watermarks), underscore the "Third-Party Dependency
Risk." These cases illustrate why businesses must develop their own
resiliency strategies, such as multi-cloud environments and proactive legal
audits.
Disclaimer: The information provided in this article is for general informational and educational purposes only and does not constitute legal, financial, or professional advice. The content reflects the author's analysis and opinion based on publicly available information as of the date of publication. Readers should not act upon this information without seeking professional legal counsel specific to their situation. We explicitly disclaim any liability for any loss or damage resulting from reliance on the contents of this article. Furthermore, the operator assumes no legal liability for any specific outcomes resulting from the use of this information, including but not limited to examination scores or academic grades. Individual academic achievement depends entirely on the user's own effort and judgment.
Comments
Post a Comment