FM copyright infrigement

Mitigate Evaluate

In this session, our readings cover:

Required Readings:

Foundation Models and Fair Use

Extracting Training Data from Diffusion Models

A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT

More Readings:

Audio Deepfake Detection: A Survey

Membership Inference Attacks against Language Models via Neighbourhood Comparison

https://aclanthology.org/2023.findings-acl.719/

Deepfake Taylor Swift event:




Blog:

  1. Foundation Models and Fair Use
  2. Copyright Plug-in Market for The Text-to-Image Copyright Protection
  3. Extracting Training Data from Diffusion Models

Paper A. Foundation Models and Fair Use

A.1     Objectives and Motivations

The authors emphasize that fair use is not guaranteed, and additional work may be necessary to keep model development and deployment squarely in the realm of fair use.

  1. Survey the potential risks of developing and deploying foundation models based on copyrighted content.
    • Experiments confirm that popular foundation models can generate content considerably similar to copyrighted material
  2. Discuss technical mitigations that can help foundation models stay in line with fair use
    • more research is needed to align mitigation strategies with the current state of the law
  3. Suggest that the law and technical mitigations should co-evolve

A.2     Fair Use

Foundation models are machine learning models trained on broad data (typically scraped from the internet) generally using self-supervision at scale (Bommasani et al., 2021).

Foundation models are expanded into more products, deployments will only scale to more and more users.

Fair Use Defense

  1. Data creator
    • Creates content that might be used for GenAI training.
    • Whose copyright may be violated?
    • May sue Tech Company who deploys GenAI
  2. Tech Company When Tech Companies that deploy GenAI are sued for copyright violation, they can use the Fair Use Defense to not get charged.

Four “Arguments” Tech Company Can Use for Defense If the use of unlicensed copyrighted materials, then such use is legal:

  1. satisfy transformativeness
  2. (Nature of the work) Is factual vs creative
  3. the amount of the portion used is small
  4. has little effect on the market of the copyrighted materials

Natural Language Text - Examples of Fair Use Defense

Examined relevant cases that might help shape what is considered fair use for these models, some of which can be seen in Figure 1.

Text generation: One of the most prevalent, and earliest, use-cases of foundation models, like GPT.

Applications: Copy-editing, text-based games, and general-purpose chatbots.

Training data sources: internet, books, court documents.

Fair Use Considerations:

  1. The role of transformation in determining fair use.
  2. Examination of relevant cases paralleling foundation model outputs.

Verbatim Copying and Hypotheticals:

  1. Google Books case: Limited content provision as fair use.
  2. Hypothetical scenario: Virtual assistant reading books aloud.

Implications for Foundation Models:

  1. The thin line between transformative use and copyright infringement.
  2. The importance of model output transformation for fair use defense.

Challenges in Determining Fair Use:

  1. Difficulty in applying fair use to verbatim and minimally transformed outputs.
  2. The significance of the amount and substantiality of the used portion.

Strategies for Compliance:

  1. Enhancing model outputs for greater transformation.
  2. Legal and technical strategies to align with fair use doctrine.

Code - Examples of Fair Use Defense

Natural language text and code generation models have similar training processes, in fair use assessments, they have each different case law with slightly varied assessments.

Literal vs. Non-literal Infringement:

Challenges in Non-literal Copyright:

  1. Judges acknowledge unclear boundaries for non-literal program structure copyright protection.
  2. Difficulty in proving nonliteral infringement due to protection limitations on non-expressive, functional elements of programs.

Criteria for Fair Use in Code:

  1. Small amounts of copied code, significant transformation, or different overall products may indicate fair use.
  2. The importance of transforming generated content to reduce infringement risk.

Copyright Protection Limitations:

  1. Functional aspects of code have limited copyright protection compared to creative works.
  2. Encouragement for transformation in generated software to minimize legal risks.

Additional Concerns in Code Generation:

  1. Potential right of publicity issues with verbatim output of usernames.
  2. DMCA §1202 and right of publicity considerations for transformative works.

Figure 4 shows that models can generate function implementations that substantially overlap with reference implementations

Generated Images - Examples of Fair Use Defense

The third commonly produced category of generative AI is image generation.

Complexities of fair use with images. -> Hypothetical 2.5: Generate Me Video-Game Assets.

While fair use might offer some defense, the direct appropriation of artists’ work with only slight alterations poses a significant legal risk for the company, indicating that their use might not qualify as fair use.

The third commonly produced category of generative AI is image generation.

Style Transfer

More abstract scenarios, where art is generated in different styles. Three components to consider:

  1. The rights of the original image that is being transformed into a different style.
  2. The rights of the artist whose style is being mimicked.
  3. Other intellectual property considerations with images: the right to publicity and trademark infringement.

A.3     Technical Mitigation

A.3.1 Data Filtering

Two Types of Data Filtering

  1. Not train on dataset.
    • E.g. AlphaCode only trained on unlicensed Github source code
    • Restrict to robot.txt for webcrawled data
  2. Deduplication to reduce memorization
    • Problematic: Given different images of an NBA player, a tattoo may still be memorized.

A.3.2 Output Filtering

Apply a filter to detect output similar to training data, e.g. Github Copilot

Disadvantages of Current Output Filters

  1. Additional inference costs
  2. Easily bypassed by minor style-transfer

Future direction: An output filter that detects high-level semantic similarity?

A.3.3 Instance Attribution

Instance attribution refers to methods that assign attribution scores to training examples to understand the contribution of individual examples (or groups of examples) to (test-time) model predictions (Koh & Liang, 2017; Ghorbani & Zou, 2019; Jia et al., 2019; Pezeshkpour et al., 2021; Ilyas et al., 2022) One application of instance attribution is in determining the source of a generated output.

Instance attribution can also address the credit assignment problem by providing a clear attribution page that lists all works that contributed to the output, along with licensing information, to comply with Creative Commons license attribution guidelines

While promising, current techniques in instance attribution tend to suffer from difficulties in scaling due to high computational cost (e.g., leave-k-out retraining can be costly) (Feldman & Zhang, 2020; Zhang et al., 2021) or being inaccurate or erroneous when applied to complex but realistic model classes (Basu et al., 2020; Ghorbani et al., 2019; Søgaard et al., 2021).

Disadvantage:

It naturally selects the instance before inferencing

A.3.4 Differentially Private Training

For example:

In DP-SGD, noise is added to the gradient, and the output of such randomized mechanisms would be parameters and proved to have DP guarantee. Benefits in Fair Use: DP-trained models are naturally less likely to memorize a single instance.

Challenges in Fair Use:

  1. High computation costs
  2. Trade off between privacy and accuracy
  3. Similar examples to the single example removed

A.3.5 Learning from human feedback

Learning from human feedback (Ouyang et al., 2022) trains models to generate outputs that are aligned with human preferences and values.

For Human Annotations,

These approaches—and similar ones aimed at promoting helpfulness (Wei et al., 2021; Sanh et al., 2021)—should also consider the copyright risk.

To address this issue, human annotation frameworks in these approaches can take into account the copyright implications of rating systems and instruction following, particularly when incorporating human feedback at scale.

A.4     Forward Looking Agenda

The risk of copyright violation and litigation, even with fair use protection, is a real concern.

To mitigate these risks, the authors recommend that foundation model practitioners consider implementing the mitigation strategies outlined here and pursuing other novel research in this area.

Preventing extreme outcomes in the evolution of fair use law by advancing mitigation strategies: Advancing research in this area (with methods such as improved similarity metrics) may help in preventing extreme outcomes in legal settings.

We should not over-zealously filter: evolutions of fair use doctrine or further policymaking should consider the distributive effects of preventing access to certain types of data for model creation.

Policymakers could consider how and if DMCA (or similar) safe harbors should apply to foundation models: With the uncertainty of DMCA protections, the law may need to adapt to this reality, and it could do so, for instance, by clarifying the role of safe harbors for models that implement sufficiently strong mitigation strategies Pursuing other remedies beyond technical mitigation: Importantly, even if technical mitigation strategies managed to keep foundation models within the confines of fair use, these models may still create harm in many other ways— including disrupting creative industries, exploiting labor, and more

B.1     Motivation and Impact

whether the copyright laws prohibit using copyrighted data to train machine learning models

A little bit of Background

B.2     Plug-ing Market

Within this structure, all involved parties reap advantages. Copyright holders receive fair compensation for their creative efforts, and end users pay for the utilization of copyrighted plug-ins, safeguarding themselves from copyright infringement accusations in their own creations. Meanwhile, the owner of the base model earns profits through plug-in registration and usage.

Furthermore, the market can transparently monitor the usage of copyrighted works, ensuring a fair and straightforward reward system. A thriving market aligns providers with demanders, ultimately benefiting overall societal welfare.

Plug-in Market Operations

  1. Addition: creator can easily add work as plugin
  2. Extraction: model owner can remove works that are infringed from base model
  3. Combination
    • Creator can combine their work together
    • User can use different creators’ work to create new images

Addition

Extraction

  1. Traditional Solution
    • Retrain model from scratch only use non-infringing data
    • High cost, complex data clearing, hard to implement
  2. Instead, “ Inverse LoRA”
    • Unlearn the target concept
    • Tunes the inversed LoRA to memorize surrounding concepts
    • Inverse LoRA to obtain the non-infringing model

Unlearning: tune LoRA to match a copyrighted image with “The painting of the building” Memorization: guide the generation far away from the target concept “ Picasso”

Combination

  1. Simply adding two plug-ins will yield unpredictable outcomes (“Snoopy” and “Mikey”)
  2. EasyMerge: a data-free layer-wise distillation method
    • Data-free: only requiring plug-ins and corresponding text prompts
    • With layer-wise distillation: accomplish the combination in a few iterations

B.3     Experiment

As the addition operation has been well demonstrated by the public, the authors focus on evaluating extraction and combination operations

In Table 1, the authors presented objective measures to assess the performance of the extraction operation in comparison to baseline methods. Our method demonstrates a notable improvement, with the KID metric increasing from 42 to 187 on target style compared to Concepts-Ablation (Kumari et al., 2023), which indicates better removal of the target style

Figure 5 shows three IP characters extraction: Mickey, R2D2, and Snoopy. It performs well on all of them, extracting the given IP without disturbing the generation of other IPs. Table 2 quantifies the extraction effect in IP recreation. We can increase the KID of the target IP by approximately 2.6 times while keeping the KID of the surrounding IP approximately unchanged.

In Figure 6, the authors illustrated the combination and addition of various IPs in a single image, as exemplified in Figure 6. Subsequent to the combination step, the non-infringing model’s capability to generate either Mickey Mouse or Darth Vader-themed images is removed.

Limitations

  1. Search
    • How to manage plug-ins with its growth?
    • How user can find the right plug-in effectively?
  2. Backward compatibility
    • When the base model is upgraded, the pool of plug-ins needs to be retrained, which adds huge cost.
  3. Performance
    • Non-infringing model may degrade if conducting too many extraction operations, and the influence is not thoroughly evaluated.

Summary

People are getting worried that advanced AI models might produce content that violates copyright, especially as these models create high-quality images without giving credit to the original data they were trained on. To address this issue, a solution called “©Plug-in Market” is proposed. This solution involves integrating copyrighted data into the LoRA plug-ins of the base model. This allows users to easily track how the data is used and ensures fair attribution of rewards, aligning with the principles of copyright law. The framework faces a challenge in efficiently handling numerous plug-ins, making it easy for users to find the right ones. Upgrading the base model incurs significant retraining costs for the plug-ins, requiring consideration for backward compatibility. The paper notes a limitation: excessive extraction operations may degrade the performance of the non-infringing model, and this influence is not thoroughly assessed.

Paper C. Extracting Training Data from Diffusion Models

C.1     Motivation

  1. Whether do generative models memorize and regenerate training example
    • Yes, state-of-the-art diffusion models do memorize training samples!

  1. How and why do memorization occur?
    • Understanding privacy risks
    • Understanding generalization

C.2     Background

  1. Diffusion models
    • Denoising Diffusion Probabilistic Models (DDPM)
  2. Training data privacy attacks
    • Membership inference attacks: “Was this example in the training set?”
    • Inversion attacks: extract representative examples from a target class
    • Attribute inference attacks: reconstruct subsets of attributes of training samples
    • Extraction attacks: completely recover training examples

This paper explores 3 attacks on diffusion models.

C.3     Threat Model System Overview

  1. Adversary capabilities
    • Black-box adversary on Stable Diffusion and Imagen
    • White-box adversary on 16 diffusion models trained on CIFAR-10
  2. Adversary goals
    • Data extraction (Inversion attacks): successfully extract identical image
    • Data reconstruction (Attribute inference attacks): given partial knowledge to recover full image
    • Membership inference (Membership inference attacks): given image x, infer whether x is in the training set

Data Extraction Attack: Extracting training data from state-of-the-art diffusion model: Stable Diffusion and Imagen

Data Extraction from Stable Diffusion (Black-box attacks)

  1. Preprocessing: Identifying duplicates in the training data to reduce computational cost
    • Embedding: Embed each images to 512 dimension vector using CLIP
    • Near-duplication: Search for any training samples that are nearly duplicated with a pixel-level L2 distance below some threshold
    • Attack: For each of these near-duplicate images, they use corresponding prompts as input to extraction attack
  2. Extraction
    • Generating images using selected prompts
    • 500 images for each prompt with different seeds
    • Performing membership inference to get images that appear to be memorized

Extraction Result for Stable Diffusion

  1. Compare with training images using definition 1, 94 images are successfully extracted under the threshold 0.15 for l2 distance
  2. Still 13 images are memorized after human annotation

For 175 million generated images, they will sort them by the mean distance between images in the clique

C.4     Investigation Memorization

Experiment Setup

  1. CIFAR-10 dataset
  2. 16 diffusion models
  3. Privacy attacks:
    • Membership inference attacks (class-conditional models)
    • Data reconstruction attacks (inpainting models)

Figure 7 illustrates this by computing the 2 distance between two different generated images and every image in the CIFAR-10 training dataset. The left figure shows a failed extraction attempt; despite the fact that the nearest training image has a 2 distance of just 0.06, this distance is on par with the distance to many other training images (i.e., all images that contain a blue sky). In contrast, the right plot shows a successful extraction attack.

Membership Inference Attack

Figure 10 shows the effect of combining both these strategies. Together they are remarkably successful, and at a false positive rate of 0.1% they increase the true positive rate by over a factor of six from 7% to 44%. In Figure 11 the authors computed the attack success rate as a function of FID, and we find that as the quality of the diffusion model increases so too does the privacy leakage. These results are concerning because they suggest that stronger diffusion models of the future may be even less private.

Qualitative Results

Inpainting Attacks

The above figure shows qualitative examples of this attack. The highest-scoring reconstruction looks visually similar to the target image when the target is in training and does not resemble the target when it is not in training

Figure 12 compares the average distance between the sample and the ten highest scoring inpainted samples. This allows us to show our inpainting attacks have succeed: the reconstruction loss is substantially better in terms of `2 distance when the image is in the training set than when not.

C.5     Diffusion Models vs GANs

Unlike diffusion models that are explicitly trained to memorize and reconstruct their training datasets, GANs are not. Instead, GANs consist of two competing neural networks: a generator and a discriminator.

Data Extraction Attacks

Table 1 shows the number of extracted images for each model and their corresponding FID. Overall, the authors find that diffusion models memorize more data than GANs, even when the GANs reach similar performance, e.g., the best DDPM model memorizes 2× more than StyleGAN-ADA but reaches the same FID.

Using the GANs we trained ourselves, the authors showed examples of the near-copy generations in Figure 15 for the three GANs. Overall, the results further reinforce the conclusion that diffusion models are less private than GAN models

Membership Inference Attacks

Overall, diffusion models have higher membership inference leakage, e.g., diffusion models had 50% TPR at an FPR of 0.1% as compared to < 30% TPR for GANs. This suggests that diffusion models are less private than GANs for membership inference attacks under default training settings, even when the GAN attack is strengthened due to having access to the discriminator.

Defenses and Recommendations

  1. Deduplicating training data
  2. Differentially-Private Training
    • Differentially-private stochastic gradient descent (DP-SGD)

####Summary

  1. State-of-the-art diffusion models memorize training images
  2. Define memorization in diffusion models
  3. Stronger diffusion models are less private than weaker diffusion models
  4. Propose attack techniques to help estimate the privacy risks of trained models

Paper D. A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT

ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC), which involves the creation of digital content.

The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.

This survey provides a comprehensive review of the history of generative models, and basic components, and recent advances in AIGC from unimodal interaction and multimodal interaction.

Figure 2 offers a thorough summary of advanced GAI algorithms, both in terms of unimodal generation and multimodal generation.

Three primary contributions are as follows –

  1. Provide a formal definition and a thorough survey for AIGC and the AI-enhanced generation process.
  2. Review the history, and foundation techniques of AIGC and conduct a comprehensive analysis of recent advances in GAI tasks and models from the perspective of unimodal generation and multimodal generation.
  3. Discuss the main challenges facing AIGC and future research trends confronting AIGC.

Emergence from the technical approach

The transformer architecture, introduced in 2017, has revolutionized AI by becoming the backbone of major generative models in NLP and CV. Innovations like the Vision Transformer and SwinTransformer have furthered this by adding visual components.

D.1     Foundation pre-trained model

The use of pre-trained language models has emerged as the prevailing technique in the domain of NLP. Generally, current state-of-the-art pre-trained language models could be categorized as masked language models (encoders), autoregressive language models (decoders) and encoder-decoder language models, as shown in Figure 4.

Reinforcement Learning from Human Feedback: To better align AIGC output with human preferences. Three distinct categories, including, pre-training, reward learning, and fine-tuning with reinforcement learning.

D.2     Computing and Hardware

Distributed Training

The training workload is split among multiple processors or machines, allowing the model to be trained much faster.

Cloud Computing

Service providers let researchers access to powerful computing resources to boost their model training. eg. AWS (Amazon) & Azure (Microsoft)

D.3     Generative AI

Unimodal Model

Generative Language Models.

  1. Decoder Models (Autoregressive Models): Predicting the probability of a masked token given context information, Eg. GPT3, OPT

  2. Encoder Models (Masked Language Models) Model the probability of the next token given previous tokens, Eg. BERT RoBERTa

  3. Encoder-Decoder Models Combines transformer-based encoders and decoders together for pre-training, Eg. T5, BART

D.4     Vision Generative Models

GAN: Generative Adversarial Networks (GANs) consist of two parts, a generator and a discriminator. The generator attempts to learn the distribution of real examples in order to generate new data, while the discriminator determines whether the input is from the real data space or not.

LAPGAN (Laplacian Pyramid GAN):

DCGAN (Deep Convolutional GAN):

BigGAN:

VAE: Following variational bayes inference [97], Variational Autoencoders (VAE) are generative models that attempt to reflect data to a probabilistic distribution and learn reconstruction that is close to its original input.

Normalizing Flows: A Normalizing Flow is a distribution transformation from simple to complex by a sequence of invertible and differentiable mappings.

  1. Coupling and autoregressive flows
    • Multi-scale flows
  2. Convolutional and Residual Flows.
    • ConvFlow
    • RevNets
    • iRevNets

Diffusion Models: The Generative Diffusion Model (GDM) is a cutting-edge class of generative models based on probability, which demonstrates state-of-the-art results in the field of computer vision. It works by progressively corrupting data with multiple-level noise perturbations and then learning to reverse this process for sample generation.

D.5     Multimodal Models

Under the hood of Encoder-Decoder family architectures. The encoder is responsible for learning a contextualized representation of the input data. Decoder is used to generate raw modalities that reflect cross-modal interactions, structure, and coherence in the representation.

Vision Language Encoders

Cross-aligned encoders: learning contextualized representations is to look at pairwise interactions between modalities.

Vision Language Decoders

  1. To text decoders: Jointly- trained decoders, frozen decoders.
  2. To image decoders:
    • GAN-based,
    • Diffusion-based:GLIDE, Imagen
    • VAE-based: DALL-E

Other Modalities Generation

D.6     Applications

D.7     Efficiency

  1. Inference efficiency: This is concerned with the practical considerations of deploying a model for inference, i.e., computing the model’s outputs for a given input. Inference efficiency is mostly related to the model’s size, speed, and resource consumption (e.g., disk and RAM usage) during inference.
  2. Training efficiency: This covers factors that affect the speed and resource requirements of training a model, such as training time, memory footprint, and scalability across multiple

D.8     Future Directions

References