The Impact of Memorization on Trustworthy Foundation Models – MemFM @ ICML 2025
Understanding unintended memorization is essential to building trustworthy foundation models.
Foundation models are rapidly becoming integral to high-stakes domains such as healthcare, public safety, and education. As their influence grows, so does the need to ensure they are reliable, ethical, and secure. A growing body of research, however, reveals a critical concern: foundation models are prone to unintended memorization—the recall of specific details or even entire samples from their training data.
This phenomenon poses serious risks, including privacy violations, intellectual property infringement, and societal harm when sensitive or proprietary information is leaked. While some degree of memorization is necessary for solving complex tasks, unintended memorization threatens the integrity and trustworthiness of these systems. Striking the right balance between performance and privacy remains an open challenge.
Currently, solutions to this issue are being pursued across disparate research communities and data modalities—often in isolation. This fragmentation leads to duplicated efforts and missed opportunities for collaboration, even when the goals are aligned. The lack of integration across fields like machine learning security, data privacy, and AI ethics hampers progress toward meaningful solutions.
This workshop aims to bring together researchers and practitioners to explore the causes, consequences, and mitigations of unintended memorization. By bridging insights across domains, we seek to foster collaboration, share practical strategies, and explore new theoretical foundations for mitigating these risks. Ultimately, our goal is to enable the development of trustworthy foundation models that serve society without compromising privacy, intellectual property, or public trust.
Schedule
⭐ Coming Soon ⭐
Speakers (TBD)
Panelists (TBD)
Call for Papers
We cordially invite submissions and participation in our “The Impact of Memorization on Trustworthy Foundation Models” workshop that will be held on July 18th or July 19th, 2025 at the Forty-Second International Conference on Machine Learning (ICML) 2025 in Vancouver, Canada.
Motivation and Topics
This workshop explores the emerging challenges of memorization in foundation models, focusing on its detection, mitigation, and broader implications. Examples of research areas include:
-
Detection and Mitigation Methods for Memorization in Foundation Models: As foundation models grow in complexity, identifying instances of unintended memorization becomes both more challenging and more essential. This topic focuses on techniques for detecting memorized content—such as membership inference attacks and data reconstruction—as well as mitigation strategies, including regularization, differential privacy, and training data filtering. The goal is to prevent sensitive or proprietary data from being inadvertently retained and surfaced by the model.
-
Theoretical Foundations of Memorization: Understanding the root causes of memorization requires a solid theoretical framework. This topic delves into formal definitions of memorization, how it emerges in high-capacity models, and its relationship to model architecture, training dynamics, and data distribution. Theoretical insights help build principled approaches to controlling memorization without compromising generalization.
-
Relationships Between Memorization and Security, Privacy, and Safety: Memorization touches multiple dimensions of trustworthiness in AI systems. This topic investigates how memorized content can be exploited in adversarial settings, pose privacy violations, or trigger unexpected model behavior. By examining these interdependencies, we can better align memorization analysis with broader goals in AI security and responsible deployment.
-
Implications of Memorization on Generalization in Foundation Models: A central tension in machine learning is the trade-off between memorizing training data and generalization. This topic focuses on how memorization impacts model robustness and performance across domains, and whether memorization can sometimes act as a proxy for poor generalization. Discussions here will explore how to find a healthy balance between these competing forces.
-
Societal Impact and Ethical Aspects of Memorization: When foundation models unintentionally memorize and reveal private, personal, or copyrighted information, the consequences can be profound. This topic addresses the ethical responsibilities of researchers and developers, the potential for harm to individuals and communities, and the broader implications for fairness, accountability, and trust in AI technologies.
We welcome submissions related to all aspects of memorization in foundation models, including but not limited to:
- Detection and Mitigation Methods for Memorization in Foundation Models
- Theoretical Foundations of Memorization
- Quantification of the Degree of Memorization
- Relationships Between Memorization and Security, Privacy, and Safety
- Implications of Memorization on Generalization in Foundation Models
- Connecting Memorization Research Across Different Domains and Applications
- Societal Impact and Ethical Aspects of Memorization
- Legal Perspectives on Memorization and Intellectual Property
The workshop will employ a double-blind review process. Each submission will be evaluated based on the following criteria:
- Soundness of the methodology
- Relevance to the workshop
- Societal impacts
We only consider submissions that haven’t been published in any peer-reviewed venue, including ICML 2025 conference. We allow dual submissions with other workshops or conferences. The workshop is non-archival and will not have any official proceedings. All accepted papers will be allocated either a poster presentation or a talk slot.
Important Dates
- Submission deadline: May 20th, 2025, 11:59 PM Anywhere on Earth (AoE)
- Author notification: June 9th, 2025
- Camera-ready deadline: June 30th, 2025 11:59 PM Anywhere on Earth (AoE)
- Workshop date: TBD (Full-day Event)
Submission Instructions
Papers should be submitted to OpenReview
Submitted papers should have up to 4 pages (excluding references, acknowledgments, or appendices). Please use our adjusted ICML submission template. Submissions must be anonymous following ICML double-blind reviewing guidelines, ICML Code of Conduct, and Code of Ethics. Accepted papers will be hosted on the workshop website but are considered non-archival and can be submitted to other workshops, conferences, or journals if their submission policy allows.
Workshop Sponsors
![]() |
⭐ Please reach out if you would like to sponsor our workshop. ⭐
Organizers
![]() |
![]() |
![]() |
Franziska Boenisch CISPA Helmholtz Center for Information Security |
Adam Dziedzic CISPA Helmholtz Center for Information Security |
Dominik Hintersdorf German Research Center for AI & TU Darmstadt |
![]() |
![]() |
![]() |
Lingjuan Lyu Sony AI |
Niloofar Mireshghallah University of Washington |
Lukas Struppek German Research Center for AI & TU Darmstadt |
Organizer affiliations
![]() |
![]() |
![]() |
![]() |
![]() |