This website uses cookies to ensure you get the best experience on our website.
Learn more
Got it!
Donato Crisostomi
Donato Crisostomi
Home
News
Blog
Publications
Type
Conference paper
Journal article
Preprint
Date
2025
2024
2023
2022
Language Models are Injective and Hence Invertible
ICLR 2026
Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs …
Giorgios Nikolaou
,
Tommaso Mencattini
,
Donato Crisostomi
,
Santilli
,
Yannis Panagakis
,
Emanuele Rodolà
Cite
arXiv
Thread
Implicit Inversion turns CLIP into a Decoder
ICLR 2026
CLIP is a discriminative model trained to align images and text in a shared embedding space. Due to its multimodal structure, it serves …
Antonio D'Orazio
,
Maria Rosaria Briglia
,
crisostomi
,
Dario Loi
,
rodola
,
Iacopo Masi
Cite
arXiv
GitHub
Mergenetic: a Simple Evolutionary Model Merging Library
ACL 2025 System Demo
Model merging allows combining the capabilities of existing models into a new one - post hoc, without additional training. This has …
Adrian R. Minut
,
Tommaso Mencattini
,
Marco Santilli
,
Donato Crisostomi
,
Emanuele Rodolà
Cite
arXiv
GitHub
Update Your Transformer to the Latest Release: Re-Basin of Task Vectors
ICML 2025
Foundation models serve as the backbone for numerous specialized models developed through fine-tuning. However, when the underlying …
Filippo Rinaldi
,
Giacomo Capitani
,
Lorenzo Bonicelli
,
Angelo Porrello
,
Donato Crisostomi
,
Federico Bolelli
,
Emanuele Rodolà
,
Elisa Ficarra
,
Simone Calderara
Cite
STAGE: Stemmed Accompaniment Generation through Prefix-Based Conditioning
ISMIR 2025 (top in Music ML)
Recent advances in generative models have made it possible to create high-quality, coherent music, with some systems delivering …
Giorgio Strano
,
Chiara Ballanti
,
Donato Crisostomi
,
Michele Mancusi
,
Luca Cosmo
,
Emanuele Rodolà
Cite
arXiv
GitHub
Efficient Generation of Multimodal Fluid Simulation Data
STAG
In this work, we introduce an efficient generation procedure to produce synthetic multi-modal datasets of fluid simulations. The …
Daniele Baieri
,
Donato Crisostomi
,
Donato Crisostomi
,
Stefano Esposito
,
Filippo Maggioli
,
Emanuele Rodolà
Cite
arXiv
MASS: MoErging through Adaptive Subspace Selection
ICLR 2026
Model merging has recently emerged as a lightweight alternative to ensembling, combining multiple fine-tuned models into a single set …
Donato Crisostomi
,
Alessandro Zirilli
,
Antonio Andrea Gargiulo
,
Maria Sofia Bucarelli
,
Simone Scardapane
,
Fabrizio Silvestri
,
Iacopo Masi
,
Emanuele Rodolà
Cite
arXiv
LoopGen: Training-Free Loopable Music Generation
ISMIR 2025 (top in Music ML)
Loops–short audio segments designed for seamless repetition–are central to many music genres, particularly those rooted in …
Davide Marincione
,
Giorgio Strano
,
Donato Crisostomi
,
Roberto Ribuoli
,
Emanuele Rodolà
Cite
arXiv
Activation Patching for Interpretable Steering in Music Generation
Preprint
Understanding how large audio models represent music, and using that understanding to steer generation, is both challenging and …
Simone Facchiano
,
Giorgio Strano
,
Donato Crisostomi
,
Irene Tallini
,
Tommaso Mencattini
,
Fabio Galasso
,
Emanuele Rodolà
Cite
arXiv
Task Singular Vectors: Reducing Task Interference in Model Merging
CVPR 2025
Task Arithmetic has emerged as a simple yet effective method to merge models without additional training. However, by treating entire …
Antonio Andrea Gargiulo
,
Donato Crisostomi
,
Maria Sofia Bucarelli
,
Simone Scardapane
,
Emanuele Rodolà
Cite
arXiv
GitHub
Tweeprint
MERGE³: Efficient Evolutionary Merging on Consumer-grade GPUs
ICML 2025
Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for …
Tommaso Mencattini
,
Adrian Robert Minut
,
Donato Crisostomi
,
Andrea Santilli
,
Emanuele Rodolà
Cite
arXiv
GitHub
Tweeprint
Humanity's Last Exam
Preprint
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are …
More than 600 authors including
,
Donato Crisostomi
,
Emanuele Rodolà
Cite
URL
GitHub
arXiv
ATM: Improving Model Merging by Alternating Tuning and Merging
ArXiv
Model merging has recently emerged as a cost-efficient paradigm for multi-task learning. Among current approaches, task arithmetic …
Luca Zhou
,
Daniele Solombrino
,
Donato Crisostomi
,
Maria Sofia Bucarelli
,
Fabrizio Silvestri
,
Emanuele Rodolà
Cite
arXiv
GitHub
C²M³: Cycle-Consistent Multi-Model Merging
NeurIPS 2024
In this paper, we present a novel data-free method for merging neural networks in weight space. Differently from most existing works, …
Donato Crisostomi
,
Marco Fumero
,
Daniele Baieri
,
Florian Bernard
,
Emanuele Rodolà
Cite
arXiv
GitHub
Tweeprint
Preface of UniReps: the Second Edition of the Workshop on Unifying Representations in Neural Models
PMLR
Discover why, when and how distinct learning processes yield similar representations, and the degree to which these can be unified.
Clementine Domine
,
Marco Fumero
,
Zorah Lähner
,
Donato Crisostomi
,
Luca Moschella
,
Kimberly Stachenfeld
Cite
Article
From Charts to Atlas: Merging Latent Spaces into One
NeurReps workshop @ NeurIPS 2023
Models trained on semantically related datasets and tasks exhibit comparable inter-sample relations within their latent spaces. We …
Donato Crisostomi
,
Irene Cannistraci
,
Luca Moschella
,
Pietro Barbiero
,
Marco Ciccone
,
Pietro Lio
,
Emanuele Rodolà
Cite
URL
PDF
Mitigating the Burden of Redundant Datasets via Batch-Wise Unique Samples and Frequency-Aware Losses
ACL 2023
Datasets used to train deep learning models in industrial settings often exhibit skewed distributions with some samples repeated a …
Donato Crisostomi
,
Andrea Caciolai
,
Alessandro Pedrani
,
Kay Rottmann
,
Alessandro Manzotti
,
Enrico Palumbo
,
Davide Bernardi
Cite
URL
AVEN-GR: Attribute Value Extraction and Normalization using product GRaphs
ACL 2023
Getting a good understanding of the user intent is vital for e-commerce applications to surface the right product to a given customer …
Donato Crisostomi
,
Thomas Ricatte
Cite
URL
Play música alegre: A Large-Scale Empirical Analysis of Cross-Lingual Phenomena in Voice Assistant Interactions
MMNLU workshop, EMNLP 2022
Cross-lingual phenomena are quite common in informal contexts like social media, where users are likely to mix their native language …
Donato Crisostomi
,
Alessandro Manzotti
,
Enrico Palumbo
,
Davide Bernardi
,
Sarah Campbell
,
Shubham Garg
Cite
URL
Metric Based Few-Shot Graph Classification
LoG 2022
Few-shot graph classification is a novel yet promising emerging research field that still lacks the soundness of well-established …
Donato Crisostomi
,
Simone Antonelli
,
Valentino Maiorca
,
Luca Moschella
,
Riccardo Marin
,
Emanuele Rodolà
Cite
URL
PDF
GitHub
Few-Shot Object Detection: A Survey
ACM Surveys
Deep learning approaches have recently raised the bar in many fields, from Natural Language Processing to Computer Vision, by …
Simone Antonelli
,
Danilo Avola
,
Luigi Cinque
,
Donato Crisostomi
,
Gian Luca Foresti
,
Fabio Galasso
,
Marco Raoul Marini
,
Alessio Mecca
,
Daniele Pannone
Cite
DOI
URL
Cite
×