CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs
Luca Capone, Alessandro Bondielli, Alessandro Lenci
Masked Diffusion Language Models with Frequency-Informed Training
Despoina Kosmopoulou, Efthymios Georgiou, Vaggelis Dorovatas, Georgios Paraskevopoulos, Alexandros Potamianos
MoEP: Modular Expert Paths for Sample-Efficient Language Modeling
Joonas Tapaninaho
Mask and You Shall Receive: Optimizing Masked Language Modeling for Pre-training BabyLMs
Lukas Edman, Alexander Fraser
Once Upon a Time: Interactive Learning for Storytelling with Small Language Models
Jonas Mayer Martins, Ali Hamza Bashir, Muhammad Rehan Khalid, Lisa Beinborn