BabyLM Challenge
Here are the previous Calls for Papers and Findings of the BabyLM workshop:
2nd Edition
📄
Call for Papers
📜
Findings
Submissions:
- Ghanizadeh, M. A., & Dousti, M. J. (2024). Towards data-efficient language models: A child-inspired approach to language learning. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Nair, A., Hancharova, A., Kumar, M., & Gharaee, A. (2024). BabyLM challenge: Experimenting with self-distillation and reverse-distillation for language model pre-training on constrained datasets. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Goriely, Z., Diehl Martinez, R., Caines, A., Buttery, P., & Beinborn, L. (2024). From babble to words: Pre-training language models on continuous streams of phonemes. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Bunzeck, B., Duran, D., Schade, L., & Zarrieß, S. (2024). Graphemes vs. phonemes: Battling it out in character-based language models. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Saha, R., Fahim, A., Fyshe, A., & Murphy, A. (2024). Exploring curriculum learning for vision-language tasks: A study on small-scale multimodal training. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Haller, P., Golde, J., & Akbik, A. (2024). BabyHGRN: Exploring RNNs for sample-efficient language modeling. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Shi, S., Matusevych, Y., & Nissim, M. (2024). Choosy babies need one coach: Inducing mode-seeking behavior in BabyLlama with reverse KL divergence. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Chesi, C., Bressan, V., Barbini, M., Fusco, A., Piccini Bianchessi, M. L., Neri, S., & Rossi, S., Sgrizzi, T. (2024). Different ways to forget: Linguistic gates in recurrent neural networks. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Klerings, A., Bartelt, C., & Mueller, A. (2024). Developmentally plausible multimodal language models are highly modular. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Behr, R. (2024). ELC-ParserBERT: Low-resource language modeling utilizing a parser network with ELC-BERT. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Prévot, L., Wang, S.-F., Chi, J.-A., & Hsieh, S.-K. (2024). Extending the BabyLM initiative: Promoting diversity in datasets and metrics through high-quality linguistic corpora. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Berend, G. (2024). Integrating quasi-symbolic conceptual knowledge into language model pre-training. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Edman, L., Bylinina, L., Ghorbanpour, F., & Fraser, A. (2024). Are BabyLMs second language learners? In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Salhan, S., Diehl Martinez, R., Goriely, Z., & Buttery, P. (2024). Less is more: Pre-training cross-lingual small-scale language models with cognitively-plausible curriculum learning strategies. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Capone, L., Bondielli, A., & Lenci, A. (2024). ConcreteGPT: A baby GPT-2 based on lexical concreteness and curriculum learning. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Iyer, S. (2024). When babies teach babies: Can student knowledge sharing outperform teacher-guided distillation on small datasets? In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Nguyen, H., Yip, L., & DeBenedetto, J. (2024). Automatic quality estimation for data selection and curriculum learning. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Lucas, E., Gaines, D., Kosireddy, T. R., Li, K., & Havens, T. C. (2024). Using curriculum masking based on child language development to train a large language model with limited training data. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Lyman, A., & Hepner, B. (2024). WhatIf: Leveraging word vectors for small-scale data augmentation. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Hong, X., Loáiciga, S., & Sayeed, A. (2024). A surprisal oracle for when every layer counts. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- AlKhamissi, B., Tang, Y., Gökce, A., Mehrer, J., & Schrimpf, M. (2024). Dreaming out loud: A self-synthesis approach for training vision-language models with developmentally plausible data. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Haga, A., Fukatsu, A., Oba, M., Bisazza, A., & Oseki, Y. (2024). BabyLM challenge: Exploring the effect of variation sets on language model training efficiency. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Charpentier, L. G. G., & Samuel, D. (2024). BERT or GPT: Why not both? In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Yam, H. M., & Paek, N. (2024). What should baby models read? Exploring sample-efficient data composition on model performance. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Tastet, J.-L., & Timiryasov, I. (2024). BabyLlama-2: Ensemble-distilled models consistently outperform teachers with limited data. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Yam, H. M., & Paek, N. (2024). Teaching tiny minds: Exploring methods to enhance knowledge distillation for small language models. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Theodoropoulos, N., Filandrianos, G., Lyberatos, V., Lymperaiou, M., & Stamou, G. (2024). BERTtime stories: Investigating the role of synthetic story data in language pre-training. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
- Yu, X., Guo, B., Luo, S., Wang, J., Ji, T., & Wu, Y. (2024). AntLM: Bridging causal and masked language models. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.
1st Edition
📄
Call for Papers
📖
Proceedings
📜
Findings
Submissions:
- Bastian Bunzeck, & Sina Zarrieß (2023). GPT-wee: How Small Can a Small Language Model Really Get?. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.2.pdf
- Clayton Fields, Osama Natouf, Andrew McMains, Catherine Henry, & Casey Kennington (2023). Tiny Language Models Enriched with Multimodal Knowledge from Multiplex Networks. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.3.pdf
- Irina Proskurina, Guillaume Metzler, & Julien Velcin (2023). Mini Minds: Exploring Bebeshka and Zlata Baby Models. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.4.pdf
- Xuanda Chen, & Eva Portelance (2023). Grammar induction pretraining for language modeling in low resource contexts. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.5.pdf
- Jaap Jumelet, Michael Hanna, Marianne de Heer Kloots, Anna Langedijk, Charlotte Pouw, & Oskar van der Wal (2023). ChapGTP,ILLC’s Attempt at Raising aBabyLM: Improving Data Efficiency by Automatic Task Formation. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.6.pdf
- Yahan Yang, Elior Sulem, Insup Lee, & Dan Roth (2023). Penn &BGUBabyBERTa+ for Strict-SmallBabyLMChallenge. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.7.pdf
- Lukas Edman, & Lisa Bylinina (2023). Too Much Information: Keeping Training Simple forBabyLMs. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.8.pdf
- Aryaman Chobey, Oliver Smith, Anzi Wang, & Grusha Prasad (2023). Can training neural language models on a curriculum with developmentally plausible data improve alignment with human reading behavior?. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.9.pdf
- Richard Diehl Martinez, Zébulon Goriely, Hope McGovern, Christopher Davis, Andrew Caines, Paula Buttery, & Lisa Beinborn (2023). CLIMB– Curriculum Learning for Infant-inspired Model Building. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.10.pdf
- Theodor Amariucai, & Alexander Scott Warstadt (2023). Acquiring Linguistic Knowledge from Multimodal Input. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.11.pdf
- Julius Steuer, Marius Mosbach, & Dietrich Klakow (2023). LargeGPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.12.pdf
- Zheyu Zhang, Han Yang, Bolei Ma, David Rügamer, & Ercong Nie (2023). Baby’sCoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.13.pdf
- Ömer Veysel Çağatan (2023). ToddlerBERTa: ExploitingBabyBERTa for Grammar Learning and Language Understanding. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.14.pdf
- Lukas Thoma, Ivonne Weyers, Erion Çano, Stefan Schweter, Jutta L Mueller, & Benjamin Roth (2023). CogMemLM: Human-Like Memory Mechanisms Improve Performance and Cognitive Plausibility ofLLMs. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.15.pdf
- Xingmeng Zhao, Tongnian Wang, Sheri Osborn, & Anthony Rios (2023). BabyStories: Can Reinforcement Learning Teach Baby Language Models to Write Better Stories?. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.16.pdf
- Justin DeBenedetto (2023). Byte-ranked Curriculum Learning forBabyLMStrict-small Shared Task 2023. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.17.pdf
- Ziling Cheng, Rahul Aralikatte, Ian Porada, Cesare Spinoso-Di Piano, & Jackie CK Cheung (2023). McGillBabyLMShared Task Submission: The Effects of Data Formatting and Structural Biases. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.18.pdf
- David Samuel (2023). MeanBERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.19.pdf
- Lucas Georges Gabriel Charpentier, & David Samuel (2023). Not all layers are equally as important: Every Layer CountsBERT. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.20.pdf
- Lukas Wolf, Klemen Kotar, Greta Tuckute, Eghbal Hosseini, Tamar I. Regev, Ethan Gotlieb Wilcox, & Alexander Scott Warstadt (2023). WhisBERT: Multimodal Text-Audio Language Modeling on 100MWords. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.21.pdf
- Xudong Hong, Sharid Loáiciga, & Asad Sayeed (2023). A surprisal oracle for active curriculum language modeling. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.22.pdf
- Maggie Mi (2023). Mmi01 at TheBabyLMChallenge: Linguistically Motivated Curriculum Learning for Pretraining in Low-Resource Settings. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.23.pdf
- Inar Timiryasov, & Jean-Loup Tastet (2023). Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.24.pdf
- Miyu Oba, Akari Haga, Akiyo Fukatsu, & Yohei Oseki (2023). BabyLMChallenge: Curriculum learning based on sentence complexity approximating language acquisition. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.25.pdf
- Gábor Berend (2023). Better Together: Jointly Using Masked Latent Semantic Modeling and Masked Language Modeling for Sample Efficient Pre-training. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.26.pdf
- Venkata S Govindarajan, Juan Diego Rodriguez, Kaj Bostrom, & Kyle Mahowald (2023). Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.27.pdf
- Chenghao Xiao, G Thomas Hudson, & Noura Al Moubayed (2023). Towards more Human-like Language Models based on Contextualizer Pretraining Strategy. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.28.pdf
- Omar Momen, David Arps, & Laura Kallmeyer (2023). Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.29.pdf
- Khushi Bhardwaj, Raj Sanjay Shah, & Sashank Varma (2023). Pre-trainingLLMs using human-like development data corpus. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.30.pdf
- Mattia Opper, J. Morrison, & N. Siddharth (2023). On the effect of curriculum learning with developmental data for grammar acquisition. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.31.pdf
- Nasim Borazjanizadeh (2023). OptimizingGPT-2 Pretraining onBabyLMCorpus with Difficulty-based Sentence Reordering. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Retrieved from https://aclanthology.org/2023.conll-babylm.32.pdf