Evaluating L2 Training Methods in Neural Language Models
Received: Nov 14, 2024 ; Revised: Dec 15, 2024 ; Accepted: Dec 16, 2024
Published Online: Dec 31, 2024
ABSTRACT
Recent advancements in language models (LMs) have significantly improved language processing capabilities; however, these models remain less efficient than human learning, especially when trained on developmentally plausible data volumes similar to those encountered by children (Warstadt & Bowman, 2022; Linzen, 2020). The inefficiency is even more pronounced in second language (L2) acquisition contexts, where cross-linguistic transfer is a key phenomenon (Papadimitriou & Jurafsky, 2020; Yadavalli et al., 2023). This study evaluates L2 training methods in neural language models by examining mutual L1-L2 influences during learning with developmentally plausible data volumes. We propose two approaches to mitigate catastrophic forgetting: the One-Stage Training (OST) method, which integrates L1 and L2 learning into a single stage, and the One-Stage Mixed Training (OSMT) method, which refines OST by incorporating L1 data into the L2 stage for more realistic simulation of bilingual learning. Through continuous syntactic evaluations throughout training, we analyzed how L1 performance changes during L2 acquisition and how cross-linguistics transfer emerges in Korean and English. The results indicate that OST and OSMT effectively mitigated catastrophic forgetting and supported more stable learning compared to the conventional Two-Stage Training method. OSMT achieved superior integration of L1 and L2 structures while revealing negative transfer effects from Korean (L1) to English (L2). These findings provide valuable insights into both neural model training and human-like L2 acquisition processes.
References
15.