Published 2026-05-12
Keywords
- second language acquisition,
- information structure,
- romance languages,
- syntax-prosody interface,
- task-elicited speech
How to Cite
Copyright (c) 2026 Bianca Maria De Paolis, Simone Stroppiana

This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
This paper presents L2FOC, a crosslinguistic corpus of spoken Italian and French developed to investigate the syntactic and prosodic realisation of focus in both native and non-native speech. The corpus includes approximately 10 hours of recordings from 65 speakers across four groups (L1/L2 Italian and French), collected through three tasks designed to elicit speech with a ranging degree of spontaneity. Recordings were made with laboratory equipment and are accompanied by orthographic transcriptions and multi-tier phonetic and phonological alignment. The corpus enables fine-grained analysis of how information structure is encoded across languages and proficiency levels, and supports applications in both theoretical linguistics and speech technology. To illustrate its analytical potential, two studies are briefly discussed: one examining the syntax–prosody interface in L2 focus strategies, and one assessing automatic speech recognition performance on learner speech.
References
- Aiello, Annachiara, Wenwei Dong, Catia Cucchiarini, and Helmer Strik. 2025. “Evaluating Automatic Speech Recognition on Non Native Italian”. In XXI convegno annuale AISV, Urbino (Italy), 6-8 February 2025.
- Allen, Will, Joan C. Beal, Karen P. Corrigan, Warren Maguire, and Hermann L. Moisl. 2007. “A Linguistic ‘Time Capsule’: The Newcastle Electronic Corpus of Tyneside English”. In Creating and Digitizing Language Corpora, edited by Joan C. Beal, Karen P. Corrigan, and Hermann L. Moisl, 16-48. London: Palgrave Macmillan. DOI: https://doi.org/10.1057/9780230223202_2
- Anastasio, Simona. 2021. Parler de déplacement en L2: perspectives acquisitionnelles dans une approche translinguistique. Roma: Aracne Editore.
- Paul Boersma and David Weenink. 2021. Praat: Doing Phonetics by Computer, version 6.1.56, University of Amsterdam.
- Büring, Daniel. 2010. “Towards a typology of focus realization”. In Information Structure, edited by Malte Zimmerman and Caroline Féry, 177–205. Oxford: Oxford University Press. DOI: https://doi.org/10.1093/acprof:oso/9780199570959.003.0008
- CEFR. 2020. Common European Framework of Reference for Languages: Learning, Teaching, Assessment–Companion Volume. Strasbourg: Council of Europe Publishing.
- De Paolis, Bianca Maria. 2024. Focus-induced variations in prosody and word order in native and non-native Italian and French. PhD Thesis, Università di Torino / Université Paris 8.
- Feldhausen, Ingo and Maria Del Mar Vanrell. 2014. “Prosody, Focus and Word Order in Catalan and Spanish: An Optimality Theoretic Approach”. In 10th International Seminar on Speech Production.
- Gabriel, Christoph. 2010. “On Focus, Prosody, and Word Order in Argentinian Spanish. A Minimalist OT Account”. Revista Virtual de Estudos da Linguagem, Special issue (4): 183–222. https://doi.org/10.5565/rev/isogloss.404 DOI: https://doi.org/10.5565/rev/isogloss.404
- Gabriel, Christoph and Jonas Grünke. 2018. “Focus, prosody, and subject positions in L3 Spanish: analyzing data from German learners with Italian and Portuguese as heritage languages”. In Focus realization in Romance and beyond, edited by Marco García García and Melanie Uth, 358–86. Amsterdam: John Benjamins. DOI: https://doi.org/10.1075/slcs.201.12gab
- Gass, Susan and Larry Selinker. 2001. Second Language Acquisition: An Introductory Course. Mahwah, NJ: Erlbaum. DOI: https://doi.org/10.4324/9781410604651
- Geertzen, Jeroen, Theodora Alexopoulou, and Anna Korhonen. 2014. “Automatic Linguistic Annotation of Large Scale L2 Databases: The EF-Cambridge Open Language Database (EFCamDat)”. In Selected Proceedings of the 2012 Second Language Research Forum, edited by Ryan T. Miller, 240–54. Somerville, MA: Cascadilla Proceedings Project.
- Gilquin, Gaetanelle, Sylvie De Cock, and Sylviane Granger. 2010. The Louvain International Database of Spoken English Interlanguage (LINDSEI). Louvain-La-Neuve: Presses universitaires de Louvain.
- Goldman, Jean-Philippe. 2011. “Easyalign: An Automatic Phonetic Alignment Tool under Praat”. In Proceedings of InterSpeech, 3233-6. https://doi.org/10.21437/Interspeech.2011-815 DOI: https://doi.org/10.21437/Interspeech.2011-815
- Jarvis, Scott and Aneta Pavlenko. 2010. Crosslinguistic Influence in Language and Cognition. London: Routledge.
- Granger, Sylviane, Maïté Dupont, Fanny Meunier, Hubert Naets, and Magali Paquot. 2020. The International Corpus of Learner English. Version 3. Louvain-la-Neuve: Presses universitaires de Louvain. Université catholique de Louvain.
- Hilton, Heather. 2009. “Annotation and analyses of temporal aspects of spoken fluency”. CALICO Journal, 26: 644–61. DOI: https://doi.org/10.1558/cj.v26i3.644-661
- Kisler, Thomas, Uwe Reichel, and Florian Schiel. 2017. “Multilingual Processing of Speech via Web Services”. Computer Speech and Language, 45: 326–47. https://doi.org/10.1016/j.csl.2017.01.005 DOI: https://doi.org/10.1016/j.csl.2017.01.005
- Krifka, Manfred. 2008. “Basic notions of information structure”. Acta Linguistica Hungarica, 55(3–4): 243–76. DOI: https://doi.org/10.1556/ALing.55.2008.3-4.2
- Lambrecht, Knud. 1994. Information Structure and Sentence Form: Topics, Focus, and the Mental Representations of Discourse Referents. Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511620607
- Lenneberg, Eric. H. 1967. Biological Foundations of Language. New York: Wiley. DOI: https://doi.org/10.1080/21548331.1967.11707799
- Mackey, Alison, and Susan M. Gass. 2016. Second Language Research: Methodology and Design. New York: Routledge.
- Myles, Florence. 2006, French Learner Language Oral Corpora (FLLOC), Oxford Text Archive, http://hdl.handle.net/20.500.12024/2495
- Norris, John and Lourdes Ortega. 2009. “Towards an organic approach to investigating CAF in instructed SLA: The case of complexity”. Applied Linguistics, 30(4): 555–78. https://doi.org/10.1093/applin/amp044 DOI: https://doi.org/10.1093/applin/amp044
- OpenCLC. 2017. Distributed by Lexical Computing Limited on behalf of Cambridge University Press and Cambridge English Language Assessment.
- Pallotti, Gabriele. 2009. “CAF: defining, refining and differentiating constructs”. Applied Linguistics, 30(4): 590–601. https://doi.org/10.1093/applin/amp045 DOI: https://doi.org/10.1093/applin/amp045
- Tremblay, Annie, and Meryl D. Garrison. 2010. “Cloze Tests: A Tool for Proficiency Assessment in Research on L2 French”. In Selected Proceedings of the 2008 Second Language Research Forum, edited by Matthew T. Prior, 73–88. Somerville, MA: Cascadilla Proceedings.
- Savy, Renata. 2006. Specifiche per la trascrizione ortografica annotata dei testi raccolti. Progetto CLIPS-W1-a4. Retrieved at: https://it.scribd.com/document/314785610/11-Specifiche-Trascrizione-Ortografica.
- Turco, Giuseppina, Christhine Dimroth, and Bettina Braun. 2013. “Intonational means to mark verum focus in German and French. Language and Speech”, 56(4): 460–90. https://doi.org/10.1177/0023830912460506 DOI: https://doi.org/10.1177/0023830912460506
- Vedder, Ineke. 2008. “Competenza pragmatica e complessità sintattica in italiano l2: l’uso dei modificatori nelle richieste”. Linguistica e Filologia, 25(1): 99–123.
- Zhang, Jie and Hongyin Tao. 2018. “Corpus-based research in Chinese as a second language”. In The Routledge Handbook of Chinese Second Language Acquisition, edited by Chuanren Ken, 48–62. London; New York: Routledge. DOI: https://doi.org/10.4324/9781315670706-3