In:The Swedish FrameNet++: Harmonization, integration, method development and practical language technology applications
Edited by Dana Dannélls, Lars Borin and Karin Friberg Heppin
[Natural Language Processing 14] 2021
► pp. 3–36
Get fulltext
Chapter 1Introduction
Swedish FrameNet++
Available under the Creative Commons Attribution-NonCommercial-NoDerivatives (CC BY-NC-ND) 4.0 license.
For any use beyond this license, please contact the publisher at rights@benjamins.nl.
Published online: 26 November 2021
https://doi.org/10.1075/nlp.14.01bor
https://doi.org/10.1075/nlp.14.01bor
Abstract
The Swedish FrameNet++ was designed to be several
things. As a digital artifact, it is an integrated
panchronic lexical macroresource, primarily for Swedish, but
including several other languages, intended as a basic
infrastructural component in Swedish language technology research
and for developing natural language processing applications. As an
activity, it is a long-term R&D initiative,
initially aimed at bringing about this macroresource, and now at
maintaining and extending it, at promoting its use in language
technology research and application development, as well as ensuring
that the results of this research and development in their turn are
incorporated in the macroresource. As a product of
research, it reflects both computational and linguistic
approaches to lexicology, lexical semantics, and lexical
typology.
Article outline
- 1.The Swedish FrameNet++
- 2.Rationale and aims of SweFN++
- 2.1From corpus-based lexicography to language technology R&D
- 2.2Extending the shelf life of lexical resources
- 2.3The increasing importance of the lexicon in language technology
- 2.4A framenet for Swedish
- 2.5Serendipitous funding and synergies
- 3.The history of Swedish FrameNet++
- 4.Integration of existing resources
- 5.A new resource: Swedish FrameNet
- 6.Theoretical and methodological considerations
- 6.1Interlinking of lexical resources
- 6.2Method matters
- 6.2.1Zipf to the rescue
- 6.2.2Towards a general lexical infrastructure: Karp
- 6.3Linguistic issues
- 6.3.1Lexicography and (comparative) linguistics
- 6.3.2Compounds in Swedish FrameNet
- 6.3.3Multiword expressions
- 6.4Computational vs. general linguistics
- 7.Similar initiatives
- 7.1Multilingual wordnets
- 7.2MTRoget and multilingual FrameNet
- 7.3Etymological wordnet, IDS/LWT and the concepticon
- 7.4BabelNet
- Postscript on BabelNet 5
- 8.Status and future
- 9.This volume
Notes References Appendix
References (66)
Ahlberg, Malin, Lars Borin, Markus Forsberg, Martin Hammarstedt, Leif-Jöran Olsson, Olof Olsson, Johan Roxendal & Jonatan Uppström. 2013. Korp and Karp – a bestiary of language resources:
The research infrastructure of Språkbanken. In Proceedings of Nodalida 2013, 429–433. Linköping: LiUEP.
Aikhenvald, Alexandra & R. M. W. Dixon. 2002. Word: A typological framework. In Alexandra Aikhenvald & R. M. W. Dixon (eds.), Word: A cross-linguistic typology, 1–41. Cambridge: Cambridge University Press.
Allén, Sture. 1967. Studier över nusvenskans
vokabulärsystem [Studies on the vocabulary system of
modern Swedish]. Research report. Gothenburg: University of Gothenburg, Dept. of Scandinavian Languages.
Baker, Collin F., Charles J. Fillmore & John B. Lowe. 1998. The Berkeley FrameNet project. In Proceedings of Coling 1998, 86–90. Montréal: ACL.
Bechhofer, Sean, Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah McGuinness, Peter Patel-Schneijder & Lynn Andrea Stein. 2004. OWL Web Ontology Language reference. Research report. Online: W3C.
Bender, Emily M. 2011. On achieving and evaluating language-independence
in NLP. Linguistic Issues in Language Technology 6(3).
Borin, Lars. 2010. Med Zipf mot framtiden – en
integrerad lexikonresurs för svensk
språkteknologi [With Zipf into the future – an
integrated lexical resource for Swedish language
technology]. LexicoNordica 17: 35–54.
Borin, Lars, Jens Allwood & Gerard de Melo. 2014. Bring vs. MTRoget: Evaluating automatic thesaurus
translation. In Proceedings of LREC 2014, 2115–2121. Reykjavik: ELRA.
Borin, Lars, Bernard Comrie & Anju Saxena. 2013. The Intercontinental Dictionary Series – a rich
and principled database for language
comparison. In Lars Borin & Anju Saxena (eds.), Approaches to measuring linguistic differences, 285–302. Berlin: De Gruyter Mouton.
Borin, Lars, Markus Forsberg & Christer Ahlberger. 2011. Semantic search in literature as an e-Humanities
research tool: CONPLISIT – consumption patterns and
life-style in 19th century Swedish
literature. In Proceedings of Nodalida 2011, 58–65. Linköping: LiUEP.
Borin, Lars, Markus Forsberg & Lennart Lönngren. 2008. The hunting of the BLARK – SALDO, a freely
available lexical database for Swedish language
technology. In Joakim Nivre, Mats Dahllöf & Beáta Megyesi (eds.), Resourceful language technology: Festschrift in honor of
Anna Sågvall Hein, 21–32. Uppsala: Uppsala University, Department of Linguistics & Philology.
. 2013. SALDO: A touch of yin to WordNet’s
yang. Language Resources and Evaluation 47(4): 1191–1211.
Borin, Lars, Markus Forsberg, Leif-Jöran Olsson, Olof Olsson & Jonatan Uppström. 2013. The lexical editing system of
Karp. In Proceedings of eLex 2013, 503–516. Ljubljana: Trojina, Institute for Applied Slovene Studies.
Borin, Lars, Markus Forsberg, Leif-Jöran Olsson & Jonatan Uppström. 2012. The open lexical infrastructure of
Språkbanken. In Proceedings of LREC 2012, 3598–3602. Istanbul: ELRA.
Borin, Lars, Markus Forsberg & Johan Roxendal. 2012. Korp – the corpus infrastructure of
Språkbanken. In Proceedings of LREC 2012, 474–478. Istanbul: ELRA.
Borin, Lars, Maria Toporowska Gronostaj & Dimitrios Kokkinakis. 2007. Medical frames as target and tool. In Proceedings of the Nodalida workshop FRAME 2007:
Building frame semantics resources for Scandinavian and
Baltic languages, 11–18. Lund: Lund University.
Brickley, Dan & R. V. Guha. 2004. RDF vocabulary description language 1.0: RDF
Schema. Research report. Online: W3C.
Church, Kenneth & Mark Liberman. 2021. The future of computational linguistics: On
beyond alchemy. Frontiers in Artificial Intelligence 4(625341): 1–18.
Dannélls, Dana. 2010. Applying semantic frame theory to automate
natural language templates generation from ontology
statements. In Proceedings of INLG 2010, 179–184. Dublin: ACL.
de Melo, Gerard. 2014. Etymological WordNet: Tracing the history of
words. In Proceedings of LREC 2014, 1148–1154. Reykjavik: ELRA.
de Melo, Gerard & Gerhard Weikum. 2008. Mapping Roget’s Thesaurus and WordNet to
French. In Proceedings of LREC 2008, 3306–3313. Marrakech: ELRA.
. 2012a. Constructing and utilizing wordnets using
statistical methods. Language Resources and Evaluation 46: 287–311.
. 2012b. UWN: A large multilingual lexical knowledge
base. In Proceedings of ACL 2012: System demonstrations, 151–156. Jeju: ACL.
Fellbaum, Christiane & Piek Vossen. 2012. Challenges for a multilingual
wordnet. Language Resources and Evaluation 46: 313–326.
Fillmore, Charles J. 1982. Frame semantics. In Linguistic Society of Korea (ed.), Linguistics in the morning calm, 111–137. Seoul: Hanshin Publishing Co.
Fillmore, Charles J., Christopher R. Johnson & Miriam R. L. Petruck. 2003. Background to FrameNet. International Journal of Lexicography 16(3): 235–250.
Forsberg, Markus, Richard Johansson, Linnéa Bäckström, Lars Borin, Benjamin Lyngfelt, Joel Olofsson & Julia Prentice. 2014. From construction candidates to constructicon
entries: An experiment using semi-automatic methods for
identifying constructions in corpora. Constructions and Frames 6(1): 114–135.
Francis, W. Nelson & Henry Kučera. 1964. Manual of information to accompany A Standard Corpus of
Present-Day Edited American English, for use with digital
computers. Research report. Providence: Brown University, Dept. Linguistics. Providence.
Green, Rebecca, Bonnie J. Dorr & Philip Resnik. 2004. Inducing frame semantic verb classes from WordNet
and LDOCE. In Proceedings of ACL 2004, 375–382. Barcelona: ACL.
Haspelmath, Martin. 2011b. The indeterminacy of word segmentation and the
nature of morphology and syntax. Folia Linguistica 45(1): 31–80.
Haspelmath, Martin & Uri Tadmor (eds.). 2009. Loanwords in the world’s languages: A comparative
handbook. Berlin: Mouton de Gruyter.
Holman, Eric W., Søren Wichmann, Cecil H. Brown, Viveka Velupillai, André Müller & Dik Bakker. 2008. Explorations in automated language
classification. Folia Linguistica 42(2): 331–354.
Huang, Chu-ren, Nicoletta Calzolari, Aldo Gangemi, Alessandro Lenci, Alessandro Oltramari & Laurent Prevot (eds.). 2010. Ontology and the lexicon: A natural language processing
perspective. Cambridge: Cambridge University Press.
ISO. 2019. Language resource management – Lexical Markup Framework
(LMF) – Part 1: Core model. ISO 24613-1:2019. Geneva: ISO.
Jäger, Gerhard. 2012. Power laws and other heavy-tailed distributions
in linguistic typology. Advances in Complex Systems 15(3–4).
Johansson, Richard & Pierre Nugues. 2005. Using parallel corpora for automatic transfer of
FrameNet annotation. In Proceedings of the 1st ROMANCE FrameNet
workshop, 26–28. Cluj-Napoca.
. 2006. A FrameNet-based semantic role labeler for
Swedish. In Proceedings of Coling/ACL 2006, 436–443. Sydney: ACL.
Kilgarriff, Adam, Miloś Husák, Katy McAdam, Michael Rundell & Pavel Rychlý. 2008. GDEX: Automatically finding good dictionary
examples in a corpus. In Proceedings of EURALEX 2008, 425–432. Barcelona: Universitat Pompeu Fabra.
Kokkinakis, Dimitrios & Maria Toporowska Gronostaj. 2010. Linking SweFN++ with medical resources: Towards a
MedFrameNet for Swedish. In Proceedings of Louhi at NAACL-HLT 2010, 68–71. Los Angeles: ACL.
Koptjevskaja-Tamm, Maria, Martine Vanhove & Peter Koch. 2007. Typological approaches to lexical
semantics. Linguistic Typology 11: 159–185.
List, Johann-Mattis, Michael Cysouw & Robert Forkel. 2016. Concepticon: A resource for the linking of
concept lists. In Proceedings of LREC 2016, 2393–2400. Portorož: ELRA.
Lyngfelt, Benjamin, Lars Borin, Markus Forsberg, Julia Prentice, Rudolf Rydstedt, Emma Sköldberg & Sofia Tingsell. 2012. Adding a constructicon to the Swedish resource
network of Språkbanken. In Proceedings of KONVENS 2012 (LexSem 2012
workshop), 452–461. Vienna: ÖGAI.
Lyngfelt, Benjamin, Lars Borin, Kyoko Ohara & Tiago Timponi Torrent (eds.). 2018. Constructicography: Constructicon development across
languages. Amsterdam: John Benjamins.
Manning, Christopher D. 2015. Last words: Computational linguistics and deep
learning. Computational Linguistics 41(4): 701–707.
Moon, Rosamund. 2000. Lexicography and disambiguation: The size of the
problem. Computers and the Humanities 34(1–2): 99–102.
Navigli, Roberto & Simone Paolo Ponzetto. 2012. BabelNet: The automatic construction, evaluation
and application of a wide-coverage semantic
network. Artificial Intelligence 193: 217–250.
Piantadosi, Steven T. 2014. Zipf’s word frequency law in natural language: A
critical review and future directions. Psychonomic Bulletin & Review 21(625341): 1112–1130.
Polinsky, Maria & Lilla Magyar. 2020. Headedness and the lexicon: The case of
verb-to-noun ratios. Languages 5(1/9): 1–25.
Reiter, Ehud. 2007. The shrinking horizons of computational
linguistics. Computational Linguistics 33(2): 283–287.
Swadesh, Morris. 1950. Salish internal relationships. International Journal of American Linguistics 16: 157–167.
. 1952. Lexico-statistic dating of prehistoric ethnic
contacts: With special reference to North American Indians
and Eskimos. Proceedings of the American Philosophical
Society 96(4): 452–463.
. 1955. Towards greater accuracy in lexicostatistic
dating. International Journal of American Linguistics 21: 121–137.
Swartz, Merryanna L. 1992. Issues for tutoring knowledge in foreign language
intelligent tutoring systems. In Merryanna L. Swartz & Masoud Yazdani (eds.), Intelligent tutoring systems for foreign language
learning, 219–233. Berlin: Springer.
Torrent, Tiago Timponi, Lars Borin & Collin Baker (eds.). 2018. Proceedings of the International FrameNet workshop 2018:
Multilingual framenets and constructicons. Miyazaki: ELRA.
Vossen, Piek (ed.). 1998a. EuroWordNet: A multilingual database with lexical
semantic networks for European languages. Dordrecht: Kluwer.
Wilks, Yorick. 2009. Ontotherapy, or how to stop worrying about what
there is. In Nicolas Nicolov, Galia Angelova & Ruslan Mitkov (eds.), Recent advances in natural language processing
V, 1–20. Amsterdam: John Benjamins.
