The case of hiking descriptions: Geoparsing and geocoding places in a dynamic space context

Gaio, Mauro; Moncla, Ludovic

doi:10.1075/hcp.66.10gai

In:The Semantics of Dynamic Space in French: Descriptive, experimental and formal studies on motion expression
Edited by Michel Aurnague and Dejan Stosic
[Human Cognitive Processing 66] 2019
► pp. 353–386

Get fulltext from our e-platform

Download Book PDF

Download Book EPUB

Geoparsing and geocoding places in a dynamic space context

The case of hiking descriptions

Mauro Gaio | LMAP, Université de Pau et des Pays de l’Adour & CNRS, France

Ludovic Moncla | LIRIS, Université de Lyon, INSA Lyon & CNRS, France

Published online: 29 July 2019

https://doi.org/10.1075/hcp.66.10gai

Abstract

The backbone of the proposal in this chapter is an automatic parser and a formal encoder of information describing places, spatial and verbal relations in textual documents in order to reconstruct and map the textually described itinerary. These tools allow us to show how to combine the information expressed in French texts, referring to places, spatial actions associated with them, and data found in external geographical resources to build a geocoded representation of an itinerary. Our approach focuses on the automatic reconstruction of routes and transcribes them in their geographical setting, identifying locations and routes by interpreting spatial information in a dynamic space context.

Keywords: spatial actions, spatial relations, itinerary, automatic parser, formal encoder, geocoded representation, ambiguity resolution

Article outline

1.Introduction
2.Background and related work
- 2.1Parsing in computational linguistics
- 2.2Named entity recognition and classification
- 2.3Construction grammars
- 2.4Geoparsing, toponym ambiguities and geocoding
3.Recognizing and locating places in a dynamic space context
- 3.1Geoparsing extended spatial entities
  - 3.1.1Extended named entity (ene) structure
  - 3.1.2Motion verbs and extended spatial named entity structures
- 3.2Geocoding
  - 3.2.1Subtyping of place named entities
  - 3.2.2Density-based spatial clustering
  - 3.2.3Geocoding for unreferenced toponyms
  - 3.2.4Automatic reconstruction of itineraries
4.Evaluation
- 4.1Named entity recognition and classification
- 4.2Toponym disambiguation
- 4.3Density-based spatial clustering
- 4.4Geocoding for unreferenced toponyms
5.Conclusion
Notes
References

References (42)

References

Amitay, E., Har’El, N., Sivan, R., & Soffer, A. (2004). Web-a-where: Geotagging Web Content. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’04 (pp. 273–280). Portland, OR: ACM.

Bloom, P., Peterson, M., Nadel, A., & Garrett, M. (Eds.). (1996). Language and space. Cambridge, MA: MIT press.

Buscaldi, D., & Rosso, P. (2008a). A conceptual density-based approach for the disambiguation of toponyms. Int. J. Geogr. Inf. Sci., 22(3), (pp. 301–313).

(2008b). Map-based vs. knowledge-based toponym disambiguation. In Proceedings of the 2nd international workshop on Geographic information retrieval, GIR’08 (pp. 19–22). Napa Valley, CA: ACM.

Béchet, F., Sagot, B., & Stern, R. (2011). Coopération de méthodes statistiques et symboliques pour l’adaptation non-supervisée d’un système d’étiquetage en entités nommées. In Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles (pp. 1–6). Montpellier: Association pour le Traitement Automatique des Langues.

Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, vol. 96 (pp. 226–231). Portland, OR: AAAI Press.

Feuerhake, U., & Sester, M. (2013). Mining group movement patterns. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (pp. 510–513). Orlando, FL: ACM.

Fillmore, C. J. (1985). Syntactic intrusions and the notion of grammatical construction. In Proceedings of the 11th Annual Meeting of the Berkeley Linguistics Society (pp. 73–86). Berkeley, CA: Berkeley Linguistics Society.

Frank, A. U. (1998). Formal models for cognition– Taxonomy of spatial location description and frames of reference. In Freksa, C., Habel, C., & Wender K. F. (Eds.), In Spatial Cognition: An Interdisciplinary Approach to Representing and Processing Spatial Knowledge (pp. 293–312). Berlin & Heidelberg: Springer.

Frank, A. U., & Mark, D. M. (1991). Language issues for Geographical Information Systems. In Maguire, D. J., Goodchild, M. F., & Rhind, D. W. (Eds.), Geographical Information Systems: Principles and applications (pp. 147–163). London: Longman Publishers.

Friburger, N., & Maurel, D. (2004). Finite-state transducer cascades to extract named entities in texts. Theoretical Computer Science, 313(1), 93–104.

Hollenstein, L., & Purves, R. (2010). Exploring place through user-generated content: Using flickr to describe city cores. Journal of Spatial Information Science, (1), 21–48.

Intagorn, S., & Lerman, K. (2011). Learning boundaries of vague places from noisy annotations. In Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (pp. 425–428). Chicago, IL: ACM.

Jonasson, K. (1994). Le nom propre: Constructions et interprétations. Louvainla-Neuve: Duculot.

Lakoff, G. (1987). Women, fire, and dangerous things – What categories reveal about the mind. Chicago, IL: University of Chicago Press.

Langacker, R. W. (1987). Foundations of cognitive grammar: Theoretical prerequisites. Stanford, CA: Stanford University Press.

Leidner, J. L. (2007). Toponym resolution in text: Annotation, evaluation and applications of spatial grounding of place names. PhD dissertation. Edinburgh: Institute for Communicating and Collaborative Systems School of Informatics, University of Edinburgh.

Levinson, S. C. (1996). Language and space. Annual Review of Anthropology, 25(1), (pp. 353–382).

(2003). Space in language and cognition: Explorations in cognitive diversity. Number 5 in Language, culture, and cognition. Cambridge: Cambridge University Press.

Lieberman, M. D., & Samet, H. (2012). Adaptive context features for toponym resolution in streaming news. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (pp. 731–740). Portland, OR: ACM.

Makhoul, J., Kubala, F., Schwartz, R., & Weischedel, R. (1999). Performance measures for information extraction. In Proceedings of the DARPA Broadcast News Workshop (pp. 249–252). San Francisco, CA: Morgan Kaufmann Publishers.

Maurel, D., Friburger, N., Antoine, J.-Y., Eshkol-Taravella, I., & Nouvel, D. (2011). Cascades de transducteurs autour de la reconnaissance des entités nommées. TAL, 52(1), 69–96.

Moncla, L. (2015). Automatic reconstruction of itineraries from descriptive texts. PhD dissertation. Pau: Université de Pau et des Pays de l’Adour.

Moncla, L., & Gaio, M. (2015). A multi-layer markup language for geospatial semantic annotations. In Proceedings of the 9th Workshop on Geographic Information Retrieval, GIR ’15 (pp. 1–10). Paris: ACM.

Moncla, L., Gaio, M., Nogueras-Iso, J., & Mustière, S. (2016). Reconstruction of itineraries from annotated text with an informed spanning tree algorithm. International Journal of Geographical Information Science, 30(6), (pp. 1137–1160).

Moncla, L., Renteria-Agualimpia, W., Nogueras-Iso, J., & Gaio, M. (2014). Geocoding for texts with fine-grain toponyms: An experiment on a geoparsed hiking descriptions corpus. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, SIGSPATIAL ’14 (pp. 183–192). Dallas, TX: ACM.

Nguyen, V. T., Gaio, M., & Moncla, L. (2013). Topographic subtyping of place named entities: a linguistic approach. In Vandenbroucke, D., Bucher, B., & Crompvoets J. (Eds.), Proceedings of the 15th AGILE International Conference on Geographic Information Science (pp. 1–5). Berlin & Heidelberg: Springer.

Nouvel, D., Antoine, J.-Y., Friburger, N., & Soulet, A. (2012). Coupling knowledge-based and data-driven systems for named entity recognition. In Workshop on Innovative Hybrid Approaches to the Processing of Textual Data (pp. 69–77). Avignon: Association for Computational Linguistics.

O’Keefe, J. (1996). The spatial prepositions in English, vector grammar, and the cognitive map theory. In P. Bloom et al. (Eds), Language and space (pp. 277–316). Cambridge, MA: MIT Press.

Poibeau, T. (2003). Extraction automatique d’information: Du texte brut au web sémantique. In Extraction automatique d’information: Du texte brut au web sémantique. Paris: Hermès Lavoisier.

Pourcel, S., & Kopecka, A. (2005). Motion expression in French: Typological diversity. Durham & Newcastle Working Papers in Linguistics, 11, 139–153.

Purves, R. S., & Derungs, C. (2015). From space to place: Place-based explorations of text. International Journal of Humanities and Arts Computing, 9(1), 74–94.

Rau, L. F. (1991). Extracting company names from text. In Proceedings of the 7th IEEE Conference on Artificial Intelligence Applications (pp. 29–32). Miami Beach FL: IEEE.

Rauch, E., Bukatin, M., & Baker, K. (2003). A Confidence-based framework for disambiguating geographic terms. In Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References– Vol. 1, HLT-NAACL-GEOREF’03 (pp. 50–54). Stroudsburg, PA: Association for Computational Linguistics.

Smith, D. A., & Crane, G. (2001). Disambiguating geographic names in a historical digital library. In Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries, ECDL’01 (pp. 127–136). Berlin & Heidelberg: Springer.

Smith, D. A., & Mann, G. S. (2003). Bootstrapping toponym classifiers. In Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references– Volume 1, HLT-NAACL-GEOREF’03 (pp. 45–49). Stroudsburg, PA: Association for Computational Linguistics.

Sui, D., & Goodchild, M. (2011). The convergence of GIS and social media: Challenges for GIScience. International Journal of Geographical Information Science, 25(11), 1737–1748.

Talmy, L. (1985). Lexicalization patterns: Semantic structure in lexical forms. In T. Shopen (Ed.), Language typology and syntactic description (vol. 3): Grammatical categories and the lexicon (pp. 57–143). Cambridge: Cambridge University Press.

(2000). Toward a cognitive semantics. Cambridge, MA: The MIT Press.

Vandeloise, C. (1991). Spatial prepositions: A case study in French. Chicago, IL: The University of Chicago Press.

Wacholder, N., Ravin, Y., & Choi, M. (1997). Disambiguation of proper names in text. In Proceedings of the Fifth Conference on Applied Natural Language Processing, ANLC’97 (pp. 202–208). Stroudsburg, PA: Association for Computational Linguistics.

Yannick-Mathieu, Y. (2003). La Grammaire de construction. Approches syntaxiques contemporaines, LINX, 48, 43–56.