In:Investigating Wikipedia: Linguistic corpus building, exploration and analysis
Edited by Céline Poudat, Harald Lüngen and Laura Herzberg
[Studies in Corpus Linguistics 121] 2024
► pp. 156–177
Chapter 6Exploring the evolution of Wikipedia articles through Contropedia
Published online: 31 October 2024
https://doi.org/10.1075/scl.121.06lan
https://doi.org/10.1075/scl.121.06lan
Abstract
Wikipedia is not just a corpus of encyclopedic articles; it also includes an entire edit history that details the evolution of each article. However,
such records are often unknown to the general public, and too complex for researchers to use. Contropedia uses innovative techniques to allow for the visualization,
exploration and investigation of the evolution of Wikipedia articles and disputes about their content. This chapter
introduces Contropedia and provides an in-depth analysis
of two articles from the WikiDemoCorpus: the English language article ‘Chiropractic’ and a multilingual comparison of
articles on the European migration crisis. These use cases illustrate Contropedia’s visual and analytical modules for
analyzing controversies within an article and for
cross-cultural comparison.
Article outline
- 1.Introduction
- 2.Description of Contropedia
- 2.1Related work
- 2.2Tool rationale and method
- 2.3Interface description
- 2.3.1Layer view
- 2.3.2Dashboard view
- 2.3.3Detailed view
- 3.Cross-language comparison
- 3.1Case study: European migrant crisis
- 3.1.1Comparing the overall activity using the “layer view”
- 3.1.2Identifying the most controversial shared elements using the “dashboard view”
- 3.1.3Comparing the evolution of controversial elements using the “detailed view”
- 3.1.4Analyzing language-specific controversial elements using the “detailed view”
- 3.2Discussion
- 3.1Case study: European migrant crisis
- 4.Conclusion
Notes References
References (32)
Adafre, Sisay F. & de Rijke, Maarten. 2005. Discovering
missing links in Wikipedia. Proceedings of the 3rd International Workshop on
Link Discovery — LinkKDD
’05, 90–97. New York NY: ACM.
Adler, B. Thomas & de Alfaro, Luca. 2007. A
content-driven reputation system for
Wikipedia. In Proceedings of the 16th International
Conference on World Wide Web — WWW ’07, Banff, Alberta, 8–12
May, 261. New York NY: ACM.
Adler, B. Thomas, de Alfaro, Luca, Pye, Ian & Raman, Vishwanath. 2008. Measuring
author contributions to the Wikipedia. In Proceedings
of the 4th International Symposium on Wikis, September, 1–10.
Bao, Patti, Hecht, Brent, Carton, Samuel, Quaderi, Mahmood, Horn, Michael & Gergle, Darren. 2012. Omnipedia:
Bridging the wikipedia language gap. In CHI’12:
Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems, 1075–1084. New York NY: ACM.
Borra, Eric, Weltevrede, Esther, Ciuccarelli, Paolo, Kaltenbrunner, Andreas, Laniado, David, Magni, Giovanni, Mauri, Michele, Rogers, Richard & Venturini, Tommaso. 2015. Societal
controversies in Wikipedia articles. In CHI’15:
Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing
Systems, 193–196. New York NY: ACM.
. 2014. Contropedia
— The analysis and visualization of controversies in Wikipedia
articles. OpenSym 34(1).
Brandes, Ulrik & Lerner, Jürgen. 2008. Visual
analysis of controversy in user-generated encyclopedias. Information
Visualization 7(1): 34–48.
Consonni, Cristian, Laniado, David & Montresor, Alberto. 2019. WikiLinkGraphs:
A complete, longitudinal and multi-language dataset of the Wikipedia link
networks. Proceedings of the International AAAI Conference on Web and Social
Media 13(1): 598–607.
. 2020. CycleRank,
or there and back again: Personalized relevance scores from cyclic paths on directed
graphs. Proceedings of the Royal Society
A, 476: 20190740. Proc. R. Soc.
A.47620190740.
Ekstrand, Michael D. & Riedl, John T. 2009. rv
you’re dumb: Identifying discarded work in Wiki article
history. In Proceedings of the 5th International
Symposium on Wikis and Open Collaboration, article
4, 1–10. Orlando FL.
Flöck, Fabian & Acosta, Maribel. 2015. Whovis:
Visualizing editor interactions and dynamics in collaborative writing over
time. In Proceedings of the 24th International
Conference on World Wide Web, Florence,
Italy, 191–194. New York NY: ACM.
Flöck, Fabian, Laniado, David, Stadthaus, Felix & Acosta, Maribel. 2015. Towards
better visual tools for exploring Wikipedia article development — The use case of “Gamergate
Controversy”. Ninth International AAAI Conference on Web and Social
Media 9(5): 48–55.
Forte, Andrea, Larco, Vanesa & Bruckman, Amy. 2009. Decentralization
in Wikipedia governance. Journal of Management Information
Systems 26(1): 49–72.
Geiger, R. Stuart & Halfaker, Aaron. 2016. Open
algorithmic systems: Lessons on opening the black box from Wikipedia. AoIR
Selected Papers of Internet Research 6. 〈[URL]〉 (1 June 2024).
Gredel, Eva. 2017. Digital
discourse analysis and Wikipedia: Bridging the gap between Foucauldian discourse analysis and digital
conversation analysis. Journal of
Pragmatics 115: 99–114.
Hecht, Brent & Gergle, Darren. 2009. Measuring
self-focus bias in community-maintained knowledge
repositories. In C&T ’09: Proceedings of the
Fourth International Conference on Communities and
Technologies, 11–20. New York NY: ACM.
Heer, Jeffrey, Bostock, Michael & Ogievetsky, Vadim. 2010. A
tour through the visualization zoo. Communications of the
ACM 53(6): 59–67.
Kaltenbrunner, Andreas & Laniado, David. 2012. There
is no deadline: Time evolution of Wikipedia
discussions. In WikiSym ’12: Proceedings of the
Eighth Annual International Symposium on Wikis and Open Collaboration, article
6, 1–10. New York NY: ACM.
Kittur, Aniket, Suh, Bongwon, Pendleton, Bryan A. & Chi, Ed H. 2007. He says,
she says: conflict and coordination in
Wikipedia. In CHI ’07: Proceedings of the SIGCHI
Conference on Human Factors in Computing
Systems, 453–462. New York NY: ACM.
Laniado, David, Tasso, Riccardo, Volkovich, Yana & Kaltenbrunner, Andreas. 2011. When
the Wikipedians talk: Network and tree structure of Wikipedia discussion
pages. Proceedings of the Fifth International Conference on Weblogs and Social
Media
(ICWSM-11) 5(1): 177–184.
Laniado, David, Borra, Eric, Mauri, Michele, Kaltenbrunner, Andreas, Weltevrede, Esther, Ciucarelli, Paolo, Magni, Giovanni, Rogers, Richard & Venturini, Tommaso. 2015. D4.
2 Description of the Contropedia platform. 〈[URL]〉 (1 June 2024).
Mesgari, Mostafa, Okoli, Chitu, Mehdi, Mohamad, Nielsen, Finn Å. & Lanamäki, Arto. 2015. “The
sum of all human knowledge”: A systematic review of scholarly research on the content of
Wikipedia. Journal of the Association for Information Science and
Technology 66(2): 219–245.
Pentzold, Christian, Weltevrede, Esther, Mauri, Michele, Laniado, David, Kaltenbrunner, Andreas & Borra, Eric. 2017. Digging
wikipedia: The online encyclopedia as a digital cultural heritage gateway and
site. Journal on Computing and Cultural Heritage
(JOCCH) 10(1): 1–19.
Rad, Hoda S. & Barbosa, Denilson. 2012. Identifying
controversial articles in Wikipedia: A comparative
study. In WikiSym ’12: Proceedings of the Eighth
Annual International Symposium on Wikis and Open Collaboration, article
7, 1–10. New York NY: ACM.
Rogers, Richard & Sendijarevic, Emina. 2012. Neutral
or national point of view? A comparison of Srebrenica articles across Wikipedia’s language
versions. Proc. Wikipedia Academy. 〈[URL]〉 (1 June
2024).
Viégas, Fernanda B., Wattenberg, Martin & Dave, Kushal. 2004. Studying
cooperation and conflict between authors with history flow
visualizations. In CHI ’04: Proceedings of the SIGCHI
Conference on Human Factors in Computing
Systems, 575–582. New York NY: ACM.
Weltevrede, Esther & Borra, Eric. 2016. Platform
affordances and data practices: The value of dispute on Wikipedia. Big Data
& Society.
