2 |
Heritage Speakers as Part of the Native Language Continuum ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Heritage Speakers as Part of the Native Language Continuum
|
|
|
|
In: Front Psychol (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Topic models do not model topics: epistemological remarks and steps towards best practices
|
|
|
|
In: EISSN: 2416-5999 ; Journal of Data Mining and Digital Humanities ; https://hal.archives-ouvertes.fr/hal-03261599 ; Journal of Data Mining and Digital Humanities, Episciences.org, 2021, 2021, ⟨10.46298/jdmdh.7595⟩ (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Topic models do not model topics: epistemological remarks and steps towards best practices
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03261599 ; 2021 (2021)
|
|
Abstract:
The social sciences and digital humanities have recently adopted the machine learning technique of topic modeling to address research questions in their fields. This is problematic in a number of ways, some of which have not received much attention in the debate yet. This paper adds epistemological concerns centering around the interface between topic modeling and linguistic concepts and the argumentative embedding of evidence obtained through topic modeling. It concludes that topic modeling in its present state of methodological integration does not meet the requirements of an independent research method. It operates from relevantly unrealistic assumptions, is non-deterministic, cannot effectively be validated against a reasonable number of competing models, does not lock into a well-defined linguistic interface, and does not scholarly model topics in the sense of themes or content. These features are intrinsic and make the interpretation of its results prone to apophenia (the human tendency to perceive random sets of elements as meaningful patterns) and confirmation bias (the human tendency to perceptually prefer patterns that are in alignment with pre-existing biases). While partial validation of the statistical model is possible, a conceptual validation would require an extended triangulation with other methods and human ratings, and clarification of whether statistical distinctivity of lexical co-occurrence correlates with conceputal topics in any reliable way.
|
|
Keyword:
[INFO.INFO-DL]Computer Science [cs]/Digital Libraries [cs.DL]; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; [SHS]Humanities and Social Sciences
|
|
URL: https://hal.archives-ouvertes.fr/hal-03261599 https://hal.archives-ouvertes.fr/hal-03261599/document https://hal.archives-ouvertes.fr/hal-03261599/file/topic_models_do_not_model_topics_final_draft.pdf
|
|
BASE
|
|
Hide details
|
|
7 |
Topic models do not model topics: epistemological remarks and steps towards best practices
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03261599 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Additional Data to "A Challenge for Contrastive L1/L2 Corpus Studies" ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Kobalt: Extension Corpus and Annotation Guidelines for Verb Classification and Dependency Adjustments ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Additional Data to "A Challenge for Contrastive L1/L2 Corpus Studies" ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Kobalt: Extension Corpus and Annotation Guidelines for Verb Classification and Dependency Adjustments ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Measuring coselectional constraint in learner corpora: A graph-based approach ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Measuring coselectional constraint in learner corpora: A graph-based approach
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Measuring coselectional constraint in learner corpora: A graph-based approach ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Measuring coselectional constraint in learner corpora: A graph-based approach ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|