DE eng

Search in the Catalogues and Directories

Hits 1 – 4 of 4

1
CAS: corpus of clinical cases in French
In: ISSN: 2041-1480 ; Journal of Biomedical Semantics ; https://hal.archives-ouvertes.fr/hal-03021064 ; Journal of Biomedical Semantics, BioMed Central, 2020, ⟨10.1186/s13326-020-00225-x⟩ (2020)
Abstract: International audience ; Background: Textual corpora are extremely important for various NLP applications as they provide information necessary for creating, setting and testing those applications and the corresponding tools. They are also crucial for designing reliable methods and reproducible results. Yet, in some areas, such as the medical area, due to confidentiality or to ethical reasons, it is complicated or even impossible to access representative textual data. We propose the CAS corpus built with clinical cases, such as they are reported in the published scientific literature in French. Results: Currently, the corpus contains 4,900 clinical cases in French, totaling nearly 1.7M word occurrences. Some clinical cases are associated with discussions. A subset of the whole set of cases is enriched with morpho-syntactic (PoS-tagging, lemmatization) and semantic (the UMLS concepts, negation, uncertainty) annotations. The corpus is being continuously enriched with new clinical cases and annotations. The CAS corpus has been compared with similar clinical narratives. When computed on tokenized and lowercase words, the Jaccard index indicates that the similarity between clinical cases and narratives reaches up to 0.9727. Conclusion: We assume that the CAS corpus can be effectively exploited for the development and testing of NLP tools and methods. Besides, the corpus will be used in NLP challenges and distributed to the research community.
Keyword: [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; Corpus with clinical cases; Medical area; Morpho-syntactic and semantic annotation; Natural language processing; Reproducibility; Sustainability
URL: https://hal.archives-ouvertes.fr/hal-03021064/document
https://doi.org/10.1186/s13326-020-00225-x
https://hal.archives-ouvertes.fr/hal-03021064
https://hal.archives-ouvertes.fr/hal-03021064/file/s13326-020-00225-x.pdf
BASE
Hide details
2
CAS: corpus of clinical cases in French
In: J Biomed Semantics (2020)
BASE
Show details
3
Speculation and negation detection in french biomedical corpora
In: RANLP 2019 - Recent Advances in Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-02284444 ; RANLP 2019 - Recent Advances in Natural Language Processing, Sep 2019, Varna, Bulgaria. pp.1-10 (2019)
BASE
Show details
4
CAS: French Corpus with Clinical Cases
In: LOUHI 2018 - The Ninth International Workshop on Health Text Mining and Information Analysis ; https://hal.archives-ouvertes.fr/hal-01937096 ; LOUHI 2018 - The Ninth International Workshop on Health Text Mining and Information Analysis, Oct 2018, Bruxelles, France. pp.1-7 (2018)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
4
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern