1 |
UBERT: A Novel Language Model for Synonymy Prediction at Scale in the UMLS Metathesaurus ...
|
|
Wijesiriwardene, Thilini; Nguyen, Vinh; Bajaj, Goonmeet; Yip, Hong Yung; Javangula, Vishesh; Mao, Yuqing; Fung, Kin Wah; Parthasarathy, Srinivasan; Sheth, Amit P.; Bodenreider, Olivier. - : arXiv, 2022
|
|
Abstract:
The UMLS Metathesaurus integrates more than 200 biomedical source vocabularies. During the Metathesaurus construction process, synonymous terms are clustered into concepts by human editors, assisted by lexical similarity algorithms. This process is error-prone and time-consuming. Recently, a deep learning model (LexLM) has been developed for the UMLS Vocabulary Alignment (UVA) task. This work introduces UBERT, a BERT-based language model, pretrained on UMLS terms via a supervised Synonymy Prediction (SP) task replacing the original Next Sentence Prediction (NSP) task. The effectiveness of UBERT for UMLS Metathesaurus construction process is evaluated using the UMLS Vocabulary Alignment (UVA) task. We show that UBERT outperforms the LexLM, as well as biomedical BERT-based models. Key to the performance of UBERT are the synonymy prediction task specifically developed for UBERT, the tight alignment of training data to the UVA task, and the similarity of the models used for pretrained UBERT. ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2204.12716 https://dx.doi.org/10.48550/arxiv.2204.12716
|
|
BASE
|
|
Hide details
|
|
2 |
"Is depression related to cannabis?": A Knowledge-infused Model for Entity and Relation Extraction with Limited Supervision
|
|
|
|
In: Publications (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Analyzing and Learning the Language for Different Types of Harassment
|
|
|
|
In: Publications (2020)
|
|
BASE
|
|
Show details
|
|
4 |
ALONE: A Dataset for Toxic Behavior among Adolescents on Twitter
|
|
|
|
In: Publications (2020)
|
|
BASE
|
|
Show details
|
|
5 |
Analyzing and Learning the Language for Different Types of Harassment
|
|
|
|
In: Amit P. Sheth (2020)
|
|
BASE
|
|
Show details
|
|
6 |
ALONE: A Dataset for Toxic Behavior among Adolescents on Twitter
|
|
|
|
In: Amit P. Sheth (2020)
|
|
BASE
|
|
Show details
|
|
7 |
Personalized Prediction of Suicide Risk for Web-based Intervention
|
|
|
|
In: Krishnaprasad Thirunarayan (2019)
|
|
BASE
|
|
Show details
|
|
8 |
Personalized Prediction of Suicide Risk for Web-based Intervention
|
|
|
|
In: Amit P. Sheth (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Personalized Prediction of Suicide Risk for Web-based Intervention
|
|
|
|
In: Kno.e.sis Publications (2018)
|
|
BASE
|
|
Show details
|
|
10 |
Database and Expert Systems Applications - 28th International Conference, DEXA 2017, Lyon, France, Proceedings part II
|
|
|
|
In: ISSN: 0302-9743 ; Lecture Notes in Computer Science ; 28th International Conference on Database and Expert Systems Applications and Workshops (DEXA 2017) ; https://hal.archives-ouvertes.fr/hal-03120290 ; Benslimane, Djamal; Damiani, Ernesto; Grosky, William I.; Hameurlain, Abdelkader; Sheth, Amit P.; Wagner, Roland R. 28th International Conference on Database and Expert Systems Applications and Workshops (DEXA 2017), Aug 2017, Lyon, France. Lecture Notes in Computer Science, 10439 (Part II), Springer, 2017, Database and Expert Systems Applications 28th International Conference, DEXA 2017, Lyon, France, 978-3-319-64470-7. ⟨10.1007/978-3-319-64471-4⟩ ; https://link.springer.com/book/10.1007%2F978-3-319-64471-4 (2017)
|
|
BASE
|
|
Show details
|
|
11 |
RQUERY: Rewriting Natural Language Queries on Knowledge Graphs to Alleviate the Vocabulary Mismatch Problem
|
|
|
|
In: Publications (2017)
|
|
BASE
|
|
Show details
|
|
12 |
RQUERY: Rewriting Natural Language Queries on Knowledge Graphs to Alleviate the Vocabulary Mismatch Problem
|
|
|
|
In: Kno.e.sis Publications (2017)
|
|
BASE
|
|
Show details
|
|
13 |
What Kind of #Communication is Twitter? A Psycholinguistic Perspective on Communication in Twitter for the Purpose of Emergency Coordination
|
|
|
|
In: Valerie Shalin (2017)
|
|
BASE
|
|
Show details
|
|
14 |
What Kind of #Conversation is Twitter? Mining #Psycholinguistic Cues for Emergency Coordination
|
|
|
|
In: Valerie Shalin (2017)
|
|
BASE
|
|
Show details
|
|
15 |
RQUERY: Rewriting Natural Language Queries on Knowledge Graphs to Alleviate the Vocabulary Mismatch Problem
|
|
|
|
In: Amit P. Sheth (2017)
|
|
BASE
|
|
Show details
|
|
16 |
Intent Classification of Short-Text on Social Media
|
|
|
|
In: Valerie Shalin (2017)
|
|
BASE
|
|
Show details
|
|
17 |
Context-Aware Semantic Association Ranking
|
|
|
|
In: Amit P. Sheth (2016)
|
|
BASE
|
|
Show details
|
|
18 |
ezDI's Semantics-Enhanced Linguistic, NLP, and ML Approach for Health Informatics
|
|
|
|
In: Amit P. Sheth (2016)
|
|
BASE
|
|
Show details
|
|
19 |
Intent Classification of Short-Text on Social Media
|
|
|
|
In: Amit P. Sheth (2016)
|
|
BASE
|
|
Show details
|
|
20 |
Intent Classification of Short-Text on Social Media
|
|
|
|
In: Krishnaprasad Thirunarayan (2016)
|
|
BASE
|
|
Show details
|
|
|
|