1 |
Combining NLP and probabilistic categorisation fordocument and term selection for Swiss-Prot medical annotation
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Multiview Semi-Supervised Learning for Ranking Multilingual Documents
|
|
|
|
In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases ; https://hal.archives-ouvertes.fr/hal-01286156 ; European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2011, Athens, Greece. pp.443-458, ⟨10.1007/978-3-642-23808-6_29⟩ (2011)
|
|
BASE
|
|
Show details
|
|
3 |
A Co-classification Approach to Learning from Multilingual Corpora
|
|
|
|
In: ISSN: 0885-6125 ; EISSN: 1573-0565 ; Machine Learning ; https://hal.archives-ouvertes.fr/hal-01172633 ; Machine Learning, Springer Verlag, 2010, 79 (1-2), pp.105-121. ⟨10.1007/s10994-009-5151-5⟩ (2010)
|
|
BASE
|
|
Show details
|
|
4 |
Multiview Clustering of Multilingual Documents
|
|
|
|
In: Proceedings of the 33rd Annual ACM SIGIR Conference (SIGIR 2010) ; The 33rd Annual ACM SIGIR Conference (SIGIR 2010) ; https://hal.archives-ouvertes.fr/hal-01292100 ; The 33rd Annual ACM SIGIR Conference (SIGIR 2010), Jul 2010, Geneva, Switzerland. pp.812-822, ⟨10.1145/1835449.1835633⟩ (2010)
|
|
BASE
|
|
Show details
|
|
5 |
Combining Coregularization and Consensus-Based Self-Training for Multilingual Text Categorization
|
|
|
|
In: Proceedings of the 33rd Annual ACM SIGIR Conference (SIGIR 2010) ; The 33rd Annual ACM SIGIR Conference (SIGIR 2010) ; https://hal.archives-ouvertes.fr/hal-01291883 ; The 33rd Annual ACM SIGIR Conference (SIGIR 2010), Jul 2010, Geneva, Switzerland. pp.475-482, ⟨10.1145/1835449.1835529⟩ (2010)
|
|
BASE
|
|
Show details
|
|
7 |
Learning from Multiple Partially Observed Views -- an Application to Multilingual Text Categorization
|
|
|
|
In: Advances in Neural Information Processing Systems ; https://hal.archives-ouvertes.fr/hal-01297947 ; Advances in Neural Information Processing Systems, Dec 2009, Vancouver, Canada (2009)
|
|
Abstract:
International audience ; We address the problem of learning classifiers when observations have multiple views, some of which may not be observed for all examples. We assume the existence of view generating functions which may complete the missing views in an approximate way. This situation corresponds for example to learning text classifiers from multilingual collections where documents are not available in all languages. In that case, Machine Translation (MT) systems may be used to translate each document in the missing languages. We derive a generalization error bound for classifiers learned on examples with multiple artificially created views. Our result uncovers a trade-off between the size of the training set, the number of views, and the quality of the view generating functions. As a consequence, we identify situations where it is more interesting to use multiple views for learning instead of classical single view learning. An extension of this framework is a natural way to leverage unlabeled multi-view data in semi-supervised learning. Experimental results on a subset of the Reuters RCV1/RCV2 collections support our findings by showing that additional views obtained from MT may significantly improve the classification performance in the cases identified by our trade-off.
|
|
Keyword:
[INFO]Computer Science [cs]
|
|
URL: https://hal.archives-ouvertes.fr/hal-01297947
|
|
BASE
|
|
Hide details
|
|
10 |
Combining NLP and probabilistic categorisation for document and term selection for Swiss-Prot medical annotation
|
|
|
|
BASE
|
|
Show details
|
|
|
|