DE eng

Search in the Catalogues and Directories

Hits 1 – 18 of 18

1
Mining an English-Chinese parallel Dataset of Financial News
In: Journal of Open Humanities Data; Vol 8 (2022); 9 ; 2059-481X (2022)
Abstract: Parallel text datasets are a valuable for educational purposes, machine translation, and cross-language information retrieval, but few are domain-oriented. We have created a Chinese–English parallel dataset in the domain of finance technology, using the Financial Times website, from which we grabbed 60,473 news items from between 2007 and 2021. This dataset is a bilingual Chinese–English parallel dataset of news in the domain of finance. It is open access in its original state without transformation, and has been made not for machine translation as has been used, but for intelligent mining, in which we conducted many experiments using up-to-date text mining techniques: clustering (topic modeling, community detection, k-means), topic prediction (naive Bayes, SVM, LSTM, Bert), and pattern discovery (dictionary based, time series). We present the usage of these techniques as a framework for other studies, not only as an application but with an interpretation.
Keyword: classification; clustering; computer science; English-Chinese; patterns; text mining
URL: https://openhumanitiesdata.metajnl.com/jms/article/view/62
https://doi.org/10.5334/johd.62
BASE
Hide details
2
Source Code for Youtube dataset processing ...
TURENNE, Nicolas. - : Zenodo, 2022
BASE
Show details
3
Source Code for Youtube dataset processing ...
TURENNE, Nicolas. - : Zenodo, 2022
BASE
Show details
4
The rumour spectrum
In: ISSN: 1932-6203 ; EISSN: 1932-6203 ; PLoS ONE ; https://hal.archives-ouvertes.fr/hal-01691934 ; PLoS ONE, Public Library of Science, 2018, 13 (1), pp.e0189080.1-27. ⟨10.1371/journal.pone.0189080⟩ (2018)
BASE
Show details
5
A semi-supervised Learning Approach to find equivalent long-string Organization Names
In: Colloque- Forum PEPS EXIA ; https://hal-enpc.archives-ouvertes.fr/hal-02310298 ; Colloque- Forum PEPS EXIA, Oct 2016, Champs sur Marne, France. 2016 (2016)
BASE
Show details
6
On a Possible Similarity between Gene and Semantic Networks ...
Turenne, Nicolas. - : arXiv, 2016
BASE
Show details
7
Duplicate Detection with Efficient Language Models for Automatic Bibliographic Heterogeneous Data Integration
In: https://hal.archives-ouvertes.fr/hal-03373972 ; 2015 (2015)
BASE
Show details
8
svcR: An R Package for Support Vector Clustering improved with Geometric Hashing applied to Lexical Pattern Discovery
In: https://hal.archives-ouvertes.fr/hal-03373979 ; 2015 (2015)
BASE
Show details
9
On a Possible Similarity between Gene and Semantic Networks
In: https://hal.archives-ouvertes.fr/hal-03373977 ; 2015 (2015)
BASE
Show details
10
Duplicate Detection with Efficient Language Models for Automatic Bibliographic Heterogeneous Data Integration ...
Turenne, Nicolas. - : arXiv, 2015
BASE
Show details
11
svcR: An R Package for Support Vector Clustering improved with Geometric Hashing applied to Lexical Pattern Discovery ...
Turenne, Nicolas. - : arXiv, 2015
BASE
Show details
12
Knowledge Needs and Information Extraction : Towards an Artificial Consciousness
Turenne, Nicolas [Verfasser]. - New York, NY : John Wiley & Sons, 2013
DNB Subject Category Language
Show details
13
Clustering and Relational Ambiguity: from Text Data to Natural Data
In: EISSN: 2416-5999 ; Journal of Data Mining and Digital Humanities ; https://hal.archives-ouvertes.fr/hal-00920423 ; Journal of Data Mining and Digital Humanities, Episciences.org, 2013, 1 (1), pp.1 (2013)
BASE
Show details
14
Knowledge Needs and Information Extraction
Turenne, Nicolas. - : HAL CCSD, 2013. : Wiley-ISTE, 2013
In: https://hal.inrae.fr/hal-02804243 ; Wiley-ISTE, 269 p., 2013, Computer Engineering and IT series, 978-1-84821-515-3 (2013)
BASE
Show details
15
Modelling noun-phrase dynamics in specialized text collections
In: Journal of quantitative linguistics. - London : Routledge 17 (2010) 3, 212-228
BLLDB
OLC Linguistik
Show details
16
Modeling Noun-Phrases Dynamics in Specialized Text Collections
In: ISSN: 0929-6174 ; Journal of Quantitative Linguistics ; https://hal.archives-ouvertes.fr/hal-02054488 ; Journal of Quantitative Linguistics, Taylor & Francis (Routledge), 2010, 17 (3), pp.212-228. ⟨10.1080/09296174.2010.485447⟩ (2010)
BASE
Show details
17
Bayesian Discriminant Analysis for Lexical Semantic Tagging
In: European Meeting on Cybernetics and Systems Research (EMCSR) ; https://hal.archives-ouvertes.fr/hal-03373905 ; European Meeting on Cybernetics and Systems Research (EMCSR), Apr 2002, Vienne, Austria (2002)
BASE
Show details
18
Apprentissage statistique pour l'extraction de concepts à partir de textes : application au filtrage d'informations textuelles
Turenne, Nicolas. - : HAL CCSD, 2000
In: https://tel.archives-ouvertes.fr/tel-00006210 ; domain_stic.gest. Université Louis Pasteur - Strasbourg I, 2000. Français (2000)
BASE
Show details

Catalogues
0
0
1
0
1
0
0
Bibliographies
1
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
16
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern