DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5
Hits 1 – 20 of 90

1
Literatur in der SBZ
Jacob, Herbert [Sonstige]; Berlin-Brandenburgische Akademie der Wissenschaften, Berlin-Brandenburgische Akademie der [Herausgeber]. - Berlin : De Gruyter Akademie Verlag Berlin, 2021
DNB Subject Category Language
Show details
2
Out-of-the-Box and Into the Ditch? Multilingual Evaluation of Generic Text Extraction Tools
In: Language Resources and Evaluation Conference (LREC 2020) ; https://hal.archives-ouvertes.fr/hal-02732851 ; Language Resources and Evaluation Conference (LREC 2020), 2020, pp.5-13 (2020)
Abstract: International audience ; This article examines extraction methods designed to retain the main text content of web pages and discusses how the extraction could be oriented and evaluated: can and should it be as generic as possible to ensure opportunistic corpus construction? The evaluation grounds on a comparative benchmark of open-source tools used on pages in five different languages (Chinese, English, Greek, Polish and Russian), it features several metrics to obtain more fine-grained differentiations. Our experiments highlight the diversity of web page layouts across languages or publishing countries. These discrepancies are reflected by diverging performances so that the right tool has to be chosen accordingly.
Keyword: [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; Boilerplate removal; Cleaneval; Evaluation metrics; Web Content Extraction; Web corpus construction
URL: https://hal.archives-ouvertes.fr/hal-02732851/document
https://hal.archives-ouvertes.fr/hal-02732851/file/2020.wac-1.2.pdf
https://hal.archives-ouvertes.fr/hal-02732851
BASE
Hide details
3
Diving Into The Complexities Of The Tech Blog Sphere
In: Digital Humanities 2019 ; https://hal.archives-ouvertes.fr/hal-02201532 ; Digital Humanities 2019, ADHO, Jul 2019, Utrecht, Netherlands ; https://dev.clariah.nl/files/dh2019/boa/0964.html (2019)
BASE
Show details
4
Nénufar: Modelling a Diachronic Collection of Dictionary Editions as a Computational Lexical Resource
In: ELEX 2019: smart lexicography ; https://hal.inria.fr/hal-02272978 ; ELEX 2019: smart lexicography, Oct 2019, Sintra, Portugal (2019)
BASE
Show details
5
TEI Encoding of a Classical Mixtec Dictionary Using GROBID- Dictionaries
In: ELEX 2019: Smart Lexicography ; https://hal.inria.fr/hal-02264033 ; ELEX 2019: Smart Lexicography, Oct 2019, Sintra, Portugal ; https://elex.link/elex2019/ (2019)
BASE
Show details
6
Designing Multilingual Digital Pedagogy Initiatives: The Programming Historian for English, Spanish, and French speaking DH Communities
In: Digital Humanities Conference 2019 ; https://halshs.archives-ouvertes.fr/halshs-03104552 ; Digital Humanities Conference 2019, Jul 2019, Utrecht, Netherlands (2019)
BASE
Show details
7
Designing Multilingual Digital Pedagogy Initiatives: The Programming Historian for English, Spanish, and French speaking DH Communities
In: Digital Humanities 2019 ; https://halshs.archives-ouvertes.fr/halshs-02277611 ; Digital Humanities 2019, Jul 2019, Utrecht, Netherlands. 2019 ; https://dh2019.adho.org/ (2019)
BASE
Show details
8
Three Challenges in Developing Open Multilingual DH Educational Resources The Case of The Programming Historian
In: Digital Humanities 2019 ; https://halshs.archives-ouvertes.fr/halshs-02277639 ; Digital Humanities 2019, ADHO, Jul 2019, Utrecht, Netherlands ; https://hcommons.org/deposits/item/hc:25461/ (2019)
BASE
Show details
9
TEI and the Mixtepec-Mixtec corpus: data integration, annotation and normalization of heterogeneous data for an under-resourced language
In: 6th International Conference on Language Documentation and Conservation (ICLDC) ; https://hal.inria.fr/hal-02075475 ; 6th International Conference on Language Documentation and Conservation (ICLDC), Feb 2019, Honolulu, United States (2019)
BASE
Show details
10
TEI Lex-0: A Target Format for TEI-Encoded Dictionaries and Lexical Resources
In: TEI Conference and Members' Meeting ; https://hal.inria.fr/hal-02265312 ; TEI Conference and Members' Meeting, Sep 2018, Tokyo, Japan (2018)
BASE
Show details
11
A database of German definitory contexts from selected web sources
In: 11th International Conference on Language Resources and Evaluation (LREC 2018) ; https://hal.archives-ouvertes.fr/hal-01798704 ; 11th International Conference on Language Resources and Evaluation (LREC 2018), May 2018, Miyazaki, Japan. pp.3068-3073 (2018)
BASE
Show details
12
Enhancing Usability for Automatically Structuring Digitised Dictionaries
In: GLOBALEX workshop at LREC 2018 ; https://hal.archives-ouvertes.fr/hal-01708137 ; GLOBALEX workshop at LREC 2018, May 2018, Miyazaki, Japan (2018)
BASE
Show details
13
A Constellation and a Rhizome: Two Studies on Toponyms in Literary Texts
In: Visualisierung sprachlicher Daten: Visual Linguistics – Praxis – Tools ; https://hal.archives-ouvertes.fr/hal-01775127 ; Bubenhofer, Noah und Kupietz, Marc. Visualisierung sprachlicher Daten: Visual Linguistics – Praxis – Tools, Heidelberg University Publishing, pp.167-184, 2018, 978-3-946054-75-7. ⟨10.17885/heiup.345.474⟩ ; http://heiup.uni-heidelberg.de/heiup/catalog/book/345 (2018)
BASE
Show details
14
Retro-digitizing and Automatically Structuring a Large Bibliography Collection
In: European Association for Digital Humanities (EADH) Conference ; https://hal.archives-ouvertes.fr/hal-01941534 ; European Association for Digital Humanities (EADH) Conference, EADH, Dec 2018, Galway, Ireland (2018)
BASE
Show details
15
A Diachronic Digital Edition of the Petit Larousse illustré
In: Journée d'étude CORLI : Traitements et standardisation des corpus multimodaux et web 2.0. ; https://hal.archives-ouvertes.fr/hal-01873805 ; Journée d'étude CORLI : Traitements et standardisation des corpus multimodaux et web 2.0., May 2018, Paris, France (2018)
BASE
Show details
16
Borderlands of text mapping: Experiments on Fontane's Brandenburg
In: Workshop INF-DH-2018 (Informatik und die Digital Humanities) ; https://hal.archives-ouvertes.fr/hal-01951880 ; Workshop INF-DH-2018 (Informatik und die Digital Humanities), Sep 2018, Berlin, Germany. ⟨10.18420/infdh2018-05⟩ (2018)
BASE
Show details
17
Automatically Encoding Encyclopedic-like Resources in TEI
In: The annual TEI Conference and Members Meeting ; https://hal.inria.fr/hal-01819505 ; The annual TEI Conference and Members Meeting, Sep 2018, Tokyo, Japan ; https://tei2018.dhii.asia/ (2018)
BASE
Show details
18
Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers
In: Fifth Workshop on NLP for Similar Languages, Varieties and Dialects ; https://hal.archives-ouvertes.fr/hal-01858444 ; Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, Aug 2018, Santa Fe, New Mexico, United States. pp.164-171 ; http://alt.qcri.org/vardial2018/ (2018)
BASE
Show details
19
A corpus of German political speeches from the 21st century
In: 11th Language Resources and Evaluation Conference (LREC 2018) ; https://hal.archives-ouvertes.fr/hal-01798703 ; 11th Language Resources and Evaluation Conference (LREC 2018), May 2018, Miyazaki, Japan. pp.792-797 (2018)
BASE
Show details
20
Toponyms as Entry Points into a Digital Edition: Mapping Die Fackel
In: ISSN: 2451-1781 ; Open Information Science ; https://hal.archives-ouvertes.fr/hal-01775122 ; Open Information Science, De Gruyter, 2018, 2 (1), pp.23-33. ⟨10.1515/opis-2018-0002⟩ ; https://www.degruyter.com/downloadpdf/j/opis.2018.2.issue-1/opis-2018-0002/opis-2018-0002.pdf (2018)
BASE
Show details

Page: 1 2 3 4 5

Catalogues
0
0
0
0
12
0
0
Bibliographies
0
0
0
0
0
0
0
0
5
Linked Open Data catalogues
0
Online resources
22
0
6
7
Open access documents
51
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern