4 |
The Orange workflow for observing collocation trends ColTrend 1.0
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Slovene ontology of semantic types for nouns SLONEST-noun 1.0
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Multiword Expressions lexicon extracted from the Gigafida 2.1 corpus
|
|
|
|
BASE
|
|
Show details
|
|
8 |
The Orange workflow for observing collocation clusters ColEmbed 1.0
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Frequency lists of collocations from the Gigafida 2.1 corpus
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Frequency lists of character-level n-grams from the GOS 1.0 corpus 1.1
|
|
|
|
BASE
|
|
Show details
|
|
16 |
List of formulaic sequences in spoken Slovenian
|
|
|
|
Abstract:
This document contains 2,374 formulaic sequences in spoken Slovenian, i.e. frequently recurring strings of two to five words, manually annotated for syntactic structure, pragmatic function, and dictionary relevance. The list of sequences with a minimum frequency threshold of 20/million is based on the Frequency lists of word-level n-grams from normalized word forms in GOS 1.0 (http://hdl.handle.net/11356/1271) and contains the union of top-1,000 formulaic sequences ranked by frequency and five association measures (Dice, t-test, MI, MI3, simple-LL). Note that there exists a related entry, "List of formulaic sequences in standard written Slovenian", http://hdl.handle.net/11356/1280.
|
|
Keyword:
formulaic language; manual annotation; multiword expressions; n-grams; Slovenian language; spoken language
|
|
URL: http://hdl.handle.net/11356/1279
|
|
BASE
|
|
Hide details
|
|
20 |
Frequency lists of word-level n-grams from the GOS 1.0 corpus 1.1
|
|
|
|
BASE
|
|
Show details
|
|
|
|