Search tips

General tips

With * (asterisk) you can abbreviate search terms (truncation). The asterisk replaces any number of letters.
Example: synta* finds syntax, syntactic, syntaxique and so on.
Use quotation marks around the words that must occur next to one another as a phrase (phrase search).
Example: "Old English"
The search is not sensitive to case: uppercase and lowercase letters are treated as equivalent.
Example: Compound and compound deliver the same search results.
German umlauts and ß are converted automatically.
You can combine search fields using the Boolean operators AND, OR, AND NOT. (AND is the default setting).
When you use AND between terms, you specify that both/all terms must be in every item found.
When you use OR between search terms, you specify that items must contain one of the terms, but not necessarily both.
Use AND NOT when you want to exclude items that contain the term positioned after NOT.

Go to top

Catalogue search

Via the Simple Search, you can search across all catalogues and all search fields. If you enter several search terms, they will be linked by AND.
The search bar on the welcome page as well as the search box in the main navigation bar provide the same function.

Selecting Advanced Search, you can use five search fields to refine your query. You can assign different categories to the search fields and combine them using the Boolean operators.

For each search field you can choose a category using the drop-down list. The options are:

All fields
Title
Creator / Publisher
Keyword
Year

Display of the results and access to the publications

You can sort the results of your query in different ways. The default setting is by publishing year (descending).

Use the drop-down list to choose one of the options:

publishing year (descending)
publishing year (ascending)
creator (A → Z)
creator (Z → A)
title (A → Z)
title (Z → A)

When searching across all data sources (catalogues, bibliographies, directories), your results can be books, articles, journals, websites, etc.

In the hit list, you see the bibliographic records of the resources that fulfil your search criteria. For each record, you have a short and a detailed version. The short version comprises the title, information about the author or editor, and data source (i.e. name of the catalogue).

If you choose Show details, you will see additional information, e.g. ISBN, URL, Country, publication language (Written in), Abstract or Keyword. The amount of the available information can vary across data sources and resource types.

Under the entries in the hit list, you can find different buttons that facilitate the access to the resource. What kind of access you have depends on the resource type and the data source. If alternative options exist, they will be shown simultaneously.

Functions of the buttons

Link to the full text or the website

The colour codes correspond to the established traffic light system (see below) of the Electronic Journals Library (Elektronische Zeitschriftenbibliothek, EZB).

The article is published in an open access journal; this is a link to the journal via the EZB; you can find the full text on the journal’s website.

The article is published in a subscription-based electronic journal that has been licensed by your institution; this is a link to the journal via the EZB; you can find the full text on the journal’s website.

The article is published in a subscription-based electronic journal; the full text is not accessible since your institution does not own a licence; the abstract may be freely availabe .

The status of the journal where the article is published could not be determined; this is a link to the journal’s record in the EZB that can provide additional information.

Link to the Zeitschriftendatenbank (ZDB); you can see here which German or Austrian libraries have holdings of the print journal.

Starts a Germany-wide search in the union catalogues. You will find out which libraries possess the book.

Starts an availability check in the Karlsruhe Virtual Catalog. You will find out in which library networks the book is available.

Link to subito, a fee-based document delivery service of German, Swiss, and Austrian libraries. You can order a copy of an article or a book chapter. Subito will deliver the copy directly to your home within 24 to 72 hours.

Please note that in order to use the full functionality you have to be in the IP range of your institution or to log in with your institutional account respectively.

Using filters to refine your results

After completing a simple or advanced search, you can refine your results further using filters. In the Refine your search area, you will find the available filter functions:

Keyword: Here you see the most frequent keywords in the set of hits. By selecting one of the keywords, you will limit the results to the ones containing the given keyword.
Creator/Publisher: Here you can see the names of the authors or editors most frequently found in the set of hits. By selecting a name, you will limit your results to the publication of the respective person.
Year: Here you see the years of publication sorted in descending order. In the brackets next to the year, you see the number of publications. By selecting a year, you will limit your results to the publications in that year.
Medium: You can filter the hits according to media type and limit your results only to online documents or print documents, for example (possible media types: Print, Online, Microfilm, CD-ROM/DVD, Audio-CD, Cartography, Miscellaneous).
Type: Here you see the number of hits sorted by resource type and you can filter accordingly (possible resource types: Article, Book, Journal, Series, Website, Miscellaneous).

Limiting the search to selected catalogues

The Catalogue search facilitates the simultaneous search in several data sources (Catalogues, Bibliographies, Linked Open Data catalogues, Online resources, Open access documents).

The default setting is a search across all integrated sources. You can, however, change the setting and limit your search to one or several catalogues. For this purpose, use the checkboxes in the list of the integrated data sources.

Next to each catalogue name, you see an information icon that links to a short description of the data source.

Go to top

Journal directory

In the Journal directory, you can search for linguistically relevant journals and not for individual articles. In order to search for individual articles, please go to the website of the respective journal.

The search mask comprises six search fields. You decide how many search fields to use and how to combine them. You have to use at least one search field in order to start a query. The default setting is the option All fields.

Use the drop-down boxes to choose a category for the search fields:

All fields includes all categories.
Title refers to the title of the journal.
Subject refers to the main topics of the journal, e.g. Computational linguistics, Media linguistics.
Tip: You can find the list of the possible subjects by selecting the facet Subject in the menu bar.
Language refers to the languages that are topic of the journal (object language), e.g. Austronesian languages, Japanese.
Keywords are more specific than the subjects are, e.g. language minority, language technology, German as a foreign language.
Written in refers to the publication language, i.e. the language, in which the majority of the journal articles are written; there are journals with more than one publication language.

Examples for possible queries

Example 1: If you enter "linguistics" in the search field with category Title, you will find all journals whose titles contain the word "linguistics".

Example 2: If you enter "Computational linguistics" in the search field with category Subject, you will find all journals with focus on computational linguistics.

Example 3: If you search for journals with focus on Arabic linguistics whose articles are written in German, you can use a combined query: Language "Arabic" AND Written in "German".

Display of the results and access to the full text

The query results are sorted alphabetically. Each hit is a journal’s bibliographic record.

The short version of the record contains the title, the homepage URL as well as information on the categories Subject, Language (object language) and Written in (publication language).

Select Show details if you want to see the detailed version of the record. It contains additional information such as Subject area, Keyword, Publisher. Via Elektronische Zeitschriftenbibliothek (EZB), you can find additional information such as details about the access to different volumes. Institutions with full text access shows which libraries have licensed the respective journal.

In the hit list, you see a traffic light symbol in front of each journal title. The traffic light system indicates your access to the full text of the journal articles:

Free access to full text

Full text accessible via your institution

Full text not accessible (usually tables of content and abstracts are accessible)

The access to the journal content varies depending on the volume. The full text is accessible for some volumes, but not for other.

If you use the search functions outside your university or institution, e.g. at home, make sure to log in with your user account into your library system. Only after logging in, the yellow traffic light can be shown. The yellow traffic light shows if your library has licensed the journal you need. This information can be processed only if you are within the IP range of a given institution.

Limit results to freely available sources is a special feature of the Journal directory. You can activate it through the checkbox under the search mask.

Browsing the Journal directory

You can browse by selected categories:

Title A-Z: This is an alphabetically sorted list of all journals contained in the directory.
Subject: e.g. Applied linguistics, Computational linguistics, Lexicography (40 subjects available)
Language: This category refers to language as topic of research (object language). The browsing is based on a hierarchically organised language classification. The top level contains established language families (e.g. Afro-Asiatic languages, Indo-European languages), groups based on geographical distribution (Australian languages, Papuan language), and isolated languages (e.g. Basque, Burushaski). Click on the plus sign to see the individual languages that belong to the respective group. Please note that the language classification includes only languages that are subject of at least one journal from the directory.
Subject area: e.g. African Studies, English Studies, German Studies, Oriental Studies (25 subject areas available)
Written in: The browsing is based on the list of the available publication languages.

Go to top

Links between publications and language corpora

Numerous publications from the Bibliography of Linguistic Literature (BLL) are interlinked with the language corpus described in the respective publication.
For this purpose, corpus records were created and integrated in the Catalogue search. You can conveniently filter the search results using the resource type Corpus.

Example: If you start a Simple Search with the phrase “Early Modern English”, you will find articles, books and journals as well as corpora. Choosing the filter Type, you will see the number of hits sorted according to resource type. By selecting Corpus, you will get to the list of the relevant corpora.

Each corpus record has a short and a detailed version.

In the short version you can see

name of the corpus, e.g. British National Corpus
corresponding abbreviation (or acronym), displayed in brackets, e.g. [ BNC ]
URL of the primary corpus homepage, e.g. http://www.natcorp.ox.ac.uk/

All corpus URLs are provided as active links – with a single click you can navigate to the web presence of a given corpus.

In the detailed version you can also find

information about the object language, e.g. Keyword: Britisches Englisch
other URLs associated with the corpus, e.g. British National Corpus, Baby edition
Handle: http://hdl.handle.net/20.500.12024/2553
PURL: http://purl.ox.ac.uk/ota/2553
VLO: https://vlo.clarin.eu/record?docId=https_58__47__47_hdl.handle.net_47_20.500.12024_47_2553_64_format_61_cmdi
Homepage: http://www.natcorp.ox.ac.uk/corpus/babyinfo.html
indication of the version, in case the corpus versions can be addressed separately, e.g. Corpus of Late Modern English Texts
Homepage: https://perswww.kuleuven.be/~u0044428/clmet.htm (Version [1])
Homepage: https://perswww.kuleuven.be/~u0044428/clmetev.htm (Extended Version)
Homepage: https://perswww.kuleuven.be/~u0044428/clmet3_0.htm (Version 3.0)
link to the list of the interlinked publications (field Related publications)

The links between a corpus and its corresponding publications are bidirectional. Provided such links exist, they are also visible in the detailed versions of the publications’ records (as hyperlinks pointing to the corpus record in the field Corpus used), e.g.

The diachrony of "the fact that"-clauses
Gentens, Caroline
In: English studies. - Abingdon : Routledge, Taylor & Francis Group 100 (2019) 1-2, 220-239
Corpus used: Corpus of Late Modern English Texts (Version 3.0)

Go to top

LOD search

The Lin|gu|is|tik portal is interlinked with Linked Open Data (LOD). Numerous language resources are integrated in the portal using this interlinking. The LOD search gives you the opportunity to specifically search for these resources.

The LOD search corresponds to a Simple Search limited to Linked Open Data catalogues. General information about the Advanced Search, the display of the results, and the use of filters is provided in the section Catalogue search.

Linked Open Data catalogues

The catalogue Annohub comprises resources that underwent an automated analysis with regard to object language and annotation scheme. The obtained metadata is stored in the Annohub-Repository (for more information please go to section LOD search: Background information).

All referenced resources are freely available and can be downloaded directly. Some of the resources are "genuine" LOD resources, i.e. both content and metadata are modelled according to LOD principles. Many of the corpora, however, are available in XLM or CoNLL format and provide only metadata as LOD.

The formal metadata (title, author, etc.) is adopted from portals such as LingHub and CLARIN-VLO.

The short version of an Annohub record comprises the title, and, if specified, the author of the resource. In the detailed version, you can find additional information such as Abstract (short description), Keyword (metadata with regard to language and annotated linguistic phenomena), URL (download link), and Mime-Type (Internet Media Type).

When using the filter functions please consider the following particularities:

Year: This information is generally lacking.
Medium: In accordance with the nature of the resources, all Annohub records have a uniform media type (online).
Type: In accordance with the nature of the resources, all Annohub records have a uniform resource type (website).

When querying the catalogue Annohub, we recommend using search terms such as

names of individual languages (e.g., Arabic, German, Hindi)
concepts describing language phenomena that are usually part of the linguistic annotation of corpora, for example, terms designating parts of speech (e.g. conjunction, personal pronoun), grammatical features (e.g. dative, dual), syntactic categories (e.g. object, syntagm), etc.

Examples for possible queries

Example 1: If you enter "auxiliary" in the search field, you will find language resources (corpora, dictionaries, or lexicons) that annotate auxiliary verbs.

Example 2: If you enter "Finnish" in the search field, you will find language resources that have Finnish as object language.

Example 3: If you search for Arabic language resources that include annotations of the syntactic structure, you can use a combined query: All fields "Arabic" AND Keyword "syntax".

Please note that the search for concrete examples in a given corpus or dictionary requires knowledge of appropriate query languages (e.g. SPARQL for resources in RDF format).

Go to top

LOD search: Background information

The term Linked Data refers to a set of best practices for publishing structured data on the Web.

Linked Data builds upon the following Web standards:

URI (Uniform Resource Identifier) for the identification of resources
HTTP (Hypertext Transfer Protocol) for the transfer of the data
RDF (Resource Description Framework) as a general method for the modelling of information
RDFS (Resource Description Framework Schema) and OWL (Web Ontology Language) for the authoring of vocabularies / ontologies
SPARQL (SPARQL Protocol and RDF Query Language) for the retrieval of data

Principles of Linked Data as outlined by Tim Berners-Lee:

Use URIs as names for things.
Use HTTP URIs so that people can look up those names.
When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL).
Include links to other URIs, so that they can discover more things.

Linked Open Data refers to open data designed according to Linked Data principles.

Linguistic Linked Open Data (LLOD) refers to freely available, linguistically relevant resources such as corpora, lexicons, dictionaries, thesauri, knowledge bases, typological databases, and terminology or metadata repositories, designed according to Linked Data principles. For details and examples, please go to https://linguistic-lod.org.

Interlinking the Lin|gu|is|tik portal with LOD

The thesaurus of the Bibliography of Linguistic Literature (BLL Thesaurus) plays a key role in the interlinking of the Lin|gu|is|tik portal with LOD. There are several reasons for this: Being an integral part of the bibliography the thesaurus is extensively interlinked with publications; simultaneously, it provides the basis for the standardised vocabulary used for classifying and indexing within the Lin|gu|is|tik portal.

In order to serve as a connecting point for LOD, the BLL Thesaurus was modelled according to Linked Data principles and linked to terminology repositories in the LLOD cloud (see BLL LOD Edition). Currently, links to the repositories OLiA, Lexvo, and Glottolog are available:

OLiA (Ontologies of Linguistic Annotations) defines linguistic data categories. The OLiA Reference Model serves as a mediator between different classification systems and annotation schemes.
Lexvo provides persistent identifiers for the language codes from the ISO 639 series.
Glottolog provides a detailed language classification according to genealogical principles and a comprehensive bibliography.

Using these links, resources that, for their part, are linked to OLiA, Lexvo, or Glottolog can be integrated in the Lin|gu|is|tik portal.

Annohub-Repository

In the framework of FID Linguistik, routines were developed for the integration of online available language resources that provide metadata as LOD. The process employs Semantic Web technologies and methods of computational linguistics:

A special tool analyses the resources with regard to object language and annotation scheme: It can determine, for example, that a corpus contains German text and its linguistic annotation corresponds to the STTS tag set, or that a dictionary contains Spanish and Catalan and uses the Ontolex vocabulary for the data structure.
The concepts from the determined tag sets / vocabularies and the identified languages are mapped to the BLL subject terms using the links established between the BLL Thesaurus and OLiA, Lexvo, and Glottolog. The tags AUX, PARP, and NUM from the Universal Dependencies annotation scheme, for example, correspond to the BLL subject terms Auxiliary, Particle, and Preposition.
The identified BLL subject terms are added to the metadata of the respective language resource.
The results of the analysis together with the aggregated basic formal metadata (title, author, etc.) are stored in a metadata repository (Annohub) established for this purpose. The Annohub dataset is designed and published according to LOD principles (see Annohub RDF Edition).

The Annohub data serves as basis for the indexing of the resources and their integration in the Lin|gu|is|tik portal.

Go to top