1 |
Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
|
|
|
|
In: https://hal.inria.fr/hal-03550289 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Linguistic resources for paraphrase generation in Portuguese: a Lexicon-Grammar approach
|
|
|
|
In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-03548861 ; Language Resources and Evaluation, Springer Verlag, 2022, ⟨10.1007/s10579-021-09561-5⟩ ; https://link.springer.com/article/10.1007/s10579-021-09561-5 (2022)
|
|
BASE
|
|
Show details
|
|
3 |
全文検索システム『ひまわり』用『国会会議録』パッケージの構築
|
|
|
|
Abstract:
国立国語研究所 研究系 音声言語研究領域 ; Spoken Language Division, Research Department, NINJAL ; 本稿は,『国会会議録検索システム』に収録されている国会会議録のテキストデータに基づき,全文検索システム『ひまわり』用の『国会会議録』パッケージを構築する方法,および,構築結果を報告する。本パッケージには,1947(第1回)~ 2012年(第182回)に開催された衆議院・参議院の本会議,および,予算委員会の会議録11106件(約4.49億字)を収録している。本パッケージは言語表現の経年変化分析を行うために設計され,会議情報,発言者情報,会議録の構造情報がXMLで付与されている。本稿では,まず,XMLタグを設計するとともに,原資料の表記上の手がかりを使って,設計したタグを会議録に自動的にアノテーションする方法を示す。次に,考案した手法に基づいて『国会会議録』パッケージを構築する。また,構築したパッケージに収録した会議録の基礎情報を示す。最後に,『国会会議録』パッケージを使って,(a)経年変化が大きい表現を抽出する方法,(b)抽出された表現に対する経年変化要因を調査する方法を示すことにより,『国会会議録』パッケージの有用性を示す。 ; This paper presents the method whereby a language resource package of the Minutes of the National Diet of Japan was constructed for the Full-Text Search System "Himawari" from text data stored in the Full-Text Database System for the Minutes of the Diet and reports the results of the construction. This package includes 11106 minutes (about 450 million characters) of the 1st (1947) to 182nd (2012) plenary sessions and budget committee meetings in the House of Representatives and the House of Councillors. Information related to the meetings, speakers, and the document structures of the minutes are annotated to the minutes in XML to facilitate the analysis of temporal changes in linguistic expressions. In this paper, I first describe the XML tags and an automatic annotation method created using notational clues in the minutes, then I detailed the application of the annotation method to the original minute data to construct the package and summarized the results. Finally, this paper classifies the usefulness of the package by showing how it can be used (a) to extract expressions showing large temporal changes and (b) to investigate the factors of the changes.
|
|
Keyword:
full-text search system "Himawari"; language resource; temporal change analysis; The Minutes of the National Diet of Japan; 全文検索システム『ひまわり』; 国会会議録; 経年変化分析; 言語資料
|
|
URL: http://id.nii.ac.jp/1328/00003521/ https://repository.ninjal.ac.jp/?action=repository_uri&item_id=3538 https://repository.ninjal.ac.jp/?action=repository_action_common_download&item_id=3538&item_no=1&attribute_id=54&file_no=1
|
|
BASE
|
|
Hide details
|
|
4 |
Machine Learning approaches for Topic and Sentiment Analysis in multilingual opinions and low-resource languages: From English to Guarani
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Investigating alignment interpretability for low-resource NMT
|
|
|
|
In: ISSN: 0922-6567 ; EISSN: 1573-0573 ; Machine Translation ; https://hal.archives-ouvertes.fr/hal-03139744 ; Machine Translation, Springer Verlag, 2021, ⟨10.1007/s10590-020-09254-w⟩ (2021)
|
|
BASE
|
|
Show details
|
|
6 |
The Zero Resource Speech Challenge 2021: Spoken language modelling
|
|
|
|
In: ISSN: 0162-8828 ; IEEE Transactions on Pattern Analysis and Machine Intelligence ; https://hal.inria.fr/hal-03329301 ; IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2021, pp.1-1. ⟨10.1109/TPAMI.2021.3083839⟩ (2021)
|
|
BASE
|
|
Show details
|
|
7 |
The Zero Resource Speech Challenge 2021: Spoken language modelling
|
|
|
|
In: Interspeech 2021 - Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-03329301 ; Interspeech 2021 - Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic. ⟨10.1109/TPAMI.2021.3083839⟩ (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Cross-lingual Representation Learning for Natural Language Processing
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Sign Language in Light of Mathematics Education:An Exploration Within Semiotic and Embodiment Theories of Learning Mathematics
|
|
|
|
In: American Annals of the Deaf, vol 166, iss 3 (2021)
|
|
BASE
|
|
Show details
|
|
10 |
Multitask Transformer Model-based Fintech Customer Service Chatbot NLU System with DECO-LGG SSP-based Data ; DECO-LGG 반자동 증강 학습데이터 활용 멀티태스크 트랜스포머 모델 기반 핀테크 CS 챗봇 NLU 시스템
|
|
|
|
In: Annual Conference on Human and Language Technology ; https://hal.archives-ouvertes.fr/hal-03603903 ; Annual Conference on Human and Language Technology, Oct 2021, Séoul, South Korea. pp.461-466 ; http://www.koreascience.or.kr/journal/OOGHAK.page (2021)
|
|
BASE
|
|
Show details
|
|
11 |
On Multi-domain Sentence Level Sentiment Analysis for Roman Urdu ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
From 'big' to 'much': On the grammaticalization of two gradable adjectives in Swedish ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
From 'big' to 'much': On the grammaticalization of two gradable adjectives in Swedish ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Creating Corpus-Informed Materials for the English as a Foreign Language Classroom: A step-by-step guide for (trainee) teachers using online resources ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Creating Corpus-Informed Materials for the English as a Foreign Language Classroom: A step-by-step guide for (trainee) teachers using online resources ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|