1 |
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
|
|
|
|
In: Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021) ; https://hal.archives-ouvertes.fr/hal-03466171 ; Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021), Aug 2021, Online, France. pp.96-120, ⟨10.18653/v1/2021.gem-1.10⟩ (2021)
|
|
BASE
|
|
Show details
|
|
2 |
BiSECT: Learning to Split and Rephrase Sentences with Bitexts ...
|
|
|
|
Abstract:
An important task in NLP applications such as sentence simplification is the ability to take a long, complex sentence and split it into shorter sentences, rephrasing as necessary. We introduce a novel dataset and a new model for this `split and rephrase' task. Our BiSECT training data consists of 1 million long English sentences paired with shorter, meaning-equivalent English sentences. We obtain these by extracting 1-2 sentence alignments in bilingual parallel corpora and then using machine translation to convert both sides of the corpus into the same language. BiSECT contains higher quality training examples than previous Split and Rephrase corpora, with sentence splits that require more significant modifications. We categorize examples in our corpus, and use these categories in a novel model that allows us to target specific regions of the input sentence to be split and edited. Moreover, we show that models trained on BiSECT can perform a wider variety of split operations and improve upon previous ... : 9 pages, 9 figures. Long paper to appear in Empirical Methods in Natural Language Processing 2021 (EMNLP 2021) ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2109.05006 https://dx.doi.org/10.48550/arxiv.2109.05006
|
|
BASE
|
|
Hide details
|
|
4 |
Pre-train or Annotate? Domain Adaptation with a Constrained Budget ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
BiSECT: Learning to Split and Rephrase Sentences with Bitexts ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Sample data for "Design and Collection Challenges of Building an Academic Email Corpus for Linguistics and Computational Research" ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Sample data for "Design and Collection Challenges of Building an Academic Email Corpus for Linguistics and Computational Research" ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Controllable Text Simplification with Explicit Paraphrasing ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
The effectiveness of the problem-based learning in medical cell biology education: A systematic meta-analysis
|
|
|
|
In: Medicine (Baltimore) (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Controllable text simplification with explicit paraphrasing
|
|
|
|
BASE
|
|
Show details
|
|
12 |
An Empirical Study of Pre-trained Transformers for Arabic Information Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Controllable Text Simplification with Explicit Paraphrasing ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Interactive Grounded Language Acquisition and Generalization in a 2D World ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Interactive Language Acquisition with One-shot Visual Concept Learning through a Conversational Game ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
A Continuously Growing Dataset of Sentential Paraphrases ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Spectral Entropy Can Predict Changes of Working Memory Performance Reduced by Short-Time Training in the Delayed-Match-to-Sample Task
|
|
|
|
BASE
|
|
Show details
|
|
|
|