DE eng

Search in the Catalogues and Directories

Page: 1...5 6 7 8 9
Hits 161 – 167 of 167

161
Parsing with PCFGs and automatic f-structure annotation
In: Cahill, Aoife orcid:0000-0002-3519-7726 , McCarthy, Mairéad, van Genabith, Josef orcid:0000-0003-1322-7944 and Way, Andy orcid:0000-0001-5736-5930 (2002) Parsing with PCFGs and automatic f-structure annotation. In: LFG02 - 7th International Lexical Functional Grammar Conference, 3-5 July, 2002, Athens, Greece. ISBN 1098-6782 (2002)
BASE
Show details
162
Glue, Underspecification and Translation
In: Crouch, Dick, Frank, Anette and van Genabith, Josef orcid:0000-0003-1322-7944 (2002) Glue, Underspecification and Translation. In: Bunt, Harry, Muskens, Reinhard and Thijsse, Elias, (eds.) Computing Meaning, Volume 2. Studies in Linguistics and Philosophy, 77 . Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 165-184. ISBN 978-1402004513 (2002)
BASE
Show details
163
TTS – A Treebank Tool Suite
In: Cahill, Aoife orcid:0000-0002-3519-7726 and van Genabith, Josef orcid:0000-0003-1322-7944 (2002) TTS – A Treebank Tool Suite. In: The Third International Conference on Language Resources and Evaluation, May 27th--June 2nd, 2002, Las Palmas de Grand Canaria, Spain. (2002)
BASE
Show details
164
Metaphors, Logic, and Type Theory
In: Metaphor and symbol. - Philadelphia : Routledge, Taylor & Francis Group 16 (2001) 1, 43-58
OLC Linguistik
Show details
165
Linear logic-based semantics construction for LTAG
In: Frank, Anette and van Genabith, Josef orcid:0000-0003-1322-7944 (2001) Linear logic-based semantics construction for LTAG. In: The 6th International Lexical-Functional Grammar Conference, LFG'2001, 25-27 June 2001, The University of Hong Kong, Hong Kong. (2001)
BASE
Show details
166
Treebank vs. xbar-based automatic f-structure annotation
In: van Genabith, Josef orcid:0000-0003-1322-7944 , Frank, Anette and Way, Andy orcid:0000-0001-5736-5930 (2001) Treebank vs. xbar-based automatic f-structure annotation. In: LFG01 - 6th International Lexical Functional Grammar Conference, 25-27 June 2001, Hong Kong. ISBN 1098-6782 (2001)
Abstract: Manual, large scale (computational) grammar development is time consuming, expensive and requires lots of linguistic expertise. More recently, a number of alternatives based on treebank resources (such as Penn-II, Susanne, AP treebank) have been explored. The idea is to automatically ``induce'' or rather read off (P)CFG grammars from the parse annotated treebank resources and to use the treebank grammars thus obtained in (probabilistic) parsing or as a starting point for further grammar development. The approach is cheap, fast, automatic, large scale, ``data driven'' and based on real language resources. Treebank grammars typically involve large sets of lexical tags and non-lexical categories as syntactic information tends to be encoded in monadic category symbols. They feature flat rules (trees) that can ``underspecify'' attachment possibilities. Treebank grammars do not in general follow Xbar architectural design principles (this is not to say that treebank grammars do not have design principles). As a consequence, treebank grammars tend to have very large CFG rule bases (e.g. Penn-II > 17,000 CFG rules for about 1 million words of text) with often only minimally differing rules. Even though treebank grammars are large, they are still incomplete, exhibiting unabated rule accession rates. From a grammar engineering point of view, the size of the rule base poses problems for maintainability, extendability and, if a treebank grammar is to be used as a CF-base in a LFG grammar, for functional (feature-structure) annotations. From the point of view of theoretical linguistics, flat treebank trees and treebank grammars extracted from such trees do not express linguistic generalisations. From the perspective of empirical and corpus linguistics, flat trees are well-motivated as they allow underspecification of subtle and often time consuming attachment decisions. Indeed, it is sometimes doubted whether highly general Xbar schemata usefully scale to ``real'' language. In previous work we developed methodologies for automatic feature-structure annotation of grammars extracted from treebanks. Automatic annotation of ``raw'' treebank grammars is difficult as annotation rules often need to identify subsequences in the RHSs of flat treebank rules as they explicitly encode head, complement and modifier relations. Xbar based CFG rules should substantially facilitate automatic feature-structure annotation of grammar rules. In the present paper we conduct a number of experiments to explore a space of possible grammars based on a small fragment of the AP treebank resource. Starting with the original treebank fragment we automatically extract a CFG G. We then apply an automatic structure preserving grammar compaction step which generalises categories in the original treebank fragment and reduces the number of rules extracted, resulting in a generalised treebank fragment and in a compacted grammar Gc. The generalised fragment is then manually corrected to catch missed constituents (and the like) resulting in an automatically extracted, compacted and (effectively manually) corrected grammar Gc,m. Manual correction proceeds in the ``spirit'' of treebank grammars (we do not introduce Xbar analyses). We then explore how many of the manual correction steps on treebank trees can be achieved automatically. We develop, implement and test an automatic treebank ``grooming'' methodology which is applied to the generalised treebank fragment to yield a compacted and automatically corrected grammar Gc,a. Grammars Gc,m and Gc,a are very similar to compiled out ``flat'' LFG-82 style grammars. We explore regular expression based compaction (both manual and automatic) to relate Gc,m to a LFG-82 style grammar design. Finally, we manually recode a subsection of the generalised and manually corrected treebank fragment into ``vanilla-flavour'' XBar based trees. From these we extract a compacted, manually corrected, XBar based grammar Gc,m,x. We evaluate our grammars and methods using standard labelled bracketing measures and according to how well they perform under automatic feature-structure annotation tasks.
Keyword: Machine translating
URL: http://doras.dcu.ie/15349/
BASE
Hide details
167
Experiments in Structure-Preserving Grammar Compaction
In: Hepple, Mark and van Genabith, Josef orcid:0000-0003-1322-7944 (2000) Experiments in Structure-Preserving Grammar Compaction. In: 1st Meeting on Speech Technology Transfer, 6-10 Nov 2000, Universidad de Sevilla and Universidad de Granada, Sevilla, Spain. (2000)
BASE
Show details

Page: 1...5 6 7 8 9

Catalogues
0
0
1
0
5
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
161
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern