Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 8 of 8

1	Web scale nlp: A case study on url word breaking
	Kuansan Wang; Christopher Thrasher; Bo-june “paul Hsu
	In: http://research.microsoft.com/pubs/144355/URLwordbreaking.pdf (2011)
	BASE
	Show details

2	Web Scale NLP: A Case Study on URL Word Breaking
	Kuansan Wang; Christopher Thrasher; Bo-june “paul Hsu
	In: http://www.www2011india.com/proceeding/proceedings/p357.pdf (2011)
	BASE
	Show details

3	Spoken correction for Chinese text entry
	Bo-june Paul Hsu; James Glass
	In: http://people.csail.mit.edu/jrg/2006/hsu-iscslp06.pdf (2006)
	BASE
	Show details

4	Erd’14: Entity recognition and disambiguation challenge
	David Carmel; Ming-Wei Chang; Microsoft Research...
	In: http://sigir.org/files/forum/2014D/p063.pdf
	BASE
	Show details

5	Spoken Correction for Chinese Text Entry
	Bo-june (paul Hsu; James Glass
	In: http://www.sls.csail.mit.edu/sls/publications/2006/hsu_iscslp06.pdf
	BASE
	Show details

6	Simple and Knowledge-intensive Generative Model for Named Entity Recognition
	Chun-Kai Wang; Bo-June; Paul Hsu; Ming-Wei Chang; Emre Kıcıman
	In: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/nlmm-msr-tr-2013-3.pdf
	Abstract: ABSTRACT Almost all of the existing work on Named Entity Recognition (NER) consists of the following pipeline stages -part-of-speech tagging, segmentation, and named entity type classification. The requirement of hand-labeled training data on these stages makes it very expensive to extend to different domains and entity classes. Even with a large amount of hand-labeled data, existing techniques for NER on informal text, such as social media, perform poorly due to a lack of reliable capitalization, irregular sentence structure and a wide range of vocabulary. In this paper, we address the lack of hand-labeled training data by taking advantage of weak super vision signals. We present our approach in two parts. First, we propose a novel generative model that combines the ideas from Hidden Markov Model (HMM) and n-gram language models into what we call an N-gram Language Markov Model (NLMM). Second, we utilize large-scale weak supervision signals from sources such as Wikipedia titles and the corresponding click counts to estimate parameters in NLMM. Our model is simple and can be implemented without the use of Expectation Maximization or other expensive iterative training techniques. Even with this simple model, our approach to NER on informal text outperforms existing systems trained on formal English and matches state-of-the-art NER systems trained on hand-labeled Twitter messages. Because our model does not require hand-labeled data, we can adapt our system to other domains and named entity classes very easily. We demonstrate the flexibility of our approach by successfully applying it to the different domain of extracting food dishes from restaurant reviews with very little extra work.
	URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1062.1778
	BASE
	Hide details

7	Spoken Correction for Chinese Text Entry
	Bo-june (paul Hsu; James Glass
	In: http://people.csail.mit.edu/bohsu/SpokenCorrectionForChineseTextEntry2006.pdf
	BASE
	Show details

8	Sampling Representative Phrase Sets for Text Entry Experiments: A Procedure and Public Resource
	Tim Paek; Bo-june (paul Hsu
	In: http://research.microsoft.com/%7Etimpaek/Papers/chi2011_paek_hsu.pdf
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern