41 |
A Visual Context-Aware Multimodal System for Spoken Language Processing
|
|
|
|
In: http://www.media.mit.edu/cogmac/publications/niloy-euro03.pdf (2003)
|
|
BASE
|
|
Show details
|
|
42 |
Augmenting User Interfaces with Adaptive Speech Commands
|
|
|
|
In: http://www.media.mit.edu/cogmac/publications/jfig_icmi03.pdf (2003)
|
|
BASE
|
|
Show details
|
|
43 |
Augmenting User Interfaces with Adaptive Speech Commands
|
|
|
|
In: http://web.media.mit.edu/~pgorniak/jfig.pdf (2003)
|
|
BASE
|
|
Show details
|
|
44 |
Learning Word Meanings and Descriptive Parameter Spaces from Music
|
|
|
|
In: http://www.media.mit.edu/cogmac/publications/whitman03learning.pdf (2003)
|
|
BASE
|
|
Show details
|
|
45 |
Grounded spoken language acquisition: Experiments in word learning
|
|
|
|
In: http://web.media.mit.edu/~dkroy/papers/pdf/roy_2003.pdf (2003)
|
|
BASE
|
|
Show details
|
|
46 |
Grounded Spoken Language Acquisition: Experiments in Word Learning
|
|
|
|
In: http://www.media.mit.edu/cogmac/publications/ieee_multimedia_2003.pdf (2003)
|
|
BASE
|
|
Show details
|
|
48 |
Learning Visually Grounded Words and Syntax of Natural Spoken Language
|
|
|
|
In: http://web.media.mit.edu/~dkroy/papers/pdf/ec.pdf (2002)
|
|
BASE
|
|
Show details
|
|
49 |
Grounded Spoken Language Acquisition: Experiments in Word Learning
|
|
|
|
In: http://www.media.mit.edu/cogmac/ieee_mm_2002.pdf (2002)
|
|
BASE
|
|
Show details
|
|
51 |
Learning Words from Sights and Sounds: A Computational Model
|
|
|
|
In: http://web.media.mit.edu/~dkroy/papers/pdf/cogsci_v2.pdf (2001)
|
|
BASE
|
|
Show details
|
|
52 |
Learning visually grounded words and syntax of natural spoken language
|
|
|
|
In: http://web.media.mit.edu/~dkroy/papers/pdf/roy_2000_2001.pdf (2001)
|
|
BASE
|
|
Show details
|
|
53 |
Integration Of Speech And Vision Using Mutual Information
|
|
|
|
In: http://vismod.www.media.mit.edu/people/dkroy/papers/pdf/icassp2000.pdf (2000)
|
|
BASE
|
|
Show details
|
|
54 |
Integration of speech and vision using mutual information
|
|
|
|
In: http://www.icsi.berkeley.edu/~dpwe/research/etc/icassp2000/pdf/143_717.PDF (2000)
|
|
BASE
|
|
Show details
|
|
55 |
Learning Words from Sights and Sounds: A Computational Model
|
|
|
|
In: http://www.media.mit.edu/cogmac/cogsci_2002.pdf (2000)
|
|
BASE
|
|
Show details
|
|
56 |
A Computational Model of Word Learning from Multimodal Sensory Input
|
|
|
|
In: http://vismod.www.media.mit.edu/~dkroy/papers/pdf/iccm2000.pdf (2000)
|
|
Abstract:
How do infants segment continuous streams of speech to discover words of their language? Current theories emphasize the role of acoustic evidence in discovering word boundaries (Cutler 1991; Brent 1999; de Marcken 1996; Friederici & Wessels 1993; see also Bolinger & Gertsman 1957). To test an alternate hypothesis, we recorded natural infant-directed speech from caregivers engaged in play with their pre-linguistic infants centered around common objects. We also recorded the visual context in which the speech occurred by capturing images of these objects. We analyzed the data using two computational models, one of which processed only acoustic recordings, and a second model which integrated acoustic and visual input. The models were implemented using standard speech and vision processing techniques enabling the models to process sensory data. We show that using visual context in conjunction with spoken input dramatically improves learning when compared with using acoustic evidence alone.
|
|
URL: http://vismod.www.media.mit.edu/~dkroy/papers/pdf/iccm2000.pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.4780
|
|
BASE
|
|
Hide details
|
|
57 |
Learning Visually Grounded Words and Syntax of Natural Spoken Language
|
|
|
|
In: http://www.media.mit.edu/cogmac/evol_comm_2002.pdf (2000)
|
|
BASE
|
|
Show details
|
|
58 |
Learning Words From Natural Audio-Visual Input
|
|
|
|
In: http://vismod.www.media.mit.edu/~dkroy/papers/Postscript/icslp98.ps.Z (1999)
|
|
BASE
|
|
Show details
|
|
59 |
Learning Words From Natural Audio-Visual Input
|
|
|
|
In: http://vismod.www.media.mit.edu/~dkroy/papers/pdf/icslp98.pdf (1999)
|
|
BASE
|
|
Show details
|
|
60 |
Multimodal Adaptive Interfaces
|
|
|
|
In: ftp://whitechapel.media.mit.edu/pub/tech-reports/TR-438.ps.Z (1997)
|
|
BASE
|
|
Show details
|
|
|
|