Multimodal speech algorithms and applications

Title: Multimodal speech algorithms and applications

By: Xavier Anguera (Telefonica Research - ES)

Date: Fri, 20 May 2011, 14h00

Room : DI seminars room
Host: Multimodal Systems

More info: http://citi.di.fct.unl.pt/seminar/seminar.php?id=185

Abstract:

In this talk I cover three of the latest research projects I have been leading in Telefonica over the last months. The first project, titled "spoken wordclouds", uses pattern-matching algorithms to automatically discover acoustic repetitions in speech recordings and then cluster them to obtain a summary of the recording, in a similar way to what a wordcloud does with text. In the second project I present the efforts I am leading in the field of multimodal video-copy detection, used for example, for detecting the infringing usage of copyrighted multimedia material. Last, the project called "spoken ebooks" proposes a method to synchronize an ebook with its corresponding audiobook and then be played in synchrony to the user. Time permitting, I will be showing a live demo of this project in an Ipad.

Próximos eventos

Eventos

Multimodal speech algorithms and applications

Próximos eventos

Estudar

Investigar

Conhecer