Inventi Journals Home
Inventi Rapid
Audio, Speech & Music Processing

Home Editorial Board Current Issue Past Issues Statistics
Search Article
Open Review Implemented! To Review - Visit Journal's Current-Issue Page


Journal Scope
Inventi Rapid/Impact: Audio, Speech & Music Processing is the peer reviewed journal of Engineering & Technology. It contains the experimental and theoretical paper related to various engineering technologies about audio, speech and music. Sound engineering, recording, electronic production of speech and music, digitization of sound are the fields covered under the papers of the journal.



COMBINED PERCEPTION AND CONTROL FOR TIMING IN ROBOTIC MUSIC PERFORMANCES
Umut Simsekli, Orhan Sonmez, Baris Kurt, Ali Taylan Cemgil

Interaction with human musicians is a challenging task for robots as it involves online perception and precise synchronization. In this paper, we present a consistent and theoretically sound framework for combining perception and control for accurate musical timing. For the perception, we develop a hierarchical hidden Markov model that combines event detection and tempo tracking. The robot performance is formulated as a linear quadratic control problem that is able to generate a surprisingly complex timing behavior in adapting the tempo. We provide results with both simulated and real data. In our experiments, a simple Lego robot percussionist accompanied the music by detecting the tempo and position of clave patterns in the polyphonic music. The robot successfully synchronized itself with the music by quickly adapting to the changes in the tempo...
More
MULTI-CANDIDATE MISSING DATA IMPUTATION FOR ROBUST SPEECH RECOGNITION
Yujun Wang, Hugo Van hamme

The application of Missing Data Techniques (MDT) to increase the noise robustness of HMM/GMM-based large vocabulary speech recognizers is hampered by a large computational burden. The likelihood evaluations imply solving many constrained least squares (CLSQ) optimization problems. As an alternative, researchers have proposed frontend MDT or have made oversimplifying independence assumptions for the backend acoustic model. In this article, we propose a fast Multi-Candidate (MC) approach that solves the per-Gaussian CLSQ problems approximately by selecting the best from a small set of candidate solutions, which are generated as the MDT solutions on a reduced set of cluster Gaussians. Experiments show that the MC MDT runs equally fast as the uncompensated recognizer while achieving the accuracy of the full backend optimization approach. The experiments also show that exploiting the more accurate acoustic model of the backend does pay off in terms of accuracy when compared to frontend MDT....
More
SPEAKER DIARIZATION OF BROADCAST NEWS IN ALBAYZIN 2010 EVALUATION CAMPAIGN
Martin Zelenak, Henrik Schulz, Javier Hernando

In this article, we present the evaluation results for the task of speaker diarization of broadcast news, which was part of the Albayzin 2010 evaluation campaign of language and speech technologies. The evaluation data consists of a subset of the Catalan broadcast news database recorded from the 3/24 TV channel. The description of five submitted systems from five different research labs is given, marking the common as well as the distinctive system features. The diarization performance is analyzed in the context of the diarization error rate, the number of detected speakers and also the acoustic background conditions. An effort is also made to put the achieved results in relation to the particular system design features....
More
Patent Watch
Job Watch

E- ISSN: Awaited


Inventi Rapid
Audio, Speech & Music Processing



Frequency: Quarterly
E- ISSN: Awaited


RI Factor- 1.0
Abstracted/ Indexed in: Ulrich’s International Periodical Directory & Google Scholar, SCIRUS