Inventi Journals Home
Inventi Impact
Audio, Speech & Music Processing

Home Editorial Board Current Issue Past Issues Statistics
Search Article


Journal Scope
Inventi Rapid/Impact: Audio, Speech & Music Processing is the peer reviewed journal of Engineering & Technology. It contains the experimental and theoretical paper related to various engineering technologies about audio, speech and music. Sound engineering, recording, electronic production of speech and music, digitization of sound are the fields covered under the papers of the journal.



IMPROVEMENT OF MULTIMODAL GESTURE AND SPEECH RECOGNITION PERFORMANCE USING TIME INTERVALS BETWEEN GESTURES AND ACCOMPANYING SPEECH
Madoka Miki, Norihide Kitaoka, Chiyomi Miyajima, Takanori Nishino, Kazuya Takeda

We propose an integrative method of recognizing gestures such as pointing, accompanying speech. Speech generated simultaneously with gestures can assist in the recognition of gestures, and since this occurs in a complementary manner, gestures can also assist in the recognition of speech. Our integrative recognition method uses a probability distribution which expresses the distribution of the time interval between the starting times of gestures and of the corresponding utterances. We evaluate the rate of improvement of the proposed integrative recognition method with a task involving the solution of a geometry problem....
More
A PRELIMINARY DEMONSTRATION OF EXEMPLAR-BASED VOICE CONVERSION FOR ARTICULATION DISORDERS USING AN INDIVIDUALITY-PRESERVING DICTIONARY
Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki

We present in this paper a voice conversion (VC) method for a person with an articulation disorder resulting from athetoid cerebral palsy. The movement of such speakers is limited by their athetoid symptoms, and their consonants are often unstable or unclear, which makes it difficult for them to communicate. In this paper, exemplar-based spectral conversion using nonnegative matrix factorization (NMF) is applied to a voice with an articulation disorder. To preserve the speaker’s individuality, we used an individuality-preserving dictionary that is constructed from the source speaker’s vowels and target speaker’s consonants. Using this dictionary, we can create a natural and clear voice preserving their voice’s individuality. Experimental results indicate that the performance of NMF-based VC is considerably better than conventional GMM-based VC....
More
PRODUCTION AND PERCEPTION OF VELAR STOP (DE)VOICING IN EUROPEAN PORTUGUESE AND ITALIAN
Daniel Pape, Luis M T Jesus

Speech production and speech perception studies were conducted to compare (de)voicing in the Romance languages European Portuguese (EP) and Italian. For the speech production part, velar stops in two positions and four vowel contexts were recorded. The voicing status for 10 consecutive landmarks during stop closure was computed. Results showed that during the complete stop closure voicing was always maintained for Italian, and that for EP, there was strong devoicing for all vowel contexts and positions. Both language and vowel context had a significant effect on voicing during stop closure. The duration values and voicing patterns from the production study were then used as input factors to a follow-up perceptual experiment to test the effects of vowel duration, stop duration and voicing maintenance on voicing perception by EP and Italian listeners. Perceptual stimuli (VCV) were generated using biomechanical modelling so that they would include physically realistic transitions between phonemes. The consonants were velar stops, with no burst or noise included in the signal. A strong language dependency of the three factors on listeners'' voicing distinction was found, with high sensitivity for all three cues for EP listeners and low sensitivity for Italian listeners. For EP stimuli with high voicing maintenance during stop closure, this cue was very strong and overrode the other two acoustic cues. However, for stimuli with low voicing maintenance (i.e. highly devoiced stimuli), the acoustic cues vowel duration and stop duration take over. Even in the absence of both voicing maintenance during stop closure and a burst, the acoustic cues vowel duration and stop duration guaranteed stable voicing distinction in EP. Italian listeners were insensitive to all three acoustic cues examined in this study, with stable voiced responses throughout all of the varying fully crossed factors. None of the examined acoustic cues appeared to be used by Italian listeners to obtain a robust voicing distinction, thus pointing to the use of other acoustic cues or combination of other cues to guarantee stable voicing distinction in this language....
More
Patent Watch
Job Watch

E- ISSN: 2250-2912
P- ISSN: Awaited


Inventi Impact
Audio, Speech & Music Processing



Frequency: Quarterly
E- ISSN: 2250-2912
P- ISSN: Awaited


Abstracted/ Indexed in: Ulrich’s International Periodical Directory & Google Scholar, SCIRUS