schliessen

Filtern

 

Bibliotheken

A robust feature extraction approach based on an auditory model for classification of speech and expressiveness

Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in ne... Full description

Journal Title: Journal of Central South University 2012, Vol.19(2), pp.504-510
Main Author: Sun, Ying
Other Authors: Werner, V. , Zhang, Xue-ying
Format: Electronic Article Electronic Article
Language: English
Subjects:
ID: ISSN: 2095-2899 ; E-ISSN: 2227-5223 ; DOI: 10.1007/s11771-012-1032-3
Link: http://dx.doi.org/10.1007/s11771-012-1032-3
Zum Text:
SendSend as email Add to Book BagAdd to Book Bag
Staff View
recordid: springer_jour10.1007/s11771-012-1032-3
title: A robust feature extraction approach based on an auditory model for classification of speech and expressiveness
format: Article
creator:
  • Sun, Ying
  • Werner, V.
  • Zhang, Xue-ying
subjects:
  • speech recognition
  • emotion recognition
  • zero-crossings
  • Teager energy operator
  • speech database
ispartof: Journal of Central South University, 2012, Vol.19(2), pp.504-510
description: Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in neutral (non-emotional) speech. The results show that the ZCMT approach effectively improves the recognition accuracy by 3.47% in average compared with the Teager energy operator (TEO). Thus, ZCMT feature can be considered as a noise-robust feature for speech recognition. The second kind consists of mono-lingual emotion recognition experiments by using the Taiyuan University of Technology (TYUT) and the Berlin databases. As the average recognition rate of ZCMT approach is 82.19%, the results indicate that the ZCMT features can characterize speech emotions in an effective way. The third kind consists of cross-lingual experiments with three languages. As the accuracy of ZCMT approach only reduced by 1.45%, the results indicate that the ZCMT features can characterize emotions in a language independent way.
language: eng
source:
identifier: ISSN: 2095-2899 ; E-ISSN: 2227-5223 ; DOI: 10.1007/s11771-012-1032-3
fulltext: fulltext
issn:
  • 2227-5223
  • 22275223
  • 2095-2899
  • 20952899
url: Link


@attributes
ID1655215415
RANK0.07
NO1
SEARCH_ENGINEprimo_central_multiple_fe
SEARCH_ENGINE_TYPEPrimo Central Search Engine
LOCALfalse
PrimoNMBib
record
control
sourcerecordid10.1007/s11771-012-1032-3
sourceidspringer_jour
recordidTN_springer_jour10.1007/s11771-012-1032-3
sourcesystemPC
galeid278277870
display
typearticle
titleA robust feature extraction approach based on an auditory model for classification of speech and expressiveness
creatorSun, Ying ; Werner, V. ; Zhang, Xue-ying
ispartofJournal of Central South University, 2012, Vol.19(2), pp.504-510
identifier
subjectspeech recognition ; emotion recognition ; zero-crossings ; Teager energy operator ; speech database
descriptionBased on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in neutral (non-emotional) speech. The results show that the ZCMT approach effectively improves the recognition accuracy by 3.47% in average compared with the Teager energy operator (TEO). Thus, ZCMT feature can be considered as a noise-robust feature for speech recognition. The second kind consists of mono-lingual emotion recognition experiments by using the Taiyuan University of Technology (TYUT) and the Berlin databases. As the average recognition rate of ZCMT approach is 82.19%, the results indicate that the ZCMT features can characterize speech emotions in an effective way. The third kind consists of cross-lingual experiments with three languages. As the accuracy of ZCMT approach only reduced by 1.45%, the results indicate that the ZCMT features can characterize emotions in a language independent way.
languageeng
source
version4
lds50peer_reviewed
links
openurl$$Topenurl_article
openurlfulltext$$Topenurlfull_article
backlink$$Uhttp://dx.doi.org/10.1007/s11771-012-1032-3$$EView_full_text_in_Springer_(Subscribers_only)
search
creatorcontrib
0Sun, Ying
1Werner, V.
2Zhang, Xue-ying
titleA robust feature extraction approach based on an auditory model for classification of speech and expressiveness
descriptionBased on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in neutral (non-emotional) speech. The results show that the ZCMT approach effectively improves the recognition accuracy by 3.47% in average compared with the Teager energy operator (TEO). Thus, ZCMT feature can be considered as a noise-robust feature for speech recognition. The second kind consists of mono-lingual emotion recognition experiments by using the Taiyuan University of Technology (TYUT) and the Berlin databases. As the average recognition rate of ZCMT approach is 82.19%, the results indicate that the ZCMT features can characterize speech emotions in an effective way. The third kind consists of cross-lingual experiments with three languages. As the accuracy of ZCMT approach only reduced by 1.45%, the results indicate that the ZCMT features can characterize emotions in a language independent way.
subject
0speech recognition
1emotion recognition
2zero-crossings
3Teager energy operator
4speech database
general
010.1007/s11771-012-1032-3
1English
2Springer Science & Business Media B.V.
3SpringerLink
sourceidspringer_jour
recordidspringer_jour10.1007/s11771-012-1032-3
issn
02227-5223
122275223
22095-2899
320952899
rsrctypearticle
creationdate2012
addtitle
0Journal of Central South University
1Science & Technology of Mining and Metallurgy
2J. Cent. South Univ. Technol.
searchscopespringer_journals_complete
scopespringer_journals_complete
lsr30VSR-Enriched:[issn, eissn, galeid, pages]
sort
titleA robust feature extraction approach based on an auditory model for classification of speech and expressiveness
authorSun, Ying ; Werner, V. ; Zhang, Xue-ying
creationdate20120200
facets
frbrgroupid5063799996054544071
frbrtype5
languageeng
creationdate2012
topic
0Speech Recognition
1Emotion Recognition
2Zero-Crossings
3Teager Energy Operator
4Speech Database
collectionSpringerLink
prefilterarticles
rsrctypearticles
creatorcontrib
0Sun, Ying
1Werner, V.
2Zhang, Xue-ying
jtitleJournal Of Central South University
toplevelpeer_reviewed
delivery
delcategoryRemote Search Resource
fulltextfulltext
addata
aulast
0Sun
1Werner
2Zhang
aufirst
0Ying
1V.
2Xue-ying
au
0Sun, Ying
1Werner, V.
2Zhang, Xue-ying
atitleA robust feature extraction approach based on an auditory model for classification of speech and expressiveness
jtitleJournal of Central South University
stitleJ. Cent. South Univ. Technol.
addtitleScience & Technology of Mining and Metallurgy
risdate201202
volume19
issue2
spage504
epage510
issn
02095-2899
110059784
eissn
02227-5223
119930666
genrearticle
ristypeJOUR
abstractBased on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in neutral (non-emotional) speech. The results show that the ZCMT approach effectively improves the recognition accuracy by 3.47% in average compared with the Teager energy operator (TEO). Thus, ZCMT feature can be considered as a noise-robust feature for speech recognition. The second kind consists of mono-lingual emotion recognition experiments by using the Taiyuan University of Technology (TYUT) and the Berlin databases. As the average recognition rate of ZCMT approach is 82.19%, the results indicate that the ZCMT features can characterize speech emotions in an effective way. The third kind consists of cross-lingual experiments with three languages. As the accuracy of ZCMT approach only reduced by 1.45%, the results indicate that the ZCMT features can characterize emotions in a language independent way.
copHeidelberg
pubCentral South University
doi10.1007/s11771-012-1032-3
pages504-510
date2012-02