This study tests whether individuals vocally align toward emotionally expressive prosody produced by two types of interlocutors: a human and a voice-activated artificially intelligent (voice-AI) assistant. Participants completed a word shadowing experiment of interjections (e.g., “ Awesome ” ) produced in emotionally neutral and expressive prosodies by both a human voice and a voice generated by a voice-AI system (Amazon ’ s Alexa). Results show increases in participants ’ word duration, mean f0, and f0 variation in response to emotional expressiveness, consistent with increased alignment toward a general ‘positive-emotional ’ speech style. Small differences in emotional alignment by talker category (human vs. voice-AI) parallel the acoustic differences in the model talkers ’ productions, suggesting that participants are mirroring the acoustics they hear. The similar responses to emotion in both a human and voice-AI talker support accounts of unmediated emotional alignment, as well as computer personification: people apply emotionally-mediated behaviors to both types of interlocutors. While there were small differences in magnitude by participant gender, the overall patterns were similar for women and men, supporting a nuanced picture of emotional vocal alignment.
Prosodic alignment toward emotionally expressive speech: Comparing human and Alexa model talkers
Michelle Cohn,Kristin Predeck,Melina Sarian,Georgia Zellou
Published 2021 in Speech Communication
ABSTRACT
PUBLICATION RECORD
- Publication year
2021
- Venue
Speech Communication
- Publication date
2021-10-01
- Fields of study
Linguistics, Psychology, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-90 of 90 references · Page 1 of 1
CITED BY
Showing 1-25 of 25 citing papers · Page 1 of 1