Prosodic addressee-detection: Ensuring privacy in always-on spoken dialog systems

Link:
Autor/in:
Verlag/Körperschaft:
Association for Computing Machinery (ACM)
Erscheinungsjahr:
2020
Medientyp:
Text
Schlagworte:
  • Turn-Taking
  • Spoken Dialogue Systems
  • Backchannel
  • Speech
  • Speech Recognition
  • Models
  • computational paralinguistics
  • recurrent neural network
  • fundamental frequency variation
  • addressee detection
  • complexity-identical human-computer interaction
  • Turn-Taking
  • Spoken Dialogue Systems
  • Backchannel
  • Speech
  • Speech Recognition
  • Models
Beschreibung:
  • We analyze the addressee detection task for complexity-identical dialog for both human conversation and device-directed speech. Our recurrent neural model performs at least as good as humans, who have problems with this task, even native speakers, who profit from the relevant linguistic skills. We perform ablation experiments on the features used by our model and show that fundamental frequency variation is the single most relevant feature class. Therefore, we conclude that future systems can detect whether they are addressed based only on speech prosody which does not (or only to a very limited extent) reveal the content of conversations not intended for the system.
Lizenz:
  • info:eu-repo/semantics/closedAccess
Quellsystem:
Forschungsinformationssystem der UHH

Interne Metadaten
Quelldatensatz
oai:www.edit.fis.uni-hamburg.de:publications/8c4c242d-8544-4128-a4fc-25a451ee8264