Public Repository

Last pushed: 6 months ago
Short Description
Estimating arousal, valence, age, gender, big 5 personality traits from audio
Full Description

RESTful webservice developed by Hesam Sagha, Chair of Intelligent and Complex Systems, University of Passau, Germany. Open source code and more info at https://github.com/MixedEmotions/up_emotions_audio

To run this module run:
docker run -it --rm -p 8888:8080 audioanalysis

Example:
http://localhost:8888/er/aer/getdims?dims=arousal,valence,gender,age,big5o,big5c,big5e,big5a,big5n&url=http://tv-download.dw.com/dwtv_video/flv/wikoe/wikoe20151114_wiruebli_sd_avc.mp4&timing=9,15;147,152

where:
getdims: desired dimensions separated by comma (arousal,valence,age,gender,big5O,big5C,big5E,big5A,big5N)
url: the url of the video/audio or the name of the uploaded file
timing: start and end of the segments (in seconds). start1,end1;start2,end2

To upload an audio/video file use curl:
Windows: curl -v -H "Content-Type:multipart/form-data" --user meuser -i -X POST -F "file=@D:\path\to\sample.wav" http://localhost:8888/er/aer/upload
Linux: curl -v -H "Content-Type:multipart/form-data" --user meuser -i -X POST -F 'file=@./sample.wav' http://localhost:8888/er/aer/upload


Moreover, this repository handles the fusion of audio and video outputs.
Run this command to fuse the results of audio and video outpus:
wget "localhost:8080/er/general/fuse?video=cat json_video_plain.txt&audio=cat json_audio_plain.txt"
In which the files should have the following entities.
Note: keep ':time=start,end' in the "@id" section.
See http://localhost:8888/er/general for more information


Licenses:

In case of using this module, please cite the following papers:

  • EYBEN, F., WENINGER, F., GROSS, F., AND SCHULLER, B. Recent Developments in openSMILE, the Munich Open-Source Multimedia Feature Extractor. In Proceedings of the 21st ACM International Conference on Multimedia, MM 2013 (Barcelona, Spain, October 2013), ACM, ACM, pp. 835–838.
  • SCHMITT, M., RINGEVAL, F., AND SCHULLER, B. At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech. In Proceedings INTERSPEECH 2016, 17th Annual Conference of the International Speech Communication Association
Docker Pull Command
Owner
mixedemotions