Research Theme

My research goals aimed at extracting knowledge from acoustic information (aka acoustic features). The examples of this theme are: speech emotion recognition, abnormal sound detection, and audio classication. Moreover, the research can be extended to vibration signal. My approaches to achieve these goal are defined by: (1) data-driven approach (instead of physical modeling), (2) focus on practical implemention (not necessary to follow human mechanism), and robustness (how stable/consistent the model given any perturbation instead of correctness). For me, science should be evidence-based, able to be implemented and consistent. My research is result-oriented instead of process-oriented. It doesn't mean that process (physical phenomena, modeling, math, and algorithms) is not important. If we understand the process very well, the solution may appears by itself. Still, there is must be a reason (rationale) for doing such reaserch. Then, I judge my research mainly based on the results. My research contributes to developing technologies to solve issues in Society 5.0 (What is Society 5.0? Read here in Indonesian language).

The following is research theme that I offered, particularly (but NOT limited) for Enginenering Physics students ITS.

For undergraduate level, I will try to provide the baseline method, and you will improve the results using your proposed method.

Speech emotion recognition using multilayer perceptron with CCC loss, dataset: IEMOCAP
Indonesian speech recognition using Wav2Vec2/Hubert/WavLM/UniSpeech-SAT, etc.
Toward universal acoustic features for multi-corpus speech emotion recognition, 30+ datasets.
~~4. Predicting Alzheimer desease using speech analysis.~~
Development of Calfem-Python
Development of Vibration Toolbox
~~7. Abnormal sound detection for predictive maitenance (the method is from you/your idea), dataset: DCASE~~
Indonesian emotional Speech synthesis Using FastSpeech
~~9. COVID-19 diagnosis using COUGH sound with deep learning~~
COVID-19 diagnosis using SPEECH sound with deep learning, dataset: ComParE CSS 2021
Predicting pathological voice disorder with speech processing technique, dataset: SVD, Voiced, HUPA
Detecting of emotion intensity of non-speech sound (laughter, crying, etc.)
Detecting/predicting stuttering (bahasa: gagap) in speech with machine learning
Predicting the intensities of seven self-reported emotions (Adoration, Amusement, Anxiety, Disgust, Empathic Pain, Fear, Surprise) from user-generated reactions to emotionally evocative videos
Few-shot learning on acoustic data to capture 10 dimensions of emotion reliably perceived in distinct vocal bursts: Awe, Excitement, Amusement, Awkwardness, Fear, Horror, Distress, Triumph, Sadness and Surprise
Multimodal learning (audio+video+text) to capture 10 dimensions of emotion reliably perceived in distinct vocal bursts: Awe, Excitement, Amusement, Awkwardness, Fear, Horror, Distress, Triumph, Sadness and Surprise
Inferring self-reported emotion from multimodal expression, using multi-output regression to predict fine-grained self-report annotations of seven ‘in-the-wild' emotional experiences

Typical Timeline:

The timeline is ideal for undergraduate (S1), it can be adapted for master (S2) and PhD (S3, three years of research from the beginning).

Contact email: bagus[at]ep.its.ac.id

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

researchtheme.md

researchtheme.md

Research Theme

Other topics/themes:

Typical Timeline:

Files

researchtheme.md

Latest commit

History

researchtheme.md

File metadata and controls

Research Theme

Other topics/themes:

Typical Timeline: