Tiantian Feng
SAIL Lab, University of Southern California. tiantiaf@usc.edu
This amazing picture was taken by my wife at Skyline Trail, Mount Rainier, 2023.
PS: Prepare enough water for this trail but make sure you are not the one carrying it.
(I am on a unique parental leaving situation, and I would be having limited bandwidth to respond your emails till October/November. Sorry that if I cannot respond your email in a timely manner.)
My name is Tiantian Feng, and I grew up in Leshan, China, which is famously for Leshan Giant Buddha. I recently completed my Ph.D. in the Thomas Lord Department of Computer Science at University of Southern California in 2024, and I am currently a postdoc researcher in SAIL lab at USC. I am fortunate to be advised by Professor Shrikanth Narayanan, a globally recognized scientist in speech modeling, linguistics, affective computing, and human understanding.
My research focuses on leveraging sensors and computational methods for understanding natural human behaviors in healthcare applications across life span. Notably, I have involved in projects studying children with autism, aging population with cognitive decline, mental health (e.g. Postpartum depression (PPD)), people with hearing loss, and workplace behavior of hospital professionals. My vision is to develop full-stack AI technology that explores sensings, learning, understanding, and assisting human in a wide range of healthcare-centered applications.
I also have a particular interest in building technology that is private and broadly accessible. I have been focused on developing datasets and benchmark that are shareable across different community. My research invovles applications such as speech understanding, wearable computing, multimodal understanding, and bio-signal processing, etc. Additionally, I have hands-on experience in industrial sensor design and deploying sensors in research studies. I have interned at both Meta and Amazon as the research scientist.
I am actively looking for tenure-track or similar positions in academia. Don’t hesitate to contact me if you would like to collaborate.
News
| Jan 20, 2026 | Four of my co-authored papers have been accepted to ICASSP! The topics include speech privacy, speech synthesis, speech encoding, and applications of speech emotion recognition. |
|---|---|
| Jan 10, 2026 | My preprint paper VoxCog: Towards End-to-End Multilingual Cognitive Impairment Classification through Dialectal Knowledge is now available online. |
| Dec 31, 2025 | My co-authored paper Speech acoustics to rt-MRI articulatory dynamics inversion with video diffusion model. has been published at Computer Speech & Language (2025). |
| Dec 10, 2025 | I will give a talk on-Developing Robust Speaker Diarization for Child-Adult Dyadic Interaction in ASRU 2025 Satellite Workshop-AI for Children’s Speech and Language! |
| Nov 23, 2025 | Our Voxlect benchmark has been accepted to 2026 KDD Dataset and Benchmark track, congratulations to all co-authors! |
| Nov 14, 2025 | I will be giving an invited talk on “Toward Human-Centered Computing for Behavioral Understanding in Healthcare Applications” to THE Ohio State University Talk Details! |
| Aug 20, 2025 | 7 papers accepted and will be presented at INTERSPEECH 2025! The topics include speech datasets for science, child speech classification, child speech-LLMs, child speech ASR, multimodal child behavior understanding, categorized speech emotion recognition, and arousal-valence speech recognition. |
| Aug 20, 2025 | Our work has won the 2nd place in 2025 INTERSPEECH Speech Emotion Recognition Challenge - Task1! |
| Aug 20, 2025 | Our work has won the 1st place in 2025 INTERSPEECH Speech Emotion Recognition Challenge - Task2! |
| Mar 20, 2025 | 4 papers accepted and will be presented at ICASSP 2025! |
Selected Publications
- KDD BenchmarkVoxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the GlobeAccepted to 2026 KDD Dataset and Benchmark Track, 2026
- Preprint Digital HealthVoxCog: Towards End-to-End Multilingual Cognitive Impairment Classification through Dialectal KnowledgeUnder review, 2026
- Preprint BenchmarkVox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech TraitsUnder review, 2025
- Preprint DatasetTILES-2018 Sleep Benchmark Dataset: A Longitudinal Wearable Sleep Data Set of Hospital Workers for Modeling and Understanding Sleep BehaviorsUnder Review, 2025
- Challenge Winner Speech EmotionDeveloping a Top-tier Framework in Naturalistic Conditions Challenge for Categorized Emotion Prediction: From Speech Foundation Models and Learning Objective to Data Augmentation and Engineering ChoicesIn Interspeech 2025 , 2025
- Digital Health INTERSPEECH 2025Egocentric Speaker Classification in Child-Adult Dyadic Interactions: From Sensing to Computational ModelingIn Interspeech 2025 , 2025
- ACM Multimedia DatasetMM-AU: Towards Multimodal Understanding of Advertisement VideosIn Proceedings of the 31st ACM International Conference on Multimedia , 2023
- Preprint Generative AICan Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?arXiv preprint arXiv:2402.09036, 2024
- Voice Privacy Generative AIUnlocking Foundation Models for Privacy-Enhancing Speech Understanding: An Early Study on Low Resource Speech Training Leveraging Label-guided Synthetic Speech ContentarXiv preprint arXiv:2306.07791, 2023
- CVPRW Generative AIGPT-FL: Generative pre-trained model-assisted federated learningarXiv preprint arXiv:2306.02210, 2023
- Journal TrustworthinessA Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and FairnessAPSIPA Transactions on Signal and Information Processing, 2023
- ICASSP 2019 WearablesDiscovering optimal variable-length time series motifs in large-scale wearable recordings of human bio-behavioral signalsIn ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2019