Publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- Preprint BenchmarkVoxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the GlobearXiv preprint arXiv:2508.01691 (Accepted to 2026 KDD Dataset and Benchmark Track), 2025
- Preprint BenchmarkVox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech TraitsarXiv preprint arXiv:2505.14648, 2025
- Preprint DatasetTILES-2018 Sleep Benchmark Dataset: A Longitudinal Wearable Sleep Data Set of Hospital Workers for Modeling and Understanding Sleep BehaviorsarXiv preprint arXiv:2507.03520, 2025
- Challenge Winner Speech EmotionDeveloping a Top-tier Framework in Naturalistic Conditions Challenge for Categorized Emotion Prediction: From Speech Foundation Models and Learning Objective to Data Augmentation and Engineering ChoicesIn Interspeech 2025 , 2025
- Child-centered Technology INTERSPEECH 2025Egocentric Speaker Classification in Child-Adult Dyadic Interactions: From Sensing to Computational ModelingIn Interspeech 2025 , 2025
- Dataset INTERSPEECH 202575-Speaker Annot-16: A benchmark dataset for speech articulatory rt-MRI annotation with articulator contours and phonetic alignmentIn Interspeech 2025 , 2025
2024
- Preprint Generative AICan Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?arXiv preprint arXiv:2402.09036, 2024
- ICASSP TSP 2024 Federated LearningPartial federated learning: Unlocking non-biometric text information sharing for federated learning2024
- Preprint WearablesUnderstanding Stress, Burnout, and Behavioral Patterns in Medical Residents Using Large-scale Longitudinal Wearable RecordingsarXiv preprint arXiv:2402.09028, 2024
- ICASSP 2024 Child-centered TechnologyAudio-visual child-adult speaker classification in dyadic interactionsIn ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2024
- ICASSP 2024 Foundation ModelEmotion-Aligned Contrastive Learning Between Images and MusicIn ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2024
2023
- ACM Multimedia DatasetMM-AU: Towards Multimodal Understanding of Advertisement VideosIn Proceedings of the 31st ACM International Conference on Multimedia , 2023
- Voice Privacy Generative AIUnlocking Foundation Models for Privacy-Enhancing Speech Understanding: An Early Study on Low Resource Speech Training Leveraging Label-guided Synthetic Speech ContentarXiv preprint arXiv:2306.07791, 2023
- Preprint Generative AIGPT-FL: Generative pre-trained model-assisted federated learningarXiv preprint arXiv:2306.02210, 2023
- Journal TrustworthinessA Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and FairnessAPSIPA Transactions on Signal and Information Processing, 2023
- Journal HRVIncreasing coordination and responsivity of emotion-related brain regions with a heart rate variability biofeedback randomized trialCognitive, Affective, & Behavioral Neuroscience, 2023
- Journal HRVEffects of a randomised trial of 5-week heart rate variability biofeedback intervention on cognitive function: possible benefits for inhibitory controlApplied psychophysiology and biofeedback, 2023
- INTERSPEECH 2023 Child-centered TechnologyUnderstanding spoken language development of children with asd using pre-trained speech embeddingsarXiv preprint arXiv:2305.14117, 2023
- INTERSPEECH 2023 CChild-centered TechnologyRobust self supervised speech embeddings for child-adult classification in interactions involving children with autismarXiv preprint arXiv:2307.16398, 2023
2022
- Frontiers WearablesUnderstanding the effect of speed on human emotion perception in mediated social touch using voice coil actuatorsFrontiers in Computer Science, 2022
2021
- ACM Transactions WearablesTemporal dynamics of workplace acoustic scenes: Egocentric analysis and predictionIEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021
- Preprint Federated LearningAttribute inference attack of speech emotion recognition in federated learning settingsarXiv preprint arXiv:2112.13416, 2021
- ACII 2021 TrustworthinessPrivacy and utility preserving data transformation for speech emotion recognitionIn 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII) , 2021
2020
- ICASSP 2020 WearablesModeling behavioral consistency in large-scale wearable recordings of human bio-behavioral signalsIn ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2020
- ICASSP 2020 WearablesModeling behavior as mutual dependency between physiological signals and indoor location in large-scale wearable sensor studyIn ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2020
2019
- ICASSP 2019 WearablesDiscovering optimal variable-length time series motifs in large-scale wearable recordings of human bio-behavioral signalsIn ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2019
- JMIR WearablesMultimodal human and environmental sensing for longitudinal behavioral studies in naturalistic settings: Framework for sensor selection, deployment, and managementJournal of medical Internet research, 2019
- EMBC 2019 WearablesImputing missing data in large-scale multivariate biomedical wearable recordings using bidirectional recurrent neural networks with temporal activation regularizationIn 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC) , 2019
- ICASSP 2019 WearablesToward robust interpretable human movement pattern analysis in a workplace settingIn ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2019
- Scientific Report Child-centered TechnologyCross-modal coordination of face-directed gaze and emotional speech production in school-aged children and adolescents with asdScientific reports, 2019