publications

My Publications

I started working on voice conversion since 2016. My goal is to make it easier and more readily applicable to realistic circumstances.

Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks

Chin-Chen Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, and Hsin-Min Wang
Proceedings of INTERSPEECH 2017 (in press)

We propose a non-parallel VC framework with a Wasserstein generative adversarial network (W-GAN) that explicitly takes a VC-related objective into account. Experimental results corroborated the capability of our framework for building a VC system from unaligned data, and demonstrated improved conversion quality.

Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder

Chin-Chen Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, and Hsin-Min Wang
Proceedings of APSIPA, 2016

By means of deep generative models, we are able to liberate the task of voice conversion from the need of parallel corpora (i.e. strict supervision), making the training process easier.

Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network

Chin-Chen Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, and Hsin-Min Wang
Proceedings of ISCSLP 2016.

We made dictionary update easier through reformulating it as an auto-encoder based on deep neural networks. It also significantly improves voice quality of the synthetic speech.

A Post-filtering Approach Based on Locally Linear Embedding Difference Compensation for Speech Enhancement

Yi-Chiao Wu, Hsin-Te Hwang, Syu-Siang Wang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang
Proceedings of INTERSPEECH 2017 (in press)

Locally Linear Embedding for Exemplar-Based Spectral Conversion

Yi-Chiao Wu, Hsin-Te Hwang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang
Proceedings of INTERSPEECH, 2016

We made dictionary update easier through reformulating it as an auto-encoder based on deep neural networks. It also significantly improves voice quality of the synthetic speech.

Chin-Cheng (Jeremy) Hsu

My Publications

Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks

Chin-Chen Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, and Hsin-Min Wang
Proceedings of INTERSPEECH 2017 (in press)

Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder

Chin-Chen Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, and Hsin-Min Wang
Proceedings of APSIPA, 2016

Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network

Chin-Chen Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, and Hsin-Min Wang
Proceedings of ISCSLP 2016.

A Post-filtering Approach Based on Locally Linear Embedding Difference Compensation for Speech Enhancement

Yi-Chiao Wu, Hsin-Te Hwang, Syu-Siang Wang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang
Proceedings of INTERSPEECH 2017 (in press)

Locally Linear Embedding for Exemplar-Based Spectral Conversion

Yi-Chiao Wu, Hsin-Te Hwang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang
Proceedings of INTERSPEECH, 2016

Robust Emotion Recognition by Spectro-Temporal Modulation Statistic Features

Tai-Shi Chi, Lan-Ying Yeh and Chin-Cheng Hsu
Journal of Ambient Intelligence and Humanized Computing, 2011

On Robustness of Spectro-Temporal Modulation Features in an Emotion Recognition Framework

Author: Chin-Cheng Hsu
Advisor: Tai-Shi Chi
Master's Thesis

Patent

Text-to-speech method and multi-lingual speech synthesizer using the method

Authors: Hsun-Fu Liu, Abhishek Pandey, Chin-Cheng Hsu
Patent ID: US20170047060A1