Tsinghua Science and Technology

2022, v.27(06) 939-947

[打印本页] [关闭]
本期目录(Current Issue) | 过刊浏览(Past Issue) | 高级检索(Advanced Search)

Optimizing the Perceptual Quality of Time-Domain Speech Enhancement with Reinforcement Learning
Optimizing the Perceptual Quality of Time-Domain Speech Enhancement with Reinforcement Learning

Xiang Hao;Chenglin Xu;Lei Xie;Haizhou Li;

摘要(Abstract):

In neural speech enhancement,a mismatch exists between the training objective,i.e.,Mean-Square Error(MSE),and perceptual quality evaluation metrics,i.e.,perceptual evaluation of speech quality and short-time objective intelligibility.We propose a novel reinforcement learning algorithm and network architecture,which incorporate a non-differentiable perceptual quality evaluation metric into the objective function using a dynamic filter module.Unlike the traditional dynamic filter implementation that directly generates a convolution kernel,we use a filter generation agent to predict the probability density function of a multivariate Gaussian distribution,from which we sample the convolution kernel.Experimental results show that the proposed reinforcement learning method clearly improves the perceptual quality over other supervised learning methods with the MSE objective function.

关键词(KeyWords):

Abstract:

Keywords:

基金项目(Foundation): supported by the National Research Foundation of Singapore (No.AISG-100E-2018-006);; Human-Robot Interaction Phase 1 (No.1922500054),under the National Robotics Programme,Singapore

作者(Authors): Xiang Hao;Chenglin Xu;Lei Xie;Haizhou Li;

参考文献(References):

扩展功能
本文信息
服务与反馈
本文关键词相关文章
本文作者相关文章
中国知网
分享