基于降维聚类与强化学习的高相似度弦乐器音色区分及合成优化研究A STUDY ON DIFFERENTIATION AND SYNTHESIS OPTIMIZATION OF HIGHLY SIMILAR STRING INSTRUMENT TIMBRES BASED ON DIMENSIONALITY REDUCTION, CLUSTERING, AND REINFORCEMENT LEARNING
赵星媛,白羽,沈函宇,唐惠心,魏依琳,宋飞,张留碗
摘要(Abstract):
针对音频合成音色自然度不足的问题,本研究提出一种融合特征分析与智能优化的多阶段框架。首先,通过合成三类声波构建数据集,提取声学特征,利用降维与聚类量化合成音与自然音的差异;其次,基于贝叶斯优化自适应分配特征权重,筛选关键区分性指标;最后,结合强化学习动态调整合成参数,以聚类中心距离为奖励驱动音色逼近自然声分布。实验证明,本方法显著提升合成音色的自然度,为数据驱动的音频优化提供了高效解决方案。
关键词(KeyWords): 音频合成;声学特征;强化学习;贝叶斯优化;音色优化
基金项目(Foundation): 通专教育背景下大学实验物理教学内容的重构探索与实践、教育部拔尖计划2.0研究课题(20211007);; 无锡市“未来技术太湖创新基金”;; 清华大学本科教育教学改革经费(53410001325)资助
作者(Author): 赵星媛,白羽,沈函宇,唐惠心,魏依琳,宋飞,张留碗
参考文献(References):
- [1] SMITH J O.Physical audio signal processing[M].W3K Publishing,2010.
- [2] MCADAMS S,WINSBERG S.Caractérisation du timbre des sons complexes.Ⅱ.Analyses acoustiques et quantification psychophysique[J].Journal de Physique IV,1999,9(PR8):1-12.
- [3] KARPLUS K,STRONG A.Digital synthesis of plucked string and drum timbres[J].Computer Music Journal,1983,7(2):43-55.
- [4] Z??LZER U.DAFX:Digital Audio Effects[M].Hoboken:Wiley,2011.
- [5] DAVIS S,MERMELSTEIN P.Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences[J].IEEE Transactions on Acoustics,Speech,and Signal Processing,1980,28(4):357-366.
- [6] PEETERS G.A large set of audio features for sound description[J].IRCAM Technical Report,2004.
- [7] MCINNES L,HEALY J,MELVILLE J.UMAP:Uniform manifold approximation and projection for dimension reduction[J].arXiv:1802.03426,2018.
- [8] HARTIGAN J A,WONG M A.Algorithm AS 136:A k-means clustering algorithm[J].Journal of the Royal Statistical Society,1979,28(1):100-108.
- [9] HUBERT L,ARABIE P.Comparing partitions[J].Journal of Classification,1985,2(1):193-218.
- [10] SNOEK J,LAROCHELLE H,ADAMS R P.Practical Bayesian optimization of machine learning algorithms[J].Advances in Neural Information Processing Systems,2012,25:2951-2959.
- [11] WILLIAMS R J.Simple statistical gradient-following algorithms for connectionist reinforcement learning[J].Machine Learning,1992,8(3-4):229-256.
- [12] ENGEL J,RESNICK C,ROBERTS A,et al.DDSP:Differentiable digital signal processing[J].International Conference on Learning Representations.