diff --git a/2106.15561v3.pdf b/A_Survey_on_Neural_Speech_Synthesis.pdf similarity index 100% rename from 2106.15561v3.pdf rename to A_Survey_on_Neural_Speech_Synthesis.pdf diff --git a/pone.0283440.pdf b/A_real-time_voice_cloning_system_with_multiple_algorithms_for_speech_quality_improvement.pdf similarity index 100% rename from pone.0283440.pdf rename to A_real-time_voice_cloning_system_with_multiple_algorithms_for_speech_quality_improvement.pdf diff --git a/s13636-024-00329-7.pdf b/Deep_learning-based_expressive_speech_synthesis-a_systematic_review_of_approaches_challenges_and_resources.pdf similarity index 100% rename from s13636-024-00329-7.pdf rename to Deep_learning-based_expressive_speech_synthesis-a_systematic_review_of_approaches_challenges_and_resources.pdf diff --git a/2006.04558v8.pdf b/FASTSPEECH_2-FAST_AND_HIGH-QUALITY_END-TO-END_TEXT_TO_SPEECH.pdf similarity index 100% rename from 2006.04558v8.pdf rename to FASTSPEECH_2-FAST_AND_HIGH-QUALITY_END-TO-END_TEXT_TO_SPEECH.pdf diff --git a/2205.04421v2.pdf b/NaturalSpeech-End-to-End_Text_to_Speech_Synthesis_with_Human-Level_Quality.pdf similarity index 100% rename from 2205.04421v2.pdf rename to NaturalSpeech-End-to-End_Text_to_Speech_Synthesis_with_Human-Level_Quality.pdf diff --git a/README.md b/README.md index a9fde69..2269806 100755 --- a/README.md +++ b/README.md @@ -20,13 +20,13 @@ | 文件名 | 核心主题 | 阅读笔记 / 目的 | 论文标题 (参考) | 状态 | 链接 | | :--- | :--- | :--- | :--- | :--- | :--- | -| `2106.15561v3.pdf` | TTS技术综述 | 此综述详细介绍了基于神经网络的TTS技术,用于构建对现代语音合成技术全景的理解。 | *A Survey on Neural Speech Synthesis* | 已归档 | [原文链接](https://arxiv.org/abs/2106.15561) | -| `s13636-024-00329-7.pdf` | 情感语音合成 | 此论文系统回顾了情感语音合成的方法、挑战和资源,与论文“融合情感”部分高度相关。 | *Deep learning-based expressive speech synthesis: a systematic review...* | 已归档 | [原文链接](https://asmp-eurasipjournals.springeropen.com/articles/10.1186/s13636-024-00329-7) | +| `A_Survey_on_Neural_Speech_Synthesis.pdf` | TTS技术综述 | 此综述详细介绍了基于神经网络的TTS技术,用于构建对现代语音合成技术全景的理解。 | *A Survey on Neural Speech Synthesis* | 已归档 | [原文链接](https://arxiv.org/abs/2106.15561) | +| `Deep_learning-based_expressive_speech_synthesis-a_systematic_review_of_approaches_challenges_and_resources.pdf` | 情感语音合成 | 此论文系统回顾了情感语音合成的方法、挑战和资源,与论文“融合情感”部分高度相关。 | *Deep learning-based expressive speech synthesis: a systematic review...* | 已归档 | [原文链接](https://asmp-eurasipjournals.springeropen.com/articles/10.1186/s13636-024-00329-7) | | `Text_to_Speech_Synthesis_A_Systematic_Review_Deep_.pdf` | TTS架构与方向 | 此综述覆盖了深度学习TTS架构和未来研究方向,为技术选型和未来展望提供参考。 | *Text to Speech Synthesis: A Systematic Review, Deep Learning Based Architecture and Future Research Direction* | 已归档 | [原文链接](https://www.researchgate.net/publication/364280141_Text_to_Speech_Synthesis_A_Systematic_Review_Deep_Learning_Based_Architecture_and_Future_Research_Direction) | -| `2006.04558v8.pdf` | 非自回归TTS模型 | 用于理解以FastSpeech为代表的非自回归模型如何解决“一对多”问题,以及如何引入韵律等变化信息来提升合成质量与速度。 | *FASTSPEECH 2: FAST AND HIGH-QUALITY END-TO-END TEXT TO SPEECH* | 已归档 | [原文链接](https://paperswithcode.com/paper/fastspeech-2-fast-and-high-quality-end-to-end) | -| `2205.04421v2.pdf` | SOTA TTS模型 | 了解如何通过设计更强大的模型(如NaturalSpeech)和利用大规模语料库来实现与人类相媲美的合成效果,这能帮助理解当前技术的天花板在哪里。| *NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality* | 已归档 | [原文链接](https://arxiv.org/abs/2205.04421) | -| `1806.04558v4.pdf` | 零样本/迁移学习TTS | 此论文讲述利用迁移学习技术,构建一个能够生成任意说话人(包括训练中未见过的说话人)声音的文本到语音(TTS)系统。 | *Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis* | 已归档 | [原文链接](https://paperswithcode.com/paper/transfer-learning-from-speaker-verification) | -| `pone.0283440.pdf` | 实时语音克隆 | 此综述旨在提升语音克隆质量的实时系统。 | *A real-time voice cloning system with multiple algorithms for speech quality improvement* | 已归档 | [原文链接](https://pmc.ncbi.nlm.nih.gov/articles/PMC10069766/) | +| `FASTSPEECH_2-FAST_AND_HIGH-QUALITY_END-TO-END_TEXT_TO_SPEECH.pdf` | 非自回归TTS模型 | 用于理解以FastSpeech为代表的非自回归模型如何解决“一对多”问题,以及如何引入韵律等变化信息来提升合成质量与速度。 | *FASTSPEECH 2: FAST AND HIGH-QUALITY END-TO-END TEXT TO SPEECH* | 已归档 | [原文链接](https://paperswithcode.com/paper/fastspeech-2-fast-and-high-quality-end-to-end) | +| `NaturalSpeech-End-to-End_Text_to_Speech_Synthesis_with_Human-Level_Quality.pdf` | SOTA TTS模型 | 了解如何通过设计更强大的模型(如NaturalSpeech)和利用大规模语料库来实现与人类相媲美的合成效果,这能帮助理解当前技术的天花板在哪里。| *NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality* | 已归档 | [原文链接](https://arxiv.org/abs/2205.04421) | +| `Transfer_Learning_from_Speaker_Verification_to_Multispeaker_Text-To-Speech_Synthesis.pdf` | 零样本/迁移学习TTS | 此论文讲述利用迁移学习技术,构建一个能够生成任意说话人(包括训练中未见过的说话人)声音的文本到语音(TTS)系统。 | *Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis* | 已归档 | [原文链接](https://paperswithcode.com/paper/transfer-learning-from-speaker-verification) | +| `A_real-time_voice_cloning_system_with_multiple_algorithms_for_speech_quality_improvement.pdf` | 实时语音克隆 | 此综述旨在提升语音克隆质量的实时系统。 | *A real-time voice cloning system with multiple algorithms for speech quality improvement* | 已归档 | [原文链接](https://pmc.ncbi.nlm.nih.gov/articles/PMC10069766/) | | `OpenVoice Versatile Instant Voice Cloning.pdf` | 情感可控的语音克隆 | 研究OpenVoice如何解耦音色与情感等风格,以实现对克隆声音的灵活情感控制,这与论文核心“融合情感的语音克隆”高度相关。 | *OpenVoice: Versatile Instant Voice Cloning* | 已归档 | [原文链接](https://arxiv.org/abs/2312.01479) | | `Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech.pdf` | 端到端TTS模型 (VITS) | 理解VITS模型如何结合VAE和对抗学习,实现高质量的并行端到端语音合成,为技术选型提供重要参考。 | *Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech* | 已归档 | [原文链接](https://arxiv.org/abs/2106.06103) | | `Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions.pdf` | 里程碑式TTS模型 (Tacotron 2) | 学习里程碑模型Tacotron 2的两阶段架构,理解其如何奠定高质量端到端语音合成的基础,为本研究提供技术背景和起点。 | *Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions* | 已归档 | [原文链接](https://arxiv.org/abs/1712.05884) | @@ -63,4 +63,5 @@ * 持续跟进SOTA(State-of-the-Art)语音克隆与情感合成模型,特别是低资源和高表现力的相关技术。 * 深入阅读儿童发展心理学、人机交互(HCI)中关于儿童与技术互动的相关文献。 -* 关注大语言模型(LLM)在教育领域,特别是对话系统和个性化辅导方面的最新研究。 \ No newline at end of file +* 关注大语言模型(LLM)在教育领域,特别是对话系统和个性化辅导方面的最新研究。 + diff --git a/1806.04558v4.pdf b/Transfer_Learning_from_Speaker_Verification_to_Multispeaker_Text-To-Speech_Synthesis.pdf similarity index 100% rename from 1806.04558v4.pdf rename to Transfer_Learning_from_Speaker_Verification_to_Multispeaker_Text-To-Speech_Synthesis.pdf