当前位置：首页 > news >正文

用LFM2.5-Audio-1.5B-GGUF模型处理文字转语音和语音转文字

news 2026/7/11 0:28:03

按照模型主页：https://www.modelscope.cn/models/LiquidAI/LFM2.5-Audio-1.5B-GGUF/summary 的提示下载模型

C:\d\models>curl -LO https://www.modelscope.cn/models/LiquidAI/LFM2.5-Audio-1.5B-GGUF/resolve/master/LFM2.5-Audio-1.5B-Q4_0.gguf 100 663M 100 663M 0 0 12.8M 0 0:00:51 0:00:51 --:--:-- 12.5M C:\d\models>curl -LO https://www.modelscope.cn/models/LiquidAI/LFM2.5-Audio-1.5B-GGUF/resolve/master/tokenizer-LFM2.5-Audio-1.5B-Q4_0.gguf 100 48.2M 100 48.2M 0 0 9996k 0 0:00:04 0:00:04 --:--:-- 10.7M C:\d\models>curl -LO https://www.modelscope.cn/models/LiquidAI/LFM2.5-Audio-1.5B-GGUF/resolve/master/mmproj-LFM2.5-Audio-1.5B-Q4_0.gguf 100 209M 100 209M 0 0 10.3M 0 0:00:20 0:00:20 --:--:-- 12.2M C:\d\models>curl -LO https://www.modelscope.cn/models/LiquidAI/LFM2.5-Audio-1.5B-GGUF/resolve/master/vocoder-LFM2.5-Audio-1.5B-Q4_0.gguf 100 103M 100 103M 0 0 10.7M 0 0:00:09 0:00:09 --:--:-- 12.5M C:\d\models>curl -LO https://www.modelscope.cn/models/LiquidAI/LFM2.5-Audio-1.5B-GGUF/resolve/master/liquid_audio_chat.py 100 17543 100 17543 0 0 45627 0 --:--:-- --:--:-- --:--:-- 45684 C:\d\models>/d/llama8/llama-liquid-audio-cli -m LFM2.5-Audio-1.5B-Q4_0.gguf -mm mmproj-LFM2.5-Audio-1.5B-Q4_0.gguf -mv vocoder-LFM2.5-Audio-1.5B-Q4_0.gguf --tts-speaker-file tokenizer-LFM2.5-Audio-1.5B-Q4_0.gguf -sys "Perform TTS." -p "Hi, how are you?" --output $OUTPUT_WAV '/d/llama8/llama-liquid-audio-cli' is not recognized as an internal or external command, operable program or batch file.

标准llama的目录中没有llama-liquid-audio-cli运行工具

需要单独下载运行工具

C:\d\models>curl -LO https://www.modelscope.cn/models/LiquidAI/LFM2.5-Audio-1.5B-GGUF/resolve/master/runners/llama-liquid-audio-ubuntu-x64.zip 100 12.6M 100 12.6M 0 0 8900k 0 0:00:01 0:00:01 --:--:-- 12.8M

将工具解压保存到llama-audio目录。
因为没有windows版本，所以进入wsl环境。

C:\d\models>wsl root@DESKTOP-59T6U68:/mnt/c/d/models# cd audio root@DESKTOP-59T6U68:/mnt/c/d/models/audio# llama-audio/llama-liquid-audio-cli -m LFM2.5-Audio-1.5B-Q4_0.gguf -mm mmproj-LFM2.5-Audio-1.5B-Q4_0.gguf -mv vocoder-LFM2.5-Audio-1.5B-Q4_0.gguf --tts-speaker-file tokenizer-LFM2.5-Audio-1.5B-Q4_0.gguf -sys "Perform TTS." -p "Hi, how are you?" --output $OUTPUT_WAV llama-audio/llama-liquid-audio-cli: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by llama-audio/llama-liquid-audio-cli) llama-audio/llama-liquid-audio-cli: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by llama-audio/llama-liquid-audio-cli) llama-audio/llama-liquid-audio-cli: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /mnt/c/d/models/audio/llama-audio/libliquid-audio.so) llama-audio/llama-liquid-audio-cli: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /mnt/c/d/models/audio/llama-audio/libliquid-audio.so) llama-audio/llama-liquid-audio-cli: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /mnt/c/d/models/audio/llama-audio/libmtmd.so.0) llama-audio/llama-liquid-audio-cli: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /mnt/c/d/models/audio/llama-audio/libllama.so.0) llama-audio/llama-liquid-audio-cli: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /mnt/c/d/models/audio/llama-audio/libggml-base.so.0)

提示缺少高版本GLIBC和GLIBCXX，于是打开docker 容器 gcc。

14.2和15.1的gcc容器的版本带的GLIBC不够高，临时拉取一个15.2的，满足要求了。

gcc version 15.1.0 (GCC) root@DESKTOP-59T6U68:/par/models/audio# strings /lib/x86_64-linux-gnu/libc.so.6|grep GLIBC_2.3 GLIBC_2.3 GLIBC_2.3.2 GLIBC_2.3.3 GLIBC_2.3.4 GLIBC_2.30 GLIBC_2.31 GLIBC_2.32 GLIBC_2.33 GLIBC_2.34 GLIBC_2.35 GLIBC_2.36 root@DESKTOP-59T6U68:/par/models/audio# exit root@DESKTOP-59T6U68:/mnt/c/d/models/audio# docker pull docker.1ms.run/gcc:15.2 Trying to pull docker.1ms.run/gcc:15.2... Getting image source signatures Copying blob a793e3c6bce8 skipped: already exists Copying blob 9da421ddeb65 skipped: already exists Copying blob 866771c43bf5 skipped: already exists Copying blob ed881fbf1b07 skipped: already exists Copying blob c9c9bdd0804b done Copying blob 933ec911a9d9 done Copying blob 93f6a80119c4 done Copying blob 303d1dc2b7db done Copying config 47a721da1a done Writing manifest to image destination Storing signatures 47a721da1addefee38fea4a35a48c0da7492ca616794cc5fcb64f9c198fb2c94 root@DESKTOP-59T6U68:/mnt/c/d/models/audio# sudo docker run -itd -v /mnt/c/d:/par --network host --name gcc152 docker.1m s.run/gcc:15.2 92399a235cfe7a6440ffa6e015f55ba1f5df4b5056eb59b52e9825172edecc4c root@DESKTOP-59T6U68:/mnt/c/d/models/audio# docker exec -it gcc152 bash root@DESKTOP-59T6U68:/# strings /lib/x86_64-linux-gnu/libc.so.6|grep GLIBC_2.3 GLIBC_2.3 GLIBC_2.3.2 GLIBC_2.3.3 GLIBC_2.3.4 GLIBC_2.30 GLIBC_2.31 GLIBC_2.32 GLIBC_2.33 GLIBC_2.34 GLIBC_2.35 GLIBC_2.36 GLIBC_2.38 <--------- GLIBC_2.39

用模型主页提供的示例命令行，-sys的提示词报错了。

llama-audio/llama-liquid-audio-cli -m LFM2.5-Audio-1.5B-Q4_0.gguf -mm mmproj-LFM2.5-Audio-1.5B-Q4_0.gguf -mv vocoder-LFM2.5-Audio-1.5B-Q4_0.gguf --tts-speaker-file tokenizer-LFM2.5-Audio-1.5B-Q4_0.gguf -sys "Perform TTS." -p "Hi, how are you?" --output OUTPUT.WAV ERR: Unsupported system prompt. Supported prompts are: - Perform TTS. Use the US male voice. - Perform TTS. Use the UK male voice. - Perform TTS. Use the US female voice. - Perform TTS. Use the UK female voice. - Perform ASR. - Respond with interleaved text and audio.

改为如下可以了：这段英文摘自演讲《I have a dream》。

文生音 llama-audio/llama-liquid-audio-cli -m LFM2.5-Audio-1.5B-Q4_0.gguf -mm mmproj-LFM2.5-Audio-1.5B-Q4_0.gguf -mv vocoder-LFM2.5-Audio-1.5B-Q4_0.gguf --tts-speaker-file tokenizer-LFM2.5-Audio-1.5B-Q4_0.gguf -sys "Perform TTS. Use the US male voice." -p "I have a dream that one day this nation will rise up and live out the true meaning of its creed:“We hold these truths to be self-evident,that all men are created equal." --output OUTPUT.WAV 音生文 llama-audio/llama-liquid-audio-cli -m LFM2.5-Audio-1.5B-Q4_0.gguf -mm mmproj-LFM2.5-Audio-1.5B-Q4_0.gguf -mv vocoder-LFM2.5-Audio-1.5B-Q4_0.gguf --tts-speaker-file tokenizer-LFM2.5-Audio-1.5B-Q4_0.gguf -sys "Perform ASR." --audio OUTPUT.WAV === GENERATED TEXT === I have a dream that one day this nation will rise up and live out the true meaning of its creed. We hold these truths to be self-evident, that all men are created equal.

模型主页还提供了服务器命令行，但是不能用浏览器执行交互，而是要运行python脚本，里面还要引入别的包，不试验了。

llama-liquid-audio-server -m LFM2.5-Audio-1.5B-Q4_0.gguf -mm mmproj-LFM2.5-Audio-1.5B-Q4_0.gguf -mv vocoder-LFM2.5-Audio-1.5B-Q4_0.gguf --tts-speaker-file tokenizer-LFM2.5-Audio-1.5B-Q4_0.gguf

查看全文

http://www.jsqmd.com/news/481015/