当前位置：首页 > news >正文

Web Speech API实战：手把手教你做个浏览器里的‘语音笔记’小工具

news 2026/6/7 7:12:35

Web Speech API实战：从零构建智能语音笔记工具

站在地铁里突然想到一个绝妙的点子，手却腾不出来记录？做饭时灵感迸发，却满手油污没法打字？这些场景下，语音输入简直就是救命稻草。今天我们就用浏览器自带的Web Speech API，打造一个能听懂人话的智能笔记工具——不需要服务器，不依赖第三方服务，打开浏览器就能用。

1. 项目准备与环境搭建

先来看看我们需要准备什么。这个项目完全基于现代浏览器能力，所以只需要：

最新版Chrome或Edge（建议版本100+）
一个文本编辑器（VS Code、Sublime都行）
基础的HTML/CSS/JavaScript知识

创建项目文件夹，初始化三个基本文件：

mkdir voice-notes && cd voice-notes touch index.html styles.css app.js

在index.html中搭建基础结构：

<!DOCTYPE html> <html lang="zh-CN"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>语音笔记小助手</title> <link rel="stylesheet" href="styles.css"> </head> <body> <div class="container"> <h1>语音笔记</h1> <button id="toggleBtn">开始录音</button> <div id="output" class="output-area"></div> </div> <script src="app.js"></script> </body> </html>

给这个小工具加点样式（styles.css）：

body { font-family: 'Segoe UI', sans-serif; max-width: 800px; margin: 0 auto; padding: 20px; background-color: #f5f7fa; } .container { background: white; border-radius: 10px; padding: 30px; box-shadow: 0 2px 10px rgba(0,0,0,0.1); } #toggleBtn { background: #4285f4; color: white; border: none; padding: 12px 24px; font-size: 16px; border-radius: 5px; cursor: pointer; transition: all 0.3s; } #toggleBtn:hover { background: #3367d6; } .output-area { margin-top: 20px; min-height: 200px; border: 1px dashed #ddd; padding: 15px; border-radius: 5px; } .note { margin-bottom: 10px; padding: 10px; background: #f8f9fa; border-left: 3px solid #4285f4; } .timestamp { font-size: 12px; color: #666; margin-bottom: 5px; }

2. 核心语音识别功能实现

现在来到最激动人心的部分——让浏览器听懂人话。Web Speech API的语音识别功能目前需要浏览器前缀，我们先做个兼容性检测：

// 检测浏览器支持情况 const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; if (!SpeechRecognition) { alert("您的浏览器不支持语音识别，请使用Chrome或Edge最新版"); document.getElementById('toggleBtn').disabled = true; } else { // 初始化识别器 const recognition = new SpeechRecognition(); recognition.continuous = true; // 持续识别 recognition.interimResults = true; // 获取临时结果 recognition.lang = 'zh-CN'; // 设置中文识别 let isListening = false; const toggleBtn = document.getElementById('toggleBtn'); const outputDiv = document.getElementById('output'); // 添加笔记到界面 function addNote(text, isFinal) { const noteDiv = document.createElement('div'); noteDiv.className = 'note'; const timestamp = document.createElement('div'); timestamp.className = 'timestamp'; timestamp.textContent = new Date().toLocaleTimeString(); const content = document.createElement('div'); content.textContent = text; noteDiv.appendChild(timestamp); noteDiv.appendChild(content); if (isFinal) { outputDiv.prepend(noteDiv); } else { // 临时结果替换最后一个临时笔记 const lastNote = outputDiv.querySelector('.note:not(.final)'); if (lastNote) { lastNote.querySelector('div:last-child').textContent = text; } else { noteDiv.classList.add('temp'); outputDiv.prepend(noteDiv); } } } // 处理识别结果 recognition.onresult = (event) => { let interimTranscript = ''; let finalTranscript = ''; for (let i = event.resultIndex; i < event.results.length; i++) { const transcript = event.results[i][0].transcript; if (event.results[i].isFinal) { finalTranscript += transcript; } else { interimTranscript += transcript; } } if (interimTranscript) { addNote(interimTranscript, false); } if (finalTranscript) { addNote(finalTranscript, true); } }; // 切换录音状态 toggleBtn.addEventListener('click', () => { if (isListening) { recognition.stop(); toggleBtn.textContent = '开始录音'; toggleBtn.style.background = '#4285f4'; } else { recognition.start(); toggleBtn.textContent = '停止录音'; toggleBtn.style.background = '#db4437'; } isListening = !isListening; }); // 错误处理 recognition.onerror = (event) => { console.error('识别错误:', event.error); toggleBtn.textContent = '开始录音'; toggleBtn.style.background = '#4285f4'; isListening = false; }; }

这段代码实现了几个关键功能：

持续录音：设置continuous: true让识别不会在每次说话后自动停止
实时反馈：interimResults: true让我们能获取识别过程中的临时结果
中文支持：通过lang: 'zh-CN'指定中文识别
错误处理：监听onerror事件处理可能的问题

3. 功能增强与优化

基础功能有了，现在来点进阶玩法。我们将添加三个实用功能：

3.1 本地存储笔记

使用localStorage保存笔记，即使刷新页面也不会丢失：

// 保存笔记到本地存储 function saveNotes() { const notes = []; document.querySelectorAll('.note.final').forEach(note => { notes.push({ time: note.querySelector('.timestamp').textContent, content: note.querySelector('div:last-child').textContent }); }); localStorage.setItem('voiceNotes', JSON.stringify(notes)); } // 加载保存的笔记 function loadNotes() { const savedNotes = localStorage.getItem('voiceNotes'); if (savedNotes) { JSON.parse(savedNotes).forEach(note => { const noteDiv = document.createElement('div'); noteDiv.className = 'note final'; const timestamp = document.createElement('div'); timestamp.className = 'timestamp'; timestamp.textContent = note.time; const content = document.createElement('div'); content.textContent = note.content; noteDiv.appendChild(timestamp); noteDiv.appendChild(content); outputDiv.appendChild(noteDiv); }); } } // 在DOM加载完成后加载笔记 document.addEventListener('DOMContentLoaded', loadNotes); // 修改addNote函数，在添加最终笔记时保存 function addNote(text, isFinal) { // ...原有代码... if (isFinal) { saveNotes(); } }

3.2 添加简单的语音命令

让我们实现几个语音命令来控制应用：

// 在onresult事件处理中添加 recognition.onresult = (event) => { // ...原有代码... if (finalTranscript) { // 检查语音命令 const lowerText = finalTranscript.toLowerCase().trim(); if (lowerText === '清除笔记') { outputDiv.innerHTML = ''; localStorage.removeItem('voiceNotes'); return; } if (lowerText === '停止录音') { recognition.stop(); toggleBtn.textContent = '开始录音'; toggleBtn.style.background = '#4285f4'; isListening = false; return; } addNote(finalTranscript, true); } };

3.3 添加简单的标记功能

通过语音命令添加重要标记：

// 修改addNote函数 function addNote(text, isFinal) { const noteDiv = document.createElement('div'); noteDiv.className = 'note'; // ...原有代码... // 检查是否包含"重要"关键词 if (text.includes('重要') || text.includes('紧急')) { noteDiv.style.borderLeftColor = '#ea4335'; noteDiv.style.background = '#fce8e6'; } // ...其余代码... }

4. 调试技巧与性能优化

开发过程中可能会遇到一些坑，这里分享几个实用技巧：

4.1 常见问题排查

权限问题：确保浏览器有麦克风访问权限
https要求：某些浏览器要求页面通过https提供服务才能使用麦克风
静音检测：API会自动停止静音状态，可通过调整noiseThreshold参数优化

4.2 性能优化建议

// 优化配置示例 recognition.maxAlternatives = 3; // 获取多个识别结果 recognition.grammars = new SpeechGrammarList(); // 可以添加语法约束 recognition.nonauthoritative = true; // 允许非权威结果 // 节流处理频繁的临时结果 let lastInterimTime = 0; recognition.onresult = (event) => { const now = Date.now(); // ...原有代码... if (interimTranscript && now - lastInterimTime > 300) { addNote(interimTranscript, false); lastInterimTime = now; } };

4.3 跨浏览器兼容方案

虽然Chrome支持最好，但我们可以做个简单的polyfill：

// 简单的兼容性处理 if (!('webkitSpeechRecognition' in window) && !('SpeechRecognition' in window)) { // 如果不支持原生API，可以提示用户或加载第三方polyfill document.getElementById('toggleBtn').style.display = 'none'; const fallbackMsg = document.createElement('div'); fallbackMsg.innerHTML = '您的浏览器不支持语音识别，推荐使用<a href="https://www.google.com/chrome/" target="_blank">Chrome</a>最新版本'; document.querySelector('.container').appendChild(fallbackMsg); }

5. 项目扩展思路

这个基础版本已经能用，但还有很大改进空间：

5.1 添加笔记分类功能

// 通过语音命令自动分类 function categorizeNote(text, noteDiv) { const lowerText = text.toLowerCase(); if (lowerText.includes('购物')) { noteDiv.dataset.category = 'shopping'; noteDiv.querySelector('.timestamp').textContent += ' 🛒'; } else if (lowerText.includes('想法')) { noteDiv.dataset.category = 'idea'; noteDiv.querySelector('.timestamp').textContent += ' 💡'; } // 更多分类... } // 在addNote中调用 function addNote(text, isFinal) { // ...原有代码... if (isFinal) { categorizeNote(text, noteDiv); } }

5.2 集成语音合成朗读

// 添加朗读功能 function readNote(text) { const utterance = new SpeechSynthesisUtterance(text); utterance.lang = 'zh-CN'; utterance.rate = 0.9; speechSynthesis.speak(utterance); } // 可以添加一个朗读按钮，或者通过语音命令触发 document.addEventListener('click', (e) => { if (e.target.classList.contains('note')) { readNote(e.target.querySelector('div:last-child').textContent); } });

5.3 导出笔记功能

// 导出笔记为文本文件 function exportNotes() { let exportText = '我的语音笔记\n\n'; document.querySelectorAll('.note.final').forEach(note => { exportText += `[${note.querySelector('.timestamp').textContent}] ${note.querySelector('div:last-child').textContent}\n`; }); const blob = new Blob([exportText], { type: 'text/plain' }); const url = URL.createObjectURL(blob); const a = document.createElement('a'); a.href = url; a.download = `语音笔记_${new Date().toLocaleDateString()}.txt`; a.click(); } // 可以添加一个导出按钮，或者通过语音命令触发

这个语音笔记工具虽然简单，但涵盖了Web Speech API的核心用法。在实际使用中，我发现最实用的功能是连续识别和实时显示，这让我在思考时可以流畅地口述而不必频繁操作界面。

查看全文

http://www.jsqmd.com/news/966614/