Vite App

# xf-asr **Repository Path**: shankun/xf-asr ## Basic Information - **Project Name**: xf-asr - **Description**: 基于 Vue3 + Vite5，使用科大讯飞 ASR 实现语音转文字语音听写流式接口，用于1分钟内的即时语音转文字技术，支持实时返回识别结果，达到一边上传音频一边获得识别文本的效果。代码拉下来可直接使用，保姆级教学 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2025-05-20 - **Last Updated**: 2025-05-20 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 基于 Vue3 + Vite5，使用科大讯飞 ASR 实现语音转文字语音听写流式接口，用于1分钟内的即时语音转文字技术，支持实时返回识别结果，达到一边上传音频一边获得识别文本的效果。 ## [官方文档](https://www.xfyun.cn/doc/asr/voicedictation/API.html) ## 环境准备 - 开发环境：`Node.js v20.10.0` + `npm v10.2.3` - 开发工具：`WebStorm 2024.1.7` - 源代码管理：`Git` - npm镜像：`https://registry.npmmirror.com/` ![系统环境](http://img.jiachen.tech/csdn/Snipaste_2025-04-01_15-23-48.png) ## 一、快速开始使用 `npm` 包管理工具快速创建一个项目 ```sh npm create vue@latest ```` 项目名称为 `xf-asr`，其它功能暂时都不需要，可以直接回车。进入项目目录，执行 `npm install` 安装依赖，执行 `npm run dev` 启动项目。 **执行 `git init` 初始化 git 仓库(个人习惯)方便管理，并执行 `git add .` ，在执行 `git commit -m 'init' `** ## 二、下载 SDK 在官方文档中找到[语音听写流式API demo js语言](https://xfyun-doc.xfyun.cn/static/16902656744143549/iat-js-demo.zip) 点击下载，里面包含了我们需要用到的录音管理器，以及语音识别的代码。解压刚才下载好的 `iat-js-demo.zip` 文件，将里面的 `dist` 文件夹复制到项目的 `public` 目录下，并改名为 `asr-sdk`。在终端执行 `npm install crypto-js` 安装`crypto-js`包，用于加密。 ## 三、引入 SDK 修改 `index.html` 文件，其它位置不用修改 ```html Vite App

``` ## 四、封装 useXfAsr 语音识别插件在项目 `src` 目录下创建一个 `hooks` 目录，在该目录下创建一个 `useXfAsr.js` 文件，并添加以下代码： ```javascript import {computed, ref} from "vue"; import CryptoJS from "crypto-js"; // TODO 自己去讯飞官网获获取 apiKey、apiSecret、app_id 等信息 const apiKey = "xxx"; const apiSecret = "xxx"; const app_id = "xxx"; /** * 获取websocket url * 该接口需要后端提供，这里为了方便前端处理 */ function getWebSocketUrl() { const url = "wss://iat-api.xfyun.cn/v2/iat"; const host = "iat-api.xfyun.cn"; const date = new Date().toUTCString(); const signatureOrigin = `host: ${host}\ndate: ${date}\nGET /v2/iat HTTP/1.1`; const signatureSha = CryptoJS.HmacSHA256(signatureOrigin, apiSecret); const signature = CryptoJS.enc.Base64.stringify(signatureSha); const authorizationOrigin = `api_key="${apiKey}", algorithm="hmac-sha256", headers="host date request-line", signature="${signature}"`; const authorization = btoa(authorizationOrigin); return `${url}?authorization=${authorization}&date=${date}&host=${host}`; } /** * 将音频二进制数据转换为base64编码 * @param buffer * @returns {string} */ function bufferToBase64(buffer) { let binary = ""; const bytes = new Uint8Array(buffer); const len = bytes.byteLength; for (let i = 0; i < len; i ++) { binary += String.fromCharCode(bytes[i]); } return window.btoa(binary); } export function useXfAsr() { const resultText = ref(); // 识别结果 let resultTextTemp = ""; let iatWS = null; // websocket let countdownInterval = null; // 倒计时 const nextTime = ref(60); // 录音时长（最大60秒） const recorder = new RecorderManager("/asr-sdk"); const recordStatus = ref("CLOSED"); // CONNECTING | OPEN | CLOSING | CLOSED const recordText = computed(() => { if (recordStatus.value === "CONNECTING") { return "建立连接中"; } else if (recordStatus.value === "OPEN") { return `录音中(${nextTime.value})`; } else if (recordStatus.value === "CLOSING") { return "关闭连接中"; } else if (recordStatus.value === "CLOSED") { return "开始录音"; } }); /** * 录音开始事件 */ recorder.onStart = () => { updateStatus("OPEN"); }; /** * 监听已录制完指定帧大小的文件事件。如果设置了 frameSize，则会回调此事件。 * @param isLastFrame 当前帧是否正常录音结束前的最后一帧 * @param frameBuffer 录音分片数据 */ recorder.onFrameRecorded = ({ isLastFrame, frameBuffer }) => { if (iatWS.readyState === iatWS.OPEN) { const data = { data: { // 0 :第一帧音频、1 :中间的音频、2 :最后一帧音频，最后一帧必须要发送 status: isLastFrame ? 2 : 1, format: "audio/L16;rate=16000", encoding: "raw", audio: bufferToBase64(frameBuffer), }, }; iatWS.send(JSON.stringify(data)); if (isLastFrame) { updateStatus("CLOSING"); } } }; /** * 录音结束事件 */ recorder.onStop = () => { clearInterval(countdownInterval); }; /** * 倒计时 */ function countdown() { nextTime.value = 60; countdownInterval = setInterval(() => { nextTime.value --; if (nextTime.value <= 0) { clearInterval(countdownInterval); recorder.stop(); } }, 1000); } /** * 更新状态 * @param status CONNECTING | OPEN | CLOSING | CLOSED */ function updateStatus(status) { recordStatus.value = status; if (status === "OPEN") { countdown(); } else if (status === "CONNECTING") { resultText.value = ""; resultTextTemp = ""; } } /** * 渲染识别结果 * @param resultData */ function renderResult(resultData) { let jsonData = JSON.parse(resultData); console.log("识别结果：", jsonData); if (jsonData.data && jsonData.data.result) { let data = jsonData.data.result; let str = ""; let ws = data.ws; for (let i = 0; i < ws.length; i ++) { str = str + ws[i].cw[0].w; } // 开启 wpgs 会有此字段(前提：在控制台开通动态修正功能) // 取值为 "apd"时表示该片结果是追加到前面的最终结果；取值为"rpl" 时表示替换前面的部分结果，替换范围为rg字段 if (data.pgs) { if (data.pgs === "apd") { // 将resultTextTemp同步给resultText resultText.value = resultTextTemp; } // 将结果存储在resultTextTemp中 resultTextTemp = resultText.value + str; } else { resultText.value = resultText.value + str; } } if (jsonData.code === 0 && jsonData.data.status === 2) { iatWS.close(); } if (jsonData.code !== 0) { iatWS.close(); console.error(jsonData); } } /** * 开始录音 */ function startRecording() { if (recordStatus.value !== "CLOSED") return; const url = getWebSocketUrl(); if ("WebSocket" in window) { iatWS = new WebSocket(url); } else if ("MozWebSocket" in window) { iatWS = new MozWebSocket(url); } else { console.error(new Error("浏览器不支持WebSocket")); return; } updateStatus("CONNECTING"); iatWS.onopen = (e) => { recorder.start({ sampleRate: 16000, frameSize: 1280 }); const params = { common: { app_id }, business: { language: "zh_cn", domain: "iat", accent: "mandarin", vad_eos: 5000, dwa: "wpgs" }, data: { status: 0, format: "audio/L16;rate=16000", encoding: "raw" }, }; iatWS.send(JSON.stringify(params)); }; iatWS.onmessage = (e) => { renderResult(e.data); }; iatWS.onerror = (e) => { recorder.stop(); updateStatus("CLOSED"); }; iatWS.onclose = (e) => { recorder.stop(); updateStatus("CLOSED"); }; } /** * 停止录音 */ function stopRecording() { recorder.stop(); } return { resultText, recordText, startRecording, stopRecording, }; } ``` ## 五、插件使用修改 `src/App.vue` 文件，添加如下代码： ```vue ``` ![运行效果](http://img.jiachen.tech/csdn/Snipaste_2025-04-01_15-59-57.png) ## 六、心得体会在这里，作者用最简洁的代码调用讯飞语音识别，通过封装 `useXfAsr` 函数，可以很方便的项目的任何地方进行调用。由于讯飞 SDK 的要求，这里的语音时长最大为 60 秒，所以，如果用户在录音过程中，超过 60 秒，则需要重新开始录音。通过接口密钥基于 `hmac-sha256` 计算签名，向服务器端发送 `Websocket` 协议握手请求。握手成功后，客户端通过 `Websocket` 连接同时上传和接收数据。数据上传完毕，客户端需要上传一次数据结束标识。接收到服务器端的结果全部返回标识后断开 `Websocket` 连接。具体的细节请参考[官方文档](https://www.xfyun.cn/doc/asr/voicedictation/API.html)及示例。 ## 七、项目原码请前往 [xf-asr](https://gitee.com/JiaChenHuang/xf-asr) 各位小伙伴，欢迎大家使用，有什么问题，欢迎留言，一起交流。

识别结果：{{ resultText }}