当前位置：首页 > news >正文

DeepSeek对话导出Word/PDF全攻略，【Linux】开启关闭MediaMTX服务。

news 2026/4/15 5:20:16

将 DeepSeek 对话 JSON 导出为 Word 和 PDF 的技术实现

DeepSeek 作为一款先进的 AI 对话工具，支持将对话内容导出为 JSON 格式。将 JSON 数据转换为 Word 和 PDF 文件，可以通过多种技术手段实现。

Python 实现方案使用python-docx和reportlab库可以高效完成转换任务。以下是一个完整的代码示例：

import json from docx import Document from reportlab.lib.pagesizes import letter from reportlab.pdfgen import canvas def json_to_word(json_path, word_path): with open(json_path, 'r', encoding='utf-8') as f: data = json.load(f) doc = Document() for message in data['messages']: doc.add_paragraph(f"{message['role']}: {message['content']}") doc.save(word_path) def word_to_pdf(word_path, pdf_path): # 需要先安装 libreoffice 并配置环境 import subprocess subprocess.call(['soffice', '--convert-to', 'pdf', '--outdir', pdf_path.rpartition('/')[0], word_path])

JavaScript 实现方案使用 Node.js 生态的docx和pdf-lib库：

const { Document, Paragraph, TextRun } = require('docx'); const fs = require('fs'); const { PDFDocument } = require('pdf-lib'); async function jsonToWord(jsonPath, wordPath) { const data = JSON.parse(fs.readFileSync(jsonPath)); const doc = new Document({ sections: [{ properties: {}, children: data.messages.map(msg => new Paragraph({ children: [new TextRun(`${msg.role}: ${msg.content}`)] }) ) }] }); const buffer = await docx.Packer.toBuffer(doc); fs.writeFileSync(wordPath, buffer); }

格式优化与高级功能

样式自定义在 Word 导出中添加样式控制：

from docx.shared import Pt, RGBColor def add_styled_paragraph(doc, text, is_bot=False): p = doc.add_paragraph() run = p.add_run(text) run.font.size = Pt(12) run.font.color.rgb = RGBColor(0x42, 0x24, 0xE9) if is_bot else RGBColor(0, 0, 0)

PDF 直接生成使用reportlab直接生成 PDF，避免格式转换损失：

def json_to_pdf(json_path, pdf_path): c = canvas.Canvas(pdf_path, pagesize=letter) y_position = 750 with open(json_path, 'r') as f: data = json.load(f) for msg in data['messages']: c.drawString(100, y_position, f"{msg['role'].upper()}: {msg['content']}") y_position -= 20 if y_position < 50: c.showPage() y_position = 750 c.save()

企业级解决方案

批量处理架构对于需要处理大量对话的场景，建议采用以下架构：

使用消息队列（如 RabbitMQ）接收转换任务
部署独立的微服务处理每种格式转换
将生成文件存储到云存储（如 S3）

性能优化技巧

对大型 JSON 文件采用流式解析
使用多线程处理独立对话
缓存常用模板减少重复计算

常见问题解决方案

中文乱码处理确保在所有环节指定 UTF-8 编码：

with open(json_path, 'r', encoding='utf-8') as f: data = json.load(f)

格式保持使用 HTML 作为中间格式可以更好地保持原始样式：

from htmldocx import HtmlToDocx def html_to_word(html, output_path): docx = HtmlToDocx() docx.add_html(html) docx.save(output_path)

以上方案提供了从基础到高级的完整实现路径，可根据具体需求选择合适的方案进行实施。

https://github.com/JohnnyDevn/pcx_kbu2/blob/main/README.md
https://raw.githubusercontent.com/JohnnyDevn/pcx_kbu2/main/README.md
https://github.com/ThoDierser/ze7_47im
https://github.com/ThoDierser/ze7_47im/blob/main/README.md
https://raw.githubusercontent.com/ThoDierser/ze7_47im/main/README.md

查看全文

http://www.jsqmd.com/news/643111/