当前位置：首页 > news >正文

Stable Diffusion 实战教程：从安装到图像生成

news 2026/7/23 19:14:24

Stable Diffusion 实战教程：从安装到图像生成

前言

Stable Diffusion 是当前最流行的开源图像生成模型之一。它能够根据文字描述生成高质量的图像，在创意设计、游戏开发等领域有广泛应用。

我在多个项目中使用过 Stable Diffusion，从简单的图像生成到风格迁移。今天分享完整的实战指南。

环境准备

# 创建虚拟环境 conda create -n sd python=3.10 conda activate sd # 安装依赖 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 pip install diffusers transformers accelerate safetensors pip install gradio # 用于可视化

基础使用

文本到图像

from diffusers import StableDiffusionPipeline import torch # 加载模型 pipe = StableDiffusionPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16 ).to("cuda") # 生成图像 prompt = "a beautiful sunset over the ocean, golden hour, photorealistic" image = pipe(prompt).images[0] # 保存图像 image.save("sunset.png")

控制生成参数

def generate_image( prompt: str, negative_prompt: str = None, num_inference_steps: int = 50, guidance_scale: float = 7.5, seed: int = None ) -> Image: """生成图像""" generator = torch.Generator("cuda").manual_seed(seed) if seed else None image = pipe( prompt=prompt, negative_prompt=negative_prompt, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale, generator=generator ).images[0] return image # 使用示例 image = generate_image( prompt="a cute cat playing with a ball", negative_prompt="ugly, blurry, low quality", num_inference_steps=30, guidance_scale=7.5, seed=42 )

高级技巧

图像到图像

from diffusers import StableDiffusionImg2ImgPipeline from PIL import Image # 加载图像到图像模型 img2img_pipe = StableDiffusionImg2ImgPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16 ).to("cuda") # 加载输入图像 init_image = Image.open("input.jpg").convert("RGB") init_image = init_image.resize((512, 512)) # 生成 prompt = "turn this photo into a painting in the style of Van Gogh" image = img2img_pipe( prompt=prompt, image=init_image, strength=0.75 ).images[0] image.save("output.png")

深度引导

from diffusers import StableDiffusionDepth2ImgPipeline # 加载深度模型 depth_pipe = StableDiffusionDepth2ImgPipeline.from_pretrained( "stabilityai/stable-diffusion-2-depth", torch_dtype=torch.float16 ).to("cuda") # 使用深度图引导 prompt = "a futuristic city skyline" image = depth_pipe( prompt=prompt, image=init_image, depth_map=None # 自动计算深度 ).images[0]

模型微调

准备数据集

from datasets import load_dataset # 加载数据集 dataset = load_dataset("lambdalabs/pokemon-blip-captions") # 预处理 def preprocess(examples): images = [image.convert("RGB").resize((512, 512)) for image in examples["image"]] return {"images": images, "captions": examples["text"]} dataset = dataset.map(preprocess, batched=True)

训练脚本

from diffusers import StableDiffusionPipeline from diffusers.training_utils import set_seed # 设置种子 set_seed(42) # 加载模型 model_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionPipeline.from_pretrained(model_id) # 配置训练参数 training_args = { "output_dir": "./pokemon-model", "per_device_train_batch_size": 4, "gradient_accumulation_steps": 4, "learning_rate": 1e-5, "num_train_epochs": 10, "logging_steps": 10, "save_steps": 100 } # 开始训练（简化示例） # trainer.train()

Web UI 部署

import gradio as gr def generate(prompt, negative_prompt, steps, scale): """生成图像""" image = pipe( prompt=prompt, negative_prompt=negative_prompt, num_inference_steps=steps, guidance_scale=scale ).images[0] return image # 创建界面 with gr.Blocks() as demo: gr.Markdown("# Stable Diffusion Demo") with gr.Row(): with gr.Column(): prompt = gr.Textbox(label="Prompt") negative_prompt = gr.Textbox(label="Negative Prompt") steps = gr.Slider(minimum=10, maximum=100, value=50, label="Steps") scale = gr.Slider(minimum=1, maximum=20, value=7.5, label="Guidance Scale") generate_btn = gr.Button("Generate") with gr.Column(): output = gr.Image(label="Output") generate_btn.click(generate, inputs=[prompt, negative_prompt, steps, scale], outputs=output) demo.launch()

常见问题

显存不足

# 解决方案：使用安全模式 pipe.enable_attention_slicing() # 或使用 CPU 卸载 pipe.enable_model_cpu_offload() # 或减少 batch size pipe.set_progress_bar_config(disable=True)

生成质量差

# 提高质量的技巧 # 1. 使用更高的 steps # 2. 调整 guidance_scale # 3. 添加详细的 negative prompt # 4. 使用更好的模型（如 SDXL）

总结

Stable Diffusion 是强大的图像生成工具：

基础用法：文本到图像的简单生成
高级技巧：图像到图像、深度引导
微调：适应特定风格或主题
部署：构建 Web 应用

关键要点：

提示词质量直接影响生成结果
negative prompt 很重要
调整参数需要经验
大显存 GPU 能显著提升速度

查看全文

http://www.jsqmd.com/news/861140/

6款优质降AIGC平台降痕效果拉满

05月冷轧精密钢管厂家精选集，助力工程高效推进，冷拔钢管/薄壁精密钢管/无缝方矩管，精密钢管源头厂家哪家强 - 品牌推荐师

央国企求职简历优化哪家靠谱？资深从业者详解权威机构甄选标准，中国烟草求职辅导/应届生央国企上岸培训，求职简历优化机构推荐 - 品牌推荐师

2026年5月新发布：浙江市场备受瞩目的实力泥浆泵品牌深度解析 - 2026年企业推荐榜

流量洪峰与合规约束叠加时奥创容量保障的落地边界观察

怎么选北京老房翻新装修公司？2026年5月推荐五家评测案例与口碑 - 品牌推荐

【PC】MToolsv0.1.0一款宝藏级电脑多媒体处理工具箱

2025-2026年国内北京老房翻新装修公司推荐：五家排行产品专业评测解决厨卫漏水致邻里纠纷 - 品牌推荐

技术突破：如何让ARM设备突破x86架构的束缚？

2026现阶段玻璃转子流量计选型指南：聚焦实力厂家余姚伟创 - 2026年企业推荐榜

WPR仿真平台：三大核心功能助你零成本掌握机器人开发

从物理光学到AI生成：揭秘玻璃折射率n=1.52如何映射为--s 750 + --iw 1.8的底层逻辑

摆脱论文困扰!！2026 最新降AIGC软件测评与推荐

炸裂！英伟达 Q1 狂赚 583 亿美元，AI 到底有多赚钱？网友：这是印钞机吧

哪家国内人力资源外包公司靠谱？2026年5月推荐五家产品案例评测与评价 - 品牌推荐

布料质感模拟私密工作流首度公开：融合PBR贴图预处理+MJ --tile指令+后期Subsurface Scattering叠加的三阶增强法

2025-2026年莱茵优品电话查询：预约服务前请核实资质与合同条款 - 品牌推荐

小苯的数组构造【牛客tracker 每日一题】

使用电脑快速测试 PROFINET 设备通讯

知识竞赛裁判怎么当？评分标准与争议处理

2025-2026年产业园区公司联系电话推荐：资源整合与联系须知 - 品牌推荐

P1289 磁盘碎片整理【洛谷算法习题】

AI与云计算融合的考点中，机器学习基础流程、大模型应用基础及Prompt Engineering在系统设计中的作用是三大核心模块

2026年国内核心五金类展览会TOP5客观排行：义乌3月份展会/义乌7月展会信息/义乌博览会2026年展会时间/选择指南 - 优质品牌商家

团队冲刺阶段6（团队）

Stable Diffusion 实战教程：从安装到图像生成

前言

环境准备

基础使用

文本到图像

控制生成参数

高级技巧

图像到图像

深度引导

模型微调

准备数据集

训练脚本

Web UI 部署

常见问题

显存不足

生成质量差

总结

相关文章：