当前位置：首页 > news >正文

CTF新手必看：手把手教你用Python脚本批量处理36个二维码碎片（BUUCTF安洵杯真题复盘）

news 2026/6/21 7:31:10

CTF实战：Python自动化处理二维码碎片的进阶技巧

引言

第一次参加CTF比赛时，我遇到了一个让我记忆深刻的题目——36个没有后缀的文件散落在文件夹里，每个都需要手动修改为.jpg才能查看内容。这种重复性工作不仅耗时耗力，还容易出错。正是这次经历让我意识到，掌握自动化脚本技能对于CTF选手来说不是锦上添花，而是必备的核心能力。

本文将从一个真实的CTF比赛题目（BUUCTF安洵杯"吹着贝斯扫二维码"）出发，带你从零开始构建一个完整的文件处理自动化方案。不同于简单的脚本复制粘贴，我们会深入探讨如何设计可复用的CTF工具链，让你在面对类似"文件碎片"、"隐写术"类题目时能够游刃有余。

1. 理解题目与文件分析

1.1 初始文件结构分析

当我们从比赛平台下载题目附件并解压后，通常会遇到以下几种文件类型：

无后缀的原始文件（36个）
加密的ZIP压缩包（flag.zip）
有时会包含提示文本或注释文件

使用file命令或十六进制编辑器检查这些无后缀文件，可以发现它们实际上是JPEG格式的图片：

$ file fragment_01 fragment_01: JPEG image data, JFIF standard 1.01

关键识别特征：

JPEG文件头：FF D8 FF
JPEG文件尾：FF D9

1.2 二维码碎片的特点

这些图片碎片具有以下典型特征：

每个碎片都是完整二维码的一部分
碎片之间可能有重叠或拼接线索
组合后的二维码通常包含解题关键信息
碎片文件名可能隐含拼接顺序

2. Python自动化文件处理

2.1 基础文件重命名脚本

最基础的解决方案是批量添加文件后缀。以下是改进后的脚本版本：

import os from pathlib import Path def batch_rename_files(directory, new_extension='.jpg'): """批量修改文件后缀""" for item in Path(directory).iterdir(): if item.is_file() and not item.suffix: new_name = item.with_suffix(new_extension) item.rename(new_name) print(f"Renamed: {item.name} -> {new_name.name}") # 使用示例 batch_rename_files('/path/to/ctf/files')

这个版本相比原始脚本有几个重要改进：

使用pathlib替代os模块，路径处理更安全
增加了文件类型检查，避免误操作
添加了操作日志输出
函数化设计便于复用

2.2 高级文件处理技巧

在实际CTF比赛中，我们可能需要更复杂的文件操作：

文件过滤与分类

def classify_files_by_magic(directory): """通过魔数识别并分类文件""" import magic # python-magic库 mime = magic.Magic(mime=True) for file in Path(directory).iterdir(): if file.is_file(): file_type = mime.from_file(file) new_ext = { 'image/jpeg': '.jpg', 'image/png': '.png', 'application/zip': '.zip' }.get(file_type, '') if new_ext and not file.suffix == new_ext: new_path = file.with_suffix(new_ext) file.rename(new_path)

文件内容校验

def validate_jpeg_files(directory): """验证JPEG文件完整性""" for jpeg_file in Path(directory).glob('*.jpg'): with open(jpeg_file, 'rb') as f: data = f.read() if not data.startswith(b'\xff\xd8\xff'): print(f"Invalid JPEG header: {jpeg_file}") if not data.endswith(b'\xff\xd9'): print(f"Invalid JPEG footer: {jpeg_file}")

3. 二维码处理与自动化拼接

3.1 二维码碎片分析

36个二维码碎片通常按6x6网格排列。我们可以通过以下特征确定排列顺序：

二维码定位图案（三个角落的方块）
碎片中的部分数据区域
文件名中的隐含顺序（如果有）

3.2 使用Python自动化拼接

手动用Photoshop拼接36个碎片效率极低。以下是使用Python+Pillow的自动化方案：

from PIL import Image def merge_qr_fragments(fragments_dir, rows=6, cols=6): """自动拼接二维码碎片""" fragments = sorted(Path(fragments_dir).glob('*.jpg')) # 获取单个碎片尺寸 sample = Image.open(fragments[0]) frag_width, frag_height = sample.size # 创建空白画布 merged = Image.new('RGB', (cols*frag_width, rows*frag_height)) # 按顺序粘贴碎片 for i, frag in enumerate(fragments): row = i // cols col = i % cols box = (col*frag_width, row*frag_height, (col+1)*frag_width, (row+1)*frag_height) merged.paste(Image.open(frag), box) merged.save('merged_qr.jpg') return merged

3.3 二维码扫描与信息提取

拼接完成后，我们可以使用Python直接解码二维码内容：

from pyzbar.pyzbar import decode def decode_qr(image_path): """解码二维码内容""" result = decode(Image.open(image_path)) if result: return result[0].data.decode('utf-8') return None qr_content = decode_qr('merged_qr.jpg') print(f"QR Code Content: {qr_content}")

4. 密码破解与数据解码

4.1 多层编码解析

从二维码获取的字符串通常经过多层编码转换。我们需要按照相反顺序解码：

import base64 import codecs from base64 import b85decode, b32decode, b16decode def decode_ctf_string(encoded_str): """处理多层编码的CTF字符串""" # Base32解码 step1 = b32decode(encoded_str).decode('utf-8') # Base16解码 step2 = b16decode(step1.upper()).decode('utf-8') # ROT13解码 step3 = codecs.decode(step2, 'rot13') # Base85解码 step4 = b85decode(step3.encode()).decode('utf-8') # Base64解码 step5 = base64.b64decode(step4).decode('utf-8') # 最终Base85解码 step6 = b85decode(step5.encode()).decode('utf-8') return step6 secret = decode_ctf_string(qr_content) print(f"Decoded Secret: {secret}")

4.2 ZIP压缩包破解

获取密码后，我们可以用Python自动解压ZIP文件：

import zipfile def extract_zip(zip_path, password, extract_to='.'): """使用密码解压ZIP文件""" try: with zipfile.ZipFile(zip_path) as zf: zf.extractall(extract_to, pwd=password.encode()) print("Extraction successful!") return True except Exception as e: print(f"Extraction failed: {e}") return False extract_zip('flag.zip', secret)

5. 构建可复用的CTF工具库

5.1 设计通用文件处理器

class CTFFileProcessor: """CTF文件处理工具类""" def __init__(self, work_dir): self.work_dir = Path(work_dir) def batch_change_ext(self, new_ext='.jpg'): """批量修改文件后缀""" for f in self.work_dir.iterdir(): if f.is_file() and not f.suffix: f.rename(f.with_suffix(new_ext)) def merge_images(self, output='merged.jpg', grid=(6,6)): """合并图像碎片""" images = sorted(self.work_dir.glob(f'*{new_ext}')) # ...合并逻辑同上... def find_file_by_header(self, magic_bytes): """通过文件头查找特定类型文件""" for f in self.work_dir.iterdir(): with open(f, 'rb') as fd: if fd.read(len(magic_bytes)) == magic_bytes: return f return None

5.2 常见CTF文件类型识别表

文件类型	魔数（文件头）	常见扩展名	CTF常见用途
JPEG	FF D8 FF	.jpg	图片隐写、二维码
PNG	89 50 4E 47	.png	图片隐写、LSB隐写
ZIP	50 4B 03 04	.zip	加密压缩包、伪加密
PDF	25 50 44 46	.pdf	文档隐写、元数据分析
ELF	7F 45 4C 46	无/自定义	逆向工程、二进制分析

5.3 错误处理与日志记录

完善的CTF工具应该包含健壮的错误处理：

import logging logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', filename='ctf_tool.log' ) def safe_file_op(func): """文件操作装饰器，添加错误处理""" def wrapper(*args, **kwargs): try: return func(*args, **kwargs) except PermissionError as e: logging.error(f"Permission denied: {e}") except FileNotFoundError as e: logging.error(f"File not found: {e}") except Exception as e: logging.error(f"Unexpected error: {e}") return wrapper

6. 进阶技巧与实战建议

6.1 高效处理大型文件集的技巧

当面对数百甚至上千个文件碎片时：

多进程处理：使用multiprocessing加速文件操作
内存映射：对于大文件使用mmap减少内存占用
增量处理：分批处理文件避免内存溢出

from multiprocessing import Pool def parallel_rename(args): """多进程文件重命名""" old, new = args try: Path(old).rename(new) return True except Exception as e: return False with Pool(4) as p: # 4个进程 tasks = [(f, f.with_suffix('.jpg')) for f in Path('.').glob('*')] results = p.map(parallel_rename, tasks)

6.2 二维码处理的特殊技巧

破损二维码修复：使用qrtools库尝试修复不完整的二维码
颜色反转处理：有些二维码可能使用反色设计
多二维码识别：一张图中可能存在多个重叠的二维码

def repair_qr(image_path): """尝试修复破损的二维码""" from qrtools import QR qr = QR(filename=str(image_path)) if qr.decode(): return qr.data # 尝试颜色反转 inverted = ImageOps.invert(Image.open(image_path)) inverted.save('temp_inverted.jpg') qr = QR(filename='temp_inverted.jpg') if qr.decode(): return qr.data return None