当前位置：首页 > news >正文

python积累--多线程的使用实例

news 2026/7/1 1:26:37

Python积累——多线程的使用实例

多线程编程是Python进阶开发中的核心技能之一。它允许程序同时执行多个任务，显著提升I/O密集型应用的效率。本文将基于实际代码示例，从基础到进阶，系统讲解Python多线程的用法、注意事项及最佳实践。

一、多线程的核心概念与优势

什么是多线程？

多线程类似于同时执行多个不同程序，每个线程共享进程的资源（如内存、文件句柄），但拥有独立的CPU寄存器上下文（包括指令指针和堆栈指针）。

多线程的五大优势：

后台处理：将耗时任务（如大文件处理）放到后台执行，不阻塞主流程。
提升用户体验：在GUI程序中，点击按钮触发任务时可显示进度条，界面保持响应。
加速程序运行：在多核CPU上，计算密集型任务可并行加速（需注意GIL限制）。
高效等待：在用户输入、文件读写、网络收发等场景下，线程可主动让出资源。
轻量级：线程比进程更轻量，创建和切换开销更小。

二、Python多线程模块演进

版本	模块	状态
Python2	`thread`	已废弃
Python3	`_thread`	底层兼容模块
Python3+	`threading`	推荐使用

注意：thread模块在Python3中被重命名为_thread，仅用于向后兼容。生产环境应优先使用threading模块。

三、基础实例：Python2与Python3的对比

示例1：Python2的`thread`模块

#!/usr/bin/python# -*- coding: UTF-8 -*-importthreadimporttimedefprint_time(threadName,delay):count=0whilecount<5:time.sleep(delay)count+=1print"%s: %s"%(threadName,time.ctime(time.time()))try:thread.start_new_thread(print_time,("Thread-1",2,))thread.start_new_thread(print_time,("Thread-2",4,))except:print"Error: unable to start thread"while1:pass# 保持主线程存活

示例2：Python3的`_thread`模块（兼容写法）

#!/usr/bin/python3import_threadimporttimedefprint_time(threadName,delay):count=0whilecount<5:time.sleep(delay)count+=1print("%s: %s"%(threadName,time.ctime(time.time())))try:_thread.start_new_thread(print_time,("Thread-1",2,))_thread.start_new_thread(print_time,("Thread-2",4,))except:print("Error: 无法启动线程")while1:pass

关键点：

使用start_new_thread()启动线程，参数为函数名和参数元组。
主线程必须保持存活（通过while 1或time.sleep()），否则子线程会被强制终止。

四、推荐用法：`threading`模块

示例3：继承`threading.Thread`类

#!/usr/bin/python3importthreadingimporttime exitFlag=0classmyThread(threading.Thread):def__init__(self,threadID,name,counter):threading.Thread.__init__(self)self.threadID=threadID self.name=name self.counter=counterdefrun(self):# 重写run方法print("开始线程："+self.name)print_time(self.name,self.counter,5)print("退出线程："+self.name)defprint_time(threadName,delay,counter):whilecounter:ifexitFlag:threadName.exit()time.sleep(delay)print("%s: %s"%(threadName,time.ctime(time.time())))counter-=1thread1=myThread(1,"Thread-1",1)thread2=myThread(2,"Thread-2",2)thread1.start()thread2.start()thread1.join()# 等待线程结束thread2.join()print("退出主线程")

核心方法：

start()：启动线程，自动调用run()。
join()：阻塞主线程，直到子线程执行完毕。

五、实战案例：爬虫多线程批量处理

示例4：使用`_thread`实现多线程数据解析

fromspider.dao.itemLinkDaoimport*fromspider.dao.categoryPageLinkDaoimport*importjsonfrombs4importBeautifulSoupimporttimeimportreimport_threaddefparserawauto(begin,size):linkhead="http://www.525.life/"linkend="/mode_show?token=&user_key=&app_version=2.6.2.1"while1:try:count=countNoDealedPageRaw()ifcount==0:breakraws=findNoDealedRawLimit(begin,size)forrawinraws:ifraw['source']=='食物库app':contentjson=json.loads(raw['content'])forfoodincontentjson['foods']:link=linkhead+food['code']+linkend insertItemLink(food['code'],food['name'],raw['link'],link,raw['type'],raw['source'])dealCategoryPageRaw(raw['link'])else:soup=BeautifulSoup(raw['content'])div=soup.find("div",class_="widget-food-list")ul=div.find("ul",class_="food-list")forboxinul.find_all("div",class_="text-box"):node=box.find('a',href=re.compile(r'/shiwu/\w+'))code=node['href'].replace("/shiwu/","")name=node['title']link=linkhead+code+linkend insertItemLink(code,name,raw['link'],link,raw['type'],raw['source'])dealCategoryPageRaw(raw['link'])print("dealed %s %s %s"%(raw['source'],raw['type'],raw['link']))exceptExceptionase:print(e)return"begin "+str(begin)+" finish"+datetime.now()defrun():# 启动20个线程，每个处理不同的数据偏移foriinrange(0,2000,100):try:_thread.start_new_thread(parserawauto,(i,100))exceptExceptionase:print(e)print("Error: unable to start thread")run()while1:# 主线程保持运行pass

设计亮点：

每个线程负责处理不同偏移量（begin）的数据，实现并行抓取。
内部while 1循环持续处理新数据，直到队列为空。
try-except捕获异常，避免单个线程崩溃影响整体。

六、线程同步：锁机制

示例5：使用`threading.Lock`实现互斥

#!/usr/bin/python3importthreadingimporttimeclassmyThread(threading.Thread):def__init__(self,threadID,name,counter):threading.Thread.__init__(self)self.threadID=threadID self.name=name self.counter=counterdefrun(self):print("开启线程："+self.name)threadLock.acquire()# 获取锁print_time(self.name,self.counter,3)threadLock.release()# 释放锁defprint_time(threadName,delay,counter):whilecounter:time.sleep(delay)print("%s: %s"%(threadName,time.ctime(time.time())))counter-=1threadLock=threading.Lock()threads=[]thread1=myThread(1,"Thread-1",1)thread2=myThread(2,"Thread-2",2)thread1.start()thread2.start()threads.append(thread1)threads.append(thread2)fortinthreads:t.join()print("退出主线程")

输出效果：
Thread-1执行完毕后，Thread-2才开始执行（锁保证了顺序）。

七、线程优先级队列

示例6：使用`queue.Queue`管理任务

#!/usr/bin/python3importqueueimportthreadingimporttime exitFlag=0classmyThread(threading.Thread):def__init__(self,threadID,name,q):threading.Thread.__init__(self)self.threadID=threadID self.name=name self.q=qdefrun(self):print("开启线程："+self.name)process_data(self.name,self.q)print("退出线程："+self.name)defprocess_data(threadName,q):whilenotexitFlag:queueLock.acquire()ifnotworkQueue.empty():data=q.get()queueLock.release()print("%s processing %s"%(threadName,data))else:queueLock.release()time.sleep(1)threadList=["Thread-1","Thread-2","Thread-3"]nameList=["One","Two","Three","Four","Five"]queueLock=threading.Lock()workQueue=queue.Queue(10)threads=[]fortNameinthreadList:thread=myThread(threadID,tName,workQueue)thread.start()threads.append(thread)threadID+=1# 填充任务队列queueLock.acquire()forwordinnameList:workQueue.put(word)queueLock.release()whilenotworkQueue.empty():passexitFlag=1# 通知线程退出fortinthreads:t.join()print("退出主线程")

适用场景：

任务量不确定的生产者-消费者模型。
需要控制并发数量的爬虫系统。

八、常见问题与避坑指南

问题1：`Unhandled exception in thread started by`

原因：主线程提前结束，导致子线程被强制终止。
解决方案：确保主线程等待所有子线程完成。

# 方法一：使用join()thread1.join()thread2.join()# 方法二：保持主线程运行while1:time.sleep(1)

问题2：GIL限制计算密集型任务

Python的全局解释器锁（GIL）导致多线程无法并行执行CPU密集型代码。此时应使用multiprocessing模块。

问题3：死锁

多个线程相互等待对方释放资源时发生。
预防：使用threading.RLock（可重入锁）或with语句管理锁。

lock=threading.Lock()withlock:# 自动获取和释放锁critical_section()

九、性能对比与选型建议

场景	推荐方案	理由
I/O密集型（网络爬虫）	`threading`+ 队列	线程切换开销低，并发效果好
CPU密集型（计算）	`multiprocessing`	绕过GIL，利用多核
高并发异步任务	`asyncio`	单线程协程，更轻量级
简单后台任务	`_thread`或`threading`	快速实现