当前位置：首页 > news >正文

TensorFlow学习笔记：优化器对比实验

news 2026/7/23 6:30:41

🍨 本文为🔗365天深度学习训练营中的学习记录博客
🍖 原作者：K同学啊

一、基础设置与导入数据

importmatplotlib.pyplotaspltimportnumpyasnpimportosimportPILimporttensorflowastffromtensorflowimportkerasfromtensorflow.kerasimportlayers,modelsimportpathlibfromdatetimeimportdatetimefrommatplotlib.tickerimportMultipleLocator# 1. 设置 GPUgpus=tf.config.list_physical_devices('GPU')print("Found GPUs:",gpus)# 2. 准备数据路径data_dir=pathlib.Path('./T11_data')# 3. 加载数据img_height=224img_width=224batch_size=32train_ds=tf.keras.utils.image_dataset_from_directory(data_dir,validation_split=0.2,subset="training",seed=123,image_size=(img_height,img_width),batch_size=batch_size)val_ds=tf.keras.utils.image_dataset_from_directory(data_dir,validation_split=0.2,subset="validation",seed=123,image_size=(img_height,img_width),batch_size=batch_size)class_names=train_ds.class_names num_classes=len(class_names)print(f"识别目标:{class_names}")# 数据管道加速AUTOTUNE=tf.data.AUTOTUNE train_ds=train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)val_ds=val_ds.cache().prefetch(buffer_size=AUTOTUNE)

二、定义统一的 CNN 模型架构

defcreate_model():model=models.Sequential([layers.Rescaling(1./255,input_shape=(img_height,img_width,3)),layers.Conv2D(16,3,padding='same',activation='relu'),layers.MaxPooling2D(),layers.Conv2D(32,3,padding='same',activation='relu'),layers.MaxPooling2D(),layers.Conv2D(64,3,padding='same',activation='relu'),layers.MaxPooling2D(),layers.Dropout(0.2),# Dropout层，防止过拟合layers.Flatten(),layers.Dense(128,activation='relu'),layers.Dense(num_classes)])returnmodel

三、优化器对比训练 (Adam vs SGD)

epochs=20# --- 训练模型一：Adam 优化器 ---print("\n--- 开始训练模型 1：使用 Adam 优化器 ---")model_adam=create_model()model_adam.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),metrics=['accuracy'])history_adam=model_adam.fit(train_ds,validation_data=val_ds,epochs=epochs)# --- 训练模型二：SGD 优化器 ---print("\n--- 开始训练模型 2：使用 SGD 优化器 ---")model_sgd=create_model()model_sgd.compile(optimizer='sgd',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),metrics=['accuracy'])history_sgd=model_sgd.fit(train_ds,validation_data=val_ds,epochs=epochs)

四、训练结果可视化

current_time=datetime.now().strftime("%Y-%m-%d %H:%M:%S")# 提取历史数据acc_adam=history_adam.history['accuracy']val_acc_adam=history_adam.history['val_accuracy']loss_adam=history_adam.history['loss']val_loss_adam=history_adam.history['val_loss']acc_sgd=history_sgd.history['accuracy']val_acc_sgd=history_sgd.history['val_accuracy']loss_sgd=history_sgd.history['loss']val_loss_sgd=history_sgd.history['val_loss']epochs_range=range(epochs)# 设置高分辨率出图plt.rcParams['figure.dpi']=150plt.figure(figsize=(16,6))# 绘制 Accuracy 对比图ax1=plt.subplot(1,2,1)plt.plot(epochs_range,acc_adam,label='Training Accuracy - Adam')plt.plot(epochs_range,acc_sgd,label='Training Accuracy - SGD')plt.plot(epochs_range,val_acc_adam,label='Validation Accuracy - Adam')plt.plot(epochs_range,val_acc_sgd,label='Validation Accuracy - SGD')plt.legend(loc='lower right')plt.title('Training and Validation Accuracy (Adam vs SGD)')plt.xlabel(f'Epochs\nTimestamp:{current_time}')ax1.xaxis.set_major_locator(MultipleLocator(2))# 绘制 Loss 对比图ax2=plt.subplot(1,2,2)plt.plot(epochs_range,loss_adam,label='Training Loss - Adam')plt.plot(epochs_range,loss_sgd,label='Training Loss - SGD')plt.plot(epochs_range,val_loss_adam,label='Validation Loss - Adam')plt.plot(epochs_range,val_loss_sgd,label='Validation Loss - SGD')plt.legend(loc='upper right')plt.title('Training and Validation Loss (Adam vs SGD)')plt.xlabel(f'Epochs\nTimestamp:{current_time}')ax2.xaxis.set_major_locator(MultipleLocator(2))plt.tight_layout()plt.show()

五、模型验证集最终评估

print("\n--- 优化器对比实验最终评估结果 ---")print("Model with Adam Optimizer:")model_adam.evaluate(val_ds,verbose=2)print("\nModel with SGD Optimizer:")model_sgd.evaluate(val_ds,verbose=2)

六、总结

本次实验意在探究不同的优化器算法（Optimizer）对深度神经网络训练速度、收敛过程以及最终模型性能的影响。
为了保证实验的严谨性，采用控制变量法，实验将网络架构搭建过程封装为 create_model() 函数。在保持数据集、网络层级（三层 CNN + Dropout）、训练轮数（Epochs=20）完全一致的前提下，仅改变模型编译（Compile）时的优化器类型。