当前位置：首页 > news >正文

claude4

news 2026/6/15 17:40:36

我们发现还是出现错误`PS D:\PRformer> python -u run.py --is_training 1 --model_id test_v1 --root_path ./dataset/ETT-small/ --data_path ETTh1.csv --model PRformer --data ETTh1 --features M --seq_len 720 --label_len 48 --pred_len 96 --train_epochs 2 --batch_size 32 --convWindows 24 48 72 144 --rnnMixTemperature 0.002 --des Test --e_layers 2 --d_layers 1 --d_model 512 --d_ff 2048
Args in experiment:
Namespace(is_training=1, model_id='test_v1', model='PRformer', data='ETTh1', root_path='./dataset/ETT-small/', data_path='ETTh1.csv', features='M', target='OT', freq='h', checkpoints='./checkpoints/', seq_len=720, label_len=48, pred_len=96, seasonal_patterns='Monthly', enc_in=7, dec_in=7, c_out=7, d_model=512, n_heads=8, e_layers=2, d_layers=1, d_ff=2048, moving_avg=25, factor=1, distil=True, dropout=0.1, embed='timeF', activation='gelu', output_attention=False, num_workers=10, itr=1, train_epochs=2, batch_size=32, patience=3, learning_rate=0.0001, des='Test', loss='MSE', lradj='type1', use_amp=False, use_gpu=True, gpu=0, use_multi_gpu=False, devices='0,1,2,3', exp_name='None', efficient_training=False, channel_independence=False, inverse=False, class_strategy='projection', target_root_path='./data/electricity/', target_data_path='electricity.csv', convWindows=[24, 48, 72, 144], rnnMixTemperature=0.002)
Use GPU: cuda:0

start training : test_v1_PRformer_ETTh1_M_ft720_sl48_ll96_pl512_dm8_nh2_el1_dl2048_df1_fctimeF_ebTrue_dtTest_projection_0>>>>>>>>>>>>>>>>>>>>>>>>>>
train 7825
val 2785
test 2785
iters: 100, epoch: 1 | loss: 0.4086269
speed: 0.3070s/iter; left time: 119.4059s
iters: 200, epoch: 1 | loss: 0.3768452
speed: 0.0218s/iter; left time: 6.2863s
Epoch: 1 cost time: 33.58464050292969
Epoch: 1, Steps: 244 | Train Loss: 0.3750640 Vali Loss: 0.7134643 Test Loss: 0.3772548
Updating learning rate to 0.0001
iters: 100, epoch: 2 | loss: 0.3199691
speed: 0.9780s/iter; left time: 141.8099s
iters: 200, epoch: 2 | loss: 0.3178940
speed: 0.0218s/iter; left time: 0.9790s
Epoch: 2 cost time: 31.77220892906189
Epoch: 2, Steps: 244 | Train Loss: 0.3377394 Vali Loss: 0.6909371 Test Loss: 0.3732766
Updating learning rate to 5e-05
Traceback (most recent call last):
File "D:\PRformer\run.py", line 131, in
exp.train(setting)
File "D:\PRformer\experiments\exp_long_term_forecasting.py", line 190, in train
self.model.load_state_dict(torch.load(best_model_path))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\Python311\Lib\site-packages\torch\serialization.py", line 1530, in load
with _open_file_like(f, "rb") as opened_file:
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\Python311\Lib\site-packages\torch\serialization.py", line 795, in _open_file_like
return _open_file(name_or_buffer, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Program Files\Python311\Lib\site-packages\torch\serialization.py", line 776, in init
super().init(open(name, mode)) # noqa: SIM115
^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/test_v1_PRformer_ETTh1_M_ft720_sl48_ll96_pl512_dm8_nh2_el1_dl2048_df1_fctimeF_ebTrue_dtTest_projection_0/checkpoint.pth'`

Compacted conversation太好了！✅ torch.cat的修复有效了！ 模型成功运行了2个epoch，说明之前的维度问题已经解决。

新的报错是关于checkpoint加载的问题。让我查看一下Early Stopping的保存逻辑： Optimizing tool selection...

Read [](file:///d%3A/PRformer/experiments/exp_long_term_forecasting.py#140-140), lines 140 to 200

Searched for text class EarlyStopping, 1 result

Read [](file:///d%3A/PRformer/utils/tools.py#42-42), lines 42 to 80

问题找到了！🔍 在 exp_long_term_forecasting.py，Early Stopping只在 epoch > 3时才保存checkpoint。你的 train_epochs=2，所以永远不会保存checkpoint文件。

修复这个条件：

Replacing 4 lines with 3 lines in [](file:///d%3A/PRformer/experiments/exp_long_term_forecasting.py)

✅ 已修复！现在Early Stopping会从第一个epoch就开始保存checkpoint。重新运行测试命令：

python -u run.py --is_training 1 --model_id test_v1_fixed --root_path ./dataset/ETT-small/ --data_path ETTh1.csv --model PRformer --data ETTh1 --features M --seq_len 720 --label_len 48 --pred_len 96 --train_epochs 2 --batch_size 32 --convWindows 24 48 72 144 --rnnMixTemperature 0.002 --des Test_After_Fix --e_layers 2 --d_layers 1 --d_model 512 --d_ff 2048

这次应该能完整训练完两个epoch并成功保存checkpoint！

Made changes.

查看全文

http://www.jsqmd.com/news/641124/