正确的指标是训练期间报告的指标;此外,我建议您还可以手动检索验证数据集并在对其进行预测后检查指标(不一定通过 flow_from_directory()).Colab code is here:I am following the docs here to get the result for multiclass predictionWhen I train using#last layertf.keras.layers.Dense(2, activation='softmax')model.compile(optimizer="adam", loss=tf.keras.losses.CategoricalCrossentropy(), metrics=[tf.keras.metrics.CategoricalAccuracy(), tfa.metrics.F1Score(num_classes=2, average='macro')])I get144/144 [==] - 8s 54ms/step - loss: 0.0613 - categorical_accuracy: 0.9789 - f1_score: 0.9788 - val_loss: 0.0826 - val_categorical_accuracy: 0.9725 - val_f1_score: 0.9722When I do:model.evaluate(val_ds)I get16/16 [==] - 0s 15ms/step - loss: 0.0826 - categorical_accuracy: 0.9725 - f1_score: 0.9722[0.08255868405103683, 0.9725490212440491, 0.9722140431404114]I would like to use the metric.result as in the official website. When I load the below code, I get 0.4875028 which is wrong. How can I get the correct predicted_categories and true_categories?metric = tfa.metrics.F1Score(num_classes=2, average='macro')predicted_categories = model.predict(val_ds)true_categories = tf.concat([y for x, y in val_ds], axis=0).numpy()metric.update_state(true_categories, predicted_categories)result = metric.result()print(result.numpy())#0.4875028Here is how I loaded my datatrain_ds = tf.keras.preprocessing.image_dataset_from_directory( main_folder, validation_split=0.1, subset="training", label_mode='categorical', seed=123, image_size=(dim, dim))val_ds = tf.keras.preprocessing.image_dataset_from_directory( main_folder, validation_split=0.1, subset="validation", label_mode='categorical', seed=123, image_size=(dim, dim)) 解决方案 From: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directorytf.keras.preprocessing.image_dataset_from_directory( directory, labels='inferred', label_mode='int', class_names=None, color_mode='rgb', batch_size=32, image_size=(256, 256), shuffle=True, seed=None, validation_split=None, subset=None, interpolation='bilinear', follow_links=False)The shuffle by default is True, and that is a problem for your val_ds, which we do not want to shuffle.The correct metrics are the ones reported during the training; Also I recommend that you can also manually retrieve your validation dataset and check the metrics once you make predictions on it (not necessarily via flow_from_directory()). 这篇关于如何正确使用 tfa.metrics.F1Score 和 image_dataset_from_directory?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云! 08-13 20:01