question

JiinJeong-9636 avatar image
1 Vote"
JiinJeong-9636 asked ThaboMofokeng-8091 commented

Azure Automated ML(interface) choosing primary metrics to handle imbalanced data

I figured out that there are some primary metrics I can choose when I run an automated ML experiment. Yet the number of primary metrics is fewer than the run metrics in the result page. I want to deal with imbalanced data(10:1 or 20:1) and

looked up the links below:
https://docs.microsoft.com/en-us/azure/machine-learning/concept-manage-ml-pitfalls#identify-models-with-imbalanced-data
and
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-configure-auto-train

It seems F1 score is recommended to evaluate each model with imbalanced data.

Here are my questions:

  • Is there any way to set F1 score or multiple measures as a primary metric?

  • If there is no such way, should I do it manually?

  • Of all the given primary metrics, which primary metric is the most appropriate(to build a Classification model with imbalanced data)?



Thanks.


azure-machine-learning
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@JiinJeong-9636 Here is the useful link that discussed about the automl settings(primary metric, number of cross-validation folds etc.).If you select another model from a sweep, choosing based on a different metric, that model is likely reasonable, though less optimized than it could be. Many metric are well correlated so choosing one tends to co-optimize the other (but to a lesser extent); some other metric pairs can be at odds. For multiple measures as a primary metric forwarded to the product team to check.



0 Votes 0 ·

1 Answer

ramr-msft avatar image
0 Votes"
ramr-msft answered ThaboMofokeng-8091 commented

For imbalanced data, it is preferred to choose AUC Weighted. Also user should then choose a metric that is appropriate to work well for imbalance. E.g. F1, micro averaged AUC, balanced accuracy for model evaluation. For primary metric (metric used for model optimization) the user should preferably choose AUC Weighted instead of accuracy.
Currently from the ml.azure.com the following metrics are supported. To add F1 score metric forwarded to product team to check on this.
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-configure-auto-train#primary-metric
10986-screenshot-162.png



screenshot-162.png (19.5 KiB)
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@JiinJeong-9636 We have plan to support optimizing on metrics like F1 in near future.You can try AUC_weighted or norm_macro_recall depending on the use case, but for simplicity AUC_weighted should be good.


1 Vote 1 ·

Thank you.

2 Votes 2 ·

Hi all,

I agree. I am presently working on a data set that is split 60/40 between the two binary outcomes 0 and 1. I find that optimizing for AUC returns better performance results compared to other metrics.

0 Votes 0 ·