question

JohanDanforth-1768 avatar image
1 Vote"
JohanDanforth-1768 asked WilhelmThomas-7049 answered

Distributed training of a model i ML.NET?

Is it possible to do distributed training/fit of a model in ML.NET to multiple workers/servers?

dotnet-ml-big-data
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Wilhelm-3109 avatar image
0 Votes"
Wilhelm-3109 answered Wilhelm-3109 published

I'm very interested by this? any answer? I have a few local machines I would like to use
TY!
Wil

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

WilhelmThomas-7049 avatar image
0 Votes"
WilhelmThomas-7049 answered

hello, I would love for Microsoft/ML.net to allow us to train on multiple machines to go faster. Microsoft should do the following:

  • Version 1: Right now Auto ML starts many algo 1 by 1 on one machine, a simple version 1 would be to start those algos in parallels on multiple computer nodes (slaves) and aggregate the results on the main computer (master node) where AutoM orchestrate the runs and aggregate the results

  • Version 2: allow to add Azure as a slave node in the mesh in combination to local computers...so we can pay for extra boost when needed.

  • version 3: I would work on accelerating the most used algos by distributing the work across the mesh nodes (harder to do, would take more time).

I think version 1 and 2 should be easy and fast to implement. Those will increase ML.Net usage but also allow people to get a hardware boost when needed by adding azure as a participating computational node
Version 3 is more complex and will take more time.

Best

Wil



5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.