question

ZhiLi-8645 avatar image
0 Votes"
ZhiLi-8645 asked ZhiLi-8645 published

How to implement MultiClassClassification with tree data structure using ML.Net

I have hundreds of projects, and they all have tree data structure like this:

79481-image.png

Or like this:

79482-image.png

Each project has its own tree structure which is modified from a standard tree structure. What I am trying to do is to map project's tree structure to the standard tree structure, like this:

79467-image.png

Or like this:

(img)mapping to standard tree



The mapping really depends on the text instead of the node's level.
Now I'm using multi class classification in ML.Net. First I map the existing projects' tree to the standard tree manually and save the results in the database, like this:

| Label      | Level1         | Level2         | Level3         |
| --------   | -------------- | -------------- | -------------- |
| A          | A              |      *         |       *        |
| A-AA       | A              |      AA1       |       *        |
| A-AA-AAA   | A              |      AA1       |      AAA1      |
| A-BB       | A              |      BB2       |       *        |
| A-BB-BBB   | A              |      BB2       |      BBB2      |
| A          | A              |      *         |       *        |
| A-AA-AAA   | A              |      AAA1      |       *        |
| A-BB       | A              |      BB2       |       *        |
| A-BB-BBB   | A              |      BB2       |      BBB2      |

Because data in the column in ML.Net cannot be a missing value, so I replace them with *. And my tree has 15 levels (feature columns).

The multi class classification algorithm I choose is SdcaMaximumEntropy. Hopefully I can use the prediction to map the tree instead of doing this manually.

I successfully implemented the prediction. However, the prediction result is really poor.

So my question is:

  1. Is the way I do this right?

  2. If yes, should I remove the duplicate rows and should I replace the missing value with *?

Thanks in advance.



dotnet-csharpdotnet-ml-big-data
image.png (3.6 KiB)
image.png (4.0 KiB)
image.png (6.9 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

0 Answers