cancel
Showing results for 
Search instead for 
Did you mean: 

Approximately, how long or how many data sets do you need to train machine learning algorithms to start working?

Eleon BORLINI
ST Employee
 
1 ACCEPTED SOLUTION

Accepted Solutions
Eleon BORLINI
ST Employee

There is no universal rule for the number of data-logs and their length. The more the acquired log describes accurately the scenario the user wants to identify, the less data is required. Basically, if your dataset is “good�?, you don’t need much data to train an MLC.

Dataset can be defined “good�? if:

- the dataset logs describe only one scenario per log

- every log shows no external noise which is not correlated to the scenario itself

- every log is acquired using the same sensor settings.

For example, if these three rules are met, a dataset used to train an MLC to detect between 3 different scenarios can consist in three different logs of 30 seconds each.

However, long, and repeated acquisitions can be used to train the ML in case there is some noise overlapped with the signal of interest (and that can’t be removed). Also, more logs can enhance the accuracy of the MLC.

View solution in original post

1 REPLY 1
Eleon BORLINI
ST Employee

There is no universal rule for the number of data-logs and their length. The more the acquired log describes accurately the scenario the user wants to identify, the less data is required. Basically, if your dataset is “good�?, you don’t need much data to train an MLC.

Dataset can be defined “good�? if:

- the dataset logs describe only one scenario per log

- every log shows no external noise which is not correlated to the scenario itself

- every log is acquired using the same sensor settings.

For example, if these three rules are met, a dataset used to train an MLC to detect between 3 different scenarios can consist in three different logs of 30 seconds each.

However, long, and repeated acquisitions can be used to train the ML in case there is some noise overlapped with the signal of interest (and that can’t be removed). Also, more logs can enhance the accuracy of the MLC.