cancel
Showing results for 
Search instead for 
Did you mean: 

Drill State Detection (Drilling, Screwing, Idle) – Filtering & Classification Stability Issues

burak_Guzeller
Associate II

Hello,

I’ve been working on a project inspired by STMicroelectronics project, where the goal is to identify different operation modes of a power drill — such as drilling, screwing, and idle — using motion sensor data.

I’m using one of ST’s development kits (e.g., STEVAL-STWINKT1B) and built my classification model using NanoEdge AI Studio.


System Overview:

  • Data Collection:
    I collected 20 signal recordings per class, each consisting of approximately 200 rows of sensor data.

  • Sensor Preprocessing:
    The raw data from the accelerometer and gyroscope is passed through multiple filters to improve signal quality:

    • Low-pass filter

    • Moving average filter

    • Kalman filter

  • Decision Mechanism (Sliding Window + Majority Voting):
    To improve classification stability in real-time, I implemented a sliding window with majority voting strategy:

    • A buffer stores the last 50 predictions from the model.

    • If at least 40 out of 50 predictions belong to the same class, that class is accepted as the current state.

    • This helps prevent noisy or sporadic misclassifications from affecting the system's output.


Problems Encountered:

  • During screwing, the system sometimes incorrectly classifies the behavior as drilling.

  • Even with multiple filters, the signals of different classes (especially drilling vs screwing) are occasionally too similar.

  • The classification model shows good offline performance, but in real-time usage, stability is not sufficient.


Looking for Advice On:

  1. How can I improve signal preprocessing to make similar behaviors (like screwing vs drilling) more distinguishable?

  2. Would you recommend specific feature engineering techniques (e.g., RMS, kurtosis, peak frequency, spectral entropy)?

  3. Is it worth exploring time-frequency domain methods like STFT, Wavelet Transform, etc., to better separate classes?

  4. Has anyone used temporal smoothing, Hidden Markov Models, or state machines to stabilize real-time classification results?


Additional Info:

  • Sensors used: Accelerometer (ACC) and Gyroscope (GYRO)  ISM330DH

  • Environment: STM32CubeIDE + NanoEdge AI Studio

  • Benchmark accuracy (offline): ~95%

  • Real-time buffer: 50 predictions, classification accepted if ≥ 40 are identical

 

I’d really appreciate any suggestions, examples, or experiences you might be able to share.
Thanks in advance!

 

burak_Guzeller_0-1754483793358.png

 

burak_Guzeller_1-1754483822423.png

 

1 REPLY 1
Julian E.
ST Employee

Hello @burak_Guzeller,

 

First of all, you did a very good job!

 

Concerning the preprocessing, you can definitely try different things. As this is tied to your use case, I cannot help you much with this.

 

Regarding NanoEdge, the most important parameters that will influence your results are:

 

The quantity of data:

here you seem to have done did a good job. Make sure to collect data only of one class, for example not a single data point of your idle state when stating to drill.

 

The buffer size used: 

I don't know the size of buffer that you used. Based on what you say, you may be using rows of 6 data points (3 accelerometers + 3 gyroscope). It that the case, I would suggest working with buffer instead (rows of maybe 20*6 datapoints for example).

Of course, the size of buffers used will impact the inference time as you will need to collect multiple data points but it generally improves a lot the results.

You will try to experiment to find the best buffer size. 

Documentation about recommended format: https://wiki.st.com/stm32mcu/wiki/AI:NanoEdge_AI_Studio#Data_format 

 

Frequency used:

The frequency used while collecting data also has a big impact on the results.

 

To help you find a starting point, you can take a look at the sampling finder.

You need to import "continuous data", here meaning 6 points per rows, at the maximum frequency to test combination of buffers size and subsampling to give you estimation of project results using these configurations.

Sampling finder documentation: https://wiki.st.com/stm32mcu/wiki/AI:NanoEdge_AI_Studio#Sampling_finder 

 

You can also read this documentation:

AI:Datalogging guidelines for a successful NanoEdge AI project - stm32mcu

 

You are already using a postprocessing strategy to help reduce false prediction.

 

I cannot really say much more as you are already doing a good job.

When doing tests in NanoEdge and running multiple benchmark, to gain time, you are not forced to run complete benchmarks that can take hours. In general, a benchmark converges in ~10-20 mins to an accuracy that is close that what you will ultimately get. The whole benchmark is really to optimize as much as possible the library. Which is to be done once, when you found the right parameters.

 

I would also encourage you to test the libraries in the Validation step.

After preprocessing your data, shuffle them (as data captured at the beginning/end of a session can be different, if you machine heat for example) and slit them into a train (80% of your data) and test set.

 

By default, NanoEdge is doing cross validation to get robust libraries, but, by testing multiples libraries, you could avoid overfitted libraries for example.

 

Lastly, if you see that you seem to have situation where the library works not so much as expected, try to log more of these situations and add the data to run new benchmarks. In your case, you are also probably mounting/ dismantling your drill to change from drilling or screwing. Each time, it may create slight difference in your data, so logging multiple time after mounting/ dismantling can help acquiring more variate data.

 

In the end, if you still not find a library that suits you. We have other tool to use neural networks on STM32. 

It will require a bit more knowledge as you will need to train the model on your own, but we also tools to help you.

Obviously, neural networks are heavier algorithms.

 

Have a good day,

Julian 


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.