Drill State Detection (Drilling, Screwing, Idle) – Filtering & Classification Stability Issues

burak_Guzeller · ‎2025-08-06

Hello,

I’ve been working on a project inspired by STMicroelectronics project, where the goal is to identify different operation modes of a power drill — such as drilling, screwing, and idle — using motion sensor data.

I’m using one of ST’s development kits (e.g., STEVAL-STWINKT1B) and built my classification model using NanoEdge AI Studio.

System Overview:

Data Collection:
I collected 20 signal recordings per class, each consisting of approximately 200 rows of sensor data.
Sensor Preprocessing:
The raw data from the accelerometer and gyroscope is passed through multiple filters to improve signal quality:
- Low-pass filter
- Moving average filter
- Kalman filter
Decision Mechanism (Sliding Window + Majority Voting):
To improve classification stability in real-time, I implemented a sliding window with majority voting strategy:
- A buffer stores the last 50 predictions from the model.
- If at least 40 out of 50 predictions belong to the same class, that class is accepted as the current state.
- This helps prevent noisy or sporadic misclassifications from affecting the system's output.

Problems Encountered:

During screwing, the system sometimes incorrectly classifies the behavior as drilling.
Even with multiple filters, the signals of different classes (especially drilling vs screwing) are occasionally too similar.
The classification model shows good offline performance, but in real-time usage, stability is not sufficient.

Looking for Advice On:

How can I improve signal preprocessing to make similar behaviors (like screwing vs drilling) more distinguishable?
Would you recommend specific feature engineering techniques (e.g., RMS, kurtosis, peak frequency, spectral entropy)?
Is it worth exploring time-frequency domain methods like STFT, Wavelet Transform, etc., to better separate classes?
Has anyone used temporal smoothing, Hidden Markov Models, or state machines to stabilize real-time classification results?

Additional Info:

Sensors used: Accelerometer (ACC) and Gyroscope (GYRO) ISM330DH
Environment: STM32CubeIDE + NanoEdge AI Studio
Benchmark accuracy (offline): ~95%
Real-time buffer: 50 predictions, classification accepted if ≥ 40 are identical

I’d really appreciate any suggestions, examples, or experiences you might be able to share.
Thanks in advance!

Julian E. · ‎2025-08-06

Hello @burak_Guzeller,

First of all, you did a very good job!

Concerning the preprocessing, you can definitely try different things. As this is tied to your use case, I cannot help you much with this.

Regarding NanoEdge, the most important parameters that will influence your results are:

The quantity of data:

here you seem to have done did a good job. Make sure to collect data only of one class, for example not a single data point of your idle state when stating to drill.

The buffer size used:

I don't know the size of buffer that you used. Based on what you say, you may be using rows of 6 data points (3 accelerometers + 3 gyroscope). It that the case, I would suggest working with buffer instead (rows of maybe 20*6 datapoints for example).

Of course, the size of buffers used will impact the inference time as you will need to collect multiple data points but it generally improves a lot the results.

You will try to experiment to find the best buffer size.

Documentation about recommended format: https://wiki.st.com/stm32mcu/wiki/AI:NanoEdge_AI_Studio#Data_format

Frequency used:

The frequency used while collecting data also has a big impact on the results.

To help you find a starting point, you can take a look at the sampling finder.

You need to import "continuous data", here meaning 6 points per rows, at the maximum frequency to test combination of buffers size and subsampling to give you estimation of project results using these configurations.

Sampling finder documentation: https://wiki.st.com/stm32mcu/wiki/AI:NanoEdge_AI_Studio#Sampling_finder

You can also read this documentation:

AI:Datalogging guidelines for a successful NanoEdge AI project - stm32mcu

You are already using a postprocessing strategy to help reduce false prediction.

I cannot really say much more as you are already doing a good job.

When doing tests in NanoEdge and running multiple benchmark, to gain time, you are not forced to run complete benchmarks that can take hours. In general, a benchmark converges in ~10-20 mins to an accuracy that is close that what you will ultimately get. The whole benchmark is really to optimize as much as possible the library. Which is to be done once, when you found the right parameters.

I would also encourage you to test the libraries in the Validation step.

After preprocessing your data, shuffle them (as data captured at the beginning/end of a session can be different, if you machine heat for example) and slit them into a train (80% of your data) and test set.

By default, NanoEdge is doing cross validation to get robust libraries, but, by testing multiples libraries, you could avoid overfitted libraries for example.

Lastly, if you see that you seem to have situation where the library works not so much as expected, try to log more of these situations and add the data to run new benchmarks. In your case, you are also probably mounting/ dismantling your drill to change from drilling or screwing. Each time, it may create slight difference in your data, so logging multiple time after mounting/ dismantling can help acquiring more variate data.

In the end, if you still not find a library that suits you. We have other tool to use neural networks on STM32.

It will require a bit more knowledge as you will need to train the model on your own, but we also tools to help you.

Obviously, neural networks are heavier algorithms.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

burak_Guzeller · ‎2025-08-11

Hello everyone,

First of all, thank you for the valuable suggestions here.

Using the Sampling Finder interface, I was recommended to collect data at 250 Hz sampling rate with a buffer size of 16 — which I implemented exactly like this in my code:

// 2. SPI IO configuration structure
ISM330DHCX_IO_t io_ctx = {
.Init = BSP_SPI3_Init,
.DeInit = BSP_SPI3_DeInit,
.BusType = ISM330DHCX_SPI_4WIRES_BUS,
.Address = 0, // Not used for SPI but required
.WriteReg = BSP_SPI3_WriteReg,
.ReadReg = BSP_SPI3_ReadReg,
.GetTick = BSP_GetTick,
.Delay = HAL_Delay
};

if (BSP_SPI3_Init() != BSP_ERROR_NONE) {
printmsg("SPI3 initialization failed\r\n");
Error_Handler();
}

if (ISM330DHCX_RegisterBusIO(&sensor, &io_ctx) != ISM330DHCX_OK) {
printmsg("Bus IO registration error\r\n");
Error_Handler();
}

uint8_t whoami = 0;
if (ISM330DHCX_ReadID(&sensor, &whoami) != ISM330DHCX_OK) {
printmsg("WHO_AM_I read error\r\n");
Error_Handler();
} else if (whoami != 0x6B) {
printmsg("Expected ID 0x6B, but read: 0x%02X\r\n", whoami);
Error_Handler();
} else {
printmsg("ISM330DHCX sensor found! WHO_AM_I: 0x%02X\r\n", whoami);
}

if (ISM330DHCX_Init(&sensor) != ISM330DHCX_OK) {
printmsg("--> INIT FAILED\r\n");
Error_Handler();
} else {
printmsg("--> INIT SUCCESSFUL\r\n");
}

if (ISM330DHCX_ACC_SetOutputDataRate(&sensor, 250.0f) != ISM330DHCX_OK) {
printmsg("Failed to set ACC ODR\r\n");
Error_Handler();
}
if (ISM330DHCX_ACC_SetFullScale(&sensor, 4) != ISM330DHCX_OK) {
printmsg("Failed to set ACC Full Scale\r\n");
Error_Handler();
}
HAL_Delay(50);

if (ISM330DHCX_ACC_Enable(&sensor) != ISM330DHCX_OK) {
printmsg("Failed to enable accelerometer\r\n");
Error_Handler();
}

if (ISM330DHCX_GYRO_SetOutputDataRate(&sensor, 250.0f) != ISM330DHCX_OK) {
printmsg("Failed to set GYRO ODR\r\n");
Error_Handler();
}

if (ISM330DHCX_GYRO_SetFullScale(&sensor, 2000) != ISM330DHCX_OK) {
printmsg("Failed to set GYRO Full Scale\r\n");
Error_Handler();
}
HAL_Delay(50);

if (ISM330DHCX_GYRO_Enable(&sensor) != ISM330DHCX_OK) {
printmsg("Failed to enable gyroscope\r\n");
Error_Handler();
}

/* Infinite loop */
while(1)
{
if (ISM330DHCX_ACC_GetAxes(&sensor, &acc_data) != ISM330DHCX_OK) {
printmsg("ACC read error\r\n");
continue;
}

if (ISM330DHCX_GYRO_GetAxes(&sensor, &gyro_data) != ISM330DHCX_OK) {
printmsg("GYRO read error\r\n");
continue;
}

acc_buffer[buffer_index] = acc_data;
gyro_buffer[buffer_index] = gyro_data;
buffer_index++;

if (buffer_index >= BUFFER_SIZE) // Buffer full
{
buffer_index = 0;

float input_user_buffer[96];
char log_line[200];

for (uint8_t i = 0; i < BUFFER_SIZE; i++) {
input_user_buffer[i*6 + 0] = (float)acc_buffer[i].x;
input_user_buffer[i*6 + 1] = (float)acc_buffer[i].y;
input_user_buffer[i*6 + 2] = (float)acc_buffer[i].z;
input_user_buffer[i*6 + 3] = (float)gyro_buffer[i].x;
input_user_buffer[i*6 + 4] = (float)gyro_buffer[i].y;
input_user_buffer[i*6 + 5] = (float)gyro_buffer[i].z;
}

for (uint8_t i = 0; i < BUFFER_SIZE; i++) {
int len = snprintf(log_line, sizeof(log_line),
"%f,%f,%f,%f,%f,%f\r\n",
input_user_buffer[i*6 + 0], input_user_buffer[i*6 + 1], input_user_buffer[i*6 + 2],
input_user_buffer[i*6 + 3], input_user_buffer[i*6 + 4], input_user_buffer[i*6 + 5]);
if (len > 0) {
printmsg("%s", log_line);
}
}
printmsg("---\r\n");

// Here I run the classification (commented out for now)
/*
enum neai_state state = neai_classification(input_user_buffer, output_class_buffer, &id_class);
if (state == NEAI_OK) {
printmsg("Prediction result: %s\r\n", id2class[id_class]);
} else {
printmsg("NanoEdge AI error code: %d\r\n", state);
}
*/
}

HAL_Delay(4);
}

Questions and observations:

Is there any mistake in the above data collection method?
Am I collecting the data correctly according to best practices for NanoEdge AI and drill state detection project?
Although my project shows a high success rate, during benchmarking, why do I not get the optimal and consistent classification results?
I am using a sliding window method for classification as well.
Could the issue be related to insufficient number of samples or number of features (signal counts)?
Sometimes, when I hold the drill tip on the workpiece in a stationary position and trained that as "idle" mode, it still gets classified incorrectly as "drilling."
In such borderline cases, should I train the model with very slight movements or completely stationary data?
Do you have any recommendations on how to improve data collection, training strategies, or preprocessing to better handle ambiguous states?

Thank you in advance for any advice!

Julian E. · ‎2025-08-11

Hello @burak_Guzeller,

Concerning the behavior you observe in NanoEdge, it is because you provide files without enough data to do all the tests in the matrix.

As you can see, it will test multiple combination of buffer size and downsampling.

With 189 lines, it cannot test all the tests in grey.

As for your other questions:

Is there any mistake in the above data collection method?
Am I collecting the data correctly according to best practices for NanoEdge AI and drill state detection project?

I don't see any issues. Make sure that the data you collect are coherent with what you would expect.

Do you use buffer of size 6 in your benchmark, or did you reshape the files?

Although my project shows a high success rate, during benchmarking, why do I not get the optimal and consistent classification results?
I am using a sliding window method for classification as well.

I am not sure to understand what you mean in your first question. Maybe the difference is in fact because of the sliding window.

Sometimes, when I hold the drill tip on the workpiece in a stationary position and trained that as "idle" mode, it still gets classified incorrectly as "drilling."
In such borderline cases, should I train the model with very slight movements or completely stationary data?
Do you have any recommendations on how to improve data collection, training strategies, or preprocessing to better handle ambiguous states?

You need to collect what you expect to encounter in the real situation. If the idle state is when someone finished drilling but it is moving with the machine in his hands, then your logging should contain example of this situation.
If it just when the machine is still on the ground, then log this. If it is both, log both.

Hope it helps.

Again, my main concern here is to know what data you use exactly in your benchmark. rows of 6 columns, or rows of multiple samples of 6 columns. The correct one is the second option.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

burak_Guzeller · ‎2025-08-14

Hello Julian,

Thank you for your previous feedback. It helped me clarify some points.

I still have one question regarding data collection for borderline cases like idle vs screwing.

Let’s say I want to collect “idle” data at different angles.
Example: starting at 90° vertical position and moving slowly down to 50° without performing any drilling or screwing — just holding the drill.

In this case, which approach is better for NanoEdge AI training?

Static captures at fixed angles (e.g., 90°, 80°, 70°, 60°, 50°) for a few seconds each, without movement.
Continuous slow movement between these angles, so the model sees the transition.

My goal is to make the model robust enough to recognize “idle” even if the drill is being held at various angles or moving slightly, but not actually drilling/screwing.

Do you recommend mixing both static and slow-motion samples for idle, or sticking to one method for consistency?

Thanks in advance for your advice.

Julian E. · ‎2025-08-14

Hello @burak_Guzeller,

I can't really say. It depends on what you think is the most realist usage in the future.

I would say both are needed: You either stopped using it and have the drill in your hand, so moving a bit, but also if it is flat on the ground.

You should try to collect both and do tests.

The two other classes contain much more and different vibrations. I think both logging solutions will give similar results.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

burak_Guzeller · ‎2025-08-14

My follow-up question is about dataset balance.

In my current project, I noticed that the Idle class can be more ambiguous than Drill or Screw, because it sometimes involves slight movements that may look similar to other classes.

Would it be acceptable in NanoEdge AI to collect a larger number of signals for Idle (for example, 500 signals) while having fewer but equal signals for Drill and Screw (for example, 50 each)?

The reason I’m asking is that Drill and Screw produce much more distinct vibration patterns, so I believe they may not require as many examples as Idle, which has more subtle variations.

Or is it mandatory to keep the number of signals per class strictly balanced?

Julian E. · ‎2025-08-14

@burak_Guzeller

It is recommended to have a balanced dataset.

but again, only doing tests will give you an answer.

You can try to have more data of the classes that do not give good results.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

burak_Guzeller · ‎2025-08-15

Hello,

I have been following your Drill State Detection project (for in referans : https://www.st.com/en/mems-and-sensors/ism330dhcx.html) using the STEVAL-STWINKT1B kit and the ISM330DHCX sensor. I have followed all suggested steps and attempted to compile the project, but the process takes a long time and sometimes drilling and screwing modes are confused.

Could you kindly provide the source code and data logger files? These materials are urgent and critical for our project. For reference, here is the ISM330DHCX sensor page.

Thank you,

burak_Guzeller · ‎2025-08-18

Could you please respond to my post? This project is very critical for us and we still haven't found a solution. We've tried all the filters, but we haven't received any definitive results. When I did it as in the idea, the correct results weren't obtained. I added a lot of data signals, but it didn't work. Could we get the source code or the .csv file? That would be great.