cancel
Showing results for 
Search instead for 
Did you mean: 

STVP / STM8S-Discovery crashes Windows

smh
Associate II
Posted on December 12, 2009 at 20:14

STVP / STM8S-Discovery crashes Windows

16 REPLIES 16
pdadev
Associate II
Posted on May 17, 2011 at 15:06

Wow.. finally. This should be at least a bit promising.

Breakpoint on ''GetDriveTypeA'' (exports from Kernel32).

Do a return from the jump (shift-11) which should throw you onto: lea ecx, [ebp-0Ch] \ lea edx .. etc etc. Step-Over down a bit and you'll see a function that takes 0s and 3s as params (push 0 \ push 3 \ push 0 \ push 3).

If cmp esi, FFFFFFFF does not trigger the je, the params are pushed and the DeviceIOControl function gets called.

This is in Sendcommand, called from the STMass_Enum_Reenumerate. The value of the parameters at EIP 60aa9e (this could be different since it's a DLL and might be relocated) should let you know EXACTLY what is being sent the moment your system dies.

If it can be established that this is the API call that is causing the failure, it might be possible to instantiate the enumeration past the point of failure as a quick-and-dirty patch to get it working on your system until the manufacturer actually supports their product.

---

I'm using the Windows 7 RTM at the moment, and the STVP works fine. there are references in the files to the Mass Storage Device driver.. it just seems that they've implemented their own IO control codes to set the device to different modes of receiving data.

If your board were up, you can also change the ini that GDB uses (where the emulator-reset-port command is defined) and add ''-SPY3 (filename)'' to get some fairly verbose logs. Unfortunately, the extent that would help with the BSOD problem is that it would tell you that the GetProcaddress/LoadLibrary for STMASS_* worked.

smh
Associate II
Posted on May 17, 2011 at 15:06

Thanks again for you input.

All my recent testing has been done without the STM8S-Discovery connected, so issuing the emulator-reset-port crashes Windows whether or not the device is present.

The stack trace in windbg is often cunfused while setpping this stuff so on my first run through I missed the return from STMass_enum_getdevice. Is it possible that it hadn't returned by the time it crashed and the calls to DeviceIoControl (called from STMass_SendCommand) are part of enum_getdevice process?

Thanks for the tip about the messages embedded in stm_swim.dll - I'll look more closely at these. However, I've never yet seen any of these messages displayed because the whole system crashes before anything is returned to functions in stm_swim.dll

pdadev
Associate II
Posted on May 17, 2011 at 15:06

This could be a very long shot and I apologize for not having been able to verify this as I've no system where the failure occurs, but here's a patch that you can try. I can't say for certain that this will fix the issue or if it will still allow all of the functionality can only let you know what I propose and why I would do so.

I was able to offset the drive letter that the DeviceIOControl/GetDrive functions accesses. Changing the offset to +6 caused my connected device to not be recognized (which connects as ''G:\''). Changing the offset to +5 means that the device does connect, so I am fairly certain that this is functioning as intended. No warranties, of course.

In the stvd\swim folder (you'll need to change the file under the stvp folder for that to work if this fixes the problem):

Hex edit the file STLinkIIIUSBDriver.dll (I'm using XVI32).

The offsets to change are:

19e01

19E47

19edc

At each of these offsets, you should see the ''41'' byte in this sequence:

83 C2 41

Change 41 to 46 in all 3 places. It may need to be lower if you need a lower drive letter, or higher if there's another conflicting device below it.

That change will mean:

83 C2 41 -- add edx, 41h

becomes:

83 C2 46 -- add edx, 46h

The way this works is that there are iterators adding 41h to a counter to derive the drive letter to send codes to. Changing the value just means that the count will start higher, hopefully bypassing the drive letters that are causing the problem.

If you have any questions about this, please let me know.

smh
Associate II
Posted on May 17, 2011 at 15:06

Fantastic - patching STLinkIIIUSBDriver.dll as you suggest does indeed enable both stvd and stvp to connect to the board.

Since my RAID has C: and D: partitions I changed 0x41 to 0x45, so the device search starts with drive E: (the STM8S-Discovery connects as drive F:).

I actually found 7 instances of add EDX,41h in the dll so for the time being I blindly changed them all: obviously a bit more testing is needed to confirm I haven't done any damage.

The offsets of the bytes I changed are:

0x19e01

0x19e47

0x19edc

0x3b171

0x4d003

0x4d0e5

0x4e0b2

I'd still like to confirm exactly what the dll is doing to the RAID driver to make it crash, so I'll continue digging and add anything interesting here. Although the driver ought not to crash I've never had problems with it before so I'm still pretty certain that ST's device enumeration is doing something it shouldn't.

Oh, and if anyone from ST *is* listening, it would still be nice to have your input!

Thanks again for your help pdadev.

pdadev
Associate II
Posted on May 17, 2011 at 15:06

No problem, I'm glad that it's working for you!

You may need to look at the 3b170 change because that's:

sub edx,41

add edx,41

Kind of a garbage instruction from the looks of it, but if you're changing one you'd need to change the other (or just nop both of them out). The other calls could be sound patches since they're changing what it fed into CreateFile/GetDriveType functions, unless those functions are called with the initial drive value being what is desired to have data sent to it.

I do think an official fix should be made, even if it's a reimplementation of what we've found. The failure seems to be that the driver is sending their custom IOControl codes at all drives to find the boards. The likely problem is that the IOControl codes, since they are not like CLSIDs that are registered, are overlapping and a code # that ST uses happens to match something that the RAID controller uses as well. Even something as simple as ST's techs sending you a file to log what GetDriveType returns and skipping those devices would be much better that what's currently going.

Anyhow, thanks for your patience.

jimb
Associate II
Posted on May 17, 2011 at 15:06

I love it when windows crashes. I am running parallels under MAC os with WIndows XP - SP3 and have the same blue screen driver fault you are describing. It is a hard kill to my system. I thought this may be a Mac/Parallels/XP driver error but guess I am in the same boat as you. I am looking for solutions and have some good ideas below. Thanks

smh
Associate II
Posted on May 17, 2011 at 15:06

I've burnt quite a lot of midnight oil on this and I confess I've finally lost the will to continue. The modification proposed by pdadev does the trick and that will have to do for now (BTW, although the patch at 3b171 is incorrect I think it's harmless because it appears to be in menu handling code in the Borland Visual Component library which I imagine is not actually used).

This driver is a pretty horrible piece of work at the assembler level - I suspect the source must be grisly too. I can't resist a few comments, in the hope that someone at ST takes sufficient pride in their work to want to fix something that is undoubtedly broken.

I'm pretty sure pdadev is correct that the code is mis-identifying the RAID disk as the STM32 device, and is then sending user-defined IoControl codes to the RAID. Although one might argue that the NVIDIA driver shouldn't crash, it's quite possible that the codes clash with values used by NVIDIA for their own purposes and are being fed invalid data. (Recall the crash is a memory access through an uninitialised pointer).

The function codes passed to deviceIoControl that I found were:

0x04D014 - Function = 0x405 : IOCTL_STORAGE_BREAK_RESERVATION

0x072008, 0x222008 - Function = 0x802 : User

0x07200C, 0x22200C - Function = 0x803 : User

0x072014 - Function = 0x805 : User

0x07201C, 0x22201C - Function = 0x807 : User

0x072020, 0x222020 - Function = 0x808 : User

(Microsoft defines functions in the range 0x000 to 0x7ff; user functions are in the range 0x800 to 0xfff).

The function STMass_Enum_Reenumerate goes through convulsions to try to identify the STM32 device. It starts with the obvious call to GetDriveType, but completely ignores the return value (which could at least exclude floppies and CD drives) before enumerating drives B to Z using functions from WINSETUPAPI. A is skipped because of something that looks suspiciously like an off-by-one error. I'm afraid I got lost in all this code when it started enumerating the parents of each device.

My guess is that the author was scared off using GetDriveType by MSDN, which tells you it's not the right way to identify USB drives. This is because USB-connected disks may claim to be DRIVE_FIXED or DRIVE_REMOVABLE, depending on the media used (A USB hard disk will claim to be fixed, a card reader removable; most USB pen drives apparently claim to have removable media too). Actually, the STM32 returns DRIVE_REMOVABLE: I believe this is a property of the device not the driver, but I'm not certain.

Surely there's an easier and more reliable way to find out if your own device is plugged into a machine? There is:

http://www.codeproject.com/KB/system/RemoveDriveByLetter.aspx?msg=2069695

on CodeProject explains how to identify USB disks and the demo code is easily tweaked to print out the device ID string which contains the vendor and device IDs, which ought to be a reliable fingerprint for the STM32.

Using the modified CodeProject demo, I found that the RAID did indeed turn up in the enumeration of the STM32:

DeviceId=ACPI\NVRAIDBUS\3&267A616A&0 <-- this is the RAID

DeviceId=USB\VID_0483&PID_3744\5&1A052CC6&0&6 <-- this is the STM32

However, the code safely ignores it by checking that the drive device number matches the volume device number (read the article). The DeviceIoControl function that provides the device number also returns the Bus Type, which confirms the device is USB. It looks like the ST driver is not making this additional check.

Apologies if I'm missing some subtle reason why the driver is written the way it is, but the approach above seems much safer than spraying all the disks on a machine with user-defined IoControl codes and hoping they do no harm.