cancel
Showing results for 
Search instead for 
Did you mean: 

Speech Recognition

sgyuri2003
Associate II
Posted on May 04, 2012 at 15:45

Hi All!!

Do you think

this

is possible

in the

Controller

speech recognition

or

speech

-driven applications

to prepare

?

19 REPLIES 19
Posted on May 04, 2012 at 16:12

Do you think

this

is possible

in the

Controller

speech recognition

or

speech

-driven applications

to prepare

?

 

Possible or practical? I doubt it's practical.

How large is your application, and how much horse power does it need to operate effectively?

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
frankmeyer9
Associate II
Posted on May 04, 2012 at 17:56

You probably can exclude everything below a Cortex M4F, assuming you mean spectral based signal analysis.

It might be possible with some constraints and optimizations, but hardly with the quality of GHz - clocked processors with DP-FPU.

sgyuri2003
Associate II
Posted on May 05, 2012 at 12:08

Thank you for your

answer !

http://www.ebv.com/fileadmin/products/Products/STM/STM32F4_Series/STM32_F-4_Series_marketing_presentation_Customer_presentation_.pdf

Floating point Unit

􀂃 Graphic acceleration: moves like rotations and so on...

􀂃 Advanced algorithms: audio (voice recognition, pitch detection) or image

processing

􀂃 Direct Matlab interface: PC tools generate floating point code, directly

portable on FPU. A fixed point device will require more care and adaptation.

This text is

purely

marketing?

DSP and

FPU

units

do not help

?

http://www.ebv.com/fileadmin/products/Products/STM/STM32F4_Series/STM32_F-4_Series_marketing_presentation_Customer_presentation_.pdf

Posted on May 05, 2012 at 13:38

I'm sure it helps, but it is going to depend a lot on what YOUR algorithm requires in terms of speed and resources. Don't expect it to function like Siri or Dragon. Or that a 168 MHz Cortex-M4 is going to be able to grind numbers like a 3 GHz x86 with megabytes or gigabytes of RAM.

I'm sure what marketing is probably talking about is something where someone grunts ON or OFF at the device and it can detect the different intonations, not something that can decipher a stream of speech on the fly.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
Andrew Neil
Evangelist
Posted on May 05, 2012 at 15:31

From: kovacs.janos:

http://www.ebv.com/fileadmin/products/Products/STM/STM32F4_Series/STM32_F-4_Series_marketing_presentation_Customer_presentation_.pdf

You're quoting a page there from

EBV - so why don't you contact

EBV and ask them to explain precisely what they mean by it?

That is, after all, what technical

Distributors are there for!

''We, as a semiconductor specialist, are being asked to provide in-depth application support, value-added services and logistics solutions to a diverse customer base. EBV Elektronik will embrace these challenges as we take advantage of our strong position within the electronic components industry and our highly skilled and motivated employees .... Our goal is to meet and exceed the needs of our customers and vendors like never before, providing access to a new level of resources in technical expertise and supply chain solutions. Don't expect anything else from us than the best in class service in semiconductor distribution.''

Slobodan Puljarevic

President & CEO EBV Elektronik

http://www.ebv.com/en/about-ebv.html?ct_ref=m-6

Go on - put them to the test!

http://www.ebv.com/clear.gif

frankmeyer9
Associate II
Posted on May 05, 2012 at 17:43

I have to agree with clive and neil.andrew - it depends a lot on the algorithms you intend to use. You might find several packages in source code for PCs, but all are usually based on DP at least.

To get some PC-based algorithms ported to Cortex M4, you have to put heavy efforts in optimization, and you might have to sacrifice something to tune it for your application.

On the other side, look at the TDA7590, for instance. That part delivers 120MIPS, with an architecture and instruction set tuned to exactly such applications. While the STM32F4 has theoretically a higher performance and higher numerical resolution (32 Bit SP float, compared to 24 Bit fixed point), it misses the highly optimized DSP engine and heavily tuned libs.

With the surplus in raw MIPS, you might keep up with that DSP class.

But, as mentioned, most important are working and efficient algorithms.

sgyuri2003
Associate II
Posted on May 08, 2012 at 07:56

Remember the

Erisson

T39m

named

dinosaurus

?

Dew

weak

hardware,

an efficient and

effective

voice control

.

And 8051

Core

sensory

''neural

network

'' voice

control

device?

The

goal is not

a

dragon

-featured

audio

control device

.

I believe that

these

resources

can

do something

on this topic.

Amen to that.

🙂

sgyuri2003
Associate II
sgyuri2003
Associate II
Posted on May 08, 2012 at 09:09

This is :

http://www.ti.com/lit/an/spra100/spra100.pdf