cancel
Showing results for 
Search instead for 
Did you mean: 

STR9 Best performance from flash

han1
Associate II
Posted on October 05, 2006 at 10:44

STR9 Best performance from flash

18 REPLIES 18
han1
Associate II
Posted on May 17, 2011 at 09:31

I am evaluating Str9 with Keil Str9 Starter Kit. I tried different settings, but I could not get better performance than STR715 for the same code.

I used the settings below.

SCU_MCLKSourceConfig(SCU_MCLK_OSC);

FMI_Config(FMI_READ_WAIT_STATE_2,FMI_WRITE_WAIT_STATE_0, FMI_PWD_ENABLE,

FMI_LVD_ENABLE,FMI_FREQ_HIGH);

SCU_PLLFactorsConfig(192,25,2); /* PLL = 96 MHz */

SCU_PLLCmd(ENABLE); /* PLL Enabled */

SCU_MCLKSourceConfig(SCU_MCLK_PLL); /* MCLK = PLL */

Also I use RCLK equal to fMstr, HCLK equal to RCLK, FMICLK equal to RCLK.

What else shoul do to get better performance from STR9?

Regards,

han1
Associate II
Posted on May 17, 2011 at 09:31

I get no reply (as I expected) , so I work more deeply to understand what is going on..

I run the same code on STR7 Eval (48 MHZ) and STR9 Eval (96 MHz). If I run the codes from RAM , the STR9 runs 1.5 times faster than STR7. Normally it is supposed to be , 2 times. When I check the ARM966Es book, it seems like there are some stalls because of the TCM memory and ARM966 architecture. So it seems like my test code cause some stalls and that's why performance is less than I expected. I run the same code on flash for both CPU. ARM9 could only run %20 faster than STR7 which I think a little bit interesting and I found it not acceptable.

I wonder if there are some performance test done by the ST comparing the STR7 and STR9, and hopefully they will share this tests with us.

Regards,

anis
Associate II
Posted on May 17, 2011 at 09:31

Hi Han,

It is possible to give some indications about your testing code (or send it),this could help to understand the results you have found.

Best regards,

STARM

han1
Associate II
Posted on May 17, 2011 at 09:31

/* This function for STR9 */

void ToggleLed(void)

{

static int ledv=0;

ledv^=1;

GPIO7->DR[0x3FC] =ledv;

}

void FuncTest(void)

{

register int ToggleCounter=10000000;

volatile register int Val2=0;

volatile register int Val3=0;

volatile register int Val4=0;

volatile register int Val5=0;

volatile register int Val6=0;

volatile register int Val7=0;

volatile register int Val8=0;

volatile int Val9=0;

volatile int Val10=0;

volatile int Val11=0;

volatile int Val12=0;

volatile int Val13=0;

volatile int Val14=0;

volatile int Val15=0;

volatile int Val16=0;

volatile int Val17=0;

volatile int Val18=0;

volatile int Val19=0;

for(;;)

{

while (ToggleCounter--)

{

++Val2;

++Val3;

++Val4;

++Val5;

++Val6;

++Val7;

++Val8;

++Val9;

++Val10;

++Val11;

++Val12;

++Val13;

++Val14;

++Val15;

++Val16;

++Val17;

++Val18;

++Val19;

}

ToggleLed();

ToggleCounter=10000000;

}

}

I run this test with IAR EWARM compiler. For STR9, I get 31.2 sec if app. is running from Flash, 18.3 second if application is running from ram. (with __runfunc ) I used variables (val2 to val19) to run the code in linear way as possible. The Ewarm produce a nice ''load , increment and store'' (3 instruction ) for ++Valxx . I did this way in order to see the real flash performance, because jumps are generally decrease the performance of the flash with prefetch mechanism. I used EWARM as compiler,because I could not be able to run my code from RAM with the compiler comes with STR9 Keil Evaluation board (MCB-STR9).

My initialize code for the EWarm:

void SCU_Configuration(void)

{

/* Enable the __GPIO7 */

SCU_APBPeriphReset(__GPIO7,DISABLE);

SCU_APBPeriphClockConfig(__GPIO7 ,ENABLE);

}

GPIO_InitTypeDef GPIO_InitStructure;

void main()

{

SCU_Configuration();

SCU_MCLKSourceConfig(SCU_MCLK_OSC);

SCU_RCLKDivisorConfig(SCU_RCLK_Div1);

SCU_HCLKDivisorConfig(SCU_HCLK_Div1);

FMI_Config(FMI_READ_WAIT_STATE_3,FMI_WRITE_WAIT_STATE_0, FMI_PWD_ENABLE,

FMI_LVD_ENABLE,FMI_FREQ_HIGH);

SCU_PLLFactorsConfig(192,25,2); /* PLL = 96 MHz */

SCU_PLLCmd(ENABLE); /* PLL Enabled */

SCU_MCLKSourceConfig(SCU_MCLK_PLL); /* MCLK = PLL */

SCU->GPIOOUT[7]=0x5555;

GPIO7->DDR=0xff;

FuncTest();

}

Best regards,

Posted on May 17, 2011 at 09:31

hi,

I made the same test with the STR9

the results are similar.

I try to enable the PQFBC but the

results are bad.

Enabling the PQFBC the time rise to 33.6s

:-?

han1
Associate II
Posted on May 17, 2011 at 09:31

I run the same code on a 60 MHz another brand name ARM7 processor, STR9 and other arm completed the code at the same time.(90 Mhz vs 60 MHz, Arm7 vs. arm9?) Is there anybody share the startup and main initialize code with me (STARM?)? So I can use the same code to test the program. IAR and Keil (Real view) is ok for me.

Best regards,

HAN

han1
Associate II
Posted on May 17, 2011 at 09:31

I also tried the Turn on PQFBC , in this test it has no benefit, because the code is very straight. In real applications, the code rarely linear, so the PQFBC is very usefull indeed and increase the speed by %50 in one of my test.

STARM , could you please explain what else I have to do?

Regards,

Posted on May 17, 2011 at 09:31

Dear Han,

I found a comparison between

http://www.arm.com/pdfs/comparison-arm7-arm9-v1.pdf

the arm9E core has a plus in certain kind of istruction

and in dsp related math.

I suppose that the small difference found code executed

in flash could be related to the flash access time.

I used ARM7 chips from ATMEL where the flash ws is 0 till 30MHz.

I' didn'f find any information in the datasheet about the max

frequency of the flash for 0 ws access.

I presume that the STR7 and STR9 have the same kind of flash

so in the flash the code run at a similar speed.

han1
Associate II
Posted on May 17, 2011 at 09:31

in str9 datasheet, it is explained that it has 128 bit access to flash, so it has 4 instruction prefecth queue, it has 4 entry branch cache, so it is supposed to be much more better performance than STR7xx and other ARM7. Besides, it is ARM9 architecture, it has obvious advantages.

Regards,