Issue with the execution time of NOP instruction [STM32F746G-DISCO]

Question asked by stratos on Jul 27, 2017
before starting a project that involves digital signal processing, I'm doing some tests with the stm32f746g-DISCO
to evaluate the capabilities of the board.
In particular, i've measured (toggling the GPIOI_PIN_1) the execution time of NOP : 60ns.
I'm using the CubeMx and I've properly set up the clock configuration to run at 216Mhz (maximum frequency).
Also, i've enabled in the "Cortex_M7 Configuration" section: TCM Interface, ART ACCELERATOR, Instruction Prefetch,CPU ICache and CPU DCache.
I'm a little bit upset, because 60ns for a NOP is in contrast with the idea of a system core clock that runs at 216 Mhz.
I'm doing something wrong, I'm sure, but I really don't understand where.I've checked the RCC registers and the content is coherent
with the code generated with the CubeMx.
Is there any possibilities that I'm in error? In  the documents related (datasheet, reference manual, programming manual etc.)
there isn't the information that i'm searching.
With pipelining, one cycle machine should match one cycle for why this happen?
Thanks for the attention.