I am evaluating Audio Weaver on the STM32F746 Discovery board. I programed the board using ST-Link with the file "STM32F746_Discovery_MDK_ARM.bin". I then set up a signal processing chain in Audio Weaver Designer and ran it, streaming audio from USB to line out. Sounds great. I then programed the board with "STM32F746_Discovery_SW4STM32.bin". Running the same Audio Weaver design produced choppy audio output. The same thing happens using "STM32F746_Discovery_EWARM.bin".
I recompiled the source files and programed from TrueStudio (essentially the same as SW4STM) and get the same result of choppy audio. I do not have a license for Keil MDK so I cannot recompile this code to try. If I did have a license, this would be a non-issue as I would just develop in Keil.
My Audio Weaver design is fairly busy. If I eliminate some modules it runs fine with SW4STM32.bin.
Is there some optimization that the Keil toolchain performs during compilation that isn't performed with SW4ST or EWARM? I would like to understand why the same source code produces more DSP capacities using Keil vs SW4STM/EWARM.
Below are the Audio Weaver Server displays using MDK_ARM.bin and SW4STM32.bin for the same design file running. Interestingly, the SW4STM32.bin indicates less CPU usage.
MDK_ARM.bin:
SW4STM32.bin:
10:58am
Hi Mark,
The difference in performance that you see is due to the difference in the code optimizations achieved by the different compilers. In our profiling experiments, we have found that Keil uVision generally produces the best performing machine code for our algorithms, with IAR Embedded Workbench at a close second, and SW4STM32 (free GCC compiler) producing the lowest performing machine code. This matches expectations (pay-for IDE's can afford to spend more money on optimizing their compilers), and matches your results. The fact that removing some modules fixes the audio artifacts also shows that it's a performance issue.
The piece that is surprising is that the SW4STM32 bin actually reports a lower CPU load than the higher performing Keil image. These CPU numbers only represent the time it takes Audio Weaver to pump audio, so it does not include any I/O or driver tasks, which means that you'll never reach 100% before you start missing audio blocks. The difference between Keil and SW4STM32 could be explained by the HAL layer drivers not being as optimized by the SW4STM32 compiler, and therefore adding more cycles that are not captured by the CPU load reported by the AWE Server.
-Axel
12:51pm
Axel,
Thank you for the thoughtful and expeditious response. You confirmed my assumptions that Keil tools are just more optimized. You also helped me understand the CPU load percentage. This was a bit of mystery to me with why I had audio drops before reaching 100%. It makes sense that If the essential non-DSP tasks the microcontroller performs take up 40%, the remainder available for audio is 60%