Datasheet Review: High-Performance STM32 Cortex-M4 Microcontroller
The STM32F469 microcontroller, which is based on the higher performance ARM Cortex-M4 processor architecture, is extremely fast and offers a huge assortment of features and peripherals.
Technical Difficulty Rating: 7 out of 10
A few weeks ago I did a datasheet review for an entry-level 32-bit ARM Cortex-M0 microcontroller from ST Microelectronics (part # STM32F030). This time I’m going to be reviewing a significantly more advanced microcontroller from the same STM32 line.
The STM32F4, or more specifically the STM32F469, has some advantages. In fact, one of the features offered is normally only available on more expensive microprocessors. No other microcontroller on the market offers this advanced feature (keep reading to learn more it).
The datasheet for the STM32F469 is a whopping 216 pages long and there is a lot of technology stuffed into this chip. Needless to say we won’t be reviewing the entire datasheet, and instead we’ll just focus on the front page which details all of the key features.
I’m going to be focusing mostly on the features that differentiate this microcontroller from the STM32F030 that we looked at previously. So I’ll just quickly pass over some of the more standard features that are similar or identical to the STM32F030.
If you’re not familiar with advanced microcontrollers you may want to read my review of the STM32F030 before you proceed with this article.
Processor Core
Arm 32-bit Cortex-M4 CPU with FPU
The STM32F469 is based on an ARM Cortex-M4 processor core. The STM32F030 that we looked at previously was based on a Cortex-M0 processor.
The Cortex-M4 is a much more advanced core than the M0. For one thing an Cortex-M4 gets more done for each tick of the clock.
For example, if you compare an M0 processor against an M4 processor with the exact same clock speed, the M4 will perform about 50% better than an M0 (based on performance benchmarks).
The other key feature that separates Cortex-M4 processors from less advanced ones is the addition of a Floating Point Unit (FPU).
An FPU is a separate processor core, sometimes called a math coprocessor, that is specifically designed for performing mathematical functions on floating-point numbers.
Not all versions of the Cortex-M4 include an FPU. Officially, Cortex-M4F is the proper designation for M4 cores with an FPU. But this isn’t strictly followed, and many vendors refer to a M4 processor with an FPU simply as Cortex-M4.
Adaptive Real-Time Accelerator (ART Accelerator)
Microcontrollers, unlike microprocessors, usually run firmware code directly from their on-chip Flash memory. But Flash memory is slow. This bottleneck becomes a more serious issue the faster the processor.
The simple solution is to have the processor execute wait states when running from Flash memory. But that is far from ideal since much of the performance increase is lost because of these wait states.
One potential solution is to move the firmware code being executed into faster RAM memory. But RAM memory is less dense than Flash, so on-chip RAM is usually much more limited than Flash memory.
ST’s much more elegant solution is called an Adaptive Real-Time Accelerator (ART Accelerator).
The details of how this memory accelerator works are beyond the scope of this article, but suffice it to say that it significantly improves processor performance when executing code from Flash memory.
180 MHz / 225 DMIPS
The STM32F4 has a performance benchmark rating of 225 Dhrystone MIPS (Million Instructions Per Second), or DMIPS.
Although the M4 core is about 50% faster per clock tick than the M0, the biggest performance boost comes from the fact that an M4 core can run at a much higher clock frequency.
For example, the maximum clock frequency for the STM32F0 is 48 MHz, but the maximum clock for the STM32F4 is 180 MHz. This means at the maximum clock speed the STM32F4 is about 1.5 * (180 / 48) = 5.6 times faster than the STM32F0.
STM32F7 microcontrollers with the even more advanced Cortex-M7 core take this performance even further. Tick for tick an STM32F7 is about 70% faster than an STMF4. The STM32F7 also supports a slightly faster clock of 216 MHz. This means an STM32F7 is roughly about 2x (1.7 * 216 / 180) the performance of the STM32F4.
Finally, the STM32H7 takes this performance to the extreme. It too is based on the Cortex-M7 processor core, but ST’s advanced semiconductor processor allows them to crank up the maximum clock speed to a whopping 400 MHz!
For a microcontroller that is insanely fast. The STM32H7 definitely blurs the line between a microcontroller and a microprocessor.
This speed comes at a price though. They cost more, they consume more power, and they significantly complicate development, so only use them when absolutely required.
DSP Instructions
The STM32F4 provides a set of special instructions for performing Digital Signal Processor (DSP) operations. A DSP is for processing continuous real-world analog signals.
Analog signals (such as an audio signal) are first converted to digital format via an Analog-to-Digital Converter (ADC). The analog information (audio in this case) can then be processed as digital data by a DSP.
Once the digital signal processing is completed, the information is then converted back into analog format by a Digital-to-Analog Converter (DAC).
There are separate high-performance DSP processors available as well that are specifically design for processing digital signals. My former employer Texas Instruments, along with Analog Devices mostly dominate the DSP chip market.
The STM32F4 doesn’t include a separate DSP processor, but it does offer specialized instructions for performing DSP operations on the Cortex-M4 core.
Memories
Flash and RAM Memories
The STM32F469 microcontroller supports up to 2 MB of Flash memory. This is the upper limit on Flash memory available built-in on any STM32 microcontroller.
What separates the Flash memory on the STM32F469 from less advanced microcontrollers is the fact that it is split into two separate banks. This allows you to simultaneously read from one bank while writing to the second bank.
The STM32F469 also includes 384 KB of system RAM with 64 KB being what is called Core Coupled Memory (CCM). CCM is RAM memory that is tightly coupled (placed nearby) with the CPU core allowing the fastest access time.
CCM RAM is usually only necessary for computation intensive tasks and real-time processing.
External Memory Controller
Another feature that really sets this microcontroller apart from most other choices is that it has an external memory controller built-in. This allows you to add external high-speed RAM memory that will interface with the microcontroller via a super-fast 32-bit data bus.
The ability to interface with external RAM memory is normally the realm of high-performance microprocessors, not microcontrollers. Microcontrollers tend to have all the RAM built in, which is one of the many reasons designing with a microcontroller is a much simpler design process.
But there are cases when the ability to add additional fast, off-chip RAM will be a big advantage. For example, to serve as a frame buffer in video applications.
Although it can add significant design complexity, if you do need a large amount of high-speed RAM memory then the STM32F469 cans support it.
The external memory controller supports various types of RAM memory including SRAM (Static RAM), PSRAM (Pseudo-Static RAM), and SDRAM (Synchronous Dynamic RAM). It also supports NOR/NAND based Flash memory.
Quad-SPI Interface
SPI is a very common serial communication interface that you may be familiar with already. It’s commonly used for communicating between chips on a PCB that need to pass data at a moderately high speed. It’s much faster than the two-wire I2C protocol.
SPI is also a common choice for interfacing a microcontroller to a Flash memory chip.
Normally, SPI uses two unidirectional data lines called Master-In-Slave-Out (MISO) and Master-Out-Slave-In (MOSI). In order to increase the data transfer rate the quad-SPI interface instead uses four bidirectional data lines.
For example, if reading data from a Flash memory via a standard SPI interface, data is transferred on only a single data line (MISO). However, when using a quad-SPI interface this same data is transferred over four data lines, thus increasing the data throughput speeds by 4x.
This additional speed is necessary for some applications that need to be able to access Flash memory at the maximum transfer speeds.
If high-speed throughput with Flash memory is absolutely critical for your application you may also consider using Flash memory that connects to the 32-bit data bus offered by the STM32F469’s external memory controller.
Graphics
The graphics capabilities of the STM32F469 really set it apart from not only the STM32F030 that I reviewed recently but also from many other high performance microcontrollers.
In fact, the STM32F479 offers one major graphics feature that is available on no other microcontroller on the market.
Graphical Hardware Accelerator
A graphics hardware accelerator is included which ST calls the Chrom-ART Accelerator. This hardware accelerator is for performing image manipulation tasks. It offloads this work from the core processor, leaving it free to perform other tasks.
LCD TFT controller
The LCD-TFT display controller outputs 24-bit parallel RGB data (8-bit for each Red, Green, and Blue). This allows it to be directly interfaced to a wide variety of LCD-TFT displays. It supports resolutions up to 1024 x 768.
A TFT (Thin-Film-Transistor) LCD display is an active matrix display unlike simple passive LCD displays that offer a limited amount of segments. LCD-TFT is the same technology commonly used for computer displays and television sets.
MIPI-DSI host controller
Finally we’re to the feature that I’ve been eluding to as being available in no other microcontroller on the market. Drum roll, please… That feature is called MIPI-DSI. It is a feature which is normally only found on much more expensive microprocessors, not on microcontrollers.
The MIPI Alliance is an organization that develops interface specifications for the mobile industry. It was founded in 2003 by ARM, ST Microelectronics, Texas Instruments, Intel, Nokia and Samsung. MIPI stands for Mobile Industry Processor Interface.
MIPI-DSI is their specification for a unidirectional, serial interface designed especially for connecting to a display. DSI stands for Display Serial Interface.
For the LCD-TFT I mentioned data is output as 24-bit parallel RGB data. This means that at least 24 pins on both the microcontroller and the display are required, and that 24 signal traces on the PCB are required for connecting them together. All of this adds more size to the product which is obviously a big deal for mobile products.
A standardized serial display interface, that only requires a fraction of the signals, acts to significantly reduce the size of the product.
MIPI-DSI has been widely adopted by the mobile industry. It is ubiquitous in smart phones and tablets. Having a microcontroller that supports MIPI-DSI opens up a huge selection of high resolution, full color, mobile displays. Most of these are high resolution AMOLED (Active Matrix Organic Light Emitting Diode) displays.
The MIPI-DSI interface on the STM32F469 can support up to 720p HD video at 30 frames per second. If 1080p HD is your goal then you will require a microprocessor or a specialized video processor.
Note that there is also a MIPI-CSI interface standard for connecting to cameras, but that is not supported by this microcontroller or any other one currently available. Hopefully that will change in the near future (are you listening ST?).
Analog-to-Digital Converter (ADC)
The STM32F030 included only one 12-bit ADC with a maximum sampling rate of 1 MSPS (Million Samples Per Second). The STM32F469 increases the number of converters to three, and increases the sampling speed up to 2.4 MSPS.
The STM32F469 even supports sampling rates up to 7.2 MSPS when using what is called triple interleave mode. In this high-speed mode, all three ADC units are working together to sample the same signal. This gives a top sampling rate equal to three times the standard rate, or 3 x 2.4 MSPS = 7.2 MSPS.
Debug mode
The STM32F030 only supported Serial Wire Debug (SWD). The STM32F469 still supports SWD but it also supports the more advanced protocol called JTAG.
The JTAG interface requires a lot more pins, and is used for more than just programming and debug. JTAG can be used during manufacturing to detect hardware defects.
Advanced connectivity
Inter-IC Sound (I2S)
The STM32F469 offers a special audio interface called I2S (Inter-IC Sound). This is not to be confused with I2C (Inter-Integrated Circuit).
Both are used for transmitting data synchronously (with a shared clock) between chips on a PCB. But, I2S is specifically used for transmitting stereo digital audio data, whereas I2C is for transmitting more generic, low-speed data.
If your product requires high-quality audio functionality then you will almost definitely want to select a microcontroller that supports I2S. To eliminate noise pickup it is usually best to leave audio information in the digital domain as long possible.
For example, if you need to route audio data across your PCB it is much better to do so as a digital data like I2S, not an analog signal.
Although the entry-level STM32 microcontrollers (like the STM32F030) don’t support I2S, many intermediate versions do support it and you don’t necessarily need to a controller as advanced as the STM32F469.
CAN Bus
The STM32F469 also supports two CAN bus interfaces. CAN bus is a serial communication protocol used in automotive applications. CAN is how the various subsystems in an automobile communicate with the master processor.
USB 2.0 OTG
The STM32F469 offers two USB 2.0 ports. One port supports high-speed USB communication up to 480 Mbits/sec which is the maximum transfer speed supported by USB 2.0.
The other port only supports full-speed USB which is limited to transfer speeds of 12 Mbits/sec.
Both ports support USB-OTG (USB On-The-Go). A device that supports USB OTG mode has the capability to serve either the host function or the peripheral function. Controllers that only support standard USB can’t ever serve the host function.
For example, if your product offers a standard USB interface then you can only connect it to a device that includes a USB host controller such as a PC or tablet. Your product will always serve as the peripheral, so you can’t connect it to a USB peripheral such as a printer.
But with OTG, if you connect your device to a USB peripheral like a printer then your device will assume the host controller role. The hardware required to serve as a USB host controller is significantly more complex than for a peripheral-only device.
Ethernet
Another feature that sets this microcontroller apart from a more entry level microcontroller is that it also includes an Ethernet controller.
To implement Ethernet functionality both a MAC (Media Access Control) layer and a PHY (physical) layer are required. The MAC layer is embedded in the microcontroller, but an external Ethernet transceiver is still required for the PHY layer.
Parallel camera interface
A parallel camera interface allows a digital connection with a camera using between 8 and 14 bits of parallel data. This camera interface supports transfer speeds up to 54 MB/sec.
The speed of this interface is sufficient to support up to 720p HD video. As I mentioned already, to support 1080p HD video a more advanced microprocessor, or specialized video processor is required.
Packages
The STM32F469 is available in three types of packages: QFP, BGA, and CSP with varying pin counts depending on the number of GPIO pins.
Unless small size is absolutely critical for your product, I would in most cases recommend the QFP package. This is because it is a leaded package so all of the pins are easy to access. This can be critical for debugging purposes. Leaded packages are also cheaper to have soldered onto the PCB.
A BGA (Ball Grid Array) has all of the pins on the bottom of the package in a grid pattern. These make for smaller chip sizes, especially for those requiring high pin counts. But, they also complicate the PCB layout.
Connecting all of those closely placed pins requires a PCB with more layers. That increases the PCB cost. Also, soldering a BGA package on your PCB is more expensive than for a leaded QFP package.
So a BGA packages allows for a smaller PCB, but at the expense of extra design complexity and production cost.
A CSP (Chip-Scale Package) takes the concept of a BGA even one step further. It eliminates the plastic package altogether allowing for the absolute minimum size possible.
A CSP package has all of the same disadvantages as a BGA package, so only use this package if you absolutely must squeeze every fraction of a millimeter from your board size.
Summary
I think you can see that the STM32F469 is an extremely advanced microcontroller. It’s in a whole other league than the STM32F030 that we looked at previously. There is a drastic difference in both speed and features.
My hope is that this article has opened your eyes to the fact that microcontrollers are not limited to simple Arduino applications.
As impressive as the STM32F469 may be, it’s raw processing performance is dwarfed by the STM32F7 series, and especially the STM32H7 series. These microcontrollers begin to blur the line between microcontroller and microprocessor.
Meeting your product’s requirements with a microcontroller instead of a microprocessor is likely to save you lots of time and money. So if at all possible, I highly recommend using a microcontroller for your product whenever it is feasible.
very helpful information. I appreciated it if making the same reviews for TI and Microchip MCUs.
Is the CCM memory same as cache memory, because you mentioned that ” It allows the fastest access time”.
I am using chibios(RTOS) along with STM32F407 , so I want to know will it support threading i.e creating multiple threads for different task
Is STM32F407 single core or multi core?
It is single-core.
Thanks for the reply.
I am using chibios(RTOS) along with STM32F407 , so I want to know will it support threading i.e creating multiple threads for different task
Kalpesh, how can a single core CPU support true multiple threads? Unless, the RTOS is designed to schedule multiple threads on that single core. You should refer to your RTOS’s documentation to check if the microcontroller in question is supported.
What will main drawback between choosing microcontrollers from TI or ST, as they have same M4 cortex processors with same pheripherals
It mostly depends on what you are used to using. TI is a great company as well and they offer some excellent microcontrollers. As a former TI designer I really like TI solutions and they also provide excellent technical support just like ST. I have found that in general I prefer ST for microcontrollers, and TI for power, audio, etc solutions. The STM32 series is very popular so there is likely more information available on using the STM32 than perhaps the TI microcontrollers.
Thanks for the review, John. I still use Microchip ARM such as SAMD21. Although they are a little more expensive than ST there is a big benefit: they are used in Arduino Duo and Zero so you can reuse the schematic/layout and develop the FW in an easy Genuino IDE and keep it there if you don’t have any advanced features that are missing support in Arduino libraries. This speeds up and lowers the cost of the development quite a bit. Another benefit is that if you use USB communication in your product then you can license a free USB PID from Microchip which otherwise would cost you about 2 grands if you get it from USB.org.
Thanks for the feedback and for sharing your experience with Microchip ARM chips. The fact they are used in the Arduino Duo and Zero could definitely have some benefits.
One of the most important specifications is cost.
Cost is always important.
I see on Mouser:
$2.65, Qty=1 STM32F030
$15.66, Qty=1 STM32F469
Mouser links(1st Distributor on a Google search):
https://www.mouser.com/Search/Refine.aspx?Keyword=STM32F030
https://www.mouser.com/Search/Refine.aspx?Keyword=STM32F469
Absolutely true, and the STM32F469 is definitely not a cheap microcontroller. There are other choices in the STM32F4 family with similar performance but less features and thus lower cost.
Thanks for the comment!
John
What device would you recommend to migrate from PIC32MZ to STM32? I need USB connectivity
I would recommend either a microcontroller in the STM32F4 series which will likely be slightly less performance (DMIPS) compared to the PIC32MZ, or a microcontroller in the STM32F7 series which will be quite a bit higher performance. Most of the choices in those series offer USB.
Fantastic publication for its clarity and simplicity in the presentation of the content. 🙂
Thank you Santiago, I really do appreciate the positive feedback!
Are the ST Microelectronics ARM significantly better then similar chips from other companies? Is there a reason to choose ST Micro over say an NXP chip (e.g. LPC4337).
Have some but not much familiarity with the 4337 so I’m not sure if I should just dive in the 4337 for my next project, or explore the ARM world more extensively.
Thanks
There are comparable microcontrollers available from quite a few different chip makers. I prefer the STM32 series because there are so many choices, the price to performance ratio is good, and I have found ST technical support to be very good. Also, as I mention in the article, the STM32F469 is the only microcontroller that offers a MIPI-DSI interface. So if you need MIPI-DSI this is the only choice.
Thanks for commenting!
John,
I get requests for video devices. I have never found an off the shelf MP3 encoder / decoder chip. The closest I have found is a T.I. micro / DSP chip and T.I. has MP3 code that can be downloaded. But it requires a $2,000 eval board and still a lot of firmware development. I tell potential clients that if they want video features like MP3 recorder / player, it is an expensive development, usually about 10 times what they expect. Does STM support this MPU / GPU processor with “easy to use” video encoding / decoding?
Marty
Hi Martin,
Well depending on the application I usually like to use specialized video codec chips especially for anything higher than 720p. I suggest you check out a Chinese chip maker called eMPIA. They offer quite a few video codecs that are affordable. These hardware codecs will perform better than a software-only video codec. ST does offer MP3 audio codec libraries. For MP4 or H264 video compression you would likely need to start with an open-source encoder and modify it as needed.
Thanks Martin for all of your comments!
John
Hi Martin,
I’d recommend another lossy encoding called Opus, that’s much better in quality compared to mp3 and results in smaller sizes. The icing on the cake is its being royalty free and completely open.
Cheers,
Jay
Hi John,
Thanks for the review. Looks to me that except for the CSI interface, this chip is in the same class as the Broadcom chip on the Pi. And I have discovered that unless you want to buy a gazillion chips, Broadcom wouldn’t give you a glass of water if you were on fire :.)
Marty
Thanks for the comment Martin. I agree that getting that Broadcom chip is next to impossible. The STM32F469 is definitely feature loaded with much crossover with an MPU, but it does run at a significantly lower clock speed compared to a GHz MPU.
John
Btw, I was also trying to find an MPU with CSI-2, but looks like I wont find it. Do you know any other way to connect a CSI-2 camera to a MPU? Some kind of converter or sth?
Thanks.
Hi Marcelo,
I assume you meant MCU (Microcontroller Unit) and not MPU (Microprocessor Unit)? There are no MCU’s on the market that offer MIPI-CSI, only MIPI-DSI. A company in Taiwan named eMPIA offers various video processor chips that interface with a camera via MIPI-CSI-2 and then output the video as USB for interfacing to a MCU. Toshiba offers the TC358748XBG
which is a MIPI-CSI-2 to parallel bridge chip.
Well, you don’t need a Broadcom chip. There are tons of other MPUs available. In fact, there are so many of them that if I needed one, I wouldn’t know which one to choose!
Yes, definitely. I’ve done many designs with MPU’s and none have been Broadcom. The main advantage of the Broadcom solution is it’s a highly integrated solution with a ton of functionality packed into that chip.
Thanks for the comment!
John