Teensy 4.0 Released

I was really excited to try @easone’s port this weekend, and so far in my limited testing it’s working great!

The SmartLED Shield for Teensy 4 prototypes are ordered, but realistically it’s going to be a couple months before they’re available for purchase. In the meantime, I just designed an adapter board that takes a Teensy 4.0 or 4.1, and slides into the SmartLED Shield for Teensy 3. The unused signals on the Teensy 4 are brought out to the connectors, though most pins are in different positions as the Teensy 3 and Teensy 4 SmartLED Shields require different pins.

I don’t plan on selling this adapter, but you can order the PCB from OSHPark ($7.45 for three boards shipped!), and consider ordering a Teensy 4.0 at the same for only $18! I just ordered the boards myself, so if you want to be safe you should wait until I get them and test before placing an order.

The adapter plugs into the SmartLED Shield with 2x 14-pin headers, the same type you’d use for a Teensy, but in the outer row of holes instead of the inner row. You attach the Teensy 4 to the adapter directly, or with female headers. For the most low-profile setup, you’ll need to solder the Teensy 4 directly to the adapter, then trim any long pins underneath, which would otherwise contact the pads on the SmartLED Shield, or use headers with a short tail (less than 0.1") under the Teensy. If you want to be able to remove the Teensy 4, then use 2x 14 pin female headers.

I’ll post pictures explaining this more after my boards arrive.

The easiest way to order from OSH Park is by using this link to the shared project on OSH Park

aee0a4a496b8d55beed0d503a190a30b

6331aecf846fdbfda6aa4da3fcdf3a06

1 Like

Hey easone,

I just tried this at different CPU speeds and it currently seems to only work at 600MHz. Can you confirm the same?

Edit: also I’m noticing a bit of random pixel flicker on two 64x64 (128x64 total) displays at 600MHz. Overclocking things to 720MHz sorts this out.

I was running these panels with the rudimentary libraries I wrote/modified before at 150MHz (mind you it was 16bits total for colour and whatever random refresh rate it felt like doing) with no issues - GitHub - bleckers/RGB-Matrix-Panel-Teensy-4.0: Arduino RGB LED Matrix library with Teensy 4.0 support

This might be due to all the other things I’m doing too which might be slightly interfering with timing. Is there a way to avoid any critical timing regions?

Some of the pixel flicker issue may be electrical and related to stray capacitance or grounding, so I expect it to improve with a shield that does not require jumper wires…

You can try adjusting FLEXIO_CLOCK_DIVIDER in MatrixHardware_KitV4T4.h to decrease the pixel clock speed. The FlexIO clock speed is not affected by over or under clocking the CPU, and it’s already close to the maximum at 600 MHz CPU before the DMA transfer is not able to keep up. Slower than 600 MHz doesn’t work for me either.

Edit: if your project has other peripherals that use DMA transfers or buffers in DMAMEM (such as SD or Serial) then that may require a slower FlexIO clock to run stably.

Ah ok, that makes sense considering the higher data requirements of a much nicer image. Not an issue really, everything is working fine at the higher clock speed for me, just wanted to see if it was something on my end.

As for the the flicker, there’s a bunch of SPI transfers going on, so that’ll be it. I’ll tweak them to run around the frame timing. Good to know thanks!

Here’s a PDF schematic of the adapter for anyone that needs it and doesn’t have EAGLE installed

I tested the OSHPark adapter and it appears to work perfectly on both T4.0 and T4.1! Electrical performance looks better now that Louis eliminated the jumper wires.

I committed some changes to my fork to support the adapter. It requires a special header (“MatrixHardware_T4Adapter.h”) to be included before the library.
https://github.com/easone/SmartMatrix
Added FM6126A support and made performance improvements too.

I received my adapter yesterday, and used it with a Teensy 4.0 and 4.1, and it seems to be working well.

The pins used for refreshing the panel conflict with the built in SPI pins (at least the ones accessible on the edges in the Teensy 4.0, so driving APA102 LEDs with the SmartLED Shield V4 APA102 buffers and cable isn’t very practical as is. @easone told me about KurtE’s FlexIO_t4 library, which can implement an SPI peripheral using DMA and FlexIO on alternate pins. I adapted the SMARTMATRIX_APA class to work on the Teensy 4 using FlexIOSPI, and it’s mostly working well. Use of the DMA or FlexIO for driving SPI seems to be conflicting with the SmartMatrix HUB75 DMA FlexIO and extending out the time it takes to shift out the pixels.

Normal refresh, where the clocks to drive the HUB75 panel fit between latch pulses

Slow refresh, where the clocks extend past the latch pulses

I just added a larger overhead to PANEL_32_PIXELDATA_TRANSFER_MAXIMUM_NS which seems to solve that problem:

#define PANEL_32_PIXELDATA_TRANSFER_MAXIMUM_NS  ((32*FLEXIO_CLOCK_DIVIDER*1000/480) + 50 + 650)

With larger APA102 pixel counts, the APA102 code seems to conflict with HUB75, and there’s corrupted lines seen on the screen. I’m pretty sure this is because the APA102 code calculates all pixels in a frame at once - which can take a long time - and it runs in low priority interrupt context, keeping the HUB75 calculation code from running. I need to break up the APA102 calculation code to do a row (or a few) at a time, instead of a frame at a time.

You can find the Teensy4 APA102 code here, and the FastLED_Panel_Plus_APA example is included: GitHub - pixelmatix/SmartMatrix at teensy4

1 Like

Louis and I are now collaborating in the new “teensy4” branch in the pixelmatix repo. The fork I started is deprecated - I won’t be keeping it updated. For the latest code, go here:

I just committed an improvement that should cut DMA usage down quite a bit. This should help with the conflict with the APA class as well as potential other DMA usage (e.g. SD card).

The driver can now be configured to use from 1 to 4 FlexIO shifters using the RGBDATA_SHIFTERS constant in SmartMatrixRefreshT4.h. Previously it was hardcoded to use 2 shifters. Going to 4 shifters doubles the DMA minor loop size to fill up the shifters, so the number of DMA triggers per row is cut in half. Although the number of bytes transferred is the same, there are a lot of overhead cycles associated with each DMA trigger (for the DMA arbitration process, offset calculation etc.). So the total usage of the DMA engine is lower.

Overclocking the CPU to 816 MHz will help too, since that speeds up the DMA engine without increasing the matrix panel clock speed.

If you need to slow down the HUB75 driver to get APA102 working well, I’d recommend decreasing the pixel clock (by increasing FLEXIO_CLOCK_DIVIDER) rather than adding overhead time to PANEL_32_PIXELDATA_TRANSFER_MAXIMUM_NS. Both ways extend the time between latch pulses. The effect should be the same in terms of total DMA usage, but changing the clock speed will space out the DMA transfers evenly across the row update time.

2 Likes

Hello easone
Do you know if there are two SPI ports available for user applications?

In my designs I need one and two SPI ports to receive data with the images to be displayed. Both working with DMA.

Regards.

The standard SPI port (pins 10-13) is blocked by the HUB75 panel driver. There was no way around that. But there are two more hardware SPI ports on Teensy 4.0 and 4.1 and the driver and shield do not interfere with them. SPI1 uses pins 0,1,26,27 which are free. And if you are not using an SD card, there is also SPI2 which uses the DATA1, DATA0, CLK, and CMD pins of the SD card interface. These are numbered as pins 34-37 on Teensy 4.0, renumbered as pins 42-45 on Teensy 4.1.

On Teensy 4.0, you will need to carefully solder wires to the very small SMT pads on the bottom side to access pins 26-27 and 34-37.

On Teensy 4.1, pins 0,1,26,27 are easy to access on the external headers. Pins 42-45 are only available through the SD card connector so you would need something like this Sparkfun microSD breakout to access them.

Check the “Teensy4 Pins” spreadsheet here for more details about the pin assignments: GitHub - KurtE/TeensyDocuments: Some of my own Teenys documents such as XLS documents showing pin assignments and the like

Thanks for the clarifications easone.

Unfortunately, I need the SD card, and then only SPI1 is free. Since I need a second SPI for my applications, I was thinking of being able to use Flexio to emulate a new SPI port, I don’t have experience with Teensy 4 yet, but I think this is possible. It was to know if there are free pins that do not affect the operation of Smartmatrix, with which to be able to emulate a new SPI (with DMA) configuring it with Flexio.

I was reviewing the connections in my schematic and at first I found it confusing because pin7 did not have it assigned, I was going to ask you about this, but reading the source I think pin7 is to be used as an alternative to pin10, that is, only one would be used of the two. This is so ?.

As an alternative to being able to emulate a new SPI with Flexio, reviewing the schema and the datasheet, I think I could remove the QSPI, boot from SD, and use LPSPI2 for my application.

My final design will be with my own board that will install the RT1062 microcontroller, now I only use Teensy for testing.

If you need to use the SD card pins for smartmatirx, can the sdcard be plugged into other GPIO with bit banging?
Otherwise another option for less than $20 is an ESP32 with 8MB of PSRAM and 16MB of flash, which is enough to store some amount of graphics in FatFS (or SPIFFFS), with no sdcard requied.

I need the SD card by SDIO 4bit, I can’t remove it. The only thing I thought is to remove the QSPI to use LPSPI2, and boot from SD card. However, I think the ideal would be to be able to emulate a new SPI port using Flexio, there are enough free pins on Flexio1 and Flexio2 to be able to use it with DMA.

In MCUxpreso for the RT1062, there are examples of source code emulating an SPI with Flexio and DMA, but for now I can’t find anything for Teensy 4. In the Kurte library there is an example, but quite basic, I think it doesn’t use DMA.

I have never considered using ESP32, I prefer the ARM world for compatibility and portability between manufacturers, also the RT1062 is much more powerful than the ESP32 processor.

@pinballsp: since you’re not using the SmartLED Shield and you’re developing custom hardware, you could use a different pinout instead of the one I chose. I actually think it is possible on Teensy 4.1 to select the LED matrix pins to not conflict with any of the hardware SPI ports: you can use pins 7,8,9,32,36,37 for RGB data, pin 6 for CLK, and pins 2&3 for LAT and OE. There are other options too: CLK could be either 34 or 35 and LAT&OE could be 4&33 or 28&29. The ordering of the RGB pins and the LAT/OE pins are not important. I can share a spreadsheet that details all the possibilities if people are interested.

We thought it would be better to maintain Teensy 4.0 compatibility for the new SmartLED Shield design, which unfortunately requires blocking the main SPI port.

To answer your question, yes it is possible to emulate SPI with FlexIO1 while the LED matrix driver runs on FlexIO2. In fact that’s exactly what Louis implemented for APA102 LED driving. The available FlexIO1 pins are 2,3,4,5,33. But I don’t know how hard it will be for your application.

Thanks easone, the options look very interesting.
If I can modify the pin assignment, to release the LPSPI4 port, using other Flexio2 pins for Smarmatrix, it is perfect for me.

So I understand that I just have to edit the MatrixHardware_KitV4T4.h file and modify the pin assignment. I was looking at the available ones in Teensy 4.1 for Flexio2 and I created these tables.

That should work - you only have to change the pin numbers defined in that header. The one time setup should configure everything automatically for you. Hopefully.

Here is the rule for selecting data/CLK pins. For the driver to work, you need to choose a contiguous group of 16 FlexIO2 pins, and choose your 6 data signals from within that group; CLK can be inside or outside that group but needs to be the lowest or highest pin (in terms of the hardware FlexIO2 assignments).

It’s not foolproof, but there is a configuration validation check that will print an error message to Serial if there is a problem with the pin numbers.

Hi easone.
So there is something that I think is wrong in the new selection of pins, if I have understood correctly what you indicate.

In Flexio2, we would have to choose pins from the same contiguous 16-pin block, that is, block1 D00 to D15, or block2 D16 to D31 (according with picture attached). CLK can be chosen from any of the two blocks but it must be the first or last.

So the selection of pins 7, 8, 9, 32, 36, 37, for RGB, would not be correct, since pins 9 and 32 are in the first 16-bit block, and the rest of the pins are in the second block. 16-bit Flexio. Also the original pin configuration would be incorrect since pins 6, 9, 11, 12, 32 are on block1 and pin 8 is on block2. Is this correct ?.

So a correct selection of pins for RGB would really be this? 7, 8, 34, 35, 36, 37 of Flexio block2, and pin 6 for CLK of Flexio block1.

There is no need for the 16 pin block to start at pin D00 or D16. FlexIO allows you to pick any pin as the start of the 16 pin block, so you can choose D01-D16 or D11-D26.

The selection you indicated should also work.

Ok, thanks, now I understand, everything is already clarified.