Size limit with Teensy 3.6 and 32x16 mod 4 panels


I would like to use relatively large displays e.g. 256x16 or 128x32 pixels. The RAM seems to be enough on the Teensy 3.6 for this, but unfortunately it seems to be not working. With these sizes there are no CLK pulses generated. Interestingly if I set 255x16 then it is working - the output is a bit strange as the phisical size differs, but there are CLK pulses sent - in this casa 510 pulses a time.
Is this a hardware limitation somewhere or is it possible to adjust some timing somwhere to be able to work? (at the moment a lower refresh rate wouldn’t be a problem for me)

Thanks your advices in adance!

There’s likely a bug somewhere e.g. with uint8_t (max 255) used instead of uint16_t for a variable. Do you happen to know if CLK pulses work when a sketch is configured for a 256x16/8 panel?

Dear Louis,
Thanks for your reply.
256x16/8 -it is not working, no CLK pulses…
255 works, (255 CLK pulses),
288 works, but only 32 CLK pulses present
320 works, but only 64 CLK pulses present.
So it seems, that you are right, somewhere the high byte is lost…

Oh, I made a mistake, in my previous reply, I was wrong (in the ino the kMatrixWidth was uint8_t instead of uint16_t.)
So the good values:
with 256x16/8 it is working,(256 CLK pulses)
with 511x16/8 it isworking (511 CLK pulses)
with 512x16/8 not working (no CLK pulses)
with 544x16/8 it is working, but only 32 Clk pulses generated
with 1023x16/8 it is working, but only 511 Clk pulses generated
with 1024x16/8 not working (no CLK pulses)
with 256x32/8 not working either (no Clk pulses)

So the magic number seems to be 512 (or 1024 - if it is the number of writes, if I am right)

Dear Louis,

I think I’ve found the cause:
At dmaClockOutData.TCD->NBYTES_MLOFFYES the NBYTES field (Minor Byte Transfer Count) is 10 bit wide only, So one minor loop can send 1024 bytes maximum. And if rowBitSructBytesToShift is greater than 1023 then modulo 1024 bytes will be sent and the higher bits possibly overwrites other fields.
If you agree with me that this could be the cause, then have you any idea how can it be bypassed?
I am a very beginner with ARM’s, this is my first experiment with this kind of microcontroller, so I am lost.

Good catch, thanks for narrowing down the issue!

I don’t know of a fix off the top of my head. It’s possible that two separate transfers need to be queued, and that’s not going to be an easy fix. I’ll open up a GitHub Issue for this, but I’m not going to have time to work on it in the near future, sorry.

I think I’ve found a simple - but a little dirty - bypass to the 512 pixel problem:
I’ve changed the the order of fields in rowBitStruct:
struct rowBitStruct {
timerpair timerValues;
uint8_t dummy
uint8_t rowAddress; // must be directly after data - DMA transfers data[] + rowAddress continuous
In this way the rowBitStructs will be continous in rowDataStruct in the memory and no offset needed at the DMA minor loop. The downside is that it sends out 5 extra bytes, however it isn’t a big problem as they shifed out from the display.
It seems to be working nice. I tried it with a 160x32 display and it worked - except it was very very slow. For my application the high color depth is not neccesary, so I started to experiment lowering the bitrate. I rewrite it to be able to work from 1 bit/color to 8 bit/color. It is working now and it increased the speed greatly. However, testing it with FeatureDemo.ino I run into an other problem: The program hungs after some time. The time is depends on the matrix size and the optimizations I set when compiling (seems to be best with Debug mode) Sometimes everithing seems to stop, no output generated, other times the output picture is OK, the DMA, timers are running, but the program halts. The time could be quite long - FeatureDemo run 5 times and at the 6th run it stops, and with a given configuration (matrix size, optimization) seems to happen at the same time. I checked the free memory with this simple function:
uint32_t FreeMem(){ // for Teensy 3.0
uint32_t stackTop;
uint32_t heapTop;

// current position of the stack.
stackTop = (uint32_t) &stackTop;

// current position of heap.
void* hTop = malloc(1);
heapTop = (uint32_t) hTop;

// The difference is (approximately) the free, available ram.
return stackTop - heapTop;

but it haven’t shown any decrease over time.
Have you experienced a similar problem? Do you have any idea, what it can be or how it can be debugged?
How can I share the modified source? Maybe you or others with more C++ knowledge can find the bug easier…
(I’ve made some other improvements - larger fonts, background color to the scrolling text)

Sorry, no

The best way is probably to fork the code on GitHub, and commit your changes to your fork, then share that here.

Here is my modified code (it is based on the teensylc branch):

It is not perfect and not finished yet. I welcome any comments and/or corrections.


  • Larger fonts can be used (based on sources from others)
  • On the Scrolling Layer the text could have a background color also.
  • Bypassed the 1024 byte DMA length limitation with a little dirty trick (bringing timerValues to the beginning of rowBitStruct and shift out the whole - it sends out 5 extra bytes per row,but it isn’t a big problem as they shifed out from the display.
  • To be able to use larger displays (e.g. 192x32 and so on) I rewrite some parts to work with lower bit depths - for my application 16 or 32 color is enough. Now it is working with 3, 6, … 24, 48.
  • Because of speed issues, I implemented a mechanism that stores the output lines and send the stored ones until something changes. (this is experimental and working, but as this is my first Arduino/Teensy-Arm/C++ project may not be the best solution - maybe someone could improve on it or at least correct my errors if there are any…)
  • This version looks stable - at least with my test program, but previously I experienced strange hung-ups. For example I started the featuredemo.ino and it went round 5 times then it hung. And it hung at the same place when nothing changed - but if I changed size, bit depth or compiler optimalization the hung-up place went to somwhere else. With small sizes (e.g. 64x32) there were no hung-ups. I can’t find the reason, but it seems that the above mentioned line storing solved it somehow
1 Like

I think I’ve found the reason of the strange hung-ups that occured with FeatureDemo.
The problem is in the Layer_Background_Impl.h file.
At fillFlatSideTriangleInt dx1, dy1, dx2, dy2 was declared as int8_t which can overflow and causes an infinite loop. Similarly in drawEllipse all variables are int16_t-s and they can overflow. I have changed the int8_t–s to int16_t in fillFlatSideTriangleInt and the int16_t-s to int32_t in drawEllipse and the hung-ups are gone.
The random hung-ups were caused that the featurdemo draws random size tringles and ellipses and only a few of them - depending on the size - caused the error.

1 Like