And 5 min later, a quick fix!
MatrixHardware_KitV1.h has a line to set the minimum amount of time to transfer a block of pixels. This appears to be longer on the Teensy 3.5, or maybe my calculation involving
F_CPU is just wrong. I replaced the line with a hard number of 6000 ns (about double what worked on the Teensy 3.2 and the ghosting went away. Try decreasing the number to see what works without ghosting (a higher number here either limits the maximum refresh rate or maximum brightness, so the lower - without ghosting - the better).
#define PANEL_32_PIXELDATA_TRANSFER_MAXIMUM_NS (uint32_t)(6000)