I think I have a reasonable handle on what is causing this to happen now, although there are some areas of complexity that I need to understand and the PXA270 programmers guide is tough reading.
Basically the console is handled as follows..
wsdisplay interfaces with pxa2x0_lcd.c and the function pxa_lcd_new_screen is used to create the memory map for the display (using bus_dmamem_alloc) and then map that into a kernel virtual address using bus_dmamem_map. This address becomes the internal frame buffer address that is used to render the text consoles which are actually implemented through the rasops library since the display is bitmap only.
When the dma map is created to give the kernel the address of the buffer a particular flag is used, BUS_DMA_COHERENT, which should ensure that writes to this region are immediately sent to the device on the other end of the DMA channel (the display).
As long as you are working in text mode and don't change anything then the frame buffer DMA doesn't get upset.
To access the bitmap display from userland such that graphical applications function it is necessary to perform two steps, step one is to send an IOCTL to the frame buffer to say that it should go into dumb mode (theoretically this should perform extra initialisations and store some flag information for the system to say that the normal text display isn't available any more) and then a call to the system call mmap is used to obtain a memory mapped region with which to communicate with the display.
The mmap call is implemented again in pxa2x0_lcd.c and what it actually does is very simply to use bus_dmamem_mmap to obtain another virtual address region corresponding to the previously allocated buffer created in pxa_lcd_new_screen. This is necessary since a process virtual address space will be different to the kernel virtual address space. This call is made, again, using the BUS_DMA_COHERENT flag to ensure that anything written to this region is immediately written to the device on the other end of the DMA channel.
bus_dmamem_mmap/map deals with virtual addresses but I suspect that, since the DMA controller of the PXA is an integrated device, it also deals with descriptors that map virtual addresses to physical addresses. This being the case what happens when a second descriptor is created to a region for transfer when a descriptor already exists? (as in the case of bus_dmamem_mmap during mmap call following bus_dmamem_map during text frame buffer creation) - my suspicion at this stage is that the controller will handle only the last initialised DMA transfer in a coherent state. This would explain why a return to the text buffer starts to experience transfer slowness in the rasops library that is used to render the text display. (upon return the kernel bitmap used for this display is no longer flagged coherent and is subject to writeback cache built into the DMA controller).
I hope to fix this by implementing an IOCTL handler within the pxa2x0_lcd.c for the WSDISPLAY_SMODE message that restores the coherency to the original (kernel) bitmapped region used by the rasops functions so that this region returns to its original snappy state. This is the area that I am tinkering with at the moment.
-Andy