There's a great tool in KDE's svn called kdumppixmap. Once built, you run an app with `ktracepixmap ` and it says something like "Starting with pixmap tracing. Use kdumppixmap to save currently allocated pixmaps to disk". You wait until the application is in the state you want it to be in and then you run the kdumppixmap command it gave you. This creates a directory full of image files, one for each pixmap sitting on the X server. It was written by Maksim Orlovich, who borrowed some code from Geert Jansen's kstartperf and from kBacktrace.
I've spent time yesterday and today with plasmoidviewer and a tortuous little plasmoid I wrote in Javascript that has 150 buttons, 150 line edits and some other assorted jumble in a grid layout. The goal was to eliminate as many pixmaps as possible, and at 18:00 on day two, I'm quite nearly there. Here are the areas I traveled in my pixmap hunt and what I found while doing so; hopefully others will find it useful as well. :)
The first stop was the FocusIndicator class which draws those pretty little halos around buttons and other things when you mouse over or give them keyboard focus. It had its own copy of the FrameSvg that creates those images from a raw SVG file, and it did a few unnecessary things at start up. After fixing that, there wasn't any reduction in pixmaps (sadly), but creating a button, line edit, combobox or slider got twice as fast. The remaining overhead is now in QGraphicsProxyWidget, of which there is probably desperately little we can do right now. (QtComponents is our likely savior in the mid-term; no short-term relief, however).
We do a lot of nice little animations in Plasma. You may not have noticed, but when you mouse over, press on, give focus to, etc. nearly any widget the visual states smoothly transition from one to the other visually. This is usually done using an animation called PixmapTransition, which does exactly what the name suggests.
One of the caveats of such a thing is that it needs to align the pixmaps first. No problem, just calculate some offsets, right? Well, it turns out it was doing that, but then creating new pixmaps with the original images repainted at those offsets in them. Which meant that every time it was used, there were usually two extra pixmaps kicking around needlessly. Now it just calculates the offsets and uses those in painting.
With those two gone, there was one left: the result. The way the animation works is that the animation timer ticks along from 0 to 100 over a given period of time (and a given curve). On each animation tick it would generate a new result pixmap and tell whatever it was animating (e.g. a push button) that it needed to repaint itself.
Thing is, most of the elements in Plasma are already buffered quite nicely and don't need this extra pixmap sitting around. So I made the caching optional and off by default. I also moved the creation of the pixmap into the method that the item being animated uses to get to it. This moves the creation of the pixmap out of the animation frame tick and places it at the moment the pixmap is actually used. This means that it now only creates frames in the animation that will actually get painted. Previously, if the animation loop is ticking along faster than the widget can repaint, then a bunch of painting operations that the user would never see on their screen would be generated. No more.
So now we were down to just 1/3rd the number of the pixmaps being held on to, and a lot more efficient generation of the pixmaps we did use. But as kdumppixmap was showing me quite plainly, we still had one pixmap of the background for each button (or whatever) even if all the buttons were the same size and disposition. In other words, we had N pixel-identical pixmaps, where N is the number of identical items on screen. This just wouldn't do, and so I took a dive into the primary culprit here: FrameSvg.
This class creates nice four-sided frames to put around buttons, Plasmoids, tooltips and what not. It's used quite heavily in Plasma and to keep things speedy it has its own local cache on top of the on-disk cache in Plasma::Theme. This cache was per-FrameSvg object, however, and the source of all those pixel-identical pixmaps that were jeering at me from Gwenview. Damn you, pixmaps!
So I implemented a small reference counted caching system whereby identical frames are shared between FrameSvg objects. Finally, I had one pixmap for the buttons (well, four, due to different states of the buttons as I went around clicking on them) and a similar number for the lineedits. Instead of hundreds and hundreds of pixmaps, I now had a couple dozen for the entire plasmoidviewer session. The savings were well over 2 MB of pixmap data in the X server.
A Plasmoid with 200+ individual components in it which are all the same size in a nice little grid doesn't sound particularly "real world" does it? I also tested with a default plasma-desktop session and it turns out that, indeed, the savings are not quite as dramatic .. but they are significant.
On starting up a default plasma-desktop layout, there's ~1/3rd fewer pixmaps due to the FrameSvg caching, and several dozen fewer on top of that due the rest of the work on things like PixmapTransition. Each pixmap avoided is less X traffic, less video memory and a generally smoother experience.
During usage, a lot of transient pixmap creation is avoided as well. For instance, the tasks widget uses Plasma::PaintUtils::transition to fade the tasks buttons in pretty much the same way the PixmapTransition animation does (in fact, it seems the only reason it isn't using PixmapTransition is that the tasks widget predates PixmapTransition). As such, it gains the same benefits that PixmapTransition has, namely two fewer pixmaps per animation step.
The more Plasmoids you use, the more you gain, of course. Things like the panel controller and other such window dressings also benefit.
The impact on Plasma mobile should be even larger, as "bunches of things of the same size" is a very common occurrence (e.g.a phone dialer, a grid of app icons, ..) and graphics performance is a rare commodity. Keeping the animations tamed through fewer generated pixmaps, only generating pixmaps actually pushed to screen, etc. helps out as well. I don't yet have numbers of the impact on the N900, but it can only be good news there if it's noticeable on a desktop system.
So, a reasonably good investment of effort. All of this should end up in the 4.6 release coming in January. If you'd like to enjoy the benefits of these improvements now, we'd enjoy you helping us test things prior to the release. :) I know Suse provides some great packaging options for up-to-the-day builds and there's always the "build it from sources" option (kdesrc-build really helps making that easy to achieve).
I've spent time yesterday and today with plasmoidviewer and a tortuous little plasmoid I wrote in Javascript that has 150 buttons, 150 line edits and some other assorted jumble in a grid layout. The goal was to eliminate as many pixmaps as possible, and at 18:00 on day two, I'm quite nearly there. Here are the areas I traveled in my pixmap hunt and what I found while doing so; hopefully others will find it useful as well. :)
The FocusIndicator
The first stop was the FocusIndicator class which draws those pretty little halos around buttons and other things when you mouse over or give them keyboard focus. It had its own copy of the FrameSvg that creates those images from a raw SVG file, and it did a few unnecessary things at start up. After fixing that, there wasn't any reduction in pixmaps (sadly), but creating a button, line edit, combobox or slider got twice as fast. The remaining overhead is now in QGraphicsProxyWidget, of which there is probably desperately little we can do right now. (QtComponents is our likely savior in the mid-term; no short-term relief, however).
The PixmapTransition Animation
We do a lot of nice little animations in Plasma. You may not have noticed, but when you mouse over, press on, give focus to, etc. nearly any widget the visual states smoothly transition from one to the other visually. This is usually done using an animation called PixmapTransition, which does exactly what the name suggests.
One of the caveats of such a thing is that it needs to align the pixmaps first. No problem, just calculate some offsets, right? Well, it turns out it was doing that, but then creating new pixmaps with the original images repainted at those offsets in them. Which meant that every time it was used, there were usually two extra pixmaps kicking around needlessly. Now it just calculates the offsets and uses those in painting.
With those two gone, there was one left: the result. The way the animation works is that the animation timer ticks along from 0 to 100 over a given period of time (and a given curve). On each animation tick it would generate a new result pixmap and tell whatever it was animating (e.g. a push button) that it needed to repaint itself.
Thing is, most of the elements in Plasma are already buffered quite nicely and don't need this extra pixmap sitting around. So I made the caching optional and off by default. I also moved the creation of the pixmap into the method that the item being animated uses to get to it. This moves the creation of the pixmap out of the animation frame tick and places it at the moment the pixmap is actually used. This means that it now only creates frames in the animation that will actually get painted. Previously, if the animation loop is ticking along faster than the widget can repaint, then a bunch of painting operations that the user would never see on their screen would be generated. No more.
FrameSvg
So now we were down to just 1/3rd the number of the pixmaps being held on to, and a lot more efficient generation of the pixmaps we did use. But as kdumppixmap was showing me quite plainly, we still had one pixmap of the background for each button (or whatever) even if all the buttons were the same size and disposition. In other words, we had N pixel-identical pixmaps, where N is the number of identical items on screen. This just wouldn't do, and so I took a dive into the primary culprit here: FrameSvg.
This class creates nice four-sided frames to put around buttons, Plasmoids, tooltips and what not. It's used quite heavily in Plasma and to keep things speedy it has its own local cache on top of the on-disk cache in Plasma::Theme. This cache was per-FrameSvg object, however, and the source of all those pixel-identical pixmaps that were jeering at me from Gwenview. Damn you, pixmaps!
So I implemented a small reference counted caching system whereby identical frames are shared between FrameSvg objects. Finally, I had one pixmap for the buttons (well, four, due to different states of the buttons as I went around clicking on them) and a similar number for the lineedits. Instead of hundreds and hundreds of pixmaps, I now had a couple dozen for the entire plasmoidviewer session. The savings were well over 2 MB of pixmap data in the X server.
The Impact on Plasma Desktop
A Plasmoid with 200+ individual components in it which are all the same size in a nice little grid doesn't sound particularly "real world" does it? I also tested with a default plasma-desktop session and it turns out that, indeed, the savings are not quite as dramatic .. but they are significant.
On starting up a default plasma-desktop layout, there's ~1/3rd fewer pixmaps due to the FrameSvg caching, and several dozen fewer on top of that due the rest of the work on things like PixmapTransition. Each pixmap avoided is less X traffic, less video memory and a generally smoother experience.
During usage, a lot of transient pixmap creation is avoided as well. For instance, the tasks widget uses Plasma::PaintUtils::transition to fade the tasks buttons in pretty much the same way the PixmapTransition animation does (in fact, it seems the only reason it isn't using PixmapTransition is that the tasks widget predates PixmapTransition). As such, it gains the same benefits that PixmapTransition has, namely two fewer pixmaps per animation step.
The more Plasmoids you use, the more you gain, of course. Things like the panel controller and other such window dressings also benefit.
The Impact on Plasma Mobile
The impact on Plasma mobile should be even larger, as "bunches of things of the same size" is a very common occurrence (e.g.a phone dialer, a grid of app icons, ..) and graphics performance is a rare commodity. Keeping the animations tamed through fewer generated pixmaps, only generating pixmaps actually pushed to screen, etc. helps out as well. I don't yet have numbers of the impact on the N900, but it can only be good news there if it's noticeable on a desktop system.
So, a reasonably good investment of effort. All of this should end up in the 4.6 release coming in January. If you'd like to enjoy the benefits of these improvements now, we'd enjoy you helping us test things prior to the release. :) I know Suse provides some great packaging options for up-to-the-day builds and there's always the "build it from sources" option (kdesrc-build really helps making that easy to achieve).