Thursday, September 23, 2010

Visualizing IPC Messages in Fennec

More progress on IPC profiling tools, this time a visualization generator. Here is what happens during loading of cnn.com:


On the left you see Fennec before dougt's prefs optimizing patch, and on the right after it. On both sides, the data is as follows:
  • Each line is a single IPC message. Messages are ordered by send time. The width of the line is how long it took until the message was received. The X axis is linear in time.
  • The Y axis also progresses forward in time (since each line is one message, and they are ordered). But it isn't linear (a lot of short messages will take a lot of vertical space, even though little time passed). The numbers show how much time actually passed. So the Y axis really indicates how many messages have been sent.
  • Colors indicate the type of message: green = http, red = prefs, blue = cookies, black = other.
That patch basically removes all the prefs messages, and the result is a 12% speedup (yay!) There were a lot of prefs messages, and even though they were very short and fast to process, they were synchronous (i.e., blocking).

The visualization tool can also show messages for the child and parent process separately (in the image above, they are all shown together).

Instructions for using it:
  • Apply the patch.
  • Run Fennec with something like this (from $objdir/dist/bin):
    • MOZ_IPC_MESSAGE_LOG=1 MOZ_REDIRECT_CHILD_STDOUT=co MOZ_REDIRECT_CHILD_STDERR=ce ./fennec www.cnn.com &> o
  • Run the visualizer with something like this (you need the Python Imaging Library installed):
    • python ../../../mozilla-central/tools/ipcprof/profiler.py o ce
  • Images should have been created, parent.png, child.png, both.png.
A perhaps more easily viewable image can be found here.

    Wednesday, September 15, 2010

    Profiling IPC Messages in Fennec

    Bug 596725 has a Python script that can profile the IPC messages log. Here's the output when run on icanhascheezburger: link. Some first impressions:
    • A few sync/rpc methods are used quite frequently, like preferences, cookies, permissions. Thankfully these complete very quickly. Still this might be worth optimizing though?
    • Lots of http is going on, so lots of http messages. Maybe nothing we can do about that?

    Sunday, September 12, 2010

    Emscripten Updates

    Minor note: Updates about Emscripten will be on my other blog - fits in better with the theme over there. Though I do believe the two should converge, when all is ready.

    Thursday, September 9, 2010

    Tools For Debugging Wakeups

    Removing unnecessary CPU wakeups in Fennec is very important, in order to reduce the amount of power used. The main tool used for that is powertop, which basically tells you how many wakeups each process has. But how can you tell what specifically in your app is causing those wakeups?

    There doesn't seem to be a simple solution for that. One idea is to use the debugger, manually. Perhaps that can be automated using debugger scripts, but I'm not sure, and in any case it would be very specific to the debugger and OS.

    So I've tried a different route - automatic rewriting of the source code. I've got it sort of working so far. The idea is this:
    • First you apply this patch. It modifies some critical files, in particular in the condition variable handling code and in the timer thread code. It adds a parameter to both, so that when you call them, you must identify (by a string) who you are.
    • Then, you run python timeout_rewriter.py (that script is included in the patch). This goes through the entire codebase, rewriting all the calls to condition variables (PR_WaitCondVar etc.) and timer creation (->initWithCallback etc.) to add that parameter. It modifies both C++ and JS code. Over 100 files are modified. How it does the rewriting is sort of hackish: It doesn't use anything fancy, just some heuristics derived from trial and error ;)
    • Then you build Firefox normally, and run it. All timeouts will then be printed with explanations, for example:
    -------------------------------------------------

    WaitCondVar: ./js/src/xpconnect/src/xpcjsruntime.cpp:820 (0.818000 seconds since previous; timeout: 1000)
    WaitCondVar: ./js/src/xpconnect/src/xpcjsruntime.cpp:820 (1.000000 seconds since previous; timeout: 1000)
    TimerThread: ./dom/base/nsJSEnvironment.cpp:3733
    WaitCondVar: ./xpcom/threads/TimerThread.cpp:347 (0.000000 seconds since previous; timeout: 997)
    WaitCondVar: ./js/src/jstask.cpp:95 (0.058000 seconds since previous; timeout: -1)
    WaitCondVar: ./js/src/jstask.cpp:95 (0.045000 seconds since previous; timeout: -1)
    WaitCondVar: ./js/src/xpconnect/src/xpcjsruntime.cpp:820 (0.825000 seconds since previous; timeout: 1000)
    TimerThread: ./content/events/src/nsEventStateManager.cpp:793
    WaitCondVar: ./xpcom/threads/TimerThread.cpp:347 (0.000000 seconds since previous; timeout: 329)
    TimerThread: ./mobile/components/SessionStore.js:345
    WaitCondVar: ./xpcom/threads/TimerThread.cpp:347 (0.329000 seconds since previous; timeout: 532)
    -------------------------------------------------


    Explanations:
    • The 1st line shows that the JS watchdog thread woke up - you can just look at the source line mentioned to see what it is. Additional info is that the previous CPU wakeup was ~0.8 seconds before it, and that this particular call to WaitCondVar was with a timeout of 1 second (so, it would like to be woken up again in 1 second).
    • The 2nd line shows the same place waking up, after 1 second.
    • The 3rd line shows the DOM GC timer being fired. The next line is the condition variable that the timer thread uses.
    • The 5th line is the JS background thread.
    • etc.
    I might write some tools to parse the output (like profilers do), but so far it already seems useful. Some conclusions: When the browser is idle, we have two 1/5 second timers, the IdleService (should be fixed soon), and nsEventStateManager (need to look into that).

    The reasons for rewriting the source code automatically are that, as mentioned, it's over 100 files - so a lot to do by hand. But also, we don't want those changes to go into our actual tree (even with ifdefs, it would be horrible), and maintaining such a big patch as the actual tree changes would be very problematic. So automatic source code rewriting seemed the best approach. Just remember to throw away all those changes, and not commit them by mistake ;)

    Wednesday, September 8, 2010

    iX script

    I wrote a small script for investigating xpconnect objects at runtime, iX (investigate Xpconnect).

    If you paste that code in a JavaScript file, and run iX(something), it will dump all the info it can find about the object: all regular JavaScript properties, as well as all valid interfaces that you can QueryInterface and GetInferface on that object, and also the node hierarchy. And it tries not to crash when doing so.

    (There is probably already something to do this that I am not aware of.)