New Year deathmatch: comments vs. nodes

About a year ago chx blogged about the necessary steps for comments to become nodes. Now we have both the code registry and multiple load, I thought it was worth seeing how far we've come.

Update:
It was a bit of a shock just how bad these benchmarks were, so I've since done some profiling. A lot of time is getting spent in drupal_render() (see http://drupal.org/node/353632 which includes a patch clawing something like 5-10% page generation time back too).

Update 2:
webchick reminded me about - http://groups.drupal.org/node/3550 - although the methodology/data is slightly different, we can see that database time has equalised, while those benchmarks also showed more time spent in PHP.

Update 3:
I've attached screenshots from kcachegrind, which show the most expensive functions for 30 nodes/30 comments and 90 nodes / 90 comments.

First, we need two relatively comparable pages to compare with varying numbers of nodes and comments - for this I've chosen node/n and taxonomy/term/n - I did a benchmark of these with one node and no comments on each to get a baseline. Since in both cases the node_load() is the same, we can compare the number of queries and requests per second for everything else.

node/n
Devel: Executed 32 queries in 9.58 milliseconds.
ab -c1 -n500
5.13 reqs/sec
5.28 reqs/sec
5.27 reqs/sec

taxonomy/term/n
Devel: Executed 33 queries in 11.45 milliseconds
ab -c1 -n500
5.21 reqs/sec
5.22 reqs/sec
5.28 reqs/sec

Close enough for our purposes that we don't need to make any adjustments.

Then we need a node with 300 comments and 300 nodes attached to one taxonomy term. Thanks to devel generate and a while loop to insert the term_node records that's easy enough, although sadly it turns out it wasn't worth benchmarking 300 nodes. I did basic benchmarking with ab, and consulted the devel query log. Note you'll see an extra query-per-item as the numbers increase, in both cases this is from the call to cache_get() in check_markup().

10 nodes per page vs. 10 comments per page:

node/n
Devel: Executed 43 queries in 27.29 milliseconds
ab -c1 -n500
5.26 reqs/sec
5.23 reqs/sec
5.25 reqs/sec

taxonomy/term/n
Devel: Executed 45 queries in 23.35 milliseconds
ab -c1 -n500
4.17 reqs/sec
4.20 reqs/sec
4.31 reqs/sec

30 nodes vs. 30 comments:
node/n
Devel: Executed 63 queries in 37.05 milliseconds.
ab -c1 -n500
5.22 reqs/sec

taxonomy/term/n
Executed 66 queries in 36.69 milliseconds
ab -c1 -n500
3.25 reqs/sec

90 vs. 90

node/n
Devel: Executed 124 queries in 58.23 milliseconds
5.27 reqs/sec

taxonomy/term/n
Devel: Executed 125 queries in 51.93 milliseconds.
1.77 reqs/sec

Since time spent in the database is exactly the same (yay!), this means we're spending a lot of time in php somewhere (boo!). merlinofchaos suggested node_rendering on irc. Probably we need some profiling to see exactly where all that work is happening.

Oh, and happy new year, planet!

AttachmentSize
30-comments.png138.08 KB
30nodes.png140.99 KB
90comments.png174.75 KB
90nodes.png144.09 KB
90pages.png237.06 KB
node-181x50.png194.51 KB
nodex50.png150.11 KB

Powered by Drupal, an open source content management system