CakePHP and performance (for noobs)

For a couple of new web projects we are looking at CakePHP. One of my employees suggested it as he has some experience with it, and it has a good reputation as MVC framework. We did look at other frameworks, but looking at the benchmark others did they do not really differ that much in speed, so we decided to use the framework at least one of us has some prior experience with.

It is safe to say I myself am not a CakePHP expert by any measure, but I do have quite some experience building the back-end parts of websites (don't take this blog as a reference, it is just a proof-of-concept I ended up using and couldn't be bothered to properly fix up).

Several pages on the internet claim CakePHP is not exactly fast, and the most common remedies are to disable debug mode, and use opcode and variable caching (we used XCache in this case for both). I'm not convinced CakePHP is slow, but for beginners like me it is easy to make mistakes that completely destroy performance.

This blog post is primarily about the optimization of a single page, that we got down from 3900ms to about 35ms, using ab -c 1 -n 1000 local-url (the performance increase is similar using a much higher concurrency level).

The page in question was never meant to show more than a few records and thus lacks pagination. For the test, however, we dumped quite a few records into that page anyway. That is when it all grinded to a halt, and the optimization started. The SQL query and result data were already cached using XCache, so that wasn't where the problem was. As this current site we are building is more or less a testbed for a site we will start building shortly that expects a heavy load, we are trying to save every ms we possible can and use this opportunity to learn the performance pitfalls. Though obviously, a 3900ms response time is not acceptable for any site, even if nobody visits it :)

Response time at this stage: +- 3900ms

In the view, a lot of $html->link() and $html->image() calls were used with controller and action parameters. After some testing, we figured out that this produced an enormous slowdown due to router::url() calls.

To counter this without modifying the page, we created an AppHelper::url() method in app/app_helper.php. Some Google'ing dug up various implementations of this function, some caching, some completely replacing router::url(). None of these solutions worked well for us. We ended up going with this code:

Code
#
1
class AppHelper extends Helper {
2
    private $urlcache = array(); 
3
  
4
    function url($url = null, $full = false) {
5
        if (is_array($url)) {
6
            $havekey = false;
7
            $haveint = false;
8
            $keyafterint = false;
9
            
10
            $cacheparts = array();
11
            $otherparts = array();
12
            foreach ($url as $key => $value) {
13
                if (is_int($key)) {
14
                    $haveint = true;
15
                    $otherparts[] = $value;
16
                } else {
17
                  $havekey = true;
18
                  if ($haveint) {
19
                      $keyafterint = true;  
20
                  }
21
                  $cacheparts[$key] = $value;
22
                }              
23
            }
24
            
25
            if (($havekey) && ($haveint) && (!$keyafterint)) {              
26
                $id = serialize($cacheparts);
27
                
28
                if (isset($this->urlcache[$id])) {
29
                  $url = $this->urlcache[$id];
30
                } else {
31
                  $url = parent::url($cacheparts, $full).'/';
32
                  $this->urlcache[$id] = $url;
33
                }
34
                
35
                foreach ($otherparts as $value) {
36
                   $url .= $value.'/';
37
                }                
38
                
39
                return $url;
40
            } else {
41
                return parent::url($url, $full);
42
            }          
43
        } else {
44
            return parent::url($url, $full);
45
        }
46
    }
47
}

I'm sure this code could be further improved with better knowledge of CakePHP. The main characteristic is that it uses request-local cache ($urlcache) and only caches the controller and action parts (and other 'named' values in the array, as long as they appear at the start), then simply appends the 'unnamed' parts of the $url array. Caching the 'unnamed' parts as well would be slightly faster, but would take insane amounts of cache space for a large number of records. You are free to use that code if you want to, though YMMV and in your situation a different method may make more sense.

This addition, without further modifications to any views, cut the response time in half.

Response time at this stage: +- 2100ms (1800ms saved - 45%)

Though we were of course happy that we doubled performance, 1800ms is still not nearly acceptable, even if the page does generate a 500kb response.

So we decided to remove most of the $html->...() calls altogether. Though this does reduce how dynamic your page are and makes them less maintenance friendly (changing routes and such require you to change the views), we decided to give it a shot anyway. Again, we saw a nice boost in performance.

Response time at this stage: +- 1600ms (500 ms saved - 25%)

To make URLs prettier, we were also using the Inflector::slug() method. This turned out to be a really bad idea. Apparently, this is a really slow call as well, and cutting it out again cut the response time in half.

Of course we were not content with not having pretty URLs, so we moved the Inflector::slug() call to the write-stage (adding a slug field to the database) instead of the read-stage where it used to be.

Response time at this stage: +- 125ms (1475ms saved - 90%)

Still not what we really wanted out of this page, even if it is a big one. Next we looked at some loop optimization in page generation. This should have been done properly from the start, but hey everybody makes mistakes. The details are not important but suffice to say some things were calculated for each record that could safely be moved outside of the loop.

Response time at this stage: +- 50ms (75ms saved - 60%)

The last optimization we did was the $uses clause. I wasn't aware it was being used at all, but it seems someone put a single $uses include in app_controller. Obviously this had to go, as for most pages it wasn't relevant.

I was aware due to other CakePHP performance articles it shouldn't be used unless absolutely necessary (and mostly that can be picked up by importing the model when you need it), so when I saw it I removed it.

Response time at this stage: +- 35ms (15ms saved - 30%)

So, here we are. This page has been brought down from 3900ms to 35ms (111x as fast). Some slowdowns were logic mistakes, others were due to lack of experience in using CakePHP.

I dare say that for now I am content with a 35ms response time on the local test server for a page that generates 500kb of output (though of course that will not go into the finished version of this project), for the moment, at least.

Conclusion

Getting CakePHP to be responsive can be difficult for beginners. We were already considering the following:

  • Using an opcode cache (XCache in our case)
  • Using a variable cache as 'default' cache (XCache in our case)
  • Using a variable cache as '_cake_core_' cache (is this still needed in 1.3 ?)
  • Caching as many queries as possible for as long as is optimal for that specific entry
  • Avoiding $uses
  • Set debug to 0
  • When debugging do not forget to use debug_kit
  • View caching (we are not using this as we will be using external methods for this)
  • Cache elements
  • Cache everything else as well ;)

Things we learned from optimizing this single page:

  • Avoid anything that ends up in router::url() like the plague, it absolutely kills performance. This includes many $html->...() and similar calls. Roll your own if needed.
  • In some cases it doesn't make sense to avoid these calls, for those cases make sure you have some sort of caching in place. Whatever you can do to speed up the reverse routing in your case. See for example the AppHelper::url() code above.
  • Inflector::slug() should never be used at render time. Avoid at all cost.
  • Don't forget to optimize your loops. It shouldn't even be needed to state this, but as we made the mistake ourselves on this page I'm mentioning it.
  • Even if you think $uses isn't used, check again.

Some things we learned from optimizing other pages:

  • Modify Paginate so you can use caching for it. It doesn't use caching by default. There are some examples around that do this (use Google), I would suggest taking those as a base and adapting them for how they will work best for you. I would post the code we used ourselves for this, but it uses a customized cache controller so it wouldn't be very useful to you.
  • Don't use a generic page controller for items that do not come from a database coupled with pages that do. It makes properly implementing the client-side caching more complex than necessary, and as your project grows you will lose track of the pages. Just use more controllers and keep your structure more logical.

Of course there are the mandatory steps after you do all these things to improve the performance even more, but these are not really CakePHP related:

  • Use YSlow! It may point you at many issues with your pages that are worth optimizing.
  • Make sure you send the right headers for proper client-side caching.
  • Use asset versioning in the filename, and let each unique filename be cached on the client pretty much forever for static assets (like images used for layout, JS, CSS, etc). Only use trickery like last-modified and ETag checking on dynamic assets.
  • Use different (sub)domains for static assets and let them be served statically (not through PHP). Minify and gzip JS and CSS assets beforehand.
  • If you're going all out, using things like HAproxy, Varnish (yay for ESI), nginx, squid, etc to reduce load on your web servers. Then use something like lighttpd for the static assets and Apache only for the dynamic requests. If you do this well, your site may go from 100 req/s to 50.000 req/s (and beyond!).
  • Don't forget to optimize Apache and MySQL as well

Whole books have been written on the subjects above. I will not go into detail about them here, but if you see anything listed above you have not heard of before, it may be worth your time to investigate them.

Optimize every single component of your site, and implement cache on as many layers as you can.

Hopefully this article will help at least one person somewhere :)

Comments and suggestions always welcome (but please be nice).

Comments

Posted by Alexander Vassbotn Røyne-Helgesen on 08-06-2010 at 13:05:24
Many valid points here, and I bet I would use some of them, but some of the points have no use if your site is behind a brute cache cluster, ie Varnish.
Posted by Chainfire on 09-06-2010 at 04:27:38
That is certainly a valid point in itself, though ideally you would optimize every layer.

This is merely an example of easy mistakes you can make, and how to fix them. Which you will use will obviously depend on your situation.




(This is a trap for bad people - don't input website)




You may use tags like [b], [u], [i], [url], [quote], [code], [pre], etc