Hopeless Geek

Tagline

LAPD motto: We treat you like a King.

Home » Blogs » Adam Knight's blog

Optimizing a VPS for Getting Dugg


  • Internet & Web
  • Technology
March 21, 2006 - 1:04am

So after Mac Geekery had a few fleeting moments on Digg’s front page and pretty much killed the server for the rest of the day, I looked into what could be done on this setup to make things work a little better for getting hit.

First, what we have to run:

  • Drupal 4.6
  • PHP 5.1
  • MySQL 5.0
  • Apache 2.0
  • Other miscellaneous utilities (30MB)

If I could find a replacement DNS server for BIND that didn’t suck, I’d use it. I tried djbdns and the author’s anti-social nature comes through very clearly in how horridly the software is designed for the human element. I got it working, but decided it wasn’t something I wanted to maintain. Actually, after getting it working it and daemontools took up as much RAM together as BIND did to start, so there was no real win to it because I already knew how to use BIND.

Then, the challenges:

  • 160MB of RAM
  • A shared host means shared access to the disk, which means slower I/O

Drupal

Drupal wants to live on its own machine. That’s the only conclusion I can come to when I look at the prolific number of SQL requests this thing puts out. While I was able to shoehorn it into this small environment, I’m not so confident I could get it to work well in any lesser environment, like shared webhosting. Well, at least not to the level where it would withstand a good punch in the face like Digg and other linky memes generate.

Drupal itself needs a few considerations before you go tweaking other settings. Note that every enabled module adds more SQL queries at some point or another. At the very least they run some PHP code. Only turn on modules you need to use. If you have some frivolous module you don’t need, set it to throttle. I’ll show you how to know when to throttle things later (that part’s fun).

There’s a developer tools module for Drupal that shows you how many queries have been run and so on. It’s overkill for most installs, however. Just trim your module list, throttle the bling, and move on to bigger fish.

Apache

The biggest problem with Apache is memory. I mean this in several ways. It takes a connection, passes it to a child (if one is not available, it makes one), and that child balloons its memory to handle the request and never lets go of it. This is contrary to sanity, but it’s what it does. So there’s two controls to take care of this. The first is telling Apache when to make children and how many (it’s not often you get to play the Chinese government). The second is when to kill the child and make a new one (there are no tactful comments here). When using PHP scripts like Drupal you’ll see a 30-35MB usage per fork. Now, up to 20MB of that is shared, but that leaves 10MB of unique data per fork sucking up resources.

Determine how much memory you want Apache to use when idle and when working hard, and set the min/max clients accordingly. I figure 30MB for the root process and an additional 10MB non-shared per fork, myself. Try to keep a good number running for two reasons: first, you’ll see a faster server if there’s a process available; second, it will already have claimed memory, so it’s less likely to swap to handle a request.

Once you know how many to keep running, you need to know when to kill it. Watch the memory behavior of the process over time and see if it balloons. If so, something is leaking and must be killed. The MaxRequestsPerChild setting tells Apache how many requests a fork can handle before being reaped and redeployed. Most people use a number in the thousands here. I’ve found 500 works well. Remember that these are requests, not connections and not pages. These are images, CSS, and include keep-alive sessions.

Speaking of keep-alive sessions, allow them. If someone hits your site they’re getting about twenty items off it just to get the main page up (generally). If you turn on keep-alive then this only takes one or two connections to your server per visitor to fetch. If you have it off then one visitor will use up all of your forks if you’ve tuned Apache for low memory. Keep-alive is good.

Also, move the contents of all of your .htaccess files into the main Apache configuration file. I cannot stress this enough. If you don’t, then Apache will parse that file for every single hit and for every single parent directory. That’s bad. Put the statements in Directory stanzas in your configuration file for the fastest access.

MySQL

MySQL is interesting to tweak. There are several things to watch:

  • Query cache
  • Key cache
  • Thread cache

Query cache

The query cache caches queries. That’s it. When the same query is run again, the answer is already available and is delivered without a lookup. When you’re getting Dugg or Slashdotted, this happens a lot. You want to turn this on. Add the following to /etc/my.cnf:

query_cache_type=1
query_cache_size=6M
query_cache_limit=1M

Well, that’s where I have it. You may want different answers. This says to cache all cacheable queries (type 1), allocate 6MB to the query answers, and cache all queries whose answers are under 1MB. By using SHOW STATUS LIKE 'Qcache%' you can see how this is working out for you:

+-------------------------+---------+
| Variable_name           | Value   |
+-------------------------+---------+
| Qcache_free_blocks      | 631     |
| Qcache_free_memory      | 1667144 |
| Qcache_hits             | 1544785 |
| Qcache_inserts          | 1500274 |
| Qcache_lowmem_prunes    | 831428  |
| Qcache_not_cached       | 51792   |
| Qcache_queries_in_cache | 1360    |
| Qcache_total_blocks     | 3862    |
+-------------------------+---------+
8 rows in set (0.00 sec)

I’m seeing significantly more hits than misses, so that’s working out. In fact, for what it’s doing I may have too much memory allocated. That’s fine for me, I’d rather too much than too little.

Key cache

Indexes can be cached as well. At best you’ll see a 1-in-10 miss for the key cache, and that really, really speeds things up for MySQL. See how you’re doing, first, to see if you need to tweak this:

+------------------------+-----------+
| Variable_name          | Value     |
+------------------------+-----------+
| Key_blocks_not_flushed | 0         |
| Key_blocks_unused      | 1770      |
| Key_blocks_used        | 1812      |
| Key_read_requests      | 127915934 |
| Key_reads              | 9735870   |
| Key_write_requests     | 2479694   |
| Key_writes             | 1243702   |
+------------------------+-----------+
7 rows in set (0.00 sec)

The ratio of actual reads to read requests is about 1-to-10, which is good. About 90% of key requests are in the cache, then. If that’s not a good ratio for you, increase the size of the key cache in your my.cnf file:

key_buffer=2M

Thread cache

Just like Apache, MySQL creates a new instance to process requests. On modern systems, it creates threads; on other systems it makes forks. Either way, it keeps some on standby, just like Apache. The way to tweak this is to see how many threads it’s making and up the cache until it’s making just a few. If you increase it too much it will never make a new thread, which means you’re wasting memory with unused threads. Cover the common case and then work on the fringes. Again, check the status and see what’s going on with the threads:

mysql> show status like 'thread%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Threads_cached    | 5     |
| Threads_connected | 3     |
| Threads_created   | 84    |
| Threads_running   | 1     |
+-------------------+-------+
4 rows in set (0.00 sec)

I’ve told MySQL to keep five threads around. Right now, three are there in use and one performing a query. In the days it’s been running it’s only needed to make about 80 new threads. That’s decent. If you see this in the thousands, you’re losing a lot of performance to the creation of a thread. Increase the thread_cache_size to a number about twice the number of connected threads and work from there. Remember: creating threads is not bad, just creating a ton of them.

Stress it

Now that you think you have it down, hammer the server and see how it handles things. Simulate 100 hits to the server with this:

ab -n 100 http://your.server/

See how it fared. Check the cache statistics in MySQL and the responsiveness of the site in a web browser while the test runs. Tweak settings as needed from here to see what’s going on.

If you can’t tell which end is having problems, work backwards. Check the running MySQL threads with SHOW PROCESSLIST and see if any are locked or writing or whatnot. If not, then see if Apache is causing swapping with sar -B 1 10 or something like that (sar is not always an included program; you may need to find and install it).

If everything seems okay, then go back to Drupal and lower the throttle threshold to below the number of current users (check the “Who’s Online” block) and try the test again. That should trigger the throttle. See if the site holds better. If it does, then throttle will help and you just need to tweak the breaking point for the site.

Drupal can take it, but it takes a lot more than just turning on throttle and praying it works. Smiling

Average: 5 (1 vote)
  • Adam Knight's blog
  • Printer-friendly version
March 27, 2007 - 6:28am
tjharman@drupal.org said

Also don’t forget

gzip all content (will mean less data going out and faster page loads for users)

A proper PHP OpCode cache, eaccelerator is the best of the free ones at the moment.

Tim

Syndicate content Syndicate content

Site Navigation

  • Home
  • Recent
  • Popular
    • Today
  • Top rated
    • Recent votes
  • Elsewhere
    • FriendFeed
    • Friends
    • Software
    • Unsane

Navigation

  • My votes

Quotes

“Our materialistic masters could, and probably will, put Birth Control into an immediate practical programme while we are all discussing the dreadful danger of somebody else putting it into a distant Utopia.” — GK’s Weekly, 1/17/31 – G. K. Chesterton

Footer Links

  • Badges
  • Contact
Powered by Drupal, an open source content management system
© Adam Knight, All Rights Reserved except where otherwise noted.