The command line

GNU/Linux, web development and some other things

Deploying Seaside: Load Testing Results

I made a series of test with different values of the parameters. All of them are for reference purpose only. As with any benchmark, a lot of factors affect the results. The advice is, try to isolate your environment so that the results are meaningful My machine is
  • Intel(R) Core(TM)2 Duo CPU T6400 @ 2.00GHz
  • Cache: 2048 KB
  • RAM: 4GB
with a lot of processes running and the same machine hosting lighttpd, JMeter and the images. There are two series of tests. The first one test the seaside.example.com. That is the application that stores everything on memory in each image. So this will be very fast. The second one test the magma.example.com. So each request is accessing the Magma database. As we increase the number of images, the database will be the bottleneck. Keep that in mind when comparing results. Also, don’t bash Magma as magma is very fast. It is just that the SeasideMagmaTester isn’t optimized yet. It is just a simple application for measuring a *simple* Magma-Seaside integration. In a real production environment you’ll put the database in the most powerful server (at least for writing performance), and for read performance, you can add several servers to the magma node and get a lot of reads/sec. But that is another problem. We just want to test the SeasideProxyTester as is. Optimize your application as you see fit. First the seaside.example.com results:
Magma Images Seaside Images mmap (MB) JMeter Users JMeter Ramp-Up (seconds) JMeter Loop counter Samples (Requests) Throughput (Req/sec) Error % (requests that failed)
1 1 100 10 10 500 5010 113 0
1 1 100 100 100 500 49971 114 1.85
1 1 100 400 100 500 200400 117 60.03
1 2 100 10 10 500 5010 140 0
1 2 100 100 100 500 50100 137 0.97
1 2 100 400 100 500 200400 158 39.56
1 10 100 10 10 500 5010 154 0
1 10 100 100 100 500 50100 154 0
1 10 100 400 100 500 200400 170 0.43
1 30 100 10 10 500 5010 118 0
1 30 100 100 100 500 50100 129 0
1 30 100 400 100 500 200400 118 0
1 30 100 4000 100 100 600600 187 47.49
1 2 100 600 30 1000 600600 205 76.12
Now the magma.example.com results:
Magma Images Seaside Images mmap (MB) JMeter Users JMeter Ramp-Up (seconds) JMeter Loop counter Samples (Requests) Throughput (Req/sec) Error % (requests that failed)
1 1 100 10 10 500 5010 50 0
1 1 100 100 100 500 50100 75 1.11
1 1 100 400 100 500 200400 102.9 78.32
1 2 100 10 10 500 5010 57 50
1 2 100 100 100 500 50100 120 73.21
1 2 100 400 100 500 197935 160 91.64
1 10 100 10 10 500 5010 44 89.28
1 10 100 100 100 500 50100 167 97.24
1 10 100 400 100 500 200400 206 95.59
1 30 100 10 10 500 5010 45 89.38
1 30 100 100 100 500 50100 150 98.72
1 30 100 400 100 500 179686 255 99.99
1 30 100 100 100 2 300 3 0
Those results as I said, are just a reference. YMMV. Comments: The seaside.example.com results are very varying. With one seaside image, you get 113 requests/second. Thats a lot of requests. Really. I hope someday I have a site that receive that number of requests. But have in mind that the seaside.example.com application it is just storing the counters in memory. Also, the server (that is, my laptop) is just handling 1 process for the magma image (not used), 1 process for the seaside image (heavily used), 1 process for the lighttpd server (heavily used) and 1 process for JMeter. Not a lot of work for the cpu and the Linux process scheduler. But if you see the results for 2 seaside images, the best you get is 140 requests/second without getting errors. That is unexpected, because if 1 image can handle 113, 2 images should handle at least 200 request. That is even more notorious when you use 10 or 30 Seaside images. The best you get is 154 requests/second. As I said before a lot of things affect this results. First my CPU isn’t as powerful as the ones from real servers. My laptop is doing a lot of other processes (webbrowser, gaim, JMeter on GUI mode, the GNOME desktop, the wireless, the music). In a dedicated server more resources are reserved for the Seaside images. In the worst case, with 30 Seaside images (each of them doing a lot of work by itself) the laptop CPU is doing a lot of process context switching giving each image a slice of processor time. Each image, in turn, is doing its own process scheduling between Komanche, Seaside and the others processes that run in a Pharo image. If you consider this you can explain why there isn’t a linear scaling in the requests/second as you increment the number of images. The best, appears to be, is to use different servers for the webserver and for the images. Also, distributing the images on two or more small servers (as my laptop is), can get the best of the images and from the balancer. Now for the Magma results. They are ugly and disappointing. But, remember, isn’t optimized yet. It is just the simplest way of getting Magma and Seaside working. For example, if your app reads a lot more than writes to the database (as most application are, unless you are storing the results of subatomic collisions ;)) you can add more read only server to a magma node to improve the read performance. Besides, you can use different read strategies and use Magma Collections to store your data. The PROBLEM WITH THIS PARTICULAR APPLICATION is that all the images are trying to write to the same slots on the dictionary that holds the counters. This, when you have a lot of processes trying to write, necessarily results in a lot of commit errors. Suppose session 1 reads the current counter value in order to increase it. Before it can commit the new value (current value + 1) the Pharo scheduler switches to other session on the same image or the OS scheduler switches to other Pharo image. The new scheduled session (session 2) reads the current value (not yet updated by the first session) and if not uncheduled like the session 1, successfully commits the new value. Some time later the session 1 get scheduled again and tries to resume from the exact same place where it left. So update the value and send the commit to magma. Magma notes that the value has changed since it was read and marks a dirty object (that is, the client must do an abort to get the new value) and a magma conmit conflict. The error is arrives to the final user and is counted by JMeter as an error because of the status 503 from the headers. In a real application, the common scenario is that each user writes to its own section of the database or to different parts of a common collection, this is handled very well by Magma, even better if you use Magma Collections. So in a more realistic scenario, you won’t have that many commit conflicts, if any. But that is Magma optimization and you know better than anybody your own application. Maybe Chris Muller (Magma creator) or Keith Hodges (Magma seasideHelper creator) can replicate this results and suggest better ways to test Magma and use Magma seasideHelper. I repeat: the apparent errors are a consequence of the application tested and not from Magma. How do you know? Because in every case we get a response from the Magma server, that is, a commit conflict error response. So the server is alive and healthy, responding appropriately every request made by a Seaside image. Keep that in mind before bashing Magma. One better way to test this application is to give each session its own counter on the database (as if each user were getting its own private data) and all of those private counters being held on a Magma Collection (that is, a collection of user data). This way each session will update its own data and that by its own nature, won’t produce commit errors. But that is left as an exercise to the reader. So, to test your apps.