Friday, September 01, 2006

A success story of ISAPI, Datasnap and D2006.

A while ago I was hired to extend and complete a very complex ISAPI Delphi web application.

This application was intended to manage high volumes of web requests and responses, with peaks of 55,000 logins per day and around 35,000 transactions per day.

The system is power up by a nice 3-tier architecture using all the DataSnap gadgets available, including a nice implementation of the out-of-thebox load balance and redundancy components included on DataSnap.

Years and several jobs later, a big problem started to show up on moments of high load/volume, a terrible "No more available connections" message start affecting the webservers, the worst part was that the asp wrappers of the ISAPI were fine, so everything pointed to a problem with the dll.

I was contacted to work on this problem, so that is how my adventure started.

Initially the solution was to increased the amount of instances of the DLL that could be loaded in memory. I also tried increasing the IIS back log settings for pending http requests, all this helped but the problem persisted.

The next solution was to increase the number of webservers, this off course helped, brute force did its magic but the volume of requests increased and after some months and the problem showed up again.

After some research, I decided to go the dramatic way, so I ended up upgrading the application to the all mighty Delphi 2006 and its new FastMM memory manager, a new Midas, and the addition of the ISAPI Thread Pool unit.

In order to test if the solution was effective enough without putting at risk the live production system, I setup a nice test environment using the well known opensta testing framework.

The test consisted of 30 virtual users, each user accessing 20 different player accounts, browsing the website, checking their balances and logging out of each account. I ran twice each test against the old and the improved dll.

The results were very good, the response time per request when from an average of a 100ms per request to 25ms per request. (THAT DRAMATIC!)

On the dark side, I got only 6, "500 errors" on the test runs. So on one side I was happy because performance just jumped up, and in the other side I was concerned because I wasn't sure if I was properly addressing the main reason of all this mess.

So I decided to ramp up the test by going up to 100 virtual users and push to the extreme those 60 maximum active modules. The result, simply impressive...

The test ended up doing 12830 http requests to the dll, in the old DLL out of those 12830, 1850 were "500 errors". What "500 error"?, well the famous "not enough available connections". BINGO!

I switched dlls and tried again... GOSH, DELPHI ROX. Out of 12830, only 10 "500 errors". The dll was able to answer most of the requests properly and still in a good time.

Do I have to say more? I can't hide the fact that some nice architectural changes helped a bit, but, FastMM, ISAPI thread pooling and DataSnap, are the perfect combination for web dlls.

No comments: