Performance & Scalability Considerations

Steviebone · Jun 12, 2006

I am designing a small load balanced system of web servers which will each be running a single web application that is database search intensive. My initial thoughts are to have single multiprocessor machine with a large striped raid array handling the data mining and several less expensive front end boxes serving the requests (and static data) in a load balanced configuration.

Am I correct in assuming that since a networked high performance share would be handling the intensive file i/o that the cpu's in the front end servers would benefit from noticeably lower loads? Or would the difference be offset by network traffic overhead?

Furthermore, how much of a performance loss will I take reading from a network share raid as opposed to a local one of the same configuration?

For example, lets assume a dual processor machine and a hardware raid stripe w/out partity reading from multiple SATA 3.0 drives? Assuming a GB(1000) nic/switch, at what point does the speed of the network transfer limit the transfer from the raid and diminish the speed gained by reading from the stripes?

Anti-Trend · Jun 12, 2006

Hi Steviebone,

Welcome to hardwareforums.

Steviebone said:

...My initial thoughts are to have single multiprocessor machine with a large striped raid array handling the data mining and several less expensive front end boxes serving the requests (and static data) in a load balanced configuration.
Click to expand...

What OS will these machines be running? Furthermore, do you expect the brunt of the load to be primarily on the webservers or on the DB server? In other words, are the web servers performing the query, or are the DB queries happening on the DB server itself?

As for the RAID, have you considered a RAID 10?

Steviebone said:

Am I correct in assuming that since a networked high performance share would be handling the intensive file i/o that the cpu's in the front end servers would benefit from noticeably lower loads? Or would the difference be offset by network traffic overhead?
Click to expand...

The file IO would be handled by the machine hosting the DB. As for traffic overhead, if the machines are on the same segment it shouldn't be an issue (unless you have a very archaic network )

Steviebone said:

Furthermore, how much of a performance loss will I take reading from a network share raid as opposed to a local one of the same configuration?
Click to expand...

That really depends on a lot of factors, not the least of which is which OS(es) you're using on the servers, whether the RAID is software or hardware-based, and what level of RAID you're using.

Steviebone said:

For example, lets assume a dual processor machine and a hardware raid stripe w/out partity reading from multiple SATA 3.0 drives? Assuming a GB(1000) nic/switch, at what point does the speed of the network transfer limit the transfer from the raid and diminish the speed gained by reading from the stripes?
Click to expand...

With multiple boxes reading from a single striped array, the first bottleneck you hit will likely be the HDD array. Unless you're running Windows, which has a somewhat broken network stack, so in that case it could go either way (in other words, I'd have to lab it to give you a straight answer there).

Steviebone · Jun 13, 2006

First, let me thank you immensley for your help.

Anti-Trend said:

What OS will these machines be running? Furthermore, do you expect the brunt of the load to be primarily on the webservers or on the DB server? In other words, are the web servers performing the query, or are the DB queries happening on the DB server itself?
Click to expand...

Server 2003 Enterprise. As presently coded the query would be happening on the front end with the central core raid handling the large file i/o. I suppose I could recode so that the front end server passes the query to the database server but then that would likely defeat some of the benefits as then the central core would have to handle the scripts for multiple requests, creating another possible bottleneck. My original concept was to keep the multi-proc machine doing nothing but file I/O.

Anti-Trend said:

As for the RAID, have you considered a RAID 10?
The file IO would be handled by the machine hosting the DB. As for traffic overhead, if the machines are on the same segment it shouldn't be an issue (unless you have a very archaic network )
Click to expand...

I would consider a 10. Im not concerned with parity, only speed as each front end server would be configured to also have a single data drive containing the most criticial data so that any given front end machine could operate stand alone or without the raid in an emergency (while the raid was down).

Each machine in the cluster network would be using double GB (1000) nics on a single GB switch and on the same network segment.

Anti-Trend said:

That really depends on a lot of factors, not the least of which is which OS(es) you're using on the servers, whether the RAID is software or hardware-based, and what level of RAID you're using.
Click to expand...

Hardware Raid with a card that allows for the full 3.0 per Sata channel/disk. As I pointed out earlier, since there will be other backups of the data on the raid, the primary purpose of the raid would be to speed the queries by virtue of multiple disk read and writes. Down the road, if all went well I might likely place another identical file server raid machine next to the primary on their own load balanced array cluster.

Anti-Trend said:

With multiple boxes reading from a single striped array, the first bottleneck you hit will likely be the HDD array. Unless you're running Windows, which has a somewhat broken network stack, so in that case it could go either way (in other words, I'd have to lab it to give you a straight answer there).
Click to expand...

Which is why I was going to put the most dollars per box into the raid machine, going with at least two if not three or four processors.

In this setup, the front ends would all benefit from the speed of reading the large stripe raid array but could run without the array if needed temporarily. The size of the database will likely be less than a 800 GB so the point of the array is more speed than size or redundancy.

I guess for starters I need to make sure I understand my math correctly. And the true capabilities of the raid card in question.

If each Sata drive is capable of 3.0 (this is burst though right? not sustained) and there are say 8 drives on the card, what is the actual limit on SUSTAINED throughput back to the front end? At some point would the limits of the GB nics be exceeded anyway?

Im trying to avoid the cost of duplicating a speed array in every machine if possible. The idea is that though the front end servers could stand on their own if needed (accessing the most critical data from local drives), they could see a significant speed gain by accessing the array. I suppose I could also code a snippet to monitor the status of the array and have the front ends default back to their local data should the array be down or too busy.

It seems likely that a dual or quad processor dedicated to the file I/o would be able to keep up with 3 or 4 front ends receiving simultaneous requests. However in that event, if too many requests were qued simultaneuosly some of the benefits would surely be lost.

Perhaps, instead of a single 8 or 16 drive array, putting smaller arrays in each machine (3 or 4 drives) would be better... In that instance the speed gain for each machine would be theoretically less than reading the data from 8 or 16 drives but the potential bottleneck would be elminiated and numerous simultaneous requests might be serviced more efficiently.

On the other hand, having a single high performance RAID for the data that can be expanded as needed is certainly more cost effective and maintainable long term. I had hoped to keep the cost of the front end boxes down and simplified so that scalability was easy.

In the end, the only way to know may be to build and test, then break up the large array into the front ends if needed. Or to have a second load balanced cluster of two or more raids in addition to the front ends.

Ok, Ill shut up now Thanks a million for any assistance!

Anti-Trend · Jun 13, 2006

Steviebone said:

First, let me thank you immensley for your help.
Click to expand...

No problem, that's what we do.

Steviebone said:

My original concept was to keep the multi-proc machine doing nothing but file I/O.
Click to expand...

That may in fact be the ideal solution, barring the IO overhead associated with poor memory handling on the NT5.1 platform of course. I just wanted to make sure we were on the same page there.

Steviebone said:

I would consider a 10. Im not concerned with parity, only speed as each front end server would be configured to also have a single data drive containing the most criticial data so that any given front end machine could operate stand alone or without the raid in an emergency (while the raid was down).
Click to expand...

I'm not concerned with parity either, until a drive goes dead. While data loss may not be an issue for you in this situation, from the sounds of it uptime is a factor, so I strongly recommend some redundancy in your RAID from that perspective. I think a RAID 6 would also be a reasonable option for you to consider.

Steviebone said:

Each machine in the cluster network would be using double GB (1000) nics on a single GB switch and on the same network segment.
Click to expand...

I don't think dual-homing the servers will give you much if any performance increase, unless the switch has a >1GB backbone. Of course, as I stated before the Windows network stack has some issues, so maybe you know some workaround of which I'm ignorant?

Steviebone said:

Hardware Raid with a card that allows for the full 3.0 per Sata channel/disk. As I pointed out earlier, since there will be other backups of the data on the raid, the primary purpose of the raid would be to speed the queries by virtue of multiple disk read and writes. Down the road, if all went well I might likely place another identical file server raid machine next to the primary on their own load balanced array cluster.
Click to expand...

Sounds like a good solution. Of course, I would provide some redundancy on the RAID of any production server to prevent as much downtime as possible.

Steviebone said:

Which is why I was going to put the most dollars per box into the raid machine, going with at least two if not three or four processors.
Click to expand...

I hate to harp on this point once again, but in this light redundancy and stability is key. If the RAID machine was to go down, so does everything else, and the load-balancing is all for naught.

Steviebone said:

In this setup, the front ends would all benefit from the speed of reading the large stripe raid array but could run without the array if needed temporarily. The size of the database will likely be less than a 800 GB so the point of the array is more speed than size or redundancy.
Click to expand...

See above.

Steviebone said:

I guess for starters I need to make sure I understand my math correctly. And the true capabilities of the raid card in question.
Click to expand...

That will definately vary based on a myriad of largely unquantifiable factors.

Steviebone said:

If each Sata drive is capable of 3.0 (this is burst though right? not sustained) and there are say 8 drives on the card, what is the actual limit on SUSTAINED throughput back to the front end? At some point would the limits of the GB nics be exceeded anyway?
Click to expand...

The burst rate on SATA drives may be a theoretical 3GBPS, but that is for linear data reads of extremely large files. In other words, 3GBPS burst is the absolute best you can hope for even the most ideal situation possible.

Steviebone said:

Im trying to avoid the cost of duplicating a speed array in every machine if possible. The idea is that though the front end servers could stand on their own if needed (accessing the most critical data from local drives), they could see a significant speed gain by accessing the array. I suppose I could also code a snippet to monitor the status of the array and have the front ends default back to their local data should the array be down or too busy.
Click to expand...

That's a fairly good concept overall.

Steviebone said:

It seems likely that a dual or quad processor dedicated to the file I/o would be able to keep up with 3 or 4 front ends receiving simultaneous requests. However in that event, if too many requests were qued simultaneuosly some of the benefits would surely be lost.
Click to expand...

Yes, though the IO capabilities grow substantially with the size of the RAID in the case of a performance-oriented RAID.

BTW, that is the WAN link on these web servers going to look like? Is this going to be an internally-facing deployment, externally-facing, or both public & private?

Are you doing round-robin DNS for load balancing, or something like an F5 / BigIP? Perhaps policy-based routing?

Steviebone said:

Perhaps, instead of a single 8 or 16 drive array, putting smaller arrays in each machine (3 or 4 drives) would be better... In that instance the speed gain for each machine would be theoretically less than reading the data from 8 or 16 drives but the potential bottleneck would be elminiated and numerous simultaneous requests might be serviced more efficiently.
Click to expand...

If you want zero redundancy in your arrays, this would be the better option. Depending on your load-balancing method you'd have to make some adjustments in the case of the failure of one of the outward-facing machines though.

Steviebone said:

On the other hand, having a single high performance RAID for the data that can be expanded as needed is certainly more cost effective and maintainable long term. I had hoped to keep the cost of the front end boxes down and simplified so that scalability was easy.
Click to expand...

In my mind, scalability will be difficult in either scenario you've outlined, and administrative overhead has the potential to be quite high. This being said, with NT5x I don't know of a better way to handle the situation. Even adding a distributed filesystem to the equation would not eliminate the need for any of the machines you've outlined here.

Steviebone said:

In the end, the only way to know may be to build and test, then break up the large array into the front ends if needed. Or to have a second load balanced cluster of two or more raids in addition to the front ends.
Click to expand...

That may be the case, unfortunate as it is. If the servers will be accessed via WAN link, I don't think you'll get anywhere near the IO limit outlined in your scenario. If your servers will be internally-facing, it may be a different story but that also depends on the type and quantity of data each client is pulling from your servers.

Log in

Performance & Scalability Considerations

Steviebone Geek Trainee

Anti-Trend Nonconformist Geek

Steviebone Geek Trainee

Anti-Trend Nonconformist Geek

Share This Page

Log in

Performance & Scalability Considerations

Steviebone Geek Trainee

Anti-Trend Nonconformist Geek

Steviebone Geek Trainee

Anti-Trend Nonconformist Geek

Share This Page

Useful Searches