MySQL: Using giant number of connections 2240 days ago Quote('84','84','5','4815')">Report spamEvery so often I need to reply the question, what is the best way to use 10.000 connections with MySQL (you can select your favorite big number here) Well. The good answer is your better not.
First let see where this requirement typically comes from ? Most commonly it is something like we have 40 apache boxes each having 200 children with persistent connections which gives us 8000. Well... This brings me to two questions - do you really need persistent connections and do you really need 200 apache children ?
Persistent connections will give you decent performance benefit only in one case - if there are few database queries to create the page and if database load is large response time contributor. In modern rather complex applications it is rarely the case. This is one of examples why persistent connections were disabled in "MySQLi" for PHP. Java applications are better in this sense as they typically use connection pooling rather than persistent connection which needs far less connections and much less efficient.
Now about 200 apache children on the box ? Why so many of them are required. In some cases it is good justified but in most it is simply poor configuration of Web part. 90% of these 200 children will be spoonfeeding slow clients, while keeping their large chunk of memory allocated by mod_php mod_perl or whatever you use part. One thing you can do to solve this problem is to place lightweight proxy in front which can spoonfeed your clients, do keepalive and even cache if something can be cached. You can use apache module for this purpose while I'd like something more lightweight, ie lighttpd. Important requirement here is not using process/thread per connection and based on events rather than select() or pool() in handling network as these are slow with large number of connections. By using these you can have 10.000 connections or more per box with small number of apache children. The other option is to use lingerd, which I did not try however. Finally the most radical option is to use fast-cgi interface which allows you to use much wider selection of web servers.
So if 200 processes is a lot how many is enough ? Well it depends on the application. In certain cases, for example if application uses Web Services internally or some other remote network calls. In this case you simply need a lot of processes running to get the performance you need. In most common case however if load is CPU bound and does local database calls you can do better with much smaller value.
Each "device" have the optimal number of requests which it can handle at the time. for CPU signle running thread per core is optimal. For hard drives it may be few requests per drive (to allow scheduling) but if you get too many you would get your sequential scans interrupted by random IO etc. Also you should count memory and CPU cache - the more concurrent processes are running the smaller portion of CPU cache and OS disk cache each of them get, not to mention they use memory themselves which reduce OS cache and even could cause swapping. The optimal number of proessess depends on the load, if it is CPU bound you would watch number of running threads more, if it is IO bound you would watch number of outstanding requests. As a quick and dirty I usually use (NumCPUs+NumDisks)*2 value. Depending on the application, OS, hardware and MySQL bugs it may be more or less than that.
The curious thing about this number is - it is kind of universal, even if it is not always 42. If you look at "localy running threads", meaning threads running on local CPUs or being blocked on local resources you can use this value - if it is Web server, application server or database server. If process is waits on network it is not local resource so you likely want to have many of these.
So how many ? I would start guessing from 20-40 is the optimal number of active threads on the Web/Application server. Then it comes to MySQL I'd see even less than that in "threads_running" Most typically values below 10 are optimal for CPU bound workload and below 30 for IO bound. It depends on hardware and workload of course.
What are the problems with many connections on MySQL side ? - Waste of memory. Each connection requires certain resources and might not give away all of them even if it is "sleeping" - Load On OS - Operation System rarely enjoys managing 1000s of threads TCP/IP connections - Many Running threads - thrashing of CPU cache and So waste of CPU resources. - Many Running threads - mutex ping-pong. Especailly bad for Innodb tables - Small annoyances as hard to deal with "show processlist", broken "SHOW INNODB STATUS"
Comments: 0 Views: 451
|