10854
|
|
|
Alex Rousskov |
take06 |
13 years ago
|
|
|
10853
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10852
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10851
|
|
Balance SMP workers load by raising one worker accept priority per second.
SMP workers are known to consume significantly different CPU time when the load cannot keep all workers busy. For example, here are the total CPU time results (plus CPU core IDs) from a 1M-transaction test with seven workers:
6:32 2 squid 5:31 7 squid 4:03 5 squid 2:42 4 squid 1:06 6 squid 0:19 1 squid 0:11 3 squid
The overall imbalance stems from the fact that most workers often have nothing to do but to wait for the next TCP connection. When the next connection arrives, the worker that can accept(2) the connection first, gets to service it. Since all workers are about the same, the first worker to wake up from epoll(2) waiting wins.
The shared listening descriptor inside the OS kernel has a queue of waiting workers. When the new connection arrives, all waiting workers are awaken (via their epoll_wait(2) calls), in the kernel-determined order. Apparently, that order is LIFO: the last worker to be placed in the wait queue is the one to be awaken first.
The new "load balancing" code that gets activated every second and changes the listening queue order. During N-th second, N-th worker becomes the first in the queue and will most likely get more work. With time, every worker gets similar number of chances to be the busiest and the idlest one, leveling the load:
3:15 2 squid 2:58 3 squid 2:43 6 squid 2:43 4 squid 2:34 1 squid 2:32 7 squid 2:17 5 squid
This specific algorithm seems to do a somewhat better job than some simpler schemes we have tried (e.g., change workers order every time Squid accepts a connection), possibly because it does not reshuffle workers at semi-random times, which may lead to groups of busier and "idler" workers.
When a worker has multiple listening ports, all ports are moved to the front of their respective queues. We have tried moving just one port at a time, but that more complex algorithm did not produce significantly better results in short tests.
There are other, secondary factors such as CPU affinity and shared CPU core caches. We are ignoring them for now, but they probably contribute to the uneven CPU time distribution among workers even when this load balancing algorithm is in place.
This change works in epoll-based environments only.
|
Alex Rousskov |
|
13 years ago
|
|
|
10850
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10849
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10848
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10847
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10846
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10845
|
|
|
Alex Rousskov |
take05 |
13 years ago
|
|
|
10844
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10843
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10842
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10841
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10840
|
|
Fixed DNS query leaks and increased defense against DNS cache poisoning.
We were leaking (i.e. forgetting about) DNS queries under several conditions. The most realistic leak case would go like this:
- We send UDP query1. No response.
- We send UDP query2. The response for query1 comes, with TC bit.
- We try to connect over TCP, sending TCP query3. The response for query2 comes, with TC bit, matching TCP query3 ID. Since we are waiting a response over TCP, we drop the UDP response, and delete the query from the queue. We leak.
This change avoids forgetting the query under the above scenario.
Moreover, the above steps are hiding another problem: we are accepting responses to timed out queries, making DNS cache poisoning easier. This change avoids that by using unique query ID for each sent query. We have also added an instance ID so that we still can track/identify a single "transaction" from Squid point of view, even when that transaction involves many DNS query messages.
When we forget about a DNS query, the caller may get stuck, holding a cbdata lock. This is typical for ACLs that require domain name resolution, for example. On a busy server with a long ACL list, the lock counter keeps growing due to forgotten requests and may overflow, causing a "c->locks < 65535" assertion. This change fixes the assertion unless there are more DNS leaks or different lock leaks present.
Same as trunk r11015.
|
Alex Rousskov |
|
13 years ago
|
|
|
10839
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10838
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10837
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10836
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|
10835
|
|
|
Alex Rousskov |
|
13 years ago
|
|
|