|
Question : Web site down
|
|
Hi, The web site I am responsible for went down last night and did not recover till all major components were restarted. The strange thing is that there is no sight of error in the logs, apart from a strange increase in the number of tcpsck registered with sar. To be more specific: The site runs on a cluster of Linux boxes, all RedHat Enterprise v.7 (2.4.21-47.0.1.ELsmp). One of the server runs MySQL v.4.1, the second one runs Apache v.2.0 and one jBoss instance v.3.2.6, while the third box just runs the second jBoss instance with the web application in a cluster. Apparently, the jBoss servers stopped working one after the other, as there are 502 errors in the Apache logs, but nothing in the jBoss logs (logging is set to maximum for all servers). After a while (around an hour) Apache stopped working as well, as its logs have a difference in time of several hours between the crashing time and the manual restart. After we manually restarted all servers everything seems to be as before the problem, the site servicing the visitors with a rate of a few thousand requests per day. I checked the disk space in all systems and all seem to have fairly enough space left (the smallest free partition has around 15GB). No memory errors were reported, as there is no change in the sar reports we frequently run. The only weird pattern is the sudden increase in the number of tcpsck, possibly reaching the limit set by the system, just after the crash of Apache and jboss. Any idea what could be the reason of the crash?
|
Answer : Web site down
|
|
Linux OS has DOS attack prevention in-built and its controlled by 2 system parameters, message_burst and message_cost (look at section 4.7.1 of http://linuxmafia.com/pub/linux/suse-linux-internals/chapter4.html). So, if a IP is opening socket too often, then after certain threshold no of connections, the OS will start rejecting sockets untill the next parameter is reached. For example, if the message_bust is set to 1 every 5 sec. So, if client has opened more than 1 sockets in 5 secs - the OS will block the IP - and remember this blocking is temporary. It will be reverted after certain threashold.
More on DOS attacks: http://articles.techrepublic.com.com/5100-6345-1047763.html
|
|
|
|