|
Question : Netware 6 High Kernel Utilization
|
|
I have a Netware 6 SP4 server (though the problem has been occuring since SP3) with unusually high kernel utilization. The server has a number of applications, such as Groupwise 6.0.2, Inoculan 4.5, ZFD 4.0.1 running, but those NLMs are not tying up the processor. I have unloaded them or even restarted without them and the problem continues. I have traced the problem down to the fact that the server is spending a lot of time in interrupts. Here are the critical specs:
Compaq Proliant ML530 800Mhz Pentium III Xeon 4gb of RAM NW6 SP4 with version 7.00 of the HP/Compaq Netware utilities (namely CPQMPK.PSM)
Here is a copy of the Kernal statistics from Remote Manager, with an uptime of 3 days:
Kernel Event Name CPU 0 CPU Utilization 80 Context Switches 1,602 Interrupts 3,928 Fast WorkToDo's 451 Normal WorkToDo's 4,563 Spin LOCKS Asserted 180,392 Spin LOCKS Busy 0 Spin LOCKS Waiting loops 0 Thread Context Spin Waits 0 Enter Netware (CPU 0) Binding 1,842 Media Manager IOs Completed 151 AES Events - No Sleeping 0 AES Events - Sleeping Allowed 0 Mutexes Acquired 31,583 Mutexes Blocking 20 Mutexes Allocated 13 Mutexes Freed 12 Semaphores Allocated 10 Semaphores Freed 1 Semaphore Wait Calls 168 RW LOCK Writer Blocks 0 RW LOCK Reader Blocks 0 Condition Variable Blocks 0 Old Semaphores Allocated 115 Old Semaphore Freed 115 CPSemaphore Calls 6 CSleepUntilInterrupt Calls 48 Yield Calls That Yielded 21 Unneeded Yields 5 Threads Preempted 10 Threads Moved to other CPU 0 CALL OUTs 170 IO Adapter Polling Calls 665,586 New Threads Created 0 Page Faults 0 CPU TLB Flushes 0 CPU TLB Shootdowns 0 Worker Thread Handovers 0 WTD No Memory Available 0 WTD Thread Create Failed 0 WTD Max Threads In Use 0 WTD Time NOT Expired 0 MP WTD Max Threads In Use 0 MP WTD Time NOT Expired 0
There are approximately 250 connections on this server, and the utilization *does* go down slightly as users disconnect, but it never drops below 20% and normally hovers around 40-60% with obvious spikes in the high 80 to 90s. There are no memory leaks that I can see and the busiest threads are always SERVER. The NIC (Q57), is the most heavily used interrupt (10): Here is some more info, this with the CPQMPK.PSM unloaded from startup.ncf:
Number Description Total Hits Spurious Lost Handler Handled 0 OS Allocated Bus Interrupt 39,685,964 0 0 Timer 0 Interrupt Handler 39,685,964 1 OS Allocated Bus Interrupt 420 0 0 Keyboard Interrupt Handler 420 5 OS Allocated Bus Interrupt 4,461,246 0 0 NPA Environment 4,461,246 10 OS Allocated Bus Interrupt 305,554,793 170 0 Q57 Hardware ISR 305,554,623 11 OS Allocated Bus Interrupt 28,283 0 0 NPA Environment 28,283 13 OS Allocated Bus Interrupt 0 0 0 CPQHLTH ASM Int Handler 0 14 OS Allocated Bus Interrupt 275,587 8 0 NPA Environment 275,579 15 OS Allocated Bus Interrupt 464 0 0 NPA Environment 464 192 OS Inter-Processor Interrupt Allocations 0 0 0 OS Inter-Processor Interrupts 0
Any ideas experts?
|
Answer : Netware 6 High Kernel Utilization
|
|
Have you checked to see if the TCP connections on this server are being torn down quickly enough? In Remote Manager you can look at the protocol information for TCP. If there are more than 15 - 20 finwait2 and timewaits listed you may need to modify some of the TCP set parameters. Here are some of the parameters that have immediately dropped utilization from 100% to around 2% on multiple Netware 6 SP3 servers. (This will backup your environment variables also, just in case :-)). The TCP delayed ack is something that has really helped performance in the Zen distribution environment also - along with disabling the Nagle algorithm - unless you have reeeeally slow links (28 - 56K) this should have no negative affect.
save modified environment before.env set tcp ip Maximum Small ECBs = 32768 set New Packet Receive Buffer Wait Time = 0.1 sec set TCP Nagle Algorithm = off set TCP Delayed Acknowledgement = off set TCP Defend Land Attacks = off set maximum wait states = 4 save modified environment after.env
|
|
|
|