Question : Netware server in a fragile state

Hi Guys,

I've got a NetWare 6.5 SP5 box that has been acting up lately, now normally Netware is rock solid and it's still functioning, but it needs lots of attention. So far the documentation that I've found for each "symptom" would indicate a memory problem, possibly a memory leak. Unfortunately the information NRM is giving me doesn't tell me much, or more specifically I don't know what to make of it. So here comes some random things and hopefully it'll set off a red flag for someone.

First of all the box is a IBM eserver xSeries 235. 1.8 Ghz Xeon processor and 3 Gigs of Ram, with lots of free hard drive space.

The server is running GroupWise 7, GroupWise Messenger, WebAccess, BorderManager 3.8, SurfControl, serves as a proxy/cache server, holds our print ques (pserver and NDPS), runs Symantec 10, and is also our main web page. Basically anything having to do with the Internet this box does it with the exception of firewall and VPN.

Up until we went to Netware 6.5 SP5 things were good, at that same time we went to GroupWise 7 and reconfigure our external DNS to run off of another server.

After that our SurfControl wouldn't load giving me all kinds of short term memory allocator out of memory, server logical address space low please reboot the server with switch -u(random number), and one other memory message that escapes.

After doing some investigating on NRM I found that our Available memory was in the range of 380 megs, so I added another 768 megs to the server and restarted. SurfControl took off again. I already know about the Key to add to the CSPConfig.ini, it's been there for a while and I've gradually increased it, but that's not having an affect any more.

Now for more points of interest:

NRM states that our Logical Address Space is low 380 megs out of 4096. For the life of me I can't get a clear answer on what that even is. Available Memory is the Ram, there is another items for Available Hard Drive Space. Can someone shed some light on what Available Logical space is? This number starts out around 600 after the server is rebooted and go down gradually, it reached 380 in about 12 hours and is still falling.

Looking at the list of loaded Modules NSS.NLM has a huge amount under cache, which is slowly going down.

I've heard that NSS cache comes from connections quickly coming and going from the server. Failed logins per hour is through the roof. 1214. It would appear that machines are no longer getting imported into their container, but I'm not sure how to stop that. (It's not as if they aren't already there anyways.)

CPU utilization hits about 95% often, my inbox is flooded with them, but utilitzation on that server is always all over the place. Between 10 and 75 is normal.

Ok guys....go to it. Any ideas, need more information? I'm shooting in the dark on this one now. If it'd help I could post the autoexec.ncf file so you could see what exactly is loading because lord knows I don't know what half of the stuff in that file does.

Answer : Netware server in a fragile state

it's time to break out some extra hardware and move the BorderManager/SurfControl/WebAccess components to that.

You've got a NetWare 6.5 license - you can have dozens of servers!  Redundancy in eDirectory via repicas is a very good thing.

What I would do:

1)  Server with BM/SurfControl - get those filters running and get your firewall up.  Use three NIC's (public, private, DMZ) - no replicas.
2)  Server with WebAccess and your other main web site - put into DMZ - no replicas.
3)  Server for file/print/messenger/Symantec - read/write replica.
4)  Server for nothing - master replica.

my guess - Symantec is a POS - that could be your leak right there.  It's also the only component not tested by Novell with SP5 while the rest of the network is.  Now eTrust, that baby works great.

However, GroupWise is a dot-oh release - could be that.  I am waiting for SP1 for GW7 before I even deploy.

Also, I use NetStorage on one server, can that baby drain resources.  It's server-side Java as I am sure WebAccess is, so there is another possibility.
Random Solutions  
 
programming4us programming4us