|
Question : FRSDIAG errors, DCPROMO not successfull, removing old exchange 2000 domain controller
|
|
Friends,
Would try to expalin my problem in shortest way but bear with me if I am a not able to do so.
Scenario Introduction.
We had three windows 2000 server all domain controllers. Server BCSERVER1 having exchange 2000 (hosting all five FSMO roles) , server BCSERVER2 having SQL 7.0 and server BCSERVER3 with Windows 2000. (file & print server). We decided to upgrade to Windows 2003 and Exchange 2003 environment. For this two new servers were purchased and all steps forest prep, FSMO role movement (to new 2003 servers) etc were done as per Microsoft instruction without any issue. Both of these new servers, BCNEWSERVER1 (having windows 2003 and excange 2003) and BCNEWSERVER2 (having windows 2003 and SQL 2000) are up and running. Every thing is working fine without any issue as far as users are concerned (email from EXChange 2003, user authentication etc) . Domain controller BCSERVER2 and BCSERVER3 were demoted running DCPROMO without any issue). Now we wanted to demote old exchange BCSERVER1. When we tried to run DCPROMO on BCSERVER1 it stopped inbetween stating could not continue popping up lot of errors in event viewer. When we tried to dig out further found lots of errors in File Replication service of BCSERVER1 (please see below in FRSDIAG error report). So we thought that before proceeding with DCPROMO first this issue should be resolved. Please note that BCSERVER1 is still shown as domain controller in active directory. We can create users etc from the active directory snap in on BCSERVER1 and the information is replicated on other two domain cotrollers. We ran FRSDIAG on all three domain controller. Server BCNEWSERVER1 and BCNEWSERVER2 passed everything. Server BCSERVER1's output is attached for info. Would like to share another info. that might help in understanding our issue is replication monitor's output. We added all three servers in monitored servers list and found that BCSERVER1 is missing some DNS related info.
Looking on following two outputs could anybody help us in resolving the issue. We want to get rid of old exchange server ASAP but would go for forcefull demotion only as last option.
Thanks in advance.
------------------------------------------------------------ FRSDiag v1.7 on 5/15/2006 3:21:13 PM .\BCSERVER1 on 2006-05-15 at 3.21.13 PM ------------------------------------------------------------
Checking for errors/warnings in FRS Event Log .... NtFrs 5/15/2006 2:15:42 PM Warning 13508 The File Replication Service is having trouble enabling replication from BCSERVER1 to BCNEWSERVER2 for c:\winnt\sysvol\domain using the DNS name (null). FRS will keep retrying. Following are some of the reasons you would see this warning. [1] FRS can not correctly resolve the DNS name (null) from this computer. [2] FRS is not running on (null). [3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers. This event log message will appear once per connection, After the problem is fixed you will see another event log message indicating that the connection has been established. NtFrs 5/15/2006 2:15:42 PM Warning 13508 The File Replication Service is having trouble enabling replication from BCSERVER1 to BCNEWSERVER1 for c:\winnt\sysvol\domain using the DNS name (null). FRS will keep retrying. Following are some of the reasons you would see this warning. [1] FRS can not correctly resolve the DNS name (null) from this computer. [2] FRS is not running on (null). [3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers. This event log message will appear once per connection, After the problem is fixed you will see another event log message indicating that the connection has been established. NtFrs 5/15/2006 4:27:13 AM Warning 13508 The File Replication Service is having trouble enabling replication from BCSERVER1 to BCNEWSERVER2 for c:\winnt\sysvol\domain using the DNS name (null). FRS will keep retrying. Following are some of the reasons you would see this warning. [1] FRS can not correctly resolve the DNS name (null) from this computer. [2] FRS is not running on (null). [3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers. This event log message will appear once per connection, After the problem is fixed you will see another event log message indicating that the connection has been established. WARNING: Found Event ID 13508 errors without trailing 13509 ... see above for (up to) the 3 latest entries!
......... failed 1 Checking for errors in Directory Service Event Log .... passed Checking for minimum FRS version requirement ... ERROR: FRS Engine Minimum Requirements are SP3 binaries or later!
......... failed Checking for errors/warnings in ntfrsutl ds ... ERROR: This server's "Member Ref" property for the SYSVOL volume does NOT seem to be correct !!! To fix this, use ADSIEdit and edit the "fRSMemberReference" Property of the nTFRSSubscriber object named "CN=Domain System Volume (SYSVOL share)" located under this Server's Computer Object. This value should match the FQDN of this Server. Current Values are: Current Value = "CN=BCSERVER1,CN=Domain System Volume (SYSVOL share),CN=File Replication Servic..." Suggested Value = "CN=BCSERVER1,CN=Domain System Volume (SYSVOL share),CN=File Replication Service,CN=System,DC=backercop,DC=co,DC=ae" Please note there is a small chance the above Suggested Value may not be correct - See below for more info on what the Proper Value should be! For more Info See KB Article : 312862 Recovering Missing FRS Objects and FRS Attributes in Active Directory - Search for the step about Updating the "fRSMemberReference" object (Step 8 on the "Recovering from Deleted FRS Objects" section ......... failed with 1 error(s) Checking for Replica Set configuration triggers... passed Checking for suspicious file Backlog size... ERROR : File Backlog TO server "BC_EAU\BCNEWSERVER1$" is : 5917 :: Unless this is due to your schedule, this is a problem! ERROR : File Backlog TO server "BC_EAU\BCNEWSERVER2$" is : 13118 :: Unless this is due to your schedule, this is a problem! failed with 2 error(s) and 0 warning(s)
Checking Overall Disk Space and SYSVOL structure (note: integrity is not checked)... passed Checking for suspicious inlog entries ... passed Checking for suspicious outlog entries ... passed Checking for appropriate staging area size ... passed Checking for errors in debug logs ... ERROR on NtFrs_0005.log : "IBCO_FETCH_RETRY" : State | Len/Ad/Er: 4/ 10ac524/ 0, 00000008 CO STATE: IBCO_FETCH_RETRY ERROR on NtFrs_0005.log : "IBCO_FETCH_RETRY" : State | Len/Ad/Er: 4/ 10ac524/ 0, 00000008 CO STATE: IBCO_FETCH_RETRY ERROR on NtFrs_0005.log : "IBCO_FETCH_RETRY" : State | Len/Ad/Er: 4/ 10ac524/ 0, 00000008 CO STATE: IBCO_FETCH_RETRY
Found 2659 IBCO_FETCH_RETRY error(s)! Latest ones (up to 3) listed above
......... failed with 2659 error entries Checking NtFrs Service (and dependent services) state...passed Checking NtFrs related Registry Keys for possible problems...passed Checking Repadmin Showreps for errors...passed
------------------------------------------------------------ Active DIrectory Replication Monitor ------------------------------------------------------------
Monitored Servers Default-first-site-name BCSERVER1 CN=Schema,CN=Configuration,DC=backercop,DC=co,DC=ae CN=Configuration,DC=backercop,DC=co,DC=ae DC=backercop,DC=co,DC=ae BCNEWSERVER1 DC=backercop,DC=co,DC=ae CN=Configuration,DC=backercop,DC=co,DC=ae CN=Schema,CN=Configuration,DC=backercop,DC=co,DC=ae DC=DomainDnszones,DC=backercop,DC=co,DC=ae DC=ForestDnszones,DC=backercop,DC=co,DC=ae BCNEWSERVER2 DC=backercop,DC=co,DC=ae CN=Configuration,DC=backercop,DC=co,DC=ae CN=Schema,CN=Configuration,DC=backercop,DC=co,DC=ae DC=DomainDnszones,DC=backercop,DC=co,DC=ae DC=ForestDnszones,DC=backercop,DC=co,DC=ae
|
Answer : FRSDIAG errors, DCPROMO not successfull, removing old exchange 2000 domain controller
|
|
A little tutorial on Application Shares and Active Directory in general.
History.
ActiveDirectory (mainly pointing to the windows2K3 version) is the so called windows NOS (Network Opperating System) in wich the AD is storing all the network and topology relative data about the zone or better subnet the domain was installed in. Because of the limitations the windows NT threw up with its Single Master model and replication overhead caused by replication of the complete schema top down and being not compliant to the standards (using WINS for name resolution) they came up with the AD diverted from the novel X500 Directory.
Basics.
To keep the directory up-to-date while using a "multi master" model they thought of some verry clever ways to do this. Mainly using the meta data on attributes (USN - Update Sequence Number, UTDSN Up-To-Date sequence number, HighWater Mark Vector etc) the Replication Service is able to replicate on a per attribute / value level reducing bandwidth etc. Next to that the AD is split up in so called NC`s. Wich splits up the AD in multiple parts wich all are replicated seperatly.
The DNS zone data is (using windows 2K3) possibly also inserted in the AD. I say possibly because this would require the DNS to allow DDNS updates (Dynamic DNS Updates) wich is usually considered a security risk. AD offers its own protection against it tough. To prevent the Replication service to replicate this specific data to all the AD copies (on other DC`s wich might not be DNS Servers) one could use application partitions to prevent the zone file from being replicated to all the DC`s and save bandwidth this way.
The Bit more advanced.
Allot of Server and Service information about the domain is usually stored in the DNS servers resp. as "A","CNAME","SRV","PTR" records. The updates usually made can be found on any DC in the following path : %systemRoot%\System32\Config\netlogon.dns .
It is used for allot of client to server administrative tasks initialised from the client, like finding a DC in a specific zone, to finding the best DC to logon to (slow connections etc) using the AD as information hive.
Where the FSMO kicks in.
As most people know we have FSMO roles pointed to servers in our Domain. wich are kind o important to keep the directory and topology information consistant.
Schema Master (forest wide role) Basicly is the only DC that is allowed to make changes/Updates to the schema Default the first server to be installed/Promoted
Domain Naming Master(forest wide) Basicly the server that controls changes to the namespace (myorg.com) and is required to enable renaming, moving domain namespaces within a forest. Default also the first server to be installed.
PDC (Domain Wide) Only use to be backwards compatible with "old" NT4 machines that are still based on the Single Master model.
RID Master (Domain wide) This is the first realy insteresting FSMO role. This one make sure that all new objects in the schema get their own unique GUID for security purposes. By default all other DC`s not holding this task reserve a set of UID`s for their own use. When they run out they will request a new set of UID`s. The treshold to request new ID`s is usually set 100 in case the RID master is not available. If this takes to long you will be unable to create new objects like users in the schema.
Infrastructure Master (Domain-wide) Is used to keep track of so called "phantoms" from other domains. It is also reposible to continually know where other for instance user objects reside in other domains so they are available in the domain it services. Usually in a Single domain setup the Infrastructure FSMO has nothing to do and doesnt pose any problem. If more domains are introduced its wise to move this role to any NON-GC DC. This because the IFM creates stale references from objects. It does this by comparing the objects with the GC and checks if its up-to-date. If the server that holds the IFM also is GC the GC is always up-to-date and thus the IFM will never have stale references.
By default most of this data is kept inside the AD for thats why the NOS was designed. Create a central repository of all the network data and make is broadly available. Replication problems might also be detected using naturaly the Toolset from the installation CD. But understanding it is more important to track problems.
Basicly replication is comparing the meta-data on the AD objects to that of an other DC. To do this all servers keep track of the Update Sequence Number, Up-To-Date sequence number of all replication partners. (on GUID Level). Next they compare the numbers of the objects with that in that table. If there is a difference the attribute gets updated, else the attribute is ignored and the next attribute is enummerised. And so on. Problems in this process might be verry critical if all FSMO`s / DNS / Dataconsitancy Checker etc are working properly.
Yet again, its fairly complex to explain and almost would advice you to read this book.
Active Directory ISBN:0-596-00466-4
Regards,
Chris gralike
|
|
|
|