Blog de François DUFOUR

Supervize Me ©

July 2009 - Posts

SCOM 2007 – AD MP – Reasons why Replication Latency Performance won’t show up

Hi all,

This post deals with problems collecting data about AD replication monitoring with SCOM 2007. I used latest version of AD MP. I found two main problems about this:

Run As Account Misconfiguration

As a good person I configured AD MP Run As account to be used on each domain controler with a domain admin account. In fact a good person wouldn’t use a domain admin account but assigning an account the good rights for AD monitoring is so painful…

At this point with such a configuration I was able to monitor replication availlability. My OpsMgrLatencyMonitors container was created and all DCs managed to create their own containers in it and to update them regularly. We then decided to monitor the replication latency between some strategic DCs. Following the guide I made overrides to enable rules “AD Replication Monitoring Performance Collection (Source)” and “AD Replication Monitoring Performance Collection (Target)” for 2 DCs first as a test. What does those rules do ? Each rule runs a script called “AD_Replication_Monitoring_Helper1.vbs” and “AD_Replication_Monitoring_Helper2.vbs”. Those scripts simply tags the DCs by writing and updating a registry key so that the main replication script knows if they are a source or a target for replication latency calculation. This should work but it won’t !

The problem is the main replication rule “AD replicaition is occuring slowly” runs the “AD_Replication_Monitoring.vbs” script using the AD MP Runas account. BUT the 2 helper scripts mentionned above do not and they run under the agent action account which is LocalSystem. So far you nothing looks wrong but check this capture of the registry:

MPAD001

Don’t pay attention to the values but to the registry key’s path. Depending on which Runas account is used you do not write keys in the same place. Scripts write and read key in a folder named with the runas account SID… The path is given in scripts by this function : oAPI.GetScriptStateKeyPath(oParams(0)). Scripts Helper1 and Helper2 runs under LocalSystem and write regkeys in path HKLM\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\…\S-1-5-18\Script\AD Management Pack\AD Replication Monitoring\ where S-1-5-18 is the well known SID of LocalSystem. The main script which runs under the AD MP account then tries to read keys in path HKLM\…\S-a-b-c-d-e-f-g\Script\AD Management Pack\AD Replication Monitoring\ where S-a-b-c-d-e-f-g is the SID of my AD MP account. As a result AD replication latency performance is never collected

Workaround : As you can’t edit the AD Management Pack in order to make the 3 rules I spoke about to use the same profile, the only solution I found (and I feel like it’s a dirty one) is to configure the agents action account on DCs to be my AD MP Runas account. But in this case watch out for the HSlockdown tool which then disable LocalSystem from running responses even if it is still used as the privileged monitoring account . For more information about this have a look here.

Once this is done Replication Latency Performance data is collected but not in the way I thougth I configured it. Here is why:

Too much data collected

I configured only one DC (named A) as Source and one DC (named B) as Target for my test purpose. I should have receive performance data about replication from DC A to DC B but received replication performance data from All my DCs to DC B ! Here is why:

On DC A the helper script write the regkey lantecy with the current time every hour. When the main replication script runs it reads the key and then it writes the admindescription attribute on its container in the OpsMgrLatencyMonitors container. The difference is it adds a tag at the end and instead of writing "toto" it write "toto P1"

MPAD002

Once replication is done, DC B reads the admindescription of each DC to see if replication is done or not. It also check if the adminsdescription contains string “P1” and then in our case it knows it has to create performance data about DC A to DC B replication latency.

The problem this time is in the main replication script. In this screenshot you can see the part of the script which parse the admindescription for the P1 substring.

MPAD003

The script tests this using the Instr function and returns the result in the lPerfFlag variable. Problem is Instr returns the position of the substring in the tested string and if not found it returns 0 and not -1. However the admindescription contains P or not, the script goes to "If "P1" = Mid(strAdminDesc, lPerfFlag, 2) Then". Here we have another problem. If P1 really is in the string then lPerfFlag = 15. if P1 is not in the string then lPerfFlag = 0 which makes the Mid function here throw an error as the "start” parameter should be 1 or higher. BUT and I discovered that if you got an error in the "If Then" condition and as the script goes on even if it throws an error (On Error Resume Next) it however enters the "If Then" statement… (Maybe because it’s not a synthax error, ) SO bPerfData is set to True however it should or  shouldn't be and Performance Data is collected in any case.

To make it simple, if you add a DC as a target it will generate Replication Latency Performance Data about all other DCs however you add them as a source or not.

Here is a test script I used to find out the problem (except assigning some variables and doing some echos it is the part of the original one which checks the admindescription attribute for string “P1”):

On Error Resume Next
stradmindesc = "20090626.0935 P1"
AdminDesc2Date = False
dtRemote = 0.0
bPerfData = False

Dim lDecimal, lPerfFlag
lDecimal = Instr(strAdminDesc, ".")

If (13 <= Len(strAdminDesc)) And (9 = lDecimal) Then
  dtRemote = DateSerial(Mid(strAdminDesc, 1, 4), Mid(strAdminDesc, 5, 2), Mid(strAdminDesc, 7, 2)) + _
              TimeSerial(Mid(strAdminDesc, 10, 2), Mid(strAdminDesc, 12, 2), 0)

  lPerfFlag = Instr(strAdminDesc, "P")
  If -1 <> lPerfFlag Then
    If "P1" = Mid(strAdminDesc, lPerfFlag, 2) Then
      bPerfData = True
    End If
  End If

  'AdminDesc2Date = IsDate(dtRemote)
End If
wscript.echo lPerfFlag
wscript.echo bPerfData

And here are 3 executions of it, the first one without the On Error Resume Next statement, the second one without “P1” in the admindescription and the third one as it is shown in the script above.

MPAD005

So to correct this, the part

  If -1 <> lPerfFlag Then

has just to be replaced by

  If 0 <> lPerfFlag Then

Which we still can’t do as the “AD_Replication_Monitoring.vbs” script is in a sealed MP.

Share this post: