Monday, May 30, 2011

The case of mysterious reboots

I got the chance to work on an interesting case some time back and thought its worth for a blog.

The issue reported to me was that machines in Czech republic site were continuosly rebooting after they were migrated to our company managed domain.This issue was observed only in this country , and the same process was succesfully implemented for many other sites round the globe.The major difference is Czech republic site is using native MUI pack, which after uninstalling, the reboot issue disappeared. Okay, thats pretty much a description of what the issue was around.

I started my investigation and connected remotely to a problem machine. After waiting for some time I saw the issue and got the following message.

The error was related to services.exe crash and once I cliked Ok, the machine gets rebooted. If I cancel the error, machine would apparently turn into an unusable state which I guess is expected because some exception has occured in the memory. As there are many services that comes under the umbrella of "Services.exe" process my next effort was to identify the real culprit causing this issue.

The first thought that came to my mind was to use Process Explorer from sysinternals. Process Explorer lets you see the threads that are executing with an Process. In this way I could see the threads that are running when services.exe is crashing.

I compared the threads that were running when the machine was normal and suspected that it could be the ESENT.dll and Userenv.dll that could be the ones responsible for the issue. The group policy engine is contained within the Userenv.dll which runs inside winlogon.exe. Hence all my focus now turned to Group policy client side extensions. I checked the event log files for errors related to group policy and found the following

Userenv 1085 The Group Policy Client Side extension Security failed to execute
A further look at the Security policy events showed the following

The security policies were failing with the following error

Event : Security policies are propagated with warning. 0x428 : An exception occurred in the service when handling the control request.

Hence it was now clear that an exception occured while processing some security polices and this caused the services.exe to crash. So every time there was a group policy refresh on the machine, it rebooted

I then tried to enable verbose logging for security client side extension in an effort to check the policies that were getting applied. You can enable verbose logging by adding the DWORD value
ExtensionDebugLevel = REG_DWORD 0x2 under HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Winlogon\GpExtensions\{827d319e-6eac-11d2-a4ea-00c04f79f83a}\
and the log will be created under %windir%\security\logs\winlogon.log

But I was not able to get much clue from the winlogon.log and the only thing I observed was that on a machine without MUI, I could see information on all the policies that were applied and on the problem machine I found only half of them and it got stuck in the middle of processing.
A further search for the 0X428 error lead to the following links :

No comments:

Post a Comment