There's been a problem plaguing our SalesLogix install since 6.2x. Our sync service would randomly hang on our Windows 2003 Server box. Eventually it became more well known as more people began upgrading to Win2003 I believe and this confirmed it.
My fix was to either babysit the service by checking MonitorConsole every day for log errors, or waiting until remotes reported a problem which usually ended up being around a week into it being broke, which in today's world isn't really practical or cool.
I heard of a PowerShell script used by one of the BPs for repair if I recall correctly, so I wanted to see if I could build my own. I've decided to release a consolidated version of the script as public domain, with a caveat: the Measure-Latest function is someone else's material and I've linked to the blog post I stole it from (thanks, by the way). It comes in two parts, an XML file called SystemDefense.xml and RepairSync.ps1. You don't have to use the names obviously. SystemDefense is lame but it kind of fit the function of the series of scripts that use the config file.
Thankfully the sync hang issue is a very easy test case, here's roughly the steps the script takes:
- Get a list of <Directory> nodes under the SalesLogix/Sync/ root in SystemDefense.xml with the default sync service log directory being C:\Documents and Settings\All Users\Application Data\SalesLogix\Sync\EventLogs\<Sync Job Name>\
- Get the latest filename in each directory and append it to an array
- Search the contents of each log file for the string "Sync Cycle completed" which appears at the end of every successful sync
- If string contents found, do nothing. Otherwise, update <LastRepairDate> in SystemDefense.xml with FileName.LastWriteTime and try restarting the server. (I use LastWriteTime because it was important that LastRepairDate reflected when it broke, not the date the script ran. That really means I should use LastBrokeDate or some better identifier)
- Check to see if service is running or not and output accordingly
To use the script you must change the existing <Directory> node and add each one that corresponds to your sync service profile. You can also change the second to last line to continually retry restarting the service if it detects it didn't restart completely, but I figured that might lead to some problems the next time the script is scheduled to run. I also thought of emailing myself when a repair occurs to have a history in email but I didn't implement that for time reasons, plus I'm not all too interested in knowing how many times it happened in the past year. Though it would likely be beneficial to send daily status emails to make sure a script like this is working properly.
I've scheduled this script to run every 2 hours between our sync window and it has been working reliably since created on 8/22 with our last repair of 10/29, so it hasn't broke yet. If you have any questions I can offer limited support due to my borderline n00b powershell knowledge but I'll definitely welcome any suggestions.
The files are hosted on SkyDrive here.