Friday, September 28, 2007

Stopping Sophos PureMessage 3.0 from Generating an NDR Storm

I upgraded to the latest version of Sophos PureMessage (v2.6.1 upgrade to v3.0) on an SBS 2003 server.

The upgrade went smoothly as per usual. Hats off to Sophos for providing good quality products and excellent documentation.

This version now includes AD integration and allows for recipient validation. I enabled this, as well as verifying the upgraded settings. I kept an eye on progress for about 90 minutes as I was performing other administrative tasks.

When I came back to it the next morning, the server was being sluggish. Investigation showed that the were several thousand NDRs queued up, and further investigation revealed that the Exchange journal mailbox was bouncing Read Receipts with a Permission Denied error back to PureMessage. Unfortunately, the Read Receipts had no From header, so PureMessage was generating an NDR and trying to send it to an address of '<', which is a completely invalid address. This was then escalating an alert message to the Alert address, which had filled up the resulting mailbox. The mail bounce that was occuring was also generating an unscannable error due to too many nested attachments, which also queued up an alert message.

The remedial action was to remove the administrator alert address. This stopped the queuing. I then turned off administrator alerts for the On Unscannable action for the Exchange Store scanning and the Transport scanning. This helped stopped further NDR flooding.

The final action I performed that finally killed the NDR storm was to fire up the Exchange System Manager, go into the SmallBusiness SMTP Connector properties, go into Content Restrictions and turn off System Messages.

I also opened up the Delivery Restrictions placed on the mailbox that I'm using for Exchange Journalling until I can verify what the appropriate restrictions should be on the mailbox such that it works with PureMessage 3.0, seeing as the previous settings worked fine with PureMessage 2.6.1 (which was only accept messages from the Exchange Journalling mailbox).


ram990 said...

Hello Chris,

What a great post, I have had precisely the same problem and this is the only other case I can find. Glad to know it's not just us.

Have you had any response from Sophos regarding this? I can't find any acknowedgement of this problem on their website.



Chris Knight said...

I didn't escalate the problem to Sophos, simply because I didn't have enough time to perform a proper post-mortem, nor did I have a repeatable method to verify the cause of the problem.

I wrote the blog entry immediately after fixing the problem so I had a record of what the problem was and how I resolved it. I'm glad it has helped some-one else.

It's on my to-do list to run this up on a test rig to find out what's going on. I'd then be happy to pass on my findings to Sophos.

Your post indicates that I should probably write up my findings as they stand and send it to Sophos. I generally dislike submitting vague, fluffy problem reports, as I know how much of a pain they can be to resolve.

John Stringer said...

Just a quick follow. We are aware of the issue experienced and have a hotfix and work around available to customers with this problem. Its also fully fixed in the next release of the product (3.0.1) due at the beginning of January - alongside a few other bug fixes and support for Japanese! Best regards, John (PME product manager)

Afterburned said...

Just a quick note to say thanks for this blog - I had the same issue and had to tell sophos this link before they knew what I was talking about!!
Interesting to see that PME has not had the fix incorporated even though the original post is September and it is now December...

Anonymous said...

EXACTLY our problem. We stay waiting for the hotfix. Our inbox is certainly full of bloomin exchange queue notifications!!!!