Background: I have a SQL Server 2005 transactional replication setup that pushes normal traffic fine, but occasional larger processing batches have been a problem.
Obviously the changes take longer to push, but even after that completes, the MRrepl_commands table gets full of old transactions and all subsequent activity involving replication is very slow (until the old records get eventually flushed away by the distribution cleanup job).
Right now there are 500 million rows in distribution.dbo.MSrepl_commands
. So even the distribution job that tries to locate pending commands to send to the subscriber is very, very slow.
What should I do? Is this a problem with the history retention period or something?
I vaguely recall a thread where someone mentioned adding a non-clustered index to MSrepl_commands, but I'm not sure if that's a good idea or not. The missing index dmvs suggested an index on ([publisher_database_id], [article_id]) INCLUDE ([xact_seqno])
, but I'm unsure if that's advisable, or if that would slow down other aspects of replication.
EDIT: Looks like the 500m rows are not recent like I thought, they go back 4 months, so for some reason my cleanup job is not working. Currently investigating the "subscription expiration" setting, it is set to "never expire", which I don't think is correct.