For the lasts days I'm getting a high undistributed commands value in the replication monitor. Restarting the Server Agent solve the problem (momentarily) but after few hours starts again. Even when there are a lot of undistributed commands, I inject a token and get low latency (00:00:09 the most high). sp_replmonitorsubscriptionpendingcmds return the same values from the Replication Monitor.
The 'Distributor To Subscriber History' shows the message "TCP Provider: An existing connection was forcibly closed by the remote host". I'm looking for network issues and I didn't find nothing, just in the Application logs are the events 14152.
There are 6 replication jobs and they fail in similar time, but not in the same moment (all are replicating to the same subscriber), I means, in a period of few minutes between them. If was a network issue, they must broke at same moment, right?
After restart the agent or the server, it's working for several hours. I didn't find any pattern in the time, sometimes broke in the morning, others in the afternoon, etc.
How can I continue debugging the problem?
note: SQL server 2008 R2