I have a transactional replication configured as per below:
ServerA.DBA (pub) --> ServerB.DBB (Sub)
ServerB.DBB (pub1) --> ServerC.DBC (Sub)
ServerB.DBB (pub2) --> ServerD.DBD (Sub)
ServerB.DBB (pub3) --> ServerE.DBE (Sub)
ServerB.DBB (pub4) --> ServerF.DBF (Sub)
ServerB.DBB (pub5) --> ServerG.DBG (Sub)
Today, I got an alert saying that my replication ServerA.DBA (pub) --> ServerB.DBB (Sub) is failing with below error:
Replication-Replication Distribution Subsystem: agent failed. The transaction log for database is full due to 'REPLICATION'.
Upon investigation, I found that my log file was capped at 100GB on ServerB.DBB which was causing the issue. To fix this, I added a new log file as I was unable to extend the existing log file due to same error. This fixed my issue and replication caught up in some time. However, now I am unable to shrink the log file and it always shows my log file utilization as 99% and increasing. This raised some doubts in my mind whether my replication from Server B to C,D,E,F and G is actually running or not and why I am unable to shrink the log file.
Log_reuse_wait_desc on my ServerB.DBB always shows 'REPLICATION'. I ran DBCC OPENTRAN on my ServerB.DBB and it shows below:
Replicated Transaction Information:
Oldest distributed LSN : (0:0:0)
Oldest non-distributed LSN : (10417370:9406:1)
This value is not changing since many hours now. Does it mean my replication from Server B to C,D,E,F and G is not working? However when I see in replication monitor it shows everything as fine. i.e. logreader, all distributor agents are running and moving transactions for all publications. Why I am unable to shrink the log file and what should I do to shrink the log file?
Also, want to add that CDC is not enabled on any of my DBs involved in replication. DBCC LOGINFO shows all my VLFs in use.