Scenario:
I'm working with a customer that has a live database. On a separate server, they have a copy of this database and they have transactional replication setup, which runs constantly. I have an SSIS package that runs on the copy of the database for up to an hour to export data to a reporting database.
When I've tested the package with replication enabled, it occasionally fails as it reads from various tables at different points of the execution. The problem is that if some data is read at an early stage, which subsequently gets deleted/inserted, other related records that are read later on effectively become orphaned and cause lookup failures. Whilst I have various safeguards to combat this, it's difficult to cater for every case as not all records have dates that I can use to limit data.
Plan:
I have been looking at pausing the replication job, so that the package can run with static data and then re-enable it once the package has run. Once the replication is enabled again, all of the transactions from the live database that were generated during the package execution should then be applied to the copy.
Problem:
I've done some reading around the various Replication Agents used for transactional replication, but I'm not entirely sure what the minimum requirement is for pausing the replication.
At the moment I'm looking at pausing the Distribution Agent and the Log Reader Agent to achieve what I want to do. The question is, do I need to pause both agent jobs or can I pause one or the other so that the transactions build up and are applied once the agent is enabled?
I'm not sure if some of this is dependent on specific configurations or setup, but I can provide further information if required, so please comment if more information is required.