Sortspill in system stored procedure in merge replication

September 22, 2017, 1:43 am

≫ Next: MsMerge_genhistory has alot of rows with pubid = null

≪ Previous: Does adding non-clustered indexes to Transactional Replication increase latency

I have a merge replication that has been running fine for a year (the system has been active for several years but I recreated the replication a year ago because of some other problems).

Everything is still working but when someone is (re-)initializing a subscription (downloading a new subscription-database) it takes much longer time and there are some locks appearing. the subscribers are ca 300 Windows CE devices with SQL server compact.

What I can see is that a stored procedure called MSenumgenerations90 is the culprit and it takes up a lot of IO and CPU. the most common wait i can see in activity monitor i CXPACKET and I understand that this is parallelism. I can see some pageiolatch and at least some of them is pointing to the tempdb. the table msMerge_GenHistory contains a bit more then 1,5 million records and i tried to add indexes to it to make the expensive operation in the stored procedure to run quicker but with no success.

My retention period is high. it is set to 60 days but i can still see that there is generations in MSMerge_GenHistory that is created (coldate) when I recreated the replication a year ago however there seems to be some generations removed since the generationid of the latest one is above 2,2 million. could there be something wrong with the metadata cleanup or is this normal behavior? When the CXPACKET waits are showing i get Buffer I/O wait times at 2000 ms/s and there is also a lot of IO_COMPLETION waits, when I monitor the disk at the server i can see that tempdb-file is getting the most reads and writes. One of the things I have read is that you can setup multiple files for tempdb and that it could relieve the pressure on tempdb, could this be something that would help my situation?

Is it safe to add more files to tempdb when you have a merge replication running?

UPDATE.

I ran the stored procedure to get the actual execution plan and there is a sort warning that says the sort is spilling to tempdb. I am guessing this is the reason for my problem, the sp is running in parallel and it spills to tempdb and 8 consecutive threads are accessing the the tempdb to try and sort results and that is why the waits in IO is high and why CXPACKET is showing!? my sort waring how can i get it to stop spilling to tempdb? could it be done by indexing better? I cannot modify the sp since it is part of the replication. Any help is welcome.

UPDATE 2.

I created a covering index for the part of the sp that was getting the sort and now there is no warning any more, the tempdb read and writes are back to normal and the time for sync has been reduced. Unfortunatly there is still CXPACKET waits going on in the activity monitor and the sp is still consuming very mutch cpu, anyone have a tip for how to continue?

↧

MsMerge_genhistory has alot of rows with pubid = null

September 26, 2017, 7:00 am

≫ Next: Change Data Type at Subscriber but Not Publisher

≪ Previous: Sortspill in system stored procedure in merge replication

I have a merge replication and I am worried that the cleanup of metadata might not be enough. I have a retention period of 60 days and I can see thet the metadatacleanup-job do remove rows i msmege_genhistory that are older but only for rows that has the right guid in pubid. most of the rows, about 1,6 million, has the value NULL in pubid and I cannot figure out why. does anybody know why there is so many null-values?

↧

Change Data Type at Subscriber but Not Publisher

October 4, 2017, 8:53 am

≫ Next: Anonymous subscribers does not get expired in merge replication

≪ Previous: MsMerge_genhistory has alot of rows with pubid = null

I am currently trying to figure out if this is possible. I have a database on SERVER A (publisher) that is used for APPLICATION 1. I currently replicate several tables from this database to SERVER B (subscriber). Those replicated tables are used for APPLICATION 2. My question is, is there a way to change the data type at the subscriber level on SERVER B but NOT on the publisher on SERVER A. I essentially want to change from char to varchar on several columns to eliminate blank spaces that require the application to use a lot of CPU trimming every query. APPLICATION 1 is legacy and could be affected if the changes are made at the publisher.

I know I can create an ETL job to handle this but was curious if replication could handle this sort of task.

SQL Server 2016 environment.

↧

Anonymous subscribers does not get expired in merge replication

October 5, 2017, 5:58 am

≫ Next: Strange UPDATE results in presence of merge replication

≪ Previous: Change Data Type at Subscriber but Not Publisher

I have a Merge replication with anonymous subscribers, in my sysmergesubscriptions there are old subscriptions that still has the status 1 (active) and the last_sync_date is way below the retentionperiod, in fact all of the subscriptions in the table has the status of active!?

The job "Expired subscription clean up" runs every night without errors but does not seem to set them to inactive. I have tried to follow the stored procedures that it uses to understand but I do not really follow.

Do anybody know why this is happening or have some input to what i should investigate?

↧

Strange UPDATE results in presence of merge replication

October 8, 2017, 4:50 pm

≫ Next: SQL Server 2016: Bidirectional Transactional Replication using Backup, Do I use/need Snapshot?

≪ Previous: Anonymous subscribers does not get expired in merge replication

I have an MS Access front-end database that accesses an SQL Server backend. This (as far as we can tell) works flawlessly when the SQL Server database is not replicated. Once we introduce replication (from SQL Server Standard on a proper server as the publisher to SQL Server Express on a laptop as a subscriber using Merge Replication) we start getting weird database corruption that I can't explain.

What happens is that we have UPDATE queries which run against several tables in the database, and we also have Access itself doing CRUD operations against multiple tables as well (this corruption is not limited to one table). The result we see is that randomly (about 1 in 10 operations) a row other than the one we wanted to update is being updated, which overwrites data we didn't want to overwrite.

The replication, when set up, runs on demand on the laptop as a pull merge, and the corruption happens regardless of whether replication is performed. It only has to be enabled. No corruption seems to occur when replication isn't enabled.

I'm not accusing Microsoft of any fault here - it's entirely probable I've just forgotten to tick some box to prevent this. I'm just not sure what it is that I need to look for.

Edit: What I mean by corruption is this: Let's say I have rows:

ID | FirstName | LastName
--------------------------
1  | John      | Smith
2  | Emma      | Citizen
3  | Bob       | May

I then run something along the lines of:

UPDATE Table SET FirstName = "Test" WHERE ID = 1

And after that happens I end up with this:

ID | FirstName | LastName
--------------------------
1  | Test      | Smith
2  | Test      | Citizen
3  | Bob       | May

There are no error messages in any of the system tables dealing with replication. The only change in the schema is that when replication is enabled it creates the rowguid column.

↧

SQL Server 2016: Bidirectional Transactional Replication using Backup, Do I use/need Snapshot?

October 11, 2017, 8:21 am

≫ Next: SQL Server merge replication composite keys with per-location transaction sequences

≪ Previous: Strange UPDATE results in presence of merge replication

Overview:

I have been setting up bidirectional transactional replication between 2 SQL Servers. We have another pair of servers, running Inductive Automation's Ignition Gateway server creating and sending SQL data to Server 1 as the primary data source, and Server 2 as a backup data source. There are 3 databases that I need to keep synchronized at all times on each of the servers, so if a failover occurs the backup has the same data the primary has.

I have been setting up and testing this configuration for a long time, and testing the failover functionality of the Gateway software. All seems to be working very well, and the databases seem to be synchronizing.

The instructions I used had me setup network shares for the snapshots on each server, but since I used a backup of the 3 databases, it doesn't appear to ever use the Snapshots.

One of these synchronized databases is over 600GB of data. The deployment took 3 steps over a weekend.

Implementation:

Step 1: was starting the SQL Server Maintenance Plan of the primary server, consolidating\backing up all databases. This took over 6 hours to complete.

Step 2: was copying the files to the Backup Server, and restoring the databases to the Backup server, again taking many hours to finish.

At this point all 3 databases on Server 1 and Server 2 are identical.

Step 3: I ran my scripts that: activate Distribution on each server, Publish the 3 databases on each server, and subscribe each server to the published databases on the other server (ensuring that loopback detection was set to true for each subscription).

Questions:

What are the snapshot folders for, and why were they needed to setup the synchronization? I suspect something is either not working, or the part of the instructions that had me setup the network shares "\Server1\snapshots" and "\Server2\snapshots", was completely erroneous.
What is the Snapshot Agent supposed to be doing for the type of synchronization we have here?
I am seeing options in SQL Server Management Studio under the Replication section to 'validate', 'reinitialize', and 'check subscription status' of the subscriptions. I've run each and the results don't give me the expected feedback. What is the difference?
In the event that the subscriptions have problems, the only maintenance procedure I feel confident about is repeating the steps we used to deploy this configuration. The length of time it takes to complete this leaves me very nervous! Production runs: Mon-Wed 24 hours, Thurs-Sat 4AM-4PM. Interruptions to production are very expensive. I'm looking for any guidance you might have about maintaining this setup.
Day 1 after implementation, I spotted that the Snapshot Agents had never started in the Replication Monitor, so I started the Agents manually. When I did, both servers became unresponsive and had to be rebooted. Now the Log Reader agents all look happy, performance is all excellent. However, the Snapshot agents all have errors "Not completed", and I cannot seem to reset them or clear them in the Replication Monitor. How do I fix this?

↧

SQL Server merge replication composite keys with per-location transaction sequences

October 12, 2017, 1:52 pm

≫ Next: dropping a subscription in the subscriber first, how to fix it?

≪ Previous: SQL Server 2016: Bidirectional Transactional Replication using Backup, Do I use/need Snapshot?

Our company is currently working on more reliably replicating data from our locations into a central server for accounting and reporting purposes. We are a Windows only shop, so we are using SQL Server Standard in our central cloud and SQL Server Express at individual locations. We are currently trying to use SQL Server's built in merge replication, since the internet connection to our customers are generally poor and we have a large number of locations (~200 as of now).

Where we are stuck is trying to figure out how to replicate transactional data to the cloud reliably. Specifically, we have requirements that some types of records have to have sequential transaction numbers. Each sequence must not have gaps, but transaction numbers can be reused between locations. For example, location 1 and location 2 can both have a transaction 1, 2, and 3, but it is not OK for location 1 to have the sequence 1, 2, 5 and location 2 have the sequence 3, 4, 6.

Currently we use identity columns at the location, and a composite key that includes the location ID in the cloud. The problem comes with merging all of these records into a central repository, since Microsoft wants all of the tables to look exactly the same. From what I can gather, this keeps us from having an identity column at the location that holds its sequence position, and a non-identity composite key in the cloud to just hold a copy of the data.

We are using each location's unique identifier to filter by overwriting HOST_NAME().

We want to set up the data like this

Cloud:

create table Cash
(
LocationID int not null,
TransactionID int identity(1,1) not null,
Amount money not null,
rowid uniqueidentifier, -- For merge replication
constraint PK_Cash primary key (LocationID, TransactionID)
)

Location (ideally, though if we need to have LocationID on the table still we can, in which case the schema would be the same as above):

create table Cash
(
TransactionID int identity(1,1) not null,
Amount money not null,
rowid uniqueidentifier, -- For merge replication
constraint PK_Cash primary key (TransactionID)
)

By design each location will be assigned a unique LocationID that no other location shares. And each location only has a single database. So there won't be any problem with unique key constraint failures.

How would we go about handling the TransactionID column, since in merge replication using the identity type is specifically prohibited, and in a non-replicated table it would be a different identity sequence per potential value of LocationID.

Is this even possible to do, or are we using the wrong technology? We are trying to avoid having to roll our own replication system so we can leverage the stability of Microsoft's system. Really we are only using replication to push data to the cloud, as the cloud uses it for reporting and does not create new data that would need to be replicated down to the location. I've tried looking around, but all of the questions have to do with the sequence being shared across all of the subscribers, instead of each subscriber needing to have their own sequence.

↧

dropping a subscription in the subscriber first, how to fix it?

October 13, 2017, 9:03 am

≫ Next: MS SQL Server merge replication cannot find file specified

≪ Previous: SQL Server merge replication composite keys with per-location transaction sequences

I was looking for dropping a subscription, however, I made the mistake of dropping it in the subscribe, without dropping it at the publisher first, ignoring the warning from the picture below.

TITLE: Microsoft SQL Server Management Studio
This action will remove the selected subscription information, but not the previously replicated data, from database 'MY_DATABASE_Sub'. Removing the subscription information should be done only if the subscription has been deleted at the Publisher, MY_PUBLISHER, or if the subscription is otherwise defunct. Removing an active subscription will result in errors at the Publisher.
Are you sure you want to remove the subscription information?
For help, click:http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&ProdVer=14.0.17119.0&EvtSrc=Microsoft.SqlServer.Management.UI.VSIntegration.ObjectExplorer.Replication.ReplicationMenuItem&EvtID=CleanUpPushSubscriberConfirmation&LinkId=20476
------------------------------ BUTTONS:
&Yes &No

that is causing me the error below (in yellow), basically the subscription is showing up in the replication monitor, but it should not be there at all.

Question is: How can I fix this error? Basically get rid of the ghost subscription showing up in the replication monitor and causing it become red flagged?

↧

MS SQL Server merge replication cannot find file specified

October 19, 2017, 9:38 am

≫ Next: Can I Configure SQL Server 2012 Replication on Windows 10 Ultimate?

≪ Previous: dropping a subscription in the subscriber first, how to fix it?

I'm trying to setup merge replication between two MS SQL Server 2012 databases on different hosts in one domain.

When replication is set up and first snapshot is generated replications starts. Replication monitor shows that tables are being prepared and bulk copied, but after about 40 minutes i get messages like

The process could not bulk copy into table "table_name".
...
The system cannot find the file specified.

The table is different every time. I'm trying to locate these missing files but they are absent. When i just sit and monitor these folders - i can see some folders being deleted a few moments before the error occurs. Tried both push and pull with similar results.

Answers on the web show it may be caused by

User having insufficient privileges. I suppose this one is wrong because data is replicated till the error occurs - i can see tables being cleaned and filled with replicated data.
Using local path for replication files. I'm using UNC path located on Publisher. Both publication and subscription occur with windows domain account with sufficient permissions.

Replication silently restarts and subscriber is constantly in inconsistent state.

Does anyone have any idea why this may happen or how can i debug the problem?

↧

Can I Configure SQL Server 2012 Replication on Windows 10 Ultimate?

October 21, 2017, 8:06 am

≫ Next: How to make SQLServer Web Merge Replication use non-default port?

≪ Previous: MS SQL Server merge replication cannot find file specified

I have windows 10 Ultimate, I need to install SQL Server 2012 Full Edition and configure SQL Server Replication on it, is that possible or must work on Windows Server?

↧

How to make SQLServer Web Merge Replication use non-default port?

October 25, 2017, 3:02 pm

≫ Next: SQL_SLAVE_SKIP_COUNTER = 1 fails, setting @@gtid_slave_pos used to skip a given GTID position

≪ Previous: Can I Configure SQL Server 2012 Replication on Windows 10 Ultimate?

I'm setting up a web synchronization for SQL Server Merge Replication. My IIS box is in my DMZ, and the SQL Server machine is behind the firewall, so I need to specify a non-default port (not 1433) to get to the publisher and distributor. I've tried setting it up like 192.168.4.5_1234 and similarly 192.168.4.5:1234 where 192.168.4.5 is my firewall IP and 1234 is the port number (well, I'm giving examples anyways).

The one with the colon blows up completely, and I'm fairly certain that's bad form for windows port specification - I just wanted to show I've tried that. The other gives this:

replmerg -Publisher [192.168.4.5_1234] -PublisherDB [MyDB] -Publication [MyPub] -Subscriber [ME] -SubscriberDB [MyPub] -SubscriptionType 2 -SubscriberSecurityMode 1 -Distributor [192.168.4.5_1234]
...
Message: The process could not connect to Distributor '192.168.4.5_1234'. 2017-10-25 21:53:27.892 Category:SQLSERVER Source: 192.168.4.5_1234 Number: 53 Message: Named Pipes Provider: Could not open a connection to SQL Server [53]. 2017-10-25 21:53:27.894 Category:SQLSERVER Source: 192.168.4.5_1234 Number: 53 Message: A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. 2017-10-25 21:53:27.895 Category:SQLSERVER Source: 192.168.4.5_1234 Number: 0 Message: Login timeout expired 2017-10-25 21:53:27.896 Category:SQLSERVER Source: 192.168.4.5_1234 Number: 0 Message: The merge process failed to execute a query because the query timed out. If this failure continues, increase the query timeout for the process. When troubleshooting, restart the synchronization with verbose history logging and specify an output file to which to write.

Why its trying to use named pipes is beyond me, TCP/IP is listed above Named Pipes in my client protocols.

I notice in Microsoft's documentation, it doesn't even tell you how to do this... they just say it will "typically" use the default port. Is this even possible? Anyone ever gotten this to work?

https://technet.microsoft.com/en-us/library/ms151255(v=sql.105).aspx

↧

SQL_SLAVE_SKIP_COUNTER = 1 fails, setting @@gtid_slave_pos used to skip a given GTID position

November 1, 2017, 5:21 am

≫ Next: Fix SQL Replication without reinitialize

≪ Previous: How to make SQLServer Web Merge Replication use non-default port?

I recently broke replication and when I tried to get past the one incorrect transaction. I got the following.

MariaDB [(none)]> STOP SLAVE;
Query OK, 0 rows affected (0.05 sec)

MariaDB [(none)]> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
ERROR 1966 (HY000): When using parallel replication and GTID with multiple replication domains, @@sql_slave_skip_counter cannot be used. Instead, setting @@gtid_slave_pos explicitly can be used to skip to after a given GTID position.
MariaDB [(none)]> select @@gtid_slave_pos;
+---------------------------------------------+
| @@gtid_slave_pos                            |
+---------------------------------------------+
| 0-1051-1391406,1-1050-1182069,57-1051-98897 |
+---------------------------------------------+
1 row in set (0.00 sec)

MariaDB [(none)]> show variables like '%_pos%';
+----------------------+---------------------------------------------------------+
| Variable_name        | Value                                                   |
+----------------------+---------------------------------------------------------+
| gtid_binlog_pos      | 0-1051-1391406,2-1051-4474,57-1051-98897                |
| gtid_current_pos     | 0-1051-1391406,1-1050-1182069,2-1051-4474,57-1051-98897 |
| gtid_slave_pos       | 0-1051-1391406,1-1050-1182069,57-1051-98897             |
| wsrep_start_position | 00000000-0000-0000-0000-000000000000:-1                 |
+----------------------+---------------------------------------------------------+

What do I need to do to fix this.

Update 1

MariaDB [(none)]> show variables like '%gtid%';
+------------------------+------------------------------------------+
| Variable_name          | Value                                    |
+------------------------+------------------------------------------+
| gtid_binlog_pos        | 1-1050-4820789,2-1051-379101,3-1010-3273 |
| gtid_binlog_state      | 1-1050-4820789,2-1051-379101,3-1010-3273 |
| gtid_current_pos       | 1-1050-4819948,2-1051-379101,3-1010-3273 |
| gtid_domain_id         | 3                                        |
| gtid_ignore_duplicates | OFF                                      |
| gtid_seq_no            | 0                                        |
| gtid_slave_pos         | 1-1050-4819948,2-1051-379101,3-1010-3273 |
| gtid_strict_mode       | OFF                                      |
| last_gtid              |                                          |
| wsrep_gtid_domain_id   | 0                                        |
| wsrep_gtid_mode        | OFF                                      |
+------------------------+------------------------------------------+

I tried the following as per the instructions to set the @@gtid_slave_pos;

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: [redacted]
                  Master_User: [redacted]
                  Master_Port: 3306
                Connect_Retry: 5
              Master_Log_File: binary.000591
          Read_Master_Log_Pos: 526511543
               Relay_Log_File: tmsdb-relay-bin.001239
                Relay_Log_Pos: 4
        Relay_Master_Log_File: binary.000591
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1062
                   Last_Error: Could not execute Write_rows_v1 event on table [redacted] Duplicate entry '1134890' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log binary.000591, end_log_pos 60726493
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 60724897
              Relay_Log_Space: 465787660
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1062
               Last_SQL_Error: Could not execute Write_rows_v1 event on table [redacted] Duplicate entry '1134890' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log binary.000591, end_log_pos 60726493
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1050
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: Current_Pos
                  Gtid_IO_Pos: 1-1050-4827753,2-1051-379101,3-1010-3273
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: optimistic
1 row in set (0.00 sec)

Using the gtid_slave_pos varialbe

MariaDB [(none)]> select @@gtid_slave_pos\G;
*************************** 1. row ***************************
@@gtid_slave_pos: 1-1050-4819948,2-1051-379101,3-1010-3273

MariaDB [(none)]> stop slave;
Query OK, 0 rows affected (0.21 sec)

MariaDB [(none)]> SET GLOBAL gtid_slave_pos='1-1050-4819948,2-1051-379101,3-1010-3274';
Query OK, 0 rows affected (0.10 sec)

MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.21 sec)

When I check the status after running the above Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 3-1010-3274, which is not in the master's binlog'

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 10.56.228.64
                  Master_User: maxscale
                  Master_Port: 3306
                Connect_Retry: 5
              Master_Log_File: binary.000591
          Read_Master_Log_Pos: 60724897
               Relay_Log_File: tmsdb-relay-bin.001239
                Relay_Log_Pos: 4
        Relay_Master_Log_File: binary.000591
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 60724897
              Relay_Log_Space: 249
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 3-1010-3274, which is not in the master's binlog'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1050
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: Current_Pos
                  Gtid_IO_Pos: 1-1050-4819948,2-1051-379101,3-1010-3274
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: optimistic
1 row in set (0.00 sec)

I can get this back to the previous state by

MariaDB [(none)]> stop slave;
Query OK, 0 rows affected (0.01 sec)

MariaDB [(none)]> SET GLOBAL gtid_slave_pos='1-1050-4819948,2-1051-379101,3-1010-3273';
Query OK, 0 rows affected (0.09 sec)

MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.06 sec)

↧

Fix SQL Replication without reinitialize

November 1, 2017, 5:36 am

≫ Next: SQL Replication for Schema changes

≪ Previous: SQL_SLAVE_SKIP_COUNTER = 1 fails, setting @@gtid_slave_pos used to skip a given GTID position

[Edit 1] SQL Server version is 2012 SP3, Standard

I may very well have made a disastrous mistake. The short version is that I created a replication a new database which is now a subscriber to another database. We are to regularly purge out the published database, but not allow the deletes to go to the subscriber database. The intention was that the subscriber DB would contain archival data, while the published DB gets pruned from time to time so it stays around 4GB in size. When it looked like it was working fine, I already started pruning data.

Now, we are having problems with them syncing up. I wanted to change the artical properties back to pass Deletes through, but that requries the publication to be reinitialized. When that happens, the data in the subscriber DB will be lost. I can't allow that to happen, but I don't know how to prevent the data from being lost. Does anyone have some ideas?

I also don't like how easy it would be to reinitialize (basically a right-click wizard in SSMS). I would appreciate some advice on how to "warehouse" data in a better way.

(I expect to be asked about this. The reason we did this in the first place was that the database is used by a terrible line of business software. The manufacturer support claims that it can't do Inserts and Selects simultaneously, if the database is larger than 4GB, or else the software crashes. While I do think their claim is bullcrap, we were hoping to see if this resolves our problems.

Also, everything I did here I've been testing on our test DB server. Naturally, we didn't see the problem until after I implemented this in production.)

↧

SQL Replication for Schema changes

November 1, 2017, 5:20 pm

≫ Next: Does Error code 1237 hamper replication?

≪ Previous: Fix SQL Replication without reinitialize

I want to configure the snapshot replication for all the articles. The problem iam facing is first time all the articles got sync on the subscriber. But later on let suppose if there is a schema change (added new table) in the publisher that newly created table is not getting replicated on the subscriber.

So every time if there a schema changes in publisher i have to manually update the selected articles on the publisher to get it replicated on the subscriber.

Is there a way to automate this step that whenever there is a schema changes at the publisher (added new articles) it should be automatically replicated on the subscriber ? Thanks

↧

Does Error code 1237 hamper replication?

November 2, 2017, 4:36 am

≫ Next: MSSQL server replication - compatibility level

≪ Previous: SQL Replication for Schema changes

I have master-slave configured MySQL with slave_parallel_type=LOGICAL_CLOCK and slave_parallel_workers=16. I needed to skip one table from slave. So I used --replicate-ignore-table. But I'm getting this note all the time.

2017-11-01T17:38:06.856729Z 80881 [Note] Slave SQL for channel '': Worker 2 failed executing transaction 'xxxxxxxxxxxxxxxxx:xxxxxxxxxx' at master log mysql-bin.001098, end_log_pos 314379975; Could not execute Query event. Detailed error: Slave SQL thread ignored the query because of replicate-*-table rules; Error log throttle is enabled. This error will not be displayed for next 60 secs. It will be suppressed, Error_code: 1237

What I need to know is:

Does this has any impact on the replication performance?
Will slave_skip_errors=1237 help gaining any replication speed?
Also which one is better between --replicate-ignore-table and--replicate-ignore-db in terms of replication performance?

↧

MSSQL server replication - compatibility level

November 2, 2017, 6:26 am

≫ Next: CD-CM setup with merge replication

≪ Previous: Does Error code 1237 hamper replication?

I currently have two SQL servers in a merge replication setup. The publisher is running SQL Server 2016 Standard and the subscriber is running SQL Server 2016 Express. For some reason i am not able to change the compatibility of the publication in the publication properties. The only one listed there is "SQL Server 2008 or later". Likewise i get the error:

"Incorrect value for parameter '@publication_compatibility_level'"

When running this T-SQL:

DECLARE @publication AS sysname;  
SET @publication = N'publication name*' ;   
EXEC sp_changemergepublication   
    @force_invalidate_snapshot = 1,
    @publication = @publication, 
    @property = N'publication_compatibility_level', 
    @value = N'130RTM'
GO

The only allowed value is "100RTM" corresponding with the only version i can pick in the publication properties.

I would like to change the compatibility version to 2016. Any indications as to why this is not possible or how it can be achieved is much appreciated.

↧

CD-CM setup with merge replication

November 3, 2017, 3:26 am

≫ Next: Determine when subscription became inactive in transactional replication

≪ Previous: MSSQL server replication - compatibility level

I am in the process of trying to make the publishing process quicker and simpler for one of our customers, on their sitecore based website. Through research I stumbled upon Merge Replication which might solve some of our issues, but it introduces other issues. I need your help and guidance to figure out which way is the best!

We've got a CD & CM setup, with 1 CM server which has its own SQL instance. 2 CD servers with a SQL instance each. At the moment I have the current setup:

CM (Master-, web- and core-database) Web is shown only internally on a secure admin url for the site, this works like a preview site.

CD1 & CD2 are the servers for visiting users, these each have a publishing-target in Sitecore.

When we deploy a release: 1. Deploy new code for CM. Publish templates and potential content changes for Sitecore to Web. Verify and authenticate that everything is correct. 2. Take out CD1 of the Load Balancer, deploy new code for CD1, publish templates and potential changes to Sitecore, verify and authenticate, then put server back into the load balancer. 3. Repeat step 2 for CD2. 4. Deployment done

this process is working OK for us now, we are up and running at all time without downtime on the site.

We've got a few issues with the current setup:

Our search (Elastic search) are being populated when CM publishes to Web, so atm there is an issue with elastic search potentially can have data which is not yet published to the CD servers.
When publishing, the editors could forget to publish to one of the CD servers, which would cause inconsistencies between the servers, which we would like to avoid.
Everything needs to be published multiple times for same environment, takes up time.
Editors do not know what a CD server is, they just want to have a “preview” and “Live” publishing target.

I've looked into the Merge Replication for Sitecore, and actually also have it working in a test environment. The advantage we want from this is that we only have two publishing targets:

Preview (CM server preview database)
Live (CM server web database, which then gets replicated out into the CD servers web databases)
The Elastic search instance will relay on data from CM’s web database, which is live data.
We have can have a Elastic search instance running on preview as well.

The issue here is, that now I can't deploy only for CD1 or CD2, when doing deployment. What if I have breaking changes towards Sitecore? The site will break if I publish new breaking Sitecore items to a server which hasn't been deployed to yet?

How can I get the best of these two worlds? Any?

↧

Determine when subscription became inactive in transactional replication

November 8, 2017, 8:24 am

≫ Next: Why does existing data not replicate when using custom stored procedures

≪ Previous: CD-CM setup with merge replication

When I try to send a tracer token on some of my publications to get the latency I get the following error -

No active subscriptions were found. The publication must have active subscriptions in order to post a tracer token.

There are subscriptions tied to these publications as well. I can fix this by re-initializing\rebuilding replication, but I was wondering if there is a way to tell with when the subscription stopped receiving anything? I want to determine how long this has not been working.

The tables that are being replicated do not have timestamps on them that allow me to figure it out based on that. I have checked Replication Monitor, navigated through several of the tables in the distribution database, checked the job history and SQL logs and not able to determine this. Is there a timestamp recorded somewhere that shows the last synch from the distributor to the subscriber?

We are using SQL Transactional Replication (Push) and on SQL Server 2012 SP4.

↧

Why does existing data not replicate when using custom stored procedures

November 16, 2017, 9:55 pm

≫ Next: SQL Replication - FTP

≪ Previous: Determine when subscription became inactive in transactional replication

I am setting up a transactional replication in SQL Server. I am using custom stored procedures for the insert, update, and delete. I also am not replicating schema to the target, a table is already there. The tables do not match as the target contains audit columns. In the SP the values are all mapped.

When I run the snapshot it completes and there are no rows in undistributed. When I check the target table, no data has been replicated.

If I go and perform a transaction on source, it is replicated to the target. Why would the existing data not replicate?

Samples below, if you need something else from this let me know.

Source Table:

CREATE TABLE [dbo].[Table_1](
[ID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[Value1] [varchar](50) NULL,
[Value2] [varchar](50) NULL,
CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED 
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

Target Table:

CREATE TABLE [dbo].[Table_1](
[ID] [int] IDENTITY(1,1) NOT NULL,
[SRC_ID] [int] NOT NULL,
[VALUE1] [varchar](50) NULL,
[VALUE2] [varchar](50) NULL,
[SRC_DELETE] [bit] NOT NULL,
[TRAN_DT] [datetime] NOT NULL,
[PROCESS_FLAG] [bit] NOT NULL,
CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED 
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[Table_1] ADD  CONSTRAINT [DF_Table_1_SRC_DELETE]  DEFAULT ((0)) FOR [SRC_DELETE]
GO

ALTER TABLE [dbo].[Table_1] ADD  CONSTRAINT [DF_Table_1_TRAN_DT]  DEFAULT (getdate()) FOR [TRAN_DT]
GO

ALTER TABLE [dbo].[Table_1] ADD  CONSTRAINT [DF_Table_1_PROCESS_FLAG]  DEFAULT ((0)) FOR [PROCESS_FLAG]
GO

Delete SP

Create procedure [dbo].[rep_del__Table_1]
@pkc1 int
as
begin
declare @primarykey_text nvarchar(100) = ''
insert into [Target].dbo.Table_1
(SRC_ID, SRC_DELETE)
values (@pkc1, 1)
if @@rowcount = 0
if @@microsoftversion>0x07320000
Begin

set @primarykey_text = @primarykey_text + '[ID] = ' + convert(nvarchar(100),@pkc1,1)
exec sp_MSreplraiserror @errorid=20598, @param1=N'[dbo].[Table_1]', @param2=@primarykey_text, @param3=13234
End
end

Insert SP

CREATE PROCEDURE [dbo].[rep_ins__Table_1]
@c1 int,
@c2 varchar(50),
@c3 varchar(50)
as
begin
insert into [Target].dbo.Table_1
(SRC_ID, VALUE1, VALUE2)
values (@c1, @c2, @c3)
end

Update SP

CREATE procedure [dbo].[rep_upd__Table_1]
@c1 int,
@c2 varchar(50),
@c3 varchar(50),
@pkc1 int
as
begin  
declare @primarykey_text nvarchar(100) = ''
insert into [Target].dbo.Table_1
(SRC_ID, VALUE1, VALUE2)
values (@pkc1, @c2, @c3)
if @@rowcount = 0
if @@microsoftversion>0x07320000
Begin

set @primarykey_text = @primarykey_text + '[ID] = ' + convert(nvarchar(100),@pkc1,1)
exec sp_MSreplraiserror @errorid=20598, @param1=N'[dbo].[Table_1]', @param2=@primarykey_text, @param3=13233
End
end

sp_addpublication

exec sp_addpublication @publication = N'Sample', @description = N'Transactional publication of database ''Source'' from Publisher ''DESKTOP''.', 
@sync_method = N'concurrent', @retention = 0, @allow_push = N'true', @allow_pull = N'true', @allow_anonymous = N'false', @enabled_for_internet = N'false', 
@snapshot_in_defaultfolder = N'true', @compress_snapshot = N'false', @ftp_port = 21, @ftp_login = N'anonymous', @allow_subscription_copy = N'false', 
@add_to_active_directory = N'false', @repl_freq = N'continuous', @status = N'active', @independent_agent = N'true', @immediate_sync = N'false', 
@allow_sync_tran = N'false', @autogen_sync_procs = N'false', @allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 0, 
@allow_initialize_from_backup = N'false', @enabled_for_p2p = N'false', @enabled_for_het_sub = N'false'

sp_addarticle

exec sp_addarticle @publication = N'Sample', @article = N'Table_1', @source_owner = N'dbo', @source_object = N'Table_1', @type = N'logbased', 
@description = N'', @creation_script = N'', @pre_creation_cmd = N'truncate', @schema_option = 0x000000000203008D, @identityrangemanagementoption = N'manual', 
@destination_table = N'Table_1', @destination_owner = N'dbo', @status = 16, @vertical_partition = N'false', @ins_cmd = N'CALL [rep_ins__Table_1]', 
@del_cmd = N'CALL [rep_del__Table_1]', @upd_cmd = N'CALL [rep_upd__Table_1]'

sp_addsubscription

exec sp_addsubscription @publication = N'Sample', @subscriber = N'DESKTOP', @destination_db = N'Target', @subscription_type = N'Push', 
@sync_type = N'replication support only', @article = N'all', @update_mode = N'read only', @subscriber_type = 0

↧

SQL Replication - FTP

November 20, 2017, 7:14 pm

≫ Next: Replication on a clustered Instance

≪ Previous: Why does existing data not replicate when using custom stored procedures

SQL Transactional replication has been setup between two SQL 2012 servers and works using the default location for the snapshot, I want to allow the subscribers to download from an FTP.

Possibly a stupid question, but within the publication properties, FTP Snapshot what is the FTP server name, is that the actual name of the FTP Server (Windows name) or the FTP site name, I am getting an error in the replication monitor I want to tick this possibility off the list before speaking to the network team.

Error: Error in replication monitor stating it cannot connect to FTP Site 'xxx' using Port 21.

↧