MSSQL Transactional Replication

March 21, 2018, 1:57 am

≫ Next: Transactional replication reports ok, but tracer token not arriving

≪ Previous: MSSQL Transactional Replication

I have an application that almost continuously works with inserting or updating data. Since multiple requests are handled asynchronous I wrote my queries like below. I used an example based on SO, but that's not what I'm actually doing.

DECLARE @rows int;
INSERT INTO [user] ([username],[reputation])
SELECT [username],[reputation]
FROM (
    SELECT [username]=:user,[reputation]=:rep
) A
WHERE A.[username] NOT IN (
  SELECT [username]
  FROM [user]
);
SET @rows = @@ROWCOUNT;

IF (@rows=0) BEGIN
  UPDATE [user]
  SET [reputation]=:rep, [updated]=GetDate()
  WHERE [username]=:user
END;

This is passed in total to the database with PHPPDO. Because of the amount of data and other processing factors it's heavy on the (cheap) VPS it's running on. It's not really a problem if these processes run slow or get delayed, but on the other hand this data should be available via a website and then the queries on the data should be quick.

I was thinking about replicating parts of the processed data to a second server and running the website on that database. But I'm wondering how that would actually work with a query like above.

I'm guessing the UPDATE query will only be in the transaction log when @rows=0, so that won't be a problem.

But would the first part only send INSERT INTO [user] ([username],[reputation]) VALUES ('Hugo Delsing', '10k') or the entire query with the WHERE NOT IN () query?

Most of the time a user would exists, so if it only runs the new inserts that won't be a problem. But if it would run the entire query each time the benefits would be small.

Obviously I could wrap the first part up in another check if exists(select 1 from [user] where [username] = :user) to make sure it only runs when there is no user, but I'm wondering if that is necessary.

Also: Would replicating be the way to go or does MS SQL offer other/better solutions for something like this?

↧

Transactional replication reports ok, but tracer token not arriving

March 21, 2018, 2:16 am

≫ Next: SQL Replication Delete error from publisher with Cascade Delete FKs defined

≪ Previous: MSSQL Transactional Replication

I've got an odd problem here. I was trialling the use of 'initialize from lsn' in our test replication setup and now have broken all publications from one of our databases, rebuilding from scratch (applying a new snapshot) has not fixed the issue.

In Replication monitor, distributor agent running continuously and reporting:

No replicated transactions are available

Tracer token is reporting that:

Publisher to Distributor: 00:00:02
Distributor to Subscriber: Pending...
Total Latency: Pending...

When I investigated the distribution logs using:

SELECT time,
       CAST(comments AS XML) AS comments,
       runstatus,
       duration,
       xact_seqno,
       delivered_transactions,
       delivered_commands,
       average_commands,
       delivery_time,
       delivery_rate,
       delivery_latency / (1000 * 60) AS delivery_latency_Min,
       agent_id
FROM dbo.MSlogreader_history WITH (NOLOCK)
WHERE agent_id = 5
ORDER BY time DESC;

SELECT *
FROM dbo.MSdistribution_history
WHERE agent_id = 125
ORDER BY time DESC;

I can see that the latest xact_seqno from:

LogReader agent is: 0x00000378000384880003
Distribution agent is: 0x000CA68B000010A8000A000000070000

As a comparison on one of the other databases (going to the same subscriber), I get:

LogReader agent is: 0x00000D140001F52A0003
Distribution agent is: 0x00000D140001F52A0003000000000000

To me it looks like the distribution agent is remembering the old broken xact_seqno (from when I made a mistake when testing 'initialize from lsn'), but now it appears to have permanently broken all replication from that database.

Servers:

PubA - One publisher hosting several databases that are replicated
Dist - A separate distributor
SubA - First subscriber
SubB - Second subscriber

Databases:

PubA.DB1 = Publisher DB that is working fine
PubA.DB2 = Publisher DB that is broken
SubA.PullWorking1 = 1st subscriber has a working pull subscription from DB1
SubA.PullBroken2 = 1st subscriber also has a broken pull subscription from DB2
SubA.PushWorking1 = 2nd subscriber has a working push subscription from DB1
SubA.PushBroken2 = 2nd subscriber also has a broken push subscription from DB2

Publications:

PubA.DB1.PushPub_works
PubA.DB1.PullPub1_works
PubA.DB1.PullPub2_works
PubA.DB2.PushPub_broken
PubA.DB2.PullPub_broken

↧

SQL Replication Delete error from publisher with Cascade Delete FKs defined

March 21, 2018, 10:23 am

≫ Next: DELETE IGNORE on Tables with Foreign Keys Can Break Replication

≪ Previous: Transactional replication reports ok, but tracer token not arriving

I was testing some optimization in our subscriber by manually adding some FKs in our Subscriber between Order -> OrderItems -> etc.

I got this error on an Order delete where it said:

The DELETE statement conflicted with the REFERENCE constraint "Orders_OrderItems_OrderID_FK". The conflict occurred in database "e5StagingDW", table "dbo.OrderItems", column 'OrderID'. (Source: MSSQLServer, Error number: 547) Looking at the Transaction in sp_browsereplcmds I found the Order of the DELETES to be reversed with the Parent (Orders) being deleted before the children (OrderItems).

Fine, looks like the Order was being deleted before the OrderItems (coming from the Cascade Delete on the FK in the publisher. So I changed the FK in the subscriber to also have a Cascade Delete and got the following:

The row was not found at the Subscriber when applying the replicated DELETE command for Table '[dbo].[OrderItems]' with Primary Key(s): [OrderItemID] = (Source: MSSQLServer, Error number: 20598)

I can't win at this game. So I removed the FK entirely and the transactions went through.

I suspect that the Cascade DELETEs on the FKs in the Publisher database are the cause if this improper re-ordering of the Delete operations.

1- Is this a known problem?

2- What is the proper way to set this up? I don't really want to automatically move all the FKs to the subscriber as part of the subscription definition which I suppose if the most obvious approach?

No FKs at all can cause problems too especially in this case as we have an Indexed View that combines Order OrderItems and another table that with the 2016 Cardinality Estimator on causes the most horrible Update Orders plan you could imagine.

Thanks

...Ray

↧

DELETE IGNORE on Tables with Foreign Keys Can Break Replication

March 21, 2018, 10:29 pm

≫ Next: Check current memory_limit of Replication Server version 15.7.1

≪ Previous: SQL Replication Delete error from publisher with Cascade Delete FKs defined

I am using replication in a MYSQL database. I have read some where DELETE IGNORE command on Tables with Foreign Keys can break Replication. Is it true? If it is true then how can workaround this?

↧

Check current memory_limit of Replication Server version 15.7.1

March 22, 2018, 1:09 am

≫ Next: Replication of many masters to single slave

≪ Previous: DELETE IGNORE on Tables with Foreign Keys Can Break Replication

Does anyone knows how to check the currently memory_limit of replication server? I need to increase the memory_limit of my replication server, but I couldn't check the current memory limit. Can anyone help?

W. 2018/03/22 15:58:08. WARNING #7038 SQM(199:0 pds.pdb1) - de/generic/mem/mem.c(2763) WARNING: Memory usage is above 80 percent. Increase 'memory_limit' or reduce cache sizes to avoid repserver threads from sleeping due to lack of memory.
W. 2018/03/22 15:58:53. WARNING #7039 DSI EXEC(199(1) pds.pdb1) - de/generic/mem/mem.c(2763) WARNING CANCEL: Memory usage is below 80 percent.

↧

Replication of many masters to single slave

March 22, 2018, 2:59 am

≫ Next: Schema Changes Are Not Populating to The Subscribers

≪ Previous: Check current memory_limit of Replication Server version 15.7.1

I want to ask for explanation about MySQL replication, I have a server with MySQL database.

I want to make replication from a server that I have into one of my slave servers, please help me to understand the flow with an example.

↧

Schema Changes Are Not Populating to The Subscribers

March 22, 2018, 6:57 pm

≫ Next: Postgresql Relation ID

≪ Previous: Replication of many masters to single slave

One of my publications was dropped mistakenly. I restored it from the latest backup but now schema changes are not replicated to the subscribers.

What can I do to re-initialize/fix the publication?

↧

Postgresql Relation ID

March 26, 2018, 2:56 am

≫ Next: SQL Server could not display the Schedule dialog box

≪ Previous: Schema Changes Are Not Populating to The Subscribers

I'm trying to utilze Postgres 10 logical replication mechanism by reading replication messages in Go code. Most of the logical replication messages refer to something called "Relation Id".

My question is: how to get Relation Ids for all of the existing tables? I am aware of "Relation" message type, but I don't know how to trigger them.

↧

SQL Server could not display the Schedule dialog box

March 26, 2018, 11:05 am

≫ Next: Postgresql: Streaming Replication - determining wal sender process client port

≪ Previous: Postgresql Relation ID

I'm trying to establish a merge replication between SQL Server 2012 (SP2) servers. In the New Subscription Wizard, I want to setup the Agent Schedule and I have the following error when selecting "Define schedule..."

Of course I have Googled the error but can't find anything helpfull.

thanks for your time and help

↧

Postgresql: Streaming Replication - determining wal sender process client port

March 27, 2018, 5:20 am

≫ Next: Unable to Set external replication between an on-prem mariaDB instance (master) and an RDS MariaDB instance (replica) [on hold]

≪ Previous: SQL Server could not display the Schedule dialog box

My experience with asynchronous streaming replication has been great so far, both as a HA solution and as a backup strategy. One issue is delaying me from using streaming replication in production - The port used by the wal sender process seems to be determined automatically when I start the streaming replication (I assume between 32768 to 61000). I need to be able to specify the exact port in order to allow access. Does anyone have any information regarding this issue or even a workaround? Thanks :)

↧

Unable to Set external replication between an on-prem mariaDB instance (master) and an RDS MariaDB instance (replica) [on hold]

March 27, 2018, 5:21 am

≫ Next: Error 20598 - The row was not found at the Subscriber when applying the replicated command

≪ Previous: Postgresql: Streaming Replication - determining wal sender process client port

There are 9 mariaDB databases on my production server. I have to set a slave on AWS RDS maria DB instance also start replication of all the 9 databases. what is the best way to do it ??

I have followed below steps on master database:

FLUSH TABLES WITH READ LOCK; 
show master status;
mysqldump -u root db1> db1.sql

FLUSH TABLES WITH READ LOCK; 
show master status;
mysqldump -u root db2> db2.sql

FLUSH TABLES WITH READ LOCK; 
show master status;
mysqldump -u root db3> db3.sql

and so on...

i am restoring these db dump one by one, they got restored but There are lot of errors When i am trying to start replication all these dump on my RDS instance with related bin log files and positions which is captured with master status.

I have followed below steps to restore dump and start replication on RDS instance:

Step 1 : Login to RDS DB console and create database

a) mysql -h HOST -uUSER -pPASSWORD
b) CREATE DATABASE db1;

Step 2: Restore database

mysql -h HOST -uUSER -pPASSWORD db1< db1.sql

Step 3: Set replication with Softlater Database

a) mysql -h HOST -uUSER -pPASSWORD
b) use db1;
c) CALL mysql.rds_set_external_master ('MASTER_DB_IP', 3306, 'USER', 'PASSWORD', 'mysql-bin.020616', 989736705, 0);
d) CALL mysql.rds_start_replication;
e) show slave status\G;

one of the error i got while restoring db1 :

Error executing row event: 'Table 'db3.campaign_running_caps' doesn't exist'

i dont understand why its giving me error of db3 while restoring dump of db1.

i have coss-checked that db dumps are correct

Please suggest where i am doing wrong and what is the best way to start replication of multiple database on single rds instance.

↧

Error 20598 - The row was not found at the Subscriber when applying the replicated command

March 27, 2018, 5:41 am

≫ Next: Should we use slave_pos or current_pos with MariaDB 2 node GTID master-master replication?

≪ Previous: Unable to Set external replication between an on-prem mariaDB instance (master) and an RDS MariaDB instance (replica) [on hold]

I am facing the error 20598 with Transactional Replication:

The row was not found at the Subscriber when applying the replicated command.

Normally this error occurs when an UPDATE or DELETE statement is executed by the publisher for a primary key value and the record (against which UPDATE/DELETE executed) does not exist in the subscriber database.

But in my case the scenario is different.

I diagnosed and found the record exists in the article/table of subscriber database, because when I executed the command (retrieved with help of sp_browsereplcmds) at the subscriber it executed successfully.

What may be the possible reason of it?

I'm using SQL Server 2016 both side.

↧

Should we use slave_pos or current_pos with MariaDB 2 node GTID master-master replication?

March 28, 2018, 12:06 am

≫ Next: A question regarding MySQL master-slave replication

≪ Previous: Error 20598 - The row was not found at the Subscriber when applying the replicated command

We are currently migrating our DB cluster from "traditional" binlog based replication to GTID based one. We use MariaDB 10.1.31 and the topology is pretty straight forward - 2 nodes each is master and slave of the other one:

Master 1 <=> Master 2

There is a DNS based LB (Cloudflare LB) in front to direct traffic and detect downtime. However we need to be able to write at either nodes at any given time due to our short RTO, hence the master-master topology. We want to be able to write to any given node and the data should be replicated to the other one, if the other node is down it should take the new data as soon as it is brought online.

I've being going trough the official MariaDB doc and it recommends current_pos e.g CHANGE MASTER TO master_use_gtid=current_pos, however the statement is somehow unclear and other online sources (Booking.com topology) recommends using slave_pos so the data is preserved in case of downtime. Which one is best for this topology and could you please elaborate on the risks and drawbacks?

Thanks!

↧

A question regarding MySQL master-slave replication

March 28, 2018, 2:20 am

≫ Next: Single User Mode and Replication

≪ Previous: Should we use slave_pos or current_pos with MariaDB 2 node GTID master-master replication?

I want to set up a one-master-one-slave replication. But I have a question. Suppose the read-load is distributed between the master and the slave. I first write to the master and then read from the slave. At this time the information has not reached the slave. So my read will fail. But I have written it! How is such a problem solved or avoided in practice?

↧

Single User Mode and Replication

March 28, 2018, 7:18 am

≫ Next: How to prevent data loss during intra-site failover

≪ Previous: A question regarding MySQL master-slave replication

If I put a replicated database in single user mode will the Log Agent count as the single user?

ALTER DATABASE [DatabaseName] SET SINGLE_USER WITH ROLLBACK IMMEDIATE

I'm trying to block users from connecting to my database while it is being updated in the morning. So I was thinking of putting it in Single User mode, run update then put it back in multi user mode. But the database is also a publication database.

↧

How to prevent data loss during intra-site failover

March 28, 2018, 5:06 pm

≫ Next: MySQL InnoDb Cluster Replication Issue

≪ Previous: Single User Mode and Replication

We are looking to build a MS SQL server (either SQL server 2016 standard or enterprise edition) solution but are having some troubles. This post can be lengthy and if you think it is TLDR, our question is simply as:

Given two physical sites, each site has two SQL servers with requirement that at any given time application is able to access one of its local SQL server instance to write/read data. Then the data need to be synchronized cross site and all four servers will contain synchronized data. Which SQL server’s HA solution or replication solution or combination of both should we pick to achieve such result?

Starting here I will be trying to explain what we have tinkered with and the issues we are running into. We use SQL server 2016 developer edition for our testing. Note that it is treated as if it is a standard edition as of now.

Set up:

Two SQL servers per physical site are joined together in a Basic High Availability group with one being primary and one being passive secondary. Then we wrote some T-SQL script based on this link to set up bidirectional replications on the database level. For simplicity, let’s say we have Server 1 and Server 2 on physical site A and Server 3 and Server 4 on physical site B. Initially, Server 1 is primary on site A and Server 3 is primary on site B. The two are also syncing with each other which means whatever the change is written to Server 1’s database is replicated over to Server 3’s database and vice versa.

Issue:

So after the set up, we tinkered with intra-site fail over. For example, manually failing over from Server 1 to Server 2 at site A. what we observed is that during this failover, if any data in a table was modified or added from site A or site B. After we re-run our script to clean up orphaned replication and re-setup bidirectional transactional replication between Server 1 and Server 3, the data added/modified during intra-site failover will not be synced. For example, both Server 1 and Server 3 touched a table. Server 1 inserted a new entry Test 2 into the table and Server 3 inserted a new entry Test3 into the table.

Server 1 table view

Aaa Bbb Test1 Test1 Test2 Test2

Server 3 table view

Aaa Bbb Test1 Test1 Test3 Test3

After the bidirectional replication script is executed on both Server 1 and Server 3, the added data is not synched. However, changing on Test 1 will be reflected on the other server. This makes sense because the script or bidirectional transactional replication doesn’t know how to deal with tables that have different data?

Essentially during intra-site failover, there is a short time window that both database can be accessed without bidirectional transactional replication yet. As a result, when the bidirectional transactional replication is set up again, two database won’t be fully synched with each other. This is not acceptable because requirement needs all four SQL server instance to be fully synched with one another. Pleas advice on which HA solution or combination of different solutions we should deploy our servers with.

Sorry for the lengthy post since I wanted to provide as much information as possible and thank you for your help.

↧

MySQL InnoDb Cluster Replication Issue

March 29, 2018, 5:55 am

≫ Next: Postgresql Replication - reading from slave

≪ Previous: How to prevent data loss during intra-site failover

I am trying to setup MySQL InnoDB Cluster. I am using version 5.7.20 of the mysql server. I am able to create a cluster. When I try to add the first slave is when I run into my issue. It complains about the slave having group transactions the master does not have. The error is as follows:

"This member has more executed transactions than those present in the group. Local Transactions: c5109632-32bb-11e8-a081-005056a142d2:1-5 > Group transactions: a43235a8-32bb-11-e8-97c3-005056a16fa6:1-47, a4c00453-3344-11e8-b0fb-005056a16fa6:1-14"

This is fine because I log into the mysql shell (on the master) and run the follow:

"SET GITD_NEXT='c5109632-32bb-11e8-a081-005056a142d2:1'; BEGIN; COMMIT;"

I do that for the five ids and then I end with

"SET GTID_NEXT='AUTOMATIC';

That gets me past that error, however I get a new error. It is:

"Plugin group_replication reported: Group contains 2 members which is greater than group_replication_auto_increment_increment value of 1. This can lead to an higher rate of transactional aborts."

Then I get:

"Slave SQL for channel 'group_replication_recovery': Error 'Unknown error 1396' on query. Default database: ''. Query: <create user bob>"

When I set all of this up. I created 3 brand new instances on 3 different servers. I created a cluster manager user account "bob" on all 3 instances. I ran all of the grants for that user on all 3 instances. I then did the checkInstanceConfiguration from all 3 instances. They all passed. I then did the configureLocalInstance on all 3 instances. Then I created my cluster and tried adding a slave.

My question is how can I get around the 'group_replication_recovery' sql error. What did I do wrong?

TIA

↧

Postgresql Replication - reading from slave

March 30, 2018, 4:20 am

≫ Next: MSSQL Server Msmerge_content

≪ Previous: MySQL InnoDb Cluster Replication Issue

I'd like to know if there's any replication mode that allows to read data from slave as well as from master and is there the guarantee that in any point of time the master is in complete sync with the slave, e.g. there's no any delay between the transaction is commited on master and the data is delivered to slave, so that there's no data loss in reading from slave.

↧

MSSQL Server Msmerge_content

March 30, 2018, 8:49 am

≫ Next: Memory leak on MySQL slave?

≪ Previous: Postgresql Replication - reading from slave

I just joined my new office as database administrator. Here we are using SQL Server Merge Replication. It surprises me that 3 of major replication tables 1. Msmergre_contents 2. Msmergre_genhistory 3. Msmergre_tombstone

Size of Msmergre_contents grow up to 64GB & no of records about to 64 Billion and this is happening due to None set the Expiration Period for Subscriptions. Now I want to clean up this table. as we are using "Simple" Recovery model when i made delete query on this table everything got stuck. i have not downtime to stop/pause replication process. can any one help me out how to minimize its size or delete half of its data?

↧

Memory leak on MySQL slave?

April 8, 2016, 1:12 pm

≫ Next: Is it possible to reshard the data in an arangodb collection when add a new dbserver node?

≪ Previous: MSSQL Server Msmerge_content

I have a Percona MySQL master-slave server configuration(5.6.20-68.0). The memory usage by MySQL on the slave server is constantly growing.

Now it reached 68G despite the fact that innodb_buffer_pool_size was set to 30720M. On the Master server memory is stable. Any idea?

I have checked the connections on both servers. But the results are strange. On the Master DB server there are more connections than on the slave:

Master

netstat | grep TIME_WAIT | wc -l

148

htop
mysql VIRT 44G RES 34,7G

Slave

netstat | grep TIME_WAIT | wc -l

78

htop
mysql VIRT 174G RES 68G

↧