Efficiently bulk upsert unrelated rows

November 1, 2018, 12:27 pm

≫ Next: Why does sp_replmonitorhelpsubscription not work without parameters?

≪ Previous: What is internal implementation for Redis master-slave communication?

As part of a manual replication process between databases with different but related schemas, for each table, we identify new and modified records in the source database and feed them via a table valued parameter to a stored procedure in the SQL Server destination database. The source database is not SQL Server and is in a different data center.

For each row to be upserted, the stored procedure queries for the primary keys of the related rows already in the destination database.

Currently, we do a single SELECT to get the primary keys, followed by a single MERGE to perform the upsert. There are two aspects of this approach that I believe may not be as efficient as possible.

An implicit transaction unnecessarily wraps the MERGE. The database would remain consistent even with each row being upserted one at a time. If an row's upsert fails, we want the remaining rows to proceed.
MERGE interleaves inserts and sets as it goes through the rows, which is fine, but we don't need this. It would be acceptable to set all the modified rows and subsequently insert all the new rows.

Based on the flexibility we have, the MERGE performance tip to use UPDATE and INSERT seems to apply to our case:

When simply updating one table based on the rows of another table, improved performance and scalability can be achieved with basic INSERT, UPDATE, and DELETE statements.

Do I understand that right? Am I better off with a separate UPDATE and INSERT? And what about the implicit transaction? Is performing a single SELECT, UPDATE, and INSERT over a large batch of rows most efficient, or is it better to take advantage of the ability to do one row at time by using a FOR loop? Or something else?

In general, what is the most efficient way to upsert a large batch or rows to a SQL Server table in which the rows are not transactionally related and sets and inserts need not be interleaved?

↧

Why does sp_replmonitorhelpsubscription not work without parameters?

October 25, 2019, 10:29 am

≫ Next: MySQL replication with DRBD - Can we use Old master as slave after failover?

≪ Previous: Efficiently bulk upsert unrelated rows

I am just starting to play around with all the cool replication tools to Programmatically Monitor Replication

One of the early finds is sp_replmonitorhelpsubscription

When I run it as

 sp_replmonitorhelpsubscription

I get

Msg 20587, Level 16, State 1, Procedure sp_replmonitorhelpsubscription, Line 77 [Batch Start Line 16]
Invalid '@publication_type' value for stored procedure 'sp_replmonitorhelpsubscription'.

According to MS documents it should be valid with a null default

NULL (default)

If I run it with the Parameter it works fine.

sp_replmonitorhelpsubscription  @publication_type = '0'

I using it with transactional replication, on a single server (Reporting Copy) I have tried and I get the same results on SQL 2017 & 2016. I am running it against the distribution database

Not sure If I am doing something stupid, if the MS docs are wrong, or what.

Why does sp_replmonitorhelpsubscription not work without parameters?

↧

MySQL replication with DRBD - Can we use Old master as slave after failover?

September 13, 2017, 12:30 am

≫ Next: Does PostgreSQL lock/queue reads to subscriber servers when there's an update to a publisher server?

≪ Previous: Why does sp_replmonitorhelpsubscription not work without parameters?

I have configured MySQL replication with DRBD, and now its working fine, Incase if something happens to my master then Slave will promote as Master.

My question is after the failover(lets assume after an hour), My Old master will come online, After that, will it sync the data which are modified or newly added during the downtime from my new master and Can I use it as Read replica?

↧

Does PostgreSQL lock/queue reads to subscriber servers when there's an update to a publisher server?

October 26, 2019, 1:09 am

≫ Next: differences between hot standby vs warm standby postgresql？

≪ Previous: MySQL replication with DRBD - Can we use Old master as slave after failover?

If I have 3 databases

Db1: only for writes, publisher
- Db2: only for reads, subscriber to Db1
- Db3: only for reads, subscriber to Db1

Just to clarify that "only for reads" means that my application won't try to modify that database. The inverse applies to "only for writes". I'm not talking about a feature of the database itself. Maybe I will use permissions to accomplish that, but regardless.

Some questions which I couldn't locate answers to by skimming through the official documentation for logical replication:

In a scenario where there's a new write to Db1, will Db2 and Db3 both lock while synchronizing to the changes, or are they going to allow reads to be made in parallel with the synchronization?
If the subscriber server is executing a read by the time a new changes to Db1 is published, will the changes be the next operation to be executed regardless of how many reads were already waiting to be executed (if any)?

My concern is with consistency in a load-balanced cluster of (only for reads) PostgreSQL servers which are replicas of Db1. They all should be in sync with Db1, not allowing any new reads to them before synchronizing to new changes published by Db1. If I can't do that with logical replication, then what are the alternatives, if any?

↧

differences between hot standby vs warm standby postgresql？

January 24, 2019, 11:13 pm

≫ Next: NSG rules not replicated using Azure site recovery fail over

≪ Previous: Does PostgreSQL lock/queue reads to subscriber servers when there's an update to a publisher server?

I am confused about the Differences BETWEEN the database replication methods mentioned in a wiki page of postgres, which is best for normal situation?

warm-standby/Continuous archiving/log shipping
offers high availability
http://www.postgresql.org/docs/current/static/warm-standby.html
hot-standby/binary-replication/Streaming Replication
used for read-only querie
https://wiki.postgresql.org/wiki/Hot_Standby
https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial
PITR

↧

NSG rules not replicated using Azure site recovery fail over

October 28, 2019, 3:53 am

≫ Next: Best way to set up cloudant backup/restore

≪ Previous: differences between hot standby vs warm standby postgresql？

I have recently replicated my Azure VM using Azure site recovery and performed a test failover. I was disappointed to see the NSG rules, route table not reflected from Source to target. If the network settings aren't reflected from source to target I don't think so it is of use using Site recovery. Am I missing any steps? I have also created "Allow 443 port outbound rule" for source NSG.
How to Create outbound HTTPS (443) rules for the Site Recovery IPs that correspond to the source location: Location-East US, Site Recovery IP address-13.82.88.226, Site Recovery monitoring IP address-104.45.147.24

↧

Best way to set up cloudant backup/restore

January 9, 2017, 9:02 am

≫ Next: MySQL 5.7 UPDATE slow due to not using index

≪ Previous: NSG rules not replicated using Azure site recovery fail over

I have just got to the point of having enough data in my cloudant DB that it would be an issue if I lost it so am now looking into what is a good/reliable(/free!) method of getting regular backups of my data.

I've looked at the documentation and its recommending using backup via replication (https://docs.cloudant.com/backup-guide-using-replication.html)

The method here seems to be: 1. Full backup via replication 1 day 2. Incremental backups to different backups on subsequent days

There seem to be 2 drawbacks here: 1. Backing up is still in cloudant so im paying full price vs being able to move it off to a cheaper alternative for long term storage 2. It isnt clear how the method described would actually work when you cycle the replication (for example, if the 'full' replication was on a monday, then incremental every other day, when you get to the full backup the following monday its no longer possible to restore to any day previously e.g. sunday, saturday, friday etc)

I've seen other solutions described such as installing couchdb on EC2, replicating to that and then taking nightly backups into amazon glacier (glacier is cheap but seems like EC2 couchdb instance is an annoying expense there).

Does anyone have a recommendation here?

↧

MySQL 5.7 UPDATE slow due to not using index

July 25, 2018, 7:47 am

≫ Next: Archiving strategy

≪ Previous: Best way to set up cloudant backup/restore

We have an issue similar to this. Only, the solution there won't work for us as:

We don't use multi-table JOINS, but UPDATE … WHERE `userid` IN (<list of 10k ids>);
We currently "only" have the issue on one replication slave, so we can't change the statement there.

Forcing the update to use an index using … USE INDEX (PRIMARY) SET … would help, according to EXPLAIN.

Is there any way using variables to get similar behaviour? Or is our best approach to change the application to use explicit index hints & rebuild the slave?

↧

Archiving strategy

March 1, 2018, 11:15 am

≫ Next: Posibilities for bi-directional replication

≪ Previous: MySQL 5.7 UPDATE slow due to not using index

I have an instance with only one database which contains 4 years of data (400 gigas). I name this instance inst_live I want to implement this following archiving strategy:

create an archive instance (inst_archive) and copy the database from inst_live to inst_archive.
all days, delete from inst_live all datas which are over two years
replay inst_live transactions in inst_archive except the purge.

Implement a streaming replication from inst_live to inst_archive is a solution to maintain freshness data in inst_archive but I fear the purge applied in inst_live will be applied in inst_archive too.

Inst_live has to contain only 2 years of data Inst_archive has to contains 4 years of data.

Is anyone have any ideas please ?

I am using Postgresql 9.5 in Red Hat 7.1

Pglogical seems allowing logical replication based only on INSERT and UPDATE statements. Does anyone implement it in production? Is there tutorials other than official documentation? I have some trouble to follow step by step official documentation.

Does Slony allow logical replication based on INSERT and UPDATE only?

↧

Posibilities for bi-directional replication

September 11, 2015, 10:39 pm

≫ Next: Is there a way to synchronize mysql databases that are missing a few days worth of data

≪ Previous: Archiving strategy

Here i found an article on Bi-Directional replication. I had tried that way and it works fine for me. here my question is that, is that the right way to achieve this? any other possibilities for this? whether it is applicable for single master and multiple slaves.

↧

Is there a way to synchronize mysql databases that are missing a few days worth of data

July 27, 2016, 9:57 am

≫ Next: Writing something then immediately read when using read/write split: how to deal with replication lag

≪ Previous: Posibilities for bi-directional replication

I have a primary-primary MySQL server setup that is connected to a load balancer. At one point, a failure occurred to where the replication wasn't happening and data was being written to one server and not the other. We have since fixed the replication issue. However, both servers are missing data that the other server has due to the load balance switching.

Is there a way to sync the missing data between both devices?

↧

Writing something then immediately read when using read/write split: how to deal with replication lag

October 29, 2019, 12:43 pm

≫ Next: Speed up initial MongoDB replication by sourcing partial data from another 'STARTUP2' member

≪ Previous: Is there a way to synchronize mysql databases that are missing a few days worth of data

We have an issue where we write to the master table, then immediately must read using the reader. We are experiencing issues with replication lag where the data is not available to the reader at the time we execute our read statement (our replication lag is around 18 ms, which seems pretty fast so I don't think it's a problem with having too much lag).

What patterns are there to deal with this problem? So far our solutions are:

Just use the writer to read
Put logic in our code to wait for replication to occur

Are there any other ways of dealing with this?

↧

Speed up initial MongoDB replication by sourcing partial data from another 'STARTUP2' member

October 29, 2019, 5:29 pm

≫ Next: SQL Server Replication - Only weeks worth of data

≪ Previous: Writing something then immediately read when using read/write split: how to deal with replication lag

I have a MongoDB replica set with 3 members:

member#1 - slow standalone instance transformed into the replica. Status: Primary
member#2 - new and super fast replica member. Status: STARTUP2
member#3 - new and fast replica member. Status: STARTUP2

After I've enabled replication on member#1 and added members #2 and #3 to config the data started slowly (limited by member#1 performance) moving to new members.

The question: It looks like both new members are sourcing data from member#1 only, instead of looking for the best available source. Is it possible to configure them to look into partial data available on each over while both of them are still in "STARTUP2" status?

For example member#2 already processed 30% of data and member#3 could have used this as a partial source (and verify later) instead of sourcing only from member#1 with slow speed.

Thank you

↧

SQL Server Replication - Only weeks worth of data

July 17, 2017, 10:41 am

≫ Next: SQL Server 2014 Cannot add pull subscription. ServerName is not defined as a subscriber

≪ Previous: Speed up initial MongoDB replication by sourcing partial data from another 'STARTUP2' member

I have a need for a test server that hosts a small subset of data from our production systems. Is it possible to setup a SQL Server Replication Job that only keeps a week's worth of data so developers can develop reports?

Keep running 7 days of data, keeping the storage need small is the goal.

↧

SQL Server 2014 Cannot add pull subscription. ServerName is not defined as a subscriber

November 26, 2017, 11:34 pm

≫ Next: Clarification on steps creating Database Replication Definition

≪ Previous: SQL Server Replication - Only weeks worth of data

When I try to add a subscription to a transactional publication I get an error. The error is:

Creating Subscription(s)...

- Creating subscription for 'bejonlsql07\bi' (Error)
    Messages
    * SQL Server could not create a subscription for Subscriber 'bejonlsql07\bi'. (New Subscription Wizard)

    ------------------------------
    ADDITIONAL INFORMATION:

    An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)

    ------------------------------

    'BEJONLSQL07\Bİ' is not defined as a Subscriber for 'BEJORUMSSQL01'.
    Could not update the distribution database subscription table. The subscription status could not be changed.
    The subscription could not be created.
    Changed database context to 'NAVRUMS'. (Microsoft SQL Server, Error: 20032)

    For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&ProdVer=12.00.5557&EvtSrc=MSSQLServer&EvtID=20032&LinkId=20476

I already tried disabled publishing and distribution completely and setting up as new. But I still get the same error.

It has worked on this server before. I just had to disable replication for some maintenance and wanted to add it again.

I looked in all tables if there is some leftover replication info from before but I cannot find anything.

UPDATE

When i look at the linked servers it somehow adds the subscription server two times.

↧

Clarification on steps creating Database Replication Definition

March 12, 2018, 8:00 pm

≫ Next: Row based replication on slave where as statement based on Master

≪ Previous: SQL Server 2014 Cannot add pull subscription. ServerName is not defined as a subscriber

I am being asked to create database replication definitions for a couple of databases (around 20 databases). I just wanted to clarify is the below-mentioned steps are the correct steps to be carried out? Let's says 'customer' is one of my database.

Set ddl in tran to truesp_dboption customer,"ddl in tran", true
Mark the primary database using sp_reptostandbysp_reptostandby customer,'all'
Set the RepAgent parameter send warm standby xacts to truesp_config_rep_agent customer,'send warm standby xacts', 'true'
Create db repdef/subscreate database replication definition db_repdef_customer with primary at DS.customer replicate DDL replicate system procedures go
create subscription db_sub_customer for database replication definition db_repdef_customer with primary at DS.customer with replicate at DS2.customer1 without materialization go

(Note: DS.customer and DS2.customer1 are ASE-ASE replication)

After I have followed the above steps to create db repdef/subs, I hit a lot of permission issues on my replication ID to do INSERT/UPDATE/DELETE operators on those tables I did not setup table replication yet. Further check on these tables in my 'customer' database (ex. I tried to do insert/update/delete operations manually on tables without setting table repdef, I realised that the data replication is working for all the tables under the 'customer' database with/without I setup table replication. Is this normal? Any steps I have missed out? Please help.

↧

Row based replication on slave where as statement based on Master

April 17, 2018, 7:07 am

≫ Next: How to mark a table as pre-populated with postgresql logical replication

≪ Previous: Clarification on steps creating Database Replication Definition

We have a master slave setup of MYSQL 5.6. Master and slave runs SBR replication where as slave is not writing binlogs.

We need to enable binlogs on the Slave server but we want the binlogs to be in RBR format on the slave instead. We want to ship them to Kafka and it wants RBR replication only.

Is this doable to have RBR on the slave where as it is getting data from master as SBR ?

Thanks

↧

How to mark a table as pre-populated with postgresql logical replication

November 21, 2018, 9:18 am

≫ Next: Redis replication for large data to new slave

≪ Previous: Row based replication on slave where as statement based on Master

I accidentally loaded data into a table before setting up a logical replication subscription for it and now I'm getting errors when I try to sync it (duplicate key already exists).

I can't delete the rows because "cannot truncate a table referenced in a foreign key constraint".

Is there a way to tell the subscription that the table was pre-synced or pre-populated?

We're getting lots of logs lines like:

ERROR: duplicate key value violates unique constraint

Thanks.

↧

Redis replication for large data to new slave

October 30, 2019, 5:02 am

≫ Next: Replication problem in SQL Server 2019

≪ Previous: How to mark a table as pre-populated with postgresql logical replication

I have a redis master which has 30 GB of data and the memory there is 90 GB. We have this setup as we have less writes and more reads. Normally, we would have a 3X db size RAM machine.

The problem here is, one slave went corrupt and later on when we added it back using sentinel. it got stuck in wait_bgsave state on master (after seeing the info on master)

The reason was that :

 client-output-buffer-limit slave 256mb 64mb 60

This was set on master and since max memory is not available it breaks replication for the new slave. I saw this question Redis replication and client-output-buffer-limit where similar issue is being discussed but i have a broader scope of question.

We can't use a lot of memory. So, what are the possible ways to do replication in this context to prevent any failure on master (wrt. memory and latency impacts)

I have few things on mind: 1 - Should i do diskless replication - will it have any impact on latency of writes and reads? 2 - Should i just copy the dump file from another slave to this new slave and restart redis. ? will that work. 3 - Should i increase the output-buffer-limit slave to a greater limit? If yes, then how much? I want to do this for sometime till replication happens and then revert it back to normal setting? I am skeptic about this approach.

↧

Replication problem in SQL Server 2019

October 30, 2019, 5:43 am

≫ Next: mysql server with several db. One DB is a replica for cluster and other are not from cluster

≪ Previous: Redis replication for large data to new slave

I have installed SQL Server 2019 RC1 and cannot create a snapshot. I keep getting the same message when I run the snapshot agent:

2019-10-30 12:32:41.59 Microsoft (R) SQL Server Snapshot Agent
2019-10-30 12:32:41.59 [Assembly Version = 15.0.0.0, File Version = 15.0.1900.25]
2019-10-30 12:32:41.59 Copyright (c) 2016 Microsoft Corporation.
2019-10-30 12:32:41.59 The timestamps prepended to the output lines are expressed in terms of UTC time.
2019-10-30 12:32:41.59 User-specified agent parameter values:
2019-10-30 12:32:41.59 --------------------------------------
2019-10-30 12:32:41.59 -Publisher DB100
2019-10-30 12:32:41.59 -PublisherDB REF3_DB
2019-10-30 12:32:41.59 -Publication REF3_PDR4
2019-10-30 12:32:41.59 -Distributor DB100
2019-10-30 12:32:41.59 -DistributorSecurityMode 1
2019-10-30 12:32:41.59 -XJOBID 0xA694274CD89FAF418C8B059F22C01B85
2019-10-30 12:32:41.59 --------------------------------------
2019-10-30 12:32:41.59 Connecting to Distributor 'DB100'
2019-10-30 12:32:41.64 Parameter values obtained from agent profile:
2019-10-30 12:32:41.64 ---------------------------------------------
2019-10-30 12:32:41.64 -BcpBatchSize 100000
2019-10-30 12:32:41.64 -HistoryVerboseLevel 2
2019-10-30 12:32:41.64 -LoginTimeout 15
2019-10-30 12:32:41.64 -QueryTimeout 1800
2019-10-30 12:32:41.64 ---------------------------------------------
2019-10-30 12:32:41.64 Validating Publisher
2019-10-30 12:32:41.64 Connecting to Publisher 'DB100'
2019-10-30 12:32:41.65 Publisher database compatibility level is set to 150.
2019-10-30 12:32:41.65 Retrieving publication and article information from the publisher database 'DB100.REF3_DB'
2019-10-30 12:32:41.65 [0%] Locking published tables while generating the snapshot
2019-10-30 12:32:41.67 [0%] The replication agent had encountered an exception.
2019-10-30 12:32:41.67 Source: Replication
2019-10-30 12:32:41.67 Exception Type: Microsoft.SqlServer.Replication.ReplicationAgentException
2019-10-30 12:32:41.67 Exception Message: An unspecified error had occurred in the native SQL Server connection component.
2019-10-30 12:32:41.67 Message Code: 55012
2019-10-30 12:32:41.67

↧