Quantcast
Channel: StackExchange Replication Questions
Viewing all 17268 articles
Browse latest View live

Postgres Replication out of sync

$
0
0

We have master server on one Datacenter, and the Replicate is in another Datacenter

Intermittently, on the replicate server, we see this as below.

“2018-12-19 07:24:25 UTC: : @: [25775]: [2-1] FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.”

We have all DB (master and replicate settings) in place for streaming replication, and we are using the same settings on 100+ replicates and there is no issue except this.

I would appreciate if someone would help here.

  1. we have wal_files on the master
  2. hot_standy is on

When we initially start the replicate server, it streams, and if in case streaming is lagging, we have restore_command to restore from master pg_xlog_archive.

So, most of the time, we don't have issues, but sometimes we have this error.


How do I tell what agent profile is being used?

$
0
0

When you change from one agent profile to another using the SSMS GUI, you often have to stop and re-start the agent for the change to take effect. I'm trying to debug an issue where it looks like someone changed the active profile, but the replication is behaving as if a different profile was being used. How do I determine the actual agent profile being used?

A parallel - when using sp_configure, both the configured values and the run values are returned. The Replication GUI seems to be returning the configured values. How do I determine the running values for the agent profile?

pgpool failover not working

$
0
0

I have setup pgpool 3.4 with streaming replication setup.I have installed pgppol on both the servers,Below are my file contents

pgppol.conf

listen_addresses = '*'
port = 9999
socket_dir = '/tmp'
pcp_listen_addresses = '*'
pcp_port = 9898
pcp_socket_dir = '/tmp'
listen_backlog_multiplier = 2

backend_hostname0 = 'masterip'
backend_port0 = 5432
backend_weight0 = 1

backend_data_directory0 = '/var/lib/pgsql/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'slaveip'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/pgsql/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'

enable_pool_hba = off
pool_passwd = 'pool_passwd'
authentication_timeout = 60

ssl = off
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0

log_destination = 'syslog'
log_line_prefix = '%t: pid %p: '   
log_connections = off
log_hostname = off
log_statement = off
log_per_node_statement = off
log_standby_delay = 'if_over_threshold'


syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 0

pid_file_name = '/var/run/pgpool/pgpool.pid'
logdir = '/var/log/pgpool'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = off

master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 10
sr_check_user = 'postgres'
sr_check_password = 'password'
delay_threshold = 10000000

follow_master_command = ''
health_check_period = 5
health_check_timeout = 10
health_check_user = 'postgres'
health_check_password = 'password'
health_check_max_retries = 10
health_check_retry_delay = 2
connect_timeout = 10000

failover_command = '/etc/pgpool-II/failover.sh  %d %P %H %R'
failback_command = ''

recovery_user = 'postgres'
recovery_password = 'password'
recovery_1st_stage_command = 'recovery_1st_stage.sh'
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0

all other settings are commented or untouched.

Failover.sh

#! /bin/sh -x
# Execute command by failover.
# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %H = new master node host name
#                  %P = old primary node id
#                  %R = new master database cluster path
#                  %r = new master port number
#                  %% = '%' character

falling_node=$1          # %d
old_primary=$2           # %P
new_primary=$3           # %H
pgdata=$4                # %R

log=/var/log/pgpool/failover.log

date >> $log
echo "failed_node_id=$falling_node new_primary=$new_primary" >> $log

if [ $falling_node = $old_primary ]; then
    if [ $UID -eq 0 ]
    then
        su postgres -c "ssh postgres@$new_primary /usr/bin/pg_ctl promote -D $pgdata"
    else
        ssh postgres@$new_primary /usr/bin/pg_ctl promote -D $pgdata
    fi
    exit 0;
fi;

recovery_1st_stage.sh

#!/bin/bash -x
# Recovery script for streaming replication.

pgdata=$1
remote_host=$2
remote_pgdata=$3
port=$4


archivedir=/var/lib/pgsql/incrementalbackup/archivedwals
hostname=$(hostname)

ssh -T postgres@$remote_host
rm -rf $remote_pgdata
/usr/bin/pg_basebackup -h $hostname -U repuser -D $remote_pgdata -x -c fast
rm -rf $archivedir/*

cd $remote_pgdata
cat > recovery.conf << EOT
standby_mode = 'on'
primary_conninfo = 'host="$hostname" port=$port user=repuser'
restore_command = 'scp $hostname:$archivedir/%f %p'
EOT

I haven’t done anything else. Haven’t changed anything in pcp.conf and pool_hba.conf

logs

pgpool[4032]: [2-1] 2018-05-07 13:57:51: pid 4032: LOG:  Setting up socket for 0.0.0.0:9999
pgpool: 2018-05-07 13:57:51: pid 4032: LOG:  Setting up socket for 0.0.0.0:9999
pgpool[4032]: [3-1] 2018-05-07 13:57:51: pid 4032: LOG:  Setting up socket for :::9999
pgpool: 2018-05-07 13:57:51: pid 4032: LOG:  Setting up socket for :::9999
pgpool[4032]: [4-1] 2018-05-07 13:57:51: pid 4032: LOG:  pgpool-II successfully started. version$
pgpool: 2018-05-07 13:57:51: pid 4032: LOG:  pgpool-II successfully started. version 3.4.8 
pgpool: 2018-05-07 13:57:51: pid 4032: LOG:  find_primary_node: checking backend no 0
pgpool: 2018-05-07 13:57:51: pid 4032: LOG:  find_primary_node: checking backend no 1
pgpool[4032]: [5-1] 2018-05-07 13:57:51: pid 4032: LOG:  find_primary_node: checking backend no 0
pgpool[4032]: [5-2]
pgpool[4032]: [6-1] 2018-05-07 13:57:51: pid 4032: LOG:  find_primary_node: checking backend no 1
pgpool[4032]: [6-2]

When I am stopping PostgreSQL service, Slave server is not getting promoted to master. What could be the issue?

postgres :: FATAL: could not access file "pglogical": No such file or directory

$
0
0

We installed pglogical from source code, we are running PostgreSQL 9.4

 make USE_PGXS=1 clean all
 make USE_PGXS=1 install

Installation was successful after updating the parameters, we tried restarting PostgreSQL and it responded with the above fatal error:

$ echo "include 'pglogical.conf'" >> $PGDATA/postgresql.conf
$ echo "wal_level = 'logical'" >> $PGDATA/pglogical.conf
$ echo "max_worker_processes = 10" >> $PGDATA/pglogical.conf
$ echo "max_replication_slots = 10" >> $PGDATA/pglogical.conf
$ echo "max_wal_senders = 10" >> $PGDATA/pglogical.conf
$ echo "shared_preload_libraries = 'pglogical'" >> $PGDATA/pglogical.conf

Basically I am trying to setup replication between 9.4 and 10, I was able to create the pglogical extension on 10.

CouchDB replication designed today, how would you do it differently?

$
0
0

I know that CouchDB is not a new project, and so is some of its technology, http 1, rest .... At the time of its creation, I'm sure it was the latest, but not now. So if the CouchDB project was started today, is there a better direction to had an efficient replicating database, especially when considering replication with browser and mobile devices.

How would I approach this project now?

Safe backup of PXC / Galera cluster

$
0
0

I have 3 nodes 'master master' cluster and using PXC 5.6

But I am only using a single node for both read and write.

Before using pxc, i was using standalone Mariadb 5.5 and I was taking backup using xtrabackup.

My question is, how should I take backup of PXC? Should i take full and incremental backup for the node which is NOT used for read and write without any additional precautions just like i was taking for mariadb 5.5?

For e.g: xtrabackup --backup --target-dir=/data/backup

Is there any officially recommended way to safely take backups of pxc?

I want to take backup so that if there is database corruption, I can start my cluster from scratch using that backup.

Replica Set Oplog Maxsize

$
0
0

I have two databases (Mongodb 3.6.5) using ReplicaSet, when I made a split in the network, I am forcing the secondary database to be the primary.

cfg = rs.conf()
cfg.members[0].votes = 0
cfg.members[1].votes = 1
cfg.members[0].priority = 0
cfg.members[1].priority = 1
rs.reconfig(cfg, { force: true })

I already tested insert in the secondary acting as primary and when returning the network and back to secondary to be secondary, the bases were replicated, the original primary was equal to secondary.

Now I unplugged the network again and did several operations on the secondary machine transformed into primary until the size of the Oplog exceeds the maxSize. After some time without doing operations, the size of the Oplog decreased somewhat. I did some more operations to increase the size value of Oplog but the same is not growing as before.

My next step is to return the network, return the secondary as a secondary, and see if it synchronizes.

I would like to know how does this Oplog work?

Force replication to statement for INSERT to a table to fire trigger on slave

$
0
0

We have a PROD DB which replicates into a slave DB using mixed replication. We want to add a trigger so that a row is added to our DW when a row is INSERTed into table_a (on master). The issue is that this INSERT is coming through using Row-based replication and the trigger (which is on table_a on slave) is not firing. We need to have the trigger on the slave table as that is where our DW is.

Looking around online it looks like this should work if statement-based replication is used. Is it possible to force the INSERT to table A to be processed as statement-based replication? Or is there any other way we can achieve this?

The INSERT itself is deterministic as is the trigger. We are using MySQL 5.6.

If you need any other information please let me know.


Couchbase - bidirectional replication

How can I stop master from starting itself as a slave (in addition to master) on reboot?

$
0
0

In a simple MySQL replication Master-Slave configuration I have a problem where Master tries to connect to itself as a slave on reboot.

So when I restart MySQL on Master, I see errors related to the same server trying to replicate to itself and I have to manually run mysql -e "STOP SLAVE;" every time I restart MySQL.

How can I disable slave on master for good?

Here's the relevant portion of my.cnf:

## Logging
binlog_format                   = mixed
log_bin                         = /var/log/mysql/mysql-bin.log
sync_binlog                     = 1
pid_file                        = /var/run/mysqld/mysqld.pid
log_error                       = /var/log/mysql/error.log
#general_log                     = 0
#general_log_file                = /var/log/mysql/general.log
slow_query_log                  = 1
slow_query_log_file             = /var/log/mysql/slow.log
long_query_time                 = 3
expire_logs_days                = 14

sql_mode                        = STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
# sql_mode                        = ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION

## Replication
server_id                       = 200

## Master Configuration
binlog-do-db                    = my_db_1
binlog-do-db                    = my_db_2
binlog-do-db                    = my_db_3
binlog-do-db                    = my_db_4
binlog-do-db                    = my_db_5
binlog-do-db                    = my_db_6

Also, when I run SELECT * FROM mysql.user; I don't see the repl user that's allegedly a "slave" on Master.

BUT, I do see that localhost has replication grants:

mysql> select Host, User, grant_priv, Repl_slave_priv, Repl_client_priv from mysql.user;
+-----------------+---------------+------------+-----------------+------------------+
| Host            | User          | grant_priv | Repl_slave_priv | Repl_client_priv |
+-----------------+---------------+------------+-----------------+------------------+
| localhost       | root          | Y          | Y               | Y                |
| localhost       | mysql.sys     | N          | N               | N                |

Here's an example of the errors I see on Reboot (before I run STOP SLAVE; on Master):

2016-09-01T15:22:23.845505Z 384 [Note] Access denied for user 'repl'@'192.168.100.200' (using password: YES)
2016-09-01T15:22:23.845761Z 1 [ERROR] Slave I/O for channel '': error connecting to master 'repl@192.168.100.200:3306' - retry-time: 30  retries: 8, Error_code: 1045
2016-09-01T15:22:50.191636Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6843ms. The settings might not be optimal. (flushed=15210 and evicted=0, during the time.)

Apart from this, replication is running fine. Writes to Master show up flawlessly on the real, read-only, Slave.


Full my.cnf:

[mysql]
default_character_set           = utf8

[mysqld]
datadir                         = /var/lib/mysql
socket                          = /var/lib/mysql/mysql.sock

symbolic-links                  = 0

## Custom Configuration
skip_external_locking           = 1
skip_name_resolve
open_files_limit                = 20000

## Cache
thread_cache_size               = 16
query_cache_type                = 1
query_cache_size                = 256M
query_cache_limit               = 4M

## Per-thread Buffers
sort_buffer_size                = 32M
read_buffer_size                = 4M
read_rnd_buffer_size            = 8M
join_buffer_size                = 2M

## Temp Tables
tmp_table_size                  = 1024M
max_heap_table_size             = 1024M

## Networking
back_log                        = 250
max_connections                 = 512
max_connect_errors              = 100000
max_allowed_packet              = 128M
interactive_timeout             = 1800
wait_timeout                    = 1800
character_set_client_handshake  = FALSE
character_set_server            = utf8mb4
collation_server                = utf8mb4_unicode_ci

### Storage Engines
default_storage_engine          = InnoDB
innodb                          = FORCE

## MyISAM
key_buffer_size                 = 128M
myisam_sort_buffer_size         = 16M

## InnoDB
innodb_buffer_pool_size         = 46G
innodb_buffer_pool_instances    = 64
innodb_log_files_in_group       = 2
innodb_log_buffer_size          = 32M
innodb_log_file_size            = 64M
innodb_file_per_table           = 1
innodb_thread_concurrency       = 0
innodb_flush_log_at_trx_commit  = 1

## Logging
binlog_format                   = mixed
log_bin                         = /var/log/mysql/mysql-bin.log
sync_binlog                     = 1
pid_file                        = /var/run/mysqld/mysqld.pid
log_error                       = /var/log/mysql/error.log
#general_log                     = 0
#general_log_file                = /var/log/mysql/general.log
slow_query_log                  = 1
slow_query_log_file             = /var/log/mysql/slow.log
long_query_time                 = 3
expire_logs_days                = 14

sql_mode                        = STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
# sql_mode                        = ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION

## Replication
# Master Server ID:
server_id                       = 200
# Slave Server ID:
# server_id                       = 300

## Master Configuration
# Comment out on Slave
binlog-do-db                    = db_1
binlog-do-db                    = db_2
binlog-do-db                    = db_3
binlog-do-db                    = db_4
binlog-do-db                    = db_5
binlog-do-db                    = db_6

## Slave Configuration
# Uncomment the following on Slave
# relay-log                       = /var/log/mysql/mysql-relay-bin.log
# binlog-do-db                    = db_1
# binlog-do-db                    = db_2
# binlog-do-db                    = db_3
# binlog-do-db                    = db_4
# binlog-do-db                    = db_5
# binlog-do-db                    = db_6
# log_slave_updates               = 1
# read_only                       = 1
# slave_skip_errors               = 1062

[mysqld_safe]
datadir                         = /var/lib/mysql
socket                          = /var/lib/mysql/mysql.sock
symbolic-links                  = 0
pid_file                        = /var/run/mysqld/mysqld.pid
log_error                       = /var/log/mysql/error.log

Accidental deletes at SQL Server Subscriber

$
0
0

Unfortunately, this happened twice in 2 weeks on our most critical db server; just before Christmas! Instead of running a delete script at the publisher, the script was run at the subscriber, which broke replication!

I'm exploring options to avoid this scenario in future. This is a massive DataWarehouse. Lot of transactions. Everyone and his brother in the organization has access. Any more such mistakes might trigger a resume-generating-event!

Here are some of my notes:

    1. Maybe I can use the deny_db_writer role at the subscriber. 
    But I believe sysadmins will still be able to run the deletes. 

    2. INSTEAD OF DELETE TRIGGERS could be an option; however,
 we have hundreds of tables, that needs these triggers. I don't know
 if the genuine deletes from replication will invoke this trigger or not. 
(we prefer NOT to. There's something called NOT FOR REPLICATION for triggers; 
we might have to use that)

    3. Can this be achieved by a DDL trigger? what could be the
 impact on the subscriber?

Has anyone done anything like this? Thank you so much for any suggestions.

Write-lock a whole table during transaction

$
0
0

I need to perform a delicate operation to my table in which I will solely insert, delete and select upon all of my rows and no God may interfere with the table during this operation: the table will be in an inconsistent state and any concurrent MySQL session shall not be allowed to modify it until commit-ed.

The use of SELECT ... FOR UPDATE | LOCK IN SHARE MODE is not suitable because, while it may potentially be used to lock all the rows in the table, it won't prevent the insertion of further rows by a concurrent session. Basically, I need to LOCK TABLES my_table WRITE within the body of a transaction.

The table contains about 20,000 rows and a master-slave, mixed-format replication is in place over a slow connection so, for any workaround, I'd prefer to avoid using temporary tables which may faint the slave, and the amount of data dumped into the binlog should ideally be minimized.

The engine is InnoDB on MySQL 5.6, for both master and slave.

Is it possible run an arbiter on system that also host the secondary member?

$
0
0

I have two machines (it is a restriction) each with MongoDB (3.6.5) installed and using ReplicaSet (master-slave).

Is it possible run an arbite on system that also host the secondary member?

enter image description here

Even if it is not advisable, but if it is possible, how can I install an arbiter on the secondary server for when the primary go down the secondary become the primary?

Once installed, how can I configure the ReplicaSet for when the original primary returns make a copy of the secondary (do not use the replication)?

How to handle set up real time DML and DDL statements sync between two DB servers SQL Servers

$
0
0

We are looking to offload the data from a Production enviroment to a reporting environment. I currently have an SSIS package setup to extract and load data between the two servers.

However, now there are some development activities and the requirement is to capture and apply any schema changes to the destination server for read only purpose all while ensuring real (or near real) time sync.

I looked into implementing Change data capture during my initial ETL implementation -but data transfer and replication are not what is it originally intended for and getting to know more about undocumented behavior of CDC when used for data sync. Also it would basically be replication under the hoods.

I thought transactional replication would be best.

But I now got to know that merge replication between other production environment is already configured in the source DB. What is mt best course of action here. I dont want to keep reengineering this solution.

how to replicate articles from different schemas?

$
0
0

from sp_addarticle (Transact-SQL)

I get an example as how to add an article to a publication.

this adds the table dbo.Product to the publication AdvWorksProductTran

DECLARE @publication    AS sysname;
DECLARE @table AS sysname;
DECLARE @filterclause AS nvarchar(500);
DECLARE @filtername AS nvarchar(386);
DECLARE @schemaowner AS sysname;
SET @publication = N'AdvWorksProductTran'; 
SET @table = N'Product';
SET @filterclause = N'[DiscontinuedDate] IS NULL'; 
SET @filtername = N'filter_out_discontinued';
SET @schemaowner = N'Production';

-- Add a horizontally and vertically filtered article for the Product table.
-- Manually set @schema_option to ensure that the Production schema 
-- is generated at the Subscriber (0x8000000).
EXEC sp_addarticle 
    @publication = @publication, 
    @article = @table, 
    @source_object = @table,
    @source_owner = @schemaowner, 
    @schema_option = 0x80030F3,
    @vertical_partition = N'true', 
    @type = N'logbased',
    @filter_clause = @filterclause;

How would I do if I had also the following tables from different schemas to add to this publication?

I want to add these two tables to the publication, how do I do it using sp_addarticle?

  1. my_schema01.Product
  2. my_schema02.Product

mysql replication missed quite a lot SQL statement

$
0
0

I have set up a master/slave with mysql-5.1.73 The master's binlog format is "statement". The master and slave seemed running very well with slave status:

         Slave_IO_Running: Yes
        Slave_SQL_Running: Yes

And when I modified the content on the master manually, whether it was select or update or insert, or alter table, the modification will be synchronized to the slave instantly.

However, after running for several days, I found the slave missed a lot of insert statements, these insertion statements didn't violate any PRIMARY key rule. More ever, when I tried to redo the binlog on another slave with:

mysqlbinlog mysql-binlog.00000X | mysql

Those missed statement were missed again with no warning or error.

Have you ever met such a situation, what should I do to restore all the changes to the slave? (There are quite a lot of missed changes, so I could not restore them one by one).

I dug into this matter to find that, the relay log on the slave contains all the insertion statement, which means the binlog is transmitted to the slave correctly. However, the binlog on the slave missed some of the insertion statement, so this issue appeared to occurred during the redo process of the slave.

Any suggestions to diagnose into this issue or work around it?

RDS Replication Error (Apply Error 1406 / Truncation)

$
0
0

I have a MySQL RDS instance as a master, created a Read Replica from it, and ran some schema change operations on it. To be specific, I changed the charset and collation of all the tables and columns from utf8 to utf8mb4. Things were replicating fine, but an error just occurred.

Apply Error 1406: Error; Data too long for column... etc

This is due to lowering the varchar length on some columns from 255 to 191.

I read that you can run some commands to skip replication errors, as described here: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/mysql_rds_skip_repl_error.html

However, would this "skip" the insert, or, just truncate the data and proceed with the insert?

I'd like the data to be truncated and still added to the table rather than aborting the entire operation, but I'm not sure if that is going to happen or not. Any suggestions would be welcome!

Bucardo - without Primary Key and Unique Key

$
0
0

I read this from Postgresql Wiki page.

Cannot incrementally replicate tables without a unique key (it can "fullcopy" them)

  1. Now we have 5.5.0, so still do we need PK or unique key for this?

  2. What are the impacts of bucardo without Pk/Unique keys?

Setting up Oracle publishers in SQL Server replication

$
0
0

I am attempting to set up and oracle publisher on a sql server 2008R2 server, but I am getting the following error.

"Unable to run SQL*PLUS. Make certain that a current version of the Oracle client code is installed at the distributor. For addition information, see SQL Server Error 21617 in Troubleshooting Oracle Publishers in SQL Server Books Online. (Microsoft SQL Server, Error:21617)"

The information I have found states that an oracle client must be installed and the oracle_home\bin must be in the path variable. I have verified that it is.

So far I have taken the following steps Installed oracle administrator client Added TNS_ADMIN env variable Added ORACLE_HOME env variable connected to the remote oracle database from the distributor via sql*plus

I am hoping someone will have run into similar errors in the past

How to check replication snapshot agent status?

$
0
0

I'd like to check the status of the agent after I start it using this statement

EXEC sp_startpublication_snapshot @publication

As I want to do a next step that needs the job to be already started.

Viewing all 17268 articles
Browse latest View live