Quantcast
Channel: StackExchange Replication Questions
Viewing all 17268 articles
Browse latest View live

Is there a way to add transformers to Kafka Strimzi MirrorMaker2?

$
0
0

Right now, I need to replicate some topics from one Kafka cluster to another, but in the second I need it in another format. We are using Strimzi in Kubernetes. In some connectors one can do something like this, but I am not sure if MirrorMaker2 let us do it since it is based on Kafka Connect:

apiVersion: kafka.strimzi.io/v1alpha1
kind: KafkaConnector
metadata:
  name: sample-connector
spec:
  class: com.sample.SampleConnector
  tasksMax: 2
  config:
    ...
    transforms: TimestampConversion,RectificationDateTimeConversion
    transforms.TimestampConversion.type: org.apache.kafka.connect.transforms.TimestampConverter$Value
    transforms.TimestampConversion.format: yyyy-MM-dd HH:mm:ss.SSS
    transforms.TimestampConversion.field: timestamp
    transforms.TimestampConversion.target.type: string
    transforms.RectificationDateTimeConversion.type: org.apache.kafka.connect.transforms.TimestampConverter$Value
    transforms.RectificationDateTimeConversion.format: yyyy-MM-dd HH:mm:ss.SSS
    transforms.RectificationDateTimeConversion.field: rectificationDateTime
    transforms.RectificationDateTimeConversion.target.type: string

Help preparing for MySQL Master Slave Replication

$
0
0

I am not in the best situation. I inherited an Ubuntu 14.04 8 GB RAM, 8 CPU MySQL 5.5 database server with almost 400 GB of business-critical data (stored on external SSD) contained within several thousand different databases. My database administration skills and experience are nascent. I want to create a backup of this data to set up MySQL Replication, but I need to create the backup with minimized impact and downtime.

These databases are individually backed up with mysqldump about every four hours. This unfortunately means that I have no single, point-in-time, logical or raw backup of the entire database server and to top it off, binary logging is not enabled on that server. But I do have the capability to individually restore these backups.

In total, there about 250,000 tables in the database server. Of those tables, about 90,000 use the myisam engine and about 160,000 use the innodb engine.

I know there will be some downtime but I would really just like to avoid having downtime of an unknown duration during which I am obliged fully backup the data and deploy replication at the same time.

In testing, I've given thought to or tried various approaches:

  • using Percona Xtrabackup
  • using mysqldump with a single transaction (for innodb) and no locks for the myisam tables
  • rsync'ing the mysql data directory, then gracefully shutting down the MySQL server, and rsync'ing the flushed out changes
  • converting the myisam tables to innodb, then doing a mysqldump or using xtrabackup
  • using my existing backups to start replication, then letting the slave catch up
  • restoring my existing backups, then syncing the changes with pt-table-checksum and pt-table sync
  • and the list can go on...

Without me providing excessive detail about my testing methods and results, I would like to know how you would approach this situation.

EDIT: In essence, my question is: With the goal of minimal downtime and given my scenario, how would you create a backup of the database server in anticipation of setting up MySQL Replication?

I would appreciate any advice, opinions, services, or resources you may have. Thank you.

MS SQL Server- Azure Cosmos DB real time replication in table format

$
0
0

What would be the easiest way to do real time replication of an on premises SQL Server/Oracle to an cloud Azure Cosmos DB ?

Error 1236 - "Could not find first log file name in binary log index file"

$
0
0

Our setup:

  • Master: MariaDB 10.0.21
  • Slave: MariaDB 10.0.17

Replication was working fine until recently at which point the slave's DBs had to be restored from a dump. I performed all of the necessary steps: Dump the master's DBs, transfer the dump to the slave, drop the old DBs, execute the dump to restore the DBs, execute the appropriate CHANGE MASTER command, and finally START SLAVE.

I am receiving the error:Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'

The first log file that the slave needs from the master is mysql-bin.000289. I can see that this is present on the master:enter image description here

I can also see that the binary log index on the master seems to have an entry for this log file:enter image description here

Still replication is not working - I keep getting the same error. I'm out of ideas - what should I check next?


Updated: Output of SHOW SLAVE STATUS\G as requested:

MariaDB [(none)]> SHOW SLAVE STATUS\G
--------------
SHOW SLAVE STATUS
--------------

*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 127.0.0.1
                  Master_User: replication
                  Master_Port: 1234
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000289
          Read_Master_Log_Pos: 342
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.000289
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: xxx_yyy,xxx_zzz
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 342
              Relay_Log_Space: 248
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 3
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
1 row in set (0.00 sec)

Additional requested information:

root@master [818 18:54:22 /var/lib/mysql]# ls -l /var/lib/mysql/mysql-bin.000289
-rw-rw---- 1 mysql mysql 1074010194 May 19 03:28 /var/lib/mysql/mysql-bin.000289
root@master [819 18:54:29 /var/lib/mysql]# ls mysql-bin.00029*
mysql-bin.000290  mysql-bin.000291  mysql-bin.000292 #(Yes, it was created)
root@master [821 18:56:52 /var/lib/mysql]# mysql -uroot -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 6345382
Server version: 10.0.21-MariaDB-log MariaDB Server

Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> SHOW BINARY LOGS;
+------------------+------------+
| Log_name         | File_size  |
+------------------+------------+
| mysql-bin.000279 | 1074114047 |
| mysql-bin.000280 | 1074004090 |
| mysql-bin.000281 | 1074035416 |
| mysql-bin.000282 | 1073895128 |
| mysql-bin.000283 | 1073742000 |
| mysql-bin.000284 | 1074219591 |
| mysql-bin.000285 | 1074184547 |
| mysql-bin.000286 | 1074217812 |
| mysql-bin.000287 | 1022733058 |
| mysql-bin.000288 |     265069 |
| mysql-bin.000289 | 1074010194 |
| mysql-bin.000290 | 1074200346 |
| mysql-bin.000291 |  617421886 |
| mysql-bin.000292 |     265028 |
+------------------+------------+
14 rows in set (0.00 sec)

MariaDB [(none)]> exit
Bye
root@master [821 18:57:24 /var/lib/mysql]# mysqlbinlog mysql-bin.000289 > /tmp/somefile.txt
root@master [822 18:58:13 /var/lib/mysql]# tail /tmp/somefile.txt 
# at 1074010124
#160519  3:28:59 server id 5  end_log_pos 1074010151    Xid = 417608063
COMMIT/*!*/;
# at 1074010151
#160519  3:28:59 server id 5  end_log_pos 1074010194    Rotate to mysql-bin.000290  pos: 4
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
root@master [823 18:58:31 /var/lib/mysql]# 

/etc/my.cnf.d/server.cnf (excerpt):

# BINARY LOGGING #
log-bin                        = /var/lib/mysql/mysql-bin
expire-logs-days               = 14
sync-binlog                    = 1

Edit: Postion 342 does seem to exist:

root@master [826 12:15:33 /var/lib/mysql]# grep "end_log_pos 342 " /tmp/somefile.txt
#160517 14:43:13 server id 5  end_log_pos 342   Binlog checkpoint mysql-bin.000288

Is it possible to shrink mdf file in replication server

$
0
0

Is it possible to shrink the data file of a SQL Server replication target server and continue the replication operation?

Ideally I'm trying to avoid shrinking the data file to reduce down time, then swap the roles if shrinking is possible.

There is 60% free space in the database after a bulk clean up, so shrinking is unavoidable.

db choice for activity log

$
0
0

I'm working on a geo-replicated web platform, composed of an ecosystem of microservices. We need to improve and rework the user activity tracing pipeline and I'm looking for "the best" database to achieve that.

Our platform relies entirely on kubernetes, for that reason we would exclude every technology that is not compatible with this approach.

Each log is quite simple and potentially made of the following data:

  • timestamp
  • user_id
  • action_type
  • description
  • some metadata, their format can be adapted to choice of the database (json, key-value, and so on)

Goals

  • high available write operation
  • excellent write-scale capability
  • excellent storage scalability

Plus

  • geo-replication
  • cloud native
  • hot-warm architecture/some kind of rotation

Non goals

  • complex data aggregation
  • complex search query
  • batch processing

Based on my knowledge, researches and experience, good candidates would be:

  • cassandra: should satisfy all the goal and geo-replication
  • cockroach: I've never used it before, but based on the documentation all the goals should be satisfied + geo-replication and is cloud native
  • influxDB: Not sure about that, I've been using influxdb for a while and though it should satisfy all the goals and all the plus maybe is not the best choice for this kind of data

what I would not choose:

  • elasticsearch: it does a lot of things I don't need, is tricky to be maintained and set up, is very resource-consuming
  • mongodb: the write scalability can be achieved only with sharding, this configuration is hard to be maintained and evolved, the shard key is tricky to be changed. Not fully HA due to the master-slave election mechanism
  • all the classic SQL with a single master

UPDATE:
good candidate for functionality but is a monster (and I don't know how it works with kubernetes)

  • HDFS

The process could not execute 'sp_replcmds' on

$
0
0

I am having a lot of trouble setting up transactional replication on my test server. I am running SQL Server 2008 SP2.

I am able to create a transactional publication. The snapshot agent works fine and subscribing to the publication works fine as well. The problem that I get is that the log reader agent fails with the error:

The process could not execute 'sp_replcmds' on [ServerName]

The snapshot and log reader agents are run under a windows account with administrator privileges on the domain and sysadmin privileges on the sql server. I have also tried running the agents under the SQL agent profile. I have tried executing sp_replflush and restarting the SQL agent. I have also tried increasing -LoginTimeout to 500 and -ReadBatchSize to 10.

Any help greatly appreciated.

Kubernetes scale down specific pods

$
0
0

I have a set of Pods running commands that can take up to a couple seconds. There is a process that keeps track of open request & which Pod the request is running on. I'd like the use that information when scaling down pods - either by specifying which pods to try to leave up, or specifying which pods to shut down. Is it possible to specify this type of information when changing the # of replicas, e.g. I want X replicas, try not to kill my long running tasks on pods A, B, C?


Getting past corrupted binary log "Error in Log_event::read_log_event():"

$
0
0

I have a binary log that mysqlbinlog chokes on with the error in the title.

The file itself has much more activity after the cited position.

Doing some basic confirmation it's not all garbage by running it through the strings command shows theres legit traffic until the end of the file when it got rotated.

I've seen a similar post about using hexdump to get past an error related to event too large, but in my case mysqlbinlog chokes to continue to get further information. I'm not familiar enough with the binary format to look for what might be a position of a next event it would recognize.

It gives a starting position it can't get past so I have a script running to basically mysqlbinlog --start-position=X incrementing X by one until it returns with a 0 exit code but that looks like it's going to take a month to completely get through everything at this rate.

I tested the POC of this idea on "good parts" by starting it at weird offsets and it returned correctly at the next one it found w/o error.

I'm running percona 5.6.20 for this instance.

I realize this report might be lacking in information needed to answer the question so I'm happy to edit with comment requests as needed.

SQL Server Replication - Only weeks worth of data

$
0
0

I have a need for a test server that hosts a small subset of data from our production systems. Is it possible to setup a SQL Server Replication Job that only keeps a week's worth of data so developers can develop reports?

Keep running 7 days of data, keeping the storage need small is the goal.

Only kill kubernetes pod if not busy

$
0
0

I want to scale my deployment depending on the amount of requests. Each pod can only handle a request at a time. Scaling up is no problem, but when I want to scale down I want to make sure I am not killing a pod that is working right now ( e.g. encoding a large file).

I have the folling pods:

  • Pod 1 (created 10 min ago, has a task)
  • Pod 2 (created 5 min ago, is free)
  • Pod 3 (created 1 min ago, has a task)

If I reduce the replica value, kubernetes will kill pod 3. It does not care if the pod is busy or not. I could manually kill pod 2, so kubernetes would start a new one:

  • Pod 1 (created 10 min ago, has a task)
  • Pod 3 (created 1 min ago, has a task)
  • Pod 4 (created just now, is free)

After I know pod 2 got killed I could reduce the number of the counter, so pod 4 will be killed before getting a task. But this solution sounds very ugly, because someone else has to tell pod 2 to shut down.

So kubernetes will kill the last created ones, but is it possible to tell him, that a pod is busy and he has to wait before it will be killed?

Deadlocks With Trigger Based Replication

$
0
0

A table ORIGINAL exists with the following structure:

ID    VARCHAR Length 10 (key)
VALUE VARCHAR Length 10 (non-key)

A table REPLICATION exists with the following structure:

ID               VARCHAR Length 10 (key)
CHANGE_TIMESTAMP NUMBER  Length 15 (non-key)

I want to log every changed primary key to the REPLICATION table, with the latest change time stamp.

Therefore I have created this trigger in oracle:

CREATE OR REPLACE TRIGGER REPLICATION_TEST
  AFTER INSERT OR UPDATE ON ORIGINAL
  FOR EACH ROW 

  DECLARE timestamp DEC(15);
  BEGIN 
    SELECT TO_NUMBER(TO_CHAR(SYSDATE, 'yyyymmddhh24miss')) 
      INTO timestamp 
      FROM dual; 

    INSERT INTO REPLICATION
      FIELDS ("id", "change_timestamp") 
      VALUES (:NEW."id", timestamp); 
    EXCEPTION WHEN dup_val_on_index THEN 
      UPDATE REPLICATION
        SET "change_timestamp" = timestamp 
        WHERE "id" = :NEW."id"
END;

Functionally, this works just fine. But in a productive environment with multiple sessions where arbitrary data changes can happen at any time, this infrequently leads to deadlocks. Presumably because of the UPDATE statement.

An alternative approach would be to add the CHANGE_TIMESTAMP field as additional key field to do only INSERTS into the REPLICATION table and skip the UPDATE in case of duplicates. This would work functionally just fine, but would obviously result in much more data being produced which I'd like to avoid.

What else can I do?

mysql - can a slave have a different primary key than the master?

$
0
0

I have a table with PRIMARY KEY (`id`) and I want to change it to PRIMARY KEY (`username`, `id`). These columns are defined as:

  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `username` varchar(20) NOT NULL DEFAULT '',

This table is within a master/slave MySQL topology with binary row replication. Can I get away with taking the slave offline, changing the primary key, and reconnect it to the master without changing the master? For clarity, only the primary key index would be different between the master/slave. All other columns/order of columns would be the same.

In mysql master-slave Replication which server requests the other?

$
0
0

I want to setup mysql replication between two servers one of them is my localhost and the other is online server. I have all availability to make any one of them the master. But according to that my localhost server doesn't have a static IP, i need to know which server of the two (master & slave) is the one which requests the other for doing updates. Does the master sends the binlog updates, or the slave is the one which requests for new updates periodically ? so i will make it the localhost. thank you in advance.

In mysql master-slave Replication which server requests the other? [migrated]

$
0
0

I want to setup mysql replication between two servers one of them is my localhost and the other is online server. I have all availability to make any one of them the master. But according to that my localhost server doesn't have a static IP, i need to know which server of the two (master & slave) is the one which requests the other for doing updates. Does the master sends the binlog updates, or the slave is the one which requests for new updates periodically ? so i will make it the localhost. thank you in advance.


SQL Server Replication using RMO

$
0
0

We are using SQL Server replication using RMO. We have SQL 2016 (Standard Edition) on the server acting as the publisher and SQL Server Express Edition as the subscriber.
Previously, the distributor and the publisher were on the same server and the replication was working.
We have a client application, the data needs to be synced with the server on a regular basis.
We have Transactional and merger replication set and rely on pull approach where the client application pulls the data on demand. For security reasons, the client doesn't want to expose port 1433 (or any other port) on the publisher to the subscribers.
So, we decided to move the distributor on a remote server, so that the subscriber talks to the publisher via remote distributor. (The remote distributor can connect and talk to the Publisher.) However, I am getting an error when I try to sync.
Wanted to check if replication is possible when port 1433 is blocked for the subscribers?
If yes, can you provide me some sample code or pointers to it. If no, what are the different options that I can have?

Sql Server Replication in Azure Data Studio

$
0
0

I've always used SSMS, but am considering switching to a Mac, so I've been exploring Azure Data Studio for my SQL Server needs. I have replication set up, and SSMS offers a nice Replication tab to monitor and manage replication. I can't find anything similar in Azure Data Studio, though. Does anyone know if it has something like this?

Download the replication snapshot file using FTPS

$
0
0

I have two databases for two companies. Company A's database contains domain data. The other company is pulling the data using snapshot replication. We have used FTP to communicate:

  1. Created FTP server on IIS in Window Server 2014
  2. Added the certificate to the Server
  3. Created the replication publisher and given the FTP account information
  4. It is working perfectly without the FTP server
  5. IIS set the certificate and the required SSL connection now it is not working
  6. This data is two company data and we want communication done using FTPS

It is not working, we don't want to use VPN. We got a link from MSDN and it is saying:

If you use SSL to secure the connections between computers in a replication topology, specify a value of 1 or 2 for the -EncryptionLevel parameter of each replication agent (a value of 2 is recommended). A value of 1 specifies that encryption is used, but the agent does not verify that the SSL server certificate is signed by a trusted issuer; a value of 2 specifies that the certificate is verified. Agent parameters can be specified in agent profiles and on the command line.

So where can I set this EncryptionLevel=2?

enter image description here

This is the test cases to connect to the server:

  1. We have changed the Server name during the login ftps://Domain.com
  2. Change the port 990 and open the port still not worked

In short, I want to use FTPS for communication.

I can communicate over FTP. I am working on SQL Server 2014.

MsMerge_genhistory has alot of rows with pubid = null

$
0
0

I have a merge replication and I am worried that the cleanup of metadata might not be enough. I have a retention period of 60 days and I can see thet the metadatacleanup-job do remove rows i msmege_genhistory that are older but only for rows that has the right guid in pubid. most of the rows, about 1,6 million, has the value NULL in pubid and I cannot figure out why. does anybody know why there is so many null-values?

PostgreSQL pg_basebackup from several port number

$
0
0

I have one slave server which used as a replication server (let's called it slave1) from several master servers.

The `slave1' server is set up to receive replication from several postgresql server, and I set it at multiple ports. The settings is like this:

Master1 port 5432 replicated to slave1 port 5432 Master2 port 5432 replicated to slave1 port 5433 etc.

Those server above (master1, master2, & slave1) are hosted at cloud.

all server using Postgresql-11 on Ubuntu 18.04.1 LTS

Is it possible to replicate my slave1 to my on premise server at office (called slave2), so that all databases on slave1 on all port is replicated to my slave2 on single port (port 5432 which is the default port of postgresql) ??

Viewing all 17268 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>