I was looking for a procedure to setup a new MySQL replication slave on Amazon AWS without having to do a MySQL dump of the master. Ended up using a couple of AWS features to make this an easy process. Let’s assume that we have a master server (master01) and a slave (slave01). We want to create a new slave (slave02). We’ll do this without taking the master offline and without doing a MySQL dump. This is especially important with large databases.
A couple of notes before we start:
- On AWS we put MySQL storage on it’s own EBS storage volume. We create a new empty volume, partition, add LVM and then format with xfs. This makes it easy to add additional storage in the future.
- We prefer to use the innodb-file-per-table setting in my.cnf to force MySQL to use a separate file for each InnoDB table. This keeps all the files that belong to a database in a common folder instead of mixing multiple databases in a single file. Also in may help avoid having very large database files.
- We’re assuming that you’ve already setup the slave02 EC2 instance. You might do this by cloning the server01 instance.
Now here’s my procedure:
1. Login to AWS and go to EC2 -> Volumes. Find the volume holding the MySQL data. If you’ve not already done so we would recommend adding a Name tag to the volume so that you can easily see which server/function it is associated with. For example you might name it “slave01-mysql”. Right click on the volume and select “Create Snapshot”. A dialog will open where you can name the snapshot … something like “slave01-mysql-snap-1” would be appropriate. Depending on the size of the volume, this snapshot will take some time to complete.
One of the greatest features of EBS snapshots is their incremental nature. Here’s how it’s described by Amazon:
Amazon EBS snapshots are incremental backups, meaning that only the blocks on the device that have changed since your last snapshot will be saved. If you have a device with 100 GBs of data, but only 5 GBs of data has changed since your last snapshot, only the 5 additional GBs of snapshot data will be stored back to Amazon S3.
So the first snapshot we take of a volume has to copy the entire volume but additional snapshots only copy changed blocks. We’re going to use this is next step. At this point we have a snapshot of the server01 slave but it is inconsistent and we don’t have the point-in-time log coordinates that we need to initialize a new slave.
2. Login to slave01, stop replication and apply a read lock as follows:
1 |
mysql> slave stop; |
1 |
mysql> flush tables with read lock; |
Make sure that the “flush tables …” completes before proceeding. It may take a few seconds or more if there are locked tables that have to be released before the lock can be completed. At this point the slave server is not fully functional. You’ll want to proceed quickly through the next steps. Until the next step (3) is complete, do not close the terminal with the “flush tables …” as we need to keep the read lock active.
3. Open a second terminal session to slave01, record log coordinates and shutdown the mysql process as follows:
1 |
echo "show slave status\G" | mysql |
1 |
mysqladmin shutdown |
The slave status will look something like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: reportdb_master.prod.com Master_User: repl Master_Port: 3306 Connect_Retry: 60 <strong> Master_Log_File: bin-log.001998</strong> Read_Master_Log_Pos: 564896522 Relay_Log_File: mysqld-relay-bin.000007 Relay_Log_Pos: 564896666 Relay_Master_Log_File: bin-log.001998 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 <strong>Exec_Master_Log_Pos: 564896522</strong> Relay_Log_Space: 564896864 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 501 |
The columns that we need to record are:
Exec_Master_Log_Pos
Master_Log_File
Take note of these values for later use.
The slave01 database is now stopped and we have the log coordinates from right before the shutdown.
4. Next in AWS, start a new snapshot of the slave01-mysql volume. Name it slave01-mysql-snap-2 or something similar. This snapshot will run much faster since it is only copying blocks that changed since the first snapshot.
5. When the second snapshot is complete, start the mysql process on the slave01 server. Check replication and confirm that it comes back in sync with the master. The slave01 is now back to a functional status.
6. In AWS, right click on the slave01-mysql-snap-2 snapshot and select “Create volume from snapshot”. Name the new volume something like slave02-mysql.
Another great feature of EBS snapshots is “lazy loading” as described here:
New volumes created from existing Amazon S3 snapshots load lazily in the background. This means that once a volume is created from a snapshot, there is no need to wait for all of the data to transfer from Amazon S3 to your Amazon EBS volume before your attached instance can start accessing the volume and all of its data. If your instance accesses a piece of data which hasn’t yet been loaded, the volume will immediately download the requested data from Amazon S3, and then will continue loading the rest of the volume’s data in the background.
This means that we can proceed with the new slave setup without waiting for the new volume to be loaded … instead it will load as needed.
7. In AWS, right click on the newly created volume and attach it to the new slave02 instance.
8. Login to the slave02 instance, confirm that mysql is stopped and mount the new volume.
9. Edit /etc/my.conf and add “skip-slave-start” to the “[mysqld]” section.
10. In the mysql data directory (/var/lib/mysql by default), remove the following files:
- master.info
- relay logs (mysqld-relay-bin.*)
- relay-log.info
11. Start mysql server
12. Set replication config:
1 2 3 4 5 |
mysql> change master to master_host='YOUR_MASTER_HOST', master_user='YOUR_REPLICATION_USER', master_password='YOUR_REPLICATION_PASS', master_log_file='bin-log.001998', master_log_pos=564896522; |
13. Now start replication:
1 |
mysql> slave start; |
It may take a few minutes or more for replication on the new slave to catch up with the master. This is due to the “lazy load” on the volume created from the snapshot and the need to process the logs from the master.
Once the new slave catches up with the master then the task is complete and you have a new slave. This procedure can work with very large databases. There is no downtime on the master for this process. The existing slave is down for only a short period.