Cloud services such as AWS are mature enough and enterprise ready. Many startups are considering the cloud as the first choice for hosting their websites and applications and organizations are looking at migrating their current infrastructure to the cloud. The big idea behind the cloud is to reduce the need to setup and maintain your own infrastructure and to allow a more effective use of it, from a development and cost perspective , and of course to scale on demand. Our services are focused on migration of on-premises and managed data center resident databases to RDS (or EC2-resident if you prefer) databases in Amazon and Azure.
We are working for a number of clients, migrating dozens of Oracle, MySQL and Postgres databases to Amazon RDS with some migrations to Redshift thrown in. There are many options when it comes to migration methods, and they depend on the type and version of source and target databases, the downtime allowable during cutover – if any, the complexity of the underlying schemas and other factors. There is no one-size fits all when it comes to migration. Here are some of our findings and recommendations from a general perspective, with technical information where it is relevant to the Big Picture.
Why choose RDS over an Oracle or MySQL AMI or customized EC2 in AWS?
If you have a complex environment with filesystem-database interaction (such as external tablespaces or busy utl_file transfers with Oracle) or have and want to retain a very hands-on DBA approach, or use any features unsupported in RDS, the EC2 route is obvious. Also if you have a MySQL server with a bunch of databases running on it and they interact, that's another win for EC2. If you have a lot of data ingestion via extensive ETL processes running on-Premises, it's possible to shift those ETL processes to Amazon but you need a cost-benefit and technical evaluation for feasibility.
RDS is where you can live with any restrictions and want less of a DBA support requirement – not that you don't need DBAs for RDS – just not to the same extent, as you have automatic replication to a different availability zone and failover out of the box, as well as disk management with striping and provisioned IOPS – you don't need an LVM and you don't need Oracle ASM. You still need to tune queries (or throw memory and IOPS at it, if that works).
How dissimilar are the source and target database type, version and platform? It is easier to move Oracle to Oracle, MySQL to MySQL and then make any platform changes after the migration. With Oracle, version differences play into this but using Datapump for Oracle schema exports, a file transfer, then import into the 11gr2 RDS databases is one method. Another is use of a replication product such as Attunity Replicate, Oracle GoldenGate, Continuent Tungsten or Dbvisit. These replication products typically use Oracle supplemental logging to replay transactions from the redo logs from the source database over a secure VPN to AWS. Each has pluses and minuses in the areas of ease of setup, robustness, speed of operation and datatype support. You can also mix and match, using Oracle Data Pump to create a copy of your source database, or a subset of schemas within it, which will create metadata and a snapshot of data as of the time of the Data Pump export, then continue using Attunity or other replication tool, to keep the target database up to date. Yet another option is to use Oracle Materialized views (supported in RDS) to maintain tables up to date by refreshing via database link from the source tables.
With MySQL the best bet is to create an up-to-date replica in RDS as a slave and make it the master at application switch over time. Or if you prefer use simple mysqldump commands.
If you're using Oracle Data Pump or mysqldump or even if you're not, and you and have a number of RDS databases to manage, we'd recommend spinning up a small Oracle or MySQL AMI in EC2 –this makes life easier when dealing with the network-only accessible RDS and adds a small cost only (your production or DevOps based databases are still in RDS with all the advantages of a smaller DBA footprint and easy scalability)
What are the cost savings? This is where your team works on the total cost of ownership between on premises or managed data center compared to paying for, managing and monitoring, and developing with RDS databases over Linux based Oracle servers. Developing some performance statistics using AWR, Statspack for Oracle or Innotop for Mysql, will help you correctly size your RDS instance for performance and cost savings but you have a benefit with RDS in that you can add space at will, or change Processor and Memory size. If you want to reduce space allocation you can restore a replacement RDS instance from a snapshot. You can estimate your costs in AWS using any of the pricing models Amazon offers and see which comes out the best. With Oracle you have the option to use your existing Oracle license or pay rental for an included Oracle license.
If using Oracle be sure to count enough room for archived redo logs. These can add 10GB or more to instance storage- you an adjust the default 7 day retention period and/or retrieve the archived logs and store offline. The RDS instance is set to switch logs every 5 minutes and this lag_interval can't be changed. That's a lot of logs. You also need space for Data Pump or other directory based utl_file files.
What are the support requirements? With RDS taking care of the multi zone availability, backups and restore to a point in time, your main DBA responsibilities are tuning, developer support, running and monitoring any ETL processes. With RDS you get built in performance metrics with alarms for Free Storage Space, CPU Utilization, Write Throughput, Read latency and dozens of other common metrics. RDS uses the Amazon Simple Notification Service (Amazon SNS) to provide notification when an Amazon RDS event occurs. These notifications can be in any notification form supported by Amazon SNS for an AWS region, such as an email, a text message, or a call to an HTTP endpoint. You can of course use any other tool to replace or supplement Cloudwatch. In general, using RDS is a clear cost savings for many organizations and the payback in lower DBA and Sysadmin support costs and ease of use is clear winner.
We'd be happy to talk to you about your migration project and how we might be of help. Please email us at email@example.com
Database Architecture, Migration, Data models