Skip to content

jNicker/great-migration

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Great Migrator

Ever have to migrate from Rackspace Files to S3? I did. And I couldn't find a simple way to do it, so I made one. There's a service mover.io but it gave up after 25,000 objects. I had to copy 175,000.

This is a Ruby script which will log into Rackspace, get a list of all objects from a container (paged in groups of 10,000 by default), and copy those objects to an S3 bucket.

Usage

Install require gems:

bundle install

Call bin/greatmigration and pass the required options:

bin/greatmigration --rackspace-user=username --rackspace-key=123456 \
  --rackspace-container=rackcontainername --aws-key=ABCDEF \
  --aws-secret=987654 --aws-bucket=s3bucketname

Check out bin/greatmigration -h for a couple more optional parameters.

Technical Description

To get as much performance out of this script as possible it forks processes to do the actual copying of files to avoid the GIL which is still in place with threads. The first fork occurs after grabbing a page of results from Rackspace. That fork will take care of copying all of the objects in that page. Then we pull from a pool of 8 processes to actually do the uploading of individual objects.

bin/greatmigration
|
|-- page process
|   |
|   |-- upload process
|   |-- upload process
|
|-- page process
    |
    |-- upload process
    |-- upload process

In my usage I ended up forking 18 "page" processes and typically had 4-6 "upload" processes running under each one simultaneously. If you have millions of objects you may want to tweak the code to actually start the page processes from a pool as well.

Performance

Your results may vary, but I ran this script on a c4.4xlarge EC2 instance (16 cores). This maxed out the CPUs for the entire run and copied 174,754 objects in 5,430 seconds (just over an hour and a half). This ended up costing $1.32.

About

Copy objects from Rackspace to S3

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Ruby 100.0%