How Long Does a File Migration Last Anyways?
Fast File Migration & Throughput
I have been manning conference booths and speaking at technical conferences for over a decade, usually covering the topic of migration in some shape or form. Recently, I served as eye candy (lies) at a technical conference in Las Vegas, for instance. Typical stuff: giant marketing cloths behind me making wild claims of ridiculously fast file migration – a file migration speed of up to 60TB per day.
Surely this is the work of some funny-math executed by a sales VP and then begrudgingly incorporated into a booth banner by some young, inexperienced marketing person. Except that it’s not, and we don’t do inexperienced!
Anyway, a crazy thing happened at this event no less than three times. Basically, an intelligent and wise technical type person walks up and calls “BS.” Literally. Two of them actually said to me, “I call BS, tell me how this is possible.”
Peak File Migration Speed
I want to set the record straight on how we legitimately achieved a peak file migration speed of 60TB per day for a recent customer.
First, let me begin by saying that it can be dangerous to “give away the IP.” But this really is not a secret. It’s all about the power of parallelization. The real IP is the incredible engineering architecture of this transfer engine. You can put the specs to paper, but it is up to the engineers to deliver.
What’s our secret sauce? We pretty much just locked these guys all into a room with a fridge full of Mountain Dew, Monster, and Red Bull. Then we slid some pizzas and Kit-Kats under the door as needed (just kidding – Our application architect is a beastly physical specimen who would probably never eat that stuff). But let me tell you how we do get this done.
File Migration Speed to Scale
The fact is it’s a very tall order to start from ground zero and create an engine that can scale, while also being flexible enough to minimize friction to the point where we can move all possible data – to the tune of over 99.9%. So it takes some of the most experienced file migration engineers on the planet to build a methodology that compliments this incredible technology.
For example, it goes like this:
Step 1: Start with a software engine that can scale in any direction, up or out, based on the strengths and weaknesses of the source and destination platforms.
Step 2: Perform some testing and tuning of that engine based on real data to find the “sweet spot” balance of optimal hardware until either (A) the customer decides spinning up more hardware is not worth shortening project duration or (B) either the source or destination system performance limits are reached. You know, typical deal. You can have it FAST, you can have it RIGHT, or you can have it CHEAP…. Pick two! But we prefer FAST and RIGHT! Give us a blank check on hardware and we can literally move (Iron) mountains!
The key here is that SkySync is never the bottleneck in transfer processing. This engine is essentially a real-time Rosetta Stone between two (or more) platforms. It moves data from the source directly to the destination. Do not pass “GO.” Do not collect $200. Do not store any data at rest… anywhere. Just pipe it. It’s lean, it’s mean and it’s incredibly efficient – the engine itself stays out of the way of throughput performance.
Using this process, we hit just a whisker short of 60TB per day at peak speed. That said, the average throughput was still well north of 30TB per day! This migration wave lasted about 96 hours (4 days). During that time, SkySync transferred 19.5 million items and 126.55TB of content. Sure, our delivery guys got a little bug-eyed watching the numbers hour by hour, but it was well worth it.
This is what the entire file migration speed wave looked like:
What you see above is the culmination of years of engine refinement. Like the New York Yankees, we basically went all in, loaded up on “A” players, and made this happen.
The Actual “Mic Drop”
First, we deployed (2) SkySync processing farms. Each had a single 16-core SQL Server and (12) 16-core processing servers. We deployed in Microsoft Azure to maximize network bandwidth and minimize latency. Additionally, each processing server can execute a minimum of 20 jobs concurrently. And each job is configured to implement 6 concurrent threads of operation. Across both farms, we are executing almost 3,000 parallel operations at the same time. Overall, these operations are working together to pipeline the data directly from the source to the destination system with no stop in between. Making it extremely efficient and extremely fast.
And for those guys that called “BS” on the fast file migration? All of them pretty much just paused for a few seconds (crickets chirping) and then said, “Well, OK then!”
Here’s the visual Mic Drop, in diagram form:
However, the obvious question is now “Why are there two farms? Can’t you just have one farm with 24 processing servers?”
The answer is generally, yes. However, having two farms gives us the flexibility to break up the waves and operate a “tick-tock” cadence where we are alternating high-performance throughput with low-performance remediation, optimizing (minimal) human intervention as well as machine efficiency. However, in this case, we simply put the pedal to the metal and floored it to see what we could get out of both farms slamming data!
And that’s how we do a fast file migration. No magic. No “BS.” No fluff. Just massive parallelization from the fastest file and folder migration engine around, and a ton of migration experience from some of our finest Syncopaths!
One of my favorite movies of all time is Spaceballs. We just hit “Ludicrous Speed.” Next up… we go to “Plaid” – 100TB+!
Read more on file migration from this author – Intentional Migration: Analyzing & Planning the Route Through Troubled Waters