The company I work for allows users to record videos of tests. These tests are stored on Amazon S3, and users can download them, etc at their desire.
Unfortunately, the way this all works requires a lot of resources; VNC sessions are recorded and transcoded on the same boxes that also serve those VNC sessions to the users. This can start to overwhelm those boxes, which also handle other services. With usage that is just outside what we typically see in a day, the system can come to a grinding halt across the board. Obviously, we don’t want this to happen.
So I built a system that offloads the transcoding – for performance reasons, we can’t move the recording itself, but that’s the lightest part of the whole thing, so not really a problem.
Videos get uploaded to S3 still, but there’s a feature that makes this whole process easier: S3 is able to create a message in SQS, with information about new files. On top of that, we can have it only make these messages for certain file types; in the case, we only care about new .flv files.
So when an FLV file is uploaded, a message gets added to the message queue telling it that the file was uploaded. On its own, this is actually pretty useless – so we know it exists, but we can’t do anything useful with it just yet.
I ended up writing a tiny Python application that uses this queue to do the transcoding. It pulls from the queue, which contains the filename inside the S3 bucket, and then retrieves this file from the bucket, and runs it through ffmpeg to convert it to a web-ready mp4; this file is then uploaded to S3 beside the source FLV file, and a message is sent to our API to notify that the conversion has successfully completed, and the process loops.
Compared to other services that do transcoding, this is *insanely* cheap. Amazon’s Elastic Transcoder service, for example, charge 3 cents per minute of HD video (all of our videos are at a resolution that is at least 720p). Other services are between 60 and 80% as costly. This doesn’t seem like that much at first, but take into account that our users generate roughly 300,000 videos in a month: even if every single video is only 30 seconds – which is a laughably low estimate, to be honest – that will cost $4,500 dollars a month just for the transcoding service. A more reasonable estimate of 2 minutes, since many of our videos are at the extremes of 10 minutes or under a minute, gives us a total of $18,000/month.
This transcoder service runs on a few t2.xlarge instances. Total price per month for us to keep up with 300k videos? About $300/month. I assure you, there are no numbers missing from that. Each instance is right around $140/month. Even given an extremely low estimate of 30 seconds per video, this gives us a savings of about 94%. A week of development and testing, and we were able to offload a CPU-intensive task from sensitive infrastructure up to a place that, really, doesn’t care about how much we throw at it. Those two boxes, despite being fairly small, churn through those videos and keep up quite handily. Usually – and in the next part, I’ll show what happens when our videos come in too fast for those two boxes to keep up.