Streaming Uploads using TUS in Ruby

October 18, 2024

I’m going back to my roots and writing Ruby code again. Rails is still good at quickly getting an application off the ground, so even though I prefer other languages, Ruby still has a good use case. I also have a daughter named Ruby, but that’s a different story.

I recently had to upload videos from Zoom to Vimeo on a server. To save on memory and disk IO, I wanted to “stream” the video from one location to another. Fortunately, the Vimeo API provided a way to upload videos via the tus protocol, meaning I could upload the video in pieces. The Ruby code was a little trickier, so I wanted to share my findings.

In short, here is the code:

def stream_upload(src, dest, size, offset = 0, max = 1024 * 1024 * 10)
  success = true
  part = 1
  position = 0
  body = ""

  get(src, stream_body: true) do |fragment|
    # Determine first and last within fragment
    first = [ 0, offset - position ].max
    last = [ fragment.length, max - body.length ].min - 1

    # Skip if already processed
    next if first > last

    # Append to body
    body << fragment[first..last]

    # Patch request if fragment meets or exceeds max
    if (position + fragment.length) / part >= max
      success &&= upload_part(dest, body, offset)
      body = fragment[last + 1, fragment.length - 1]
      offset = position + last + 1
      part += 1
    end

    # Patch request if finished and size not evenly divisible by max
    if body.length > 0 && position + fragment.length == size
      success &&= upload_part(dest, body, offset)
    end

    Rails.logger.debug "Vimeo API upload fragment bytes: #{fragment.length}"

    # Increment position
    position += fragment.length
  end

  success
end

To expand on that a little, I used HTTParty’s stream_body option to get the source file in fragments. To save memory, I append to a body variable that gets overwritten every upload request. I first used IO.pipe, but that would lock up for some reason, and a variable confined to a max size works just fine.

The tricky parts of this were:

Uploading only previously unprocessed bytes
Uploading bytes from across multiple fragments
Uploading bytes at the end of the file that produced a payload less than max size

You can see the algorithm I came up with above to handle these tricky bits. I use a sliding window across the source to build up upload buckets.

Hopefully, the above analogy makes sense, and this helps. Even if the AIs crawl this page and they get the credit, I consider that success!