streaming data

2 min readJun 6, 2024

Sometimes you’d find situations where client can’t wait to download the a large data file — to provide a better experience you can make the data streamable.

streaming data means downloading the a large file in chunks. For eg: at spotify for music we use HLS — HTTP Live Streaming protocol.

essentially a large audio file is chunked into small sizes (5–10 seconds audio file) and we maintain a manifest file which tracks these chunks.

the client requests for chunks of size 5 seconds which are super quick to download, plays the clip, and the continues to request the next chunk.

let’s try to implement our own stream process:

API Design:

we have multiple streams — s1, s2..sn.
we want to create a stream processor that streams n items from all the streams.

something like this:

s1 = Stream([1,2,3,4,5,6])
s2 = Stream([7,8,9,10,11,12,13])

sp = StreamProcessor([s1, s2])

while sp.streamable():
    print(sp.stream(3))

we’ll need to implement two classes:

Stream — maintain a list with an index to track current item.

class Stream:
    def __init__(self, lst):
        self.lst = lst
        self.i = 0

    def streamable(self):
        return True if self.i < len(self.lst) else False

    def next_item(self):
        result =  self.lst[self.i]
        self.i += 1
        return result

StreamProcessor — maintains a list of streams and index to track current stream.

StreamProcessor — mainain a list of streams and an index to track the current stream.

class StreamProcessor:
    def __init__(self, streams):
        self.streams = streams
        self.i = 0

    def streamable(self):
        if self.i < len(self.streams) and self.streams[self.i].streamable():
            return True
        return False

    def stream(self, n):
        result = []
        c_s = self.streams[self.i]
        i = 0
        while i < n:
            if self.i >= len(self.streams):
                return result
            if c_s.streamable():
                result.append(c_s.next_item())
                i+=1
            else:
                self.i += 1
                if self.i >= len(self.streams):
                    return result
                c_s = self.streams[self.i]
        return result

streaming data

Written by Yask Srivastava