streaming data

Yask Srivastava
2 min readJun 6, 2024

--

Sometimes you’d find situations where client can’t wait to download the a large data file — to provide a better experience you can make the data streamable.

streaming data means downloading the a large file in chunks. For eg: at spotify for music we use HLS — HTTP Live Streaming protocol.

essentially a large audio file is chunked into small sizes (5–10 seconds audio file) and we maintain a manifest file which tracks these chunks.

the client requests for chunks of size 5 seconds which are super quick to download, plays the clip, and the continues to request the next chunk.

let’s try to implement our own stream process:

API Design:

we have multiple streams — s1, s2..sn.
we want to create a stream processor that streams n items from all the streams.

something like this:

s1 = Stream([1,2,3,4,5,6])
s2 = Stream([7,8,9,10,11,12,13])

sp = StreamProcessor([s1, s2])

while sp.streamable():
print(sp.stream(3))

we’ll need to implement two classes:

Stream — maintain a list with an index to track current item.

class Stream:
def __init__(self, lst):
self.lst = lst
self.i = 0

def streamable(self):
return True if self.i < len(self.lst) else False

def next_item(self):
result = self.lst[self.i]
self.i += 1
return result

StreamProcessor — maintains a list of streams and index to track current stream.

StreamProcessor — mainain a list of streams and an index to track the current stream.

class StreamProcessor:
def __init__(self, streams):
self.streams = streams
self.i = 0

def streamable(self):
if self.i < len(self.streams) and self.streams[self.i].streamable():
return True
return False

def stream(self, n):
result = []
c_s = self.streams[self.i]
i = 0
while i < n:
if self.i >= len(self.streams):
return result
if c_s.streamable():
result.append(c_s.next_item())
i+=1
else:
self.i += 1
if self.i >= len(self.streams):
return result
c_s = self.streams[self.i]
return result

--

--