In an ideal world, your data will flow through your system, piece by piece. Each piece getting processed and passed on quickly to minimise latency and keep the system responsive.
Sometimes though the overhead in processing can kill this flow.
I’ve had two such examples recently with customer projects. Very different causes but the same outcome:
Very different problems, but similar solutions – we must spread the overhead across multiple pieces of data with batching.
Lets define some terms:
Frame – A single data/image capture.
Throughput – The Frames per second that the system can process from an input.
Latency – The time from the system capturing a frame to outputting it in whatever form is required.
Overhead – A fixed minimum time for an operation irrelevant of the amount of work it is doing.
The overhead on the file example above was around 100ms (+ negligible time per frame). If we grab an individual frame each time then we can achieve 1/100ms or 10 frames/second throughput.
But we want to target 100 frames/second throughput so instead if we can grab 10 frames in each read that gives us 10/100ms or 100 frames/second throughput. What we give up is latency. Instead of loading a frame every 10ms, we get 10 frames every 100ms. This means some frames have been delayed through the system.
(and yes this is exactly like a manufacturing production flow)
Whether latency is important depends on your application.
If you are using the data to make real time decisions then the latency can become very important. For customer 1 this is the case and we will be careful testing different batch sizes to keep the throughput high enough without causing too many delays.
The GPU customer is sending this data to a user interface and data storage so we can tolerate larger delays.
100ms doesn’t mean much if it is going to a user interface. But if you need 10 second batches the interface updates will look juddery (think video buffering when the internet was younger)
For storage we can tolerate much higher latency – the key concern though is how much data might be lost if there is a system failure. The latency represents a quantity of data that could be lost in the case of a power failure for example.
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.