Source here.
Throughput is a measure of how much can be completed in a given time. Following the same example, how many files could we download in an hour?
What is the relationship between latency and throughput?
It make take, again, 1 hour to download a file. Notice already that we have a relationship between some amount of work that can be completed and time -- if the file is, say, 2 GB, it takes 1 hour to download 2 GB in our example. What we really want to do here is generalize: how long does it take to download 10 GB? How many GB can we download in ten hours or one minute? Generalizing over time, we derive latency; generalizing over work completed, we derive the throughput.
So if we start with an observation of some relationship, deriving throughput and latency is only a matter of unit conversion. I know I can download 2 GB in an hour, but I want to know how many bytes I download in one second -- or I have a specific job in mind and want to download a 100 GB game, so I want to know how long that will take.
Throughput and latency seem like inverses, however. I can download 2 GB in an hour, and I consider the hour as my baseline for time and the GB as my baseline for work -- I start with the latency for the 2 GB job, and I want to know its throughput -- or how often a GB is downloaded.
It seems helpful to understand ratios, and what the inversion of a ratio really means -- notionally and not just symbolically. I can say I download 2 GB in an hour and just as much that in an hour, I download 2 GB -- so that is all we really mean by the inversion. But when you invert, your units change. I said above that I consider the GB as my baseline for work, and the inversion no longer expresses a relationship in terms of my baseline. So I normalize my baseline by solving the simple equation that asks, if I can download 2 GB in an hour, how much I can download in 1 hour -- and easily conclude I can download 0.5 GB.
But there's a fallacy in this reasoning. The pipeline is provided as a counter-example. Say that it takes a car 5 hours to be completed via a pipeline process with 5 steps. Each step in the car's construction takes an hour. But because of the structure of the system that builds the car, other cars are also being built as each part of this car is completed. So the throughput in this case is not necessarily 0.2 cars an hour. If the pipeline is full, the final step will run 5 times in an hour and 5 cars will be completed. The throughput then is considerably more: 1 car an hour.
Where does the fallacy come from? We fell into the fallacy, I think, by confusing a data-point for a more general relationship. We might observe a car come out of a pipeline after 5 hours and then, through inversion and normalization, conclude that the next car will come in another 5 hours. But if we observed the system, which we might suppose is now saturated, for another hour, we will observe another car come out. The fallacy is not in the inversion itself but in confusing a data point for a relationship. Likewise, if you are downloading several files at a time and notice that it takes a certain amount of time to download one file, you can't necessarily arrive at any general conclusion about the throughput of the downloader from the amount of time it took to download that one file. In fact, during the time it took to download that one file, simply from the hypothesis that other files were also downloading, we know that the throughput is greater than the size of the file itself.
But perhaps it isn't quite right to say that throughput and latency are not inverses -- the point is rather that you can't generalize from a single case. So if you observe that it takes 5 hours to build a car or 1 hour to download a file, you are not necessarily observing throughput of a system in the first place -- you are just observing how long it took the system to complete a particular job. You don't know what else the system was doing at that time -- it all depends on how it works.
Comments
Post a Comment