parent
bcefca10f7
commit
63cce5c70a
1 changed files with 62 additions and 0 deletions
@ -0,0 +1,62 @@ |
||||
Consider three cases just to suggest the spectrum |
||||
of possiblities: |
||||
|
||||
a) linear upsample: each output pixel is a weighted sum |
||||
of 4 input pixels |
||||
|
||||
b) cubic upsample: each output pixel is a weighted sum |
||||
of 16 input pixels |
||||
|
||||
c) downsample by N with box filter: each output pixel |
||||
is a weighted sum of NxN input pixels, N can be very large |
||||
|
||||
Now, suppose you want to handle 8-bit input, 16-bit |
||||
input, and float input, and you want to do sRGB correction |
||||
or not. |
||||
|
||||
Suppose you create a temporary buffer of float pixels, say |
||||
one scanline tall. Actually two temp buffers, one for the |
||||
input and one for the output. You decode a scanline of the |
||||
input into the temp buffer which is always linear floats. This |
||||
isolates the handling of 8/16/float and sRGB to one place |
||||
(and still allows you to make optimized 8-bit-sRGB-to-float |
||||
lookup tables). This also allows you to put wrap logic here, |
||||
explicitly wrapping, reflecting, or replicating-from-edge |
||||
pixels that would come from off-edge. |
||||
|
||||
You then do whatever the appropriate weighted sums are |
||||
into the output buffer, and you move on to the next |
||||
scanline of the input. |
||||
|
||||
The algorithm just described works directly for case (c). |
||||
Suppose you're downsampling by 2.5; then output scanline 0 |
||||
sums from input scanlines 0, 1, and 2; output scanline 1 |
||||
sums from 2,3,4; output 2 from 5,6,7; output 3 from 7,8,9. |
||||
Note how 2 & 7 get reused, but we don't have to recompute |
||||
them because we can do things in a single linear pass |
||||
through the input and output at the same time. |
||||
|
||||
Now, consider case (a). When upsampling, the same two input |
||||
scanlines will get sampled-from for multiple output scanlines. |
||||
So, to avoid recomputing the input scanlines, we need either |
||||
multiple input or multiple output temp buffer lines. Since |
||||
the number of output lines a given pair of input scanlines |
||||
might touch scales with the upsample amount, it makes more |
||||
sense to use two input scanline buffers. For cubic, you'll |
||||
need four scanline buffers, and in general the number of |
||||
buffers will be limited by the max filter width, which is |
||||
presumably hardcoded. |
||||
|
||||
You want to avoid memory allocations (since you're passing |
||||
in the target buffer already), so instead of using a scanline-width |
||||
temp buffer, use some fixed-width temp buffer that's W pixels, |
||||
and scale the image in vertical stripes that are that wide. |
||||
Suppose you make the temp buffers 256 wide; then an upsample |
||||
by 8 computes 256-pixel-width strips (from ~32-pixel-wide input |
||||
strips), but a downsample by 8 computes ~32-pixel-width |
||||
strips (from a 256-pixel width strip). Note this limits |
||||
the max down/upsampling to be ballpark 256x along the |
||||
horizontal axis. |
||||
|
||||
|
||||
|
Loading…
Reference in New Issue