created stb_resample_ideas.txt

ago%!(EXTRA string=11 years) · 63cce5c70a
parent bcefca10f7
commit 63cce5c70a
1 changed files with 62 additions and 0 deletions
--- a/docs/stb_resample_ideas.txt
+++ b/docs/stb_resample_ideas.txt
@ -0,0 +1,62 @@
 Consider three cases just to suggest the spectrum
 of possiblities:
 a) linear upsample: each output pixel is a weighted sum
 of 4 input pixels
 b) cubic upsample: each output pixel is a weighted sum
 of 16 input pixels
 c) downsample by N with box filter: each output pixel
 is a weighted sum of NxN input pixels, N can be very large
 Now, suppose you want to handle 8-bit input, 16-bit
 input, and float input, and you want to do sRGB correction
 or not.
 Suppose you create a temporary buffer of float pixels, say
 one scanline tall. Actually two temp buffers, one for the
 input and one for the output. You decode a scanline of the
 input into the temp buffer which is always linear floats. This
 isolates the handling of 8/16/float and sRGB to one place
 (and still allows you to make optimized 8-bit-sRGB-to-float
 lookup tables). This also allows you to put wrap logic here,
 explicitly wrapping, reflecting, or replicating-from-edge
 pixels that would come from off-edge.
 You then do whatever the appropriate weighted sums are
 into the output buffer, and you move on to the next
 scanline of the input.
 The algorithm just described works directly for case (c).
 Suppose you're downsampling by 2.5; then output scanline 0
 sums from input scanlines 0, 1, and 2; output scanline 1
 sums from 2,3,4; output 2 from 5,6,7; output 3 from 7,8,9.
 Note how 2 & 7 get reused, but we don't have to recompute
 them because we can do things in a single linear pass
 through the input and output at the same time.
 Now, consider case (a). When upsampling, the same two input
 scanlines will get sampled-from for multiple output scanlines.
 So, to avoid recomputing the input scanlines, we need either
 multiple input or multiple output temp buffer lines. Since
 the number of output lines a given pair of input scanlines
 might touch scales with the upsample amount, it makes more
 sense to use two input scanline buffers. For cubic, you'll
 need four scanline buffers, and in general the number of
 buffers will be limited by the max filter width, which is
 presumably hardcoded.
 You want to avoid memory allocations (since you're passing
 in the target buffer already), so instead of using a scanline-width
 temp buffer, use some fixed-width temp buffer that's W pixels,
 and scale the image in vertical stripes that are that wide.
 Suppose you make the temp buffers 256 wide; then an upsample
 by 8 computes 256-pixel-width strips (from ~32-pixel-wide input
 strips), but a downsample by 8 computes ~32-pixel-width
 strips (from a 256-pixel width strip). Note this limits
 the max down/upsampling to be ballpark 256x along the
 horizontal axis.