I love this! I tried to apply the same idea to scan the tallest tree in New England with a drone. It didn't come out great, but I might just try again now.
Does anyone know what it looks like when you use a line scan camera to take a picture of the landscape from a moving car or train? I suspect the parallax produces some interesting distortions.
Iirc, at the last Olympics, Omega paired a high-frequency linear display with their finish-line strip cameras. Regular cameras saw a flashing line, but the backdrop to photo-finishes was an Omega logo. Very subtle, but impressive to pull off.
IMO the denoising looks rather unnatural and emphasizes the remaining artifacts, especially color fringe around details. Personally I'd leave that turned off. Also, with respect to the demosaic step, I wonder if it's possible to implement a version of RCD [1] for improved resolution without the artifacts that seem to result from the current process.
Yeah I actually have it disabled by default since it makes the horizontal stripes more obvious and it's also extremely slow. Also, I found that my vertical stripe correction doesn't work in all cases and sometimes introduces more stripes. Lots more work to do.
As for RCD demosaicing, that's my next step. The color fringing is due to the naive linear interpolation for the red and blue channels. But, with the RCD strategy, if we consider that the green channel has full coverage of the image, we could use it as a guide to make interpolation better.
When you do the demosaicing, and perhaps other steps, did you ever consider declaring the x-positions, spline parameters, ... as latent variables to estimate?
Consider a color histogram, then the logo (showing color oscillations) would have a wider spread and lower peaked histogram versus a correctly mapped (just the few colors plus or minus some noise) which would show a very thin but strong peak in colorspace. A a high-variance color occupation has higher entropy compared to a low-variance strongly centered peak (or multipeak) distribution.
So it seems colorspace entropy could be a strong term in a loss function for optimization (using RMAD).
I looked into line cameras for a project. I think their main application is in quality control of food on conveyer belts. There are plenty of automated sorting systems that can become a bottleneck. One of the units I speced out could record an 8k pixel line at up to 40kfps.
They are used in OCT (optical coherence tomography) as well
OCT is a technique which uses IR to get "through" tissue using beam in the near infrared (roughly 950 nm, with a spread of roughly 100 nm). The return is passed through interferometer and what amounts to a diffraction grating to produce the "spread" that the line camera sees. After some signal processing (FFT is a big one), you can get the intensity at depth. If you sweep in X,Y somehow, usually deflecting the beam with a mirror, you can obtain a volumetric image like an MRI or sonogram. Very useful for imaging the eye, particularly the back of the retina where the blood vessels are.
It's neat that it captured the shadow of the subway train, too, which arrived just ahead of the train itself. This virtual shadow is thrown against a sort of extruded tube with the profile of the slice of track and wall that the slit was pointed at.
Fun read! I used to work in sensor calibration, and most people take for granted how much engineering went into having phones taking good photos. There’s a nontrivial amount of math and computational photography that goes into the modern phone camera
Super cool. I wonder if you could re-use a regular 2-d CMOS digital camera sensor to the same effect. But now I realize your sensor is basically 1-D and has a 95khz sampling rate. At the same rate with a 4k sensor you'd have way too much data to store and would need to throw most of it away.
Pretty sure could do it but it would be very expensive, because you'd need alot more very fast ADCs.
Like if the camera is $5k, in order to get that exposure time in full-field you would need to duplciate the hardware 800 times or whatever you wanted horizontal resolution to be. Thats alot of zeros for a single camera
The video [https://www.magyaradam.com/wp/?page_id=806] blew my mind. I can only image he reconstructed the video by first reconstructing one frame's worth of slits — then shifting them over by one column and adding the next slit data.
None of the shots in that video are using Slit Scan technique. It’s using a technique called Mean Stack Mode to get the average pixel value across multiple frames, over a rolling selection of an input video.
Imagine a camera that only takes pictures one pixel wide. Now make it take a picture, for example, 60 times a second and append every pixel-wide image together in order. This is what's happening here, it's a bunch of one pixel wide images ordered by time. The background stays still as it's always the same area captured by that one pixel, resulting in the lines, but moving objects end up looking correct as they're spread out over time.
At first, I thought this explanation would make sense, but then I read back what I just wrote and I'm not sure it really does. Sorry about that.
Yeah, like walking past a door that's cracked just a bit so you can see into an office only a slit. Now reconstruct the whole office from that traveling slit that you saw.
Okay I was stumped about how this works because it's not explained, as far as I can tell. But I guess the sensor array has its long axis perpendicular to the direction the train is traveling.
You can also get close in software. Record some video while walking past a row of shops. Use ffmpeg to explode the video into individual frames. Extract column 0 from every frame, and combine them into a single image, appending each extracted column to the right-hand-side of your output image. You'll end up with something far less accurate than the images in this post, but still fun. Also interesting to try scenes from movies. This technique maps time onto space in interesting ways.
Absolutely fascinating stuff! Thank you so much for adding detailed explanations of the math involved and your process. Always wondered how it worked but never bothered to look it up until today. Reading your page pushed it beyond idle curiosity for me. Thanks for that. And thanks also to HN for always surfacing truly interesting reading material on a daily basis!
What's your FPS/LPS in this setup? I've experimented with similar imaging with an ordinary camera, but LPS was limiting, and I know line-scan machine vision cameras can output some amazing numbers, like 50k+ LPS.
You use a single vertical line of sensors and resample "continuously". When doing this with film, the aperture is a vertical slit and you continuously advance the film during the exposure.
For "finish line" cameras, the slit is located at the finish line and you start pulling film when the horses approach. Since the exposure is continuous, you never miss the exact moment of the finish.
Line scan sensors are basically just scanners, heck people make em out of scanners .
Usually the issue is they need rather still subjects, but in this case rather than the sensor doing a scanning sweep they're just capturing the subject as it moves by, keeping the background pixels static.
It only works for trains because the image of train at t+1 is basically image of train at time t shifted over by a few pixels, right? It doesn't seem like this would work to capture a picture of a human, since humans don't just rigidly translate in space as they move.
If the human is running and doesn't frantically shake it decently works. There's samples of horse race finishing line pics in the article, and they look pretty good IMHO.
It falls apart when the subject is either static or moves it's limbs faster than the speed the whole subject moves (e.g. fist bumping while slowly walking past the camera would screw it)
I love this! I tried to apply the same idea to scan the tallest tree in New England with a drone. It didn't come out great, but I might just try again now.
Here is how it came out: https://www.daviddegner.com/wp-content/uploads/2023/09/Tree-...
It was part of this story: https://www.daviddegner.com/photography/discovering-old-grow...
Does anyone know what it looks like when you use a line scan camera to take a picture of the landscape from a moving car or train? I suspect the parallax produces some interesting distortions.
It’s just a blur. Like the background of the photos in this article.
You can get some cool distortions at very slow speeds, but at car or train speeds you won’t see anything
Reminds me of the early experiments with using a flat-bed scanner as a digital back. Here is one: https://www.sentex.net/~mwandel/tech/scanner.html
Iirc, at the last Olympics, Omega paired a high-frequency linear display with their finish-line strip cameras. Regular cameras saw a flashing line, but the backdrop to photo-finishes was an Omega logo. Very subtle, but impressive to pull off.
Wow, great article. I love the cable car photo https://upload.wikimedia.org/wikipedia/commons/e/e0/Strip_ph...
Must be somewhat interesting deciding on the background content, too.
IMO the denoising looks rather unnatural and emphasizes the remaining artifacts, especially color fringe around details. Personally I'd leave that turned off. Also, with respect to the demosaic step, I wonder if it's possible to implement a version of RCD [1] for improved resolution without the artifacts that seem to result from the current process.
[1] https://github.com/LuisSR/RCD-Demosaicing
Yeah I actually have it disabled by default since it makes the horizontal stripes more obvious and it's also extremely slow. Also, I found that my vertical stripe correction doesn't work in all cases and sometimes introduces more stripes. Lots more work to do.
As for RCD demosaicing, that's my next step. The color fringing is due to the naive linear interpolation for the red and blue channels. But, with the RCD strategy, if we consider that the green channel has full coverage of the image, we could use it as a guide to make interpolation better.
When you do the demosaicing, and perhaps other steps, did you ever consider declaring the x-positions, spline parameters, ... as latent variables to estimate?
Consider a color histogram, then the logo (showing color oscillations) would have a wider spread and lower peaked histogram versus a correctly mapped (just the few colors plus or minus some noise) which would show a very thin but strong peak in colorspace. A a high-variance color occupation has higher entropy compared to a low-variance strongly centered peak (or multipeak) distribution.
So it seems colorspace entropy could be a strong term in a loss function for optimization (using RMAD).
Do you share some of the original raw recordings somewhere?
Yeah, i dont think the denoised result looks that good either
I looked into line cameras for a project. I think their main application is in quality control of food on conveyer belts. There are plenty of automated sorting systems that can become a bottleneck. One of the units I speced out could record an 8k pixel line at up to 40kfps.
https://youtu.be/E_I9kxHEYYM
They are used in OCT (optical coherence tomography) as well
OCT is a technique which uses IR to get "through" tissue using beam in the near infrared (roughly 950 nm, with a spread of roughly 100 nm). The return is passed through interferometer and what amounts to a diffraction grating to produce the "spread" that the line camera sees. After some signal processing (FFT is a big one), you can get the intensity at depth. If you sweep in X,Y somehow, usually deflecting the beam with a mirror, you can obtain a volumetric image like an MRI or sonogram. Very useful for imaging the eye, particularly the back of the retina where the blood vessels are.
It's neat that it captured the shadow of the subway train, too, which arrived just ahead of the train itself. This virtual shadow is thrown against a sort of extruded tube with the profile of the slice of track and wall that the slit was pointed at.
Fun read! I used to work in sensor calibration, and most people take for granted how much engineering went into having phones taking good photos. There’s a nontrivial amount of math and computational photography that goes into the modern phone camera
Super cool. I wonder if you could re-use a regular 2-d CMOS digital camera sensor to the same effect. But now I realize your sensor is basically 1-D and has a 95khz sampling rate. At the same rate with a 4k sensor you'd have way too much data to store and would need to throw most of it away.
Pretty sure could do it but it would be very expensive, because you'd need alot more very fast ADCs.
Like if the camera is $5k, in order to get that exposure time in full-field you would need to duplciate the hardware 800 times or whatever you wanted horizontal resolution to be. Thats alot of zeros for a single camera
If you like this sort of thing, check out https://www.magyaradam.com/wp/ too. A lot of his work uses a line scan camera.
The video [https://www.magyaradam.com/wp/?page_id=806] blew my mind. I can only image he reconstructed the video by first reconstructing one frame's worth of slits — then shifting them over by one column and adding the next slit data.
None of the shots in that video are using Slit Scan technique. It’s using a technique called Mean Stack Mode to get the average pixel value across multiple frames, over a rolling selection of an input video.
> Hmm, I think my speed estimation still isn’t perfect. It could be off by about 10%.
Probably would be worth asking a train driver about this, e.g. "what is a place with smooth track and constant speed"
Maybe an optical flow sensor to estimate speed in real time?
They have an amazing painterly quality. I'm not a huge train fan but I'd put some of these on my wall.
That's a lot more than I thought I'd want to know about this, but I was totally nerd sniped. Great writeup.
What a beautiful example of image processing. Great post
These are amazing images. I don't understand what's going on here, but I do like the images.
Imagine a camera that only takes pictures one pixel wide. Now make it take a picture, for example, 60 times a second and append every pixel-wide image together in order. This is what's happening here, it's a bunch of one pixel wide images ordered by time. The background stays still as it's always the same area captured by that one pixel, resulting in the lines, but moving objects end up looking correct as they're spread out over time.
At first, I thought this explanation would make sense, but then I read back what I just wrote and I'm not sure it really does. Sorry about that.
No, thank you. This was perfect. It completely explains where the train comes from and where the lines come from.
Lightbulb on.
Aha achieved. (Don’t you love Aha? I love Aha.)
Yeah, like walking past a door that's cracked just a bit so you can see into an office only a slit. Now reconstruct the whole office from that traveling slit that you saw.
Very cool.
It made sense to me!
How to receive order using panda rider app
reading this is how I imagine it feels to be chatgpt
Okay I was stumped about how this works because it's not explained, as far as I can tell. But I guess the sensor array has its long axis perpendicular to the direction the train is traveling.
The analogue equivalent (a slit scan camera) is easier to understand, I think https://www.lomography.com/magazine/283280-making-a-slit-sca... https://petapixel.com/2017/10/18/role-slit-scan-image-scienc...
You can also get close in software. Record some video while walking past a row of shops. Use ffmpeg to explode the video into individual frames. Extract column 0 from every frame, and combine them into a single image, appending each extracted column to the right-hand-side of your output image. You'll end up with something far less accurate than the images in this post, but still fun. Also interesting to try scenes from movies. This technique maps time onto space in interesting ways.
Thanks, I added a section called "Principle of operation" to explain how it works.
Absolutely fascinating stuff! Thank you so much for adding detailed explanations of the math involved and your process. Always wondered how it worked but never bothered to look it up until today. Reading your page pushed it beyond idle curiosity for me. Thanks for that. And thanks also to HN for always surfacing truly interesting reading material on a daily basis!
What's your FPS/LPS in this setup? I've experimented with similar imaging with an ordinary camera, but LPS was limiting, and I know line-scan machine vision cameras can output some amazing numbers, like 50k+ LPS.
You use a single vertical line of sensors and resample "continuously". When doing this with film, the aperture is a vertical slit and you continuously advance the film during the exposure.
For "finish line" cameras, the slit is located at the finish line and you start pulling film when the horses approach. Since the exposure is continuous, you never miss the exact moment of the finish.
Line scan sensors are basically just scanners, heck people make em out of scanners .
Usually the issue is they need rather still subjects, but in this case rather than the sensor doing a scanning sweep they're just capturing the subject as it moves by, keeping the background pixels static.
It only works for trains because the image of train at t+1 is basically image of train at time t shifted over by a few pixels, right? It doesn't seem like this would work to capture a picture of a human, since humans don't just rigidly translate in space as they move.
If the human is running and doesn't frantically shake it decently works. There's samples of horse race finishing line pics in the article, and they look pretty good IMHO.
It falls apart when the subject is either static or moves it's limbs faster than the speed the whole subject moves (e.g. fist bumping while slowly walking past the camera would screw it)
Depends what you're going for.
https://en.wikipedia.org/wiki/Slit-scan_photography#/media/F...