What's a good scan?

February 15, 2026

In my previous post, I talked about failed scans, but what's a good scan? What does it look like? How is it different from a photo? It's a question that comes back regularly when I work on FairScan, and I still don't have a clear answer. What I know is that this question quietly drives many of the strategic choices I make. I recently worked on improving how FairScan handles color documents, and that was a good opportunity to reflect on it with the experience I gained over the past months.

A scan is not a photo

A photo is about capturing a scene, something that happens within a given environment, with a particular atmosphere. It's largely about light: in fact, the word photo comes from the Greek word for light. Light comes from a source, either natural (the sun) or artificial (a lamp), and it has a direction. The way light falls on a scene, the way it highlights some parts and leaves others in the dark, plays a crucial role in the impression we get when looking at a photo.

A scan, on the other hand, tries to capture a document and only that, as if it were isolated from the world, under light that is as uniform as the surface of a screen. A flatbed scanner is a good reference for that ideal.

In comparison, a phone's camera has a major advantage: it fits in your pocket and is already connected to your digital world, making it easy to share documents instantly. But being able to take a photo of a document is very different from having a scan of that document.

What we expect from a scan

When you look at a scan, you probably have expectations that you don't consciously articulate. There is likely no absolute truth here, but working on FairScan forces me to identify what I believe are the most common ones:

Homogeneous brightness: light should be uniform, shadows should be avoided
Bright enough, but not washed out: there should be enough light without losing details in the brightest areas
Readable contrast: differences in brightness are crucial for readability, especially for text
Color only if there's color: if at least one element in the document is colored, I want the scan to show it; otherwise, I expect a black-and-white scan

Ideally, I think a good scan should look like the document appeared on a screen before it was printed. Sometimes it's more complex than that: elements can be added later on paper, like stamps or handwritten annotations. But in most cases, trying to reproduce the original digital document is a simple way to define a clear goal.

As a user, I don't want to think about how to adjust a photo to get a decent scan. Which parameters should I tweak? In what order? An app like FairScan should take care of that automatically. That's how I can get a usable scan in a few seconds, without effort.

Enhancing without betraying

A document scanning application starts by cropping a photo to remove everything that is not part of the document (see this blog post). The goal is then to "enhance" the photo to make it look like a scan, but without damaging it in the process.

When you see a document as part of a photo, as one element of the physical world, your brain tends to extrapolate what the document should look like. The surprise comes after cropping: the exact same pixels that made the document look white within a photo often don't look white anymore once everything around them is removed. What appeared to be white paper is in fact frequently composed of grey pixels (see another blog post).

To mitigate this, earlier versions of FairScan processed color documents using a very basic approach: increasing the brightness of all pixels by the same factor. Unsurprisingly, the result was sometimes too dark, sometimes washed out and unreadable. FairScan 1.14 uses a more careful strategy, increasing brightness based on image content rather than an arbitrary constant.

Making brightness homogeneous is also difficult. Reducing shadows involves evaluating brightness variations across the document and compensating for them. The challenge is that two things are mixed together: the lighting that reaches the document, with its direction and intensity, and the intrinsic reflectance of the document itself, which depends on its content. On a black-and-white document, how can an algorithm distinguish a shadow from an area that was intentionally printed in gray?

How I approach this in FairScan

I didn't know anything about image processing before working on FairScan. I learn techniques as I encounter concrete problems and look for ways to solve them. To compare algorithms or parameter choices, I usually test them side by side on a small set of images that cover a wide range of situations. Before shipping, I also review the results on hundreds of images from my dataset to make sure the changes actually move FairScan closer to what I consider a "good scan".

It's also essential that the processing shipped in the app doesn't make it slower. FairScan should feel fast: when you capture an image, you expect to see the result almost immediately. This often means finding a trade-off between visual quality and performance on a mobile device. In practice, it's sometimes possible to gain significant performance with a loss in quality that very few people will notice.

So what is a good scan?

I still don't have a single, definitive answer. I believe it should be both readable and reasonably faithful to the original document, but I'm still exploring how to translate that into FairScan. Much of it comes down to product choices, especially deciding how far the app should go when it "enhances" images.

What I know for sure is that this has to remain compatible with what I'm trying to build: an app that produces PDFs that don't require manual adjustments. The work I want to spare users must be handled by FairScan itself, which means carefully engineering and fine-tuning its behavior. In the end, even if the app does a lot behind the scenes, that complexity should remain invisible. To users, FairScan should feel fast and simple. Always.