February 26, 2010
In reading about visual acuity recently, I noticed that different sources state different figures as the limit of detail that our eyes can resolve.
We’re talking here about the finest line spacings we can see in the center of our vision—the retina’s “fovea”—where our cone cells are most tightly packed. That’s reasonable, since our eyes are constantly scanning across anything we see, noticing significant details, to build up a complete impression.
The acuity unit often mentioned is “cycles per degree.” Each “cycle” is the same as a “line pair”—namely, one black/white pair taken together. You may find human acuity limits quoted as anywhere from 40 to 50 cycles per degree. (But 20/20 vision only corresponds to 30 cycles per degree.)
One reason for this uncertainty is simply that the human contrast sensitivity function means that finer spacings are perceived with much lower clarity. So the limit is not black and white; rather it is literally “in a gray area.”
But if you’re curious, it’s pretty simple to test yourself.
All you really need is a nice long tape measure (50 feet is probably enough).
Draw two heavy lines with a Sharpie, with a gap between them the same width as the lines. Tack the paper to a wall, and hook your tape measure onto a convenient nearby door frame, etc.
Then start walking backwards. At some point you’ll find it becomes very difficult—then impossible—to see the white gap between the lines. Write down your tape-measure distance from the target.
Next, you need to measure the width of your “cycle” (one black line and the white gap). Convert this width into the same units as your tape-measure distance (feet, inches, meters, etc.).
In my case, I’d measured 36 feet on the tape measure; and my “cycle” was 0.0139 feet wide.
Divide the cycle width by the tape-measure distance, and you’ll get some tiny number. Now, to convert this to degrees, you need to take the “inverse tangent” (if you’ve lost your scientific calculator, try the bottom section of this online one).
That gives you the degrees per cycle. To get cycles per degree, divide one by that number.
I didn’t estimate any numbers beforehand; so I was pleasantly surprised that my measured distance translated perfectly into 45 cycles per degree. That was good news both about my eyesight (and my eyeglass prescription), as well as perfectly splitting the range of acuity numbers I’d seen quoted.
But note that I did this test outdoors, on an overcast day. Moderately bright illumination like this gives the maximum visual acuity.
So I did a retest indoors, at an illumination level that might be more relevant to typical print-viewing conditions. Here, the lighting was 7 stops dimmer (1/128th as bright).
Perhaps not surprisingly, it became harder to judge the cut-off point where the white gap disappeared. But it definitely happened at a closer distance—roughly 28 feet.
Crunching those numbers, my indoor acuity dropped to 35 cycles per degree. This illumination level is probably more representative of the conditions used in eye-doctor tests; so being in the ballpark of 30 cycles per degree (20/20 vision) seems pretty plausible.
Now remember from my earlier discussion that detectable detail doesn’t equate very well with subjectively significant detail.
But, sitting typing this, I have unconsciously adjusted my distance from my computer screen so that its height occupies about 32° of my field of vision.
If you think that’s a reasonable distance to view a photo print from, you can do a little math. Even losing some resolution to Bayer demosaicing, a digital camera of 12.5 megapixels would capture all the detail my acuity could discern at all.
February 24, 2010
This year’s PMA trade show in Anaheim is over now, without much to show for itself.
To keep my life simple and my blood pressure under control, I intend to ignore all new cameras with 1/2.3″ sensors.
It’s only ones with larger, high-ISO-friendly chips that interest me.
However in that category, few actual, working products got unwrapped. I did mention the Samsung TL500 already. But otherwise, there were vague statements about future possibilities and “intentions.”
Okay, Sigma announced the DP2s and its wide-angle sister the DP1x—modest evolutions of their earlier versions. The Foveon sensors remain (larger than Four Thirds), as do their superior non-zooming lenses. But we need to wait for reviews of these models’ handling and high-ISO performance.
Samsung confirmed its lens roadmap for the NX mount. But it will be months before we see their “wide pancake” 20mm f/2.8, which is a shame. Samsung promises lenses that are “stylish and iconic,” and I’ve always wanted to be iconic. Oh wait, that was “ironic.”
Ricoh announced that soon we’ll see two more “units” for its oddball GXR system. Again, the interesting one is the non-zoomer, with an APS-C sensor, coming later this year. But its 42e normal lens is an underwhelming f/2.5. How is this supposed to sell me on the GXR system?
Hopes for any new Panasonic G-series µ4/3 bodies also failed to materialize (despite persistent rumors that something new is on the way).
An Olympus rep was bold enough to suggest that DSLR mirrors may die soon. Reflex optical viewfinders have always been a challenge for Four Thirds cameras, since the smaller image makes the groundglass so tiny. The Olympus VF-2, with 1.4 million dots of resolution, has won over some doubters to electronic viewfinders.
As for other brands joining the EVIL bandwagon, a Nikon exec coyly said that mirrorless cameras were “one solution.” Sigma dropped a mention of its “plans” to build a mirrorless system around the Foveon sensor. But the biggest buzz came from Sony’s non-functioning model of a mirrorless APS-C-sensor compact:
First, let’s be clear none of these would be compatible with Micro Four Thirds. Lenses for µ4/3 only cover an image circle of 21.65 mm. For Foveon you need 24.9mm; and for APS-C it’s 28.4mm.
So, if these other mirrorless models come to market, their options for lenses could be fragmented, with only a few manufacturer-specific choices.
The Sony lens mount shown above is clearly a dummy: There are no bayonet tabs, or electrical contacts. Yet if it keeps that shallow register distance and wide throat (seemingly about 42 mm in their mockup) it will be much friendlier to lens adapters than the Samsung NX mount. Leica lenses on an affordable APS-C sensor, anyone?
Of course Sony has a history of imposing proprietary standards on customers (think Memory Stick or MiniDisc’s ATRAC). We shouldn’t assume the camera will even turn on, if it can’t find a properly-coded Sony lens on the front.
And Sony’s concept has no control dials visible at all. Maybe it would be a touchscreen-driven interface. Meh.
There was one bright note of hope for me in this PMA however. And that’s a bit of a convergence in comments from several different photo executives.
A Samsung VP expressed surprise how many NX10 buyers were opting for the 30mm pancake, rather than the kit zoom.
Actually this doesn’t surprise me at all: In any indoor lighting, the f/2 lens is vastly preferable to the zoom (which is 2 stops slower at 30mm).
And the pancake ridiculously small. Plus, the little guy tests pretty well too.
Meanwhile, Sigma’s chief has started seeing that in Asian markets, even non-techie consumers are buying fast primes. They want that glamorous, shallow depth of field look—even for family snaps, or blogging what they cooked last night.
“I don’t have to be paid to say this, I really enjoy our small compact cameras and I actually adore our Limited [prime] lenses. I’m not a zoom type of photographer, and so I love our 31mm, I love all of our compact lenses, because it suits the way I was trained as photographer”
“[W]e are finding a lot of people who maybe are more serious photographers who have bought the K-x, and now on the forums are asking about our Limited lenses.”
Well all right then! Three different executives suddenly think prime lenses are good. (Photographers do seem to have limped along with them OK, during those first 85 years of film cameras.)
So maybe we have a groundswell on our hands: Say goodbye to chubby, dim zooms; and hello to small, perky, and bright primes!
But… er, Ned? The FA 31mm is an oversized holdout from the film era; it costs almost $1000.
The Sigma 30mm f/1.4 (newly-beloved of Asian moms) is $440.
So, why can’t Pentax build a small, fast “normal” for the APS-C format?
Better have a pancake and think it over.
February 23, 2010
Last week’s post, comparing excessive bits in music recording with megapixel overkill in cameras, drew more comments than usual.
So today, I will recklessly continue my strained analogy between audio and optics. I hope you’ll bear with me.
Anyone who has researched stereo gear or sound-recording equipment will have come across graphs of frequency response, like this one:
The horizontal axis shows the pitch of the sound, increasing from low notes on the left to high ones on the right. The wavy curve shows how the output signal rises and falls with these different sound frequencies.
In a mic like this, air pressure needs to to physically shove a diaphragm and a coil of wire back and forth. When you get to the right end of the graph, with air vibrating 20,000 times per second, mechanical motion has trouble keeping up; the response curve nosedives.
Now, other brands of microphones exist where the frequency response is almost ruler-flat across the spectrum. So is this just a poor one?
In fact, you have to look at this response curve in the context of human hearing ability. As it happens, our ears don’t have a very flat frequency response either:
Note that the orientation of this graph is “upside down” compared to the first one. These curves show that our hearing is most sensitive to pitches around 3 or 4 kHz; tones of other frequencies must be cranked up higher to have subjectively equal loudness.
(Fletcher & Munson from Bell Labs published the first comprehensive measurements of this effect in 1933. So you’ll still hear mention of “Fletcher-Munson curves.” But the measurements have been redone several times since. Click the graph for a PDF of a recent, and hopefully definitive study.)
The upshot is, the rolled-off ends of this microphone’s frequency response may not be that obvious to our ears. And actually, the SM58 has become the standard stage mic for vocalists. Its ability to tolerate nightly abuse in a bar room or on a music tour offers a real-world advantage outweighing any lack of extended treble.
If you’ve ever seen live music, you’ve undoubtedly heard vocals through an SM58. Did the roll-off above 10 kHz change how you felt about the singer’s performance? I doubt it.
Last week I made a post about lens sharpness, referring to MTF graphs. While the little test target I posted on this blog has solid black & white bars (which was easiest for me to create), formal MTF testing presents a lens with stripes varying in brightness according to a sine function—just like audio test tones:
The brightness difference between the white and black stripes at a very low frequency defines “100% contrast.” Then, as you photograph more and more closely-spaced lines, you measure how the contrast ratio drops off, due to diffraction and aberrations.
Ultimately, at some frequency, the contrast percentage plummets into the single digits. At that point details become effectively undetectable.
In fact you can draw a “spatial frequency response” curve for a lens—very much like our audio frequency response graph for the microphone:
(This chart comes from a deeply techie PDF from Carl Zeiss, authored by Dr. Hubert Nasse. It discusses understanding and interpreting MTF in great detail, if you’re curious.)
Unlike the earlier MTF graph I showed, here the horizontal axis doesn’t show distance from the center of the frame. Instead, it graphs increasing spatial frequencies (that is, stripes in the target getting more closely spaced) at a single location in the image.
The lower dotted curve shows how lens aberrations at f/2 cause contrast to drop off rapidly for fine details. The heavy solid line shows how contrast at f/16 would be limited even for a flawless lens, simply due to diffraction.
The lens performance at f/5.6 is much better. It approaches, but does not quite reach, the diffraction limit for that aperture. Results like this are representative of many real lenses.
Now, our natural assumption is that greater lens sharpness is always useful. But as I mentioned in my earlier post, our eyes have flawed lenses too (squishy, organic ones), limited by diffraction and aberrations. They have their own MTF curve, which also falls off as details become more closely spaced.
Might we say that our visual acuity has its own “Fletcher-Munson curves”?
To answer that I’m going to ask you to open a new tab, for this link at Imatest (a company who writes optical-testing software). This is their discussion of the human “Contrast Sensitivity Function.”
The top graph shows the approximate spatial frequency response for human vision. It shows that subjectively, we are most sensitized to details at a spacing of about 6–8 cycles per degree of our visual field.
Even more startling is the gray test image below. Notice how the darker stripes seem to fill a “U” shape?
In fact, at any given vertical height across that figure, every stripe has the same contrast. You are seeing your own spatial sensitivity function at work. It’s the middle spacings that we perceive most clearly.
You can have a bit of fun moving your head closer and further from your computer screen (changing the cycles/degree in your vision). Notice how your greatest sensitivity to the contrast moves from the left to the center of the image? (Ignore the right edge, where the target starts to alias against our computer-screen pixel spacing.)
It might surprise you that there is a low-frequency roll-off to our vision’s contrast detection. After all, imagine filling your field of vision with ONE giant white stripe, and one black one—surely you’d notice that?
But, MTF measurements don’t use stripes with hard edges. Instead MTF uses smoothly-varying sinusoidal brightness changes, as I showed above.
Our eyes are great at finding edges. But smooth gradients over a large fraction of your visual field are much less obvious. Our retinas locally adjust their sensitivity—which is why we get afterimages from camera flashes or from staring at one thing for too long—and tend to subtract out those large-scale patterns.
Defining the whole human contrast sensitivity function precisely can get extremely complex. However we can make a few points:
The ultimate limit for human acuity might be at 45 or 50 cycles per degree (sources differ). But that spacing is right at the edge of perceivability.
It’s only under bright light that we achieve our greatest acuity; but we generally view photographs under dimmer indoor illumination. There, even spacings of 20 cycles per degree might show seriously degraded contrast.
In a second PDF, Dr. Nasse discusses a “Subjective Quality Factor” for lenses (see page 8). This is an empirical figure of merit, which integrates lens MTF over the spatial frequencies that our eyes find most significant. The convention is to use 3 to 12 cycles per degree—near our vision’s peak sensitivity.
My prior post on MTF mentioned that 10 and 30 lp/mm (on a full 24 x 36mm frame) were lens-test spacings chosen for their relevance to human vision. Actually that oversimplified slightly. We didn’t specify how the photographs would be viewed.
In fact, those criteria correspond to “closely examining” an image. If you put your eye very near to a print (a distance half the image diagonal) the photo practically fills your view. It’s about the limit before viewing becomes uncomfortable. In those conditions, the 10 and 30 lp/mm of lens tests translate to 4 and 12 cycles per degree of your vision.
The conditions of “regular” viewing (at a distance maybe twice the image diagonal) relax the detail requirements significantly.
So how much detail do our photos really, truly need?
Ultimately, that’s a personal question, which photographers must answer for themselves. It depends how large you print, and how obsessively you examine.
But the eye’s own limited “frequency response” suggests we may not always need to worry about it.
February 21, 2010
Yes, I have been keeping one eye on the product introductions at PMA 2010. But anyone waiting for some industry-rattling blockbuster has likely been disappointed so far.
It seems that a few camera manufacturers are conspicuously snubbing the show this year. The “poor economy” is the glib explanation; but I wonder if there’s a some more complex backstory we haven’t heard yet.
There is a new international imaging exhibition in Japan called CP+, launching in Yokohama on March 11th. Camera makers may feel that the Asian market is the center of their future growth; and so that’s where the promotional dollars (er, yen) and media attention would be best targeted.
From the perspective of this blog, the one PMA unveiling of note is the Samsung TL500 (to be called EX1 in Europe). A friend asked me if this would change my “crackhead” assessment of Samsung; and the answer is “a little.”
All these cameras are 10 megapixel models with roughly 2.0 micron pixel size. That’s significant, since each pixel grabs 60-100% more light than the tiny pixels used in mainstream point & shoots.
Panasonic’s LX3 is more compact than the TL500. But Samsung has matched its extrawide 24e lens coverage, while eking out an extra 1/3 stop of lens brightness.
Some complained that the LX3′s zoom range only extended to 60e (which is hardly even a portrait lens); and was only f/2.8 at that point. The new Samsung stretches that to 72e even while maintaining f/2.4 brightness.
Before anyone geeks out about the selective-focus potential here, remember that’s about equivalent to the DOF at f/11 if you were using an APS-C lens covering the same 72e.
If you need more telephoto reach than that, the Canon S90 gives you 105e, in a smaller package—albeit at a cost of a couple of f/stops. Another deficiency of the S90 is its lack of a hot shoe, unlike the others.
The styling of the Samsung is a bit unusual—rather angular and “brutalist.” I think I’d need to handle the TL500 in person to see whether that bothered me.
I’m amused that those chamfered ends seem to echo Kodak’s old (German-made) Retina cameras, the camera series that introduced the 35mm film cassette to the world.
In their day, Retinas were considered quite desirable precision models. Most offered superior Schneider Xenar or Xenon lenses. Alongside the even swankier offerings of Leica and Contax, they helped cement 35mm film as a “miniature” format which could be taken seriously.
The TL500′s lens also carries the Schneider name (its manufacture is undoubtedly Asian, of course). And the physical size of the Samsung is a very close match to early Retinas, too.
February 19, 2010
The stats show that my post, “the Great Megapixel Swindle,” continues to get quite a lot of traffic. That’s a little unnerving, given that it was written as a quick, off-the-cuff tantrum. If I’d known how many folks would read it, I would have said several things more precisely.
Around the internet, the “Swindle” spawned many, many discussion threads—I can’t keep track of them all. I did try to respond to many of the questions and misunderstandings that I was seeing, in a followup post here. And I’ve expanded on the same issues in many other posts as well.
Today I’ve noticed a thread over at Rangefinder Forum which raises the question, “isn’t it unfair to use a crop from the background? Naturally that looks bad, since it’s out of focus.”
First, the real point this example makes is this: Cameras with tiny pixels must use aggressive post-processing to reduce noise; and this can cause strange, unnatural-looking artifacts. (There’s more on the subject here.)
But on the question of depth of field, I should clarify a bit.
The EXIF data for this shot shows the Olympus FE-26 was set at the wide end of its zoom range—namely 6.3mm. Such a short focal length implies extreme depth of field. The f/stop used was f/4.6.
The H. Lee White is a 700-foot-long Great Lakes freighter. I’m not exactly sure how far the camera was from the subject; but it’s doubtful that it was closer than 15 feet.
Now, we can’t blindly apply standard depth-of-field tables here. The standard calculation (e.g. if you scroll down to Olympus FE-26 here) uses a circle of confusion of 0.005 mm for this sensor format.
But when you are looking at such an extreme enlargement, the assumptions behind that break down (CoC is generally referenced to viewing an 8×10 print at a moderate distance).
But consider that this camera has a sensor size of about 6 x 4.5mm, and so each individual pixel is 1.53 microns wide. In other words, 0.00153 mm.
Clearly, the meaningful circle of confusion can’t be smaller than one pixel. Given the resolution loss that happens with Bayer interpolation, 0.002 mm seems a realistic CoC.
So the out-of-focus blur is actually negligible compared to the pixel size in this case.
You can use an alternate depth of field calculator which lets you input arbitrary values if you’d like to explore this further yourself.
February 19, 2010
Once upon a time, the way photo-geeks evaluated lens quality was in terms of pure resolution. What was the finest line spacing which a lens could make detectable at all?
Excellent lenses are able to resolve spacings in excess of 100 lines per millimeter at the image plane.
But unfortunately, this measure didn’t correlate very well with our perception looking at real photos of how crisp or “snappy” a lens looked.
The problem is that our eyes themselves are a flawed optical system. We can do tests to determine the minimum spacing between details that it’s possible for our eyes to discern. But as those details become more and more finely spaced, they become less clear, less obvious—even when theoretically detectable.
The aspects of sharpness which are subjectively most apparent actually happen at slightly larger scale than you’d expect, give the eye’s pure resolution limit.
This is the reason why most lens testing has turned to a more relevant—but unfortunately much less intuitive—way to quantify sharpness, namely MTF at specified frequencies.
An MTF graph for a typical lens shows contrast on the vertical axis, and distance from the center of the frame on the horizontal one. The black curves represent the lens with its aperture wide open. Color means the lens has been stopped down to minimize aberrations, usually to f/8 or so. (I’ll leave it to Luminous Landscape to explain the dashed/solid line distinction.)
For the moment, all I want to point out is that that there’s a thicker set of curves and a thinner set.
The thinner curves show the amount of contrast the lens retains at a very fine subject line spacings. The thicker ones represent the contrast at a somewhat coarser line spacing (That’s mnemonically helpful, at least.)
The thick curves correspond well to our subjective sense of the “snap,” or overall contrast that a lens gives. Good lenses can retain most of the original subject contrast right across the frame. Here, this lens is managing almost 80% contrast over a large fraction of the field, even wide open. Very respectable.
The thin curves correspond to a much finer scale—i.e. in your photo subject, can you read tiny lettering, or detect subtle textures?
You can see that preserving contrast at this scale becomes more challenging for an optical design. Wide open, this lens is giving only 50 or 60% of the original subject contrast. After stopping down (thin blue curves), the contrast improves significantly.
When lenses are designed for the full 35mm frame (as this one was) it’s typical to use a spacing of 30 line-pairs per millimeter to draw this “detail” MTF curve.
And having the industry choose this convention wasn’t entirely arbitrary. It’s the scale of fine resolution that seems most visually significant to our eyes.
So if that’s true… let’s consider this number, 30 lp/mm, and see where it takes us.
A full-frame sensor (or 35mm film frame) is 24mm high. So, a 30 lp/mm level of detail corresponds to 720 lines over the entire frame height.
The number “720″ might jog some HDTV associations here. Remember the dispute about whether people can see a difference between 720 and 1080 TV resolutions, when they’re at a sensible viewing distance? (“Jude’s Law,” that we’re comfortable viewing from a distance twice the image diagonal, might be a plausible assumption for photographic prints as well.)
But keep in mind that a 30 line pairs/mm (or cycles/mm in some references) means that you have a black stripe and a white stripe per pair. So if a digital camera sensor is going to resolve those 720 lines, it must have a minimum of 1440 pixels in height (at the Nyquist limit).
So we would probably need an extra 1/3 more pixels to get clean resolution: 1920 pixels high, then.
In a 3:2 format, 1920 pixels high would make the width of the sensor 2880 pixels wide. Do you see where this is going?
Multiply those two numbers and you get roughly 5.5 megapixels.
Now, please understand: I am not saying there is NO useful or perceivable detail beyond this scale. I am saying that 5 or 6 Mp captures a substantial fraction of the visually relevant detail.
There are certainly subjects, and styles of photography, where finer detail than this is essential to convey the artistic intention. Anselesque landscapes are one obvious example. You might actually press your nose against an exhibition-sized print in that case.
But if you want to make a substantial resolution improvement—for example, capturing what a lens can resolve at the 60 lp/mm level—remember that you must quadruple, not double the pixel count.
And that tends to cost a bit of money.
February 17, 2010
I’m sorry I called you a crackhead, really.
It was just a little joke. Can we still be friends?
Recently, we all learned that your Lumix GH1 has the best sensor of any Micro Four Thirds camera. That’s great!
And I think the native multi-aspect-ratio feature is awesome too. (You do that nicely on several cameras, like the LX3.)
The GH1 has great HD video capabilities, and I understand that the zoom is optimized for this purpose (with quieter motors, etc.)
But but for stills photography, the GH1 also has class-leading high ISO performance. Available-light shooters would surely appreciate a smaller body that can still deliver the goods at ISO 800.
I’m one of them.
So as an alternative, why not also bundle the GH1 with your excellent 20mm f/1.7 pancake?
The 20mm is over two and a half stops faster! And presumably, you could offer a GH1 kit for a lot less money then. Like $400 less.
You might sell a few extra GH1′s that way.
February 16, 2010
Today, some musings that are a bit more (har har) abstract.
Sometimes we become numbed with the perpetual escalation of tech specs. The mindset of the computer industry, where each new generation promises more, faster & bigger, seems to be the new normal.
And it can become a self-fulfilling prophecy. If it is technically possible to ratchet up some specs number, inevitably we’ll choose to. That’s what keeps people buying!
But there’s one nice example of an industry that dangled ever-increasing numbers in front of consumers—who then yawned and said “no thanks.”
How many DVD-Audio or “Super Audio CD” titles do you own? (Okay, I’m sure someone out there is an enthusiastic adopter. But I mean, the average person.)
The original standard for music CDs uses 44,100 samples per second. Each sound sample has 16 bits; meaning it encodes the full range from soft to loud with about 65,000 discrete levels.
The CD standard was adopted around 1980; so at that time there was pressure to keep the bit rate low enough so that the player electronics would not be prohibitively expensive.
No one knew how data-handling ability would explode over the following decades. When you see a speed “60x” or “133x” on a camera memory card, those are referenced against the original CD-player (or, 1x CD-ROM) bit rate.
So in the audiophile world there were grumbles from the start that the CD bit rate was insufficient.
With only 16 bits, the quietest elements of music (like note decays and room ambience) must be recorded with somewhat coarse resolution. It’s similar to how shadow areas in digital photos can look noisier than highlights. A standard that used 20 bits or so would have preserved more fine “texture” in low-amplitude sounds.
And before digitizing sound for CD, any frequency of 22,050 cycles/second or higher must get chopped off. A high-pitched rising chirp that went past that frequency would make the A/D converter miss the true peaks and troughs of the wave; it would misrepresent it as a falling note, back down in the audible spectrum.
For CD audio, 22 kHz is the “Nyquist Limit,” and you need an “antialias filter” to block higher frequencies (yes, these exact same principles apply to digital-camera sensors). But there were complaints that the antialias filters degraded sound quality.
And while most adults can’t hear steady pitches much above 16 kHz, very brief clicks at a higher theoretical frequency might contribute some “edge” to percussion and note attacks. A lot of professionals preferred to record at 48 kHz instead. (You can go higher today, though bandwidth limitations in the analog realm become significant.)
Folks who produce music may have reasons for 24-bit sampling (tracks are often put through computation-intensive effects; you don’t want rounding errors), but 20-bit delivery covers an excellent dynamic range. Even taking a 20-bit master and dithering it down to a 16-bit release version can work well.
I apologize to all of you whose eyes glazed over during those past few paragraphs. The truth is, most of us found CD sound quality perfectly adequate.
If you do the math, the bit rate used for CD sound is about 1.4 Mbit/sec (in stereo). A reasonable standard that would have handled any outstanding quality issues might have been 20 bits at 48 kHz. That works out to 1.9 Mbit/sec.
But the arrival of DVD technology offered a huge increase in disc capacity. It was a great opportunity to sell a newer, zingier, whizzier-spec music format too.
So a “format war” broke out, but based on bit rates that were sort of crazy. Stereo in Sony’s SACD standard burns up 5.6 Mbit/sec—quadrupling the CD rate. DVD-A is a “family” of standards (a standard that doesn’t standardize); but its highest supported stereo rate is 9.2 Mbits/sec!
A leading writer in the digital-audio field once told me in an email that the reasons for these bit rates had more to do with “quieting the lunatic fringe” than with any technical justification.
But the public treated these new audio formats with indifference. SACD has found a niche in classical music, but most folks are completely unaware of it. (Even though, soon enough, millions did go out to buy a new disc format: Blu-ray.)
You all know what actually happened with music: Buying it online, and being able to take it everywhere in your pocket, totally changed the game.
So with downloadable music, the bit rate actually plunged instead. Today we’re buying music that uses 1/5th or even 1/8th the old CD standard! (Psychoacoustically intelligent data compression makes it possible.)
So… does this have anything to do with photography?
Camera manufacturers today (even those making enthusiast models) continue to use megapixels as the spec that defines “improvement.” Every year, more bits!
This shows a depressing lack of creativity. Past the point where this offers any real value, it’s just mindlessly chasing a number.
What we need is a serious rethink—to create something so novel and desirable that any talk about pixel count become irrelevant.
My feeling is that a game-changer for digital cameras is radical improvements in low-light capability. (This parallels the opinions of that recent Gizmodo article.)
We’ve suffered through decades of terrible point & shoots—whose slow zooms and limited sensitivity demanded nuclear-blast electronic flash for every indoor shot.
Flash is blinding, conspicuous, annoying to bystanders, and quite rightly prohibited at most museums and concerts. It drags out the lag time before shots.
It’s also a form of lighting which makes people look like shit.
Consider the fraction of our days we spend indoors, often under marginal illumination. But living rooms, restaurants, etc.—isn’t this where our real lives happen? Wouldn’t it be amazing to record those moments realistically, accurately, but without blinding and ugly flash?
What if you had a camera that could shoot at ISO 1600, cleanly? What about a camera where anti-shake let you trust shooting at 1/15th sec.? Plus a lens of f/1.7 or f/1.4—scooping up four times as much light?
Then, people could take photos freaking anywhere. Without flash. The light of a single candle is enough! (That’s LV 2, if you’re wondering.)
Now, to get a lens that fast, we’d probably need to lose the zoom.
Oh noes! Cameramakers’ second most-flogged spec number is zoom range. Our 12x is better than their 10x! I can hear the howls already: “Who would buy a camera without a zoom?”
Well, how many people use cell phones as their main camera today? Those have no optical zoom.
Some used to ask, “who would pay for a compressed MP3 when you can own the real physical disc?”
People who appreciate convenience. People who want technology to fit into their real lives. Us.
February 15, 2010
DxO Labs has released their sensor test results for two Micro Four Thirds cameras, the Panasonic GF1 and the Olympus E-P2.
Since DxO provides a handy comparison feature, here’s a link comparing both cameras with an APS-C sensor DSLR, the Nikon D5000.
Note that the D5000 is Nikon’s second-cheapest DSLR, which you can get in a kit for $750. These two µ4/3 models cost more. (Although you can find the Olympus E-P1 for a bit under $700 now.)
I’m not “cherry picking” the D5000 for any particular reason (it is not the best-performing APS-C sensor DSLR). It’s just a current-technology APS-C model which matches the 12 megapixels of the µ4/3 models.
According to DP Review, the sensor used in the cheaper Pentax K-x is very similar.
But the larger, APS-C sensor simply means larger pixels (5.5 microns wide, versus 4.3). And if you click the “dynamic range” and “SNR 18%” tabs, you can see what a huge difference this makes.
For any given noise level, the APS-C camera gains nearly one whole f/stop of ISO sensitivity. The dynamic range is two stops greater.
As always with DxO tests, note that they evaluate just the sensor, based on raw images. They ignore any differences between different cameras’ JPEG processing quality, or any consideration of camera handling, etc.
February 14, 2010
The first thing to understand about picture noise (aka grain, speckles) is that it’s already present in the optical image brought into focus on the sensor.
Even when you photograph something featureless and uniform like a blank sky, the light falling onto the sensor isn’t creamy and smooth, like mayonnaise. At microscopic scales, it’s lumpy & gritty.
This is because light consists of individual photons. They sprinkle across the sensor at somewhat random timings and spacings. And eventually you get down to the scale where one tiny area might receive no photons at all, even as its neighbor receives many.
Perhaps one quarter of the photons striking the sensor release an electron—which is stored, then counted by the camera after the exposure. This creates the brightness value recorded for each pixel. (There is also a bit of circuit noise sneaking in, mostly affecting the darkest parts of the image.)
But no matter how carefully a camera is constructed, it is subject to photon noise—sometimes called “shot noise.” You might also hear some murmurs about Poisson statistics, the math describing how this noise is distributed.
When you start from a focused image that’s tiny (as happens with point & shoot cameras), then magnify it by dozens of times, the more noticeable this inherent noise becomes:
In fact, the only way to reduce the speckles is to average them out over some larger area. However, the best method for doing this requires some consideration.
The most obvious solution is this: You decide what is the minimum spatial resolution needed for your uses (i.e., what pixel count), then simply make each pixel the largest area permissible. Bigger pixels = more photon averaging.
Lets recall that quite a nice 8 x 10″ print can be made from a 5 Mp image. The largest inkjet printers bought by ordinary citizens print at 13 inches wide; 9 Mp suffices for this, at least at arm’s length. And any viewing on a computer screen requires far fewer pixels still.
The corollary is that when a photographer does require more pixels (and a few do), you increase the sensor size, not shrink the pixels. For a given illumination level (a particular scene brightness and f/stop) the larger sensor will simply collect more photons in total—allowing better averaging of the photon noise.
But say we take our original sensor area, then subdivide it into many smaller, but noisier, pixels. Their photon counts are bobbling all over the place now! The hope here is that later down the road, we can blend them in some useful way that reduces noise.
One brute-force method is just applying a small-radius blur to the more pixel-dense image. However this will certainly destroy detail too. It’s not clear what advantage this offers compared to starting from a crisp, lower-megapixel image (for one thing, the file size will be larger).
Today, the approach actually taken is to start with the noisier high-megapixel image, then run sophisticated image-processing routines on it. Theoretically, smart algorithms can enhance true detail, like edges; while smoothing shot noise in the areas that are deemed featureless.
This is done on every current digital camera. Yet it must be done much more aggressively when using tiny sensors, like those in compact models or phone-cams.
One argument is that by doing this, we’ve simply turned the matter into a software problem. Throw in Moore’s law, plus ever-more-clever programming, and we may get a better result than the big-pixel solution. I associate the name Nathan Myrvold with this form of techno-optimism (e.g. here). Assuming the files were saved in raw format, you might even go back and improve photos shot in the past.
But it’s important to note the limits of this image-processing, as they apply to real cameras today.
Most inexpensive cameras do not give the option of saving raw sensor files. So before we actually see the image, the camera’s processor chip puts it through a series of steps:
Bayer demosaicing —> Denoising & Sharpening —> JPEG compression
The problem is that photon noise affects pixels randomly—without regard to their assigned color. If it happens (and statistically, it will) that several nearby “red” pixels end up too bright (because of random fluctuations), the camera can’t distinguish this from a true red detail within the subject. So, false rainbow blobs can propagate on scales much larger than individual pixels:
The next problem is that de-noising and sharpening actually tug in opposite directions. So the camera must make an educated guess about what is a true edge, sharpen that, then blur the rest.
This works pretty well when the processor finds a crisp, high-contrast outline. But low-contrast details (which are very important to our subjective sense of texture) can simply be smudged away.
The result can be a very unnatural, “watercolors” effect. Even when somewhat sharp-edged, blobs of color will be nearly featureless inside those outlines.
Or, combined with a bit more aggressive sharpening, you might get this,
Clearly, the camera’s guesses about what is true detail can fail in many real-world situations.
There’s an excellent technical PDF from the DxO Labs researchers, discussing (and attempting to quantify) this degradation. Their research was originally oriented towards cell-phone cameras (where these issues are even more severe); but the principles apply to any small-sensor camera dependent on algorithmic signal recovery.
Remember that image processing done within the camera must trade off sophistication against speed and battery consumption. Otherwise, camera performance becomes unacceptable. And larger files tax the write-speed and picture capacity of memory cards; they also take longer to load and edit in our computers.
So there is still an argument for taking the conservative approach.
We can subdivide sensors into more numerous, smaller pixels. But we should stop at the point which is sufficient for our purposes, in order to minimize reliance on this complex, possibly-flawed software image wrangling.
And when aberrations & diffraction limit the pixel count which is actually useful, the argument becomes even stronger.