From the Editor’s Desk: Why DxOMark scores are basically worthless
DxOMark controversy is back in the news this week, but the problem with the mobile camera rating system isn’t as simple as manufacturers ‘buying’ inflated scores.
This week, following a small amount of fanfare, the OnePlus 5 nabbed a DxOMark Mobile score of 87. A day later, as if timed perfectly to demonstrate the flaws of its rating system, DxO gave the LG G6 an 84. And the Internet is suitably riled up. Comment threads suggest something untoward has happened as a result of OnePlus’s recently-announced partnership with DxO. Reddit is swimming in incredulous anger.
Let me start by saying I don’t think DxO has allowed however much money changed hands between it and OnePlus to influence the objectivity of its testing. Nobody is directly buying or selling higher benchmark scores — that would be crazy. Nevertheless, it’s become clear that as a basis for judging whether one smartphone camera is better than another, the firm’s numbered scores are, at best, flawed.
DxO’s overall scores are taken from an average of sub-scores for exposure and contrast, color autofocus, texture, noise, artifacts and stabilization. There’s a brief explainer (dating all the way back to 2012) detailing how DxO generates these mobile scores, apparently showing a mix of automatic and perceptual testing — the latter involving a human using the phone out in the real world. For the automated tests, DxO relies on software like its own DxO Analyzer, which is used by the world’s top camera makers to gauge image quality.
The specifics of DxO’s partnership with OnePlus (and other manufacturers like HTC and Google) haven’t been publicly disclosed. But presumably, it’s this software, along with other testing equipment, that the imaging teams at these phone makers get access to.
Single numbered scores for phone cameras are at once too vague and too specific.
Firstly, let’s address the flaws of using a single number to sum up the entire mobile camera experience. Reducing a smartphone camera to a percentage score has the problem of being at once too vague and too specific. A number — a non-weighted average — doesn’t do justice to the complexity of modern smartphone cameras, where performance can vary widely depending on the situation, and not all factors are equally important. At the same time, a score out of 100 implies precision. The OnePlus 5, Huawei P10 and Samsung Galaxy S6 edge+ are all equally good, the numbers tell us. The LG G6 and Moto G4 Plus, also equal, with DxO scores of 84. Anyone who’s used these devices out in the real world will tell you the reality is not even close.
Meanwhile, DxO rates the Galaxy S6 edge+ as an 87, whereas the Galaxy Note 5 is an 86. Both phones have the same internal hardware and camera modules. There’s a one-point variance between these two phones, which in imaging terms are identical. There’s also a one-point difference between a Samsung Galaxy S8 and a Sony Xperia Z5, which are light-years apart in real-world performance.
This underscores the craziness of putting stock in these single numbered scores for phone cameras, particularly when the same variance can exist between two physically identical cameras and two very, very different ones. DxO scores may well serve as a decent benchmark for the raw capabilities of each camera (personally, I think even that is debatable — see the LG G6 vs Moto G4 Plus example above), but they also have the effect of muddying important details around real-world use.
DxOMark scores often don’t line up with reality, telling us the LG G6 is only as good as last year’s Moto G4 Plus.
DxO’s written reviews are far more informative — actually digging down into the important details that make a phone camera good or bad. Yet it’s the numbers that manufacturers flaunt in their marketing materials, and which DxO displays hierarchically on its sidebar.
Next, there’s the very obvious problem — and potential conflict of interest — of selling testing hardware to phone makers which you eventually use to publicly rate them on their performance. The idea, presumably, is to allow phone makers to make their cameras better by using repeatable, scientific tests around image quality. But this also has the effect of allowing OEMs to “teach to the test.”
DxO’s partnership model allows OEMs to ‘teach to the test’
Like a wily student preparing for a standardized test, manufacturers who partner with DxO, and get access to its hardware and software, can tune their image processing to ace the firm’s synthetic tests (within the limits of the hardware, of course). As a result, their review scores are higher when DxO eventually publishes them — because they’ve had access to the testing hardware all along. Manufacturers who don’t partner with DxO are at an automatic disadvantage in terms of their score, even though real-world, outside-of-the-lab image quality might not be substantially different. When that happens, as it is bound to, consumers who put faith in comparisons between scores from partners and non-partners are potentially misled.
That’s how we end up with scores that tell you the Moto G4 Plus is as good as the LG G6, which is worse than the OnePlus 5.
And that’s where the potential conflict of interest arises. Partner with DxO, and your phone has the opportunity to max out its eventual review score — and if it’s a new flagship phone, maybe steal the crown with a new high score. Manufacturers who don’t license DxO’s stuff compete on an uneven playing field.
All that being said, often DxOMark scores do match up with the observations of experienced tech reviewers. The firm correctly called the Google Pixel as the best smartphone camera of 2016. And I think most reviewers would agree that the HTC U11 has, by a slender margin, probably the best phone camera released in 2017 to date.
But that shouldn’t compensate for the egregious examples of DxOMark scores not lining up with reality, the most recent of which tells us that LG’s flagship phone of 2017 is only as good as a year-old Motorola mid-ranger. As a result, these scores can be pretty much worthless for directly comparing two or more phones.
Bottom line: DxO’s reviews are informative and well-researched. But those numbered scores? Forget ’em.
Other odds and ends for a working weekend:
- We’re a month out from Android O being finalized, and surely just days away from finding out what “O” stands for. Oatmeal Cookie is apparently the internal codename — don’t read too much into that; internal codenames are often different to public nicknames. But really what other options are there besides Oreo? If Google doesn’t ink some kind of partnership with the cookie giant, Oatmeal Cookie might end up winning by default.
- On a related note, big win for Sony if it can be first with Android O on a new phone, as the rumor mill would seem to suggest.
- DxOMark controversy and jelly scrolling hullabaloo aside, I’m enjoying the OnePlus 5, and I’ll have a second opinion piece coming up early next week. The camera is pretty good. Not great. Not bad. I award it 85 AleXMarks.
- Because I’m about to do some traveling, and Google still insists on charging crazy money for the Pixel C, I have an iPad Pro on the way to me. (Don’t judge.) I’ve used iPads on and off over the years — almost inevitably selling them after a few months. We’ll see if the latest one fares any better, and how it eventually adapts to iOS 11.
- Google surely has a convertible of its own on the way this fall. I’m eagerly awaiting the Android tablet space being less of a wasteland than it currently is.
That’s it. More hot takes from me in a few weeks.