Standard ASTM D4265 Stain Monitor Courtesy Delltech
I'm frequently asked why I don't start testing laundry detergents myself and instead rely on test results from Consumer Reports, Wirecutter and Jeeves_NYC.
The answer is twofold: it's too resource-intensive for this one-man show and the standard tests are fundamentally flawed.
Let's dig in.
In the US, the standard method for laundry detergent testing is ASTM D4265. In this test, a single stain monitor on a particular fabric gets washed with eight pounds of ballast fabrics to simulate a typical washload. The monitor swatches are virgin textiles; the ballast fabrics can either be virgin textiles or reused textiles washed in a specific manner to reset them to new condition. The ballast textiles are then either soiled in prescribed manner or a liquid that simulates normal human soiling is added to the wash load.
The color and intensity of each stain on the monitor is recorded before and after washing, and stain removal performance is reported as a percentage reduction in the color intensity of each stain. So if a particular stain is visibly reduced 75% in intensity, the score is 25. In a perfect world, all stains being removed to bare fabric would result in 0 for each swatch.
Consumer Reports, Wirecutter and Jeeves all use some variation on this method - the particular stain swatches vary, and the way they weight the importance of individual stains varies and the specific wash conditions (temperature, equipment) aren't common between them, but they are internally consistent. All of them are reliable test methods that should provide consistent, reproducible results when evaluating two different products tested by the same organization. It's harder to compare results across testing platforms because of the variations in which stains get tested and their weighting methods. This results in some interesting quirks - with products that test extremely well at one organization testing very poorly at another.
Each test should be repeated a few times and in varying water hardnesses to truly represent how well a detergent copes with varying wash conditions, and optimally would be tested in both conventional and HE machines.
Stains aren't the only thing that happens to laundry. One could argue that the primary reason we wash clothing isn't stain removal, but broader soil and odor removal.
Visual methods cannot, in my opinion, adequately reflect the necessary full spectrum of cleaning performance - that textiles come out looking, feeling and smelling clean after ordinary washing. It's incredibly hard to adequately quantify these three aspects of clean, and I wouldn't propose that they necessarily could be - smell is incredibly difficult to measure quantitatively, as an example.
But one quantifiable aspect of measurable cleaning performance that affects all three dimensions of cleaning is sebum removal, and I have serious doubts about how the ASTM D4265 measures this fundamental factor in whether consumers actually get clean clothing. Since fresh sebum is essentially clear, the monitors for sebum are necessarily artificially pigmented and the hope is that lower pigment intensity means that the carrier sebum was removed as well. I believe this is less axiomatic than proponents of this testing method hope. Anecdotally, Redditors and myself have seen plenty of products that remove visible stains, but leave products less than completely degreased, leading to texture and odor problems that require rewashing or more intensive treatment to resolve. Further, even very low retained sebum levels lead to poor textile properties, especially on modern polyesters, which are not part of typical stain monitor panels.
I would approach it from a couple of perspectives.
The most important change is to acknowledge that complete removal of less-visible soils like sebum and other oily components prone to rancidity or polymerization has the most significant impact on how laundered clothing looks, feels and smells. I would propose that a superior method would directly measure the percentage of a known quantity of a particular oil on monitors that is removed in a wash process and would look something like this:
Soil two monitor fabrics in a standard way with precisely measured quantities of reference sebum substitute - both a cellulosic and a modern lobed polyester. This would more accurately reflect a typical load than the 80s-style polyester-cotton woven monitors typically used. Lobed polyester retaining oily soils is one of the biggest sources of laundry malodor and texture issues.
Wash the monitors in a properly ballasted standardized load with the product under test.
Separately solvent-extract both monitors using a solvent that is highly effective against the reference sebum substitute.
Measure the residual oil after distilling off the solvent.
This style of testing is admittedly expensive misery - organic chemistry is comparatively more dangerous and time-intensive than photometric methods, but it's also much more sensitive to this important aspect of cleaning performance. The vast majority of problems people bring to r/Laundry are about "invisible" oily soils and most of them are using reputable name-brand detergents at more or less the label dose - a test procedure that says "used as directed, this product either does or does not completely degrease clothing" would be a substantial advance.
Another point of differentiation in testing that isn't being well-addressed is how well detergents handle hard water. Half of American households do laundry in water that is hard enough to affect wash performance at a standard product dose, and stain-removal-focused methods have two blind spots when it comes to water hardness.
The first is that the chemistries that handle many kinds of stain removal are relatively insensitive to water hardness. Enzymes and oxygen bleaches target many reference stains and they could not care less about how much calcium is in the wash water. So a detergent with low ability to neutralize wash water hardness while preserving surfactant activity can still do decently at stain removal according to photometric methods, but it doesn't do well at sebum removal.
The second blind spot is that some detergent ingredients are actively problematic in hard water - soaps / saponified oils make soap scum when used in hard water without sufficient chelating buffers, and some precipitating buffers form mineral residues that collect in some fabrics.
A superior test suite for hard water compatibility would be to directly measure how much free calcium and magnesium remains after a measured dose of the product in a known quantity of water of known composition, followed by a filtration step that mimics wash water draining through dense dark fabrics, to reveal insoluble soil formation.
As with the other testing organizations, I believe photometric stain removal is an important part of overall detergent testing performance, so that would be included.
A read of these other results, a review of the ingredient list and a few washes with a chosen product gives you a reasonable place to start improving your laundry routine.