That is the not what I said. A bit more context (with added emphasis) makes it clear that each of the samples are an example of what the particular methodology created or could have created:
Do note that Randall’s own estimate is that horse correct battery staple
has 44 bits of entropy. Absent evidence discrediting Randall’s calculation, I do not feel it fair to conclude that horse ... onion
has 0+13 bits, although I would accept that 44+13 is not “close enough” to 64 to be considered similar.
Help me understand how one determines that horse correct battery staple
contributes zero bits of entropy because it is pwned (11 times), but accurate cocoa heroics impulsive last
enjoys 13 bits of entropy contributed by accurate
despite it having been pwned 3,500 times.
Derating based on the pwnage of substrings is something I have never heard of. Nor do I believe it scales. The letter “a” has been pwned 750k+ times, yet we would not severely derate every password/phrase that contains an “a”. And since all other letters appear to be pwned, things rapidly disintegrate as every possible substring is considered.
Derating based on the entire password being pwned is something I could get behind; derating based on a substring seems nonsensical.