Nobody Defined Success
When did anyone last tell you what a successful evaluation looks like?
Not what makes it rigorous. Not what makes it relevant. Not what makes it independent. What makes it a success.
Think back to the last evaluation you commissioned or oversaw. Was there a moment — in the terms of reference, the kick-off meeting, the steering group — where someone said, in plain language, this is what success will look like?
Probably not (success is not a word the profession uses about its own work). And if there was, it almost certainly said something like “a rigorous, independent, methodologically sound evaluation that produces credible findings.”
That sounds like a definition of success. It isn’t. It’s a definition of quality. And the gap between the two is where most evaluations quietly fail.
I recently reviewed fifty evaluation frameworks from governments, multilaterals, foundations, and professional associations, looking for one thing: explicit definitions of evaluation success.
I found almost none.
What I found instead was a near-universal substitution: success ≈ quality, and quality ≈ research-style methodological rigour. The OECD-DAC criteria, the UNEG norms, the CDC standards — all excellent on what makes evaluation good, all silent on what makes it successful.
Quality is an attribute of the work. Success is a judgement about whether the work achieved what it was for. A study can be impeccably designed and still be the kind of experience no one wants to repeat. A study can also be methodologically modest and still leave people glad they commissioned it.
Frameworks don’t help you tell the difference. They evaluate the evidence, not the evaluation.
This is not a writing problem. It is a structural one. When success is silently equated with quality, trade-offs become invisible. Asking for an evaluation faster, cheaper, or narrower feels like asking for less success. So everyone pretends the constraints aren’t there — until the project is already in trouble.
Every evaluation lives inside four constraints at once: epistemic, strategic, temporal, and economic. A useful definition of success has to speak to all four.
It has to include credibility, because evidence no one trusts isn’t evidence.
It has to include usefulness, because an evaluation that informs no decision is orphaned research.
It has to include timeliness, because evidence arriving after the window has closed has, literally, no value.
And it has to include efficiency, because evaluation resources are real resources, and an evaluation that costs more than the decision it informs is worth is a strategic loss, not a methodological win.
Credibility. Usefulness. Timeliness. Efficiency. CUTE.
I’ll unpack each of these — and the trade-offs between them — in future posts.
For now, the question is not: Was the evaluation rigorous?
The question is: What would success have looked like before the evaluation even began?
If you can’t answer that in one sentence, you don’t have a definition of success. You have a definition of quality, and a hope that quality will be enough.
It usually isn’t.

