Back to Writing
· 3 min read ai engineering management leadership

Measuring Impact vs. Output

For years, I measured developers, myself, and my peers by how much we produced. Features shipped. Bugs fixed. Pull requests merged. That worked when writing code was done by hand and took time.

AI changed that. A developer using AI can now do in one afternoon what used to take a week. So “how much did you get done?” is no longer the right question.

AI Is a Basic Expectation Now

I don’t take developers seriously if they refuse to use AI tools. It’s like a construction worker who won’t use power tools. The skill is real, but the speed isn’t there.

If everyone on a team produces more code than ever, you need a different way to find your best people.

Output Was a Proxy for Value

I used to measure output because it was the closest thing I had to measuring value. More features shipped meant progress. More bugs fixed meant an effective developer. Those connections were always loose. Now they’re breaking.

The better question is: what impact did this work have on the business? Not how much work was done. What actually changed because someone spent their time on this task instead of another one?

I’m not good at measuring that yet. That’s a gap in my skills as a manager.

What My Team Is Trying

We built a point system. Tasks get ranked as small, medium, or large based on business impact, not technical difficulty. A large task is worth about 20 times the points of a small one.

The goal is to measure who has the most impact on the business.

I don’t fully agree with this for a couple of reasons. The weights feel arbitrary. In a lot of ways, categorizing the tasks still feels subjective. But it’s a first version. What matters is we stopped measuring volume and started attempting to measure impact.

Not All Impact Shows Up as Revenue

Historically, some of the most valuable engineering work had no direct impact on revenue. Infrastructure upgrades. Security improvements. Better internal tools. That’s still true. It used to be true for technical debt too, and I’m curious about how that’s changing now that AI can pay it down faster.

That work shows up later — when the system doesn’t crash under load, or when the next feature ships in two days instead of two weeks.

I think impact has to include more than revenue. It should also cover:

  • Risk reduction — did this prevent a future problem?
  • Speed — did this make future work faster or cheaper?
  • Strategy — did this move the business closer to where it needs to be in a year?

These are harder to measure than story points. But they’re closer to what matters.

I’m Starting With Myself

Before asking my team to think this way, I’m trying to learn to think this way. Every Monday I ask: what is the most important thing I can work on this week? Not the most urgent. Not the most interesting. The thing that moves the business forward the most.

Some weeks that’s reviewing other developers’ work. Some weeks that’s planning a big task. Some weeks that’s jumping in the trenches and clearing out the backlog.

Summary

The old way I used to measure developers — by volume of output — is losing its usefulness. AI makes code cheap to produce. I need to find a new way to measure impact, and I’m still figuring that out.

If anyone has figured this out, tell me.