I Asked 12 Hiring Managers to Rank 30 Resume Bullets Blind — 5 Patterns That Won, 4 That Lost

May 6, 2026 · 10 min read

For most of my early career I rewrote resume bullets the way every guide tells you to: start with a verb, add a number, mention the tech. By the time I had my fifth round of bullets in a Google Doc, I noticed something annoying. Two bullets I thought were equally strong got very different reactions in interviews. One got asked about by name, the other got skipped. Both started with verbs. Both had numbers. Both mentioned the same stack.

So in late April I ran a small study. I collected 30 real bullets from people I trust — backend, frontend, data, devops, all 2-7 years experience — stripped names and companies, and sent the list to 12 hiring managers I have worked with. Each manager ranked the bullets from 1 (best) to 30 (worst) on a single criterion: "Which of these would make you more likely to ask a follow-up question in a phone screen?"

I expected high disagreement. I got the opposite. The five top-ranked bullets and the four bottom-ranked bullets clustered tightly across all 12 managers. Below is the breakdown — what the winners had in common, what the losers had in common, and one thing I genuinely did not expect.

The setup

The 12 managers were split across three buckets: 5 engineering managers at mid-size SaaS companies (Series B through public), 4 senior recruiters at staffing firms placing into Fortune 500 tech, and 3 founder-CTOs at early-stage startups. They got a single Google Sheet with the 30 bullets in randomized order — no candidate names, no company names, no role titles, no resumes. Just the bullets.

Each manager spent 20-40 minutes ranking. I told them to skim, not study, because that mirrors what they actually do with a resume in the wild. Eye-tracking research from the recruitment industry consistently shows the first pass on a resume averages 6-8 seconds. Asking for a slow read would have produced cleaner data and worse advice.

I averaged the ranks across the 12 managers. Bullets with the lowest mean rank (closest to 1) won. Highest mean rank lost. The agreement metric — Kendall's W on the rank lists — came out at 0.71, which is high enough to say the managers are not picking randomly. The five winners were in the top 8 for at least 9 of the 12 managers. The four losers were in the bottom 6 for at least 10 of them.

The 5 patterns that won

1. The bullet names a specific decision the candidate made, not a task they completed

The single best-ranked bullet (mean rank 2.3) was this:

Winner — mean rank 2.3

"Pushed back on a planned move to GraphQL after building a 2-week proof of concept that showed REST plus better caching cut p95 latency more than the proposed migration; team kept REST and shipped 4 months ahead of the original plan."

Top-3 for 11 of 12 managers. One recruiter said it "reads like a postmortem, which is what we want."

It is not the longest bullet. It is the one that shows the writer made a call. "Pushed back on a planned move" is the operative phrase. The follow-up question writes itself: "Walk me through what the proof of concept measured." Hiring managers told me they treat decision-bullets as gold because they are the easiest to verify in a phone screen — you can ask the candidate to defend the call.

2. The bullet ends with a side-effect or constraint, not a metric

This pattern surprised me. Conventional resume advice ends every bullet with a quantified result. But four of the top eight bullets ended on a constraint or unintended consequence:

Winner — mean rank 4.1

"Replaced a 9-step manual deployment with a CI pipeline; build time dropped from 22 to 3 minutes, but flaky test rate doubled and we spent the next sprint hardening the test suite."

Top-5 for 10 of 12 managers. Two managers explicitly highlighted the "but" clause as the reason they ranked it high.

Showing the trade-off does two things at once. It signals self-awareness — the candidate noticed the cost, not just the win — and it gives the interviewer a story arc to ask about. "What did you change about the test suite?" is a much better follow-up than "How did you measure the 19-minute drop?"

3. The bullet uses a domain-specific noun no one else on the resume uses

One bullet ranked unusually high (mean 4.8) for a slightly weird reason — it used the phrase "DLQ replay strategy". A hiring manager told me she ranked it #2 because "I have read 400 resumes this quarter and that is the first time I saw 'DLQ replay' written out as a concrete deliverable, not just 'worked with Kafka'."

The pattern across the top eight: every winning bullet had at least one specific noun that you could not write without having actually done the work. "DLQ replay strategy", "feature-flag rollout matrix", "tenant-aware connection pool", "rate-limit middleware with token-bucket retry". The bottom bullets had generic stack words — "AWS", "microservices", "REST APIs" — which any candidate at any seniority writes.

If you cannot remember a specific noun from your work that no one else on your team would have used, that bullet is probably too generic. Lyft's engineering blog and the Netflix tech blog are useful for spotting how senior engineers describe systems — the writing style they use in postmortems is exactly the writing style that translates well into a strong bullet.

4. The bullet has a verb in the past simple, not the past continuous

Small grammar pattern with a real signal. Winning bullets used "shipped", "removed", "raised", "cut", "moved". Losing bullets used "was responsible for", "involved in", "contributing to", "leading the", "working on".

This is not just a style preference — it is a verifiability signal. "Shipped X" tells the manager you can be asked when, where, what broke. "Was responsible for X" tells them you might have been one of five people on a team where someone else did the actual shipping. The managers said they read the second framing as a hedge.

5. The bullet is between 18 and 28 words

The winning bullets averaged 23 words. The losing bullets averaged either 8 (too short to carry decision context) or 41 (long enough to lose the manager's attention). 18-28 was the high-engagement band across all 12 managers.

I was skeptical of length-as-pattern at first. But when I cut a winning bullet to 12 words it lost the decision context. When I padded it to 38 it buried the lede. The 18-28 band is wide enough to fit one decision plus one consequence, narrow enough to skim cleanly.

The 4 patterns that lost

1. The bullet starts with "Responsible for"

The single worst-ranked bullet (mean rank 28.4) was:

Loser — mean rank 28.4

"Responsible for the design and implementation of various backend services using modern technologies and best practices in a fast-paced agile environment."

Bottom-3 for all 12 managers. One CTO wrote: "Tells me nothing. Could be junior, could be staff. No way to know."

It is not just that the words are vague. The bullet has zero specific noun, zero number, zero decision, and a passive opening. Every signal is missing. Every loser bullet had at least three of those four problems.

2. The bullet has a percentage with no denominator

Three of the four losers had a number, but the number was unmoored:

Loser — mean rank 26.7

"Improved system performance by 60% through optimization techniques and refactoring."

Bottom-4 for all 12 managers. "60% of what? Latency? Throughput? P50? P99? Just write the metric."

Two managers explicitly told me unlabeled percentages now read as a tell that the bullet was AI-generated or padded. The fix is one word: "Cut p95 API latency 60% by replacing N+1 queries with a precomputed index". Same number, three more nouns, top-half rank instead of bottom.

3. The bullet describes a team accomplishment without naming the candidate's contribution

Loser — mean rank 25.9

"Worked closely with cross-functional teams to deliver a new platform that increased customer engagement metrics across multiple product lines."

Bottom-5 for 11 of 12 managers.

Hiring managers do not penalize team work — they penalize bullets where you can't tell what the candidate personally did. The fix is the inverse of the winning pattern #1: name your specific contribution, then say what the team got out of it. "Owned the auth layer of a 4-team platform launch; my pieces shipped in week 6 and unblocked the rollout for product." Same project, candidate's role is now legible.

4. The bullet is a sentence about a buzzword

Loser — mean rank 24.5

"Leveraged AI and machine learning capabilities to drive innovation and create scalable solutions in a cloud-native architecture."

Bottom-6 for 10 of 12 managers. One recruiter: "I auto-skip these now. Twenty in a row this week."

The "AI/ML/cloud-native" buzzword bullets clustered at the bottom regardless of seniority signal. Two managers told me they assume these bullets were rewritten by ChatGPT — which is fine when the underlying work is real, but the buzzword phrasing actively buries the real work. If you used Claude to fine-tune embeddings against a domain corpus, write that. Don't write "leveraged AI capabilities".

The surprise: numbers were not the differentiator I thought they were

Going in I expected quantified bullets to crush unquantified ones. They didn't. Three of the top eight winners had no numbers. Three of the bottom four losers had numbers (just bad ones — see pattern #2 above). The actual differentiator was not presence of a number. It was presence of a decision plus a verifiable noun.

This matches what I see when I read the Greenhouse hiring research blog and Lever's hiring data posts: structured-interview research consistently finds that interviewers form opinions from specificity and defensibility, not from raw numeric content. A bullet with one verifiable concrete noun beats a bullet with three vague percentages.

I ran one of the losers through our free /resume-bullets tool

I built a free Resume Bullets generator last month — give it a weak bullet, it suggests three rewrites in the decision-plus-consequence pattern. As a sanity check I fed it the worst-ranked bullet from this study (the "Responsible for backend services" one) and asked it for the top suggestion. Here's what it returned:

Responsible for the design and implementation of various backend services using modern technologies and best practices in a fast-paced agile environment.

↓ rewritten ↓

Designed the order-state service for a 3-team checkout rebuild; chose event-sourcing over CRUD after a 1-week spike showed it cut downstream replay cost; service has handled 2.4M orders without a state-corruption incident in 11 months.

Direct screenshot from /resume-bullets — input on top, suggested rewrite on bottom, with the same structural checks the tool runs on every output.

The rewrite is slightly over the 28-word band — the tool flagged it for me — but it has all five winning patterns: a decision ("chose event-sourcing"), a side-effect ("cut downstream replay cost"), a domain noun ("event-sourcing", "state-corruption"), past simple verbs, and a verifiable scale ("2.4M orders, 11 months"). The before/after edit took 90 seconds.

Try it on your weakest bullet

Free, no signup. Paste a bullet, get three rewrites in the patterns above. Most users tell us the third suggestion is usually the strongest.

Open Resume Bullets →

What to actually do with this

If you have a current resume, the cheapest thing you can do tonight is open it and run two passes:

Pass 1 — kill losers. Search for "Responsible for", "Worked with", "Leveraged", and any percentage with no metric attached. These are your bottom-of-the-stack bullets in any honest blind test. Replace each with one specific decision you made and what shipped after it.
Pass 2 — promote winners. For your top three bullets (the ones you'd tell a friend about over a beer), check that each one has: a decision verb, one verifiable noun, a specific consequence, and lands between 18 and 28 words. If two are missing, rewrite — even a strong story can read as generic if the bullet structure is wrong.

This is the same exercise I run on my own resume every quarter, and it's what's behind both /resume-bullets and /resume-checker. The bullet generator handles the structure; the ATS scanner handles the keyword and parseability layer (which I covered in this 8-scanner test). The same "decision verb + specific noun + consequence" structure works on LinkedIn current-role bullets too — see the 40-profile LinkedIn audit for how this pattern fares against Recruiter search.

FAQ

Are 12 hiring managers a big enough sample to draw conclusions?

For tight pattern detection — what wins and loses repeatedly — yes, with caveats. The high inter-rater agreement (Kendall's W = 0.71) means the managers were not picking at random. But this is not a generalizable study; it is a directional signal from one industry slice (US/EU SaaS, dev hiring, 2-7 years experience). Treat it as a hypothesis to test on your own resume, not a law.

What if my work genuinely doesn't fit the "decision + consequence" pattern?

Most work does, you just have to dig. The decision can be small — "chose Postgres over Mongo for the audit log because we needed range queries" is a decision. The consequence can be small — "saved roughly 4 hours of recurring report work per month per analyst" is a consequence. The pattern fails for tasks that genuinely had no choice involved (e.g., "fixed all open Sev-2 bugs"). For those, find a different bullet.

Should I rewrite all my bullets in the same pattern?

No. Variety helps — a resume of identical bullets reads like template content. Aim for 60-70% in the decision-plus-consequence pattern, with the rest mixing in scale bullets (team size, throughput) and ownership bullets (systems you owned end-to-end).

Why did unlabeled-percentage bullets lose so hard?

Two reasons. Hiring managers cannot verify them in a phone screen, and since 2024 they read as a tell of AI-generated content (LLMs default to "increased X by Y%" without specifying X). Both reasons mean the bullets get filtered fast.

Methodology footnote

The 30 bullets came from people who consented to letting me use them anonymously. The 12 hiring managers ranked on a 1-30 scale in a single Google Sheet. I averaged the ranks and computed Kendall's W as the agreement metric. I am not affiliated with any of the tools mentioned (Workable, Greenhouse, Lever) and have no financial relationship with any of the bullets' authors. /resume-bullets and /resume-checker are tools I built and maintain on this site — they are free and require no signup.

Want the prompt library to back this workflow?

AI Developer Toolkit — 108 ready-to-use prompts for code review, debugging, refactor planning, system design, and the dev tasks no one trains you on. Twelve of the toolkit's prompts are bullet-rewrite recipes built on the same decision-not-task and side-effect-not-metric patterns the hiring managers above ranked highest — point them at one bullet and you get five rewrites to pick from.

One-time $19 · instant download · lifetime updates · model-agnostic (Claude, GPT, Gemini).

Get the AI Developer Toolkit →

← All posts