The more senior the role, the fewer the candidates, and thus the less it makes sense to restrict the pipeline.

Last week, a friend posted on LinkedIn about a role: $1.5m-$2.3m comp (base + equity), only hiring via referral. As someone who has publicly pledged never to accept more than $1m a year in comp, I suggested that the salary range was an immediate red flag. But it is the referral-only that really bothers me.

Broadly speaking, workers are a combination of two things: skills (things they can do) and temperament (the way they do those things). Most companies require large numbers of lower-skill employees, with successively smaller numbers of higher-skill employees at each level above them. Temperament matters at all levels, because companies aren’t just giant skill machines and you actually need the gears to want to work together. 

The problem is that skills are relatively easy to assess through the hiring process but temperament is much harder; the short, artificial nature of interviews means you don’t really know much about how someone actually does the work until six months or more into working with them.

This is why employers often rely on referrals. Because of homophily (the tendency of likes to attract), it is generally true that if you enjoy working with a referrer, you’ll enjoy working with the referral.

Here is where it gets tricky. Because you will always have a larger supply of lower-skill talent (most people can do most things at the absolute bottom of the skill pyramid), the primary differentiating factor for lower-skill roles is temperament. Which means referrals should be far more important in low-skill roles.

But that is the precise opposite of what happens. Instead, higher-skill roles are much more likely to be referral-only, despite the fact that the relative rarity of high-skill/good-temperament creates a much smaller hiring pool. Temperament is a much smaller differentiator when only a few people have the necessary skills.

In some ways, all of this is moot because in reality, no role should be referral-only. There is overwhelming evidence that relying on referrals alone increases systemic barriers related to gender, ethnicity, credential, etc. Remember that homophile? It applies to more than just temperament. Referral-only recruiting is anti-science and pro-bias.

But if you insist on referral-only, it should be primarily used for lower-skill roles, where it is a better differentiator. This also avoids magnifying the bias problem by conflating it with higher-skill roles that typically come with higher compensation. If you’re using referral-only and then paying those candidates millions of dollars, you’re being sexist, racist, etc. at the highest possible scale of wealth-gap creation.

The death of shared experience is a fundamental paradigm shift…and it will absolutely wreck your customer service department.

At some point, every teen ends up in the same conversation: how do I know that the reality I am experiencing is the same one you are? The context changes (as does the amount of alcohol consumed) but the easy example is always color. I look at the sky and call it blue, and so do you, but that is just because we have both been taught to label the experience that way. How do I know that your blue is my blue?

The truth, as most adults come to realize, is that functionally it doesn’t matter. As long as we agree on the connection between the label and the experience, and both of those are stable, we can communicate with each other.

And that’s useful. One of my many jobs in college was working the IT helpdesk. And it was relatively easy to support people over the phone because the interface was predictable; I can tell you to go to the lower right and click the blue button, because I know what I see is what you see. Sometimes there are errors when people encode the information differently (Do you think the blue button is more of a purple?) but as long as you find a way to translate, it works.

One of the recent promises of AI optimists is fully personalized interfaces. They imagine a world that adapts to us to such a degree that there is no one standard way of engaging. This is often presented as being modular at first, where designers will create discrete blocks that rearrange depending on your needs, and then growing increasingly adaptive as the blocks become smaller and smaller.

But this disrupts one of the most fundamental assumptions that allows humans to collaborate: shared experience. In a future where I have literally no way of replicating the experience you’re having, how am I supposed to support you when you need help? How can we work together, when we literally exist in different worlds?

The easy, AI optimistic answer is that I don’t need to, because the AI also does the supporting and collaborating. But even if we believe that one AI is going to support you with another AI, what happens when the AIs aren’t having the same experience? And how are humans helping you before the support AI gets trained?

Hyperpersonalization feels a clear win; if all my experiences are tailored to fit me perfectly, how can that possibly be a bad thing? But if you take it to the logical conclusion, where the Venn diagram of the experiences between humans drifts to nothing…that has profound implications for humanity.

And for your customer service team. This isn’t just drunken teenage musing; shared experience is fundamental to how businesses currently operate. And so at the same time you’re investing in the latest technology to make hyperpersonalization possible, you have to also invest in the business infrastructure that makes it supportable.

But our tendency to confuse solutions with causes often traps us; when it comes to plugging leaks, it can’t just be a form of bailing faster.

Take racism. We know that empathetic listening, coupled with strong challenges, can convert hardliners. And we have plenty of examples: Ann Atwater and C. P. Ellis. Matthew Stevenson and Derek Black. Radical kindness does seem to be one potential solution to abject hatred.

At the same time, plenty of well-meaning people have taken that solution and suggested the false corollary that a lack of kindness is what radicalizes young men in the first place. They spin tales of how young White men experience hardship that causes them to start down the road to racism and it feels intuitively true: if kindness is the solution, then a lack of kindness must be the cause.

But C.P. Ellis didn’t become a Ku Klux Klan leader because Black people were unkind to him; he lived in extreme poverty and parroted the beliefs of his KKK-leading father. Derek Black had almost no interaction with Black people and yet hated them intensely, encouraged by his Stormfront-founding father (I’m noticing a theme). Americans did not enslave Africans because they were wronged by them. The boat is not leaking for lack of bailing.

This is particularly relevant to those of us in tech right now. In the wake of Mark Zuckerberg calling for an increase in toxic masculinity in the workplace, there has been a fair amount of online handwringing about what can be done to combat the pervasive sexist behavior of men in tech. And far too many of the suggestions sound like “Well, if women were just nicer to men in the first place…”

No. No no no.

It isn’t just the -isms of the world; it is any behavior change. The crusade to reduce smoking before it killed us all started with anti-smoking ads; in 1967, the FCC’s Fairness Doctrine ruled that for every smoking ad on TV, there had to be a corollary anti-smoking ad. But of course smoking continued: you can’t bail faster than the leaks.

It took us 30 years to figure out that instead of just running matched anti-cigarette ads, we should just ban cigarette advertising to begin with; the Master Settlement Agreement didn’t happen until 1998.

So how can you prevent these false syllogisms when you’re designing interventions?

I often talk about the five behavioral archetypes: Always, Never, Sometimes, Started, Stopped. The false syllogisms above are really a conflation of Never and Stopped; if radical kindness and anti-smoking ads can cause someone to Stop, they can also cause them to Never. 

But that isn’t always true. Be deliberate and systematic in your approach to gathering insights, remember that Never and Sopped are not equivalents, and investigate the two behavioral states as clearly different to reveal where the pressures themselves diverge.

The thin line between ambition and moral bankruptcy isn’t just about what we do but what we allow from others. In the end, we are what we tolerate.

The fabulous Katie Scarpa recently opened a role for an Integration Specialist on her team at Oceans. And almost immediately, she sent me a very impressive resume to look at. The candidate’s claim to fame? Inventing time travel.

Oddly, he didn’t explicitly mention his invention. But the first job on his resume was “Integration Specialist at Oceans”, which he apparently has been doing since October of last year. And that’s a truly impressive feat, since we just posted for the first hire on this team a few weeks ago.

Now if it were me and I invented time travel, you better believe that would be my resume headline. But no, this humble candidate just slipped it in casually, like it was no big deal. “I’m already an Integration Specialist at Oceans, so you should…hire me to be an Integration Specialist at Oceans.”

It makes sense, until it doesn’t.

This is apparently now a common tactic: trying to trick algorithmic resume screening by telling it that you’re perfect for the job by pretending you already do it. Presumably, once you cheat your way past the system, the hiring manager is supposed to respect your hustle and want to meet you so you can shoot your shot.

It reminds me of Apple Cider Vinegar on Netflix (I haven’t watched it but my partner loves telling me about it). She was relating a scene in which the main character Belle bluffs her way into a publishing meeting by pretending to have an appointment. Belle confesses the lie, lies again and gets caught, but the publisher accepts her anyway and just suggests lies that will sell better to the public.

Bluffing your way into a meeting is the sort of thing you hear about in startup culture all the time; it is meant to be a sign of how determined and gritty you are. If the lie is discovered, it becomes a form of recommendation: look how hard I’ll work to make this startup succeed.

The liars aren’t all that interesting to me because I understand the candidate who looks at the current job market and in desperation to simply be seen, fakes a resume to get past the screener. They probably tell themselves it isn’t a lie, that the hiring manager will know they are simply hustling, and that Shakespeare had it right: “My poverty and not my will consents.”

What interests me is the hiring manager who goes along with it. The investor or publisher who decides that lying is a virtue, not a liability. It isn’t Romeo and Juliet; it is Macbeth – being a lord but wanting to be king.

All of us are at least sometimes in positions of power. We choose where to spend our money, time, attention, and love. And there will always be people who don’t have enough of those things and so will do what they feel they must to get them. Which means that the limiting factor in the development of our culture is how we choose to allocate scarce resources, more than how we choose to pursue them.

Faking your resume to get past AI doesn’t work at Oceans because a human reads every application (we even say it at the top of our job postings!). And we don’t reward deceit, whether it comes from determination or desperation. Because celebrating growth and creating opportunity means being a good steward of those resources, even if it means losing out on hiring the inventor of time travel (he’d probably just get us stuck in a temporal paradox anyway). Sure, you might miss out on that rare apothecary whose willingness to hustle gets you more More MORE…but at the end of the day, you are what you accept. So accept better.

Yesterday, I wrote about using an Alternative Universe exercise to get stakeholders focused on behaviors. Today, I’ll share a second tactic: Outliers.

Let’s use a different example and focus on an internal use case: “I want people to be more inclusive in meetings.” And for variety, our stakeholder can be a glasses-wearing CEO named Satya.

The Outliers exercise relies on a natural human bias: our tendency to remember vivid extremes better than averages. This is useful because by using real exemplars of actual outliers, it helps stakeholders connect with behaviors as observed, rather than hypothesized.

I generally start with the positive version. “Alright, Satya, I want you to think about the most inclusive person you know. Do you have them in your mind? Great; tell me a bit about them.”

Satya will usually start with the person’s demographics, which you can ignore unless that is all he mentions, in which case you’ll need to prompt him: “You know he’s inclusive because he’s Black? Are all Black people inclusive?”

Usually, though, he’ll mention some behaviors in passing and those are what you want to call out and emphasize. “Oh, so he calls out when men repeat ideas that a woman has already said. Is that what you mean by inclusive: someone who calls out idea attribution?” 

I generally suggest waiting until he’s finished his initial description, as he may mention several behaviors without prompting. You’re not actually trying to finalize a selection here, just get a list of possibilities, so the more behaviors you can pull out, the greater the chance that you’ll find the sufficient one. You can worry about narrowing down later as you actually write your behavioral statement. 

I also think it is worthwhile to do the negative version. “OK, let’s try something different. I want you to think of the least inclusive person you know and tell me about them.”


This is useful because sometimes our end goal isn’t getting people to do a desirable behavior but rather stopping them from doing an undesirable behavior; there are plenty of inclusive practices that are about the absence of a bias. And because of our innate tendency to focus on promoting pressures, focusing on a negative outlier can help close that blindspot.

I tend to use Outliers over Alternative Universes when the subject is serious, as it doesn’t rely so much on the entertainment factor to hold interest; talking about exemplars is inherently fascinating. And because Satya has no access to the exemplars cognitions and emotions, he’s forced to rely on observable behaviors, which makes it easier to move people away from concepts like “loving a product”.

But the focus on exemplars comes at a cost: because it uses real people, the Outlier exercise often struggles with niche behaviors where the stakeholders’ have no personal experience. If Satya has never actually seen an inclusive leader, how is he supposed to describe them?

Tomorrow’s exercise, The Genie, addresses this by moving back into the realm of fantasy.

One of the hardest jobs that applied behavioral scientists have is getting stakeholders to focus on behaviors, rather than emotions or cognitions. Over the years, I’ve come to rely on three tactics for creating a behavioral outcome and while none of them is a silver bullet, at least one of them usually manages to do the job.

They all work on the same central premise: that people find it easier to make decisions through comparisons. They’re designed to be entertaining (high promoting pressure) but also simple enough that anyone in the room can use and understand them (low inhibiting pressure). And they all start with an emotion or cognition that a stakeholder expresses.

For our examples, let’s use an emotional outcome: “I want users to be obsessed with our product.” And because it feels appropriate, let’s say that was expressed by a turtleneck-wearing CEO named Steve.

The first tactic is using Alternative Universes.

“We’ve all seen a sci-fi movie where there are alternative universes. There is Original Steve and Steve Prime, and you know Steve Prime is the bad Steve because he has a mustache. And then he shaves his mustache and they end up in some dramatic fight scene where you have to figure out which Steve is the original so you can shoot the other one and keep them from taking over the multiverse.”

“Now Steve and Steve Prime are identical in every possible way except Steve Prime is from a universe that is obsessed with our product. You’ve got to shoot him…how would you know he is from the universe that is obsessed and Original Steve isn’t?”

Now we’ve got people laughing (entertaining, check!) and everyone can relate (easy, check!) and you can start to push on the behavioral bit. You’ve got two jobs: be the buzzer when someone says something that isn’t observable and provide suggestions of potential behaviors if people are struggling.

So if someone says “Steve Prime loves the phone!” you have to be quick to shut that down. “Bzzzzz, you just killed Original Steve and now we’re doomed.” (Bonus points if you sing “I don’t know what love is…but I want you to show me!”) Be funny but firm on this; you have to shut down anything that isn’t a physically observable behavior.

If people aren’t getting there, you can always make suggestions. “What about owning our phone? Original Steve doesn’t own one of our phones, Steve Prime does…is that enough?” 

The ‘enough’ part of that is key – is owning the phone alone enough to say that someone is truly obsessed? If both of them owned our phone, would they both be obsessed? Why or why not? Provocative questions are key, because disagreement is good. If everyone rushes to the behavior, that is likely a false consensus and will come back to bite you later.

Tomorrow we’ll look at the second tactic I use, Outliers.

Delight without satisfaction is addiction. And so when we design to make people feel happy in the moment, we must be mindful that it also enhances their long-term happiness, or risk creating a suboptimal world.

Over the weekend, designer Taurean Bryant posted about his hatred for the term “delight” in design. And Justin Maxwell shared an anecdote about the Mint.com team being bewildered by “Design for Delight” as an OKR at Intuit after they were acquired. Yet given the option, I think we’d all prefer a delightful experience. So why the dissonance?

Let’s start by defining terms. Generally, hedonic psychologists think of happiness as made up of two parts: delight (a momentary experience; I often think of the first lick of an ice cream cone) and satisfaction (a long-term experience; I imagine myself watching my sleeping son, thinking my life is good).

One of the oddities of hedonics is that the two don’t correlate particularly strongly. In studies where people are randomly pinged to rate their in-the-moment delight and then asked about their satisfaction at the end of the day, the two emerge as distinct: some have many moments of delight but are highly unsatisfied, while others aren’t particularly delighted but very satisfied generally (I may be a strong outlier in this category).

Since they are distinct, the perfect product would both delight and satisfy me. But that is hard to achieve and so design leaders frequently pronounce edicts about what they perceive as the deficit. I have no doubt that some well-meaning Intuit leader thought “Well, our product is very satisfying but not particularly delightful, so let’s lean into introducing more delightful moments.”

Which makes sense, as long as it is contextualized. In general, I don’t buy Intuit products to be delighted; I have Netflix for that. I just want to file my taxes so I can be satisfied. So what Intuit really wants is to be maximally satisfying and delightful enough. The result would be an app that files my taxes correctly and doesn’t make me totally hate the in-the-moment experience.

That is a cohesive, achievable design strategy. You can say that finishing filing your taxes and having them accepted is the satisfying behavior and advancing to the next screen is the delightful behavior, then divide your team to focus on designs that maximize the rate of each. In the event of conflicting needs, filing your taxes wins.

The converse is true for Netflix: be maximally delightful and satisfying enough. The result is an entertainment experience that also, on occasion, makes me reflect on some of the deeper truths of my life. Again, those are differing behaviors that can be designed for separately.

Because in reality, the problem isn’t with the introduction of delight into design, but rather a misunderstanding of when and why it matters. Every product needs both but can only maximize one, so they have to be viewed and communicated by leaders as a tradeoff of both resources and features.

Yesterday, I wrote about Lenny Rachitsky’s attempt to figure out which companies produce the best PMs and the problems with his analysis. But today is more important: even if you correct for method errors, this data doesn’t really answer the question he is asking. But it may tell PMs, particularly those underrepresented in tech, what companies to avoid.

A brief reminder of method: he looked at PMs who have left a company and what happens across 7 career categories, like how quickly they are promoted in their next job.

But using only alumni data introduces significant confounds. And he acknowledges this deep in one section of his analysis: “Another explanation is that the best PMs at FAANG companies are happy and don’t leave, and so we don’t see their trajectories in the data.”

This is very much burying the lead. All good insights start with removing as much systematic bias as you can from your sample and looking at only alumni means ignoring all the reasons that PMs choose to stay at a company. Many of the companies have experienced layoffs. Some are better at retaining leaders vs juniors. Companies that are newer have less time to show attrition.

And these biases aren’t random. Take Average Time to First Promotion. In his view, lower is better: it means the company turned you into a talented PM. But it could just as easily mean that a company systematically underpromotes top talent, causing them to leave and get quickly promoted elsewhere. This is particularly true for underrepresented people, who are most likely to be overlooked. 


But what if, by combining Time to Promotion and Leadership, we try to find companies that systematically underpromote talented people?

There are caveats. Smaller companies might not have as much room to promote people and we don’t have data in this sample to control for that. Cross-validating with another dataset (like time in role without promotion before leaving) and qualitative interviews would go a long way.


But let’s say we do believe the combined metric is reasonable. Where should folks choose to work?

Worst first; these companies appear to systematically overlook top talent.

46. Discord

45. Deel

44. Revolut

43. Scale AI

44. Plaid

Both Revolut and Plaid made Rachitsky’s Top 5 Best Companies. This is why looking at things through an inclusive lens is so important: otherwise you might not just give random advice but advice that is actively bad for disadvantaged groups.

The best companies?

  1. Microsoft
  2. Adobe
  3. Apple
  4. Intuit
  5. eBay

Maybe large companies with more formal processes are better at mitigating promotion bias. Maybe they just have more room to grow. Maybe underrepresented people can more freely transfer internally away from biased managers there. I don’t feel strongly enough about this data to feel like I know the answer.

And that’s really the point: good causal analysis matters and being wrong can worsen systematic issues. So if you have a large audience, be particularly careful what you say, and don’t rely on others to check your work.

Newsletter expert Lenny Rachitsky tried to figure out which companies produce the best product managers, using a very large dataset. And not only did he arrive at the wrong conclusion, he inadvertently produced a list of places you probably shouldn’t work, particularly if you’re underrepresented in tech.

Preamble: I don’t know Rachitsky but our mutuals paint him as reasonable. And he had a chance to comment on this draft and gave it his blessing. The point isn’t to drag him – it is to understand how and why we have to do better when creating insights.

So today I’ll explain how his analysis actually miscategorizes the top companies. Tomorrow I’ll talk about why that doesn’t matter, because the data isn’t suited to answer his question anyway. But it can potentially answer another important question.

Let’s start with his stated intention: to guide people to great companies, where PMs learn the craft and go on to have stellar careers. To do that, he looked at PMs who have left a company and what happens across 7 categories: Total promotions, Fastest immediate promotion, Fastest rise to leadership, Highest rate of CPOs, HOPs, First PM hires, and Founders.

He then ranks each category and looks at the Top Ten in each. His overall Top 5?

1. Revolut

2. N26

3. eBay 

4. Plaid

5. Intercom 


Using rankings is his first mistake. If the difference between 1st and 2nd is small but 2nd and 3rd is huge, using ordinals obscures that. So you need to normalize to standard deviations with Z-scores, looking at how far companies are from the average.

Which leads to his second mistake: differences in population size. Deel has 79 former PMs on LinkedIn; Microsoft has 32k. Given the rarity of events like becoming a founder, this creates massive data anomalies if even one more person takes that path at Deel.

Finally, he assumes independence among his criteria. But this isn’t true. Take PMs from Discord. They’re almost dead last in terms of becoming founders but top the list in fastest promotions. Why? Because you can’t get promoted if you become a founder!


In fact, all of the data is substantially correlated. Average Time To Promotion and Average Time To Leadership have a correlation of 0.96; they might as well be the same list.

There seem to be two discriminant criteria: Rate of Founder/Head of Product/CPO/First Hire (Early Stage) and Average Time to Promotions/To Reach Leadership/Total Promotions (Late Stage).

So if we adjust to these criteria and use Z-scores, who actually comes out on top?

  1. Revolut
  2. N26
  3. Intercom
  4. Plaid
  5. Palantir

This isn’t entirely different from Rachitsky’s list but it isn’t the same either. And with 230k followers and over $2m a year in newsletter revenue, what he says matters to a lot of people. So it is worth saying the right thing.

But it gets worse: even if you correct for bad statistics, this data doesn’t actually answer the question he is asking and may actually be an anti-signal that disproportionately affects underrepresented PMs. Tomorrow, I’ll talk about why.

During breakfast on Monday, my son turned to me and said “Last night was really fun. Thank you for having them over.” That is an exact quote and, I think we can all agree, an utterly bizarre thing for a nine-year-old to say. It was sandwiched between two long soliloquies about anime, so I’m fairly sure he’s still my child and not a clone.

“Them” was Di Le, Asli Aydin, Misti Cain, and Diana Wolosin. We all met (along with the absent but beloved Kevin Bethune) at the speaker dinner for DDX San Diego and it became a rollicking WhatsApp group that resulted in a followup dinner at my house. I asked them before tagging but won’t say more; the Chatham House rule applies.

My son is used to explicit discussions of diversity, because my co-parent, my partner, and I are all aligned on being clear with him about our beliefs and how we came to them. And most parents spend at least some anxious midnight hours worrying about how we talk to our children.

Ditto executives. CEOs often have multi-person executive communications teams responsible for helping them with messaging for employees, investors, and the public.

But I know of few executives who have an equally sized team responsible for making sure they are behaving in line with that communication. Exec comms is a role; exec behavior change isn’t. And this represents a misallocation of resources.

Imagine four CEOs. CEO A talks about diversity and has a diverse set of leaders around them. B talks about diversity but doesn’t have it on their team. C doesn’t talk about diversity but has a diverse team. D neither talks about diversity nor has a diverse team.

Intuitively, A is the best and D is the worst; as with any 2×2, the matching corners are easy. And in the mixed corners, intuitively C is better than B: it is better to walk than talk, given the choice.

The surprising finding is that B is actually often just as bad as D, and some situations may even be worse. This isn’t irrational. If someone doesn’t do or talk about something, the possibility at least exists that they are unaware and that they might change their behavior with awareness. But if they talk about it and don’t do it, that possibility is closed off; our brain naturally says “if they are aware but still not doing it, there must be good reasons I shouldn’t do it either.”

There are exceptions: leaders can talk about their struggle with a behavior rather than an accomplished reality and that tends to reduce the anti-signal. But ultimately, behaviors rule. Which means we need to be investing both time and money in making sure that we are behaving in ways that are congruent with the messages we send.

At home, you can use the dinner party test. Who was the last group of non-family people that your kid saw around your table? Do they accurately represent the values and beliefs you have expressed? Certainly those four women do and I am grateful for their friendship.

At work, consider the same. What shows people that you mean what you say? Are you sure you’re doing it?