Optimal Stopping

How to hire the best candidate for a job? How many properties to view before making an offer on one? Or choosing which restaurant to eat at - to revisit a place you've been before, which you really liked, or to gamble and try a new place that you might like better? These are all examples of a common problem known as Optimal Stopping. As the name suggests, the problem is all about knowing when to stop a particular action and when to start taking advantage of the knowledge gained from your previous actions.

Fundamentally, it boils down to exploration vs exploitation. We need to spend some amount of time exploring the situation to better inform our ability to exploit it later. The question is then - how much time to spend in each phase?

How can we apply it?

Take the example of interviewing candidates for a job. Say there are 100 candidates in total and if they are successful, they must be hired on the spot.

Here's the plan: Let's interview a number of people first without making any offers in order to set a benchmark standard ("exploration" period). After a number of people have been interviewed, we will hire the first one who is better than anybody found in the exploration period.

It wouldn't make sense to hire the very first candidate, as there is only a 1% chance they are the best person for the job. Similarly, it wouldn't make sense to hire the very last candidate as again - there is only a 1% chance they are the best person for the job. There must be a sweet-spot somewhere in the middle. As it turns out, there is.

Thirty-Seven Percent

If we interview 37% of the candidates without making any offers, we set a high-enough standard and still have enough time to wait for the right candidate. This approach can be seen to provide the best candidate most often. Take a look at the widget below - it runs 300 simulations for each proportion and shows the average score of the candidate eventually hired. The blue dots are the result of each simulation and the orange bars indicate the proportion of times that the candidate hired under this method is actually the best candidate available in the entire process.

Note: all candidate scores are randomly generated between 0-100%. It is assumed that every candidate can be scored objectively and quantitatively, but I am aware that reality is definitely more nuanced than what we assume here.

In the widget, the data is calculated randomly every time the page is refreshed, so the actual results may not show a peak between 30-40% every time, but it does on aggregate.


There isn't much in it, but the theoretical optimum of 37% holds true. This proportion most regularly ends up hiring the best-possible candidate, and the most capable candidates overall.

It also shows that if we spend more than 50% of our time exploring, our expected returns drop off fairly rapidly.

I wasn't expecting the curve to be so flat across the 15-45% range, but there we are. Perhaps then, if an improved model was created which considers the costs of running more interviews, the optimal exploration period would be lower. Perhaps this could be the subject of a future blog...

Attention: Recruiters!

Please do me a favour and set me up as the 38% candidate for any potential interviews. I can't promise any fees for arranging this, but I'll make you a lovely chart on a topic of your choice. Deal?