There room N agents and also K of them room spies. Your job is to recognize all the spies. You have the right to send a given number of agents to a “retreat” top top a remote island. If every K spies are existing at the retreat, they will accomplish to strategize. If also one spy is missing, this spy meeting will certainly not take place. The only info you gain from a retreat is whether or not the spy meeting happened. You deserve to send as plenty of agents as you prefer to the retreat, and the retreat can occur as numerous times as needed. You know the values of N and K.

You are watching: Sniff out the spy

What’s the minimum variety of retreats required to insurance you have the right to identify all K spies? If every retreat expenses \$1,000 per person, what is the full cost to determine all K spies?

To begin with, let’s assume you know that N = 1,024 and also K = 17.

Here is my solution for $K=1$:


The instance $k=1$

Let’s warm up v the simple case $k=1$. We’ll speak to $r_1(n)$ the minimum variety of retreats necessary to identify a single spy in a team of $n$ agents. If we choose to send $m$ that the $n$ agents ~ above a retreat, then either a conference took place, which way the spy is amongst those $m$, or no meeting took place, therefore the spy must be among the $n-m$ that didn’t walk on the retreat. Depending on the result, we’ll either need $r_1(m)$ or $r_1(n-m)$ an ext retreats. Us can thus write this little (dynamic programming) recursion:\beginalignr_1(1) &= 0 \\r_1(n) &= \undersetm\in\1,\dots,n-1\\textminimize\,\, 1+\textmax\ r_1(m), r_1(n-m) \\qquad\textfor n=2,3,\dots\endalignThe factor for the “max” is the we want a guarantee the this variety of retreats will certainly always find the spy. Therefore, us assume the spy is where they should be so the it bring away as long as feasible to find them. Indigenous here, it’s clean that we should always split in half. Therefore,\beginalignr_1(1) = 0\quad\textand\quad r_1(n) = 1 + r_1\bigl(\lceil \tfracn2 \rceil\bigr)\quad\textfor n=2,3,\dots\endalignWhere $\lceil x \rceil$ method that us round $x$ up to the nearest integer. By inspection, we deserve to solve this recursion and also we uncover that:

$\displaystyle\textnumber that retreats = r_1(n) = \lceil \log_2(n) \rceil$

What around the cost? We have the right to simply count the total variety of retreat attendees, i beg your pardon we’ll call $a_1(n)$. Once we separation $n$ in half, we can select to send either $\lceil \tfracn2 \rceil$ or $\lfloor \tfracn2 \rfloor$ (round up or ring down) come the retreat. Both offer us the very same information, so we’ll choose to send fewer agents. The number of attendees thus satisfies:\beginaligna_1(1) = 0\quad\textand\quad a_1(n) = \lfloor \tfracn2 \rfloor + a_1\bigl(\lceil \tfracn2 \rceil\bigr)\quad\textfor n=2,3,\dots\endalignAgain, by inspection, we uncover that the number of attendees is:

$\displaystyle\textnumber of attendees = a_1(n) = n-1$

Note the we might have achieved the same full cost by sending the agents one-by-one on individual retreats. By the moment we’ve sent all however one agent, us must know who the spy is. However, we were inquiry to minimize the number of retreats, so it would certainly be garbage to have actually this countless retreats.


The situation $k\gt 1$

The instance $k=1$ to be all about finding one spy. There were $n$ possibilities (each the the $n$ agents might be the spy) and each time we had a retreat, we received one little bit of information (yes/no). The best we could do was use that bit to cut our possibilities by a factor of two, and this was achievable by sending fifty percent of the agents top top the retreat every time.

If $k \gt 1$, we space tasked with finding $k$ spies, and there are currently $\binomnk$ possibilities (choosing $k$ spies out of $n$ agents). Again, the logic is the same: the ideal we can hope for with one little bit of info is to alleviate the possibilities through a variable of two. If we call $r_k(n)$ the minimum variety of retreats forced to determine the spies v certainty, climate this dispute yields the bound:

$\displaystyle\textnumber of retreats = r_k(n) \ge \left\lceil \log_2\binomnk \right\rceil$

If $n=1024$ and $k=17$, this yields the tied $r_17(1024) \ge 122$. The concern then becomes: how close can we actually pertained to reaching this bound? One feasible approach is to it is in greedy: we save track the all feasible remaining subsets, then we take into consideration all possible retreat arrangements and also use the one the reduces the variety of remaining subsets through as lot as possible (which have the right to never be more than than a variable of two). We proceed in this fashion until we have diminished the variety of possible subsets to one, and that must be our set of spies.

While this greedy technique would likely be optimal, it’s computationally intractable. There room $\binom102417 \approx 3.68\times 10^34$ feasible subsets that spies and there room $2^1024 \approx 1.8\times 10^308$ feasible retreats we could arrange at every step. So there is no expect of computer the optimal greedy strategy and seeing how well the performs compared to our lower bound.

We deserve to think the this together a game, where I’m trying to guess the spies and the evil one is trying come arrange the spies such the it’s as challenging as feasible for me to guess. For this reason this becomes a philosophical question: if I pick a break-up such that there is a slight benefit for my devil to say “yes, there was a meeting”, then as lengthy as they save saying “yes”, the systems is recursive and easily computable. However if my adversary knows that the trouble gets an extremely hard whenever they to speak “no, there was no meeting”, castle they might just carry out something “suboptimal” v the score of to run me out of memory.

Suboptimal heuristic

The greedy heuristic seems out the reach, so here is an additional heuristic, which to be truly a team initiative — discussions through my colleague Alberto Del Pia, comments on this article by Jim Crimmins, Adam, and also Jason Weisman, as well as a little bit of effort on my end. The idea is to split the $n$ agents into $m$ groups of approximately equal size, wherein $k+1 \le m \le n$. Let’s number the teams $1,\dots,m$. Right here is the plan:

TakeGroup 1 continues to be home, all other groups go top top a retreat. If there was a meeting, then group 1 has no spies in it. If there to be no meeting, then group 1 has at least one spy in it. Go back to step 1, however this time, group 2 stays home and all other groups go ~ above a retreat.Keep going in this fashion until we have determined all groups with at the very least one spy in them. There can be at many $k$ such groups. Unify these groups (this becomes our new $n$), discard all various other groups, and repeat the entire procedure with the brand-new smaller group: choose $m$, splitting into groups, etc.

Note that we carry out not require to shot all $m$ feasible retreats in each round! As quickly as there space $k$ retreats through “no meeting”, then we can stop and immediately continue to the next round, because we now understand which $k$ groups contain spies. In the worst case, this won’t happen. Us will constantly be forced to have actually as numerous retreats as possible. By the moment we have actually done $m-1$ retreats, two things can happen:

We have determined $k$ spies, in which situation we can lug over our $k$ spy-containing groups to the following round immediately.We have determined $k-1$ spies, in which instance we deserve to just lug over the last team to the next round without testing it.

In either case, we only require $m-1$ retreats and we constantly send $k$ groups to the following round. So there is never any type of need to have all $m$ retreats in a offered round. To watch what wake up at each round, note that if we division $n$ into $m$ groups, climate to number out the team sizes, let $q$ and also $r$ be the quotient and also remainder, respectively, as soon as we divide $n$ through $m$. Therefore $n = qm + r$, v $0\le r\le m-1$. Climate $n$ will certainly be split into $m$ teams as follows:\But i m sorry groups lug over come the next round? because we acquire to select the bespeak in which us schedule the retreats, and not all teams are of same size, we have the right to ensure that the last team (corresponding to the last retreat that never ever happens) is as tiny as possible. Yet it’s also feasible that ours $(m-1)^\textth$ team gets sent to the next round instead. Therefore our finest bet is to save the two smallest groups for last. Ultimately, the worst case will check out the $k-1$ biggest groups and the $2^\textnd$ smallest team sent come the following round. Let’s speak to $g(n,m)$ this number. It’s rather annoying to compute $g(m,n)$, since it counts on the relative size of $k$ and $r$, yet it’s around equal come $kn/m$. Below is what the recursion watch like:\The recursion have the right to be solved in a simple manner making use of recursion. Below is some Julia code together with the complete computation that $g(n,m)$ that does the job:

nmax = 1024k = 17memo = -1 + zeros(Int,nmax)# if n agents are split into m groups, choose k-1 biggest and second smallestfunction g(n,m,k) # n = q*(m-r) + (q+1)*r q,r = floor(Int,n/m), rem(n,m) if m == k+1 # we must pick the k biggest in this case (r >= k ? k*(q+1) : r*(q+1) + (k-r)*q) rather # choose the k-1 largest and the 2nd smallest in this case (r >= k-1 ? (k-1)*(q+1) : r*(q+1) + (k-1-r)*q) + (m-r >= 2 ? q : q+1) endendfunction rk(n) if memo != -1 memo rather memo = ( n == k ? 0 : minimum() ) endend rk(1024)In the very first round, we have actually $n=1024$ suspects. The optimal thing to perform is to choose $m=43$ groups, leaving us with $407$ suspects. Then we select $51$ teams to get down come $136$ suspects. Then choose $46$ teams again to acquire down to $50$ suspects. Finally, we choose $50$ groups and also we’re down to $17$ suspects, which have to be our spies. For this reason the full is $(43-1)+(51-1)+(46-1)+(50-1)=186$ complete retreats. Interestingly, if we use the approximation $g(n,m) \approx \lceil \tfracknm \rceil$, we additionally get the an outcome of $186$. In addition to our previously obtained lower bound, us obtain:

$\displaystyle\textnumber the retreats:\quad 122 \leq r_17(1024) \leq 186$

The reduced bound method that it’s impossible for any strategy to do far better than this, yet there might not be any kind of strategies that are this good. The top bound states it’s definitely possible to execute this well using a particular strategy, yet other strategies might do better. In this case, the ratio between our upper and lower limit is around 1.52. This sort of ratio is used in approximation theory to quantify the top quality of a suggest solution when the trouble is tough to deal with exactly.

How about the cost? We will certainly approximate. For each round, in the worst case, we have actually $m$ retreats, where $k$ the them have actually meetings and $m-k$ execute not. Whenever over there is a meeting, the group that continued to be home deserve to be to exclude, forever after ~ (since we recognize it has no spies), so us don’t have to send that team on subsequent retreats. The many expensive thing that can occur is that our $k$ retreats with meetings come last. So the total variety of attendees will be approximately:\beginaligna(n,m) &\approx (m-1-k)\left(n-\tfracnm\right) + \sum_j=1^k \left(n-j\tfracnm\right)\\&\approx \fracn \left(k-k^2+2 (m-1)^2\right)2 m\endalignSo if we end up choosing $m$ groups when there space $n$ attendees, we should include $a(n,m)$ come our running count of attendees. Incorporating this utilizing the solution found above, we achieve the an outcome that the strategy will require sending around $65484$ attendees come retreats. In ~ a expense of \$1,000 per attendee, this means it will price at most \$65.48 million to uncover all the spies.

For the curious, right here is a plot showing the maximum variety of retreats forced to sniff out 17 spies, together a role of the total number of agents:

*

As we deserve to see, the upper and also lower bounds begin close together yet end up spreading apart as $n$ increases.

Minimizing cost instead?

The difficulty statement asked united state to minimize the variety of retreats, which led us to 186 retreats in ~ a cost of \$65.48 million. If instead we test to minimization dollars using a similar strategy of dividing into groups and also keeping one team home, we can formulate a similar recursion come the one above, except this time we execute the minimization over cost rather 보다 retreats. The result in this case is 220 retreats at a cost of \$54.46 million. This renders sense. We can lower the price if us want, yet only at the price of adding much more retreats.

See more: How To Start A Mini Cooper Features Even Owners Don'T Know About

Even better!

It transforms out the upper bound can be diminished to $131$ by rather singling out one spy at a time! This turns out to have a much far better worst-case performance than splitting the continuing to be agents into equally-sized teams as I described in mine solution. As one can expect, the worst-case cost in dollars goes increase dramatically. The solution, as result of Tim Black, deserve to be discovered here. It still continues to be to be checked out whether there exists a straightforward strategy that can further close the gap between the reduced bound of 122 and the brand-new upper bound of 131.