Imagine there’s have an array of integers but you aren’t allowed to access any of the values (so no `Arr[i] > Arr[i+1]`

or whatever). The only way to discern the integers from one another is by using a **query()** function: this function takes a subset of elements as inputs and returns the number of unique integers in this subset. The goal is to partition the integers into groups based on their values — integers in the same group should have the same value, while integers in different groups have different values.

The catch – the code has to be O(nlog(n)), or in other words the query() function can only be called O(nlog(n)) times.

I’ve spent hours optimizing different algorithms in Python, but all of them have been O(n^2). For reference, here’s the code I start out with:

```
n = 100
querycalls = 0
secretarray = [random.randint(0, n-1) for i in range(n)]
def query(items):
global querycalls
querycalls += 1
return len(set(items))
groups = []
```

`secretarray`

generates a giant random list of numbers of length `n`

. `querycalls`

keeps track of how much the function is called. `groups`

are where the results go.

The first thing I did was try to create an algorithm based off of merge sort (split the arrays down and then merge based on the query() value) but I could never get it below O(n^2).

### >Solution :

Let’s say you have an element `x`

and an array of distinct elements, `A = [x0, x1, ..., x_{k-1}]`

and want to know if the `x`

is equivalent to some element in the array and if yes, to which element.

What you can do is a simple recursion (let’s call it `check-eq`

):

- Check if
`query([x, A]) == k + 1`

. If yes, then you know that`x`

is distinct from every element in`A`

. - Otherwise, you know that
`x`

is equivalent to some element of`A`

. Let`A1 = A[:k/2], A2 = A[k/2+1:]`

. If`query([x, A1]) == len(A1)`

, then you know that`x`

is equivalent to some element in`A1`

, so recurse in`A1`

. Otherwise recurse in`A2`

.

This recursion takes at most `O(logk)`

steps. Now, let our initial array be `T = [x0, x1, ..., x_{n-1}]`

. `A`

will be an array of "representative" of the groups of elements. What you do is first take `A = [x0]`

and `x = x1`

. Now use `check-eq`

to see if `x1`

is in the same group as `x0`

. If no, then let `A = [x0, x1]`

. Otherwise do nothing. Proceed with `x = x2`

. You can see how it goes.

Complexity is of course `O(nlogn)`

, because `check-eq`

is called exactly `n-1`

times and each call take `O(logn)`

time.