Meeting 17 September 2016

Sep 17, 2016 • meetings • Christoph Dürr, Antoine Amarilli and all the present students

We read some problems solved by Miguel Oliveira. Let’s learn how to think as the problem setter Miguel Oliveira.

[!] this file might be updated during the coming days.

SLIDE - Team Slide Treasure Hunt Race

We are given a grid with positive integers with up to 10 columns and 10.000 rows. We want to find two paths from the top-left and top-right cells that reach the bottom-left and bottom-right cells, where each path consists only of down, down-left, down-right steps. The goal is to maximize the total value over all covered cells.

The following observation simplifies the implementation, but is not crucial for the computational complexity. First we can observe that in an optimal solution the paths will never cross. This still needs a rigorous proof, but here is the idea. Conceptually color the path starting top-left in blue and the path starting in top-right in red. Then if at some rows the red path is to the right of the blue path then this happens between two particular rows, where the paths cross. At these crossing points we can exchange the portions of the paths (untangle them), without changing the objective value of the solution. Also if paths do not cross but intersect in some cells, than we can shift some portion of a path such that more cells are covered. Since the grid values are positive, this would improve the solution.

As a consequence the optimal solution is completely defined by pairs \((a_i,b_i)\) with \(0\leq a_i < b_i \leq M-1\) for each row \(i\), such that \(a_i-a_{i+1} \in \{-1,0,+1\}\) and \(b_i-b_{i+1} \in \{-1,0,+1\}\), and \(a_0=a_{N-1}=0, b_0=b_{N-1}=M-1\).

Denote by \(O[i,a,b]\) the maximal total value of all cells covered by paths from cells (0,0) to (i,a) and from (0,M-1) to (i,b) respecting the required conditions. The base case is

\[O[0,a,b] = \begin{cases} G_{i,a} + G_{i,b} &\mbox{if } a=0, b=M-1 \\ -\infty & \mbox{otherwise.} \end{cases}\]

The recursion is for \(i>0\)

\[O[i,a,b] = G_{i,a} + G_{i,b} + \max\{ O[i-1,a',b']\},\]

where the maximization is over all pairs such that \(\max\{0,a-1\} \leq a' \leq \min\{M-2, a+1\}\) and \(\max\{a'+1, b-1\} \leq b' \leq \min\{M-1,b+1\}\).

The answer to the problem is then \(O[N-1,0,M-1]\). The dynamic program has \(O(N M)\) variables, each of which is the maximization over a constant number of expressions. With the given bounds, the program has the right complexity.

CANDYSTN - Candies and Milestones

This problem is pure simulation and can be solved in linear time.

Suppose that we start with x candies in the bag. At any moment there are x+d candies in the bag, and c candies in total, with d=0, c=N initially. Then when we simulate the sequence of operations we update c and d, and each step generates the inequalities \(0\leq x+d \leq c\). This generates a strongest lower bound for x and a strongest upper bound. If are contradictory, you can answer that there is no solution (print -1), otherwise you can choose x as the lower bound.

Since each milestone number of the input is in the range \([-10^6,10^6]\) and there are up to \(10^4\) of them, the intermediate computed numbers can be in the range \([-2\cdot 10^{10},+2\cdot 10^{10}]\) which does not fit in a 32 bit integer. A solution in C++ should therefore use long long.

LINEUP - Lineup

This is a perfect matching with maximum weight problem. Consider the bipartite graph with 11 vertices on the left (the players \(u_0\) to \(u_{10}\)) and 11 vertices on the right (the slots \(v_0\) to \(v_{10}\)). Let there be an edge between \(u_i\) and \(v_j\) if the given weight matrix has non-zero entry in row i and column j. This entry defines the weight of the edge. This graph has at most 55 edges, since the statement says that there are at most 5 non zero entries per row. But to simplify the implementation we will assume that the graph is complete, by adding dummy edges of sufficiently small weight -D, where \(D=11\cdot 100\) should be enough. Using \(-\infty\) is not good here because we would not be able to distinguish between cycles having one or two of these additional edges.

A perfect matching in this graph is a selection of edges such that every vertex is covered exactly once by this selection. Our goal is to find a perfect matching which maximizes total edge weight.

This can be solved using the Hungarian algorithm in time \(O(n^3)\), which for \(n=11\) is more than sufficient. But the algorithm is long to implement.

We propose a solution which repeatedly searches for positive weight alternating cycles and improves the matching by changing its along the cycles.

Now that we have a complete graph, we can start with an arbitrary perfect matching, say consisting of all edges of the form \((u_i,v_i)\). Matching theory tells us that if the current matching is not of maximal weight then there is an alternating cycle of positive weight. An alternating cycle is a cycle in the graph, that alternates with edges from the matching and edges which are not in the matching. The weight of such a cycle is the total weight of its non-matching edges minus the total weight of its matching edges. The symmetric difference of the current perfect matching and of the alternating cycle is another perfect matching, and the weight of the matching is changed by adding the weight of the cycle.

The algorithm has the following structure.

start with an arbitrary perfect matching
while there is an alternating cycle with positive weight
- change the matching along the cycle

Let n be the number of vertices in the graph on each side (\(n\leq 11\)) and c be the maximum edge weight (\(c\leq 100\)) then the algorithm above makes at most cn+n iterations, as the matching weight increases strictly and is upper bounded by cn, and there are at most n negative edges in the matching. Positive cycles can be found with a simple modification of Bellman-Ford, as sketched below. Hence the algorithm has running time in the order of \(cn^4\) which is less than a million with the given limits, hence it is ok.

In order to find positive alternating cycles in the bipartite graph G(U,V,E) with edge weights w and matching \(M:U\rightarrow V\) (seen as a mapping for convenience), we consider the graph H(U,E’) with weights w’ such that \((u_i,u_j)\in E'\) iff there is an edge from \(M[u_i]\) to \(u_j\). The edge has weight \(w'(u_i,w_j) := w(u_i, M[u_i]) - w(M[u_i], u_j)\). Every cycle C in H corresponds to a cycle in G of opposite weight (that is, multiplied by -1). Hence we can just run Bellman-Ford in order to find negative cycles in H. By replacing each edge in H by two alternating edges in G we obtain an alternating cycle in G of positive weight. The implementation below follows this idea, without building the graph H explicitly.

Here is the code that detects an alternating cycle of positive weight. The graph is given as a 2 dimensional matrix. We encode a matching by a table associating to each row vertex from U the corresponding column vertex in V.

/* Find a positive weight alternating cycle given a perfect matching
Bellman-Ford

weight:  an n by n weight matrix
match: matching left to right mapping
complexity: O(n^3)
return: vertex that leads to a cycle, or -1
*/
int find_pos_alt_cycle(int weight[][N], int match[], int prec[]) {
    int changed;
    int potential[N];
    fill(potential, potential + N, 0);
    for (int i=0; i<N; i++) {              // has to iterate N times in order to detect cycles
        changed = -1;
        for (int u=0; u<N; u++) {          // relax all 2-paths of the form (v0, u, v1), where v0 is matching of u
            int v0 = match[u];
            for (int v1=0; v1<N; v1++) {
                int alt = potential[v0] - weight[u][v0] + weight[u][v1];
                if (alt > potential[v1]) {
                    potential[v1] = alt;   // improvement found
                    prec[v1] = u;
                    changed = v1;
                }
            }
        }
    }
    return changed;
}

If the function above returns a vertex (a number different from -1), then we can follow the precedence relation until we hit a cycle. After that, we just make a single loop around that cycle, changing the matching along the way:

/* maximum profit bipartite perfect matching

weight: an n by n weight matrix
returns: the maximum total weight over all perfect matchings
complexity: O(n^4 C), where C is the maximum difference in weights
*/
int max_profit_bipartite_matching(int weight[][N]){
    int match[N];
    for (int i=0; i<N; i++)
        match[i] = i;
    int prec[N];
    int start;
    while ((start = find_pos_alt_cycle(weight, match, prec)) != -1) {
        bool visited[N] = {false};
        while (! visited[start]) {   // follow precedence relation, until cycle hit
            visited[start] = true;
            start = match[prec[start]];
        }
        int v1 = start;              // vertex on cycle found
        int count = 0;
        do {
            int u = prec[v1];        // follow this cycle
            int v0 = match[u];
            match[u] = v1;           // alternate matching along the cycle
            v1 = v0;
        } while (v1 != start);
    }
    int sum = 0;                     // compute total weight of matching
    for (int u=0; u<N; u++)
        sum += weight[u][match[u]];
    return sum;
}

GOODA - Good Travels

The correct approach is to strongly connected components. This can be done, for instance, using the algorithm of Kosaraju, which consists of two DFS traversals of the graph, and which produces the strongly connected components in topological sorted order. Well done, because we need this order. Life is good, isn’t it? We can then consider the directed acyclic graph resulting from the contraction of these components. The weight of a super vertex is the sum of the weights of the vertices in the component.

Now we want to find a path from the super vertex containing the source to the super vertex containing the destination. Clearly we can solve this by dynamic programming, by computing the heaviest path from the source super vertex to each super vertex v, which we perform by processing the super vertices in topological order.

This gives a linear time algorithm.

POUR1 - Pouring water

The problem is rumored to have been invented by Poisson, but we could not verify this.

You are given two vessels of respectively a and b liters, which initially are empty. You are allowed (1) to fill a vessels completely, (2) empty it completely or (3) pour one vessel into the other until one becomes empty or the other full. Using these operations, determine whether it is possible to obtain exactly c liters in one of the vessels, and find the minimal number of operations required.

It seems that one just needs to check whether 0≤c≤b and if the greatest common divisor of a and b is also a divisor of c. This would yield a solution in logarithmic time in the upper bound of the given integers. See Prologin 2012, Correction des questions d’algorithmique, Jill-Jênn Vie et Antoine Pietri, section 3.3.

Another solution is to perform a BFS to explore the possible configurations of the vessels, i.e., the possible pairs of contents. While it seems that the number of configurations could be quadratic in the vessel capacity (which would not be acceptable given the bounds), a finer analysis reveals that the number is linear: indeed, each operation preserves the invariant that there has to be one vessel which is empty or full. Hence, we can simply perform the BFS, using a map as a sparse storage of the reachable configurations. There are \(2\cdot2\cdot40000\) vertices in the graph, each of outdegree at most 6. Hence the running time of this solution is acceptable.

#define mkp make_pair
#define mkps(choice, a, b) ((!choice) ? mkp((a), (b)) : mkp((b), (a)))

typedef pair<int, int> pii;

inline void my_push(queue<pii> &q, map<pii, int> &V, int d, pii val) {
  if (V[val] > 0)
    return;
  V[val] = d+1;
  q.push(val);
}

int main() {
  int t;
  scanf("%d", &t);
  for (int ncase = 0; ncase < t; ncase++) {
    int capa[2], c;
    scanf("%d%d%d", &(capa[0]), &(capa[1]), &c);

    map<pii, int> V;
    V[mkp(0, 0)] = 1;
    queue<pair<int, int> > myq;
    myq.push(mkp(0, 0));

    int result = -1;
    while (!myq.empty()) {
      pii pp = myq.front();
      myq.pop();
      int d = V[pp];
      int v[2] = {pp.first, pp.second};
      if (v[0] == c || v[1] == c) {
        result = d-1;
        break;
      }
      for (int j = 0; j <= 1; j++) {
        my_push(myq, V, d, mkps(j, 0, v[1-j]));
        my_push(myq, V, d, mkps(j, capa[j], v[1-j]));
        if (v[j] > 0) {
          if (v[j] <= capa[1-j] - v[1-j])
            my_push(myq, V, d, mkps(j, 0, v[j] + v[1-j]));
          else
            my_push(myq, V, d, mkps(j, v[j] - capa[1-j] + v[1-j], capa[1-j]));
        }
      }
    }
    printf("%d\n", result);
  }
  return 0;
}

Question

In case we want to compute the minimum number of operations needed to create a volume of c liters, do we need to perform this BFS exploration, or does the proof of the gcd solution contain an answer to this optimization question?

MISERMAN - Wise And Miser

Look closer. Do you see the relation with the problem SLIDE above?

MTWALK - Mountain Walking

Given a cell-valued N by N grid, you want to find a path from the top-left cell (source cell) to the top-right cell (target cell) using only left-, right, top or down steps, which minimizes the difference between the largest and smallest cell value covered by the path.

Our idea is to sort the grid vales in a table T, which costs \(O(N^2 log N)\). Then for every index i you find the smallest index \(j\geq i\) such that the source and target cells are connected in the grid restricted to the cells with value between \(T[i]\) and \(T[j]\). For a fixed index \(i\) this task can be achieved in time \(O^*(N^2 \alpha(N))\), where \(\alpha\) is the inverse Ackermann function of \(N\). Simply use a union-find structure and, for increasing \(j\), add edges adjacent to the \(j\)-th cell in the order \(T\) until source and target are in the same connected component.

Of course, we only need to consider the distinct values of \(T\). As there are only \(C\leq 110\) different grid values, this leads to an overall complexity of \(O(C N^2 \alpha(N))\), which is in the order of a million with the given bounds.