# Résolution de problèmes algorithmiques

Jussieu, room 26-25/105, 14:00 until 17:00

No need to bring a computer. Only paper and pen, and if needed some fruits and cookies. Coffee is provided.

We will talk about solutions to problems A to I from this booklet and some subselection from this selection.

## HeavyTransportation

Given a graph G(V,E) with V={1,…,n} and edge weights, find the largest x such that the in the graph restricted to edges of weight at least x, the vertices 1 and n are connected.

In time $O(n \log n)$. Start with empty graph. Add edges in order of decreasing weight, until source 1 and target n become connected.

### vertex indexes

(optional) Decrement indexes by 1 after reading to make them start at 0.

See Seminar room

## Collecting Beepers

Given a grid with 10 targets and an origin, find a minimum length tour from origin to origin that visits all targets. This a traveling salesman problem, which you can solve in time $(n^2 2^n)$.

Given n=30000 rectilinear rectangles, the goal is to find the total area that is covered by at least 10 rectangles. This can be solved in time $O(n\log n)$ by the sweepline algorithm using a segment tree.

## Disk Tree

The key observation is that if you order the input lines lexicographically, then they are in the output order. Then you just need to use spaces to replace the common prefix with the previous line for each output line. This is much simpler than to build the actual tree.

## Phone Numbers

Organize the dictionary as a prefix tree with outdegree 10. This permits to generate all words that match some input numbers.

You can attach to every node of the tree a possible word that corresponds to the path from the root to that node. In addition by dynamic programming you can attach to every node the smallest word in the sub-tree rooted at that node.

All in all the running time to build the tree is linear in the total word size of the dictionary. Also for every input number you need to do a linear time traversal of the tree.

## ROCK - Sweet and Sour Rock

The problem is about breaking a 01 string into smaller strings of maximum total score. The score of a string is its length if it contains more 1’s than 0’s otherwise its score is zero.

This can be solved by dynamic programming. For every j compute the optimal solution for the prefix of size j. This solution consists of a last string, say from i to j, and the optimal solution for the prefix of size i - 1. The solution is straightforward and runs in quadratic time.

## City Driving

Given a tree with a single additional edge, find shortest paths between different source-destination pairs. Essentially we have to implement a solution to the least common ancestor problem.

See an explanation here.

Opérations pour obtenir un mot lexicographiquement plus grand :

CHANGING, INSERTION, SUPPRESSION

Calculer un plus long chemin est NP-dur en général, mais polynomial par programmation dynamique dans un graphe orienté sans cycle (DAG en anglais).

Le graphe orienté obtenu est acyclique. Alors par composition un plus long chemin i->j->k est composé de plus longs chemins i->j et j->k. On va tout simplement calculer d[i] la longueur du plus long chemin commençant par i, et succ[i], le successeur dans ce chemin, où succ[i]=-1 si le chemin s’arrête là. Et d[i]=1+max j d[j] pour tout arc i->j.

Pour avoir les mots u qu’on peut obtenir d’une opération sur w et avec u<w, on utilise une table de hachage. Chaque mot “jus” est aussi stocke sous les clés “?us” “j?s” et “ju?”. Pire : pour “ju?” on stocke seulement l’indice du mot qui matche avec “ju?” et qui a plus longue sequence parmi ceux-la.

Pour s’assurer que la séquence est lexicographiquement croissante, on traite tout simplement les mots dans le bon ordre et on les ajoute dans le dictionaire seulement au fur et a mesure.

Problème 1: les mots données ne sont pas du tout triés comme promis dans l’énoncé. Je l’ai verifie en forcant un time limit exceeded dans le cas non-trié. Et il y a des doublons.

Problème 2: une map basée sur un arbre binaire de recherche est trop lente pour nous. Il nous faut une table de hashage. Pour cela il faut fournir une fonction de hashage comme ci-dessous.

## Edit distance

This is a classical dynamic programming problem. Reduces to finding a shortest path in a grid.

## Spiral of Numbers

This a problem of converting a position to the spiral to a grid coordinate and vice-versa. By looking at the example we observe the following relations.

and

## Color a Tree

This is a scheduling problem, which is denoted as 1|tree|sum wjCj.

We are given a rooted tree, representing a precendence relation between jobs. Every job has a unit size and a weight. The goal is to find an order on the jobs that respects the precendence relation and minimizes the total weighted completion time.

Algo glouton en $O(n^2)$ – peut être fait en $O(n \log n)$.

Problème d’ordonnancement de taches unitaires avec relation de précédence en forme d’arbre. Minimiser somme pondéré sur les temps de complétude. Donc on veut mettre le plus de poids possible vers le début.

Algorithme glouton dans le sens suivant. On ne peut pas juste ordonnancer la racine, puis à chaque fois la tâche la plus lourde active (dont le père a été ordonnancé).

Plutôt on va ordonnancer la tâche la plus lourde juste après son père, par un argument d’échange c’est optimal. Ensuite on fusionne les deux taches, en une tache plus lourde et plus longue. En général w[u] est la taille de la tâche u, s[u] sa longueur et o[u] le coût engendré si la tâche est ordonnancée au temps 0. On recommence a chaque fois avec la tâche qui a le plus grand rapport poids/taille, aussi appelé Smith-ratio.

Quand on fusionne u dans v, pour marquer u comme supprimée on pose s[u]=0.

This is the classical Floyd-Marshal all pair shortest paths problem.

On lit les mots, et on les trie. Ensuite on ne travaille qu’avec les indices, et l’ordre sur les indices est aussi l’ordre lexicographique sur les mots.

On calcule la distance d’édition entre les mots. C’est la distance L1 entre les vecteurs d’occurrence des lettres des mots. Cette distance défini un graphe, où il y a une arrête (i,j) si la distance L1 est 1.

Invariant: dist[i][j] est la distance entre i et j dans ce graphe en n’utilisant que des sommets intermédiaires entre 0 et k, ou k va varier de 0 a n-1.

next[i][j] est le prochain sommet sur le plus court chemin entre i et j. S’il y a plusieurs plus court chemins, on ne garde que le lexicographiquement plus petit.

La solution est un couple (bi,bj) (b, comme best), qui dans l’ordre

• maximise la distance L1 entre bi et bj
• minimise la distance dist[bi][bj]
• minimise lexicographiquement le chemin de bi à bj

## Tiling a Grid With Dominoes

Can be solved using fast exponentiation in time $O(\log W)$, plus a preprocessing using $6^3 \cdot 32$ multiplications.

The important aspect of the tiling is the line between two columns. There are several configurations depending on where the vertical dominoes cross this line. We encode these configurations using 4 bits, one for each row.

There are 6 different possible configurations as depicted here. The - represents the crossing by a horizontal domino.

0  1  2  3  4  5
|  -  -  |  -  |
|  -  -  |  |  -
|  -  |  -  |  -
|  -  |  -  -  |


We encode in a 6 by 6 adjacency matrix, the compatibility between configurations. M[i,j]=1 if there is a tiling with configuration to the left of a column and configuration j to the right of the same column. Otherwise M[i,j]=0.

Now M taken to the W-th power gives a matrix, where the entry at row 0 and column 0 is the answer to the problem, namely the number of domino tilings of a 4 by W grid.

## Boolean Expression

Given two boolean expressions on 27 variables, decide whether their truth value is the identical for all possible boolean assignements to the variables.

This is a co-NP-hard problem. Deciding if a boolean expression is equivalent to the constant False, is equivalent with deciding if there is a boolean assignment that validates the boolean expression.

We propose a backtracking algorithm to answer the question. Let L be the list of all variables that appear in the given expressions. In order to avoid evaluating the expression for all $2^{\#L}$ possible boolean assignments, we use a simple trick to prune the search tree. We extend the boolean algebra to partial assignments. The consist of assigning to the variables the values True, False and Undefined. The evaluation rules are quite natural, for example True OR Undefined = True while False OR Undefined = Undefined. Every node in the search tree will correspond to a True-False assignment to some prefix of L, and Undefined assignment for the remaining variables. Now we have to consider 3 cases.

• If one of the boolean expressions evaluates as undefined, we extend the tree as follows. Let x be the first undefined variable in L (there must be at least one). The current node has two descendants, one where x is assigned to True and one where x is assigned to False.

• If both expression are true or both are false, we proceed as in the previous case.

• If one of the expression is false and one is true, we can stop the exploration, because we know that the two boolean expression are not equivalent.

Now if the construction of the tree completes without interruption we can safely answer that the expressions are equivalent. It should be noted that we don’t need to build explicitly the tree, it is induced by recursive calls to the backtracking function.

In fact the main difficultly is to parse the expression and build a internal representation that permits an easy evaluation. For this purpose we encode the truth values as True=+1, False=-1 and Undefined=0. With the following class we can compose any boolean expression.

In order to parse the expression we follow a simple grammar, which is:

 O = A | A + O
A = N | N A
N = V #*
V = var | ( O )


The expression can be constructed from the string representation as follows. (There might quite well be a simpler method.) Note how every non-terminal of the grammar is represented by a dedicated method.

## FTOUR - Free Tour

See shortest cycle.

Explanation

## Clustering

Given n points in the plane, clustern them into k groups, such that the minimal distance between two points of different clusters is maximized.

This is a nice variation of Kruskal’s algorithm. Simple interrupt it, once there are k connected components.

## Muddy Fields

This a minimum bipartite vertex cover problem. Simply let every maximal row segment containing only mud fields be a vertex. Segments can have length as small as 1. Do the same for column segments. Whenever a row segment and column segment intersect, let there be an edge. This is a bipartite graph. The required solution is precisely a minimum vertex cover in that graph.

## Street directions

You are given a city map formed by vertical and horizontal streets. There are n intersection pairs forming a source and a target. The goal is to orient the streets to make them one way streets such that every source-target pair can be connected using only a single turn.

Think a bit, this can be modeled as a 2-SAT formula. See Seminar room.