  # Résolution de problèmes algorithmiques

Given the a rewriting system of the form $f:\Sigma\rightarrow \Sigma^\star$, determine the limit of $f^i(a)$ for some given seed letter $a\in \Sigma$.

# Rewriting system

Consider the infinite sequence $a,f(a),f(f(a)),\ldots$. In the first step, letter $a$ is replaced by the string $f(a)$. In the second step every letter in $f(a)$ is replaced by its image by $f$ and the results are concatenated, and so on.

Consider these examples: $f:\: a \mapsto bb,\: b \mapsto b$
a, bb, bb, bb, …
$f:\: a \mapsto ab,\: b \mapsto a$
a, ab, aba, abaab, abaab, abaababa, abaababaabaab, …
$f:\: a \mapsto ba,\: b \mapsto c,\: c \mapsto b$
a, ba, cba, bcba, cbcba, bcbcba, …

We can observe quite different behaviors. The first example reaches quickly a fixpoint, the second example seems to converge to a fixpoint (in form of an infinite string), but which is not reached within a finite number of steps. And the last example observes some cyclic behavior.

# Letter type

This motivates us to distinguish 3 types of letters, depending on properties of the sequence generated by sucessive applications of $f$ starting with the letter $x$.

• Letter $x$ is of type 1 if successive applications of f eventually reach a fixpoint, i.e. if there is an integer i such that $f^i(a)=f^{i+1}(a)$.
• Letter $x$ is of type 2 if for every integer k there is an integer i such that $f^i(x)$ has length at least k and the k first letters are the same in all strings $f^j(x)$ for $j\geq i$.
• Letter $x$ is of type 0 in the remaining case.

# Notation

We denote the i-th string in this sequence by $f^i(a)$. If the sequence has a limit, then denote it by $f^\star(a)$. In particular if $f^\star(a)$ is a finite string, then it is reached in a finite number of iterations, and $a$ is of type 2. If $f^\star(a)$ is an infinite string, then clearly it is not reached within a finite number of steps, and $a$ is of type 2. (This is less obvious, but I believe it to hold.)

# Simplification

We might get rid of letters $x$ which are mapped to the empty string, simply by removing $x$ from all right handsides. This can be done with a queue containing symbols to be removed, and whenever the right hand side of some rule becomes empty, the left hand side enters the queue.

From now on, let’s assume that $f$ does not map to an empty string for all letters considered in the problem.

# Rules determining the types

## Basic rules:

• If $f(x)=x$, then $x$ has type 1. ($x$ is a fixpoint for $f$)
• If $f(x)=xw$ for some arbitrary non empty word $w$, then $x$ is of type 2. Indeed, it generates the sequence $x, xw, xwf(w), xwf(w)f(f(w)), \ldots$.

• If $f(x)=y$ for some letter $y\neq x$, then clearly $x$ has the same type as $y$. But if these implications are circular, then $x$ has type 0 (and so have y and all the other letters along the cycle). ## More complex rules:

If $f(x) = y_1 y_2 \ldots y_k$ with $y_1 \neq x$ and $k > 1$, then things become more interesting.

If all letters $y_1 y_2 \ldots y_k$ are of type 1, then so is $x$. Simply because the limit of f on x is $f^\star(y_1\ldots y_k)$ which is $f^\star(y_1)\ldots f^\star(y_k)$.

If $y_i$ is the first letter not of type 1, then the type of x is determined by the type of $y_i$. Again if this leads to a circular dependecy – for example when $y_i=x$ – then $x$ has type 2. Indeed in the limit f will generate on x the string $w^\star$, where $w$ is the fixpoint reached by $y_1,\ldots,y_{i-1}$. # How to determine efficiently the types?

We will maintain a table type indexed by the alphabet. Initially all letters have an unknown type, that we could represent by the constant -1. For an implementation in Python, one would like to name the table kind instead to avoid the use of a keyword.

First we apply the basic rules, by setting the type of all letters $x$ with $f(x)=x$ or $f(x)=xw$.

Then for the remaining letters we inspect a function $g$, mapping $x$ to the first letter of $f(x)$. If iterated mapping of $g$ on a letter, ends on a letter with a defined type, then we don’t do anything for the moment. But if it ends on a cycle composed of letters with undefined type, then we know that all letters covered by the cycle need to be set to type 0.

Next, for all letters $x$ of undefined type, we maintain a pointer pos[x] on the string $f(x)$, such that we already know that all letters before this position have type 1. If the letter pointed by pos[x] has type 1, then clearly we can increment pos[x]. But if it has type is 0 or 2, then the type of $x$ must become the same type (0 or 2). If all letters of $f(x)$ have type 1 (pos[x] exceeds the length of $f(x)$), then the type of $x$ is set to $1$.

And finally we can safely set to 2 the type of all yet undefined letters.

# Time complexity

It is $O(kn)$, where $k$ is the size of the alphabet and $m$ is the total length of the images of $f$, hence roughly the size of the input.