## Deep Reinforcement Learning

Stable Baselines rely on TF 1.x but Stable Baselines v3 rely on PyTorch.

See also all posts ordered by category or date.

Stable Baselines rely on TF 1.x but Stable Baselines v3 rely on PyTorch.

In December 2016 we presented it at ENS Paris-Saclay with Étienne Simon but apparently I never wrote a blog post in the end.

- -300
- Algorithme d’Euclide (qui date peut-être de l’école de Pythagore 530 av. J.C.)

Application: Given an ordered list of keys with frequencies, build a binary search tree on those keys which minimizes the average query cost.

Compute the pareto set of a given set of points in 2 or 3 dimensions.

You are given two polynomials $P$ and $Q$ and want to compute their product. The polynomials are given in form of an array with their coefficients.

You are given an array $X$ with the promise that each of its values appears exactly twice. You want to transform $X$ such that at the end all pairs are adjacent in the array. An allowed operation consists in removing a value from $X$ and appending it at the end. The cost of a solution is the maximal value which was moved.

Given an array $x$ and an integer $k$, determine for every index $1 \leq j\leq n$ the maximum $x[i]$ among all indices $\max\{1,j-k+1 \} \leq i \leq j$.

Maintain a set, allowing to add or remove elements and to query the sum of the up to k largest items.