Abstract: We consider the case where a pool of DNA molecules clones both, flipped and not-flipped, have been cut by restriction enzymes. Ideally, each clone is cut in the same positions, although in practice due to errors, this does not always happen. The computational problem is to determine where the cuts have occurred.

This is a key problem in determining the structure of the original DNA molecule.

A single molecule is represented by a string of 1's and 0's, with cuts represented by $1's$. A set of molecules clones (with errors) is observed, but the orientation/parity of each molecule is unknown. Clear is that the location of the observed cuts of one molecule are dependent on the parity: flipping the molecule would result in the cuts location, as observed, being ``flipped'' .

We propose a Bayesian approach to generate a posterior distribution on the cuts and parity, given the data. We first present an approximate algorithm where we attempt to divide the problem into subproblems, but it is not guaranteed to solve the problem. Then, we propose another approximate method based on a statistical framework and a mean field annealing algorithm. It computes the maximum posterior marginal (MPM estimator) and maximum aposteriori estimate (MAP estimator).

We also provide evidence that the exact solution of the problem is intractable.