10 likes | 168 Vues
Alignment of Noisy Unstructured Text Data. Julien Bourdaillet, Jean-Gabriel Ganascia Pierre and Marie Curie University - Paris. Monolingual alignment Texts can be very different Textual genetic criticism Sequence processing algorithm: Suffix tree + A* Experiments: Very noisy text alignment
E N D
Alignment of Noisy Unstructured Text Data Julien Bourdaillet, Jean-Gabriel Ganascia Pierre and Marie Curie University - Paris • Monolingual alignment • Texts can be very different • Textual genetic criticism • Sequence processing algorithm: Suffix tree + A* • Experiments: • Very noisy text alignment • Duplicate linkage • Text reuse • Synthetic data alignment