NLPainter “Text Analysis for picture/movie generation”

NLPainter “Text Analysis for picture/movie generation” David Leoni Eduardo Cárdenas 23/10/2011

Team: David Leoni Eduardo Cardenas Project name: “Text Analysis for picture/movie generation” Professors in charge: Prof. Dr. Ing. Stefan Trausan-Matu As. Drd. Ing. CostinChiru

Motivation for choosing the project: • The purpose of our project is to transform text in images trying that both express the same mining. • More than 50% of human brain is devoted to vision • A fact mere text can't exploit, no matter how inspired and well written it is. • Adding illustrations to text can be of great help to memorize its contents • But searching images that represent the text is a time consuming task • Drawing entirely new images from scratch takes even longer.

How the problem can be solve? In order to solve this problem we are going to use different techniques like text mining, natural language processing and semantic web. We need to obtain a big Image database. The Images needs to have tags with the things that are inside of them. We need to know how to select the most representative picture in our database that describes a specific object.

How the problem can be solve? We need to apply different text mining techniques in order to obtain most frequent words, stop removal, etc. We need to obtain the PoS of the phrase that we want to convert to image. We need to associate the text with the images.

Ontology • We need an ontology to hold implicit knowledge which might not be present in the text provided by the user • Examples: • A lion is hunting a prey • Which are the usual animals hunted by a lion? • This information is needed especially if we want to generate a composition of several pictures • we can use data from Earth of Life website

Databases The following databases of images could be useful for our project: • ImageNet • has images corresponding to synsets of Wordnet • some pictures are marked with bounding boxes for objects • there are 25 attributes for ~400 popular synsets available • annotations are verified by using paid humans through Amazon Mechanical Turk • LabelMe • images are annotated with the shapes of the objects contained in the scene. • labeling was done by unpaid users • Wordnet matching has been done by the authors.

Databases • LabelMe

Databases • Google Image Search • vast and semantically rich corpora, due to a game Google used to invite people labeling images for free. • But the API is deprecated! In some time it will be pay-only or totally unusable. • Flickr • Labeling done by picture authors • 3D databases, Princeton db in particular • Has ~2000 objects stored in OFF, a very simple mesh format. Could be easily adapted to our needs.

Input example: The ball is on the floor. It is a red ball. It is a rubber ball. The baby looks at the ball. Output1 example: The is on the . It is a . It is a . The looks at the . Output2 example:

NLPainter “Text Analysis for picture/movie generation”