1 / 10

Transforming Text to Images: A Comprehensive Approach to Visual Content Generation

Our project aims to convert text into images that accurately represent the content, leveraging techniques like text mining, natural language processing, and semantic web. By developing an extensive image database with tagged visuals, we can identify and generate appropriate images based on textual input. The project highlights the importance of visual aids in enhancing comprehension and memorization, addressing the challenges of finding or creating illustrative content. We utilize existing databases such as ImageNet, LabelMe, and others to enhance our model's efficiency and accuracy in transforming descriptions into visual representations.

garren
Télécharger la présentation

Transforming Text to Images: A Comprehensive Approach to Visual Content Generation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NLPainter “Text Analysis for picture/movie generation” David Leoni Eduardo Cárdenas 23/10/2011

  2. Team: David Leoni Eduardo Cardenas Project name: “Text Analysis for picture/movie generation” Professors in charge: Prof. Dr. Ing. Stefan Trausan-Matu As. Drd. Ing. CostinChiru

  3. Motivation for choosing the project: • The purpose of our project is to transform text in images trying that both express the same mining. • More than 50% of human brain is devoted to vision • A fact mere text can't exploit, no matter how inspired and well written it is. • Adding illustrations to text can be of great help to memorize its contents • But searching images that represent the text is a time consuming task • Drawing entirely new images from scratch takes even longer.

  4. How the problem can be solve? In order to solve this problem we are going to use different techniques like text mining, natural language processing and semantic web. We need to obtain a big Image database. The Images needs to have tags with the things that are inside of them. We need to know how to select the most representative picture in our database that describes a specific object.

  5. How the problem can be solve? We need to apply different text mining techniques in order to obtain most frequent words, stop removal, etc. We need to obtain the PoS of the phrase that we want to convert to image. We need to associate the text with the images.

  6. Ontology • We need an ontology to hold implicit knowledge which might not be present in the text provided by the user • Examples: • A lion is hunting a prey • Which are the usual animals hunted by a lion? • This information is needed especially if we want to generate a composition of several pictures • we can use data from Earth of Life website

  7. Databases The following databases of images could be useful for our project: • ImageNet • has images corresponding to synsets of Wordnet • some pictures are marked with bounding boxes for objects • there are 25 attributes for ~400 popular synsets available • annotations are verified by using paid humans through Amazon Mechanical Turk • LabelMe • images are annotated with the shapes of the objects contained in the scene. • labeling was done by unpaid users • Wordnet matching has been done by the authors.

  8. Databases • LabelMe

  9. Databases • Google Image Search • vast and semantically rich corpora, due to a game Google used to invite people labeling images for free. • But the API is deprecated! In some time it will be pay-only or totally unusable. • Flickr • Labeling done by picture authors • 3D databases, Princeton db in particular • Has ~2000 objects stored in OFF, a very simple mesh format. Could be easily adapted to our needs.

  10. Input example: The ball is on the floor. It is a red ball. It is a rubber ball. The baby looks at the ball. Output1 example: The is on the . It is a . It is a . The looks at the . Output2 example:

More Related