1 / 56

DeviantART Analysis using Image Features

DeviantART Analysis using Image Features. Bart Buter, Davide Modolo, Sander van Noort Nick Dijkshoorn, Quang Nguyen, Bart van de Poel. Profile Project . Our project focused on explorative research on the analysis of artists and their images of a huge art community called deviantART

Télécharger la présentation

DeviantART Analysis using Image Features

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DeviantART Analysis using Image Features Bart Buter, Davide Modolo, Sander van Noort Nick Dijkshoorn, Quang Nguyen, Bart van de Poel

  2. Profile Project • Our project focused on explorative research on the analysis of artists and their images of a huge art community called deviantART • The research touched different fields: • Visualization (implementation of a Toolkit) • Data collection • Features extraction (statistical and cognitive-inspired) • Classification • Network analysis

  3. Overview • Introduction • Toolkit • Experiments & Results • Future work • Conclusion

  4. Introduction - deviantART • deviantART (dA) is the largest online community showcasing various forms of user-made artwork • 13 million registered members (called Deviants) • Allows emerging and established artists to exhibit, promote, and share their works • All artwork is well organized (comprehensive category structure) • Traditional media (painting and sculpture), to digital art, pixel art, films and anime

  5. Research questions • Can we visualize important aspects of deviantART? • Can artists and/or styles be distinguished? • Are artists influencing each other? • Do art styles change over time? • Are there none-artists interesting for deviantART?

  6. Toolkit • General tool to answer research questions about social art communities (deviantART) • 4 Components Online

  7. Data collection from deviantART • Network of “professional” artists • Download artist’s name and their watchers • Output for Pajek and Matlab graph toolbox • Artist’s images and information about these images • Download galleries from users as dataset • No web API, instead follow Backend links • Parse RSS XML files and download images Data collection

  8. Data collection • For each image store a xml file Example: <?xml version="1.0"?> <root xml_tb_version="3.1"> <guid>http://catluvr2.deviantart.com/art/42-Journals-73664427</guid> <title>-42 Journals</title> <category>customization/screenshots/other</category> <filename>_42_Journals_by_catluvr2.jpg</filename> </root> Data collection

  9. Dataset information • Downloaded 31 users • About 5000 images • Daily Deviations of a random day • Top categories: • photography: 2244 • customization: 906 • traditional: 842 • digitalart: 587 • fanart: 239 Data collection

  10. Feature extraction • Why we need features • Can’t visualize sets of images in high-dimensional space • Features can be intuitive for toolkit users • Easier to work with than raw data (classification) • Kind of features: • Statistical features • Cognitively-inspired features Feature extraction

  11. Feature format • Store features in XML files • One XML file per image describing all features • Easy to addnew features of existing images • Easy to add images • Onlycalculate features that are notalready present in XML file • Addthose features to the XML file of the image Feature extraction

  12. Statistical features • Low level & understandablefeatures • RGB values (average, median) • Hue, Saturation&Intensityvalues (average, median) • Edge-pixel ratio • Corner-pixel ratio • Entropy of the intensity • Variance of the intensity • Compositional features Feature extraction – Statistic part

  13. Edge-pixel ratio Ratio: 0.0094 Ratio: 0.0998 Feature extraction - Statistic part

  14. Average of the intensity AvgIntensity: 21.90 AvgIntensity: 123.96 AvgIntensity: 243.67 Feature extraction - Statistic part

  15. Entropy of the intensity Intensity entropy: 1.5408 Intensity entropy: 7.8799 Feature extraction - Statistic part

  16. Variance of the intensity Intensity variance: 506 Intensity variance: 14676 Feature extraction - Statistic part

  17. Compositional edge-pixel ratio Feature extraction - Statistic part

  18. Hue and Saturation Feature extraction - Statistic part

  19. Weibull-Distribution Image Contrast • Why Feature extraction – Statistical part

  20. Cognitively-inspired features Model of Saliency-Based Visual Attention • It has appeared that attention influences visual information even in the earliest areas of primate visual cortex • This influence seems to shape an integrated saliency map • This maps is the representation of the environment that weighs every input by its local feature contrast and its current behavioral relevance • It enables the visual system to integrate alarge amount of information Feature extraction - Cognitive part

  21. Itti, Koch and Niebur’s Model Feature extraction - Cognitive part

  22. Example of saliency map color ORIGINAL IMAGE orientation Feature extraction - Cognitive part intensity EXTRA: skin SALIENCY MAP

  23. What do we have Intensity map Orientation map • Important visual features about the style of the photo of this image: • - The portrait is not exactly in the middle • The portrait is a human • - The portrait is standing statically • - Colors are quite uniform, and they are not so many Skin map Feature extraction - Cognitive part Saliency map Color map But how to use all the different maps to represent these information?

  24. Cognitively-inspired features (1) • Shannon entropy of the 5 different maps (the saliency and the conspicuity ones) • Standard deviation of the saliency distribution in the saliency map • Location of the three most salient points • Skin intensity Feature extraction - Cognitive part

  25. Cognitively-inspired features (2) • Location has been computed using the Inhibition Of Return (IOR) procedure: Original saliency map Feature extraction - Cognitive part 3 most salient locations After the first inhibition After the second inhibition

  26. Cognitively-inspired features (3) • Skin is an extra channel (not standard in the Itti’s model) but it has been found really interesting • It can easily be used to detect nude images (that are quite popular within devianArt’s professional photographer) Original image Feature extraction - Cognitive part Skin map Skin map Original image

  27. OpenCV face detector Feature extraction - Cognitive part

  28. Classification • Given a set of features, the classification is used to: • Determine if two artists/categories are distinguishable • Determine which features are useful to do it • Different classifiers are available in the Toolkit: • k-Nearest Neighbour (kNN) • Naive Bayes (NB) • Nearest Mean (NM) • Support Vector Machine (libSVM) Classification

  29. Classification • Pre-processing functions: • Reading in XML files and creating a dataset • Normalization • Dataset filtering on classes and features • Parameter optimization using cross-validation • Classification current capabilities: • 1 class against another class • 1 class against all other classes Classification

  30. Classification • Feature selection is needed when dealing with a lot of features • Reduces the dimensions of the data representation • Give the feature combination that best separate a class • Sequential forward feature selection • First select the most informative feature and iteratively add the next most informative feature to it • Criterion is based on the inter-intra distance Classification

  31. Classification • Evaluation measures: • Precision • The percentage of how many of the positive classified images were indeed positive • Recall • The percentage of how many of the total positive images were found positive • F1-Measure • The weighted average of the precision and recall Classification

  32. Visualization • Purpose of the visualization: • Visualize the dataset • Find patterns • Analyse classification results • Filtering (relevant information) • Input: Dataset (thumbs+full) images & XML features files • Converted to single TAB seperated file • Express the classification performance • Capture the performance in one graph • Input: performance output of the classifier Visualization

  33. Visualization • Use existing visualization application? • Mondrian, general purpose statistical data-visualization system Visualization http://rosuda.org/mondrian/

  34. Visualization • Use existing visualization application? • XmdvTool, interactive visual exploration of multivariate data sets • Flat version of the data set Visualization http://davis.wpi.edu/~xmdv/

  35. Visualization • Use existing visualization application? • Tool that has generic uses, produce only generic displays • Data can take many interesting forms • Require unique types of display and interaction • Not captured with general applications • UI not intuitive (lack easy way to filter data) • (These tools also look outdated) Visualization

  36. Visualization • What language/framework for our visualization? • There are many… • Prefuse visualization toolkit (generic displays) • Adobe Flash/Flex (expensive, slow for large datasets) Visualization

  37. Visualization • (Partially) Implemented in “Processing” • Open source programming language to create images, animations, and interactions • Build on top of Java (collection of Java classes) • Consists of: • Processing Development Environment (PDE) (very minimalistic) • A collection of commands (API) • Several libraries that support more advanced features (OpenGL, XML) • Easy to integrate into Java (Eclipse) Visualization

  38. Visualization: Processing • Provides functions to make life more easy • image(img, x, y, [width, height]) • line(x1, y1, x2, y2) stroke(color) • Not to draw complete graphs/plots • Right combination of cost, ease of use and speed • Export the application as a Java Applet • Run it on a website • Use URL instead of images to avoid legal issues Visualization

  39. Experiments & Results

  40. Experiment #1 – Classification • Goal: • Use the toolkit to find what kind of features best separate two artists • Details of the experiment • Experiment was performed for all artists in the dataset • Feature selection algorithm was used to output the 1-5 most informative features • Evaluation was done using the F-measure

  41. Selecting the classifier • Select classifier for the experiment • Train all the classifiers on a subset of the trainingdata using crossvalidation to optimize parameters • Criteria of selection: F-measure • SVM gives the highest F-measure Average F-measure 1vs1 classification over all artists

  42. Result Matrix using the top 1 feature

  43. Result Matrix using top 2 features

  44. Result Matrix using top 3 features

  45. Result Matrix using the top 4 features

  46. Result Matrix using the top 5 features

  47. Result Matrix using all features

  48. Visualization Case (1) • Artist Pair: Kitsunebaka91 and LALAax • Fmeasure Pair: 0.952941 and 0.884615 • medIntCells_2 • gridEdgeRatio_4 • Artist Pair: fediaFedia and gsphoto • Fmeasure Pair: 0.867347 and 0.938095 • avgHue • intVariance

  49. Visualization Case (2) • Artist Pair: K1lgore and sekcyjny • Fmeasure Pair: 0.692308 and 0.640000 • avgBCells_3 • salMapCEntropy • Artist Pair: stereoflow and zihnisinir • Fmeasure Pair: 0.649007 and 0.683871 • avgHueCells_4 • avgR

  50. Results

More Related