Learning object affordances based on structural object representation

Learning object affordances based on structural object representation Kadir F. Uyanik Asil Kaan Bozcuoglu EE 583 Pattern Recognition Jan 4, 2011

Content • Goal • Inspirations • Potential Difficulties • Problem Definition • Proposed Method • References • Appendix

Goal

Inspirations Ecological Psychologist James Jerome Gibson 1904 -1979 Cognitive Psychologist Irving Biederman 1939 -

Inspirations:Affordances[1] “… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer.” [1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7. [2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472

Inspirations:Affordances[1] Throw-able “… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer.” Push-able [1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7. [2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472

Inspirations:Affordances[1] Throw-able “… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer.” Push-able (<effect>, <(entity, behavior)>) Revised Definition: An affordance is an acquired relation between a <(entity, behavior)> tuple of an agent such that the application of the <behavior> on the <entity> generates a certain <effect>[2]. environment agent <entity> <behavior> <effect> [1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7. [2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472

Inspirations:Human Image Understanding[3] “There are small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words)” [3] Recognition-by-components: A theory of Human Image Understanding, Psychological Review, Vol. 94 (1987), pp. 115-148

Potential Difficulties[4] • Structural description not enough, also need metric info [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Potential Difficulties[4] • Structural description not enough, also need metric info • Difficult to extract geons from real images [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Potential Difficulties[4] • Structural description not enough, also need metric info • Difficult to extract geons from real images • Ambiguity in the structural description: most often we have several candidates [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Potential Difficulties[4] • Structural description not enough, also need metric info • Difficult to extract geons from real images • Ambiguity in the structural description: most often we have several candidates • For some objects, deriving a structural representation can be difficult [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Problem Definition HOW TO • decompose objects into parts/components ? • find relations between components ? • find a generic graph representation of an <action-entity-effect> three tuple ?

Object DecompositionProposed Algorithm

Object Decomposition What is missing? • Use/try different clustering algorithms • Triangulate 3D surfaces, Delaunay • Compute gaussian curvature on each vertex • Detect region boundaries, curvature thresholding • Perform iterative region growing, flood fill

Graphical Representation • We represent each objects in non-directed graphs as follows: • Each node has the info of geometric shape of the part • Each edge has the information of direction of edge for three axises, i.e from node1 to node2, x axis increases.

Graphical RepresentationSimilarity Checking • [isIsomorphic, label_list]= check_Isomorphism(G1, G2) • If isIsomorphic • Check geometric shapes of same labeled nodes in two graphs • Check direction of equivalent edges in both graphs • If both are matched, return true • Else return false • Else return false

Graphical RepresentationSimilarity Checking Isomorphism check: Two candidates: - n1 = n6, n2 = n4, n3 = n5 (Attributes matched!) - n1 = n4, n2 = n6, n3 = n5 (Attributes isn’t matched)

Current System • 80% is successful • Assumes no occlusion. • For the cup case, handles should always be visible • Needs metric info to distinguish bigger objects from small ones

One way to go… • Learning a generic graph for each affordance type. • Checking the maximal- cliques of the match graph while comparing graph of an object and a generic graph. • Mahalanobis distance metric for generic graphs and use MLE

Tools

References [1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7. [2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472 [3] Recognition-by-components: A theory of Human Image Understanding, Psychological Review, Vol. 94 (1987), pp. 115-148 [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Thanks for listening

Appendix

Human Image Understanding • Hypothesis: small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words) • Geons are directly recognized from edges, based on their nonaccidental properties (i.e., 3D features that are usually preserved by the projective imaging process). • edges are straight or curved • pairs of edges are parallel or non-parallel • vertices will always appear to be vertices • Non-accidental properties allows geons to be recognized from any perspective. • The information in the geons are redundant so that they can be recognized even when partially occluded.

AppendixThe Importance of spatial arrangement

AppendixThe Principal of non-accidentalness • Examples: • Colinearity • Smoothness • Symmetry • Parallelism • Cotermination

AppendixSome non-accidental differences

Learning object affordances based on structural object representation

Learning object affordances based on structural object representation

Presentation Transcript

Learning Object

Object-based learning

Object-Based Databases

Object-based Programming

object based classification

Object Based Design

Learning Object Metadata

An Object-oriented Representation for Efficient Reinforcement Learning

Learning Object?

Object-based Image Representation

Object-based Storage

Object Based Programming

Learning Object Repository

Object-Based Programming

Object-Based Databases

An Object-oriented Representation for Efficient Reinforcement Learning

Object Based Programming

Object Based Programming

Learning Object?

Object-based Scanning

3D Object Representation