430 likes | 605 Vues
Learning object affordances based on structural object representation. Kadir F. Uyanik Asil Kaan Bozcuoglu EE 583 Pattern Recognition Jan 4, 2011. Content. Goal Inspirations Potential Difficulties Problem Definition Proposed Method References Appendix. Goal. Goal. Goal. Goal.
E N D
Learning object affordances based on structural object representation Kadir F. Uyanik Asil Kaan Bozcuoglu EE 583 Pattern Recognition Jan 4, 2011
Content • Goal • Inspirations • Potential Difficulties • Problem Definition • Proposed Method • References • Appendix
Inspirations Ecological Psychologist James Jerome Gibson 1904 -1979 Cognitive Psychologist Irving Biederman 1939 -
Inspirations:Affordances[1] “… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer.” [1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7. [2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472
Inspirations:Affordances[1] Throw-able “… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer.” Push-able [1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7. [2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472
Inspirations:Affordances[1] Throw-able “… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer.” Push-able (<effect>, <(entity, behavior)>) Revised Definition: An affordance is an acquired relation between a <(entity, behavior)> tuple of an agent such that the application of the <behavior> on the <entity> generates a certain <effect>[2]. environment agent <entity> <behavior> <effect> [1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7. [2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472
Inspirations:Human Image Understanding[3] “There are small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words)” [3] Recognition-by-components: A theory of Human Image Understanding, Psychological Review, Vol. 94 (1987), pp. 115-148
Inspirations:Human Image Understanding[3] “There are small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words)” [3] Recognition-by-components: A theory of Human Image Understanding, Psychological Review, Vol. 94 (1987), pp. 115-148
Potential Difficulties[4] • Structural description not enough, also need metric info [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition
Potential Difficulties[4] • Structural description not enough, also need metric info • Difficult to extract geons from real images [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition
Potential Difficulties[4] • Structural description not enough, also need metric info • Difficult to extract geons from real images • Ambiguity in the structural description: most often we have several candidates [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition
Potential Difficulties[4] • Structural description not enough, also need metric info • Difficult to extract geons from real images • Ambiguity in the structural description: most often we have several candidates • For some objects, deriving a structural representation can be difficult [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition
Problem Definition HOW TO • decompose objects into parts/components ? • find relations between components ? • find a generic graph representation of an <action-entity-effect> three tuple ?
Object Decomposition What is missing? • Use/try different clustering algorithms • Triangulate 3D surfaces, Delaunay • Compute gaussian curvature on each vertex • Detect region boundaries, curvature thresholding • Perform iterative region growing, flood fill
Graphical Representation • We represent each objects in non-directed graphs as follows: • Each node has the info of geometric shape of the part • Each edge has the information of direction of edge for three axises, i.e from node1 to node2, x axis increases.
Graphical RepresentationSimilarity Checking • [isIsomorphic, label_list]= check_Isomorphism(G1, G2) • If isIsomorphic • Check geometric shapes of same labeled nodes in two graphs • Check direction of equivalent edges in both graphs • If both are matched, return true • Else return false • Else return false
Graphical RepresentationSimilarity Checking Isomorphism check: Two candidates: - n1 = n6, n2 = n4, n3 = n5 (Attributes matched!) - n1 = n4, n2 = n6, n3 = n5 (Attributes isn’t matched)
Current System • 80% is successful • Assumes no occlusion. • For the cup case, handles should always be visible • Needs metric info to distinguish bigger objects from small ones
One way to go… • Learning a generic graph for each affordance type. • Checking the maximal- cliques of the match graph while comparing graph of an object and a generic graph. • Mahalanobis distance metric for generic graphs and use MLE
References [1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7. [2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472 [3] Recognition-by-components: A theory of Human Image Understanding, Psychological Review, Vol. 94 (1987), pp. 115-148 [4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition
Human Image Understanding • Hypothesis: small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words) • Geons are directly recognized from edges, based on their nonaccidental properties (i.e., 3D features that are usually preserved by the projective imaging process). • edges are straight or curved • pairs of edges are parallel or non-parallel • vertices will always appear to be vertices • Non-accidental properties allows geons to be recognized from any perspective. • The information in the geons are redundant so that they can be recognized even when partially occluded.
AppendixThe Principal of non-accidentalness • Examples: • Colinearity • Smoothness • Symmetry • Parallelism • Cotermination