1 / 34

Object Recognition based on Shape and Function

Object Recognition based on Shape and Function. Akihiro Eguchi BS Thesis Defense – November 30, 2011. Committee Craig Thompson Russell Deaton John Gauch. College of Engineering Computer Science and Computer Engineering Department. Outline. Personal Background Introduction Background

Télécharger la présentation

Object Recognition based on Shape and Function

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Object Recognitionbased on Shape and Function Akihiro Eguchi BS Thesis Defense – November 30, 2011 Committee Craig Thompson Russell Deaton John Gauch College of Engineering Computer Science and Computer Engineering Department

  2. Outline • Personal Background • Introduction • Background • Approach and Architecture • Methodology, Results, and Analysis • Conclusion

  3. 2. Introduction 3. Background 4. Approach and Architecture 5. Methodology, Results, and Analysis 6. Conclusion 1.Personal background

  4. Akihiro Eguchi • Majors at the University of Arkansas • College of Engineering • Honors B.S., Computer Science • Advisor: Dr. Craig Thompson • Fulbright College of Fine Arts and Science • Honors B.A., Psychology • Thesis: “Cultural bias during word learning” • Advisor: Dr. Douglas Behrend • Minor in B.S., Mathematics

  5. Projects, Publications, Awards • Projects in Computer Science field • Semantic World • Publications: • International Journal of Computer Information Systems and Industrial Management (2011) • Web Virtual Reality and Three-Dimensional Worlds Workshop, Freiburg, Germany (2010) • Inquiry Journal of Undergraduate Research (2010) • Cyber Infrastructure Days Conference (2010) • Conference on Applied Research in Information Technology (2010) • X10 Workshop on Extensible Virtual Worlds (2010) • Awards: • Honorable Mention, CRA Outstanding Undergraduate Researchers Award (2011) • Winner, University of Arkansas Undergraduate Research Award, U of A. (2010) • Autonomous Floor Mapping Robot • Publications: • 9th IEEE International Symposium on Robotic and Sensors Environments (ROSE), Montreal, QC, Canada (2011) • Minority Game • Publications: • 5th IEEE International Workshop on Multi-Agent Systems and Simulation (MAS&S), Szczecin, Poland (2011) • Smart Housekeeper Robot with Android • Projects in Psychology field • Cultural bias during children’s word learning • Cultural differences in identity perspectives

  6. 1. Personal Background 3. Background 4. Approach and Architecture 5. Methodology, Results, and Analysis 6. Conclusion 2. Introduction

  7. Earlier Object Recognition • Required • extensive knowledge of math • expensive equipment use of Kinect • Mostly based only on shape of objects • difficult to recognize objects like • a uniquely designed chair • different objects that have similar shapes ideas from Developmental Psychology “Function Bias”

  8. Objective • To combine Kinect sensor with machine learning techniques • To develop a new object recognition model based on shape bias and function bias

  9. 1. Personal Background • 2. Introduction 4. Approach and Architecture 5. Methodology, Results, and Analysis 6. Conclusion 3. Background

  10. Key Concept 1Machine Learning • Give computers a way to learn without explicitly being programmed • autonomous vehicles, checker playing, and signal processing, etc. • Predict the function f(x) = y • Examples include • rote learning • decision tree based (ID3) • neural networks • k-nearest neighbor clustering

  11. Key Concept 2Microsoft Kinect for the Xbox • A controller for the Microsoft gaming console Xbox 360 costs only $150 • dynamic depth image retrieval • human body recognition • skeletal joint tracking • multi-array microphone • Kinect SDK was officially released in June 2011

  12. Related WorkDevelopmental Psychology • Human children use two main biases when learning to name objects: • Shape bias (B. Landau et al., 1988) • generalize name of the object if the shape is similar • Function bias (D. G. K. Nelson et al., 2000) • generalize name of the object if the function of the use seem to be the same • B. Landau, L. Smith, and S. Jones, “The Importance of Shape in Early Lexical Learning,” Cognitive Development, vol. 3, no. 3, 1988, pp. 299-321. • D. G. K. Nelson, R. Russell, N. Duke, and K. Jones, “Two-Year-Olds Will Name Artifacts by Their Functions,” Child Development, vol. 71, no. 5, 2000, pp. 1271-1288.

  13. Related WorkObject Recognition in CS • Simulation of biases (Grabner et al., 2011) • Shape bias: • Define 3D models that are labeled with a same name (e.g., chair) and run a machine learning algorithm to train the classifier. • Function bias: • First define the use of the object; e.g., chair is a object to sit on. • Then, let the program learn the posture of sitting. • If the object is sittable, the objectis more likely to be a chair. • H. Grabner, J. Gall, L. V. Gool, “What Makes a Chair a Chair?,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11), Colorado Springs, CO, June 20-25, 2011, pp. 1529-1536.

  14. Related WorkActivity Recognition • Workflow analysis using RFID (Philipose et al., 2004) • Label objects with RFID to record action • Bayesian networks to infer actions • Video analysis with SVM (Schuldt et al., 2004) • 4 second video recording of walking, jogging, and running • SVM to train the classifier to recognize those actions. • M. Philipose, K. P. Fishkin, M. Perkowitz, D.J. Patterson, D. Fox, H. Kautz, D. Hahnel,, “Inferring Activities from Interactions with Objects,” IEEE Pervasive Computing, vol. 3, no. 4, Oct.-Dec. 2004, pp. 50-57. • C. Schuldt, I. Laptev, B. Caputo, “Recognizing Human Actions: a Local SVM Approach, ” Proceedings of the 17th International Conference on Pattern Recognition (ICPR), 2004.

  15. 1. Personal Background • 2. Introduction 3. Background 5. Methodology, Results, and Analysis 6. Conclusion 4. Approach and Architecture

  16. Selecting a Learning Technique • hand-written number recognition • 7x11 grid of 77 cells • Different Techniques • Neural network (RBFN) • 5 nodes, 3 nodes, 300 iterations • accurate, but slow • K-NN clustering • simple and fast for small input. • less accurate but sufficient for this project

  17. Learning to Use the Kinect SDK • Noise reduction • 3D reconfiguration with OpenGL

  18. Architecture of the Object Recognition Model

  19. Plain Surface Removal with RANSAC Algorithm • Random Sample Consensus (RANSAC) • randomly takes 3 points from the point cloud to determine a random plain • counts how many points are on the plain • iterating through to find a plain that has maximum number of points

  20. K-NN for Shape Learning • Learning: • User names each object or chooses the name from a list of names the user has already told the program • Testing • program compares the shape of the target object with previously learned shape information to infer the name

  21. Activity Recognition using the Kinect • Learning: • Kinect tracks twenty different joints of skeletal information • records coordinates for each joint every 0.1 seconds for 10 seconds • name the activity • Testing: • Use of K-NN to infer the target activity

  22. Demo: Activity Recognition using the Kinect • http://www.youtube.com/watch?v=AxCn0eKWkiQ

  23. Object Recognition Model with Shape Bias and Function Bias • Learning: • Let program learn the shape • Activity learning to learn the use • Associate the activity with the name of object. • Testing: • Based on the confidence: • "Maybe the object is [answer based on the shape]. But the object may be a [answer based on the action] because you used the object for [the name of action]" • "I think the object is [answer based on the action] because you used the object for [name of action]. But it might be a [answer based on the shape] based on the shape."

  24. Demo: Object Recognition Model with Shape Bias and Function Bias • http://www.youtube.com/watch?v=4ia76fzxm68

  25. 1. Personal Background • 2. Introduction 3. Background 4. Approach and Architecture 6. Conclusion 5. Methodology, Results, and Analysis

  26. Methodology • Target Objects • two objects that look similar but have a different use • two other objects that look different but have the same name and function a can of antiperspirant and a can of insecticide a conventional chair and an oddly shape of a chair

  27. Object Recognition based only on Shape • Learning: • Antiperspirant • Insecticide • Chair (1) • Testing: • Antiperspirant -> “Antiperspirant” / “Insecticide” • Insecticide -> “Antiperspirant” / “Insecticide” • Chair (1) -> “Chair” • Chair (2) -> “Antiperspirant” ??? • Because shape is quite different from the chair (1)

  28. Function Recognition • Learning: • “killing bugs” with insecticide • “deodorizing” with antiperspirant • “sitting” on a chair (1) • Testing • Killing bugs with insecticide  “killing bugs” • Deodorizing with antiperspirant  “deodorizing” • Sitting on a chair (1)  “sitting” • Sitting on a chair (2)  “sitting”

  29. Object Recognition based on both Shape and Function • “Deodorizing” with antiperspirant  correctly answer OR  "Maybe the object is insecticide. But the object may be antiperspirant because you used the object for deodorizing". • “Killing bugs” with insecticide  correctly answer OR  "Maybe the object is antiperspirant. But the object may be insecticide because you used the object for killing bugs". • Sitting on a Chair (1)  correctly answer “chair” • Sitting on a Chair (2) "I think the object is a chair because you used the object for sitting. But it might be a antiperspirant based on the shape."

  30. Analysis • when action does not involve a lot of movement, like sitting, the program works well • when action involves movement like walking, the shifting of timing can be a problem.

  31. 1. Personal Background • 2. Introduction 3. Background 4. Approach and Architecture 5. Methodology, Results, and Analysis 6. Conclusion

  32. Summary • Proposed a new way of designing a computational object recognition model • knowledge from developmental psychology • shape + function • Leveraged powerful features of the Kinect sensor • depth map retrieval • human body joint recognition • easier to program an advanced application • less expensive • Result shows that the model works as expected

  33. Future Work • The machine learning technique used for this model can be improved • Speech recognition technology can be used to name objects and actions • Can recognize sequences of activities (workflows) • brushing your teeth involves turning on the faucet, picking up the toothpaste, brushing, rinsing … • Help to create a future semantic world with smart objects • visual object recognition complements RFID tag based recognition • the accuracy can be improved by combining with ontology field of study

  34. Questions

More Related