1 / 43

Conceptual Spaces

Conceptual Spaces. Part 1: Fundamental notions. P.D. Bruza Information Ecology Project Distributed Systems Technology Centre. Opening remarks. This tutorial is more about cognitive science than IR, is fragmented and offers a somewhat personal interpretation

rhea-howe
Télécharger la présentation

Conceptual Spaces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conceptual Spaces Part 1: Fundamental notions P.D. Bruza Information Ecology Project Distributed Systems Technology Centre

  2. Opening remarks • This tutorial is more about cognitive science than IR, is fragmented and offers a somewhat personal interpretation • The content is drawn mostly from Gärdenfors’ “Conceptual Spaces: The geometry of thought”, MIT Press, 2000. • Also driven by some personal intuition: • The model theory for IR should be rooted in cognitive semantics • How do you capture these computational semantics in a computational form and what can you do with them?

  3. Gärdenfors’ point of departure • How can representations (information) in a cognitive system be modelled in an appropriate way? • Symbolic perspective: representation via symbol, a cognitive system is described by a Turing machine (cognition = computation = symbol manipulation) • Associationist perspective: representation via associations between “different kinds of information elements” (e.g. connectionism – associations modelled by artificial neural networks)

  4. The problem with the symbolic and associationist perspectives • “mechanisms of concept acquisition, which are paramount for the understanding of many cognitive phenomena, cannot be given a satisfactory treatment in any of these representational forms” • Concept acquisition (learning) closely tied with similarity • Geometric representation: similarity can be “modelled in a natural way”

  5. Propositional representation symbolic conceptual Geometric representation associationist (sub-conceptual) Connectionist representation Gärdenfors’ cognitive model

  6. Conceptual spaces outline Quality dimension Domain (Context) property Concept “Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics” How can conceptual spaces be realized (e.g., for IR)

  7. Quality dimensions • Represent various “qualities” of an object: • Temperature • Weight • Brightness • Pitch • Height • Width • Depth • A distinction is made between “scientific” and “phenomenal” (psychological) dimensions

  8. Quality dimensions (con’t) “Each quality dimension is endowed with certain geometrical structures (in some cases topological or ordering relations) 0 Weight: isomorphic to non-negative reals

  9. Quality dimensions may have a discrete geometric structure Discrete structure divides objects into disjoint classes 1. Kinship relation: father, mother, sister etc, (geometric structure = discrete points) 2. t “Even for discrete dimensions we can distinguish a rudimentary geometric structure”

  10. Phenomenal vs. scientific interpretations of dimensions • Phenomenal interpretation: dimensions originate from cognitive structures (perception, memories) of humans or other organisms • E.g. (height, width, depth), hue, pitch • Scientific interpretation: dimensions are treated as part of a scientific theory • E.g., weight

  11. Example: colour • Hue- the particular shade of colour • Geometric structure: circle • Value: polar coordinate • Chromaticity- the saturation of the colour; from grey to higher intensities • Geometric structure: segment of reals • Value: real number • Brightness: black to white • Geometric structure: reals in [0,1] • Value: real number

  12. Example: colour (hue, chromaticity, brightness) NB geometric structure allows phenomenologically “complementary” and “opposite” hues can be distinguished

  13. Integral and separable dimensions • Dimensions are integral if an object cannot be assigned a value in one dimension without giving it a value in another: • E.g. cannot distinguish hue without brightness, or pitch without loudness • Dimensions that are not integral, are said to be separable • Psychologically, integral and separable dimensions are assumed to differ in cross dimensional similarity – • integral dimensions are higher in cross-dimensional similarity than separable dimensions. • (This point will motivate how similarities in the conceptual space are calculated depending on whether dimensions are integral or separable. N.B. IR matching functions treat all dimensions equally)

  14. Where do dimensions originate from? • Scientific dimensions: tightly connected to the measurement methods used • Psychological dimensions: • Some dimensions appear innate, or developed very early; e.g. inside/outside, dangerous/not-dangerous. (These appear to be pre-conscious) • Dimensions are necessary for learning – to make sense of “blooming, buzzing, confusion”. Dimensions are added by the learning process to expand the conceptual space: • E.g., young children have difficulty in identifying whether two objects differ w.r.t brightness or size, even though they can see the objects differ in some way. “Both differentiation and dimensionalization occur throughout one’s lifetime”.

  15. In summary, • Quality dimensions are the building blocks of representations within an conceptual space • Gärdenfors’ rebuttal of logical positivism: • “Humans and other animals can represent the qualities of objects, for example, when planning an action, without presuming an internal language or another symbolic system in which these qualities are expressed. As a consequence, I claim that the quality dimensions of conceptual spaces are independentof symbolic representations and more fundamental than these”

  16. Conceptual spaces outline Quality dimension Domain (Context) property Concept “Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics” How can conceptual spaces be realized (e.g., for IR)

  17. Domains and conceptual space • A domain is set of integral dimensions- a separable subspace (e.g., hue, chromaticity, brightness) • A conceptual space is a collection of one or more domains • Cognitive structure is defined in terms of domains as it is assumed that an object can be ascribed certain properties independently of other properties • Not all domains are assumed to be metric – a domain may be an ordering with no distance defined • Domains are not independent, but may be correlated, e.g., the ripeness and colour domains co-vary in the space of fruits

  18. Conceptual spaces outline Quality dimension Domain (Context) property Concept “Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics” How can conceptual spaces be realized (e.g., for IR)

  19. Properties and concepts: general idea • A property is a region in a subspace (domain) • A concept is based on several separable subspaces

  20. Example property: “red” hue chromaticity brightness Criterion P: A natural property is a convex region of a domain (subspace) “natural” – those properties that are natural for the purposes of problem solving, planning, communicating, etc

  21. Motivation for convex regions x x y y Convex Not convex x and y are points (objects) in the conceptual space If x and y both have property P, then any object between x and y is assumed to have property P

  22. Remarks about Criterion P Criterion P:A natural property is a convex region of a domain (subspace) • Assumption: “Most properties expressed by simple words in natural languages can be analyzed as natural properties” • “The semantics of the linguistic constituents (e.g. “red”) is severely constrained by the underlying conceptual space” (I.e. no “bleen”) • “Criterion P provides an account of properties that is independent of both possible worlds and objects” • Strong connection between convex regions and prototype theory (categorization) • (Easier to understand how inductive inferences are made)

  23. < , , > Example concept: “apple” Apple = < , , , texture, fruit, nutrition> Criterion C: A natural concept is represented as a set of regions in a number of domains together with an assignment of salience weights to the domains and information about how the regions in the different domains are correlated

  24. Concepts and inference (in passing) • The salience of different domains determines which associations can be made, and which inferences can be triggered • Context: moving a piano – leads to association “heavy” • More about this next time…..

  25. How to model relevance: concept? Table from Yuan, Belkin and Kim, ACM SIGIR 2002 Poster

  26. How to model a document(s): ? • “An exosomantic memory is a computerized system that operates as an extension to human memory. Ideally, use of an exosomantic system would be transparent, so that finding information would seem the same as remembering it to the human user” (B.C. Brookes, 1975) • To create computerized representations of data sets that are consistent with human perception of the data sets • To enable personalized relations to representations of data sets • To provide natural interfaces for interaction with exosomantic memory Newby, G. Cognitive space and information space. JASIST 52(12), 2001

  27. Term = dimension • “Since many of the fundamental quality dimensions are determined by our perceptual mechanisms, there is a direct link between properties described by regions of such dimensions and perceptions” (rats!) • However, dimensional spaces based on terms have shown marked correlation with human information processing: • HAL and note (“It is difficult to know how to encode abstract concepts with traditional semantic features. Global co-occurrence models, such as HAL, may provide a solution to part of this problem”) • So, terms as dimensions in a global co-occurrence leads useful vector representations of abstract concepts • HAL’s results seem to be echoed by Newby using Principal Component Analysis on a term-term co-occurrence matrix

  28. Text fragment = dimension • For example, (term x document) matrix • Latent semantic analysis produces vector representations of words in a reduced dimensional space: • LSA correlates with human information processing on a number of tasks, e.g., semantic priming • Landauer at al often use short fragments (dimension = 1 or 2 sentences) • Dimensional reduction is apparently successful in re-producing cognitive compatibility, but the reason for this is unknown • Determining the appropriate dimensional structure for IR models is still an open question, especially in light of cognitive aspects

  29. Similarity: introductory remarks • Similarity is central to many aspects of cognition: concept formation (learning), memory and perceptual organization • Similarity is not an absolute notion but relative to a particular domain (or dimension) • “an apple an orange are similar as they have the same shape” • Similarity defined in terms of the “number of shared properties” leads to arbitrary similarity – “a writing desk is like a raven” • Similarity is an exponentially decreasing function of distance N.B. clustering in IR often uses an “absolute” notion of similarity

  30. Metric spaces A real-valued function d(x,y) is said to be a distance function for space S if it satisfies the following conditions for all points x, y and z in S: A space that has a distance function is called a metric space (There is debate about whether distance is symmetric from a psychological viewpoint. Eg Tversky et al “Tel Aviv judged more similar to New York” than vice versa. Gärdenfors accepts the symmetry axiom)

  31. Equi-distance under the Euclidean metric x Set of points at distance d from a point x form a circle Points between x and y are on a straight line

  32. Equi-distance under the city-block metric x The set of points at distance d from a point x form a diamond The set of points between x and y is a rectangle generated by x and y and the directions of the axes

  33. Between-ness in the city-block metric y x All points in the rectangle are considered to be between x and y

  34. Metrics: integral and separable dimensions • For separable dimensions, calculate the distance using the city-block metric: • “If two dimensions are separable, the dissimilarity of two stimuli is obtained by adding the dissimilarity along each of the two dimensions” • For integral dimensions, calculate distance using the Euclidean metric: • “When two dimensions are integral, the dissimilarity is determined both dimensions taken together

  35. Minkowski metrics Euclidean and city-block are special cases of Minkowski metrics: City-block: r = 1 Euclidean: r = 2

  36. Scaling dimensions Due to context, the scales of the different dimensions cannot be assumed identical Dimensional scaling factor

  37. Similarity as a function of distance A common assumption in psychological literature is that similarity is an exponentially decaying function of distance: The constant c is a sensitivity parameter. The similarity between x and y drops quickly when the distance between the objects is relatively small, while it drops more slowly when the distance is relatively large. The formula captures the similarity-based generalization performances of human subjects in a variety of settings

  38. IR-related comments on similarity • In the vector-space model, similarity is determined by the cosine function, which is not exponentially decaying • IR models don’t distinguish between integral and separable dimensions, even though this distinction is significant from a cognitive point of view • Experience so far with computational cognitive models is mixed: • LSA uses cosine similarity (not exponentially decaying)!! • HAL used Minkowski (r = 1) to measure semantic distance, I.e a non-Euclidean distance metric was employed • (Non-Euclidean metrics should perhaps be explored)

  39. Prototypes and categorical perception: introductory remarks • Human subjects judge “a robin as a more prototypical bird than a penguin” • Classifying an object is accomplished by determining its similarity to the prototype: • Similarity is judged w.r.t a reference object/region • Similarity is context-sensitive: a robin is a prototypical bird, but a canary is a prototypical pet bird • Continuous perception: membership to a category is graded

  40. Prototype regions in animal space reptile emu archaeopteryx mammal robin bat bird penguin platypus Categorical perception: stimuli between categories distinguished with more ease and accuracy than within them Based on Gärdenfors & Williams IJCAI 2001

  41. Computing categories in conceptual space: Voronoi tessellations Given prototypes require that q be in the same category as its most similar prototype. Consequence: partitioning of the space into convex regions

  42. Voronoi Tessellations (con’t) • Much psychological data concords with tessellating conceptual spaces into star-shaped (and sometimes convex) regions around prototypes (e.g., stop consonants in phoneme classification” • Boundaries produced by Voronoi tesselations provide the threshold of similarity and support a mechanism explaining categorical perception Gärdenfors & Williams, Reasoning about categories in conceptual spaces, ProceedingsIJCAI 2001

  43. Part II • Concept combination • Induction • Semantics • Non-monotonic aspects of concepts • Realizing (approximating) conceptual spaces

More Related