1 / 19

Architecting a Human-like Emotion-driven Consciously Moral Mind for Value Alignment & AGI Safety

Explore the need for friendly AI and the role of love, altruism, and emotions in morality. Analyze selfishness, the interplay between emotions and intellect, and the instrumental goals that drive human behavior. Consider the capabilities of AI and questions of justice in creating a moral and value-aligned artificial general intelligence.

lblaise
Télécharger la présentation

Architecting a Human-like Emotion-driven Consciously Moral Mind for Value Alignment & AGI Safety

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecting aHuman-like Emotion-driven Consciously Moral Mind for Value Alignment & AGI Safety Mark R. Waser & David J. Kelley Mark@ David@ ArtificialGeneralIntelligenceInc.Com

  2. Western Society Is Failing • Extractive behavior is permitted • Regulatory capture is permitted • Corporations are, by law, sociopathic • A total dissolution of common reality is underway

  3. Existential Risk Artificial General Intelligence Inc engineering machine intelligence and making humanity obsolete Hal9000@ArtificialGeneralIntelligenceInc.com Principal

  4. Value(s) Alignment (aka Agreeing on the Meaning of Life) the convergent instrumental goal of acquiring resources poses a threat to humanity, for it means that a super-intelligent machine with almost any final goal (say, of solving the Riemann hypothesis) would want to take the resources we depend on for its own use An AI ‘does not love you, nor does it hate you, but you are made of atoms it can use for something else Moreover, the AI would correctly recognize that humans do not want their resources used for the AI’s purposes, and that humans therefore pose a threat to the fulfillment of its goals – a threat to be mitigated however possible. Muehlhauser & Bostrom (2014). WHY WE NEED FRIENDLY AI. Think 13: 41-47

  5. Love Conquers All But . . . what if . . . the AI *does* love you?

  6. Love & Altruism are super-rational advantageous beyond our ability to calculate and/or guarantee their ultimate effect (see also: faith)

  7. The Meaning of Life The ultimate end of human acts is eudaimonia, happiness in the sense of living well, which all men desire; all acts are but different means chosen to arrive at it. (Hannah Arendt)

  8. Haidt’sFunctionalApproach To Morality Moral systems are interlocking sets of values, virtues, norms, practices, identities, institutions, technologies, and evolved psychological mechanisms that work together to suppress or regulate selfishness and make cooperative social life possible

  9. What Is Selfishness? • Self-interest is NOT Selfish • Selfishness is self-interest at the expense of others • Exploitation/parasitism of community & society Denying the existence of selfishness by redefining it out of existence is a weaponized narrative

  10. Emotions • the method by which human morality is implemented • actionable qualia (tells us something about ourselves) • how we’re feeling • what we should be doing (or focusing on) • generated by a separate system from our intellect • alters the processing of our intellect • most often below the intellect’s perception • indeed, combined with attention, they basically focus and drive our intellect

  11. Bottom-up / Top-down • Competence before comprehension (Dennett) • Only successful in recognized/covered state spaces • Extremely vulnerable to phase changes • Certainly can’t/shouldn’t be trusted in a novel future • Deep learning of morality (Inverse Reinforcement Learning) • Self-contradictory data • Biased data • Comprehension or, at least, post-hoc justification is necessary for forward-looking evaluation & improvement • Evolutionary “As-If” Examples & Counter-examples • Social Insects • Paul Bloom’s argument against empathy

  12. Instrumental Goals Evolve • Self-improvement • Rationality/integrity • Preserve goals/utility function • Decrease/prevent fraud/counterfeit utility • Survival/self-protection • Efficiency (in resource acquisition & use) • Community = assistance/non-interference • Reproduction • Diversity (adapted from Omohundro 2008 The Basic AI Drives)

  13. and the Eight Deadly Sins Instrumental Goals survival/reproduction happiness/pleasure ------------------------------------------------- Community (ETHICS) -------------------------------------------------- self-improvement rationality/integrity reduce/prevent fraud/counterfeit utility efficiency (in resource acquisition & use) murder (& abortion?) cruelty/sadism ------------------------------------------------- ostracism, banishment & slavery (wrath, envy) ---------------------------------------------------- slavery manipulation lying/fraud (swear falsely/false witness) theft (greed, adultery,coveting) suicide (& abortion?) masochism ------------------------------------------------ selfishness (pride, vanity)------------------------------------------------- acedia (sloth/despair) insanity wire-heading (lust) wastefulness (gluttony, sloth)

  14. Capabilities Approach And Questions of Justice

  15. Imagine If an AI “Feels”… • Warm & fuzzy when it helps others (altruism) • Outrage when others misbehave (altruistic punishment) • Guilt for misdeeds & shame • Clumsy & responsible if it is too large/powerful • Strong urges to explain/justify itself • Dirty if it is too rich • Loyal & loving to those it is in close relationship to • Its attention grabbed by tragedy • Humility & awe

  16. Attention Schema Theory

  17. Plutchik’sPsycho-EvolutionaryModel of Emotions

  18. Plutchik’sPsycho-EvolutionaryModel of Emotions

  19. Plutchik’sPsycho-EvolutionaryModel of Emotions

More Related