Artificial Intelligence - An Introduction

Department ofComputer Science & Engineering Artificial Intelligence- An Introduction

What is AI? • Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial defines "man-made," and intelligence defines "thinking power", hence AI means "a man-made thinking power.“ • Artificial Intelligence exists when a machine can have human based skills such as learning, reasoning, and solving problems

History of AI

Does AI have applications? • Autonomous planning and scheduling of tasks aboard a spacecraft • Beating Gary Kasparov in a chess match • Steering a driver-less car • Understanding language • Robotic assistants in surgery • Monitoring trade in the stock market to see if insider trading is going on

Applications

Goals of AI • Problem solving • Problem-solving agents: • In Artificial Intelligence, Search techniques are universal problem-solving methods. Rational agents or Problem-solving agents in AI mostly used these search strategies or algorithms to solve a specific problem and provide the best result. 6/34

An Agent ‘Anything’ that can gather information about its environment and take action based on that information.

Components of a Basic Speech Recognition System A speech capturing Device: It consists of a microphone, which converts the sound wave signals to electrical signals and an Analog to Digital Converter which samples and digitizes the analog signals to obtain the discrete data that the computer can understand. A Digital Signal Module or a Processor: It performs processing on the raw speech signal like frequency domain conversion, restoring only the required information etc. Preprocessed signal storage: The preprocessed speech is stored in the memory to carry out further task of speech recognition. Reference Speech patterns: The computer or the system consists of predefined speech patterns or templates already stored in the memory, to be used as the reference for matching. Pattern matching algorithm: The unknown speech signal is compared with the reference speech pattern to determine the actual words or the pattern of words.

Working of the System

Working of the System A speech can be seen as an acoustic waveform, i.e., signal carrying message information. This acoustic waveform is converted to analog electrical signals by the microphone. The Analog to Digital converter converts this analog signal to digital samples by taking precise measurements of the wave at discrete intervals. The digitized signal consists of a stream of periodic signals sampled at 16000 times per second and is not suitable to carry out actual speech recognition process as the pattern cannot be easily located. To extract the actual information, the signal in time domain is converted to signal in frequency domain. This is done by the Digital Signal Processor using FFT technique. In the digital signal, the component after every 1/100th of a second is analyzed and the frequency spectrum for each such component is computed. In other words, the digitized signal is segmented into small parts of frequency amplitudes. Each segment or the frequency graph represents the different sounds made by human beings. The computer performs the matching of the unknown segments with the stored phonetics of the particular language.

Factors on which Speech Recognition system depends The speech recognition system depends on the following factors: Isolated Words: There needs to be a pause between the consecutive words spoken because continuous words can overlap making it difficult for the system to understand when a word starts or ends. Thus, there needs to be a silence between consecutive words. Single Speaker: Many speakers trying to give speech input at the same time can cause overlapping of the signals and interruptions. Most of the speech recognition systems used are speaker dependent systems. Vocabulary size: Languages with large vocabulary are difficult to be considered for pattern matching than those with small vocabulary as chances of having ambiguous words are lesser in the latter.

Components of ASR LEXICON MODEL, ACOUSTIC MODEL, & LANGUAGE MODEL

Lexicon The lexicon is the primary step in decoding speech. Creating a comprehensive lexical design for an ASR system involves including the fundamental elements of both spoken language (the audio input the ASR system receives) and written vocabulary (the text the system sends out). Acoustic Model Acoustic modeling involves separating an audio signal into small time frames. Acoustic models analyze each frame and provide the probability of using different phonemes in that section of audio. Simply put, acoustic models aim to predict which sound is spoken in each frame. Language Model Today’s ASR systems employ natural language processing (NLP) to help computers understand the context of what a speaker says. Language models recognize the intent of spoken phrases and use that knowledge to compose word sequences. They operate in a similar way to acoustic models by using deep neural networks trained on text data to estimate the probability of which word comes next in a phrase. Together, the lexicon, acoustic model, and language model enable ASR systems to make close-to-accurate predictions about the words and sentences in an audio input.

How ASR Works? In the simplest terms, speech recognition occurs when a computer receives audio input from a person speaking, processes that input by breaking down the various components of speech, and then transcribes that speech to text. Some ASR systems are speaker-dependent and must be trained to recognize particular words and speech patterns. These are essentially the voice-recognition systems used in your smart devices. You need to say specific words and phrases into your phone before the ASR-powered voice assistant starts working in order for it to learn to identify your voice. Other ASR systems are speaker-independent. These systems do not require any training. Speak-independent systems have the ability to recognize spoken words regardless of the speaker. Speaker-independent systems are practical solutions for business applications like interactive voice response (IVR).

ASR Use Cases • From speech recognition’s mid-twentieth-century origins to its multi-industry applications today, the use cases for ASR technology are far-reaching. ASR made it out of the computer science laboratories and is now integrated into our everyday lives. • Voice Assistants : According to a 2020 survey conducted by NPR and Edison Research, 63% of respondents said they use a voice assistant. The ability to use voice commands to help complete tasks like opening mobile apps, sending a text message, or searching the web affords users a greater level of convenience. • Language learning: For people engaged in self-guided language study, apps using speech-recognition tools put them a step closer to having a comprehensive learning experience during independent study. Apps like Busuu and Babbel use ASR technology to help students practice their pronunciation and accents in their target languages. Using these apps, a student speaks into their phone or computer in their target language. The ASR software listens to that voice input, analyzes it, and if it matches what the system identifies as the correct pronunciation, it informs the learner. If the student’s voice input doesn’t match what the ASR knows to be correct, it will inform the student of their missed pronunciation as well. • Transcription Services : One of the first widespread use cases of ASR was for the simple transcription of speech. Speech-to-text services offer a level of convenience in many contexts and open the door to improved audio and video accessibility. Health care practitioners use dictation products like Dragon Naturally Speaking to help them take hands-free notes while attending to patients. ASR captioning also allows for real-time transcription of live video, which allows a broader audience to access the media. • Call Centers: ASR is crucial for the automation of processes for businesses with extensive customer support demands. With an influx of callers, companies need a way to efficiently handle a vast amount of customer communication. ASR technology is one of the main mechanisms involved in smart IVR — a system that automates routine inbound communications as well as large-scale outbound call campaigns.

Challenges & Issues in ASR • Imprecision and false interpretations • Time and lack of efficiency • Accents and local differences • Background noise and loud environments • Privacy and data security

Aravali College of Engineering And Management Jasana, Tigoan Road, Neharpar, Faridabad, Delhi NCR Toll Free Number : 91- 8527538785 Website : www.acem.edu.in

Artificial Intelligence - An Introduction

Artificial Intelligence - An Introduction

Presentation Transcript

Artificial Intelligence Introduction

Artificial Intelligence: Introduction

An Introduction to Artificial Intelligence

An Introduction to Artificial Intelligence

An Introduction to Artificial Intelligence

An Introduction to Artificial Intelligence CE 40417

Artificial Intelligence Introduction

An Introduction to Artificial Intelligence

Artificial Intelligence: INTRODUCTION

An Introduction to Artificial Intelligence

An Introduction to Artificial Intelligence

An Introduction to Artificial Intelligence

Artificial Intelligence An introduction

Artificial Intelligence : Introduction

Artificial Intelligence Lecture # 1 An Introduction

Artificial Intelligence- An Introduction

Artificial Intelligence Introduction

Artificial Intelligence (AI) An introduction

Artificial Intelligence: Introduction

Artificial Intelligence : introduction