1 / 40

Center for Interdisciplinary Research on Language and Speech

C entro di R icerca I nterdisciplinare sul L inguaggio CRIL. Center for Interdisciplinary Research on Language and Speech. Mirko Grimaldi – Barbara Gili Fivela University of Lecce (Italy – Southern Puglia – SALENTO ). Principle Characteristics.

jroxie
Télécharger la présentation

Center for Interdisciplinary Research on Language and Speech

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Centro di Ricerca Interdisciplinare sul Linguaggio CRIL Center for Interdisciplinary Research on Language and Speech Mirko Grimaldi – Barbara Gili Fivela University of Lecce (Italy – Southern Puglia – SALENTO)

  2. Principle Characteristics • Co-presence of diversified and highly sophisticated instruments; • Instruments typology identifiestwo connected research areas: • The study of the acoustic/auditory – articulatory/aerodynamic nature of speech and its relation with the discrete - mental - properties of sounds; • The study, in a modern way, of the human brain language organization, especially with respect to the ancient problem of the biological foundations of language.

  3. Consequences • Vocation to an interdisciplinary approach for: • Integrating phonetic and phonological knowledge in a coherent model of speech perception and production; • Discovering the neural primitives that play a role in speech perception and production (and probably in the encoding of linguistic representations in long term memory). • Trying to formulate a neural theory of language, exploring the boundary spaces where current neuroscientific and linguistics knowledge conflict.

  4. Speech Soundproof room Software and hardware to analyze spectrogram, sonogram, formants and many others speech characteristics Laryngograph Airflow measurements Ultrasound tongue motion Articulograph 3D system Electropalatograph Neuroscience Repetitive Transcranial Magnetic Stimulation (rTMS); Electroencephalography using 64-channel Event-Related Potentials (ERPs) Eye Tracker Softaxic brain navigator software Software presentation for brain stimuli Instruments

  5. Acoustic • CRIL has an 11mq soundproof room with in/out connections for audio, video, pc, and internet.

  6. Principle Acoustic tools • Software-hardware tools: • Computerized Speech Lab (CSL) 4500: CSL, produced by Kayelemetrics in New Jersey, is the most advanced and flexible speech analysis system. • Unlike systems built around generic, plug-in, multimedia sound cards, CSL, with its fully integrated hardware and software, is well tailored for sound input and measurement in the most exacting speech processing applications.

  7. Acoustic • CSL offers input signal-to-noise performance typically 20-30dB superior to generic, plug-in sound cards (designed primarily for sound output). • speech analysis for research,voice measurements,clinical feedback,acoustic phonetics,second language articulation,forensic work and teaching.

  8. Analysis-Synthesis Laboratory (ASL) Applied Speech Science for Voice & Resonance Disorder Auditory Feedback Tools Disordered Voice Database Games Motor Speech Profile Multi-Dimensional Voice Program Paltometer Database Phonetic & Perception Simulation Programs Respiration, Phonation and Prosody Simulation Signal Enhancement Program Sona-Match Synthesis (ASL) Video Phonetics Program and Database Voice Range Profile Acoustic Our CSL software and database options include:

  9. Laryngograph • It measures vocal fold contact and the signal from it can derive a waveform representing the variations in conductance. • It extracts informations non-invasively by running a small electric current through the larynx with two receptors. • Then the waveforms provide the basic for physical interpretation of normal and pathological voice conditions • Some of the features that can be investigated include: vibration regularity, closure and opening definition, open/close phase ratio, closure/opening sequence shape.

  10. Laryngograph hardware and receptors Front Small receptors Back Large receptors

  11. Pattern Sonogram Vocal fold vibration Waveform Pitch (Fx) Laryngograph software Display important measurable aspects of voice Typical main interface which can be tailored for individual needs. The most important analysis are:

  12. Laryngograph: voice quality, “closed” phase and pitch Figure 1 For estimation of the closed phase of each glottal cycle we use the Loudness waveform (Lx). The black bars in Fig. 1 show the closed phase at a point 70% down from the peak of the waveform and this is simply taken as a ratio with the total period, Tx, to give this parameter. The symbol Qx is used to denote this particular closed phase estimate. Qx can be measured for each individual period and it can also be linked to the pitch regularity which occurs when two successive period have the same value (as in Fig 2). Figure 2

  13. Airflow measurements • SCICON system for measuring • oral airflow • nasal airflow • audio signal • Main components • two masks and trasducers • up to 16 (DC) channels • 1 audio channel • multiple sample rates

  14. Ultrasound • Why ultrasound? • Ultrasound Machines (UM) provide researchers portable, user and subject friendly means to do speech research. They can be used in a laboratory as well as a field setting (actually SonoSite produces some portable UM to do imaging work outside of a hospital or lab: fieldwork, schools, clinics, etc.) • The field of ultrasound research is still very youngbecause, in the past, as a clinical instrument, it needed additional modifications to make reliable research measurements. • It was difficult to access such instruments for speech research because they were primarily found in hospitals and heavily used clinically. Reduced cost, improved reliability and increased interest in its unique data have made this instrument a good tool to detect tongue motion.

  15. Using ultrasound • In the world use ultrasound for speech research: • University of British Columbia Interdisciplinary Speech Research Laboratory; • Haskins Laboratories (Connecticut); • Yale University - Queen Margaret University College Speech and Language Science Department; • University of Maryland Dental School Vocal Tract Visualization Lab; • University of Toronto Voice and Resonance Laboratory and Graduate Department of Speech-Language Pathology; • University of Arizona Department of Linguistics; • University of South Florida Department of Communication Sciences and Disorders; • New York University Department of Linguistics;

  16. Anatomy • The tongue is important to all oropharyngeal behaviours. In speech, the tongue is the major contributor to the vocal tract shapes that are our speech sounds. In chewing, the tongue positions the bolus between the molars for grinding food, while protecting the airway from spillage. In swallowing, the tongue propels the bolus backward into the pharynx. In breathing, the tongue maintains tonic muscle contraction to prevent collapse into the airway.

  17. Ultrasound principle • Ultrasound is an ultra high-frequency sound wave emanating from a piezoelectric crystal that produces an image by using the reflective properties of sound waves. • When using ultrasound to measure an object, the transducer is placed at one edge of the object and the sound passes through it until it reaches the impedance mismatch at the opposite edge or surface, which causes a reflection. The impedance mismatch is due to the change in density between the object and its surrounding.

  18. Transducer As with any sound wave, the reflected sound returns straight to the source if the reflecting surface is perpendicular to the ultrasound beam. If the surface is at an angle, the sound refracts and may not be received by the transducer. In the case of the tongue, the transducer is placed below the chin and the sound travels upward to be reflected back by the upper surface. The upper surface of the tongue typically is bounded by air or the palate bone. Figure 3: Focused transducer beam (from Hedrick et al. 1995)

  19. Ultrasound • Gives a good image of the tongue surface from about the hyoid bone to near the tip. • Setup time is minimal. • UM typically collect up to 30 scans-per-second (some machines have a faster scan rate 80-90 Hz). • At 30 Hz, each scan is about 33 ms in duration. • Requires a holding system to coregister with the head.

  20. Sagittal ultrasound images From Shaker et al. (1984) Fig. 5: S: Tongue Surface M: Mucosa GG: Genioglossus Muscle FIS: Floor Intermuscular Septum GH: Geniohyoid Muscle; MH: Mylohyoid Muscle Fig. 4: Transducer placement Fig. 6: Ultrasound scan

  21. Palate and velum • When making a velar sound, the linguavelar contact allows the beam to pass into the velum and reflect back from the air above it, imaging the floor of the nasal cavity Fig. 7: Visible palate and velum in an ultrasound image. From Stone (2004)

  22. Coronal scan From Shaker et al. (1984) Fig. 5: S: Tongue Surface M: Mucosa GG: Genioglossus Muscle FIS: Floor Intermuscular Septum GH: Geniohyoid Muscle; MH: Mylohyoid Muscle MFS: Median Fibrum Septum LM: Lateral Muscles J: Jaw Inner Aspect PS: Paramedian Septum CF: Cervical Fascia Fig. 8: Transducer placement Fig. 10: Ultrasound scan

  23. Rigid Transducer Placement • The goal of transducer positioning is to maintain intimate contact with the chin and accurate beam direction, while ensuring no tissue depression. There are two methods for positioning the transducer under the chin, immobile (rigid) and mobile. • When the transducer is rigid and the head is steady relative to it, tongue measurements are actually tongue-jaw measurements, and have the palate (i.e., skull) as the reference. This is because the image reference point, the transducer, is immobile relative to the head, so measurements made relative to the transducer are also relative to the head (or palate). • In this case, tongue and jaw motion are not separate. Although tongue behavior can be subtracted from jaw motion using an additional instrument to measure jaw position, such as video, Optotrak or Electromagnetic Midsagittal Articulography (EMA), (Stone and Davis, 1995).

  24. Some Rigid Transducer Placement From Stone (2004) Automated Head and Transducer Support System (AHATS), built at Maureen Stone’s Vocal Tract Visualization Lab in Baltimora in order to make 3D reconstruction of the tongue surface from ultrasound images

  25. Some Rigid Transducer Placement Current lab setup used at the University of British Columbia from Bryan Gick and your group. From Gick & Rahemtulla (2004)

  26. Some Rigid Transducer Placement From Wrench (2004) Helmet with headband, two clamps at the cheeks and one at the nape of the neck, used by Alan Wrench and James Scobbie of the Speech and Language Science Department, Yale University

  27. Some Rigid Transducer Placement HOCUS: The Haskins Optically Corrected Ultrasound System From Whalen et al. 2004

  28. Haskins system • Combines ultrasound images of the tongue and hard palate with optical measurements of the head position and lips and jaw. • Optotrak IREDs on head and ultrasound transceiver allow rigid body reconstruction so that tongue position relative to the head can be recovered.

  29. Ultrasound data Analysis and Statistical Representation • Tongue surface contur extracted from each frame in a sequence; • Change in tongue shape during speech; • X,y,t surfaces overlaid and ready for statistical comparison. From Stone (2004)

  30. 3D Static tongue surfaces • Using an ordinary (2D) ultrasound transducer to recreate 3D surface is, at present, more convenient for quantitative analysis than 3D machines. A sparse data set of 5-6 coronal slices is adequate to accurately reconstruct 3D surfaces. • Stone and Lundberg (1996) and Lundberg and Stone (1999) defined 3D tongue surface shapes for static American English sounds using first a 60 slice and then sparser, ultrasound data sets. • The goal was to determine the optimal number and location of slices needed to reconstruct the tongue surface while minimizing error and maximizing coverage.

  31. 3D tongue reconstruction From Stone (2004) Some English vowels

  32. CRIL ultrasound equipment Toshiba Xario 2D-3D-4D System.In collaboration with LIRA-Lab we are thinking about either a Rigid or a Corrected Transducer Placement.

  33. CRIL Ultrasound Some doubtful retroflex sounds of the sallentinian dialects.

  34. Ultrasound Strengths: Relatively easy to record raw data (cf. MRI, EMA); Potential to record more natural speech; Potential to be used as speech therapy assessment and feedback tool; Ultrasound can show all a lot about tongue; Acoustics, EPG and video (lips) can be aligned; Non-invasive; It is a good resource for dialectal speech (high articulatory variation). Ultrasound Weaknesses: It needs sampling rate and temporal accuracy; Tongue surface sometimes is difficult to interpret; Palate and pharyngeal wall missing from image; Accuracy of correction for probe movement; OVERALL – Accuracy in time and space needs to be assessed and improved. So…

  35. Articulography – AG500 • Carstens electromagnetic articulography (EMA): AG500 - 3D system • Audio • Kinematics • Sensors at all positions and in all orientations • The transmitter produces alternating magnetic field

  36. Articulography – AG500 - II • Six transmitter coils on the case, producing the magnetic field • Alternating current in the sensors • Up to 12 sensors fixed on the articulators • Currents in the sensors has a strength that is a function of the distance of the sensor a transmitter coil

  37. Electropalatography (EPG) • Articulate Instruments EPG system • Records timing and location of tongue contact with hard palate • Treatment and diagnosis of many articulation disorders • Main components • hard palate with 62 silver contacts • multiplexer unit which, hung around theneck ofspeaker • Interface units, connected with PC

  38. Electropalatography - II • Tongue-palate contact patterns • displayed live • recorded along with audio • laryngograph • Percentage of contact • Duration of contact

  39. Summary CRIL – for interdisciplinary reasearch • Acustics • Frequency at the larynx • Oral and nasal airflow • Ultrasound imaging of tongue • Articulography for tongue, lips, jaw • Electropalatograph for tongue-palate contact • Instruments for neurolinguistic studies

  40. Bibliography Stone, M. 2004, A Guide Analysing Tongue Motion from Ultrasound Images, http://speech.umaryland.edu/Publications/Guide_to_Ultrasound.pdf Hedrick, W.R., Hykes, D.L., & Starchman, D.E. (1995) Ultrasound Physics and Instrumentation, third Edition: Mosby Inc.: St Louis Mo Shawker T., Sonies B.C., & Stone, M. (1984). Soft Tissue anatomy of the tongue and floor of the mouth: An ultrasound demonstration. Brain and Language, 21, 335-350. Stone, M. & Davis, E. (1995). A head and transducer support (HATS) system for use in ultrasound imaging of the tongue during speech. Journal of the Acoustical Society of America, 98, 3107-3112. Gick B. & Rahemtulla S. (2004) http://www.linguistics.ubc.ca/isrl/UltraSoundResearch/ULTRAFESTpresentations.htm Wrench, A. (2004) http://www.linguistics.ubc.ca/isrl/UltraSoundResearch/ULTRAFESTpresentations.htm Whalen, D. (2004) http://www.linguistics.ubc.ca/isrl/UltraSoundResearch/ULTRAFESTpresentations.htm

More Related