Digitalized Dialect Studies: North-Western Romanian

Digitalized Dialect Studies: North-Western Romanian Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Context

Noul Atlas lingvistic român. Crisana • Crisana region in north-west Romania • Hard copy atlas by Stan and Uritescu (1996, 2003) • Digitize to make it more accessible

RODA: Romanian Online Dialect Atlas Digitize and present hard copy atlas: • Mostly graduate students • in Canada and Romania • Enter data from maps into text files • When complete, it will be posted to the Internet for general use

Objective • Use Information Technology to permit a broad range of scholars to • access the data, • select the data appropriately, and • present the data clearly; and so gain greater understanding of its significance.

Other Digital Atlases

Other Digital Atlases • Salzburg • H.Goebl • phonetic dialect atlas of Dolomitic Ladinian (since 1985) • Edgar Haimerl • ‘Visual DialectoMetry’ (VDM) (ca 2000) • Netherlands • Heeringa et al.; de Vriend et al. • Dialectometric and cartographic software

Other Dialect Atlases • Japan • D. Long, others (http://nihongo.human.metro-u.ac.jp/~long/maps/perceptmaps.htm ) • Japanese area maps

Related endeavours • Google Earth • Available mapping software • Images world-wide • Dialect studies with databases • e.g. Iran: National survey for 2009 • Visualization software • e.g T. Pi. Atlas of Dialect Topography • http://dialect.topography.chass.utoronto.ca/dt_atlas.php

Overall challenges: • Digitize data • Accessible interface to data • Search • Analyze • Presentation of data • As data • As maps

RODA as linguistic technology

The technology allows one to: • View the data • Search for data and count it • Interpret the data or the counts • Analyze the data (e.g. MDS) • See the results as maps • Save the maps as .jpg pictures • Save the results for later use • Hear samples of the data

RODA: function • Custom-defined maps • You select the data • You see the result as a map • Programmable access to the whole set of digitized data • You ask about data spread over many maps • You can customize what you search for (not just the editor’s choice)

RODA: selection of data • Context of search becomes important • Word-final vs non-final vs either • Plain character vs accented character • Character vs (superposed) alternate • Choice of fields to search • E.g. With nouns: sg. vs pl. entries • Variations heard by field workers • Flags to mark special situations (e.g. hesitation)

Examples from RODA

Crisana, Romania

Crisana, Romania (from RODA)

Seeing Words Change Word-final /u/in Latin and non-Latin words

Word-final /u/ from Latin

Is word-final /u/ random? • Look for a geographic pattern over all potential occurrences • The maps for single examples such as /ochi/ and others, are in the hard-copy dialect Atlas, • But total data for all examples is spread widely over many maps.

Word-final /u/ • Data from: • 407 maps • Field 1 • Size of cross shows the number of occurrences • Horizontal= syllabic • Vertical = non-syllabic

Word-final, syllabic /u/ • Data from: • 407 maps • Field 1 • word-final only • (horizontal = vertical) • Locations 137, 141, 146 show most examples

Word-final, syllabic /u/ • Can review the data

Word-final, syllabic /u/ • Data from: • selected maps • Field 1 • word-final only • removed non-vocalic /u/ , def. art., some clusters +/u/. • (horizontal = vertical) • Locations 137, 141, 146 show most examples

/u/ Pattern • There is a pattern: • Word final /u/ is retained in central, and north-eastern areas • It is syllabic mostly in parts of the central area • The locations with most frequent syllabic final /u/ do not form a continuous area

Raised word-final /e/

Raised, word-final /e/ • Data from: • 407 maps • Field 1 • Horizontal= vertical • Raised /e/ is wide-spread

Raised, word-final /e/ vs schwa • Data from: • 407 maps • Field 1 • Raised /e/ (horizontal) • Raised schwa (vertical) • Raised schwa is also wide-spread but does not always coincide with raised /e/ • (cf. 158, 159)

High /e/ and schwa

Retained /u/ versus Raised /e/ • Syllabic word-final /u/ (horizontal) • Raised word-final /e/ (vertical) • Zoom-in view of central area • 137, 141, 146 have both

Retained /u/ versus Raised schwa • Syllabic word-final /u/ (horizontal) • Raised word-final schwa (vertical) • Zoom-in view of central area • 137, 146 (not 141) have both

Conclusion • The raising of final mid vowels and the weakening of final high vowels are distinct natural lenition processes.

Non-palatalized dentals before front vowels

Non-palatalized dentals before front vowels • Crişana: dentals before front vowels are palatalized. • Are they restructured as palatals? • If the process is no longer productive, there may be non-palatalized dentals before front vowels. • If so, where, in what forms and what is the frequency?

Non-palatalized dentals before front vowels • Examples everywhere. • (As is well-known, dentals are not palatalized in Oaş, except for 220.) • Map shows where and how many examples.

/st/ before front vowels

/t/ but not /st/ before /e/ and /i/ • 407 maps, field 1 • /te/ (horizontal) • /ti/ (vertical) • values all scaled x 3 to make more visible

/t/ but not /st/ before /e/ and /i/ • Shown as an interpretive map • 407 maps, field 1 • /te/ (red) • /ti/ (black) • Map is automatically drawn from the previous searches

/t/ before /e/ or /i/ • See the examples that were found and counted. • See the source map number and location number of each. • Can delete “exceptions” from the count.

Non-palatalized dentals before front vowels • There are examples everywhere (not only in Oaş) • Here we establish a result with the location and frequency of examples. • Can view the examples that support the conclusion.

/e, i/ after /ts, z, s/ • With digital data and tools, we easily discover significant patterns • Here, we see the conservation of front vowels after velarizing consonants. • We see • frequency and areas • phonological context

/e, i/ after /ts, z, s/

MDS

MDS process • Multidimensional Scaling (MDS) uses the “linguistic distance” between N+1 locations to place them in an N-dimensional space. • Then, the N-space is projected onto a 2-space (a map) such that the distances among the points are preserved as best as possible.

MDS and dialects • Embleton and Wheeler have used an MDS process on • English dialects • Finnish dialects • Dialect roughly correlates with geography

Dialect groupings • Began with a hypothesis about dialect groupings in Crisana • Analyzed all data in 407 maps using the MDS method • Identity is exact match; any difference is a difference of 1. • Distance is sum of differences. • We see the groupings on a map.

MDS mapAll groups • South-east and South-west are distinct. • The rest are less so. • Suggests the dialect unity of the region • --> refine groupings

MDS mapRefined groupings • Still, considerable overlap or closeness • More groups that could be identified, e.g.: • Several divisions in West • Two areas in Oaş • Oaş is close to southern areas • Still, its distinctness is clear (cf. also Uritescu 1984a).

Digitalized Dialect Studies: North-Western Romanian