290 likes | 416 Vues
Localization Technology for Business Decision Makers. David Filip, Moravia Worldwide Angelika Zerfaß, zaac Localization World Berlin, 2010. The myths. [hand out glossary] Technology will solve all our problems Translation Tools/Technologies TM (Translation Memory)
E N D
Localization Technology for Business Decision Makers David Filip, Moravia Worldwide Angelika Zerfaß, zaac Localization World Berlin, 2010
The myths • [hand out glossary] • Technology will solve all our problems • Translation Tools/Technologies • TM (Translation Memory) • Never translate (and pay for) a sentence twice • TMX (Translation Memory Exchange) • Moving data from one TM tool to another without problems • Machine Translation (MT) • Can be used wherever you need it
The myths • Terminology • It's just words, right? • Content Management (CMS) • Send out only new or changed text to translation More • Cloud and Crowd • ROI • TCO
Technology will solve all our problems • Like buying a car does not solve the problem that you need to learn to drive, read a map (use the navigation system), know where to add petrol, oil, water… buying a translation tool does not give you a good/fast/cheap translation • Same as with a car, localization technology will require continuous investment of effort and money after you have purchased it
Technology will solve all our problems • Technology helps us to work faster and more efficiently • But the technology itself does not solve the problem, so investing in a tool is only one step • You need to • Adapt your text creation processes to the translation tool • Train your authors, translators, project managers, reviewers • Documentation of file preparation / post-translation processes • Connect the tool with authoring systems, CMS, project management systems, in-house/home-grown systems…
Technology will solve all our problems • Documentation of existing processes before buying a tool and investing in a technology • Test where the software/technology can help improve your processes • Test where you need to adapt your processes to be able to use the tools efficiently • Price comparison comes at the end, when you know what you want and what each tool can offer and how each tool would meet your requirements • Your technology simply needs to be aligned with your core business needs
TM – never translate (and pay for) the same sentence twice • Translation Memory Systems help achieve better consistency, faster turnaround and can save money • Whenever the same sentence that already has been translated before has to be translated again, the TM system SUGGESTS the existing translation to the translator (so-called 100% match) • The translator has to • Check if the translation does not contain any mistakes (TMs save what you give them, they don't check for spelling mistakes or the like) • Does the translation make sense in this context • Does the translation (length) fit the available space • Is the terminology used correctly
Examples • The sentence needs to be reviewed • Sentence = penalty or phrase? • If there is not enough volume, go to step 4. • Volume = sound or mass/size? • The client does not have the necessary access rights. • Client = client computer or customer?
Context Information • Saving context in the TM helps to improve matches • Context information • Structural information (segment is a heading, content of a table cell… • Surrounding information (previous and following segments) • IDs of strings from software files • Translator still has to check for terminology, typos, appropriateness in context
Advanced Leverage • Re-use of smaller parts from a translation memory than whole segments • Advanced Leverage aka sub-segment matching • Becomes particularly important because of actual lack of segmentation rules standardization and importance of legacy TM assets • Different tools create different segments because they use different rules to recognize segments
Standard File formatsMoving Data between Tools • Standards like TMX for exchanging the contents of a TM system with another TM system and XLIFF as a standard format for translating any file format, help to make it easier to move between different tools • But these standard formats can still be different from tool vendor to vendor as the specification allows different "flavors“, other tools can often only read the text information • Formatting information might get lost in TMX Exchange • Customized information might get lost or can only be preserved partly in TMX exchange • Specific features of XLIFF might not be supported
TMX • Tool A • Segmentation rule: Colon is not a segment end character • "Please consider: There should be more than one." • Tool A sees one segment • Tool B • Segmentation rule: Colon is a segment end character • "Please consider:" • "There should be more than one." • Tool B sees two segments • After moving TM data from Tool A to Tool B, there will be no match for the 2 separate sentences during translation
TMX • Using TMX to move translation memory data from one tool to the next is possible • Allows to use different translation tools within one project • Will result in loss of match rates between 5 and 20% depending on the file format that was used to create the original TM and the rules for segmentation of each tool
Interoperability • Translation/Localization is not an isolated process, tools need to interface with • Authoring environment, document/content management, project management, web resources • Increasing demand for interoperability on client side leads to better and smarter implementation of standards such as XML, XLIFF… • Some tools can e.g. adapt segmentation rules to make better leverage of legacy material • Sub-segment matching and context matching and metadata can also help a lot
MT – fast, use it everywhere • Machine translation is getting used more and more, but works best on text that has been written for the machine (controlled language) • Post-editing is always necessary if the translation needs to be of good quality • MT system needs to be constantly trained and configured to prevent mistakes from occurring over and over again • Feeding MT systems with sentence pairs from TMs is a good idea – if the quality of the TM content is proven and controlled
MT • Microsoft Support Pages
ROI of MT • The sweet-spot for MT just is NOT the traditional content • The ROI of MT on traditional content is boring • Even good TM can be dirty from the MT-training point of view • There is not a “vanilla” MT – the purpose determines everything • The sweet-spot is customer support and community-generated content
Mismatch case-study • L10N manager wants to shorten turnaround times and save on per-word rate… Localized release comes 10 weeks after source • Is it an MT-ready scenario? • We need to ask more diagnostic questions • 6% new content, structured authoring • The solution is not MT! Process optimization, in-context and sub-segment leverage • But MT is buzzword-compliant
Types of MT • General SMT (Statistical MT) or RBMT (Rule-based MT) • Low level of investment • Gisting, perishable content like e-mail or IM • RBMT with highly specifically customized glossaries, domain specific SMT • Medium level of investment • Productivity post-editing in maximum few dozens of percent gross savings
Types of MT • Highly specifically customized hybrid systems (usually SMT or tree-based/example-based) • Highest level of investment and high cost of ongoing maintenance • Draft quality – good enough in certain publication contexts – such as support or knowledge base articles, community-generated content, etc.
Terminology – it's just words, right? • Terminology defines the corporate identity • Company specific and subject matter specific • Terminology questions / misunderstandings by translators (and later by clients) are costly • What needs to be done • Source terminology first • Allowed / forbidden terms for authors to check • Target terminologies • Correct usage and forbidden terms can be checked • Source and target language style guides
CMS – Send out only new or changed text to translation • Content Management Systems store text modules that can be re-used during authoring • Changed or new modules can be extracted for translation • Single modules (often only single sentences) are harder to translate than full chapters, because of context • Single modules, sent to translation, might later not conform to the style of the existing modules, if no comparison is possible by the translator • Structuring your data into modules is a significant effort and requires some rethinking by the authors as well
Structured authoring and translation • Is a great idea • However, you need to combine it with other ideas to succeed • Be sure to bring the context to the translator • Previews, embedded or referenced screenshots • Guaranteed matches from previous translations should be included to provide context, even though locked or otherwise excluded from translation…
Business case • Technology decisions are not too much different compared to other business decisions • Executive sponsors should ask their techies to explain how technology of a particular kind will improve their competitiveness in core business areas • Avoid chasing for buzzword compliance, ask for ROI
Cloud and Crowd • Extremely fashionable buzzwords equal extreme danger of overkill and mismatch • Putting a piece of old-fashioned software onto a datacenter somewhere on the West Coast hardly solves your issues. Even if it is considered cloud and hence cool • Platforms that really scale and can use the cloud flexibility are not yet mature • Architecture issue with systems that need centrally valid data
Cloud and Crowd • Other types of possible mismatch • You might not need the flexibility of expenditure • Your current solution can easily scale for your needs mid-term • You might need tight database-level integrations with systems running in-house • You might not have the community that is needed for a successful crowdsourcing project
TCO • Factors • Longevity? Critical application? Complex existing infra? • Interoperability, Customizability • APIs, SDKs, open standards, open source, licensing of non-production instances, escrow • Avoid chasing for buzzword compliance, ask for TCO • You do not need to be bleeding edge outside of your core business area
Take-aways • Business Case • Require human readable explanations • Avoid overkill • Make sure that you are using the full potential of your solutions • Do not buy cool features that you won’t be able to exploit for the life time of the tool • Avoid mismatch