1 / 9

Noisy Text Correction – an exercise in futility?

Noisy Text Correction – an exercise in futility?. Sreeram Balakrishnan IBM India Research Lab. Aggregate versus Instance analysis. Can divide applications for noisy text into two broad categories Applications that look at individual text instances Eg Search, transcription (OCR, speech2text)

vienna
Télécharger la présentation

Noisy Text Correction – an exercise in futility?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

  2. Aggregate versus Instance analysis • Can divide applications for noisy text into two broad categories • Applications that look at individual text instances Eg Search, transcription (OCR, speech2text) • Applications that look at aggregate features of the text Eg Document classification, Aggregate text analytics • Aggregate analysis is more robust to noise since errors can be averaged out. • Text correction techniques can help improve accuracy of aggregate statistics • Applications that require accurate correction of each text instance may be an exercise in futiliy (at least in the short term) • Eg the example of SMSs that require knowledge of whole context of conversation to manually correct

  3. Example – Customer Contact Records • IBM PC help centers received over 500,000 calls per year • Agents produce summary transcripts for each call Date: 19990425, ID: 13163548 PRELA 04/25/1999 20:46 - Call started by John Velocci (MOB_NORTH). Q: wants to know if he's protected againt the CIH virus. a: has ibm anti virus installed. A: told him to goto the web site for upgrade patches. told him to fax pop. s: self st: closed 04/25/1999 20:51 - Call closed by John Velocci (MOB_NORTH). Date: 19990426, ID: 13171316 POWER b04/26/1999 18:50 - Call started by Scott MacDonald (MOB_NORTH). q: CX'S DOG ATE HER POWER SUPPLY a: looked up the pn for ac adapter 02k6496 and transfered her to parts 04/26/1999 18:55 - Call closed by Scott MacDonald (MOB_NORTH). Date: 19990604, ID: 13376646MONIT 06/04/1999 22:16 - Call started by Barry O'Kelly (IREL_MOB3). Q:Tp attached to dock with external monitor.....black border around LCD and monitor A:Undocked Tp......booted.....screen full Only gets border when attached to dock and monitor Was reinstalling monitor when cus disconnected S:Training 06/04/1999 23:11 - Call placed in Mobiles call back queue by Barry O'Kelly (IREL_MOB3). 06/04/1999 18:14 - Call taken by Andrew Atias (TAG4). Q: Customer calling back, customer still getting black border on LCD and monitor. A: I explained to the customer that this will will happen when using a simultanious display. S: SOP 06/04/1999 19:04 - Call closed by Andrew Atias (TAG4). CALL TYPE: Technical Information for Purchased Equipment CODENAME: MICH-2 MACHINE TYPE: 9546 OMPONENT TYPE: Monitor/Display : : Date: 19990605, ID: 13376581MONIT 06/03/1999 10:39 - Call started by Robert Dennis (MOB_NORTH). Machine is oow warranty Q:machine lcd panel will cut out on cust with the machine sitting normally cust states that presure on the under side of machine at or near the F10 key is what is needed to keep machine running....advised billable repair cust agreed, seeking R3 service 10:46:02 * MSG FROM EZSRV : EasyServe R3 pickup request received for 10:46:02 * MSG FROM EZSRV : Machine Type: 2640 Serial# 78GM283 advised customer that data may be lost when sending in to ez serve.... advised customer to back up all personal data...if possible also machine may be reloaded as part of pd/repair process please have all software, product licenses and COA's available when machine is returned also please write the case number on the outside of the box before sending in to ez serve repair 06/03/1999 10:49 - Call closed by Robert Dennis (MOB_NORTH). 06/04/1999 22:52 - Case Number: 13366697 continued by Margaret Butler : : Date: 19991001, ID: 8629697 inquiry about TP8:memory upgrade customer would like to upgrade TP8:memory but needs information. gave customer information about memory products. Date: 19991051, ID: 8630655 complaint aboutTPxx customer reportspaint ispeelingaroundpalm rest large number of customer claim logs diverse contact media • Summary records contains details of why customers are unhappy • Aggregate analysis of the key phrases reveals that paint peeling at palm rest is common complaint of many TPxx users

  4. Value from wider sources of data

  5. Some example of SMS data • Please send me about yyy card • What is the no. that i may have to dial for kaun banega lakhpati ? • Tell me about new plans plese thanks • Mera custmbar care kayu band kar diya gaya hai kirpa karke aap mera custmbar chalu karen kayu ki mera ko newplan ki jankare chahi • Sir mere custmer care nall gall nahi ho rahi.. • Please activate my wap over gprs • Gup shup pack not activating .unable to connect 656. • Pl. confirm the receipt of payment of Rs. 500 paid on 19.05.06 vide receipt 0244213 at Karanagar. Thanks • Request har never made by me for ISD. I dont need ISD • 24 hrs over what is the reply • I am post paid customor & i have quary about my bill but custamer ex are not there. What can i do now • 3 din ho gaye aap ko aap ke 24 hour kab pure honge • Xxx ki service ko kya ho gaya hai.custmer ko satisfied hi nahi karte. • Tell me where i can contact othewise i would take another connection. • The service of xxx is extremly bad and some of the senior employee are irresponsible regarding their work.e.g. (XYZ) • Xxx service veri poor. • No care for customer is what Xxx focus on. I've to leave xxx as it is not solving my problem. Gudbye Keep NOT care customers • I am very distrebed to xxx massangar I riqvest 3rd time complained • Bhaji plz custmar care service chalu kar do nahi ta mai no. Band kar devaga.Menu bahut mushkil aa rahi plz.Kal spice da no chalega

  6. Some examples of call centre notes • no.fwd to unbarr • pls actv. AR on cst reqt ............10:19am. • cust wants to actv roam as he don't understand the ivr • roming actv on cust req ,,,,,,,,,,charges told,,,,reena • no. unbarred as pymt reflected on cust req • xxx roaming deactv on cust req • the cust secratory called up and he inf tht he was not able to access GPRS ,he was not able to confirm whether its masala or MO,and he told that he will call back with other details later and disconn teh call • No waiver given to him at any cost........ • promotional mssg restricted as on cst req......11:14am • Customer was charged SMS for Rs.3074.But customer didnt give request for deactivation of 10000sms pack.Since om dwn,not able to chk active or not.But its shows active in new crm window. • resume no. as pymt is reflected.....................9.20am • ar deactivated on cust request • case escallated : HEALTH ALERTS to be deactivated ......11:15 am

  7. Enriching structured BI with unstructured data • Augment classic structured data warehouses with information extracted from unstructured sources • Domain specialized annotators embedded in UIMA (open source) extract structured attributes from unstructured sources Unstructured Information Management Architecture Reporting OLAP tools Modeling Mining Unstructured sources UIMA processing Unstructured Enriched Data Warehouse Link/Cleanse/Transform Structured sources

  8. Analysis of Agent Performance (AT002) Scenario: IBM BPO business with agents handling car rentals wants higher check-out rate Solution: Extract and correlate key phrases from call transcripts with outcomes Value Selling Phrases Mention of Good Vehicle Mention of Good Rate

  9. Analysis of Agent Performance Higher use of value selling phrases mentioning good rate for checked out cars versus no show Value selling phrases Checked out cars 47% No shows 25% Cancelled Pick up information

More Related