Introduction to Online Marketing Intelligence Zhangxi Lin ISQS 3358 Texas Tech University
Outline • Online Targeted Advertising • About Web mining • Data • Knowing your customer • Consumer segmentation
Marketing Technology Adoption • In December 2005, Forrester surveyed 371 marketing technology decision-makers and influencers to investigate trends in marketing technology adoption and spending. • Respondents hail from six major industry groups, and two-thirds work for firms whose annual revenues in 2005 exceeded $1 billion. • Marketing technology adoption is widespread. • Marketers say they need a more comprehensive application suite. • Vendors aren’t delivering yet.
Marketing Technology Spending • Since 2003, budgets have crept steadily upward and, on average, 2006 budgets are up 7% over 2005. But spending varies significantly by company size and industry. Specifically: • The largest and smallest firms are scaling back slightly. • Technology followers are putting cash behind their intentions. • As a percentage of revenue, retailers spend the most on marketing technology. • B2B firms are growing marketing technology spend aggressively.
Online Advertising Market Status • In 2006, the advertising spending was $16.8 billion an increase of 34% from that of 2005 (IAB 2007). • According to DoubleClick (2005) • Limited online advertising publishing resources because of limited online users’ capability to view growing number of web pages (DoubleClick Research 2005) • Online targeted advertising is a seller market • Online targeted advertising is emerging as a new trend. • In March 2007, China’s largest advertising company by advertising revenue, Focus Holding Ltd agreed to buy Chinese leading online firm Allyes Information Technology Co. Ltd for $225 million. • In April 2007, Google Inc. announced a definitive agreement to acquire DoubleClick for $3.1 billion.
Targeted Marketing • Users know what they want • Users purchased certain items from certain websites • We can apply real-time customized marketing solutions (see the process map later) • Users did not purchase, but click through some links • Mining the click streams of the customers, and figure out the needs----behavioral targeting • Users do not know what they want---behavioral targeting • Collecting information online (such as the blogs, discussions boards in a community) • Segment/target/position strategy • We can potentially build a database profiling the online users • How to design (create) ads to make it appeal to end users
Implications of Targeted Marketing • For advertisers • Help to drive immediate responses (or increased sales) to their advertisements • Help to build branding for the advertisers • For publishers • Maximize the value of high-quality ad inventory space (differential services for different site sectors)
Effectiveness of Online Marketing When executed properly, behavioral marketing is a highly effective means of reaching and converting your target audience. Network Behavioral Targeting vs. Non-Targeted Advertising Behavioral Re-Targeting vs. Non-Targeting Advertising Source: Advertising.com, 2004 Source: Advertising.com, 2005
PRODUCT PURCHASE This travel advertiser targeted consumers who previously visited its website in order to drive actual reservations. Visitors who had not booked a reservation received custom ads highlighting guaranteed rates, seasonal discounts, new hotel perks and free gifts with an online booking. A hotel booking was generated for every 2,000 impressions served. 1 out of every 2 people who clicked on the ad completed a booking.
Web Mining • When online users browse web pages, their activities could be recorded. Using data mining techniques to analyze these activities will enable more accurate web-based online advertising。 • The possible web mining applications may include • Consumer Profiling • Purchase propensity analysis • Web page effectiveness evaluation • Online recommendation • Realtime advertising • Others
Some Business Questions • Who is visiting my Web site? • Who is buying my product(s)? • Who are my repeat buyers? • Which customers are churning? • Which Web design produces the most purchases? • What campaign strategies are most effective in increasing Web site visits?
Business Questions • What factors influence product purchases? • Time-of-day effects • Gender, Age, Income, and so forth • Latent factors: e-shopper, Web expert, and so forth • Which sales channels produce the most profitable customers? • Do any site-visit patterns correlate with outcomes that can be exploited for business advantage? • How can I forecast peak usage and future usage to ensure I have the hardware and technology to keep my Web site running? • How can I monitor my Web site to prevent inappropriate access and malicious activity? • How can I manage purchases, returns, and exchanges to avoid fraud and reduce waste?
Web Mining for Profitability • Increase viewing, navigation, and transaction efficiency. • Improve the customer experience. • Add services and features that promote cross-selling and up-selling opportunities. • Identify problem areas. • Improve security. • Attract more high quality customers.
Customer Relationship Management (CRM) • Making the right offer to the right customer at the right time. • One-to-one marketing. — Peppers and Rogers • TQM (Total Quality Management) with new buzz words. • “The practice of annoying customers for short term profits.” — Herb Edelstein
Examples of Web Site Services Recommender systems Stock quotes or financial services News, weather, sports, traffic conditions Celebrity or event photos and multimedia Search engines Web site hosting or e-mail Games or contests Beach cams, space cams, hot spot cams
Internet Commerce Challenges • 24/7 operations • International scope • Non-standard media • Many browsers • Different display monitors and graphics adapters • Different window geometry • Different computers and operating systems • Different customer concerns • Secure transactions • Privacy and confidentiality • Legitimacy
Data Collection Methods • Web logs • Cookies • Forms • Java applications • Other applications
Web Log Data • Fields • User’s IP address, also called • Remote host name • Client IP address • User name, also called • Remote user log name (may be different) • Authenticated user name • Date and time of request, with or without a UTC offset • Request type, also called “method” • HTTP request with (CLF) or without (IIS) argument • Status: HTTP three digit status code • Number of bytes sent to client continued...
Web Log Fields • The URL path requested, if request type has no argument • The port to which the request was served • The name of the server • The IP address of the server • The time taken to serve the request • Number of bytes in the request received from the client • User agent, which is usually a text string with the name and version number of Web browser used by the client and the operating system of the client machine • The domain name or IP address of the referring URL • Query information in a text string • Cookie information in a text string
The User Session User requests index.htm. Server sends copy of index.htm. Browser parses index.htm, finds references to image files, and requests image files. Web Server Browser ...
Three Popular Web Log Formats NCSA Common Log Format Microsoft IIS Format W3C Extended Log File Format
Web Logs May Be Inadequate for Data Mining Limitations exist with respect to defining users, sessions, and page hits. User preferences must be inferred from limited data: referring URL, page selections, browser. Different users within a household may be indistinguishable.
What Is a Cookie? Browser requests Web page Web page is delivered with instructions for creating cookie Browser creates cookie and writes to hard disk Value of cookie sent to server Custom content returned Web Browser Client Web Server
The Anatomy of a Cookie Name Sequence of characters uniquely identifying cookie Value Stored information Domain Domain name Path Path within a site. Access is restricted to this path. Expires Expiration date in UTC Secure Encryption flag
A Sample Cookie session-id 103-0556164-3592039 www.megastock.com/ 0 730710016 30123554 2742100288 29450847 *
Limitations of Cookies Can only be accessed by the domain name that created them (which is a GOOD thing) Are restricted to a maximum number of cookies per Web site (20 with Netscape Version 0) Are limited in size (4K with Netscape Version 0) Have an expiration date
Server-Side Cookies Can be used to restrict access Support shopping cart applications Help track user activity on the Web site
Server-Side Data Collection Maintaining user information Collecting and updating information e-Commerce strategies
Some Common Web Log Statistics • Most popular pages • Frequency of referring sites • Page count statistics: means, percentiles, variation • Session count statistics • Frequency of Web browser usage • Frequency of operating systems • Frequency of error types Check web log statistics: http://www.commerx.com/usage/ This website is the business site of IMW (http://www.inetworks.com) headquartered in Austin, Texas.
Baselines and Comparisons • Which statement is more informative? • Our Web server recorded 11,000 page views yesterday. • Our Web server recorded an increase of 1000 page views yesterday compared to the previous day. • Our Web server recorded a 10% increase in page views yesterday compared to the previous day. continued...
Baselines and Comparisons: Good or Bad? “We converted 25% of our registered customers to premium account status this month.” “We converted 50% more of our registered customers to premium account status this month compared to last month.” “Last month we converted 2 registered customers to premium account status, and this month we converted 3.”
Methods of Evaluating Visitor Behavior • Web Stats • Path Analysis • Link Analysis • Stochastic Process Methods • Page transition probabilities • Probability of site abandonment
Path Analysis for an E-tailer Final Decision Product Selection Customer Info Shipping Billing/ Credit Card Info Product Info
A Visitor Path Path: 1 6 7 1 3 8 1 5 1 4 2 6 3 2 3 7 5 4 8 6 EXIT
Path Analysis Example Results • Sixty percent of site visitors leave after viewing the home page. • Seventy-three percent of customers who purchase product X do not access the product X information page. • The highest probability of abandonment occurs on the shipping page. • Sixty-three percent of consumers who purchased product X viewed warranty information, while twenty-seven percent of consumers who abandoned a shopping cart containing product X viewed warranty information.
Path Analysis E-tailer Example • Data • Only sessions with shopping carts are included • All paths up to “checking out” are condensed into a single “Product Selection” state • Each session consists of 1 to 7 states, number of items selected, value of all items in the shopping cart, and time each state is entered. • Purpose: investigate the abandonment of shopping carts and exiting the site without making a purchase. • Analysis: group shopping carts by value, perform a sequential association analysis, and plot confidence as a function of state.
SAS Code for Path Analysis odshtml path='C:\workshop\winsas\CCWEB' body='rlnkstat.html'; title1"Path Analysis of E-tailer Data"; proccontentsdata=crssamp.rlinks; run; Produce Contents of the RLINKS Dataset continued...
SAS Code for Path Analysis procfreqdata=crssamp.rlinks; tables Category DollarCat NumItems PurchaseStep /listmissing; run; Produce Frequencies for Class Variables continued...
SAS Code for Path Analysis procunivariatedata=rlinks; var TotalCost; run; title2"Total Cost when a Purchase is Made"; procunivariatedata=rlinks (where=(PurchaseSequence=7)); var TotalCost; run; Basic Descriptive Statistics continued...
Link Analysis • Link analysis is the examination of the linkages between effects in a complex system. (SAS Help screen) • Analysts try to discover the relationships between states in a complex system. • A link analysis may employ a variety of techniques including OLAP, associations, sequences, clustering, and graphics. • The path analysis performed on the e-tailer data may be viewed as a link analysis performed on a rather simple retail system.
SAS Link Node • C1U -- the unweighted first-order Centrality measure. • C2U -- the unweighted second-order Centrality measure. • C1 -- the first-order Centrality measure. • C2 -- the second-order Centrality measure. • VALUE -- the value of the class variable, or the midpoint of the bin of the interval variable that constitutes the node. • VAR -- the variable that constitutes the node. • ROLE -- the variable role. • COUNT -- the node count. The number of observations that are represented by the level of the variable. • PERCENT -- the node count divided by the total number of observations. • ID -- the node ID. • TEXT -- the text variable, represented as VAR=VALUE. • X -- the X-coordinate of the node in the Link Graph. • Y -- the Y-coordinate of the node in the Link Graph.