1 / 16

The Lifecycle of Enterprise Information

The Lifecycle of Enterprise Information. Pankaj Mehra HP Distinguished Technologist Chief Scientist, HP Labs Russia. Business Records. Peak of product success. deemed evidence. Product design. Identified in a legal dispute.

arnav
Télécharger la présentation

The Lifecycle of Enterprise Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Lifecycle ofEnterprise Information Pankaj MehraHP Distinguished TechnologistChief Scientist, HP Labs Russia

  2. Business Records Peak ofproduct success deemedevidence Productdesign Identifiedin a legaldispute Source for business value indicator: Geoff Moore, General Manager of Tower Software • Businesses record and retain a lot of information • Old information is generally less valuable • Information must be retained for a long time Syrcodis 2008 – Pankaj Mehra, HP Labs

  3. The discipline ofInformation Lifecycle Management • Stored Data • Paper • Business Records BusinessProcesses( People(PortalsSearch) Applications(Query) ) • Discovery • Classification • Policy engine • Migration • Retention • Deletion • Key differentiators • Metadata indexing • Content indexing • Compression • Tamperproofing • Timerange search DatabasesFilesystemsContent RepositoriesScalable storage Syrcodis 2008 – Pankaj Mehra, HP Labs

  4. To appreciate the full generality of Information Lifecycles, consider: • A store coupon (price, expiry date) • A gift card (value, identity) • Office space lease contract • An airline ticket • A child’s X-ray

  5. To appreciate the full generality of Information Lifecycles, consider: • A store coupon (price, expiry date) • A gift card (value, identity) • Office space lease contract • An airline ticket • A child’s X-ray • Retention / deletion policy • 90 years! • Legal requirements • HIPAA • Privacy and access control • Healthcare power-of-attorney • Parental rights • Life and death issues • Cost and risk of doing business • Future laws! ×

  6. Can you figure out the systems implications of all this? • A store coupon (price, expiry date) • A gift card (value, identity) • Office space lease contract • An airline ticket • A child’s X-ray • Retention / deletion policy • 90 years! • Legal requirements • HIPAA • Privacy and access control • Healthcare power-of-attorney • Parental rights • Life and death issues • Cost and risk of doing business • Future laws! × • Comprehensive data capture • Technology neutral • De-normalized • Sufficient indexing • Disaster tolerance • Volume of accumulated information (PBs)? • At what cost and what quality of service? • In what format and on what media? • Legal and loyalty cost of data outages and data loss? • What to keep and for how long? • What to delete and when? ×

  7. ILM Platforms • Robust and auditable architecture • To allow compliance with prevailing laws • Connectors and data discovery • to capture information anywhere (image in medical equipment, invoice from SOAP message, project data fragmented across documents, or sales order normalized across many tables) • Scalable, tiered storage with migration • Data de-duplication and compression • Tamper-proofing • Robust classification and analysis algorithms • Compact and uniformly applied policies • Access methods resilient against forgotten names and locations

  8. To comply with laws Syrcodis 2008 – Pankaj Mehra, HP Labs

  9. To tame unchecked data growth

  10. To differentially manage by business value 50 000 000 data outage penalty-rate($/hr) 5 000 000 asynchronous, batched mirroring synchronous mirroring 500 000 50 000 fail-over to secondary site reconstruct primary site 5 000 tape backup 500 async mirror data loss penalty-rate ($/hr) 500 5 000 50 000 500 000 5 000 000 50 000 000 Keeton, et al. (HP labs), Designing for disasters, FAST’04 conference Syrcodis 2008 – Pankaj Mehra, HP Labs

  11. Typical Lifecycle of Structured Data Test and development Subset the data Datawarehouse Extract, transform, load Productiondatabases Archive (or delete) Historical archive (long-term retention) Individual data marts (decision support) Syrcodis 2008 – Pankaj Mehra, HP Labs

  12. Database Lifecycle Management

  13. … not forgetting the other 85%(in files and folders on fileservers) These graphs are typical of “reports” produced by discovery tools, such as Scentric Tao Destiny, Kazeon Discovery Engine, and IntermineFileCensus. Syrcodis 2008 – Pankaj Mehra, HP Labs

  14. Continuously Protect Optimize Archive 0-72 hrs 72 hrs – 2 wks Months Years Decades The Lifecycle of Files • Information in documents typically passes through 3 phases during its life • Operational • frequently updated during 72 hours after creation • Transitional • infrequently updated • converted to business record format • Archival • static(rarely accessed) • subject to long-term retention management Syrcodis 2008 – Pankaj Mehra, HP Labs

  15. Information Lifecycle Management Process Storage class 1 Discover find info sources Storage class 2 Classify determine categories Analyze Storage class 3[normalize, compress, encrypt] extract metadata disposition? at end of retention period discard ILMpolicies Syrcodis 2008 – Pankaj Mehra, HP Labs

  16. ILM’s 3 Pillars:Business value, Laws, Costs • in: info sources, apps • out: business records Storage class 1 Discover • classification rules Storage class 2 Classify Analyze Storage class 3[normalize, compress, encrypt] disposition? • in: examples (e.g., of insider trading) • out: notifications (e.g., of possible insider-trading instances) • infrastructure costs, efficiency discard ILMpolicies • lifecycle action rules/goals • business value of information Syrcodis 2008 – Pankaj Mehra, HP Labs

More Related