1 / 18

Data Tagging Architecture for System Monitoring in Dynamic Environments

Data Tagging Architecture for System Monitoring in Dynamic Environments. Bharat Krishnamurthy, Anindya Neogi, Bikram Sengupta, Raghavendra Singh (IBM Research Division in India). IEEE Network Operations and Management Symposium (NOMS), 2008. Introduction. Why need Monitoring system:

cree
Télécharger la présentation

Data Tagging Architecture for System Monitoring in Dynamic Environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Tagging Architecture for System Monitoring in Dynamic Environments Bharat Krishnamurthy, Anindya Neogi, Bikram Sengupta, Raghavendra Singh (IBM Research Division in India) IEEE Network Operations and Management Symposium (NOMS), 2008.

  2. Introduction • Why need Monitoring system: • Monitoring various parameters of the entire IT system in the data center is the key to efficient management of a data center environment. • Though each IT component is often packaged with open monitoring interfaces and tools • e.g., a database server or a network element will have published APIs for querying its performance metrics • A monitoring system is required to integrate and process the data collected from heterogeneous sources

  3. Concept: Monitoring Objective • Specification that defines how to collect a data stream, process it, and generate a type of event or an aggregated data stream. • Also called Service Level Objectives or SLOs in this paper. • “Service Level Agreement” (SLA) is viewed as a possible composition of multiple SLOs

  4. Issue • In dynamic environments, monitoring objectives cannot be frozen at setup. • Monitoring objective need to be specified on a continuous basis as new applications or hardware are deployed. • Operations personnel lack sufficient knowledge and skills to use or extend complex data modeling standards • such as Common Information Modeling (CIM) [10] [10] DMTF – Common Information Model (CIM). http://www.dmtf.org/standards/cim/

  5. Example • An operations team member wants to measure the utilization of a disk. • Assume we have a function to get the current utilization of a disk. • The Operation personnel should choose “BaseMetricDefinition, SystemResource, ComputerSystem” class from higher-level CIM classes. • Tell the monitoring system what data is • Furthermore, Operation personnel have to integrate all monitoring parameter to a logical one.

  6. Goal • The goal is to balance the benefits (e.g. uniform interpretation) of well-defined taxonomies like CIM with the ease of use that free text descriptions offer. • Model Driven Monitoring System (MDMS) • A SLO authoring and monitoring system

  7. Model Driven Monitoring System (MDMS) • MDMS has two parts, • The structured part • Uses standard system configuration and monitoring specifications(CIM), to allow standard processing to be automatically performed on the monitoring data. • The unstructured part • Allows any additional “information” not represented by the structured part to be modeled statically or at runtime.

  8. A simplified data model(1/2) • Use tag to represent CIM model, and leave out the redundant attribute. • For example • A user wants to monitor the disk utilization on server 1 • ‘<BaseMetricDefinition.name=utilization>/<SystemResource.name=disk>/<ComputerSystem.name=server1>’ • The computerSystem may have another redundant attribute such as Dedicated, ResetCapability

  9. A simplified data model(2/2) • Use string matching algorithm to get relation between tags • For example, • A different user, to measure the throughput of an application in server 1. • ‘<BaseMetricDefinition.name=throughput>/<application.name=printbill>/<ComputerSystem.name=server1>’. • By using fast string matching algorithms, we can get a relation with the example in previous page

  10. Incorporating Unstructured Data Types • Since this system use tag as model, so we just “concatenate” the unstructured data types • In the above example, measuring disk utilization, the data collection logic can generate the pairs by collecting per partition disk utilization. • ‘<BaseMetricDefinition.name=utilization>/<SystemResource.name=disk>/<ComputerSystem.name=server1>/<part.name=hda>’ • Assumption: analytics application writers are domain experts who can use existing unstructured text mining algorithms and tools to infer types from the free text descriptions and/or the tag values

  11. System Architecture overview

  12. Look into agent

  13. Data Processing by Server

  14. Overhead Experiments(1/3) • Environments • The MDMS Server is hosted on an IBM p9113-550 with 16GB of RAM and 4 CPUs running at 1.6 GHz each. • At the time of the measurements close to 1500 SLOs were configured to monitor around 500 servers.

  15. Overhead Experiments(2/3) • MDMS server overhead:

  16. Overhead Experiments(3/3) • The agent measured was configured to handle 235 objectives. • The utilization measurements do not include the actual running of the code that performs the data sensing and processing activity

  17. Conclusion • In this paper we described a monitoring system that uses a hybrid data model consisting of structured and unstructured parts to describe the semantics of monitoring data and events. • The representation of data semantics is in terms of string tags. • The authoring of specifications becomes simpler and more intuitive for operations

  18. Future work

More Related