150 likes | 256 Vues
AgentCities - Agents and Grids. Thoughts on Monitoring and Agents. Prof Mark Baker ACET, University of Reading Tel: +44 118 378 8615 E-mail: Mark.Baker@computer.org Web: http://acet.rdg.ac.uk/~mab. Outline. Monitoring: What is it? A View of Grid Monitoring. Ganglia Example.
E N D
AgentCities - Agents and Grids Thoughts on Monitoring and Agents Prof Mark Baker ACET, University of Reading Tel: +44 118 378 8615 E-mail: Mark.Baker@computer.org Web: http://acet.rdg.ac.uk/~mab mark.baker@computer.org
Outline • Monitoring: What is it? • A View of Grid Monitoring. • Ganglia Example. • Generic Monitoring Architecture • A Layered View. • Monitoring Issues. • Where do Agents fit in? • Summary. mark.baker@computer.org
Monitoring: What is it? • Monitoring is part of the process of administrating and managing computer-based resources: • However, the term “monitoring” is rather an overloaded word. • The term implies that we are effectively “watching” the state of some component or resource. • This type of passive monitoring (read only) is useful in some spheres (e.g. job submission), but has limited usefulness for actually managing these computer-based resources. • Dynamic monitoring (read/write) is more useful because now we can not only watch the status of the resources, but we can interact with them to control and manage them too (e.g. reconfigure on the fly, change QoS setting, queue priorities…). mark.baker@computer.org
A View of Grid Monitoring • Traditional view of monitoring is looking at static and dynamic computer-based resource information: • Static Information: • For example - CPU type, amount of memory, OS type… • Dynamic Information: • For example - CPU, memory, disk use. • This information gathered can be used for all manner of tasks: • Basic systems monitoring (sys admin tasks), • General accounting, • Monitoring for job submissions purposes (want to choose best resource for task placement), • Monitoring to ensure QoS, • Policing SLA, • Performance profiling of systems and applications (looking for bottlenecks and other problems), • Potential for security reasons. mark.baker@computer.org
Ganglia mark.baker@computer.org
Generic Architecture (Local) Agent/Sensor Agent/Sensor Agent/Sensor mark.baker@computer.org
Generic Architecture (Global) mark.baker@computer.org
Data Management Issues • Need to produce: • A simple and expressive API, • Device drivers and manager for each Agent, • A means of describing the monitored data: • Implies an XML-based schema and an ontology. Resource Markup Language Ontologies and Schema API Agent API Driver Manager Agent Driver Manager Common Agent API Agent Devices XYZ Agent SNMP Agent NWS Agent NetL Agent WBEM Agent SCM Agent mark.baker@computer.org
Some Architectural Issues • Sensors/Agents: • Make everyone install custom agents, or use existing ones! • Potentially billions of resources that need monitoring! • Protocols: • No real standards apart from SNMP. • XML used extensively now - GLUE often used (limited). • Resources verses Services: • On-going debate. • Scalability: • Need global extent, current systems are typically designed for small scale, based on cluster monitoring. • Security: • Often little or no security. • OK for read-only systems, but… • Intrusiveness: • Trade-off as usual, do not want to affect systems monitored. mark.baker@computer.org
Monitoring Systems • Recent review showed that there are about twenty active Grid-based monitoring systems. • These range from systems: • That are “built from scratch” - to use such a system you need to install all the their software for monitoring purposes, • To those that are built on existing infrastructure and standards - gather SNMP/Ganglia data and use this for monitoring purposes. • The latter systems are becoming increasing popular and widely used to day. mark.baker@computer.org
Where do Agents fit in with Monitoring? • Agent booklet definition: • “An agent is a computer system that is capable of flexible autonomous action in a dynamic, unpredictable, typically multi-agent domains.” • According to this definition we “just” throw away what we have and start again with agents! • However, there are a raft of very practical problems… • Not least among these is that most of the world does not use agent-based technologies, and do not want to replace there monitoring infrastructure with something new and unproven. mark.baker@computer.org
Where do Agents fit in with Monitoring? Intelligence/Knowledge Clients Intelligent Tools Ontologies and Schema Brokers, Schedulers, Policing API Agent/Sensor API Driver Manager Agent/Sensor Driver Manager Common Agent/Sensor API Agent Devices XYZ Agent SNMP Agent NWS Agent NetL Agent WBEM Agent SCM Agent Data/Information mark.baker@computer.org
Where do Agents fit in with Monitoring? • Not practical to replace existing monitoring infrastructure with agents. • However, there is vast space to use agents to process data/information gathered and use this provide intelligence/knowledge to higher-level tools. • Key agent features: • Intelligence - rule-based decision making. • Complex agent-to-agent interaction - to produce knowledge for more sophisticated decision making. • Potential problems!: • Integrating agent frameworks and the Grid, APIs, and protocols - practical aspects of wide-scale deployment! mark.baker@computer.org
Where do Agents fit in with Monitoring? • SLA/QoS/site-policy policing • Intelligent brokering for a range of tasks: • Negotiation, • Bartering, • Arbitration, • Job submission, • Resource reservation. • Accounting tools. • Autonomic behaviour - help in providing self-healing capabilities of distributed systems. • Working with Semantic Web technologies to create/provide knowledge. mark.baker@computer.org
Summary • Well established monitoring infrastructure for existing distributed systems - clusters, LANs, the Grid… • Higher level tools/services that use the gathered monitoring data are few and far between - seems a good space where agent-based systems can work. • Need “intelligence” to provide knowledge to consumers of Grid-based services. • Not necessarily easy to put agent and Grid infrastructure, various issues security, different architectures, API, protocols… mark.baker@computer.org