An Introduction to IBM Tivoli Monitoring 6.2.1and the ITM 6.x Universal Agent Tivoli User Group Meet (Mumbai), 4th July 2009.
IBM Tivoli Monitoring 6.2.1 • What ITM is - Service Management • What ITM does – Visibility, Control, Automation • Monitoring Space and Scalability • Advanced Features and Customization • Problem Determination • Questions
IBM Tivoli Monitoring 6.2.1 – Service Management VisibilityVisualize service performance and health across all network, server, middleware and application components. Business ServiceManagement ControlIncrease effectiveness and productivity, reduce errors and improve availability through consolidated tooling. AutomationKeep costs under control as all aspects of infrastructure grows with integrated policy-based automation. ApplicationAvailability Consolidated OperationsManagement
IBM Tivoli Monitoring 6.2.1 - Control • Situations allow operators to quickly distribute a set of conditions to determine if a potential problem exists in any monitored resource • Out-of-the-box situationsprovide immediate returnon investment and fasttime to value • Extended situations reducefalse alerts and raiseconfidence of operators thatalerts are real. • Tight integration into rootcause analysis toolsimprove mean time torecovery
IBM Tivoli Monitoring 6.2.1 - Automation • Take Action allows for entry of individual commands and either manual or automated processes to be executed in response to an individual situation • Out-of-the-box take actionsprovide immediate returnon investment and fasttime to value • Personalized take actions cancapture a local best practice forunique situations and execute itpreemptively • User-defined text can alsoimbed knowledge that maybe unique to a particular situation
IBM Tivoli Monitoring 6.2.1 - What ITM does • Runs as framework components, plus intelligent agents on monitored and/or monitoring infrastructure • Collects metrics and reports them, per configuration • Raises events, takes action, runs automation policies, per configuration • Gives you key status indicators for your entire infrastructure at a glance and at a single console
IBM Tivoli Monitoring 6.2.1 – Scalable Architecture • One Hub TEMS necessary, remote REMS can be added to allow scaling • One or more Warehouses if you decide to collect metrics over time. Else, you deal with only real-time data. • About 500 agents per [R]TEMS • Large installations can have > 10K OS + app agents
IBM Tivoli Monitoring 6.2.1 - Advanced features and customization Agent customization • Agent Builder • Eclipse based GUI for agent development • Generates agents for Windows, AIX, Linux, Solaris, HP-UX • Incorporates queries, situations and workspaces • Creates an installable image – local install and remote deploy image
IBM Tivoli Monitoring 6.2.1 - Advanced features and customization
IBM Tivoli Monitoring 6.2.1 - Advanced features and customization Agent customization • Agentless Monitoring – New in 6.2.1 • You can remotely monitor operating systems using Java Managenent Extensions (JMX), Common Information model (CIM), Simple Network Management Protocol (SNMP), and Windows Management Instrumentation (WMI). • Nothing installed on target box, but need to enable these services/API’s • Easy, faster to implement
IBM Tivoli Monitoring 6.2.1 - Advanced features and customization Agent customization • Universal Agent – the swiss army knife of agents • You can create UA applications (mdl) • You can interface with your enterprise using API services, Sockets, Files, Scripts, HTTP URLs, Post, ODBC, SNMP. • This covers a lot of ground, and gives you immense flexibility • Multiple applications can run in a single UA instance • We will take a detailed look at the UA further on
IBM Tivoli Monitoring 6.2.1 - Advanced features and customization • Workflows and automation • Automate tasks to be carried out under certain conditions • Start, stop policies and situations; send SNMP traps
IBM Tivoli Monitoring 6.2.1 – Problem Determination • What to do when you have problems • Look up the documentation for error codes – actions listed • See if the logs tell you what’s wrong • Use the IBM Support Assistant (ISA) to analyze logs and find solutions: http://www-01.ibm.com/software/support/isa/ • IBM’s Product information Centers • Troubleshooting docs, Technotes, FAQs on the Support website • Contact IBM Support : 1800-425-6666 (Toll Free) or +91 080-26788970 • Set traces as advised
The ITM 6.x Universal Agent • Generic, configurable agent • Customizable • Extensible • Flexible • Cross-platform • Brings ITM-enabled VCA to your custom data • Helps with custom, exotic scenarios for which an agent doesn’t already exist • Do-It-Yourself agents for data from many sources which can’t be “traditionally” monitored • Does not replace the standard agents, but complements their capabilities • Powerful tools to build solutions that work for youwhen you want more from ITM
How everything fits in together TEPTEPS(R)TEMSUADP[data]
Major UA components and steps • The data providers • Eight types • The metafile(s) – defining your UA application • <ua-appname>.mdl • The process of getting your UA application to run • Validation • Importing • Refreshing (if you change your application) • kumpcon, um_console, take action commands
Data Providers – the interfaces to the UA Data Providers are the interfaces that the Universal Agent uses to enable data collection from external sources
API | File | HTTP | ODBC | Post | Script | SNMP | Socket Application Programming Interface Server DP • Supports API client functions • You can develop scripts interfacing with the UA • You can write C/C++ code interfacing with the UA • You can use manual CLI to interact with (a subset of) the API Minimal requirements • library + accompanying header • libc & socket api, plus any other libs required by your code
API | File | HTTP | ODBC | Post | Script | SNMP | Socket • Functions and commands (library + binaries) that you can call Functions related to • Defining data • Formatting data • Transferring data • Requesting activity • Status
API | File | HTTP | ODBC | Post | Script | SNMP | Socket Monitors data in text files written out sequentially, typically custom logs • Can monitor dynamic file names given a defined pattern • You may use some regular expressions to filter through specific keywords in the logs • Must run locally on the box where files are monitored • Monitored files cannot be > 4 GB in size • You may however “map” files from remote filesystems to be seen as a local resource if you want to monitor logs centrally – using nfs, smb, etc.
API | File | HTTP | ODBC | Post | Script | SNMP | Socket HTTP URL monitoring • Checks HTTP protocol status and availability of a URL • Is proxy-aware • Can alert you on availability of a particular link • This is one DP for which you have a predefined workspace, and no metafiles are needed • Relatively quick and simple to set up • For more complex scenarios, you can use the other DP’s
API | File | HTTP | ODBC | Post | Script | SNMP | Socket Open Database Connectivity – query any ODBC-compliant DB via the ODBC client/library provided with the DB • Note: this one is Windows-only! • You can run custom SQL queries • You can raise alerts based on the values returned • Since ODBC clients work across the network, you can obtain data across different boxes over the network
API | File | HTTP | ODBC | Post | Script | SNMP | Socket Helps you send (“post”) messages to the UA to display on the TEP • This is actually a Socket DP with a fixed metafile • The metafile can be overridden if required • Listens on both TCP and UDP (port 7575) • Ten predefined message types • You can raise alerts based on the values returned • kumpsend is the equivalent command
API | File | HTTP | ODBC | Post | Script | SNMP | Socket Runs a script and collects data via stdin • Any data that can be returned by a script • Not just shell, but perl, python, php – as long as it returns formatted data via its stdout • You get to tie in binaries already on your box – stuff like awk, netstat and such • Script is run at regular intervals • You can access environment variables that the script has access to (we will discuss an example)
API | File | HTTP | ODBC | Post | Script | SNMP | Socket Integrates ITM with the SNMP world Useful for devices like routers and networked storage, which can speak SNMP • Collect enterprise MIB data. • Monitor any SNMP traps sent to the data provider. • Perform SNMP SET operations. • Manage and configure SNMP-enabled devices in your enterprise. • Collect historical statistics from your SNMP-enabled devices • MibUtility: http://www-01.ibm.com/software/brandcatalog/portal/opal/details?catalog.label=1TW10TM3P
API | File | HTTP | ODBC | Post | Script | SNMP | Socket Uses standard TCP and UDP sockets to communicate and pass data to the UM • Listens on both TCP and UDP (port 7500, or KUMP_DP_PORT ) • You can write code in any network-aware language (on the sending side) and connect to this data provider • For example, use IO::Socket::INET from Perl to connect to this DP • This is helpful when you can’t run an agent on your monitored platform, but can connect to the UA over TCP or UDP (say, MacOS) • You can specify your own metafile and return custom data to the UA listening on a specified socket • You can raise alerts based on the values returned • kumpsend is the equivalent command using a script on the sending side • The Post DP is a special case of this DP
The metafile • Defines separate UA apps • Name of the application • Name of each of the attribute groups that comprise the application • Source or sources of the data in each attribute group • Names and characteristics of the individual attributes • Helps define data and formats • Helps generate application support during validate/import/refresh • This is NOT the UA configuration file! • This is a file that defines a specific set of data being collected by the UA • One UA can concurrently run more than one application
A very simple example UA application from OPAL thatmonitors password expiry – the metafile • Metafile: password.mdl http://www-01.ibm.com/software/brandcatalog/portal/opal/details?catalog.label=1TW10TM70 //APPL ExpiryPassword ******************************************************************************* *V 1.0 Password Expiration Monitoring solution ******************************************************************************/ //NAME PasswordExpiry K 7200 AddTimeStamp //SOURCESCRIPT password.sh Interval=3600 //ATTRIBUTES ';' User D 32 KEY Atomic PasswordAge C 999999 MaximumAge C 999999 DaysToExpire C 999999 LastPasswordChange D 128 PasswordExpiryDate D 128
A very simple example UA application (contd) – the script • script: password.sh #!/bin/sh for line in `cat /etc/shadow` do MAXAGE=`echo $line | cut -d: -f5` if [ "$MAXAGE" != 99999 ]; then USER=`echo $line | cut -d: -f1` CURRENT_EPOCH=`echo $line | cut -d: -f3` EPOCH=`/usr/bin/perl -e 'print int(time/(60*60*24))'` AGE=`echo $EPOCH - $CURRENT_EPOCH |/usr/bin/bc` MAX=`echo $line | cut -d: -f5` EXPIRE=`echo $MAX - $AGE |/usr/bin/bc` CHANGE=`echo $CURRENT_EPOCH + 1 |/usr/bin/bc` LSTCNG="`perl -e 'print scalar localtime('$CHANGE' * 24 *3600);'`" EXPD=`echo $EPOCH + $EXPIRE |/usr/bin/bc` EXPDATE="`perl -e 'print scalar localtime('$EXPD' * 24 *3600);'`“ if [ "$EXPIRE" -lt 0 ]; then EXPIRE=`echo "PASSWORD EXPIRED"` fi echo "$USER;$AGE;$MAX;$EXPIRE;$LSTCNG;$EXPDATE " fi done
A very simple example UA application (contd) – installation • Ensure that the SCRIPT data provider is enabled on your Universal Agent. Example: KUMA_STARTUP_DP=asfs • Copy the password.mdl file to ITMHOME/<arch>/um/metafile • Copy the password.sh file to /ITMHOME/<arch>/um/scripts directory • Ensure that the password.sh file has Execute permissions for the user account running the Universal Agent. • Next, import the metafiles. Example: On Linux/UNIX, the command is "bin/kumpcon import *.mdl" or you can use the um_console command. To run the um_console command, you must first set the CANDLEHOME environment variable. • After successfully importing the password.mdl file, you can see the result in TEP console. Once the Universal Agent is up and running, you can customize the queries and workspaces and create Situations.
The ITM 6.x Universal Agent • Questions?
The ITM 6.x Universal Agent • Thanks!
ITM 6.x / UA Backup slides
The ITM 6.x Universal Agent • //APPL • Name of the application (//APPL) • The first three characters of the application name must be unique in the enterprise • Enable summarization and pruing of data warehoused data by using the WHEN parameter • //NAME • Name of each of the attribute groups that comprise the application • There must be at least one //NAME and can be up to 64 • Specifies the nature of the data (Polled, Sampled, Event, Keyed) • Controls the interval that the data is available to the agent (TTL), default is 300 seconds
The ITM 6.x Universal Agent • //SOURCE • Sources of the data in each attribute group • When used it follows immediately after //NAME • Not required for API and SNMP metafiles • Supports “run-as” user switching for SCRIPT and ODBC (on Windows) Data Providers • //ATTRIBUTES • Names and characteristics of the individual attributes • Specifies the attribute delimiter in the data string • An attribute group can contain a maximum of 63 attributes • Can override with KIB_MAXCOLS – and set to a max of 127 attributes
The ITM 6.x Universal Agent • P Polled (default). Polled data becomes available periodically and only the latest set of values is available for situation monitoring and reporting. • S Sampled. Sampled data behaves in the same way as polled data except that more than one set of attribute data values can be available for use. • E Event. Event data occurs unpredictably and is reported in asynchronous fashion as soon as the data becomes available. • K Keyed. Keyed data behaves in the same way as sampled data, but allows you to correlate events. You can designate up to five attributes in each group as key attributes.