270 likes | 284 Vues
Rational Configuration Design. To Prevent Irrational Problem Solving. John Murphy. Introduction. Basic. Advanced. Parents and dependencies Managing exceptions Automation. Contacts Hosts Services. Our Scenario. Contacts. Contacts. Contact. User. Login account for an actual user.
E N D
Rational Configuration Design To Prevent Irrational Problem Solving John Murphy
Introduction Basic Advanced • Parents and dependencies • Managing exceptions • Automation • Contacts • Hosts • Services
Contacts Contact User Login account for an actual user. No contact information. • Contact address for support. • Email, SMS, Ticketing, etc.
Contacts Contact Definition define contact { name contact-user host_notifications_enabled 1 service_notifications_enabled 1 host_notification_period 24x7 service_notification_period 24x7 host_notification_optionsd,u service_notification_options c host_notification_commands notify-h-email service_notification_commands notify-s-email register 0 } define contact { contact_namecu-contact contactgroupscg-main email servers@domain.com use contact-user } define contactgroup { contactgroup_namecg-main alias Kmart Contact contactgroup_members vg-team }
Contacts User Definition define contact { name read-contact host_notifications_enabled 0 service_notifications_enabled 0 host_notification_period none service_notification_period none host_notification_options n service_notification_options n host_notification_commandscheck_none service_notification_commandscheck_none register 0 } define contact { contact_namevu-jsmurphy contactgroupsvg-team use read-contact } define contactgroup { contactgroup_namevg-team alias Kmart Team } define contactgroup { contactgroup_namecg-main alias Kmart Contact contactgroup_membersvg-team }
Contacts LDAP/AD For Nagios Core ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin" <Directory "/usr/local/nagios/sbin"> SetEnv TZ "Australia/Melbourne" Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "Nagios Core" AuthTypeBasic # AuthUserFile/usr/local/nagios/etc/htpasswd.users # Require valid-user AuthBasicProviderldap AuthName “Nagios server" AuthzLDAPAuthoritativeoff AuthLDAPBindDN"CN=bindAccount,OU=User,DC=domain,DC=com" AuthLDAPBindPasswordxxxxxxxxx AuthLDAPURLldaps://domain.com/OU=User,DC=Domain,DC=com?sAMAccountName?sub?(objectClass=user) AuthLDAPGroupAttributemember AuthLDAPGroupAttributeIsDNon Require ldap-group CN=NagiosAccessGroup,OU=Groups,DC=domain,DC=com </Directory>
Contacts Summary • Distinguish between your users and your contacts. • Use an existing authentication source for your user logins. • Consider the end-user experience… try to ensure it’s easy to get the information they need.
Hosts • Focus on minimizing host configuration to make automation easier. • Use templates to assign user view information. • Create host groups based on shared monitoring profiles.
Hosts Host Definitions define host { name srv-template alias Server host template check_command check_icmp!250.0,60%!500.0,80% max_check_attempts 3 check_interval 10 retry_interval 2 check_period 24x7 contact_groupscg-main notification_interval 60 notification_period 24x7 notification_optionsd,f notifications_enabled 1 register 0 } define host { host_nameexchange01 use srv-template alias Exchange server address exchange01 parents switch001,switch002 hostgroupssrv-exchange, srv-windows icon_image exchange.png register 1 } define hostgroup { hostgroup_namesrv-windows alias Windows group }
Hosts Summary • Minimize configuration in host objects to make automation easier. • Hostnames allow for easier maintenance than IP addresses. • Create logical host-groupings that will make service assignment easier e.g. OS type, Location, Applications it serves.
Services • Keep services as generic as possible to prevent the need for duplicate services. • Minimizing service templates allows for easier management and baseline changes. • Use service groups for applications.
Services Service Definitions define service { name main-service-template service_description main service template max_check_attempts 3 check_interval 10 retry_interval 2 check_period 24x7 notification_interval 60 notification_period 24x7 notification_options c register 0 } define service { service_descriptionWindows C: usage use main-service-template hostgroup_namesrv-windows,srv-v-windows check_command check_nt!USEDDISKSPACE!-w 80 -c 90 contact_groupscg-main,cg-main-SMS register 1 }
Services Summary • Strike a balance between your service-templates and your service definitions. • Service groups are a very useful feature when used appropriately, used inappropriately they are an administrative burden. • Device life-cycle happens, ensure your configuration isn’t burdened by over-complexity.
Good Parenting (or how to not get woken up 20 times at ~3am) Parenting Service Dependencies Parent indirectly monitored services with service dependencies. • Use host parenting. • Use host parenting. • Use host parenting.
Indirect Services …And the art of dependencies A typical ESX monitoring setup… Q. But what happens when the vSphere server fails?
Indirect Services …And the art of dependencies A. Something like this
Indirect Services …And the art of dependencies define servicedependency { dependent_hostgroup_namesrv-v-windows dependent_service_descriptionCPU Usage host_namevSphereServer service_description Ping dependency inherits_parent 1 execution_failure_criteriaw,u,c,p notification_failure_criteriaw,u,c dependency_period 24x7 } define service { host_namevSphereServer service_descriptionPing dependency use main-service-template check_command check_ping!100,80%!200,90% register 1 } define service { service_descriptionCPU Usage use main-service-template hostgroup_namesrv-v-windows check_commandcheck_esx!CPU contact_groupscg-main register 1 }
Managing Exceptions • Clearly label exceptions in your config. • Make sure you can use the same solution again if necessary. Image by Mike Bade: http://robotseatingpies.blogspot.com.au/2011/06/robots-dont-have-feelings_16.html
Automation (or intrapreneurship ideas for the lazy) • Every piece of infrastructure is a potential data source… make use of it! • AD/LDAP Servers. • Virtual infrastructure API’s. • Patching systems. • Asset databases. • Network management platforms. • Network LLDP/CDP tables. • SNMP enabled servers. • Help I’m running out of space!
Nagios World Conference Thanks For Listening!