30 likes | 136 Vues
Browse shift history, monitor site performance, file tickets, manage tasks, escalate issues, and maintain shift activity log. Efficient communication and escalation process for quick issue resolution in a high-production automated environment.
E N D
Typical Shift Plan • Browse recent shift history • Check performance of all sites • File tickets for new issues • Continue interactions about old issues • Check status of current tasks • Check all central processing tasks • Monitor analysis flow (not individual tasks) • Overall data movement • File software (validation) bug reports • Check Panda, DDM health • Maintain elog of shift activities
Some Communications Issues • Monitoring: primary goal of ADCoS team • When problems are found • Elog entry must be opened first • If software error, file validation Savannah bug report • If task error, file operations savannah bug report • If site error, file GGUS (or RT) bug report • If serious error, inform/consult ADCoS mailing list • If DDM error, inform DDM on call expert (through mailing list) • If central services problem, inform central services expert on call • Interventions • If a site is in downtime or failing badly for many hours, shifters can set site to offline (please leave elog ticket open till site is set online)
Monitor-Escalate-Follow up • We expect production to be maximally automated • Shifters primary function – monitor • Any problems discovered – escalate • Continue to follow up all open issues • Types of escalation: • Site problems • Service problems (Panda, DB, DDM…) • Software bugs • Work flow (tasks, exercises…)