1 / 23

This is a process we already use but we need more consistency.

Incident Reporting - This is a guide to the life of an Incident from first report to closed record. This is a process we already use but we need more consistency.

sylvie
Télécharger la présentation

This is a process we already use but we need more consistency.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Incident Reporting - This is a guide to the life of an Incident from first report to closed record. This is a process we already use but we need more consistency. Each incident from the earliest possible time should be encapsulated in a transportable package which can then be picked up and understood by either an Engineer on the next shift, the Lead Engineer, Encompass Management, a Supplier Support Department or the customer. Basic Information Required For Each Ticket Following the initial report in Sharepoint, there is a minimum of information which should be available to anyone looking at the case. Channels affected :- We have VOD Demand 5 available now for VOD cases, some may be just ‘Internal’ or ‘External’. Services affected :- Engineering refer to TX chains, Operation refer to Client designated names, Sharepoint uses both to reduce confusion. TX-Sides affected :- This is very significant for Engineering as the equipment and logs are specific to TX-Sides Material ID’s affected :- Basic investigation involves searching logs for Mat ID’s to find errors and previous TX timings Clients affected :- Simple field, we have ‘Internal’ or ‘External’ for MCR and other non-precise clients Time & Date of Incident :- Significant for obvious reasons. Timetable of start, development and end of outage is useful TC’s on Duty :- Shifts change and first hand reports are most useful DOM’s on Duty :- Shifts change and first hand reports are most useful Engineers on Duty :- Shifts change and first hand reports are most useful Supplier Case No. :- If it is escalated it should be tracked in Sharepoint ‘Supplier Cases; list referenced by Supplier Case Number Ticket Status :- This field should be kept up to date, nothing should be new for more than an hour Most of this should come from TX Operations the new Incident form has nine required fields for anyone opening an Incident so most of this information should be available from the start making prioritisation and assignment easier. However, this information will come from a variety of sources, incident tickets can be opened by anyone not just TX, it could be ingest, MCR or clients opening tickets so some tickets will be opened with little more than a brief description and guesses at what the required fields should contain. Incident Tickets which have been escalated to a Supplier still need detailed Engineering input, they are still Incidents but they are incidents we cannot resolve without help. The Sharepoint Supplier Case list will track the issue and record the progress but the Service Desk needs to follow up and respond to Supplier requests. Escalation should not be seen as First Line resolution, Suppliers are there to assist our work, not take it off our hands. There are two views in the Suppliers Case list particularly for Lead and Support Engineers: Supplier Problems Overview - Encompass Support - Priorities | Encompass Support - To Close ’Encompass Support – Priorities’ is the filtered view of cases where the action is with Encompass, these may need logs or information. ‘Encompass Support – To Close’ is the filtered list of cases where a resolution has been deployed and we are monitoring the results ready for closure.

  2. Escalation of Incidents

  3. This is a guide to Normal / Low priority Incidents broken down into three stages: • 1st stage recognise and diagnose issue. Support Resolution - All involved • 2nd stage Lead Engineers Investigation - Lead engineers • 3rd stage Supplier Investigation and resolution - Suppliers plus Leads and Support collaborating • 1st Stage (5 to 10 minutes) is primarily recognising and diagnosing issues, it is broken don into 4 actions: • 1. Awareness of new issues Carrying the phone & monitoring the list. • 2. Management of new issues Prioritising & Assigning tickets • 3. Initial Investigation Identify equipment / asset involved, gather logs check for alarms • 4. Resolution by Support Engineer If resolved, get Operations sign off and update sharepoint • DECISION TO ESCALATE TO LEAD ENGINEER / SUPPLIER - is made now 10 to 30 minutes into incident. • 2nd Stage (60 to 90 minutes) is Lead Engineer Investigation. Escalation should be a handover of a prepared case from Support. • 1. Detailed Investigation by Lead Including Invasive work on live systems in Emergency • 2. Check Logs - Check alarms Detailed analysis of available information 3. Resolution by Lead Engineer If resolved, get Operations sign off and update sharepoint DECISION TO ESCALATE TO SUPPLIER - is made now 60 to 90 minutes into incident. • 3rd Stage (Timing depends on Supplier SLA) Dial-in, RMA, Site visit. All with Support involvement and update feedback. Encompass Incident Priorities Critical Incident - PHONE CALL - Off Air, No output, No options, slide to air or completely off scheduled output - Escalate to ALL NOW High Priority - PHONE CALL - Will be off air within eight hours - Escalate to Leads immediately and Supplier within 30 mins. Normal Priority - Update Sharepoint with initial findings, escalate to Lead / Supplier within 30 minutes if on-going. Low Priority - Update Sharepoint with initial findings, undertake investigation, escalate to Lead after 24 hours.

  4. Incident Flow – Graphical view of Incident Priorities

  5. Sharepoint Problem Management Site – Lists and Configuration Database for Support Sharepoint is divided into Sites which can be seen along the horizontal bar near the top of the screen, from here you can access ‘VMP’ for VOD information, ‘Engineering’ and ‘Playout Operations’ sites. In support we own and maintain the ‘Problem Management’ site as seen above; the Homepage is split into different areas, the sidebar on the left will take you to different areas within the site, the Knowledge Base in the middle is a view of the ‘Knowledge Base’ list which can be accessed from the sidebar. The Section on the right is where new documents and news are uploaded, there are also quick links to useful and commonly used areas / lists / documents. The top centre section gives client filtered composite views of Active Incidents, Problems and NOW’s for each client plus links to specific areas of the Configuration Database relevant to each client. Supplier filtered views of active Incidents are available from here too.

  6. Sharepoint for Engineers Guide to the Sharepoint Sidebar in Problem Management Daily Checks The Knowledge Base, News, Useful Documents and Links • New Support HandoverEngineers Shift Handover • System StatusRoutine Checks for shift Engineers • Dayshift Checks Routine Checks for shift Engineers • Nightshift Checks Routine Checks for shift Engineers Service Desk     • Incidents ops ViewList of tickets – for Operations • Engineers incident List List of tickets - or Engineers • Engineers View Composite View of Active Incidents Problems and NOW’s • Created Past Week Composite View of Incidents, Problems and NOW’s created in past 7 days regardless of current status • Active Outage Incidents not closed which caused a noticeable effect to end user • Outage 30 daysAll Incidents noticeable to end user in last 30 days • Incident Ageing Active and Closed Incidents with ages from creation to last update • Completed Work Composite view of closed Incidents, Completed Problems and Expired NOW’s Problem Management     • Supplier Cases    The list of issues escalated to Suppliers referenced by supplier case number. • SuppliersDetailsSupplier Contact Details. • Known Error Database Descriptions of frequently seen issues with details of investigations and resolutions • Service ImprovementContinual Service Improvement – Wasteful and inefficient processes with workarounds / solutions • Knowledge Base` List of lessons learned, tips and information • System ChangesList of changes to the system not covered by NOW’s • Support RMA's                       List of RMA’s which have come through Support. Mike Glynn manages deliveries / dispatches and records. • Active NOW’s NOW’s which have not expired, i.e. not passed their timeslot • NOWs                                     Up to date list of Notifications of Work sorted by ‘date of work’, sort by ‘created’ to see most recently uploaded. • NOW Calendar                      Calendar of Planned work, entering events here is a manual process so not always up to date. • Current ProjectsLink to list of current Projects on the Engineering Site Support Config DB                         Support Configuration Databaase – lists of equipment, ip’s firmware, serial numbers etc. • Morpheus CH5                        Channel 5 Morpheus, 2330 card ip’s and devices, server applications • Morpheus SYS-4                   Disney / Info TV Morpheus, 2330 card ip’s and devices, server applications • Morpheus SYS-5                   Disney Morpheus, 2330 card ip’s and devices, server applications • GV Servers                             List of Grass Valley K2’s with serial numbers, ip addresses, software and rack numbers • Omneon ports                        List of OmneonMediadeck serial numbers, playout ports, record ports, ip addresses • Axon Cards                             The Axon Digital glue with frames, ip addresses, card types, firmware • Vertigos                                   List of Vertigos with ip addresses and firmware • LogoVisions                            List of Logovisions with ip addresses and firmware versions • U4000 Subtitles                      List of Screen U4000’s with ip’s and HD Driver • VBI / VANC Insertion               Inserted data on the services, details of lines and data type • Pharos Map                            Complete map of all the Pharos Hosts and Processes in Info centre with ip’s and firmware • Archive LTO Drives               List of LTO Drives in the Archive with previous drives and spares listed by serial numbers • Volicon Loggers                      List of Volicon Loggers with ip addresses, serial numbers and logins • EDML Perl Scripts                 List of the Encompass perl scripts with host machine and description

  7. This Document is on the Problem Management Homepage with active hyperlinks – (Problem Management Support Status) Views to Check on a regular Basis Incidents resulting in outage still to be closed – Live link to list Problem cases in ‘monitoring’ state requiring Engineer sign off or re-opening– Live link to list Problem Cases requiring Encompass Engineers action – Live link to list Problem Cases – Action with Supplier – Live link to list Views to help monitor current state Engineer View – Under Service Desk - Composite view of Engineers tasks i.e. Active Incidents, Problems to Close & Problems requiring info Created Past Week – Under Service Desk - See Incidents, NOW’s and Problems as they arrive regardless of status i.e. open or closed 30Day Outages – Under Service Desk - Sliding window of all Incidents in last 30 days which have caused an outage Notification Of Works – On Home Page - NOW’s and Calendar together, I have colour coded NOW’s on Calendar Client View of Incidents, Problems and NOW’s from Home Page Status    |      Disney     |    Channel5     |      Sony      |   Information TV  |     Filmflex      |    VOD 5.com    |  PB Channels  | Use these for a snapshot of current Incidents, Problems and NOW’s for individual Clients Past Performance Reports – From Home Page Last WeekLast MonthOperations ReportProblem Management Support Status History      Disney PerformanceChannel 5 PerformanceSony PerformanceInfo TV PerformanceFilmflex Performance

  8. Sharepoint for Engineers

  9. Sharepoint for Engineers

  10. Sharepoint for Engineers The Logovision link will tell you more: As will the U400O link: And the Axon Link: The Axon Card screen leads to others, clicking on the ‘Axon Card’ eg HEP100 gives the latest firmware | Click on ‘Frames & Cards for current installed firmware versions: available from the Axon website.

  11. Sharepoint for Engineers The same principles work for the Sony services, they have some different equipment so there are some different screens: Click on ‘POD 5 Sony’ for all the channels & services on POD 5: Click on TX27 AXN POL HD for more details: Plus one services will have less equipment: GV Server tells you about the K2 servers: Data Insertion information is available for VBI and VANC from the Configuration Database: There are Vertigos on some Sony Services:

  12. Sharepoint for Engineers Other areas have our active lists and tasks, the ‘Engineers incident list’ in Service desk is a filtered view of the full incident list, we try to enter all support requests in the incident list as a guide to the tasks the Service Desk is asked to undertake. The ‘Engineers incident list’ shows just the active tasks, not the closed or resolved cases, it also filters out the TX Alerts where the owner is not ‘Encompass’. I have recently gone through the list and added a client to all incidents, updated status to ‘Responded’ where there is a ‘reply’ but the status is still ‘NEW’. I add ‘Root Cause’ in the ‘TX Alerts – Encompass’ view as I use these in the performance analysis. ‘Alert2’ changes from ‘OK’ to ‘SLA’ if a NEW ticket is not modified within 4 hours. Keeping this list current and up to date is seen by senior management as a key indicator of the performance of the Service Desk and the Support Engineers. If it is used productively it will form the basis of a solutions database which will contain useful information on first line diagnosis, investigation and resolution techniques.

  13. Sharepoint for Engineers The Suppliers Case List contains the support cases which have been escalated to Suppliers and are on-going. They are referenced by the Supplier’s case No. The links at the top of the page filter :-‘Encompass Support – Priorities’ these are the cases which are ‘Action with us’. ‘Encompass Support – To Close’ these are cases in a state of ‘Monitoring’, they are resolved and await Engineering sign off. The Links for Suppliers filter out cases with the named supplier and include a brief summary of the case, these lists include cases in states of ‘In Progress’, ‘Deferred’, Monitoring’ & ‘Completed’. The links for clients do a similar thing for each named client………

  14. Sharepoint for Engineers Under Problem Management we also have:- CSI – Continual Service Improvement, a list of support processes which could work better: Changes – These are modifications to our systems and processes which do not require a NOW but we should know about:

  15. Sharepoint for Engineers RMA’s – These are the ‘Return to Manufacturer Authorizations’ with reference numbers linked to Supplier problem cases: Support Handovers are also found here, there is a handover from early shift to night shift and another from night shift to early shift. Two per calendar day.

  16. I have added sections for ‘Known Errors’, EMC / Isilon, Central Storage, Script Servers and Current Projects recently

  17. EDML Broadcast Support Services Who we are & what we do The roles we have The responsibilities we have

  18. Broadcast Support – Definitions and Aims Definition: Broadcast Support Service – The specific role ‘Support’ plays within the wider EDML business. We maintain the technical infrastructure, working closely with TX Operations and Ingest with the intent of maximising the availability of the client services. To achieve this we run a 24/7Service Desk 365 days a year which responds to incidents raised by users. The Service Desk Engineers are supervised by Lead Engineers who report to the Head of Support. Roles and responsibilities are defined in detail later in this document. We are answerable to the contractual obligations which may vary from client to client but our direct interaction is generally with internal users and 3rd party Suppliers. Client Service – This is anything EDML are delivering to clients under the terms of our contracts, including those things we charge for, such as ‘Support Services’ and also those which are a good will gesture or a generally accepted practise. Output - The ‘output’ in this sense is quite different to the ‘Service’, and shouldn’t be confused, we can work on the redundant side of a channel with no effect on the ‘output’ but the ‘service’ is compromised as redundancy is something we are charging someone for but not delivering. This may seem like a subtle distinction but it is significant to the clients. Aims: Broadcast Support Aims – TX / Delivery We aim to maintain the integrity of the signal from the MPEG decoder on the TX server to the decoded return. Much of this path is not our responsibility but we do have a responsibility to monitor the whole path and report / escalate issues anywhere along its length. Broadcast Support Aims – Ingest / Content We aim to maintain the integrity of the MAM / Archive interconnectivity from the point of delivery to the TX servers. This is a collaboration with I.T as the workflow starts with outward facing connections and technologies such as Aspera and Signiant. From there a series of firewalls protect the broadcast network from the wider world. Media around the network is linked by a web of scripts written and maintained internally. Archiving extends to a second site at Drummond Street. VOD has recently been introduced as an Ingest activity with Carbon Coders re-purposing assets for various platforms, these are PC based applications and again fall between Broadcast Support and I.T.

  19. Roles: Many tasks overlap, it is up to us how we manage these responsibilities. Broadcast Support Engineer:- This role has been loosely defined for years and can be interpreted in many ways. Some Broadcast Support Engineers come from a bench maintenance background other’s come from Operations or IT backgrounds. Like many large scale international operators we are moving Broadcast Support to a ‘Service Desk Operation’ with an ITIL philosophy. This is mainly due to changes in the industry as it moves towards more IT based solutions. There is also pressure from Clients and encouragement from Suppliers to move in this direction. Big companies, particularly those operating in the U.S., are expected to work within recognised ITIL guidelines. This common language allows joined up planning between ourselves, our clients, our suppliers and contractors. Service Desks have different methods of working. The ‘insourced’ Service Desk can be either skilled or unskilled. We have always been the equivalent of a skilled service desk, only escalating to suppliers when we need help. The unskilled service desk is a telephone messaging service which collects information and passes it on to skilled Engineers. The ‘outsourced service desk can be unskilled or skilled. There are many professional outfits now offering ‘follow the sun’ service desk coverage with skilled people situated around the world remotely offering dial-in and web-ex support sessions, these services can look very attractive to international companies, these companies are our competition for support work. Specifically, from the recent ‘Engagement Model’ Broadcast Support Engineers should:- • Address priority support tickets as directed by support leads, according to SOPs and engineering policies. • Accurately reflect and record ticket status • Accurately complete shift handovers

  20. Roles: Many tasks overlap, it is up to us how we manage these responsibilities. Broadcast Lead Engineer:- This role is distinct from Senior Engineer or System Specialist in that there is an element of supervision and training. The Lead Engineer will have the experience and knowledge of a Senior Engineer but will additionally be a mentor for Support Engineers guiding them through the investigation and resolution of issues. Also from the recent ‘Engagement Model’ Lead Engineers should:- • Manage support requests by client area and/or technology platform • Ensure all support works are planned and completed to time and quality parameters as per the NOWs • Maintain and implement regular checks across the client and/or technology estate • Maintain the documentation, SOPs and HW/SW versions and roadmap across the estate • Acceptance any SR/Project works into support • Ensure all support data and metrics are captured and accurate. • Attend regular client operations/support meetings

  21. Roles: Many tasks overlap, it is up to us how we manage these responsibilities. Supplier Manager:- Main Purpose - To own, manage and maintain the Engineering • Support problem management and resolution process, and • Service level reporting across Encompass and third party/supplier support teams. Principle Responsibilities from job description: • Problem Management:- Own, manage and maintain the support problem management process, and underpinning systems and reports. Ensure all support problems are captured, tracked and reported • Configuration Management:- Ensure that all configuration management processes are documented, and clearly communicated to Encompass staff and 3rd parties/suppliers. Own, manage and maintain the configuration management of Encompass technology assets. Ensure that all support processes are documented, and clearly communicated to Encompass staff and 3rdparties/suppliers • Supplier Management:- Ensure all 3rd party/supplier SLA’s are tracked in accordance with contracted levels. Manage the interaction, tracking and resolution of 3rd party/supplier support tickets in relation to Encompass support tickets • Reporting:- Facilitate meetings, resources required to resolve support issues across Encompass and our 3rd party suppliers. Ensure that all problem reporting systems are used in accordance with the process and are up-to-date. Produce operational and management reporting on support issues and their resolution across Encompass and 3rd parties/suppliers. Also from the recent ‘Engagement Model’ Supplier Manager should:- • Ensure support requests are logged and tracked (internal and external) • Ensure support requests are directed to relevant support area(s) and/or suppliers • Ensure response times are set and monitored • Produce client reporting on status, priority and completion dates • Produce internal reporting on EDML/3rd Party performance • Attend operational/support client meetings

  22. Broadcast Support responsibilities ‘Broadcast Support Department Responsibilities’ from the recent ‘Engagement Model’ • Responsible for the support and maintenance of the technology estate underpinning EDML services: • Playout • Content Management • Distribution • Ownership of the support and maintenance roadmap • Ownership of the technology support procedures and standards, and assurance of documentation for accuracy and completeness • Responsible for managing and reporting on the performance of 1st line technology support measured through the support ticketing system • Responsible for managing and reporting on the performance of 3rd party 2nd and 3rd line support suppliers against contractual SLAs • Responsible for continuous improvement initiatives driven by operational and service performance requirements • Responsible for the training and awareness of support staff on EDML services, technologies and operational/support procedures and policies • Responsible for managing key suppliers for 2nd and 3rd line technology • Responsible for administering the NOW process

  23. Broadcast Support responsibilities – Broken down into Roles Ownership & Direction of Service Ownership of the support and maintenance roadmap Head of Department - Direction - Responsible for continuous improvement initiatives driven by operational and service performance requirements Supplier Manager / Head of Department - Sharepoint Service Improvement List – 24/7 Service Operation Responsible for the support and maintenance of the technology estate underpinning EDML services: Playout , Content Management , Distribution All- Preventative maintenance checks, service desk operation, daily meetings – Supervision of Operation Ownership of the technology support procedures and standards, and assurance of documentation for accuracy and completeness Leads- Processes & Visios, demonstrable and consistent approach – Responsible for the training and awareness of support staff on EDML services, technologies and operational/support procedures and policies Leads- Processes and Visios, Configuration Database Sharepoint collaboration – Service Management Responsible for managing and reporting on the performance of 1st line technology support measured through the support ticketing system Supplier Manager - Powerpoint Incident Reports and statistical analysis – Responsible for managing and reporting on the performance of 3rd party 2nd and 3rd line support suppliers against contractual SLAs. Supplier Manager - Powerpoint Problem Reports and statistical analysis – Responsible for managing key suppliers for 2nd and 3rd line technology Supplier Manager - Sharepoint Problem List, weekly Supplier call’s – Responsible for administering the NOW process Supplier Manager - NOW List in Sharepoint –

More Related