Purpose to manage events throughout their lifecycle is the purpose of event management. This life cycle of activities to detect events, make sense of them and determine the appropriate control action, which is coordinated by the event management process. Event management is therefore the basis for operational monitoring and control.
If events are programmed to communicate operational information as well as warnings and exceptions, they can be used as a basis for automating many routine operations management activities.
Example: Executing scripts on remote devices, or submitting jobs for processing, or even dynamically balancing the demand for a service across multiple devices to enhance performance.
An event can be defined as any change of state that has significance for the management of a configuration item (CI) or IT service. Events are typically recognized through notifications created by an IT service, CI or monitoring tool. Effective service operation is dependent on knowing the status of the infrastructure and detecting any deviation from normal or expected operation.
The objectives of the event management process are to:
Detect all changes of state that have significance for the management of a Cl or IT service
Determine the appropriate control action for events, and ensure these are communicated to the appropriate functions
Provide the trigger, or entry point, for the execution of many service operation processes and operations management activities
Provide the means to compare actual operating performance and behavior against design standards and SLAs
Provide a basis for service assurance and reporting; and service improvement, (This is covered in detail in ITIL Continual Service Improvement)
These can be achieved by good monitoring and control systems,Which are based on two types of tools: Active monitoring tools that poll key CIs to determine their status and availability. Any exceptions will generate an alert that needs to be communicated to the appropriate tool or team for action.
Passive monitoring tools that detect and correlate operational alerts or communications generated by CIs. Scope Event management can be applied to any aspect of service management that needs to be controlled and which can be automated. This includes configuration items (CIs). Some CIs will be included because they need to stay in a constant state
Example, a switch on a network needs to stay on and event management tools confirm this by monitoring responses to ‘pings’. Some CIs will be included because their status needs to change frequently, and event management can be used to automate this and Update the configuration management system (CMS) (e.g. the updating of a file server).
Environmental conditions (e.g. fire and smoke detection). Software license monitoring for usage to ensure optimum/legal license utilization and allocation.
Security (e.g. intrusion detection).Normal activity (e.g. tracking the use of an application or the performance of a server).
The difference between monitoring and event management Monitoring and event management are closely related, but slightly different in nature.
Event management is focused on generating and detecting meaningful notifications about the status of the IT infrastructure and services. While it is true that monitoring is required to detect and track these notifications, monitoring is broader than event management. For example, monitoring tools will check the status of a device to ensure that it is operating within acceptable limits, even if that device is not generating events. Put more simply, event management works with occurrences that are specifically generated to be monitored. Monitoring tracks these occurrences, but it will also actively seek out conditions that do not generate events.
Event Management – Value to Business
Event management’s value to the business is generally indirect; however, it is possible to determine the basis for its value as follows:
Event management provides mechanisms for early detection of incidents. In many cases it is possible for the incident to be detected and assigned to the appropriate group for action before any actual service outage occurs. When integrated into other service management processes (such as, for example, availability or capacity management), Event management can signal status changes or exceptions that allow the appropriate person or team to perform early response, thus improving the performance of the process. This, in turn, will allow the business to benefit from more effective and more efficient service management overall.
Event management provides a basis for automated operations. Thus, increasing efficiency and allowing expensive human resources to be used for more innovative work. Such as, designing new or improved functionality, or defining new ways in which the business can exploit technology for increased competitive advantage. Event management can have a direct bearing on service delivery and customer satisfaction. As an example, an automated teller machine may generate event notifications that indicate the device is running low on cash. Potentially avoiding the failure of the cash withdrawal portion of that service and its immediate impact on customer satisfaction.
Event Management – Types of Events
There are three types of Events:
Informational Events These events only provide us with information like a scheduled workload has completed. A user has logged in to use an application. An email has reached its intended recipient.
Warning Events These events which will provide us alerts when set thresholds levels have been achieved. A server’s memory utilization reaches within 5% of its highest acceptable performance level. The completion time of a transaction is 10% longer than normal.
Exceptional Events These events which indicates that a CI or service operates abnormally. When a user attempts to log on to an application with the incorrect password. A device’s CPU is above the acceptable utilization rate. A PC scan reveals the installation of unauthorized software.
Warning events signify unusual, but not exceptional, operation. These are an indication that the situation may require closer monitoring. In some cases the condition will resolve itself, for example in the case of an unusual combination of workload, as they are completed, normal operation is restored. In other cases, operator intervention may be required if the situation is repeated, or if it continues for too long. These rules or policies are defined in the monitoring and control objectives for that device or service.