Incident Response Policy
TABLE OF CONTENTS
- Security Incident Response Policy
- Security Incident Response Procedures
- Appendix A: Terms & Definitions
- Appendix B: Roles & Responsibilities
- Appendix C: Incident Analysis Guidelines
Security Incident Response Policy
Purpose
Brado has published this policy and associated procedures to define standards related to the handling of information security incidents. Despite taking all reasonable and appropriate steps to protect the confidentiality, integrity, and availability of sensitive information, information security incidents may occur. A consistent and effective process must be followed to ensure these incidents are remediated appropriately.
It is the policy of Brado to investigate and respond to any actual, threatened, suspected, or alleged security incident, including a violation of Brado policies, a breach involving protected data, or a breach of information security safeguards. This includes any attempts to bypass, break through, or override any information security safeguards.
This policy is designed to ensure Brado complies with all applicable data privacy and protection laws and regulations. See the Data Privacy and Protection Policy for details regarding protected data.
The appendices Terms & Definitions, Roles & Responsibilities and Incident Analysis Guidelines provide additional supporting details.
Scope
This policy applies to the use of all Brado’s resources, networks and data. This policy does not cover information technology investigations managed by Brado’s legal counsel.
The Security Incident Response Policy is in place to bring required resources together in an organized manner to handle an information security incident involving Brado resources or Brado-held data. An information security incident includes but is not limited to the following:
- Malicious code attack including ransomware
- Unauthorized access to any resource
- Unauthorized utilization of Brado services
- Denial of service (DoS) attacks
- General misuse of resources
- Virus hoaxes
- Loss, theft, or compromise of Brado data or resources
Classification, Prioritization and Measurement of Incidents
Details regarding prioritization and monitoring are addressed in the procedures. The Information Security Incident Coordinator will classify incidents into one or more of the following areas:
Incident Reporting
Brado’s IT Department must report information security incident information to appropriate Brado management and staff in a prompt manner. Brado management is responsible for notifying clients and other individuals in accordance with applicable breach notification laws.
- All information security incidents must be reported to the Director of Technology and CFO. Legal counsel is consulted at their discretion.
- All security incidents involving data subject to HIPAA or GDPR must be reported to the HIPAA Privacy Officer, the HIPAA Security Officer, or GDPR Data Protection Officer as appropriate. Currently, the Director of Technology fills these roles.
- Additional individuals may be notified based on the discretion of the Director of Technology, CFO, or the Incident Coordinator.
Enforcement
Violations of this policy should be reported to the user’s supervisor or HR. Brado employees who violate this policy may be subject to disciplinary action, up to and including termination. Other users in violation of this policies and related procedures may be subject to loss of visitor privileges, termination of services and/or termination of engagement from Brado.
Security Incident Response Procedures
Purpose
Brado has determined it is important to have a formal, focused, consistent, and coordinated approach to responding to security incidents. These procedures:
- Give Brado a roadmap to ensure a consistent response to security incidents.
- Provide an approach to identify, respond, and successfully mitigate any actual or suspected security incident.
- Apply to all Brado managed resources.
The following procedures are intended to be scalable and are not necessary for every security incident as many incidents are small and only require a single responder. The Director of Technology, or designee, should determine when to assemble Security Incident Response Team (SIRT). If a SIRT is convened, these procedures must be consulted and the elements appropriate to the individual incident must be used. The SIRT will be responsible for determining the categorization of the incident.
See the appendices for Terms & Definitions, Roles & Responsibilities and Incident Analysis Guidelines.
Categories
It is impractical to develop comprehensive procedures with detailed instructions for handling every type of incident as security incidents can occur in many ways. These procedures are designed to prepare Brado to generally handle any type of incident with more specific steps to handle common security incident types. The incident categories below are not comprehensive and not intended to provide definitive classification for incidents. They are designed to give a basis for providing advice on how to handle incidents based on their primary category:
- Denial of Service
- Inappropriate Usage
- Malicious Code or Malware
- Unauthorized Access
Brado utilizes the following categorization scheme:
- Data Breach – An accidental or intentional breach (unauthorized access) of protected data
- Human Error – An accidental incident that causes the loss of confidentiality, integrity, and/or availability of Brado data
- Lost/Stolen Equipment – The loss of theft of a computing device or media used by Brado staff or containing Brado or client data such as a laptop, smartphone, or removable media
- Malicious Attack – An attack by an internal or external party that impacts the confidentiality, integrity, or availability of a Brado managed system or data
- Network Availability – A significant outage of a major Brado managed system
- Other – An attack that does not fit into any of the other categories
- Policy Violation – Any incident resulting from violation of Brado acceptable usage or documented IT policies by an authorized user
Definition and Analysis
Security incidents can be detected in a variety of ways (e.g. audit logs, IPS/IDS alerts, user reporting, vulnerability scanning, etc.). Signs of an incident fall into one of two categories:
- Precursor – a sign that an incident may occur in the future
- Indicators – a sign that an incident may have occurred or may be occurring now
Both categories can come from a variety of sources. Any actual or suspected security incident must be reported the Director of Technology. Once an actual or suspected security incident is detected, the analysis phase should begin with the assemblance of a SIRT that is led by the Information Security Incident Coordinator. In the analysis phase, the SIRT should assume that a security incident is occurring (even if it’s only suspected) until they have determined that it is not. The team should work quickly and validate that any potential or suspected security incident is truly a security incident. The actions taken during this phase should be documented, as needed. Actions taken can include any of the following (see Appendix C for further detail):
- Profiling networks and systems
- Understanding normal behaviors
- Reviewing audit logs
- Correlating events
- Researching the symptoms/causes on the Internet
- Collecting network traffic
- Asking for assistance from other parties (US-CERT, CSIRT, etc.)
Documentation
When you suspect or know that an incident has occurred, you must immediately start recording all of the facts regarding the incident. Documenting system events, conversations, and observed changes in files can lead to a more efficient, more systematic, and less error-prone handling of the problem. The SIRT should maintain records about the status of incidents, along with other pertinent information as appropriate. The Security Incident Response Team must safeguard security incident data and restrict access to it because it often contains sensitive information (e.g. data on exploited vulnerabilities, recent security breaches, and users that may have performed inappropriate actions).
Prioritization or Severity Ratings
During the initial phase of an incident or suspected incident, a priority should be assigned to the incident. This prioritization should be based on the functional impact, information impact, and recoverability of a security incident.
Incident targeting IT systems typically impact the business functionality which those systems provide. This results in some level of negative impact to the users of those systems. Incident handlers should consider how the incident will impact the existing functionality of the affected systems. Incident handlers should consider not only the current functional impact of the incident, but also the likely future functional impact of the incident if the incident is not immediately contained.
Incidents involving protected data may result in lawsuits, the loss of certifications, fines imposed by regulatory agencies, and/or the termination of Client agreements. Incident handlers should be well informed regarding management and notification procedures dictated by regulatory agencies and Client agreements in order to mitigate these possible damages.
Incidents may affect the confidentiality, integrity, and availability of Brado’s or client’s information. Incident handlers should consider how this incident will impact Brado’s overall mission.
The size of an incident and the type of resources affected will determine the amount of time and resources that must be spent on recovering from an incident. In some instances, it may not be possible to recover from an incident (e.g. the confidentiality of sensitive information has been compromised). It would not be a good use of limited resources to create an elongated incident handling cycle unless that effort was directed at ensuring a similar incident did not occur in the future. In other instances, an incident may require far more resources to handle than what Brado has available. The authority to commit additional resources to incident response shall rest with Brado management. SIRT should consider the effort necessary to recover from an incident and carefully weigh that against the value created by the recovery effort and any requirements related to incident handling.
Care should be taken to ensure that the communication method is appropriate for the security incident. Documentation supporting these efforts should be maintained. In addition, an initial categorization of the security incident, as described above, should take place.
Containment
Containment ensures that the security incident does not grow and affect other resources. Containment strategies will vary based on the type of security incident. Criteria for determining the appropriate strategy include:
- Compromise to PHI, PII, or other confidential Brado or client data
- Potential damage to and/or theft of resources
- Need for evidence preservation
- Service availability (e.g. network connectivity or services provided to external entities)
- Time and resources needed to implement the strategy
- Effectiveness of the strategy (e.g. partial containment or full containment)
- Duration of the solution (e.g. emergency workaround, temporary workaround, permanent solution, etc.)
As needed, evidence should be maintained during the incident to ensure it is resolved and documented appropriately. Evidence should be retained in compliance with Brado record retention policies and procedures.
During incident handling, system owners and others sometimes want to or need to identify the attacking host or hosts. Although this information can be important, SIRT should generally stay focused on containment, eradication, and recovery. Identifying an attacking host can be a time-consuming and futile process which can prevent a team from achieving its primary goal of minimizing the business impact.
Eradication
After a security incident has been contained, eradication may be necessary to eliminate certain components, such as deleting malicious code and disabling breached user accounts. For some security incidents, eradication is either not necessary or is performed during recovery. Subject Matter Experts (SMEs) may be needed to provide direction and perform actions to eradicate the issue.
Recovery
In recovery, administrators restore systems to normal operation and, if applicable, harden systems to prevent similar security incidents. Recovery may involve such actions as restoring systems from clean backups, rebuilding systems from scratch, replacing compromised files with clean versions, installing patches, changing passwords, and tightening network perimeter security (e.g., firewall rule sets, boundary router access control lists). It is also often desirable to employ higher levels of system logging or network monitoring as part of the recovery process. Once a resource is successfully attacked, it is often attacked again, or other resources within the organization are attacked in a similar manner.
Post-Incident Activity
One of the most important parts of security incident response is also the most often omitted: learning and improving. Each SIRT should evolve to reflect new threats, improved technology, and lessons learned. A meeting provides a chance to achieve closure with respect to a security incident by reviewing what occurred, what was done to intervene, and how well intervention worked. The meeting should be held within a couple of days from the closure of the security incident. Questions to be answered in the lessons learned meeting may include:
- Exactly what happened and at what times?
- How well did staff and management perform in dealing with the incident?
- Were documented procedures followed and found to be adequate?
- What information was needed sooner?
- Were any steps or actions taken which may have inhibited the recovery?
- What should be done differently the next time a similar incident occurs?
- What corrective actions can prevent similar incidents in the future?
- What precursors or indicators should be watched to detect similar incidents?
- What additional tools or resources are needed to detect, analyze, and mitigate future incidents?
Communication
The SIRT will use all available communication methods to appropriately manage a security incident. These methods could include face-to-face conversations, phone, email, etc. The method used should be consistent with the sensitivity of the information being transferred. If sensitive information is being transferred, appropriate controls, including encryption, should be utilized to ensure the information remains protected.
Only authorized individuals are permitted to talk with the media and other external parties.
Brado may need to report a security incident to an external party (e.g. client and/or a regulatory agency). Brado’s Client Service Team is responsible for knowing client requirements. The HIPAA Privacy Officer, HIPAA Security Officer and/or GDPR Data Protection Officer is responsible for managing regulatory issues. These two groups will work in concert regarding responses to incidents involving breaches or disclosures of client-related confidential or private information. Most Master Service Agreements (MSAs) require client notification within one (1) business day and specific processes have been agreed upon.
Metrics
Metrics are developed as requested by Brado management. These metrics may include, but are not limited to, the number of security incidents per quarter or the average time to close a security incident. A periodic report is regarding new security incidents is sent to the CFO.
Roadmap
The implementation of these procedures and continued use of consistent processes to respond and address security incidents will ensure the maturation of Brado’s incident response capabilities. Brado makes incident policy and procedures updates based on operational need, NIST guidance, and lessons learned.
Appendix A: Terms & Definitions
Appendix B: Roles & Responsibilities
Appendix C: Incident Analysis Guidelines
The National Institute of Standards and Technology recommends the following actions to make incident analysis easier and more effective (NIST SP 800-61 rev. 2):
Profiling is measuring the characteristics of expected activity so that changes can be more easily identified. Examples of profiling are running file integrity checking software on hosts to derive checksums for critical files and monitoring network bandwidth usage to determine what the average and peak usage levels are on various days and times. In practice, it is difficult to detect incidents accurately using most profiling techniques; organizations should use profiling as one of several detection and analysis techniques.
Examples in place at Brado: Active Directory group policy objects and firewall traffic monitoring
Understanding Normal Behaviors
Incident response team members should study applications, networks, and systems to understand what their normal behavior is so that abnormal behavior can be more easily recognized. No incident handler will have a comprehensive knowledge of all behavior throughout the environment, but handlers should know which experts could fill in the gaps. One way to gain this knowledge is through reviewing log entries and security alerts. This may be tedious if filtering is not used to condense the logs to a reasonable size. As handlers become more familiar with the logs and alerts, they should be able to focus on unexplained entries, which are usually more important to investigate. Conducting frequent log reviews should keep the knowledge fresh, and the analyst should be able to notice trends and changes over time. The reviews also give the analyst an indication of the reliability of each source.
Examples in place at Brado: Daily logs are sent to the team from the IPS system. These are reviewed and a general idea of what constitutes regular and irregular traffic has been formed.
Create a Log Retention Policy – Information regarding an incident may be recorded in several places, such as application, firewall and IDS/IPS logs. Creating and implementing a log retention policy that specifies how long log data should be maintained may be extremely helpful in analysis because older log entries may show reconnaissance activity or previous instances of similar attacks. Another reason for retaining logs is that incidents may not be discovered until days, weeks, or even months later. The length of time to maintain log data is dependent on several factors, including the organization’s data retention policies and the volume of data.
Examples in place at Brado: Brado network logs are uploaded to the cloud daily and are kept for a period of one year.
Evidence of an incident may be captured in several logs that each contain different types of data – a firewall log may have the source IP address that was used, whereas an application log may contain a username. A network IDS/IPS may detect that an attack was launched against a specific host, but it may not know if the attack was successful. The analyst may need to examine the host’s logs to determine that information. Correlating events among multiple indicator sources can be invaluable in validating whether a specific incident occurred.
Keep All Host Clocks Synchronized
Protocols such as the Network Time Protocol (NTP) synchronize clocks among hosts. Event correlation will be more complicated if the devices reporting events have inconsistent clock settings.
Examples in place at Brado: All server time settings are synchronized through Active Directory
Maintain and Use a Knowledge Base of Information
The knowledge base should include information that handlers need for referencing quickly during incident analysis. Although it is possible to build a knowledge base with a complex structure, a simple approach can be effective. Text documents, spreadsheets, and relatively simple databases provide effective, flexible, and searchable mechanisms for sharing data among team members. The knowledge base should also contain a variety of information, including explanations of the significance and validity of precursors and indicators, such as IDS/IPS alerts, operating system log entries, and application error codes.
Examples in place at Brado: The IT team maintains a shared OneNote notebook as a text reference. It is continually updated by members of the team as needed.
Use Internet Search Engines for Research
Internet search engines can help analysts find information on unusual activity. For example, an analyst may see some unusual connection attempts targeting TCP port 22912. Performing a search on the terms “TCP,” “port,” and “22912” may return some hits that contain logs of similar activity or even an explanation of the significance of the port number. Note that separate workstations should be used for research to minimize the risk to the organization from conducting these searches.
Run Packet Sniffers to Collect Additional Data
Sometimes the indicators do not record enough detail to permit the handler to understand what is occurring. If an incident is occurring over a network, the fastest way to collect the necessary data may be to have a packet sniffer capture network traffic. Configuring the sniffer to record traffic that matches specified criteria should keep the volume of data manageable and minimize the inadvertent capture of other information. Because of privacy concerns, some organizations may require incident handlers to request and receive permission before using packet sniffers.
There is not enough time to review and analyze all the indicators. However, at a minimum, the most suspicious activity should be investigated. One effective strategy is to filter out categories of indicators that tend to be insignificant. Another filtering strategy is to show only the categories of indicators that are of the highest significance. This approach carries substantial risk because new malicious activity may not fall into one of the chosen indicator categories.
Occasionally the team will be unable to determine the full cause and nature of an incident. If the team lacks sufficient information to contain and eradicate the incident, then it should consult with internal resources (e.g., information security staff) and external resources (e.g., USCERT, other CSIRTs, contractors with incident response expertise, etc.). It is important to accurately determine the cause of each incident so that it can be fully contained, and the exploited vulnerabilities can be mitigated to prevent similar incidents from occurring.
Examples in place at Brado: Brado works very closely with a local partner who helps to analyze and maintain the network.