Challenges encountered during remediation
Organizations dealt with a variety of specific challenges while remediating for the Log4Shell vulnerability, including putting together an initial response, determining exposure and next steps, and addressing third-party risks.
1. Assembling remediation teams
Log4Shell remediation was not simple, and organizations had to assemble teams capable of proper remediation. The pervasive structure of the Log4Shell vulnerability made it difficult to locate, and the sheer variety of things malicious actors could accomplish while exploiting it made it especially difficult to address. Proper remediation required a diverse and specialized operations overview.
Organizations that rushed to assemble patchwork incident response teams tended to be slow and less effective, resulting in potential exploitable gaps in target systems. Conversely, organizations that had properly constructed teams in place fared better. These teams were usually led by a high-level coordinator with information pipelines to all system owners. With a leader coordinating remediation strategy, other team members could fill more niche responsibilities. Operational members could zero in on detection, testing, and patching, while other team members focused on communication, documentation, and budgeting. Such specialized roles helped decrease the chance of errors, and the high-level coordination and expansive knowledge base helped protect a system under threat.
2. Assessing exposure and tracking remediation
Addressing the Log4Shell vulnerability challenged many organizations, and those without updated views of their internal infrastructure faced several unique problems. First, organizations had to determine if their systems were at risk, which proved difficult due to the widespread use of the Log4j library. Companies that maintained an effective IT asset management program could quickly determine which systems used the Log4j library. But organizations that lacked insight wasted valuable time cataloging which of their key systems used Log4j.
Determining which systems and applications hosted Log4j required an in-depth review of documentation and baselines. Even organizations with comprehensive system documentation couldn’t be 100% sure if Log4j existed somewhere on their systems. Implementation of antivirus and malware scanning tools was not sufficient because vulnerability scanner repositories initially lacked the information needed to detect the presence of the Log4j library. Therefore, even well-prepared organizations employed more specific scanning methods that were – early on – far more effective than standard vulnerability scanners. IT teams without access to a diverse suite of scanners and detection tools were virtually blind during the first weeks of the vulnerability’s discovery.
Determining where to focus Log4Shell remediation efforts was a tricky process, even for organizations that maintained in-depth documentation of their internal infrastructure. The most effective approach started by prioritizing key external-facing systems that hosted sensitive data and applications. Security teams then turned to noncritical external-facing systems due to their exposure and then patched internal systems running Log4j.
3. Managing third-party risks
In the wake of the Log4Shell vulnerability, organizations addressed their own network security concerns first. However, third-party service providers, including data processors and managed service providers with remote access to the organizations’ networks, presented additional risks. These third parties required additional reviews to determine how they were affected by the vulnerability and what steps they had taken to begin remediation. Despite the gravity of the threat, organizations often experienced issues responding to these third-party risks. Notably, a lack of accurate and complete third-party inventories, emergency third-party escalation procedures, and remediation enforcements delayed response efforts.
When Apache disclosed the Log4Shell vulnerability, organizations often discovered problems when reviewing their third-party inventories, including issues with limited information on the third parties such as their data processing practices, organization access levels, and lack of contact information. These issues caused organizations to perform ad hoc inventory updates based on old contract documents and emails before even beginning response efforts.
After reviewing third-party inventories, organizations encountered additional issues when attempting to execute emergency third-party escalation procedures. Incident response teams needed to prioritize the most at-risk third parties for outreach efforts, despite the lack of approved prioritization criteria and requirements. Even with a prioritized listing, response teams still needed to determine the proper evidence to request from third parties.
Typically, organizations review independent security certifications once before agreeing to a contract, or they assess third parties on a periodic basis using a standard questionnaire. However, some independent security certifications and questionnaire responses lacked the necessary information to determine which third parties were vulnerable to Log4Shell. Those without proper emergency third-party escalation procedures had to redirect already limited response resources, including valuable time, to implement proper procedures.
Enforcement of timely third-party remediation presented another challenge. Some third parties contacted organizations directly, explaining the lack of vulnerability in their system or ongoing remediation efforts. However, the third parties that required specific outreach often ignored or even refused to respond to the request, leaving the incident response details unknown to the organization. To manage these uncooperative third parties, many organizations enforced right-to-audit contract clauses, put the third parties at risk of renewal, and disabled network access until the third parties provided sufficient remediation details.
Lessons learned
Responding to the Log4Shell vulnerability was a difficult and costly process. However, many lessons emerged from how organizations dealt with it, and those lessons can be applied to help those organizations and others improve their security posture.
Following are actions organizations can take to better prepare for the next vulnerability.
- Maintain a playbook. A playbook is a formalized set of policies and procedures outlining the steps that should be taken during a security incident. It serves as a guide to personnel to minimize delays and errors during incident response. Organizations that maintain a playbook with steps to address open-source, zero-day vulnerabilities like Log4Shell benefit from this proactive measure. Premade procedures can save time and help the organization respond efficiently.
- Document internal infrastructure. Organizations should establish up-to-date documentation of all internal systems and services provided. Such documentation should be comprehensive and detailed, while helping to condense the internal systems into a digestible form. When responding to zero-day vulnerabilities, this documentation can be used to prioritize which systems require attention and identify which assets are at risk. In addition, baseline metrics can be used as indicators of compromise for monitored systems.
- Implement a diverse suite of scanners. IT teams should have procedures in place to continually update and maintain various scanners for every major organizational system. Vulnerability scanners are imperative to system security, and consistent repository updates are necessary to keep the scanners as effective as possible. In addition, organizations should use various types of software composition analysis (SCA) tools. The Log4j library has various fingerprints that traditional vulnerability scans overlook. However, SCA tools can parse through applications’ source code and find references to the library. In addition, system behavior monitors can reveal a compromised system by identifying distinct patterns in a system’s operation.
- Establish emergency patching processes. All IT teams should have a documented emergency patching process in place. An emergency patching process is a set of standard procedures that facilitate the patching process for sudden high-impact vulnerabilities discovered by an organization. Standard patching procedures optimize for scope, time, and cost. However, the Log4Shell vulnerability demonstrated that standard patching procedures are not always an adequate response to vulnerabilities. An emergency patching process is an important tool, which can circumvent the usual obstacles that delay the patching process such as cost and availability.
- Maintain a list of third parties with use cases. Organizations should maintain an updated third-party inventory list, including services provided, data processed, and third-party contact information. They should also maintain emergency third-party escalation procedures, including prioritization criteria and custom outreach templates. Lastly, organizations should inform nonresponsive and uncooperative third parties of contractual requirements for inquiries and emergency third-party offboarding procedures. In extreme cases, organizations can temporarily remove remote access privileges or end service agreements with such third parties indefinitely.
Preparing for next time
Most organizations have begun or completed the remediation process for the Log4Shell vulnerability. However, remediation is not always enough. Arguably, it’s just as important to internalize and apply lessons learned about the vulnerability’s impact and the factors that made it so difficult to address.