[Servercert-wg] Ballot SC29: System Configuration Management

Wed Apr 1 03:36:28 MST 2020

On 2020-03-30 2:38 μ.μ., Neil Dunbar via Servercert-wg wrote:
>
> On 30/03/2020 11:07, Dimitris Zacharopoulos (HARICA) via Servercert-wg 
> wrote:
>
>> We had an internal discussion at HARICA and focused mainly on the OS 
>> security patching process. Here are some of the team's observations:
>>
>>  1. Security patches, especially for 0-day exploits, should be
>>     applied as soon as possible. Human review in most cases adds
>>     unnecessary delay and increases the risk of the CA being
>>     vulnerable to attacks.
>>
> It's not at all clear to me that this is the case. However, it might 
> help to understand what the notion of "review" means in this case. I 
> suspect that we're coming at this word from different angles.
>
> Emergency Change Orders are a well known phenomenon in most workflows. 
> Now - they _should_ be painful (ie, a senior manager or director 
> should be approving the change) because that acts as a discouragement 
> to fast-track proper testing via the ECO mechanism.
>

The team considers this "Emergency Change Order" to cause unnecessary 
delay in some cases, for some categories of OS patches (e.g. security 
updates). Whenever the vendor has patches that *the vendor* labels as 
security updates, we would like to have the option for these patches to 
be applied as soon as possible, without requiring additional human 
review by the CA.

Just to avoid any misunderstanding, only existing installed packages 
would be updated. This would not invoke installation of new packages 
(e.g. bluetooth packages or other unwanted software). Servers that 
follow the NSRs 1g already ensure that only the utmost necessary 
software is installed.

We would also like to remind everyone that this ballot affects 
"Certificate Systems, Issuing Systems, Certificate Management Systems, 
Security Support Systems, and Front-End / Internal-Support Systems", 
which covers almost every system under the CA's PKI management.

>>  1. Human review doesn't scale as the number of Systems increase.
>>     Automation is the only way to handle large number of Systems.
>>
> Well, no one is asking for a per-system review. If you host a curated 
> mirror of your vendor's upstream patches, then the current automation 
> systems would work with virtually no alteration - just a repository 
> change.

We don't disagree that for some packages this manual approval process is 
an improvement. We disagree that this needs to be done for all cases, 
like OS vendor security patches.

>>  1. Human review of patches for so many different OS components,
>>     sometimes without enough details or the source code available, is
>>     too difficult to be done correctly and humans do make mistakes.
>>
> Of course - so do OS vendors! But again - what do we mean by "review". 
> I don't understand it to be a line-by-line formal code analysis. The 
> issue at hand is whether the OS vendor is addressing a security issue 
> which the CA is likely to suffer from, and that the patches produced 
> don't introduce any extra vulnerability which the CA cannot tolerate.
>

This decision must be left for the CA to make based on its risk 
assessment. The potential introduction of "extra vulnerability" vs "real 
and existing vulnerability" leans towards mitigating the "real and 
existing vulnerability". Historically, the cases where an OS vendor 
introduce additional vulnerabilities are very rare. Can you give us some 
examples of cases where the vendor introduced new vulnerabilities when 
trying to mitigate patch existing ones?

> To accept a patch - at least to me - means that you are aware of the 
> issue which the patch seeks to address (ie, you've looked at the 
> changelog); you've determined that you are likely to suffer from the 
> asserted vulnerability; you decide that the exploit of the 
> vulnerability is an unacceptable risk; that you believe that the 
> vendor has sufficient QA processes in place not to introduce further 
> unacceptable vulnerabilities while deploying the updated software; and 
> that you know how to roll back the change in the event that it either 
> does not work or introduces unacceptable changes in your systems. 
> [There are probably more steps in there, but those are the major ones, 
> to my thinking].
>

The belief that the vendor has sufficient QA process has already been 
determined/decided when the CA decided to accept this vendor as.... a 
trusted vendor :-)

I think our misunderstanding is about the generic use of the term 
"vendor". Our concerns are more focused on OS vendors that the CA trusts 
for doing a good job in addressing security risks at the OS level, doing 
a good job at QA and has a history of delivering security 
updates/patches that do not introduce additional vulnerabilities. CAs 
should be allowed to make this decision once and not daily by 
approving/screening security updates that are critical for the stability 
and safety of the CA systems.

Of course it's up to the CA to have revert mechanisms if a security 
patch "breaks" something, but that's the responsibility of the CA and we 
should not be so prescriptive about it in the requirements.

In our understanding, installing security patches without undue delay is 
more secure for the overall security which ultimately benefits the 
end-users. This is the one automation that can be done securely and 
unattended for the benefit of the ecosystem. In most cases the risk of 
being "down" is smaller then the risk of being "vulnerable". Also, a CA 
can implement a series of compensating controls that can prevent a 
system from being "down" in those rare cases.

> Put another way - if you knew that a patch addressed a vulnerability 
> in the Bluetooth stack on a host, but you have no Bluetooth equipment 
> in your array, then isn't patching something with no avenue of 
> exploitation a source of risk? And if so, is that risk then 
> acceptable? It may, or may not be, but I think an auditor would at 
> least want to see that someone had made that judgement call.
>

Again, we are not discussing about upgrading something that is outside 
the installed software stack. According to NSRs 1g, only necessary 
software should be installed in the first place. But, just to entertain 
the idea, and assuming we have allowed the Bluetooth stack to remain 
installed, what potential "harm" would you see if a patch for the 
Bluetooth stack were installed unattended?

>>  1. The risk of being non-functional is of lesser importance and
>>     potential impact, compared to the risk of remaining vulnerable
>>     while waiting for a human review.
>>
>> We believe that OS patches, digitally signed by the software vendor 
>> and provided through secure channels, MUST be allowed as a dully 
>> approved Change Management Policy practice, to be applied without 
>> employing a human review process. Such updates and patches are 
>> beforehand thoroughly tested by the software vendors themselves and 
>> the CA staff should be discouraged to duplicate such a process with 
>> dubious results.
>>
> Again - I don't think anyone is asking for the CAs to _duplicate_ the 
> QA processes of the software vendor - merely to be assured that the 
> patches which are going to be installed are proportionate to the risk 
> being mitigated.
>

If a CA trusts the OS vendor for its QA process and its security 
practices, the CA probably doesn't need to repeat this process. Also, it 
is possible that for some cases, the CA has pre-determined (via its 
Change Management Policy) that the risks being mitigated from security 
OS patches outweigh the risks of "breaking" things.

> For the vast majority of (non-emergency) patches which get applied - 
> surely we apply them to test systems before we go live on production? 
> Isn't that part of a proper review?
>

Repeating our previous statement, in most cases the risk of being "down" 
is smaller then the risk of being "vulnerable" and there are several 
compensating controls the CA can implement to mitigate this issue.

> And I would be horrified to learn if CAs are installing software right 
> now _without_ ensuring that software's authenticity and integrity!
>

So would we :-)

>> The recommended change in 1.h:
>>
>> "Ensure that the CA’s security policies encompass a Change Management 
>> Process, following the principles of documentation, approval and 
>> testing, and to ensure that all changes to Certificate Systems, 
>> Issuing Systems, Certificate Management Systems, Security Support 
>> Systems, and Front-End / Internal-Support Systems follow said Change 
>> Management Process;"
>>
>> doesn't seem to clearly allow this practice. We cannot support a 
>> ballot that enforces manual human review in applying OS patches which 
>> puts systems at risk, even for a short amount of time.
>
> I think that it does allow the practice, even for urgent changes - 
> most Change Management Systems encompass the notion of Emergency 
> Change Orders. The review can be as little as "we cannot accept the 
> vulnerability existing a moment longer; the vendor has produced patch 
> X; fire up an ECO and phone the Chief Operating Officer because the 
> risk of not deploying the patch is greater than the risk of deploying it".
>

We believe that all security OS patches should be considered "ECO". 
*Security *vulnerabilities are introduced quite frequently (several days 
per week). CAs have publicly accessible systems that are scanned 
constantly in search of vulnerabilities.

> I guess that my question then boils down to this: other high risk or 
> high accountability institutions have rigorous change order management 
> processes, which encompass emergency changes too - isn't the onus on 
> CAs to explain why their business is so radically different so as not 
> to be a proper fit for such widespread processes?
>

The same applies to ACME implementations that push for more automation. 
Manually approving updates is not a one-size-fits-all solution. Yes, 
there are some more critical systems that would make sense to have 
manual approvals of security updates but some others would be better 
(best practice) if the security updates were applied as quickly as 
possible (with or without the curated mirror you proposed). We suggest 
keeping the time between when a patch is available to when the patch is 
installed, to a minimum. Regardless of how many humans are in the 
decision process, we believe that some systems would be more secure if 
there were no manual approvals by humans for security patches.

We hope to have provided a clear view of our position and why we see 
real security benefit from quick and automated installation of OS 
security patches.

Best regards,
Dimitris.

> Regards,
>
> Neil
>
>
> _______________________________________________
> Servercert-wg mailing list
> Servercert-wg at cabforum.org
> http://cabforum.org/mailman/listinfo/servercert-wg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cabforum.org/pipermail/servercert-wg/attachments/20200401/da6967fd/attachment.html>