[cabf_netsec] SC20 and OS Patch Management

Mon Feb 24 03:49:04 MST 2020

All,

As discussed in the F2F meeting, within the context of a Change 
Management Process for CAs, the question came up regarding OS and 
application patch management, and whether it was intended to be covered 
by SC20.

 From my perspective, the answer is an unequivocal "Yes", but I did want 
to take the temperature of the NetSec contributors before I open up 
SC20v2 for re-discussion on the public list.

What follows is _my_ intepretation of how OS/App patches could/should be 
handled: this does _not_ mean that there are not other, probably better, 
ways of handling them.

1. Upstream packages/patches need to come with a well-known source of 
authenticity and integrity. A fairly common example would be packages 
which are digitally signed by a private key; the public key counterpart 
of which is well known and well publicised (or even baked in at OS 
installation time). If you can't know that the package being installed 
is highly likely to be what you _think_ you are installing, then you 
don't have any real constraints on your changes, and you are just hoping 
that no malware sneaks in. [NB: OS/Application vendors can vary hugely 
in their signing key discipline, so its an open question as to just how 
strong a constraint this truly is]. Ideally both the individual packages 
and the overall repository manifest should be signed by the upstream vendor.

2. A CA has a channel by which it reviews security patches coming from 
the vendor (e.g. Red Hat's rhsa-announce, ubuntu-security-announce, RSS 
feeds, etc). Using this, the CA personnel can judge whether or not they 
could be affected by the security issue, and what the likely effect of 
patching is. (For example, if a Linux kernel has a vulnerability in a 
module which is not capable of being exercised in the CA's deployed 
systems, the CA could determine that the potential disturbance to 
customers is not acceptable, and refuse to curate the patched kernel). 
The decision to accept or reject the patch is recorded in a log (who did 
it, when, and with what result).

3. For each security notification which the CA decides they are 
affected, and where the risk of keeping the vulnerability is deemed 
unacceptable, any patch files covered by the security notification are 
marked and copied into an internal package repository [By "internal", I 
mean where the contents of the repository are purely determined by 
responsible CA personnel, and where access to the repository flows via 
trusted network connections]

4. At some point in the patch cycle, the CA deploys the patches to a 
stage server to determine the likely effect on production systems (The 
stage systems are kept as close as possible to the production 
platforms). The relevant monitoring systems (HIDS, file integrity 
monitors, etc.) will produce logs showing that files have been altered. 
Those logs are stored in preparation for the production deployment.

5. Assuming that no ill effects were noted on the staged deployment, a 
change ticket is opened up to note that productions systems A, B, C... 
are going to be patched in the normal cycle.

6. The relevant system administrators apply the patches (whether 
scripted, or individually is a matter for systems personnel) to the 
production systems, and perform such steps as needed to restore the 
production systems to full capability.

7. The monitoring systems of step 4 will record a set of file 
alterations to a log. These change entries are reconciled with the list 
produced in step 4 for the stage systems. Note: it's possible that 
because of scheduling differences and minor discrepancies between stage 
and production, the change logs won't be *identical*, but they should be 
substantively similar.

8. Assuming that no discrepancies in change logs are noted, the change 
ticket of step 5 is closed successfully. If there are change 
discrepancies, then an investigation would need to be launched within 24 
hours (per SC20), as this is an indication of a potential unauthorised 
change.

Observations/Open Questions:

1. The change ticket of step 5 can be instantiated from a ticket 
template. For example, if the CA does stage testing on Tuesdays, and 
production deployment on Thursdays, then the ticket could essentially be 
a template which says "Security Patching of Production Equipment", 
listing the systems being patched (and ideally the package deployment 
logs of the step 4 stage deployment). In other words, you don't need to 
generate a discrete change ticket for each package on every system you 
are patching, and get individual approval - that's way too much manual 
effort. A ticket generation on Wednesday with approval in place for 
Thursday, covering all systems is still an effective control.

2. Is it necessary for the internal repository manifest to be re-signed 
by the CA using an internal package key? If this is not done, an 
attacker could cause patches not to be applied by deleting the package 
files prior to deployment. On the other hand, one assumes that the 
internal repository is under the same monitoring regime as the rest of 
the CA's systems, so personnel would be alerted to the deletion (absent 
a log trail, this would be suspicious). However, re-signing the packages 
and repository does create a nice, self-sustaining  change log for the 
CA, who can demonstrate to their auditors that the package was curated 
at a given time. That said, software signing keys tend to be shared 
resources, so some additional evidence is required to show which 
individual made the call to accept the package (hopefully the log of 
step 2 is sufficient]

3. Where is the Change Management Process in the above list? I think it 
would be steps 2 and 3, rather than step 5 (Or rather, step 5 is just 
the request and approval after steps 2 and 3 have been followed).

4. Is it allowable for the connections to the internal repository to 
flow over potentially hostile network connections from a highly secure 
zone? Is something like a VPN sufficient to make the hostile network 
threat go away? Or should the repository belong purely within a high 
security network zone, per the rest of the NSRs. For example, if I have 
a repository hosted on S3, where the packages and manifests are signed 
by me, is there an unacceptable risk to the high security systems which 
a purely internal network location would address?

Anyway - I'd really value some thoughts and observations. If we're 
asking CAs to make changes to their internal processes, then it's well 
that we understand the depth, scope and complexity of those changes 
prior to voting.

Cheers,

Neil