[Servercert-wg] OCSP Service Availability

Neil Dunbar ndunbar at trustcorsystems.com
Thu May 14 10:38:23 MST 2020


On 11/05/2020 18:46, Ben Wilson via Servercert-wg wrote:
>
>
> OCSP uptime has recently been discussed in the m.d.s.p. list[1] and a 
> suggestion has been made that we address OCSP uptime in the Mozilla 
> Root Store Policy.[2]  Section 4.10.2 of the Baseline Requirements 
> only specify 24x7 availability.[3] It could be argued that this is a 
> requirement for 100% uptime. I know that many CAs have SLAs that 
> commit to less than 100% uptime. What is a reasonable baseline 
> requirement? I am interested in co-sponsoring a ballot that says what 
> an expected reasonable uptime should be.
>
> [1] 
> https://groups.google.com/forum/#!topic/mozilla.dev.security.policy/Pnyo3vhMhJY
> [2]https://github.com/mozilla/pkipolicy/issues/214 
> <https://github.com/mozilla/pkipolicy/issues/214>
> [3] https://cabforum.org/wp-content/uploads/CA-Browser-Forum-BR-1.7.0.pdf

> Section 4.10.2 says, "The CA SHALL maintain an online 24x7 Repository 
> that application software can use to automatically check the current 
> status of all unexpired Certificates issued by the CA."

I don't think that 24x7 necessarily implies 100% uptime of the service: 
to me it implies that there are no a priori described times when the 
service will be unavailable (e.g public holidays, or every second Friday 
in the month); additionally a service can suffer outage local to a 
relying parties query (which might not have anything to do with the CA 
or its service providers, but rather some network issue surrounding the RP).

I think that a good many CAs try to ensure high availability by 
offloading pre-generated OCSP responses to a CDN; the problem in 
maximising uptime there is that OCSP is _spectacularly_ ill fitted to 
traditional CDN architectures [CDNs tend not to cache POSTs, and OCSP 
nonces are a brilliant way of guaranteeing cache miss]. But even if we 
could guarantee requests were purely expressed as RFC 5019 optimised GET 
queries, the maximum uptime would then be constrained by the uptime of 
the CDN. I should imagine that most CAs don't deploy their own global 
CDNs, rather using commercial offerings; not all commercial CDNs will 
give 100% uptimes, and the remedies offered for SLO failure are 
typically service credits, which might not be sufficient to cover the 
downsides of failing to satisfy OCSP uptime as an audit qualification.

Perhaps we should be aiming for the notion that a CA publishes the 
details of what SLO it intends for its OCSP service in the CPS, and the 
browsers can then decide whether that meets the criterion for trust in 
their root programs. There would need to be auditable records to show 
evidence that the SLO was met - and a failure to meet the SLO would 
warrant an incident report. From a monitoring standpoint, you would want 
to ensure that the requests to the OCSP service were sufficiently 
randomised to prevent gaming the system by having a "pre-optimised" set 
of responses at the ready (e.g. providing a single OCSP response to 
serve as the 502 response to signify a faulty origin).

I'm sure there are a million things which could be considered - these 
are my not-well-organised thoughts.

Neil

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cabforum.org/pipermail/servercert-wg/attachments/20200514/eb57823f/attachment.html>


More information about the Servercert-wg mailing list