[Servercert-wg] OCSP Service Availability
Neil Dunbar
ndunbar at trustcorsystems.com
Thu May 14 10:38:23 MST 2020
On 11/05/2020 18:46, Ben Wilson via Servercert-wg wrote:
>
>
> OCSP uptime has recently been discussed in the m.d.s.p. list[1] and a
> suggestion has been made that we address OCSP uptime in the Mozilla
> Root Store Policy.[2] Section 4.10.2 of the Baseline Requirements
> only specify 24x7 availability.[3] It could be argued that this is a
> requirement for 100% uptime. I know that many CAs have SLAs that
> commit to less than 100% uptime. What is a reasonable baseline
> requirement? I am interested in co-sponsoring a ballot that says what
> an expected reasonable uptime should be.
>
> [1]
> https://groups.google.com/forum/#!topic/mozilla.dev.security.policy/Pnyo3vhMhJY
> [2]https://github.com/mozilla/pkipolicy/issues/214
> <https://github.com/mozilla/pkipolicy/issues/214>
> [3] https://cabforum.org/wp-content/uploads/CA-Browser-Forum-BR-1.7.0.pdf
> Section 4.10.2 says, "The CA SHALL maintain an online 24x7 Repository
> that application software can use to automatically check the current
> status of all unexpired Certificates issued by the CA."
I don't think that 24x7 necessarily implies 100% uptime of the service:
to me it implies that there are no a priori described times when the
service will be unavailable (e.g public holidays, or every second Friday
in the month); additionally a service can suffer outage local to a
relying parties query (which might not have anything to do with the CA
or its service providers, but rather some network issue surrounding the RP).
I think that a good many CAs try to ensure high availability by
offloading pre-generated OCSP responses to a CDN; the problem in
maximising uptime there is that OCSP is _spectacularly_ ill fitted to
traditional CDN architectures [CDNs tend not to cache POSTs, and OCSP
nonces are a brilliant way of guaranteeing cache miss]. But even if we
could guarantee requests were purely expressed as RFC 5019 optimised GET
queries, the maximum uptime would then be constrained by the uptime of
the CDN. I should imagine that most CAs don't deploy their own global
CDNs, rather using commercial offerings; not all commercial CDNs will
give 100% uptimes, and the remedies offered for SLO failure are
typically service credits, which might not be sufficient to cover the
downsides of failing to satisfy OCSP uptime as an audit qualification.
Perhaps we should be aiming for the notion that a CA publishes the
details of what SLO it intends for its OCSP service in the CPS, and the
browsers can then decide whether that meets the criterion for trust in
their root programs. There would need to be auditable records to show
evidence that the SLO was met - and a failure to meet the SLO would
warrant an incident report. From a monitoring standpoint, you would want
to ensure that the requests to the OCSP service were sufficiently
randomised to prevent gaming the system by having a "pre-optimised" set
of responses at the ready (e.g. providing a single OCSP response to
serve as the 502 response to signify a faulty origin).
I'm sure there are a million things which could be considered - these
are my not-well-organised thoughts.
Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cabforum.org/pipermail/servercert-wg/attachments/20200514/eb57823f/attachment.html>
More information about the Servercert-wg
mailing list