[Servercert-wg] Ballot SC22: Reduce Certificate Lifetimes (v2)

Ryan Sleevi sleevi at google.com
Tue Sep 3 10:05:06 MST 2019


On Tue, Sep 3, 2019 at 12:19 PM Doug Beattie <doug.beattie at globalsign.com>
wrote:

> Ryan,
>
>
>
> If I understood and agreed with the reasons for these changes, then I
> could certainly convey this to our customers, but you continue to skirt the
> real subject which is there is not a definitive place where the authors of
> this ballot have laid out the reasons for the change and tied that to the
> proposed timeline.  I’m more than willing to send along the position
> statement and provide commentary on it.
>

You've said that, but what is and remains unclear is how the Ballot is not
that. That's why I'm again trying to understand what it is you feel is
lacking.

GlobalSign has previously been supportive of shorter lived certificates -
for example,
https://www.globalsign.com/en/blog/ssl-certificate-validity-capped-at-maximum-two-years/
 and
https://www.globalsign.com/en/blog/ssl-certificate-validity-capped-at-maximum-two-years/
-
so it's clear GlobalSign sees benefits and has been able to communicate
those benefits in the past with its customers. Do those benefits not apply
to one year reductions?


> I don’t buy the comment that incident reports are the driving reason for
> shorter periods, or that shorter periods will reduce the number of incident
> reports.
>

https://bugzilla.mozilla.org/show_bug.cgi?id=1547691

In this incident, GlobalSign oversaw a third-party Sub-CA to issue
certificates, which then violated the Baseline Requirements. In this
scenario, because this customer issued certificates at 30 days or less,
they were able to reconfigure and replace the affected certificates
rapidly, and in the worse case, would have been no more than 30 days from
remediation. However, other certificates were manually managed, and
required significant manual effort to replace, which would have created
non-trivial impact if GlobalSign and its Sub-CA followed the BR-required
timeline of 5 days.

In the course of that response, concerns with the Sub-CA were raised.
However, GlobalSign's response was that because the majority of the
certificates were expired, it was reasonable to delay revocation and focus
on holistic replacement, rather than taking immediate steps to protect
users. Here, the overall reduction in lifetime allowed for a better risk
management calculus for those certificates still in use, without having to
worry about 'legacy' certificates that might no longer be used, but were
unexpired.

https://bugzilla.mozilla.org/show_bug.cgi?id=1575880

GlobalSign employees failed to appropriately validate certificates. During
the course of investigation, GlobalSign was able to focus only on unexpired
certificates, and did not examine certificates that had previously expired.
This risk calculus is likely because GlobalSign understands that expired
certificates can "do no harm", even if they may provide useful insight into
the systemic issues behind the failure of GlobalSign and its employees to
validate the data correctly.

https://bugzilla.mozilla.org/show_bug.cgi?id=1393555

GlobalSign failed to properly validate domain names or follow RFC 5280, as
well as oversee that of its technically constrained sub-CAs. In this
situation, GlobalSign had corrected their issuance practice in February
2016; however, they did not discover the issue until February 2017. This
issue went undetected because the customer ordered the certificates in
August 2015, with a two year validity. As a result, this issue would not
have been detected until August 2017 by GlobalSign, except for that fact
that Relying Parties discovered GlobalSign was violating its CP/CPS and the
Baseline Requirements.

Here, shorter lifetimes would have ensured that, as the customer replaced
their certificate in a hypothetical August 2016, the issue would have been
discovered, approximately six months before the community discovered it.

As part of that response, GlobalSign also announced it was moving the
technically constrained sub-CAs it oversaw to managed solutions. Not
withstanding any concerns for certificate pinning, one would assume that
from the moment that decision is made, it would take GlobalSign
approximately two years to complete that migration from making the service
available. Anything sooner than that would be disruptive to Subscribers, as
it would involve revoking the Sub-CA and requiring a forced replacement of
their certificates. Had the validity period been capped at one year, then
the time period that GlobalSign adopted - roughly a year and a half - would
have been able to be completed sooner (within a year) and without any
disruption or negative impact, simply through the natural cadence of
certificates.

https://bugzilla.mozilla.org/show_bug.cgi?id=1390997

GlobalSign had been failing to follow the EV Guidelines for a number of
years, not enforcing certain provisions. This was reported by Relying
Parties in August 2017. As part of its incident response, GlobalSign shared
that it had corrected the underlying problem in late November 2016.
However, these certificates had all been issued prior to then, and thus
evaded detection. Had lifetime been capped at a year, both the underlying
issue and the improved remediation would have allowed GlobalSign to detect
this particular issue and remedy the underlying issue sooner.

GlobalSign then decided that, despite GlobalSign's violation of its CP/CPS
and the EV Guidelines, it would further violate the EV Guidelines by not
revoking these certificates, and they would be permitted to be used until
their natural expiration and replacement. Here, the mitigation for this was
the fact that many of the certificates would be promptly expiring, due to
the limits on the overall certificate lifetime.

Over the course of investigation, it was determined that GlobalSign had
misissued over 2200 certificates in this form. However, this was mitigated
by the fact that, despite the rampant misissuance by GlobalSign, many of
these certificates were expired.


This is just a small sample of highlighting incidents where a key factor
for timely resolution and correction was the certificate validity period.
It allowed GlobalSign to promptly scope the issue, focus on timely
replacement, or otherwise minimize any disruption to their customers, all
caused by failures of GlobalSign to follow the unambiguous requirements.

Now, it may be that GlobalSign does not view its non-compliance with its
CP, CPS, the Baseline Requirements and EV Guidelines, and Root Store
Program Requirements as serious, because no harm was demonstrated. This
certainly rings echoes of DigiNotar, which, until catastrophic harm was
caused to hundreds of thousands of Iranian users, its non-compliance was
otherwise unobservable and insignificant. Browsers have firmly rejected
this selective approach to compliance, because, much like the story of Van
Halen and brown M&Ms, the failure to spot the little things represents the
chance of systemic and catastrophic failures that can cause real, lasting,
permanent harm.

I only chose GlobalSign incidents, out of continued respect of wanting to
avoid CAs positioning Incident Reports as a mean to shame, versus what they
are: an opportunity to improve the ecosystem. That's what Ballot SC22
attempts to do: to learn from those incidents, apply a systemic
understanding about the many varied and complex causes, and to accept that
if we must accept human error as a potentiality in the CA ecosystem, we
should balance that risk with harm reduction, such as reducing the harm
that can be caused when those ever-so-fallible humans make mistakes.


> Yes, there are a couple of incidents where stale data was re-used, but
> typically incidents are for issues other than this.
>
>
>
> What’s missing is a public blog or position by the ballot authors on the
> reasons this is needed and why April 2020 is the drop dead date.  The
> current ballot into is insufficient.
>

You continue to dismiss the reasoning being given, as you do in this
message, so I'm not sure there's any reasonable path forward. This response
functionally feels like "nuh uh, you're wrong", and that's why it makes it
difficult to explain or even reasonably engage in discussion with.

This latest reply similarly doesn't help move the discussion further, as
appealing as it might sound. For example, you present April 2020 as a
drop-dead date, but haven't engaged on any substantive discussion about
what actual harms are caused, what's unreasonable (which is implicitly
stated in a discussion about a date), or what reasons may exist for
delaying, and when. I've provided a long list of harms which date reduction
would address, once the existing two year certificates were phased out.

I appreciate the attempt to move the Overton window, in order to suggest
that April 2020 is an extreme position, but that somehow, there exists some
more reasonable compromise position. However, GlobalSign hasn't
demonstrated any evidence that would, compared to the harms caused today
and benefits from reduction, justify further delay.


> We need a list of issues and attacks that have resulted in, or have a high
> potential to harm the eco system and exactly how these proposed changes
> help more than they hurt.
>

Do you think CAs bear the same burden of discussion for establishing
"hurt"? There only evidence of any harm has been one CA highlighting the
challenges this make. The information provided by Entrust, DigiCert, and
GoDaddy does not actually do anything to establish that there is any hurt
whatsoever.


>   Describe them without calling our specific CAs or organizations,
> intimidating the community, or demeaning those that have expressed their
> opinion in the past.
>

This is why it's functionally impossible to engage in a reasonable
discussion about this. You cannot have a discussion about the harm
mitigation without discussing the past harm and issues, and I've been very
careful to engage in specific examples precisely because, regardless of the
facts, CAs will spin it as calling them out. For example, we cannot discuss
the SHA-1 issues without calling out the organizations that got a SHA-1
exception, and yet that conveniently would allow CAs to dismiss those
concerns. Similarly, any statement about the desire to protect the security
of users is going to be seen as "intimidating the community", and any
disagreement can be painted as "demeaning", regardless of the merits.

Hopefully the facts provided above, which provide concrete examples of the
harm reduction of reduced lifetimes, and how one year could have corrected
or remediated issues even more than the then-current two years or three
years, and could have further reduced impact and challenges for GlobalSign
customers, moves that discussion forward.

In the two weeks of discussion, I tried publicly and privately to engage
with GlobalSign to enumerate which parts of the Ballot Text it felt were
not accurate or which it disagreed with. I appreciate that you repeated
your call here for the reasons, but you've continually skirted engaging on
the Substance, and instead presented it as an argument about presentation
instead, and so naturally, we haven't been able to engage.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cabforum.org/pipermail/servercert-wg/attachments/20190903/86192015/attachment.html>


More information about the Servercert-wg mailing list