The Strange World of Public Key Infrastructure and Certificates

An e-mail discussion related to my recent post on IT security has motivated me to ponder about issues with Public Key Infrastructure once more. So I attempt – most likely in vain – to merge a pop-sci introduction to certificates with sort of an attachment to said e-mail discussion.

So this post might be opaque to normal users and too epic and introductory for security geeks.

I mentioned the failed governmental PKI pilot project in that post – a hardware security device destroying the key and there was no backup. I would have made fun of this – hadn’t I experienced it often that it is the so-called simple processes and logistics that can go wrong.

I didn’t expect to find such a poetic metaphor for “security systems” rendered inaccessible. Padlocks in Graz, Austria. Legend has it that lovers attaching a padlock to the bridge and throwing the key into the water will be together forever.

When compiling the following I had in mind what I call infrastructure PKIs – company-internal systems to be used mainly for internal purposes and very often for use by devices rather than by humans. (Ah, the internet of things.)

Issues often arise due to a combination of the following:

  • Human project resources assigned to such projects are often limited.
  • Many applications simply demand certificates so you need to create them.

Since the best way to understand certificates is probably by comparing them to passports or driver licenses I will nonetheless use one issued to me as a human life-form:

Digital Certificate
In Austria the chipcards used to authorize you to medical doctors as a patient can also be used as digital ID cards. That is, the card’s chip also holds the cryptographic private key, and the related certificate ties your identity as a citizen to the corresponding public key. A certificate is a file digitally signed by a Certificate Authority which in this case has the name a-sign-Token-03. The certificate can be searched for in the directory (German site).
Digital X.509 Certificate: Details
The Public key related to my identity as a citizen (or better a database record representing myself as a citizen). As a passport, the certificate has an end of life and requires renewal.

Alternatives to Hardware Security Modules

An HSM is protecting that sacred private key of the certification authority. It is often a computer, running a locked-down version of an operating system, and it is equipped with sensors that detect and attempt to access the key store physically – it should actually destroy the key rather than having an attacker gain access to it.

It allows for implementing science-fiction-style (… Kirk Alpha 2Spock Omega 3 …) split administration and provides strong key protection that cannot be provided if the private key is stored in software – somewhere on the hard disk of the machine running the CA.

Modern HSMs have become less cryptic in terms of usage but still: It is a hardware device not used on a daily basis, and requires additional training and management. Storage of physical items like the keys for unlocking the device and the corresponding password(s) is a challenge as is keeping the know-how of admins up to date.

Especially for infrastructure CAs I propose a purely organizational split administration for offline CAs such as a Root CA: Storing the key in software, but treating the whole CA machine as a device to be protected physically. You could store the private key of the Root CA or the virtual machine running the Root CA server on removable media (and at least one backup). The “protocol” provides spilt administration: E.g. one party has the key to the room, the other party has the password to decrypt the removable medium. Or the unencrypted medium is stored in a location protected by a third party – which in turn does only allow two persons to enter the room together.

But before any split administration is applied an evaluation of risks it should be made sure that the overall security strategy does not look like these cartoons mocking firewalls – as a padlocked gate in the middle of an otherwise open meadow.

You might have to question the holy order (hierarchy) and the security implemented at the lowest levels of CA hierarchies.

Hierarchies and Security

In the simplest case a certification authority issues certificates to end-entities – users or computers. More complex PKIs consist of hierarchies of CAs and thus tree-like structures. The theoretical real-world metaphor would be an agency issuing some license to a subordinate agency that issues passports to citizens.

Chain of certificates associated with this blog
Chain of certificates associated with this blog: * is certified by Go Daddy Secure Certification Authority which is in turn certified by Go Daddy Class 2 Certification Authority. The asterisk in the names makes it usable with any site – but it defies the purpose of denoting one specific entity.

The Root CA at the top of the hierarchy should be the most secure as if it is compromised (that is: it’s private key has – probably – been stolen) all certificates issued somewhere in the tree should be invalidated.

However, this logic only makes sense:

  • if there is or will with high probability be at least a second Issuing CA – otherwise the security of the Issuing CA is as important as that of the Root CA.
  • if the only purpose of that Root CA is to revoke the certificate of the Issuing CA. The Root CA’s key is going to sign a blacklist referring to the Issuing CA. Since the Root should not revoke itself its key signing the revocation list should be harder to compromise than the key of the to-be-revoked Issuing CA.
Certificate Chain
The certificate chain associated with my “National ID” certificate. Actually, these certificates stored on chipcards are invalidated every time the card (which serves another purpose primarily) is retired as a physical item. Invalidation of tons of certificates can create other issues I will discuss below.

Discussions of the design of such hierarchies focus a lot on the security of the private keys and cryptographic algorithms involved

But yet the effective security of an infrastructure PKI in terms of Who will be able to enroll for certificate type X (that in turn might entitle you to do Y) is often mainly determined by typical access control lists in databases or directories system that are integrated with an infrastructure PKI. Think would-be subscribers logging on to a web portal or to a Windows domain in order to enroll for a certificates. Consider e.g. Windows Autoenrollment (licensed also by non-Windows CAs) or the Simple Certificate Enrollment Protocol used with devices.

You might argue that it should be a no-no to make allegedly weak  software-credential-based authentication the only prerequisite for the issuance of certificates that are then considered strong authentication. However, this is one of the things that distinguish primarily infrastructure-focused CAs from, say, governmental CAs, or “High Assurance” smartcard CAs that require a face-to- face enrollment process.

In my opinion certificates are often deployed because their is no other option to provide platform-independent authentication – as cumbersome as it may be to import key and certificate to something like a printer box. Authentication based on something else might be as secure, considering all risks, but not as platform-agnostic. (For geeks: One of my favorites is 802.1x computer authentication via PEAP-TLS versus EAP-TLS.)

It is finally the management of group memberships or access control lists or the like that will determine the security of the PKI.

Hierarchies and Cross-Certification

It is often discussed if it does make sense to deploy more intermediate levels in the hierarchy – each level associated with additional management efforts. In theory you could delegate the management of a whole branch of the CA tree to different organizations, e.g. corresponding to continents in global organizations. Actually, I found that the delegation argument is often used for political reasons – which results in CA-per-local-fiefdom instead of the (in terms of performance much more reasonable) CA-per-continent.

I believe the most important reason to introduce the middle level is for (future) cross-certification: If an external CA cross-certifies yours it issues a certificate to your CA:

Cross Certification
Cross Certification between two CA hierarchies, each comprising three levels. Within a hierarchy each CA issues a certificate for its subordinate CA (orange lines). In addition the middle-tier CAs in each hierarchy issue certificates to the Root CAs of the other hierarchy – effectively creating logical chains consisting of 4 CAs. Image credits mine.

Any CA on any level could on principle be cross-certified. It would be easier to cross-certificate the Root CA but then the full tree of CAs subordinate to it will also be certified (For the experts: I am not considering name or other constraints here). If a CA an intermediate level is issued the cross-certificate trust is limited to this branch.

Cross-Certification constitutes a bifurcation in the CA tree and its consequences can be as weird and sci-fi as this sounds. It means that two different paths exists that connect an end-entity certificate to a different Root CA. Which path is actually chosen depends on the application validating the certificate and the protocol involved in exchanging or collecting certificates.

In an SSL handshake (which happens if you access your blog via https: //, using the certificate with that asterisk) happens if you access the web server is so kind to send the full certificate chain – usually excl. the Root CA – to the client. So the path finally picked by the client depends on the chain the server knows or that takes precedence at the server.

Cross-certification is usually done by CAs considered external, and it is expected that an application in the external world sees the path chaining to the External CAs.

Tongue-in-cheek I had once depicted the world of real PKI hierarchies and their relations as:

CA hierarchies in the real world.
CA hierarchies in the real world. Sort of. Image credits mine.

Weird things can happen if a web server is available on an internal network and accessible by the external world (…via a reverse proxy. I am assuming there is no deliberate termination of the SSL connection at the proxy – what I call a corporate-approved man-in-the-middle attack). This server knows the internal certificate chain and sends it to the external client – which does not trust the corresponding internal-only Root CA. But the chain sent in the handshake may take precedence over any other chain found elsewhere so the client throws an error.

How to Really Use “Cross-certification”

As confusing cross-certification is – it can be  used in a peculiar way to solve other PKI problems – those with applications that cannot deal with the validation of a hierarchy at all or who can deal with only a one-level hierarchy. This is interesting in particular in relation to devices such as embedded industry systems or iPhones.

Assuming that only the needed certificates can be safely injected to the right devices and that you really know what you are doing the fully pesky PKI hierarchy can be circumvented by providing an alternative Root CA certificate to the CA at the bottom of the hierarchy:

The real, full blown hierarchy is

  1. Root CA issued a root certificate for Root CA (itself). It contains the key 1234.
  2. Root CA issues a certificate to Some Other CA related to key 5678.

… then the shortcut hierarchy for “dumb devices” looks like:

  1. Some Other CA issues a root certificate to itself, thus to Subject named Some Other CA. The public key listed in this certificate is 5678 the same as in certificate (2) of the extended hierarchy.

Client certificates can then use either chain – the long chain including several levels or the short one consisting of a single CA only. Thus if certificates have been issued by the full-blown hierarchy they can be “dumbed-down to devices” by creating the “one-level hierarchy” in addition.

Names and Encoding

In the chain of certificates the Issuer field in the certificate of the Subordinate CA needs to be the same as the Subject field of the Root CA – just as the Subject field in my National ID certificate contains my name and the Issuer field that of the signing CA. And it depends on the application how names with be checked. In a global world, names are not simple ASCII strings anymore, but encoding matters.

Certificates are based on an original request sent by the subordinate CA, and this request most often contains the name – the encoded name. I have sometimes seen that CAs changed the encoding of the names when issuing the certificates, or they reshuffled the components of the name – the order of tags like organization and country. An application may except that or not, and the reasons for rejections can be challenging to troubleshoot if the application is running in a blackbox-style device.

Revocation List Headaches

Certificates (X.509) can be invalidated by adding their respective serial number to a blacklist. This list is – or actually: may – be checked by relying parties. So full-blown certificate validation comprises collecting all certificates in the chain up to a self-signed Root CA (Subject=Issuer) and then checking each blacklist signed by each CA in the chain for the serial number of the entity one level below:

Certificate Validation
Validation of a certificate chain (“path). You start from the bottom and locate both CA certificates and the revocation lists via URLs in each subordinate certificate. Image credits mine.

The downside: If the CRL isn’t available at all applications following the recommended practices will for example deny network access to thousands of clients. With infrastructure PKIs that means that e.g. access to WLAN or remote access via VPN will fail.

This makes desperate PKI architects (or rather the architects accountable for the application requiring certificate based logon) build all kinds of workarounds, such as switching off CRL checking in case of an emergency or configuring grace periods. Note that this is all heavily application dependent and has to be figured out and documented individually for emergencies for all VPN servers, exotic web servers, Windows domain controllers etc.

A workaround is imperative if a very important application is dependent on a CRL issued by an “external” certificates’ provider. If I would use my Austrian’s digital ID card’s certificate for logging on to server X, that server would need tp have a valid version of this CRL which only lives for 6 hours.

Certificate Revocation List
A Certificate Revocation List (CRL) looks similar to a certificate. It is a file signed the Certification Authority that also signed the certificates that might be invalidated via that CRL. From downloading this CRL frequently I conclude that it a current version is published every hour – so there are 5 hours of overlap.

The predicament is that CRLs may be cached for performance reasons. Thus if you publish short-lived CRLs frequently you might face “false negative” outages due to operational issues (web server down…) but if the CRL is too long-lived it does not serve its purpose.

Ideally, CRLs would be valid for a few days, but a current CRL would be published, say every day, AND you could delete the CRL at the validating application every day. That’s exactly how I typically try to configure it. VPN servers, for example, have allowed to delete the CRL cache for a long time and Windows has a supported way to do that since Vista. This allows for reasonable continuity but revocation information would still be current.

If you cannot control the CRL issuance process one workaround is: Pro-active fetching of the CRL in case it is published with an overlap – that is: the next CRL is published while the current one is still valid – and mirroring the repository in question.

As an aside: It is more difficult as it sounds to give internal machines access to a “public” external URL. Machines not necessarily use the proxy server configured for user (which cause false positive results – Look, I tested it by accessing it in the browser and it works), and/or machines in the servers’ network are not necessarily allowed to access “the internet”.

CRLs might also simply be too big – for some devices with limited processing capabilities. Some devices of a major vendor used to refuse to process CRLs larger than 256kB. The CRL associated with my sample certificate is about 700kB:

How the revocation is located – via a URL embedded in the certificate. For the experts: OCSP is supported, too, and it is the recommended method. However considering older devices it might be necessary to resort to CRLs.
CRL Details - Blacklist
The actual blacklist part of the CRL. The scrollbar is misleading – the list contains about 20.000 entries (best viewed with openssl or Windows certutil).

Emergency Revocation List

In case anything goes wrong – HSM inaccessible, passwords lost, datacenter 1 flooded abd backup datacenter 2 destroyed by a meteorite – there is one remaining option to keep PKI-dependent applications happy:

Prepare a revocation list in advance whose end of life (NextUpdate date) is after the end of validity of the CA certificate. In contrast to any backup of key material this CRL can be “backed up” by pasting the BASE64 string to the documentation as it does not contain sensitive information.

In an emergency this CRL will be published to the locations embedded in certificates. You will never be able to revoke anything anymore as CRLs might be cached – but business continuity is secured.

Emergency CRL
An Emergency CRL for my home-grown CA. It seems 9999 days is the maximum I can use with Windows certutil. Actually, the question of How many years should the lifetime be so that I will not be bothered anymore until retirement? comes up often in relation to all kinds of validity dates.

10 Comments Add yours

  1. I have not been keeping up with my follows lately Elke … forgive me. I did try to wade through this but, must admit that it took me quite some time and I’m afraid that most of it went right over my head. I’m glad, however, to know that folks like you are thinking about such things. What your ideas did make me very much aware of is how much more I am doing online these days … banking, healthcare paperwork, livestock registrations … and the list goes on. I have two issues with all of these new habits … (1) the security of all of these sites, and (2) the degree to which my ‘data’ is accumulating out there such that folks with the proper knowledge and put together a pretty good picture of me … which, I suppose, is something I’m not that happy about. So … thanks for getting me to think! D

    1. elkement says:

      Thanks, Dave – and no need for apologies. I have been nearly offline, too, in the last days (travelling and busy).
      I really appreciate your comment as I know that this post was very long and very special. It is even too special when just pondering about ‘security’ as the issue with profiles created from the traces you leave online will not be solved by ‘security technology’… as it would be rather hard (though not impossible) to process data while keeping them anonymous.

  2. I particularly like the padlocked gate image near the top of the post as it so nicely illustrates the “root” (pun intended) issue: we can make the infrastructure as secure as we can but someone will find a way to “bend” the system to circumvent all of our hard work. Authentication and encrypted send-receive are of no use if you make compromises at the user level. A key-logger, for example can capture system passwords and such. Once you get admin access to the computer everything else is moot.
    I am quite drawn to one of the essential ideas in the book you so kindly passed my way: keep the context in mind wen you design security. In my case the main security I am worried about is privacy protection of particular Information such as grades and to a lesser extent personally identifying information (PII). In that case it seems that limiting access is a giant step, that is, set the system so that all access to the grades is logged by user and IP and by system policy only allow school administrators access to them anyway.
    The lingering thought I am still having is this: the current ‘younger’ generation of people, say 35 and under or so are so used to hyper-social-media that they may have no great concern for privacy anyway. As they slowly become the management class what changes will they enact on policy, never mind the underlying mechanisms. Will we all grow in to a society where there’s no need to protect PII because nobody cares about it anymore?

    1. elkement says:

      Thanks, Maurice, especially because I know the length of this post was against all best practices. (I use the root cause pun often, too ;-))
      As for the younger generation: I would bet there are some really skilled young hackers, too, and systems protecting grades and exams would be attractive targets.

      The “management class” question is quite an interesting one, and I am totally biased. I think that management / business education has become more and more (and too much!) about policies and compliance actually. It seems it started with Enron and the like and the baby has been thrown out with the bathwater. I don’t worry at all that freshly minted 30 year old managers in charge of security and compliance today don’t impose enough security policies on the working resources … rather the contrary.
      Years ago a “contract” was signed by shaking hands only – including agreements with large corporations. (Actually still being able to work that way today has become a critierion of mine for picking clients and projects). Today as a vendor you often have to jump through all kinds of hoops and send papers back and forth, most of them through very secure and compliant IT systems. And sign the vendor code of conduct and the statement about your compliance with data retention policies etc. etc. Often these requirements about how to handle customer data for example are logically inconsistent and/or inconsistent with data protection law.
      It is more often the old guard that works with you on circumventing the systems in a way that is still somewhat legitimate.
      So it seems that “secure and compliant systems” in the broader sense (from cryptical HSMs to any corporate policy) do simply not allow to work productively.

      Another interesting question is how people connect their professional over-compliant personas with the lax way they act on social media. I have to leave this to the psychologists – but I think this wouldn’t be the first example for how so-called cognitive dissonance is resolved in surprising ways. I am often stunned by difference between people’s private values and those that their corporate employers effectively demonstrate (which not necessarily conincide with published corporate values). So I guess I am trying so say that we (any generation) are more than used to execute “corporate rules” intuitively that run counter anything we believe in… and younger people might have been immersed in that business world from day 1.

  3. I feel like my brain just got a security update. I’ll have to delay my reboot until I get out of work, though

    1. Joseph Nebus says:

      Yeah, I feel a bit overwhelmed by it all, but I’m fairly confident that another round or two of reading it and it’ll be clear enough.

      1. elkement says:

        I know – this post was of the tl;dr kind. So thanks a lot for reading!!

  4. OK, I can’t say I understood all that… However, I am pondering on a new type of security that we’ll be needing. As I am developing designs for 3D printing, I need to find a way to protect them both before and after they are printed. If someone without the proper authorization tries to print them, I want the printer to get stuck, or the object to fall apart. And if they scan them, I want the scanner to get messed up and produce garbage.

    I have been pondering hardware viruses: structural inclusions that will physically impact an illicit object – only the inclusion of a key in the 3D mesh would stop that. Taking security from digital space to real world space will prove an interesting challenge and a nice new industry.

    Sorry for being 100% off topic!!!

    1. elkement says:

      Thanks – and the off-topic-ness is sort of on topic again: As I pointed out in the previous post on certificates I linked at the top (any maybe only between the lines in this one) I agree with critics of PKI – like Peter Gutmann – who consider it an ancient standard developed in a time nobody could predict which security challenge we have to face today.
      Your question is about a real-world security issue, and chances are high that current standards for certificates would be the best way to tackle it. If I understood correctly this is not similar to “secure printing” (You have to enter a code directly at a printer so that nobody can snatch your “printouts”) but the design – the printing instructions – and not the printer should authorize the owner.

      I am now just thinking aloud…

      You might need something like a signature or a message authentication code created from the print design. When the printer validates this signature/MAC it knows it has been created by yourself and the printer design has not been modified. Issues with this: What if the cryptographic check is turned off in the printer hardware? What if somebody else signed the printer design, too, and replaced only the signature part. One solution would be to encrypt the design, too,…but as soon as the design is once used it has to be decrypted… somewhere, probably only in memory…. so the “printer rootkit” that might have infected the printer would steal the decryption key for future use.
      I think I cannot think out of the PKI box as it would help if the printer would only accept specific “certified” signatures, thus you would not only sign the design but you need to add your certificate related to the signing key – and this would in turn be signed by a trusted CA. But then the hacker would try to hack the store of trusted CAs at the printer (it is a real issue that every application and device has its own CA store in the IT world).
      I have not much experience with Message Authentication Codes but as I understood the principles here they might help: In contrast to “just signing” (= encryption of a hash value with the private key) MACs require a secret key… a secret shared between the owner and “the design”. Normally, the second party is another computer or user, that would have a specifically protected store holding passwords and keys. But in this case the design is just software – I guess it is not an option to turn “printing instruction”, that is: a document, into something like a smartcard = a small computer with a key store… a device somewhat similar to an HSM actually. Thus when you would want to print you insert that “smartcard”, that checks your signature … which requires the MAC secret key held in the physical store of the card. Neglecting security issues with this “card” it would be able to use the key internally but it would not allow an attacker to extract it.
      Sorry – I got carried away – it was such an interesting question… I am sure I missed something obvious … if I ever stumble upon some existing security feature / device / technology / whatever that might be helpful here I will let you know.

      1. Wow, thanks – lots of good thinking. You might well be right that security will have to be built into the printers such that only printers with compatible security can be used.

        It is also possible to watermark the designs, like they do with jpgs etc. That would lead to barely visible or invisible deformations. which could be picked up by 3D scanners and which would allow the scan to pick up the security code.

        The problem isn’t big today but with 5 years people will start copying expensive designer items and make chinese copies themselves. I’m someone will insist on 3D security then.

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.