[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
X.509 implementation notes
I've been putting together a few notes on X.509 implementation issues over the
last few days after discussion with other people revealed that there seems to
be a lot of confusion over how these should be done. A very rough first draft
is included below, if anyone has any comments on them please let me know.
I'll probably put this on the web somewhere when its finished since it seems
there's a need for this sort of thing.
Just give me a minute to get a fork and some marshmellows before you reply :-).
Peter.
X.509 Implementation Notes
==========================
Peter Gutmann, pgut001@cs.auckland.ac.nz
30 October 1996
Anyone who has had to work with X.509 has probably experienced what can best be
described as ISO water torture, which involves ploughing through all sorts of
ISO, ANSI, ITU, and IETF standards, amendments, meeting notes, draft standards,
committee drafts, working drafts, and other work-in-progress documents, some of
which are best understood when held upside-down in front of a mirror. Because
of this confusion, noone seems to know how to implement X.509 certificates
properly (see 'Known Bugs' below). This document is an attempt at providing a
cookbook for certificates which should give you everything that you can't find
anywhere else without devoting your life to the search, as well as comments on
what you'd typically expect to find in certificates. The conventions used are:
- All encodings follow the DER unless otherwise noted.
- Most of the formats are ASN.1. Occasionally 15 levels of indirection are cut
out to make things easier to understand.
- Questions are marked by '=>' at the start. If anyone has any comments on
these please let me know.
Certificate
-----------
Certificate ::= SEQUENCE {
tbsCertificate TBSCertificate,
signatureAlgorithm AlgorithmIdentifier,
signature BIT STRING
}
The encoding of the Certificate may follow the BER rather than the DER. At
least one implementation uses the indefinite-length encoding form for the
SEQUENCE.
TBSCertificate
--------------
TBSCertificate ::= SEQUENCE {
version [ 0 ] Version DEFAULT v1(0),
serialNumber CertificateSerialNumber,
signature AlgorithmIdentifier,
issuer Name,
validity Validity,
subject Name,
subjectPublicKeyInfo SubjectPublicKeyInfo,
issuerUniqueID [ 1 ] IMPLICIT UniqueIdentifier OPTIONAL,
subjectUniqueID [ 2 ] IMPLICIT UniqueIdentifier OPTIONAL,
extensions [ 3 ] Extensions OPTIONAL
}
Version
-------
Version ::= INTEGER { v1(0), v2(1), v3(2) }
The default version is v1(0). If the issuerUniqueID or subjectUniqueID are
present than the version must be v2(1) or v3(2). If extensions are present
than the version must be v3(2). An implementation should target v3
certificates, which is what everyone is moving towards.
SerialNumber
------------
CertificateSerialNumber ::= INTEGER
This should be unique for each certificate issued by a CA (typically a CA will
keep a counter in persistent store somewhere, perhaps a config file under Unix
and in the registry under Windows). Another way is to take the current time in
seconds and subtract some base time like the first time you ran the software,
to keep the numbers manageable.
Serial numbers aren't necessarily restricted to 32-bit quantitues. For example
the RSADSI Commercial Certification Authority serial number is 0x0241000016,
which is larger than 32 bits. If you're writing certificate-handling code,
just treat them as a blob which happens to be an encoded integer.
Signature
---------
This rather misnamed field contains the algorithm identifier for the signature
algorithm used by the CA to sign the certificate. There doesn't seem to be
much use for this field, although you should check that the algorithm
identifier matches the one of the signature on the cert (if anyone can forge
the signature on the cert then they can also change the inner algorithm
identifier, it's possible that this was included because of some obscure attack
where someone who could convince (broken) signature algorithm A to produce the
same signature value as (secure) algorithm B could change the outer,
unprotected algorithm identifier from B to A, but couldn't change the inner
identifier without invalidating the signature. What this would achieve is
unclear).
Name
----
Name ::= SEQUENCE OF RelativeDistinguishedName
RelativeDistinguishedName ::= SET OF AttributeValueAssertion
AttributeValueAssertion ::= SEQUENCE {
attributeType OBJECT IDENTIFIER,
attributeValue ANY
}
This is used to encode that wonderful ISO creation, the Distinguished Name
(DN), a path through an X.500 directory information tree (DIT) which uniquely
identifies everything on earth. Although the RelativeDistinguishedName (RDN)
is given as a SET OF AttributeValueAssertion (AVA) each set should only contain
one element. However it *may* contain more than one, there has been a reported
case of a certificate which contained more than one element in the SET.
=> How are these handled? If you've got an entry with a CN and extra
attributes of email address and phone number it's obvious to a human that
the CN is the important one, but how do you handle it in software?
When encoding sets with cardinality > 1, you need to take care to follow the
DER rules which say that they should be ordered by their encoded values
(although ASN.1 says a SET is unordered, the DER adds ordering rules to ensure
it can be encoded in an unambiguous manner). What you need to do is encode
each value in the set, then sort them by the encoded values, and output them
wrapped up in the SET OF encoding.
You don't have to use a Name for the subject name if you don't want to; there
is a subjectAltName extension which allows use of email addresses or URL's. If
you want to do this, make the Name an empty sequence and include a
subjectAltName extension and mark it critical. Because of this, you should be
prepared to accept a zero-length sequence for the subject name in version 3
certificates.
Typically you would expect to find the following types of AVA's in an X.509
certificate, starting from the top:
countryName ::= SEQUENCE { { 2 5 4 6 }, StringType( SIZE( 2 ) ) }
organization ::= SEQUENCE { { 2 5 4 10 }, StringType( SIZE( 1...64 ) ) }
organizationalUnitName
::= SEQUENCE { { 2 5 4 11 }, StringType( SIZE( 1...64 ) ) }
commonName ::= SEQUENCE { { 2 5 4 3 }, StringType( SIZE( 1...64 ) ) }
The countryName is the ISO 3166 code for the country. A StringType is either a
TeletexString, a PrintableString, a UniversalString, or an IA5String.
There appears to be some confusion about what format a Name in a certificate
should take. In theory it should be a full, proper DN, which traces a path
through the X.500 DIT, eg:
C=US/L=Area 51/O=Hanger 13/OU=X.500 Standards Designers/CN=John Doe
but since the DIT's usually don't exist, exactly what format the DN should take
seems open to debate. Some implementations seem to let you stuff anything with
an OID into a DN.
=> Are there any guidelines for this?
DN's are a very awkward way to handle information about a certificate because
most people will want to associate an email address and a name with a
certificate and no more. If you want to take the easy way out, use an empty
sequence for the subjectName, provide an email address as a subjectAltName
extension, and mark it critical.
Validity
--------
Validity ::= SEQUENCE {
notBefore UTCTIME,
notAfter UTCTIME
}
The IETF recommends that all times be expressed in GMT and seconds not be
encoded, giving:
YYMMDDHHMMZ
as the time encoding. However certificates have been found which include
seconds. This doesn't lead to an ambiguous encoding because you should never
encode a value of 00 seconds, which means if you read in a UTCTime value
generated by an implementation which doesn't use seconds and write it out again
with an implementation which does, it'll have the same encoding because the 00
won't be encoded.
Non-GMT encodings have never been reported, but it may be a good idea to
include handling for the "+/-xxxx" time offset format just in case (but flag it
as a decoding error).
In coming up with the worlds least efficient time encoding format, the ISO
nevertheless decided to forgo the encoding of centuries, so if you find a year
less than 80, add 100 to the century.
UniqueIdentifier
----------------
UniqueIdentifier ::= BITSTRING
These were added in X509v2 to handle the possible reuse of subject and/or
issuer names over time. Their use is deprecated by the IETF. If you're
writing certificate-handling code, just treat them as a blob which happens to
be an encoded bitstring.
No occurrences of a UniqueIdentifier have been reported.
Extensions
----------
Extensions ::= SEQUENCE OF Extension
Extension ::= SEQUENCE {
extnid OBJECT IDENTIFIER,
critical BOOLEAN DEFAULT FALSE,
extnValue OCTETSTRING
}
X.509 certificate extensions are like a LISP property list: an ISO-standardised
place to store crufties. Extensions can consists of key and policy
information, certificate subject and issuer attributes, certificate path
constraints, CRL distribution points, and private extensions.
[There are large numbers of these things. If people can tell me what they've
encountered in the past I'll document them]
Netscape have quite a few defined for things like CRL URL's and SET defines a
few as well, but none have ever been spotted in the wild.
Open Issues
-----------
The creation of a no-authentication certificate (which is the equivalent of an
unsigned PGP key which states something like "Noone has certified this key, but
here it is anyway") is possible using self-signed certificates. Technically
this isn't a legal way to certify a key, and Netscape will reject these keys.
It may be possible to provide an equivalent mechanism with a null signature OID
and signature.
Known Bugs
----------
The following bugs are known to exist in X.509 (and related) implementations
from different vendors:
Verisign
Their PKCS #10 stuff is wrong. According to the PKCS #10 spec, the
attributes field is mandatory, so if it is empty it is encoded as NULL.
Verisigns software however assumes that if there are no attributes, the field
should not be present, treated it like an OPTIONAL field. This is,
unfortunately, what the example in PKCS #10 does - there are no attributes so
it leaves them out, contrary to the spec a few pages earlier.
SHTTP specification
There is at least one invalid PCKS#7 example in earlier versions of the spec.
The most recent draft <draft-ietf-wts-shttp-03.txt> July 1996 (Expires
January-97) fixes this. Implementors should ensure they are working with the
latest version of the draft.
COST
The lengths of some of the fields in CRL's are broken. Specifically, the
lengths of some sequences are calculated incorrectly, so if your code merely
nods at things like SET and SEQUENCE tags and lengths as they whiz past and
then works with the individual fields it'll work, but if it tries to work
with the length values given (for example making sure the lengths of the
components of a sequence add up correctly) then it'll break. The sequence
lengths are longer than the amount of data in the sequence, the COST code may
be adding the lengths of the elements of the sequence incorrectly (it's a bit
hard to tell what's going wrong. Basically the CRL's are broken).
Microsoft
Will create certificates with one of the weirder 32-bit wide character string
types being used to encode all fields in the DN (probably a BMPString
containing straight Unicode). While not illegal, it's a truly Microsoft way
of using ASN.1.
Netscape
Invalid encoding of some (but only some) occurences of the integer value 0
(encoded as integer, length = 0 rather than integer, length = 1, value = 0).
How they can get it right in one place and then wrong 50 bytes later is a
puzzle.
Sample Certificates
-------------------
Include some samples:
- Basic v1.
- Basic v3 with C, O, OU, and CN in the DN.
- Basic v3 with BER encoding of outer wrapper.
- v3 with other fields in the DN (what would be used?).
- v3 with extensions (what would be used?).
- Microsoft certificates with oddball string types.
Acknowledgements
----------------
Eric Young provided lots of useful information on encoding issues and bugs.