[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
(Long) comments on requirements document
Let me try to make some concrete suggestions for updates
to draft-ietf-ltans-reqs-00.txt to remove from it the
assumption or requirement that a long-term archive MUST
use hash algorithms for time stamps or encryption for
privacy. My goal is not to require non-hash-based time
stamps, but rather, not to exclude it from the requirements
document.
I think we can be more productive than continuing the
discussion in the abstract, since I don't think the abstract
discussion is converging, and we're having one of these
talking-past-each-other conversations ("I don't think this
solution will work", "But the problem is important and we
need a solution now")
I started to review the document just for places where
it unnecessarily made assumptions, but wound up reviewing
more generally. I'm sorry for the length.
I also will cheerfully admit that I might be completely wrong,
have misunderstood or misinterpreted the current document,
and so I hope you will take my comments as questions or
at least constructive.
=============================
CURRENT:
A long-term archive service aids in the preservation of data over
long periods of time. In particular, it periodically performs
activities to preserve the non-repudiation of existence and integrity
as well as ensuring the availability of data.
Instead of "In particular, it periodically performs...", say
"For example, it might periodically perform...". This leaves
the door open for long-term archive services that do not need
to "periodically" perform activities.
==============================
CURRENT:
A variety of technical and operational means are required to achieve
this goal and technical issues beyond cryptography, such as storage
media lifetime, disaster planning, changes in processing or software
technology, etc., as well as legal issues must be addressed.
I suggest:
NEW:
A variety of technical and operational means are required to
achieve this goal. A solution must address issues such as
storage media lifetime, disaster planning, changes in processing
and software technology and legal issues, in addition to technical
issues of cryptography.
This is just wordsmithing. It may be that you can accomplish
long-term archiving without 'cryptography' per se (well, secret
sharing is described in 'Applied Cryptography' and 'Current Cryptology'
but still).
==============================
About digital signatures:
CURRENT:
Any digital signature that may need to be verified at points in time
well into the future, e.g. past the certificate validity period or
past the cryptoperiod of the signature private key, is a candidate
for preservation using a long-term archive service.
I propose:
NEW:
A long-term archive or notary service may be used to validate the
existence of documents, or assertions of agreements, e.g., originally
asserted with digital signatures, at times in the future well
beyond the validity period of the certificate used originally
for signing the document, or even the validity of the algorithms
available for digital signatures or encryption.
I believe the verification is not of the "digital signature" per se,
but an assertion that the parties involved signed the document at the
date claimed. So it isn't the "signature" that needs to be verified,
it's the agreement. The signature is a way of communicating the agreement.
======================
On requirements:
OLD:
Operational requirements, such as storage media concerns, individual
legal requirements and questions dealing with accounting and billing
techniques are not addressed by this document.
My concern here is that we might not be able to get away with not addressing
other requirements; are they operational or technical? Issues with the
long-term interpretability of file formats, with metadata, and with
identity of principals. I think that we need to address what assumptions
we're making about these things, e.g., that the documents are represented
in formats that the originator believes will be interpretable during the
lifetime of the signature, that identity of roles (people and organizations)
are meaningful over the same lifetime.
I think there is a problem with identity in the same time-scale as we're
concerned with in long-term archives. AT&T 20 years ago isn't the same
organization as AT&T today.
==========================================
Originator: Role (person or process) who produces, and possibly
signs, a data object that is to be archived. The Originator does
not necessarily generate or request generation of an evidence record
for the data.
Do you want to use the term 'principal' here, since it is standard,
rather than 'Role', and give a reference? Again, I think there may
be a problem if the lifetime of being able to determine identity
isn't as long as the lifetime required for archival.
=============================================
Timestamp: A signed confirmation generated by a Time Stamping
Authority (TSA) that a data item existed at a certain time.
[RFC3161] specifies a good structure for timestamps and a protocol
for communicating with a Timestamp Authority (TSA).
I think there is a problem, in that this definition combines the
function (asserting that a document existed or had been asserted
at a given point in time) and the mechanism for accomplishing that
function (e.g., RFC 3161). I'm not sure RFC 3161 gives a good
structure for timestamps, for timestamps that have to survive
for as long as this document claims is its requirement.
Also, "Timestamp Authority (TSA)" isn't define in the glossary,
even though it is referenced.
So I think that it is necessary to separate out the notion of
an "assertion of existence of a document or statement/agreement"
from the electronic timestamp mechanism referenced.
==================================================
User: Role (person or process), who submit data objects for
archiving, request archive packages and verify the evidence of an
archived data object using the associated evidence record,
optionally including the verification of any signatures within the
archived data object itself.
I think it's confusing to call this "User", perhaps "Submitter"?
There are a lot of users running around.
===================================================
3.1.3 Preservation of evidence for signed or timestamped data
Archived data objects may contain digital signatures or time stamps
to be able to prove the origin and the time of existence of these
objects and signatures. In the course of time the value of evidence
of these signatures or timestamps can decrease or can get lost for
many reasons:
I'm not sure that the requirement is to 'preserve the evidence'.
Rather, the requirement is to be able to credibly assert in the future
something that is currently asserted. The future assertion might
or might not use a similar mechanism as the current assertion.
So you might accomplish future assertion of timestamp by somehow
"renewing" a timestamp, but you might also accomplish it by having
several trusted organizations willing to assert that, according
to their records, the document was received at a particular time,
with no use of tree-hash-based timestamps.
In order to avoid problems in case of disputes in the future it is
necessary to preserve a digitally signed document, as well as
certificates, revocations lists, OCSP responses and time stamps, even
if these elements are not included in the signed document itself.
I don't believe this statement. Certainly keeping all of these things
won't "avoid problems" in the case of disputes. And keeping these things
might not be necessary.
By periodically inspecting and acting upon stored evidence, it is
possible to generate a cryptographically protected history for a data
item that contains no periods of time during which an algorithm was
thought to be weak, an authority thought to be compromised, etc.
I don't believe this sentence either, or at least, not that it
is true without other conditions. After all, there is no way to
predict when an algorithm might suddenly be 'thought to be weak'.
So it's hard to imagine a practice that would guarantee such a
history. Perhaps there is an operational practice which would ensure
that it would be very unlikely there would be no such periods,
but my impression was that you were going to leave such operational
practices "out of scope".
=====================================
- The Long-term Archive Service is to store archived data objects
over a long, optionally undefined, period of time.
I think rather than "optionally undefined" you might mean
"arbitrarily long"? Typically in records management,
there is a well-defined retention schedule.
================================
- The Long-term Archive Service provides material needed to prove the
existence and integrity of data objects for users as well as in
court. These means especially are time-stamps, periodically
generated during the archiving period of the data objects. Additional
verification data, to verify these time-stamps after a long period of
time (CRLS, OCSP responses and certificates) need also be provided
Again, I don't think that the particular mechanism alluded
to here -- of periodically generated time stamps -- should be
part of the requirements. It may be part of a proposed solution,
but the requirement is to credibly be able to verify the existence
of the document at its original communication date.
=============================================
A Long-term Archive Service is to not be designed to solve all
thinkable problems of long-term-verification of digital signatures.
It does not provide data necessary to verify signatures which are
part of the archived data object itself. This has to be done by
verifiers using PKI-Services like SCVP (Simple Certificate Validation
Protocol) or DVCS (Data Validation and Certification Server).
This doesn't read right. It is to be designed to not solve these
problems? Or it is not necessary to solve all such problems?
I think this section is alluding to some problems that you want
to rule out of scope.
Of course, the problem here is the confounding of the problem statement
with one proposed mechanism for solving the problem.
I think the point of the section is that the archive service
needs to be aware of the assertions that it is being called
upon to assert in the future, and that embedded signatures
need to be recognized somehow. I don't see any reason why
an archive service SHOULDN'T be able to provide assurance about
embedded signatures as well as external ones, except that the
location of the signature needs to be clear.
I don't see why SCVP or DVCS (references?) are referenced here.
===========================================
- Archive service assuring long-term Non-Repudiation
Long-term Archive Service stores data objects like signed or
unsigned documents for identified and authenticated users. It
generates time stamps for these data objects and obtains necessary
verification data over a given time or until a request of deletion by
this authorized user is sent.
Again, I'm concerned about long-term identity of what might
constitute an "authorized user". This also seems to bake in
the notion that one (and only one) user is required to authorize
a deletion. With contractual data, for example, you might expect
that a third-party archive service might offer to only delete
records where the deletion is requested by both parties, not just
one. This can't be easily simulated with a single role-based
user authorization model.
And this 'requirement' seems to bake in the 'periodic replenishment
of timestamps' mechanism unnecessarily.
================================================
- Pure long-term non-repudiation Service
Long-term Archive Service only guarantees non-repudiation of
existence of data. It periodically generates time stamps and obtains
additional verification data for a given period of time. It stores
data objects (e.g. documents, but also relevant parts of documents
containing signatures) locally only for the purpose of non-
repudiation. It is not a document-archive for users and therefore
does not provide retrieval of documents and no deletion of data
objects. Therefore it does not need any access control.
I'm not sure that access control is unneeded even for pure non-repudiation
services, if it is possible to retrieve the metadata, or to search
for documents. I think "it stores data objects " is not proscriptive,
but is intended as a possibility.
I *think* you're trying to define a long-term notary service with
this scenario -- a service that asserts, over the long term, the dated
assertion at a particular time. I'm not sure why you aren't using the
term, though.
=====================================
3.3 Instances and overall architecture
Is this section meant to be an example, or actually proscriptive?
Is this just saying how you MIGHT build an archive or notary service,
or are you requiring them all to work this way? The text sort of
sounds like you intend it to be normative, and I'm not sure why
this belongs in a requirements document.
Users transfer the data objects that shall be archived at the Trusted
Archive Authority (TAA) using their application of choice.
I'm confused -- it sounds like you're imagining that there
is a user-determined transfer protocol for actually uploading
the data, and that the archive protocol then is applied, after
the fact, to the previously transferred data. Is this what you
meant?
================================
Long-term Archive Service may allow for relays using Long-term
Archive Protocol. The use of external archive services may be also
possible. But Relaying must be transparent to the client.
Here you're allowing the TAA to relay the request to
other archive services (1 out of 2) and possibly to
multiple services (K out of N). So this architecture
might be consistent with K out of N archiving after all,
without too many changes.
============================
A TAA may be a server within an enterprise network communicating with
local archive servers and other applications or an external service
accessible via internet.
Is this an issue of firewalls, trust and organizational
control, or some other factors? This is the first time
the document alludes to the possible organizational relationship
between the various parties involved, but I think the
relationship is crucial, an important part of the analysis
of the validity of long-term archives, and probably needs
to be expanded elsewhere.
If an organization is charged with archiving its own documents,
then it could assert that a contract was valid even if it
wasn't. Doesn't the long-term threat model for document
forgery require organizational independence of the
TAA/TSA that is doing revalidation of signatures?
=================================
4. Long-term Archive Service functional and quality requirements
...
- Generate, store and maintain evidence records (i.e. by
periodically obtaining timestamps) for data objects submitted for
preservation;
I think the requirement is to be able to provide,
over the lifetime of the record, assurances that were
originally obtained through techniques such as timestamps
and digital signatures. I don't think there is a requirement
for periodically obtaining timestamps.
========================================
- Be able to provide an acknowledgement that a data object existed
at a certain time, as an alternative, if user is not able to
interpret the evidence record;
I think this is a key to a longer discussion about how
to provide for long-term interpretability of records,
though providing records in standard archivable formats,
and providing, at the time of archiving, conversions of
the record to multiple formats, etc.
========================================
A long-term archive service must be able to work efficiently even for
large amounts of archived data objects. In order to limit expenses,
costs and dependency on high performance, time-stamp services, the
number of necessary time stamps MUST be minimized and a time stamp
should include a large number of signatures and documents;
This is pretty strong for a MUST (and RFC 2119 is probably not
appropriate here anyway).
I don't understand why this is here at all I guess this is assuming some
cost model for timestamp services? What is the use case where this
matters?
===========================
Necessity to access stored archived data object SHOULD be minimized.
It SHOULD only be necessary access to the archived data objects only
if the archived data objects are requested by users or if applied
hash algorithms become insecure.
This comment isn't just here, but I just thought of it.
There's some assumption here about identifiers for
records, or searching for records, or record locators,
that isn't discussed in this draft. (I remember some
conversation about this at the meeting, though.) How
are records identified? Are the record identifiers globally
unique? Use the hash of the document for its identifier,
for example? Who has access to the record?
Interestingly, even if one-way hash algorithms are cracked
(someone finds a way of generating another document with different
content with the same hash), it's still unlikely that you could
_discover_ a valid document hash without knowing anything.
So you might provide access control for documents by using
document hashes as the key for accessing the document.
I don't understand why it is important to minimize the access
to stored archived data, as a protocol requirement. This sounds
like it's an operational requirement for a well-implemented
TAA service, but there are other operational and implementation
requirements that aren't mentioned here. ("It should work.
It should run on commercially available platforms. It
shouldn't have any security holes. It shouldn't erase
its disk periodically.").
=========================================
The data structure for the evidence record itself should have the
following properties:
- It MUST be possible to include all timestamps necessary to
verify the existence of the archived data objects.
....
Again, if there are ways of providing evidence for the
existence of an assertion or signature or record at a
given point in time, then this requirement would be empty.
==============================
- It SHOULD be possible to provide evidence for groups of archived
data objects. For example, it should be possible to archive a
document file and a signature file together such that they get the
same evidence record.
- Where groups of data objects are submitted, non-repudiation
proof MUST still be available for each archived data object
separately.
Is this a 'record' requirement or a protocol requirement?
Why is this a requirement? The obvious use cases given
in the document ("signed contracts or agreements, wills,
property deeds, ...") don't have this kind of
performance requirement.
What are the uses for "CRLs and timestamps over any
type of data" that would have this requirement?
=======================================
- It SHOULD be possible to create timestamps without the need to
access the archived data objects. The access to the archived data
SHOULD only be necessary if the security suitability of employed hash
algorithm is menaced.
Why is this a requirement? Is this a performance requirement?
An operational one? Just something you think might be a good
idea? How often are timestamps created that this is important?
====================================
- It SHOULD be possible to package all evidence along with the
archived data objects in a single data item or to package evidence
and archived data objects in separate items.
I think there is some assumption about 'evidence records'
and their use that might be unclear. Evidence records are
used to send assertions TO the archive service at the time
the document is archived. And it might -- or might not --
be used to supply evidence when there is a dispute. Is it
a requirement that the same data structure be used for both?
========================================
Standardization of a protocol for interactions with a Long-term
Archive Service is desirable. The protocol should have the following
properties:
Again, here you say 'should' in front of a bunch of 'MUST's.
I think probably the right thing to do is to get rid of the
MUSTs (in some cases, and get rid of the requirement, in others).
==============================================
- The protocol MUST define interactions with a Long-term Archive
Service including, at a minimum: submission of data or groups of
data for preservation, retrieval of archive packages and deletion
of archived data and associated evidence records.
I'm concerned about deletion, since a threat to a contract is
to have one of the parties delete it. And I'm not sure of the
value of deletion, or that it is actually possible, e.g., if
the TAA backs up its disks and sends them offsite, is it possible
to actually delete its records from its offsite backups?
And I'm not sure 'submission of groups' is a MUST requirement.
It sounds like a minor optimization. After all, the bulk of
the protocol overhead is probably sending the data.
=====================================
- The protocol MUST provide a response indicating successful
submission or deletion of data. The acknowledgement of successful
submission SHOULD permit a submitter to verify that the correct data
was received by the service for preservation, e.g. the
acknowledgement could include an index, a signature or a timestamp
obtained for the archived data object.
Shouldn't you always verify the correct data? I think that
it's more important to provide for the converse: if there are
any errors or difficulties with the request, the failure must
be notified. Otherwise, the best you have is the archive service's
guess that all went well.
================================
- The protocol MUST response an index to retrieve the archive
package. It also should be possible to retrieve archive packages by
using hash values of the archived data objects.
Earlier you had it possible to get the data independent of
the assertion. And "the hash value of the archived data object"
is (as I said earlier) one way to get a handle on the document.
But it's also problematic -- which hash? which handle?
I think this mixes mechanism with requirement. When a package
is archived, it gets a handle, and the protocol should allow
for the retrieval of the package, assertions about the package,
and pieces of the package.
=============================================
- The protocol SHOULD support some basic Metadata (Mime-Type, key
words, etc.), i.e. the client should be able to provide metadata
along with the archived data to facilitate future search operations
based on the metadata.
Oh, this one is really problematic. First, allowing for search
introduces lots of discovery attacks that might not be necessary.
Why is this important? If the client wants to archive an index
along with some documents, let them archive the index.
On the other hand, metadata is crucial. Bits without understanding
the interpretation of the bits aren't very useful. The question is
whether you just need the content-type, or do you need all of the
other content- MIME headers, e.g., content-language?
For example, it would be useful to allow the possibility of
multiple renderings of the same document in multiple formats,
along with an assertion of their equivalence, for long-term
document retention (e.g., XHTML, PDF/A, TIFF 6, Unicode text).
===============================================
- If a Long-term Archive Service does not support a client-
requested long-term archive policy, the service MUST return an error.
- A Long-term Archive Service MUST provide an indication of the
long-term archive policy under which the service operates.
This is the first mention that I can find of 'archive policy'.
What is this, how does a client request one, what are
servers allowed not to implement and still be compliant?
(There's no point in defining a standard for behavior of
non-compliant participants, since they're already non-compliant!).
How does a server 'indicate' a policy?
========================================
- The protocol MUST prevent replay attacks.
It's odd that this is mentioned without any other attacks
being mentioned. And it's not clear what kind of
"replay attack" is being asked to be prevented. All?
Even ones on unrelated services?
=========================================
- The protocol SHOULD permit encryption of data before submission
in such a way that there exists non-repudiation evidence for the
unencrypted data.
So this is assuming that communication between requestor
and service isn't over a TLS? I'm not sure why this
is a requirement.
=====================================
- The protocol SHOULD provide means of associating submitted data
objects with previously submitted data objects, i.e. to facilitate
retrieval based on aggregation of objects over time.
This is the first mention of "associating data". Does this
mean that access to the first data object also gives you access
to the second? Or just vice versa? How is the association
noted? What is this used for? I don't see this in the use
cases for contracts, land deeds, etc.
====================================
- The protocol SHOULD provide means for specifying a point in time
at which an archived data object need no longer be preserved. It
also should be possible to extend the archiving period.
Extending the archiving period may also be problematic. Who
is authorized to extend the archiving period for an agreement
that was previously agreed would expire?
==============================================
- The format for the acknowledgements MUST allow the
identification of the archiving provider.
- The format for the acknowledgements MUST allow (at least from
the creating archiving provider) the identification of the
participating client.
see comments above about identity. Perhaps this is just
metadata about the archived data rather than identity?
========================================
- Responses must uniquely reference corresponding requests
This seems out of the blue. Any protocol with requests and
responses needs some way of linking them, but it might be
temporarily rather than through identifiers.
================================
- It should be possible to sign requests and responses. It is
recommended that in particular acknowledgements are signed.
Why is this a requirement? There is some security requirement
for authentication of the participants in the protocol and their
authorization to use the protocol and to be identified later,
but why is it important to sign the requests and responses
rather than, say, using TLS for the communication?
======================
- Deletion must be authenticated.
See comments above about deletion. Deletion must be authorized,
and authorization policy for Deletion must be observed.
==============================
- The archive service MUST be able to provide evidence about the
policies that have been used at any time
I'm not sure how it is possible to supply evidence. This
is very hard, in general, to provide evidence for policies
that were used, especially if the policies are baked into
the code or into various tables. And some of the policies
are operational policies, not technical ones.
====================================
- The protocol SHOULD be as simple to use as possible.
A quote attributed to Einstein: "Everything should be as
simple as possible, but no simpler".
Personally, I think that protocols are simple to use if
they aren't new protocols. I wonder if the archival protocol
could be implemented by using WebDAV over TLS with
certain well-known dynamic properties identified for
archival objects.
====================================
Means to enable accountability, access control, confidentiality of
communication between applications and TAA need additional
precautions (like SSL) that are outside the scope of these
requirements.
I think not. That is, there is a service requirement
for those things, and there has to be some way of
accomplishing them. In some cases, you're asking
for things like signatures of requests which have
only a transactional lifetime (of the request), and might
be accomplished other ways.
=======================================
7. Security Considerations
This section isn't really about 'security considerations' in
the typical sense that IETF requires. What you're supposed to
do is outline the threats a long term archive or notary
service. You've written some things that also need to
be considered, but probably as protocol requirements.
=================================================
Where non-standard formats are used or proprietary processing is
employed, verification of signatures on or in archived data may
require the availability of specific applications or tools.
I think this is really a matter of whether or not the
long-term archive or notary service relies on these things
at all.
An archive service cannot assert the validity of a signature
(at the time of submission) if it can't interpret the signature.
And the believability of the assertion cannot be better than the
believability of the signature and its associated software.
===================================================
Certificate revocation could retroactively invalidate previously
verified signatures. Measures may be implemented to support such
claims by an alleged signer, e.g. collection of revocation
information after a grace period during which compromise can be
reported or preservation of subsequent revocation information.
This is some requirement on the archive service at the time
of submission, isn't it?
========================================================
Access control mechanisms associated with data stored by a TAA should
consider the lifespan of the data object. For example, the
credentials of an entity that submitted data to an archive may not be
available or valid when the data needs to be retrieved.
Well, worse, the identity of the entity might not be clear after 30 years.
========================================================
To achieve accountability, local means should be employed to ensure
that all data is inserted in chronological order, e.g. by using
write-once media. Similarly, methods should be deployed to ensure
that all deletions are detectable.
This is completely out of place. It's one idea for an operational
practice which might or might be credible for later asserting
validity. And if you have 'write-once media', then deletion isn't
possible!
==================
Larry
--
http://larry.masinter.net