Hereafter are a few comments (before leaving on
holidays).
Besides the comments made below, there are more than 50 typo
errors or incorrect grammatical constructions.
The document states on
page 6:
"An LTA is a part of a general archive service that
provides evidence used to
demonstrate the existence of an archived
data object at a given time and
the integrity of the archived data
object since that
time".
However, nothing in the protocols that are described allow to achieve
this property.
The document should consider the existence of a journal
for keeping track of all operations
that are performed using LTAP. In other
words, an "archive journal" should be considered
and actions like "read",
"search" and "delete" allowed on it.
An archive service cannot work
without any data storage capabilities as mentioned at the top of page 7
:
" Alternatively, it may simply act as an evidence
or information service without data storage capabilities
(it relies upon
other services for storage of the archived data).").
It may archive imprints of data rather than data itself, and
thus needs to have storage capabilities.
On page 7, the document
states:
"They are atomic elements of an LTA service
consisting of three logical parts:
o Archive data (including metadata or
other related data) entering
the LTA using
the interaction protocol,
o Archive process-related
meta or binding information, and
o Evidence
information"
However, later on, the document does not consider "archive process-related meta or binding
information"
or "Evidence information".
The
data to be considered instead should be :
- Archive data associated with
metadata,
- Archive data
references with metadata, and
-
Metadata alone.
If some evidence is provided, it should be considered as
a special type of metadata.
The document should make the difference
between metadata that is opaque to LTAP and metadata
which needs to be
interpreted by LTAP, for example date of storage, end of retention, etc
..
The document should consider the existence of "archive profiles". The
document currently considers
a " service policy" but the concept behind this
wording is not sufficiently explained.
It seems close (page 14) but is
different:
"A client MAY
indicate one or more service policy identifiers associated to a service type
in order to select different features to be performed by the
LTA".
"Archive policy" should be able to be defined so that it is
possible to make a deposit according
to an archive policy. The archive
policy should then state which metadata are mandatory
when making a deposit
and which are automatically computed and added to the data when making a
deposit.
Archive policies may be created and deleted. These operations
performed on the archive policy objects
(e.g. create, read, modify, list,
suppress) should be logged in the journal.
When
archiving some data, it is fundamental to know when the data should be deleted.
It should be possible
to ask to the archive system to identify which
archives should be deleted. With the current operations, this is
impossible.
On page 7, the document states:
"The
LTA performs perpetual maintenance of archive objects".
First of all, different concepts are behind this
wording as indicated on top of page 8: "e.g. by proof-reading,
copying to
new material, or performing time stamp renewal)". The VERIFY
operation as defined in section 5.4
is too vague to allow reaching
these different goals.
Note that the word "operation" is used on page 9
and the wording "service type" is used on page 31.
The word "operation"
should be used everywhere.
The VERIFY operation is not always necessary
since some systems automatically take care of the move
of data
from one media to another. Some systems are also taking care automatically of
the duplication
of data on different sites.
This topic is related to
another issue.
On page 19, the document
states:
"Servers MUST create a server-wide unique identifier for each
data
object managed by
the LTA. The identifier MUST be
global during the
intended lifetime of an object".
On page
30, the document states:
"It MUST be ensured that in an actual context of a client/server
network names are scalable and
global both in terms of actual
community space and time to live of the treated data
objects".
If the
reference of the data is server-specific (rather than service-specific) and the
server is changed,
then the reference would change and the evidence will no
more be valid.
Let us suppose that media migration is not automatic and
that no automatic backup of the storage is supported.
The document should
provide some guidance on the use of the protocol. Currently it seems that at
least
two operations would need to be done for the deposit. To increase the
performance, allowing to address
two servers, or better to archive services
at the same time, would be better.
On page 9, one operation is missing.
There should be a MODIFY operation to modify metadata.
Some metadata may not
be modifiable, e.g. the time of an initial deposit. So the archive systems
should be able to recognize which metadata is or is not
modifiable.
There should be a SEARCH operation, which is more explicit
than LISTIDS mentioned on the top of page 10.
The EXPORT operation is
mentioned. This naming may be understood as "read and delete". It would be
advantageous
to call it READ. Read would apply to "data or imprint +
metadata" or to metadata only.
In order to increase the performance
there should be a COPY operation. This means that the data would not flow
through the client, but directly from one LTAS to another LTAS.
On
page 13, there is an odd sentence:
"Some
metadata MAY remain available even after deleting the object".
If data is deleted, the metadata associated with
the object is also deleted. However the recordings
of the various operations
performed on the data are in the journal and are not deleted.
On page 19, the text
states:
"The data
to be archived are arbitrary binary data and, minimally, an associated type that
MUST be either
available as part of a server configuration policy or
explicitly indicated by the client".
On page 27, about ArchiveData the text
states:
"This type is
used to describe data together with optional metadata
and reference information. At least one of the optional
elements
MUST be provided
in order to either provide or identify the data".
Data to be archived
shall always be associated with metadata. Some metadata types should be
mandatory,
in particular the one defining the type of metadata that is
archived (e.g. using a MIME type). No metadata
should be part of a “server
configuration policy”. A said earlier, metadata may be explicitly provided
by the client or be derived from an archiving policy.
On page 20,
the text states:
Meta
information is associated with archive data and can be included
implicitly, i.e. be a part of a
document, or explicitly, i.e. as a
document attachment
and also:
Meta
information may occur in various forms and may be an integral
part of archive data, e.g.
security attributes in form of digital
signatures.
Metadata should not be confused with security attributes of the data.
Metadata is always associated with the data
and is NOT included in the data
itself. Signatures are part of the data and can be considered in some cases
as attributes of the data. However, a metadata may indicate that one or more
signatures are present in the data.
On page 20, the text
states:
"An LTA does not
interprete metadata that may
express logical relations among documents in the archive that
is
submitted selectively using
several requests".
The text should rather say that a LTA MUST understand some types of
metadata. For example, a metadata
may include a “communication delay”, i.e.
a time period starting from a date and time included in another metadata
which specifies the time after which the public may read the data. The LTA
shall be able to control the release
of information to the
public.
The text states on page 20:
To process such information, the
LTA MUST retrieve
enough
information on the type and purpose of information enclosed,
which may simply be defined with
the use of an apropriate archive
service policy, e.g. archive service for digitally signed
documents.
An “archive service policy” is left undefined. There are not
enough explanations. The example is not sufficient
to understand whether
this concept is valid or not.
The text states on page 21:
In some scenarios, a specific set
of meta information must be
preserved together with archive data, e.g. information
identifying
the document
owner/author, location or time.
This highlights the fact that some
metadata must be understood by the LTA and thus need to be
standardized.
The text states on page 21:
Evidence information demonstrates
the integrity and existence of
archived data. The LTA
accepts data for the single purpose of
generating or obtaining evidence
information for data submitted by a
client. The evidence information structure is
defined in [RFC4998].
It is unclear how ERS is being used. An ERS record
should be a metadata.
The text states on page 21:
In the case where LTA accepts data
only for the purpose of generating
evidence information (without
storage capabilites to avoid, e.g.
confidentiality issues), the
archivation process is limited in time.
It is not explained why the
“archivation
process is limited in time ».
The
text states on page 21:
When an LTA performs a renewal of
evidence, … .
It is not
explained what this means in terms of operations and metadata to
handle.
The text states on page 21:
… collisions may exist now or in
the future.
Hash
algorithms shall be chosen so that collisions do not exist “now”. Collisions
might exist in the future.
In this case, the draft should give guidance on
how to handle them in the security considerations section.
The text
states on page 23:
The Metadata type is a list of
open types
What is an “open type” ? As said earlier, some
metadata MUST be understood by the LTA.
In order to be able to perform
search operations using metadata, the LTAS MUST understand
the syntax of the
metadata. Then it can apply matching rules. The matching rules may also be part
of the metadata.
The text states on page 23:
No attempt is made to recur to some other existing metedata
specification, e.g., the Dublin
Core.
Some metadata from the Dublin Core should be made
mandatory. One particular type of metadata
should be standardized: toallow
identifying an “archive policy”.
The text states on page 23:
Since some global metadata are always associated to data objects
and
necessary for the LTA
service, an LTA MUST provide a complete
description of all metadata it
associates with an archived data
object for operational
purposes.
The
text recognizes that some metadata is necessary for the LTA services. Those that
are absolutely
necessary should be described.
The text states on
page 23:
A client is not required to
understand the semantics of metadata.
How could a client
make a search, if it does not understand the semantics of the metadata
?
On page 26, RawData should not be a choice but rather be a pair
composed of an OCTET STRING
and Meta Data indicating the type of the OCTET
STRING.
The text states on page 27:
For
preservation purposes, an LTA must have information on
archive
data type (e.g., signed or
unsigned).
When preservation is supported, this type of metadata is necessary for
the LTA services. It should be standardized.
The text states on page
33:
servicePolicyInfo PolicyInformation,
PolicyInformation is imported
from PKIX1Implicit-2009. However the semantics of this item has nothing to do
with this document since it is related to a Certification Policy rather than
an Archiving Policy. A reference to
an archiving Policy should be used
instead and should be optional.
The text states on page 33:
serial
SerialNumber OPTIONAL,
nonce
Nonce OPTIONAL,
There are no explanations allowing understanding
when a SerialNumber is needed/useful in addition to a Nonce.
The text
states on page 37:
4.3.
OperationResponse
(…)
It references the initial request
as well as the
data that
had been submitted.
OperationResponse ::= SEQUENCE {
information
RequestInformation,
status
StatusNotice,
data
ArchiveData
}
The ArchiveData should be made OPTIONAL, otherwise 10
Mbytes of data might be returned with each response.
The text states on
page 39:
The client builds a Request with an
information item including service
policy interpreting service
characteristics and service
configuration parameters.
The
client should build a request including an archive policy and metadata in
accordance with the archive policy.
There should be no “service policy
interpreting service characteristics », nor « service configuration
parameters ».
The text states on page 40 about the DELETE
operation:
After a successful operation, the the server does not
maintain any status information
about the object.
This is incorrect. A log of what happened should
be maintained.
The text states on page 40 about the DELETE
operation:
o The metadata MAY be set to replace the
existing metadata of the
object.
This is quite odd. When the object is
deleted, the metadata is also deleted.
The text states on page 40 about
the DELETE operation:
If the client retries a delete
operation, it may happen that the LTA
has already deleted all traces of
the operation.
This is incorrect. If the client retries a delete operation, it may
happen that the LTA has already deleted the object.
However, it shall have a
trace of the operation.
The text states on page 41 about the VERIFY
operation:
This operation allows a client to
verify the authenticity of
information stored in the
archive.
In
practice, the verify operation would be used for different purposes. The primary
one is for checking the quality of the media.
In such a case, transient
information may be returned and that information is not necessarily added to the
metadata.
Thus the last sentence of this section is
wrong:
"The LTA
returns updated metadata of the object".
If there is a wish to maintain
the validity of the data by renewing time-stamp tokens, then a different verb should be
used,
like “MAINTAIN”. The details of the parameters should then be
given.
The text
states on page 41 about the STATUS operation:
A client can request the status of
a data object.
It is unclear what the status of the data object is. What is the
difference between a status and metadata ?
In some places within the
document there is also confusion between the status of an operation and the
status
of a data
object. The current text is no sufficient to be able to correctly understand.
Normally a subset of
the metadata should be returned.
The text on
page 41 about the LISTIDS operation is missing to include data to be used with
matching rules
and to be compared with metadata. Simply providing a range of
dates as indicated on page 42 is insufficient.
On page 43, the text is
speaking of “restricted
BER encoding “. What is it ?
On page 44, the text
states:
“The owner of the S/MIME arc doesn't like to register them in the S/MIME
arc ».
Such a sentence should be
deleted.
On page 48, the text states:
The validity of data
should be checked by periodic execution of
VERIFY operations intended to
ensure data with demonstratable
integrity is available throughout the lifetime of an archived
data
object.
The
sentence should be changed since some systems do this automatically without the
need of such an operation.
On page 48, the text
states:
Depending on the lifetime and the
quality of data, …
Which
concept is behind the « quality of data » ?
On page 48,
the text states:
Nevertheless, in case the private
key does become
compromised, an audit trail of all the response generated by the
service SHOULD be kept as a means
to help discriminate between
genuine and false responses.
In
any case, an audit trail SHALL be kept.
End of
comments
Denis