[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: ERS - using hash tree?
Hello Jesus,
Hm, am not sure I understand your question completely, but let me explain a bit more about ERS. (if you like we can also talk in at the Vancouver meeting next week, or you can give me a short call, as this is maybe a lot easier in person or via phone.)
However:
Typically you will build one hashtree per time interval (e.g. lets say per day) for all your documents independent of how long you want to keep the individual documents.
Lets say you get one day 2 groups of 3 documents each and a set of 5 individual documents (d1...d5).
Then you would e.g. group them on the lowest level in four groups: group-1 group-2 and a generic group-3 (d-1, d2, d-3) and a generic group-4 (d4, d-5, hash(of null or random))
You get your timestamp and keep the whole ER until the last of the documents can be deleted.
Let's look at the retention:
Let's say d-1 passes its retention time first: then you delete d-1 normally. You leave the ER intact as it only contains the hash value of d-1 but not the document itself. If for the other documents a ERS is requested you send back the specific ER for this document.
Personal comment: a few systems I know of actually do not care about the possible semantics of data object groups in ERS, but only look at the individual data objects. They just group the received data objects in binary ascending order and then bundle them together to groups (based on a system wide defined/customized arity of the hashtree). (you give away the advantage of the semantics of the data object group, but can parse and build the ER structures a lot faster as you always know the exact arity/group sizes on every level.)
Hope this explains it a bit more.
Tobias
Ps.: can you maybe tell me a bit more about your goal: e.g. a verification tool would have to be much more flexible in respect to e.g. what arities and node geometry it accepts of ER hashtrees, compared to if you build a TAS, in which case you could e.g. define a fixed arity for your system generated trees, decide whether to offer users "data object groups" and other parameters of your system as you like.
Pss.: I'll be at the IETF conf in Vancouver next week and back on Dec-10. So if you like you can give me a short call tomorrow, meet next week at the LTANS meeting, or call me after Dec-10th.
__________________________________________
Tobias Gondrom
Head of Open Text Security Team
Director, Product Security
Open Text
Technopark 2
Werner-von-Siemens-Ring 20
D-85630 Grasbrunn
Phone: +49 (0) 89 4629-1816
Mobile: +49 (0) 173 5942987
Telefax: +49 (0) 89 4629-33-1816
eMail: mailto:tobias.gondrom@xxxxxxxxxxxx
Internet: http://www.opentext.com/
Place of Incorporation / Sitz der Gesellschaft: Open Text GmbH, Werner-von-Siemens-Ring 20, 85630 Grasbrunn, Germany | Phone: +49 (0) 89 4629 0 | Fax: +49 (0) 89 4629 1199 | Register Court / Registergericht: München, Germany | Trade Register Number / HRB: 168364 | VAT ID Number /USt-ID: DE 114 169 819 | Managing Director / Geschäftsführer: John Shackleton, Walter Köhler
> -----Original Message-----
> From: Jesus Maria Mendez Perez [mailto:jesusmmp@xxxxxxxxx]
> Sent: Tuesday, November 27, 2007 4:55 PM
> To: Tobias Gondrom
> Cc: ietf-ltans@xxxxxxx
> Subject: Re: ERS - using hash tree?
>
> On Nov 22, 2007 7:27 PM, Tobias Gondrom <tgondrom@xxxxxxxxxxxx> wrote:
> > Hello Jesus,
> >
> > 1.
> > > Then, what is the meaning of building the hash tree if you really
> > > after of building it you can´t add or remove any data object?.
> >
> > Yes, after you completed the hashtree and protect it with a timestamp
> you can not add further nodes. (Comment: Typical ERS-systems build one
> hashtree per day for all documents that were archived during that day and
> sign the hashtree every evening with a timestamp.)
>
> Then, if I want to preserve some documents in the same day, for
> example 3, and to archive them during differents periods of time, the
> first one (d1) until 01-12-07, the second (d2) until 15-12-07 and the
> last one (d3) until 01-01-08 for example and I send independently to
> the TAS server.
>
> How have they been manage?
> What evidences are the TAS server archiving?
> Has the TAS server to built 2 evidences? (first to preserve each
> document separately (Fig 1))
>
> ----- ------ --
> ----
> | d1 | | d2 | |
> d3 |
> ------ ------
> ------
> | |
> |
> | |
> |
> ------ -------
> ------
> | h1 | | h2 | | h3 |
> ------ -----
> ------
> | |
> |
> ---------------- -------------------
> ---------------------
> | TS(01-12-07) | | TS(15-12-07 | | TS(01-01-08) |
> ------------------ -------------------
> --------------------
> Fig 1
>
> and after that a second evidence to include them in the evening
> hash-tree (Fig 2)?
> Suppose another user, send during the same day the document
> "anotherd4". In the same day only were sent d1,d2,d3 and anotherd4.
>
> ------ ------ ------
> ---------------
> | d1 | | d2 | | d3 | |
> anotherd4 |
> ------ ------ ------
> ----------------
> | | |
> |
> | | |
> |
> ------ ------ ------ ---
> ---
> | h1 | | h2 | | h3 | | h4 |
> ------ ------ ------ ---
> ---
> | | |
> |
> | | | |
> | |
> -------- ------
> | h12 | | h34 |
> -------- ------
> | |
> | |
> |
> -----------
> | h1234 |
> -----------
> |
> --------------
> | TSh1234 |
> --------------
> Fig 2
>
>
> And what is the evening's timestamp during?
>
>
> Another doubt: But then can I send a group of documents to preserving
> and preserve it independently all other (Fig 3), (without including
> them in the evening hash-tree)?
> Can I decide what groups of documents form the evening-hash-tree? Can
> the TAS server archive "little" hash-trees, like folders and in every
> folder contains the hash tree of a group of documents?
>
> ------ ------ |
> | d1 | | d2 | |
> ------ ------ |
> | | |
> And, in this side
> | | |
> the evenging hash tree
> ------ ------ |
> | h1 | | h2 | |
> ------ ------ |
> | | |
> | | |
> | |
> -------- |
> | h12 | |
> -------- |
> | |
> | |
> ------------ |
> | TS12 | |
> ------------
>
>
> I try to explain the best I can..
>
> Thanks Tobias
>
>
> >
> >
> > 2. Arity (number of branches at each node) of the hashtree:
> > Actually the arity of the hash tree is not restricted by the ERS
> standard and can be decided by the implementer.
> > Several performance and efficiency tests I have seen, have shown that
> typically a tertiary (or binary) hashtree offers the highest efficiency
> and performance when handling larger numbers of documents.
> >
> > E.g. hashtree for about 2000 documents
> > With tertiary:
> > 2187 nodes on the lowest level
> > 729 nodes on 2nd level
> > 243 nodes on 3rd level
> > 81 nodes on 4th level
> > 27 nodes on 5th level
> > 9 nodes on 6th level
> > 3 nodes on 7th level
> > And the timestamp at the top.
> >
> > That way the reduced ERS for one specific document is a lot smaller as
> it does not need to contain all 2000 nodes of all neighbours, but a lot
> less.
> >
> > Plus other operational parameters have also been more efficient.
> > (Typically I would recommend a tertiary tree.)
> >
> > But if you like you can also use a tree with arity n, e.g. n=1000 or
> n=1.000.000.
> >
> > Hope this helps, Tobias
> >
> >
> >
> >
> >
> > > -----Original Message-----
> > > From: owner-ietf-ltans@xxxxxxxxxxxx [mailto:owner-ietf-
> ltans@xxxxxxxxxxxx]
> > > On Behalf Of Jesus Maria Mendez Perez
> > > Sent: Thursday, November 22, 2007 6:48 PM
> > > To: ietf-ltans@xxxxxxx
> > > Subject: ERS - using hash tree?
> > >
> > >
> > > Hello,
> > >
> > > I´m Jesus and I don´t understand some aspects about using the
> > > hash-tree to preserving data objects.
> > >
> > > According to ERS if you have 4 data objects the working will be the
> next
> > > one:
> > >
> > > ------ ------ ------
> ---
> > > ---
> > > | d1 | | d2 | | d3 | |
> d4 |
> > > ------ ------ ------
> ---
> > > ---
> > > | | |
> |
> > > | | |
> |
> > > ------ ------ ------
> ---
> > > ---
> > > | h1 | | h2 | | h3 | |
> h4 |
> > > ------ ------ ------
> ---
> > > ---
> > > | | |
> |
> > > | | | |
> > > | |
> > > -------- ------
> > > | h12 | | h34 |
> > > -------- ------
> > > | |
> > > | |
> > > |
> > > -----------
> > > | h1234 |
> > > -----------
> > > |
> > > --------------
> > > | TSh1234 |
> > > --------------
> > >
> > > where "d" are the data objects and "h" are the hash of the data
> > > objects and "TS" is the timestamp.
> > >
> > > In this case you´ll have to do 7 hashes and the additional
> timestamping.
> > >
> > > Then, what is the meaning of building the hash tree if you really
> > > after of building it you can´t add or remove any data object?.
> > >
> > > You can simply add the 4 hashes of the 4 data objects and do the
> finish
> > > hash.
> > > Then we´ll have to do 5 hashes and the additonal timestamping versus 7
> > > hashes we had before.
> > >
> > > ------ ------ ------
> ---
> > > ---
> > > | d1 | | d2 | | d3 | | d4
> |
> > > ------ ------ ------
> ---
> > > ---
> > > | | |
> |
> > > | | |
> |
> > > ------ ------ ------
> ---
> > > ---
> > > | h1 | + | h2 | + | h3 | + | h4 |
> > > ------ ------ ------
> ---
> > > ---
> > > | | |
> |
> > > | | | |
> > > | | | |
> > > -------------------------
> > > | h1234 |
> > > -------------------------
> > > |
> > > ----------------------------
> > > | TSh1234 |
> > > ---------------------------
> > >
> > > I dont know if we could do it this last way or if i don´t understand
> > > the processing of Generation and Verification at all..
> > >
> > > Thanks.
> >
> >