[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Examples Are Somewhat Confusing



In your scenario #1, if the data is in non-paged-out memory on the
server *and* the parsing is being done on the server, then you are
right.  Of course if the server owns this data, it could choose to
re-order the data as it reads it into memory to make future searches
faster.

For scenario #2, how common are networks with small packets?  Will this
become more or less common as we move forward?

The reason I see to keep the ordering more open is that doing otherwise
pretty much forces the writing code to buffer everything before writing
it out.  This seems restrictive with minor or at least hard to quantify
benefits to justify it.  It also hampers hand-coded data, which a
text-based format otherwise allows (e.g., hand-coded HTML is quite
common).

-- Mike Weston

Alec Dun wrote:
> 
> Mike,
> 
> You're right on the "read-from-disk case", most disks have at least 512
> byte sectors.  There are a couple of other scenarios that are
> significant though:
> 
> 1.  If the data is in-memory on a server, then my performance is
> directly related to the amount of parsing I'm doing, so I cut my parse
> time in 1/2.
> 
> 2.  Some networks (particularly slow ones) have small packets, so if I
> am reading from a stream and I stop reading the stream and the packet
> size is 100 bytes, then I may only use 3 packets.  My overall network
> traffic will be smaller in this case.
> 
> I guess one of the things I'm trying to understand is why folks are
> opposed to a fixed order which should not hurt anything, and has the
> potential to improve performance.  I mean nobody is going to hand-code
> an application/directory body parts, right?   ...or do people do that?
> 
> I'm trying to figure out the downside, help me out here...  Give me an
> example of where I'd want to have a random order and it makes life
> easier for me.
> 
> Thanks,
> Alec.
> 
> >----------
> >From:  Mike Weston[SMTP:mweston@netscape.com]
> >Sent:  Tuesday, August 27, 1996 3:40 PM
> >To:    Alec Dun
> >Cc:    'Frank Dawson'; 'ietf-asid'; 'ietf-calendar'
> >Subject:       Re: Examples Are Somewhat Confusing
> >
> >If the 2.5MB you are parsing is coming from a disk, and is alternately
> >interspersed between the 2.5MB that you are potentially skipping, the
> >reduced parsing time will be dwarfed by the time to read the data from
> >the disk.  If the data is coming over a network, the situation is
> >probably similar.  It seems like we're talking about single digit
> >percentage performance issues, at a non-trivial cost in terms of code
> >for some simple applications.
> >
> >-- Mike Weston
> >
> >Alec Dun (Exchange) wrote:
> >>
> >> Hi Frank,
> >>
> >> The vCard may be 500 bytes, but I may have 10,000 of them.  That's 5Mb
> >> of data I have to look thru (if I'm doing a linear search), and if, on
> >> average, I can stop searching 1/2 way thru a v-card because I found all
> >> the properties that I need to do the comparison, I can avoid parsing
> >> 2.5Mb of data.
> >>
> >> What specifically is the down-side here that worries you?
> >>
> >> Thanks,
> >> Alec.
> >>
> >> >----------
> >> >From:  Frank Dawson[SMTP:fdawson@raleigh.ibm.com]
> >> >Sent:  Monday, August 26, 1996 2:11 PM
> >> >To:    Alec Dun (Exchange)
> >> >Cc:    'ietf-asid'; 'ietf-calendar'; 'Frank Dawson'
> >> >Subject:       RE: Examples Are Somewhat Confusing
> >> >
> >> >Alec:
> >> >
> >> >I appreciate your answering my questions. Thanks.
> >> >
> >> >The required sequencing of the content information beyond boundary
> >>sentinels
> >> >seem somewhat harsh. You have made a point that a search engine can be
> >> >optimized if it predictable that the vCard or other application/directory
> >> >content has such a sequencing order.
> >> >
> >> >It just seems an unnecessarily  restrictive a request. Saying the header
> >>must
> >> >appear before the body, and in the body, the body header must appear
> >>before
> >> >the
> >> >body content is an appropriate level of required ordering. But beyond
> >>that,
> >> >it
> >> >might be a bit restrictive.
> >> >
> >> >The size of these things are not really very large. For a vCard, we see
> >>less
> >> >than 500 bytes on average. That is really small. This ordering is not
> >>going
> >> >to
> >> >improve efficiency to the point of paying for the restriction on
> >>generating
> >> >these content portions.
> >> >
> >> >- - Frank Dawson