[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
time zone database interface and formatting
Title: time zone database interface and formatting
I have been in communication with Doug Royer concerning the VTIMEZONE postings concerning the new list, and one thing led to another. Doug suggested that I post my last email to this list for discussion. I have done this with a little bit of editing. I have posted information to this list before, but have not been involved in any of the development discussions, although I have monitored them. I don't know anything except the data at this point, so please excuse my ignorance.
I have thought about your emails, the current state of VTIMEZONE in my limited understanding of it from you, and the current state of the Olsen database. I have suggestions that may involve changes too radical for the current data and code, and would like to run a few things by you for your consideration. Possibly you can help me in posting something that will be helpful.
The primary changes I would suggest are to make a clear distinction between information, symbols, machine readable text, and human readable text in the data. Here are a few notes about specifics, using some formatting examples from VTIMEZONE as discussion points.
> TZNAME:US/Los_Angles
This is a classic example of trying to make something both machine readable and human readable, and failing on both counts in the long run. The human is taken into consideration here more than the machine, as this symbol is easily human readable. However, in making it human readable, the time zone is tied to a particular point location, rather than a region. This has caused problems in the past not only with Olsen's list but with every other group that has put together time zone data. The problem occurs when the city chosen decides to do something different and the rest of the region that it was representing stays the same. The historical records then have a discontinuity unless some method is used to correct them or the new chosen name for the region. This has and will occur in many of the cities that are used currently in the Olsen database. If you look at the political and economic situation in some of the cities chosen as area representative city, it is obvious that they will eventually separate from the surrounding region in regards to their timekeeping method. IATA gets around this problem, which I believe they also experienced at one point in their development of Appendix F, by using a country abbreviation followed by a number iterator, and sometimes followed by a letter for a sub-region. This has worked fairly well, and has not been a cause for confusion with the airlines thus far.
The method of using America/Pangnirtung as an example designation is mixing data with symbol representation, which is the equivalent of having hard numbers mixed in C code as opposed to having header definitions. America/Pangnirtung currently uses two time zones within the city, whereas the surrounding region it would supposedly represent does not. Because of this another designator will need to be chosen to represent the surrounding area.
As a solution I would propose something more like the IATA method, possibly using the ISO 3166 codes along with iterators if a move to consolidating time zones was desired.
> BEGIN:DAYLIGHT
Of the 30 methods currently used that we might call daylight savings time, many are not looked on as daylight savings by the local government and populace. The changes in time zone that the country makes are based on energy, political, economic, and local conditions other than extending the daylight hours.
Argentina is a recent example. Their change to daylight savings time without changing the time was a move based on local conditions, and not what we would consider a daylight savings time mechanism, even though it will have daylight savings effects. Georgia is another good example, as they base the change of their time zone primarily on how many nuclear reactors they have up and running. This allows them to be a net importer or exporter of electrical power and in control of how much their populace spends on electricity by manipulating which time zone they are in. Lebanon is a third example, as they do not base their time change on daylight savings at all in most cases (1999 was a perfect example of this), but on local neighboring political climates and negotiations. I don't have an alternative name, but I think this is important to address so that incorrect assumptions are not made by the users of the database. If daylight extension is the reason for the change, with the intention clearly stated in an edict or law that the time will be changed back, the term daylight savings is relevant. Otherwise another second term should be used where applicable.
This is an area that I have spent hours on trying to define for each country that changes their time zone at some point during the year. All I can say at this point is that there are these categories:
1) Countries for which a methodology can be defined, as it exists in national policy, edicts, agreements, etc. (examples: North America, EC, Cuba, Syria, New Zealand, Chile, Paraguay, Namibia, Thule Airbase Greenland)
2) Countries in which someone decides each year with a proclamation or edict; it may or may not be the same each year, and may or may not be implemented each year (examples: the states of Australia, the states of Brazil, Israel)
3) Countries in which someone decides each year based on local conditions not necessarily directly tied to extending the daylight hours as the primary objective (examples: Georgia, Lebanon, Armenia, Fiji, Tonga)
4) Countries that have mixed daylight savings and non-daylight savings time within geographic boundaries that sometimes overlap (examples: Canada (parts of the Nunavut province), Falkland Islands)
5) Countries that decide to switch time zones for other reasons, which overlaps with 3): (example: Lebanon)
As yet I have not come up with a satisfactory method of dealing with this, although I have tried several. I think that something should be built into the data that recognizes these types of differences if any prediction of future time zone changes is to be included. At the very least, what is not predictable can be defined, and the predictable can be defined until the methodologies in 1) are changed by the country.
> DTSTART:YYYYMMDDTHHMMSS
The majority of the world's population, including what will be the largest computer user base in the world in the near future, does not use the Gregorian calendar. Because of this it makes much more sense to use the Julian day count, or Modified Julian day count, as NASA and most other scientific agencies dealing in important time keeping matters use. This embedding of the Gregorian calendar into systems is a much, much larger problem than the Y2K problem ever was. It will cause problems in the future that will make the Y2K problem look like an pretty insignificant bug. I think that it is very important to recognize this now and correct the problem before it is too late. It is the responsibility of the people on this list to do this. No one else is addressing this in a way that affects as many people as this source of information does. A simple fix for now would be to include a field specifying the calendar method used before the field YYYYMMDDTHHMMSS or before any reference to a year. A better fix would be to reference all past dates as the Julian day count, and specify which calendar system is being used to predict future dates. Calendar conversion algorithms are available from many sources. The book "Calendrical Calculations" is thorough, although I find it difficult to follow because of the author's style. "Standard C Date/Time Library" (available from Amazon.com) is much more easily followed, and appears very thorough, although I have not had a chance to fully go over it for its usefulness in all the calendar systems, as I just got it. There are several other good references, including the "Explanatory Supplement to the Astronomical Almanac", whatever is used, conversions can be made from the Julian day count and back with a minimum effort of programming, giving a much more universally useful database.
> BEGIN:STANDARD
This is a misnomer in some cases due to information stated above when addressing daylight savings time. It probably, however, can be understood in most cases, Argentina being a recent exception. There are other exceptions that will come up, so it may be of some use to define what is meant by Standard.
>
> Good point --- PLEASE PLEASE contribute to the ietf-calendar@xxxxxxx
> list with your suggestions. It will proceed without them if you
> don't!
I think possibly an interface to the data (the equivalent of a header file) would be the main thing I suggest, rather than having the data mixed with misleading symbols. In addition, with all respect to the former British Empire, I would suggest using positive offsets from the west side of the International Date line to avoid the use of negative offsets. This eliminates any date confusion in calculations and notations. Using the west side of the International Date Line gives a clear and unchanging representation of the last hour of the present time at any time, and is free of the confusion that can arrive when London uses daylight savings time, which believe it or not, happens a lot with people. This is especially important when writing code or recording data that needs to be parsed by humans as well as read by machines.
>
> -MOST- Applications will care about their own time zone and perhaps
> that of their offices in other countries. So I don't think that
> a typical application will download the entire database. And
> it will often never care about the past.
The past is about the only thing we can pin down with certainty, unfortunate as this is. IATA and the OAG have had a very difficult time scheduling three years in advance for all airports. It has caused numerous headaches with flight scheduling. I think a few cues should be taken from their experience when proposing something that is dependable. The best I have been able to come up with, working on this full time, is a prediction six months in advance, with small updates every month that don't affect the majority of the world. Whatever is developed, it has to be dynamic and presented as such to its users. Otherwise it will loose credibility very fast, and with no one willing to take responsibility as IATA does for the airlines, it will not be used.
I currently print a 30" by 40" map of the time zones of the world every month or so. It takes me 3 days to enter all the changes since the last printing. I have not yet printed one that was not out of date the day it was printed. It was a little frustrating at first, but finally it helped me to develop a system that can be used, and is dynamic.
The method I use, which I refer to as Global Time Systems, is currently ported as a library to the Mac OS and ready to port to Windows. I have also ported it to an obscure 8 bit microprocessor, and have it successfully running. It takes up about 115K in the 8 bit micro; the Mac version is 168K; the Windows library is still to be determined, but I expect it to be about the same as the Mac version. It does not have any interface built now except in my development platform, but will have one soon, as it is being ported to a clock program now sold for the Mac and being ported to Windows.
Any arrangement of symbols can be used to access the data. These symbols are what you are calling the time zone names. I believe the choice of symbols is as important as the understanding of what the symbols are representing. If the choice of symbols clouds this understanding, another more generic symbol should be selected. The data it is representing can then be of any depth; it can be four dimensional, three dimensional, or list based.
There is always the concern about legacy code and data when making changes. I think this can be addressed with the proper interface or header file equivalent. What should be our concern is that the code and data we generate and use is forward compatible, as this is something that we can do something about now with a little thought and planning, but is much more difficult to correct in the future without affecting a large number of programs that have come to rely on what has been written.
I hope that this is not too big a chunk of information to be useful to this list. I posted this at the encouragement of Doug, and because I would like to see something develop that will be useful in the long run. In addition, I would like to see that the methodology I have developed will interface with what is being developed here, as there are several large users that have said that they will use what I have developed for embedded systems.
Sincerely,
Rives McDow