GeneaBloggers

Tuesday, 4 August 2015

Synchronised Dates



In A Calendar for Your Date — Part I, I mentioned the changeover from the Julian calendar to the Gregorian calendar. People who have read about this may have heard terms such as dual dates, double dates, double years, new/old style, but do you really know what they mean? Was the handling of this changeover a special case — one that needs its own treatment in our data — or was it really an example of a generalised case? It’s time to take the lid off.
Synchronised swimming in a sea of dates
Figure 1 - Synchronised swimming in a sea of dates.


The Gregorian calendar was introduced by papal bull on Thursday 4 Oct 1582, Julian (followed by Friday 15 Oct 1582, Gregorian), but a couple of factors prevented its automatic adoption everywhere. One of these was the fact that it was devised by the Roman Catholic Church, and so in a time of especially fraught church tensions other churches saw it as some type of power-play and resisted. For instance, although other Catholic countries in Europe adopted it either immediately (e.g. Spain), or later that year (e.g. France), or the following year (e.g. Netherlands), Britain, including its colonies — some to later become part of the US — didn’t change until Wednesday 2 Sep 1752, Julian (followed by Thursday 14 Sep 1752, Gregorian; the difference by that time being 11 days rather than 10). Another factor was that many people considered that days were being stolen from them — between 10 and 13 days, depending on the date that their changeover occurred. Birthdays and anniversaries changed, events changed, and that shortened year (282 days for Britain) created difficulties for handling taxes, deadlines, and interest.[1] So why were days taken away?

To understand this, it is important to know the reasons for the calendar change. The length of the Julian year was too long (365.25 days) and that meant that the Easter date was drifting backwards from the traditional date as defined by the early church. There were therefore two main parts to the calendar change: the solar part whereby new leap-year rules corrected the average year to 365.2425 (much closer to the measured average of 365.24219 days), and the lunar part which corrected the cumulative error of 13 centuries of drift by removing 10 days.

To complicate things slightly more, the British year was also deemed to start on the 1st January rather than the 25th March (Lady Day) that it had done since the 12th Century (except in Scotland where they had already changed in 1600). The UK Calendar (New Style) Act 1750, which introduced the calendar changeover, justified the adjustment of the civil year by “Whereas the legal supputation of the year of our Lord in England, according to which the year beginneth on the twenty-fifth day of March, hath been found by experience to be attended with divers inconveniences, not only as it differs from the usage of neighbouring nations, but also from the legal method of computation in Scotland, and from the common usage throughout the whole kingdom, …”.[2] Note that 1st January had long been celebrated as the start of the “historical year” (New Year’s Day) but the Gregorian calendar was essentially a civil calendar, and this was part of the reason why the church could not mandate it. The terms Old Style (O.S.) and New Style (N.S.) are often used to clarify the ambiguities of dates falling between 1st January and 24th March. Old Style meant that something was dated according to the old civil year and so must be adjusted to align with the New Style civil year, or to the historical year.[3]

The fact that the civil and historical years were already different in Britain, even before the calendar changeover (except in Scotland), meant that there was already a means of representing a year combination using a notation informally called double years. For instance, 3rd March 1733/4 clarified that this was March of the civil year 1733 and of the historical year 1734 (i.e. the month before April 1734).  Following the Julian-to-Gregorian calendar change, the difference of 11 days meant that it was not just the year that was different; a date such as 10/22 January 1705/6 explicitly represented both the Julian and Gregorian dates, including their corresponding years.[4] This was another form of a dual date, or double date although this term is no longer preferred due to the ambiguity with the social occasion of the same name.

Even today, the start of the old civil year affects everyday life in Britain since the start of the personal tax year remained at 25th March (O.S.), or 5th April (N.S.), until 1800 when it moved to 6th April. Britain also retains a double-year notation to indicate that a tax year spans two year numbers, e.g. 2011/12.

Although Easter should fall on the Sunday following the full moon that follows the northern spring equinox, both the full moon and the equinox are now determined by calendar rules rather than direct observation. This means that the date calculation (Computus) now differs in different calendars, in different localities, and in different religions. Not all holy festival dates were moved during the changeover, though; the date of Christmas was already 25th December in the Julian calendar. This was subsequently retained by all Western churches, and by some Eastern ones. The equivalent Julian date (in this present time) would be 7th January, and some churches do continue to use that date.

A large part of the confusion must be attributed to the fact that the same calendar era was used — meaning that year numbering was designed to run, as consecutively as possible, from the previous ones — and that the same month names and day counts were retained. Hence, when birthdays or anniversaries had been “bumped up”, it appeared to be a seriously intrusive change, despite the measurement actually being according to a different calendar scale. It is interesting to observe that this demonstrates how indispensable the notion of a calendar had become, and how people attached greater significance to the day number and month name than to the actual time of the year.

If the change had merely included new leap-year rules then no one would have noticed until the next difference (year 1700). Even the change of the year start would have been manageable since countries such as Scotland had already achieved it. However, the correction of 10 days, combined with retention of the old months, appeared as though days were being stolen, and it supposedly caused riots. Finally, the pope’s bull came at a very difficult time as far as relations between the Catholic and Protestant churches were concerned. It was issued in the reign of Elizabeth I and in 1584 a previous attempt was made to adopt it. An act was prepared entitled ‘An act to give Her Majesty authority to alter and make a new calendar, according to the calendar used in other countries’. Although the bill passed two readings in the House of Lords, stalling tactics by Protestant bishops ensured that it was eventually ignored.[5]

When we observe dual dates written during this chaotic transition, or from the period before, when the civil and historical years differed, we can identify two dates expressed according to distinct calendars: the Julian and Gregorian, or Julian with civil and historical years, respectively. In other words, it was not always the case that dual dates represented Julian and Gregorian dates during the changeover — a common misconception. What the cases have in common is that the date pairs represented the same day.

In A Calendar for Your Date — Part II, I made a case for holding an additional normalised (computer-readable) version of our dates, but how should this be extended to such pairs? The situation is not a special case as there are other precedents for representing the same day according to different calendars. In Israel, for instance, government documents usually carry a dual date embracing both the traditional Hebrew calendar and the Gregorian calendar. Similarly, following the calendar reforms of India, in 1957, government documents carry a dual date embracing the new national calendar of India and the Gregorian calendar.

STEMMA therefore defines the notion of synchronised dates, where a single entity describes an item from a written or printed source that embraces representations of the same day according to two or more calendars. To see this in action, let’s look at the combined Julian-Gregorian date example from above:

Synchronised Julian and Gregorian dates
Figure 2 - Synchronised Julian and Gregorian dates.


We can immediately see that the one evidential form — that obtained from the consulted source — yields two normalised forms: one according to the Julian calendar and one to the Gregorian calendar. Any number of display forms can be generated from these normalised forms, dependent upon the regional settings and personal preferences of the end-user.

Anyone who hasn’t read my previous two articles on dates might be wondering why we need to store two normalised values when one will do, employing a conversion algorithm when necessary. However, note that not all such dual dates have an obvious interpretation. For instance, double years were occasionally used for months other than January to March which makes little sense, and so needs some considerable interpretation. These two normalised forms provide a direct interpretation of the relevant parts of the evidential form.

When other calendars are considered then the conversion to the Gregorian calendar — necessary for date comparison and timelines — is not always reliable. In these circumstances, the same entity can describe the direct interpretation of the evidential form and the calculated version of it. I’ll return to the French Republican calendar example from the aforementioned article in order to demonstrate such a conversion:

Synchronised French Republican and calculated Gregorian dates
Figure 3 - Synchronised French Republican and calculated Gregorian dates.


In effect, the same synchronised-date entity is used to represent both dual dates and cases where a value in an alternative calendar has been calculated. In both circumstances it binds the multiple derived forms (normalised and/or calculated) to a single evidential form.



[1] David Ewing Duncan, The Calendar (London: Fourth Estate Ltd, 1998), pp.288–289.
[2] “Calendar (New Style) Act 1750”, transcription, legislation.gov.uk (http://www.legislation.gov.uk/apgb/Geo2/24/23 : accessed 4 Aug 2015), in introduction; delivered by The National Archives of UK.
[3] University of Nottingham , “Historical Year and the Civil Year”, UK Campus: Manuscripts and Special Collections (https://www.nottingham.ac.uk/manuscriptsandspecialcollections/researchguidance/datingdocuments/historicalandcivil.aspx : accessed 3 Aug 2015).
[4] Mike Spathaky, "Old Style and New Style Dates and the change to the Gregorian Calendar: A summary for genealogists", GENUKI (http://www.cree.name/genuki/dates.htm : accessed 4 Aug 2015), under "The cause of ambiguities - 2. The Start of the year".
[5] E. G. Richards, Mapping Time: The Calendar and its History (Oxford University Press, 1998, reprinted 2005), p.252.

Sunday, 19 July 2015

A Calendar for Your Date — Part II



In A Calendar for Your Date — Part I, I gave a brief tour of the variations in current and historical calendar systems. I now want to approach the question of how we should represent dates that are not expressed according to our Gregorian calendar.

The different calendar systems may be categorised as follows:[1]

Empirical. The start of the months or years is determined by direct observation and intercalary days or months are inserted on an ad hoc basis.

Calculated. These are rule-based and so are predictable. Lunisolar and Solar calendars may be astronomical rather than arithmetic in that the start of months and years may be determined through astronomical calculation rather than purely by using a fixed rule. Calendars with “wandering years”, such as the Egyptian civil calendar and the Mayan calendar, have a simple fixed number of days per year.

Conversion between calculated calendars can be approached algorithmically, if enough information is known, but empirical calendars require the use of tables, and it is rarely the case that enough historical information survives to make this an accurate process.



In general, when we see an historical date, we cannot always convert it directly and unambiguously to our modern Gregorian calendar; it requires some interpretation, and that in turn requires information about the actual calendar variant that was being used, the social group involved, their political and religious leanings, the weather, and maybe even the geographical coordinates.

As a modern-day analogy of this problem, consider if I’d written the date 9/6/2015. Now did I mean June 9th or September 6th? If the date had been written in the US then you might believe that the latter alternative is more likely. However, some knowledge of the author, and the fact he has worked in the US but that he is English by birth, might suggest the former alternative is more likely. Hopefully, you see the problem: the converted value cannot always be faithful to the original source information, and it might require an update based on the analysis of new information, or the availability of a revised algorithm or tables.

A number of resources exist for calendar conversions — both algorithms and documentation — although some seem to have disappeared since I first saw them:

URL
Description
Status
Notes
Converter for historical calendars


Calendars and Their History
Currently inaccessible
See "inexact" in sec 1.3 observational calendars
Indian Calendrical Calculations


Convert a Date


Pancanga (version 3.14)



These resources involve specific calendar variants that would have to be known in advance, and at least one acknowledges the inexact nature of general calculations. In effect, a calculated date can never be more accurate than the original, and there will be many cases where it will be less accurate.

What I’m suggesting is that wholesale conversion of historical dates to the Gregorian calendar is a very bad approach for genealogy, and for historians in general. Obviously we need the ability to put dates from different calendars on the same timeline, but that does not mean discarding the original information in favour of a calculated alternative; a process which may involve an increased degree of uncertainty or imprecision (as differentiated in Warm Fuzzy Dates), as well as some loss of evidence. Furthermore, if software is going to collate these dates then it needs a representation that it can understand, and we cannot suffice with just the written evidential form and the calculated variant.

Let me explain what this last sentence means by using a small example. STEMMA applies a bilateral approach to all data items assimilated into a computer-readable form, as explained in Returning to Normalised Names and Dates and in Is That a Fact?. What this means is that it holds a transcript of the original evidential form and a separate computer-readable (normalised) form. A computer-readable date would be used for sorting and collation, but also for generating a display form — say for a report or a chart — according to the regional settings and personal preferences of the current end-user. For instance:

Evidential form:     25 Dec 56
Normalised form:     1856-12-25
Display form:        25th December 1856

There are a couple of points to note in this simple Gregorian example. Firstly, those software developers who believe that it is possible to automatically convert written (Gregorian-)dates to a normalised form (ISO 8601 in this case) would probably have interpreted this date as 1956 rather than 1856, thus emphasising the importance of the context of the information. I could have used an evidential form such as “Christmas 56” to hammer that home but I wanted to give a sense of its subtlety. Similarly, an evidential form of “my birthday” is also referencing a date, but whose birthday, and in which year — all of which is contextual information that a researcher would use to apply a conversion. My second point is that the display form is generally (in modern software) produced according to a “short”, “medium”, “long”, or “full” request, and that request would examine the end-user’s settings in order to generate a consistent representation for readability. This is an approach that could be applied to all calendars, in principle.

In the case of a calendar conversion, a missing item would be a normalised value in the alternative calendar; one that must be flagged as being “calculated” to avoid ambiguity. The next example includes a date from the French Republican calendar converted to the Gregorian calendar..

Evidential form:     18 Brum an VIII
Normalised form:     #FR#08-02-18
Display form:        18 Brumaire An 8
Normalised form:     1799-11-09         (Calc)
Display form:        9th November 1799  (Calc)

The normalised form is the STEMMA one since there is no standard that I am aware of. The associated display form uses Arabic year numbers rather than Roman numerals. Coinage of the time often used these rather than the Roman numerals used elsewhere, but it would obviously be a display setting. The two extra fields show the equivalent normalised date after conversion to the Gregorian calendar, and its associated display form. The two normalised forms are therefore distinct in that one is a direct implementation of the evidential form whereas the other is a derivation. The second should therefore be flagged as a calculated datum using something akin to the GEDCOM CAL flag (see DATE_APPROXIMATED in the specification). STEMMA would allow the two forms to be bound using the DATE_ENTITY that’s also used for synchronised dates (i.e. its generalised form of dual dates).

Unfortunately, there are no data standards to accommodate the normalised representation of dates in other calendars. All we have is the ISO 8601 date standard[2], which is specific to the Gregorian calendar and largely the result of an amalgamation of previous standards. Much of its content gets ignored in favour of the pure representation of a Gregorian date and/or time, and that includes ranges, ordinal dates, etc. A critique of that standard may be found at: Is the ISO Date Standard Bad?.

GEDCOM 5.5 includes a small set of “date escapes”[3] that can prefix a date value in order to address different calendars:

@#DGREGORIAN@ — Gregorian calendar
@#DJULIAN@ — Julian calendar
@#DHEBREW@ — Jewish calendar
@#DFRENCH R@ — French Republican calendar
@#DROMAN@ — for future definition
@#DUNKNOWN@ — for unknown calendars

This sounds like a step in the right direction although the specification offers little help on the encoding of year numbers or month names for the non-Gregorian calendars. It does acknowledge the ambiguity of using words rather than numbers via the statement: “No future calendar types will use words (e.g. month names) from this list: FROM, TO, BEF, AFT, BET, AND, ABT, EST, CAL, or INT”.

In February of 2015, Bob Coret analysed the usage on this calendar feature in a sample of 82.9 million DATE lines from about 7000 GEDCOM files.[4] He reported the following very low permille (i.e. tenths of a percent) usage — all others being zero:

@#DJULIAN@        0.123 ‰
@#DHEBREW@     0.013 ‰
@#DFRENCH R@  0.006 ‰

Clearly this feature is very underutilised, but what is the reason? Is it that few people have dates in alternative calendars, or that they only store the Gregorian equivalents, or that their software does not support this feature?

Family Historian uses a “[J]” prefix for entering dates in the Julian calendar, and this has also become a display option in some other products (e.g. TNG). For instance: “[J] 1 Mar 1740”. A consequence is that this alternative syntax occasionally creeps into exported GEDCOM dates to dirty the water.

The Unicode Common Locale Data Repository (CDLR) has also proposed a set of calendar names for computer use at: http://unicode.org/repos/cldr/trunk/common/bcp47/calendar.xml, although I cannot see any details of corresponding date encodings. It appears to be part of an extension to BCP47 ("Tags for Identifying Languages") called RFC6067 for "subtags that specify language and/or locale-based behaviour or refinements to language tags, according to work done by the Unicode Consortium”.

The MARC Extended Date/Time Format (EDTF) makes no mention of calendars as it is applicable only to the Gregorian calendar.

The Society of American Archivists (SAA) adopted DACS (Describing Archives, A Content Standard) in 2004. This mentions alternative calendar systems but only from a written point of view as opposed to a digital one. Their Standards for Archival Description, Chapter 7 (Codes), does mention the Julian calendar but only in the context of ordinal dates.

The MSS Working Group discusses a number of issues related to date/time representation, including dates from non-Gregorian calendars.

The ISO 8601 standard that addresses the Gregorian calendar has a few attractive core features:

  • It uses fixed-length all-numeric fields and so avoids language issues and textual ambiguties (see GEDCOM list of avoided names, above).
  • The resultant text is implicitly sortable without the host software having to understand dates at all.

Ideally, any standard for the computer-readable date formats in the other calendars should adopt a similar approach. This was STEMMA’s goal from its inception. However, it found that it had to adopt a varation of the ISO 8601 format in order to support missing levels of granularity (such as yearly quarters) and to correctly sort differing granularities with respect to each other — two criticisms in the aforementioned article. Apart from the easy cases of the Julian and French Republican calendars, it has made no further headway. What it has done, though, is to create a generic Date entity that can be back-filled with the encodings for any number of calendars — once they’ve been defined — and without changing its overall data model. This is an approach that I strongly recommend to FHISO in order to avoid prematurely dismissing this issue, and then later finding that some method of date escapes is required, similar to GEDCOM.

A number of papers were received by FHISO on the subject of calendars, and their approaches and coverage appear to be very constructive. At the time of writing, there was no associated Exploratory Group established for research into this field.

CFPS
Title
Description
Proposal to support dates BC as negative years
This paper presents a case for allowing dates BC to be recorded using the standard Julian and Gregorian calendars, proposes a representation for such dates that is naturally sortable.
Proposal to extend the calendar style mechanism of CFPS 43 into an abstract formatting model
CFPS 43′s style mechanism is extended into abstract formatting model that would allow applications to format correctly dates written in many unknown calendar systems.
Proposal to support the Julian calendar similarly to CFPS 17
Proposal for a Julian calendar with years starting on 1 Jan
Proposal to add style to the wholly-numeric representation of dates in CFPS 13
Proposal to separate presentation from representation in calendars in order to avoid a proliferation of calendars.
Proposal for compound calendars to resolve a difficulty with default calendars
Proposal to allow the default calendar to be dependent on the date.
Proposal for a Generalised Dual-Date Representation
Proposal for a generalised dual-date representation that applies to multiple calendars
Proposal to Accommodate Alternative World Calendar Systems
Proposed adoption of a date syntax applicable to multiple world calendars, both historical and modern-day.


A question I have heard before is why those uncertainties and inaccuracies should be relevant to genealogists. What difference does it make if you’re a few days out, or a month, or possibly even a year? I entirely disagree with this thinking. Even if you’re only building a family tree then the relationships and vital events might not be supported by direct and non-conflicting evidence; there may have to be some interpretation, and some correlation with information from elsewhere in order to justify them.

A bigger question people may pose is why the historical calendars are of interest to genealogists. After all, there is at least some agreed synchronisation between the six principal calendars that are in use today. Not many people can trace their lineage back to, say, Caesar. Well, even historical characters had lineage, and family history, so whether you’re studying modern genealogy or ancient genealogy should be irrelevant. More than this, though, I do not consider genealogy (including family trees and family history) to be a special case that needs its own standards and methodologies. It is a form of micro-history, which in turn is a form of history. The information that we uncover and analyse in our research does not come from a world of its own, and it cannot be considered in isolation. All those events — both large-scale and small-scale — relate to the real world, and will affect each other. Historical research needs a consistent scheme that respects the integrity of our sources and the information found therein. To suggest that software standards, or the Internet, or populist genealogy products, must stick to Gregorian dates would be a case of the tail wagging the dog.



[1] E. G. Richards, Mapping Time: The Calendar and its History (Oxford University Press, 1998, reprinted 2005), p.99.
[2] Data elements and interchange formats — Information interchange — Representation of dates and times, International Standard, ISO 8601:2004(E), 3rd ed. 1 Dec 2004; online copy obtained from http://dotat.at/tmp/ISO_8601-2004_E.pdf (accessed 11 Jul 2015).
[3] Actually, version 5.3 also contained this feature but version 5.4 omitted it with the following explanatory statement: “The Lineage-Linked GEDCOM Form is restricted to Gregorian calendar forms. This version of GEDCOM chose not to support multiple calendars. The reason is that support of multiple calendars would require each receiving system to handle multiple calendar conversions”. Source: Tamura Jones, "FamilySearch GEDCOM Specifications", Modern Software Experience (http://www.tamurajones.net/FamilySearchGEDCOMSpecifications.xhtml : accessed 19 Jul 2015).
[4] Bob Coret, "Usage of calendars in GEDCOM", Bob Coret in English, posted 5 Feb 2015 (http://blog-en.coret.org/2015/02/usage-of-calendars-in-gedcom.html : accessed 19 Jul 2015).