In line with DickyP's story, I have an "annoying or what?" tale to tell...
About ten years ago, after adding a few members to the family tree, I downloaded a new genealogy file from Ancestry.COM in GEDCOM format, which is essentially an entity-attribute-value format - with the complication of having up to six columns of attribute identification. It was downloaded in one of the many text formats (Notedpad recognized at least six) but it was not ANSI text because of the line-end delimiters. No problem, because if you opened it in Notepad, you could save it in ANSI format. But this time, it was different.
My data import routine saw only one VERY LONG record because the input changed to UTF-8, which Notepad couldn't completely convert. The "very long record" was because they changed end-line delimiters, too. So when my code did the INPUT LINE to identify the next entity, it saw no delimiters and just slurped up the WHOLE FILE (about 1.6 Mb) as a single record. No warnings, not notices, just changed to UTF-8 - AND actually used a few characters from the extended character set associated with UTF-8.
I had to write a character-by-character parser because the only way to recognize the UTF-8 extended characters and actually DO something with them is to catch an in-line non-ASCII character, which "escapes" you to a two-byte, three-byte, or four-byte character encoding sequence that will resolve to a single extended character. Fortunately, none of the extended characters make a huge difference because they only appear in the VALUE side of the record, not the entity or attribute sections. When I catch one, I can assign it as a benign character, for which I chose "?".
Just when I though I had it cleaned up, the next import crashed again, with an arithmetic overflow. THAT blew my mind because (a) the DB doesn't do a lot of math and (b) at that time, I only had about 1800 people in this DB, and a DCOUNT wouldn't even overflow a WORD integer. It took me a bit of debugging to find the input line on which it crashed. I had to change my logging routine to find it.
This second problem was that their database had gotten so big that their numeric "entity ID" field grew from 12 digits to (at least) 16 digits - which of course overflowed my LONG integer entity PK, and I'm on 32-bit Access which doesn't support LONG LONG (or QUAD) integers. So I had to retain the 16 digits as text, but didn't want to use it as a PK.
I built a lookup table where I assign a "local" LONG ID number for the people and other entities involved, and I use the local number for all records and relationships. I can look up the original ID and determine the locally assigned ID whenever the input has an entity ID in a value field. E.g. as the the person for a "SPOUSE", "PARENT", or "CHILD" attribute... Ancestry uses their own numbers for that kind of reference, so I have to translate them. As often as they use the cross-referenced number, it is storage-wise economically a good fit, and it allowed me to bypass having a 16-character text PK vs. a 4-byte numeric PK.
It took me over a month to track all of this down and modify the entity ID changes throughout my code, because of the pervasive nature of having a PK for relationship support. To say I was annoyed by this little format change doesn't quite cut it. The worst part of it is that when it comes to Ancestry.COM's online documentation for the format of their GEDCOM file, I've seen better manuals on Chinese-made DVD drives suitable for installation in an old 4086 PC.