Spiro Title Slide - ICU

Spiro Title Slide - ICU

Knotty problems in date/time parsing and formatting and time zones Yoshito Umaoka IBM Globalization Center of Competency 32nd Internationalization and Unicode Conference 2008 IBM Corporation Agenda Challenges for Implementing Date and Time UI Understanding Time Zone Formatting Parsing 2 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Challenges for Implementing Date and Time UI Two examples Google Calendar IBM Lotus Notes Walking through various requirements for displaying date and time Solutions provided by CLDR Design/Implementation Tips 3 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation

Google Calendar 4 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Lotus Notes 8 Calendar 5 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Date Format Types Basic: July 27, 2008 Relative: Today Basic: July 28, 2008 Relative: Tomorrow Basic: August 3, 2008 Relative: August 3, 2008 Interval: July 27 - 28, 2008 Duration: 1 day Interval: July 27 August 3, 2008 Duration: 7 days 6

IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Mini Calendar Month Different form without date in some locales Eg. Polish - lipiec (nominative) vs. lipca (genitive) lipiec 2008 28 lipca 2008 Day of week Very short abbreviation Not always the first letter of day of week name Eg. Chinese: The first day of week Sunday is the first day of week in many regions, but its not true in some regions. 7 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Month/Day of Week Names in CLDR 3 different widths - wide / abbreviated / narrow 2 context types format / stand-alone Month name example - January Locale format

stand-alone wide abbreviated narrow wide abbreviated narrow en_US January Jan J January Jan J pl_PL stycznia sty

s stycze sty s ru_RU . . Day of week name example - Sunday Locale 8 format stand-alone wide

abbreviated narrow wide abbreviated narrow en_US Sunday Sun S Sunday Sun S zh_Hans_CN

IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Date and Time Interval When displaying a date interval, duplicated date fields could be stripped off. 3 possible patterns depending on combination of start date and end date July 2026, 2008 July 20 August 1, 2008 July 20, 2008 July 19, 2009 Different combination patterns in different locales 2026 July 2008 20 July 1 August 2008 20 July 2008 19 July 2009 9 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Date/Time Interval in CLDR Each is associated with as skeleton pattern and contains one or more patterns A element contains a pattern which will be used when the greatest difference of two given dates matches its id attribute

MMM d, yyyy MMM d, yyyy MMM d MMM d, yyyy MMM dd, yyyy 10 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Other Challenges Various combinations of date fields and widths Sat 7/26 The UI requires to display short format including month, day of month and day of week, but not year The pattern could be changed depending on the locale Sat 26/7 for en_GB 7/26( ) for ja_JP Week number Week number is commonly used in European countries The way of calculating week numbers in a year may vary depending on local conventions 11 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Flexible Date Format Support in CLDR (1)

contains various Each has id attribute representing skeleton skeleton contains only field information in a canonical order A CLDR consumer provides a skeleton When the matching skeleton is available in the locale, the associated pattern is returned. If not, closest match which contains all requested fields is returned. E d MMM d MMMM dd/MM d/M MMM yy MM/yyyy MMMM yyyy 12 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Flexible Date Format Support in CLDR (2) When any element does not satisfy the matching criteria, use the rules defined by to append missing fields to one of the existing format. {0} ({2}: {1})

{0} {1} {0} {1} {0} ({2}: {1}) {0} ({2}: {1}) {0} ({2}: {1}) {0} ({2}: {1}) {0} ({2}: {1}) {0} {1} {0} ({2}: {1}) {0} {1} 13 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Week Data in CLDR minDays: minimum days in the first week firstDay: first day in a week weekendStart/weekendEnd: start/end day of weekend

14 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Comparison of Format Functions 15 Standard C library Microsoft .NET JDK ICU Basic format function strftime/wcsftime DateTime SimpleDateFormat

SimpleDateFormat Predefined format patterns LC_TIME date time date & time DateTimeFormatInfo date (long/short) time (long/short) date & time month and day year and month DateFormat constructor date time date & time 4 different lengths for above (full/long/medium/ short) DateFormat constructor Same with JDK Support for arbitrary combination of date fields using skeleton pattern Localized month/day

names LC_TIME full & abbreviated DateTimeformatInfo full & abbreviated genitive month shortest day names DateFormatSymbols full & abbreviated DateFormatSymbols full/abbreviated/narrow formatting/standalone Relative n/a n/a n/a DateFormat (RelativeDateFormat) Interval n/a n/a

n/a DateIntervalFormat Duration n/a n/a n/a TimeUnitFormat Calendar system Gregorian and its variants 15 calendar types Gregorian, Thai Buddhist and Japanese 11 calendar types IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Design/Implementation Tips Keep internal date/time representation locale-independent Localized format may vary depending on implementation Use standard format such as ISO8601 for data exchange

Do not hardcode format patterns in your source code Do not put format patterns in resource bundles with other localizable messages! Locale support is more than UI translation Translation vendors are usually not able to handle regional variants You should be able to find solutions in CLDR/ICU if no available, file bugs to request new features Avoid date/time data entry by text Formatting date/time is complicated, so is parsing Use UI widget to eliminate ambiguous data entry Understand regional conventions of calendar system Rules for calculating some calendar fields may vary Be prepared to support non-Gregorian calendar systems For example, Buddhist calendar is the most preferred calendar system in Thai Japanese calendar support may be required depending on target sectors 16 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Understanding Time Zone Formatting and Parsing CLDRs approach for supporting time zone formatting Choosing a right time zone format type for your needs Tips for processing date/time with time zone http://www.time.gov/images/worldzones.gif

17 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Time Zone Implementations The tz database (a.k.a Olson database) 568 zones (436 unique zones / 132 aliases) (2008d) Support historic time transitions since late 19 th century At least 1 zone per country/region Time zone abbreviations for display (3 or 4 letter ASCII alphabet), such as EST, JST Used by *nix systems (Solaris, Linux, AIX, Mac OS X) and Java MS Windows time zone 84 zones (Windows Vista), some are obsolete Support historic rules (2005 and beyond) in Vista/2008 Server (Dynamic DST) A zone is shared by multiple cities/countries Time zone display names including the standard offset and common name or exemplar cities, such as (GMT-05:00) Eastern Time (US & Canada), (GMT+09:00) Osaka, Sapporo, Tokyo 18 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Time Zone Format Types in CLDR (1) Generic location format Designed for populating choice lists for time zones Uniquely mapped to canonical zone IDs Examples

Europe/Rome Italy Time Italy Time [en] America/New_York Italy Time United States (New York) Time [en] America/New_York Italy Time Hora de Estados Unidos (New York) [es] Generic non-location format Designed for recurring events, meetings, or anywhere people do not want to be overly specific Two widths long/short Examples America/New_York ET [en/short] America/New_York Eastern Time [en/long] America/Montreal Eastern Time [en/long] 19 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Time Zone Format Types in CLDR (2) Generic partial location format A variant of generic non-location format used as a fallback name when the generic non-location format is not specific enough Two widths long/short Examples America/Mexico_City Hora central (Ciudad de Mxico) [es_US/short/Mar 9 April 6, 2008] America/Chicago Hora central (Chicago) [es_MX/short/Mar 9 April 6, 2008] Specific (non-location) format Designed to distinguish between standard time and daylight time Two widths long/short Examples

20 America/New_York EST [en/short/standard time] America/New_York EDT [en/short/daylight time] America/New_York Eastern Standard Time [en/long/standard time] America/Montreal Eastern Standard Time [en/long/standard time] IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Time Zone Format Types in CLDR (3) Localized GMT format Designed for representing the offset from GMT Local decimal digits are used Examples America/New_York GMT-05:00 [en/standard time] America/New_York GMT-04:00 [en/daylight time] America/New_York -0500 [bg/standard time] RFC 822 format Locale in-sensitive fixed format representing the offset from GMT defined by RFC 822 ASCII decimal digits are always used Examples America/New_York -0500 [standard time] America/New_York -0400 [daylight time] 21

IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation CLDR Metazone A metazone is an grouping of one or more internal zones that share common non-location display names Following zones are currently associated with a metazone America_Eastern (CLDR 1.6.1) America/Nassau, America/Resolute, America/Coral_Harbour, America/Thunder_Bay, America/Nipigon, America/Toronto, America/Montreal, America/Iqaluit, America/Pangnirtung, America/Port-au-Prince, America/Jamaica, America/Cayman, America/Panama, America/Grand_Turk, America/Indiana/Vincennes, America/Indiana/Petersburg, America/Indiana/Marengo, America/Indiana/Winamac, America/Indianapolis, America/Louisville, America/Indiana/Vevay, America/Kentucky/Monticello, America/ Detroit, America/New_York Each metazone has a set of localizable names Following names are used for metazone America_Eastern (CLDR 1.6.1) Locale 22 long short generic standard daylight generic standard

daylight en Eastern Time Eastern Standard Time Eastern Daylight Time ET EST EDT fr Heure de lEst Heure normale de lEst Heure avance de lEst HE HNE HAE zh_Hans

ET EST EDT IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Time Zone Short Abbreviation Problem 2 to 4 letter ASCII alphabets abbreviations are used for short names, such as ET, EST, PDT The extent to which time zone abbreviations are understood varies heavily by region For example, how many people recognize EAT (East Africa Time) in US? CLDRs solution - a boolean value associated with a zone/metazone commonlyUsed to enable/disable short abbreviations Metazone Africa_Eastern has a short standard name EAT for English locales For metazone Africa_Eastern commonlyUsed = true in en_ZA [English (South Africa)] commonlyUsed = false in en_US [English (United States)] 23 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation

Ambiguous Time with Generic format Daylight Standard transition Sunday, November 2, 2008 01:30:00 Pacific Time? Valid, happens twice Generic format cannot distinguish between 1:30 PST and 1:30 PDT Standard Daylight transition Sunday, March 9, 2008 02:30:00 Pacific Time? Invalid! 30 minutes 1 second after 01:59:59? or 30 minutes before 03:00:00? 24 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation CLDR Time Zone Formatting Patterns (1) Letter Width Format Description Example Roundtrip time Roundtrip canonical zone z

13 Specific non-location short format (commonlyUsed = true) Localized GMT format PST PDT GMT-08:00 yes no 4 Specific non-location long format Localized GMT format Pacific Standard Time Pacific Daylight Time GMT-08:00 yes no 13 RFC 822 format -0800

yes no 4 Localized GMT format GMT-08:00 -0800 yes no Z 25 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation CLDR Time Zone Formatting Patterns (2) Letter Width Format Description Example Roundtrip time

Roundtrip canonical zone v 1 Generic non-location short format (commonlyUsed = true) Generic partial location short format & (commonlyUsed = true) Localized GMT format PT PT (Canada) PT (Yellowknife) GMT-08:00 no (at transition) no 4 Generic non-location long format Generic partial location long format Localized GMT format Pacific Time Pacific Time (Canada)

Pacific Time (Yellowknife) GMT-08:00 no (at transition) no 1 Specific non-location short format Localized GMT format PST PDT GMT-08:00 yes no 4 Generic location format Localized GMT format (only for GMT style time zones such as Etc/GMT+8) Italy Time United States (Los Angeles) Time GMT-08:00

no (at transition) yes V 26 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Tips for Processing Date/Time with Time Zone For serializing future date/time data in text format, use RFC 822 format with zone ID Time zone rules could be changed GMT offset information along with zone ID is sufficient to fix up data The result of java.util.Date#toString() might be ambiguous CST is used for both America/Chicago and Asia/Shanghai in Java CLDR does not use a same name for multiple time/meta zone Many zones in tz database use LMT (Local Mean Time) as initial offset LMT is calculated from the longitude and the GMT offset has a fraction of minute ISO8601 / RFC822 / Java GMT format does not have second field, so it may not roundtrip Minimize the dependencies on Windows time zone in multi-platform applications Some windows time zones are not well maintained No historic time zone rule support before Vista/2008 server Mapping between Windows time zones and the tz database is 1-to-n 27

IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation Links Unicode CLDR project - http://www.unicode.org/cldr/ UTS#35 UNICODE LOCALE DATA MARKUP LANGUAGE (LDML) http://www.unicode.org/reports/tr35/ ICU Project - http://icu-project.org/ tz database - http://www.twinsun.com/tz/tz-link.htm 28 IUC32: Knotty problems in date/time parsing and formatting and time zones 2008 IBM Corporation

Recently Viewed Presentations

  • Continuous System Modeling

    Continuous System Modeling

    Therefore, the overall modeling task can be reduced to two sub-problems: Mapping of the physical topology to a system of implicitly formulated DAEs. Conversion of the DAE system into an executable program structure.
  • Plant Cells (The Basics)

    Plant Cells (The Basics)

    Linear molecule. Cell Division in Plants Most plant cells divide by Mitosis Mitosis: Process of division that produces two daughter cells with identical chromosomal content of parent cell. Mitosis is one stage of the cell cycle. Cell cycle--cycle of stages...
  • SS8H6a Explain the importance of key issues and events that ...

    SS8H6a Explain the importance of key issues and events that ...

    Sherman's Atlanta Campaign. Though often called "Sherman's March through Georgia" or simply "Sherman's March," Sherman actually led two separate military campaigns in the state. The first was called the Atlanta Campaign. Beginning in the spring of 1864, Sherman set out...
  • Lesson 2 - A Level PE

    Lesson 2 - A Level PE

    Neural Control of Breathing. Complete a diagram of the Neural control of breathing. During exercise the mechanics of breathing allow for greater volumes of air to be . inhaled . per breath. Describe. how the mechanisms of . neural control...
  • Hormones, Receptors, and Signal Transduction Learning Objectives 1.Learn

    Hormones, Receptors, and Signal Transduction Learning Objectives 1.Learn

    Possible pathways of transmission of hormonal signal. Each hormone can work through one or more receptors; each hormone-receptor complex can work through one or more mediatorproteins (either G proteins or other signaling mechanism), and each mediating protein or enzyme activated...
  • San Diego-Imperial Counties Community Colleges Workforce ...

    San Diego-Imperial Counties Community Colleges Workforce ...

    San Diego-Imperial Counties Community CollegesWorkforce & Economic DevelopmentRegional Consortium. ... Key website requirements/branding concepts developed. ... Research Plan for LMI demand & supply gap analyses on target.
  • Windows Azure Active Directory Vittorio Bertocci vittorib@Microsoft.com @vibronet

    Windows Azure Active Directory Vittorio Bertocci [email protected] @vibronet

    Agenda. The Directory Pattern. Directory in Action: Windows Azure for Organizations. Your Directory and Line of Business Apps in the Cloud. Your Customer's Directory and your SaaS Apps in the Cloud
  • Snímek 1 - ssvos.cz

    Snímek 1 - ssvos.cz

    File:US Navy 030620-N-7391W-007 Cashier Sue Amine assists a customer at the Pearl Harbor Commissary, run by Defense Commissary Agency's (DeCA), in the new Pearl Harbor mall complex, which opened earlier this year.jpg - Wikimedia Commons [online]. [cit. 15.10.2012].