Running on Java 22-ea+15-1134 (Preview)
Home of The JavaSpecialists' Newsletter

172Wonky Dating

Author: Dr. Heinz M. KabutzDate: 2009-04-23Java Version: 5Category: Tips and Tricks
 

Abstract: The DateFormat produces some seemingly unpredictable results parsing the date 2009-01-28-09:11:12 as "Sun Nov 30 22:07:51 CET 2008". In this newsletter we examine why and also show how DateFormat reacts to concurrent access.

 

Welcome to the 172nd issue of The Java(tm) Specialists' Newsletter. One of my pet peeves is when I am asked to predict the future of Java. As a Java Champion, I am expected to have a better idea than the average person. The truth is I do not have a clue what will happen to Java or any other technology. When cellular telephones were first invented, I dismissed them as something that would never become successful. Far too expensive and besides, who would want their boss to be able to contact them 24x7? I could not even predict the amazing popularity of my Java Specialist Master Course. My Design Patterns for Delphi course, that I was sure would fly, did not sell a single seat. Recently my Toronto buddy Jean suggested I read the book The Black Swan [ISBN 1400063515] , which explains these outliers very nicely and at long last vindicates my "I don't know" answer about the future. It also explains that experts in a field, especially those with a reputation to protect, are notoriously bad at predicting the future as they are too conservative to expect the unexpected. In future, when someone asks me what will happen to Java in the next 5 years, I will take a wild guess and say that Java won't exist in 5 years time.

javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.

Wonky Dating

A few weeks ago, one of my newsletter readers sent me the following code:

    DateFormat df = new SimpleDateFormat("yyyyMMddHHmmss");
    Date d = df.parse("2009-01-28-09:11:12");
    System.err.println(d);

Since the date format was different to the incoming text, she was getting the rather strange result of "Sun Nov 30 22:07:51 CET 2008".

The SimpleDateFormat is by default lenient and tries to fit our dates into the format as best it can. Whilst doing that, it might cause some strange effects. Here is how I think it gets interpreted:

    yyyyMMddHHmmss
    2009-01-28-09:11:12

    year = 2009
    month = -0
    day = 1
    hour = -2
    minute = 8
    second = -09

The year is easy, just 2009. In our interpretation of month, there is no such month as 0. January would be 1. So it would be one month before January, in other words December 2008. The day is the 1st. Next comes the hour, which I would have imagined should have been set to 28, but was read as -2. Perhaps due to the confusing yyyyMMdd start, the time was offset by one character. Since hour is -2, minute is set to 8 and second to -09. If we subtract 2 hours from 1st Dec 2008, we come to 22:00:00 on the 30th Nov 2008. We then add 8 minutes and subtract 9 seconds, thus having 7 minutes and 51 seconds. The end result is 30th Nov 2008 22:07:51.

Similarly, when we have as input "2009-12-31-00:00:00", it will be parsed as:

    yyyyMMddHHmmss
    2009-12-31-00:00:00

    year = 2009
    month = -1
    day = 2
    hour = -3
    minute = 1
    second = 0

Thus it will be year 2009, month -1, thus November 2008, the second day, but hour -3, thus the 1st of November 2008 at 21:00:00. Minutes would be set to 1 and seconds to 0, thus we get the completely incorrect (by more than 12 months) answer of Sat Nov 01 21:01:00 CET 2008.

We would not have had this problem if we had specified the DateFormat to be strict, with df.setLenient(false). In that case, we would have immediately seen the mistake, rather than have a date that is completely off.

Concurrent Dating

Another issue with DateFormat is that it is not thread safe. Since DateFormat is an expensive object to create, you might want to keep a copy available in a static final field. That means, however, that you can only use it from a single thread at a time, otherwise the results are unpredictable.

Take for example the DateConverter class:

import java.text.*;
import java.util.Date;

public class DateConverter {
  private static final DateFormat df =
      new SimpleDateFormat("yyyy/MM/dd");

  public void testConvert(String date) {
    try {
      Date d = df.parse(date);
      String newDate = df.format(d);
      if (!date.equals(newDate)) {
        System.out.println(date + " converted to " + newDate);
      }
    } catch (Exception e) {
      System.out.println(e);
    }
  }
}

When we call the testConvert() method, we would expect date to always equal newDate. However, I managed to get rather strange results in conversion, such as:

    1971/12/04 converted to 0000/09/-730498
    1971/12/04 converted to 100083/09/02
    1971/12/04 converted to 19711971/12/04
    2001/09/02 converted to 1971/02/04
    2001/09/02 converted to 1977/04/23

In other words, the results had absolutely nothing to do with possible values. In production, the probability of calling the format() or parse() methods concurrently might be low, so you would only see such mangled dates seldomly. However, that is what makes these "black swans" [ISBN 1400063515] even more dangerous, since the values are completely different to what you expected. Imagine trying to work out the interest due on a loan, based on the starting date parsed as "0000/09/-730498". Here is my test code:

import java.text.*;
import java.util.Date;
import java.util.concurrent.*;

public class DateConverterTest {
  public static void main(String[] args) {
    ExecutorService pool = Executors.newCachedThreadPool();
    convert(pool, "1971/12/04");
    convert(pool, "2001/09/02");
  }

  private static void convert(ExecutorService pool, final String date) {
    pool.submit(new Runnable() {
      public void run() {
        DateConverter dc = new DateConverter();
        while (true) {
          dc.testConvert(date);
        }
      }
    });
  }
}

We can fix the problem of concurrent access to the DateFormat either by synchronizing the testConvert() method or by having a separate DateFormat instance for each thread. Synchronizing introduces contention, so that is probably not the best approach. Instead, we should rather create a ThreadLocal that gives each thread his own copy of the DateFormat class. With ThreadLocal, we want to set the value the first time the thread requests it and then simply use that in future. The easiest way to do that is by overriding the initialValue() method, like so:

import java.text.*;
import java.util.Date;

public class DateConverter {
  private static final ThreadLocal<DateFormat> tl =
      new ThreadLocal<DateFormat>() {
        protected DateFormat initialValue() {
          return new SimpleDateFormat("yyyy/MM/dd");
        }
      };

  public void testConvert(String date) {
    try {
      DateFormat formatter = tl.get();
      Date d = formatter.parse(date);
      String newDate = formatter.format(d);
      if (!date.equals(newDate)) {
        System.out.println(date + " converted to " + newDate);
      }
    } catch (Exception e) {
      System.out.println(e);
    }
  }
}

As long as the thread is alive, this thread local would stay set, even if he never used the DateFormat again. We could instead use a SoftReference as a value for the ThreadLocal:

import java.lang.ref.SoftReference;
import java.text.*;
import java.util.Date;

public class DateConverter {
  private static final ThreadLocal<SoftReference<DateFormat>> tl
      = new ThreadLocal<SoftReference<DateFormat>>();

  private static DateFormat getDateFormat() {
    SoftReference<DateFormat> ref = tl.get();
    if (ref != null) {
      DateFormat result = ref.get();
      if (result != null) {
        return result;
      }
    }
    DateFormat result = new SimpleDateFormat("yyyy/MM/dd");
    ref = new SoftReference<DateFormat>(result);
    tl.set(ref);
    return result;
  }

  public void testConvert(String date) {
    try {
      DateFormat formatter = getDateFormat();
      Date d = formatter.parse(date);
      String newDate = formatter.format(d);
      if (!date.equals(newDate)) {
        System.out.println(date + " converted to " + newDate);
      }
    } catch (Exception e) {
      System.out.println(e);
    }
  }
}

Now we can use the testConvert() method from as many threads as we want, without any fears of racing conditions on the format() or parse() methods.

Kind regards

Heinz

 

Comments

We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)

When you load these comments, you'll be connected to Disqus. Privacy Statement.

Related Articles

Browse the Newsletter Archive

About the Author

Heinz Kabutz Java Conference Speaker

Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Superpack '23

Superpack '23 Our entire Java Specialists Training in one huge bundle more...

Free Java Book

Dynamic Proxies in Java Book
Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

We can help make your Java application run faster and trouble-shoot concurrency and performance bugs...