Categories
Posts

HTML5 Spec Calls For Altering Your Data

While reading the jQuery 1.6 release announcement I came across this gem:

jQuery 1.5 introduced a feature in the .data() method to automatically import any data- attributes that were set on the element and convert them to JavaScript values using JSON semantics. In jQuery 1.6 we have updated this feature to match the W3C HTML5 spec with regards to camel-casing data attributes that have embedded dashes. So for example in jQuery 1.5.2, an attribute of data-max-value=”15″ would create a data object of { max-value: 15 } but as of jQuery 1.6 it sets { maxValue: 15 }.

The portion of the W3C HTML5 spec mentioned is http://www.w3.org/TR/html5/elements.html#embedding-custom-non-visible-data-with-the-data-attributes – and the specific wording in this regard is:

element . dataset
Returns a DOMStringMap object for the element’s data-* attributes.

Hyphenated names become camel-cased. For example, data-foo-bar=”” becomes element.dataset.fooBar.

My feelings on this are simple – this is an absolutely horrible idea! Why would you indicate that implementations of the spec actually alter the attribute names used in the HTML? The only result from this I can see is confusion and wasted time when code breaks from this but it isn’t obvious why.

To make this even more confusing the data attribute section of the spec also mentions:

Note: All attributes on HTML elements in HTML documents get ASCII-lowercased automatically, so the restriction on ASCII uppercase letters doesn’t affect such documents.

So even if you wanted to CamelCase your data attribute names, to avoid the auto conversion described in the spec, you can’t. Using data-CamelCase would become data-camelcase.

Where does all this leave the average web dev? Well, I think the easiest way out is to never, ever, ever use a name for a data attribute that the spec indicates will be altered. This means no dashes any where in the dev controlled section of data attribute names. That isn’t very pretty since the attribute is required to start with data-, but it is the safest route.

In practice this will result in either names with no breaks or using underscores where you would normally use dashes. “Safe” examples would be:

  • data-foobar
  • data-foo_bar

I’m really disappointed to see attribute name changes be part of the HTML5 spec. I’d be really curious to know what the justification was for this move.