Categories
Posts

HTML5 Spec Calls For Altering Your Data

While reading the jQuery 1.6 release announcement I came across this gem:

jQuery 1.5 introduced a feature in the .data() method to automatically import any data- attributes that were set on the element and convert them to JavaScript values using JSON semantics. In jQuery 1.6 we have updated this feature to match the W3C HTML5 spec with regards to camel-casing data attributes that have embedded dashes. So for example in jQuery 1.5.2, an attribute of data-max-value=”15″ would create a data object of { max-value: 15 } but as of jQuery 1.6 it sets { maxValue: 15 }.

The portion of the W3C HTML5 spec mentioned is http://www.w3.org/TR/html5/elements.html#embedding-custom-non-visible-data-with-the-data-attributes – and the specific wording in this regard is:

element . dataset
Returns a DOMStringMap object for the element’s data-* attributes.

Hyphenated names become camel-cased. For example, data-foo-bar=”” becomes element.dataset.fooBar.

My feelings on this are simple – this is an absolutely horrible idea! Why would you indicate that implementations of the spec actually alter the attribute names used in the HTML? The only result from this I can see is confusion and wasted time when code breaks from this but it isn’t obvious why.

To make this even more confusing the data attribute section of the spec also mentions:

Note: All attributes on HTML elements in HTML documents get ASCII-lowercased automatically, so the restriction on ASCII uppercase letters doesn’t affect such documents.

So even if you wanted to CamelCase your data attribute names, to avoid the auto conversion described in the spec, you can’t. Using data-CamelCase would become data-camelcase.

Where does all this leave the average web dev? Well, I think the easiest way out is to never, ever, ever use a name for a data attribute that the spec indicates will be altered. This means no dashes any where in the dev controlled section of data attribute names. That isn’t very pretty since the attribute is required to start with data-, but it is the safest route.

In practice this will result in either names with no breaks or using underscores where you would normally use dashes. “Safe” examples would be:

  • data-foobar
  • data-foo_bar

I’m really disappointed to see attribute name changes be part of the HTML5 spec. I’d be really curious to know what the justification was for this move.

7 replies on “HTML5 Spec Calls For Altering Your Data”

The HTML editors probably wanted to match the DOM spec for accessing CSS properties: http://www.w3.org/TR/DOM-Level-2-Style/css.html#CSS-extended

In jQuery, though, doing .data( ‘hyphenated-key’ ) or .data( ‘camelCaseKey’ ) works, but only for keys that were prepopulated from data-* attributes. Try this out, for example.

<script src="http://code.jquery.com/jquery-1.6.js"></script&gt;
<script>
jQuery( function( $ ) {
console.debug( 'BEGIN', $.extend( {}, $( '#bob' ).data() ) );

$( '#bob' ).data( {
"max-value": 15,
minValue: 12,
} );

console.debug( 'AFTER OBJECT ADD', $.extend( {}, $( '#bob' ).data() ) );

$( '#bob' ).data( 'another-max-value', 25 );
$( '#bob' ).data( 'anotherMinValue', 22 );

console.debug( 'AFTER STRING ADD', $.extend( {}, $( '#bob' ).data() ) );

console.debug( 'GET hello-there', $( '#bob' ).data( 'hello-there' ) );
console.debug( 'GET max-value', $( '#bob' ).data( 'max-value' ) );
console.debug( 'GET min-value', $( '#bob' ).data( 'min-value' ) );
console.debug( 'GET another-max-value', $( '#bob' ).data( 'another-max-value' ) );
console.debug( 'GET another-min-value', $( '#bob' ).data( 'another-min-value' ) );

console.debug( 'AFTER DASH GETS', $.extend( {}, $( '#bob' ).data() ) );

console.debug( 'GET helloThere', $( '#bob' ).data( 'helloThere' ) );
console.debug( 'GET maxValue', $( '#bob' ).data( 'maxValue' ) );
console.debug( 'GET minValue', $( '#bob' ).data( 'minValue' ) );
console.debug( 'GET anotherMaxValue', $( '#bob' ).data( 'anotherMaxValue' ) );
console.debug( 'GET anotherMinValue', $( '#bob' ).data( 'anotherMinValue' ) );

console.debug( 'AFTER CAMEL GETS', $.extend( {}, $( '#bob' ).data() ) );
} );
</script>
<p id="bob" data-hello-there="hi">yo</p>

Unfortunately that makes things even more confusing. The new rules for figuring out this flow look like:

– data-* attributes in the original HTML get dashes converted to CamelCase
– data-* attributes added via jQuery afterwards with dashes do not get converted into CamelCase

The only thing worse than altering attribute names is doing it inconsistently.

Good post. Guess the only way to have multi-word, easy to read names that are consistent between the HTML and jQuery is to use underscores. It makes no sense that it ends up camel-cased, but can’t start out that way.

I totally agree, i thought the idea was retarded the moment I saw it. But at least we can still use underscores.

Leave a Reply

Your email address will not be published. Required fields are marked *