Update (November 12th 2011): Read a reply by Jeremy Keith to this article in which he strongly argues about the importance of pursuing semantic value and addresses issues discussed in the article as well as in the comments here on Smashing Magazine.

Disclaimer: This article is published in the Opinion column section in which we provide active members of the community with the opportunity to share their thoughts and ideas publicly. Do you agree with the author? Please leave a comment. And if you disagree, would you like to write a rebuttal or counter piece? Leave a comment, too, and we will get back to you! Thank you.

Meta-utopia is a world of reliable meta data. When poisoning the well confers benefits to the poisoners, the meta-waters get awfully toxic in short order.

– Cory Doctorow

Allow me to paint a picture:

  1. You are busy creating a website.
  2. You have a thought, “Oh, now I have to add an element.�
  3. Then another thought, “I feel so guilty adding a div. Div-itis is terrible, I hear.�
  4. Then, “I should use something else. The aside element might be appropriate.�
  5. Three searches and five articles later, you’re fairly confident that
    aside is not semantically correct.
  6. You decide on article, because at least it’s not a div.
  7. You’ve wasted 40 minutes, with no tangible benefit to show for it.

This Just Straight Up Sucks

This is not the first time this topic has been broached. In 2004, Andy Budd wrote on semantic purity versus semantic realism.

If your biggest problem with HTML5 is the distinction between an aside and a blockquote or the right way to mark up addresses, then you are not using HTML5 the way it was intended.

Mark-up structures content, but your choice of tags matters a lot less than we’ve been taught for a while. Let’s go through some of the reasons why.

The Web No Longer Consists Of Structured Content

In the golden days of the Web, Web pages were supposed to be repositories of information and meaning, nothing more. Today, the Web has content, but meaning is derived from users’ interactions with it.

XML, RDFA, Dublin Core and other structured specifications have very solid use cases, but those use cases do not account for the majority of interactions on the Web. Heck, no website really has the purity of semantic mark-up that such specifications demand. Mark Pilgrim writes about this much better than I do.

If you have content that demands semantic purity — such as a library database, a document that needs a table of contents, or an online book (i.e. anything for which semantic purity makes sense) — then by all means stick to the HTML5 outlining algorithm, and split hairs on which element should be an article and which a section. No customer-facing tool exists that takes advantage of this algorithm by producing a table of contents. No browser seems to exploit such tools either.

Is It Really Accessible?

If accessibility is your reason for using semantic mark-up, then understand that accessibility and semantic mark-up have very little correlation, due to the massive abuse of HTML mark-up on the Web. (I would love to link to Mark Pilgrim’s post on this, but it is dead, so this will have to do.)

The b, strong, i and em tags are equivalent to the span tag as far as the specification is concerned. And so are some of HTML5’s tags.

As stated on HTML5 Accessibility, almost every new HTML5 element currently provides to assistive technology only as much semantic information as a div element. So, if you thought that using HTML5 elements would make your website more accessible, think again. (How much additional information do <figure> and <figcaption> bring? None.)

The recent debate (or debacle?) on the <time> element is just more proof of the impermanence of the semantic meanings associated with elements.

Is It Really Searchable?

If SEO is your grand purpose for using semantic mark-up, then know that most search engines do not give more credence to a page just because of its mark-up. The only thing recommended in this SEO guide from Google is to use relevant headings and anchor links (other search engines work similarly). Your use of HTML5 elements or of strong or span tags will not affect how your content is read by them.

There is another way to provide rich data to search engines, and that is via micro-data. In no way does this make your website rank better on search engines; it simply adds value to the search result when a relevant one is found for your website.

Is It Really Portable?

Another much-touted advantage of the semantic Web is data portability. Miraculously, all devices are supposed to understand the semantic mark-up used everywhere and be able to parse the information therein with no effort. Aryeh Gregor puts that myth to sleep:

… +Manu Sporny said that semantic Web people had received feedback that out-of-band data was harder to keep in sync with content. I can attest that in MediaWiki’s case this isn’t true, though… The only times I can see where you’d want to use RDFa or microdata instead of separate RDF is if either you don’t have good enough page-generation tools, or you want the metadata to be consumed by specific known clients that only support inline metadata (e.g. search engines supporting schema.org or such). If the page is being processed by a script anyway, and if the script author has ready access to server-side tools that can extract the metadata into a separate RDF stream, then it’s normally going to be just as easy to publish as a separate stream as to publish inline. And it saves a lot of bloat on every page view.

What Now, Then?

  • There is no harm using div elements; you can continue using them instead of section and article. I believe we should use the new elements to make your mark-up readable, not for any inherent semantic advantage. If you want to use HTML5 section and article tags to enhance some particular textual documentation for a future document reader, do it.
  • Tools exist today that take advantage of the nav, header and footer elements. NVDA now assigns implied semantics with these elements. The elements are straightforward to understand and use.
  • There is good support for ARIA landmarks in screen readers, but be careful when using them with HTML5 elements.
  • HTML5 has a host of new features. Learn about them, use them, give feedback. Make these features more robust and stable. Yes, most of these features require that you understand and write JavaScript and expose features that create a richer experience for your audience. If that task sounds formidable to you, then start learning how to code, particularly JavaScript.

(al)


© Divya Manian for Smashing Magazine, 2011.