So many people talk about semantic web, valid code and other related stuff. I can’t imagine how I can write more than the usual 5 cents.
But I am trying.
Every professional web developer knows about such topics, but sometimes something stops him from following semantic rules or valid markup. What? His name is Mr. Style. Yes, every HTML coder knows about missing additional
div elements to wrap some blocks for fancy shadowing or other eye-candy features.
It’s a long story, that begins in HTML‘s roots. SGML — is a general markup standard, HTML is based on it. HTML was created as both a content and style markup language at the same time. For example, HTML tags like
b is purely style-related. The main task of HTML was not content meaning (semantics), but more styling and representation (rendering).
Opposite to HTML, XML was invented as a pure semantic standard. The main usage for XML is hierarchical data storage and retrival. It has fully flexible generalized element naming, that allows one to create new markup languages on top of XML.
After some time the next standard was invented — CSS, which brings new possibilities for styling both HTML and XML marked-up documents. With CSS each tag/element can be styled in any manner. For example, usual paragraph HTML tags
p can be styled as lists with few lines of CSS code. The representation meaning of HTML becomes less important with CSS features.
The next level of document markup was reached with XHTML. Yes, it seems like a natural evolution of HTML, with focus on strong semantic markup as XML. This one brings new possibilities to machine processing for data retrieval, but keeps backward compatibility with older browsers and other HTML processors.
Well, why can’t we use pure XML formats for semantic data exchange in Web? RSS/Atom is the first widely used format in Web that comes to my mind. OK, here are some points:
- Semantic (x)HTML can be interpreted by web browsers (with or without corresponding styles the user can see rendered content).
- Semantic (x)HTML can be interpreted by machines for data retrieval.
- Semantic (x)HTML can be interpreted by special screen readers and other devices dedicated to disabled people.
- Semantic (x)HTML can mix different types of information such as news feed, contact cards, calendars, etc. in a single document.
I want to write a little more about the last point. Microformats — is one of the most famous and usable representations of semantic HTML. It’s a very useful thing, because lots of robots/tools can retrieve information (hatom, hcard, hcalendar) from web site pages. Lots of the well-known big players of the Internet community already integrated this standard for their document mark-up (Google, Yahoo, etc). It’s still not Internet-wide, but already shows more and more cases with new possibilities.
With CSS evolution, the Web will go closer and closer to the semantic meaning of web page contents. I recommend that modern web developers always be on the cutting edge of markup technologies. You can easily start with Microformats as a rich example of semantic markup. You can dive into CSS3 techniques and features to be ready to implement it in some of your next projects (and write fewer divs). The markup will shine for those who reach this formatting Zen and for a few robots too! :-)