HTML is HyperText Markup Language, created by Tim Berners-Lee as a subset of the far larger and more complex SGML. In turn, XHTML and HTML5 are successors to this standard. To break down the HTML acronym:
- HyperText
- is simply the ability to jump from one document to another via links.
- Markup
- means that HTML specifies (marks up) content, i.e. it provides meaning. This is a very important point. HTML is not intended to specify how something looks; instead, it specifies what things are: a paragraph, a heading, a quotation, etc. The appearance of content is provided by CSS).
- Language
- A misnomer: HTML is not a language in the classical sense of programming languages, because it cannot create executable programs on its own. Do not put HTML on your resume under “Programming Languages” – any employer who knows his code will be able to tell that you are, at best, uninformed.
HTML was made for scientists, not artists.
HTML is the lingua franca of the web, the means by which most documents on the web are encoded. It has become so popular that most programs now output HTML, even for documents that were never intended to be viewed on the web (Microsoft Word, for example (which encodes HTML terribly).
However, there is a problem with HTML, one that lies in the heart of its creation. Tim Berners-Lee, the developer of HTML, was working at CERN when he developed HTML as a method to present simple scientific documents on a heterogeneous network. (You can still find the original release announcement for the Web on Usenet). The first version of HTML didn’t even have the ability to display graphics. HTML was developed for scientists, not for artists. The original focus of HTML was on features that could be used in scientific papers: tables, lists, headings, and the like, and HTML never entirely left this inheritance behind.
As the web grew, artists wanted to jump on the bandwagon. Lacking any other means to do so, they took the functionality developed by Tim Berners-Lee and tried to push it towards design and appearance. HTML was never intended for this, and it coped badly. Designers jumped through all kinds of hoops – cutting up graphics and nesting tables inside tables inside tables, for example – to try to get the visual effects they were after.
The browser wars of the mid-to-late 90’s caused HTML to suffer further. Dissatisfied with the slow, academic advancement of HTML and driven to dominate the market, Netscape and Microsoft began to introduce their own proprietary code. This code looked like HTML, but it was only recognized in the particular browser made by that company (Netscape Navigator and Internet Explorer, respectively). Both Netscape and Microsoft pushed these “advancements” to web developers, who were forced to code for one particular browser if they wanted to use this new feature. At its worst point this war threatened to Balkanize the web, making web pages that could be seen in only one browser and not in the rest.
Thankfully, sanity slowly prevailed. The development of HTML was passed to an organization known as the W3C. New features such as CSS were advanced to support designers, returning the separation of appearance and content. While there are still some inconsistencies in how different browsers support these open, standardized features, the situation is continuing to improve.
HTML has been standardized as HTML5 as a way of documenting data. CSS has been standardized as a means of presenting that data. And ECMAScript ( more popularly, known as JavaScript) has been standardized as a way of controlling interactive behavior of that data.
XHTML was the immediate successor to HTML, and shares many qualities with it – you might want to think of XHTML as “strict” HTML, cleared of the accumulated cruft of <font>
tags and the other presentation elements that crept in over 10 years. For many, this new standard was overly draconian and too limited in scope; HTML5 was the result.
Image courtesy of Daniel Rehn
Enjoy this piece? I invite you to follow me at twitter.com/dudleystorey to learn more.