Coming Attractions: HTML5 and beyond, and XHTML

 

1. Introduction HTML is being revised to include more support for rich internet applications (RIA), mobile computing, and other recent developments. The new revision will also include better support for sections, sidebars, etc.

The new version is called HTML5 by the w3c and it is a recommendation (eventually to become a standard.) As of August 2010 HTML5 is implemented in Firefox, Chrome, Safari, Opera, some mobile browsers, and Seamonkey. It is NOT implemented in IE8 (or earlier) but Microsoft is implementing it in IE9. (IE9 is in beta testing at this writing.)

Given the time lag for users to switch to new versions of browsers, you may expect many users to not have access to the new features in HTML5 for quite some time. So you will need to continue to write HTML which degrades gracefully for users of the earlier version. This is done by using functions which test whether or not a specific feature new to HTML5 is supported in the user's browser (or by using Modernizr to do this.)

This new version is called HTML5 and it may be 'served' or 'serialized' as either HTML (which, of course, is more forgiving) or as XHTML (more complex, but also has support for namespaces, etc.) These two ways are referred to as HTML5 ,and XHTML - or the later is sometimes referred to as XHTML5.

 

The beyond HTML5 revision of HTML is being developed by the WHATWG or Web Hypertext Applications Technology Working Group. http://www.whatwg.org/ (See also What is this (beyond) version .   Just to make life a little more complicated,

 

HTML5 may be used now (just include the <!doctype html> tag), but some browsers are not yet supporting all of it (see above). Some features (e.g. the canvas for bit maps) are supported, and while the w3c said that HTML5 would become the standard when it is supported by at least two browsers, the w3c is still calling it a recommendation, despite meeting the two browser criterion.

 

For developers who want to include equations, mathematical symbols, and other material from MathML, the good news is that HTML5 will include MathML. Until most users use browsers which are able to support HTML5, however, you are probably forced to stick with XHTML1.0

 

The next sections of this document tell you what you need to know for HTML5 pages, for XHTML(5) pages, and provide a list of references. That list is valuable as this is a work-in-progress. Until X/HTML5 is the standard I am continuing to write XHTML1.0 without the <?xml ... > processing instruction when I need an XML language and will be using HTML5 when I need the new tags. A brief survey of the sites which provide the best technical information on HTML and XHTML in August 2010 showed that they were all using XHTML1.0 Transitional.

What should I use?

 

 

2. HTML5 documents begin

<!DOCTYPE html>
<html lang=’en’>

 

You may write either <!DOCTYPE or <!doctype in the first tag, but (unlike HTML4.01) you must include a doctype. This will also guarantee that your browser will use the most recent version of HTML.

 

Anything transmitted with the MIME type text/html will be rendered as HTML5. The w3c recommends this for most authors as it will be compatible with older browsers. (see section 1.4.1 in http://www.w3.org/TR/html5/introduction.html ).

 

The lang attribute is optional and is specified in the html tag as lang='en'. (If not specified, it defaults to the lang value of the parent.)

 

The charset is specified as the FIRST element after <head>

<meta charset ='utf-8'>

 

 

 

3. XHTML5 documents begin

 

<html xmlns='http://www.w3.org/199/xhtml'
     xml:lang='en'>

 

XHTML5 documents do not need a DOCTYPE definition as that is provided by the xmlns.

Anything transmitted with MIME type application/xhtml+xml or application/xhtml or application/xml will be processed with an XML processor in the web — i.e. rendered as XHTML. (See same reference.)

 

In HTML5, the DOM now is more than a way to manipulate the page (an API); each element in the DOM now has a meaning or semantics attached to it.

 

The lang attribute is mandatory and is specified in the tag as xml:lang='en'

 

The w3c recommends that you do NOT include the processing instructions for XHTML5.

 

  <?xml version="1.0" encoding="UTF-8" ?>

 

This is because some user agents will render (produce) this. Lacking this line, the charset will default to UTF-8 encoding (or possibly UTF-16), which is just fine.

 

Note: You will need the processing instruction for XML documents — see Unit 4 — Ch. 7 of this course.

 

Please also see references on doctypes and xml processors.

 

4. References:

http://www.w3http://diveintohtml5.org/detect.html.org/TR/html5/ has the current (June 2010) working draft for HTML5 and http://www.w3.org/TR/html-markup/ has the Reference Guide.
http://dev.w3.org/html5/html-author/#getting-started-with-html-5, an earlier version of the HTML Reference, has a clear introduction to starting on HTML5.

A later version (August 2010)of the working draft is at http://www.whatwg.org/specs/web-apps/current-work/ and describes the "beyond" version of HTML5, and their wiki includes a good description of how all the various HTMLs and XHTMLs are related.

http://www.w3.org/TR/html5/introduction.html specifies XHTML5 and http://www.w3.org/TR/html5/introduction.html#html-vs-xhtml describes the difference between XHTML5 and HTML5, as does http://wiki.whatwg.org/wiki/HTML_vs._XHTML.

http://diveintohtml5.org/detect.html will start you on HTML5 (if you know HTML) and http://diveintohtml5.org/detect.html explains how to determine if specific HTML5 functions are implemented in the user's browser, including the link to Modernizr, which is open source.

http://www.w3.org/TR/html5-diff/ has the differences between HTML4.01 and HTML5, including information on the new elements in HTML5 in the Language section. (Also see the first reference or http://www.ibm.com/developerworks/xml/library/x-html5/ for a sophisticated introduction or http://www.runwalsoft.com/blog/?p=15 for a gentler version. The wiki on these differences is at http://wiki.whatwg.org/wiki/HTML_vs._XHTML ) and http://wiki.whatwg.org/wiki/HTML_vs._XHTML#Differences_Between_HTML_and_XHTML has a chart summarizing the syntactic differences and a list of differences about specific elements.

 

http://simon.html5.org/html5-elements has all elements and attributes of HTML5 (click on the item of interest in the left column), but has no revision date, so I don't know if it is staying current or not.

http://www.whatwg.org/ is the group developing X/HTML5. They run a wiki at http://wiki.whatwg.org/wiki/Main_Page and http://wiki.whatwg.org/wiki/FAQ

http://xhtml.com/en/future/conversation-with-x-html-5-team/ is a gentle introduction to HTML5. (User agent means things like browsers.)

http://xhtml.com/en/future/x-html-5-versus-xhtml-2/ explains the differences between HTML5 and XHTML2.0

http://www.w3.org/QA/2008/01/html5-is-html-and-xml.html explains the HTML5 vs XHTML5 difference

 

http://meyerweb.com/eric/thoughts/category/tech/xhtml/ has Eric Meyer (the great guru of CSS) writings on XHTML and HTML5. He is always worth reading (e.g. http://meyerweb.com/eric/thoughts/2008/06/02/the-missing-link/ )

 

http://www.w3.org/MarkUp/xhtml-roadmap/ has the plan for all XHTML modifications — especially see table near the bottom — as of 8/2008, but as of 12/2009 the group was disbanded to focus resources on HTML5.

 

http://stackoverflow.com/questions/2662508/html-4-html-5-xhtml-mime-types-the-definitive-resource contains a sophisticated summary of the differences among all the various current HTMLs and XHTMLs

 

http://www.w3.org/QA/2008/03/html-charset.html has more information than you probably want to know about charsets and encoding.

http://html5doctor.com has many useful articles about bugs in HTML5 and new tags and new developments.
http://www.ibm.com/developerworks/xml/library/x-think45/index.html provides a different point of view about getting XHTML5 to work properly and using MIME types etc. that differ from official recommendations. A cautionary tale.

 

 

Validators
http://html5.validator.nu/ has a "highly experimental" validator, which is vary easy to use. Clicking on the relevant warning or error message takes you to the relevant code.
http://validator.w3.org/ is the official w3c validator; also "highly experimental" it appears to give the same results as the one above without the nice feature of jumping to the relevant spot in your code. (It is possible that they are the same engine, as the validator for whatwg.org links to the one above.)
http://www.totalvalidator.com/ validates against HTML5 specs and for accessibility.

Information on polyglot files
http://www.w3.org/TR/html-polyglot/ has the most recent working draft of polyglot mark–up (24 June 2010 as of this writing) and http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html has the most recent editor´s draft of the same (13 August 2010 as of this writing)
http://blog.whatwg.org/ July 25, 2010 entry has the whatwg blog on polglot mark–up
http://stackoverflow.com/questions/3106699/should-i-write-polyglot-html5-documents has some useful (sophisticated) articles on polyglot documents


Information on older versions, including HTML4.01 and XHTML 1.0 and XHTML1.1
http://www.w3.org/QA/2002/04/valid-dtd-list.html lists all the doctypes for all the versions of HTML and XHTML prior to HTML5 and XHTML5. The list (last revised in 2007) includes all possible valid doctype declarations. HTML5 is not included as it is not yet a recommendation. I suggest, however, that you check periodically and NOT use the <?xml...> line for XHTML documents.