This page is a cookbook for making HTML 4.01 pages into XHTML pages, rather than a tutorial.
XHTML stands for E X tensible H yper T ext M arkup L anguage , which is aimed to replace HTML. It is very similar to HTML 4.01, but is a stricter and more logical.
XHTML is a based HTML and XML (e X tensible M arkup L anguage) . It is compatible with older browsers, and is the 'HTML' of the future.
Put crudely, XHTML is the new HTML. You don't need to change your pages to be ".xhtml". They remain ".htm" or ".html", but they are written in a way that makes it less likely for them to fail in current browsers, and compatible with future ones.
Browsers are presently very forgiving and support the old HTML, however, in the future, browsers and programs dealing with HTML will become more strict. A correct XHTML document is easier for a program to examine and to check.
When a browser opens a webpage, it can open an XHTML page quicker, because it doesn't have to go into "quirks" mode to figure out non-standard code.
Following the XHTML guidelines, documents are made more easy to transport to other media, and increases accessibility, in for instance, talking browsers and text-only browsers. In the future, the number of clients will increase, with telephones and other media reading internet documents. Using standard documents gives your pages forward compatibility.
Converting to XHTML now, will make your pages compatible with future technology.
The following is an example of a basic XHTML document :
|
< ?xml version="1.0" encoding="UTF-8"? > < !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" > < html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" > < head > < title > Greetings < /title > < /head > < body > < p >Hello!< /p > < /body > < /html > |
The above document defines the character set in the first line. Insstead of using utf-8, you can use other character sets. For example:
< ?xml version="1.0" encoding="iso-8859-1"? >
The next one defines the character set in the meta tags.
|
< !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" > <html> <head> <title>My Page</title> < meta http-equiv= "Content-Type" content= "text/html;charset=utf-8" / > </head> <body> </body> </html> |
A minimal document must have a DTD at the top. The Content Type needs to be defined, although this can be done server side. And the page needs the html, head, body, and title tags.
There are three possible doctypes:
< !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" >
An XHTML document should start with one of the above. More information from W3C .
You would use the strict dtd when you do not use any old, or depreciated HTML attributes, and use style sheets to format your pages (and not HTML).
You would use the "transitional" dtd when you have not completely used all the standards of XHTML, and wish to format your pages with HTML, using "font", "align", "size", "width", etc. Also the strict dtd does not accept the TARGET tag, although transitional does. This page, for example, follows the transitional standard.
The frames DTD is used when you use frames on your page.
You can also define the xml namespace as follows:
< html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" >
In the example basic page , the Content Type was defined as utf-8 :
< meta http-equiv= "Content-Type" content= "text/html;charset=utf-8" / >
Another common one is ISO-8859-1
< meta http-equiv= "Content-Type" content= "text/html; charset=ISO-8859-1" / >
There are a number of valid character sets and implementations. The one above is a common one for those writing in Roman letters. The page must have its Content Type defined.
Next we need to look at the differences between HTML and XHTML, or what changes we must make to produce XHTML documents.
While HTML is not case sensitive, and you can write <a Href, <A hRef, etc, in XHTML, all tags must be lowercase. Therefore, the following is the only correct way to write tags:
<a href= ....
![]()
The surprising thing here is that Events, which we think of as JavaScript, must be lowercase. So while we could write onClick=, or ONCLICK=, we must now write:
onclick=""
![]()
Similarly, properties and property names must be lowercase:
<a href="myURL.htm" id="myID"></a>
![]()
<img src="myImage.gif" height="20" width="30" align="left" alt="My Image" />
![]()
The <img tag, by the way, MUST have an "alt" attribute in XHTML, which is logical because some browsers are text browsers and others talking browsers, which need the "alt" text.
The forward-slash at the end of the IMG tag leads us onto the next rule:
In HTML, it wasn't necessary to close some tags, and the following is OK in HTML:
<p>Hello<p>From me
![]()
However, in XHTML, the tags must be closed:
<p>Hello</p><p>From me</p>
![]()
Which, after all, is logical.
Some tags in HTML are empty tags and it wasn't necessary to close them. The famous example is:
<br>
![]()
In XHTML we must close this tag. One way to do this, which is compatible with older browsers is to add a space and a forward slash:
<br />
![]()
So in XHTML we would write the empty tags as follows:
< img src= "Back.jpg" border= "0" width= "127" height= "44" alt= "My Message" />
< input type= "button" value= "button" / >
< inpu type= "checkbox" name= "C1" id="C1" value= "" checked= "checked" / >
< input type= "radio" name= "r1" id="r1" value= "" checked= "checked" / >
< link rel= "stylesheet " href= "menu.css" type= "text/css" / >
< meta name= "keywords" content= "" / >
< meta name= "description" content= "" / >
< meta http-equiv= "Content-Type" content= "text/html;charset=utf-8" / >
< base target= "_top" / >
These tags must be on one line in XHTML, whereas they could span several lines in HTML.
In HTML, we could write:
< input type= "radio" name= "r1" id="r1" value= "" checked >
![]()
However, this breaks three rules. The correct way to write the above is:
< input type= "radio" name= "r1" id="r1" value= "" checked= "checked" / >
![]()
That is, all the values are in quotes, and we do not have minimisation , where we could write simply checked , when we meant checked="checked" or checked="true" .(If the name attribute is present, the id attribute is required, but they don't need to have the same value)
This means that we should not write:
<b><i>Hello</b></i>
![]()
But we should write this:
<b><i>Hello</i></b>
![]()
That is we close the tags in the opposite order to the way we opened them.
Also we should not write block tags such as "<p>" or "<table>") inside inline elements (such as "<a>", "<span>", or "<font>").
So:
<font size="3"><p>Hello</p></font>
![]()
Because we have the block (p) element inside an inline element (font). Instead write:
<p><font size="3">Hello</font></p>
![]()
In XHTML, the name attribute is replaced with the id attribute. In transitional documents both name and id can be used, but a name without an id is not allowed.
The name and id property values must be one word. So not
<a name="a nice day" id="a nice day"></a>
![]()
But
<a name="a_nice_day" id="a_nice_day"></a>
![]()
Some elements require an attribute. For instance, the FORM requires an action:
<form name="form1" id="form1" action="">
An empty action, it appears, is better than no action!
An image requires an "alt" attribute and an src attribute!:
<img src="myImage.jpg" alt="" />
JavaScript and CSS tags need a type attribute:
<script type= "text/javascript" >
<link rel= "myCSS.css" type= "text/css" />
<script type= "text/javascript" >
<style type= "text/css" >
< link rel= "stylesheet " href= "menu.css" type= "text/css" / >
In strict XHTML 1.0, the language tag is deprecated. So we write:
<script type= "text/javascript" >
Rather than:
<script language= "javascript" type= "text/javascript" >
XHTML considers the forward slash to be an end marker, so when writing HTML with JavaScript the forward slashes should be preceded with an escape character ("\", backward slash).
So:
document.write("<H1>Hello< \ /H1>");
The backward slash precedes any forward slashes.
Some characters may not be recognised in the character set chosen. For instance, �� may not be recognised, and needs to be replaced by ©
Strict XHTML recognizes the following special character names only:
& - ampersand ( & )
< - less than, open bracket ( < )
> - greater than, close bracket ( > )
" - double quote ( " )
- non-breaking space (hard space) ( )
(See
special characters
).
Most Recent Revision: 18-Oct-98.
Copyright �� 1998
I am always pleased to hear from you.
Send your comments to
and please visit:
Living
Consciously