I migrated the HTML pages on my website to XHTML this weekend. I’ve been hesitant to do this after hearing a couple horror stories of how a slight error could have big consequences (for example, see how extra slashes caused Google to stop indexing CodeProject) but I took the plunge. Mostly this was a matter of changing <br>
to <br />
etc. But I did discover a few errors such as missing </p> tags. I was surprised to find such things because I had previously validated the site as HTML 4.01 strict.
I thought that HTML entities such as β
for the Greek letter β were illegal in XHTML. Apparently not. Three validators (Microsoft Expression Web 2, W3C, and WDG) all seem to think they’re OK. Apparently they’re defined in XHTML though not in XML in general. I looked at the official W3C docs and didn’t see anything ruling these out.
Also, I’ve read that <i>
and <b>
are not allowed in strict XHTML. That’s what Elliotte Rusty Harold says in Refactoring HTML, and he certainly knows more about (X)HTML that I do. But the three validators I mentioned before all approved of these tags on pages marked as XHTML strict. I changed the <i>
and <b>
tags to <em>
and <strong>
respectively just to be safe, but I didn’t see anything in the W3C docs suggesting that the <i>
and <b>
tags were illegal or even deprecated. (I understand that italic and bold refer to presentation rather than content, but it seems pedantic to me to suggest that <em>
and <strong>
are any different than their frowned-upon counterparts.)