At work (http://selkirk NULL.ca/), we’re looking at buying a new Content Management System to run our web site. Right now we’re using what I built by hand, but we could really use a system built by more than one or two people.
But one of the scariest issues for me is ensuring that only pure, spotless, valid, accessible XHTML 1.0 Strict content goes into my database. And generally speaking, web-based WYSIWYG HTML editors are… less than exemplary.
I’ve used TinyMCE (http://tinymce NULL.moxiecode NULL.com/), FCKEditor (http://www NULL.fckeditor NULL.net/), and XStandard (http://xstandard NULL.com/). Currently XStandard (http://xstandard NULL.com/) is by far the most successful at stripping inappropriate code. It is well worth the modest license fee — and the free version is very nearly as good as the paid version.
When I inherited this web site, it was built the bad old way, mostly using Microsoft Frontpageâ„¢. It was riddled with deeply nested
<font> tags — often a font tag for every other character in the content. The code was unreadable, it was easily twenty times its necessary filesize, it was very unfriendly to search engines. Repurposing the same text for another purpose (for use in print documents, for example) required running several cleanup routines, and still there was hour after hour of manually removing little niggly bits of bad code that lingered even after the RegEx (http://en NULL.wikipedia NULL.org/wiki/Regular_expression) tools had had their fun.
After getting my code to a POSH (http://www NULL.456bereastreet NULL.com/archive/200711/posh_plain_old_semantic_html/) state, it was very distasteful to see users trying to enter
<font> tags, inline styles (which bloat the code nearly as bad as
With clean, strict code, I can allow Marketing to make the design decisions, I can make the behaviour decisions globally, and content providers can focus on what they need to focus on — providing good, clean, well-written content to their visitors.
I Hate Bad Editors
There are lots of Online WYSIWYG HTML editors out there. But most of them allow any old abomination of HTML. The better ones might provide a cleanup routine based on Tidy (http://en NULL.wikipedia NULL.org/wiki/HTML_Tidy), but again, it tends to be as effective as any RegEx tool. And many of those require the editor to click a button or some such to do the cleaning — and they just don’t bother. Some claim to be XHTML-compliant, but they usually don’t forbid users from adding presentational markup to the content (leading to code bloat, design problems, etc), because XHTML Transitional isn’t worried about those elements so much.
I’ve played around with TinyMCE and FCKEditor enough to get it kind of working as Strict… but I don’t have a lot of confidence in those products.
Peter Krantz (http://www NULL.standards-schmandards NULL.com/2007/wysiwyg-editor-test-2/) or standards-schmandards.com did a good Evaluation of WYSIWYG editors for semantic features, but the need for Strict code is implied, not stated explicitly.
To test the CMS systems that are coming up in our evaluation process, I put together the most abominable page I could. The CMS editor should strip most, if not all, of the bad markup listed in the document. Warning: this is not for the weak of stomach.
Test Web Page for Cleanup of Bad HTML (http://jasonfriesen NULL.ca/schtuff/bad-html NULL.html)