How to make TinyMCE to output clean HTML
I like TinyMCE, I think it's the best wysiwyg editor you can get. There's just one thing that bothers me quiet a lot. By default TinyMCE outputs really messy HTML code. For instance imagine you want to make an ordinary unordered list.
The output is following:
1 2 3 4 5 | <ul> <li><span style="font-size: x-small;"><span style="font-size: 10px; line-height: 16px;">first</span></span></li> <li><span style="font-size: x-small;"><span style="font-size: 10px; line-height: 16px;">second</span></span></li> <li><span style="font-size: x-small;"><span style="font-size: 10px; line-height: 16px;">third</span></span></li> </ul> |
Why the hell so many spans and styles?
Fortunately there is a simple solution (it just took me 3 hours to find it). If you take a look at TinyMCE Configuration you'll find two insignificant parameters invalid_elements and extended_valid_elements.
invalid_elements
This parameter allows you to specify which elements you want to exclude from HTML output.
1
| invalid_elements: "span"
|
This solves our problem but you never know if you really always want to get rid of all `span` elements. What if sometime you have to have span with class attribute? On the other hand the output is nice and pure HTML:
1 2 3 4 5 | <ul> <li>first</li> <li>second</li> <li>third</li> </ul> |
extended_valid_elements
I think better way of doing this is extended_valid_elements. This is like the right opposite function to invalid_elements. Unlike to invalid_elements you can specify which elements can have which attributes. The dafault configuration is quiet wild but it doesn't matter.
All we want to do is to say: "remove all spans without class attribute".
And here's the solution:
1
| extended_valid_elements : "span[!class]"
|
HTML output is still nice and if you want to use span with class ... you can :].
1 2 3 4 5 | <ul> <li>fdgsdfsfdg</li> <li class="hello">cvbcxvbxcb</li> <li>dsfgsdfgsdg</li> </ul> |