martes, 27 de noviembre de 2012

HTML to valid XML code using tidy.

Sometimes you copy HTML source code from a web page that is not valid. Some tools, like mine htmlminimizator requires valid XML fragments to work. In this case we need to transform HTML sources to valid XML. I'm doing it fine with the tidy utility like this.

First I suppose you have your HTML code fragment in a file called test1.html, then I perform :

tidy -asxml -ashtml -utf8 test1.html 

Hope that can be of help to others in a similar situation.