Sanitising HTML with C# and Tidy
// October 26th, 2008 // 4 Comments » // Programming
(*See article end for updates)
A while back I was given the task of sanitising HTML entered in by users in our Discussions are at Huddle. After searching around I initially couldn’t find many easy to implement tools out there to help me.
I firstly came across Jeff Atwood’s solution for cleaning HTML. A very fast solution, although not water tight. Regular expressions are great for string matching when you know what to expect, but in many cases the attacks are written so randomly it’s possible for them to slip through.