This replaces DOMParser with htmlparser2 which is much, much faster.
How much faster? 80%. This new implementation can parse at 50mb/s
which is insane! The old one could only do 5-10mb/s
We still haven't gotten rid of the DOMParser though since HTML-to-MD
conversion still needs it. This will be done soon though by using `dr-sax`.
This uses a custom implementation of htmlparser2 instead of the default
one which is 50% faster.
Since HTML is a tree-like language it is futile to compare it character
for character. `html1 === html2` is almost always false. This commit
introduces a simple diffing algorithm that only checks the text inside
the html + a few other attributes to decide whether the 2 HTMLs are
actually different or not. This is obviously not foolproof and it will
ignore everything aesthetic (b, em, strong tags etc.). This is actually
desireable because in our case only the text difference should
warrant a conflict. Everything else can easily be brought back.
Similarly, this also ignores whitespace differences surrouding the
tags.
All in all it'll provide a more reliable alternative to MD5 hashing the
2 HTMLs.