I'm developing PHP applications for quite a while now. But this one realy gets me struggled. I’m loading complete HTML pages using the DomDocument. These pages are external and may contain JavaScript. This is beyond my control.
On some pages things were not rendered the way it supposed to when it came down to basic HTML formatting in JavaScript strings. I've wrote down an example which explains it all.
<?php
$html = new DOMDocument();
libxml_use_internal_errors(true);
$strPage = '<html>
<head>
<title>Demo</title>
<script type="text/javascript">
var strJS = "<b>This is bold.</b><br /><br />This should not be bold. Where did my closing tag go to?";
</script>
</head>
<body>
<script type="text/javascript">
document.write(strJS);
</script>
</body>
</html>';
$html->loadHTML($strPage);
echo $html->saveHTML();
exit;
?>
Am I missing something?
Edit: I've changed the demo. Changing the LoadHTML to LoadXML doesn't work anymore now and the output of the demo will pass w3c validation. Also adding the CDATA block to the JavaScript doesn't seem to have any effect.
from DOMDocument removes HTML tags in JavaScript string
No comments:
Post a Comment