Using an allowlist based Sanitizer you are definitely less likely to shoot yourself in the foot, but as long as you use setHTML you can't introduce XSS at least.
I'll be very excited to use this in Lit when it hits baseline.
While lit-html templates are already XSS-hardened because template strings aren't forgeable, we do have utilities like `unsafeHTML()` that let you treat untrusted strings as HTML, which are currently... unsafe.
With `Element.setHTML()` we can make a `safeHTML()` directive and let the developer specify sanitizer options too.
Two, even if we did, DOMPurify is ~2.7x bigger than lit-html core (3.1Kb minzipped), and the unsafeHTML() directive is less than 400 bytes minzipped. It's just really big to take on a sanitizer, and which one to use is an opinion we'd have to have. And lit-html is extensible and people can already write their own safeHTML() directive that uses DOMPurify.
For us it's a lot simpler to have safe templates, an unsafe directive, and not parse things to finely in between.
A built-in API is different for us though. It's standard, stable, and should eventually be well known by all web developers. We can't integrate it with no extra dependencies or code, and just adopt the standard platform options.
Are you certain that this is secure? What about parsing depth/DOM clobbering, etc?
See https://mizu.re/post/exploring-the-dompurify-library-bypasse... for an example of why this is really hard. Please do not roll your own sanitizers; DOMPurify has very good maintenance hygiene, and the maintainer is an expert. I have reported a bunch of issues and never waited for more than two hours for a response in the past.
He is also one of the leading authors of the specification behind `setHTML`.
My code accepts only a very limited subset of HTML tags and their respective attributes. (<a>, <img>, <font>, <br>, <b>, <strong>, <i>, <em>, <del>, <s>, <u>, <p>, <hr>, <li>, <ul>, <ol>).
I could easily add more, like headings or tables. Just decided to not overwhelm the readers. But all of the allowed elements / attributes here are harmless. When I'm copying them, I'm only copying the known safe elements and attributes (forbids unknown attributes, including styles/scripts, event handlers, style attributes, ids, or even classes). I have fine control over the allowed elements / attributes and the structure. This makes things much easier. For a basic html content management this kind of filtering is fine since DOMParser actually does the heavy lifting.
Sure, DomPurify is powerful and handles much more complex use cases (doesn't it also use DOMParser though?), no doubts about that. But a basic CMS probably has to handle basic HTML text elements. I guess inline SVG sanitation is more complicated (maybe just use ordinary <img> instead?).
If you have some html example that will inject js/css or cause any unexpected behavior in my code example, please provide that HTML.
The app developers can still use that right now, but if the framework forces it's usage it'd unnecessarily increase package size for people that didn't need it.
So that's why template literals are broken. I am not much of a JS dev but sometimes I play one on TV. and I was cursing up a storm because I could not get templates to work the way I wanted them to. And I quote "What do you mean template strings are not strings? What idiot designed this."
If curious I had a bright idea for a string translation library, yes, I know there are plenty of great internationalization libraries, but I wanted to try it out. the idea was to just write normalish template strings so the code reads well, then the translation engine would lookup the template string in the language table and replace it with the translated template string, this new template string is the one that would be filled. But I could not get it to work. I finally gave up and had to implement "the template system we have at home" from scratch just to get anything working.
To the designers of JS template literals, I apologize, you were blocking an attack vector that never crossed my mind. It was the same thing the first time I had to do the cors dance. I thought it was just about the stupidest thing I had ever seen. "This protects nothing, it only works when the client(the part you have no control over) decides to do it" The idea that you need protection after you have deliberately injected unknown malicious code(ads) into your web app took me several days of hard thought to understand.
my example: a table to lookup translated templates. most translation engines require you to use placeholder strings. this lets you use the template directly as the optional lookup key.
simplified with some liberties taken as this can't be done with template literals.
Easy enough to fake with some regexes and loops. but I was a bit surprised that the built in js templates are limited in this manner.
const translate_table = {
'where is the ${thing}':'${thing} はどこですか' ,
}
function t(template, args) {
if (translate_table[template] == undefined) {
return template.format(args);
}
else {
return translate_table[template].format(args);
}
}
user_dialog(t('Where is the ${thing}', {'thing', users_thing} ));
I even dug deep into tagged templates, but they can't do this ether. The only solution I found was a variant of eval() and at that point I would rather write my own template engine.
I think I understand what you're suggesting, and I think it can be achieved with javascript template literals. It might be easier to understand with a usage example instead of an implementation example.
The only restriction may be that variable placeholders in additional translations might need to be positional rather than named.
You can make your tagged template literal return an array of tokens, so the developer gets to write naturally and no one has to deal with parsing. Just use the json stringified token array as the key in your translation map.
Here's how the tagged template literal maps to tokens:
t`Where is the ${t.thing()}` ->
["Where is the ", ["thing"]] // ["variable name"]
Example rendering a translated string directly:
t`Where is the ${t.thing(user_data)}?`.toString()
Its internet forum so I made it as short as possible over all other style factors. Untested - just trying to express the idea.
This has nothing to do with xss or security. Its also a pretty common for template literals/string interopolation to work like this. There are a couple of exceptions, but the majority of programming languages do it this way.
As far as I can tell JS has no way to symbolicly handle unformatted templates and then format them later.
For example, you can't do this.
const t1 = new Template('Hello ${name}');
const str_1 = t1.format({'name':user_name});
You could argue, perhaps correctly, that this is by design and doing something like this is a mistake. But when my whole clever idea depended on doing exactly this, I was a bit surprised when it does not work with native templates.
I'm not saying its right or wrong just that php is following the trend with this feature when it comes to language design.
I know i said earlier its not for security, but it could very well be for security (not xss though) as format string injection is a common vulnerability in c and python which allow this sort of thing.
I upgraded mine to FF 138. After the update it even opened the tab groups blog post. 15 minutes of going through settings later, no, my browser has no tab groups feature at all. Of course then I see it's a progressive rollout. Sad.
There is an about:config setting you can turn on. I don’t have it hear but it has been widely posted so take a look. I think if you search about:config for “tab” it might show up.
You might want something like:
This will replace <h1> elements with their children (i.e. text in this case), but disallow all other elements and attributes.reply