When creating html it is necessary to be mindful of escaping html entities properly. “&” needs to be replaced with “&”, “<” with “<” etc.
When serving html data remotely, the main reason for this is protection from XSS attacks. Aside from security concerns, having html entities escaped when they should not be is also a problem. If your html output looks like “<img src="president-cat.png" />” instead of showing you the actual image, you will be an unhappy user.
The Django Template System uses a subclass of the python type string extended to also store information about whether the string should be escaped or not. A string that should not be escaped anymore is “safe”. The task was to create a type which can be used just like QString, but which also stores information about whether the string is safe from further escaping or not. Additionally, certain operations on the object need to change the “safe” state of the string. Even though the string “Kernighan & Ritchie” is safe, if the developer does something like str.remove("amp; Ritchie");, the result is not safe: “Kernighan &”. So remove is an unsafe operation, but it’s not the only one. Unsafe operations on QStrings need to be reimplemented to make unsafe operations turn a safe string into an unsafe string.
The thing about python is that all methods behave as virtual, all classes always behave polymorphically and it is largely type independent, though types must often conform to some concept. That means that in python the solution was to create a subclass which contains the safe/unsafe variable and which re-implements the necessary methods for changing of safeness, and instances of the new class can be passed around just as easily and type independently.
In C++ it’s not so simple, so Grantlee needed another way. I started experimenting with a SafeString class as a subclass of QString. Preserving the safeness of an instance was not a large problem. Nor was overriding safeness-changing methods by using SafeString in the API instead of QString. However, the tricky part was passing SafeString instances around in QVariants without using QVariant::fromValue all the time.
There are some methods in the extension API where a method must be implemented to return a QVariant, possibly containing a SafeString. That should be as easy as simply returning an instance of the SafeString class, just like is possible with QString. The reason that works with QString is that QVariant has an implicit constructor which takes a QString. I couldn’t just add a new constructor to QVariant, so I created on SafeString an operator QVariant() which simply returns QVariant::fromValue(str);.
It worked quite well. Except that it only worked on gcc. Testing revealed that MSVC and Sun Studio were down-casting the SafeString to a QString and putting that in the QVariant, discarding the safe-ness information.
After discussing it with colleagues I tried out some other ideas such as inheriting privately from QString and making use of the using keyword to bring the QString methods public again. That didn’t work either. I got a compile error that instead of using operator QVariant, it was trying to cast up to QString and use that implicit constructor, but it was an inaccessible base class. Another idea was to create a QString wrapper class instead of a subclass. That got me most of the way there in that the operator QVariant() worked on all compilers, but I still had to implement forwarding for all QString operations.
The penultimate solution was to wrap a subclass of QString instead of QString directly. Initially the wrapped class was available through operator->() so that things like mySafeString->replace(); called the reimplemented method and returned a SafeString with the appropriate safe-ness attribute. A simple file was used to test the behavior of different compilers. Eventually though, the words of the wise were confirmed that that was abuse of operator->(), so now the NestedString in SafeString is available through .get(). That works quite well too. All QString methods which return a QString are reimplemented to return a SafeString and a few more overloads are added to make the use of SafeString quite transparent, so it’s mostly hidden in the documentation.