Archive for the ‘KDE’ Category

Grantlee version 5.3.1 now available

November 11, 2022

I’ve just made a new 5.3.1 release of Grantlee. The 5.3.0 release had some build issues with Qt 6 which should now be resolved with version 5.3.1.

Unlike previous releases, this release will not appear on http://www.grantlee.org/downloads/. I’ll be turning off grantlee.org soon. All previous releases have already been uploaded to https://github.com/steveire/grantlee/releases.

The continuation of Grantlee for Qt 6 is happening as KTextTemplate so as not to be constrained by my lack of availability. I’ll only make new Grantlee patch releases as needed to fix any issues that come up in the meantime.

Many thanks to the KDE community for taking up stewardship and ownership of this library!

Location, Location, Location

April 27, 2021

As of a few days ago, a new feature in clang-query allows introspecting the source locations for a given clang AST node. The feature is also available for experimentation in Compiler Explorer. I previously delivered a talk at EuroLLVM 2019 and blogged in 2018 about this feature and others to assist in discovery of AST matchers and source locations. This is a major step in getting the Tooling API discovery features upstream into LLVM/Clang.

Background

When creating clang-tidy checks to perform source to source transformation, there are generally two steps common to all checks:

  • Matching on the AST
  • Replacing particular source ranges in source files with new text

To complete the latter, you will need to become familiar with the source locations clang provides for the AST. A diagnostic is then issued with zero or more “fix it hints” which indicate changes to the code. Almost all clang-tidy checks are implemented in this way.

Some of the source locations which might be interesting for a FunctionDecl are illustrated here:

Pick Your Name

A common use case for this kind of tooling is to port a large codebase from a deprecated API to a new API.

A tool might replace a member call pushBack with push_back on a custom container, for the purpose of making the API more like standard containers. It might be the case that you have multiple classes with a pushBack method and you only want to change uses of it on a particular class, so you can not simply find and replace across the entire repository.

Given test code like

    struct MyContainer
    {
        // deprected:
        void pushBack(int t);

        // new:
        void push_back(int t);    
    };

    void calls()
    {
        MyContainer mc;

        mc.pushBack(42);
    }

A matcher could look something like:

    match cxxMemberCallExpr(
    on(expr(hasType(cxxRecordDecl(hasName("MyContainer"))))),
    callee(cxxMethodDecl(hasName("pushBack")))
    )

Try experimenting with it on Compiler Explorer.

An explanation of how to discover how to write this AST matcher expression is out of scope for this blog post, but you can see blogs passim for that too.

Know Your Goal

Having matched a call to pushBack the next step is to replace the source text of the call with push_back. The call to mc.pushBack() is represented by an instance of CXXMemberCallExpr. Given the instance, we need to identify the location in the source of the first character after the “.” and the location of the opening paren. Given those locations, we create a diagnostic with a FixItHint to replace that source range with the new method name:

    diag(MethodCallLocation, "Use push_back instead of pushBack")
        << FixItHint::CreateReplacement(
            sourceRangeForCall, "push_back");

When we run our porting tool in clang-tidy, we get output similiar to:

warning: Use push_back instead of pushBack [misc-update-pushBack]
    mc.pushBack(42);
       ^~~~~~~~
       push_back

Running clang-tidy with -fix then causes the tooling to apply the suggested fix. Once we have tested it, we can run the tool to apply the change to all of our code at once.

Find Your Place

So, how do we identify the sourceRangeForCall?

One way is to study the documentation of the Clang AST to try to identify what API calls might be useful to access that particular source range. That is quite difficult to determine for newcomers to the Clang AST API.

The new clang-query feature allows users to introspect all available locations for a given AST node instance.

note: source locations here
    mc.pushBack(42);
       ^
 * "getExprLoc()"

note: source locations here
    mc.pushBack(42);
                  ^
 * "getEndLoc()"
 * "getRParenLoc()"

With this output, we can see that the location of the member call is retrievable by calling getExprLoc() on the CXXMemberCallExpr, which happens to be defined on the Expr base class. Because clang replacements can operate on token ranges, the location for the start of the member call is actually all we need to complete the replacement.

One of the design choices of the srcloc output of clang-query is that only locations on the “current” AST node are part of the output. That’s why for example, the arguments of a function call are not part of the locations output for a CXXMemberCallExpr. Instead it is necessary to traverse to the argument and introspect the locations of the node which represents the argument.

By traversing to the MemberExpr of the CXXMethodCallExpr we can see more locations. In particular, we can see that getOperatorLoc() can be used to get the location of the operator (a “.” in this case, but it could be a “->” for example) and getMemberNameInfo().getSourceRange() can be used to get a source range for the name of the member being called.

The Best Location

Given the choice of using getExprLoc() or getMemberNameInfo().getSourceRange(), the latter is preferable because it is more semantically related to what we want to replace. Aside from the hint that we want the “source range” of the “member name”, the getExprLoc() should be disfavored as that API is usually only used to choose a position to indicate in a diagnostic. That is not specifically what we wish to use the location for.

Additionally, by experimenting with slightly more complex code, we can see that getExprLoc() on a template-dependent call expression does not give the desired source location (At time of publishing! – This is likely undesirable in this case). At any rate, getMemberNameInfo().getSourceRange() gives the correct source range in all cases.

In the end, our diagnostic can look something like:

    diag(MethodCallLocation, "Use push_back instead of pushBack")
        << FixItHint::CreateReplacement(
            theMember->getMemberNameInfo().getSourceRange(), "push_back");

This feature is a powerful way to discover source locations and source ranges while creating and maintaining clang-tidy checks. Let me know if you find it useful!

AST Matchmaking made easy

February 14, 2021

The upcoming version of Clang 12 includes a new traversal mode which can be used for easier matching of AST nodes.

I presented this mode at EuroLLVM and ACCU 2019, but at the time I was calling it “ignoring invisible” mode. The primary aim is to make AST Matchers easier to write by requiring less “activation learning” of the newcomer to the AST Matcher API. I’m analogizing to “activation energy” here – this mode reduces the amount of learning of new concepts must be done before starting to use AST Matchers.

The new mode is a mouthful – IgnoreUnlessSpelledInSource – but it makes AST Matchers easier to use correctly and harder to use incorrectly. Some examples of the mode are available in the AST Matchers reference documentation.

In clang-query, the mode affects both matching and dumping of AST nodes and it is enabled with:

set traversal IgnoreUnlessSpelledInSource

while in the C++ API of AST Matchers, it is enabled by wrapping a matcher in:

traverse(TK_IgnoreUnlessSpelledInSource, ...)

The result is that matching of AST nodes corresponds closely to what is written syntactically in the source, rather than corresponding to the somewhat arbitrary structure implicit in the clang::RecursiveASTVisitor class.

Using this new mode makes it possible to “add features by removing code” in clang-tidy, making the checks more maintainable and making it possible to run checks in all language modes.

Clang does not use this new mode by default.

Implicit nodes in expressions

One of the issues identified is that the Clang AST contains many nodes which must exist in order to satisfy the requirements of the language. For example, a simple function relying on an implicit conversion might look like.

struct A {
    A(int);
    ~A();
};

A f()
{
    return 42;
}

In the new IgnoreUnlessSpelledInSource mode, this is represented as

ReturnStmt
`-IntegerLiteral '42'
and the integer literal can be matched with
returnStmt(hasReturnValue(integerLiteral().bind("returnVal")))

In the default mode, the AST might be (depending on C++ language dialect) represented by something like:

ReturnStmt
`-ExprWithCleanups
  `-CXXConstructExpr
    `-MaterializeTemporaryExpr
      `-ImplicitCastExpr
        `-CXXBindTemporaryExpr
          `-ImplicitCastExpr
            `-CXXConstructExpr
              `-IntegerLiteral '42'

To newcomers to the Clang AST, and to me, it is not obvious what all of the nodes there are for. I can reason that an instance of A must be constructed. However, there are two CXXConstructExprs in this AST and many other nodes, some of which are due to the presence of a user-provided destructor, others due to the temporary object. These kinds of extra nodes appear in most expressions, such as when processing arguments to a function call or constructor, declaring or assigning a variable, converting something to bool in an if condition etc.

There are already AST Matchers such as ignoringImplicit() which skip over some of the implicit nodes in AST Matchers. Still though, a complete matcher for the return value of this return statement looks something like

returnStmt(hasReturnValue(
    ignoringImplicit(
        ignoringElidableConstructorCall(
            ignoringImplicit(
                cxxConstructExpr(hasArgument(0,
                    ignoringImplicit(
                        integerLiteral().bind("returnVal")
                        )
                    ))
                )
            )
        )
    ))

Another mouthful.

There are several problems with this.

  • Typical clang-tidy checks which deal with expressions tend to require extensive use of such ignoring...() matchers. This makes the matcher expressions in such clang-tidy checks quite noisy
  • Different language dialects represent the same C++ code with different AST structures/extra nodes, necessitating testing and implementing the check in multiple language dialects
  • The requirement or possibility to use these intermediate matchers at all is not easily discoverable, nor are the required matchers to saitsfy all language modes easily discoverable
  • If an AST Matcher is written without explicitly ignoring implicit nodes, Clang produces lots of surprising results and incorrect transformations

Implicit declarations nodes

Aside from implicit expression nodes, Clang AST Matchers also match on implicit declaration nodes in the AST. That means that if we wish to make copy constructors in our codebase explicit we might use a matcher such as

cxxConstructorDecl(
    isCopyConstructor()
    ).bind("prepend_explicit")

This will work fine in the new IgnoreUnlessSpelledInSource mode.

However, in the default mode, if we have a struct with a compiler-provided copy constructor such as:

struct Copyable {
    OtherStruct m_o;
    Copyable();
};

we will match the compiler provided copy constructor. When our check inserts explicit at the copy constructor location it will result in:

struct explicit Copyable {
    OtherStruct m_o;
    Copyable();
};

Clearly this is an incorrect transformation despite the transformation code “looking” correct. This AST Matcher API is hard to use correctly and easy to use incorrectly. Because of this, the isImplicit() matcher is typically used in clang-tidy checks to attempt to exclude such transformations, making the matcher expression more complicated.

Implicit template instantiations

Another surpise in the behavior of AST Matchers is that template instantiations are matched by default. That means that if we wish to change class members of type int to type safe_int for example, we might write a matcher something like

fieldDecl(
    hasType(asString("int"))
    ).bind("use_safe_int")

This works fine for non-template code.

If we have a template like

template  
struct TemplStruct {
    TemplStruct() {}
    ~TemplStruct() {}

private:
    T m_t;
};

then clang internally creates an instantiation of the template with a substituted type for each template instantation in our translation unit.

The new IgnoreUnlessSpelledInSource mode ignores those internal instantiations and matches only on the template declaration (ie, with the T un-substituted).

However, in the default mode, our template will be transformed to use safe_int too:

template  
struct TemplStruct {
    TemplStruct() {}
    ~TemplStruct() {}

private:
    safe_int m_t;
};

This is clearly an incorrect transformation. Because of this, isTemplateInstantiation() and similar matchers are often used in clang-tidy to exclude AST matches which produce such transformations.

Matching metaphorical code

C++ has multiple features which are designed to be simple expressions which the compiler expands to something less-convenient to write. Range-based for loops are a good example as they are a metaphor for an explicit loop with calls to begin and end among other things. Lambdas are another good example as they are a metaphor for a callable object. C++20 adds several more, including rewriting use of operator!=(...) to use !operator==(...) and operator<(...) to use the spaceship operator.

[I admit that in writing this blog post I searched for a metaphor for “a device which aids understanding by replacing the thing it describes with something more familiar” before realizing the recursion. I haven’t heard these features described as metaphorical before though…]

All of these metaphorical replacements can be explored in the Clang AST or on CPP Insights.

Matching these internal representations is confusing and can cause incorrect transformations. None of these internal representations are matchable in the new IgnoreUnlessSpelledInSource mode.

In the default matching mode, the CallExprs for begin and end are matched, as are the CXXRecordDecl implicit in the lambda and hidden comparisons within rewritten binary operators such as spaceship (causing bugs in clang-tidy checks).

Easy Mode

This new mode of AST Matching is designed to be easier for users, especially newcomers to the Clang AST, to use and discover while offering protection from typical transformation traps. It will likely be used in my Qt-based Gui Quaplah, but it must be enabled explicitly in existing clang tools.

As usual, feedback is very welcome!

Grantlee version 5.2 (Alles hat ein Ende, nur die Wurst hat zwei) now available

December 18, 2019

The Grantlee community is pleased to announce the release of Grantlee version 5.2.0.

For the benefit of the uninitiated, Grantlee is a set of Qt based libraries including an advanced string template system in the style of the Django template system.

{# This is a simple template #}
{% for item in list %}
    {% if item.quantity == 0 %}
    We're out of {{ item.name }}!
    {% endif %}
{% endfor %}

Included in this release contains a major update to the script bindings used in Grantlee to provide Javascript implementations of custom features. Allan Jensen provided a port from the old QtScript bindings to new bindings based on the QtQml engine. This is a significant future-proofing of the library. Another feature which keeps pace with Qt is the ability to introspect classes decorated with Q_GADGET provided by Volker Krause. Various cleanups and bug fixes make up the rest of the release. I made some effort to modernize it as this is the last release I intend to make of Grantlee.

This release comes over 3 and a half years after the previous release, because I have difficulty coming up with new codenames for releases. Just joking of course, but I haven’t been making releases as frequently as I should have, and it is having an impact on the users of Grantlee, largely in KDE applications. To remedy that, I am submitting Grantlee for inclusion in KDE Frameworks. This will mean releases will happen monthly and in an automated fashion. There is some infrastructure to complete in order to complete that transition, so hopefully it will be done early in the new year.

The Future of AST-Matching refactoring tools (EuroLLVM and ACCU)

April 30, 2019

I recently made a trip to LLVM in Brussels and ACCU in Bristol. It was a busy week. I gave a talk at both conferences on the topic of the future of AST Matchers-based refactoring.

As usual, the ‘hallway track’ also proved useful at both conferences, leading to round-table discussions at the LLVM conference with other interested contributors and getting to talk to other developers interested in refactoring tooling at ACCU.

Presentations

The learning curve for AST-Matcher-based refactoring is currently too steep. Most C++ developers who are not already familiar with the internal Clang APIs need to invest a lot in order to learn how to make such bespoke tooling to improve and maintain their codebase.

The presentations both include demos of steps I’ve been taking to try to address these problems.

The first demo is of clang-query discovery features which aim to reduce the need to infer AST Matcher code by examining the Clang AST itself. I also showed the debugging features I am preparing to upstream to clang-query. Finally – in terms of demo content – I showed a Qt tool which can eliminate some of the additional difficulties and problems of long-developer-iteration-time.

The debugging features and the Qt tool were world exclusives at the LLVM conference (and at the ACCU conference because most people didn’t attend both 🙂 ). I hadn’t shown them to anyone else before, so I was quite happy the demos went well.

Videos

My 25 minute presentation to the LLVM developers tried to show that these changes can make mechanical refactoring more easily available to C++ developers.

The aim was to show the features to the LLVM community to

  1. illustrate the issues as I see them
  2. get some feedback about whether this is a good direction
  3. introduce myself for the sake of further code reviews (and collaborators). As this was my first LLVM conference, I am not already familiar with most of the attendees.

My 1.5 hour ACCU presentation is a far-less-rushed presentation of the same tools and a repetition of some of the content at code::dive 2018. In the ACCU presentation, the new demo content starts about 30 minutes in. This talk is the one to watch if you are interested in using mechanical refactoring on your own code.

Feedback was very positive from both talks, so I’m happy with that.

Qt Tooling

Earlier this year I refactored the clang AST dump functionality. It was previously implemented in one class, ASTDumper, which had the dual responsibilities of traversing the clang AST and creating a textual representation of it. I separated the textual output from the generic output independent traversal, which introduces the possibility of alternative output formats such as JSON.

Of course, given my KDE and Qt contribution history, I would only create a generic tree traversal class in order to implement QAbstractItemModel for it.

The demos show all of the features you would expect from a point-and-click refactoring tool including exploring, feature discovery, debugging with in-source feedback, live source updates, experimental refactoring etc.

Of course, all of this requires changes to Clang upstream (for example to add the debugging interface) which was the point of my visit to EuroLLVM. Hopefully, once enough of the changes are upstream, I’ll be able to open source the tool.

The idea as always is to hopefully have enough functionality in Clang itself that IDEs such as Qt-Creator, KDevelop and Visual Studio would be able to integrate these features using their own GUI APIs, making the simple tool I made obsolete anyway. I only made it for demo purposes.

This will take the mechanical refactoring workflow which is currently

and turn it into

You will still do the same things, but with much faster development iteration to achieve the same result.

There is even more that can be done to make the process of mechanical refactoring with clang easier and faster. We discussed some of that at EuroLLVM, and hopefully all the pieces can come together soon. Meanwhile I’ll be upstreaming some of this work, talking at NDC Oslo, and at my local meetup group on this topic.

Debugging Clang AST Matchers

April 16, 2019

Last week I flew to Brussels for EuroLLVM followed by Bristol for ACCU.

At both conferences I presented the work I’ve been doing to make it easier for regular C++ programmers to perform ‘mechanical’ bespoke refactoring using the clang ASTMatchers tooling. Each talk was prepared specifically for the particular audience at that conference, but both were very well received. The features I am working on require changes to the upstream Clang APIs in order to enable modern tooling, so I was traveling to EuroLLVM to try to build some buy-in and desire for those features.

I previously delivered a talk on the same topic about AST Matchers at code::dive 2018. This week I presented updates to the tools and features that I have worked on during the 6 months since.

One of the new features I presented is a method of debugging AST Matchers.

Part of the workflow of using AST Matchers is an iterative development process. For example, the developer wishes to find functions of a particular pattern, and creates and ever-more-complex matcher to find all desired cases without false-positives. As the matcher becomes more complex, it becomes difficult to determine why a particular function is not found as desired.

The debugger features I wrote for AST Matchers intend to solve that problem. It is now possible to create, remove and list breakpoints, and then enable debugger output to visualize the result of attempting to match at each location. A simple example of that is shown here.

When using a larger matcher it becomes obvious that the process of matching is short-circuited, meaning that the vertically-last negative match result is the cause of the overall failure to match the desired location. The typical workflow with the debugger is to insert break points on particular lines, and then remove surplus breakpoints which do not contribute useful output.

This feature is enabled by a new interface in the Clang AST Matchers, but the interface is also rich enough to implement some profiling of AST Matchers in the form of a hit counter.

Some matchers (and matcher sub-trees) are slower/more expensive to run than others. For example, running a matcher like `matchesName` on every AST node in a translation unit requires creation of a regular expression object, and comparing the name of each AST node with the regular expression. That may result in slower runtime than trimming the search tree by checking a parameter count first, for example.

Of course, the hit counter does not include timing output, but can give an indication of what might be relevant to change. Comparison of different trees of matchers can then be completed with a full clang-tidy check.

There is much more to say about both conferences and the tools that I demoed there, but that will be for a future log post. I hope this tool is useful and helps discover and debug AST Matchers!

Refactor with Clang Tooling at code::dive 2018

January 2, 2019

I delivered a talk about writing a refactoring tool with Clang Tooling at code::dive in November. It was uploaded to YouTube today:

The slides are available here and the code samples are here.

This was a fun talk to deliver as I got to demo some features which had never been seen by anyone before. For people who are already familiar with clang-tidy and clang-query, the interesting content starts about 15 minutes in. There I start to show new features in the clang-query interpreter command line.

The existing clang-query interpreter lacks many features which the replxx library provides, such as syntax highlighting and portable code completion:

It also allows scrolling through results to make a selection:

A really nice feature is value-based code-completion instead of type-based code completion. Existing code completion only completes candidates based on type-compatibility. It recognizes that a parameterCountIs() matcher can be used with a functionDecl() matcher for example. If the code completion already on the command line is sufficiently constrained so that there is only one result already, the code completion is able to complete candidates based on that one result node:

Another major problem with clang-query currently is that it is hard to know which parenthesis represents the closing of which matcher. The syntax highlighting of replxx help with this, along with a brace matching feature I wrote for it:

I’m working on upstreaming those features to replxx and Clang to make them available for everyone, but for now it is possible to experiment with some of the features on my Compiler Explorer instance on ce.steveire.com.

I wrote about the AST-Matcher and Source Location/Source Range discovery features on my blog here since delivering the talk. I also wrote about Composing AST Matchers, which was part of the tips and tricks section of the talk. Over on the Visual C++ blog, I wrote about distributing the refactoring task among computers on the network using Icecream. My blogs on that platform can be seen in the Clang category.

All of that blog content is repeated in the code::dive presentation, but some people prefer to learn from conference videos instead of blogs, so this might help the content reach a larger audience. Let me know if there is more you would like to see about clang-query!

Composing AST Matchers in clang-tidy

November 20, 2018

When creating clang-tidy checks, it is common to extract parts of AST Matcher expressions to local variables. I expanded on this in a previous blog.

auto nonAwesomeFunction = functionDecl(
  unless(matchesName("^::awesome_"))
  );

Finder->addMatcher(
  nonAwesomeFunction.bind("addAwesomePrefix")
  , this);

Finder->addMatcher(
  callExpr(callee(nonAwesomeFunction)).bind("addAwesomePrefix")
  , this);

Use of such variables establishes an emergent extension API for re-use in the checks, or in multiple checks you create which share matcher requirements.

When attempting to match items inside a ForStmt for example, we might encounter the difference in the AST depending on whether braces are used or not.

#include <vector>

void foo()
{
    std::vector<int> vec;
    int c = 0;
    for (int i = 0; i < 100; ++i)
        vec.push_back(i);

    for (int i = 0; i < 100; ++i) {
        vec.push_back(i);
    }
}

In this case, we wish to match the push_back method inside a ForStmt body. The body item might be a CompoundStmt or the CallExpr we wish to match. We can match both cases with the anyOf matcher.

auto pushbackcall = callExpr(callee(functionDecl(hasName("push_back"))));

Finder->addMatcher(
    forStmt(
        hasBody(anyOf(
            pushbackcall.bind("port_call"), 
            compoundStmt(has(pushbackcall.bind("port_call")))
            ))
        )
    , this);

Having to list the pushbackcall twice in the matcher is suboptimal. We can do better by defining a new API function which we can use in AST Matcher expressions:

auto hasIgnoringBraces = [](auto const& Matcher)
{
    return anyOf(
        Matcher, 
        compoundStmt(has(Matcher))
        );
};

With this in hand, we can simplify the original expression:

auto pushbackcall = callExpr(callee(functionDecl(hasName("push_back"))));

Finder->addMatcher(
    forStmt(
        hasBody(hasIgnoringBraces(
            pushbackcall.bind("port_call")
            ))
        ) 
    , this);

This pattern of defining AST Matcher API using a lambda function finds use in other contexts. For example, sometimes we want to find and bind to an AST node if it is present, ignoring its absense if is not present.

For example, consider wishing to match struct declarations and match a copy constructor if present:

struct A
{
};

struct B
{
    B(B const&);
};

We can match the AST with the anyOf() and anything() matchers.

Finder->addMatcher(
    cxxRecordDecl(anyOf(
        hasMethod(cxxConstructorDecl(isCopyConstructor()).bind("port_method")), 
        anything()
        )).bind("port_record")
    , this);

This can be generalized into an optional() matcher:

auto optional = [](auto const& Matcher)
{
    return anyOf(
        Matcher,
        anything()
        );
};

The anything() matcher matches, well, anything. It can also match nothing because of the fact that a matcher written inside another matcher matches itself.

That is, matchers such as

functionDecl(decl())
functionDecl(namedDecl())
functionDecl(functionDecl())

match ‘trivially’.

If a functionDecl() in fact binds to a method, then the derived type can be used in the matcher:

functionDecl(cxxMethodDecl())

The optional matcher can be used as expected:

Finder->addMatcher(
    cxxRecordDecl(
        optional(
            hasMethod(cxxConstructorDecl(isCopyConstructor()).bind("port_method"))
            )
        ).bind("port_record")
    , this);

Yet another problem writers of clang-tidy checks will find is that AST nodes CallExpr and CXXConstructExpr do not share a common base representing the ability to take expressions as arguments. This means that separate matchers are required for calls and constructions.

Again, we can solve this problem generically by creating a composition function:

auto callOrConstruct = [](auto const& Matcher)
{
    return expr(anyOf(
        callExpr(Matcher),
        cxxConstructExpr(Matcher)
        ));
};

which reads as ‘an Expression which is any of a call expression or a construct expression’.

It can be used in place of either in matcher expressions:

Finder->addMatcher(
    callOrConstruct(
        hasArgument(0, integerLiteral().bind("port_literal"))
        )
    , this);

Creating composition functions like this is a very convenient way to simplify and create maintainable matchers in your clang-tidy checks. A recently published RFC on the topic of making clang-tidy checks easier to write proposes some other conveniences which can be implemented in this manner.

API Changes in Clang

September 13, 2018

I’ve started contributing to Clang, in the hope that I can improve the API for tooling. This will eventually mean changes to the C++ API of Clang, the CMake buildsystem, and new features in the tooling. Hopefully I’ll remember to blog about changes I make.

The Department of Redundancy Department

I’ve been implementing custom clang-tidy checks and have become quite familiar with the AST Node API. Because of my background in Qt, I was immediately disoriented by some API inconsistency. Certain API classes had both getStartLoc and getLocStart methods, as well as both getEndLoc and getLocEnd etc. The pairs of methods return the same content, so at least one set of them is redundant.

I’m used to working on stable library APIs, but Clang is different in that it offers no API stability guarantees at all. As an experiment, we staggered the introduction of new API and removal of old API. I ended up replacing the getStartLoc and getLocStart methods with getBeginLoc for consistency with other classes, and replaced getLocEnd with getEndLoc. Both old and new APIs are in the Clang 7.0.0 release, but the old APIs are already removed from Clang master. Users of the old APIs should port to the new ones at the next opportunity as described here.

Wait a minute, Where’s me dump()er?

Clang AST classes have a dump() method which is very useful for debugging. Several tools shipped with Clang are based on dumping AST nodes.

The SourceLocation type also provides a dump() method which outputs the file, line and column corresponding to a location. The problem with it though has always been that it does not include a newline at the end of the output, so the output gets lost in noise. This 2013 video tutorial shows the typical developer experience using that dump method. I’ve finally fixed that in Clang, but it did not make it into Clang 7.0.0.

In the same vein, I also added a dump() method to the SourceRange class. This prints out locations in the an angle-bracket format which shows only what changed between the beginning and end of the range.

Let it bind

When writing clang-tidy checks using AST Matchers, it is common to factor out intermediate variables for re-use or for clarity in the code.

auto valueMethod = cxxMethodDecl(hasName("value"));
Finer->addMatcher(valueMethod.bind("methodDecl"));

clang-query has an analogous way to create intermediate matcher variables, but binding to them did not work. As of my recent commit, it is possible to create matcher variables and bind them later in a matcher:

let valueMethod cxxMethodDecl(hasName("value"))
match valueMethod.bind("methodDecl")
match callExpr(callee(valueMethod.bind("methodDecl"))).bind("methodCall")

Preload your Queries

Staying on the same topic, I extended clang-query with a --preload option. This allows starting clang-query with some commands already invoked, and then continue using it as a REPL:

bash$ cat cmds.txt
let valueMethod cxxMethodDecl(hasName("value"))

bash$ clang-query --preload cmds.txt somefile.cpp
clang-query> match valueMethod.bind("methodDecl")

Match #1:

somefile.cpp:4:2: note: "methodDecl" binds here
        void value();
        ^~~~~~~~~~~~

1 match.

Previously, it was only possible to run commands from a file without also creating a REPL using the -c option. The --preload option with the REPL is useful when experimenting with matchers and having to restart clang-query regularly. This happens a lot when modifying code to examine changes to AST nodes.

Enjoy!

Embracing Modern CMake

November 5, 2017

I spoke at the ACCU conference in April 2017 on the topic of Embracing Modern CMake. The talk was very well attended and received, but was unfortunately not recorded at the event. In September I gave the talk again at the Dublin C++ User Group, so that it could be recorded for the internet.

The slides are available here. The intention of the talk was to present a ‘gathered opinion’ about what Modern CMake is and how it should be written. I got a lot of input from CMake users on reddit which informed some of the content of the talk.

Much of the information about how to write Modern CMake is available in the CMake documentation, and there are many presentations these days advocating the use of modern patterns and commands, discouraging use of older commands. Two other talks from this year that I’m aware of and which are popular are:

It’s very pleasing to see so many well-received and informative talks about something that I worked so hard on designing (together with Brad King) and implementing so many years ago.

One of the points which I tried to labor a bit in my talk was just how old ‘Modern’ CMake is. I recently was asked in private email about the origin and definition of the term, so I’ll try to reproduce that information here.

I coined the term “Modern CMake” while preparing for Meeting C++ 2013, where I presented on the topic and the developments in CMake in the preceding years. Unfortunately (this happens to me a lot with CMake), the talk was not recorded, but I wrote a blog post with the slides and content. The slides are no longer on the KDAB website, but can be found here. Then already in 2013, the simple example with Qt shows the essence of Modern CMake:


find_package(Qt5Widgets 5.2 REQUIRED)

add_executable(myapp main.cpp)
target_link_libraries(myapp Qt5::Widgets)

Indeed, the first terse attempt at a definition of “Modern CMake” and first public appearance of the term with its current meaning was when I referred to it as approximately “CMake with usage requirements”. That’s when the term gained a capitalized ‘M’ and its current meaning and then started to gain traction.

The first usage I found of “Modern CMake” in private correspondence was March 13 2012 in an email exchange with Alex Neundorf about presenting together on the topic at a KDE conference:

Hi Alex

Are you planning on going to Talinn for Akademy this year? I was thinking about sumitting a talk along the lines of Qt5, KF5, CMake (possibly along the lines of the discussion of ‘modern CMake’ we had before with Clinton, and what KDE CMake files could look like as a result).

I thought maybe we should coordinate so either we don’t submit overlapping proposals, or we can submit a joint talk.

Thanks,

Steve.

The “discussion with Clinton” was probably this thread and the corresponding thread on the cmake mailing list where I started to become involved in what would become Modern CMake over the following years.

The talk was unfortunately not accepted to the conference, but here’s the submission:

Speakers: Stephen Kelly, Alexander Neundorf
Title: CMake in 2012 – Modernizing CMake usage in Qt5 and KDE Frameworks 5
Duration: 45 minutes

KDE Frameworks 5 (KF5) will mark the start of a new chapter in the history of KDE and of the KDE platform. Starting from a desire to make our developments more easy to use by 3rd parties and ‘Qt-only’ developers, the effort to create KF5 is partly one of embracing and extending upstreams to satisfy the needs of the KDE Platform, to enable a broadening of the user base of our technology.

As it is one of our most important upstreams, and as the tool we use to build our software, KDE relies on CMake to provide a high standard of quality and features. Throughout KDE 4 times, KDE has added extensions to CMake which we consider useful to all developers using Qt and C++. To the extent possible, we are adding those features upstream to CMake. Together with those features, we are providing feedback from 6 years of experience with CMake to ensure it continues to deliver an even more awesome build experience for at least the next 6 years. Qt5 and KF5 will work together with CMake in ways that were not possible in KDE 4 times.

The presentation will discuss the various aspects of the KDE buildsystem planned for KF5, both hidden and visible to the developer. These aspects will include the CMake automoc feature, the role of CMake configuration files, and how a target orientated and consistency driven approach could change how CMake will be used in the future.

There is a lot to recognize there in what has since come to pass and become common in Modern CMake usage, in particular the “target orientated and consistency driven approach” which is the core characteristic of Modern CMake.