Last week’s MR Port QRegExp in advancedrename tool had introduced a regression which was caught by the test advancedrename_utest.

About QRegularExpression

QRegularExpression implements Perl-compatible regular expressions. It should be kept in mind that QRegularExpression != QRegExp. There are cases where a pattern can silently fail with QRegularExpression but pass with QRegExp. One of the key differences between these classes is that the result of the match of QRegExp is contained the object itself whereas in QRegularExpression, the result is contained in another object of class QRegularExpressionMatch. This was an important point which I forgot to pay attention to while porting advancedrename tool.

Regression in Advanced Rename tool

The advancedrename tool API is defined in core/utilities/advancedrename/common. Rule::parse() does the regex matching. This is the implementation from master

ParseResults Rule::parse(ParseSettings &settings)
{
    ParseResults parsedResults;
    const QRegExp& reg         = regExp();
    const QString& parseString = settings.parseString;

    int pos = 0;

    while (pos > -1)
    {
        pos = reg.indexIn(parseString, pos);

        if (pos > -1)
        {
            QString result = parseOperation(settings);

            ParseResults::ResultsKey   k(pos, reg.cap(0).count());
            ParseResults::ResultsValue v(reg.cap(0), result);
            parsedResults.addEntry(k, v);
            pos           += reg.matchedLength();
        }
    }

    return parsedResults;
}

As one can see, the regex match is done in the line

...
while (pos > -1)
    {
        pos = reg.indexIn(parseString, pos);
...

is contained in the QRegExp object reg.

The new ported line is:

...
while (pos > -1)
    {
        pos = parseString.indexOf(reg, pos);
 ...

The Rule::parseOperation() implementation also has to have the access to this reg object in-order to operate on the parsed results. For example from core/utilities/advancedrename/parser/options/filepropertiesoption.cpp,

QString FilePropertiesOption::parseOperation(ParseSettings& settings)
{
...
    const QRegExp& reg   = regExp();
    const QString& token = reg.cap(1);
...

When porting this API, I realized that reg object will no longer have the parsed results when it is an instance of QRegularExpression class. My mistake was that I assumed that I can always perform this match again when needed in Rule::parseOperation(), as I had the match string settings.parseString and reg object. So I did this,

QString FilePropertiesOption::parseOperation(ParseSettings& settings)
{
...
    const QRegularExpression& reg   = regExp();
    const QString& token            = reg.match(settings.parseString).captured(1);
...

This approach broke some test settings.parseString where it was for example [ext].[ext].

The string [ext].[ext] has two parts. The correct parsing of the string was like this:

  • [ext] in 1st iteration
  • .[ext] in 2nd iteration

But I got the result [ext] in both the iterations. I initially thought that the regex pattern had broken silently in transition from QRegExp to QRegularExpression. But after some days, I realized that my assumption that I can re-match whenever I want was wrong. Match needs to be done only once in Rule::parse(), because it parses the whole string and has the role of positioning as well, whereas if re-matched the string is matched from the beginning and that is why I got [ext] in both iterations.

I patched the API to also pass the result of the match as a second parameter to Rule::parseOperation().

The patch now looks like this:

core/utilities/advancedrename/common/rule.cpp

ParseResults Rule::parse(ParseSettings &settings)
{
    ParseResults parsedResults;
    const QRegularExpression& reg         = regExp();
    const QString& parseString            = settings.parseString;
    QRegularExpressionMatch                 match;

    int pos = 0;

    while (pos > -1)
    {
        pos = parseString.indexOf(reg, pos, &match);

        if (pos > -1)
        {
            QString result = parseOperation(settings, match);

            ParseResults::ResultsKey   k(pos, match.captured(0).count());
            ParseResults::ResultsValue v(match.captured(0), result);
            parsedResults.addEntry(k, v);
            pos           += match.capturedLength();
        }
    }

    return parsedResults;
}

core/utilities/advancedrename/parser/options/filepropertiesoption.cpp

QString FilePropertiesOption::parseOperation(ParseSettings& settings, const QRegularExpressionMatch &match)
{
...
    const QString& token            = match.captured(1);
...

All tests passed after this and this got merged. It seems a silly mistake now but at the time while working, the API was so big that I did this mistake.😅

Porting QTextCodec

This was trivial to port as can be seen from the MR.

As of now, it is not possible to port these two instances of use in digiKam,

So to keep this code running, we decided to keep using on Core5Compat module for now.

Thanks for reading!