Last week’s MR Port QRegExp in advancedrename tool
had introduced a regression which was caught by the test advancedrename_utest
.
About QRegularExpression
QRegularExpression
implements Perl-compatible regular expressions. It should
be kept in mind that QRegularExpression != QRegExp
. There are cases where a pattern can silently fail
with QRegularExpression
but pass with QRegExp
. One of the key differences between these classes is that the
result of the match of QRegExp
is contained the object itself whereas in QRegularExpression
, the result is
contained in another object of class QRegularExpressionMatch
. This was an important point which I forgot to pay
attention to while porting advancedrename tool.
Regression in Advanced Rename tool
The advancedrename tool API is defined in core/utilities/advancedrename/common. Rule::parse() does the regex matching. This is the implementation from master
ParseResults Rule::parse(ParseSettings &settings)
{
ParseResults parsedResults;
const QRegExp& reg = regExp();
const QString& parseString = settings.parseString;
int pos = 0;
while (pos > -1)
{
pos = reg.indexIn(parseString, pos);
if (pos > -1)
{
QString result = parseOperation(settings);
ParseResults::ResultsKey k(pos, reg.cap(0).count());
ParseResults::ResultsValue v(reg.cap(0), result);
parsedResults.addEntry(k, v);
pos += reg.matchedLength();
}
}
return parsedResults;
}
As one can see, the regex match is done in the line
...
while (pos > -1)
{
pos = reg.indexIn(parseString, pos);
...
is contained in the QRegExp
object reg
.
The new ported line is:
...
while (pos > -1)
{
pos = parseString.indexOf(reg, pos);
...
The Rule::parseOperation()
implementation also has to have the
access to this reg
object in-order to operate on the parsed results. For example from
core/utilities/advancedrename/parser/options/filepropertiesoption.cpp,
QString FilePropertiesOption::parseOperation(ParseSettings& settings)
{
...
const QRegExp& reg = regExp();
const QString& token = reg.cap(1);
...
When porting this API, I realized that reg
object will no longer have the parsed results when it is an instance of
QRegularExpression
class. My mistake was that I assumed that I can always perform this match again when needed
in Rule::parseOperation()
, as I had the match string settings.parseString
and reg
object. So I did this,
QString FilePropertiesOption::parseOperation(ParseSettings& settings)
{
...
const QRegularExpression& reg = regExp();
const QString& token = reg.match(settings.parseString).captured(1);
...
This approach broke some test settings.parseString
where it was for example [ext].[ext]
.
The string [ext].[ext]
has two parts. The correct parsing of the string was like this:
[ext]
in 1st iteration.[ext]
in 2nd iteration
But I got the result [ext]
in both the iterations. I initially thought that the regex pattern had broken silently in
transition from QRegExp
to QRegularExpression
. But after some days, I realized that my assumption that I can
re-match whenever I want was wrong. Match needs to be done only once in Rule::parse()
, because it parses the whole
string and has the role of positioning as well, whereas if re-matched the string is matched from the beginning and
that is why I got [ext]
in both iterations.
I patched the API to also pass the result of the match as a second parameter
to Rule::parseOperation()
.
The patch now looks like this:
core/utilities/advancedrename/common/rule.cpp
ParseResults Rule::parse(ParseSettings &settings)
{
ParseResults parsedResults;
const QRegularExpression& reg = regExp();
const QString& parseString = settings.parseString;
QRegularExpressionMatch match;
int pos = 0;
while (pos > -1)
{
pos = parseString.indexOf(reg, pos, &match);
if (pos > -1)
{
QString result = parseOperation(settings, match);
ParseResults::ResultsKey k(pos, match.captured(0).count());
ParseResults::ResultsValue v(match.captured(0), result);
parsedResults.addEntry(k, v);
pos += match.capturedLength();
}
}
return parsedResults;
}
core/utilities/advancedrename/parser/options/filepropertiesoption.cpp
QString FilePropertiesOption::parseOperation(ParseSettings& settings, const QRegularExpressionMatch &match)
{
...
const QString& token = match.captured(1);
...
All tests passed after this and this got merged. It seems a silly mistake now but at the time while working, the API was so big that I did this mistake.😅
Porting QTextCodec
This was trivial to port as can be seen from the MR.
As of now, it is not possible to port these two instances of use in digiKam,
So to keep this code running, we decided to keep using on Core5Compat module for now.
Thanks for reading!