Be cautious using the “g” flag in regex

Lately, I’ve stuck with a very interesting problem. But let’s start from the beginning.

There’s an internal Library application in the company I’m currently working for. In a nutshell, the app allows employees to borrow books the company has and my task was to resolve a problem with search functionality. The search generally works but not always. Sometimes it doesn’t show all elements it should, for example, there are two books:

  • “Institutional Investment Management: Equity and Bond Portfolio Strategies and Applications”
  • “Institutional Investors (MIT Press)”

When you type “institutional” in the search box only the first of these books appears as the result. But when you type the whole title of the second book it appears normally as it should. However, you never get both of them at the same time. What’s even more interesting, it happened only if you wanted to find books by the word at the beginning of their titles. For example, if the second book had the word “institutional” in the middle of the title then both of them would appear in the search result. Strange.

Firstly I’ve checked the controller responsible for retrieving books from the database but it works correctly and what’s more the controller isn’t involved in the searching process. Searching is implemented as filtering on the web page in the Angular pipe. Pipes in Angular are used to filter, format or transform the way the data is displayed.

The application’s frontend is written in Angular 2 and I’ve to admit that I have never worked with Angular before. That’s more or less what the code looked like.

At first glance, everything looks fine. There is the regex with “i” and “g” attributes which stands for case insensitivities and for a global match respectively. The global match means that it find all matches rather than stopping after the first one.

This piece of code, for example, works correctly and the result is true:

So the title matches search criteria.

So why doesn’t the original code work correctly? Why doesn’t it return all matches?

I spent many hours rewriting the code in different ways and debugging it. Finally, my teammate had a look at it and found the bug very quickly. What hi did, was to remove the “g” flag from the regex expression and everything started working as expected. But why? Shouldn’t the “g” flag be necessary to find all the occurrences of the searched phrase? Generally yes but it should be used when you want to find many occurrences in one string, not many occurrences in many strings. It’s explained on StackOverflow:

The RegExp object keeps track of the lastIndex where a match occurred, so on subsequent matches it will start from the last used index, instead of 0.

It explains why the code doesn’t work when the searched expressions are placed at the beginning of the book title but works correctly when they are located in different places in the title. So when we try to use the same regex object for many strings we should reset the index after each check.

The conclusion for me is to read the documentation more carefully in such cases because things don’t always work in the way we think they do. Another important thing is to ask others to take a look at the code if there is such possibility. It’s often easier for someone “fresh” to find a bug than for somebody who’s staring at the code for many hours.

You may also like

Leave a Reply

Your email address will not be published. Required fields are marked *