by G Adams
Last Updated August 10, 2016 08:03 AM

I'm trying to categorize some data in a Google spreadsheet.

- I have "phrases to screen" in column A, TAB "Phrases". For example "blueish house"
- I have "words" in column A, TAB "Words". For example, "blue"
- I have "categories" in column B, TAB "Words". For example, "color"

I created a sample file here: There's a link to a related post too that can help.

I'd like to use regular expressions using the words in column A to match some phrases in A, and then return the appropriate category from B.

For example, I'd like to find "blue" in "blueish house" and return "color". I'd like to find "white" in "great white shoes" and return "color", but not when followed by numbers. Hence, I need to use regular expressions.

I'm using the following formula in column C TAB "phrases". It works, but not with regular expressions. It returns the reg. expressions itself and then can't match the category.

`=arrayformula(vlookup(arrayformula(iferror(regexextract(A2:A8,join("|",Words!$A$1:$A$7)))),Words!A$1:B,2,0))`

I tried to adapt the formula (from the related post) in column D but it's not working.

SO here is what I did - I added a sheet to your doc called SO Test - Aurielle.

Then I made a unique list of the possible categories in column B using:

```
=UNIQUE(Words!B:B)
```

In Column A, I did a JOIN using the regex `AND`

operator which is `|`

and used the formula:

```
=IF(ISTEXT(B2),JOIN("|",FILTER(Words!A:A,Words!B:B=B2)),)
```

Basically the filter restricts it to combine keywords by their category value.

Then in column D I added this formula:

```
=IFERROR(INDEX(A$2:B,MATCH(TRUE,ARRAYFORMULA(REGEXMATCH(C2,INDIRECT("A"&2&":A"&3+COUNTA(B$2:B)))),0),2))
```

Basically what happens here , is if you break it down from the inside out is I'm using ARRAYFORMULA along with REGEXMATCH - which returns true or false depending on which row the value actually exists in.

So I use the word TRUE to be my key for the MATCH formula, then using INDEX, I navigate it to pull in the index row, and one column over thus grabbing the category.

NOTE: I also added an additional INDIRECT formula in there to calculate how many values actually exist in column B so that your formula will dynamically accommodate on the number of rows it needs to..

- Serverfault Help
- Superuser Help
- Ubuntu Help
- Webapps Help
- Webmasters Help
- Programmers Help
- Dba Help
- Drupal Help
- Wordpress Help
- Magento Help
- Joomla Help
- Android Help
- Apple Help
- Game Help
- Gaming Help
- Blender Help
- Ux Help
- Cooking Help
- Photo Help
- Stats Help
- Math Help
- Diy Help
- Gis Help
- Tex Help
- Meta Help
- Electronics Help
- Stackoverflow Help
- Bitcoin Help
- Ethereum Help