REGEX.EXTRACT - The range … can’t be used as a single value.

Question

Level 1

4 points

REGEX.EXTRACT - The range … can’t be used as a single value.

Good day!

I have never before used REGEX.EXTRACT.
I am unfamiliar with Apple/Numbers wildcard strings (regular expressions).

I want to retrieve an array of only the populated cells in a column of numeric values from a separate table - to be displayed using ROW()-1 to step through the array.

Using the named column “L 4” results in an error: REGEX.EXTRACT couldn’t find a match for the regular expression “\d+”.

REGEX.EXTRACT(L 4,"\d+",ROW()−1)

Specifying the range by reference inside column headers & footers results in an error: The range “Table 1::I5:I158” can’t be used as a single value.

REGEX.EXTRACT(Table 1::I5:I158,"\d+",ROW()−1)

To verify the validity of my regular expression string against my data, I tried using the MATCH() function with REGEX().

Using the named column “L 4” as the range, MATCH() returned the column heading (1).

MATCH(REGEX("\d+"),L 4,0)

Specifying the data range by reference, MATCH() retrieved the index number of the first populated data cell (29, a valid result!)

MATCH(REGEX("\d+"),Table 1::I5:I158,0)

Please advise?

Thank you much!

iPad Pro, iPadOS 18

Posted on Feb 10, 2025 2:10 PM

Reply

Answer 1

Top-ranking reply

Badunit

Level 7

30,366 points

Feb 11, 2025 6:40 PM in response to Solsticemoon

REGEX.EXTRACT is simple in theory but the codes needed to extract exactly what you want can be fairly complex.

REGEX.EXTRACT extracts substrings from a longer string. It is a text function. In order to use it to retrieve data from a range of cells, you first need to turn that range of cells into one long string. Then you can extract them one by one. Here is one possible solution:

=1*REGEX.EXTRACT("~"&TEXTJOIN("~",TRUE,Table 1::I$5:I$158)&"~","(?<=~)[0-9,.-]+(?=~)",ROW(cell)−1)

TEXTJOIN concatenates all the cells in that range into one long string with a delimiter between each. I used ~ as the delimiter and I tacked a ~ onto each end of the string so all cells have a ~ on both sides. The TRUE means blank cells are ignored.

\d is the numbers 0-9. If those are the only characters in your numbers then you can use \d+ instead of what I did. I included comma, decimal point, and negative sign as allowable characters.

The code "(?<=~)[0-9,.-]+(?=~)" is looking for a string of the characters 0-9,.- with a ~ on both sides, but not including those ~ characters.

The results from REGEX.EXTRACT are strings, not actual "numbers". I multiplied by 1 to turn them into actual numbers.

Note that a string like 90-34,,56.,7 will be extracted as if it was a number but will end up being an error when it is multiplied by 1.

Reply

Answer 2

Badunit

Level 7

30,366 points

Feb 12, 2025 7:38 AM in response to Solsticemoon

I think every other solution in Numbers requires an extra column to determine (or keep track of) the rows that match your criteria. If the REGEX.EXTRACT approach works for you, use it. If you want another solution, I'm sure we can make you one. I do not know the limits or the speed of computation of using TEXTJOIN and REGEX.EXTRACT together. I doubt it could do a million rows of data, maybe not even tens of thousands, but it works well enough for smaller tables.

One thing I thought of this morning is that if your cells contain numbers or nothing at all (no other text or data, just numbers or nothing), the criteria doesn't need to be [0-9.,-]+ it can be [^~]+ which means anything not a ~.

=1*REGEX.EXTRACT("~"&TEXTJOIN("~",TRUE,Table 1::I$5:I$158)&"~","(?<=~)[^~]+(?=~)",ROW(cell)−1)

One problem I have with REGEX.EXTRACT for numeric data is that it is text-based. For text it is perfectly fine and can do some very complex matching. For numbers and dates and things like that you have to be careful, trying to make sure it will match all your numbers and not something else unintended, and you can't do numeric comparisons (like searching for numbers greater than 5 or dates greater than Jan 1, 2000). Also, and of high importance, the results are the text you see in the cell. If the value in the cell is 100.123 but the cell is formatted to one decimal as 100.1 then the result from REGEX.EXTRACT will be 100.1 exactly, it will not be 100.123.

Search the internet for help on regular expressions. I didn't find it easy to learn. I still don't when it gets complex. Half the "solutions" you find online are wrong or don't actually work, but enough of them do.

Reply

Answer 3

Camelot

Level 9

58,299 points

Feb 11, 2025 9:31 AM in response to Solsticemoon

I think you need to be clearer about what you're trying to do with this data.

Ostensibly, I was able to get your REGEX.EXTRACT to work by amending the parameters. Even though the third parameter says it returns the match-occurrence, I found that using the fourth parameter (capture-group-occurrence) actually works.

=REGEX.EXTRACT(L 4, "\d+",0,ROW()-1)

Maybe I'm too fuzzy about regular expressions to be able to decipher this one.

Reply

Answer 4

Solsticemoon Author

Level 1

4 points

Feb 11, 2025 1:48 PM in response to Solsticemoon

I believe I misinterpreted the support files for REGEX.EXTRACT(). I apologize.

If there is a proper way to collect an array of filtered data from a range of cells, please direct me to the proper support files?

Thank you very much!

Reply

Answer 5

Solsticemoon Author

Level 1

4 points

Feb 12, 2025 6:10 AM in response to Badunit

I very much appreciate the information about regular expressions!

Is there a user-friendly reference website to support the standard?

An external search for “regular expressions” turned up developer-focused literature and “tutorials” rather than comprehensive documentation.

Is there a better solution than REGEX.EXTRACT?

I managed to employ a convoluted find/next logic by stacking MATCH functions, but this is quite cumbersome and I used additional columns.

Thank you very much for all of the information!

Reply

Answer 6

Solsticemoon Author

Level 1

4 points

Feb 12, 2025 10:40 AM in response to Badunit

Thank you so much for sharing your wisdom!

I’ll give it a try.

Yes, I am becoming aware of unnecessary and excessive calculations.

Elsewhere in this Numbers database, I manually created histogram data for a graph by using VLOOKUP to match each minute of an 18 hour span. It works marvelously! However, each time I duplicate the blank sheet, my iPad needs about 4 minutes to recalculate.

Again, I appreciate your time to answer my real questions. Your responses have been very helpful.

Reply

Answer 7

Solsticemoon Author

Level 1

4 points

Feb 11, 2025 1:59 PM in response to Camelot

I am completely unfamiliar with capture groups.

What value should be used to specify the capture-group-occurrence?

Thanks so much for your time!

Reply