ICAEW.com works better with JavaScript enabled.

Taking a Break

Author: Liam Bastick

Published: 20 Dec 2022

This month, we look at a common “clean up” problem in Excel – how to separate text and numbers. Before everyone writes in and tells me “use Power Query”, the point here is to create a formulaic solution which will not need refreshing when data changes.

As we all recognise, often we receive data that isn’t in the best of shape. It may have text and numbers intermingled and you wish to separate (i.e. space) them out. For example, you might want “Pelham123” to be “Pelham 123”, or require “121Training” to be “121 Training”.

Where there is just the one change of data type in the text string, it’s possible to create a reasonably straightforward formula, but things get harder with multiple occurrences, e.g.

bubble chart

This problem has been a regular challenge for Excel users since the dawn of time. Well, 1985 anyway. Dynamic arrays have made the task easier because you can now undertake repetitive checks. Dynamic arrays have made the task easier because you can now undertake repetitive checks. Dynamic arrays have made the task easier because you can now undertake repetitive checks…

OK, the joke’s over.

For this month’s example, you can see my suggested solution is extremely simple:

table

=PROPER(TRIM(CONCAT(IF(INDEX(IF(ISNUMBER(VALUE(MID(F12,SEQUENCE
(1,LEN(F12)),1))),1,),SEQUENCE(1,LEN(F12)))=INDEX(IF(ISNUMBER(VALUE(MID
(F12,SEQUENCE(1,LEN(F12)),1))),1,),SEQUENCE(1,LEN(F12),0)),MID(F12,SEQUENCE
(1,LEN(F12)),1)," "&MID(F12,SEQUENCE(1,LEN(F12)),1)))))

Any questions?

Perhaps we need to step through this monster. In the associated Excel file, the second section steps through the mechanics of this calculation. The first formula is (hopefully) straightforward!

table

=F12

If you need help with this calculation, then the rest of this explanation is not looking good. This simply links the original input into cell P28, so all the formulae are close together. That’s not so exciting.

The next formula is a little more interesting:

excel drop down

=MID(P28,SEQUENCE(1,LEN(P28)),1)

Even in this stepped out approach, I have managed to include three functions in one formula!

1. LEN(P28) is a text function that counts the number of characters (its “volume”) in the text string in cell P28. In the example (above), there are 58 characters, so LEN(P28) equals 58

2. The SEQUENCE(rows, [columns], [start], [step]) function has one required (rows) and three optional arguments. This is one of the new dynamic array functions and fills rows number of rows, columns number of columns (1 if not specified), starting at start (1 if not specified) in increments of step (1 if not specified). Hence, =SEQUENCE(4,3,2,1) would be

excel drop down

It would consist of 12 cells (4 rows by 3 columns), starting with the number 2 and increasing in increments of 1 (as pictured).

Therefore, =SEQUENCE(1,LEN(P28)) would be the numbers 1 to 58, all in one row, across 58 columns

3. Finally, =MID(P28,SEQUENCE(1,LEN(P28)),1) uses the MID(text, start_number, length) function. Here, it takes a sub-text string from the text reference, starting at position start_number of length length. So, =MID(“abcdefg”,2,3) would be “bcd” being three characters starting from the second position.

Therefore, =MID(P28,SEQUENCE(1,LEN(P28)),1) simply splits the text string out into 58 cells across a row, each cell consisting of just one of the 58 characters in the text string. It should be noted that MID converts every element to a text string, i.e. all characters are text – even numbers.

Let’s look at the next two formulae together:

excel drop down with the likelihood options

=VALUE(P30#)

=IF(ISNUMBER(P32#),1,

I want to identify which characters are numbers in the range starting in cell P30 (P30# simply means the spilled dynamic array, so all 58 cells – but this will change if the text length changes). ISNUMBER will do this – but not straight away. Remember, MID has converted everything to text, so first I need to convert numbers back to numbers.
Interestingly, the N function won’t work here (it will “coerce” the range back to one cell), but VALUE will. Values of text characters are #VALUE! by definition.
Then, with this new range in P32#, we can see what is a number and what isn’t. The formula

=IF(ISNUMBER(P32#),1,)

simply provides a one [1] for every numerical value and zero [0] for everything else (including #VALUE! errors).

The next three steps may be considered together:

excel drop down with the likelihood options

=INDEX(P34#,SEQUENCE(1,LEN(P28)))
=INDEX(P34#,SEQUENCE(1,LEN(P28),0))

=P36#=P38#

The expression =SEQUENCE(1,LEN(P28)) has already been explained: this generates the numbers one to 58 in 58 cells across the row. Since P34# contains 58 cells across one row,

=INDEX(P34#,SEQUENCE(1,LEN(P28)))

is a seemingly rather longwinded way of replicating P34# - the first cell provides the first value in the vector P34#, the second cell provides the second value in the vector P34# and so on. It seems convoluted. It’s subtle though: this formula is giving us a way of referencing a particular position in the range P34#. That’s important when we consider the second formula:

=INDEX(P34#,SEQUENCE(1,LEN(P28),0))

Almost the same formula, the start of zero [0] is specified in the SEQUENCE function. When it is not specified (as in the first formula considered in this section), it is assumed to start at one [1]. This has the effect of producing the same result as the first formula, albeit displaced one cell to the right, e.g. the first cell provides the “noughth” value in the vector P34#, the second cell provides the first value in the vector P34#, and so on. In other words, it displays the previous cell result.

The formula

=P36#=P38#

simply checks that the current cell and the previous cell in the range are both text or both numerical values (TRUE). If they differ, the result is FALSE. This is what we want to find. Wherever we have a FALSE value, we need to add a space to separate text and numbers (or vice versa). This is precisely what the next formula does:

excel grid with labels

=IF(INDEX(P34#,SEQUENCE(1,LEN(P28)))=INDEX(P34#,SEQUENCE(1,LEN(P28),0)),P30#," "&P30#)

This formula takes the logic of the above. The formula =P36#=P38# cannot be referred to per se (it has been provided for illustration) as this is not how dynamic arrays work. However, this formula checks for inconsistent data types between the cell and the prior and adds a preceding space if the two data types differ (i.e. the equality is FALSE). For example, the value in cell T42 is not 1, but “ 1” (i.e. it has a preceding space to delineate between text and number).

The final three formulae simply tidy things up:

excel grid

=CONCAT(P42#)
=TRIM(P44)
=PROPER(P46)

The formula in cell P44 simply combines all the cell values back into one cell using the CONCAT (concatenate) function; the formula on cell P46 removes excess spaces using TRIM; the final formula in cell P48 just puts all text into “Proper” case, i.e. each word starts with a capital letter, with all other letters using lower case.

That’s it.

To get the monster formula, you just build each formula into its dependent calculation so the last formula could be expressed as

=PROPER(TRIM(P44))

and the formula before that could be written as

=PROPER(TRIM(CONCAT(P42#)))

If you carry on this approach, this will eventually result in our monster formula, viz.

=PROPER(TRIM(CONCAT(IF(INDEX(IF(ISNUMBER(VALUE(MID(F12,SEQUENCE
(1,LEN(F12)),1))),1,),SEQUENCE(1,LEN(F12)))=INDEX(IF(ISNUMBER(VALUE(MID
(F12,SEQUENCE(1,LEN(F12)),1))),1,),SEQUENCE(1,LEN(F12),0)),MID(F12,SEQUENCE
(1,LEN(F12)),1)," "&MID(F12,SEQUENCE(1,LEN(F12)),1)))))

excel table settings options

Word to the Wise

Dynamic Arrays are presently only available in Microsoft 365 and Excel 2021 for Desktop users. If you have an earlier version of Excel, the attached Excel file will neither display correctly nor calculate appropriately. In this case, you will have to revert to VBA (e.g. as User Defined Function), a third party add-in or Power Query.

For the time being!
Archive and Knowledge Base

This archive of Excel Community content from the ION platform will allow you to read the content of the articles but the functionality on the pages is limited. The ION search box, tags and navigation buttons on the archived pages will not work. Pages will load more slowly than a live website. You may be able to follow links to other articles but if this does not work, please return to the archive search. You can also search our Knowledge Base for access to all articles, new and archived, organised by topic.

Open AddCPD icon

Add Verified CPD Activity

Introducing AddCPD, a new way to record your CPD activities!

Log in to start using the AddCPD tool. Available only to ICAEW members.

Add this page to your CPD activity

Step 1 of 3
Download recorded
Download not recorded

Please download the related document if you wish to add this activity to your record

What time are you claiming for this activity?
Mandatory fields

Add this page to your CPD activity

Step 2 of 3
Mandatory field

Add activity to my record

Step 3 of 3
Mandatory field

Activity added

An error has occurred
Please try again

If the problem persists please contact our helpline on +44 (0)1908 248 250