Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Is there a way in Excel to count the occurrences of specific text within a string but also include previous character?

I have a spreadsheet containing 98 rows of different strings that are sequences of male/female visits recorded using only "M" and "F". I am trying to find the frequency of male followed by male visits in these sequences which would appear in the strings as "MM". However, using the methods I know of in Excel, it currently misses some occurrences because after it has found matching characters it then ignores those and moves onto the next, whereas I need it to always consider the last character. E.g, MMM would first be a male followed by a male, but is again then a male followed by another male. For example, one cell includes "MMMFMMMMM", using the methods I know of in Excel it counts it only as 3 occurrences because once it finds "MM" it then moves onto the rest of the string. But in my scenario this would actually be 6 occurrences of a male followed by a male, as for the sequence I always consider the last visitor. Is there any way to automate this using a formula or even VBA or macro etc in Excel? Please see the example below of some records from my A and B columns that I currently have and the C column that I would like to add for the frequency of MM visits in each sequence:

Day Sequence MM Count
1 MFFFM 0
2 FMMM 2
3 FF 0
4 MMMMM 4
5 M 0
6 MFMMMFF 2

So far I have tried:

For all cells:
=SUM(LEN(B2:B7)-LEN(SUBSTITUTE(B2:B7,"MM","")))/LEN("MM")
=SUMPRODUCT(--(ISNUMBER(FIND("MM", B2:B7))))
=COUNTIF(B2:B7, "*MM*")

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

For each cell:
=LEN(B2)-LEN(SUBSTITUTE(B2,"MM",""))
=MATCH(2,1/(MID(B2,SEQUENCE(LEN(B2)),2)="MM"))

However nearly all of these approaches only count each MM occurrence and then moves onto the next characters. Some of them also only count the cells that contain an occurrence of MM. Is there a method in Excel that can count the occurrences of MM in each cell’s string whilst also considering the previous character in the string? Thanks so much for any input!

>Solution :

Nice question. One option could be:

enter image description here

Formula in C2:

=MAP(B2:B7,LAMBDA(s,SUM(LEN(TEXTSPLIT(s&"FM","F",,1))-1)))

Or, if you know the strings are no longer then 99 chars then avoid LAMBDA() and maybe use something like:

=MMULT(N(MID(B2:B7,SEQUENCE(,99),2)="MM"),ROW(1:99)^0)

If you happen to be forced to use legacy functions, maybe:

=MMULT(N(MID(B2,TRANSPOSE(ROW(1:99)),2)="MM"),ROW(1:99)^0)

And confirm through CtrlShiftEnter

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading