Match substring with regex inside file with bash

May 29, 2023

I have a following text file that contains content:

┌─ This is folder description
│
├─ ##### ─────── Magenta 
│
├─ Size ──────── 245.7 GiB
│
├─ URL ───────── https://www.google.com
│
│
├─ Video ─────── 23.976fps
│
├─ Audio ─────┬─ French 
│             │         
│             │
│             └─ German
│
├─ Subtitles ─── French, German, Italian, C#

I want to verify if there is French between Audio and Subtitles, I don’t care rest.

I tried:

grep -E 'Audio(.*)(French|Franz?)(.*)Sub(s|(titles?)|(titulos))' test.txt

But no match. I’m pretty sure that is something wrong in using grep. Can lead me to correct way to do that ?

>Solution :

grep reads 1 line at a time by default so you can’t do a multi-line match on a single line. Using GNU grep for -z (to read the whole file at one time) and -o (as I’m guessing at what output you want):

$ grep -Ezo 'Audio(.*)(French|Franz?)(.*)Sub(s|(titles?)|(titulos))' test.txt
Audio ─────┬─ French
│             │
│             │
│             └─ German
│
├─ Subtitles

Alternatively, using any awk:

$ awk '
    { rec = rec $0 ORS }
    END {
        if ( match(rec,/Audio(.*)(French|Franz?)(.*)Sub(s|(titles?)|(titulos))/) ) {
            print substr(rec,RSTART,RLENGTH)
        }
    }
' test.txt
Audio ─────┬─ French
│             │
│             │
│             └─ German
│
├─ Subtitles