Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Bash: Extract URL from markdown format

I have a set of markdown format posts for a jekyll site that each contain a markdown link. For example:

---
layout: post
title: "The Title"
date: 2022-07-31
categories:
- CategoryX
- CategoryY
author: AuthorName, SecondAuthor
tags: [tag1,tag2,tag3]
---

Some text that might contain (brackets] or other symbols.

[Visit Link](https://www.linkhere.net/somepage){:target="_blank" rel="noopener"}

I’d like to extract just the full URLs from each file in the _post directory and write them to a new file.

This is the code and commented attempts

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#!/bin/bash

# configuration
jekyll_post_dir="<jekyll_dir>/_posts"


for file in $jekyll_post_dir/*
do
    #link=$(sed -n -e '/[Visit Link]/,/{:target/p' $file)

    #link=$(sed -n '/[Visit Link]/,/target/{ /html>/d; p }' $file)

    #link=$(awk '/[Visit Link]/,/target/' $file)

    #link=$(sed -n 's/[^{]*\({[^}]*}\).*/\1/g' $file)

    #link=$(sed 's/.*Link](\(.*\))/\1/' $file)

    #link=$(awk -F"[()]" '{print $2}' $file )

    #while IFS="](){" read a b; do echo "$b"; done < $file

    #link=$(sed -n '/\](/,/)\{:/p' $file)

    #echo $link >> linklist.txt

done

All my attempts have either selected unwanted text or failed completely. I am not familiar with regex or similar definitions so I would appreciate some guidance. I’m happy to use any bash-supported solution.

Thanks for reading/helping…

>Solution :

The command below gets the expected URL

sed -nre '/:target=/ s/.*[]][(]([^)]+)[)][{]:target=.*/\1/p' test.txt 

Result

https://www.linkhere.net/somepage

Alternative command

sed -nre '/:target=/ s/.*\]\(([^)]+)\)\{:target=.*/\1/p' test.txt

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading