Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Awk: In search for better ways of filtering through date columns

I have the following file called dates.csv

1,2020-01-27,mom,walk1
2,2023-07-27,dad,walk2
3,2020-06-04,uncle jeff,walk2
4,2022-04-26,uncle celso,walk3
5,2020-01-06,mom,walk2
6,2023-01-19,val,walk1
7,2019-12-02,otice,walk2
8,2021-01-20,uncle celso,walk1
9,2021-04-16,dad,walk3
10,2023-10-03,dad,walk2
11,2021-04-16,uncle jeff,walk3
12,2019-10-11,uncle jeff,walk3
13,2023-01-04,dad,walk3
14,2019-08-22,uncle celso,walk2
15,2022-04-27,val,walk3
16,2019-04-01,dad,walk1
17,2021-01-27,uncle jeff,walk2
18,2019-12-02,val,walk2
19,2022-04-20,uncle celso,walk2
20,2022-07-04,uncle jeff,walk3

I am trying to filter dates in a more sophisticated way than I am doing using awk. What I am currently doing is the following. If I want select dates larger than or equal to 2020-06-04 I will run

awk 'BEGIN{FS=OFS=","}$2 >= 20200604{print $2=gensub(/(....)-(..)-(..)/,"\\1\\2\\3",1)}' dates.txt | \
sed 's/,\(....\)\(..\)\(..\)/,\1-\2-\3/1'

which gives

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

2,2023-07-27,dad,walk2
4,2022-04-26,uncle celso,walk3
6,2023-01-19,val,walk1
8,2021-01-20,uncle celso,walk1
9,2021-04-16,dad,walk3
10,2023-10-03,dad,walk2
11,2021-04-16,uncle jeff,walk3
13,2023-01-04,dad,walk3
15,2022-04-27,val,walk3
17,2021-01-27,uncle jeff,walk2
19,2022-04-20,uncle celso,walk2
20,2022-07-04,uncle jeff,walk3

If I want to select dates smaller than or equal to 2020-06-04 I will run

awk 'BEGIN{FS=OFS=","}$2 <= 20200604{print $2=gensub(/(....)-(..)-(..)/,"\\1\\2\\3",1)}' dates.txt | \
sed 's/,\(....\)\(..\)\(..\)/,\1-\2-\3/1'

which results in

1,2020-01-27,mom,walk1
3,2020-06-04,uncle jeff,walk2
5,2020-01-06,mom,walk2
7,2019-12-02,otice,walk2
12,2019-10-11,uncle jeff,walk3
14,2019-08-22,uncle celso,walk2
16,2019-04-01,dad,walk1
18,2019-12-02,val,walk2

I have been trying around with mktime(), strftime() but am not able to get anything to work that is better than what I have got. I appreciate any ideas on this in advance.

>Solution :

You can simplify like this:

awk -v date=20200604 'BEGIN{FS=OFS=","} $2 >= date' file
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading