Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to get Perl to match \r in files with windows EOL characters

I’m trying to write a perl script to identify files with Windows EOL characters, but \r matching doesn’t seem to work.

Here’s my test script:

#!/usr/bin/perl
use File::Slurp;

$winfile = read_file('windows-newlines.txt');
if($winfile =~ m/\r/) {
  print "winfile has windows newlines!\n"; # I expect to get here
}
else {
  print "winfile has unix newlines!\n"; # But I actually get here
}

$unixfile = read_file('unix-newlines.txt');
if($unixfile =~ m/\r/) {
  print "unixfile has windows newlines!\n";
}
else {
  print "unixfile has unix newlines!\n";
}

Here’s what it outputs:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

winfile has unix newlines!
unixfile has unix newlines!

I’m running this on Windows, and I can confirm in Notepad++ that the files definitely have the correct EOL characters:

Screenshot of windows-newlines.txt in Notepad++
Screenshot of unix-newlines.txt in Notepad++

>Solution :

Unless binmode is true (which is not in your code) read_file will change \r\n to \n on Windows. From the code:

# line endings if we're on Windows
${$buf_ref} =~ s/\015\012/\012/g if ${$buf_ref} && $is_win32 && !$opts->{binmode};

In order to keep the original encoding set binmode, like shown in the documentation:

my $bin = read_file('/path/file', { binmode => ':raw' });
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading