I need to calculate MD5 for a file.
private string GetMD5(string file)
{
using var md5 = MD5.Create();
using var stream = new StreamReader(file);
return (BitConverter.ToString(md5.ComputeHash(stream.BaseStream)).Replace("-", string.Empty)).ToLower();
}
private string GetMD5_V2(string file)
{
using var md5 = MD5.Create();
using var stream = new StreamReader(file);
**_ = stream.EndOfStream;**
return (BitConverter.ToString(md5.ComputeHash(stream.BaseStream)).Replace("-", string.Empty)).ToLower();
}
test()
{
var fichier = "myFile.txt";
var md5_1 = GetMD5(fichier);
var md5_2 = GetMD5_V2(fichier);
}
When I run this code md5_1 and md5_2 is different. I not understand why when I read the propertie stream.EndOfStream this change the result of stream.BaseStream?
>Solution :
Querying the EndOfStream property of a freshly initialized StreamReader reads some bytes from the underlying stream. See the source code of this property’s getter (link):
public bool EndOfStream
{
get
{
ThrowIfDisposed();
CheckAsyncTaskInProgress();
if (_charPos < _charLen)
{
return false;
}
// This may block on pipes!
int numRead = ReadBuffer();
return numRead == 0;
}
}
On a freshly instantiated StreamReader, the value of both _charPos and _charLen is zero, leading to the EndOfStream getter invoking ReadBuffer().
ReadBuffer() reading from the underlying stream will then advance the read/write position of that stream, hence the MD5 instance then consuming only the remaining bytes from the stream beginning from the now advanced stream read/write position. Which then in turn yields a different MD5 hash compared to calculating the MD5 hash over the entire stream.