Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Efficient way to remove nulls and do a find-replace on a list of custom objects

I have a large-ish set of data (26000 records ) that I get from an API.
In one pass I am filtering them to remove the ones that value of a specific field is null or empty. So I remove nulls from my data.
In another pass I am transforming the shape of data for the data in the same specific field again with a for-each loop so code below:

$data = $data | Where-Object { $_.myField -ne $null -and $_.myField -ne '' }

foreach ($item in $data) {
    $item.myField = $item.myField.Replace("ABC_", "")
}

I just wanted to see if you have a more efficient way of doing this. I am on PowerShell 5.0 so don’t have Parallel For-Each and things like that.

Sample Input Data:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

$data = @(
    [PSCustomObject]@{Id=1; myField='2345';    Car = $null},
    [PSCustomObject]@{Id=2; myField='ABC_123'; Car = 'Pagani'},
    [PSCustomObject]@{Id=2; myField= $null; Car = 'Pagani'},
    [PSCustomObject]@{Id=2; myField= ''; Car = 'Pagani'},
    [PSCustomObject]@{Id=3; myField='ABC_456'; Car = 'KIA'}
)

Sample Expected Result:

$data = @(
    [PSCustomObject]@{Id=1; myField='2345';    Car = $null},
    [PSCustomObject]@{Id=2; myField='123'; Car = 'Pagani'},    
    [PSCustomObject]@{Id=3; myField='456'; Car = 'KIA'}
)

>Solution :

ForEach-Object -Parallel wouldn’t help you here. What you can do is reduce the loops from 2 to 1 and, since the collection is already in memory, nothing beats a foreach loop in PowerShell so:

$data = foreach ($item in $data) {
    # handles the 3 possibilities (null, '' and '  ')
    if ([string]::IsNullOrWhiteSpace($item.myField)) {
        continue
    }

    $item.myField = $item.myField.Replace('ABC_', '')
    $item
}

I haven’t tested this but using a compiled regex instance might be faster than string.Replace, probably worth giving it a try:

$re = [regex]::new('^ABC_', [System.Text.RegularExpressions.RegexOptions]::Compiled)
$data = foreach ($item in $data) {
    if ([string]::IsNullOrWhiteSpace($item.myField)) {
        continue
    }

    $item.myField = $re.Replace($item.myField, '')
    $item
}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading