Efficient way to remove nulls and do a find-replace on a list of custom objects

Advertisements

I have a large-ish set of data (26000 records ) that I get from an API.
In one pass I am filtering them to remove the ones that value of a specific field is null or empty. So I remove nulls from my data.
In another pass I am transforming the shape of data for the data in the same specific field again with a for-each loop so code below:

$data = $data | Where-Object { $_.myField -ne $null -and $_.myField -ne '' }

foreach ($item in $data) {
    $item.myField = $item.myField.Replace("ABC_", "")
}

I just wanted to see if you have a more efficient way of doing this. I am on PowerShell 5.0 so don’t have Parallel For-Each and things like that.

Sample Input Data:

$data = @(
    [PSCustomObject]@{Id=1; myField='2345';    Car = $null},
    [PSCustomObject]@{Id=2; myField='ABC_123'; Car = 'Pagani'},
    [PSCustomObject]@{Id=2; myField= $null; Car = 'Pagani'},
    [PSCustomObject]@{Id=2; myField= ''; Car = 'Pagani'},
    [PSCustomObject]@{Id=3; myField='ABC_456'; Car = 'KIA'}
)

Sample Expected Result:

$data = @(
    [PSCustomObject]@{Id=1; myField='2345';    Car = $null},
    [PSCustomObject]@{Id=2; myField='123'; Car = 'Pagani'},    
    [PSCustomObject]@{Id=3; myField='456'; Car = 'KIA'}
)

>Solution :

ForEach-Object -Parallel wouldn’t help you here. What you can do is reduce the loops from 2 to 1 and, since the collection is already in memory, nothing beats a foreach loop in PowerShell so:

$data = foreach ($item in $data) {
    # handles the 3 possibilities (null, '' and '  ')
    if ([string]::IsNullOrWhiteSpace($item.myField)) {
        continue
    }

    $item.myField = $item.myField.Replace('ABC_', '')
    $item
}

I haven’t tested this but using a compiled regex instance might be faster than string.Replace, probably worth giving it a try:

$re = [regex]::new('^ABC_', [System.Text.RegularExpressions.RegexOptions]::Compiled)
$data = foreach ($item in $data) {
    if ([string]::IsNullOrWhiteSpace($item.myField)) {
        continue
    }

    $item.myField = $re.Replace($item.myField, '')
    $item
}

Leave a ReplyCancel reply