Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find similar words in an array and eliminate them

$a[] = "paris";
$a[] = "london";
$a[] = "paris";
$a[] = "london tour";
$a[] = "london tours";
$a[] = "london";
$a[] = "londonn";

foreach($a as $name) {

echo $name;
echo '<br>';

}

Output:

paris
london
paris
london tour
london tours
london
londonn

I can eliminate the same words with array_unique

foreach(array_unique($a) as $name) {

echo $name;
echo '<br>';

}

Output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

paris
london
london tour
london tours
londonn

I want to take this further and eliminate similar words. Like, if there is a "london", I want to eliminate "londonn".

So the output will be:

paris
london
london tour

I tried similar_text($name, $name, $percent) but it did not help.

Here is what I tried with my limited of knowledge:

foreach(array_unique($a) as $name) {

$test = $a;
foreach($test as $test1) {

 similar_text($name, $test1, $percent);
if ($percent > 90) {
echo $name;
echo '<br>';
} 

}
}

Output:

paris
paris
london
london
london
london tour
london tour
london tours
london tours
londonn
londonn
londonn

The source of the words is a search list:

$a[] = "$popular_search";

>Solution :

The main problem seems to be the way you use the two nested loops. Here’s a very explicit example, without anything fancy, showing how you could do this:

$a[] = "paris";
$a[] = "london";
$a[] = "paris";
$a[] = "london tour";
$a[] = "london tours";
$a[] = "london";
$a[] = "londonn";

$b = [];
foreach($a as $outerName) {
    // start optimistic, no similar string found
    $isUnique = true;
    foreach($b as $innerName) {
        // check whether the string already has a similar entry
        similar_text($outerName, $innerName, $percent);
        if ($percent > 90) {
            $isUnique = false;
            break;
        }
    }
    if ($isUnique) {
        $b[] = $outerName;
    }
}

print_r($b);

Working example

The output is:

Array
(
    [0] => paris
    [1] => london
    [2] => london tour
)

How does it work? There’s an outer loop that simply goes through all the strings in array $a. Inside that loop it loops through the strings $b that have already been identified as being unique enough. If a string from $a is similar enough to a string of $b we skip it. That’s all.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading