I have a list of blacklisted URLs in an array
var exclusion = ["facebook.com","instagram.com","twitter.com","youtube.com","linkedin.com","google.com","wordpress.org","pinterest.com","plus.google.com","miit.gov.cn","whatsapp.com","apple.com","goo.gl","qq.com","policies.google.com","youtu.be","microsoft.com","maps.google.com","play.google.com","wa.me","accounts.google.com","github.com","en.wikipedia.org","support.google.com"]
I will be given a single URL like these to test against the exclusion list
https://www.facebook.com
http://www.facebook.com
https://facebook.com
http://facebook.com
http://facebook.com?login=true
http://facebook.com/?login=true
instagram.com
Hence
for(var i=0;i<exclusion.length;i++)
{
if("https://www.facebook.com".indexOf(exclusion[i]) == 0)
return true;
}
is a highly inefficient technique
since the list has "domain names" and the given string are URLs
How do I make a function to return true if the specified URL is in the list of domains.
>Solution :
One way to do this is to strip the extraneous parts of the URL off (preferably using the URL API, but if that is not possible, using a regex) and then test whether the result is in the exclusions array:
const exclusion = ["facebook.com","instagram.com","twitter.com","youtube.com","linkedin.com","google.com","wordpress.org","pinterest.com","plus.google.com","miit.gov.cn","whatsapp.com","apple.com","goo.gl","qq.com","policies.google.com","youtu.be","microsoft.com","maps.google.com","play.google.com","wa.me","accounts.google.com","github.com","en.wikipedia.org","support.google.com"]
const tests = ['https://www.facebook.com','http://www.facebook.com','https://facebook.com','http://facebook.com','http://facebook.com?login=true','http://facebook.com/?login=true','instagram.com']
var urlRegex = /^https?:\/\/(?:www\.)?([^/?]+).*$/
const blacklistedURL = (url) => exclusion.includes(url.replace(urlRegex, '$1'))
tests.forEach(url => {
if (blacklistedURL(url)) {
console.log(`${url} is blacklisted!`)
}
})
Note I’ve used a trivial URL matching regex in the code for demonstration purposes. There are many sources of better regexes to match URLs, and you should use one of them.