How Web Filtering Works

Internet Security has become an umbrella term encompassing everything from intrusion detection and antivirus to internet usage monitoring and filtering. This article discusses key concepts around the topic of Internet filtering and, more specifically, Web Filtering.

Leading Internet monitoring and filtering software solutions offer a combination of employee Internet management capabilities. Web Filtering is the method of blocking Web page access based on content classification techniques. Web Filtering is typically done either by contextual word analysis, flesh tone analysis, maintenance of a database of categorized Web sites or a combination of all three. Checking the context in which a word is used (e.g. sex as a verb versus sex as an adjective) and flesh tone analysis – looking for images that have flesh colors and thus a higher probability of nudity – provide the greatest incidence of false positives and thus tend to over-filter or over-block.

The most prevalent and accurate form of Web Filtering is the maintenance of a database of categorized Web sites. A comprehensive and accurate list of categorized Web addresses is a powerful approach to Web Filtering. Internet monitoring and filtering software companies build their databases by having their computers crawl the Web and apply custom logic to identify and categorize content. This is similar to how search engines like Google and Yahoo! crawl the Web for indexing purposes. A Web Filtering list should be created with a blended approach of category based content analysis – what is the topic of the site’s pages – , link analysis – what sites lead to and from the site under review – and domain analysis – who owns the site and what other sites fall under the same owner’s purview. Each Internet monitoring and filtering software provider creates its own rules for properly categorizing Web content. Because an artificial intelligence approach is not foolproof, a quality Web Filtering software will incorporate human site review in order to ensure proper evaluation and classification of sites that cannot be done so in an automated fashion. The Web Filter database should be maintained by the Internet filtering software provider and periodic updates to the database should be automatically distributed to customers over the Internet.

When evaluating a Web Filter database, be leery of size claims. Web filter providers may try to impress you with an excessive amount of Web site categories which you can purchase to configure your access rules and may boast of an inordinate number of Web sites included in their databases. For example, the pearlsoftware.com Web site has roughly 500 pages and can be categorized as a business-to-business Web site. Some Internet filtering software providers will count pearlsoftware.com as a single entry in their database because all rules that apply to the root Web site, pearlsoftware.com, also apply to all pages contained within the site (e.g. pearlsoftware.com/about/index.html). Other Web filter providers may maintain each page in their database and thus boast a filtering database 500 times larger than their competitors. The 80-20 rule is a good rule of thumb to use when evaluating web filtering capabilities: 80 percent of your employees will visit 20 percent of the most popular Web sites in each category. Internet monitoring and filtering software providers will have the twenty percent covered and are battling at the edge to categorize less popular sites visited by fewer people.

Posted in Cyber Security.