Content Filtering

ACL Content Filtering with DansGuardian

Table of Contents:

  • Content Filtering

  • Our Configuration

    • (a) authentication
    • (b) DansGuardian
    • (c) caching
  • Execution Script

    • Squid
    • DansGuardian

We had a requirement to support content filtering but we also still required our overlook available from squid analysis tools that were not available from our content filtering program (dansguardian.)

These notes are intended to be used together with the existing notes on configuring squid. Information already covered in those notes will not necessarily be reviewed here. If you have not installed a working version of squid, then I recommend the above mentioned notes and ensure that you can get a standard install of squid working before delving further into these notes.

Content Filtering

What is it?

People seem to take extreme views on the legitimacy of content filetering of the Internet, but nonetheless if you need to restrict web access in your environment then DansGuardian is a good Open Source program for the task. If you don’t have our other requirement, then you can just install that.

Essentially, content-filtering is the ability to determine access privileges to websites, web-pages, depending on the ‘content’ of the page, instead of depending on the URL (host and links) of the pages.

Our additional requirement, as hinted above, was to retain knowledge of who was accessing what on the Internet and to limit access to the internet to legitimate users of our service (i.e. authentication)

Our Configuration:

We have/had an OpenBSD server as our gateway, caching box between the WAN and the Internet.

The proposal is to send all connections to an Authentication Proxy, from that Proxy to the Content Filter (DansGuardian) and then to a Caching Server.

client –> cache (authentication) –> content filter –> cache (caching)

In this workspace, we are running two instances of squid (a) authentication and (c) caching with the content filter (b) DansGuardian in between.

(a) authentication

The (a) authentication instance of squid is configured to:

  1. Listen on the standard squid port (3128)
  2. authenticate users before passing the request to DansGuardian using the cache_peer TAG. [Please refer to above mentioned squid.htm documentation for how to configure authentication]
  3. don’t cache anything, allow DansGuardian to review all connections
  4. allow access from anyone on our network (only)
  5. Distinguish this revision of squid by using log and caching names of *_authenticate

(b) DansGuardian

Configuration is per its documentation with the following revisions.

  1. Listen on port 8080 (or change the settings in this documentation)
  2. Forward Internet Access to the caching squid proxy at port 3138

(c) caching

The (c) caching instance of squid is configured to:

  1. Listen on a non-standard squid port (3138) or whichever you choose.
  2. DO NOT AUTHENTICATE users.
  3. Cache everything (practically)
  4. ONLY allow access from localhost (i.e. DansGuardian)
  5. Distinguish this revision of squid by using log and caching names of *_cache

  The standard configuration settings we have used are shown below.

TAG Squid authentication Squid caching
filename /etc/squid/squid_authenticate.conf /etc/squid/squid_cache.conf
http_port 3128 3138
icp_port 3130 3140
cache_peer 127.0.0.1 parent 8080 7 allow-miss no-digest no-netdb-exchange no-query proxy-only  
hierarchy_stoplist (remove cgi-bin ?)  
no_cache #We recommend you to use the following two lines.
acl allhttp proto HTTP
acl QUERY urlpath_regex cgi-bin \?
acl localnet src LAN_SUBNET/LAN _MASK no_cache deny QUERY
no_cache deny allhttp
no_cache deny localnet
 
cache_access_log /var/squid/logs/access_authenticate.log /var/squid/logs/access_cache.log
cache_mem 8 MB 8 MB
  Size should match size in cache_dir  
cache_dir ufs /var/squid/cache_authenticate 8 16 256 read-only ufs /var/squid/cache 100 16 256
cache_log /var/squid/logs/access_authenticate.log /var/squid/logs/access_cache.log
cache_store_log /var/squid/logs/store_authenticate.log /var/squid/logs/store_cache.log
pid_filename /var/run/squid_authenticate.pid /var/run/squid_cache.pid
authenticate_ip_ttl 60 seconds 1 hour
http_access allow our_networks allow localhost
deny our_networks
     
 

Execution Script

Because three different servers need to be initiated (started) and there are times you may just wish to stop / start these things, I’ve set up documenting a script for the squid processes, and DansGuardian comes with a script that seems to work well enough for it.

squid.sh

To ensure required squid processes are both running, the below script can be used. A good place to locate the script would be /etc/rc.d/init.d/squid.sh

File: squid.sh

#!/bin/sh
echo -n ' Squid '

case "$1" in
        start)
            /usr/local/sbin/squid -D -f /etc/squid/squid_authenticate.conf
            /usr/local/sbin/squid -D -f /etc/squid/squid_cache.conf
            ;;
        stop)
            /usr/local/sbin/squid -f /etc/squid/squid_authenticate.conf -k shutdown
            /usr/local/sbin/squid -f /etc/squid/squid_cache.conf -k shutdown
            ;;
        restart)
            /usr/local/sbin/squid -f /etc/squid/squid_authenticate.conf -k reconfigure
            /usr/local/sbin/squid -f /etc/squid/squid_cache.conf -k reconfigure
            ;;
        rotate)
            /usr/local/sbin/squid -f /etc/squid/squid_authenticate.conf -k rotate
            /usr/local/sbin/squid -f /etc/squid/squid_cache.conf -k rotate
           ;;
        *)
        echo "Usage: `basename $0` {start|stop|restart|rotate}"
        ;;
esac

The script merely ensures that both versions of the running squid are terminated, started, restarted when required.

dansguardian

The following script is generally supplied with dansguardian and is placed here for completeness of documentation.

File: dansguardian.sh

#!/bin/sh
#
# BSD startup script for dansguardian
# partly based on httpd startup script
#
# description: A web content filtering plugin for web \
#              proxies, developed to filter using lists of \
#              banned phrases, MIME types, filename \
#              extensions and PICS labling.
# processname: dansguardian


# See how we were called.

case "$1" in
    start)
        [ -x /usr/sbin/dansguardian ] && /usr/sbin/dansguardian > /dev/null && echo -e ' dansguardian'        ;;
    stop)
        /usr/sbin/dansguardian -q
        [ -r /tmp/.dguardianipc ] && echo -e ' dansguardian'        rm -f /tmp/.dguardianipc
        ;;
    restart)
        $0 stop
        $0 start
        ;;
    *)
        echo "Usage: configure {start|stop|restart}" >&2
        ;;
esac
exit 0