Taming Web Proxy Auto-Discovery

12/13/2010 -- Updates below Jump

I've gone through hell trying to get this stuff working (especially with IE) so I'm hoping that my few quick notes will be helpful to some other poor schmuck. My environment is a simple home network with a single web proxy, so everything below is geared around that. So if you're doing something like running multiple web proxies with load balancing for a large organization, you're going to need a bit more than I'm covering here.

I don't know why this stuff is such deep voodoo. It should be really really stupid simple. Perhaps it's because it was never really formed into a standard, and seems to have been caught up in the Great Browser Wars between MS and Netscape as well. See the link at the bottom of this page for more on that and other useful bits.

The most basic config possible

Best I can tell, the most basic config requires internal DNS and a web server. DHCP appears to not be required. My lack of certainty in places is due the general opaqueness of this feature, coupled with lots of highly erratic test results and the resulting (hopefully temporary) brain damage.

My clients are all WinXP except one Vista64 box. The server is Fedora Core 10 (FC10). I'm still early in getting used to some of the curve balls thrown at me with FC10 in this config. For example, there are apparently some new security settings in Apache that prevent access to any *.dat files. See a problem? Covered further below...

Client & Network Config

Whether you use DHCP or not, your clients should have a concept of their domain name and proper DNS. For an internal network, you should set up a special zone called "my.lan" or somesuch, and restrict external queries/transfers. Since the whole idea of WPAD is to to have clients "just work", then you should probably be doing DHCP too, even if you're mostly assigning fixed addresses. Fixed or dynamic, really doesn't matter for our purposes.

Client Browser Settings

IE7 comes with Tools->Internet Options->Connections->LAN Settings->Automatically Detect Settings already checked and none of the other proxy options enabled. If you're having problems getting WPAD to work with IE7, check those settings first. It could save your brain.

Speaking of brain-saving, Firefox 3 may or may not have Tools->Advanced->Settings->Auto-detect proxy settings for this network checked. Better check whether or not it's checked...

I don't use any other browsers. If you do, your mission is to find and verify/set the auto-proxy option.

If you're using DHCP, that's about all that's required on the client side.

Server config

DHCP Config

Read this one through carefully. It ends up saying nearly the opposite of where it starts.

In your /etc/dhcpd.conf global config, you need something like this:

# For browser auto proxy configuration
option wpad code 252 = text;
option wpad "http://wpad.my.lan/wpad.dat\n";

Then do a service dhcpd restart and check /var/log/messages for complaints. Note the above assumes you've set up your internal DNS to actually resolve wpad.my.lan to your web server hosting the wpad.dat file. More below...

If you are not using DHCP, then you may or may not get this stuff to work and your results may vary between browsers, or depend on the phase of the moon. i.e., all bets are off. This is because there is a DHCP option that provides the URL for the proxy config file. Based on my wacky adventures, I don't think use of this DHCP option is consistent across browsers. or that clients will fully obey what you tell them. I.e., if you define the filename in the URL as wpad.pac, both IE7 and FF3 seem to happily ignore the DHCP defined filename and retrieve wpad.dat instead. Maybe it's because I also have DNS set up with an internal "wpad" alias to a web server?

So to confirm (as best anyone can confirm anything with this WPAD stuff), I disabled the DHCP options, restarted dhcpd, then on the client I did an ipconfig /release an ipconfig /renew and rebooted just be be sure (hah!).

My latest results suggest that as long as you at least have all the DNS and webserver stuff set up correctly, then the DHCP config is not reqired. YMMV...

DNS Config

All you need is a CNAME so that the wpad host alias points at your web server. It's just that easy. So in your my.lan zone file, you should have something like the following:

myserver	IN	A	192.168.100.1
wpad		IN	CNAME	myserver
proxy		IN	CNAME	myserver

That is, you've got a standard A record pointing at your server's internal address (presumably that was already there) and you just added a "wpad" CNAME pointing to the same place. A second "proxy" CNAME is just to give you some flexibility. It should point at whatever server is your actual proxy server.

After that, service named reload, check /var/log/messages for screaming and you're on your way. Oh, don't forget to change the serial in your SOA. (Does that matter if you aren't doing zone xfers?)

Web Server Config

In FC10, there's a couple of changes required to get Apache to serve up the wpad.dat file correctly. First, there's some lovely new security stuff that will spit stuff like this into your /var/log/httpd/error_log:
[Mon Dec 15 21:06:04 2008] [error] [client 192.168.100.21] ModSecurity: Access denied with code 500 (phase 2). Pattern match "(?:lock-token|translate|if)$" at REQUEST_HEADERS_NAMES:translate. [file "/etc/httpd/modsecurity.d/modsecurity_crs_30_http_policy.conf"] [line "106"] [id "960038"] [msg "HTTP header is restricted by policy"] [severity "WARNING"] [tag "POLICY/HEADER_RESTRICTED"] [tag "POLICY/FILES_NOT_ALLOWED"] [hostname "boink"] [uri "/"] [unique_id "SUcNDH8AAAEAAAnibLcAAAAF"]

If you squint just right, you can make out the offending file and line number.

File: /etc/httpd/modsecurity.d/modsecurity_crs_30_http_policy.conf
Line: 106

Visiting that line reveals a rather large regex that's matching on our ".dat" file extension and prohibiting it from being served up:

SecRule REQUEST_BASENAME "\.(?:c(?:o(?:nf(?:ig)?|m)|s(?:proj|r)?|dx|er|fg|md)|p(?:rinter|ass|db|ol|wd)|v(?:b(?:proj|s)?|sdisco)|a(?:s(?:ax?|cx)|xd)|d(?:bf?|at|ll|os)|i(?:d[acq]|n[ci])|ba(?:[kt]|ckup)|res(?:ources|x)|s(?:h?tm|ql|ys)|l(?:icx|nk|og)|\w{0,5}~|webinfo|ht[rw]|xs[dx]|key|mdb|old)$" \
    "phase:2,t:none,t:urlDecodeUni, t:lowercase, deny,log,auditlog,status:500,msg:'URL file extension is restricted by policy', severity:'2',id:'960035',tag:'PO
LICY/EXT_RESTRICTED'"

You could turn the whole damned rule off, but I try to take the philosophy that these things are the product of someone's hard thinking and hard work and generally there for a reason, so just a bit of tweaking instead:

#SecRule REQUEST_BASENAME "\.(?:c(?:o(?:nf(?:ig)?|m)|s(?:proj|r)?|dx|er|fg|md)|p(?:rinter|ass|db|ol|wd)|v(?:b(?:proj|s)?|sdisco)|a(?:s(?:ax?|cx)|xd)|d(?:bf?|at|ll|os)|i(?:d[acq]|n[ci])|ba(?:[kt]|ckup)|res(?:ources|x)|s(?:h?tm|ql|ys)|l(?:icx|nk|og)|\w{0,5}~|webinfo|ht[rw]|xs[dx]|key|mdb|old)$" \
# CW - Removed .dat from list so wpad.dat auto proxy config would work
SecRule REQUEST_BASENAME "\.(?:c(?:o(?:nf(?:ig)?|m)|s(?:proj|r)?|dx|er|fg|md)|p(?:rinter|ass|db|ol|wd)|v(?:b(?:proj|s)?|sdisco)|a(?:s(?:ax?|cx)|xd)|d(?:bf?|ll|os)|i(?:d[acq]|n[ci])|ba(?:[kt]|ckup)|res(?:ources|x)|s(?:h?tm|ql|ys)|l(?:icx|nk|og)|\w{0,5}~|webinfo|ht[rw]|xs[dx]|key|mdb|old)$" \
    "phase:2,t:none,t:urlDecodeUni, t:lowercase, deny,log,auditlog,status:500,msg:'URL file extension is restricted by policy', severity:'2',id:'960035',tag:'POLICY/EXT_RESTRICTED'"

You'll have to be pretty familiar with regexes and look quite closely to find the problem and fix it. Buried in there, there's a regex fragment that has to be changed from this:

d(?:bf?|at|ll|os)
To this:
d(?:bf?|ll|os)

Masochistic bastards... Well at least they gave us the file and line #, eh?

And when you can get your eyes to focus again, you also need to change /etc/httpd/conf/httpd.conf to add the following MIME-type config items:

#
#       Charlie's addition to cleanly (Hah!!) support WPAD (auto-proxy config)
#
AddType application/x-ns-proxy-autoconfig .pac
AddType application/x-ns-proxy-autoconfig .dat

You could probably leave out the .pac line -- I never got the browsers to retrieve anything other than wpad.dat anyhow.

wpad.dat itself

Finally! You'll need to drop this file in the doc root for your web server, which by default is /var/www/html/:

function FindProxyForURL(url, host) {
	// Proxy everything!!!
        return "PROXY proxy.my.lan:3128";
}

That's the simplest possible proxy rule -- great for initial testing. Once you get that behaving, you can get fancy and add checks for things you don't want/need proxied. As a rule, Squid seems to proxy anything thrown at it pretty well these days, so adding exceptions is mostly to avoid burning system resources having Squid proxy content that's on the same server, or elsewhere on your local network.

Anyway, I'll cover that later. I haven't gotten fancy yet. :-)

A word of caution however. Note that the contents of this file get executed every time a URL is retrieved (so I'm told), so you should probably avoid "expensive" function calls like isInNet(). If you're just supporting a SOHO type setup, isInNet() doesn't really do what you need anyway, even though I found it suggested in many places. More later.

Getting fancy with wpad.dat

Next day... This seems to work:

function FindProxyForURL(url, host) { 

	// If unqualified hostname, pretty much has to be local content.
	// (Unqualified hostnames don't even work on my LAN clients...)
	if( isPlainHostName(host) ){
		return "DIRECT";
	}

	// Various local domains...

	if( dnsDomainIs(host, ".my.lan") ){
		return "DIRECT";
	}

	if( dnsDomainIs(host, ".external1.com") ){
		return "DIRECT";
	}

	if( dnsDomainIs(host, ".external2.com") ){
		return "DIRECT";
	}

	// Anything that gets this far should be non-local and gets proxied

	return "PROXY proxy.my.lan:3128"; 
}

Testing

Can your client browser resolve the CNAME and access the wpad.dat file? Try entering http://wpad.my.lan/wpad.dat into your browser. If you got the MIME type correct, your browser will probably try to download the file rather than display it.

Is the MIME-type getting set correctly by Apache? From a shell prompt on your server, run "lynx -head http://wpad.my.lan/wpad.dat" and check that you get back "Content-Type: application/x-ns-proxy-autoconfig".

Are the contents of wpad.dat formed correctly and behaving as expected? Since wpad.dat is a JavaScript function, one option is to throw "alert()" calls in there to display variable values and show program flow. Note this could prove extremely annoying to anyone else trying to use the web from your LAN. Here's one to get you started:

function FindProxyForURL(url, host) {
	alert("url: " + url + " host: " + host);
	// Proxy everything!!!
        return "PROXY proxy.my.lan:3128";
}

More later -- I've got Christmas chores to do!

Time passes...

UPDATE 12/13/2010

Learned a few new things...

Links

Lots of good info here: http://www.mercenary.net/blog/index.php?/archives/42-HOWTO-WPAD.html

A nice WPAD tutorial here: http://findproxyforurl.com/wpad_tutorial.html

Slapped together one cold December weekend in 2008 by Charlie Wilkinson -- cwilkins@boinklabs.com... And updated another cold December night in 2010.