Geographically Targeted Web Content

Using Apache Server Side Includes

by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Suppose you wanted to geographically target some web content. For the sake of example, suppose you wanted to include content in some of your web pages to force those pages to be blocked by the great firewall of China by including a "Free Tibet" banner.

If your web server supports Apache and you have server-side includes enabled, you can do this in a manner that will not intrude on web users outside of China.

Prerequisite Information

Apache gives access to the IP address of the client viewing a web page with the REMOTE_ADDR variable. The web site Locating Countries from IP addresses gives information on identifying the country of origin of a web client from the IP address. This shows the range of China's IP numbers and gives the formula for converting these to IP addresses.

IP number IP address
from 3401580544 202.192.0.0
to 3402629119 202.207.255.255
from 3411087360 203.81.16.0
to 3411091455 203.81.31.255
from 3411533824 203.87.224.0
to 3411543039 203.88.3.255
from 3411607552 203.89.0.0
to 3411608575 203.89.3.255
from 3411673088 203.90.0.0
to 3411674111 203.90.3.255
from 3411738624 203.91.0.0
to 3411739647 203.91.3.255
from 3411804160 203.92.0.0
to 3411805183 203.92.3.255
from 3411869696 203.93.0.0
to 3411943423 203.94.31.255
from 3412000768 203.95.0.0
to 3412002815 203.95.7.255

What is immediately clear here is that there are a dismaying number of different blocks of IP addresses allocated to China, with similar small blocks allocated to other countries. The first of these blocks is by far the largest.

The Apache SSI language has an if test that allows the server to test a variety of conditions. For example, it is easy to see if the client is a particular IP address with

    <!--#if expr="$REMOTE_ADDR = 203.95.7.255" -->
      Content displayed only to a specific client
    <!--#endif -->

We can also test using pattern matching, by enclosing the pattern in slash characters. Within patterns, asterisks match any string of characters, and periods match any single character. To force literal matching of periods, you need to escape the period with a backslash. For example, to match all addresses starting with 203.89. we could use this

    <!--#if expr="$REMOTE_ADDR = /203\.89\..*/" -->

The final dot-star in this pattern matches any repeated sequence of zero or more characters after the preceeding literal dot.

We can also compare strings. For example, consider testing to see if the IP address is in the range from 202.192.0.0 to 202.207.255.255. To do this, we need to apply two different boolean conditions:

    <!--#if expr="$REMOTE_ADDR > 202.192.0.0" -->
      <!--#if expr="$REMOTE_ADDR < 202.207.255.255" -->
        Content displayed only when within the indicated range
      <!--#endif -->
    <!--#endif -->

The above test is not entirely correct because the comparison used is not a numeric comparison, but rather, an alphabetical one. As a result 202.2.0.0 satisfied it because, in the alphabetical order of the ASCII character set, dot is less than any numeral.

The Basic Solution

The following test correctly identifies the first (and largest) range of IP addresses that concern us. It does so by identifying the length of the IP address fields first:

    <!--#if expr="$REMOTE_ADDR = /202\....\..*/" -->
      <!--#if expr="$REMOTE_ADDR > 202.192.0.0" -->
        <!--#if expr="$REMOTE_ADDR < 202.207.255.255" -->
          <!--#set var="country" value="CHINA" -->
        <!--#endif -->
      <!--#endif -->
    <!--#endif -->

In addition, the above code doesn't create any output, instead, it sets the server-side variable country to CHINA if the IP address matches. We will use this to allow multiple tests to arrive at a result.

We can do similar tricks for each of the ranges, but many of the tests can be combined. There is also the question of how we format the actual protest text. What we will do here is assume that this text is in a separate file, called ChinaProtest.shtml Ultimately, we will either include this file or display a horizontal rule:

    <!--#if expr="$country = CHINA" -->
      <!--include file="ChinaProtest.shtml" -->
    <!--#else -->
      <hr>
    <!--#endif -->

Putting It All Together

In order to avoid cluttering web pages that have content where we wish to include this protest notice, these IP address tests can be stored in a separate file. Here, we will assume that the file called incTestIP.shtml holds the code to test the country of origin

Now, where you would have put a simple horizontal rule <hr> in your HTML file, you can replace this with

    <!--include file="incTestIP.shtml" -->

Here is a complete incTestIP.shtml file for detecting web sites in China. It is ugly and can probably be improved considerably. Due to limitations on the string matching tools used here, some of the IP address ranges had to be tested using multiple cases, and the framework of the entire thing is a linear search:

    <!--#set var="country" value="none" -->
    <!--#if expr="$REMOTE_ADDR = /202\....\..*/" -->
      <!--#if expr="$REMOTE_ADDR > 202.192." -->
        <!--#if expr="$REMOTE_ADDR < 202.208." -->
          <!--#set var="country" value="CHINA" -->
        <!--#endif -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.81\...\..*/" -->
      <!--#if expr="$REMOTE_ADDR > 203.81.16." -->
        <!--#if expr="$REMOTE_ADDR < 203.81.32." -->
          <!--#set var="country" value="CHINA" -->
        <!--#endif -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.87\....\..*/" -->
      <!--#if expr="$REMOTE_ADDR > 203.87.224." -->
        <!--#set var="country" value="CHINA" -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.88\..\..*/" -->
      <!--#if expr="$REMOTE_ADDR < 203.88.4." -->
        <!--#set var="country" value="CHINA" -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.89\..\..*/" -->
      <!--#if expr="$REMOTE_ADDR < 203.89.4." -->
        <!--#set var="country" value="CHINA" -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.90\..\..*/" -->
      <!--#if expr="$REMOTE_ADDR < 203.90.4." -->
        <!--#set var="country" value="CHINA" -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.91\..\..*/" -->
      <!--#if expr="$REMOTE_ADDR < 203.91.4." -->
        <!--#set var="country" value="CHINA" -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.92\..\..*/" -->
      <!--#if expr="$REMOTE_ADDR < 203.92.4." -->
        <!--#set var="country" value="CHINA" -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.93\..*/" -->
      <!--#set var="country" value="CHINA" -->
    <!--#elif expr="$REMOTE_ADDR = /203\.94\...\..*/" -->
      <!--#if expr="$REMOTE_ADDR < 203.93.32." -->
        <!--#set var="country" value="CHINA" -->
      <!--#endif -->
    <!--#elif expr="$REMOTE_ADDR = /203\.95\..\..*/" -->
      <!--#if expr="$REMOTE_ADDR < 203.95.8." -->
        <!--#set var="country" value="CHINA" -->
      <!--#endif -->
    <!--#endif -->
    <!--#if expr="$country = CHINA" -->
      <!--include file="ChinaProtest.shtml" -->
    <!--#else -->
      <hr>
    <!--#endif -->

The above code is extraordinarily ugly, and also difficult to debug, since you need to try to access it from each of many different IP addresses to see if it worked correctly.

What to Output

Here is an example framework for a ChinaProtest.shtml file. Please invent content more interesting than this:

<table width="100%" cellspacing="0" cellpadding="5" border="1">
 <tr>
   <td bgcolor="#FFE0E0">
     <h2>Free Tibet</h2>
     <p>
     Appripriate text to add a bit of content to the protest.
</table>

Generalizing the Idea

The mechanisms discussed above can clearly be generalized to detect multiple countries. A protest directed at the Saudi firewall would be equally appropriate, and it is not difficult to imagine targeting other countries.

There is, however, a big difference between content designed to test the various great firewalls on the net and content designed to add a note of protest to a web site. If, for example, you wanted to protest US actions in Iraq or Israeli actions on the West Bank, your message will be seen by viewers from those countries. Keep it polite and they'll continue visiting your web site despite a protest with which they may disagree. Make it loud and distracting, and people will simply go elsewhere.

A second generalization is possible. Instead of showing the message every time the web site is displayed, it is fairly easy to make the message show only occasionally. If a firewall system walks the web periodically making a list of politically dangerous web sites, then a site that only exhibits its dangerous content occasionally is likely to avoid being blocked. Here is an if statement that only succedes when the time of day ends with a zero, so it is true for one second out of every ten:

    <!--#if expr="$DATE_GMT = /:.0 /" -->

If you use this random display idea, you should try to make your protest message relatively non-disruptive. Even subtle and brief messages will be noticed in a country where any political content is routinely quashed.

Thoughts

Given that we expect the great firewall of China to block any web page using this mechanism, it is fair to ask what web pages should be equipped with this protest. Here are my suggestions regarding this:

Do not include such protest material in web pages where the content involves human rights issues. Where the content of those pages is otherwise non offensive to the great firewall, these pages should be freely available in China. This applies to web pages that discuss a wide range of subjects such as criminal and civil law, the machinery of democracy, and environmental issues.

Do consider adding such protest material to web pages that are of potential economic value to China. Web pages containing instructional content in various fields of engineering are perfect vehicles for such protests.

Disclaimer

Who knows if this works? I probably made some mistakes. If anyone finds an error, I would appreciate knowing about it. My E-mail address is on my home page.