Sep 9 2007

Coming back

Those that usually visit my page (yes, there are such people) would notice that it had an irregular behavior being inaccesible the most of the time. It was due to two reasons:

To fix this I did the following:

  • Redirected my web page to Coral Content Distribution Network cach├ę, so when yo visit my web page you will be redirected to that copy (this is why it’s slower now, but at least it works) being the original web page hosted at SDF-eu the source of Coral CDN. IF you want to do the same with your web page, you must detect when your web page is being visited from Coral CDN to allow it to access and otherwise redirect the visit to the same URL adding .nyud.net at the end of the domain name. Using PHP:

    <?php
      if (strpos($_SERVER["HTTP_USER_AGENT"], "CoralWebPrx") === false)
      {
        header("HTTP/1.1 302 Found");
        header("Location: " . $_SERVER["HTTP_HOST"] . ".nyud.net" . $_SERVER["REQUEST_URI"]);
        exit;
      }
    ?>
    

    If you don’t use PHP you can do it with .htaccess:

    RewriteEngine on
    RewriteCond %{HTTP_USER_AGENT}  !^CoralWebPrx*
    RewriteRule ^(.*) http://%{HTTP_HOST}.nyud.net/$1  [P,L]
    
  • Block Cuill. Cuill is a search engine founded by ex Googlers that’s not yet working and nowadays it only indexes web pages. Only known by those of us who had to suffer its robot indexing our web page, this is why I’m not interested in being DoSed by this crawler in order to be indexed in their search engine. To block Twiceler -Cuill robot- with PHP:

    <?php
      if (strpos($_SERVER["HTTP_USER_AGENT"], "Twiceler") !== false)
      {
        header("HTTP/1.1 403 Access denied");
        exit;
      }
    ?>
    

    With .htaccess

    RewriteEngine on
    RewriteCond %{HTTP_USER_AGENT} Twiceler
    RewriteRule .* - [F,L]
    

    If this search engine respects the most basic rules for search engines it will be possible to block it using robots.txt:

    User-Agent: Twiceler
    Disallow: /
    

    But it seems that I must add to the list of Cuill misbehaviours the fact that it seems that doesn’t support this protocol.

  • Block hotlinking. Hotlinking consists in linking an image or any other file from one page in another without consent causing an usage of the first one’s bandwith. Is one of the most common, harmful and rejected bad practices. This is why I blocked it. Since now, you won’t be able to link to any image hosted in my web page, but you will be able to use ethic hotlinking service ImgRed, as I do.

    RewriteEngine On
    
    RewriteCond %{REQUEST_FILENAME} .*jpg$|.*gif$|.*png$ [NC]
    RewriteCond %{HTTP_REFERER} !^$
    # Allow my own web page
    RewriteCond %{HTTP_REFERER} !h0m3r\.sdf-eu\.org [NC]
    # Allow my own web page in Coral CDN
    RewriteCond %{HTTP_REFERER} !h0m3r\.sdf-eu\.org\.nyud\.net [NC]
    # Allow ImgRed.com
    RewriteCond %{HTTP_REFERER} !imgred\.com [NC]
    # Allow search engines
    RewriteCond %{HTTP_REFERER} !google\. [NC]
    RewriteCond %{HTTP_REFERER} !yahoo\. [NC]
    # Allow Google cache
    RewriteCond %{HTTP_REFERER} !search\?q=cache [NC]
    
    RewriteRule (.*) /img/leech.png
    

That’s all, it may affect site performance (especially Coral CDN) but I have no choice as long as I have so limited bandwith and some people reject minimum code of ethics. If you want me to move to a better server, then you should know what’s the utility of advertisments ­čśë

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.