A few weeks back, a client was facing a severe problem: their database disk space consumption was steadily going up for no apparent reason. I was called in to help, and provided with the hosting control panel password as well as the Drupal admin password.
Investigation and Findings
Upon investigation, I found out two things quickly:
- They were using an old Drupal version based on CivicSpace. It was a 4.4 or earlier version.
- The cache was growing very fast. For such a relatively small site, cache was 184 MB alone! Larger than the content of anything else on the side (including comments, node, and accesslog).
When I emptied the cache table, it started to grow again almost immediately.
Upon looking more closely, it was apparent that two factors caused this fast growth:
- The fact that this is a pre-4.5 Drupal version is crucial. On 4.4 and older, Drupal did not handle 404 errors correctly. Instead, it displayed trhe contents of the home page of the site.
- The site was being hit by referer spam attempts or proxy scans. Since these spam attempts never got a 404, they thought they were successful, and kept trying again.
Moreover, since Drupal caching was turned on, the home page was being cached for every attempt with the cache key being the off site link. This caused about 48 kB to be cached for each attempt.
Besides the above problem, the site also used an excessive amount of bandwidth due to the sheer number of requests and the serving of the home page over and over.
Here are some examples of URLs:
http://partners.mygeek.com/search.jsp?partnerid=98980&ip=64.60.171.35&query=appraisal
http://partners.mygeek.com/search.jsp?partnerid=98765&ip=64.21.136.223&query=auto+dialers
http://feed.genieknows.com/search/search_html.jsp?client_id=GOTOMAI_7997&q=Linux+file+server
http://txsearch.epilot.com/getresults.aspx?aff=ebuyarts&ip=216%2E92%2E142%2E138&keyword=Yoga+Tapes&source=s&r=www.ebuyarts.com
http://txsearch.epilot.com/getresults.aspx?aff=gotomai&ip=216%2E92%2E142%2E138&keyword=Search+Term&source=s&r=www.gotomai.com
http://partner.search.sohu.com/cpc/partner.php?pid=info-xatom&type=14
http://partner.search.sohu.com/cpc/partner.php?pid=info-xa163&type=14
Recommendations
There are several recommendations that can be done to take care of this problem:
Blacklist the IP addresses
If you find that the IP addresses that these spam attempts are coming from are not that many, you can block them in the .htaccess file.
Upgrade Drupal
You are better off with a newer Drupal since it does issue a 404 when it does not find the page. There are other reasons that make upgrading a good idea, including security issues with older releases.
Block certain file types
It would help if you prevent any requests to file types that you do not have. For example, if you are only running Drupal, then the following file types are not needed:
aspx|jsp|look|cgi
You can those in .htaccess to the line:
<Files ~ "(\.(inc|module|pl|sh|sql|theme|engine|xtmpl)|Entries|Repositories|Root|scripts|updates)$">
So it looks like this:
<Files ~ "(\.(inc|module|pl|sh|sql|theme|engine|xtmpl|aspx|jsp|look|cgi)|Entries|Repositories|Root|scripts|updates)$">
Disable Drupal's cache
If your site does not get lots of hits, then you can disable the Drupal cache from the admin menus. This will cause the cache table to not grow.
Links and Resources
- Drupal Forums: Site Slammed by Offsite Ad and Proxy Requests.
- Drupal Forums: Is this a hacking attempt? Also see the details of how it looks in the log here.
- AndySpace: partners.mygeek.com officially shitlisted.
Most Comments
Most commented on articles ...