Laycat, Kyklo, what next?…and even admits is ‘cloaking’ itself
When I was looking through my November website logs, Laycat and Kyclo were of the highest visiting robots above Yahoo and Google. Of course, I googled it to see what on earth it was and sure enough other people were also complaining it was their highest visitor.
It is a relatively small cross-section of web designers and developers that actually look through their records and we’re one of them, the hits from Kyclo and Laycat were too big to ignore. Only a handful of people at the time reported about this particular Robot, some said that they were getting a minimum of 550 hits eg http://jagf.net/blog/?tag=laycat,
For a short period Laycat.com issued a web crawler notice on their site saying that they were simply gathering information for a new search engine…. and that was good enough for some, since a poster had copy/pasted the robot notice on a forum. The robots are sporadic, keep changing names, hit A LOT and the links to their website did not have any information on multiple occasions they were checked therefore this post was originally written. It looked a bit dodgy.
Now that this post was brought to the attention of Laycat/Kyclo, the very plain robot information page is back online, after being assured by the admin at Laycat that it must have been temporary down-time when I was looking.
There are currently 3 known robots all named differently operating under the same people. (rather odd - and how many more are there?) Kyklo.com, aceleo.com and and laycat.com. Not to tell someone else how to run their operation but couldn’t you simply use 3 different server names at one domain, for example kyklo.laycat.com aceleo.laycat.com and laycat.laycat.com? This might make people slightly less suspiscious of 3 different robots with completely different names linking back to the same place.
http://www.kyklo.com and http://www.aceleo.com all redirect to http://www.laycat.com/, - Don’t expect anything too fancy - it’s just a plain robot information notice blurb - no site, no branding or company information, nor anything further, plus despite being asked for further details on several occassions, they with not oblige and instead want to insist we change our public and might I say rightfully free, opinion of it, without further information, I’m sorry if that’s the way I ran my life I’d be a devout christian who thought science was just the devils way of trying to trick us because I’d be ignoring all evidence and putting my faith in the hands of someone elses words.
The admin at Laycat have been extremely bitter and resentful about their bots being mentioned on here in a skeptical light. Their initial contact was immediately followed by the post being re-titled, their admin being thanked for the 3 links above and thanked for their Robots text being re-issued online…. I got told I was being ‘Nasty’ !
Without further aggrevation, Laycat admin continued to bombard us with very long comment posts laced with further derogatory comments, calling us ‘undocumented trolls’, using childish tactics of posting word counts of his posts, due to the fact we said the comments length may have been something to do with Askimet Spam canning his comments. Ripping our post and comments apart line by line (Just like what would normally be considered “a troll” on most forums/blogs) with negatively verbose responses etc. We were painted as simpletons, writing rubbish to just drive people through our affiliate links (hardly advert city here with a maximum 4 links placed for layout aid vs 30+ links to our own site and services), we just wont stand for that, tell us we’re wrong by all means, but provide proof of it, don’t just bombard the comments with links and excuses.
Laycat (also aceleo and Kyklo…. even though I was told that it was kyclo not kyklo by Laycat even though the Kyklo website is kyklo.com), they have an absolutely stinking attitude to say the least. Given Laycats response, the dawn of a new search engine being the reason for these robots has become highly unlikely in our minds, and if it has that sort of childish mentality at the head of it, then frankly we don’t need it. Considering the type of responses that were given, we find it is far more likely this new search engine will be the next “Web Ripper” and not a search engine at all. Due to the nature of our site in comparison to the nature of his comments, we have been forced to remove ALL comments and re-write this post appropriately and close further comments, if admin@laycat.com would like to further comment on this post, we invite him to use our contact form http://www.symsysit.com/core/Symsys-Contact-Details.php to do so, beware though if you fill your email to us with lots of links, a massive character count, swear words etc, then our Web Spam filter will probably pick it up as well.
As repeated in all of Laycats comments, it is highly recommended, that their bots be blocked in the form of IP banning and robots.txt block lists if you think they may be maliscious - I am only repeating the advice given by Laycat admin here and just to please him, since he thinks we have such a controlling effect on our readers, I must molly-coddle you all by saying, “We encourage you to make up your own mind and this post is purely for informational purposes, we are not the definitive voice on the internet” - Laycat do you feel re-assured that we still don’t like your bots but have told our readers to make up their own minds? Readers do you feel re-assured that you’re not being “ordered” to believe what we tell you to?
Laycat, Kyklo, Aceleo maliscious?…..I say HELL YES … well, the admin certainly is!
Paranoid?….. YES :) lol, maybe just bored. At the end of the day, it is your site, you should be able to control what drive though taking your information to some extent, be it on the Internet or not. I’m now off to put on my tin hat, install barbed wire fencing around my house and instruct my datacenter to restrict all traffic to and from my server, just because I feel like it!
Our crawler has visited your web site? Do you have any questions? 1) Why is your robot visiting my web site? Laycat crawler is a web documents indexing robot. 5) What is the search engine this web crawler is working for? The search engine this crawler is working for is currently in an early development stage, and will go public as soon as we achieve the beta stage. His job is to retrieve millions of pages from the world wide web in order to feed a search engine. 6) Why is your crawler using an anonymous user agent? Many documents found on the internet are generated dynamicaly, and may present different content to crawlers than they would to regular visitors by examining the user agent string. Examples of pages adding links to gambling, adult content web sites when a crawler is visiting are plethora. This practice is called cloaking, and the goal is to fool crawlers and search engines in order to make them index some different content than a normal person would actually see. This is what we might call search engine spamming. To avoid that kind of practice, the crawler uses an anonymous user agent, and it will remain that way until we have enough data to do it the best way. At this point we will of course consider using a dedicated user agent. Most antivirus software use the same method as we do when scanning web pages. There is no real need for a webmaster to detect a crawler using the user agent string since this crawler respects the Robot Exclusion Standard, and webmasters can decide to allow him to visit or not using this standard. Please also note that the crawler will never fetch more than one page every two seconds on a same IP address, thus never eating server's resources.=4
Filed under: Robots + Htaccess ... Comments (0)
Subscribe in a Reader or by Email
Robots txt bot list update Oct 08
It has come to the time to do another website clean up. This generally involves sitewide link and accessibility checks, making sure the sitemap is correct etc. It can be a rather sporatic event that I turn my mind to Robots.txt and htaccess. In order to keep things relatively simple, this page is a bad bot list recompiled from old and new records for Robots.TXT only.
PLEASE NOTE: that this will not prevent bad bots completely by any means! There are many bots that ignore the robots.txt altogether so the fact that this text file does include older bots is actually a good thing since they are the more likely to still be actually reading the file.
I had the thought again today, that sometimes it may not be the best idea to shut out so many of these bots. I know we don’t want the site ripped off (not that robots.txt is going to make a difference to that). I know we do’nt want spam marketing emails galore. I just have to point out that some of the higher google ranked websites are turning out to not be the ones that enforce exhaustive security on robots and htaccess. There is Cautious, Meticulous and darned right Anal and Overboard.
If you look through this list, you will see that when a robot version is discovered, another is released and can be as simple as a name change, no matter how small to break though the disallow list. Always remember, Nothing is infallible, and if someone wants IN, they WILL get in - end. Saying that, it is still a good idea to list many of these robots. Some are above board marketing companies with ethics that will honour the txt file (well we like to think so). A lot of these are part of commercial and open software releases that anyone can use - so if you don’t want thier program to work on your site, then you can tell it so.
Best effort has been made to remove any duplicate entires and get it into a general albhabetic order. I never use anyone elses code without checking it over, checking syntax etc so neither should you.
How do you find the latest bots? - with a watchful eye on your raw access logs. Many robots / crawlers often include links to thier source so that you can find out who it is and what the robots is doing so that you can decide for yourself wether to add it to your disallow list or not. If you are the sort that wants to ‘look’ like your web hits are through the roof on say ‘webalizer’ then you may aswell allow the lot…. I prefer to only monitor real humans and the crawlers I DO want, not lies and statistics.
DON’T FORGET:
Disallow ANYTHING including Google from your Cgi-bin, private, secure etc folders
Example:
User-agent: * Disallow: /cgi-bin/
The ‘Google Hacking Database’ has become a quite popular pastime for many.
Example, ’secret’ folders are easily reached by searching for intitle:index.of.secret in google.
It is all the same for ’secure’, ‘cgi-bin’, /tmp and so on. Just go see for yourself
There is a handy little database which tells you more about some of the robots and what they do by name on robotstxt.org
User-agent: 216.34.209.23 Disallow: / User-agent: aipbot Disallow: / User-agent: ia_archiver Disallow: / User-agent: Alexibot Disallow: / User-agent: Aqua_Products Disallow: / User-agent: asterias Disallow: / User-agent: b2w/0.1 Disallow: / User-agent: BackDoorBot Disallow: / User-agent: BackDoorBot/1.0 Disallow: / User-agent: Black.Hole Disallow: / User-agent: BlackWidow Disallow: / User-agent: BlowFish Disallow: / User-agent: BlowFish/1.0 Disallow: / User-agent: Bookmark search tool Disallow: / User-agent: Bot mailto:craftbot@yahoo.com Disallow: / User-agent: BotALot Disallow: / User-agent: BotRightHere Disallow: / User-agent: BuiltBotTough Disallow: / User-agent: Bullseye Disallow: / User-agent: Bullseye/1.0 Disallow: / User-agent: BunnySlippers Disallow: / User-agent: b2w/0.1 Disallow: / User-agent: becomebot Disallow: / User-agent: Cegbfeieh Disallow: / User-agent: CheeseBot Disallow: / User-agent: CherryPicker Disallow: / User-agent: CherryPickerElite/1.0 Disallow: / User-agent: CherryPickerSE/1.0 Disallow: / User-agent: ChinaClaw Disallow: / User-agent: Copernic Disallow: / User-agent: CopyRightCheck Disallow: / User-agent: Crescent Disallow: / User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 Disallow: / User-agent: Custo Disallow: / User-agent: cosmos Disallow: / User-agent: DISCo Disallow: / User-agent: DISCo Pump 3.0 Disallow: / User-agent: DISCo Pump 3.2 Disallow: / User-agent: DISCoFinder Disallow: / User-agent: DittoSpyder Disallow: / User-agent: Download Demon Disallow: / User-agent: Download Demon/3.2.0.8 Disallow: / User-agent: Download Demon/3.5.0.11 Disallow: / User-agent: dumbot Disallow: / User-agent: eCatch Disallow: / User-agent: eCatch/3.0 Disallow: / User-agent: EirGrabber Disallow: / User-agent: EmailCollector Disallow: / User-agent: EmailSiphon Disallow: / User-agent: EmailWolf Disallow: / User-agent: EroCrawler Disallow: / User-agent: Express WebPictures Disallow: / User-agent: Express WebPictures (www.express-soft.com) Disallow: / User-agent: ExtractorPro Disallow: / User-agent: EyeNetIE Disallow: / User-agent: Enterprise_Search Disallow: / User-agent: Enterprise_Search/1.0 Disallow: / User-agent: es Disallow: / User-agent: FairAd Client Disallow: / User-agent: Flaming AttackBot Disallow: / User-agent: FlashGet Disallow: / User-agent: FlashGet WebWasher 3.2 Disallow: / User-agent: Foobot Disallow: / User-agent: FrontPage Disallow: / User-agent: FrontPage [NC,OR] Disallow: / User-agent: Fasterfox Disallow: / User-agent: Gaisbot Disallow: / User-agent: GetRight Disallow: / User-agent: GetRight/2.11 Disallow: / User-agent: GetRight/3.1 Disallow: / User-agent: GetRight/3.2 Disallow: / User-agent: GetRight/3.3 Disallow: / User-agent: GetRight/3.3.3 Disallow: / User-agent: GetRight/3.3.4 Disallow: / User-agent: GetRight/4.0.0 Disallow: / User-agent: GetRight/4.1.0 Disallow: / User-agent: GetRight/4.1.1 Disallow: / User-agent: GetRight/4.1.2 Disallow: / User-agent: GetRight/4.2 Disallow: / User-agent: GetRight/4.2b (Portuguxeas) Disallow: / User-agent: GetRight/4.2c Disallow: / User-agent: GetRight/4.3 Disallow: / User-agent: GetRight/4.5 Disallow: / User-agent: GetRight/4.5a Disallow: / User-agent: GetRight/4.5b Disallow: / User-agent: GetRight/4.5b1 Disallow: / User-agent: GetRight/4.5b2 Disallow: / User-agent: GetRight/4.5b3 Disallow: / User-agent: GetRight/4.5b6 Disallow: / User-agent: GetRight/4.5b7 Disallow: / User-agent: GetRight/4.5c Disallow: / User-agent: GetRight/4.5d Disallow: / User-agent: GetRight/4.5e Disallow: / User-agent: GetRight/5.0beta1 Disallow: / User-agent: GetRight/5.0beta2 Disallow: / User-agent: GetWeb! Disallow: / User-agent: Go!Zilla Disallow: / User-agent: Go!Zilla (www.gozilla.com) Disallow: / User-agent: Go!Zilla 3.3 (www.gozilla.com) Disallow: / User-agent: Go!Zilla 3.5 (www.gozilla.com) Disallow: / User-agent: Go-Ahead-Got-It Disallow: / User-agent: GrabNet Disallow: / User-agent: Grafula Disallow: / User-agent: grub Disallow: / User-agent: grub-client Disallow: / User-agent: HMView Disallow: / User-agent: HTTrack Disallow: / User-agent: HTTrack 3.0 Disallow: / User-agent: HTTrack [NC,OR] Disallow: / User-agent: Harvest Disallow: / User-agent: Harvest/1.5 Disallow: / User-agent: hloader Disallow: / User-agent: httplib Disallow: / User-agent: humanlinks Disallow: / User-agent: ia_archiver Disallow: / User-agent: ia_archiver/1.6 Disallow: / User-agent: IconSurf Disallow: / User-agent: Image Stripper Disallow: / User-agent: ImageWalker/2.0 Disallow: / User-agent: Image Sucker Disallow: / User-agent: Indy Library Disallow: / User-agent: Indy Library [NC,OR] Disallow: / User-agent: InfoNaviRobot Disallow: / User-agent: InterGET Disallow: / User-agent: Internet Ninja Disallow: / User-agent: InternetSeer.com Disallow: / User-agent: Internet Ninja 4.0 Disallow: / User-agent: Internet Ninja 5.0 Disallow: / User-agent: Internet Ninja 6.0 Disallow: / User-agent: Iron33/1.0.2 Disallow: / User-agent: JOC Web Spider Disallow: / User-agent: JennyBot Disallow: / User-agent: JetCar Disallow: / User-agent: Kenjin Spider Disallow: / User-agent: Kenjin.Spider Disallow: / User-agent: Keyword Density/0.9 Disallow: / User-agent: Keyword.Density Disallow: / User-agent: LNSpiderguy Disallow: / User-agent: LeechFTP Disallow: / User-agent: LexiBot Disallow: / User-agent: LinkScan/8.1a Unix Disallow: / User-agent: LinkWalker Disallow: / User-agent: LinkWalker/2.0 Disallow: / User-agent: LinkextractorPro Disallow: / User-agent: larbin Disallow: / User-agent: larbin (samualt9@bigfoot.com) Disallow: / User-agent: larbin samualt9@bigfoot.com Disallow: / User-agent: larbin_2.6.2 (kabura@sushi.com) Disallow: / User-agent: larbin_2.6.2 (larbin2.6.2@unspecified.mail) Disallow: / User-agent: larbin_2.6.2 (listonATccDOTgatechDOTedu) Disallow: / User-agent: larbin_2.6.2 (vitalbox1@hotmail.com) Disallow: / User-agent: larbin_2.6.2 kabura@sushi.com Disallow: / User-agent: larbin_2.6.2 larbin2.6.2@unspecified.mail Disallow: / User-agent: larbin_2.6.2 larbin@correa.org Disallow: / User-agent: larbin_2.6.2 listonATccDOTgatechDOTedu Disallow: / User-agent: larbin_2.6.2 vitalbox1@hotmail.com Disallow: / User-agent: libWeb/clsHTTP Disallow: / User-agent: lwp-trivial Disallow: / User-agent: looksmart Disallow: / User-agent: lwp-trivial/1.34 Disallow: / User-agent: MJ12bot Disallow: / User-agent: MIDown tool Disallow: / User-agent: MIIxpc Disallow: / User-agent: MIIxpc/4.2 Disallow: / User-agent: MSIECrawler Disallow: / User-agent: Mass Downloader Disallow: / User-agent: Mass Downloader/2.2 Disallow: / User-agent: Mata Hari Disallow: / User-agent: Mata.Hari Disallow: / User-agent: Microsoft URL Control Disallow: / User-agent: Microsoft URL Control - 5.01.4511 Disallow: / User-agent: Microsoft URL Control - 6.00.8169 Disallow: / User-agent: Microsoft.URL Disallow: / User-agent: Mister PiX Disallow: / User-agent: Mister PiX version.dll Disallow: / User-agent: Mister Pix II 2.01 Disallow: / User-agent: Mister Pix II 2.02a Disallow: / User-agent: Mister.PiX Disallow: / User-agent: moget Disallow: / User-agent: moget/2.1 Disallow: / User-agent: naver Disallow: / User-agent: NICErsPRO Disallow: / User-agent: NPBot Disallow: / User-agent: Navroad Disallow: / User-agent: NearSite Disallow: / User-agent: Net Vampire Disallow: / User-agent: Net Vampire/3.0 Disallow: / User-agent: NetAnts Disallow: / User-agent: NetAnts/1.10 Disallow: / User-agent: NetAnts/1.23 Disallow: / User-agent: NetAnts/1.24 Disallow: / User-agent: NetAnts/1.25 Disallow: / User-agent: NetMechanic Disallow: / User-agent: NetSpider Disallow: / User-agent: NetZIP Disallow: / User-agent: NetZip Downloader 1.0 Win32(Nov 12 1998) Disallow: / User-agent: NetZip-Downloader/1.0.62 (Win32; Dec 7 1998) Disallow: / User-agent: NetZippy+(http://www.innerprise.net/usp-spider.asp) Disallow: / User-agent: Octopus Disallow: / User-agent: Offline Explorer Disallow: / User-agent: Offline Explorer/1.2 Disallow: / User-agent: Offline Explorer/1.4 Disallow: / User-agent: Offline Explorer/1.6 Disallow: / User-agent: Offline Explorer/1.7 Disallow: / User-agent: Offline Explorer/1.9 Disallow: / User-agent: Offline Explorer/2.0 Disallow: / User-agent: Offline Explorer/2.1 Disallow: / User-agent: Offline Explorer/2.3 Disallow: / User-agent: Offline Explorer/2.4 Disallow: / User-agent: Offline Explorer/2.5 Disallow: / User-agent: Offline Navigator Disallow: / User-agent: Offline.Explorer Disallow: / User-agent: Openbot Disallow: / User-agent: Openfind Disallow: / User-agent: Openfind data gatherer Disallow: / User-agent: Oracle Ultra Search Disallow: / User-agent: pavuk Disallow: / User-agent: PerMan Disallow: / User-agent: pcBrowser Disallow: / User-agent: psbot Disallow: / User-agent: PageGrabber Disallow: / User-agent: Papa Foto Disallow: / User-agent: PerMan Disallow: / User-agent: ProPowerBot/2.14 Disallow: / User-agent: ProWebWalker Disallow: / User-agent: Python-urllib Disallow: / User-agent: QueryN Metasearch Disallow: / User-agent: QueryN.Metasearch Disallow: / User-agent: RMA Disallow: / User-agent: Radiation Retriever 1.1 Disallow: / User-agent: ReGet Disallow: / User-agent: RealDownload Disallow: / User-agent: RealDownload/4.0.0.40 Disallow: / User-agent: RealDownload/4.0.0.41 Disallow: / User-agent: RealDownload/4.0.0.42 Disallow: / User-agent: RepoMonkey Disallow: / User-agent: RepoMonkey Bait & Tackle/v1.01 Disallow: / User-agent: SBIder Disallow: / User-agent: SBIder/SBIder-0.8.2-dev Disallow: / User-agent: SiteSnagger Disallow: / User-agent: SlySearch Disallow: / User-agent: SmartDownload Disallow: / User-agent: SmartDownload/1.2.76 (Win32; Apr 1 1999) Disallow: / User-agent: SmartDownload/1.2.77 (Win32; Aug 17 1999) Disallow: / User-agent: SmartDownload/1.2.77 (Win32; Feb 1 2000) Disallow: / User-agent: SmartDownload/1.2.77 (Win32; Jun 19 2001) Disallow: / User-agent: SpankBot Disallow: / User-agent: sootle Disallow: / User-agent: Sqworm/2.9.85-BETA (beta_release; 20011115-775; i686-pc-linux Disallow: / User-agent: SuperBot Disallow: / User-agent: SuperBot/3.0 (Win32) Disallow: / User-agent: SuperBot/3.1 (Win32) Disallow: / User-agent: SuperHTTP Disallow: / User-agent: SuperHTTP/1.0 Disallow: / User-agent: Surfbot Disallow: / User-agent: Szukacz/1.4 Disallow: / User-agent: searchpreview Disallow: / User-agent: spanner Disallow: / User-agent: SurveyBot Disallow: / User-agent: suzuran Disallow: / User-agent: tAkeOut Disallow: / User-agent: Teleport Disallow: / User-agent: TeleportPro Disallow: / User-agent: Teleport Pro/1.29 Disallow: / User-agent: Teleport Pro/1.29.1590 Disallow: / User-agent: Teleport Pro/1.29.1634 Disallow: / User-agent: Teleport Pro/1.29.1718 Disallow: / User-agent: Teleport Pro/1.29.1820 Disallow: / User-agent: Teleport Pro/1.29.1847 Disallow: / User-agent: Telesoft Disallow: / User-agent: The Intraformant Disallow: / User-agent: The.Intraformant Disallow: / User-agent: TheNomad Disallow: / User-agent: TightTwatBot Disallow: / User-agent: toCrawl/UrlDispatcher Disallow: / User-agent: True_Robot Disallow: / User-agent: True_Robot/1.0 Disallow: / User-agent: turingos Disallow: / User-agent: TurnitinBot Disallow: / User-agent: TurnitinBot/1.5 Disallow: / User-agent: Titan Disallow: / User-agent: URL Control Disallow: / User-agent: URL_Spider_Pro Disallow: / User-agent: URLy Warning Disallow: / User-agent: URLy.Warning Disallow: / User-agent: VCI Disallow: / User-agent: VCI WebViewer VCI WebViewer Win32 Disallow: / User-agent: VoidEYE Disallow: / User-agent: WWW-Collector-E Disallow: / User-agent: WWWOFFLE Disallow: / User-agent: Web Image Collector Disallow: / User-agent: Web Sucker Disallow: / User-agent: Web.Image.Collector Disallow: / User-agent: WebAuto Disallow: / User-agent: WebAuto/3.40 (Win98; I) Disallow: / User-agent: WebBandit Disallow: / User-agent: WebBandit/3.50 Disallow: / User-agent: WebCapture 2.0 Disallow: / User-agent: WebCopier Disallow: / User-agent: WebCopier v.2.2 Disallow: / User-agent: WebCopier v2.5 Disallow: / User-agent: WebCopier v2.6 Disallow: / User-agent: WebCopier v2.7a Disallow: / User-agent: WebCopier v2.8 Disallow: / User-agent: WebCopier v3.0 Disallow: / User-agent: WebCopier v3.0.1 Disallow: / User-agent: WebCopier v3.2 Disallow: / User-agent: WebCopier v3.2a Disallow: / User-agent: WebEMailExtrac.* Disallow: / User-agent: WebEnhancer Disallow: / User-agent: WebFetch Disallow: / User-agent: webfetch/2.1.0 Disallow: / User-agent: WebGo IS Disallow: / User-agent: WebLeacher Disallow: / User-agent: WebReaper Disallow: / User-agent: WebReaper [info@webreaper.net] Disallow: / User-agent: WebReaper [webreaper@otway.com] Disallow: / User-agent: WebReaper v9.1 - www.otway.com/webreaper Disallow: / User-agent: WebReaper v9.7 - www.webreaper.net Disallow: / User-agent: WebReaper v9.8 - www.webreaper.net Disallow: / User-agent: WebReaper vWebReaper v7.3 - www,otway.com/webreaper Disallow: / User-agent: WebSauger Disallow: / User-agent: WebSauger 1.20b Disallow: / User-agent: WebSauger 1.20j Disallow: / User-agent: WebSauger 1.20k Disallow: / User-agent: WebStripper Disallow: / User-agent: WebStripper/2.03 Disallow: / User-agent: WebStripper/2.10 Disallow: / User-agent: WebStripper/2.12 Disallow: / User-agent: WebStripper/2.13 Disallow: / User-agent: WebStripper/2.15 Disallow: / User-agent: WebStripper/2.16 Disallow: / User-agent: WebStripper/2.19 Disallow: / User-agent: Website Quester Disallow: / User-agent: Webster Pro Disallow: / User-agent: WebZip Disallow: / User-agent: WebWhacker Disallow: / User-agent: WebZIP/2.75 (http://www.spidersoft.com) Disallow: / User-agent: WebZIP/3.65 (http://www.spidersoft.com) Disallow: / User-agent: WebZIP/3.80 (http://www.spidersoft.com) Disallow: / User-agent: WebZIP/4.1 (http://www.spidersoft.com) Disallow: / User-agent: WebZIP/4.21 Disallow: / User-agent: WebZIP/4.21 (http://www.spidersoft.com) Disallow: / User-agent: WebZIP/5.0 Disallow: / User-agent: WebZIP/5.0 PR1 (http://www.spidersoft.com) Disallow: / User-agent: WebZIP/7.0 Disallow: / User-agent: WebZip/4.0 Disallow: / User-agent: wget Disallow: / User-agent: Wget/1.5.3 Disallow: / User-agent: Wget/1.6 Disallow: / User-agent: Wget/1.5.2 Disallow: / User-agent: Wget/1.7 Disallow: / User-agent: Wget/1.8 Disallow: / User-agent: Wget/1.8.1 Disallow: / User-agent: Wget/1.8.1+cvs Disallow: / User-agent: Wget/1.8.2 Disallow: / User-agent: Wget/1.9-beta Disallow: / User-agent: Widow Disallow: / User-agent: WebmasterWorldForumBot Disallow: / User-agent: Website Quester Disallow: / User-agent: Website Quester - www.asona.org Disallow: / User-agent: Website Quester - www.esalesbiz.com/extra/ Disallow: / User-agent: Website eXtractor Disallow: / User-agent: Website eXtractor (http://www.asona.org) Disallow: / User-agent: WebmasterWorldForumBot Disallow: / User-agent: Xaldon WebSpider Disallow: / User-agent: Xaldon WebSpider 2.5.b3 Disallow: / User-agent: Xenu's Disallow: / User-agent: Xenu's Link Sleuth 1.1c Disallow: / User-agent: Zeus Disallow: / User-agent: Zeus 11389 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 11652 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 18018 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 26378 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 30747 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 32297 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 39206 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 41641 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 44238 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 51070 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 51674 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 51837 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 63567 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 6694 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 71129 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 82016 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 82900 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 84842 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 90872 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 94934 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 95245 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 95351 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus 97371 Webster Pro V2.9 Win32 Disallow: / User-agent: Zeus Link Scout Disallow: /
Now, you may agree that this list is now BEYOND ridiculous, and will not include the probably about another 10 maliscious robots only yesterday.
I think its would be a far better and wiser approach to simply make a robots.txt that bans the lot apart from a safe Good robot list if there is such a thing!
Filed under: Code, Robots + Htaccess ... Comments (0)
Subscribe in a Reader or by Email




