## http://www.robotstxt.org/wc/norobots.html #User-agent: ultraseek #Disallow: #User-agent: recseek #Disallow: ## Crawl-delay: 300 ## These bots are harvester/collector bots and are ## used to siphon email addresses from websites: User-agent: CherryPickerElite/1.0 Disallow: / User-agent: CherryPickerSE/1.0 Disallow: / User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 Disallow: / User-agent: EmailCollector/1.0 Disallow: / User-agent: EmailSiphon Disallow: / User-agent: EmailWolf 1.00 Disallow: / User-agent: ExtractorPro Disallow: / User-agent: Mozilla/2.0 (compatible; NEWT ActiveX; Win32) Disallow: / User-agent: WebBandit/2.1 Disallow: / User-agent: WebBandit/3.50 Disallow: / User-agent: Webbandit/4.00.0 Disallow: / #User-agent: #Disallow: / ## The following bots are ill-behaved and tend to index our site during ## working hours - often with detrimental effects: ## TurnItIn is an education spider which pulls data from various ## sources identify plagarization in student papers. User-agent: TurnitinBot/2.0 http://www.turnitin.com/robot/crawlerinfo.html #Disallow: Crawl-delay: 30 User-Agent: MJ12bot Crawl-Delay: 600 ## Below here are the good bots (like OURS) and ## default behaviour for unknown bots & spiders User-agent: UltraSeek #Disallow: Crawl-Delay: 15 User-agent: * Disallow: /webaccess Crawl-Delay: 300