MyBB Internal: One or more warnings occured. Please contact your administrator for assistance.
honor robots.txt
Current time: 09-27-2020, 10:51 AM Hello There, Guest! (LoginRegister)

Post Reply 
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
honor robots.txt
03-15-2011, 01:56 PM
Post: #1
honor robots.txt
Hey guys,

I'm working on httparchive, and one of the current bugs is to make it honor robots.txt--which really is an upstream bug with wpt.

I'm not familiar with the wpt codebase, but I'd be happy to try to contribute, if altering the spidering call to respect robots.txt seems relatively straightforward to someone familiar with the codebase. (I'm guessing the actual spidering is executed via a php curl extension call?)

If supporting robots.txt would be tough, I'll just handle the spidering step in httparchive for now.


Jared Hirsch
Find all posts by this user
Quote this message in a reply
Post Reply 

Messages In This Thread
honor robots.txt - jared - 03-15-2011 01:56 PM
RE: honor robots.txt - pmeenan - 03-15-2011, 11:03 PM
RE: honor robots.txt - jared - 03-16-2011, 05:44 AM

Forum Jump:

User(s) browsing this thread: 1 Guest(s)