Google Blogoscoped

Forum

WMW Updates Redirect Handling (Again)  (View post)

John Honeck [PersonRank 10]

Tuesday, March 6, 2007
17 years ago5,197 views

Kudos to all involved, he went farther than the 'first click free' and made it 4-5 clicks free which really has the searcher in mind as they may want to click through the site some more after hitting the landing page.

Now that they are in the clear, I think we can expect a definitive follow up by Matt on his blog or even an 'official' webmaster blogpost from Adam.

Cloakers beware!

(oh-oh! I said cloaking, now this will turn into a 100 comment thread of it's-cloaking vs. its-not-cloaking)

JohnMu [PersonRank 10]

17 years ago #

Sounds like a good plan :-). Let's hope it sticks.

Hong Xiaowan [PersonRank 10]

17 years ago #

Useless Updating.

As a test, I already downloaded 1697 pages from WMW this morning. I will wait the result if WMW give me the error page. Now, I did not find any error pages.

jersey [PersonRank 0]

17 years ago #

god help you Philipp if you had any ulterior motive or skin in the game of persuing Brett and WMW for this long length of time.

Philipp Lenssen [PersonRank 10]

17 years ago #

You too if you had any ulterior motive in posting that comment, Jersey ;)

Elias Kai [PersonRank 10]

17 years ago #

This should apply to all newspapers.

Hong Xiaowan [PersonRank 10]

17 years ago #

Now, I downloaded 6460 pages at WMW, Still not find error pages.

engine [PersonRank 0]

17 years ago #

"Hong Xiaowan
Now, I downloaded 6460 pages at WMW, Still not find error pages"

It's things like above that make Brett block users.

Hong Xiaowan [PersonRank 10]

17 years ago #

Now, the error pages works. Need I fake IP to test again? Useless.

Hong Xiaowan [PersonRank 10]

17 years ago #

Downloading form 03-07 05:45 to 03-08 01:13
Threads total 34607, Success download 7654, Total rate is 21.86%.
To success all 34607 threads, still need 3days and about 14 hours.

WMW seams not block my IP, just give me a rate to success the pages.
Rate 500 equal 100%, Thread and success are real amount in the per minute
http://hongxiaowan.googlepages.com/wmw.gif

Philipp Lenssen [PersonRank 10]

17 years ago #

Xiaowan, do you follow the robots.txt guidelines while screenscraping WMW? If not, I think it's unfair to Brett to do this.

Hong Xiaowan [PersonRank 10]

17 years ago #

This way that WMW did, really make many person interested in trying to research how to download pages from WMW.

I know, there were another 7 guys also researching now. I think there will be more persons join this senseless buy funny test.

If WMW opened all pages, they will lost interest and stop at once.

Hong Xiaowan [PersonRank 10]

17 years ago #

Dear Philipp Lenssen
Yes, I followed the robots.txt. And used real cookies. It is rule, and, I think I have rights to download the pages when I obeyed the rules.
Best regards.

Hong Xiaowan [PersonRank 10]

17 years ago #

User-agent: *
Disallow: /

Now I see the rules just. Means all bot not have rights to download pages.

Sorry, I not followed the rules. I use the bot and let bot followed the robots.txt. Something wrong. I am checking.

Hong Xiaowan [PersonRank 10]

17 years ago #

The logs said it followed the robots, there should be three robots files. The "enter letter" of first file is not realy "enter letter", then the robots follow the rules in other two files.

10:40:09 Warning: Link www.webmasterworld.com/profilev4.cgi?action= view&member=Mikevanh not scanned (follow robots meta tag)
10:41:38 Warning: Link www.webmasterworld.com/profilev4.cgi?action= view&member=Vahid not scanned (follow robots meta tag)

But this bot not followed all rules, that should have 17s for delay.

Excuse me. I now stop the botting and ask other guys stop downloading the pages cause WMW said they only allow four botting.

Panda Mimi [PersonRank 1]

17 years ago #

This thread made GBC(Google Blogoscoped in Chinese) down for a while per day.

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!