Google Blogoscoped

Forum

On the "Google Copy Paste Syndrome"  (View post)

/pd [PersonRank 10]

Tuesday, December 4, 2007
16 years ago7,247 views

yes, this is an interesting syndrome- the copy paste thingy – does not permit the younger generation to string words together to form meaningful statements. This will surely interesting how future generations will make up new stories / articles ??

asdf2579 [PersonRank 0]

16 years ago #

Educational institutions need to take a firm stand on this practice. At the college program where I teach, I busted nearly half my class a few years ago for cut and paste plagiarism and the school refused to discipline the students involved--in fact one of them made the honor roll.

Armand Asante [PersonRank 1]

16 years ago #

My God!

That paper has some HORRIBLE English.
I think those writers could do with a little copy/pasting themselves.

And as for copy/paste killing creativity – just take a look at all those 'Darth Vader Sings the Blues' vids on youtube. Creations made entirely of copy/pasting "plagiarized" material.

I feel sorry for the authors of that paper – they're old.

Matt Cutts [PersonRank 10]

16 years ago #

If I didn't have time to read all 187 pages, could I get a shorter summary? Maybe, one page? :)

Matt Cutts [PersonRank 10]

16 years ago #

Skimming it now. Page 50 has a picture that some people may like.

Page 73 has an interesting quote: "We would rather see a number of big search engines run by some official non-profit organisations than a single one run by a private, profit driven company."

But I don't think there's anything stopping a non-profit that wanted to build their own search engine?

The points on pages 73 & 74 are nearly a verbatim repeat of text earlier in the doc. Given that the paper discusses plagiarism, that's sort of strange.

Also on page 73 I read "It is now some 18 months ago that we found out that Google proved to be unwilling to support high
power plagiarism detection services." I wonder if someone wanted to send a bunch of queries to Google or use our API, and we couldn't support that? If someone wanted to do high-volume queries for academic research, they might want to check out this program that Google now offers: http://research.google.com/university/search/

Some strange speculation on page 82: "There were rumours about a secret collaboration of Wikipedia and Google because Wikipedia key term entries usually appear very high on the list of
Google search results (if not on top rank). In the empirical part of this study we were able to prove this fact. Of course this is no proof of a hidden collaboration! But it is interesting that for example Wikipedia admits on its web site that they had arranged a collaboration with Yahoo!, and the months
after the contract Wikipedia links climbed up the Yahoo! search results list"

On page 84, the case of "Google addicts" is discussed: "Turning web searchers into "Google addicts": The problem of Google addiction so far is nearly completely neglected by social sciences and media psychology. On a recent ARTE documentation on
Google, the phenomenon of "Google addiction" was covered: People who are Google addicts not only google the terms of their interest nearly the whole day (in an endless loop to see if something has changed), but they also use googling in the same way as chatting, doing phone calls or watching TV:
to fill up time in which they do not work or do not want to work. So far there is no empirical study on Google addiction with convincing hard facts. We would strongly recommend to conduct such an investigation."

I don't know. I would take the feedback/suggestions for e.g. better plagiarism APIs/tools more strongly if it were proposed in a less controversial way.

Philipp Lenssen [PersonRank 10]

16 years ago #

> Also on page 73 I read "It is now some 18 months
> ago that we found out that Google proved to be
> unwilling to support high
power plagiarism detection services." I wonder if
> someone wanted to send a bunch of queries to
> Google or use our API, and we couldn't support that?

There are some related sample apps here, and probably on other sites as well:

http://blogoscoped.com/quotefinder/
http://blogoscoped.com/text-color/

Admittedly, the official way to do this stuff server-side is not available for new users these days, as the Google SOAP API has been cancelled (and not everyone's a university student). The AJAX Search API is a limited alternative, perhaps it works in this case. Then again screenscraping Google results with tools like PHP5 is just as easy as using the SOAP API, and sometimes even more reliable.

B. [PersonRank 2]

16 years ago #

And, for better or for worse, it's going to get worse – or better – who knows? (http://plasq.com/skitch#demo). They copy and paste our DNA too, don't they? Anyways we are only in the beginning of the discution.

Future Converged [PersonRank 1]

16 years ago #

A while back there was this news that Google buying claim-your-content.com domain and its variants. As always, the same technology that makes it possible to find and use lots of text excerpts from various sources can also be used to see where those excerpts came from.

I guess this may become a model of the future. You register with an authoritative site such as claim-your-content.com. Then every time you wrote something, you signal site to this for indexing (like you do with technorati now) and you claim what you just wrote. From now on you are the first person who has written this text. Of course the system can check to make sure you are actually the person who wrote it in the first place before it registers it to you. Google can use its search technology to see where the text might have originated from. In fact this is so fantastically in line with their effort on scanning the paper books. Google already has the online digital world indexed and once most books are also digitized and indexed, just about any sentence can be traced back to its origin.

So I would say this is a especial time for those who want to plagiarize. Because in the future it will get harder.

David T [PersonRank 7]

16 years ago #

Very interesting article, I agree that it is negative how much "copy and pasting" is going on. Nevertheless in terms of University papers, I think this depends somewhat on your the educational institution, and the country which you’re studying in. Say for example, when I was in Uni in the UK, they were very strict on plagiarism, France are much less strict about it, and in India, my impression was that it is more or less accepted!

All to say that at least for Universities, that I think there is certainly some cultural aspect in all of this that shouldn't be neglected.

stefan2904 [PersonRank 10]

16 years ago #

about the author: http://www.iicm.tugraz.at/maurer/

stefan2904 [PersonRank 10]

16 years ago #

german:
http://www.heise.de/newsticker/meldung/99953

http://www.iicm.tugraz.at/Ressourcen/Papers/Google.pdf

Mathias Schindler [PersonRank 10]

16 years ago #

thanks stefan2904 for the links. The google.pdf article is fun, he confuses gmail.de with googlemail :)

Helge Fahrnberger [PersonRank 0]

16 years ago #

The author has had an outstanding track record earlier in his academic career but has produced quited some controversies in recent times. Links for German-speakers you might want to check out:

http://www.helge.at/2007/12/die-idee-fixe-des-herrn-maurer/
http://www.helge.at/2007/09/grazer-dekan-im-fettnaepfchen/

Andy Wong [PersonRank 10]

16 years ago #

I seem to see the points of the author of this lengthy article.

If you do what Google does, slowly and manually, as we has been doing in last 20 years without a decent searching engine, the author may not consider you invading privacy.

This professor is not alone. Did you remember that many governments blame Google Earth/Maps for leaking military sites? While those photos were available in last 2 decades through commercial channels.

The point is, the info is public available. Harvesting them and aggregating them automatically and quickly do not form privacy invasion. If you don't want to private info be seen through Google, just don't make the info be available through public channels. Of course, if someone dig your private info out and put the info to public channels, this is privacy invasion.

These authors showed very poor logics when drafting this article. See the digest from this article:
"However, in conjunction with the fact that Google is operating many other services, and probably silently cooperating with still further players, this is unacceptable.
The reasons are basically:
– Google is massively invading privacy. It knows more than any other organisation about people, companies and organisations than any institution in history before, and is not restricted by national data protection laws.
– Thus, Google has turned into the largest and most powerful detective agency the world has ever known. I do not contend that Google has started to use this potential, but as commercial company it is FORCED to use this potential in the future, if it promises big revenue. If government x or company y is requesting support from Google for information on whatever for a large sum, Google will have to comply or else is violating its responsibilities towards its stockholders."

These are not reasons, but conclusions. After all, this article aggregate the view points of many people in fear of Google's powers. And such fear just went over their common sense, and eventually kept blaming what made them be afraid, and ran to illogical thoughts and confusions. As many readers had pointed out above.

Philipp Lenssen [PersonRank 10]

16 years ago #

Mathias Schindler says this is the official statement from Google in regards to the paper:

<<These allegations are premised on numerous inaccuracies, conspiracy theories and fundamental misunderstandings about Google's products and services. They're completely without foundation and, frankly, a little strange.>>

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!