Wikipedia Stubs Written By Bots

Under the Rose · Jul 16, 2014

Wikipedia is used by many people as a first resource when doing a quick search on a topic of interest, largely because it also often provides links to a myriad of other inputs. I just read a piece by ZME Science which makes the following remarks:

Sverker Johansson could encompass the definition of prolific. The 53-year-old Swede has edited so far 2.7 million articles on Wikipedia, or 8.5% of the entire collection. But there’s a catch – he did this with the help of a bot he wrote. Wait, you thought all Wikipedia articles are written by humans?
Read more at This author edits 10,000 Wikipedia entries a day

Lsjbot’s entries are categorized by Wikipedia as stubs – pages that contain only the most important, basic bits of information. This is why his bot works so well for animal species or towns, where it can make sense to automatize the process. In fact, if Wikipedia has a chance of reaching its goal of encompassing the sum of the whole human knowledge, it needs bots. It needs billions of entries, and this is no task a community of humans can achieve alone, not even one as active and large as Wikipedia.
Read more at This author edits 10,000 Wikipedia entries a day

http://www.zmescience.com/research/w...nce%29#!bf3Iwr

This is very interesting to me and explains why many of the pages contain only very basic content, presented in a similar format. I was wondering how many of you were already aware of this and what your thoughts and comments are regarding the use of Bots as research assistants and authors.

NobleSavage · Jul 16, 2014

I wasn't aware of this, but it doesn't surprise me at all. My current hobby is playing around with Python's NLTK. I find it to be a fascinating topic. There are tons of search engine spammers that generate sites with bot text and just put up ads.

There is a humorous patent generator written in Python http://lav.io/2014/05/transform-any-text-into-a-patent-application/

Under the Rose · Jul 16, 2014

Thank you for the reply, NobleSavage.

A number of other sources are giving this fact a bit of cover.

AnonTechie writes:
From Popular Science:
You might think writing 10,000 articles per day would be impossible. But not for a Swede named Sverker Johansson. He created a computer program that has written a total of 2.7 million articles, making Johansson the most prolific author, by far, on the "internet's encyclopedia." His contributions account for 8.5 percent of the articles on Wikipedia, the Wall Street Journal reports.
But how can a bot write so many articles, and do it coherently? As Johansson--a science teacher with degrees in linguistics, civil engineering, economics and particle physics--explained to the WSJ, the bot scrapes information from various trusted sources, and then cobbles that material together, typically into a very short entry, or "stub." Many of the articles cover the taxonomy of little-known animals such as butterflies and beetles, and also small towns in the Philippines (his wife is Filipino).
Johansson's creation, known as Lsjbot, is certainly not the only bot to write articles meant for human eyes. For example, the Associated Press just announced that it will use robots to write thousands of pieces, and other news outlets use programs to write articles, especially finance and sports stories. And on Wikipedia, half of all of the edits are made by bots.

bilby · Jul 16, 2014

I tried writing a Wikipedia article using a bot, but it was completely impractical, so I went back to using my fingers.

rjh01 · Jul 17, 2014

What he is doing may be OK. Humans need to watch out for a bot that writes many articles on a subject that are one sided. Imagine a political bot that is for one side of politics. Write articles about all politicians. The ones that are on the bot's side get good articles, the ones that are not get bad articles. Ditto for many other subjects. There are solutions for this sort of thing.

NobleSavage · Jul 17, 2014

Under the Rose said:
Thank you for the reply, NobleSavage.

A number of other sources are giving this fact a bit of cover.

AnonTechie writes:
From Popular Science:
You might think writing 10,000 articles per day would be impossible. But not for a Swede named Sverker Johansson. He created a computer program that has written a total of 2.7 million articles, making Johansson the most prolific author, by far, on the "internet's encyclopedia." His contributions account for 8.5 percent of the articles on Wikipedia, the Wall Street Journal reports.
But how can a bot write so many articles, and do it coherently? As Johansson--a science teacher with degrees in linguistics, civil engineering, economics and particle physics--explained to the WSJ, the bot scrapes information from various trusted sources, and then cobbles that material together, typically into a very short entry, or "stub." Many of the articles cover the taxonomy of little-known animals such as butterflies and beetles, and also small towns in the Philippines (his wife is Filipino).
Johansson's creation, known as Lsjbot, is certainly not the only bot to write articles meant for human eyes. For example, the Associated Press just announced that it will use robots to write thousands of pieces, and other news outlets use programs to write articles, especially finance and sports stories. And on Wikipedia, half of all of the edits are made by bots.

Click to expand...

I bet he had a bot doing all his homework.

thebeave · Jul 19, 2014

One step closer to Skynet...

Underseer · Jul 19, 2014

Under the Rose said:
Wikipedia is used by many people as a first resource when doing a quick search on a topic of interest, largely because it also often provides links to a myriad of other inputs. I just read a piece by ZME Science which makes the following remarks:

Lsjbot’s entries are categorized by Wikipedia as stubs – pages that contain only the most important, basic bits of information. This is why his bot works so well for animal species or towns, where it can make sense to automatize the process. In fact, if Wikipedia has a chance of reaching its goal of encompassing the sum of the whole human knowledge, it needs bots. It needs billions of entries, and this is no task a community of humans can achieve alone, not even one as active and large as Wikipedia.
Read more at This author edits 10,000 Wikipedia entries a day

Click to expand...

http://www.zmescience.com/research/w...nce%29#!bf3Iwr

This is very interesting to me and explains why many of the pages contain only very basic content, presented in a similar format. I was wondering how many of you were already aware of this and what your thoughts and comments are regarding the use of Bots as research assistants and authors.

This is further proof that Wikipedia is full of lies by the liberal intellectual elite trying to convert everyone to collectivism and turn their chilluns gay. This is yet another reason why patriotic Real Americans^TM know to use conservapedia instead. That way you get Fair and Balanced^TM information instead of lies! [/conservolibertarian]

Under the Rose · Jul 20, 2014

The remark about bots and homework strikes home with me and I surely would not want to be a teacher attempting to fairly judge essay assignments these days. It is bad enough that we no longer teach written script and now have spell check to instantly correct instead of laboriously double checking one's submission with a dictionary or thesaurus before submitting. How can an educator be sure that the work they are scoring has not simply been purchased on-line?

As long as a student is clever enough to present work that is reasonably close to their own vocabulary and presentation style (or slightly edits a procured work), I'm not sure that they can make an easy determination. :thinking:

Now, a verbal presentation, they could make a better assessment of, in my opinion.

rjh01 · Jul 20, 2014

Under the Rose said:
The remark about bots and homework strikes home with me and I surely would not want to be a teacher attempting to fairly judge essay assignments these days. It is bad enough that we no longer teach written script and now have spell check to instantly correct instead of laboriously double checking one's submission with a dictionary or thesaurus before submitting. How can an educator be sure that the work they are scoring has not simply been purchased on-line?

As long as a student is clever enough to present work that is reasonably close to their own vocabulary and presentation style (or slightly edits a procured work), I'm not sure that they can make an easy determination. Now, a verbal presentation, they could make a better assessment of, in my opinion.

There are tools teachers can buy to work out if an essay is a cut and paste from the Internet.

Lugubert · Jul 20, 2014

rjh01 said:
There are tools teachers can buy to work out if an essay is a cut and paste from the Internet.

Some universities use similar things on final papers. Will probably catch the lazy students, but I'm not sure that they will discover essays that change the order of arguments, manipulate the word order, and excel in using synonyms. Lets assume that I find a nice essay in English (or whatever language) that would suit my BA/MA/even PhD Thesis. I translate it into Swedish in a way that's immensely more colloquial that the presumably dry original academic language, and juggle the details as of above. I'll bet you some useful money that I won't be caught by software. A well read professor might spot it.

NobleSavage · Jul 20, 2014

rjh01 said:
Under the Rose said:

The remark about bots and homework strikes home with me and I surely would not want to be a teacher attempting to fairly judge essay assignments these days. It is bad enough that we no longer teach written script and now have spell check to instantly correct instead of laboriously double checking one's submission with a dictionary or thesaurus before submitting. How can an educator be sure that the work they are scoring has not simply been purchased on-line?

As long as a student is clever enough to present work that is reasonably close to their own vocabulary and presentation style (or slightly edits a procured work), I'm not sure that they can make an easy determination. Now, a verbal presentation, they could make a better assessment of, in my opinion.

Click to expand...

There are tools teachers can buy to work out if an essay is a cut and paste from the Internet.

There are tons of these tools on line for free. Any smart cheater would just run his plagiarized content through a few checkers and and make changes until clear.

http://www.plagscan.com/seesources/analyse.php

bilby · Jul 20, 2014

NobleSavage said:
rjh01 said:

There are tools teachers can buy to work out if an essay is a cut and paste from the Internet.

Click to expand...

There are tons of these tools on line for free. Any smart cheater would just run his plagiarized content through a few checkers and and make changes until clear.

http://www.plagscan.com/seesources/analyse.php

Given that a large majority of people use a Batchelor's degree as a kind of generic indicator of intelligence, rather than using it in the specific field to which it relates, there is an argument to be made that a person who is a sufficiently skilled cheat deserves a degree anyway.

Given the number of middle managers who have a degree, but appear to be incompetent nonetheless, there is an argument to be made that this is already happening with some regularity.

Wikipedia Stubs Written By Bots

Loading....

Under the Rose

Veteran Member

NobleSavage

Veteran Member

Under the Rose

Veteran Member

bilby

Fair dinkum thinkum

rjh01

Member

NobleSavage

Veteran Member

thebeave

Contributor

Underseer

Contributor

Under the Rose

Veteran Member

rjh01

Member

Lugubert

Junior Member

NobleSavage

Veteran Member

bilby

Fair dinkum thinkum