Turnitin Crawler
While troubleshooting a problem with a website using
wireshark, I was capturing
HTTP traffic.
I noticed a connection from 65.98.224.2 with the contents
of the first packet received from that address showing
the software accessing my support website identifying itself
as shown below:
User-Agent: TurnitinBot/2.1 (http://www.turnitin.com/robot/crawlerinfo.html)
Checking the URL listed, I found the following:
Chances are that you are reading this because you found a reference
to this web page from your web server logs. This reference was left
by Turnitin.com's web crawling robot, also known as TurnitinBot. This
robot collects content from the Internet for the sole purpose of helping
educational institutions prevent plagiarism. In particular, we compare
student papers against the content we find on the Internet to see if
we can find similarities. For more information on this service, please
visit www.turnitin.com
The Wikipedia
article on Turnitin states that it is as "an Internet-based
plagiarism-detection service created by iParadigms, LLC. Institutions
(typically universities and high schools) buy licenses to submit essays
to the Turnitin website, which checks the document for plagiarism."
I had read that many schools now use such services to deter students
from submitting plagiarized papers. I've seen services offerring pre-written
papers for students to submit for classes, so I can see the need for teachers
to use such detection services. I didn't realize this service crawled websites
to index materials on the web as part of its detection efforts, but it makes
sense to me that the service would do so. This is the first time I've
noticed this particular web
crawler
[/network/web/crawlers]
permanent link
Installing Wireshark
I wanted to install Ethereal on a CentOS Linux system to sniff network traffic
to try to resolve a problem for a website. I have
tcpdump on the system,
but I wanted to have a
GUI
tool to make analyzing the packets a little easier for me.
I ran yum install ethereal
, which installed wireshark
and its dependency, libsmi
. Wireshark was installed, because
development of ethereal has stopped and the core development team is now
developing wireshark.
The FAQ for
wireshark offers the following explanation of the name change.
In May of 2006, Gerald Combs (the original author of Ethereal) went
to work for CACE Technologies (best known for WinPcap). Unfortunately,
he had to leave the Ethereal trademarks behind.
This left the project in an awkward position. The only reasonable way to
ensure the continued success of the project was to change the name. This
is how Wireshark was born.
Wireshark is almost (but not quite) a fork. Normally a "fork" of an
open source project results in two names, web sites, development teams,
support infrastructures, etc. This is the case with Wireshark except for
one notable exception -- every member of the core development team is now
working on Wireshark. There has been no active development on Ethereal
since the name change. Several parts of the Ethereal web site (such as the
mailing lists, source code repository, and build farm) have gone offline.
After the installation completed, I tried running wireshark by issuing
the command wireshark
.
# wireshark
bash: wireshark: command not found
I then discovered that installing the wireshark
RPM only installs
a command line program, tshark
. The program was installed in
/usr/sbin/tshark
. You can obtain help on
tshark
using man tshark
or tshark -h
.
There is also documentation installed in /usr/share/wireshark/help
.
I had to install wireshark-gnome
to get the GUI version, which
I did with yum -y install wireshark-gnome
. I could then start
the GUI version from a shell prompt with wireshark
or start it
by clicking on Applications, Internet, and then
Wireshark Network Analyzer.
Since I wanted to capture only HTTP
traffic, I typed HTTP
in the Filter field and
then clicked on the Apply button. I then clicked on Capture,
Interfaces, and clicked on the Start button next to the
eth0 interface to start capturing all HTTP traffic.
[/network/tools/sniffing/wireshark]
permanent link
Web Developer Extension for Firefox
The
Web
Developer extension for Firefox adds a menu and a toolbar to the browser
with various web developer tools. It is designed for
Firefox,
Flock and
Seamonkey, and will
run on any platform that these browsers support including Windows, Mac OS X
and Linux.
You can install the extension by simply clicking on the
link for it. When it
is installed, you will be notified you should restart Firefox to complete
your changes.
The extension provides the capability for one to easily view the
headers or CSS information for
a page, check for
Section 508 compliance, display the dimentions of images on the page,
and many other capabilities useful to web developers.
[ More Info ]
[/network/web/browser/firefox]
permanent link