earthworm
"benevolent virus propagation" for team file sharing
Technical Elements
IP/port scanning
Most worms use networks to spread unlike regular computer
viruses. The
IP-Port scanner first pings the network in order to detect all online
computers on the network using a set range of IP addresses.
After
detecting all the computers on the network, the scanner then looks for
an open port on each of the computers. The open port will
allow the
worm to send a file to another computer without any problems and
communicate with other infected computers.
Another method of
file propagation that we looked into was the Gnutella
protocol. The
Gnutella protocol is consists of five parts: ping, pong, query, query
reply, and push. First, a ping message is used to discover new nodes on
the network. Then, a pong message is sent as a reply to a ping and
provides information on a network node, including IP address, port
number, and number of files shared. Third, a query message is then used
to search for files shared by other nodes on the network; it contains a
query string and a minimum requested link speed. Fourth, a query reply
message contains a list of one or more files which match a given query,
the size of each file, and the link speed of the responding node.
Finally, push message is used to upload file to clients behind a
firewall who can not download files themselves. Due to
limited time,
we were unable to implement this type of file propagation.
Local dropbox monitoring
When a user drops a file into their local dropbox, a message must be sent to every other computer on the network currently running the Earthworm client, telling those computers to come get the new file. This script takes a snapshot of the dropbox directory contents every ten seconds, and compares it to the last snapshot. If a file is in the user's local dropbox that wasn't there before, or if a file in the user's local dropbox is no longer there, that change must be propagated to the other computers on the network. A GET request is sent to these computers, who then come and fetch the file for their local dropboxes.
Remote vs. local dropbox comparison
The purpose of this
software is to propagate files like a worm would, meaning that all
files in one computer’s shared directory are placed in the shared
directories of all other computers running our software. In order to
determine when files need to be transferred from one computer to
another, we needed a way to compare the files on the local computer
with the files on another computer on our network. In order to complete
a comprehensive comparison, we needed more information about each file
than its name and date modified. We needed a way to compare
two files with the same name, and determine which file should be
transferred and which should be overwritten (we assumed that the newer
file should always overwrite the older file). We used the Python md5
library to find the md5 hash of each file. An md5 hash is the result of
digest algorithms which characterizes a file in a 32 digit hexadecimal
number. This makes it possible to easily see if two versions of a file
are different. We created a piece of code that makes a Python
dictionary with the names of each file on the shared dictionary as the
keys and the md5 hashes and dates modified as the values. We also
altered a method in the request handler such that a get request without
a specific filename will return the dictionary.
The overall process we use to view and compare files is as
follows. We randomly choose one computer that is on the
network, and then compare the dictionaries of the two shared
directories. If there is a file on the remote computer that is not on
the local computer, we transfer it from the network via an http
request. If there is a file on the other computer with the same name
but a different md5 hash and a more recent date modified, we also
download it from the network to the local directory. This process, when
run on all machines, ensures that the shared directories on each
computer running our software contain the same files.
File sending/receiving
Network communication between machines is handled using HTTP. Each machine runs its own HTTP server on port 13370 (not the standard port 80). The HTTP server listens for various requests made by clients and responds to them appropriately. Click the link below to continue reading:
Earthworm was created for ENGR 3410 (Computer Architecture) at Olin College of Engineering in Needham, Massachusetts.
Team Earthworm is:
Leslie Gerhat, Eric Hwang, Katarina Miller, Raghu Rangan, and Garrett
Rodrigues (Class of 2010).