earthworm

"benevolent virus propagation" for team file sharing

Technical Elements

IP/port scanning

Most worms use networks to spread unlike regular computer viruses. The IP-Port scanner first pings the network in order to detect all online computers on the network using a set range of IP addresses.  After detecting all the computers on the network, the scanner then looks for an open port on each of the computers.  The open port will allow the worm to send a file to another computer without any problems and communicate with other infected computers.

Another method of file propagation that we looked into was the Gnutella protocol.  The Gnutella protocol is consists of five parts: ping, pong, query, query reply, and push. First, a ping message is used to discover new nodes on the network. Then, a pong message is sent as a reply to a ping and provides information on a network node, including IP address, port number, and number of files shared. Third, a query message is then used to search for files shared by other nodes on the network; it contains a query string and a minimum requested link speed. Fourth, a query reply message contains a list of one or more files which match a given query, the size of each file, and the link speed of the responding node. Finally, push message is used to upload file to clients behind a firewall who can not download files themselves.  Due to limited time, we were unable to implement this type of file propagation.


Local dropbox monitoring

When a user drops a file into their local dropbox, a message must be sent to every other computer on the network currently running the Earthworm client, telling those computers to come get the new file. This script takes a snapshot of the dropbox directory contents every ten seconds, and compares it to the last snapshot. If a file is in the user's local dropbox that wasn't there before, or if a file in the user's local dropbox is no longer there, that change must be propagated to the other computers on the network. A GET request is sent to these computers, who then come and fetch the file for their local dropboxes.


Remote vs. local dropbox comparison

The purpose of this software is to propagate files like a worm would, meaning that all files in one computer’s shared directory are placed in the shared directories of all other computers running our software. In order to determine when files need to be transferred from one computer to another, we needed a way to compare the files on the local computer with the files on another computer on our network. In order to complete a comprehensive comparison, we needed more information about each file than its name and date modified.  We needed a way to compare two files with the same name, and determine which file should be transferred and which should be overwritten (we assumed that the newer file should always overwrite the older file). We used the Python md5 library to find the md5 hash of each file. An md5 hash is the result of digest algorithms which characterizes a file in a 32 digit hexadecimal number. This makes it possible to easily see if two versions of a file are different. We created a piece of code that makes a Python dictionary with the names of each file on the shared dictionary as the keys and the md5 hashes and dates modified as the values. We also altered a method in the request handler such that a get request without a specific filename will return the dictionary.

The overall process we use to view and compare files is as follows.  We randomly choose one computer that is on the network, and then compare the dictionaries of the two shared directories. If there is a file on the remote computer that is not on the local computer, we transfer it from the network via an http request. If there is a file on the other computer with the same name but a different md5 hash and a more recent date modified, we also download it from the network to the local directory. This process, when run on all machines, ensures that the shared directories on each computer running our software contain the same files.


File sending/receiving

Network communication between machines is handled using HTTP. Each machine runs its own HTTP server on port 13370 (not the standard port 80). The HTTP server listens for various requests made by clients and responds to them appropriately. Click the link below to continue reading:

File sending/receiving protocol


Earthworm was created for ENGR 3410 (Computer Architecture) at Olin College of Engineering in Needham, Massachusetts.


Team Earthworm is: Leslie Gerhat, Eric Hwang, Katarina Miller, Raghu Rangan, and Garrett Rodrigues (Class of 2010).