3.7 KiB
Distributed File System in Rust for CCOM4017
This suite of programs handles file copying over TCP with a client/server model. It contains the following programs;
- copy
- ls
- data_node
- meta_data
copy
and ls
are clients that connect to the servers. copy
sends file read and write requests
to the meta_data
server, which uses a sqlite3 database to keep track of which nodes are connected,
as well as which files have been added. When a file is added, meta_data
sends the list of available
data_node
servers, copy
then divides the file up by the amount of nodes, then proceeds to transfer
each chunk over 256 bytes at a time. ls
simply prints out a list of the existing files on the
meta_data
server.
The code uses serde_json
to serialize and deserialize Rust structs to and from json. The clients and
servers then listen for incoming streams of data and parses them as json. As well as exchanging
metadata, this protocol also establishes the handshake to then transfer the raw file chunks.
rusqlite
is used for managing the sqlite database. This allows SQL queries to be performed from
the rust code and manage the data base in a relatively type safe way. Unit tests in the meta_data
provide coverage of these SQL operations against an in-memory version
WARNING:
If you're my professor, please do not generate a database with the default createdb.py
provided in the skeleton dfs. I have included a custom version of the file in the root of the project.
The reason being that I changed chunks to be integers rather than strings, in order to provide ordering
to the chunks when transferring.
Running
To run the ls
provide an endpoint in the ip:port
format. ip
can be "localhost", consider
using ./
to avoid a naming conflict with the GNU version of ls
$ ./ls 127.0.0.1:6770
The meta_data
server takes an optional port, but will default to 8000
if none is specified.
$ meta_data 6710
The data node takes two endpoints in the ip:port
and then a an optional path. The first endpoint
is the ip and port, both for binding to a TCP port and also to send itself to the meta_data
server.
The second endpoint is the meta_data
server's ip and port. The optional base path will default to the
working directory if none is provided.
$ data_node localhost:6771 127.0.0.1:8000 my_cool_data_node
The copy
takes two different parameter versions, depending on whether it's sending to or receiving
from the server. To send a file, provide the path to the local file, then the endpoint with the file
in the ip:host:filepath
format. The data_node
will save the file relative to the base path
provided to it.
$ copy some_path/pug.jpg localhost:6700:another_path/pug.jpg
To receive a file, simply invert the parameters
$ copy localhost:6700:another_path/pug.jpg some_path/pug.jpg
Misc Scripts
shutdown_node
sends a json request with a provided port to shutdown a data_node
. This ensures
that the node can terminate gracefully and unregister itself from the meta_data
server. I was
advised against using Unix Signals, so opted for this instead.
$ shutdown_node 6770
sm
just does a send message to a provide port. It can be used to test and inspect jsons. It can
for instance be used to mimic the ls
;
$ sm '{"p_type":"ListFiles","json":null}' 8000
Connection to localhost 8000 port [tcp/*] succeeded!
{"paths":["pug.jpg 21633 bytes"]}%
clean_db
just recreates the dfs.db
with the custom python script.
Building
If you wish to compile the code, install rust and cargo link
Then just run build
cargo build
If you wish to run a specific algorithm;
cargo run --bin copy
Testing
cargo test --bin meta_data