95 lines
3.7 KiB
Markdown
95 lines
3.7 KiB
Markdown
## Distributed File System in Rust for CCOM4017
|
|
|
|
This suite of programs handles file copying over TCP with a client/server model.
|
|
It contains the following programs;
|
|
- copy
|
|
- ls
|
|
- data_node
|
|
- meta_data
|
|
|
|
`copy` and `ls` are clients that connect to the servers. `copy` sends file read and write requests
|
|
to the `meta_data` server, which uses a sqlite3 database to keep track of which nodes are connected,
|
|
as well as which files have been added. When a file is added, `meta_data` sends the list of available
|
|
`data_node` servers, `copy` then divides the file up by the amount of nodes, then proceeds to transfer
|
|
each chunk over 256 bytes at a time. `ls` simply prints out a list of the existing files on the
|
|
`meta_data` server.
|
|
|
|
The code uses `serde_json` to serialize and deserialize Rust structs to and from json. The clients and
|
|
servers then listen for incoming streams of data and parses them as json. As well as exchanging
|
|
metadata, this protocol also establishes the handshake to then transfer the raw file chunks.
|
|
|
|
`rusqlite` is used for managing the sqlite database. This allows SQL queries to be performed from
|
|
the rust code and manage the data base in a relatively type safe way. Unit tests in the `meta_data`
|
|
provide coverage of these SQL operations against an in-memory version
|
|
|
|
### WARNING:
|
|
If you're my professor, please do not generate a database with the default `createdb.py`
|
|
provided in the skeleton dfs. I have included a custom version of the file in the root of the project.
|
|
The reason being that I changed chunks to be integers rather than strings, in order to provide ordering
|
|
to the chunks when transferring.
|
|
|
|
##### Running
|
|
|
|
To run the `ls` provide an endpoint in the _`ip:port`_ format. _`ip`_ can be _"localhost"_, consider
|
|
using `./` to avoid a naming conflict with the GNU version of `ls`
|
|
|
|
```$ ./ls 127.0.0.1:6770```
|
|
|
|
The `meta_data` server takes an optional port, but will default to `8000` if none is specified.
|
|
|
|
```$ meta_data 6710```
|
|
|
|
The data node takes two endpoints in the _`ip:port`_ and then a an optional path. The first endpoint
|
|
is the ip and port, both for binding to a TCP port and also to send itself to the `meta_data` server.
|
|
The second endpoint is the `meta_data` server's ip and port. The optional base path will default to the
|
|
working directory if none is provided.
|
|
|
|
```$ data_node localhost:6771 127.0.0.1:8000 my_cool_data_node```
|
|
|
|
The `copy` takes two different parameter versions, depending on whether it's sending to or receiving
|
|
from the server. To send a file, provide the path to the local file, then the endpoint with the file
|
|
in the _`ip:host:filepath`_ format. The `data_node` will save the file relative to the base path
|
|
provided to it.
|
|
|
|
```$ copy some_path/pug.jpg localhost:6700:another_path/pug.jpg```
|
|
|
|
To receive a file, simply invert the parameters
|
|
|
|
```$ copy localhost:6700:another_path/pug.jpg some_path/pug.jpg```
|
|
|
|
##### Misc Scripts
|
|
|
|
`shutdown_node` sends a json request with a provided port to shutdown a `data_node`. This ensures
|
|
that the node can terminate gracefully and unregister itself from the `meta_data` server. I was
|
|
advised against using Unix Signals, so opted for this instead.
|
|
|
|
```$ shutdown_node 6770```
|
|
|
|
`sm` just does a _send message_ to a provide port. It can be used to test and inspect jsons. It can
|
|
for instance be used to mimic the `ls`;
|
|
|
|
```
|
|
$ sm '{"p_type":"ListFiles","json":null}' 8000
|
|
Connection to localhost 8000 port [tcp/*] succeeded!
|
|
{"paths":["pug.jpg 21633 bytes"]}%
|
|
```
|
|
|
|
`clean_db` just recreates the `dfs.db` with the custom python script.
|
|
|
|
##### Building
|
|
|
|
If you wish to compile the code, install rust and cargo
|
|
[link](https://www.rust-lang.org/en-US/install.html)
|
|
|
|
Then just run build
|
|
|
|
```cargo build```
|
|
|
|
If you wish to run a specific algorithm;
|
|
|
|
```cargo run --bin copy ```
|
|
|
|
##### Testing
|
|
|
|
`cargo test --bin meta_data`
|