Skip to content

iyidgnaw/PyDFS

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyDFS

Simple fault tolerant distributed file system like HDFS (and of-course GFS). It consists of one Master (NameNode) and multiple Minions (DataNode). And a client for interation. It will dump metadata/namespace when given SIGINT and reload it when fired up next time. Replicate data the way HDFS does. It will send data to one minion and that minion will send it to next one and so on. Reading done in similar manner. Will contact fitst minion for block, if fails then second and so on. Uses RPyC for RPC.

Requirements:

  • rpyc (Really! That's it.)

How to run.

  1. pip3 install -r requirements.txt
  2. Edit conf.py for setting block size, replication factor and list minions (minionid:host:port)
  3. Fireup master and minions using python admin.py
  4. To store and retrieve a file:
$ python client.py put sourcefile.txt sometxt
$ python client.py get sometxt
Stop it using Ctll + C so that it will dump the namespace.

TODO:

Priority:

  • Minion heartbeats / Block reports (Luke)
  • Add entry in namespace only after write succeeds
  • Admin interface: NODE_CREATE, NODE_KILL, GET, PUT, DELETE (Luke)
  • Delete/Add minion node (diyi)
  • Block integrity check (diyi)

Optional:

  • Use better algo for minion selection to put a block (currently random)
  • Dump namespace periodically (check-pointing)
  • Use proper datastructure(tree-like eg. treedict) to store namespace(currently simple dict)

Dev

To use the pre commit hook

  1. copy pre-commit to .git/hook and change the file permission if necessary
  2. Install Pylint (for python3) and use "google-pylint.rc" as your pylint config

Issue:

  1. ReferenceError in master sync.

About

Simple distributed file system like HDFS (and of-course GFS)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 96.3%
  • Shell 3.7%