-
Notifications
You must be signed in to change notification settings - Fork 2
Setup the storage
GlusterFS is a scalable network filesystem. Using common off-the-shelf hardware, you can create large, distributed storage solutions for media streaming, data analysis, and other data- and bandwidth-intensive tasks. More on the official website or the project GitHub page.
Within SecurityCloud, we use two volumes: conf and flow. Volume conf works as a shared storage for configuration files and volume flow is a storage for the flow files. Make sure there is enough disk space for the flow files on each node.
Configuration options in install.conf for GlusterFS are all mandatory and have gfs_ prefix. Option gfs_conf_brick determines path to the conf brick directory (brick is a place where data are stored). Options gfs_flow_primary_brick and gfs_flow_backup_brick determine paths to the primary and backup brick directories of the flow volume, respectively. Volume flow uses two bricks, because data are stored redundantly on two nodes. Options gfs_conf_mount and gfs_flow_mount determine paths to the mount points of the GlusterFS volumes.
In the exmple below, the paths are set according to the naming convention. It is not mandatory, but we recommend to use the paths from the example configuration. If any of the directories doesn't exist, it will be created by the installation script.
gfs_conf_brick=/data/glusterfs/conf/brick
gfs_flow_primary_brick=/data/glusterfs/flow/brick1
gfs_flow_backup_brick=/data/glusterfs/flow/brick2
gfs_conf_mount=/data/conf
gfs_flow_mount=/data/flow
Services have to be running on all the nodes.
#start and enable GlusterFS daemon on CentOS
$ systemctl start glusterd.service
$ systemctl enable glusterd.service
#start and enable GlusterFS daemon on Debian
$ systemctl start glusterfs-server
$ systemctl enable glusterfs-serverOn an arbitrary node run:
$ ./install.sh glusterfsNow you can verify that all the actions were successful and GlusterFS is ready.
GlusterFS services should be running:
$ ps -C glusterd,glusterfs,glusterfsd
PID TTY TIME CMD
7596 ? 00:00:00 glusterd
8325 ? 00:00:00 glusterfsd
8550 ? 00:00:00 glusterfs
8777 ? 00:00:00 glusterfs
8843 ? 00:00:00 glusterfs
...Connections between nodes should be established:
$ netstat -tavn | grep "2400[7|8]"
tcp 0 0 0.0.0.0:24007 0.0.0.0:* LISTEN
tcp 0 0 10.4.0.25:49144 10.4.0.41:24007 ESTABLISHED
tcp 0 0 127.0.0.1:24007 127.0.0.1:49069 ESTABLISHED
tcp 0 0 10.4.0.25:49142 10.4.0.25:24007 ESTABLISHED
tcp 0 0 10.4.0.25:49149 10.4.0.37:24007 ESTABLISHED
tcp 0 0 127.0.0.1:24007 127.0.0.1:49121 ESTABLISHED
tcp 0 0 10.4.0.25:24007 10.4.0.39:49143 ESTABLISHED
tcp 0 0 127.0.0.1:49121 127.0.0.1:24007 ESTABLISHED
...All nodes should be present in the trusted pool in a connected state:
$ gluster pool list
UUID Hostname State
b6a46565-45c1-4b54-8611-950616cbc765 sub1.example.org Connected
9435070c-0f2c-40b9-be94-da91c4a4c0d3 sub2.example.org Connected
609e386e-ca6f-4a89-932f-0d70557bac12 sub3.example.org Connected
...Check information about the volumes:
$ gluster volume info conf
Volume Name: conf
Type: Replicate
Volume ID: c37231e4-1e7b-48a7-86db-a3f0635bc6e8
Status: Started
Number of Bricks: 1 x 10 = 10
Transport-type: tcp
Bricks:
Brick1: sub1.example.org:/data/glusterfs/conf/brick
Brick2: sub2.example.org:/data/glusterfs/conf/brick
Brick3: sub3.example.org:/data/glusterfs/conf/brick
...
Options Reconfigured:
network.ping-timeout: 10
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: true$ gluster volume info flow
Volume Name: flow
Type: Distributed-Replicate
Volume ID: 7c620b8c-8b09-4ada-8a4b-86fd2cc1e263
Status: Started
Number of Bricks: 8 x 2 = 16
Transport-type: tcp
Bricks:
Brick1: sub1.example.org:/data/glusterfs/flow/brick1
Brick2: sub2.example.org:/data/glusterfs/flow/brick2
Brick3: sub2.example.org:/data/glusterfs/flow/brick1
...
Options Reconfigured:
cluster.nufa: enable
network.ping-timeout: 10
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: trueCheck status of the volumes:
$ gluster volume status conf
Status of volume: conf
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick sub1.example.org:/data/glusterf
s/conf/brick 49152 0 Y 9370
Brick sub2.example.org:/data/glusterf
s/conf/brick 49152 0 Y 9005
Brick sub3.example.org:/data/glusterf
s/conf/brick 49152 0 Y 8964
...
Self-heal Daemon on sub1.example.org N/A N/A Y 9701
Self-heal Daemon on sub2.example.org N/A N/A Y 9242
Self-heal Daemon on sub3.example.org N/A N/A Y 9201
...
Task Status of Volume conf
------------------------------------------------------------------------------
There are no active volume tasks$ gluster volume status flow
Status of volume: flow
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick sub1.example.org:/data/glusterf
s/flow/brick1 49153 0 Y 9660
Brick sub2.example.org:/data/glusterf
s/flow/brick2 49153 0 Y 9201
Brick sub2.example.org:/data/glusterf
s/flow/brick1 49154 0 Y 9220
...
Self-heal Daemon on sub1.example.org N/A N/A Y 9701
Self-heal Daemon on sub3.example.org N/A N/A Y 9201
Self-heal Daemon on sub2.example.org N/A N/A Y 9242
...
Task Status of Volume flow
------------------------------------------------------------------------------
There are no active volume tasksCheck if the volumes are mounted on all the nodes:
$ mount | grep glusterfs
localhost:/conf on /data/conf type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
localhost:/flow on /data/flow type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)And finally, try to write some data to the volumes. You should be able to access those data from all the nodes:
$ dd if=/dev/urandom of=/data/conf/test.bin bs=4M count=1
$ dd if=/dev/urandom of=/data/flow/test.bin bs=4M count=1
$ ls -l /data/conf/test.bin /data/flow/test.bin
-rw-r--r-- 1 root root 4194304 Jul 28 12:36 /data/conf/test.bin
-rw-r--r-- 1 root root 4194304 Jul 28 12:37 /data/flow/test.binThe SecurityCloud project is supported by the Technology Agency of the Czech Republic under No. TA04010062 Technology for processing and analysis of network data in big data concept.