As described in a previous article, Gluster is a good way to provide centralized storage for a load balanced cluster on Storm.
More below the fold
Although Gluster itself falls into our best effort support category, it really isn't that hard to set up, and from my experience, seems to work pretty well.
Just a note that most of these instructions will be based off of the installation and administration documentation. However I've included the commands I've ran as a quick reference.
The point of this article isn't going to be doing the actual load balanced setup, so the end configuration will be:
- Gluster version will be 3.2
- Two Gluster nodes, setup in a "RAID 1" configuration
- A single client
- Everything connected via private networking
- All servers running core managed Cent 6.2 images on 1GB instances
Although I only will be detailing the setup of one client, the process is identical for other servers. Also keep in mind that if you do use a cPanel box for the client (as in trying to load balance them), you would need to ensure that all the accounts are consistent setup wise in the cluster. Currently cPanel isn't really cluster friendly, however I have talked to them and that is something they are working on releasing, probably in the 12.x branch, whenever that will come out.
Anyways, let the fun begin.
So the first thing to do is kick the first Gluster node. In this case, I will refer to it as gl1
Once that is done, we need to get Gluster installed:
[root@gl1 ~]# mkdir gluster [root@gl1 ~]# cd gluster/ [root@gl1 ~/gluster]# wget http://download.gluster.com/pub/gluster/glusterfs/LATEST/CentOS/6/glusterfs-fuse-3.2.5-2.el6.x86_64.rpm [root@gl1 ~/gluster]# wget http://download.gluster.com/pub/gluster/glusterfs/LATEST/CentOS/6/glusterfs-core-3.2.5-2.el6.x86_64.rpm [root@gl1 ~/gluster]# yum install glusterfs-core-3.2.5-2.el6.x86_64.rpm [root@gl1 ~/gluster]# yum install glusterfs-fuse-3.2.5-2.el6.x86_64.rpm
Now lets start it up and make sure it starts up on reboot:
[root@gl1 ~/gluster]# chkconfig glusterd on [root@gl1 ~/gluster]# service glusterd start
One thing that needs to happen is to make sure the proper ports are open. From the docs:
Ensure that TCP ports 111, 24007, 24008, 24009 (24009 + number of bricks across all volumes) are open on all Gluster servers. If you will be using NFS, open additional ports 38465 to 38467.
So in our case, we need to open up 111, 24007 thru 24011.
Once that is done, clone the server (which I will call gl2) and enable the private networking.
When this is done, add the appropriate host or DNS entries so to have the servers reference each other via the private network. In my case, my host files would have these entries:
10.38.4.134 gl1.rrfaae.com gl1 10.38.4.135 gl2.rrfaae.com gl2
Once that is done, we need to establish our volumes. To do this, the first thing we need to do is create the location where we would like the data to actually reside:
[root@gl1 ~]# mkdir /storage1 [root@gl2 ~]# mkdir /storage2
Basically, when the client talks to the cluster, the data is stored in this location on the servers.
So now, let's actually make sure the nodes are talking and setup the volume:
[root@gl1 ~]# gluster peer probe gl2 [root@gl1 ~]# gluster peer status Number of Peers: 1 Hostname: gl2 Uuid: 8010a6b2-324c-4917-846d-734b451f31de State: Peer in Cluster (Connected) [root@gl1 ~]# gluster volume create raid-volume gl1:/storage1 gl2:/storage2 Creation of volume raid-volume has been successful. Please start the volume to access data. [root@gl1 ~]# gluster volume start raid-volume Starting volume raid-volume has been successful
So basically what happened was this... we created a volume called "raid-volume" (creative, isn't it?) across the servers gl1 and gl2, with the respective export directories specified. Once that was done, we started the volume.
So, now, create a client server (or servers, the process would be the same for multiple clients, such as in a load balanced situation).
Once we do that, we need to install the native client (and of course setup the host file properly. You did that, right? Good!). Fortunately, it's just like installing Gluster on the actual servers.
However, the client uses FUSE (Filesystem in Userspace).
[root@glclient ~]# dmesg | grep -i fuse fuse init (API version 7.13) [root@glclient ~]# mkdir /mnt/gluster
Make sure you add modprobe fuse to /etc/rc.d/rc.local so that when the server reboots, fuse get's reloaded. That way we can automount the Gluster mount.
So this is what I added to fstab (to allow for automounting on boot)
gl2:/raid-volume /mnt/gluster glusterfs defaults,_netdev 0 0
Note: You only need to reference one of the nodes.
[root@glclient ~/gluster]# mount gl2:/raid-volume [root@glclient ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda3 72G 1.8G 67G 3% / tmpfs 427M 0 427M 0% /dev/shm /dev/vda1 99M 45M 50M 48% /boot gl2:/raid-volume 72G 1.8G 67G 3% /mnt/gluster
So now, lets write something to the volume, and verify it stays there:
[root@glclient ~]# touch /mnt/gluster/test.txt [root@glclient ~]# ls /mnt/gluster/ test.txt
There it be.
Hopefully this is something that works for you. Don't hesitate to ask questions in the comments field below.Tweet