How big can I build a Gluster cluster?

3 Answers

4 Votes
While it is clearly possible to have hundreds of bricks in the same volume, there are still reasons why it might be preferable to partition data into multiple volumes (using the same underlying servers and storage) instead.  Chief among these is the handling of directories in the distribute (DHT) translator.  Directories must exist on all bricks, so creating/deleting/renaming entire directories can be extremely expensive and vulnerable to weird failure scenarios if you're not also using replication (or have more failures than replicas).

Also, directory listings are "striped" across all of the bricks in a volume, so they can be fairly slow and resource-intensive.  The new directory-spread-count option should alleviate the worst symptoms, but it's not yet available in a release.  If your application has a high rate of create/list/rename/delete operations for either files or directories, a single high-brick-count volume is probably not going to perform as well as several lower-brick-count volumes.
1 Vote

Because we have no centralized data store, no metadata, and we operate at the file level there no theoretical limits to the size, speed, inode count, file count, number of nodes, number of bricks, number of clients, etc, etc, etc.

In reality the current limiting factor for a Gluster cluster is the brick count. Each brick requires 2 open ports on the Gluster server, effectively limiting the number of bricks to ~1000.

Gluster supports the XFS file system on each brick, using XFS each brick can be as large as 9 million terabytes [1], limiting the total possible size of a single Gluster namespace to 6.75 X 10^9 TB's. (that's really big!)

As of May 2011 the cluster with the most nodes in production has ~250 servers participating in a single volume. The largest volume in production is 2.5PB after hardware RAID and 1.25PB usable after Gluster replication.

[1] http://scalability.org/?p=3192&cpage=1

NOTE: Just because you can build a huge volume doesn't mean you should. Think through the implications of having a multi-petabyte volume before you decide to build one.

1 Vote
If you neglect the overflow to df, a glusterfs volume can be 2^32 (max subvolumes of distribute translator) * 18 exabytes (max xfs volume size [1]) for 72 brontobytes (or 89,131,682,828,547,379,792,736,944,128 bytes).

GlusterFS is capable of supporting 2^128 (uuid) inodes.

[1] http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide//tmp/en-US/html/ch02s04.html