RADOS Gateway (RGW)
Ceph natively manages objects, but it is crucial to not confuse this with other uses of the name, especially in relation to object storage in the vein of OpenStack Swift or Amazon's S3() service. The Ceph RGW service can be used to provide object storage compatible with both Swift and S3. Note that when used, the Ceph RGW service utilizes one or more dedicated pools (see Chapter 2, Ceph Components and Services) and cannot be used to access RBD volumes or other types of data that may live within your cluster in their own pools. This service is provided RESTfully with a familiar HTTP/HTTPS interface.
The Ceph RGW service can reside on the same servers as the OSD and their devices in a converged architecture, but it is more common to dedicate servers or even virtual machines to RGW services. Environments with very light usage may co-locate them with MON daemons on physical Ceph servers, but there pitfalls, and this must be done carefully, perhaps even using containers and cgroups. Small, lightly used, or proof-of-concept (PoC) installations may choose a virtual machine for ease of provisioning and to contain costs and space. Larger, production-class installations often provision RGW servers on bare metal servers for performance and to limit dependencies and avoid cascading failures. The author of this chapter has run but does not recommend a combination of dedicated RGW servers and ones co-located with Ceph MONs on modest servers.
Typically, one uses haproxy or another load balancer solution, perhaps in conjunction with keepalived, to achieve balanced and high availability service across multiple RGW instances. The number of RGW servers can be scaled up and down according to workload requirements, independently of other Ceph resources including OSDs and MONs. This flexibility is one of Ceph's many salient advantages over traditional storage solutions, including appliances. Ceph can readily utilize server resources that you already know how to manage, without one-off differences in chassis management, support contracts, or component sparing. It is even straightforward to migrate an entire Ceph cluster from one operating system and server vendor to another without users noticing. Ceph also allows one to architect and expand for changing requirements of usable bytes, IOPS, and workload mix without having to over-provision one component in order to scale another.
In releases of Ceph prior to Hammer, the RGW service was provided by a discrete daemon to interact with the cluster, coupled with the traditional Apache / httpd as the client frontend. This was subtle and fiddly to manage, and with the Ceph Hammer release, the service has been reworked to utilize a single ceph.radowsgw application with an embedded Civetweb web server.
Some large Ceph installations are run purely for the RGW service, without any RBD or other clients. Ceph's flexibility is among its strengths.
Object storage does not offer the low operation latency and predictable performance that block storage boasts, but capacity can scale up or down effortlessly.
A potential use case can be a collection of web servers using a Content Management System (CMS) to store an unstructured mix of HTML, JavaScript, image, and other content that may grow considerably over time.