Data Lake for Enterprises
上QQ阅读APP看书,第一时间看更新

Serving layer - data delivery and exports

The Lambda Architecture also emphasizes the criticality of how the data is served or delivered to the consuming application. Data, as we know, can be delivered in multiple ways between systems. However, one of the most common ways to deliver data is via services. In the context of a Data Lake, these services may be called Data Services that may deliver primarily data.

One of the other ways to deliver data is via exports. The data in its final form can be exported as messages, files, data dumps, and so on for other systems to consume.

The primary focus while delivering/serving data is to have the data in the desired form. This form can be enforced as a data contract whether the data is served by services or by exports. However, during data delivery operations, it is very important to have a merge between the batch data and the data from near real-time processing, as both of these streams would hold key information from an organizational domain perspective. The data serving/delivery layer will need to ensure that the data is consistent as adhering to an agreed contract with the consuming application.

Overall, high-level specifications for the data serving/delivery layer can be summarized as follows:

  • It must support multiple mechanisms to serve data to the consuming application
  • For every mechanism supported for serving the data, there should be adherence to a contract in agreement with the consuming application
  • It must support merged views of both batch-processed and near real time processed data
  • It must be scalable and responsive to the consuming application

With the serving layer having its key responsibility to serve the data out of the Data Lake, this layer may also optionally merge the data for enrichment. 

While these are primarily specifications of Lambda Architecture layers, there are other layers too such as data acquisition, messaging, and the data ingestion layer that feed the data into the Lambda Architecture for processing, which we will discuss later in this chapter.