Aggregator Architecture

Deploy Vector in your clusters to receive data from all your systems.

Overview

The aggregator architecture deploys Vector as an aggregator onto dedicated nodes for remote processing. Data ingests from one or more upstream agents or upstream systems:

Aggregator

We recommend this architecture to most Vector users for its high availability and easy setup.

When to Use this Architecture

We recommend this architecture for environments that require high durability and high availability (most environments). This architecture is easy to set up and slot into complex environments without changing agents. It is exceptionally well suited for enterprises and large organizations.

Going to Production

Architecting

  • Deploy multiple aggregators within each network boundary (i.e., each Cluster or VPC).
  • Use DNS or service discovery to route agent traffic to your aggregators.
  • Use HTTP-based protocols when possible.
  • Use the vector source and sink for inter-Vector communication.
  • Shift the responsibility of data processing and durability to your aggregators.
  • Configure your agents to be simple data forwarders.
See the architecting document for more detail.

High Availability

  • Deploy your aggregators across multiple nodes and availability zones.
  • Enable end-to-end acknowledgements for all sources.
  • Use disk buffers for your system of record sink.
  • Use waterfall buffers for your system of analysis sink.
  • Route failed data to your system of record.
See the high availability document for more detail.

Hardening

See the hardening recommendations for more detail.

Sizing, Scaling, & Capacity Planning

Rolling Out

See the rolling out document for more detail.

Advanced

Pub-Sub Systems

We do not recommend provisioning a new pub-sub service for the sole purpose of Vector. Vector can deploy in a highly available manner that minimizes the need for such systems.

The aggregator architecture can deploy as a consumer to a pub-sub service, like Kafka:

Aggregator

Partitioning

Partitioning, or “topics” in Kafka terminology, refers to separating data in your pub-sub systems. We strongly recommend partitioning along data origin lines, such as the service or host that generated the data.

Aggregator

Recommendations

  • Use memory buffers with buffers.when_full set to block. This will ensure back pressure flows upstream to your pub-sub system, where durable buffering should occur.
  • Enable end-to-end acknowledgements for your Vector pub-sub source (i.e., the kafka source) to ensure data is persisted downstream before removing the data from your pub-sub systems.

Global Aggregation

Because Vector can deployed anywhere in your infrastructure, it offers a unique approach to global aggregation. Aggregation can be tiered, allowing local aggregators to reduce data before forwarding to your global aggregators.

Aggregator

This eliminates the need to deploy a single monolith aggregator, creating an unnecessary single point of failure. Therefore, global aggregation should be limited to use cases that can reduce data, such as computing global histograms.

Recommendations

  • Limit global aggregation to tasks that can reduce data, such as computing global histograms. Never send all data to your global aggregators.
  • Continue to use your local aggregators to process and deliver most data. Never introduce a single point of failure.

Support

For easy setup and maintenance of this architecture, consider the Vector’s discussions or chat. These are free best effort channels. For enterprise needs, consider Datadog Observability Pipelines, which comes with enterprise-level support. In addition, we offer Datadog Premier Support Services for expert architectural guidance. Contact us for more information.