How Humio Does Scale-Out Clustering

When scaling a Humio cluster, there are a two main concerns that are relevant to optimize: ingest and search.

The cluster configuration allows:

  • scaling out to deal with high ingest
  • scaling out to improve query performance

In a Humio cluster, some nodes are arrival nodes (they actually receive the data from the real world), and you can specify which nodes should deal with ingest processing, and which nodes to use for search. The nodes responsible for search also store the underlying segment files. In the default setup all nodes are equal — they all play the roles of arrival, ingest and search nodes, so the load gets evenly distributed across your nodes, which is a good starting point.

We discussed part of the ingest pipeline in a previous post, but left out how we scale out ingest to a cluster.