For large Splunk Deployments, we often get asked the questions on how to centrally monitor the platforms. The problem in light is for the Monitoring of Monitoring.
Let’s say we have a production cluster spread across 100+ locations globally and the goal is to provide visibility across all the Splunk Environments for performance, errors, issues etc. Possible options/solutions commonly visited are as follows:
This is a good option that is feasible and also fulfils the purpose. But the complexity rises when there are issues around network connectivity, cross data centre chatter and security concerns from data egress. This makes it difficult for this kind of solution to be put in place.
Moreover, there can be limitations of the Ip Address, CIDR and hostname conventions followed across different sites.
This idea is a no-go since it would be too much of an overhead for someone to setup separate monitoring consoles separately.
(Bad Idea!! – No central visibility. Inconvenient)
Since both the usual suspects have been ruled out to tackle this problem, it is time to think out of the box. This option uses Splunk’s built in “indexAndForward” capability to deal with the problem. Also it makes sure that each cluster on it’s own has access to it’s own internal data AND has a copy of data sent over to the Central Monitoring Console (Yes, what an orgininal name! )
Following are the default parameters for whitelist and blacklist in the
forwardedindex.0.whitelist = .* # Match all indexes forwardedindex.1.blacklist = _.* # Blacklist all indexes starting with “_” forwardedindex.2.whitelist = (_audit|_internal|_introspection|_telemetry)
With the above configuration, Splunk forwards all the data (both Splunk logs and data sources coming in)
To address the problem, we have to whitelist
only internal indexes (_*) on the indexers
outputs.conf. Post the below change, Splunk will only forward the data coming in
On the Indexer, make the following change to
[tcpout] defaultGroup = primary_indexers forwardedindex.0.whitelist = (_audit|_internal|_introspection|_telemetry) forwardedindex.1.blacklist = _.* forwardedindex.2.whitelist = (_audit|_internal|_introspection|_telemetry) [tcpout:primary_indexers] server= <ip_addr>:9997,<ip_addr>:9997,<ip_addr>:9997,<ip_addr>:9997
[tcpout] defaultGroup = primary_indexers forwardedindex.0.whitelist = forwardedindex.1.blacklist = forwardedindex.5.whitelist = (_audit|_internal|_introspection|_telemetry) server= 10.1.5.2:9997,10.1.5.3:9997,10.1.5.4:9997,10.1.5.5:9997 [tcpout:primary_indexers] server= 10.1.5.2:9997,10.1.5.3:9997,10.1.5.4:9997,10.1.5.5:9997
This won’t work for the following reasons:
You can get creative and tag the data using
inputs.confto differentiate the clusters based on the metadata. You can then use these tags to differentiate clusters in the Monitoring Console for ease of use. This makes the challenging problem of isolation of cluster’s issue very easy to Tackle.
For all the Splunkers visiting Splunk Conf 2019 next week – we urge you to visit our Booth #160 at the event to find other such tips and tricks that help you manage Splunk better.