Migrate My Indexer Cluster to New Hosts
Hello again Everyone
Here today to talk about migrating an indexer cluster to new hosts. Recently I had a project to refresh all of my Splunk VMs with new physical hardware, this included migrating data from the SAN/NAS storage as well.
We had 24 indexers with 17TB of hot/warm on NAS data and about 25TB for SAN storage.
We were moving from VMs with this storage setup to new physical servers with all local storage and SSDs for hot/warm data.
Splunk has a native "copy" function built-into the "offline" command, essentially migrating the data for you which is a huge lift.
A couple other notes I'd like to add: The new hosts should already be in-place and running with Splunk installed. It is assumed they are already connected to the License Master as well.
Migration is actually pretty easy, here's a basic outline(with commands):
Add new indexers to the cluster
Pre 8.1.0 : $SPLUNK_HOME/bin/splunk edit cluster-config -mode slave -master_uri https://<host>.com:8089 -replication_port 9887 -secret <splunk_secret>
Post 8.1.0 $SPLUNK_HOME/bin/splunk edit cluster-config -mode peer -master_uri https://<host>.com:8089 -replication_port 9887 -secret <splunk_secret>
Restart after the command is run. Test.
Prepare to decommission old indexers
– Point forwarders to new indexers - For me this is simply done by adjusting the outputs.conf app I created on my deployment server and pushing the change while making sure that the endpoints restart Splunk as well.
– Put old indexers into detention - I went to each legacy indexer and ran: $SPLUNK_HOME/bin/splunk edit cluster-config -auth <username>:<password> -manual_detention on Doing this will make the data that currently resides on the legacy indexers still searchable while stopping the flow of incoming data to these legacy hosts.
Decommission old indexers (one at a time is best practice)
– Run command splunk offline --enforce-counts
$SPLUNK_HOME/bin/splunk offline --enforce-counts
– Wait for indexer status to show as GracefulShutdown in CM UI
After running the offline command the CM puts the peer into "Decommissioning" status. This will then turn into "Graceful Shutdown"
– Repeat for remaining indexers
– CM will fix / migrate buckets to new hardware
The time it takes to migrate all depends on amount of data and bandwidth. For me transferring roughly 42TB over 10GB pipes took about 16-18 hours per indexer.
– Remove the old peer from the master's list
$SPLUNK_HOME/bin/splunk remove cluster-peers -peers <guid>,<guid>,<guid>,...
You CAN lower the bandwidth settings during business hours to save bandwidth and then increase the settings during off hours to prevent bandwidth saturation.
Accelerated Migration - Data rebalancing uses the value of these attributes reduced by 1.
max_peer_build_load = 3
* Max # of concurrent tasks to make buckets searchable on peers
* Defaults to 2.
max_peer_rep_load = 6
* Max # of concurrent replications a peer can accept as a target
* Defaults to 5.
$SPLUNK_HOME/bin/splunk edit cluster-config –max_peer_build_load 3
At this point the legacy indexers will be in a Splunk stopped state. From here you can fully shutdown the machine.
Hope this helps some people work through migration. There were some things that made me really nervous but it went surprisingly smooth.