top of page
Search

Splunk - Verifying accuracy of counts

  • Writer: Todd Waller
    Todd Waller
  • 23 hours ago
  • 2 min read

Hello Everyone and welcome back to Old Logs New Tricks. Happpy New Year! This year we will be working to provide new tips and tricks as well as interesting things we think will be helpful.


Today I will be talking about accurately verifying results.

I find that the hardest part of verification is making the source data and its counts match Splunk, because Splunk makes it so easy. Time ranges are critical as well.

For example imagine you have a file that has fields:

INFO 2026-01-05 13:10:15,234 Type=AAAA ID=012345 Extended_ID=ab234567.23452365

INFO 2026-01-05 13:10:15,234 Type=BB ID=345567 Extended_ID=ab456785678.3465689

INFO 2026-01-05 13:10:15,234 Type=CCC ID=678905 Extended_ID=ab06789.458657


Using a simple search in Splunk counting by Type, ID and Extended_ID would yeild

Type ID Extended_ID Count AAAA 012345 ab234567.23452365 1

BB 345567 ab456785678.3465689 1

CCC 678905 ab06789.458657 1


To verify this against the actual file you would need to download the file and the open it in an editor and compare the counts to what Splunk reports. On a small scale this is easy. On a large scale, thousands of records, it's not so easy. It is however necessary to verify accuracy.

What I do is download the Splunk results and open in a spreadsheet. Download the data source file and open in Excel and split the data into columns based on delimeters:

ree

This will split the data into columns like this. Then split the columns based on the "=" delimiter so you can create the single columns

ree

Remove uneeded columns and data:

ree

At this point you can paste the columns into a single sheet and then use the option at Home-> Conditional Formatting-> Highlight Cells Rules-> Duplicate Values to highlight differences where IDs do not match:

ree

ree

As you can see in the the last example, we expected a record that isnt there or theres an additional record thats being accounted for. Either way this record would account for a record in the count. You would need to determine if it should be there or not and whether the count is accurate or if you need to change criteria.


Additionally, there are some online diff tools like Winmerge and similar, that will do this for you but you will need to adhere to security rules to protect data.


This is just a simple example, there are certainly other ways to verify the data as well as any analytics you are doing with it. Imagine a dataset 100 times larger and how complicated that may be.


What are some of your go to verification techniques?


Thanks for reading, have a great day!

Cheers!

-Todd

 
 
 
Post: Blog2_Post
  • Facebook
  • Twitter
  • LinkedIn

©2018 by Old Logs New Tricks. Proudly created with Wix.com

bottom of page