Please read this entire document: *** Checksplunk is not a supported product is use at your own risk software *** Like any new program, try checksplunk a development or test environment first. More information can be found at http://www.ugu.com/software/checksplunk written by: Kirk Waingrow 3/2009 - kirk.waingrow@gmail.com 1. OVERVIEW 2. FEATURES 3. INSTALLATION 4. USING CHECKSPLUNK 5. EXAMPLES 6. COMMANDS RAN BY CHECKSPLUNK ============================= 1. OVERVIEW ============================= Checksplunk is a non-obtrusive Perl script for Splunk Administrators to understand the health and integrity of Splunk and the server(s) Splunk is running on. It doesn't write anything to the system or any splunk config files. It doesn't change or modify anything, it only reads information that readily available from the server or within Splunk. While checksplunk provides useful information on the fly, it was designed for spdash (http://www.ugu.com/software/spdash), a web based splunk dashboard. Feel free to copy and distribute for other to use. Feel free to offer feature requests or contributions. ============================= 2. FEATURES ============================= The following is collect and displayed by checksplunk System Level Output CPU load (vmstat) Disk utilization (iostat) on disk with hot/warm db's Load Average (uptime) Free memory (meminfo) Server hostname Disk size of dbase storage Current day/time Seconds since 1970 (See spdash Doc's) Splunk Level Output Splunk version Splunk daemon running (from process table) Splunkd running (from splunk status) Splunkweb running (from splunk status) Number of events indexed Number of errors in the log files Display errors in a log file Display the errors in the log files Number of hosts Display indexed hosts License Information Number of users accounts created in Splunk Display users with accounts in Splunk Display users actively working in Splunk Display users actions Display servers that are license hogs ============================= 3. INSTALLATION ============================= Checksplunk is very easy to get working. Checksplunk is a simple Perl script that does not rely on any Perl modules. It can run from any directory and should run out of the box with a few simple changes to the variables at the top of the script that match your environment. * Checksplunk needs to run by an user account that has access to execute splunk CLI commands and view splunk log files. Set checksplunk to read/write only my the user account running it: $ chown splunkadmin.splunkadmin checksplunk $ chmod 700 checksplunk * Changes may be needed in the top of the script to the following: #! /usr/bin/perl At the top of the script set your path to Perl $SPLUNK="/opt/splunk/bin/splunk"; Set the path to the installed version of the splunk binary command $AUTH="-auth splunkadmin:xxxxxx"; Set the authentication ID and password that have an account in Splunk. This is a splunk account and should have admin/power/user role. Yes the account password is open, which is why checksplunk should have permissions set to read/write by the owner only (700). $STORAGE="/storage"; Mounted filesystem name of the Hot/Warm DB's. (This is not a path) Goto the directory of your Hot/Warm DB's and type: df . This is used for finding remaining storage space using the df command. $DISKIO="c0d0"; The device (c0d0, sda1, etc...) that IO stat checks. Use df and fdisk to find the device name the Hot/Warm DB's are on. The iostat command is not installed on some linux versions by default. Sometimes iostat is packaged with the systat software package. $IOCOUNT=5; To get an accurate iostat the first stat returned is dropped and the remaining results are averaged. Each result takes 1 second to return. If IOCOUNT is set to 5, it will take 5 seconds to run and the the total count is averaged across 4 results. $MAXOUT=10000; Used for returning maximum results for displaying users and hosts. $STATDIR="/home/kwaingrow/splunk/spdash/status"; This location to where the "spdash" web dashboard will look for the data files created by checksplunk. $HST=`hostname -a`; This is for the files that are created by using the "spdash" option. it is best to use the fully qualified hostname, if possible. $LOGDIR="/opt/splunk/var/log/splunk"; Location of the splunk log directories $TMP="/tmp"; Temp files are stored in this location. The script cleans up the temp files when it is done with them. A lock (LCK) file is also written here to ensure that checksplunk does not run multiple times in case it hangs on a splunk or system command. ============================= 4. USING CHECKSPLUNK ============================= The following is the syntax to using checksplunk and the arguments that can be passed to it. The [C] or [S] at the end of an argument denotes whether it is part of the -C or the -S option. the "hosts" and "users" arguments cannot be used with any other combination of arguments, but by themselves. NOTE: When check splunk runs it sets a file in $TMP/checksplunk.LCK, when it finishes it removes this file. This ensures that if something hangs it and checksplunk is executed from cron it won't spawn multiple checksplunk processes. SYNTAX: checksplunk [OPTIONS] hosts : Display all Hosts indexed by Splunk hogs : Display the top 10 systems using the largest amount of license in kb search : Display's number of searches & last access time by users spdash : Builds all the SPDASH files needed for web dashboard interface users : Display users authenticated to us Splunk -A : All options are processed, excluding -G, hosts, and users -c : CPU load (vmstat) [C] -C : display all 'computer' related information -d : splunkd running (from splunk status) [S] -D : Add a description to the output of an argument -e : number of events indexed [S] -g : number of errors in the log files [S] -G : display the errors in the log files [S] -h : number of hosts [S] -i : disk utilization (iostat) on disk with hot/warm dbs [C] -l : Load Average (uptime) [C] -L : license information [S] -m : free memory (meminfo) [C] -n : name of the server/host [C] -p : splunk daemon running (from process table) [S] -s : disk size of dbase storage [C] -S : display all 'splunk' related information, excluding -G, hosts, users -t : current day/time [C] -u : number of users authenticated to use Splunk [S] -U : Output user audit logs -v : splunk version [S] -w : splunkweb running (from splunk status) [S] ============================= 5. EXAMPLES ============================= Show the number of events that splunk has indexed. $ checksplunk -e To see all the users that splunk knows about: $ checksplunk users To see all the hosts indexed by splunk $ checksplunk hosts To see all the server related information $ checksplunk -CD To see all the splunk related information $ checksplunk -SD To use with "spdash" $ checksplunk -AD > /var/www/html/spdash/`hostname -a` To build all the necessary files for "spdash" web based dashboard $ checksplunk spdash ============================= 6. COMMANDS RAN BY CHECKSPLUNK ============================= The following are all the commands that checksplunk runs. You can see in the end there is really nothing special checksplunk is doing. You may want to run these commands to ensure that your system can support all of them. System Related Commands ----------------------------------- hostname -a Server name date Date uptime Load Average vmstat cpu utilization ps -ef check process table (may need to swapped with aux) date +%s seconds since 1970 df disk check cat /proc/meminfo check meminfo cat /proc/cpuinfo check cpuinfo iostat check iostat (packaged with systat sometimes) Splunk Related Commands ----------------------------------- splunk version splunk status splunk dispatch '| metadata type=hosts | stats sum(totalCount)' # of events splunk dispatch 'index=_internal todaysBytesIndexed LicenseManager-Audit NOT source=*web_service.log NOT source=*web_access.log | eval Daily_Indexing_Volume_in_MBs = todaysBytesIndexed/1024/1024 | timechart avg(Daily_Indexing_Volume_in_MBs) by host | convert ctime(_time)' splunk dispatch '| metadata type=hosts splunk dispatch '| metadata type=hosts | stats count(host)' SPLUNK list user