# Network UPS Tools: example upsmon configuration
#
# This file contains passwords, so keep it secure.
# --------------------------------------------------------------------------
# RUN_AS_USER <userid>
#
# By default, upsmon splits into two processes. One stays as root and
# waits to run the SHUTDOWNCMD. The other one switches to another userid
# and does everything else.
#
# The default unprivileged user is set at compile-time with the option
# 'configure --with-user=...'
#
# You can override it with '-u <user>' when starting upsmon, or just
# define it here for convenience.
#
# Note: if you plan to use the reload feature, this file (upsmon.conf)
# must be readable by this user! Since it contains passwords, DO NOT
# make it world-readable. Also, do not make it writable by the upsmon
# user, since it creates an opportunity for an attack by changing the
# SHUTDOWNCMD to something malicious.
#
# For best results, you should create a new normal user like "nutmon",
# and make it a member of a "nut" group or similar. Then specify it
# here and grant read access to the upsmon.conf for that group.
#
# This user should not have write access to upsmon.conf.
#
# RUN_AS_USER nut
RUN_AS_USER root
# --------------------------------------------------------------------------
# MONITOR <system> <powervalue> <username> <password> ("primary"|"secondary" )
#
# List systems you want to monitor. Not all of these may supply power
# to the system running upsmon, but if you want to watch it, it has to
# be in this section.
#
# You must have at least one of these declared.
#
# <system> is a UPS identifier in the form <upsname>@<hostname>[:<port>]
# like ups@localhost, su700@mybox, etc.
#
# Examples:
#
# - "su700@mybox" means a UPS called "su700" on a system called "mybox"
#
# - "fenton@bigbox:5678" is a UPS called "fenton" on a system called
# "bigbox" which runs upsd on port "5678".
#
# The UPS names like "su700" and "fenton" are set in your ups.conf
# in [brackets] which identify a section for a particular driver.
#
# If the ups.conf on host "doghouse" has a section called "snoopy", the
# identifier for it would be "snoopy@doghouse".
#
# <powervalue> is an integer - the number of power supplies that this UPS
# feeds on this system. Most personal computers only have one power supply,
# so this value is normally set to 1, while most modern servers have at least
# two. You need a pretty big or special box to have any other value here.
#
# You can also set this to 0 for a system that doesn't take any power
# from the MONITORed supply, which you still want to monitor (e.g. for an
# administrative workstation fed from a different circuit than the datacenter
# servers it monitors). Use <powervalue> if 0 when you want to hear about
# changes for a given UPS without shutting down when it goes critical.
#
# <username> and <password> must match an entry in that system's
# upsd.users. If your username is "upsmon" and your password is
# "blah", the upsd.users would look like this:
#
# [upsmon]
# password = blah
# upsmon primary # (or secondary)
#
# "primary" means this system will shutdown last, allowing the secondary
# systems time to shutdown first.
#
# "secondary" means this system shuts down immediately when power goes
# critical and less than MINSUPPLIES power sources have reliable input feeds.
#
# The general assumption is that the "primary" system is the one with direct
# connection to an UPS (such as serial or USB cable), so the primary system
# runs the NUT driver and 'upsd' server locally and can manage the device,
# and it would often tell the UPS to completely power itself off as a step
# in power-race avoidance (see POWERDOWNFLAG for details).
#
# Also, since the primary system stays up the longest, it suffers higher risks
# of ungraceful shutdown if the estimation of remaining runtime (or of the
# time it takes to shut down this system) was guessed wrong. By consequence,
# the "secondary" systems typically monitor the power environment state
# through the 'upsd' processes running on the remote (often "primary" ) systems
# and do not directly interact with an UPS (no local NUT drivers are running
# on the secondary systems). As such, secondaries typically shut down as
# soon as there is a sufficiently long power outage, or a low-battery alert
# from the UPS, or a loss of connection to the primary while the power was
# last known to be missing.
#
# This assumption and configuration can also make sense for networked UPSes,
# where a rack full of servers might overload the communications capacity
# of the networked management card on the UPS - in this case you might either
# reduce the 'snmp-ups' or 'netxml-ups' driver polling rate, or dedicate a
# "primary" server and set up the rest as "secondary" systems.
#
# In case of such large setups as mentioned above, beware also that shutdown
# times of the rack done all at once can substantially differ from smaller
# scale experiments with single-server shutdowns, since systems can compete
# for shared storage and other limited resources as they go down (and also
# not everyone may safely shut down simultaneously - e.g. a NAS or DB server
# would better go down after all its clients). You would be well served by
# higher-end UPSes with manageable thresholds to declare a critical state.
#
# Examples:
#
# MONITOR myups@bigserver 1 upswired blah primary
# MONITOR su700@server.example.com 1 upsmon secretpass secondary
# MONITOR myups@localhost 1 upsmon pass primary # (or secondary)
# MONITOR ups@192.168.2.201 1 monuser secret slave
MONITOR ups@192.168.2.201 1 monuser secret secondary
# --------------------------------------------------------------------------
# MINSUPPLIES <num>
#
# Give the number of power supplies that must be receiving power to keep
# this system running. Most systems have one power supply, so you would
# put "1" in this field.
#
# Large/expensive server type systems usually have more, and can run with
# a few missing. Some of these can run with 2 out of 4, for example,
# so you'd set that to 2. The idea is to keep the box running as long
# as possible, right?
#
# Obviously you have to put the redundant supplies on different UPS circuits
# for this to make sense! See big-servers.txt in the docs subdirectory
# for more information and ideas on how to use this feature.
MINSUPPLIES 1
# --------------------------------------------------------------------------
# SHUTDOWNCMD "<command>"
#
# upsmon runs this command when the system needs to be brought down.
#
# This should work just about everywhere ... if it doesn't, well, change it,
# perhaps to a more complicated custom script.
#
# Note that while you experiment with the initial setup and want to test how
# your configuration reacts to power state changes and ultimately when power
# is reported to go critical, but do not want your system to actually turn
# off, consider setting the SHUTDOWNCMD temporarily to do something benign -
# such as posting a message with 'logger' or 'wall' or 'mailx'. Do be careful
# to plug the UPS back into the wall in a timely fashion.
SHUTDOWNCMD "/sbin/shutdown -h +0"
# --------------------------------------------------------------------------
# NOTIFYCMD <command>
#
# upsmon calls this to send messages when things happen
#
# This command is called with the full text of the message (from NOTIFYMSG)
# as one argument.
#
# The environment string NOTIFYTYPE will contain the type string of
# whatever caused this event to happen.
#
# The environment string UPSNAME will contain the name of the system/device
# that generated the change.
#
# Note that this is only called for NOTIFY events that have EXEC set with
# NOTIFYFLAG. See NOTIFYFLAG below for more details.
#
# Making this some sort of shell script might not be a bad idea.
# Alternately you can use the upssched program as your NOTIFYCMD for some
# more complex setups (e.g. to ease handling of notification storms).
# For more information and ideas, see docs/scheduling.txt
#
# Example:
# NOTIFYCMD /bin/notifyme
NOTIFYCMD /usr/sbin/upssched
# --------------------------------------------------------------------------
# POLLFREQ <n>
#
# Polling frequency for normal activities, measured in seconds.
#
# Adjust this to keep upsmon from flooding your network, but don't make
# it too high or it may miss certain short-lived power events.
POLLFREQ 5
# --------------------------------------------------------------------------
# POLLFREQALERT <n>
#
# Polling frequency in seconds while UPS on battery.
#
# You can make this number lower than POLLFREQ, which will make updates
# faster when any UPS is running on battery. This is a good way to tune
# network load if you have a lot of these things running.
#
# The default is 5 seconds for both this and POLLFREQ.
POLLFREQALERT 5
# --------------------------------------------------------------------------
# HOSTSYNC - How long upsmon will wait before giving up on another upsmon
#
# The primary upsmon process uses this number when waiting for secondary
# systems to disconnect once it has set the forced shutdown (FSD) flag.
# If they don't disconnect after this many seconds, it goes on without them.
#
# Similarly, upsmon secondary processes wait up to this interval for the
# primary upsmon to set FSD when an UPS they are monitoring goes critical -
# that is, on battery and low battery. If the primary doesn't do its job,
# the secondaries will shut down anyway to avoid damage to the file systems.
#
# This "wait for FSD" is done to avoid races where the status changes
# to critical and back between polls by the primary.
HOSTSYNC 15
# --------------------------------------------------------------------------
# DEADTIME - Interval to wait before declaring a stale ups "dead"
#
# upsmon requires a UPS to provide status information every few seconds
# (see POLLFREQ and POLLFREQALERT) to keep things updated. If the status
# fetch fails, the UPS is marked stale. If it stays stale for more than
# DEADTIME seconds, the UPS is marked dead.
#
# A dead UPS that was last known to be on battery is assumed to have gone
# to a low battery condition. This may force a shutdown if it is providing
# a critical amount of power to your system.
#
# Note: DEADTIME should be a multiple of POLLFREQ and POLLFREQALERT.
# Otherwise you'll have "dead" UPSes simply because upsmon isn't polling
# them quickly enough. Rule of thumb: take the larger of the two
# POLLFREQ values, and multiply by 3.
DEADTIME 15
# --------------------------------------------------------------------------
# POWERDOWNFLAG - Flag file for forcing UPS shutdown on the primary system
#
# upsmon will create a file with this name in primary mode when it's time
# to shut down the load. You should check for this file's existence in
# your shutdown scripts and run 'upsdrvctl shutdown' if it exists, to tell
# the UPS(es) to power off.
#
# See the config-notes.txt file in the docs subdirectory for more information.
# Refer to the section:
# [[UPS_shutdown]] "Configuring automatic shutdowns for low battery events"
# or refer to the online version.
POWERDOWNFLAG /etc/killpower
# --------------------------------------------------------------------------
# NOTIFYMSG - change messages sent by upsmon when certain events occur
#
# You can change the default messages to something else if you like.
#
# NOTIFYMSG <notify type> "message"
#
# NOTIFYMSG ONLINE "UPS %s on line power"
# NOTIFYMSG ONBATT "UPS %s on battery"
# NOTIFYMSG LOWBATT "UPS %s battery is low"
# NOTIFYMSG FSD "UPS %s: forced shutdown in progress"
# NOTIFYMSG COMMOK "Communications with UPS %s established"
# NOTIFYMSG COMMBAD "Communications with UPS %s lost"
# NOTIFYMSG SHUTDOWN "Auto logout and shutdown proceeding"
# NOTIFYMSG REPLBATT "UPS %s battery needs to be replaced"
# NOTIFYMSG NOCOMM "UPS %s is unavailable"
# NOTIFYMSG NOPARENT "upsmon parent process died - shutdown impossible"
#
# Note that %s is replaced with the identifier of the UPS in question.
#
# Possible values for <notify type>:
#
# ONLINE : UPS is back online
# ONBATT : UPS is on battery
# LOWBATT : UPS has a low battery (if also on battery, it's "critical" )
# FSD : UPS is being shutdown by the primary (FSD = "Forced Shutdown" )
# COMMOK : Communications established with the UPS
# COMMBAD : Communications lost to the UPS
# SHUTDOWN : The system is being shutdown
# REPLBATT : The UPS battery is bad and needs to be replaced
# NOCOMM : A UPS is unavailable (can't be contacted for monitoring)
# NOPARENT : The process that shuts down the system has died (shutdown impossible)
# --------------------------------------------------------------------------
# NOTIFYFLAG - change behavior of upsmon when NOTIFY events occur
#
# By default, upsmon sends walls (global messages to all logged in users)
# and writes to the syslog when things happen. You can change this.
#
# NOTIFYFLAG <notify type> <flag>[+<flag>][+<flag>] ...
#
# NOTIFYFLAG ONLINE SYSLOG+WALL
# NOTIFYFLAG ONBATT SYSLOG+WALL
# NOTIFYFLAG LOWBATT SYSLOG+WALL
# NOTIFYFLAG FSD SYSLOG+WALL
# NOTIFYFLAG COMMOK SYSLOG+WALL
# NOTIFYFLAG COMMBAD SYSLOG+WALL
# NOTIFYFLAG SHUTDOWN SYSLOG+WALL
# NOTIFYFLAG REPLBATT SYSLOG+WALL
# NOTIFYFLAG NOCOMM SYSLOG+WALL
# NOTIFYFLAG NOPARENT SYSLOG+WALL
#
# Possible values for the flags:
#
# SYSLOG - Write the message in the syslog
# WALL - Write the message to all users on the system
# EXEC - Execute NOTIFYCMD (see above) with the message
# IGNORE - Don't do anything
#
# If you use IGNORE, don't use any other flags on the same line.
NOTIFYFLAG ONLINE EXEC
NOTIFYFLAG ONBATT EXEC
NOTIFYFLAG LOWBATT EXEC
NOTIFYFLAG NOCOMM EXEC
NOTIFYFLAG COMMBAD IGNORE NOTIFYFLAG COMMOK IGNORE
NOTIFYFLAG SHUTDOWN IGNORE
NOTIFYFLAG FSD EXEC
NOTIFYFLAG NOPARENT SYSLOG
# --------------------------------------------------------------------------
# RBWARNTIME - replace battery warning time in seconds
#
# upsmon will normally warn you about a battery that needs to be replaced
# every 43200 seconds, which is 12 hours. It does this by triggering a
# NOTIFY_REPLBATT which is then handled by the usual notify structure
# you've defined above.
#
# If this number is not to your liking, override it here.
RBWARNTIME 43200
# --------------------------------------------------------------------------
# NOCOMMWARNTIME - no communications warning time in seconds
#
# upsmon will let you know through the usual notify system if it can't
# talk to any of the UPS entries that are defined in this file. It will
# trigger a NOTIFY_NOCOMM by default every 300 seconds unless you
# change the interval with this directive.
NOCOMMWARNTIME 300
# --------------------------------------------------------------------------
# FINALDELAY - last sleep interval before shutting down the system
#
# On a primary, upsmon will wait this long after sending the NOTIFY_SHUTDOWN
# before executing your SHUTDOWNCMD. If you need to do something in between
# those events, increase this number. Remember, at this point your UPS is
# almost depleted, so don't make this too high. If needed, on high-end UPS
# devices you can usually configure when the low-battery state is announced
# based on estimated remaining run-time or on charge level of the batteries.
#
# Alternatively, you can set this very low so you don't wait around when
# it's time to shut down. Some UPSes don't give much warning for low
# battery and will require a value of 0 here for a safe shutdown.
#
# Note: If FINALDELAY on the secondary is greater than HOSTSYNC on the
# primary, the primary will give up waiting for that secondary system
# to disconnect.
FINALDELAY 5
# --------------------------------------------------------------------------
# CERTPATH - path to certificates (database directory or directory with CA's)
#
# When compiled with SSL support, you can enter the certificate path here.
#
# With NSS:
# Certificates are stored in a dedicated database (split into 3 files).
# Specify the path of the database directory.
#
# CERTPATH /etc/nut/cert/upsmon
#
# With OpenSSL:
# Directory containing CA certificates in PEM format, used to verify
# the server certificate presented by the upsd server. The files each
# contain one CA certificate. The files are looked up by the CA subject
# name hash value, which must hence be available.
#
# CERTPATH /usr/ssl/certs
#
# See 'docs/security.txt' or the Security chapter of NUT user manual
# for more information on the SSL support in NUT.
# --------------------------------------------------------------------------
# CERTIDENT - self certificate name and database password
# CERTIDENT <certificate name> <database password>
#
# When compiled with SSL support with NSS, you can specify the certificate
# name to retrieve from database to authenticate itself and the password
# required to access certificate related private key.
#
# CERTIDENT "my nut monitor" "MyPasSw0rD"
#
# See 'docs/security.txt' or the Security chapter of NUT user manual
# for more information on the SSL support in NUT.
# --------------------------------------------------------------------------
# CERTHOST - security properties for an host
# CERTHOST <hostname> <certificate name> <certverify> <forcessl>
#
# When compiled with SSL support with NSS, you can specify security directive
# for each server you can contact.
# Each entry maps server name with the expected certificate name and flags
# indicating if the server certificate is verified and if the connection
# must be secure.
#
# CERTHOST localhost "My nut server" 1 1
#
# See 'docs/security.txt' or the Security chapter of NUT user manual
# for more information on the SSL support in NUT.
# --------------------------------------------------------------------------
# CERTVERIFY - make upsmon verify all connections with certificates
# CERTVERIFY 1
#
# When compiled with SSL support, make upsmon verify all connections with
# certificates.
# Without this, there is no guarantee that the upsd is the right host.
# Enabling this greatly reduces the risk of man in the middle attacks.
# This effectively forces the use of SSL, so don't use this unless
# all of your upsd hosts are ready for SSL and have their certificates
# in order.
# When compiled with NSS support of SSL, can be overridden for host
# specified with a CERTHOST directive.
# --------------------------------------------------------------------------
# FORCESSL - force upsmon to use SSL
# FORCESSL 1
#
# When compiled with SSL, specify that a secured connection must be used
# to communicate with upsd.
# If you don't use 'CERTVERIFY 1', then this will at least make sure
# that nobody can sniff your sessions without a large effort. Setting
# this will make upsmon drop connections if the remote upsd doesn't
# support SSL, so don't use it unless all of them have it running.
# When compiled with NSS support of SSL, can be overridden for host
# specified with a CERTHOST directive.
# --------------------------------------------------------------------------
# DEBUG_MIN - specify minimal debugging level for upsmon daemon
# e.g. DEBUG_MIN 6
#
# Optionally specify a minimum debug level for `upsmon` daemon, e.g. for
# troubleshooting a deployment, without impacting foreground or background
# running mode directly, and without need to edit init-scripts or service
# unit definitions. Note that command-line option `-D` can only increase
# this verbosity level.
#
# NOTE: if the running daemon receives a `reload` command, presence of the
# `DEBUG_MIN NUMBER` value in the configuration file can be used to tune
# debugging verbosity in the running service daemon (it is recommended to
# comment it away or set the minimum to explicit zero when done, to avoid
# huge journals and I/O system abuse). Keep in mind that for this run-time
# tuning, the `DEBUG_MIN` value *present* in *reloaded* configuration files
# is applied instantly and overrides any previously set value, from file
# or CLI options, regardless of older logging level being higher or lower
# than the newly found number; a missing (or commented away) value however
# does not change the previously active logging verbosity.
|