This document is intended to supplement the General Configuration and Change Management Plan for ARSC systems by providing details on how configuration files and local modifications are maintained across all ARSC platforms.
|
The general principle is files are not arbitrarily modified and are monitored for correctness including type, content, ownership, and mode (access permission). In addition, change history and backout copies are retained for all configuration changes and for recovery and security these are maintained on a administrative host not exposed to general users. These principles apply to all aspects of systems management, but software installed and managed by vendor tools (rpm, pkginst, lslpp, ...), or under /usr/local using 'Installing Third Party Software' are managed by those procedures. Likewise account creation is also handled by separate procedures even though the related files are monitored under ConfigFiles.
The ConfigFiles are typically under /etc or /usr, but may fall under other directories. Typically /usr/local files do not fall under ConfigFiles, but there are a few exceptions for critical files. Any file in system space which must be added, changed, or replaced (e.g., a binary) outside of a vendor provided tool is a candidate for ConfigFiles to ensure it is not regressed or arbitrarily lost or changed. Any files which have significant security ramifications, whether locally modified or not, are also monitored by ConfigFiles. The suid|sgid files which must be monitored for security are handled via the permchk process, a few exceptions are in ConfigFiles for local modification or version tracking.
ARSC uses similar procedures across all platforms:
All ConfigFiles changes are evaluated for impact. Major changes are installed on test systems and may be implemented during scheduled downtime. Major changes and significant changes are recorded via RT entries. Changes are typically made by System Administrator staff on the associated Support and Usability (S&U) Project, but with project overlaps or in oncall situations may be made by other ARSC IT II staff.
Platform
Typehost(s) admin
from/var/local/
directoriesList and machines Uses
push?Other
platform
toolsLinux puppychow, ag*, cms,
workstations (48+),
...dispatch LinuxFiles LinuxFiles/List.txt
LinuxFiles/Long.txt
etc/machines.linuxyes linuxman Sun
Solaris40+ systems dispatch SunFiles SunFiles/List.txt
SunFiles/Long.txt
etc/machines.sunyes mush sunman Sun
Linuxmidnight
(420+ nodes)mn1sm
LinuxFiles LinuxFiles/List.txt
LinuxFiles/Long.txt
etc/machines.linuxyes pdsh midman Cray
CLEpingo (25)
ognip (7)piboot
ogbootCrayFiles CrayFiles/List.txt
CrayFiles/Long.txt
etc/machines.crayyes xtopview
xtspec
pdshcrayman
This file contains the list of managed systems and is manually maintained. The white space delimited columns include:
1 machine # hostname 2 type # e.g.: 'x4200', 'V120', 'w1100z', ... 3 status # 'offline' or 'ok' (or 'Linux', 'SunOS', ...) 4 usage/class # e.g.: 'admin', 'compute', 'login', ... 5 version_major # e.g.: '5.2', 'rh4', '2.6', ... 6 version_minor # e.g.: '4', '8', ... 7 serial_number/alias # e.g.: '0521AM0162', 'a=csmflyer', ... 8 location/rack/other # e.g.: 'Butro007', 'W105-G', ... 9 management_server # Where used, indicates controlling server (e.g.: 'nis', 'mn1sm', ...) 10 reserved1 # - not currently used - 11 reserved2 # - not currently used - 12 comments # Any content permitted, including white space.Columns 1 - 9 are used by various scripts. The 'comments' column is always the last column. As the need arises, additional columns (such as 'reserved1' and 'reserved2) may from time to time be inserted just before the comments column. No column entry may contain white space, with the exception of the final field ('comments'). The 'status' field is used to indicate whether a system is online and must be updated when a machine is down for a significant period of time to prevent delays (ssh timeouts) in push and monitoring processes.
Contact platform ISSO for other specific column meanings.
Note, capability for a virtual host alias column prior to "other" exists and is recognized for push commands, for example:
In the above, 'a=' allows a node to be recognized by a common alias, such as:#machine type status usage Maj Min alias other comments #------- ---- ------ ----- --- --- ----- ---- -------- flyman flyer AIX admin 5.2 9 a=csmflyer MS management server iceflyer flyer AIX login 5.2 9 a=f2n1 Login p650 node icef1n1 flyer AIX compute 5.2 9 a=f1n1 Compute p690 lpar icef3n1 flyer AIX test 5.2 9 a=f3n1 Test p610 node icef4n1 flyer AIX compute 5.3 5 a=f4n1 Compute p575 node ...
Note that any machine *Files entries must represent the actual host name not the alias, e.g. krb5.keytab.flyman. The 'push -o' option can still be used:dispatch: push -ibm -m csmflyer cmd -eq uname -a flyman:AIX csmflyer 2 5 00095C1A4C00
dispatch: push -ibm -o MS cmd -eq uname -a flyman:AIX csmflyer 2 5 00095C1A4C00 bergman:AIX csmberg 2 5 0007BB3A4C00
These directories represent the monitored ConfigFiles for each platform.
The directories and contents are manually maintained on the administrative management host for the platform.
The structure maps the file monitored, for example, the directory
/var/local/ConfigFiles/etc/hosts/
maps /etc/hosts for platform nodes.
Note:
These files map the actual files on the managed machines, for example:
/var/local/ConfigFiles/etc/hosts/hosts.mcgrew
represents the /etc/hosts file on mcgrew including contents, mode, and ownership.
The extension (in order of precedence) can be .hostname, .type,
.usage, .Maj, .Maj.Min, .other, or .template.
For example, if machines.list contained:
The file for node f2n1 under /var/local/ConfigFiles/etc/hosts/ might be (first match):#machine type status usage Maj Min Other #------- ---- ------ ----- --- --- ----- csmflyer csm AIX admin 5.2 9.03 MS f2n1 p4 AIX network 5.2 8csp Frame2 f3n1 p4 AIX test 5.2 9.03 Frame3 f4n1 p5 AIX compute 5.3 4csp Frame4
When the above are files, the mode and ownership match what is on the node and the file contents should be identical. For platforms where the management host is a different architecture, pay attention to the numeric UID|GID for ownership as there may be variance in the alpha representation when <100. Note, ACLs are not supported by ARSC and are not specifically monitored. However, presence of an ACL will be detected and reported. ACLs are represented with a '+' after the 4-character octal mode.
hosts.f2n1 (machine) specific to the node hosts.p4 (type) for power4 systems hosts.network (usage) for network nodes hosts.5.2 (MajorOS) for AIX 5.2 nodes hosts.5.2.8csp (Maj.Min) for AIX 5200-08-CSP nodes hosts.Frame2 (Other) for nodes in Frame2 hosts.template (general) for any node not matched above
In addition, the following special representations or files can exist within a ConfigFiles directory:
The README.ARSC files are typically short indicating when, who, any pertinent history, why, and linking to more detailed information in a particular RT ticket or ARSC web document.
This file contains the list of managed files and is updated by the upd_*Files.ksh script manually or nightly cron via chk_sanity.ksh. The ConfigFiles are under the corresponding /var/local/*Files directory for each platform and are identified by the presence of a 'backout' directory in any sub-directory.
This file contains the long listing (mode, ownership, size, sum) of all files under the /var/local/*Files and is used by the chk_sanity.ksh script. This file is updated by the upd_*Files.ksh script manually or nightly cron via chk_sanity.ksh.
Any time ConfigFiles are updated the following will generally be followed:
admin: /usr/local/adm/bin/push config filename or admin: /usr/local/adm/bin/push one system1[,system2,...] filename or admin: /usr/local/adm/bin/push -m system1[,system2,...] config filename
The following fabricated ConfigFile demonstrates symlink translations, including:
csmflyer: uals -y Mog; pwd d 0750 root ibmman backout l 0777 kcarlson ibmman test.csmflyer -> Directory l 0777 kcarlson ibmman test.f1n1 -> xxx l 0777 kcarlson ibmman test.f1n3 -> ./xxx - 0640 kcarlson ibmman test.f2n1 l 0777 kcarlson ibmman test.f4n1 -> nofile l 0777 kcarlson ibmman test.f4n2 -> ./NOFILE l 0777 kcarlson ibmman test.f4n3 -> yyy l 0777 kcarlson ibmman test.f4n4 -> ./yyy - 0640 kcarlson ibmman xxx /var/local/ConfigFiles/var/local/test | csmflyer: dsh -av -n csmflyer \ uals -dMog /var/local/test 2>&1 | sort csmflyer: d 0700 root root /var/local/test f1n1: - 0640 kcarlson ibmman /var/local/test f1n3: l 0777 root root /var/local/test -> ./xxx f2n1: - 0640 kcarlson ibmman /var/local/test f3n1: uals: /var/local/test: No such file or directory f4n1: uals: /var/local/test: No such file or directory f4n2: uals: /var/local/test: No such file or directory f4n3: l 0777 root root /var/local/test -> yyy f4n4: l 0777 root root /var/local/test -> ./yyy |
csmflyer: push config -q test push_rcp: 1 file(s) csmflyer:#symbolic link: /var/local/ConfigFiles/var/local/test/test.csmflyer f2n1:/usr/bin/rcp -p /var/local/ConfigFiles/var/local/test/test.f2n1 f2n1:/var/local/test:new f1n1:/usr/bin/rcp -p /var/local/ConfigFiles/var/local/test/xxx f1n1:/var/local/test:new f1n3:#symbolic link: /var/local/ConfigFiles/var/local/test/test.f1n3 f3n1:#no appropriate /var/local/ConfigFiles/var/local/test/test.* f4n1:#linked NOFILE: /var/local/ConfigFiles/var/local/test/test.f4n1 f4n2:#linked NOFILE: /var/local/ConfigFiles/var/local/test/test.f4n2 f4n3:#symbolic link: /var/local/ConfigFiles/var/local/test/test.f4n3 f4n4:#symbolic link: /var/local/ConfigFiles/var/local/test/test.f4n4 |
The following tools reside in /usr/local/adm/bin and are used to validate or manage system configuration:
Usage: mkbko -options file1 [file2...] Make backout copy of file in [../]backout|old/*.YYYYMMDD[.HHMM] Options: -q*uiet # very quiet, do not report anything -Q*uiet # semi-quiet, report any copies made -i mask # files to ignore via egrep, current: " _\.|~$" -s*udo # re-invoke with sudo
Usage: ckbko -options file1 [file2...] Check backout copy of file in [../]backout|old with *.YYYYMMDD[.HHMM] stamp. Options: -diff|-sdiff-s|-SDiff # compare with diff, sdiff -s, or sdiff -q*uiet # quieter (do not report Ok's) -[0-9]* # compare against an older backup -i mask # files to ignore via egrep, current: " _\.|~$" -s*udo # re-invoke with sudo -D*ebug # debug option enabling set -x
Usage: cksumnode [-options] Check file sums (bsd) between two systems Options: -d compare directories, Default: $PWD -1|-f 1st host to compare, Default: piman -2|-o other hosts to compare, Default: -e exclude filter, Default: backout|old -c use [r|s|kr]sh command, Default: ssh -q -y uals fields beyond sum, Default: -r do not use sum -s use root (sudo) +s not 'sdiff -s'
These scripts are used to regenerate the *Files/List.txt and *Files/Long.txt files. They can be invoked manually and are invoked from chk_sanity.ksh in the nightly cron job or any time all nodes on a platform are being checked.