The Volume Manager system, otherwise known as volman, services user requests for removable volumes (e.g. tapes). Typical requests are ``mount volume,'' ``unmount volume'' and ``move volume.'' The system overview is in the Volume Manager External Reference Specification.
The libvol communication routines all pass messages through the Client Daemon (volcd) on the user's host, which confirms client UID and passes messages on the the main server (vold).
Some utilities which require root to run make direct, console connections to vold without going through volcd; these are in the volman/srvr/ directory. An example is vshutdown. The console interface is discussed further below.
All the Volume Manager daemons (including the Repository Controllers) run as root. Originally this was thought to provide access to privileged sockets that would guarantee root identity in communicating with one another; however it turns out that no such privilege exists and other means wold be necessary to guarantee identity. Additionally, volnd needs root to access device nodes. All processes that connect with vold use the XDR(3M) protocol (over sockets), which guarantees that transmitted data structures are valid on different architectures.
The volcd (Client Daemon) accepts client connections and authenticates identity by performing a handshake involving writing a random key to a file chowned to the client's claimed UID. Upon the first client request after the handshake, volcd informs vold of the new client and begins passing requests to vold and responses to the client. When the client terminates the connection, volcd informs vold. There is a volcd on every host supported by the system. Volcd in detail
The vold is the central decision maker and message router of the system. It authorizes access to volumes based on client UID (as established by volcd) and volume ownership and permissions contained in its database. It keeps a queue of volume requests, allocates volumes and drives (avoiding deadlock using the Banker's Algorithm), and forwards mount and move requests to the appropriate Repository Controller(s) according to the client request and the location of the volume per the database. When an RC needs to check whether a volume is mounted (vaultrc) or check an internal label and set up a user- accessible node (vaultrc, chrc, acsrc), vold forwards the request to the volnd on the drive's system and forwards the response back to the RC. The vold can run on any host, including hosts not served by the volman system - it does not access the drives. Its administrative configuration file, /usr/mss/etc/rc.volman, contains a list of hosts from which it will accept connections, which would include all hosts served (via volcd and volnd) as well as any other hosts on which RCs run. Vold in detail
The volnd (Node Daemon) handles all volman system access to the drives' data paths, checking internal labels, setting up / deleting client-accessible device nodes (/dev/vol/xxxxxx), and scratching volumes (erasing and writing new internal labels). It accepts requests forwarded by vold from RCs (mount / node creation) and from the vold (node delete). There is a volnd on every host supported by the system. Volnd in detail
The vaultrc manages an operator-run vault. It sends messages to the operator via syslog(3) broadcasts to the console account (defined in the vold /usr/mss/etc/rc.volman file) and receives error status from the operator via rcerr(1M). The vaultrc does not attempt to optimize channel usage in allocating drives. Its optimization is that it caches mounts by leaving `dismounted' volumes on the drives until a dismount is physically required for a new mount. This incurs no cost in time since manual drives can dismount immediately (this is done by volnd as the first stage of a vaultrc mount request), and saves the significant operator time required to relocate and remount a volume that would otherwise have been dismounted. Vaultrc in detail
Each chrc process manages a 'changer' robot, currently lower-cost devices with a single volume-moving mechanism that can be reasonably driven in synchronous mode. The chrc process runs on a host connected to the robot via a SCSI bus. Chrc in detail
Each acsrc process manages a StorageTek Automated Cartridge System (ACS) robot farm which is controlled by ACS Library Server (ACSLS) software that runs on a Sun server. A dedicated `ssi' process on the acsrc host passes messages between the acsrc and its Sun server. The database function is handled by the Sun server - to allocate a drive, an acsrc determines which silo a volume is in and searches 'outward' from it in the silo numbering scheme for an empty drive, preferring drives on less-used channels. Acsrc in detail
The Volume Manager account name, along with server ports and various directories and file names used for installation and running, is set in the src/volman/Mkconfig.volman file. This file is included by the various Makefiles.
Client-volcd communications on the other hand use simple socket communications without XDR, writing the packet data structures directly into the socket. This is possible because both client and volcd are on the same host, therefore the structures are identical for both processes (e.g. sizeof(int) is the same).
Server-server communications are always between vold and the other servers - when the non-vold servers need to communicate, they pass messages through vold. Vold has a list of acceptable server hosts in its administrative configuration file, /usr/mss/etc/rc.volman.
All communication is via sockets. Some attempt has been made to allow relatively painless switching to another protocol, however this mainly extends to client-volcd connections. Both client-volcd and server-server connections use Internet domain sockets for convenience; volcd does not check the host for its connections, so a client could be on any host as long as it could read the key file written on the volcd host.
General client-volcd routines are in src/volman/lib/vm_sock.c, while client-specific ones are in src/volman/lib/cl_sock.c.
The general server-server routines are in src/volman/srvr/rl_sock.c and rl_xdr.c; vold-specific ones are in rp_sock.c and rp_xdr.c, and routines for the other servers are in srv_xdr.c. Some higher-level message-sending routines are in cd_msg.c (volcd) and rp_msg.c (vold).
Connections are established the same way in all cases: a server process at startup creates a socket using socket(2), uses bind(2) to associate it with a well-known (hardcoded) port, then does a listen(2) to register willingness to accept connections. Another process builds a socket using socket(2), then uses connect(2) to establish a connection with the server's port. The server detects the main port connection (along with messages on regular sockets) using select(2), and uses accept(2) to create a unique socket for the connection.
The volcd keeps an array of per-client information which is indexed by the file descriptor of the connection. When volcd accepts a connection, it notes that there is an unknown client on that socket and waits for an identifying packet (M_HELLO) specifying the client UID. When this is received, volcd writes a random key to a randomly-named file in a special volman spool directory, chowns the file to the claimed UID, and sends the name of the file to the client. The client reads the key from the file and includes it in any future request. On receipt of the first request, volcd removes the key file and sends a copy of the client structure to the vold before forwarding the request.
Messages from vold are forwarded to the appropriate client, or handled internally if they are for volcd itself (shutdown and starting a new logfile).
When a client connection is closed, volcd marks the client slot as free and informs vold.
Vold also forks a volcd and any RCs that have a PATH defined in /usr/mss/etc/rc.volman. It logs SIGCLD from these processes but does not attempt to restart them.
Vold then goes into a loop on select() (in rp_sock.c), accepting connections and receiving messages from voltimer, volcd, volnd, the RCs, and console commands such as vshutdown(1M). When a connection is made, vold uses gethostbyaddr(3N) to get the host name and checks it against the list that was read in from the /usr/mss/etc/rc.volman file. The first message received on a new connection must identify the type of connecting process. Subsequent messages are handled in sender-type-specific switch routines in rp_1.c, which also contains main() and `housekeeping' routines such as MShutdown() which handles the M_SHUTDOWN packet sent by vshutdown. The switch routine for client requests calls packet-handling routines in the following categories:
Client mount and move requests are checked and placed in a queue for resources. Routines for manipulating the requests are in rp_vol.c. Each request is also queued in several other ways, including hash-queued on the volume's external label, which speeds up checks for resource availability. Request queue processing is handled in rp_q.c - requests are examined in order of receipt and commands are sent to RCs when resources are available. Vold keeps counts of available drives in each RC and sends only as many mounts as there are drives. A mount request to an RC contains a bitmap of varied-on drives of the volume's medium in the RC, and the RC uses this list to choose a specific drive to optimize according to the nature of the repository.
In addition to checking drive availability, vold uses the Banker's Algorithm to check if requests using reservations for multiple resources could deadlock. Requests that are not part of reservations are put in `implicit' reservations, and all requests are also queued by reservation. Reservation-related routines are in rp_rid.c. The reservation functionality is described in the vreserve(1M) man page.
In addition to resource-queued requests for volumes, vold handles various requests to read/write the database that are handled immediately and not queued, such as allocating a volume, updating volume ownership or permissions, or recycling a volume, as described in volalloc(1), vchown(1), vchmod(1) and vrecycle(1). The vold also responds immediately to requests for status of mount/move requests, drives, reservations, and configuration.
The database consists of two tables: volume information and quotas. The tables are are in fixed-field ASCII format, with each record terminated by a newline and a blank character between each field in a record. The records are padded so that none will cross a disk block boundary - this, in combination with synchronized disk access (open(2) using the O_SYNC flag), ensures that a record will not be partially written to disk in the event of a system crash. The database routines in rp_db.c translate the ASCII format to C structures and maintain B-tree indexes on various fields.
The quota table is indexed by concatenated UID,RC. The volume table is indexed by volume external label (for most lookups), by owner UID (for vls(1) requests), by internal label concatenated with UID (used for guaranteeing that users do not have duplicate internal labels), and by status (lookup of scratch volumes). If a previous vold process did not close the indexes (required to flush blocks buffered by the B-tree package), they are rebuilt on startup in parallel by forked processes.
When a volnd starts up, it registers with the vold, then clears the /dev/vol/ directory of any user volume nodes left by a previous volnd process. It then waits on requests from vold, catching SIGCLD and getting status when the child it forks for each request terminates. The parent process and message-sending routines are in nd_1.c, and the child routines are in nd_2.c. This file in turn includes either nd_2a.c (for hosts on which a device will not open until it's ready) or nd_2b.c (for hosts on which a device can be opened then polled to see if it's ready).
The basic commands for which a volnd child is forked are:
Volume checking involves testing `write ring' status, and also checking the internal label unless the client requested bypassing this step. `Test for mount' is invoked by the vaultrc and involves putting `MOUNT / XXXXXX' on the drive's billboard if the drive has one.
Communication with the barcode server is via RPC(3N), i.e. the response is immediately available as the result of a remote procedure call; this is known as a `stateless' system since there are no followup messages after an interval, unlike the protocol between the vold and the vaultrc. This simple protocol is adequate because the asynchronous action waited on by the vault is the mount of a volume, which is detected and reported by volnd. Thus the barcode system has no appreciable impact on the complexity of the vaultrc - vaultrc merely keeps the barcode server informed of requests, and the server provides the same info as the console broadcasts, plus volume location, both on its own console and on the barcode gun's LCD display.
When the vaultrc starts up, it connects to the vold and gets the system configuration, using the RC id (provided as the only required argument in starting the program) to look up its own configuration. If the barcode system is configured (as described at the beginning of vaultrc.c), vaultrc also connects to the barcode server. It checks if anyone is logged into the console account (complaining to the log if not), and notifies vold that it is ready. It does not worry about what volumes may be currently on the drives; they will be automatically dismounted when the drives are assigned new mounts.
(Note: tape media should in fact not be left mounted and idle indefinitely, because tape can lose tension, and, in the case of helical scan technology, tape and tape head can wear out since the head spins constantly in contact with the tape.)
After initialization, vaultrc then loops on receipt of packets from vold, which can be client requests for volumes, operator error reports, status from volnds, or commands like shutdown from vold.
When a drive is assigned a mount, a message is sent via vold to the volnd on the drive's machine. If the volume has been left on a drive (cached) following a dismount, volnd is only requested to recheck the internal label and create the user-accessible device node, /dev/vol/XXXXXX. If it is a fresh mount, volnd is instructed to first loop on testing the device until the mount has occurred.
All the code is in vaultrc.c, except for barcode RPC calls which are in vault.x.
When a mount has occurred, chrc sends a message (via vold) to the volnd on the machine the drive is attached to. When volnd has checked write ring status and label, a message is forwarded back and chrc informs vold that the mount has completed.
Author mail: Bill Ross
![]() |
![]() |
WebWork: Harry Waddell
NASA Official: John Lekashman A> |