The Volume Manager system, otherwise known as volman, services user requests for removable volumes (e.g. tapes). Typical requests are ``mount volume,'' ``unmount volume'' and ``move volume.'' The system overview is in the Volman External Reference Specification.
The libvol communication routines all operate through the Client Daemon (volcd) on the user's host, which establishes client UID and passes messages on the the main server (vold).
Some utilities which run as root make direct, console connections to vold without going through volcd; an example is vshutdown which is in the volman/srvr/ directory. The console interface is discussed further below.
All the Volume Manager daemons (including the Repository Controllers) run as root in order to make use of privileged sockets in communicating with one another; additionally, volcd and volnd need root to verify client UID and to access device nodes respectively. All processes that connect with vold use the XDR(3M) protocol which guarantees that transmitted data structures are valid on different architectures.
The volcd (Client Daemon) accepts client connections and authenticates identity by performing a handshake involving writing a random key to a file chowned to the client's claimed UID. Upon the first client request after the handshake, volcd informs vold of the new client and begins passing requests to vold and responses to the client. When the client terminates the connection, volcd informs vold. There is a volcd on every host supported by the system. Volcd in detail
The vold is the central decision maker and message router of the system. It authorizes access to volumes based on client UID (as established by volcd) and volume ownership and permissions contained in its database. It keeps a queue of volume requests, allocates volumes and drives (avoiding deadlock using the Banker's Algorithm), and forwards mount and move requests to the appropriate Repository Controller(s) according to the client request and the location of the volume per the database. When an RC needs to check whether a volume is mounted (vaultrc) or check an internal label and set up a user- accessible node (vaultrc, acsrc), vold forwards the request to the volnd on the client's system and forwards the response back to the RC. The vold can run on any host, including hosts not served by the volman system - it does not access the drives. Its administrative configuration file, VCONF, contains a list of hosts from which it will accept connections, which would include all hosts served (via volcd and volnd) as well as any other hosts on which RCs run. Vold in detail
The volnd (Node Daemon) handles all access to the drives' data paths, checking internal labels and setting up / deleting client-accessible device nodes (/dev/vol/xxxxxx). It accepts requests forwarded by vold from RCs (mount / node creation) and from the vold (node delete). There is a volnd on every host supported by the system. Volnd in detail
The vaultrc manages an operator-run vault. It sends messages to the operator via syslog(3) broadcasts to the console account (defined in the vold VCONF file) and receives error status from the operator via rcerr(1M). The vaultrc does not attempt to optimize channel usage in allocating drives. Its optimization is that it caches mounts by leaving `dismounted' volumes on the drives until a dismount is physically required for a new mount. This incurs no cost in time since manual drives can dismount immediately (this is done by volnd as the first stage of a vaultrc mount request), and saves the significant operator time required to relocate and remount a volume that would otherwise have been dismounted. Vaultrc in detail
Each acsrc process manages a StorageTek Automated Cartridge System (ACS) robot farm which is controlled by ACS Library Server (ACSLS) software that runs on a Sun server. A dedicated `ssi' process on the acsrc host passes messages between the acsrc and its Sun server. The database function is handled by the Sun server - to allocate a drive, an acsrc determines which silo a volume is in and searches 'outward' from it in the silo numbering scheme for an empty drive, preferring drives on less-used channels. Acsrc in detail
The Volume Manager account name, along with server ports and various directories and file names used for installation and running, is set in the `config.csh' file in the top volman source directory. This file is sourced by the various Makefiles with single arguments such as `SPOOLDIR', and returns the appropriate string which is assigned to a local variable and used in the current command line of the Makefile.
Client-volcd communications use simple socket communications, writing the packet data structures directly into the socket. This is possible because both client and volcd are on the same host, therefore the structures are identical for both processes (e.g. sizeof(int) is the same).
Server-server communications are always between vold and the other servers - when the other servers need to communicate, they pass messages through vold. Since the servers can be on different hosts, XDR translation is used to guarantee that the packet data structures translate correctly. The connections between vold and the other servers are made via a root-only port (i.e. < 5000) to guarantee identity; the vold has a list of acceptable hosts in its administrative configuration file, VCONF.
All communication is via sockets. Some attempt has been made to allow relatively painless switching to another protocol, however this mainly extends to client-volcd connections. Both client-volcd and server-server connections use Internet domain sockets for convenience; volcd does not check the host for its connections, so a client could be on any host as long as it could read the key file written on the volcd host.
General client-volcd routines are in lib/vm_sock.c, while client-specific ones are in lib/cl_sock.c. Volcd client socket routines are in srvr/cd_sock.c.
The general server-server routines are in srvr/rl_sock.c and rl_xdr.c; vold-specific ones are in rp_sock.c and rp_xdr.c, and routines for the other servers are in srvr_xdr.c. Some higher-level message-sending routines are in cd_msg.c and rp_msg.c.
Connections are established the same way in all cases: a server process at startup creates a socket using socket(2), uses bind(2) to associate it with a well-known (hardcoded) port, then does a listen(2) to register willingness to accept connections. Another process builds a socket using socket(2), then uses connect(2) to establish a connection with the server's port. The server detects the main port connection (along with messages on regular sockets) using select(2), and uses accept(2) to create a unique socket for the connection.
The volcd keeps an array of per-client information which is indexed by the file descriptor of the connection. When volcd accepts a connection, it notes that there is an unknown client on that socket and waits for an identifying packet (M_HELLO) specifying the client UID. When this is received, volcd writes a random key to a randomly-named file in a special volman spool directory, chowns the file to the claimed UID, and sends the name of the file to the client. The client reads the key from the file and includes it in any future request. On receipt of the first request, volcd removes the key file and sends a copy of the client structure to the vold before forwarding the request.
Messages from vold are forwarded to the appropriate client, or handled internally if they are for volcd itself (shutdown and starting a new logfile).
When a client connection is closed, volcd marks the client slot as free and informs vold.
For convenience in the current production configuration (Fall, 1996), vold also forks a volcd, volnd and the RCs; it logs SIGCLD from these processes but does not attempt to restart them.
Vold then goes into a loop on select() (in rp_sock.c), accepting connections and receiving messages from voltimer, volcd, volnd, the RCs, and console commands such as vshutdown(1M). When a connection is made, vold uses gethostbyaddr(3N) to get the host name and checks it against the list that was read in from the VCONF file. The first message received on a new connection must identify the type of connecting process. Subsequent messages are handled in sender-type-specific switch routines in rp_1.c, which also contains main() and `housekeeping' routines such as MShutdown() which handles the M_SHUTDOWN packet sent by vshutdown. The switch routine for client requests calls packet-handling routines in the following categories:
Client mount and move requests are checked and placed in a queue for resources. Routines for manipulating the requests are in rp_vol.c. The structure used to track requests is also hash-queued on the volume's external label, which speeds up checks for resource availability. Request queue processing is handled in rp_q.c - requests are examined in order of receipt and commands are sent to RCs when resources are available. Vold keeps counts of available drives in each RC and sends only as many mounts as there are drives. A mount request to an RC contains a bitmap of varied-on drives of the volume's medium in the RC, and the RC uses this list to choose a specific drive to optimize according to the nature of the repository.
In addition to checking drive availability, vold uses the Banker's Algorithm to check if requests using reservations for multiple resources could deadlock. Requests that are not part of reservations are put in `implicit' reservations, and all requests are also queued by reservation. Reservation-related routines are in rp_rid.c. The reservation functionality is described in the vreserve(1M) man page.
In addition to resource-queued requests for volumes, vold handles various requests to read/write the database that are handled immediately and not queued, such as allocating a volume, updating volume ownership or permissions, or recycling a volume, as described in volalloc(1), vchown(1), vchmod(1) and vrecycle(1). The vold also responds immediately to requests for status of mount/move requests, drives, reservations, and configuration.
The database consists of two tables: volume information and quotas. The tables are are in fixed-field ASCII format, with each record terminated by a newline and a blank character between each field in a record. The records are padded so that none will cross a disk block boundary - this, in combination with synchronized disk access (open(2) using the O_SYNC flag), ensures that a record will not be partially written to disk in the event of a system crash. The database routines in rp_db.c translate the ASCII format to C structures and maintain B-tree indexes on various fields.
The quota table is indexed by concatenated UID,RC. The volume table is indexed by volume external label (for most lookups), by owner UID (for vls(1) requests), by internal label concatenated with UID (used for guaranteeing that users do not have duplicate internal labels), and by status (lookup of scratch volumes). If a previous vold process did not close the indexes (required to flush blocks buffered by the B-tree package), they are rebuilt on startup in parallel by forked processes.
When a volnd starts up, it registers with the vold, then clears the /dev/vol/ directory of any user volume nodes left by a previous volnd process. It then waits on requests from vold, catching SIGCLD and getting status when the child it forks for each request terminates. The parent process and message-sending routines are in nd_1.c, and the child routines are in nd_2.c. The basic commands for which a child is forked are:
Volume checking involves testing `write ring' status, and also checking the internal label unless the client requested bypassing this step. `Test for mount' is invoked by the vaultrc and involves putting `MOUNT / XXXXXX' on the drive's billboard if the drive has one.
Communication with the barcode server is via RPC(3N), i.e. the response is immediately available as the result of a remote procedure call; this is known as a `stateless' system since there are no followup messages after an interval, unlike the protocol between the vold and the vaultrc. This simple protocol is adequate because the asynchronous action waited on by the vault is the mount of a volume, which is detected and reported by volnd. Thus the barcode system has no appreciable impact on the complexity of the vaultrc - vaultrc merely keeps the barcode server informed of requests, and the server provides the same info as the console broadcasts, plus volume location, both on its own console and on the barcode gun's LCD display.
When the vaultrc starts up, it connects to the vold and gets the system configuration, using the RC id (provided as the only argument in starting the program) to look up its own configuration. If the barcode system is configured (as described at the beginning of vaultrc.c), vaultrc also connects to the barcode server. It checks if anyone is logged into the console account (complaining to the log if not), changes its process name to the title indicated in its configuration, and notifies vold that it is ready. It does not worry about what volumes may be currently on the drives; they will be automatically dismounted when the drives are assigned new mounts. Vaultrc then loops on receipt of packets from vold, which can be client requests for volumes, operator error reports, status from volnds, or commands like shutdown from vold.
When a drive is assigned a mount, a message is sent via vold to the volnd on the client's machine. If the volume has been left on a drive (cached) following a dismount, volnd is only requested to recheck the internal label and create the user-accessible device node, /dev/vol/XXXXXX. If it is a fresh mount, volnd is instructed to first loop on testing the device until the mount has occurred.
All the code is in vaultrc.c, except for barcode RPC calls which are in vault.x.
Author mail: Bill Ross
![]() |
![]() |
WebWork: Harry Waddell
NASA Official: John Lekashman A> |