mhd(7i)
NAME
mhd - multihost disk control operations
SYNOPSIS
#include <sys/mhd.h>
DESCRIPTION
The mhd ioctl(2) control access rights of a multihost disk,
using disk reservations on the disk device.
The stability level of this interface (see attributes(5)) is
evolving. As a result, the interface is subject to change
and you should limit your use of it.
The mhd ioctls fall into two major categories:
ioctls for non-shared multihost disks
ioctls for shared multihost disks.
One ioctl, MHIOCENFAILFAST, is applicable to both non-shared
and shared multihost disks. It is described after the first
two categories.
All the ioctls require root privilege.
For all of the ioctls, the caller should obtain the file
descriptor for the device by calling open(2) with the
O_NDELAY flag; without the O_NDELAY flag, the open may fail
due to another host already having a conflicting reservation
on the device. Some of the ioctls below permit the caller
to forcibly clear a conflicting reservation held by another
host, however, in order to call the ioctl, the caller must
first obtain the open file descriptor.
Non-shared multihost disks
Non-shared multihost disks ioctls consist of MHIOCTKOWN,
MHIOCRELEASE, HIOCSTATUS, and MHIOCQRESERVE. These ioctl
requests control the access rights of non-shared multihost
disks. A non-shared multihost disk is one that supports
serialized, mutually exclusive I/O mastery by the connected
hosts. This is in contrast to the shared-disk model, in
which concurrent access is allowed from more than one host
(see below).
A non-shared multihost disk can be in one of two states:
o Exclusive access state, where only one connected host
has I/O access
o Non-exclusive access state, where all connected hosts
have I/O access. An external hardware reset can cause
the disk to enter the non-exclusive access state.
Each multihost disk driver views the machine on which it's
running as the "local host"; each views all other machines
as "remote hosts". For each I/O or ioctl request, the
requesting host is the local host.
Note that the non-shared ioctls are designed to work with
SCSI-2 disks. The SCSI-2 RESERVE/RELEASE command set is the
underlying hardware facility in the device that supports the
non-shared ioctls.
The function prototypes for the non-shared ioctls are:
ioctl(fd, MHIOCTKOWN);
ioctl(fd, MHIOCRELEASE);
ioctl(fd, MHIOCSTATUS);
ioctl(fd, MHIOCQRESERVE);
MHIOCTKOWN
Forcefully acquires exclusive access rights to the
multihost disk for the local host. Revokes all access
rights to the multihost disk from remote hosts.
Causes the disk to enter the exclusive access state.
Implementation Note: Reservations (exclusive access
rights) broken via random resets should be reinstated
by the driver upon their detection, for example, in
the automatic probe function described below.
MHIOCRELEASE
Relinquishes exclusive access rights to the multihost
disk for the local host. On success, causes the disk
to enter the non- exclusive access state.
MHIOCSTATUS
Probes a multihost disk to determine whether the local
host has access rights to the disk. Returns 0 if the
local host has access to the disk, 1 if it doesn't,
and -1 with errno set to EIO if the probe failed for
some other reason.
MHIOCQRESERVE
Issues, simply and only, a SCSI-2 Reserve command. If
the attempt to reserve fails due to the SCSI error
Reservation Conflict (which implies that some other
host has the device reserved), then the ioctl will
return -1 with errno set to EACCES. The
MHIOCQRESERVE ioctl does NOT issue a bus device reset
or bus reset prior to attempting the SCSI-2 reserve
command. It also does not take care of re-instating
reservations that disappear due to bus resets or bus
device resets; if that behavior is desired, then the
caller can call MHIOCTKOWN after the MHIOCQRESERVE has
returned success. If the device does not support the
SCSI-2 Reserve command, then the ioctl returns -1
with errno set to ENOTSUP. The MHIOCQRESERVE ioctl is
intended to be used by high-availability or clustering
software for a "quorum" disk, hence, the "Q" in the
name of the ioctl.
Shared Multihost Disks
Shared multihost disks ioctls control access to shared mul-
tihost disks. The ioctls are merely a veneer on the SCSI-3
Persistent Reservation facility. Therefore, the underlying
semantic model is not described in detail here, see instead
the SCSI-3 standard. The SCSI-3 Persistent Reservations sup-
port the concept of a group of hosts all sharing access to a
disk.
The function prototypes and descriptions for the shared mul-
tihost ioctls are as follows:
ioctl(fd, MHIOCGRP_INKEYS, (mhioc_inkeys_t) *k);
Issues the SCSI-3 command Persistent Reserve In Read
Keys to the device. On input, the field k->li should
be initialized by the caller with k->li.listsize
reflecting how big of an array the caller has allo-
cated for the k->li.list field and with k->li.listlen
== 0. On return, the field k->li.listlen is updated to
indicate the number of reservation keys the device
currently has: if this value is larger than k-
>li.listsize then that indicates that the caller
should have passed a bigger k->li.list array with a
bigger k->li.listsize. The number of array elements
actually written by the callee into k->li.list is the
minimum of k->li.listlen and k->li.listsize. The field
k->generation is updated with the generation informa-
tion returned by the SCSI-3 Read Keys query. If the
device does not support SCSI-3 Persistent Reserva-
tions, then this ioctl returns -1 with errno set to
ENOTSUP.
ioctl(fd, MHIOCGRP_INRESVS, (mhioc_inresvs_t) *r);
Issues the SCSI-3 command Persistent Reserve In Read
Reservations to the device. Remarks similar to
MHIOCGRP_INKEYS apply to the array manipulation. If
the device does not support SCSI-3 Persistent Reserva-
tions, then this ioctl returns -1 with errno set to
ENOTSUP.
ioctl(fd, MHIOCGRP_REGISTER, (mhioc_register_t) *r);
Issues the SCSI-3 command Persistent Reserve Out
Register. The fields of structure r are all inputs;
none of the fields are modified by the ioctl. The
field r->aptpl should be set to true to specify that
registrations and reservations should persist across
device power failures, or to false to specify that
registrations and reservations should be cleared upon
device power failure; true is the recommended setting.
The field r->oldkey is the key that the caller
believes the device may already have for this host
initiator; if the caller believes that that this host
initiator is not already registered with this device,
it should pass the special key of all zeros. To
achieve the effect of unregistering with the device,
the caller should pass its current key for the r-
>oldkey field and an r->newkey field containing the
special key of all zeros. If the device returns the
SCSI error code Reservation Conflict, this ioctl
returns -1 with errno set to EACCES.
ioctl(fd, MHIOCGRP_RESERVE, (mhioc_resv_desc_t) *r);
Issues the SCSI-3 command Persistent Reserve Out
Reserve. The fields of structure r are all inputs;
none of the fields are modified by the ioctl. If the
device returns the SCSI error code Reservation Con-
flict, this ioctl returns -1 with errno set to EACCES.
*r);
ioctl(fd, MHIOCGRP_PREEMPTANDABORT, (mhioc_preemptandabort_t)
Issues the SCSI-3 command Persistent Reserve Out
Preempt-And-Abort. The fields of structure r are all
inputs; inputs; none of the fields are modified by the
ioctl. The key of the victim host is specified by the
field r->victim_key. The field r->resvdesc supplies
the preempter's key and the reservation that it is
requesting as part of the SCSI-3 Preempt-And-Abort
command. If the device returns the SCSI error code
Reservation Conflict, this ioctl returns -1 with errno
set to EACCES.
ioctl(fd, MHIOCGRP_PREEMPT, (mhioc_preemptandabort_t) *r);
Similar to MHIOCGRP_PREEMPTANDABORT, but instead
issues the SCSI-3 command Persistent Reserve Out
Preempt.
ioctl(fd, MHIOCGRP_CLEAR, (mhioc_resv_key_t) *r);
Issues the SCSI-3 command Persistent Reserve Out
Clear. The input parameter r is the reservation key of
the caller, which should have been already registered
with the device, by an earlier call to
MHIOCGRP_REGISTER.
For each device, the non-shared ioctls should not be mixed
with the Persistent Reserve Out shared ioctls, and vice-
versa, otherwise, the underlying device is likely to return
errors, because SCSI does not permit SCSI-2 reservations to
be mixed with SCSI-3 reservations on a single device. It
is, however, legitimate to call the Persistent Reserve In
ioctls, because these are query only. Issuing the
MHIOCGRP_INKEYS ioctl is the recommended way for a caller
to determine if the device supports SCSI-3 Persistent
Reservations (the ioctl will return -1 with errno set to
ENOTSUP if the device does not).
MHIOCENFAILFAST Ioctl
The MHIOCENFAILFAST ioctl is applicable for both non-shared
and shared disks, and may be used with either the non-shared
or shared ioctls.
ioctl(fd, MHIOENFAILFAST, (unsigned int *) millisecs);
Enables or disables the failfast option in the mul-
tihost disk driver and enables or disables automatic
probing of a multihost disk, described below. The
argument is an unsigned integer specifying the number
of milliseconds to wait between executions of the
automatic probe function. An argument of zero dis-
ables the failfast option and disables automatic prob-
ing. If the MHIOCENFAILFAST ioctl is never called,
the effect is defined to be that both the failfast
option and automatic probing are disabled.
Automatic Probing
The MHIOCENFAILFAST ioctl sets up a timeout in the driver to
periodically schedule automatic probes of the disk. The
automatic probe function works in this manner: The driver is
scheduled to probe the multihost disk every n milliseconds,
rounded up to the next integral multiple of the system
clock's resolution. If
1. the local host no longer has access rights to the mul-
tihost disk, and
2. access rights were expected to be held by the local
host,
the driver immediately panics the machine to comply with the
failfast model.
If the driver makes this discovery outside the timeout func-
tion, especially during a read or write operation, it is
imperative that it panic the system then as well.
RETURN VALUES
Each request returns -1 on failure and sets errno to indi-
cate the error.
EPERM Caller is not root.
EACCES
Access rights were denied.
EIO The multihost disk or controller was unable to suc-
cessfully complete the requested operation.
EOPNOTSUP
The multihost disk does not support the operation. For
example, it does not support the SCSI-2
Reserve/Release command set, or the SCSI-3 Persistent
Reservation command set.
ATTRIBUTES
See attributes(5) for a description of the following attri-
butes:
____________________________________________________________
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
| Availability | SUNWhea |
| Stability | Evolving |
|_____________________________|_____________________________|
SEE ALSO
ioctl(2), open(2), attributes(5)open(2)
NOTES
The ioctls for shared multihost disks and the MHIOCQRESERVE
ioctl are currently implemented only for SPARC and only for
the following disk device drivers: sd(7D), ssd(7D).
Man(1) output converted with
man2html