NAME
disk,
disk_init,
disk_attach,
disk_begindetach,
disk_detach,
disk_destroy,
disk_wait,
disk_busy,
disk_unbusy,
disk_isbusy,
disk_find,
disk_set_info —
generic disk framework
SYNOPSIS
#include <sys/types.h>
#include <sys/disklabel.h>
#include <sys/disk.h>
void
disk_init(
struct
disk *,
const char
*name,
const struct
dkdriver *driver);
void
disk_attach(
struct
disk *);
void
disk_begindetach(
struct
disk *,
int
(*lastclose)(device_t),
device_t self,
int flags);
void
disk_detach(
struct
disk *);
void
disk_destroy(
struct
disk *);
void
disk_wait(
struct
disk *);
void
disk_busy(
struct
disk *);
void
disk_unbusy(
struct
disk *,
long bcount,
int read);
bool
disk_isbusy(
struct
disk *);
struct disk *
disk_find(
const
char *);
void
disk_set_info(
device_t,
struct disk *,
const char *type);
DESCRIPTION
The
NetBSD generic disk framework is designed to provide
flexible, scalable, and consistent handling of disk state and metrics
information. The fundamental component of this framework is the
disk structure, which is defined as follows:
struct disk {
TAILQ_ENTRY(disk) dk_link; /* link in global disklist */
const char *dk_name; /* disk name */
prop_dictionary_t dk_info; /* reference to disk-info dictionary */
int dk_bopenmask; /* block devices open */
int dk_copenmask; /* character devices open */
int dk_openmask; /* composite (bopen|copen) */
int dk_state; /* label state ### */
int dk_blkshift; /* shift to convert DEV_BSIZE to blks */
int dk_byteshift; /* shift to convert bytes to blks */
/*
* Metrics data; note that some metrics may have no meaning
* on certain types of disks.
*/
struct io_stats *dk_stats;
const struct dkdriver *dk_driver; /* pointer to driver */
/*
* Information required to be the parent of a disk wedge.
*/
kmutex_t dk_rawlock; /* lock on these fields */
u_int dk_rawopens; /* # of opens of rawvp */
struct vnode *dk_rawvp; /* vnode for the RAW_PART bdev */
kmutex_t dk_openlock; /* lock on these and openmask */
u_int dk_nwedges; /* # of configured wedges */
/* all wedges on this disk */
LIST_HEAD(, dkwedge_softc) dk_wedges;
/*
* Disk label information. Storage for the in-core disk label
* must be dynamically allocated, otherwise the size of this
* structure becomes machine-dependent.
*/
daddr_t dk_labelsector; /* sector containing label */
struct disklabel *dk_label; /* label */
struct cpu_disklabel *dk_cpulabel;
};
The system maintains a global linked-list of all disks attached to the system.
This list, called
disklist, may grow or shrink over time as
disks are dynamically added and removed from the system. Drivers which
currently make use of the detachment capability of the framework are the
ccd,
dm, and
vnd
pseudo-device drivers.
The following is a brief description of each function in the framework:
-
-
- disk_init()
- Initialize the disk structure.
-
-
- disk_attach()
- Attach a disk; allocate storage for the disklabel, set the
“attached time” timestamp, insert the disk into the disklist,
and increment the system disk count.
-
-
- disk_begindetach()
- Check whether the disk is open, and if not, return 0. If
the disk is open, and
DETACH_FORCE
is not set in
flags, return EBUSY
.
Otherwise, call the provided lastclose routine (if
not NULL
) and return its exit code.
-
-
- disk_detach()
- Detach a disk; free storage for the disklabel, remove the
disk from the disklist, and decrement the system disk count. If the count
drops below zero, panic.
-
-
- disk_destroy()
- Release resources used by the disk structure when it is no
longer required.
-
-
- disk_wait()
- Disk timings are measured by counting the number of queued
requests (wait counter) and requests issued to the hardware (busy counter)
and keeping timestamp when the counters change. The time interval between
two changes of a counter is accumulated into a total and also multiplied
by the counter value and the accumulated into a sum. Both values can be
used to determine how much time is spent in the driver queue or in-flight
to the hardware as well as the average number of requests in either state.
disk_wait() increment the disk's wait counter and
handles the accumulation.
-
-
- disk_busy()
- Decrements the disk's wait counter and increments the
disk's “busy counter”, and handles either accumulation. If the
wait counter is still zero, it is assumed that the driver hasn't been
updated to call disk_wait(), then only the values from
the busy counter are available.
-
-
- disk_unbusy()
- Decrement the disk's busy counter and handles the
accumulation. The third argument read specifies the
direction of I/O; if non-zero it means reading from the disk, otherwise it
means writing to the disk.
-
-
- disk_isbusy()
- Returns true if disk is marked as
busy and false if it is not.
-
-
- disk_find()
- Return a pointer to the disk structure corresponding to the
name provided, or
NULL
if the disk does not
exist.
-
-
- disk_set_info()
- Setup disk-info dictionary and other dependent values of
the disk structure, the driver must have initialized the dk_geom member of
struct disk with suitable values. If
type is not
NULL
, it will be
added to the dictionary.
The functions typically called by device drivers are
disk_init()
disk_attach(),
disk_begindetach(),
disk_detach(),
disk_destroy(),
disk_wait(),
disk_busy(),
disk_unbusy(), and
disk_set_info(). The function
disk_find()
is provided as a utility function.
DISK IOCTLS
The following ioctls should be implemented by disk drivers:
-
-
DIOCGDINFO
struct disklabel
- Get disklabel.
-
-
DIOCSDINFO
struct disklabel
- Set in-memory disklabel.
-
-
DIOCWDINFO
struct disklabel
- Set in-memory disklabel and write on-disk disklabel.
-
-
DIOCGPART
struct partinfo
- Get partition information. This is used internally.
-
-
DIOCRFORMAT
struct format_op
- Read format.
-
-
DIOCWFORMAT
struct format_op
- Write format.
-
-
DIOCSSTEP
int
- Set step rate.
-
-
DIOCSRETRIES
int
- Set number of retries.
-
-
DIOCKLABEL
int
- Specify whether to keep or drop the in-memory disklabel
when the device is closed.
-
-
DIOCWLABEL
int
- Enable or disable writing to the part of the disk that
contains the label.
-
-
DIOCSBAD
struct dkbad
- Set kernel dkbad.
-
-
DIOCEJECT
int
- Eject removable disk.
-
-
DIOCLOCK
int
- Lock or unlock disk pack. For devices with removable media,
locking is intended to prevent the operator from removing the media.
-
-
DIOCGDEFLABEL
struct disklabel
- Get default label.
-
-
DIOCCLRLABEL
- Clear disk label.
-
-
DIOCGCACHE
int
- Get status of disk read and write caches. The result is a
bitmask containing the following values:
-
-
DKCACHE_READ
- Read cache enabled.
-
-
DKCACHE_WRITE
- Write(back) cache enabled.
-
-
DKCACHE_RCHANGE
- Read cache enable is changeable.
-
-
DKCACHE_WCHANGE
- Write cache enable is changeable.
-
-
DKCACHE_SAVE
- Cache parameters may be saved, so that they persist
across reboots or device detach/attach cycles.
-
-
DIOCSCACHE
int
- Set status of disk read and write caches. The input is a
bitmask in the same format as used for
DIOCGCACHE
.
-
-
DIOCCACHESYNC
int
- Synchronise the disk cache. This causes information in the
disk's write cache (if any) to be flushed to stable storage. The argument
specifies whether or not to force a flush even if the kernel believes that
there is no outstanding data.
-
-
DIOCBSLIST
struct disk_badsecinfo
- Get bad sector list.
-
-
DIOCBSFLUSH
- Flush bad sector list.
-
-
DIOCAWEDGE
struct dkwedge_info
- Add wedge.
-
-
DIOCGWEDGEINFO
struct dkwedge_info
- Get wedge information.
-
-
DIOCDWEDGE
struct dkwedge_info
- Delete wedge.
-
-
DIOCLWEDGES
struct dkwedge_list
- List wedges.
-
-
DIOCGSTRATEGY
struct disk_strategy
- Get disk buffer queue strategy.
-
-
DIOCSSTRATEGY
struct disk_strategy
- Set disk buffer queue strategy.
-
-
DIOCGDISKINFO
struct plistref
- Get disk-info dictionary.
-
-
DIOCGMEDIASIZE
off_t
- Get disk size in bytes.
-
-
DIOCGSECTORSIZE
u_int
- Get sector size in bytes.
USING THE FRAMEWORK
This section includes a description on basic use of the framework and example
usage of its functions. Actual implementation of a device driver which uses
the framework may vary.
Each device in the system uses a “softc” structure which contains
autoconfiguration and state information for that device. In the case of disks,
the softc should also contain one instance of the disk structure, e.g.:
struct foo_softc {
device_t sc_dev; /* generic device information */
struct disk sc_dk; /* generic disk information */
[ . . . more . . . ]
};
In order for the system to gather metrics data about a disk, the disk must be
registered with the system. The
disk_attach() routine
performs all of the functions currently required to register a disk with the
system including allocation of disklabel storage space, recording of the time
since boot that the disk was attached, and insertion into the disklist. Note
that since this function allocates storage space for the disklabel, it must be
called before the disklabel is read from the media or used in any other way.
Before
disk_attach() is called, a portions of the disk
structure must be initialized with data specific to that disk. For example, in
the “foo” disk driver, the following would be performed in the
autoconfiguration “attach” routine:
void
fooattach(device_t parent, device_t self, void *aux)
{
struct foo_softc *sc = device_private(self);
[ . . . ]
/* Initialize and attach the disk structure. */
disk_init(&sc->sc_dk, device_xname(self), &foodkdriver);
disk_attach(&sc->sc_dk);
/* Read geometry and fill in pertinent parts of disklabel. */
/* Initialize geometry values of the disk structure */
[ . . . ]
disk_set_info(&self>, &sc->sc_dk, type);
}
The
foodkdriver above is the disk's “driver”
switch. This switch currently includes pointers to several driver entry
points, where only the
d_strategy entry point is used by the
disk framework. This switch needs to have global scope and should be
initialized as follows:
void (foostrategy)(struct buf *);
void (foominphys)(struct buf *);
int (fooopen)(dev_t, int, int, struct lwp *);
int (fooclose)(dev_t, int, int, struct lwp *);
int (foo_discard)(device_t, off_t, off_t);
int (foo_diskstart)(device_t, struct buf *);
void (foo_iosize)(device_t, int *);
int (foo_dumpblocks)(device_t, void *, daddr_t, int);
int (foo_lastclose)(device_t);
int (foo_firstopen)(device_t, dev_t, int, int);
int (foo_label)(device_t, struct disklabel *);
const struct dkdriver foodkdriver = {
.d_open = fooopen,
.d_close = fooclose,
.d_strategy = foostrategy,
.d_minphys = foominphys,
.d_discard = foo_discard,
.d_diskstart = foo_diskstart, /* optional */
.d_dumpblocks = foo_dumpblocks, /* optional */
.d_iosize = foo_iosize, /* optional */
.d_firstopen = foo_firstopen, /* optional */
.d_lastclose = foo_lastclose, /* optional */
.d_label = foo_label, /* optional */
};
Once the disk is attached, metrics may be gathered on that disk. In order to
gather metrics data, the driver must tell the framework when the disk queues,
starts and stops operations. This functionality is provided by the
disk_wait(),
disk_busy() and
disk_unbusy() routines. Because
struct
disk is part of device driver private data it needs to be guarded. Mutual
exclusion must be done by driver
disk_wait(),
disk_busy() and
disk_unbusy() are not
thread safe. The
disk_busy() routine should be called
immediately before a command to the disk is sent, e.g.:
void
foostrategy(struct buf *bp)
{
[ . . . ]
mutex_enter(&sc->sc_dk_mtx);
disk_wait(&sc->sc_dk);
/* Put buffer onto drive's transfer queue */
mutex_exit(&sc->sc_dk_mtx);
foostart(sc);
}
void
foostart(struct foo_softc *sc)
{
[ . . . ]
/* Get buffer from drive's transfer queue. */
[ . . . ]
/* Build command to send to drive. */
[ . . . ]
/* Tell the disk framework we're going busy. */
mutex_enter(&sc->sc_dk_mtx);
disk_busy(&sc->sc_dk);
mutex_exit(&sc->sc_dk_mtx);
/* Send command to the drive. */
[ . . . ]
}
The routine
disk_unbusy() performs some consistency checks,
such as ensuring that the calls to
disk_busy() and
disk_unbusy() are balanced. It also performs the final steps
of the metrics calcuation. A byte count is added to the disk's running total,
and if greater than zero, the number of transfers the disk has performed is
incremented. The third argument
read specifies the
direction of I/O; if non-zero it means reading from the disk, otherwise it
means writing to the disk.
void
foodone(xfer)
struct foo_xfer *xfer;
{
struct foo_softc = (struct foo_softc *)xfer->xf_softc;
struct buf *bp = xfer->xf_buf;
long nbytes;
[ . . . ]
/*
* Get number of bytes transferred. If there is no buf
* associated with the xfer, we are being called at the
* end of a non-I/O command.
*/
if (bp == NULL)
nbytes = 0;
else
nbytes = bp->b_bcount - bp->b_resid;
[ . . . ]
mutex_enter(&sc->sc_dk_mtx);
/* Notify the disk framework that we've completed the transfer. */
disk_unbusy(&sc->sc_dk, nbytes,
bp != NULL ? bp->b_flags & B_READ : 0);
mutex_exit(&sc->sc_dk_mtx);
[ . . . ]
}
disk_isbusy() is used to get status of disk device it returns
true if device is currently busy and false if it is not. Like
disk_wait(),
disk_busy() and
disk_unbusy() it requires explicit locking from user side.
CODE REFERENCES
The disk framework itself is implemented within the file
sys/kern/subr_disk.c. Data structures and function
prototypes for the framework are located in
sys/sys/disk.h.
The
NetBSD machine-independent SCSI disk and CD-ROM
drivers use the disk framework. They are located in
sys/scsi/sd.c and
sys/scsi/cd.c.
The
NetBSD ccd,
dm,
and
vnd drivers use the detachment capability of the
framework. They are located in
sys/dev/ccd.c,
sys/dev/vnd.c, and
sys/dev/dm/device-mapper.c.
SEE ALSO
ccd(4),
dm(4),
vnd(4),
dksubr(9)
HISTORY
The
NetBSD generic disk framework appeared in
NetBSD 1.2.
AUTHORS
The
NetBSD generic disk framework was architected and
implemented by
Jason R. Thorpe
<
thorpej@NetBSD.org>.