1 / 108

Advanced Character Driver Operations

Advanced Character Driver Operations. Ted Baker  Andy Wang COP 5641 / CIS 4930. Topics. Managing ioctl command numbers Block/unblocking a process Seeking on a device Access control. ioctl. For operations beyond simple data transfers Eject the media Report error information

ingo
Télécharger la présentation

Advanced Character Driver Operations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Character Driver Operations Ted Baker  Andy Wang COP 5641 / CIS 4930

  2. Topics • Managing ioctl command numbers • Block/unblocking a process • Seeking on a device • Access control

  3. ioctl • For operations beyond simple data transfers • Eject the media • Report error information • Change hardware settings • Self destruct • Alternatives • Embedded commands in the data stream • Driver-specific file systems

  4. ioctl • User-level interface int ioctl(int fd, unsigned long cmd, ...); • ... • Variable number of arguments • Problematic for the system call interface • In this context, it is meant to pass a single optional argument • Just a way to bypass the type checking • Difficult to audit ioctl calls • E.g., 32-bit vs. 64-bit modes • Currently uses lock_kernel(), or the global kernel lock • See vfs_ioctl() in /fs/ioctl.c

  5. ioctl • Driver-level interface int (*ioctl) (struct inode *inode, struct file *filp, unsigned int cmd, unsigned long arg); • cmd is passed from the user unchanged • arg can be an integer or a pointer • Compiler does not type check

  6. Choosing the ioctl Commands • Need a numbering scheme to avoid mistakes • E.g., issuing a command to the wrong device (changing the baud rate of an audio device) • Check include/asm/ioctl.h and Documentation/ioctl/ioctl-decoding.txt

  7. Choosing the ioctl Commands • A command number uses four bitfields • Defined in <linux/ioctl.h> • < direction, type, number, size> • direction: direction of data transfer • _IOC_NONE • _IOC_READ • _IOC_WRITE • _IOC_READ | WRITE

  8. Choosing the ioctl Commands • type (ioctl device type) • 8-bit (_IOC_TYPEBITS) magic number • Associated with the device • number • 8-bit (_IOC_NRBITS) sequential number • Unique within device • size: size of user data involved • The width is either 13 or 14 bits (_IOC_SIZEBITS)

  9. Choosing the ioctl Commands • Useful macros to create ioctl command numbers • _IO(type, nr) • _IOR(type, nr, datatype) • _IOW(type, nr, datatype) • _IOWR(type, nr, datatype) • Example • cmd = _IOWR(‘k’, 1, struct foo) The macro will figure out that size = sizeof(datatype)

  10. Choosing the ioctl Commands • Useful macros to decode ioctl command numbers • _IOC_DIR(nr) • _IOC_TYPE(nr) • _IOC_NR(nr) • _IOC_SIZE(nr)

  11. Choosing the ioctl Commands • The scull example /* Use 'k' as magic number */ #define SCULL_IOC_MAGIC 'k‘ /* Please use a different 8-bit number in your code */ #define SCULL_IOCRESET _IO(SCULL_IOC_MAGIC, 0)

  12. Choosing the ioctl Commands • The scull example /* * S means "Set" through a ptr, * T means "Tell" directly with the argument value * G means "Get": reply by setting through a pointer * Q means "Query": response is on the return value * X means "eXchange": switch G and S atomically * H means "sHift": switch T and Q atomically */ #define SCULL_IOCSQUANTUM _IOW(SCULL_IOC_MAGIC, 1, int) #define SCULL_IOCSQSET _IOW(SCULL_IOC_MAGIC, 2, int) #define SCULL_IOCTQUANTUM _IO(SCULL_IOC_MAGIC, 3) #define SCULL_IOCTQSET _IO(SCULL_IOC_MAGIC, 4) #define SCULL_IOCGQUANTUM _IOR(SCULL_IOC_MAGIC, 5, int) Set new value and return the old value

  13. Choosing the ioctl Commands • The scull example #define SCULL_IOCGQSET _IOR(SCULL_IOC_MAGIC, 6, int) #define SCULL_IOCQQUANTUM _IO(SCULL_IOC_MAGIC, 7) #define SCULL_IOCQQSET _IO(SCULL_IOC_MAGIC, 8) #define SCULL_IOCXQUANTUM _IOWR(SCULL_IOC_MAGIC, 9, int) #define SCULL_IOCXQSET _IOWR(SCULL_IOC_MAGIC,10, int) #define SCULL_IOCHQUANTUM _IO(SCULL_IOC_MAGIC, 11) #define SCULL_IOCHQSET _IO(SCULL_IOC_MAGIC, 12) #define SCULL_IOC_MAXNR 14

  14. The Return Value • When the command number is not supported • Return –EINVAL • Or –ENOTTY (according to the POSIX standard)

  15. The Predefined Commands • Handled by the kernel first • Will not be passed down to device drivers • Three groups • For any file (regular, device, FIFO, socket) • Magic number: “T.” • For regular files only • Specific to the file system type

  16. Using the ioctl Argument • If it is an integer, just use it directly • If it is a pointer • Need to check for valid user address int access_ok(int type, const void *addr, unsigned long size); • type: either VERIFY_READ or VERIFY_WRITE • Returns 1 for success, 0 for failure • Driver then results –EFAULT to the caller • Defined in <asm/uaccess.h> • Mostly called by memory-access routines

  17. Using the ioctl Argument • The scull example int scull_ioctl(struct inode *inode, struct file *filp, unsigned int cmd, unsigned long arg) { int err = 0, tmp; int retval = 0; /* check the magic number and whether the command is defined */ if (_IOC_TYPE(cmd) != SCULL_IOC_MAGIC) { return -ENOTTY; } if (_IOC_NR(cmd) > SCULL_IOC_MAXNR) { return -ENOTTY; } …

  18. Using the ioctl Argument • The scull example … /* the concept of "read" and "write" is reversed here */ if (_IOC_DIR(cmd) & _IOC_READ) { err = !access_ok(VERIFY_WRITE, (void __user *) arg, _IOC_SIZE(cmd)); } else if (_IOC_DIR(cmd) & _IOC_WRITE) { err = !access_ok(VERIFY_READ, (void __user *) arg, _IOC_SIZE(cmd)); } if (err) return -EFAULT; …

  19. Using the ioctl Argument • Data transfer functions optimized for most used data sizes (1, 2, 4, and 8 bytes) • If the size mismatches • Cryptic compiler error message: • Conversion to non-scalar type requested • Use copy_to_user and copy_from_user • #include <asm/uaccess.h> • put_user(datum, ptr) • Writes to a user-space address • Calls access_ok() • Returns 0 on success, -EFAULT on error

  20. Using the ioctl Argument • __put_user(datum, ptr) • Does not check access_ok() • Can still fail if the user-space memory is not writable • get_user(local, ptr) • Reads from a user-space address • Calls access_ok() • Stores the retrieved value in local • Returns 0 on success, -EFAULT on error • __get_user(local, ptr) • Does not check access_ok() • Can still fail if the user-space memory is not readable

  21. Capabilities and Restricted Operations • Limit certain ioctl operations to privileged users • See <linux/capability.h> for the full set of capabilities • To check a certain capability call int capable(int capability); • In the scull example if (!capable(CAP_SYS_ADMIN)) { return –EPERM; } A catch-all capability for many system administration operations

  22. The Implementation of the ioctl Commands • A giant switch statement … switch(cmd) { case SCULL_IOCRESET: scull_quantum = SCULL_QUANTUM; scull_qset = SCULL_QSET; break; case SCULL_IOCSQUANTUM: /* Set: arg points to the value */ if (!capable(CAP_SYS_ADMIN)) { return -EPERM; } retval = __get_user(scull_quantum, (int __user *)arg); break; …

  23. The Implementation of the ioctl Commands … case SCULL_IOCTQUANTUM: /* Tell: arg is the value */ if (!capable(CAP_SYS_ADMIN)) { return -EPERM; } scull_quantum = arg; break; case SCULL_IOCGQUANTUM: /* Get: arg is pointer to result */ retval = __put_user(scull_quantum, (int __user *) arg); break; case SCULL_IOCQQUANTUM: /* Query: return it (> 0) */ return scull_quantum; …

  24. The Implementation of the ioctl Commands … case SCULL_IOCXQUANTUM: /* eXchange: use arg as pointer */ if (!capable(CAP_SYS_ADMIN)) { return -EPERM; } tmp = scull_quantum; retval = __get_user(scull_quantum, (int __user *) arg); if (retval == 0) { retval = __put_user(tmp, (int __user *) arg); } break; …

  25. The Implementation of the ioctl Commands … case SCULL_IOCHQUANTUM: /* sHift: like Tell + Query */ if (!capable(CAP_SYS_ADMIN)) { return -EPERM; } tmp = scull_quantum; scull_quantum = arg; return tmp; default: /* redundant, as cmd was checked against MAXNR */ return -ENOTTY; } /* switch */ return retval; } /* scull_ioctl */

  26. The Implementation of the ioctl Commands • Six ways to pass and receive arguments from the user space • Need to know command number int quantum; ioctl(fd,SCULL_IOCSQUANTUM, &quantum); /* Set by pointer */ ioctl(fd,SCULL_IOCTQUANTUM, quantum); /* Set by value */ ioctl(fd,SCULL_IOCGQUANTUM, &quantum); /* Get by pointer */ quantum = ioctl(fd,SCULL_IOCQQUANTUM); /* Get by return value */ ioctl(fd,SCULL_IOCXQUANTUM, &quantum); /* Exchange by pointer */ /* Exchange by value */ quantum = ioctl(fd,SCULL_IOCHQUANTUM, quantum);

  27. Device Control Without ioctl • Writing control sequences into the data stream itself • Example: console escape sequences • Advantages: • No need to implement ioctl methods • Disadvantages: • Need to make sure that escape sequences do not appear in the normal data stream (e.g., cat a binary file) • Need to parse the data stream

  28. Blocking I/O • Needed when no data is available for reads • When the device is not ready to accept data • Output buffer is full

  29. Introduction to Sleeping

  30. Introduction to Sleeping • A process is removed from the scheduler’s run queue • Certain rules • Never sleep when running in an atomic context • Multiple steps must be performed without concurrent accesses • Not while holding a spinlock, seqlock, or RCU lock • Not while disabling interrupts

  31. Introduction to Sleeping • Okay to sleep while holding a semaphore • Other threads waiting for the semaphore will also sleep • Need to keep it short • Make sure that it is not blocking the process that will wake it up • After waking up • Make no assumptions about the state of the system • The resource one is waiting for might be gone again • Must check the wait condition again

  32. Introduction to Sleeping • Wait queue: contains a list of processes waiting for a specific event • #include <linux/wait.h> • To initialize statically, call DECLARE_WAIT_QUEUE_HEAD(my_queue); • To initialize dynamically, call wait_queue_head_t my_queue; init_waitqueue_head(&my_queue);

  33. Simple Sleeping • Call variants of wait_event macros • wait_event(queue, condition) • queue = wait queue head • Passed by value • Waits until the boolean condition becomes true • Puts into an uninterruptible sleep • Usually is not what you want • wait_event_interruptible(queue, condition) • Can be interrupted by any signals • Returns nonzero if sleep was interrupted • Your driver should return -ERESTARTSYS

  34. Simple Sleeping • wait_event_killable(queue, condition) • Can be interrupted only by fatal signals • wait_event_timeout(queue, condition, timeout) • Wait for a limited time (in jiffies) • Returns 0 regardless of condition evaluations • wait_event_interruptible_timeout(queue, condition, timeout)

  35. Simple Sleeping • To wake up, call variants of wake_up functions void wake_up(wait_queue_head_t *queue); • Wakes up all processes waiting on the queue void wake_up_interruptible(wait_queue_head_t *queue); • Wakes up processes that perform an interruptible sleep

  36. Simple Sleeping • Example module: sleepy static DECLARE_WAIT_QUEUE_HEAD(wq); static int flag = 0; ssize_t sleepy_read(struct file *filp, char __user *buf, size_t count, loff_t *pos) { printk(KERN_DEBUG "process %i (%s) going to sleep\n", current->pid, current->comm); wait_event_interruptible(wq, flag != 0); flag = 0; printk(KERN_DEBUG "awoken %i (%s)\n", current->pid, current->comm); return 0; /* EOF */ } Multiple threads can wake up at this point

  37. Simple Sleeping • Example module: sleepy ssize_t sleepy_write(struct file *filp, const char __user *buf, size_t count, loff_t *pos) { printk(KERN_DEBUG "process %i (%s) awakening the readers...\n", current->pid, current->comm); flag = 1; wake_up_interruptible(&wq); return count; /* succeed, to avoid retrial */ }

  38. Blocking and Nonblocking Operations • By default, operations block • If no data is available for reads • If no space is available for writes • Non-blocking I/O is indicated by the O_NONBLOCK flag in filp->flags • Defined in <linux/fcntl.h> • Only open, read, and write calls are affected • Returns –EAGAIN immediately instead of block • Applications need to distinguish non-blocking returns vs. EOFs

  39. A Blocking I/O Example • scullpipe • A read process • Blocks when no data is available • Wakes a blocking write when buffer space becomes available • A write process • Blocks when no buffer space is available • Wakes a blocking read process when data arrives

  40. A Blocking I/O Example • scullpipe data structure struct scull_pipe { wait_queue_head_t inq, outq; /* read and write queues */ char *buffer, *end; /* begin of buf, end of buf */ int buffersize; /* used in pointer arithmetic */ char *rp, *wp; /* where to read, where to write */ int nreaders, nwriters; /* number of openings for r/w */ struct fasync_struct *async_queue; /* asynchronous readers */ struct semaphore sem; /* mutual exclusion semaphore */ struct cdev cdev; /* Char device structure */ };

  41. A Blocking I/O Example static ssize_t scull_p_read(struct file *filp, char __user *buf, size_t count, loff_t *f_pos) { struct scull_pipe *dev = filp->private_data; if (down_interruptible(&dev->sem)) return -ERESTARTSYS; while (dev->rp == dev->wp) { /* nothing to read */ up(&dev->sem); /* release the lock */ if (filp->f_flags & O_NONBLOCK) return -EAGAIN; if (wait_event_interruptible(dev->inq, (dev->rp != dev->wp))) return -ERESTARTSYS; if (down_interruptible(&dev->sem)) return -ERESTARTSYS; }

  42. A Blocking I/O Example if (dev->wp > dev->rp) count = min(count, (size_t)(dev->wp - dev->rp)); else /* the write pointer has wrapped */ count = min(count, (size_t)(dev->end - dev->rp)); if (copy_to_user(buf, dev->rp, count)) { up (&dev->sem); return -EFAULT; } dev->rp += count; if (dev->rp == dev->end) dev->rp = dev->buffer; /* wrapped */ up (&dev->sem); /* finally, awake any writers and return */ wake_up_interruptible(&dev->outq); return count; }

  43. Advanced Sleeping

  44. Advanced Sleeping • Uses low-level functions to affect a sleep • How a process sleeps 1. Allocate and initialize a wait_queue_t structure DEFINE_WAIT(my_wait); • Or wait_queue_t my_wait; init_wait(&my_wait); Queue element

  45. Advanced Sleeping 2. Add to the proper wait queue and mark a process as being asleep • TASK_RUNNINGTASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE • Call void prepare_to_wait(wait_queue_head_t *queue, wait_queue_t *wait, int state);

  46. Advanced Sleeping 3. Give up the processor • Double check the sleeping condition before going to sleep • The wakeup thread might have changed the condition between steps 1 and 2 if (/* sleeping condition */) { schedule(); /* yield the CPU */ }

  47. Advanced Sleeping 4. Return from sleep Remove the process from the wait queue if schedule() was not called void finish_wait(wait_queue_head_t *queue, wait_queue_t *wait);

  48. Advanced Sleeping • scullpipewrite method /* How much space is free? */ static int spacefree(struct scull_pipe *dev) { if (dev->rp == dev->wp) return dev->buffersize - 1; return ((dev->rp + dev->buffersize - dev->wp) % dev->buffersize) - 1; }

  49. Advanced Sleeping static ssize_t scull_p_write(struct file *filp, const char __user *buf, size_t count, loff_t *f_pos) { struct scull_pipe *dev = filp->private_data; int result; if (down_interruptible(&dev->sem)) return -ERESTARTSYS; /* Wait for space for writing */ result = scull_getwritespace(dev, filp); if (result) return result; /* scull_getwritespace called up(&dev->sem) */ /* ok, space is there, accept something */ count = min(count, (size_t)spacefree(dev));

  50. Advanced Sleeping if (dev->wp >= dev->rp) count = min(count, (size_t)(dev->end - dev->wp)); else /* the write pointer has wrapped, fill up to rp - 1 */ count = min(count, (size_t)(dev->rp - dev->wp - 1)); if (copy_from_user(dev->wp, buf, count)) { up (&dev->sem); return -EFAULT; } dev->wp += count; if (dev->wp == dev->end) dev->wp = dev->buffer; /* wrapped */ up(&dev->sem); wake_up_interruptible(&dev->inq); if (dev->async_queue) kill_fasync(&dev->async_queue, SIGIO, POLL_IN); return count; } Notify asynchronous readers who are waiting

More Related