Based on kernel version 4.7.2. Page generated on 2016-08-22 22:48 EST.
1 An introduction to the videobuf layer 2 Jonathan Corbet <corbet@lwn.net> 3 Current as of 2.6.33 4 5 The videobuf layer functions as a sort of glue layer between a V4L2 driver 6 and user space. It handles the allocation and management of buffers for 7 the storage of video frames. There is a set of functions which can be used 8 to implement many of the standard POSIX I/O system calls, including read(), 9 poll(), and, happily, mmap(). Another set of functions can be used to 10 implement the bulk of the V4L2 ioctl() calls related to streaming I/O, 11 including buffer allocation, queueing and dequeueing, and streaming 12 control. Using videobuf imposes a few design decisions on the driver 13 author, but the payback comes in the form of reduced code in the driver and 14 a consistent implementation of the V4L2 user-space API. 15 16 Buffer types 17 18 Not all video devices use the same kind of buffers. In fact, there are (at 19 least) three common variations: 20 21 - Buffers which are scattered in both the physical and (kernel) virtual 22 address spaces. (Almost) all user-space buffers are like this, but it 23 makes great sense to allocate kernel-space buffers this way as well when 24 it is possible. Unfortunately, it is not always possible; working with 25 this kind of buffer normally requires hardware which can do 26 scatter/gather DMA operations. 27 28 - Buffers which are physically scattered, but which are virtually 29 contiguous; buffers allocated with vmalloc(), in other words. These 30 buffers are just as hard to use for DMA operations, but they can be 31 useful in situations where DMA is not available but virtually-contiguous 32 buffers are convenient. 33 34 - Buffers which are physically contiguous. Allocation of this kind of 35 buffer can be unreliable on fragmented systems, but simpler DMA 36 controllers cannot deal with anything else. 37 38 Videobuf can work with all three types of buffers, but the driver author 39 must pick one at the outset and design the driver around that decision. 40 41 [It's worth noting that there's a fourth kind of buffer: "overlay" buffers 42 which are located within the system's video memory. The overlay 43 functionality is considered to be deprecated for most use, but it still 44 shows up occasionally in system-on-chip drivers where the performance 45 benefits merit the use of this technique. Overlay buffers can be handled 46 as a form of scattered buffer, but there are very few implementations in 47 the kernel and a description of this technique is currently beyond the 48 scope of this document.] 49 50 Data structures, callbacks, and initialization 51 52 Depending on which type of buffers are being used, the driver should 53 include one of the following files: 54 55 <media/videobuf-dma-sg.h> /* Physically scattered */ 56 <media/videobuf-vmalloc.h> /* vmalloc() buffers */ 57 <media/videobuf-dma-contig.h> /* Physically contiguous */ 58 59 The driver's data structure describing a V4L2 device should include a 60 struct videobuf_queue instance for the management of the buffer queue, 61 along with a list_head for the queue of available buffers. There will also 62 need to be an interrupt-safe spinlock which is used to protect (at least) 63 the queue. 64 65 The next step is to write four simple callbacks to help videobuf deal with 66 the management of buffers: 67 68 struct videobuf_queue_ops { 69 int (*buf_setup)(struct videobuf_queue *q, 70 unsigned int *count, unsigned int *size); 71 int (*buf_prepare)(struct videobuf_queue *q, 72 struct videobuf_buffer *vb, 73 enum v4l2_field field); 74 void (*buf_queue)(struct videobuf_queue *q, 75 struct videobuf_buffer *vb); 76 void (*buf_release)(struct videobuf_queue *q, 77 struct videobuf_buffer *vb); 78 }; 79 80 buf_setup() is called early in the I/O process, when streaming is being 81 initiated; its purpose is to tell videobuf about the I/O stream. The count 82 parameter will be a suggested number of buffers to use; the driver should 83 check it for rationality and adjust it if need be. As a practical rule, a 84 minimum of two buffers are needed for proper streaming, and there is 85 usually a maximum (which cannot exceed 32) which makes sense for each 86 device. The size parameter should be set to the expected (maximum) size 87 for each frame of data. 88 89 Each buffer (in the form of a struct videobuf_buffer pointer) will be 90 passed to buf_prepare(), which should set the buffer's size, width, height, 91 and field fields properly. If the buffer's state field is 92 VIDEOBUF_NEEDS_INIT, the driver should pass it to: 93 94 int videobuf_iolock(struct videobuf_queue* q, struct videobuf_buffer *vb, 95 struct v4l2_framebuffer *fbuf); 96 97 Among other things, this call will usually allocate memory for the buffer. 98 Finally, the buf_prepare() function should set the buffer's state to 99 VIDEOBUF_PREPARED. 100 101 When a buffer is queued for I/O, it is passed to buf_queue(), which should 102 put it onto the driver's list of available buffers and set its state to 103 VIDEOBUF_QUEUED. Note that this function is called with the queue spinlock 104 held; if it tries to acquire it as well things will come to a screeching 105 halt. Yes, this is the voice of experience. Note also that videobuf may 106 wait on the first buffer in the queue; placing other buffers in front of it 107 could again gum up the works. So use list_add_tail() to enqueue buffers. 108 109 Finally, buf_release() is called when a buffer is no longer intended to be 110 used. The driver should ensure that there is no I/O active on the buffer, 111 then pass it to the appropriate free routine(s): 112 113 /* Scatter/gather drivers */ 114 int videobuf_dma_unmap(struct videobuf_queue *q, 115 struct videobuf_dmabuf *dma); 116 int videobuf_dma_free(struct videobuf_dmabuf *dma); 117 118 /* vmalloc drivers */ 119 void videobuf_vmalloc_free (struct videobuf_buffer *buf); 120 121 /* Contiguous drivers */ 122 void videobuf_dma_contig_free(struct videobuf_queue *q, 123 struct videobuf_buffer *buf); 124 125 One way to ensure that a buffer is no longer under I/O is to pass it to: 126 127 int videobuf_waiton(struct videobuf_buffer *vb, int non_blocking, int intr); 128 129 Here, vb is the buffer, non_blocking indicates whether non-blocking I/O 130 should be used (it should be zero in the buf_release() case), and intr 131 controls whether an interruptible wait is used. 132 133 File operations 134 135 At this point, much of the work is done; much of the rest is slipping 136 videobuf calls into the implementation of the other driver callbacks. The 137 first step is in the open() function, which must initialize the 138 videobuf queue. The function to use depends on the type of buffer used: 139 140 void videobuf_queue_sg_init(struct videobuf_queue *q, 141 struct videobuf_queue_ops *ops, 142 struct device *dev, 143 spinlock_t *irqlock, 144 enum v4l2_buf_type type, 145 enum v4l2_field field, 146 unsigned int msize, 147 void *priv); 148 149 void videobuf_queue_vmalloc_init(struct videobuf_queue *q, 150 struct videobuf_queue_ops *ops, 151 struct device *dev, 152 spinlock_t *irqlock, 153 enum v4l2_buf_type type, 154 enum v4l2_field field, 155 unsigned int msize, 156 void *priv); 157 158 void videobuf_queue_dma_contig_init(struct videobuf_queue *q, 159 struct videobuf_queue_ops *ops, 160 struct device *dev, 161 spinlock_t *irqlock, 162 enum v4l2_buf_type type, 163 enum v4l2_field field, 164 unsigned int msize, 165 void *priv); 166 167 In each case, the parameters are the same: q is the queue structure for the 168 device, ops is the set of callbacks as described above, dev is the device 169 structure for this video device, irqlock is an interrupt-safe spinlock to 170 protect access to the data structures, type is the buffer type used by the 171 device (cameras will use V4L2_BUF_TYPE_VIDEO_CAPTURE, for example), field 172 describes which field is being captured (often V4L2_FIELD_NONE for 173 progressive devices), msize is the size of any containing structure used 174 around struct videobuf_buffer, and priv is a private data pointer which 175 shows up in the priv_data field of struct videobuf_queue. Note that these 176 are void functions which, evidently, are immune to failure. 177 178 V4L2 capture drivers can be written to support either of two APIs: the 179 read() system call and the rather more complicated streaming mechanism. As 180 a general rule, it is necessary to support both to ensure that all 181 applications have a chance of working with the device. Videobuf makes it 182 easy to do that with the same code. To implement read(), the driver need 183 only make a call to one of: 184 185 ssize_t videobuf_read_one(struct videobuf_queue *q, 186 char __user *data, size_t count, 187 loff_t *ppos, int nonblocking); 188 189 ssize_t videobuf_read_stream(struct videobuf_queue *q, 190 char __user *data, size_t count, 191 loff_t *ppos, int vbihack, int nonblocking); 192 193 Either one of these functions will read frame data into data, returning the 194 amount actually read; the difference is that videobuf_read_one() will only 195 read a single frame, while videobuf_read_stream() will read multiple frames 196 if they are needed to satisfy the count requested by the application. A 197 typical driver read() implementation will start the capture engine, call 198 one of the above functions, then stop the engine before returning (though a 199 smarter implementation might leave the engine running for a little while in 200 anticipation of another read() call happening in the near future). 201 202 The poll() function can usually be implemented with a direct call to: 203 204 unsigned int videobuf_poll_stream(struct file *file, 205 struct videobuf_queue *q, 206 poll_table *wait); 207 208 Note that the actual wait queue eventually used will be the one associated 209 with the first available buffer. 210 211 When streaming I/O is done to kernel-space buffers, the driver must support 212 the mmap() system call to enable user space to access the data. In many 213 V4L2 drivers, the often-complex mmap() implementation simplifies to a 214 single call to: 215 216 int videobuf_mmap_mapper(struct videobuf_queue *q, 217 struct vm_area_struct *vma); 218 219 Everything else is handled by the videobuf code. 220 221 The release() function requires two separate videobuf calls: 222 223 void videobuf_stop(struct videobuf_queue *q); 224 int videobuf_mmap_free(struct videobuf_queue *q); 225 226 The call to videobuf_stop() terminates any I/O in progress - though it is 227 still up to the driver to stop the capture engine. The call to 228 videobuf_mmap_free() will ensure that all buffers have been unmapped; if 229 so, they will all be passed to the buf_release() callback. If buffers 230 remain mapped, videobuf_mmap_free() returns an error code instead. The 231 purpose is clearly to cause the closing of the file descriptor to fail if 232 buffers are still mapped, but every driver in the 2.6.32 kernel cheerfully 233 ignores its return value. 234 235 ioctl() operations 236 237 The V4L2 API includes a very long list of driver callbacks to respond to 238 the many ioctl() commands made available to user space. A number of these 239 - those associated with streaming I/O - turn almost directly into videobuf 240 calls. The relevant helper functions are: 241 242 int videobuf_reqbufs(struct videobuf_queue *q, 243 struct v4l2_requestbuffers *req); 244 int videobuf_querybuf(struct videobuf_queue *q, struct v4l2_buffer *b); 245 int videobuf_qbuf(struct videobuf_queue *q, struct v4l2_buffer *b); 246 int videobuf_dqbuf(struct videobuf_queue *q, struct v4l2_buffer *b, 247 int nonblocking); 248 int videobuf_streamon(struct videobuf_queue *q); 249 int videobuf_streamoff(struct videobuf_queue *q); 250 251 So, for example, a VIDIOC_REQBUFS call turns into a call to the driver's 252 vidioc_reqbufs() callback which, in turn, usually only needs to locate the 253 proper struct videobuf_queue pointer and pass it to videobuf_reqbufs(). 254 These support functions can replace a great deal of buffer management 255 boilerplate in a lot of V4L2 drivers. 256 257 The vidioc_streamon() and vidioc_streamoff() functions will be a bit more 258 complex, of course, since they will also need to deal with starting and 259 stopping the capture engine. 260 261 Buffer allocation 262 263 Thus far, we have talked about buffers, but have not looked at how they are 264 allocated. The scatter/gather case is the most complex on this front. For 265 allocation, the driver can leave buffer allocation entirely up to the 266 videobuf layer; in this case, buffers will be allocated as anonymous 267 user-space pages and will be very scattered indeed. If the application is 268 using user-space buffers, no allocation is needed; the videobuf layer will 269 take care of calling get_user_pages() and filling in the scatterlist array. 270 271 If the driver needs to do its own memory allocation, it should be done in 272 the vidioc_reqbufs() function, *after* calling videobuf_reqbufs(). The 273 first step is a call to: 274 275 struct videobuf_dmabuf *videobuf_to_dma(struct videobuf_buffer *buf); 276 277 The returned videobuf_dmabuf structure (defined in 278 <media/videobuf-dma-sg.h>) includes a couple of relevant fields: 279 280 struct scatterlist *sglist; 281 int sglen; 282 283 The driver must allocate an appropriately-sized scatterlist array and 284 populate it with pointers to the pieces of the allocated buffer; sglen 285 should be set to the length of the array. 286 287 Drivers using the vmalloc() method need not (and cannot) concern themselves 288 with buffer allocation at all; videobuf will handle those details. The 289 same is normally true of contiguous-DMA drivers as well; videobuf will 290 allocate the buffers (with dma_alloc_coherent()) when it sees fit. That 291 means that these drivers may be trying to do high-order allocations at any 292 time, an operation which is not always guaranteed to work. Some drivers 293 play tricks by allocating DMA space at system boot time; videobuf does not 294 currently play well with those drivers. 295 296 As of 2.6.31, contiguous-DMA drivers can work with a user-supplied buffer, 297 as long as that buffer is physically contiguous. Normal user-space 298 allocations will not meet that criterion, but buffers obtained from other 299 kernel drivers, or those contained within huge pages, will work with these 300 drivers. 301 302 Filling the buffers 303 304 The final part of a videobuf implementation has no direct callback - it's 305 the portion of the code which actually puts frame data into the buffers, 306 usually in response to interrupts from the device. For all types of 307 drivers, this process works approximately as follows: 308 309 - Obtain the next available buffer and make sure that somebody is actually 310 waiting for it. 311 312 - Get a pointer to the memory and put video data there. 313 314 - Mark the buffer as done and wake up the process waiting for it. 315 316 Step (1) above is done by looking at the driver-managed list_head structure 317 - the one which is filled in the buf_queue() callback. Because starting 318 the engine and enqueueing buffers are done in separate steps, it's possible 319 for the engine to be running without any buffers available - in the 320 vmalloc() case especially. So the driver should be prepared for the list 321 to be empty. It is equally possible that nobody is yet interested in the 322 buffer; the driver should not remove it from the list or fill it until a 323 process is waiting on it. That test can be done by examining the buffer's 324 done field (a wait_queue_head_t structure) with waitqueue_active(). 325 326 A buffer's state should be set to VIDEOBUF_ACTIVE before being mapped for 327 DMA; that ensures that the videobuf layer will not try to do anything with 328 it while the device is transferring data. 329 330 For scatter/gather drivers, the needed memory pointers will be found in the 331 scatterlist structure described above. Drivers using the vmalloc() method 332 can get a memory pointer with: 333 334 void *videobuf_to_vmalloc(struct videobuf_buffer *buf); 335 336 For contiguous DMA drivers, the function to use is: 337 338 dma_addr_t videobuf_to_dma_contig(struct videobuf_buffer *buf); 339 340 The contiguous DMA API goes out of its way to hide the kernel-space address 341 of the DMA buffer from drivers. 342 343 The final step is to set the size field of the relevant videobuf_buffer 344 structure to the actual size of the captured image, set state to 345 VIDEOBUF_DONE, then call wake_up() on the done queue. At this point, the 346 buffer is owned by the videobuf layer and the driver should not touch it 347 again. 348 349 Developers who are interested in more information can go into the relevant 350 header files; there are a few low-level functions declared there which have 351 not been talked about here. Also worthwhile is the vivi driver 352 (drivers/media/platform/vivi.c), which is maintained as an example of how V4L2 353 drivers should be written. Vivi only uses the vmalloc() API, but it's good 354 enough to get started with. Note also that all of these calls are exported 355 GPL-only, so they will not be available to non-GPL kernel modules.