From: Lai Jiangshan Date: Fri, 18 Mar 2011 03:15:47 +0000 (+0800) Subject: rcu: introduce kfree_rcu() X-Git-Tag: v3.0-rc1~392^2~49 X-Git-Url: https://openfabrics.org/gitweb/?a=commitdiff_plain;h=9ab1544eb4196ca8d05c433b2eb56f74496b1ee3;p=~emulex%2Finfiniband.git rcu: introduce kfree_rcu() Many rcu callbacks functions just call kfree() on the base structure. These functions are trivial, but their size adds up, and furthermore when they are used in a kernel module, that module must invoke the high-latency rcu_barrier() function at module-unload time. The kfree_rcu() function introduced by this commit addresses this issue. Rather than encoding a function address in the embedded rcu_head structure, kfree_rcu() instead encodes the offset of the rcu_head structure within the base structure. Because the functions are not allowed in the low-order 4096 bytes of kernel virtual memory, offsets up to 4095 bytes can be accommodated. If the offset is larger than 4095 bytes, a compile-time error will be generated in __kfree_rcu(). If this error is triggered, you can either fall back to use of call_rcu() or rearrange the structure to position the rcu_head structure into the first 4096 bytes. Note that the allowable offset might decrease in the future, for example, to allow something like kmem_cache_free_rcu(). The new kfree_rcu() function can replace code as follows: call_rcu(&p->rcu, simple_kfree_callback); where "simple_kfree_callback()" might be defined as follows: void simple_kfree_callback(struct rcu_head *p) { struct foo *q = container_of(p, struct foo, rcu); kfree(q); } with the following: kfree_rcu(&p->rcu, rcu); Note that the "rcu" is the name of a field in the structure being freed. The reason for using this rather than passing in a pointer to the base structure is that the above approach allows better type checking. This commit is based on earlier work by Lai Jiangshan and Manfred Spraul: Lai's V1 patch: http://lkml.org/lkml/2008/9/18/1 Manfred's patch: http://lkml.org/lkml/2009/1/2/115 Signed-off-by: Lai Jiangshan Signed-off-by: Manfred Spraul Signed-off-by: Paul E. McKenney Reviewed-by: David Howells Reviewed-by: Josh Triplett --- diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index c7aeacf7fc9..99f9aa7c280 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -809,4 +809,60 @@ static inline void debug_rcu_head_unqueue(struct rcu_head *head) } #endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */ +static __always_inline bool __is_kfree_rcu_offset(unsigned long offset) +{ + return offset < 4096; +} + +static __always_inline +void __kfree_rcu(struct rcu_head *head, unsigned long offset) +{ + typedef void (*rcu_callback)(struct rcu_head *); + + BUILD_BUG_ON(!__builtin_constant_p(offset)); + + /* See the kfree_rcu() header comment. */ + BUILD_BUG_ON(!__is_kfree_rcu_offset(offset)); + + call_rcu(head, (rcu_callback)offset); +} + +extern void kfree(const void *); + +static inline void __rcu_reclaim(struct rcu_head *head) +{ + unsigned long offset = (unsigned long)head->func; + + if (__is_kfree_rcu_offset(offset)) + kfree((void *)head - offset); + else + head->func(head); +} + +/** + * kfree_rcu() - kfree an object after a grace period. + * @ptr: pointer to kfree + * @rcu_head: the name of the struct rcu_head within the type of @ptr. + * + * Many rcu callbacks functions just call kfree() on the base structure. + * These functions are trivial, but their size adds up, and furthermore + * when they are used in a kernel module, that module must invoke the + * high-latency rcu_barrier() function at module-unload time. + * + * The kfree_rcu() function handles this issue. Rather than encoding a + * function address in the embedded rcu_head structure, kfree_rcu() instead + * encodes the offset of the rcu_head structure within the base structure. + * Because the functions are not allowed in the low-order 4096 bytes of + * kernel virtual memory, offsets up to 4095 bytes can be accommodated. + * If the offset is larger than 4095 bytes, a compile-time error will + * be generated in __kfree_rcu(). If this error is triggered, you can + * either fall back to use of call_rcu() or rearrange the structure to + * position the rcu_head structure into the first 4096 bytes. + * + * Note that the allowable offset might decrease in the future, for example, + * to allow something like kmem_cache_free_rcu(). + */ +#define kfree_rcu(ptr, rcu_head) \ + __kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head)) + #endif /* __LINUX_RCUPDATE_H */ diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c index 0c343b9a46d..4d60fbc9c64 100644 --- a/kernel/rcutiny.c +++ b/kernel/rcutiny.c @@ -167,7 +167,7 @@ static void rcu_process_callbacks(struct rcu_ctrlblk *rcp) prefetch(next); debug_rcu_head_unqueue(list); local_bh_disable(); - list->func(list); + __rcu_reclaim(list); local_bh_enable(); list = next; RCU_TRACE(cb_count++); diff --git a/kernel/rcutree.c b/kernel/rcutree.c index b579e4f9721..2c07adb9708 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -1206,7 +1206,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp) next = list->next; prefetch(next); debug_rcu_head_unqueue(list); - list->func(list); + __rcu_reclaim(list); list = next; if (++count >= rdp->blimit) break;