[v2] lib/kmapset: fix undefined behavior in kmapset_cmp()

Submitted by Pavel Butsykin on Dec. 7, 2018, 12:38 p.m.

Details

Message ID 20181207123859.12743-1-pbutsykin@virtuozzo.com
State New
Series "lib/kmapset: fix undefined behavior in kmapset_cmp()"
Headers show

Commit Message

Pavel Butsykin Dec. 7, 2018, 12:38 p.m.
Here we are dealing with undefined behavior when taking the address from lvalue
which doesn't designate an object. The expression '&a->b' is undefined behavior
in the C language in that case if 'a' is NULL pointer. But in most cases this
causes no problems, until compiler starts using an aggressive optimization
policy.

That's what I got after compiling the kernel with gcc9:

[    0.422039] BUG: unable to handle kernel paging request at fffffffffffffff0
[    0.422557] IP: [<ffffffff9134bd9c>] kmapset_hash+0x37c/0x730
[    0.422971] PGD 6a69e067 PUD 6a6a0067 PMD 0
[    0.423276] Oops: 0000 [#1] SMP KASAN
[    0.423546] Modules linked in:
[    0.423782] CPU: 0 PID: 0 Comm: swapper/0 ve: 0 Not tainted 3.10.0-862.20.2.ovz.73.6 #67 73.6
[    0.424341] Hardware name: Virtuozzo KVM, BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[    0.425044] task: ffffffff926a53c0 ti: ffffffff92688000 task.ti: ffffffff92688000
[    0.425528] RIP: 0010:[<ffffffff9134bd9c>]  [<ffffffff9134bd9c>] kmapset_hash+0x37c/0x730
[    0.426072] RSP: 0000:ffffffff9268fae8  EFLAGS: 00010a02
[    0.426416] RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 1ffffffffffffffe
[    0.426876] RDX: 0000000000000000 RSI: 1ffffffffffffffe RDI: fffffffffffffff0
[    0.427339] RBP: ffffffff9268fb48 R08: 0000000000000000 R09: 0000000000000000
[    0.427947] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880067191508
[    0.428466] R13: ffffffffffffffe8 R14: ffff8800671915a0 R15: ffffffffffffffe8
[    0.428937] FS:  0000000000000000(0000) GS:ffff880067600000(0000) knlGS:0000000000000000
[    0.429455] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.429826] CR2: fffffffffffffff0 CR3: 000000006a69a000 CR4: 00000000000606b0
[    0.430296] Call Trace:
[    0.430464]  [<ffffffff9134c529>] kmapset_commit+0x99/0x160
[    0.430828]  [<ffffffff9134c490>] ? kmapset_put+0xf0/0xf0
[    0.431193]  [<ffffffff910bf830>] kernfs_get_ve_perms+0xc0/0x120
[    0.431587]  [<ffffffff910b9b69>] kernfs_add_one+0x289/0x4b0
[    0.431965]  [<ffffffff910b9e87>] kernfs_create_dir_ns+0xf7/0x150
[    0.432358]  [<ffffffff910c2e43>] sysfs_create_dir_ns+0xc3/0x1f0
[    0.432746]  [<ffffffff912ff4dc>] kobject_add_internal+0x1ec/0xa20
[    0.433160]  [<ffffffff91300164>] kobject_add+0x124/0x190
...

It's an attempt to obtain link_a->key when map_a->links.first is NULL.
Disassembly listing showed that this happened because the compiler optimized
condition - "while (&link_a->map_link)". Looks like gcc noticed the check of
map->links.first pointer to NULL in kmapset_hash() above, and decided that if
the pointer isn't NULL then 'while' condition is always true, but if the
pointer is NULL then it's undefined behavior and gcc doesn’t care about it.

The second undefined behavior is pointer overflow, the compiler could assumes
&link_a->map_link cannot be NULL if the member's offset is greater than 0. In
the case when map_a->links.first is NULL, then link_a will be equal to
(NULL - 24) and expression &((struct kmapset_link *)(NULL - 24))->map_link will
refer to NULL. But the problem is that POINTER_PLUS_EXPR in gcc is unsigned
sizetype, and negative pointer (NULL - 24) appear as very large positive, so for
this reason the compiler can also optimize this 'while' condition.

But gcc with -fno-delete-null-pointer-checks and -fno-strict-overflow flags
should not optimize such NULL pointer checks. The kernel uses these flags, so it
turned out to be gcc9 bug - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88367

Let's fix this UB stuff by using hlist_for_each_entry() that safely iterates
hlist with hlist_entry_safe().

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
---
 v2: fix list_entry() to hlist_entry()

 lib/kmapset.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

Patch hide | download patch | download mbox

diff --git a/lib/kmapset.c b/lib/kmapset.c
index e369603b0bca..79261d0d0554 100644
--- a/lib/kmapset.c
+++ b/lib/kmapset.c
@@ -40,18 +40,14 @@  static long kmapset_cmp(struct kmapset_map *map_a, struct kmapset_map *map_b)
 	if (map_a->size != map_b->size)
 		return map_a->size - map_b->size;
 
-	link_a = hlist_entry(map_a->links.first,
-			struct kmapset_link, map_link);
 	link_b = hlist_entry(map_b->links.first,
 			struct kmapset_link, map_link);
-	while (&link_a->map_link) {
+	hlist_for_each_entry(link_a, &map_a->links, map_link) {
 		if (link_a->key != link_b->key)
 			return (long)link_a->key - (long)link_b->key;
 		if (link_a->value != link_b->value)
 			return link_a->value - link_b->value;
-		link_a = list_entry(link_a->map_link.next,
-				struct kmapset_link, map_link);
-		link_b = list_entry(link_b->map_link.next,
+		link_b = hlist_entry(link_b->map_link.next,
 				struct kmapset_link, map_link);
 	}
 

Comments

Kirill Tkhai Dec. 7, 2018, 3:49 p.m.
On 07.12.2018 15:38, Pavel Butsykin wrote:
> Here we are dealing with undefined behavior when taking the address from lvalue
> which doesn't designate an object. The expression '&a->b' is undefined behavior
> in the C language in that case if 'a' is NULL pointer. But in most cases this
> causes no problems, until compiler starts using an aggressive optimization
> policy.
> 
> That's what I got after compiling the kernel with gcc9:
> 
> [    0.422039] BUG: unable to handle kernel paging request at fffffffffffffff0
> [    0.422557] IP: [<ffffffff9134bd9c>] kmapset_hash+0x37c/0x730
> [    0.422971] PGD 6a69e067 PUD 6a6a0067 PMD 0
> [    0.423276] Oops: 0000 [#1] SMP KASAN
> [    0.423546] Modules linked in:
> [    0.423782] CPU: 0 PID: 0 Comm: swapper/0 ve: 0 Not tainted 3.10.0-862.20.2.ovz.73.6 #67 73.6
> [    0.424341] Hardware name: Virtuozzo KVM, BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [    0.425044] task: ffffffff926a53c0 ti: ffffffff92688000 task.ti: ffffffff92688000
> [    0.425528] RIP: 0010:[<ffffffff9134bd9c>]  [<ffffffff9134bd9c>] kmapset_hash+0x37c/0x730
> [    0.426072] RSP: 0000:ffffffff9268fae8  EFLAGS: 00010a02
> [    0.426416] RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 1ffffffffffffffe
> [    0.426876] RDX: 0000000000000000 RSI: 1ffffffffffffffe RDI: fffffffffffffff0
> [    0.427339] RBP: ffffffff9268fb48 R08: 0000000000000000 R09: 0000000000000000
> [    0.427947] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880067191508
> [    0.428466] R13: ffffffffffffffe8 R14: ffff8800671915a0 R15: ffffffffffffffe8
> [    0.428937] FS:  0000000000000000(0000) GS:ffff880067600000(0000) knlGS:0000000000000000
> [    0.429455] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.429826] CR2: fffffffffffffff0 CR3: 000000006a69a000 CR4: 00000000000606b0
> [    0.430296] Call Trace:
> [    0.430464]  [<ffffffff9134c529>] kmapset_commit+0x99/0x160
> [    0.430828]  [<ffffffff9134c490>] ? kmapset_put+0xf0/0xf0
> [    0.431193]  [<ffffffff910bf830>] kernfs_get_ve_perms+0xc0/0x120
> [    0.431587]  [<ffffffff910b9b69>] kernfs_add_one+0x289/0x4b0
> [    0.431965]  [<ffffffff910b9e87>] kernfs_create_dir_ns+0xf7/0x150
> [    0.432358]  [<ffffffff910c2e43>] sysfs_create_dir_ns+0xc3/0x1f0
> [    0.432746]  [<ffffffff912ff4dc>] kobject_add_internal+0x1ec/0xa20
> [    0.433160]  [<ffffffff91300164>] kobject_add+0x124/0x190
> ...
> 
> It's an attempt to obtain link_a->key when map_a->links.first is NULL.
> Disassembly listing showed that this happened because the compiler optimized
> condition - "while (&link_a->map_link)". Looks like gcc noticed the check of
> map->links.first pointer to NULL in kmapset_hash() above, and decided that if
> the pointer isn't NULL then 'while' condition is always true, but if the
> pointer is NULL then it's undefined behavior and gcc doesn’t care about it.
> 
> The second undefined behavior is pointer overflow, the compiler could assumes
> &link_a->map_link cannot be NULL if the member's offset is greater than 0. In
> the case when map_a->links.first is NULL, then link_a will be equal to
> (NULL - 24) and expression &((struct kmapset_link *)(NULL - 24))->map_link will
> refer to NULL. But the problem is that POINTER_PLUS_EXPR in gcc is unsigned
> sizetype, and negative pointer (NULL - 24) appear as very large positive, so for
> this reason the compiler can also optimize this 'while' condition.
> 
> But gcc with -fno-delete-null-pointer-checks and -fno-strict-overflow flags
> should not optimize such NULL pointer checks. The kernel uses these flags, so it
> turned out to be gcc9 bug - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88367
> 
> Let's fix this UB stuff by using hlist_for_each_entry() that safely iterates
> hlist with hlist_entry_safe().
> 
> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>

Looks OK

> ---
>  v2: fix list_entry() to hlist_entry()
> 
>  lib/kmapset.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/kmapset.c b/lib/kmapset.c
> index e369603b0bca..79261d0d0554 100644
> --- a/lib/kmapset.c
> +++ b/lib/kmapset.c
> @@ -40,18 +40,14 @@ static long kmapset_cmp(struct kmapset_map *map_a, struct kmapset_map *map_b)
>  	if (map_a->size != map_b->size)
>  		return map_a->size - map_b->size;
>  
> -	link_a = hlist_entry(map_a->links.first,
> -			struct kmapset_link, map_link);
>  	link_b = hlist_entry(map_b->links.first,
>  			struct kmapset_link, map_link);
> -	while (&link_a->map_link) {
> +	hlist_for_each_entry(link_a, &map_a->links, map_link) {
>  		if (link_a->key != link_b->key)
>  			return (long)link_a->key - (long)link_b->key;
>  		if (link_a->value != link_b->value)
>  			return link_a->value - link_b->value;
> -		link_a = list_entry(link_a->map_link.next,
> -				struct kmapset_link, map_link);
> -		link_b = list_entry(link_b->map_link.next,
> +		link_b = hlist_entry(link_b->map_link.next,
>  				struct kmapset_link, map_link);
>  	}
>  
>