correct the operand specifiers in the riscv64 CAS routines

Submitted by Palmer Dabbelt on Sept. 25, 2019, 3:30 a.m.

Details

Message ID 20190925033015.9905-1-palmer@sifive.com
State New
Series "correct the operand specifiers in the riscv64 CAS routines"
Headers show

Commit Message

Palmer Dabbelt Sept. 25, 2019, 3:30 a.m.
The operand sepcifiers in a_cas and a_casp for riscv64 were incorrect:
there's a backwards branch in the routine, so despite tmp being written
at the end of the assembly fragment it cannot be allocated in one of the
input registers because the input values may be needed for another trip
around the loop.

For code that follows the guarnteed forward progress requirements, he
backwards branch is rarely taken: SiFive's hardware only fails a store
conditional on execptional cases (ie, instruction cache misses inside
the loop), and until recently a bug in QEMU allowed back-to-back
store conditionals to succeed.  The bug has been fixed in the latest
QEMU release, but it turns out that the fix caused this latent bug in
musl to manifest.

Full disclosure: I haven't actually even compiled musl.  I just guessed
this would fix a bug introducted by the new QEMU behavior, Alistair
(CC'd) actually checked it fixes the problem.  The rest is just
conjecture.
---
 arch/riscv64/atomic_arch.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Patch hide | download patch | download mbox

diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h
index c976534284aa..41ad4d04907c 100644
--- a/arch/riscv64/atomic_arch.h
+++ b/arch/riscv64/atomic_arch.h
@@ -14,7 +14,7 @@  static inline int a_cas(volatile int *p, int t, int s)
 		"	sc.w.aqrl %1, %4, (%2)\n"
 		"	bnez %1, 1b\n"
 		"1:"
-		: "=&r"(old), "=r"(tmp)
+		: "=&r"(old), "=&r"(tmp)
 		: "r"(p), "r"(t), "r"(s)
 		: "memory");
 	return old;
@@ -31,7 +31,7 @@  static inline void *a_cas_p(volatile void *p, void *t, void *s)
 		"	sc.d.aqrl %1, %4, (%2)\n"
 		"	bnez %1, 1b\n"
 		"1:"
-		: "=&r"(old), "=r"(tmp)
+		: "=&r"(old), "=&r"(tmp)
 		: "r"(p), "r"(t), "r"(s)
 		: "memory");
 	return old;

Comments

Khem Raj Sept. 25, 2019, 6:43 a.m.
On 9/24/19 8:30 PM, Palmer Dabbelt wrote:
> The operand sepcifiers in a_cas and a_casp for riscv64 were incorrect:
> there's a backwards branch in the routine, so despite tmp being written
> at the end of the assembly fragment it cannot be allocated in one of the
> input registers because the input values may be needed for another trip
> around the loop.
>
> For code that follows the guarnteed forward progress requirements, he
> backwards branch is rarely taken: SiFive's hardware only fails a store
> conditional on execptional cases (ie, instruction cache misses inside
> the loop), and until recently a bug in QEMU allowed back-to-back
> store conditionals to succeed.  The bug has been fixed in the latest
> QEMU release, but it turns out that the fix caused this latent bug in
> musl to manifest.
>
> Full disclosure: I haven't actually even compiled musl.  I just guessed
> this would fix a bug introducted by the new QEMU behavior, Alistair
> (CC'd) actually checked it fixes the problem.  The rest is just
> conjecture.

thanks Palmer, I tested this fix with OpenEmbedded and it indeed fixes
the problem I have tested mimimal image built with with both gcc and
clang on qemu 4.1 Tested-By: Khem Raj <raj.khem@gmail.com>

> ---
>  arch/riscv64/atomic_arch.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h
> index c976534284aa..41ad4d04907c 100644
> --- a/arch/riscv64/atomic_arch.h
> +++ b/arch/riscv64/atomic_arch.h
> @@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int s)
>  		"	sc.w.aqrl %1, %4, (%2)\n"
>  		"	bnez %1, 1b\n"
>  		"1:"
> -		: "=&r"(old), "=r"(tmp)
> +		: "=&r"(old), "=&r"(tmp)
>  		: "r"(p), "r"(t), "r"(s)
>  		: "memory");
>  	return old;
> @@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void *t, void *s)
>  		"	sc.d.aqrl %1, %4, (%2)\n"
>  		"	bnez %1, 1b\n"
>  		"1:"
> -		: "=&r"(old), "=r"(tmp)
> +		: "=&r"(old), "=&r"(tmp)
>  		: "r"(p), "r"(t), "r"(s)
>  		: "memory");
>  	return old;
Alistair Francis Sept. 25, 2019, 11:38 p.m.
On Tue, 2019-09-24 at 20:30 -0700, Palmer Dabbelt wrote:
> The operand sepcifiers in a_cas and a_casp for riscv64 were

> incorrect:

> there's a backwards branch in the routine, so despite tmp being

> written

> at the end of the assembly fragment it cannot be allocated in one of

> the

> input registers because the input values may be needed for another

> trip

> around the loop.

> 

> For code that follows the guarnteed forward progress requirements, he

> backwards branch is rarely taken: SiFive's hardware only fails a

> store

> conditional on execptional cases (ie, instruction cache misses inside

> the loop), and until recently a bug in QEMU allowed back-to-back

> store conditionals to succeed.  The bug has been fixed in the latest

> QEMU release, but it turns out that the fix caused this latent bug in

> musl to manifest.

> 

> Full disclosure: I haven't actually even compiled musl.  I just

> guessed

> this would fix a bug introducted by the new QEMU behavior, Alistair

> (CC'd) actually checked it fixes the problem.  The rest is just

> conjecture.


Reviewed-by: Alistair Francis <alistair.francis@wdc.com>

Tested-by: Alistair Francis <alistair.francis@wdc.com>


> ---

>  arch/riscv64/atomic_arch.h | 4 ++--

>  1 file changed, 2 insertions(+), 2 deletions(-)

> 

> diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h

> index c976534284aa..41ad4d04907c 100644

> --- a/arch/riscv64/atomic_arch.h

> +++ b/arch/riscv64/atomic_arch.h

> @@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int

> s)

>  		"	sc.w.aqrl %1, %4, (%2)\n"

>  		"	bnez %1, 1b\n"

>  		"1:"

> -		: "=&r"(old), "=r"(tmp)

> +		: "=&r"(old), "=&r"(tmp)

>  		: "r"(p), "r"(t), "r"(s)

>  		: "memory");

>  	return old;

> @@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void

> *t, void *s)

>  		"	sc.d.aqrl %1, %4, (%2)\n"

>  		"	bnez %1, 1b\n"

>  		"1:"

> -		: "=&r"(old), "=r"(tmp)

> +		: "=&r"(old), "=&r"(tmp)

>  		: "r"(p), "r"(t), "r"(s)

>  		: "memory");

>  	return old;