mips: add single-instruction math functions

Submitted by info@mobile-stream.com on Sept. 11, 2019, 10:05 a.m.

Details

Message ID 20190911103224.504A15C44C@mx7.valuehost.ru
State New
Series "mips: add single-instruction math functions"
Headers show

Commit Message

info@mobile-stream.com Sept. 11, 2019, 10:05 a.m.
non-commit text:
gcc puts annoying nop into the delay slot for these functions, e.g.:
	abs.d	$f0,$f12
	jr	$ra
	 nop
is there any way to get rid of this without using pure .S?



SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double).

ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR
implement the required behaviour.
---
 src/math/mips/fabs.c  | 16 ++++++++++++++++
 src/math/mips/fabsf.c | 16 ++++++++++++++++
 src/math/mips/sqrt.c  | 16 ++++++++++++++++
 src/math/mips/sqrtf.c | 16 ++++++++++++++++
 4 files changed, 64 insertions(+)
 create mode 100644 src/math/mips/fabs.c
 create mode 100644 src/math/mips/fabsf.c
 create mode 100644 src/math/mips/sqrt.c
 create mode 100644 src/math/mips/sqrtf.c

Patch hide | download patch | download mbox

diff --git a/src/math/mips/fabs.c b/src/math/mips/fabs.c
new file mode 100644
index 00000000..0a5aa3b1
--- /dev/null
+++ b/src/math/mips/fabs.c
@@ -0,0 +1,16 @@ 
+#if !defined(__mips_soft_float) && defined(__mips_abs2008)
+
+#include <math.h>
+
+double fabs(double x)
+{
+	double r;
+	__asm__("abs.d %0,%1" : "=f"(r) : "f"(x));
+	return r;
+}
+
+#else
+
+#include "../fabs.c"
+
+#endif
diff --git a/src/math/mips/fabsf.c b/src/math/mips/fabsf.c
new file mode 100644
index 00000000..35307be6
--- /dev/null
+++ b/src/math/mips/fabsf.c
@@ -0,0 +1,16 @@ 
+#if !defined(__mips_soft_float) && defined(__mips_abs2008)
+
+#include <math.h>
+
+float fabsf(float x)
+{
+	float r;
+	__asm__("abs.s %0,%1" : "=f"(r) : "f"(x));
+	return r;
+}
+
+#else
+
+#include "../fabsf.c"
+
+#endif
diff --git a/src/math/mips/sqrt.c b/src/math/mips/sqrt.c
new file mode 100644
index 00000000..595c9dbc
--- /dev/null
+++ b/src/math/mips/sqrt.c
@@ -0,0 +1,16 @@ 
+#if !defined(__mips_soft_float) && __mips >= 3
+
+#include <math.h>
+
+double sqrt(double x)
+{
+	double r;
+	__asm__("sqrt.d %0,%1" : "=f"(r) : "f"(x));
+	return r;
+}
+
+#else
+
+#include "../sqrt.c"
+
+#endif
diff --git a/src/math/mips/sqrtf.c b/src/math/mips/sqrtf.c
new file mode 100644
index 00000000..84090d2d
--- /dev/null
+++ b/src/math/mips/sqrtf.c
@@ -0,0 +1,16 @@ 
+#if !defined(__mips_soft_float) && __mips >= 2
+
+#include <math.h>
+
+float sqrtf(float x)
+{
+	float r;
+	__asm__("sqrt.s %0,%1" : "=f"(r) : "f"(x));
+	return r;
+}
+
+#else
+
+#include "../sqrtf.c"
+
+#endif

Comments

Rich Felker Sept. 11, 2019, 11:46 a.m.
On Wed, Sep 11, 2019 at 01:05:04PM +0300, info@mobile-stream.com wrote:
> 
> non-commit text:
> gcc puts annoying nop into the delay slot for these functions, e.g.:
> 	abs.d	$f0,$f12
> 	jr	$ra
> 	 nop
> is there any way to get rid of this without using pure .S?

I think you don't want to get rid of it anyway, since if FPU emulation
is in use, emulation of floating point instructions in branch delay
slots is really problematic and requires nasty hacks with executable
stacks and whatnot. It would be nice if we could tell GCC not to put
the fpu instructions it generates in branch delay slots either, but I
don't know a way to do that.

Rich
Rich Felker Sept. 12, 2019, 5:57 p.m.
On Wed, Sep 11, 2019 at 01:05:04PM +0300, info@mobile-stream.com wrote:
> 
> non-commit text:
> gcc puts annoying nop into the delay slot for these functions, e.g.:
> 	abs.d	$f0,$f12
> 	jr	$ra
> 	 nop
> is there any way to get rid of this without using pure .S?
> 
> 
> 
> SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double).
> 
> ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR
> implement the required behaviour.

One other thing: can you confirm/cite that the mips sqrt instructions
are correctly-rounded? I would assume so but this needs checking since
there have been some low-end fpus that do things wrong (like early
pre-vfp ARM ones).

Rich
info@mobile-stream.com Sept. 13, 2019, 11:46 a.m.
R4000UM, MIPS-IV (pre-MTI), MIPS32, MIPS64 specs say sqrt is rounded according to the current rounding mode in FCSR.


R> One other thing: can you confirm/cite that the mips sqrt instructions
R> are correctly-rounded? I would assume so but this needs checking since
R> there have been some low-end fpus that do things wrong (like early
R> pre-vfp ARM ones).
Rich Felker Sept. 13, 2019, 6:31 p.m.
On Wed, Sep 11, 2019 at 01:05:04PM +0300, info@mobile-stream.com wrote:
> 
> non-commit text:
> gcc puts annoying nop into the delay slot for these functions, e.g.:
> 	abs.d	$f0,$f12
> 	jr	$ra
> 	 nop
> is there any way to get rid of this without using pure .S?
> 
> 
> 
> SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double).
> 
> ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR
> implement the required behaviour.
> ---
>  src/math/mips/fabs.c  | 16 ++++++++++++++++
>  src/math/mips/fabsf.c | 16 ++++++++++++++++
>  src/math/mips/sqrt.c  | 16 ++++++++++++++++
>  src/math/mips/sqrtf.c | 16 ++++++++++++++++
>  4 files changed, 64 insertions(+)
>  create mode 100644 src/math/mips/fabs.c
>  create mode 100644 src/math/mips/fabsf.c
>  create mode 100644 src/math/mips/sqrt.c
>  create mode 100644 src/math/mips/sqrtf.c
> 
> diff --git a/src/math/mips/fabs.c b/src/math/mips/fabs.c
> new file mode 100644
> index 00000000..0a5aa3b1
> --- /dev/null
> +++ b/src/math/mips/fabs.c
> @@ -0,0 +1,16 @@
> +#if !defined(__mips_soft_float) && defined(__mips_abs2008)

Why is this dependent on __mips_abs2008?

Rich
Rich Felker Sept. 13, 2019, 6:56 p.m.
On Fri, Sep 13, 2019 at 02:31:23PM -0400, Rich Felker wrote:
> On Wed, Sep 11, 2019 at 01:05:04PM +0300, info@mobile-stream.com wrote:
> > 
> > non-commit text:
> > gcc puts annoying nop into the delay slot for these functions, e.g.:
> > 	abs.d	$f0,$f12
> > 	jr	$ra
> > 	 nop
> > is there any way to get rid of this without using pure .S?
> > 
> > 
> > 
> > SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double).
> > 
> > ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR
> > implement the required behaviour.
> > ---
> >  src/math/mips/fabs.c  | 16 ++++++++++++++++
> >  src/math/mips/fabsf.c | 16 ++++++++++++++++
> >  src/math/mips/sqrt.c  | 16 ++++++++++++++++
> >  src/math/mips/sqrtf.c | 16 ++++++++++++++++
> >  4 files changed, 64 insertions(+)
> >  create mode 100644 src/math/mips/fabs.c
> >  create mode 100644 src/math/mips/fabsf.c
> >  create mode 100644 src/math/mips/sqrt.c
> >  create mode 100644 src/math/mips/sqrtf.c
> > 
> > diff --git a/src/math/mips/fabs.c b/src/math/mips/fabs.c
> > new file mode 100644
> > index 00000000..0a5aa3b1
> > --- /dev/null
> > +++ b/src/math/mips/fabs.c
> > @@ -0,0 +1,16 @@
> > +#if !defined(__mips_soft_float) && defined(__mips_abs2008)
> 
> Why is this dependent on __mips_abs2008?

OK, I see. The macro isn't well-documented, but it corresponds to the
gcc -mabs=2008 option indicating an ISA variant that treats the
abs.[sd] instruction as bitwise rather than arithmetic. I'm somewhat
unclear on whether we should do anything like this; it's one of the
places where MIPS blurred the distinction between ISA levels and
incompatible ABIs, and seems to be able to produce code that would
silently run on the wrong type of cpu and produce the wrong result (or
maybe this is a runtime-switchable cpu mode?)

I'm also unclear though whether fabs() actually needs to behave as
bitwise. Neither C nor Annex F nor POSIX seems to specify its behavior
for NANs except that a NAN is returned. Am I missing something? If
not, it seems the 'legacy' abs.[sd] instructions would also be fine
for implementing this, and then no conditional is needed.

Rich
James Y Knight Sept. 13, 2019, 7:12 p.m.
I've no idea about the issues with MIPS' instruction set, but yes, to be
correct, abs does need to affect only the sign bit, leaving everything else
the same.

It's specified as such in IEEE 754-2008: 5.5.1 Sign bit operations --
"Implementations shall provide the following homogeneous
quiet-computational sign bit operations for all supported arithmetic
formats; they only affect the sign bit. The operations treat floating-point
numbers and NaNs alike, and signal no exception. These operations may
propagate non-canonical encodings."
...
"abs(x) copies a floating-point operand x to a destination in the same
format, setting the sign bit to 0 (positive)."

(That section also defines the copy, negate, and copySign operations)


On Fri, Sep 13, 2019 at 2:57 PM Rich Felker <dalias@libc.org> wrote:

> On Fri, Sep 13, 2019 at 02:31:23PM -0400, Rich Felker wrote:
> > On Wed, Sep 11, 2019 at 01:05:04PM +0300, info@mobile-stream.com wrote:
> > >
> > > non-commit text:
> > > gcc puts annoying nop into the delay slot for these functions, e.g.:
> > >     abs.d   $f0,$f12
> > >     jr      $ra
> > >      nop
> > > is there any way to get rid of this without using pure .S?
> > >
> > >
> > >
> > > SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double).
> > >
> > > ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR
> > > implement the required behaviour.
> > > ---
> > >  src/math/mips/fabs.c  | 16 ++++++++++++++++
> > >  src/math/mips/fabsf.c | 16 ++++++++++++++++
> > >  src/math/mips/sqrt.c  | 16 ++++++++++++++++
> > >  src/math/mips/sqrtf.c | 16 ++++++++++++++++
> > >  4 files changed, 64 insertions(+)
> > >  create mode 100644 src/math/mips/fabs.c
> > >  create mode 100644 src/math/mips/fabsf.c
> > >  create mode 100644 src/math/mips/sqrt.c
> > >  create mode 100644 src/math/mips/sqrtf.c
> > >
> > > diff --git a/src/math/mips/fabs.c b/src/math/mips/fabs.c
> > > new file mode 100644
> > > index 00000000..0a5aa3b1
> > > --- /dev/null
> > > +++ b/src/math/mips/fabs.c
> > > @@ -0,0 +1,16 @@
> > > +#if !defined(__mips_soft_float) && defined(__mips_abs2008)
> >
> > Why is this dependent on __mips_abs2008?
>
> OK, I see. The macro isn't well-documented, but it corresponds to the
> gcc -mabs=2008 option indicating an ISA variant that treats the
> abs.[sd] instruction as bitwise rather than arithmetic. I'm somewhat
> unclear on whether we should do anything like this; it's one of the
> places where MIPS blurred the distinction between ISA levels and
> incompatible ABIs, and seems to be able to produce code that would
> silently run on the wrong type of cpu and produce the wrong result (or
> maybe this is a runtime-switchable cpu mode?)
>
> I'm also unclear though whether fabs() actually needs to behave as
> bitwise. Neither C nor Annex F nor POSIX seems to specify its behavior
> for NANs except that a NAN is returned. Am I missing something? If
> not, it seems the 'legacy' abs.[sd] instructions would also be fine
> for implementing this, and then no conditional is needed.
>
> Rich
>
Rich Felker Sept. 13, 2019, 8:01 p.m.
On Fri, Sep 13, 2019 at 03:12:11PM -0400, James Y Knight wrote:
> I've no idea about the issues with MIPS' instruction set, but yes, to be
> correct, abs does need to affect only the sign bit, leaving everything else
> the same.
> 
> It's specified as such in IEEE 754-2008: 5.5.1 Sign bit operations --
> "Implementations shall provide the following homogeneous
> quiet-computational sign bit operations for all supported arithmetic
> formats; they only affect the sign bit. The operations treat floating-point
> numbers and NaNs alike, and signal no exception. These operations may
> propagate non-canonical encodings."
> ....
> "abs(x) copies a floating-point operand x to a destination in the same
> format, setting the sign bit to 0 (positive)."
> 
> (That section also defines the copy, negate, and copySign operations)

OK, the text I was missing was:

"The fabs functions in <math.h> provide the abs function recommended
in the Appendix to IEC 60559."

from F.3 ¶1.

Rich

> On Fri, Sep 13, 2019 at 2:57 PM Rich Felker <dalias@libc.org> wrote:
> 
> > On Fri, Sep 13, 2019 at 02:31:23PM -0400, Rich Felker wrote:
> > > On Wed, Sep 11, 2019 at 01:05:04PM +0300, info@mobile-stream.com wrote:
> > > >
> > > > non-commit text:
> > > > gcc puts annoying nop into the delay slot for these functions, e.g.:
> > > >     abs.d   $f0,$f12
> > > >     jr      $ra
> > > >      nop
> > > > is there any way to get rid of this without using pure .S?
> > > >
> > > >
> > > >
> > > > SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double).
> > > >
> > > > ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR
> > > > implement the required behaviour.
> > > > ---
> > > >  src/math/mips/fabs.c  | 16 ++++++++++++++++
> > > >  src/math/mips/fabsf.c | 16 ++++++++++++++++
> > > >  src/math/mips/sqrt.c  | 16 ++++++++++++++++
> > > >  src/math/mips/sqrtf.c | 16 ++++++++++++++++
> > > >  4 files changed, 64 insertions(+)
> > > >  create mode 100644 src/math/mips/fabs.c
> > > >  create mode 100644 src/math/mips/fabsf.c
> > > >  create mode 100644 src/math/mips/sqrt.c
> > > >  create mode 100644 src/math/mips/sqrtf.c
> > > >
> > > > diff --git a/src/math/mips/fabs.c b/src/math/mips/fabs.c
> > > > new file mode 100644
> > > > index 00000000..0a5aa3b1
> > > > --- /dev/null
> > > > +++ b/src/math/mips/fabs.c
> > > > @@ -0,0 +1,16 @@
> > > > +#if !defined(__mips_soft_float) && defined(__mips_abs2008)
> > >
> > > Why is this dependent on __mips_abs2008?
> >
> > OK, I see. The macro isn't well-documented, but it corresponds to the
> > gcc -mabs=2008 option indicating an ISA variant that treats the
> > abs.[sd] instruction as bitwise rather than arithmetic. I'm somewhat
> > unclear on whether we should do anything like this; it's one of the
> > places where MIPS blurred the distinction between ISA levels and
> > incompatible ABIs, and seems to be able to produce code that would
> > silently run on the wrong type of cpu and produce the wrong result (or
> > maybe this is a runtime-switchable cpu mode?)
> >
> > I'm also unclear though whether fabs() actually needs to behave as
> > bitwise. Neither C nor Annex F nor POSIX seems to specify its behavior
> > for NANs except that a NAN is returned. Am I missing something? If
> > not, it seems the 'legacy' abs.[sd] instructions would also be fine
> > for implementing this, and then no conditional is needed.
> >
> > Rich
> >
Rich Felker Sept. 13, 2019, 8:23 p.m.
On Fri, Sep 13, 2019 at 04:01:41PM -0400, Rich Felker wrote:
> On Fri, Sep 13, 2019 at 03:12:11PM -0400, James Y Knight wrote:
> > I've no idea about the issues with MIPS' instruction set, but yes, to be
> > correct, abs does need to affect only the sign bit, leaving everything else
> > the same.
> > 
> > It's specified as such in IEEE 754-2008: 5.5.1 Sign bit operations --
> > "Implementations shall provide the following homogeneous
> > quiet-computational sign bit operations for all supported arithmetic
> > formats; they only affect the sign bit. The operations treat floating-point
> > numbers and NaNs alike, and signal no exception. These operations may
> > propagate non-canonical encodings."
> > ....
> > "abs(x) copies a floating-point operand x to a destination in the same
> > format, setting the sign bit to 0 (positive)."
> > 
> > (That section also defines the copy, negate, and copySign operations)
> 
> OK, the text I was missing was:
> 
> "The fabs functions in <math.h> provide the abs function recommended
> in the Appendix to IEC 60559."
> 
> from F.3 ¶1.
> 
> > On Fri, Sep 13, 2019 at 2:57 PM Rich Felker <dalias@libc.org> wrote:
> > 
> > > On Fri, Sep 13, 2019 at 02:31:23PM -0400, Rich Felker wrote:
> > > > On Wed, Sep 11, 2019 at 01:05:04PM +0300, info@mobile-stream.com wrote:
> > > > >
> > > > > non-commit text:
> > > > > gcc puts annoying nop into the delay slot for these functions, e.g.:
> > > > >     abs.d   $f0,$f12
> > > > >     jr      $ra
> > > > >      nop
> > > > > is there any way to get rid of this without using pure .S?
> > > > >
> > > > >
> > > > >
> > > > > SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double).
> > > > >
> > > > > ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR
> > > > > implement the required behaviour.
> > > > > ---
> > > > >  src/math/mips/fabs.c  | 16 ++++++++++++++++
> > > > >  src/math/mips/fabsf.c | 16 ++++++++++++++++
> > > > >  src/math/mips/sqrt.c  | 16 ++++++++++++++++
> > > > >  src/math/mips/sqrtf.c | 16 ++++++++++++++++
> > > > >  4 files changed, 64 insertions(+)
> > > > >  create mode 100644 src/math/mips/fabs.c
> > > > >  create mode 100644 src/math/mips/fabsf.c
> > > > >  create mode 100644 src/math/mips/sqrt.c
> > > > >  create mode 100644 src/math/mips/sqrtf.c
> > > > >
> > > > > diff --git a/src/math/mips/fabs.c b/src/math/mips/fabs.c
> > > > > new file mode 100644
> > > > > index 00000000..0a5aa3b1
> > > > > --- /dev/null
> > > > > +++ b/src/math/mips/fabs.c
> > > > > @@ -0,0 +1,16 @@
> > > > > +#if !defined(__mips_soft_float) && defined(__mips_abs2008)
> > > >
> > > > Why is this dependent on __mips_abs2008?
> > >
> > > OK, I see. The macro isn't well-documented, but it corresponds to the
> > > gcc -mabs=2008 option indicating an ISA variant that treats the
> > > abs.[sd] instruction as bitwise rather than arithmetic. I'm somewhat
> > > unclear on whether we should do anything like this; it's one of the
> > > places where MIPS blurred the distinction between ISA levels and
> > > incompatible ABIs, and seems to be able to produce code that would
> > > silently run on the wrong type of cpu and produce the wrong result (or
> > > maybe this is a runtime-switchable cpu mode?)
> > >
> > > I'm also unclear though whether fabs() actually needs to behave as
> > > bitwise. Neither C nor Annex F nor POSIX seems to specify its behavior
> > > for NANs except that a NAN is returned. Am I missing something? If
> > > not, it seems the 'legacy' abs.[sd] instructions would also be fine
> > > for implementing this, and then no conditional is needed.

Anyway, back to the patch review, I think the fabs[f] part should be
held back contingent on really understanding the ABI issue and whether
we can treat this as an ISA level and make it safe, or whether this is
really an incompatible ISA/ABI variant.

(And in any case, the caller will inline the insn and never call the
external fabs function if you're using the right CFLAGS...)

The sqrt part is probably okay for merge in the next release cycle.

Rich
info@mobile-stream.com Sept. 18, 2019, 1:07 p.m.
>> diff --git a/src/math/mips/fabs.c b/src/math/mips/fabs.c
>> new file mode 100644
>> index 00000000..0a5aa3b1
>> --- /dev/null
>> +++ b/src/math/mips/fabs.c
>> @@ -0,0 +1,16 @@
>> +#if !defined(__mips_soft_float) && defined(__mips_abs2008)

R> Why is this dependent on __mips_abs2008?


ABS.fmt is non-arithmetic only on cores with the corresponding FCSR bit hardwired to 1. It is R6 (always) and some R5 cores (such as P5600, M5150).
Rich Felker Sept. 18, 2019, 1:49 p.m.
On Wed, Sep 18, 2019 at 04:07:33PM +0300, info@mobile-stream.com wrote:
> >> diff --git a/src/math/mips/fabs.c b/src/math/mips/fabs.c
> >> new file mode 100644
> >> index 00000000..0a5aa3b1
> >> --- /dev/null
> >> +++ b/src/math/mips/fabs.c
> >> @@ -0,0 +1,16 @@
> >> +#if !defined(__mips_soft_float) && defined(__mips_abs2008)
> 
> R> Why is this dependent on __mips_abs2008?
> 
> 
> ABS.fmt is non-arithmetic only on cores with the corresponding FCSR
> bit hardwired to 1. It is R6 (always) and some R5 cores (such as
> P5600, M5150).

OK, in that case it may be okay to condition it on R6 (which is a
different ISA/ABI), as long as that's actually a hard requirement of
the R6 ISA.

Rich
info@mobile-stream.com Sept. 18, 2019, 5:18 p.m.
R> Why is this dependent on __mips_abs2008?

There is also __mips_nan2008 (always set for hard-float R6 and -mnan=2008).

Binaries built with this option (implicit or not) are unusable on -mnan=legacy system, this is enforced by kernel (unless booted with some debugging option).

The fabs code could be changed to also depend on __mips_nan2008 (since these ISA features are paired) to prevent -mabs=2008 musl on -mabs=legacy system (rather unrealistic).

Why is it wrong to depend on fine-grained ISA features after all?
Why is it wrong to explicitly depend e.g. on __mips_dsp in the strchr code fearing improper usage on a system without DSP ASE?

powerpc64, s390x have similar ifdefs in their math code and IIUC nothing prevents running (until SIGILL) statically-linked _ARCH_PWR5X binary on an _ARCH_PWR5 system.

Or some powerpc64 code depends on __VSX__. Is it wrong to depend on __mips_msa?

What is different with mips here?
Rich Felker Sept. 18, 2019, 8:04 p.m.
On Wed, Sep 18, 2019 at 08:18:04PM +0300, info@mobile-stream.com wrote:
> R> Why is this dependent on __mips_abs2008?
> 
> There is also __mips_nan2008 (always set for hard-float R6 and
> -mnan=2008).
> 
> Binaries built with this option (implicit or not) are unusable on
> -mnan=legacy system, this is enforced by kernel (unless booted with
> some debugging option).
> 
> The fabs code could be changed to also depend on __mips_nan2008
> (since these ISA features are paired) to prevent -mabs=2008 musl on
> -mabs=legacy system (rather unrealistic).
> 
> Why is it wrong to depend on fine-grained ISA features after all?

It's not. The presence of a new instruction for non-arithmetic abs
would be a fine-grained ISA feature. An incompatible change in an
existing instruction is a *different ISA*, which needs a different
ldsoname per musl policy of always allowing different ISAs to coexist
in the same filesystem and have their own library ecosystems.

I'm guessing we've hit a situation where people have been building
binaries for an incompatible MIPS-family ISA reusing the same
ldsoname, which is a huge mess we probably need to figure out how to
deal with...

> Why is it wrong to explicitly depend e.g. on __mips_dsp in the
> strchr code fearing improper usage on a system without DSP ASE?

Because it's an ISA level, not an incompatible ISA. With libc built
for the baseline ISA level (or any ISA level not assuming dsp;
actually it probably doesn't matter even if it does since I can't
imagine the compiler generates dsp insns for anything in libc) you can
run *both baseline non-dsp mips binaries, and ones using dsp
features*.

Note that this is the same situation as i386; as long as libc is built
for a baseline (like i486; i386 is actually a misnomer) you can run
both baseline binaries and ones built for i686 or whatever more recent
ISA level you like using the same libc (and library ecosystem and
filesystem).

> powerpc64, s390x have similar ifdefs in their math code and IIUC
> nothing prevents running (until SIGILL) statically-linked
> _ARCH_PWR5X binary on an _ARCH_PWR5 system.

You're looking at it the other way around.

> Or some powerpc64 code depends on __VSX__. Is it wrong to depend on
> __mips_msa?
> 
> What is different with mips here?

Reversal of direction of the incompatibility.

Rich
info@mobile-stream.com Sept. 19, 2019, 12:54 p.m.
R> It's not. The presence of a new instruction for non-arithmetic abs
R> would be a fine-grained ISA feature. An incompatible change in an
R> existing instruction is a *different ISA*, which needs a different
R> ldsoname per musl policy of always allowing different ISAs to coexist
R> in the same filesystem and have their own library ecosystems.

1) -mabs=legacy ("baseline") musl is safe on -mabs=2008 ("non-baseline") system cause it uses no explicit ABS.fmt in fabs[f]().
And since compiler does not generate trapping ABS.fmt/NEG.fmt unless specifically instructed there will be no implicit insns too. 
Even whatever explicit __mips_abs2008-protected code would not change (for good or for bad) anything here.

2) -mabs=2008 application will work correctly with -mabs=legacy musl on a -mabs=2008 system.

3) It is not musl's business if some -mabs=legacy application behaves unexpectedly on a -mabs=2008 system due to non-trapping ABS.fmt/NEG.fmt.

4) -mnan=legacy musl is probably not safe on a -mnan=2008 system but this is externally prevented by the kernel and there is *no* need for two ecosystems as CPU is *either* 2008 or legacy.

That is, this is *not* like o32 or n32 ABI on 64-bit MIPS (which can run these directly at the same time for different goals),
not a soft-float set of libraries on a hard-float system (which is usually just a subset with different argument passing convention),
not a hard-float binary on a soft-float system with kernel FPU emulation.

It is not even like r2 set of libraries on r6 (where kernel may want to emulate missing/redefined r2 instructions) -- *efficient* emulation of trapping ABS.fmt/NEG.fmt or 1985-style NaNs generation seems impossible on a 2008 system.

5) From the kernel pov (IIUC), nan2008 flag in the ELF header defines the abs2008 behaviour too (though these are distinct bits/flags per arch spec and in compilers).


I know glibc and uclibc have different ldso names for nan2008. I think it is because they have implemented it years ago when
1) the kernel had no nan2008 enforcement;
2) the mips r3 spec defined nan2008/abs2008 FCSR bits as possibly writable on a given CPU.

mips r5 spec has changed these bits to be strictly read-only and no r3 cores from IMG with writable nan2008/abs2008 bits exist (per spec at least).

So the feature bits are hard-wired, cross-binary and cross-system consistency is externally enforced, efficient emulation is barely possible.

Why bother with different ldsoname for nan2008 then?


(though all this nan2008 stuff is independent from the __mips_abs2008 fabs[f]() oneliners).


R> I'm guessing we've hit a situation where people have been building
R> binaries for an incompatible MIPS-family ISA reusing the same
R> ldsoname, which is a huge mess we probably need to figure out how to
R> deal with...

R> actually it probably doesn't matter even if it does since I can't
R> imagine the compiler generates dsp insns for anything in libc) you can

Sure it does, and for good. Indexed load/stores (gcc/clang), 64-bit additions with ADDWC/ADDSC (clang).

But do you essentially deny non-baseline musl without new ldsoname just because someone could misuse it on a baseline system?

If not, the -mabs=2008 ("non-baseline") musl on a -mabs=legacy ("baseline") system is irrelevant (though wrong of course).

If yes, do you consider x86 with LZCNT a different ISA?
It is perfectly possible to build musl with -mlzcnt ("non-baseline") and let it fail silently on non-ABM/BMIx system ("baseline").
Neither musl nor kernel prevents this yet nobody invents new ldsoname for this case.

It is probably possible to build soft-float ARMv6 musl with LDRD and let it crash on XScale v5TE system due to stricter alignment requirements.
Does musl prevent this with whatever ldsonames? I think not.
And if the kernel would reject such binary (dunno), why are you against external nan2008 enforcement then?

Finally, I believe it is possible to build mips32r2 binary with rotate instructions and let it fail silently and strangely on r1 system (rotate opcode reuses one of the shift opcodes) as kernel apparently ignores the corresponding flag in the ELF header.
This seems to be the only case of three you want to fix.
Rich Felker Sept. 19, 2019, 1:14 p.m.
On Thu, Sep 19, 2019 at 03:54:31PM +0300, info@mobile-stream.com wrote:
> R> It's not. The presence of a new instruction for non-arithmetic abs
> R> would be a fine-grained ISA feature. An incompatible change in an
> R> existing instruction is a *different ISA*, which needs a different
> R> ldsoname per musl policy of always allowing different ISAs to coexist
> R> in the same filesystem and have their own library ecosystems.
> 
> 1) -mabs=legacy ("baseline") musl is safe on -mabs=2008
> ("non-baseline") system cause it uses no explicit ABS.fmt in
> fabs[f]().
> And since compiler does not generate trapping ABS.fmt/NEG.fmt unless
> specifically instructed there will be no implicit insns too.
> Even whatever explicit __mips_abs2008-protected code would not
> change (for good or for bad) anything here.

OK, this is really good to know. I tried to follow these discussions
when they first happened for the tooling and glibc, but got lost in it
all and was unclear on what the compatibility properties are and
whether they're different for nan2008 vs abs2008 (AIUI now they are).

> 2) -mabs=2008 application will work correctly with -mabs=legacy musl
> on a -mabs=2008 system.

Are you sure? This seems to disagree with what you're saying below
about the same ABI tagging being used for abs2008 and nan2008 and
kernel refusing to load mismatching binaries.

> 3) It is not musl's business if some -mabs=legacy application
> behaves unexpectedly on a -mabs=2008 system due to non-trapping
> ABS.fmt/NEG.fmt.

Absolutely.

> 4) -mnan=legacy musl is probably not safe on a -mnan=2008 system but
> this is externally prevented by the kernel and there is *no* need
> for two ecosystems as CPU is *either* 2008 or legacy.

musl supports multiple ecosystems in the same filesystem regardless of
whether a cpu does; that's the whole point of supporting even multiple
unrelated archs like mips and arm or x86 and riscv and why they all
have differing ldso names. For example you can be running the foreign
ISA via qemu-user with binfmt_misc or explicitly.

Now, musl doesn't really do anything special with signaling nans and
doesn't particularly consider them a supported feature, so in some
sense it probably does work anyway for them to mismatch. But if the
kernel refuses to load -mnan=legacy binaries on nan2008 hardware, that
undermines the above.

> That is, this is *not* like o32 or n32 ABI on 64-bit MIPS (which can
> run these directly at the same time for different goals), not a
> soft-float set of libraries on a hard-float system (which is usually
> just a subset with different argument passing convention), not a
> hard-float binary on a soft-float system with kernel FPU emulation.
> 
> It is not even like r2 set of libraries on r6 (where kernel may want
> to emulate missing/redefined r2 instructions) -- *efficient*
> emulation of trapping ABS.fmt/NEG.fmt or 1985-style NaNs generation
> seems impossible on a 2008 system.
> 
> 5) From the kernel pov (IIUC), nan2008 flag in the ELF header
> defines the abs2008 behaviour too (though these are distinct
> bits/flags per arch spec and in compilers).

In that case, it seems like the kernel would refuse to load
-mabs=legacy binaries on nan2008 hardware, gratuitously due to
conflating the two properties. :( Is that the case, and if so, is
there any way to avoid it?

> I know glibc and uclibc have different ldso names for nan2008. I
> think it is because they have implemented it years ago when
> 1) the kernel had no nan2008 enforcement;
> 2) the mips r3 spec defined nan2008/abs2008 FCSR bits as possibly
> writable on a given CPU.
> 
> mips r5 spec has changed these bits to be strictly read-only and no
> r3 cores from IMG with writable nan2008/abs2008 bits exist (per spec
> at least).
> 
> So the feature bits are hard-wired, cross-binary and cross-system
> consistency is externally enforced, efficient emulation is barely
> possible.
> 
> Why bother with different ldsoname for nan2008 then?

If the tooling is capable of treating them as the same ABI (which
implies considering signaling nan an unsupported feature and treating
all nans as the same), then it's not needed. But if it's enforced that
they're separate ABIs, they need separate ldsonames.

> (though all this nan2008 stuff is independent from the
> __mips_abs2008 fabs[f]() oneliners).

I think that sounds correct.

> R> I'm guessing we've hit a situation where people have been building
> R> binaries for an incompatible MIPS-family ISA reusing the same
> R> ldsoname, which is a huge mess we probably need to figure out how to
> R> deal with...
> 
> R> actually it probably doesn't matter even if it does since I can't
> R> imagine the compiler generates dsp insns for anything in libc) you can
> 
> Sure it does, and for good. Indexed load/stores (gcc/clang), 64-bit
> additions with ADDWC/ADDSC (clang).

OK.

> But do you essentially deny non-baseline musl without new ldsoname
> just because someone could misuse it on a baseline system?

No. It's the same ldsoname because binaries with or without dsp
features can use the same baseline (without dsp) libc.so/ldso.

> If not, the -mabs=2008 ("non-baseline") musl on a -mabs=legacy
> ("baseline") system is irrelevant (though wrong of course).
> 
> If yes, do you consider x86 with LZCNT a different ISA?

No. Both binaries using lzcnt and binaries not using lzcnt can share
the same ldso with the baseline ISA.

> It is perfectly possible to build musl with -mlzcnt ("non-baseline")
> and let it fail silently on non-ABM/BMIx system ("baseline").
> Neither musl nor kernel prevents this yet nobody invents new
> ldsoname for this case.

Because the ABI is the same.

> It is probably possible to build soft-float ARMv6 musl with LDRD and
> let it crash on XScale v5TE system due to stricter alignment
> requirements. Does musl prevent this with whatever ldsonames? I
> think not.

No, because the ABI is the same and you can run armv6 and armv5
binaries using the same libc.so/ldso.

> And if the kernel would reject such binary (dunno), why are you
> against external nan2008 enforcement then?
> 
> Finally, I believe it is possible to build mips32r2 binary with
> rotate instructions and let it fail silently and strangely on r1
> system (rotate opcode reuses one of the shift opcodes) as kernel
> apparently ignores the corresponding flag in the ELF header. This
> seems to be the only case of three you want to fix.

This isn't about noisy or silent failure of binaries using new ISA
features on old cpus. It's about the inability to use a
baseline-supporting libc.so/ldso on a newer cpu with the changed ISA,
for binaries built with the changed ISA.

Rich
Rich Felker Sept. 19, 2019, 1:25 p.m.
On Thu, Sep 19, 2019 at 09:14:37AM -0400, Rich Felker wrote:
> On Thu, Sep 19, 2019 at 03:54:31PM +0300, info@mobile-stream.com wrote:
> > It is probably possible to build soft-float ARMv6 musl with LDRD and
> > let it crash on XScale v5TE system due to stricter alignment
> > requirements. Does musl prevent this with whatever ldsonames? I
> > think not.
> 
> No, because the ABI is the same and you can run armv6 and armv5
> binaries using the same libc.so/ldso.
> 
> > And if the kernel would reject such binary (dunno), why are you
> > against external nan2008 enforcement then?
> > 
> > Finally, I believe it is possible to build mips32r2 binary with
> > rotate instructions and let it fail silently and strangely on r1
> > system (rotate opcode reuses one of the shift opcodes) as kernel
> > apparently ignores the corresponding flag in the ELF header. This
> > seems to be the only case of three you want to fix.
> 
> This isn't about noisy or silent failure of binaries using new ISA
> features on old cpus. It's about the inability to use a
> baseline-supporting libc.so/ldso on a newer cpu with the changed ISA,
> for binaries built with the changed ISA.

The closest analogy to the mips situation might be arm where there are
separate branches of baseline ISA: armv4t, and the intersection of
armv7-a and armv7-m (called just armv7?). Here, there's currently no
way to make a single baseline ISA libc.so that can run on everything
just because armv4t lacks thumb2 and armv7-m lacks arm (the 32-bit arm
ISA). This is unfortunate but might be fixable if we ever get to the
point where building as thumb1 is supportable, but would break again
if we wanted to add plain armv4 (non-t) support (EABI doesn't specify
this but can be extended in a non-incompatble way).

I don't think the arm situation is an exact analogy since it does not
have completely disconnected components in the compatibility graph. It
just has 2 roots.

I bring this up because my last email was probably too long, and the
point here is not "rejecting abs2008/nan2008 without a new ldsoname"
but rather highlighting that there are issues that were never
discussed, and that might have already created a big mess in the wild,
and figuring out what if anything needs to be done about it. This
discussion should have happened a long time ago, and didn't, at least
not with musl.

Rich
info@mobile-stream.com Sept. 19, 2019, 6:26 p.m.
>> 2) -mabs=2008 application will work correctly with -mabs=legacy musl
>> on a -mabs=2008 system.

R> Are you sure? This seems to disagree with what you're saying below

Yes, assuming they all were built with the same -mnan= option.

The -mabs=2008 option has no ABI tag, it is just a compiler flag that controls how many instructions are used for |x| and -x.

I think there is some optimisation flag in gcc that does exactly the same without declaring __mips_abs2008.

Only -mnan=xxx consistency is enforced by the kernel basing on the corresponding bits in the ELF header and FCSR.

But yes, IIUC kernel assumes nan2008 binary is abs2008 too though this has no deep (or whatever) meaning on cores with hard-wired feature bits in FCSR.



R> musl supports multiple ecosystems in the same filesystem regardless of
R> whether a cpu does; that's the whole point of supporting even multiple
R> unrelated archs like mips and arm or x86 and riscv and why they all
R> have differing ldso names. For example you can be running the foreign
R> ISA via qemu-user with binfmt_misc or explicitly.

OK. Though qemu-user has -L option for things like this.



R> In that case, it seems like the kernel would refuse to load
R> -mabs=legacy binaries on nan2008 hardware, gratuitously due to

No, no. The kernel only knows -mnan=xxx status for binary.

All this mess is because architecture spec defines these features as distinct, compilers have independent options for them (gcc even has --with-nan= in ./configure but no --with-abs=), kernel code is written like it can change FCSR bits etc.

It reality these bits are always either both zero (non-existant on <= R3) or both set in hardware and only NAN2008 FCSR bit has reflection in the ELF header flags.
Rich Felker Sept. 19, 2019, 7:10 p.m.
On Thu, Sep 19, 2019 at 09:26:18PM +0300, info@mobile-stream.com wrote:
> 
> >> 2) -mabs=2008 application will work correctly with -mabs=legacy musl
> >> on a -mabs=2008 system.
> 
> R> Are you sure? This seems to disagree with what you're saying below
> 
> Yes, assuming they all were built with the same -mnan= option.
> 
> The -mabs=2008 option has no ABI tag, it is just a compiler flag
> that controls how many instructions are used for |x| and -x.
> 
> I think there is some optimisation flag in gcc that does exactly the
> same without declaring __mips_abs2008.
> 
> Only -mnan=xxx consistency is enforced by the kernel basing on the
> corresponding bits in the ELF header and FCSR.
> 
> But yes, IIUC kernel assumes nan2008 binary is abs2008 too though
> this has no deep (or whatever) meaning on cores with hard-wired
> feature bits in FCSR.

I guess I was assuming that if the two generally (always?) match in
the hardware, you'd necessarily be using nan2008 when using abs2008.

> R> musl supports multiple ecosystems in the same filesystem regardless of
> R> whether a cpu does; that's the whole point of supporting even multiple
> R> unrelated archs like mips and arm or x86 and riscv and why they all
> R> have differing ldso names. For example you can be running the foreign
> R> ISA via qemu-user with binfmt_misc or explicitly.
> 
> OK. Though qemu-user has -L option for things like this.

Indeed you can use -L in the qemu-user case, but there are plenty of
other situations like shared rootfs across network or with qemu-system
guests. In any case, ability for different archs and subarchs (ABIs)
to coexist in the same filesystem regardless of any relationship or
non-relationship between them has always been part of musl.

> R> In that case, it seems like the kernel would refuse to load
> R> -mabs=legacy binaries on nan2008 hardware, gratuitously due to
> 
> No, no. The kernel only knows -mnan=xxx status for binary.
> 
> All this mess is because architecture spec defines these features as
> distinct, compilers have independent options for them (gcc even has
> --with-nan= in ./configure but no --with-abs=), kernel code is
> written like it can change FCSR bits etc.
> 
> It reality these bits are always either both zero (non-existant on
> <= R3) or both set in hardware and only NAN2008 FCSR bit has
> reflection in the ELF header flags.

In that case it seems like using abs2008 as you've proposed does not
make the situation any worse than it might already be.

Rich
Rich Felker Oct. 14, 2019, 2:18 p.m.
On Wed, Sep 11, 2019 at 01:05:04PM +0300, info@mobile-stream.com wrote:
> 
> non-commit text:
> gcc puts annoying nop into the delay slot for these functions, e.g.:
> 	abs.d	$f0,$f12
> 	jr	$ra
> 	 nop
> is there any way to get rid of this without using pure .S?

I'm taking care of merging this now, since it seems concerns about
abs2008 being able to be treated as an isa level were adequately
addressed, and one interesting thing is that I don't see the above
happening. My mips toolchain is gcc 6.3.0 and I get:

00000000 <fabs>:
   0:   03e00008        jr      ra
   4:   46206005        abs.d   $f0,$f12

Perhaps you have an older gcc, or there's some option that affects
whether it can use delay slots? It looks like gcc is emitting the code
as:

#APP
 # 8 "../../src/math/mips/fabs.c" 1
	abs.d $f0,$f12
 # 0 "" 2
#NO_APP
	jr      $31

but not using the noreorder directive, allowing the assembler to
reorder into delay slots.

Rich