Q: dealing with missing removal of excess precision

Submitted by Rich Felker on March 22, 2020, 1:19 a.m.

Details

Message ID 20200322011958.GM11469@brightrain.aerifal.cx
State New
Series "Q: dealing with missing removal of excess precision"
Headers show

Commit Message

Rich Felker March 22, 2020, 1:19 a.m.
On Fri, Mar 20, 2020 at 02:12:50PM -0400, Rich Felker wrote:
> On Sun, Feb 23, 2020 at 07:14:08PM +0300, Alexander Monakov wrote:
> > On Sat, 22 Feb 2020, Rich Felker wrote:
> > 
> > > I'm not well acquainted with SSE, and only so-so with x87, so pretty
> > > much I'm reading them for higher-level issues with tooling
> > > compatibility (like the concerns I already raised and looked up and
> > > seem to have resolved about x87 constraints and non-GCC compilers) and
> > > logic, then planning to apply and test them. I think being aware of
> > > non-obvious mistake modes that have already been found would be a lot
> > > more useful than staring at things, especially if the bugs you've
> > > found are in subtleties of the insn behavior or constraint behavior.
> > 
> > Okay, thanks. I found two issues:
> > 
> > 1. i386 lrint* functions mistakenly used fistpll instead of fistpl (I
> > posted the fix for x32 asm after noticing my own mistake).
> > 
> > 2. Some functions bind a 32-bit lvalue as output for fnstsw %ax, which
> > as the operand says writes only 16 bits. They should be changed to either
> > use a 16-bit lvalue, or a zero-initialized 32-bit lvalue with "+a" constraint.
> > 
> > Plus, not bugs, but still worth mentioning:
> > 
> > 3. The new remquol in C could alternatively be implemented by using fxam
> > to extract sign bits instead of loading them from stack slots. The current
> > approach makes sense given the ABI, but an implementation aiming for better
> > code after inlining could choose to use fxam instead of forcing a spill.
> > 
> > 4. I did not manage to find a copy of Figueroa's "When is double rounding
> > innocuous", but I could cite e.g. "Innocuous Double Rounding of Basic
> > Arithmetic Operations" by Pierre Roux instead (in i386/sqrtf.c).
> > 
> > If you like you can fetch a Git tree with my patches from 
> > 
> >     https://git.sr.ht/~amonakov/musl
> > 
> > (issue 1 is already corrected in that repo)
> 
> Hi. I'm trying to catch up on this and other patches after being sick
> and not getting much done for a while. Is there anything else I should
> be aware of before going forward with these?
> 
> One minor cosmetic change I'd like to make to commit names if you
> don't object is changing "x86-family" to "x87-family" for the commits
> that are "i386 + long double functions on x86_64" since I found it
> confusing when there's one commit for "x86 family" then a separate one
> for x86_64 versions of the same function. Let me know if you think of
> any alternative that might be more clear than "x87"

With the fixups mentioned (included in attached patches), i386 and
x86_64 are passing libc-test and seem fine. OK if I merge them
(rebasing the fixups in as fixups)? With s/x86/x87/ naming?

Rich
From 3f99f4595aa3ce528008ee22d65512768376a9a6 Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Sat, 21 Mar 2020 16:23:51 -0400
Subject: [PATCH 1/3] fabsf i386 fixup

---
 src/math/i386/fabsf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/src/math/i386/fabsf.c b/src/math/i386/fabsf.c
index d07be321..d882eee3 100644
--- a/src/math/i386/fabsf.c
+++ b/src/math/i386/fabsf.c
@@ -1,6 +1,6 @@ 
 #include <math.h>
 
-float fabs(float x)
+float fabsf(float x)
 {
 	__asm__ ("fabs" : "+t"(x));
 	return x;

Comments

Alexander Monakov March 22, 2020, 5:40 p.m.
On Sat, 21 Mar 2020, Rich Felker wrote:

> With the fixups mentioned (included in attached patches), i386 and
> x86_64 are passing libc-test and seem fine. OK if I merge them
> (rebasing the fixups in as fixups)? With s/x86/x87/ naming?

Ack for x87 naming, fabsf fix and sqrt +0x300 fix.

Nack for sqrt 'unsigned short' fix, I recommend to consider
'unsigned fpsr = 0' and "+a" constraint, the resulting assembly is
much better (GCC doesn't seem to do a good job optimizing the 'unsigned short'
variant at all).

I also would like to see i386/sqrtf.c to explain why double rounding
is safe, either as a comment (my preference) or in the commit message.
I would write something like the following:

  The long double result has sufficient precision so that second rounding
  to float still keeps the returned value correctly rounded, see
  Pierre Roux, "Innocuous Double Rounding of Basic Arithmetic Operations".

(this paper elaborates the earlier [currently paywalled] paper by Figueroa)

Actually, may I ask why the initial commit did not mention that it relies
on this nontrivial property?

Thanks.
Alexander
Rich Felker March 22, 2020, 5:53 p.m.
On Sun, Mar 22, 2020 at 08:40:19PM +0300, Alexander Monakov wrote:
> On Sat, 21 Mar 2020, Rich Felker wrote:
> 
> > With the fixups mentioned (included in attached patches), i386 and
> > x86_64 are passing libc-test and seem fine. OK if I merge them
> > (rebasing the fixups in as fixups)? With s/x86/x87/ naming?
> 
> Ack for x87 naming, fabsf fix and sqrt +0x300 fix.
> 
> Nack for sqrt 'unsigned short' fix, I recommend to consider
> 'unsigned fpsr = 0' and "+a" constraint, the resulting assembly is
> much better (GCC doesn't seem to do a good job optimizing the 'unsigned short'
> variant at all).

Sorry, I forgot to disassemble and check after making that change, and
indeed it is very bad.

How about just leaving it as-is? The value is masked such that upper
bits don't matter, and "whatever the register happened to contain
before" is a valid albeit ugly output from inline asm -- it doesn't
admit any compiler transformations that would cause inconsistent value
to be observed or other problems.

> I also would like to see i386/sqrtf.c to explain why double rounding
> is safe, either as a comment (my preference) or in the commit message.
> I would write something like the following:
> 
>   The long double result has sufficient precision so that second rounding
>   to float still keeps the returned value correctly rounded, see
>   Pierre Roux, "Innocuous Double Rounding of Basic Arithmetic Operations".

I'm fine with adding that as a comment.

> (this paper elaborates the earlier [currently paywalled] paper by Figueroa)
> 
> Actually, may I ask why the initial commit did not mention that it relies
> on this nontrivial property?

The need for fixup of double was only realized later, in commit
809556e60a3359f646946879dd94c4760e5b8e84. It was discussed at the time
that no action was needed for single, but it seems since there was no
change there wasn't any mention of it in log.

Rich
Alexander Monakov March 22, 2020, 6:51 p.m.
On Sun, 22 Mar 2020, Rich Felker wrote:

> > Nack for sqrt 'unsigned short' fix, I recommend to consider
> > 'unsigned fpsr = 0' and "+a" constraint, the resulting assembly is
> > much better (GCC doesn't seem to do a good job optimizing the 'unsigned short'
> > variant at all).
> 
> Sorry, I forgot to disassemble and check after making that change, and
> indeed it is very bad.
> 
> How about just leaving it as-is? The value is masked such that upper
> bits don't matter, and "whatever the register happened to contain
> before" is a valid albeit ugly output from inline asm -- it doesn't
> admit any compiler transformations that would cause inconsistent value
> to be observed or other problems.

Yes, I guess it's acceptable.

> > Actually, may I ask why the initial commit did not mention that it relies
> > on this nontrivial property?
> 
> The need for fixup of double was only realized later, in commit
> 809556e60a3359f646946879dd94c4760e5b8e84. It was discussed at the time
> that no action was needed for single, but it seems since there was no
> change there wasn't any mention of it in log.

Are you sure? single-precision sqrtf received a change just two days
prior to that in commit e0a54e6725 ("correct rounding for i387 sqrtf function")
which is the same day as when Stephen Canon supplied his answer in
https://stackoverflow.com/questions/9678224/is-there-any-way-to-get-correct-rounding-with-the-i387-fsqrt-instruction
(and also the same day you asked the question)

Alexander
Rich Felker March 22, 2020, 7:10 p.m.
On Sun, Mar 22, 2020 at 09:51:09PM +0300, Alexander Monakov wrote:
> On Sun, 22 Mar 2020, Rich Felker wrote:
> 
> > > Nack for sqrt 'unsigned short' fix, I recommend to consider
> > > 'unsigned fpsr = 0' and "+a" constraint, the resulting assembly is
> > > much better (GCC doesn't seem to do a good job optimizing the 'unsigned short'
> > > variant at all).
> > 
> > Sorry, I forgot to disassemble and check after making that change, and
> > indeed it is very bad.
> > 
> > How about just leaving it as-is? The value is masked such that upper
> > bits don't matter, and "whatever the register happened to contain
> > before" is a valid albeit ugly output from inline asm -- it doesn't
> > admit any compiler transformations that would cause inconsistent value
> > to be observed or other problems.
> 
> Yes, I guess it's acceptable.
> 
> > > Actually, may I ask why the initial commit did not mention that it relies
> > > on this nontrivial property?
> > 
> > The need for fixup of double was only realized later, in commit
> > 809556e60a3359f646946879dd94c4760e5b8e84. It was discussed at the time
> > that no action was needed for single, but it seems since there was no
> > change there wasn't any mention of it in log.
> 
> Are you sure? single-precision sqrtf received a change just two days
> prior to that in commit e0a54e6725 ("correct rounding for i387 sqrtf function")
> which is the same day as when Stephen Canon supplied his answer in
> https://stackoverflow.com/questions/9678224/is-there-any-way-to-get-correct-rounding-with-the-i387-fsqrt-instruction
> (and also the same day you asked the question)

I just dug through the old gcc and glibc bz's it was mentioned on
(52593 and 14032 resp.) and didn't find anything, but I'm pretty sure
it was known in 2012 that it was a non-issue for single-precision.
What I didn't understand then was that the callee was responsible for
dropping the excess precision; I wrongly believed that "something" in
the caller would make it not-matter/get collapsed down to nominal
precision and rounded correctly. Commit e0a54e6725 and related commits
were fixing that mistake, which I'd since recognized was wrong but
hadn't yet recognized the urgency of fixing until I started seeing how
bad it could break the compiler.

Rich
Alexander Monakov March 22, 2020, 7:46 p.m.
On Sun, 22 Mar 2020, Rich Felker wrote:

> > Are you sure? single-precision sqrtf received a change just two days
> > prior to that in commit e0a54e6725 ("correct rounding for i387 sqrtf function")
> > which is the same day as when Stephen Canon supplied his answer in
> > https://stackoverflow.com/questions/9678224/is-there-any-way-to-get-correct-rounding-with-the-i387-fsqrt-instruction
> > (and also the same day you asked the question)
> 
> I just dug through the old gcc and glibc bz's it was mentioned on
> (52593 and 14032 resp.) and didn't find anything, but I'm pretty sure
> it was known in 2012 that it was a non-issue for single-precision.
> What I didn't understand then was that the callee was responsible for
> dropping the excess precision; I wrongly believed that "something" in
> the caller would make it not-matter/get collapsed down to nominal
> precision and rounded correctly. Commit e0a54e6725 and related commits
> were fixing that mistake, which I'd since recognized was wrong but
> hadn't yet recognized the urgency of fixing until I started seeing how
> bad it could break the compiler.

Yes, that stackoverflow thread is also from 2012. Reading the question,
it's clear that you are asking about both single and double precision
(no indication you're aware that double-rounding would be safe for floats),
then Stephen Canon answers, and soon after the aforementioned commit appears.

Anyhow - thanks for the extra historical background.

Alexander