pthread_key_create bug?

Submitted by Markus Wichmann on Jan. 7, 2019, 5:13 p.m.

Details

Message ID 20190107171327.GD29911@voyager
State New
Series "pthread_key_create bug?"
Headers show

Commit Message

Markus Wichmann Jan. 7, 2019, 5:13 p.m.
On Sun, Jan 06, 2019 at 09:11:28PM -0500, Rich Felker wrote:
> See commit 84d061d5a31c9c773e29e1e2b1ffe8cb9557bc58.
> 
> Rich

Speaking of that commit: Am I missing something? The log message of that
commit says that pthread_key_delete was stuffed into another file to
avoid __synccall() being pulled into programs not referencing
pthread_key_delete(). Yet pthread_key_create.c contains
clean_dirty_tsd(), which contains a hard dependency on
__pthread_key_delete_synccall(), which will do the synccall, and thus
pull everything in.

I appreciate that the function in question can never be called unless
pthread_key_delete() is used, but the linker can't know that. The
compiler can't see code of the other compilation units and thus has to
generate code containing a reference to __pthread_key_delete_synccall(),
and the linker can't see that the reference is unreachable unless
__pthread_key_delete() is linked in...

Wait a minute. If we made that a weak reference, that would already
suffice. Wouldn't even necessarily need an alias, since it is
unreachable without pthread_key_delete() anyway.

Is the attached patch acceptable?

Ciao,
Markus

Patch hide | download patch | download mbox

From 851f8bb89d9fae664a0d6018fce251ab64d737b7 Mon Sep 17 00:00:00 2001
From: Markus Wichmann <nullplan@gmx.net>
Date: Mon, 7 Jan 2019 18:07:34 +0100
Subject: [PATCH 4/4] Add weak reference to reduce static size.

Commit 84d061d5a31c9c773e29e1e2b1ffe8cb9557bc58 claimed to try and
reduce the static size for programs referencing pthread_key_create() but
not pthread_key_delete(). Unfortunately, the critical reference was
still strong, so the __synccall() mechanism was pulled in, anyway. This
commit makes the reference weak, so it is resolved to NULL if
pthread_key_delete() is not used, but in that case the function is
unreachable, anyway.
---
 src/thread/pthread_key_create.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/thread/pthread_key_create.c b/src/thread/pthread_key_create.c
index e26f199c..31ed0826 100644
--- a/src/thread/pthread_key_create.c
+++ b/src/thread/pthread_key_create.c
@@ -35,6 +35,8 @@  static void clean_dirty_tsd_callback(void *p)
 	if (args->caller == self) args->ret = 0;
 }
 
+extern hidden weak void __pthread_key_delete_synccall(void (*f)(void *), void *p);
+
 static int clean_dirty_tsd(void)
 {
 	struct cleanup_args args = {
-- 
2.19.1


Comments

Rich Felker Jan. 8, 2019, midnight
On Mon, Jan 07, 2019 at 06:13:28PM +0100, Markus Wichmann wrote:
> On Sun, Jan 06, 2019 at 09:11:28PM -0500, Rich Felker wrote:
> > See commit 84d061d5a31c9c773e29e1e2b1ffe8cb9557bc58.
> > 
> > Rich
> 
> Speaking of that commit: Am I missing something? The log message of that
> commit says that pthread_key_delete was stuffed into another file to
> avoid __synccall() being pulled into programs not referencing
> pthread_key_delete(). Yet pthread_key_create.c contains
> clean_dirty_tsd(), which contains a hard dependency on
> __pthread_key_delete_synccall(), which will do the synccall, and thus
> pull everything in.
> 
> I appreciate that the function in question can never be called unless
> pthread_key_delete() is used, but the linker can't know that. The
> compiler can't see code of the other compilation units and thus has to
> generate code containing a reference to __pthread_key_delete_synccall(),
> and the linker can't see that the reference is unreachable unless
> __pthread_key_delete() is linked in...
> 
> Wait a minute. If we made that a weak reference, that would already
> suffice. Wouldn't even necessarily need an alias, since it is
> unreachable without pthread_key_delete() anyway.
> 
> Is the attached patch acceptable?
> 
> Ciao,
> Markus

> >From 851f8bb89d9fae664a0d6018fce251ab64d737b7 Mon Sep 17 00:00:00 2001
> From: Markus Wichmann <nullplan@gmx.net>
> Date: Mon, 7 Jan 2019 18:07:34 +0100
> Subject: [PATCH 4/4] Add weak reference to reduce static size.
> 
> Commit 84d061d5a31c9c773e29e1e2b1ffe8cb9557bc58 claimed to try and
> reduce the static size for programs referencing pthread_key_create() but
> not pthread_key_delete(). Unfortunately, the critical reference was
> still strong, so the __synccall() mechanism was pulled in, anyway. This
> commit makes the reference weak, so it is resolved to NULL if
> pthread_key_delete() is not used, but in that case the function is
> unreachable, anyway.
> ---
>  src/thread/pthread_key_create.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/thread/pthread_key_create.c b/src/thread/pthread_key_create.c
> index e26f199c..31ed0826 100644
> --- a/src/thread/pthread_key_create.c
> +++ b/src/thread/pthread_key_create.c
> @@ -35,6 +35,8 @@ static void clean_dirty_tsd_callback(void *p)
>  	if (args->caller == self) args->ret = 0;
>  }
>  
> +extern hidden weak void __pthread_key_delete_synccall(void (*f)(void *), void *p);
> +
>  static int clean_dirty_tsd(void)
>  {
>  	struct cleanup_args args = {
> -- 
> 2.19.1
> 

I think you're right, though we don't generally use weak references in
musl on the basis (perhaps somewhat dubious) that they're an
additional toolchain feature that might cause problems reusing the
code in non-ELF contexts (this may affect midipix; I'm not sure).
That's why weak aliases to dummy functions are used for this purpose
everywhere else.

Also: weak references to hidden functions historically had some
breakage with respect to generating unresolvable-at-ld-time
pc-relative references to the symbol. Thus their use might break some
gcc/binutils versions too.

Rich
u-uy74@aetey.se Jan. 8, 2019, 8:43 a.m.
On Mon, Jan 07, 2019 at 07:00:18PM -0500, Rich Felker wrote:
> >  
> > +extern hidden weak void __pthread_key_delete_synccall(void (*f)(void *), void *p);

> musl on the basis (perhaps somewhat dubious) that they're an
> additional toolchain feature that might cause problems reusing the
> code in non-ELF contexts (this may affect midipix; I'm not sure).

Thanks. That's no doubt, the less the reliance on toolchain features,
the easier to use, especially in the ways/areas not known in advance.

Rune
Markus Wichmann Jan. 8, 2019, 7:34 p.m.
On Tue, Jan 08, 2019 at 09:43:10AM +0100, u-uy74@aetey.se wrote:
> On Mon, Jan 07, 2019 at 07:00:18PM -0500, Rich Felker wrote:
> > >  
> > > +extern hidden weak void __pthread_key_delete_synccall(void (*f)(void *), void *p);
> 
> > musl on the basis (perhaps somewhat dubious) that they're an
> > additional toolchain feature that might cause problems reusing the
> > code in non-ELF contexts (this may affect midipix; I'm not sure).
> 
> Thanks. That's no doubt, the less the reliance on toolchain features,
> the easier to use, especially in the ways/areas not known in advance.
> 
> Rune
> 

Well, what happens on midipix with this patch? Worst case scenario is,
the toolchain doesn't do weak references, and the reference becomes
strong. So that would leave you no worse than the current situation. Or
am I missing the point?

Ciao,
Markus
Markus Wichmann Jan. 8, 2019, 10:10 p.m.
On Mon, Jan 07, 2019 at 07:00:18PM -0500, Rich Felker wrote:
> I think you're right, though we don't generally use weak references in
> musl on the basis (perhaps somewhat dubious) that they're an
> additional toolchain feature that might cause problems reusing the
> code in non-ELF contexts (this may affect midipix; I'm not sure).
> That's why weak aliases to dummy functions are used for this purpose
> everywhere else.
> 

I don't quite get you. Weak aliases are just weak references with a
non-zero default value. Any toolchain being able to handle weak aliases
should be able to handle weak references, right?

And if a port had to fudge things to get weak aliases to work, they
could just continue to fudge things to make weak references into strong
ones. This is merely an optimization; not having the weak reference
doesn't break anything, just increases the binary size.

> Also: weak references to hidden functions historically had some
> breakage with respect to generating unresolvable-at-ld-time
> pc-relative references to the symbol. Thus their use might break some
> gcc/binutils versions too.
> 
> Rich

OK, that's interesting. But do we really want to work around bugs in
gcc/binutils? Especially ones that were fixed a while ago? Because that
might lead us to a situation where we have ugly code we can't improve
because an old version of binutils had a bug, which is a fact that will
never go away, so the code stays ugly forever.

Not saying this is that situation, of course.

Ciao,
Markus
writeonce@midipix.org Jan. 8, 2019, 10:40 p.m.
On 01/08/2019 20:34, Markus Wichmann wrote:
> On Tue, Jan 08, 2019 at 09:43:10AM +0100, u-uy74@aetey.se wrote:
> > On Mon, Jan 07, 2019 at 07:00:18PM -0500, Rich Felker wrote:
> > > >  
> > > > +extern hidden weak void __pthread_key_delete_synccall(void (*f)(void *), void *p);
> > 
> > > musl on the basis (perhaps somewhat dubious) that they're an
> > > additional toolchain feature that might cause problems reusing the
> > > code in non-ELF contexts (this may affect midipix; I'm not sure).
> > 
> > Thanks. That's no doubt, the less the reliance on toolchain features,
> > the easier to use, especially in the ways/areas not known in advance.
> > 
> > Rune
> > 
> 
> Well, what happens on midipix with this patch? Worst case scenario is,
> the toolchain doesn't do weak references, and the reference becomes
> strong. So that would leave you no worse than the current situation. Or
> am I missing the point?

Weak/hidden symbols and references as used in the patch are actually supported. Generally speaking, default visibility means that a register (say, rax) is going to have the address of a function's .got entry (.got entries in PE are specific to midipix and are part of the toolchain's customization), and the call will thus take the form of callq *%rax. With hidden visibility, rax would contain the actual function address, and the call would accordingly become callq %rax. Weak references on midipix provide most of everything that ELF provides. With your patch applied, and linking libc.so with pthread_key_delete.lo (and thus also tss_delete.lo) left out, the resuling image contains the following as expected (callq 0 being the bit of interest):

6bc4d0f3:     48 8d 0d 46 c7 00 00    lea    0xc746(%rip),%rcx        # 6bc59840 <clean_dirty_tsd_callback>
6bc4d0fa:     48 89 44 24 20          mov    %rax,0x20(%rsp)
6bc4d0ff:     c7 44 24 28 0b 00 00    movl   $0xb,0x28(%rsp)
6bc4d106:     00 
6bc4d107:     e8 f4 2e 3b 94          callq  0 <__dll__>

To complete the picture, the one weak/hidden trick that's currently not supported on midipix consists in musl's src/errno/__errno_location.c and src/include/errno.h, where you end up with "call ___errno_location" (because of the hidden attribute) in many translation units, yet a strongly defined ___errno_location in none (for which the workaround is to provide a strong ___errno_location() function).


> 
> Ciao,
> Markus
> 

--
A. Wilcox Jan. 8, 2019, 11:07 p.m.
On 01/08/19 16:10, Markus Wichmann wrote:
> OK, that's interesting. But do we really want to work around bugs in
> gcc/binutils?

Yes.

> Especially ones that were fixed a while ago?

Yes.

> Because that
> might lead us to a situation where we have ugly code we can't improve
> because an old version of binutils had a bug, which is a fact that will
> never go away, so the code stays ugly forever.

That is correct.


Best,
--arw
Rich Felker Jan. 9, 2019, 12:29 a.m.
On Tue, Jan 08, 2019 at 11:10:06PM +0100, Markus Wichmann wrote:
> On Mon, Jan 07, 2019 at 07:00:18PM -0500, Rich Felker wrote:
> > I think you're right, though we don't generally use weak references in
> > musl on the basis (perhaps somewhat dubious) that they're an
> > additional toolchain feature that might cause problems reusing the
> > code in non-ELF contexts (this may affect midipix; I'm not sure).
> > That's why weak aliases to dummy functions are used for this purpose
> > everywhere else.
> > 
> 
> I don't quite get you. Weak aliases are just weak references with a
> non-zero default value. Any toolchain being able to handle weak aliases
> should be able to handle weak references, right?

No, not necessarily, and no they're not equivalent to weak definitions
with a zero value. A weak definition with a zero value would provide a
definition for other translation units to see/use, preventing one
with a non-weak reference from pulling in the real definition. Weak
references are a property of the reference. Weak definitions are a
property of the definition. They are not in any way equivalent.

In any case, the policy is that we don't use weak references (*). If
there were a strong reason to want them, we could review the reasons
and see if they are still relevant.

(*) This is only partly true. There are weak references to some
special ELF symbols defined by the linker, because providing a weak
definition would override the linker-provided value. For these, there
will never be definitions provided by source files, so the weak
reference is mostly equivalent to a weak definition by your above
logic. Furthermore, we know they'll always be defined for
position-independent linking (_DYNAMIC has to exist for shared
libraries and PIE) so there is no concern about the issue below
(PC-relative references to an undefined weakly-referenced symbol).

> > Also: weak references to hidden functions historically had some
> > breakage with respect to generating unresolvable-at-ld-time
> > pc-relative references to the symbol. Thus their use might break some
> > gcc/binutils versions too.
> 
> OK, that's interesting. But do we really want to work around bugs in
> gcc/binutils? Especially ones that were fixed a while ago? Because that
> might lead us to a situation where we have ugly code we can't improve
> because an old version of binutils had a bug, which is a fact that will
> never go away, so the code stays ugly forever.

It depends on the situation. In this case the functionality itself is
an extension to the language and it's not formally defined exactly
what usage is valid; we have to infer that from what the tools do and
how they're documented. The whole point here is to minimize the extent
to which we depend on behavior that has historically differed between
tools, so that we don't get bogged down worrying about the specifics
of it.

Rich
Szabolcs Nagy Jan. 9, 2019, 11:45 a.m.
* Rich Felker <dalias@libc.org> [2019-01-08 19:29:53 -0500]:
> On Tue, Jan 08, 2019 at 11:10:06PM +0100, Markus Wichmann wrote:
> > On Mon, Jan 07, 2019 at 07:00:18PM -0500, Rich Felker wrote:
> > > I think you're right, though we don't generally use weak references in
> > > musl on the basis (perhaps somewhat dubious) that they're an
> > > additional toolchain feature that might cause problems reusing the
> > > code in non-ELF contexts (this may affect midipix; I'm not sure).
> > > That's why weak aliases to dummy functions are used for this purpose
> > > everywhere else.
> > > 
> > 
> > I don't quite get you. Weak aliases are just weak references with a
> > non-zero default value. Any toolchain being able to handle weak aliases
> > should be able to handle weak references, right?
> 
> No, not necessarily, and no they're not equivalent to weak definitions
> with a zero value. A weak definition with a zero value would provide a
> definition for other translation units to see/use, preventing one
> with a non-weak reference from pulling in the real definition. Weak
> references are a property of the reference. Weak definitions are a
> property of the definition. They are not in any way equivalent.
> 
> In any case, the policy is that we don't use weak references (*). If
> there were a strong reason to want them, we could review the reasons
> and see if they are still relevant.
> 
> (*) This is only partly true. There are weak references to some
> special ELF symbols defined by the linker, because providing a weak
> definition would override the linker-provided value. For these, there
> will never be definitions provided by source files, so the weak
> reference is mostly equivalent to a weak definition by your above
> logic. Furthermore, we know they'll always be defined for
> position-independent linking (_DYNAMIC has to exist for shared
> libraries and PIE) so there is no concern about the issue below
> (PC-relative references to an undefined weakly-referenced symbol).

note that weak references are not even well supported on elf targets:
weak references can be undefined and then the symbol address is 0 at
runtime, which should be possible to check with if (weakref == 0)..,
but this does not work reliably, see e.g.
https://groups.google.com/forum/#!topic/generic-abi/eZv8VQskSD0
so the specified weak reference semantics is broken in important use
cases: tls, copy reloc, plt address.