pthread_setattr_default_np behavior difference from glibc

Submitted by Rich Felker on July 26, 2018, 3:51 p.m.

Details

Message ID 20180726155154.GA29079@brightrain.aerifal.cx
State New
Series "pthread_setattr_default_np behavior difference from glibc"
Headers show

Commit Message

Rich Felker July 26, 2018, 3:51 p.m.
On Sat, Aug 12, 2017 at 12:38:53AM -0500, Bobby Bingham wrote:
> I've got a glibc-linked binary I'm trying to get running, and I'm
> running into the usual stack overflow for threads other than the main
> thread because of musl's smaller default stack size.  So I decided to
> see if I could LD_PRELOAD a library with a constructor that used
> pthread_setattr_default_np to increase the default stack size.
> 
> It turns out that it doesn't work for this binary because it passes a
> thread attribute object to pthread_create without explicitly setting the
> stack size in the attribute object.  According to the man page, the
> values set with pthread_getattr_default_np are only used if no attribute
> object is passed to pthread_create, which is how musl behaves.
> 
> It turns out this isn't what glibc does for stack size (but is for guard
> size).
> 
> glibc's pthread_attr_init sets the stack size field in the attribute
> object to zero, which is interpreted to mean the default stack size at
> whatever time it's queried.  The attached program will print two
> different stack sizes on glibc.
> 
> I think that the glibc behavior is problematic because
> pthread_attr_getstacksize can return a value different than what will
> actually be used if the default is changed before pthread_create is
> called.
> 
> However, if we changed pthread_attr_init to initialize the stack size
> based on the current default at the time its called, we'd be a little
> closer to glibc's behavior, but without the quirk I mentioned above.
> And it would fix my use case :)
> 
> Thoughts?

Somehow I missed this at the time, or forgot about it, and just
rediscovered the same issue. Actually I thought glibc ignored the
default set by pthread_setattr_default_np in this case, because I
wasn't aware of the (likely non-conforming, unless
pthread_attr_getstacksize patches it up) way zero is used in glibc,
and I was about to propose we diverge from glibc here just because the
current behavior is not useful for the intended purpose.

The whole reason pthread_setattr_default_np was added was to allow
easy workarounds for applications/libraries that assume large stacks:
so packagers can just add an early call to pthread_setattr_default_np
to ensure that all threads created have sufficiently large stacks. But
some code, notably SDL, creates an attribute object to set other
properties, leaving the stack size alone, and thereby gets the
musl-default stack size rather than the new default configured by
pthread_setattr_default_np.

The attached patch fixes this, and I think I'll apply it, but I'm not
entirely happy with the situation before or after this change, since
pthread_setattr_default_np is thread-unsafe and thus it's unsafe to
call from ctors in libraries that might be dynamic-loaded or where
threads might have already been created by other libraries' ctors.

I think we should probably have pthread_setattr_default_np use a_cas
to update the defaults, changing them from size_t to volatile int
(there's no need to support defaults >INT_MAX; doing so would just be
harmful) so a_cas is usable on them. Then the consumers,
pthread_create and pthread_attr_init, would be getting relaxed-order
atomics rather than data races. Assuming there's something else to
order the pthread_setattr_default_np with respect to pthread_create or
pthread_attr_init (e.g. being called in the same thread) you know the
latter will get the latest-set default; otherwise it gets *some valid*
default from the past or present.

Rich

Patch hide | download patch | download mbox

diff --git a/src/thread/default_attr.c b/src/thread/default_attr.c
index e69de29..46fe98e 100644
--- a/src/thread/default_attr.c
+++ b/src/thread/default_attr.c
@@ -0,0 +1,4 @@ 
+#include "pthread_impl.h"
+
+size_t __default_stacksize = DEFAULT_STACK_SIZE;
+size_t __default_guardsize = DEFAULT_GUARD_SIZE;
diff --git a/src/thread/pthread_attr_init.c b/src/thread/pthread_attr_init.c
index 8f6e337..a962b46 100644
--- a/src/thread/pthread_attr_init.c
+++ b/src/thread/pthread_attr_init.c
@@ -1,9 +1,12 @@ 
 #include "pthread_impl.h"
 
+extern size_t __default_stacksize;
+extern size_t __default_guardsize;
+
 int pthread_attr_init(pthread_attr_t *a)
 {
 	*a = (pthread_attr_t){0};
-	a->_a_stacksize = DEFAULT_STACK_SIZE;
-	a->_a_guardsize = DEFAULT_GUARD_SIZE;
+	a->_a_stacksize = __default_stacksize;
+	a->_a_guardsize = __default_guardsize;
 	return 0;
 }
diff --git a/src/thread/pthread_create.c b/src/thread/pthread_create.c
index 2c066cf..2df2e9f 100644
--- a/src/thread/pthread_create.c
+++ b/src/thread/pthread_create.c
@@ -165,8 +165,8 @@  static void *dummy_tsd[1] = { 0 };
 weak_alias(dummy_tsd, __pthread_tsd_main);
 
 volatile int __block_new_threads = 0;
-size_t __default_stacksize = DEFAULT_STACK_SIZE;
-size_t __default_guardsize = DEFAULT_GUARD_SIZE;
+extern size_t __default_stacksize;
+extern size_t __default_guardsize;
 
 static FILE *volatile dummy_file = 0;
 weak_alias(dummy_file, __stdin_used);

Comments

Rich Felker July 26, 2018, 10:07 p.m.
On Thu, Jul 26, 2018 at 11:51:54AM -0400, Rich Felker wrote:
> On Sat, Aug 12, 2017 at 12:38:53AM -0500, Bobby Bingham wrote:
> > I've got a glibc-linked binary I'm trying to get running, and I'm
> > running into the usual stack overflow for threads other than the main
> > thread because of musl's smaller default stack size.  So I decided to
> > see if I could LD_PRELOAD a library with a constructor that used
> > pthread_setattr_default_np to increase the default stack size.
> > 
> > It turns out that it doesn't work for this binary because it passes a
> > thread attribute object to pthread_create without explicitly setting the
> > stack size in the attribute object.  According to the man page, the
> > values set with pthread_getattr_default_np are only used if no attribute
> > object is passed to pthread_create, which is how musl behaves.
> > 
> > It turns out this isn't what glibc does for stack size (but is for guard
> > size).
> > 
> > glibc's pthread_attr_init sets the stack size field in the attribute
> > object to zero, which is interpreted to mean the default stack size at
> > whatever time it's queried.  The attached program will print two
> > different stack sizes on glibc.
> > 
> > I think that the glibc behavior is problematic because
> > pthread_attr_getstacksize can return a value different than what will
> > actually be used if the default is changed before pthread_create is
> > called.
> > 
> > However, if we changed pthread_attr_init to initialize the stack size
> > based on the current default at the time its called, we'd be a little
> > closer to glibc's behavior, but without the quirk I mentioned above.
> > And it would fix my use case :)
> > 
> > Thoughts?
> 
> Somehow I missed this at the time, or forgot about it, and just
> rediscovered the same issue. Actually I thought glibc ignored the
> default set by pthread_setattr_default_np in this case, because I
> wasn't aware of the (likely non-conforming, unless
> pthread_attr_getstacksize patches it up) way zero is used in glibc,
> and I was about to propose we diverge from glibc here just because the
> current behavior is not useful for the intended purpose.
> 
> The whole reason pthread_setattr_default_np was added was to allow
> easy workarounds for applications/libraries that assume large stacks:
> so packagers can just add an early call to pthread_setattr_default_np
> to ensure that all threads created have sufficiently large stacks. But
> some code, notably SDL, creates an attribute object to set other
> properties, leaving the stack size alone, and thereby gets the
> musl-default stack size rather than the new default configured by
> pthread_setattr_default_np.
> 
> The attached patch fixes this, and I think I'll apply it, but I'm not
> entirely happy with the situation before or after this change, since
> pthread_setattr_default_np is thread-unsafe and thus it's unsafe to
> call from ctors in libraries that might be dynamic-loaded or where
> threads might have already been created by other libraries' ctors.
> 
> I think we should probably have pthread_setattr_default_np use a_cas
> to update the defaults, changing them from size_t to volatile int
> (there's no need to support defaults >INT_MAX; doing so would just be
> harmful) so a_cas is usable on them. Then the consumers,
> pthread_create and pthread_attr_init, would be getting relaxed-order
> atomics rather than data races. Assuming there's something else to
> order the pthread_setattr_default_np with respect to pthread_create or
> pthread_attr_init (e.g. being called in the same thread) you know the
> latter will get the latest-set default; otherwise it gets *some valid*
> default from the past or present.

One possible additional improvement that could be made:

There's a PT_GNU_STACK ELF program header that's normally only used to
request non-executable stack, but it can also specify a stack size the
program wants. This is used (and was historically required) for
FDPIC/NOMMU targets to tell the kernel how much stack to reserve for
the main thread, and is conventionally left at 0/default for normal
ELF, but ld is capable of writing a custom size with
-Wl,-z,stacksize=X.

So, we could automatically increase __default_stacksize according to
max(current,dso.pt_gnu_stacksize.p_memsz) for the main program and
each loaded dso. This would make it possible to work around programs
that need large stacks but don't explicitly request them without
patching, by instead just adding the appropriate LDFLAGS.

Are there any significant cons to doing this? Would it help
distros/integrators?

Rich
A. Wilcox Aug. 30, 2018, 6:13 p.m.
On 07/26/18 17:07, Rich Felker wrote:
> There's a PT_GNU_STACK ELF program header that's normally only used to
> request non-executable stack, but it can also specify a stack size the
> program wants. This is used (and was historically required) for
> FDPIC/NOMMU targets to tell the kernel how much stack to reserve for
> the main thread, and is conventionally left at 0/default for normal
> ELF, but ld is capable of writing a custom size with
> -Wl,-z,stacksize=X.
> 
> So, we could automatically increase __default_stacksize according to
> max(current,dso.pt_gnu_stacksize.p_memsz) for the main program and
> each loaded dso. This would make it possible to work around programs
> that need large stacks but don't explicitly request them without
> patching, by instead just adding the appropriate LDFLAGS.
> 
> Are there any significant cons to doing this? Would it help
> distros/integrators?
> 
> Rich


Short answer: YES!


Long answer:

One of the stickiest issues we've (Adélie) had with the default thread
stack size on musl is not really patching programs (that's easy, and
most upstreams actually do appreciate the conformance help).

It's Mozilla products.

There is no single place where a thread is spawned in Gecko / XPCOM /
MozJS.  It's a mess to try and fix it, so we haven't.

And that means that pages that use very deeply nested JavaScript... some
examples, off the top of my head:

- AT&T's landing pages
- Blue Cross of Oklahoma
- Forbes (magazine)
- Google Maps
- Namecheap's domain renewal page

... crash the browser because they exceeded the thread stack size.

Being able to simply add a linker flag that would "magically" fix this
for Firefox would be a huge win for us.


Best,
--arw