[v2] MT fork

Submitted by Rich Felker on Nov. 11, 2020, 12:52 a.m.

Details

Message ID 20201111005216.GH534@brightrain.aerifal.cx
State New
Series "Status report and MT fork"
Headers show

Commit Message

Rich Felker Nov. 11, 2020, 12:52 a.m.
On Mon, Nov 09, 2020 at 05:23:21PM -0500, Rich Felker wrote:
> On Mon, Nov 09, 2020 at 10:59:01PM +0100, Szabolcs Nagy wrote:
> > (not ideal, since then interposers can't see all
> > allocations, which some tools would like to see,
> > but at least correct and robust. and it is annoying
> > that we have to do all this extra work just because
> > of mt-fork)
> 
> Yes. On the other hand if this were done more rigorously it would fix
> the valgrind breakage of malloc during ldso startup..
> 
> > > The other pros of such an approach are stuff like making it so
> > > application code doesn't get called as a callback from messy contexts
> > > inside libc, e.g. with dynamic linker in inconsistent state. The major
> > > con I see is that it precludes omitting the libc malloc entirely when
> > > static linking, assuming you link any part of libc that uses malloc
> > > internally. However, almost all such places only call malloc, not
> > > free, so you'd just get the trivial bump allocator gratuitously
> > > linked, rather than full mallocng or oldmalloc, except for dlerror
> > > which shouldn't come up in static linked programs anyway.
> > 
> > i see.
> > that sounds fine to me.
> 
> I'm still not sure it's fine, so I appreciate your input and anyone
> else's who has spent some time thinking about this.

Here's a proposed first patch in series, getting rid of getdelim/stdio
usage in ldso. I think that suffices to set the stage for adding
__libc_malloc, __libc_free, __libc_calloc, __libc_realloc and having
ldso use them.

To make this work, I think malloc needs to actually be a separate
function wrapping __libc_malloc -- this is because __simple_malloc
precludes the malloc symbol itself being weak. That's a very slight
runtime cost, but has the benefit of eliminating the awful hack of
relyin on link order to get __simple_malloc (bump allocator) to be
chosen over full malloc. Now, the source file containing the bump
allocator can define malloc as a call to __libc_malloc, and provide
the bump allocator as the weak definition of __libc_malloc.
mallocng/malloc.c would then provide the strong definition of
__libc_malloc.

For the other functions, I think __libc_* can theoretically just be
aliases for the public symbols, but this may (almost surely does)
break valgrind, and it's messy to do at the source level, so perhaps
they should be wrapped too. This should entirely prevent valgrind from
interposing the libc-internal calls, thereby fixing the longstanding
bug where it crashes by interposing them too early.

Rich
From b24186676cbc87c0e2c3e0fa672ef73ea55b600e Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Tue, 10 Nov 2020 19:32:09 -0500
Subject: [PATCH] drop use of getdelim/stdio in dynamic linker

the only place stdio was used here was for reading the ldso path file,
taking advantage of getdelim to automatically allocate and resize the
buffer. the motivation for use here was that, with shared libraries,
stdio is already available anyway and free to use. this has long been
a nuisance to users because getdelim's use of realloc here triggered a
valgrind bug, but removing it doesn't really fix that; on some archs
even calling the valgrind-interposed malloc at this point will crash.

the actual motivation for this change is moving towards getting rid of
use of application-provided malloc in parts of libc where it would be
called with libc-internal locks held, leading to the possibility of
deadlock if the malloc implementation doesn't follow unwritten rules
about which libc functions are safe for it to call. since getdelim is
required to produce a pointer as if by malloc (i.e. that can be passed
to reallor or free), it necessarily must use the public malloc.

instead of performing a realloc loop as the path file is read, first
query its size with fstat and allocate only once. this produces
slightly different truncation behavior when racing with writes to a
file, but neither behavior is or could be made safe anyway; on a live
system, ldso path files should be replaced by atomic rename only. the
change should also reduce memory waste.
---
 ldso/dynlink.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

Patch hide | download patch | download mbox

diff --git a/ldso/dynlink.c b/ldso/dynlink.c
index f9ac0100..d736bf12 100644
--- a/ldso/dynlink.c
+++ b/ldso/dynlink.c
@@ -1,6 +1,5 @@ 
 #define _GNU_SOURCE
 #define SYSCALL_NO_TLS 1
-#include <stdio.h>
 #include <stdlib.h>
 #include <stdarg.h>
 #include <stddef.h>
@@ -556,6 +555,20 @@  static void reclaim_gaps(struct dso *dso)
 	}
 }
 
+static ssize_t read_loop(int fd, void *p, size_t n)
+{
+	unsigned char *b = p;
+	for (size_t l, i=0; i<n; i+=l) {
+		l = read(fd, b+i, n-i);
+		if (l<0) {
+			if (errno==EINTR) continue;
+			else return -1;
+		}
+		if (l==0) return i;
+	}
+	return n;
+}
+
 static void *mmap_fixed(void *p, size_t n, int prot, int flags, int fd, off_t off)
 {
 	static int no_map_fixed;
@@ -1060,13 +1073,17 @@  static struct dso *load_library(const char *name, struct dso *needed_by)
 				snprintf(etc_ldso_path, sizeof etc_ldso_path,
 					"%.*s/etc/ld-musl-" LDSO_ARCH ".path",
 					(int)prefix_len, prefix);
-				FILE *f = fopen(etc_ldso_path, "rbe");
-				if (f) {
-					if (getdelim(&sys_path, (size_t[1]){0}, 0, f) <= 0) {
+				fd = open(etc_ldso_path, O_RDONLY|O_CLOEXEC);
+				if (fd>=0) {
+					size_t n = 0;
+					if (!fstat(fd, &st)) n = st.st_size;
+					sys_path = malloc(n+1);
+					sys_path[n] = 0;
+					if (!sys_path || read_loop(fd, sys_path, n)<0) {
 						free(sys_path);
 						sys_path = "";
 					}
-					fclose(f);
+					close(fd);
 				} else if (errno != ENOENT) {
 					sys_path = "";
 				}

Comments

Alexey Izbyshev Nov. 11, 2020, 6:35 a.m.
On 2020-11-11 03:52, Rich Felker wrote:
> Here's a proposed first patch in series, getting rid of getdelim/stdio
> usage in ldso. I think that suffices to set the stage for adding
> __libc_malloc, __libc_free, __libc_calloc, __libc_realloc and having
> ldso use them.
> 

> +static ssize_t read_loop(int fd, void *p, size_t n)
> +{
> +    unsigned char *b = p;
> +    for (size_t l, i=0; i<n; i+=l) {
> +        l = read(fd, b+i, n-i);
> +        if (l<0) {
> +            if (errno==EINTR) continue;
This increments `i` by a negative `l`.
> +            else return -1;
> +        }
> +        if (l==0) return i;
> +    }
> +    return n;
> +}

> +                if (fd>=0) {
> +                    size_t n = 0;
> +                    if (!fstat(fd, &st)) n = st.st_size;
> +                    sys_path = malloc(n+1);
> +                    sys_path[n] = 0;
`sys_path` can be NULL here.
> +                    if (!sys_path || read_loop(fd, sys_path, n)<0) {
>                         free(sys_path);
>                         sys_path = "";
>                     }
> -                    fclose(f);
> +                    close(fd);
>                 } else if (errno != ENOENT) {
>                     sys_path = "";
>                 }

Alexey
Szabolcs Nagy Nov. 11, 2020, 11:25 a.m.
* Alexey Izbyshev <izbyshev@ispras.ru> [2020-11-11 09:35:22 +0300]:
> On 2020-11-11 03:52, Rich Felker wrote:
> > Here's a proposed first patch in series, getting rid of getdelim/stdio
> > usage in ldso. I think that suffices to set the stage for adding
> > __libc_malloc, __libc_free, __libc_calloc, __libc_realloc and having
> > ldso use them.
> > 

if we don't have to replicate a lot of code in ldso then this sounds good.

> > +static ssize_t read_loop(int fd, void *p, size_t n)
> > +{
> > +    unsigned char *b = p;
> > +    for (size_t l, i=0; i<n; i+=l) {
> > +        l = read(fd, b+i, n-i);
> > +        if (l<0) {
> > +            if (errno==EINTR) continue;
> This increments `i` by a negative `l`.

it's worse: l cannot be negative so the error check is ineffective.

maybe it should be ssize_t? or check == -1

> > +            else return -1;
> > +        }
> > +        if (l==0) return i;
> > +    }
> > +    return n;
> > +}
> 
> > +                if (fd>=0) {
> > +                    size_t n = 0;
> > +                    if (!fstat(fd, &st)) n = st.st_size;
> > +                    sys_path = malloc(n+1);
> > +                    sys_path[n] = 0;
> `sys_path` can be NULL here.
> > +                    if (!sys_path || read_loop(fd, sys_path, n)<0) {
> >                         free(sys_path);
> >                         sys_path = "";
> >                     }
> > -                    fclose(f);
> > +                    close(fd);
> >                 } else if (errno != ENOENT) {
> >                     sys_path = "";
> >                 }
> 
> Alexey
Rich Felker Nov. 11, 2020, 2:56 p.m.
On Wed, Nov 11, 2020 at 12:25:00PM +0100, Szabolcs Nagy wrote:
> * Alexey Izbyshev <izbyshev@ispras.ru> [2020-11-11 09:35:22 +0300]:
> > On 2020-11-11 03:52, Rich Felker wrote:
> > > Here's a proposed first patch in series, getting rid of getdelim/stdio
> > > usage in ldso. I think that suffices to set the stage for adding
> > > __libc_malloc, __libc_free, __libc_calloc, __libc_realloc and having
> > > ldso use them.
> > > 
> 
> if we don't have to replicate a lot of code in ldso then this sounds good.

Indeed the ldso part of the patch it just:

+#define malloc __libc_malloc
+#define calloc __libc_calloc
+#define realloc __libc_realloc
+#define free __libc_free

That's the minimal needed to make it work. Assuming we adopt and keep
this I might also remove a bunch of ugly code in dynlink.c that
special-cases whether it's running with replaced malloc or not.

> > > +static ssize_t read_loop(int fd, void *p, size_t n)
> > > +{
> > > +    unsigned char *b = p;
> > > +    for (size_t l, i=0; i<n; i+=l) {
> > > +        l = read(fd, b+i, n-i);
> > > +        if (l<0) {
> > > +            if (errno==EINTR) continue;
> > This increments `i` by a negative `l`.

Thanks Alexey for catching that!

> it's worse: l cannot be negative so the error check is ineffective.
> 
> maybe it should be ssize_t? or check == -1

Yes, there are multiple problems and this was the original motivation
for using stdio -- not having to write this ugly error-prone code. But
it only has to be written and reviewed once so it shouldn't be too bad
to get it right. ssize_t should be fine. I mainly did ridiculous stuff
trying to be clever with scope of the vars.

> > > +                if (fd>=0) {
> > > +                    size_t n = 0;
> > > +                    if (!fstat(fd, &st)) n = st.st_size;
> > > +                    sys_path = malloc(n+1);
> > > +                    sys_path[n] = 0;
> > `sys_path` can be NULL here.

Thanks, I meant to put that in if ((sys_path = malloc(n+1))) or
something. Will fix.

Rich
Rich Felker Nov. 11, 2020, 4:35 p.m.
On Tue, Nov 10, 2020 at 07:52:17PM -0500, Rich Felker wrote:
> On Mon, Nov 09, 2020 at 05:23:21PM -0500, Rich Felker wrote:
> > On Mon, Nov 09, 2020 at 10:59:01PM +0100, Szabolcs Nagy wrote:
> > > (not ideal, since then interposers can't see all
> > > allocations, which some tools would like to see,
> > > but at least correct and robust. and it is annoying
> > > that we have to do all this extra work just because
> > > of mt-fork)
> > 
> > Yes. On the other hand if this were done more rigorously it would fix
> > the valgrind breakage of malloc during ldso startup..
> > 
> > > > The other pros of such an approach are stuff like making it so
> > > > application code doesn't get called as a callback from messy contexts
> > > > inside libc, e.g. with dynamic linker in inconsistent state. The major
> > > > con I see is that it precludes omitting the libc malloc entirely when
> > > > static linking, assuming you link any part of libc that uses malloc
> > > > internally. However, almost all such places only call malloc, not
> > > > free, so you'd just get the trivial bump allocator gratuitously
> > > > linked, rather than full mallocng or oldmalloc, except for dlerror
> > > > which shouldn't come up in static linked programs anyway.
> > > 
> > > i see.
> > > that sounds fine to me.
> > 
> > I'm still not sure it's fine, so I appreciate your input and anyone
> > else's who has spent some time thinking about this.
> 
> Here's a proposed first patch in series, getting rid of getdelim/stdio
> usage in ldso. I think that suffices to set the stage for adding
> __libc_malloc, __libc_free, __libc_calloc, __libc_realloc and having
> ldso use them.
> 
> To make this work, I think malloc needs to actually be a separate
> function wrapping __libc_malloc -- this is because __simple_malloc
> precludes the malloc symbol itself being weak. That's a very slight
> runtime cost, but has the benefit of eliminating the awful hack of
> relyin on link order to get __simple_malloc (bump allocator) to be
> chosen over full malloc. Now, the source file containing the bump
> allocator can define malloc as a call to __libc_malloc, and provide
> the bump allocator as the weak definition of __libc_malloc.
> mallocng/malloc.c would then provide the strong definition of
> __libc_malloc.
> 
> For the other functions, I think __libc_* can theoretically just be
> aliases for the public symbols, but this may (almost surely does)
> break valgrind, and it's messy to do at the source level, so perhaps
> they should be wrapped too. This should entirely prevent valgrind from
> interposing the libc-internal calls, thereby fixing the longstanding
> bug where it crashes by interposing them too early.

The __simple_malloc link logic for this actually turned out to be
somewhat complicated, so I want to document it here independent of
code before actually working out and posting the code draft:

The source file defining __simple_malloc (lite_malloc.c but I'll
probably rename it to malloc.c at some point) needs to also define
(the only definition of) __libc_malloc and a weak symbol for malloc,
which should be the only definition of malloc in libc.

The first ensures that any reference to __libc_malloc will cause this
file to be linked (rather than relying on link order or else having
the possibility of pulling in full malloc). The latter ensures that
any reference to malloc will cause this file to be linked, unless the
application already defined its own malloc, and weakness ensures they
won't clash if the application did.

Both __libc_malloc and (the weak) malloc function should just call
__libc_malloc_impl, which has a weak definition provided by
__simple_malloc and a strong definition if full malloc is linked (as a
result of referencing free, realloc, etc.). Here malloc could simply
be a weak alias for __libc_malloc, except that valgrind would then
munge __libc_malloc. The function is nothing but a tail call so
duplicating the code twice really doesn't hurt anyway.

Aside from the new __libc_malloc function, the above is really how
this always should have worked (to avoid the link order hack). Not
doing it was a matter of avoiding the wrapper function layer.

Rich