dlsym(handle) may search in unrelated libraries

Submitted by Markus Wichmann on Feb. 7, 2019, 5:43 p.m.

Details

Message ID 20190207174312.GE5469@voyager
State New
Series "dlsym(handle) may search in unrelated libraries"
Headers show

Commit Message

Markus Wichmann Feb. 7, 2019, 5:43 p.m.
On Thu, Feb 07, 2019 at 04:42:15PM +0300, Alexey Izbyshev wrote:
> I think the easiest way is simply to modify load_deps() to always traverse
> DT_NEEDED in breadth-first order without relying on the dso list in the
> outer loop. load_deps() already effectively maintains a queue (deps) that
> can be used for BFS, so no recursion is needed.

OK, since we have to implement a BFS, that does in fact work. So I
implemented that. Still needs testing, though.

One side effect is, patch 7 from the previous mail was reverted.

Another is that now load_deps() depends on the deps array as loop
structure. I was almost as far as just using the runtime code and adding
an a_crash() in case of allocation failure during loadtime, but then I
decided to just split loadtime and runtime apart. So
load_deps_loadtime() is just a copy of load_deps(), refactored with the
assumption runtime==0, and load_deps_runtime() is a copy of load_deps(),
with the patch discussed here and refactored under the assumption
runtime!=0.

I had noticed, during the refactoring, that this means that app->deps ==
{0}, always. So I wondered if that might bite us. However, the only
normal way to obtain a handle to the app itself is to call dlsym() with
RTLD_NEXT. Which is one of the special symbols that will load symbols
from the given DSO and all following ones in the symbol list. And all of
the main app's dependencies are immediately added to the symbol list
after the first load_deps() call (now load_deps_loadtime()).

For Rich's comfort, I am attaching patch 6 again, so all relevant
patches are in one mail.

> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlsym.html
> [2] http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlopen.html
> 

Right you are, again. Maybe I should read more POSIX before working on
these things. :-)

> Alexey
> 

Ciao,
Markus

Patch hide | download patch | download mbox

From 18008eb03acd59f6cbaa82c607f1969c70707e21 Mon Sep 17 00:00:00 2001
From: Markus Wichmann <nullplan@gmx.net>
Date: Thu, 7 Feb 2019 18:17:25 +0100
Subject: [PATCH 9/9] Fix runtime dependency accounting in dlopen().

As Alexey Izbyshev pointed out, the library given as argument to
dlopen() does not necessarily have to be the last in the chain. Nor do
any of the dependencies. Therefore it is wrong to assume that walking
the chain of libraries from any of them forward will only walk over
dependencies of the freshly loaded library.

I had to split the runtime and loadtime paths of load_deps() apart, or
otherwise I would have had to allocate the deps array for the
application at loadtime. And then I would have needed a resolution for
allocation failure, which would have been a crash.
---
 ldso/dynlink.c | 43 ++++++++++++++++++++++++++++---------------
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/ldso/dynlink.c b/ldso/dynlink.c
index 6ffeca85..66e6f18b 100644
--- a/ldso/dynlink.c
+++ b/ldso/dynlink.c
@@ -1136,10 +1136,10 @@  static struct dso *load_library(const char *name, struct dso *needed_by)
 	return p;
 }
 
-static void load_deps(struct dso *p)
-{
-	size_t i, ndeps=0;
-	struct dso ***deps = &p->deps, **tmp, *dep;
+static void load_deps_loadtime(struct dso *p) {
+	size_t i;
+	struct dso *dep;
+	p->deps = (struct dso**)&nodeps_dummy;
 	for (; p; p=p->next) {
 		for (i=0; p->dynv[i]; i+=2) {
 			if (p->dynv[i] != DT_NEEDED) continue;
@@ -1147,19 +1147,32 @@  static void load_deps(struct dso *p)
 			if (!dep) {
 				error("Error loading shared library %s: %m (needed by %s)",
 					p->strings + p->dynv[i+1], p->name);
-				if (runtime) longjmp(*rtld_fail, 1);
-				continue;
 			}
-			if (runtime) {
-				tmp = realloc(*deps, sizeof(*tmp)*(ndeps+2));
-				if (!tmp) longjmp(*rtld_fail, 1);
-				tmp[ndeps++] = dep;
-				tmp[ndeps] = 0;
-				*deps = tmp;
+		}
+	}
+}
+
+static void load_deps_runtime(struct dso *p)
+{
+	size_t i, ndeps=0, j=0;
+	struct dso ***deps = &p->deps, **tmp, *dep;
+	for (; p; p=(*deps)[j++]) {
+		for (i=0; p->dynv[i]; i+=2) {
+			if (p->dynv[i] != DT_NEEDED) continue;
+			dep = load_library(p->strings + p->dynv[i+1], p);
+			if (!dep) {
+				error("Error loading shared library %s: %m (needed by %s)",
+					p->strings + p->dynv[i+1], p->name);
+				longjmp(*rtld_fail, 1);
 			}
+			tmp = realloc(*deps, sizeof(*tmp)*(ndeps+2));
+			if (!tmp) longjmp(*rtld_fail, 1);
+			tmp[ndeps++] = dep;
+			tmp[ndeps] = 0;
+			*deps = tmp;
 		}
 	}
-	if (!*deps) *deps = (struct dso **)&nodeps_dummy;
+	if (!*deps) *deps = (struct dso**)&nodeps_dummy;
 }
 
 static void load_preload(char *s)
@@ -1653,7 +1666,7 @@  _Noreturn void __dls3(size_t *sp)
 
 	/* Load preload/needed libraries, add symbols to global namespace. */
 	if (env_preload) load_preload(env_preload);
- 	load_deps(&app);
+ 	load_deps_loadtime(&app);
 	for (struct dso *p=head; p; p=p->next)
 		add_syms(p);
 
@@ -1836,7 +1849,7 @@  void *dlopen(const char *file, int mode)
 	/* First load handling */
 	int first_load = !p->deps;
 	if (first_load) {
-		load_deps(p);
+		load_deps_runtime(p);
 		if (!p->relocated && (mode & RTLD_LAZY)) {
 			prepare_lazy(p);
 			for (i=0; p->deps[i]; i++)
-- 
2.20.1


Comments

Rich Felker Feb. 7, 2019, 9:29 p.m.
On Thu, Feb 07, 2019 at 06:43:12PM +0100, Markus Wichmann wrote:
> On Thu, Feb 07, 2019 at 04:42:15PM +0300, Alexey Izbyshev wrote:
> > I think the easiest way is simply to modify load_deps() to always traverse
> > DT_NEEDED in breadth-first order without relying on the dso list in the
> > outer loop. load_deps() already effectively maintains a queue (deps) that
> > can be used for BFS, so no recursion is needed.
> 
> OK, since we have to implement a BFS, that does in fact work. So I
> implemented that. Still needs testing, though.

Comments below:

> One side effect is, patch 7 from the previous mail was reverted.
> 
> Another is that now load_deps() depends on the deps array as loop
> structure. I was almost as far as just using the runtime code and adding
> an a_crash() in case of allocation failure during loadtime, but then I
> decided to just split loadtime and runtime apart. So
> load_deps_loadtime() is just a copy of load_deps(), refactored with the
> assumption runtime==0, and load_deps_runtime() is a copy of load_deps(),
> with the patch discussed here and refactored under the assumption
> runtime!=0.

The error() function sets ldso_fail to true, which prevents running of
the program. So it should work just fine to keep them together, and to
build the deps lists at program start time if we want (which is still
an open question, I think).

> I had noticed, during the refactoring, that this means that app->deps ==
> {0}, always. So I wondered if that might bite us. However, the only
> normal way to obtain a handle to the app itself is to call dlsym() with
> RTLD_NEXT. Which is one of the special symbols that will load symbols
> from the given DSO and all following ones in the symbol list. And all of
> the main app's dependencies are immediately added to the symbol list
> after the first load_deps() call (now load_deps_loadtime()).

This seems correct.

> For Rich's comfort, I am attaching patch 6 again, so all relevant
> patches are in one mail.

One thing to fix in it, see below..

> From e823910d69ff56ffccecaa9b29fd4b67b901798a Mon Sep 17 00:00:00 2001
> From: Markus Wichmann <nullplan@gmx.net>
> Date: Wed, 6 Feb 2019 16:51:53 +0100
> Subject: [PATCH 6/9] Make libc and vdso explicitly have no deps.
> 
> Alexey Izbyshev reported that without this, dlopen("libc.so") returns a
> handle that is capable of finding every symbol in libraries loaded as
> dependencies, since dso->deps == 0 usually means dependencies haven't
> been loaded.
> ---
>  ldso/dynlink.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/ldso/dynlink.c b/ldso/dynlink.c
> index ec921dfd..6ffeca85 100644
> --- a/ldso/dynlink.c
> +++ b/ldso/dynlink.c
> @@ -1244,6 +1244,7 @@ static void reloc_all(struct dso *p)
>  static void kernel_mapped_dso(struct dso *p)
>  {
>  	size_t min_addr = -1, max_addr = 0, cnt;
> +	static const struct dso *sentinel = 0;

This fragment is unused and looks like cruft leftover from an earlier
idea.

>  	Phdr *ph = p->phdr;
>  	for (cnt = p->phnum; cnt--; ph = (void *)((char *)ph + p->phentsize)) {
>  		if (ph->p_type == PT_DYNAMIC) {
> @@ -1428,6 +1429,7 @@ hidden void __dls2(unsigned char *base, size_t *sp)
>  	ldso.phdr = laddr(&ldso, ehdr->e_phoff);
>  	ldso.phentsize = ehdr->e_phentsize;
>  	kernel_mapped_dso(&ldso);
> +	ldso.deps = (struct dso**)&nodeps_dummy;
>  	decode_dyn(&ldso);
>  
>  	if (DL_FDPIC) makefuncdescs(&ldso);
> @@ -1675,6 +1677,7 @@ _Noreturn void __dls3(size_t *sp)
>  		vdso.prev = tail;
>  		tail->next = &vdso;
>  		tail = &vdso;
> +		vdso.deps = (struct dso**)&nodeps_dummy;
>  	}

Style nit: (struct dso **)

> From 18008eb03acd59f6cbaa82c607f1969c70707e21 Mon Sep 17 00:00:00 2001
> From: Markus Wichmann <nullplan@gmx.net>
> Date: Thu, 7 Feb 2019 18:17:25 +0100
> Subject: [PATCH 9/9] Fix runtime dependency accounting in dlopen().
> 
> As Alexey Izbyshev pointed out, the library given as argument to
> dlopen() does not necessarily have to be the last in the chain. Nor do
> any of the dependencies. Therefore it is wrong to assume that walking
> the chain of libraries from any of them forward will only walk over
> dependencies of the freshly loaded library.
> 
> I had to split the runtime and loadtime paths of load_deps() apart, or
> otherwise I would have had to allocate the deps array for the
> application at loadtime. And then I would have needed a resolution for
> allocation failure, which would have been a crash.
> ---
>  ldso/dynlink.c | 43 ++++++++++++++++++++++++++++---------------
>  1 file changed, 28 insertions(+), 15 deletions(-)
> 
> diff --git a/ldso/dynlink.c b/ldso/dynlink.c
> index 6ffeca85..66e6f18b 100644
> --- a/ldso/dynlink.c
> +++ b/ldso/dynlink.c
> @@ -1136,10 +1136,10 @@ static struct dso *load_library(const char *name, struct dso *needed_by)
>  	return p;
>  }
>  
> -static void load_deps(struct dso *p)
> -{
> -	size_t i, ndeps=0;
> -	struct dso ***deps = &p->deps, **tmp, *dep;
> +static void load_deps_loadtime(struct dso *p) {
> +	size_t i;
> +	struct dso *dep;
> +	p->deps = (struct dso**)&nodeps_dummy;
>  	for (; p; p=p->next) {
>  		for (i=0; p->dynv[i]; i+=2) {
>  			if (p->dynv[i] != DT_NEEDED) continue;
> @@ -1147,19 +1147,32 @@ static void load_deps(struct dso *p)
>  			if (!dep) {
>  				error("Error loading shared library %s: %m (needed by %s)",
>  					p->strings + p->dynv[i+1], p->name);
> -				if (runtime) longjmp(*rtld_fail, 1);
> -				continue;
>  			}
> -			if (runtime) {
> -				tmp = realloc(*deps, sizeof(*tmp)*(ndeps+2));
> -				if (!tmp) longjmp(*rtld_fail, 1);
> -				tmp[ndeps++] = dep;
> -				tmp[ndeps] = 0;
> -				*deps = tmp;
> +		}
> +	}
> +}
> +
> +static void load_deps_runtime(struct dso *p)
> +{
> +	size_t i, ndeps=0, j=0;
> +	struct dso ***deps = &p->deps, **tmp, *dep;
> +	for (; p; p=(*deps)[j++]) {
> +		for (i=0; p->dynv[i]; i+=2) {
> +			if (p->dynv[i] != DT_NEEDED) continue;
> +			dep = load_library(p->strings + p->dynv[i+1], p);
> +			if (!dep) {
> +				error("Error loading shared library %s: %m (needed by %s)",
> +					p->strings + p->dynv[i+1], p->name);
> +				longjmp(*rtld_fail, 1);
>  			}
> +			tmp = realloc(*deps, sizeof(*tmp)*(ndeps+2));
> +			if (!tmp) longjmp(*rtld_fail, 1);
> +			tmp[ndeps++] = dep;
> +			tmp[ndeps] = 0;
> +			*deps = tmp;
>  		}

Aside from above remark about not splitting the two versions, I don't
think the algorithm works. In the case of circular dependencies, which
are awful but do happen in the wild, the loop will run forever and
keep appending to the deps array. I think this can be fixed via
comparison of each new dep against prior slots (linear time for each
addition, so quadratic overall) to avoid adding it more than once, or
since we hold a lock, a tag could be added to struct dso to tag which
ones we've already hit (constant time). Since load_library is already
a linear search with a larger value of N than the number of
dependencies, I don't really see any advantage to avoiding the linear
search here, and would just go with it since it's simpler and less
invasive.

>  	}
> -	if (!*deps) *deps = (struct dso **)&nodeps_dummy;
> +	if (!*deps) *deps = (struct dso**)&nodeps_dummy;

Spurious style regression. :)

Rich