value of thread-specific data immediately after initialization

Submitted by Rich Felker on Sept. 18, 2018, 3:18 a.m.

Details

Message ID 20180918031807.GS17995@brightrain.aerifal.cx
State New
Series "value of thread-specific data immediately after initialization"
Headers show

Commit Message

Rich Felker Sept. 18, 2018, 3:18 a.m.
On Mon, Sep 17, 2018 at 04:54:56PM -0400, Rich Felker wrote:
> On Thu, Sep 13, 2018 at 01:39:38PM -0700, Benjamin Peterson wrote:
> > POSIX says that after the creation of a thread-specific key, its
> > associated value should be NULL in all threads. musl may not uphold
> > this requirement if a particular key is deleted and later resued.
> > 
> > Here's an example of the bug:
> > 
> > #include <pthread.h>
> > #include <stdio.h>
> > 
> > int main() {
> > 	pthread_key_t k;
> > 	pthread_key_create(&k, NULL);
> > 	printf("%p\n", pthread_getspecific(k));
> > 	pthread_setspecific(k, &k);
> > 	pthread_key_delete(k);
> > 	pthread_key_create(&k, NULL);
> > 	printf("%p\n", pthread_getspecific(k));
> > 	pthread_key_delete(k);
> > 	return 0;
> > }
> > 
> > Per POSIX, I would expect this testcase to output two NULLs.
> > However, musl prints the address of the old data even after the
> > second initialization:
> > 
> > $ musl-gcc -o test test.c
> > $ ./test 
> > 0
> > 0x7fff2ba3d004
> 
> I'm aware of this, and was aware of it when I wrote the code. My view
> was that it's a programming error to delete a key while there are
> still values associated with it, especially when the intended usage
> has dtors (which obviously couldn't be run) linked to the keys, with
> the implication that the data stored may have nontrivial destruction
> requirements. However, this is not in agreement with POSIX, which
> says:
> 
>     "The thread-specific data values associated with key need not be
>     NULL at the time pthread_key_delete() is called."
> 
> I think this should be fixed, but I'm not sure the whole thing is
> sufficiently specified. For example what is supposed to happen if a
> call to pthread_key_delete is made concurrently with the exit of
> another thread?
> 
> I think pthread_key_create and delete need to take an rwlock for write
> on the keys (dtors) array, and __pthread_tsd_run_dtors needs to hold a
> read lock while walking the list, releasing and reacquiring it
> whenever it finds a dtor it needs to call. This at least makes it
> thread-safe. The current code is not; if a key is deleted while data
> remains, the loop in __pthread_tsd_run_dtors has a TOCTOU race where
> it could find (self->tsd[i] && keys[i]) true, then crash when calling
> keys[i] if keys[i] has become a null pointer.
> 
> For clearing values at delete time, I don't want to use sequence
> numbers and lazy deletion. They waste a lot of memory, aren't robust
> unless they're very large (at least 64-bit, preferably larger), and
> slow down read access to tsd values. Instead I think we can move
> pthread_key_delete to a separate file (so it doesn't pull in bulky
> dependencies if not linked) and have it call __synccall to zero the
> value in all threads. This is very expensive, but deletion is not
> something you expect to have happen often if at all.
> 
> If we maintained a linked list of all live threads we could just walk
> that instead, but doing that would impose heavy synchronization costs
> on reasonable operations rather than just on unreasonable ones, I
> think.
> 
> One mitigation of the cost would be not to reuse keys until they're
> exhausted. That is, on deletion, set the dtor value to a sentinel
> rather than clearing it. Then, pthread_key_create would only allocate
> new key values that haven't been recycled until they all run out. When
> they run out, __synccall would be used to clear all the stale values
> at once, and all the sentinel dtor pointers could be cleared for
> reuse.

I have a patch I'm testing that implements this idea, including the
mitigation in the last paragraph. It seems to work fine; I've attached
it in case you or anyone else wants to take a look before I'm ready to
commit/push.

Note that your test program is insufficient to demonstrate that it's
working; it assumes the same key will be handed out again, which now
won't happen until all keys have been exhausted. The best way I know
to test is loop allocating keys until you get an error, setting a
value for all of them, then delete and recreate just one and confirm
that its value is null.

Rich

Patch hide | download patch | download mbox

diff --git a/src/internal/pthread_impl.h b/src/internal/pthread_impl.h
index 052a547..26e6e1d 100644
--- a/src/internal/pthread_impl.h
+++ b/src/internal/pthread_impl.h
@@ -148,6 +148,9 @@  hidden void __do_cleanup_push(struct __ptcb *);
 hidden void __do_cleanup_pop(struct __ptcb *);
 hidden void __pthread_tsd_run_dtors();
 
+hidden void __pthread_key_delete_synccall(void (*)(void *), void *);
+hidden int __pthread_key_delete_impl(pthread_key_t);
+
 extern hidden volatile int __block_new_threads;
 extern hidden volatile size_t __pthread_tsd_size;
 extern hidden void *__pthread_tsd_main[];
diff --git a/src/thread/pthread_key_create.c b/src/thread/pthread_key_create.c
index a78e507..b6ec8c5 100644
--- a/src/thread/pthread_key_create.c
+++ b/src/thread/pthread_key_create.c
@@ -5,52 +5,100 @@  void *__pthread_tsd_main[PTHREAD_KEYS_MAX] = { 0 };
 
 static void (*volatile keys[PTHREAD_KEYS_MAX])(void *);
 
+static pthread_rwlock_t key_lock = PTHREAD_RWLOCK_INITIALIZER;
+
+static pthread_key_t next_key;
+
 static void nodtor(void *dummy)
 {
 }
 
+static void dirty(void *dummy)
+{
+}
+
+static void clean_dirty_tsd(void *dummy)
+{
+	pthread_t self = __pthread_self();
+	pthread_key_t i;
+	for (i=0; i<PTHREAD_KEYS_MAX; i++) {
+		if (keys[i] == dirty && self->tsd[i])
+			self->tsd[i] = 0;
+	}
+}
+
 int __pthread_key_create(pthread_key_t *k, void (*dtor)(void *))
 {
-	unsigned i = (uintptr_t)&k / 16 % PTHREAD_KEYS_MAX;
-	unsigned j = i;
+	pthread_key_t j = next_key;
 	pthread_t self = __pthread_self();
+	int found_dirty = 0;
 
 	/* This can only happen in the main thread before
 	 * pthread_create has been called. */
 	if (!self->tsd) self->tsd = __pthread_tsd_main;
 
 	if (!dtor) dtor = nodtor;
+
+	pthread_rwlock_wrlock(&key_lock);
 	do {
-		if (!a_cas_p(keys+j, 0, (void *)dtor)) {
-			*k = j;
+		if (!keys[j]) {
+			keys[next_key = *k = j] = dtor;
+			pthread_rwlock_unlock(&key_lock);
 			return 0;
+		} else if (keys[j] == dirty) {
+			found_dirty = 1;
+		}
+	} while ((j=(j+1)%PTHREAD_KEYS_MAX) != next_key);
+
+	if (!found_dirty) {
+		pthread_rwlock_unlock(&key_lock);
+		return EAGAIN;
+	}
+
+	__pthread_key_delete_synccall(clean_dirty_tsd, 0);
+
+	for (j=0; j<PTHREAD_KEYS_MAX; j++) {
+		if (keys[j] == dirty) {
+			if (dtor) {
+				keys[next_key = *k = j] = dtor;
+				dtor = 0;
+			} else {
+				keys[j] = 0;
+			}
 		}
-	} while ((j=(j+1)%PTHREAD_KEYS_MAX) != i);
-	return EAGAIN;
+	}
+
+	pthread_rwlock_unlock(&key_lock);
+	return 0;
 }
 
-int __pthread_key_delete(pthread_key_t k)
+int __pthread_key_delete_impl(pthread_key_t k)
 {
-	keys[k] = 0;
+	pthread_rwlock_wrlock(&key_lock);
+	keys[k] = dirty;
+	pthread_rwlock_unlock(&key_lock);
 	return 0;
 }
 
 void __pthread_tsd_run_dtors()
 {
 	pthread_t self = __pthread_self();
-	int i, j, not_finished = self->tsd_used;
-	for (j=0; not_finished && j<PTHREAD_DESTRUCTOR_ITERATIONS; j++) {
-		not_finished = 0;
+	int i, j;
+	pthread_rwlock_rdlock(&key_lock);
+	for (j=0; self->tsd_used && j<PTHREAD_DESTRUCTOR_ITERATIONS; j++) {
+		self->tsd_used = 0;
 		for (i=0; i<PTHREAD_KEYS_MAX; i++) {
-			if (self->tsd[i] && keys[i]) {
-				void *tmp = self->tsd[i];
-				self->tsd[i] = 0;
-				keys[i](tmp);
-				not_finished = 1;
+			void *val = self->tsd[i];
+			void (*dtor)(void *) = keys[i];
+			self->tsd[i] = 0;
+			if (val && dtor && dtor != nodtor && dtor != dirty) {
+				pthread_rwlock_unlock(&key_lock);
+				dtor(val);
+				pthread_rwlock_rdlock(&key_lock);
 			}
 		}
 	}
+	pthread_rwlock_unlock(&key_lock);
 }
 
-weak_alias(__pthread_key_delete, pthread_key_delete);
 weak_alias(__pthread_key_create, pthread_key_create);
diff --git a/src/thread/pthread_key_delete.c b/src/thread/pthread_key_delete.c
index e69de29..012fe2d 100644
--- a/src/thread/pthread_key_delete.c
+++ b/src/thread/pthread_key_delete.c
@@ -0,0 +1,14 @@ 
+#include "pthread_impl.h"
+#include "libc.h"
+
+void __pthread_key_delete_synccall(void (*f)(void *), void *p)
+{
+	__synccall(f, p);
+}
+
+int __pthread_key_delete(pthread_key_t k)
+{
+	return __pthread_key_delete_impl(k);
+}
+
+weak_alias(__pthread_key_delete, pthread_key_delete);