time: Fix bug in timer_create

Submitted by changdiankang on Sept. 24, 2019, 11:16 a.m.

Details

Message ID 5114FFD1D7758546892A923CB3078CD4121624A3@dggemi523-mbx.china.huawei.com
State New
Series "time: Fix bug in timer_create"
Headers show

Commit Message

changdiankang Sept. 24, 2019, 11:16 a.m.
time: Fix bug in timer_create

When create a SIGEV_THREAD timer, in helper thread "start",
"id" is assigned with self->timer_id immediately after pthread_create.
But the real timer_id is assigned to self->timer_id after SYS_timer_create.
So "id" in "start" always be 0, and timer_delete will always fail.
Put assignement of "id" after pthread_barrier_wait can fix this bug.

Signed-off-by: Chang Diankang <changdiankang@huawei.com<mailto:changdiankang@huawei.com>>

Patch hide | download patch | download mbox

diff --git a/src/time/timer_create.c b/src/time/timer_create.c
index c5e40a1..4172b9e 100644
--- a/src/time/timer_create.c
+++ b/src/time/timer_create.c
@@ -48,13 +48,14 @@  static void *start(void *arg)
{
        pthread_t self = __pthread_self();
        struct start_args *args = arg;
-       int id = self->timer_id;
+       int id;
        jmp_buf jb;

        void (*notify)(union sigval) = args->sev->sigev_notify_function;
        union sigval val = args->sev->sigev_value;

        pthread_barrier_wait(&args->b);
+       id = self->timer_id;
        for (;;) {
                siginfo_t si;
                while (sigwaitinfo(SIGTIMER_SET, &si) < 0);

Comments

Rich Felker Sept. 25, 2019, 5:19 p.m.
On Tue, Sep 24, 2019 at 11:16:41AM +0000, changdiankang wrote:
> time: Fix bug in timer_create
> 
> When create a SIGEV_THREAD timer, in helper thread "start",
> "id" is assigned with self->timer_id immediately after pthread_create.
> But the real timer_id is assigned to self->timer_id after SYS_timer_create.
> So "id" in "start" always be 0, and timer_delete will always fail.
> Put assignement of "id" after pthread_barrier_wait can fix this bug.
> 
> Signed-off-by: Chang Diankang <changdiankang@huawei.com<mailto:changdiankang@huawei.com>>
> 
> diff --git a/src/time/timer_create.c b/src/time/timer_create.c
> index c5e40a1..4172b9e 100644
> --- a/src/time/timer_create.c
> +++ b/src/time/timer_create.c
> @@ -48,13 +48,14 @@ static void *start(void *arg)
> {
>         pthread_t self = __pthread_self();
>         struct start_args *args = arg;
> -       int id = self->timer_id;
> +       int id;
>         jmp_buf jb;
> 
>         void (*notify)(union sigval) = args->sev->sigev_notify_function;
>         union sigval val = args->sev->sigev_value;
> 
>         pthread_barrier_wait(&args->b);
> +       id = self->timer_id;
>         for (;;) {
>                 siginfo_t si;
>                 while (sigwaitinfo(SIGTIMER_SET, &si) < 0);
> 

Thanks! In the future, please send patches as attachments rather than
inline since it looks like your mailer software is corrupting the
patch (converting tabs to spaces, maybe other problems too). I can
apply it manually though.

Rich
Rich Felker Sept. 25, 2019, 5:34 p.m.
On Wed, Sep 25, 2019 at 01:19:05PM -0400, Rich Felker wrote:
> On Tue, Sep 24, 2019 at 11:16:41AM +0000, changdiankang wrote:
> > time: Fix bug in timer_create
> > 
> > When create a SIGEV_THREAD timer, in helper thread "start",
> > "id" is assigned with self->timer_id immediately after pthread_create.
> > But the real timer_id is assigned to self->timer_id after SYS_timer_create.
> > So "id" in "start" always be 0, and timer_delete will always fail.
> > Put assignement of "id" after pthread_barrier_wait can fix this bug.
> > 
> > Signed-off-by: Chang Diankang <changdiankang@huawei.com<mailto:changdiankang@huawei.com>>
> > 
> > diff --git a/src/time/timer_create.c b/src/time/timer_create.c
> > index c5e40a1..4172b9e 100644
> > --- a/src/time/timer_create.c
> > +++ b/src/time/timer_create.c
> > @@ -48,13 +48,14 @@ static void *start(void *arg)
> > {
> >         pthread_t self = __pthread_self();
> >         struct start_args *args = arg;
> > -       int id = self->timer_id;
> > +       int id;
> >         jmp_buf jb;
> > 
> >         void (*notify)(union sigval) = args->sev->sigev_notify_function;
> >         union sigval val = args->sev->sigev_value;
> > 
> >         pthread_barrier_wait(&args->b);
> > +       id = self->timer_id;
> >         for (;;) {
> >                 siginfo_t si;
> >                 while (sigwaitinfo(SIGTIMER_SET, &si) < 0);
> > 
> 
> Thanks! In the future, please send patches as attachments rather than
> inline since it looks like your mailer software is corrupting the
> patch (converting tabs to spaces, maybe other problems too). I can
> apply it manually though.

Actually, I don't think this patch is a complete fix. After the
barrier wait passes, the caller can immediately call timer_delete,
which sets the sign bit of ->timer_id, possibly before the start
function reads the initial value of ->timer_id and saves it.

I'll extend this to a full fix. Just masking off the bit would be a
hack that would work, but would still leave the formal data race, so I
think it would make sense to just drop use of the barrier and instead
use a semaphore that both sides have to post (timer thread does
wait->post, calling thread does post->wait). That would be nice in
terms of getting rid of linking dependency on pthread barriers too.

Rich
Rich Felker Sept. 26, 2019, 3:41 a.m.
On Wed, Sep 25, 2019 at 01:34:24PM -0400, Rich Felker wrote:
> On Wed, Sep 25, 2019 at 01:19:05PM -0400, Rich Felker wrote:
> > On Tue, Sep 24, 2019 at 11:16:41AM +0000, changdiankang wrote:
> > > time: Fix bug in timer_create
> > > 
> > > When create a SIGEV_THREAD timer, in helper thread "start",
> > > "id" is assigned with self->timer_id immediately after pthread_create.
> > > But the real timer_id is assigned to self->timer_id after SYS_timer_create.
> > > So "id" in "start" always be 0, and timer_delete will always fail.
> > > Put assignement of "id" after pthread_barrier_wait can fix this bug.
> > > 
> > > Signed-off-by: Chang Diankang <changdiankang@huawei.com<mailto:changdiankang@huawei.com>>
> > > 
> > > diff --git a/src/time/timer_create.c b/src/time/timer_create.c
> > > index c5e40a1..4172b9e 100644
> > > --- a/src/time/timer_create.c
> > > +++ b/src/time/timer_create.c
> > > @@ -48,13 +48,14 @@ static void *start(void *arg)
> > > {
> > >         pthread_t self = __pthread_self();
> > >         struct start_args *args = arg;
> > > -       int id = self->timer_id;
> > > +       int id;
> > >         jmp_buf jb;
> > > 
> > >         void (*notify)(union sigval) = args->sev->sigev_notify_function;
> > >         union sigval val = args->sev->sigev_value;
> > > 
> > >         pthread_barrier_wait(&args->b);
> > > +       id = self->timer_id;
> > >         for (;;) {
> > >                 siginfo_t si;
> > >                 while (sigwaitinfo(SIGTIMER_SET, &si) < 0);
> > > 
> > 
> > Thanks! In the future, please send patches as attachments rather than
> > inline since it looks like your mailer software is corrupting the
> > patch (converting tabs to spaces, maybe other problems too). I can
> > apply it manually though.
> 
> Actually, I don't think this patch is a complete fix. After the
> barrier wait passes, the caller can immediately call timer_delete,
> which sets the sign bit of ->timer_id, possibly before the start
> function reads the initial value of ->timer_id and saves it.
> 
> I'll extend this to a full fix. Just masking off the bit would be a
> hack that would work, but would still leave the formal data race, so I
> think it would make sense to just drop use of the barrier and instead
> use a semaphore that both sides have to post (timer thread does
> wait->post, calling thread does post->wait). That would be nice in
> terms of getting rid of linking dependency on pthread barriers too.

It turns out it was easiest to just move the load of timer_id to the
deletion code path.

All of this will go away at some point anyway when we stop using
kernel timer objects for SIGEV_THREAD timers and instead just use
clock_nanosleep and manual overrun counting.

Rich