[REVIEW,03/26] signal/arm64: Use force_sig not force_sig_fault for SIGKILL

Submitted by Eric W. Biederman on May 23, 2019, 12:38 a.m.

Details

Message ID 20190523003916.20726-4-ebiederm@xmission.com
State New
Series "signal: Remove task argument from force_sig_info"
Headers show

Commit Message

Eric W. Biederman May 23, 2019, 12:38 a.m.
It really only matters to debuggers but the SIGKILL does not have any
si_codes that use the fault member of the siginfo union.  Correct this
the simple way and call force_sig instead of force_sig_fault when the
signal is SIGKILL.

Cc: stable@vger.kernel.org
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: af40ff687bc9 ("arm64: signal: Ensure si_code is valid for all fault signals")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/arm64/kernel/traps.c | 5 +++++
 1 file changed, 5 insertions(+)

Patch hide | download patch | download mbox

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index ade32046f3fe..0feb17bdcaa0 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -282,6 +282,11 @@  void arm64_notify_die(const char *str, struct pt_regs *regs,
 		current->thread.fault_address = 0;
 		current->thread.fault_code = err;
 
+		if (signo == SIGKILL) {
+			arm64_show_signal(signo, str);
+			force_sig(signo, current);
+			return;
+		}
 		arm64_force_sig_fault(signo, sicode, addr, str);
 	} else {
 		die(str, regs, err);

Comments

Will Deacon May 23, 2019, 10:17 a.m.
On Wed, May 22, 2019 at 07:38:53PM -0500, Eric W. Biederman wrote:
> It really only matters to debuggers but the SIGKILL does not have any
> si_codes that use the fault member of the siginfo union.  Correct this
> the simple way and call force_sig instead of force_sig_fault when the
> signal is SIGKILL.
> 
> Cc: stable@vger.kernel.org
> Cc: Dave Martin <Dave.Martin@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Fixes: af40ff687bc9 ("arm64: signal: Ensure si_code is valid for all fault signals")
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  arch/arm64/kernel/traps.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index ade32046f3fe..0feb17bdcaa0 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -282,6 +282,11 @@ void arm64_notify_die(const char *str, struct pt_regs *regs,
>  		current->thread.fault_address = 0;
>  		current->thread.fault_code = err;
>  
> +		if (signo == SIGKILL) {
> +			arm64_show_signal(signo, str);
> +			force_sig(signo, current);
> +			return;
> +		}

I know it's a bit of a misnomer, but I'd rather do this check inside
arm64_force_sig_fault, since I think we have other callers (e.g.
do_bad_area()) which also blindly pass in SIGKILL here.

We could rename the thing if necessary.

Will
Dave P Martin May 23, 2019, 10:21 a.m.
On Thu, May 23, 2019 at 01:38:53AM +0100, Eric W. Biederman wrote:
> It really only matters to debuggers but the SIGKILL does not have any
> si_codes that use the fault member of the siginfo union.  Correct this
> the simple way and call force_sig instead of force_sig_fault when the
> signal is SIGKILL.

I haven't fully understood the context for this, but why does it matter
what's in siginfo for SIGKILL?  My understanding is that userspace
(including ptrace) never gets to see it anyway for the SIGKILL case.

Here it feels like SIGKILL is logically a synchronous, thread-targeted
fault: we must ensure that no subsequent insn in current executes (just
like other fault signal).  In this case, I thought we fall back to
SIGKILL not because there is no fault, but because we failed to
properly diagnose or report the type of fault that occurred.

So maybe handling it consistently with other faults signals makes
sense.  The fact that delivery of this signal destroys the process
before anyone can look at the resulting siginfo feels like a
side-effect rather than something obviously wrong.

The siginfo is potentially useful diagnostic information, that we could
subsequently provide a means to access post-mortem.

I just dived in on this single patch, so I may be missing something more
fundamental, or just being pedantic...

Cheers
---Dave

> Cc: stable@vger.kernel.org
> Cc: Dave Martin <Dave.Martin@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Fixes: af40ff687bc9 ("arm64: signal: Ensure si_code is valid for all fault signals")
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  arch/arm64/kernel/traps.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index ade32046f3fe..0feb17bdcaa0 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -282,6 +282,11 @@ void arm64_notify_die(const char *str, struct pt_regs *regs,
>  		current->thread.fault_address = 0;
>  		current->thread.fault_code = err;
>  
> +		if (signo == SIGKILL) {
> +			arm64_show_signal(signo, str);
> +			force_sig(signo, current);
> +			return;
> +		}
>  		arm64_force_sig_fault(signo, sicode, addr, str);
>  	} else {
>  		die(str, regs, err);
> -- 
> 2.21.0
>
Eric W. Biederman May 23, 2019, 2:53 p.m.
Dave Martin <Dave.Martin@arm.com> writes:

> On Thu, May 23, 2019 at 01:38:53AM +0100, Eric W. Biederman wrote:
>> It really only matters to debuggers but the SIGKILL does not have any
>> si_codes that use the fault member of the siginfo union.  Correct this
>> the simple way and call force_sig instead of force_sig_fault when the
>> signal is SIGKILL.
>
> I haven't fully understood the context for this, but why does it matter
> what's in siginfo for SIGKILL?  My understanding is that userspace
> (including ptrace) never gets to see it anyway for the SIGKILL case.

Yes.  In practice I think it would take tracing or something very
exotic to notice anything going wrong because the task will be killed.

> Here it feels like SIGKILL is logically a synchronous, thread-targeted
> fault: we must ensure that no subsequent insn in current executes (just
> like other fault signal).  In this case, I thought we fall back to
> SIGKILL not because there is no fault, but because we failed to
> properly diagnose or report the type of fault that occurred.
>
> So maybe handling it consistently with other faults signals makes
> sense.  The fact that delivery of this signal destroys the process
> before anyone can look at the resulting siginfo feels like a
> side-effect rather than something obviously wrong.
>
> The siginfo is potentially useful diagnostic information, that we could
> subsequently provide a means to access post-mortem.
>
> I just dived in on this single patch, so I may be missing something more
> fundamental, or just being pedantic...

Not really.  I was working on another cleanup and this usage of SIGKILL
came up.

A synchronous thread synchronous fault gets us as far as the forc_sig
family of functions.  That only leaves the question of which union
member in struct siginfo we are using.  The union members are _kill,
_fault, _timer, _rt, _sigchld, _sigfault, _sigpoll, and _sigsys.

As it has prove quite error prone for people to fill out struct siginfo
in the past by hand, I have provided a couple of helper functions for
the common cases that come up such as: force_sig_fault,
force_sig_mceerr, force_sig_bnderr, force_sig_pkuerr.  Each of those
helper functions takes the information needed to fill out the union
member of struct siginfo that kind of fault corresponds to.

For the SIGKILL case the only si_code I see being passed SI_KERNEL.
The SI_KERNEL si_code corresponds to the _kill union member while
force_sig_fault fills in fields for the _fault union member.

Because of the mismatch of which union member SIGKILL should be using
and the union member force_sig_fault applies alarm bells ring in my head
when I read the current arm64 kernel code.  Somewhat doubly so because
the other fields in passed to force_sig_fault appear to be somewhat
random when SIGKILL is the signal.

So I figured let's preserve the usage of SIGKILL as a synchronous
exception.  That seems legitimate and other folks do that as well but
let's use force_sig instead of force_sig_fault instead.  I don't know if
userspace will notice but at the very least we won't be providing a bad
example for other kernel code to follow and we won't wind up be making
assumptions that are true today and false tomorrow when some
implementation detail changes.

For imformation on what signals and si_codes correspond to which
union members you can look at siginfo_layout.  That function
is the keeper of the magic decoder key.  Currently the only two
si_codes defined for SIGKILL are SI_KERNEL and SI_USER both of which
correspond to a _kill union member.

Eric


> Cheers
> ---Dave
>
>> Cc: stable@vger.kernel.org
>> Cc: Dave Martin <Dave.Martin@arm.com>
>> Cc: James Morse <james.morse@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Fixes: af40ff687bc9 ("arm64: signal: Ensure si_code is valid for all fault signals")
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  arch/arm64/kernel/traps.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>> 
>> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> index ade32046f3fe..0feb17bdcaa0 100644
>> --- a/arch/arm64/kernel/traps.c
>> +++ b/arch/arm64/kernel/traps.c
>> @@ -282,6 +282,11 @@ void arm64_notify_die(const char *str, struct pt_regs *regs,
>>  		current->thread.fault_address = 0;
>>  		current->thread.fault_code = err;
>>  
>> +		if (signo == SIGKILL) {
>> +			arm64_show_signal(signo, str);
>> +			force_sig(signo, current);
>> +			return;
>> +		}
>>  		arm64_force_sig_fault(signo, sicode, addr, str);
>>  	} else {
>>  		die(str, regs, err);
>> -- 
>> 2.21.0
>>
Eric W. Biederman May 23, 2019, 2:59 p.m.
Will Deacon <will.deacon@arm.com> writes:

> On Wed, May 22, 2019 at 07:38:53PM -0500, Eric W. Biederman wrote:
>> It really only matters to debuggers but the SIGKILL does not have any
>> si_codes that use the fault member of the siginfo union.  Correct this
>> the simple way and call force_sig instead of force_sig_fault when the
>> signal is SIGKILL.
>> 
>> Cc: stable@vger.kernel.org
>> Cc: Dave Martin <Dave.Martin@arm.com>
>> Cc: James Morse <james.morse@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Fixes: af40ff687bc9 ("arm64: signal: Ensure si_code is valid for all fault signals")
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  arch/arm64/kernel/traps.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>> 
>> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> index ade32046f3fe..0feb17bdcaa0 100644
>> --- a/arch/arm64/kernel/traps.c
>> +++ b/arch/arm64/kernel/traps.c
>> @@ -282,6 +282,11 @@ void arm64_notify_die(const char *str, struct pt_regs *regs,
>>  		current->thread.fault_address = 0;
>>  		current->thread.fault_code = err;
>>  
>> +		if (signo == SIGKILL) {
>> +			arm64_show_signal(signo, str);
>> +			force_sig(signo, current);
>> +			return;
>> +		}
>
> I know it's a bit of a misnomer, but I'd rather do this check inside
> arm64_force_sig_fault, since I think we have other callers (e.g.
> do_bad_area()) which also blindly pass in SIGKILL here.

Sigh.  You are right.

I thought I had checked for that when I made my change there.  But
do_bad_area will definitely do that, and that was one of the cases that
jumped out at me as needing to be fixed, when I skimmed the arm code.

I will respin this patch to move that lower.

> We could rename the thing if necessary.

I would not mind but as long as we aren't misusing the generic bits
I won't have alarm bells going of in my head when I look at their
users.

Eric
Dave P Martin May 23, 2019, 4:12 p.m.
On Thu, May 23, 2019 at 03:53:06PM +0100, Eric W. Biederman wrote:
> Dave Martin <Dave.Martin@arm.com> writes:
>
> > On Thu, May 23, 2019 at 01:38:53AM +0100, Eric W. Biederman wrote:
> >> It really only matters to debuggers but the SIGKILL does not have any
> >> si_codes that use the fault member of the siginfo union.  Correct this
> >> the simple way and call force_sig instead of force_sig_fault when the
> >> signal is SIGKILL.
> >
> > I haven't fully understood the context for this, but why does it matter
> > what's in siginfo for SIGKILL?  My understanding is that userspace
> > (including ptrace) never gets to see it anyway for the SIGKILL case.
>
> Yes.  In practice I think it would take tracing or something very
> exotic to notice anything going wrong because the task will be killed.
>
> > Here it feels like SIGKILL is logically a synchronous, thread-targeted
> > fault: we must ensure that no subsequent insn in current executes (just
> > like other fault signal).  In this case, I thought we fall back to
> > SIGKILL not because there is no fault, but because we failed to
> > properly diagnose or report the type of fault that occurred.
> >
> > So maybe handling it consistently with other faults signals makes
> > sense.  The fact that delivery of this signal destroys the process
> > before anyone can look at the resulting siginfo feels like a
> > side-effect rather than something obviously wrong.
> >
> > The siginfo is potentially useful diagnostic information, that we could
> > subsequently provide a means to access post-mortem.
> >
> > I just dived in on this single patch, so I may be missing something more
> > fundamental, or just being pedantic...
>
> Not really.  I was working on another cleanup and this usage of SIGKILL
> came up.
>
> A synchronous thread synchronous fault gets us as far as the forc_sig
> family of functions.  That only leaves the question of which union
> member in struct siginfo we are using.  The union members are _kill,
> _fault, _timer, _rt, _sigchld, _sigfault, _sigpoll, and _sigsys.
>
> As it has prove quite error prone for people to fill out struct siginfo
> in the past by hand, I have provided a couple of helper functions for
> the common cases that come up such as: force_sig_fault,
> force_sig_mceerr, force_sig_bnderr, force_sig_pkuerr.  Each of those
> helper functions takes the information needed to fill out the union
> member of struct siginfo that kind of fault corresponds to.
>
> For the SIGKILL case the only si_code I see being passed SI_KERNEL.
> The SI_KERNEL si_code corresponds to the _kill union member while
> force_sig_fault fills in fields for the _fault union member.
>
> Because of the mismatch of which union member SIGKILL should be using
> and the union member force_sig_fault applies alarm bells ring in my head
> when I read the current arm64 kernel code.  Somewhat doubly so because
> the other fields in passed to force_sig_fault appear to be somewhat
> random when SIGKILL is the signal.
>
> So I figured let's preserve the usage of SIGKILL as a synchronous
> exception.  That seems legitimate and other folks do that as well but
> let's use force_sig instead of force_sig_fault instead.  I don't know if
> userspace will notice but at the very least we won't be providing a bad
> example for other kernel code to follow and we won't wind up be making
> assumptions that are true today and false tomorrow when some
> implementation detail changes.
>
> For imformation on what signals and si_codes correspond to which
> union members you can look at siginfo_layout.  That function
> is the keeper of the magic decoder key.  Currently the only two
> si_codes defined for SIGKILL are SI_KERNEL and SI_USER both of which
> correspond to a _kill union member.

I see.  Assuming we cannot have a dummy internal si_code for this
special case (probably a bad idea), I think Will's suggestion of at
least pushing the special case handling down into
arm64_force_sig_fault() is probably a bit cleaner here, expecially
if other callers of that function may pass in SIGKILL (I haven't
looked though).

Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Eric W. Biederman May 23, 2019, 9 p.m.
Dave P Martin <Dave.Martin@arm.com> writes:

> On Thu, May 23, 2019 at 03:53:06PM +0100, Eric W. Biederman wrote:
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > On Thu, May 23, 2019 at 01:38:53AM +0100, Eric W. Biederman wrote:
>> >> It really only matters to debuggers but the SIGKILL does not have any
>> >> si_codes that use the fault member of the siginfo union.  Correct this
>> >> the simple way and call force_sig instead of force_sig_fault when the
>> >> signal is SIGKILL.
>> >
>> > I haven't fully understood the context for this, but why does it matter
>> > what's in siginfo for SIGKILL?  My understanding is that userspace
>> > (including ptrace) never gets to see it anyway for the SIGKILL case.
>>
>> Yes.  In practice I think it would take tracing or something very
>> exotic to notice anything going wrong because the task will be killed.
>>
>> > Here it feels like SIGKILL is logically a synchronous, thread-targeted
>> > fault: we must ensure that no subsequent insn in current executes (just
>> > like other fault signal).  In this case, I thought we fall back to
>> > SIGKILL not because there is no fault, but because we failed to
>> > properly diagnose or report the type of fault that occurred.
>> >
>> > So maybe handling it consistently with other faults signals makes
>> > sense.  The fact that delivery of this signal destroys the process
>> > before anyone can look at the resulting siginfo feels like a
>> > side-effect rather than something obviously wrong.
>> >
>> > The siginfo is potentially useful diagnostic information, that we could
>> > subsequently provide a means to access post-mortem.
>> >
>> > I just dived in on this single patch, so I may be missing something more
>> > fundamental, or just being pedantic...
>>
>> Not really.  I was working on another cleanup and this usage of SIGKILL
>> came up.
>>
>> A synchronous thread synchronous fault gets us as far as the forc_sig
>> family of functions.  That only leaves the question of which union
>> member in struct siginfo we are using.  The union members are _kill,
>> _fault, _timer, _rt, _sigchld, _sigfault, _sigpoll, and _sigsys.
>>
>> As it has prove quite error prone for people to fill out struct siginfo
>> in the past by hand, I have provided a couple of helper functions for
>> the common cases that come up such as: force_sig_fault,
>> force_sig_mceerr, force_sig_bnderr, force_sig_pkuerr.  Each of those
>> helper functions takes the information needed to fill out the union
>> member of struct siginfo that kind of fault corresponds to.
>>
>> For the SIGKILL case the only si_code I see being passed SI_KERNEL.
>> The SI_KERNEL si_code corresponds to the _kill union member while
>> force_sig_fault fills in fields for the _fault union member.
>>
>> Because of the mismatch of which union member SIGKILL should be using
>> and the union member force_sig_fault applies alarm bells ring in my head
>> when I read the current arm64 kernel code.  Somewhat doubly so because
>> the other fields in passed to force_sig_fault appear to be somewhat
>> random when SIGKILL is the signal.
>>
>> So I figured let's preserve the usage of SIGKILL as a synchronous
>> exception.  That seems legitimate and other folks do that as well but
>> let's use force_sig instead of force_sig_fault instead.  I don't know if
>> userspace will notice but at the very least we won't be providing a bad
>> example for other kernel code to follow and we won't wind up be making
>> assumptions that are true today and false tomorrow when some
>> implementation detail changes.
>>
>> For imformation on what signals and si_codes correspond to which
>> union members you can look at siginfo_layout.  That function
>> is the keeper of the magic decoder key.  Currently the only two
>> si_codes defined for SIGKILL are SI_KERNEL and SI_USER both of which
>> correspond to a _kill union member.
>
> I see.  Assuming we cannot have a dummy internal si_code for this
> special case (probably a bad idea), I think Will's suggestion of at
> least pushing the special case handling down into
> arm64_force_sig_fault() is probably a bit cleaner here, expecially
> if other callers of that function may pass in SIGKILL (I haven't
> looked though).

Done in my v2 version of this patch.

Eric