[Devel] kvm: x86: hyperv: avoid livelock in oneshot SynIC timers

Submitted by Roman Kagan on July 19, 2017, 5:31 p.m.

Details

Message ID 20170719173141.12082-1-rkagan@virtuozzo.com
State New
Series "kvm: x86: hyperv: avoid livelock in oneshot SynIC timers"
Headers show

Commit Message

Roman Kagan July 19, 2017, 5:31 p.m.
If the SynIC timer message delivery fails due to SINT message slot being
busy, there's no point to attempt starting the timer again until we're
notified of the slot being released by the guest (via EOM or EOI).

Even worse, when a oneshot timer fails to deliver its message, its
re-arming with an expiration time in the past leads to immediate retry
of the delivery, without ever letting the guest vcpu to run and release
the slot, which results in a livelock.

To avoid that, only start the timer when there's no timer message
pending delivery.  When the guest releases the slot, the processing is
resumed so the timer will be started then.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
 arch/x86/kvm/hyperv.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Patch hide | download patch | download mbox

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index c8efdce3e702..ab9501c2f32c 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -632,9 +632,10 @@  void kvm_hv_process_stimers(struct kvm_vcpu *vcpu)
 				}
 
 				if ((stimer->config & HV_STIMER_ENABLE) &&
-				    stimer->count)
-					stimer_start(stimer);
-				else
+				    stimer->count) {
+					if (!stimer->msg_pending)
+						stimer_start(stimer);
+				} else
 					stimer_cleanup(stimer);
 			}
 		}

Comments

Konstantin Khorenko July 20, 2017, 8:38 a.m.
Need it in update 5?
Any bug id?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 07/19/2017 08:31 PM, Roman Kagan wrote:
> If the SynIC timer message delivery fails due to SINT message slot being
> busy, there's no point to attempt starting the timer again until we're
> notified of the slot being released by the guest (via EOM or EOI).
>
> Even worse, when a oneshot timer fails to deliver its message, its
> re-arming with an expiration time in the past leads to immediate retry
> of the delivery, without ever letting the guest vcpu to run and release
> the slot, which results in a livelock.
>
> To avoid that, only start the timer when there's no timer message
> pending delivery.  When the guest releases the slot, the processing is
> resumed so the timer will be started then.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---
>  arch/x86/kvm/hyperv.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index c8efdce3e702..ab9501c2f32c 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -632,9 +632,10 @@ void kvm_hv_process_stimers(struct kvm_vcpu *vcpu)
>  				}
>
>  				if ((stimer->config & HV_STIMER_ENABLE) &&
> -				    stimer->count)
> -					stimer_start(stimer);
> -				else
> +				    stimer->count) {
> +					if (!stimer->msg_pending)
> +						stimer_start(stimer);
> +				} else
>  					stimer_cleanup(stimer);
>  			}
>  		}
>
Roman Kagan July 20, 2017, 10:01 a.m.
On Thu, Jul 20, 2017 at 11:38:19AM +0300, Konstantin Khorenko wrote:
> Need it in update 5?

That's up to Den to decide.  The livelock is only triggerable with
one-shot SynIC timers; none of our regular guests uses them (Windows
uses periodic timers, Linux does use one-shot timers but needs to be
started with hyperv on and kvm off which doesn't happen in normal use).

> Any bug id?

The problem was noticed in a special scenario on my dev machine with the
mainstream kernel so I didn't bother to file a bug.

I'm about to post the patch to the KVM list so you'll have the master
commit id once (if) it's merged.

Roman.

> 
> --
> Best regards,
> 
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
> 
> On 07/19/2017 08:31 PM, Roman Kagan wrote:
> > If the SynIC timer message delivery fails due to SINT message slot being
> > busy, there's no point to attempt starting the timer again until we're
> > notified of the slot being released by the guest (via EOM or EOI).
> > 
> > Even worse, when a oneshot timer fails to deliver its message, its
> > re-arming with an expiration time in the past leads to immediate retry
> > of the delivery, without ever letting the guest vcpu to run and release
> > the slot, which results in a livelock.
> > 
> > To avoid that, only start the timer when there's no timer message
> > pending delivery.  When the guest releases the slot, the processing is
> > resumed so the timer will be started then.
> > 
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > ---
> >  arch/x86/kvm/hyperv.c | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> > index c8efdce3e702..ab9501c2f32c 100644
> > --- a/arch/x86/kvm/hyperv.c
> > +++ b/arch/x86/kvm/hyperv.c
> > @@ -632,9 +632,10 @@ void kvm_hv_process_stimers(struct kvm_vcpu *vcpu)
> >  				}
> > 
> >  				if ((stimer->config & HV_STIMER_ENABLE) &&
> > -				    stimer->count)
> > -					stimer_start(stimer);
> > -				else
> > +				    stimer->count) {
> > +					if (!stimer->msg_pending)
> > +						stimer_start(stimer);
> > +				} else
> >  					stimer_cleanup(stimer);
> >  			}
> >  		}
> >