[rh7,2/2] ixgbe: don't check firmware errors

Submitted by Konstantin Khorenko on Jan. 10, 2020, 3:31 p.m.

Details

Message ID 20200110153123.27448-2-khorenko@virtuozzo.com
State New
Series "Series without cover letter"
Headers show

Commit Message

Konstantin Khorenko Jan. 10, 2020, 3:31 p.m.
There is a check in new ixgbe version driver in RHEL7.7
which verify the firmware error register value for correctness.

Apparently seems the check is not fully correct itself
causing kernel to spoil logs (every second).

The driver in RHEL7.6 does not have that check.
The driver from vendor site does not have that check.
https://downloadmirror.intel.com/14687/eng/ixgbe-5.6.5.tar.gz

So let's drop the check as well in Virtuozzo kernels.

https://bugs.centos.org/view.php?id=16495
https://forum.proxmox.com/threads/pve-6-0-7-ixgbe-firmware-errors.58592/
https://jira.sw.ru/browse/PSBM-100722

Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 9 ---------
 1 file changed, 9 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index eef479f762f93..b39a1b477b246 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7913,15 +7913,6 @@  static void ixgbe_reset_subtask(struct ixgbe_adapter *adapter)
 static bool ixgbe_check_fw_error(struct ixgbe_adapter *adapter)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
-	u32 fwsm;
-
-	/* read fwsm.ext_err_ind register and log errors */
-	fwsm = IXGBE_READ_REG(hw, IXGBE_FWSM(hw));
-
-	if (fwsm & IXGBE_FWSM_EXT_ERR_IND_MASK ||
-	    !(fwsm & IXGBE_FWSM_FW_VAL_BIT))
-		e_dev_warn("Warning firmware error detected FWSM: 0x%08X\n",
-			   fwsm);
 
 	if (hw->mac.ops.fw_recovery_mode && hw->mac.ops.fw_recovery_mode(hw)) {
 		e_dev_err("Firmware recovery mode detected. Limiting functionality. Refer to the Intel(R) Ethernet Adapters and Devices User Guide for details on firmware recovery mode.\n");

Comments

Kirill Tkhai Jan. 10, 2020, 3:45 p.m.
On 10.01.2020 18:31, Konstantin Khorenko wrote:
> There is a check in new ixgbe version driver in RHEL7.7
> which verify the firmware error register value for correctness.
> 
> Apparently seems the check is not fully correct itself
> causing kernel to spoil logs (every second).
> 
> The driver in RHEL7.6 does not have that check.
> The driver from vendor site does not have that check.
> https://downloadmirror.intel.com/14687/eng/ixgbe-5.6.5.tar.gz
> 
> So let's drop the check as well in Virtuozzo kernels.
> 
> https://bugs.centos.org/view.php?id=16495
> https://forum.proxmox.com/threads/pve-6-0-7-ixgbe-firmware-errors.58592/
> https://jira.sw.ru/browse/PSBM-100722
> 
> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index eef479f762f93..b39a1b477b246 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -7913,15 +7913,6 @@ static void ixgbe_reset_subtask(struct ixgbe_adapter *adapter)
>  static bool ixgbe_check_fw_error(struct ixgbe_adapter *adapter)
>  {
>  	struct ixgbe_hw *hw = &adapter->hw;
> -	u32 fwsm;
> -
> -	/* read fwsm.ext_err_ind register and log errors */
> -	fwsm = IXGBE_READ_REG(hw, IXGBE_FWSM(hw));
> -
> -	if (fwsm & IXGBE_FWSM_EXT_ERR_IND_MASK ||
> -	    !(fwsm & IXGBE_FWSM_FW_VAL_BIT))
> -		e_dev_warn("Warning firmware error detected FWSM: 0x%08X\n",
> -			   fwsm);

Maybe we warn here only once?

>  
>  	if (hw->mac.ops.fw_recovery_mode && hw->mac.ops.fw_recovery_mode(hw)) {
>  		e_dev_err("Firmware recovery mode detected. Limiting functionality. Refer to the Intel(R) Ethernet Adapters and Devices User Guide for details on firmware recovery mode.\n");
>
Konstantin Khorenko Jan. 10, 2020, 3:56 p.m.
On 01/10/2020 06:45 PM, Kirill Tkhai wrote:
> On 10.01.2020 18:31, Konstantin Khorenko wrote:
>> There is a check in new ixgbe version driver in RHEL7.7
>> which verify the firmware error register value for correctness.
>>
>> Apparently seems the check is not fully correct itself
>> causing kernel to spoil logs (every second).
>>
>> The driver in RHEL7.6 does not have that check.
>> The driver from vendor site does not have that check.
>> https://downloadmirror.intel.com/14687/eng/ixgbe-5.6.5.tar.gz
>>
>> So let's drop the check as well in Virtuozzo kernels.
>>
>> https://bugs.centos.org/view.php?id=16495
>> https://forum.proxmox.com/threads/pve-6-0-7-ixgbe-firmware-errors.58592/
>> https://jira.sw.ru/browse/PSBM-100722
>>
>> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
>> ---
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 9 ---------
>>  1 file changed, 9 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index eef479f762f93..b39a1b477b246 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -7913,15 +7913,6 @@ static void ixgbe_reset_subtask(struct ixgbe_adapter *adapter)
>>  static bool ixgbe_check_fw_error(struct ixgbe_adapter *adapter)
>>  {
>>  	struct ixgbe_hw *hw = &adapter->hw;
>> -	u32 fwsm;
>> -
>> -	/* read fwsm.ext_err_ind register and log errors */
>> -	fwsm = IXGBE_READ_REG(hw, IXGBE_FWSM(hw));
>> -
>> -	if (fwsm & IXGBE_FWSM_EXT_ERR_IND_MASK ||
>> -	    !(fwsm & IXGBE_FWSM_FW_VAL_BIT))
>> -		e_dev_warn("Warning firmware error detected FWSM: 0x%08X\n",
>> -			   fwsm);
>
> Maybe we warn here only once?

i thought about it but did not find any reason why do we need this warning (even once).
1) we know it is triggered both on our internal hardware and there are complains from others in the internet
2) there were no this check and warning previously
3) there is no such check in the vendor driver version
4) network works fine with drivers without this warning (both old and new)

i don't see how this warning can be useful for us,
moreother it will threaten user while they can do nothing with it.

So, i've chosen to just remove it completely.

>
>>
>>  	if (hw->mac.ops.fw_recovery_mode && hw->mac.ops.fw_recovery_mode(hw)) {
>>  		e_dev_err("Firmware recovery mode detected. Limiting functionality. Refer to the Intel(R) Ethernet Adapters and Devices User Guide for details on firmware recovery mode.\n");
>>
>
> .
>
Kirill Tkhai Jan. 10, 2020, 3:57 p.m.
On 10.01.2020 18:56, Konstantin Khorenko wrote:
> On 01/10/2020 06:45 PM, Kirill Tkhai wrote:
>> On 10.01.2020 18:31, Konstantin Khorenko wrote:
>>> There is a check in new ixgbe version driver in RHEL7.7
>>> which verify the firmware error register value for correctness.
>>>
>>> Apparently seems the check is not fully correct itself
>>> causing kernel to spoil logs (every second).
>>>
>>> The driver in RHEL7.6 does not have that check.
>>> The driver from vendor site does not have that check.
>>> https://downloadmirror.intel.com/14687/eng/ixgbe-5.6.5.tar.gz
>>>
>>> So let's drop the check as well in Virtuozzo kernels.
>>>
>>> https://bugs.centos.org/view.php?id=16495
>>> https://forum.proxmox.com/threads/pve-6-0-7-ixgbe-firmware-errors.58592/
>>> https://jira.sw.ru/browse/PSBM-100722
>>>
>>> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
>>> ---
>>>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 9 ---------
>>>  1 file changed, 9 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index eef479f762f93..b39a1b477b246 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -7913,15 +7913,6 @@ static void ixgbe_reset_subtask(struct ixgbe_adapter *adapter)
>>>  static bool ixgbe_check_fw_error(struct ixgbe_adapter *adapter)
>>>  {
>>>  	struct ixgbe_hw *hw = &adapter->hw;
>>> -	u32 fwsm;
>>> -
>>> -	/* read fwsm.ext_err_ind register and log errors */
>>> -	fwsm = IXGBE_READ_REG(hw, IXGBE_FWSM(hw));
>>> -
>>> -	if (fwsm & IXGBE_FWSM_EXT_ERR_IND_MASK ||
>>> -	    !(fwsm & IXGBE_FWSM_FW_VAL_BIT))
>>> -		e_dev_warn("Warning firmware error detected FWSM: 0x%08X\n",
>>> -			   fwsm);
>>
>> Maybe we warn here only once?
> 
> i thought about it but did not find any reason why do we need this warning (even once).
> 1) we know it is triggered both on our internal hardware and there are complains from others in the internet
> 2) there were no this check and warning previously
> 3) there is no such check in the vendor driver version
> 4) network works fine with drivers without this warning (both old and new)
> 
> i don't see how this warning can be useful for us,
> moreother it will threaten user while they can do nothing with it.
> 
> So, i've chosen to just remove it completely.

Sounds reasonable.

Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com>