Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4810 Discussions

Intel 82599ES 10-Gigabit Adapter Causing System Panic (SLES11SP4)

KWitt4
Beginner
1,433 Views

Greetings Intel Community,

I have several systems that experience a system panic with a currently unknown cause. The brief overview of the configuration is:

OS: SLES 11 SP 4

Kernel: 3.0.101-68-default

ixgbe driver (inbox): 3.19.1-k

Adapter: Each server has two dual-port X520-SR2 networking adapters installed (4x10G ports total, 2 physical adapters)

Configuration: all 4 interfaces under bond0 in mode=802.3ad (lacp) to a managed switch with the matching configuration

Note # 1: We've seen the crashes on sles11sp3 (as well as RHEL systems too), different kernels, and different ixgbe drivers. This is just the current configuration that is experiencing the issue.

Note # 2: Load & types of traffic don't seem to matter, the system can crash with or without load, the crash has happened with just a SSH session open.

With some crash dump information, it seems that the adapters have issues with "IOH timeouts" and we also see "CATERRs".

0x000A:IOH: D3_IOBAS_IOLIM_SSTS @ 0x00000181C = 0x0000000040006060

0x000A:IOH: D3_IOBAS_IOLIM_SSTS:SIGSYSTEMERROR <30> = 0x1

0x000A:IOH: D3_IOBAS_IOLIM_SSTS:IOBASEADDRLIMIT <15:12> = 0x6

0x000A:IOH: D3_IOBAS_IOLIM_SSTS:IOBASEADDR <7:4> = 0x6

Also...

0x0024 r002i23b02 IOH on IP93-5 sn RPM031 10 Stuck RH Tracker Entry - IOH (6 total on this node)

0x003E r002i23b15 IP93-5 sn RRB997 9 RH H0 detected TRB Timeout - Destination Node

0x0004 r002i01b02 SKT0 on IP93-5 sn RPX686 8 RH H0 detected TRB Timeout - Requester

0x0000 r002i01b00 IP93-5 sn RPJ019 7 RH H0 detected TRB Timeout - Destination Node

What I'm hoping to get is a method to further debug what is happening with the driver during a system crash, is there a way to build the ixgbe driver with debugging options that generate information that could be useful to understanding the problem more.

I'm also wondering if anyone has any suggestions for kernel parameters that might help these types of issues.

Thanks for any input.

0 Kudos
1 Reply
st4
New Contributor III
577 Views

Hi KarlW,

We need to furher check on ths.

rgds,

wb

0 Kudos
Reply