FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6379 Discussions

CXL IP Debug Toolkit

RicardoC
Beginner
890 Views

Hello,

 

An Intel development kit DK-DEV-AGI027R1BES with the CXL Type 3 Example Design image causes an AMD Siena architecture system to reboot when a single write is issued. The information extracted by the Debug Toolkit seems to point to failures, but the documentation does not give details on the description of the registers. The most notable entries are:

 

Local Retry State Machine,0x8c00
Num Local CRC Detected,0x2
Local FSM State Status,0x3
Viral Log,0x4
Link Received Viral,0x1
BBS Idle Status,0x0
BBS Error Status,0x1
BBS CXL Status Register Slice0,0xc0000000
BBS Error Status Register,0x12
Device Protocol Table Error,0x1
M2S Viral Received,0x1
BBS Error Status First Register,0x10

 

The counters show some interesting results. Even though a single Byte RwD was requested by the application, a Req also happened, and apparently only the Req was responded with DRS, whereas the RwD didn't trigger NDR to be sent:

 

Counter of M2SReq Operations,0x1
M2SReq Counter,0x1
Counter of M2SRwD Operations,0x1
M2SRwD Counter,0x1
Counter of S2MDRS Operations,0x1
S2MDRS Counter,0x1
Counter of S2MNDR Operations,0x0
S2MNDR Counter,0x0

 

Is there more information available on the meaning of the registers for the CXL IP Debug Toolkit?

 

Thank you,

 

Ricardo

 

PS: The complete dump of registers from the Debug Toolkit can be found attached.

 

 

 

Labels (1)
0 Kudos
5 Replies
WZ2
Employee
838 Views

Hi there,

Can you please specify the steps involved in the operation? I'm not quite clear on what 'a single write is issued' means. I expect you to provide a complete overview of the usage scenario and the sequence of steps.

At the same time, I'm unsure if I understand correctly. After performing an operation, did the system reboot, and upon recovery, did you use the debug toolkit to capture some information?

Best regards,

WZ


0 Kudos
RicardoC
Beginner
823 Views

Hi WZ,

The issue evolved into 2 phases of debugging.

The first, and original, situation was the failure to boot the OS in the AMD system. With the Intel CXL Type 3 Example Design image in the DK-DEV-AGI027R1BES development kit plugged into the system, the handoff from firmware (EFI) boot to the OS causes the system to reboot itself. It happens in a very early stage of the boot, so no logs are created by the OS. By forcing the CXL to be treated as Special Purpose Memory in the BIOS settings, the OS is able to boot, but the device is not seen as a CXL device. Which leads to phase 2 of the investigation.

In the second situation, as stated earlier, the OS is able to boot, but the card does not appear as a CXL device to the OS. daxctl is invoked in order to make the device usable to the OS as a memory one. A simple C program was written so that a single byte can be written to the device. However, as soon as the write operation is executed, the system reboots itself.

The file attached to the start of the thread shows the CXL IP registers status extracted using the Debug Toolkit, after the write has been executed and the system rebooted.

Could you provide details on the meaning of the Debug Toolkit registers?

Thank you,

Ricardo

0 Kudos
WZ2
Employee
743 Views

Hi Ricardo,

I apologize for the delay in responding due to health reasons; I am truly sorry for any inconvenience caused.

  1. Regarding the register, I understand that you wish to understand the values of CXL register internal registers for debugging. I can provide you with a register map for your reference. However, this document requires certain permissions. If you are unable to access it, please contact the Intel personnel who sold the board to you to obtain the necessary permissions:

https://www.intel.com/content/www/us/en/secure/content-details/777521/r-tile-intel-fpga-ip-for-compute-express-link-cxl-register-maps.html

  1. I am particularly interested in the first operation where the system cannot recognize the CXL device. For Type 3 operations, are you using avst to load the pof? If you only use sof, the system may not recognize the device, so please ensure that the CXL image is stored in the flash.

Best regards,

WZ


0 Kudos
RicardoC
Beginner
712 Views

Hi WZ,

I hope are recovering well.

The document that you pointed to has the CXL registers addresses. It does not provide any information on the debug registers that the Debug Toolkit outputs.

We are using the original pof image shipped by Intel.

Regards,

Ricardo

0 Kudos
WZ2
Employee
702 Views

Hi there,

Thank you for your concern. For some information on registers in the debug toolkit, you can refer to sections 5.13-5.14 in this document, as well as the official documentation for CXL.


https://www.intel.com/content/www/us/en/secure/content-details/795866/intel-agilex-7-r-tile-compute-express-link-cxl-1-1-2-0-fpga-ip-user-guide.html?DocID=795866


I am not questioning the source of your POF; rather, I am interested in the methodology you are using. Typically, we have not encountered situations where devices are not recognized by the system, and some of our customers also use AMD CPUs.

Best regards,

WZ


0 Kudos
Reply