Processors
Intel® Processors, Tools, and Utilities
14503 Discussions

Dual Xeon 2696v4 25% performance drop, help me find the issue please?

RDerr
Beginner
2,403 Views

Hi

I have 2 render farm nodes called "Rush-In and Rush-Out that I used commercially to produce animation and stills. No OC has been attempted and they are exactly the same hardware and software. The motherboards are Asus Z10PH-D16

Recently I noticed one node "Rush-In" is rendering frames about 25% slower than my "Rush-Out" node, which signalled to me I probably have a thermal or hardware issue reducing its performance.

These devices sit in my garage which is cool, they have water cooling and additional fans to move air around so I'm struggling to see how I could improve the cooling. Plus the CPU temps are fair, under 100% load they sit at between 60-69 Celsius.

Alarmingly Rush Out shows all the correct graphical bars inside XTU but Rush-In is only showing CPU Utilization…I am using CPU-Z and Corsair to providing the core speed and the cpu temps, so something is wrong with Rush-In.

Regarding Rush-In: At first (like me) you probably want to say something is wrong with XTU and that's why core freq and package temps are not showing up, however, Rush-Out (identical hardware) shows these stats in XTU.

I'm 95% sure all the bios settings are the same as well, I had a quick look at them yesterday and could not see anything different.

In the following images, you will see what the healthy Rush-Out looks like without load and after 20 mins of 100% load.

under load V

In the following images, you will see what unhealthy Rush-In looks like without load and after 20 mins of 100% load.

I just noticed under CPU-Z its showing 11 threads in the screen grabs, this is not the case I just checked and its showing 44 upon refresh.

Under load V

My questions are:

1) What would be causing Rush-In to not show the states in XTU (but in CPU-Z) and could this be the cause of the performance drop?

2) How do I get Rush-In back to performing well?

3) Why is Rush-Out on XTU indicating Power Limit Throttling is happening at 150W and can I turn this off somewhere without damaging my baby?

4) Rush Out on XTU is saying the active core count is 22 but these PC's have dual sockets (2 CPU's installed) so this should be 44 correct, hyperthreaded it is 88, can XTU only read one socket at a time?

Thank you in advance

Regards Rueben

0 Kudos
4 Replies
idata
Employee
1,311 Views

Hello RuebenAD,

 

 

Thank you for joining the Intel Community Support.

 

 

Determining the cause of the performance drop may include reviewing specific information about your setup. I would like to gather basic information about it to do further research and attempt to provide a detailed answer.

 

 

Please scan both systems using the Intel® System Support Utility. Follow the steps below:

 

1.Download the Intel® System Support Utility and save the application to your system:

 

https://downloadcenter.intel.com/download/25293/Intel-System-Support-Utility-for-Windows-?product=91600

 

2.Open the application and click Scan to see system and device information. The Intel® System Support Utility defaults to the Summary View on the output screen following the scan. Click the menu where it says summary to change to Detailed View.

 

3.To save your scan, click Next and click Save. You can save the file to any accessible location on your computer.

 

4. Attach the report to this thread.

 

 

Also, are you using software to make your system work like in a cluster?

 

Are your systems working independently?

 

 

Wanner G.

 

0 Kudos
RDerr
Beginner
1,311 Views

Hi

Thank you for showing an interest!

Please find both reports attached.

Regarding your question about clusters. I don't think so. I use them as render slaves/nodes but neither one is also operating as the repository or manager, so for example "Rush-In" does not have additional processes going on to cause the slow down in renders.

I think CPU-Z shows a clock speed drop on Rush-In so I thought that must be heat or power but Rush-Out is operating at a faster clock speed but in the same temperature zone...stranger?

Regards Rueben

0 Kudos
idata
Employee
1,311 Views

Hello RuebenAD,

 

 

Thank you for your response.

 

 

Based on the information provided about your system configuration and the reports attached, you may be experiencing hardware issues considering that overheating is not an issue.

 

 

In one of the questions you stated that you would like Rush-in back to performing well. Does that mean that you usually monitor and compare the performance of both systems? Were they delivering the same performance before you noticed the performance drop?

 

 

What you can do is basically swap hardware such as processors and video cards to check if you get the same behavior. Also, keep in mind that having identical hardware may not mean that you will receive the same performance. Please let me know if you are able to swap hardware between both systems.

 

 

Regarding the questions about Intel® XTU, I am doing further research to provide an update about it.

 

 

Wanner G.
0 Kudos
idata
Employee
1,311 Views

Hello RuebenAD,

 

 

I am writing to follow up on your inquiry.

 

 

Please let us know if you need any further help.

 

 

Wanner G.
0 Kudos
Reply