Processors
Intel® Processors, Tools, and Utilities
14506 Discussions

Why my Intel Atom D2550 chipset freezes after a few hours of operation? It is used in a Laptop MB designed by BCM for VeriFone Ruby2 Touch Screen terminal working with a DOM (SSD). SATA analyzer shows no issue.

NZlat
Beginner
1,332 Views

Here is what I found last night and why we changed the Controller Mode in BIOS setting from IDE to AHCI. It was disclosed by Intel to use

AHCI mode when using a SSD (Solid State Driver) or a.k.a. DOM (Disk-On-a-Module) with Serial ATA (Serial AT Attachment).

SATA is a computer bus interface that connects host bus adapters to mass storage devices such as hard disk drives and optical drives.

AHCI stand for Advance Host Controller Interface. AHCI is a hardware mechanism that allows software to communicate

with Serial ATA (SATA) devices (such as host bus adapters) that are designed to offer features not offered by Parallel ATA

(PATA) controllers, such as hot-plugging and native command queuing (NCQ). See link below.

http://forum.crucial.com/t5/Crucial-SSDs/Why-do-i-need-AHCI-with-a-SSD-Drive-Guide-Here-Crucial-AHCI-vs/td-p/57078 http://forum.crucial.com/t5/Crucial-SSDs/Why-do-i-need-AHCI-with-a-SSD-Drive-Guide-Here-Crucial-AHCI-vs/td-p/57078

As of last night since the setting was changed, the bad "V" DOM unit and the Golden "# 4" DOM unit heartbeats are still running fine.

It is very likely in the light of my finding that the solution to the freezing heartbeat is to do the suggested BIOS setting from IDE to AHCI.

I came this morning and found the "Old Fateful" Ruby2 in a Heartbeat freeze state at A9 off. I checked the SATA protocol monitor file and found an error.

The error is flagged as # 1 Code Violation and right after that # 2 Disparity Error. My research found a detailed explanation of the error codes.

The recording started last night when I left at 6:05PM and stopped after 8h 39min which is a little after 2:00AM today. For 55sec the Kernel tried to

correct the error and reestablish communication but the heartbeat went into a freeze state and the protocol error recovery stopped.

Below is the explanation of the process. I believe this issue is caused by a marginal hardware deviation that results in a SATA protocol malfunction right above the physical layer.

The link layer is the next layer and is directly above the physical layer (PHY). This layer is responsible for encapsulating data payloads and manages the protocol for sending and

receiving them. A data payload that is sent is called a Frame Information Structure (FIS). The link layer also provides some other services for ensuring data integrity, handling flow

control, and reducing EMI. The host and the disk each have their own transmit pair in a SATA cable, and theoretically data could be sent in both directions simultaneously.

However, this does not occur. Instead, the receiver sends "backchannel" information to the sender that indicates the status of the transfer in progress.

For instance, if an error were to be detected midtransmission, such as a disparity error, the receiver could notify the sender of this.

The link layer uses a set of defined Link Layer Primitives to perform these functions. Primitives are each 4 Dwords long and start with the control character K28.3

(except for ALIGN, as discussed above). The following table lists most of the defined primitives and their value in hexadecimal before encoding. The usage of these will be<...

0 Kudos
0 Replies
Reply