- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is what I found last night and why we changed the Controller Mode in BIOS setting from IDE to AHCI. It was disclosed by Intel to use
AHCI mode when using a SSD (Solid State Driver) or a.k.a. DOM (Disk-On-a-Module) with Serial ATA (Serial AT Attachment).
SATA is a computer bus interface that connects host bus adapters to mass storage devices such as hard disk drives and optical drives.
AHCI stand for Advance Host Controller Interface. AHCI is a hardware mechanism that allows software to communicate
with Serial ATA (SATA) devices (such as host bus adapters) that are designed to offer features not offered by Parallel ATA
(PATA) controllers, such as hot-plugging and native command queuing (NCQ). See link below.
http://forum.crucial.com/t5/Crucial-SSDs/Why-do-i-need-AHCI-with-a-SSD-Drive-Guide-Here-Crucial-AHCI-vs/td-p/57078 http://forum.crucial.com/t5/Crucial-SSDs/Why-do-i-need-AHCI-with-a-SSD-Drive-Guide-Here-Crucial-AHCI-vs/td-p/57078
As of last night since the setting was changed, the bad "V" DOM unit and the Golden "# 4" DOM unit heartbeats are still running fine.
It is very likely in the light of my finding that the solution to the freezing heartbeat is to do the suggested BIOS setting from IDE to AHCI.
I came this morning and found the "Old Fateful" Ruby2 in a Heartbeat freeze state at A9 off. I checked the SATA protocol monitor file and found an error.
The error is flagged as # 1 Code Violation and right after that # 2 Disparity Error. My research found a detailed explanation of the error codes.
The recording started last night when I left at 6:05PM and stopped after 8h 39min which is a little after 2:00AM today. For 55sec the Kernel tried to
correct the error and reestablish communication but the heartbeat went into a freeze state and the protocol error recovery stopped.
Below is the explanation of the process. I believe this issue is caused by a marginal hardware deviation that results in a SATA protocol malfunction right above the physical layer.
The link layer is the next layer and is directly above the physical layer (PHY). This layer is responsible for encapsulating data payloads and manages the protocol for sending and
receiving them. A data payload that is sent is called a Frame Information Structure (FIS). The link layer also provides some other services for ensuring data integrity, handling flow
control, and reducing EMI. The host and the disk each have their own transmit pair in a SATA cable, and theoretically data could be sent in both directions simultaneously.
However, this does not occur. Instead, the receiver sends "backchannel" information to the sender that indicates the status of the transfer in progress.
For instance, if an error were to be detected midtransmission, such as a disparity error, the receiver could notify the sender of this.
The link layer uses a set of defined Link Layer Primitives to perform these functions. Primitives are each 4 Dwords long and start with the control character K28.3
(except for ALIGN, as discussed above). The following table lists most of the defined primitives and their value in hexadecimal before encoding. The usage of these will be<...
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page