Skip to content

Commit

Permalink
EDAC, mce_amd: Print IPID and Syndrome on a separate line
Browse files Browse the repository at this point in the history
Currently, the IPID and Syndrome are printed on the same line as the
Address. There are cases when we can have a valid Syndrome but not a
valid Address.

For example, the MCA_SYND register can be used to hold more detailed
error info that the hardware folks can use. It's not just DRAM ECC
syndromes. There are some error types that aren't related to memory that
may have valid syndromes, like some errors related to links in the Data
Fabric, etc.

In these cases, the IPID and Syndrome are not printed at the same log
level as the rest of the stanza, so users won't see them on the console.

Console:
  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Dmesg:
  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  , Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Print the IPID first and on a new line. The IPID should always be
printed on SMCA systems. The Syndrome will then be printed with the IPID
and at the same log level when valid:

  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  [Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Signed-off-by: Yazen Ghannam <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
  • Loading branch information
yghannam authored and suryasaimadhu committed Feb 16, 2017
1 parent e62d2ca commit 75bf2f6
Showing 1 changed file with 4 additions and 5 deletions.
9 changes: 4 additions & 5 deletions drivers/edac/mce_amd.c
Original file line number Diff line number Diff line change
Expand Up @@ -991,20 +991,19 @@ int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
pr_cont("]: 0x%016llx\n", m->status);

if (m->status & MCI_STATUS_ADDRV)
pr_emerg(HW_ERR "Error Addr: 0x%016llx", m->addr);
pr_emerg(HW_ERR "Error Addr: 0x%016llx\n", m->addr);

if (boot_cpu_has(X86_FEATURE_SMCA)) {
pr_emerg(HW_ERR "IPID: 0x%016llx", m->ipid);

if (m->status & MCI_STATUS_SYNDV)
pr_cont(", Syndrome: 0x%016llx", m->synd);

pr_cont(", IPID: 0x%016llx", m->ipid);

pr_cont("\n");

decode_smca_errors(m);
goto err_code;
} else
pr_cont("\n");
}

if (!fam_ops)
goto err_code;
Expand Down

0 comments on commit 75bf2f6

Please sign in to comment.