Finding IDE drive firmware version from Solaris

I am working on a mysterious bug at work, and absent of any other promising theories, I decided to compare the firmware revision of all the hard drives involved to see if that correlated with the pattern of problems. If you aren’t a sysadmin, that will either not make any sense at all, or will give you some insight into what our lives are like.

So I figured it can’t be too hard, right? Getting this info from a SCSI disk turns out to be easy, as long as you have Veritas NetBackup server installed, and are on a Sparc. You run “sgscan”, which uses the SCSI Generic driver to scan the buses. Fine, but I was on Solaris x86, and though I did have the NetBackup client package available to me, sgscan is only available in the server install.

After digging around some, I learned about the uscsi(7) man page. Interesting stuff there, and good unique terms to search on in Google. As a result, I found some code at MIT I could adapt to do the same thing sgscan was doing, but for free, and in source form, so I could compile it on Solaris x86. The resulting program is here.

Of course, this is only for SCSI, not for IDE. I did it because I thought it would be nice to understand how sgscan does its work, and also because I was hoping that Solaris’ unified handling of disk and tape devices might extend far enough to make IDE drives act like SCSI ones for the sake of inquiries. No such luck. When you run it on a raw device for an IDE drive, you get “inappropriate ioctl”.

I knew that IDE drives get probed at boot just like SCSI ones do. I found this line in my logs, and then went and found the corresponding code in the Solaris source.

gda: Disk0:  <Vendor 'Gen-ATA ' Product 'Maxtor 5T030H3  '>

That convinced me that it was very likely the data I needed was there in the kernel, but the gda driver had not bothered to print it out. I toyed briefly with trying to compile the gda driver, but I know from past experience that the Solaris sources are pretty much read-only they way they ship them. Critical files are missing (hello, tcp.c, where are you?) and the Makefiles assume that you are sitting at Scott McNealy’s desk in Mountain View.

So I started digging, planning to use adb to ferret out the info. Let me tell you, digging backwards from the log message towards the probe was not a good idea. I wasted lots of time out in kernel hinterland, then finally realized that “Gen-ATA” is probably not coming from the disk (duh!), and could probably be found in the source, and would act as a useful signpost. From there it got much easier, since I could see that indeed the firmware revision gets copied out of the inquiry reply from the IDE disk and put into the fake SCSI inquiry structure. (This faking they do is part of what got me thinking uscsi(7) would be enough.)

ata_disk_fake_inquiry(
ata_drv_t *ata_drvp)
{
struct ata_id *ata_idp = &ata_drvp->ad_id;
struct scsi_inquiry *inqp = &ata_drvp->ad_inquiry;

ADBG_TRACE(("ata_disk_fake_inquiry enteredn"));

if (ata_idp->ai_config & ATA_ID_REM_DRV) /* ide removable bit */
inqp->inq_rmb = 1;      /* scsi removable bit */

(void) strncpy(inqp->inq_vid, "Gen-ATA ", sizeof (inqp->inq_vid));
inqp->inq_dtype = DTYPE_DIRECT;
inqp->inq_qual = DPQ_POSSIBLE;

(void) strncpy(inqp->inq_pid, ata_idp->ai_model,
sizeof (inqp->inq_pid));
(void) strncpy(inqp->inq_revision, ata_idp->ai_fw,
sizeof (inqp->inq_revision));
}

Finally, I came up with the following bit of ADB magic which will print out all the info. (You paste in the green part, it prints the black part.)

# adb -k
physmem 3f8b0
(*((***ata_state)+0t12))+0t20/20c
(*((***ata_state)+0t12))+0t50/8c
(*((***ata_state)+0t12))+0t58/40c
(*((***ata_state)+0t76))+0t20/20c
(*((***ata_state)+0t76))+0t50/8c
(*((***ata_state)+0t76))+0t58/40c
0xe153761c:     T3RJ9B7C
0xe153763a:     TAH71DP0
0xe153763e:     Maxtor 5T030H3
0xe153791c:     V80KQ8WC
0xe153793a:     ZAH814YO
0xe153793e:     Maxtor 98196H8

That’s the serial number first, then the revision, then the model number of the disk. The second set of data is for the slave IDE device on the primary controller. I do not have a machine with multiple controllers to test with, but you’d be monkeying around with the part of the expressions nearest the ata_state to skip to the second controller. Good luck!

This is a read-only adb operation, so it is very safe. I did it on all of our machines while they were in service, and I still work for my employer. However, YMMV. Do not ever start up adb unless you either can tolerate a panic, or know why you are not going to make one.

Leave a Reply