Mountroot before drives are initialized
As per Sachas comment about splitting my previous bug report into separate
ones. here is an issue that has been bugging me on this box. (HP Proliant
It seems that mountroot is ran before da0 (and cd0) gets initialized on
this box. mountroot then complains that da0s1 is missing.
Here is a verbose dmesg:
Note that this problem seems to happen randomly when booting from live cd.
But happens all the time when booting from disk.
#1 Updated by eric.j.christeson over 9 years ago
I am also running into this problem on a Dell Optiplex GX270 (P4 2.26Ghz) I
have a SCSI drive as my boot/root drive and IDE drive and CD.
I've been tracking HEAD and first noticed the problems after the devfs changes.
At the time I was going a few days between rebuilds so I can't easily pinpoint
I noticed a few interesting things:
1. booting in verbose mode does NOT result in a mountroot failure
2. at the mountroot prompt, ? doesn't list da0 (root device) the first time, but
will list it subsequent times.
3. at mountroot, specifying root doesn't work as the first typing. If I type ?
first, or try specifying root twice, it works.
4. Booting with or without a CD in the CD-ROM drive gives the same results
I've got a couple of hours, so I may try to look at this.
dmesg.fine.out Verbose boot, no mountroot hang
dmesg.hang Standard boot, note failure after first time specifying rootdev,
strange cd0: message after ? and finding root after specifying rootdev again.
#3 Updated by dillon over 9 years ago
Do a verbose boot so we can see when CAM starts its probes.
It is probably the SYM driver rejecting the initial bus scan
from CAM, and then later (after it is too late) notifying CAM that
a new bus and/or devices are present asynchronously.
I'm not sure how easy it will be to fix, the SYM driver is 10,000
lines and it will take a few hours to figure out how it deals
with the SCSI bus scan.
#6 Updated by elekktretterr over 9 years ago
> Do a verbose boot so we can see when CAM starts its probes.
> It is probably the SYM driver rejecting the initial bus scan
> from CAM, and then later (after it is too late) notifying CAM that
> a new bus and/or devices are present asynchronously.
> I'm not sure how easy it will be to fix, the SYM driver is 10,000
> lines and it will take a few hours to figure out how it deals
> with the SCSI bus scan.
I attached my verbose dmesg to the original email. Do you think our
problems are same? This server uses the ciss driver.
#8 Updated by elekktretterr over 9 years ago
> Eric J. Christeson <firstname.lastname@example.org> added the comment:
> Don't know if this info helps, but setting SCSI_DELAY to 10000 or 20000
> had no
I should point out that this problem was already happening before we put
the ciss raid card in it. It was happening with a USB attached cd drive
too. I saw it being initialized AFTER mountroot was ran.
#9 Updated by eric.j.christeson over 9 years ago
I've been booting with various levels of CAMDEBUG (and it boots fine since the
output gives enough delay for init) and something occurred to me. Do you also
have (n)atapicam compiled in? I noticed that sometimes I would see messages
**WARNING** waiting for the following device to finish configuring:
xpt: func=0xc0144b4f arg=0
With CAMDEBUG I see why, xpt has to enumerate all the scsci, ata, and usb
bus/devices in the system. xpt has to init before any of the scsi devs (or it
does, even if it doesn't _have_ to) so I wonder if some of the delay isn't
there. I'm going to take out natapicam and see if things improve or not.
#10 Updated by elekktretterr over 9 years ago
I havent tried it without natapicam. Im going to have to install FBSD 7 on
it for now as we are rushing to put this box in the datacentre. I realized
I cant put DragonFly on it because its going to run pgpool-II. The other
boxes all run 64bit OS, but all DragonFly builds are currently still only
32 bit and i was told that pgpool recovery(uses postgres PITR) is
#11 Updated by alexh over 9 years ago
Can you please try to boot again with commit
8c05caabb07caf24fd0dfab4f1497fb58a8c31e0 and writing "set kern.disk_debug=1"
at bootloader prompt?
It should give some more insight on the source of the problem. If you feel
like it, you can also set it to 2, so it gives some info on partition probing
for each slice.
#13 Updated by dillon over 9 years ago
:Hasso Tepper <email@example.com> added the comment:
:I have the same issue and also using ciss(4) (HP Proliant DL360 G6). Vanill=
:kernel fails 100% here to mount root from harddisk, but with kern.disk_debu=
If Alex doesn't come up with something in the next week or so we will
add a straight-up delay before mountroot.
In fact, could you test that a straight out delay before mountroot works?
Here's a patch.
diff --git a/sys/kern/vfs_conf.c b/sys/kern/vfs_conf.c
index a159afc..8bdea67 100644
@@ -109,8 +109,9 @@ SYSINIT(mountroot, SI_SUB_MOUNT_ROOT, SI_ORDER_SECOND, vfs_mountroot, NULL);
- int i;
cdev_t save_rootdev = rootdev;
+ int i;
+ int dummy;
* Make sure all disk devices created so far have also been probed,
@@ -121,6 +122,8 @@ vfs_mountroot(void *junk)
+ tsleep(&dummy, 0, "syncer", hz*2);
* The root filesystem information is compiled in, and we are
#15 Updated by alexh over 9 years ago
I don't know how to approach this. The solution lies within cam and scsi_da, I
think. The disk is created on time (disk_create) but setdiskinfo does NOT
occur on time to trigger probing before mountroot.
I'll continue investigating, but I'd welcome any ideas on how to solve this in
the aforementioned direction.