Bug #2952
closedvkernel crashes during boot
0%
Description
Can't start vkernel no my 4.6 machine:
a@kl:~$ uname -a
DragonFly kl.zta.lk 4.6-RELEASE DragonFly v4.6.0.10.g16fba-RELEASE #10: Wed Aug 17 14:26:31 CEST 2016 root@kl.zta.lk:/usr/obj/usr/src/sys/X86_64_GENERIC x86_64
@kl:~$ sudo /var/vkernel/4.6/boot/kernel/kernel -m 2g -r /vhost/dev/root.img -I auto:bridge0 -d -p /var/run/vkernel.
vhost-dev.pid
Wachtwoord:
Using memory file: /var/vkernel/memimg.000003
KVM mapped at 0x8000000000-0x10000000000
TAP UNIT 7
Copyright (c) 2003-2016 The DragonFly Project.
Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
DragonFly v4.6.0.20.ged336-RELEASE #16: Wed Sep 21 14:46:27 CEST 2016
root@kl.zta.lk:/usr/obj/usr/src/sys/VKERNEL64
real memory = 2147483648 (2097152K bytes)
avail memory = 2027917312 (1980388K bytes)
Fatal trap 12: page fault while in kernel mode
cpuid = 0
fault virtual address = 0x9
fault code = supervisor read, page not present
instruction pointer = 0x2b:0x6ef9b9
stack pointer = 0x10:0x7fffffffe6f0
frame pointer = 0x10:0x1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 ()
current thread = pri 12
<- SMP: XXX
kernel: type 12 trap, code=0
CPU0 stopping CPUs: 0x0000000000000000
stopped
Stopped at 0x6ef9b9: movq 0x8(%rbp),%rdi
db>
Interestingly, I was able to start the 4.4 vkernel some time ago, but I recently recompiled it and it started to crash the same way 4.6 does. I still have access to my working 4.4 vkernel, but don't know what to compare to see the difference.
Updated by tuxillo over 8 years ago
- Category changed from Kernel to vkernel
- Status changed from New to Feedback
- Assignee set to tuxillo
Hi,
Can you boot it with -v to see if we get a better idea where it might be crashing?
Also, can you 'addr2line -f -e /var/vkernel/4.6/boot/kernel/kernel 0x6ef9b9 ' ?
As a final test, would you be able to cherry-pick this commits and build the vkernel to see if it works?
4dd1b99459f58c096edd1945eb144cf12006d85a
57cbfb93d182ba7966c918d24df413bf77f7e459
Best regards,
Antonio Huete
Updated by zhtw over 8 years ago
Only now I noticed that I compiled it with CONFIGARGS=-p option.
After I removed the option, 4.4 vkernel started to work. 4.6 however now hangs:
a@kl:~$ sudo /var/vkernel/4.6/boot/kernel/kernel -m 2g -r /vhost/dev/root.img -I auto:bridge0 -d -p /var/run/vkernel.
vhost-dev.pid
Wachtwoord:
Using memory file: /var/vkernel/memimg.000003
KVM mapped at 0x8000000000-0x10000000000
TAP UNIT 7
Copyright (c) 2003-2016 The DragonFly Project.
Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
DragonFly v4.6.0.20.ged336-RELEASE #17: Wed Sep 21 15:29:59 CEST 2016
root@kl.zta.lk:/usr/obj/usr/src/sys/VKERNEL64
real memory = 2147483648 (2097152K bytes)
avail memory = 2027917312 (1980388K bytes)
DragonFly/MP: Multiprocessor
cpu0 (BSP)
cpu1 (AP)
Initialize MI interrupts
initclocks
SMP: AP CPU #1 Launched!
I will recompile it with -p again and do what you just asked.
(Thanks for such a quick response.)
Updated by tuxillo over 8 years ago
Hi,
If vkernel 4.6 hangs there you have to cherry-pick the commits I mentioned.
Let me know how it works for you.
Cheers,
Antonio Huete
Updated by zhtw over 8 years ago
Booting with -v gives exactly the same output.
addr2line doesn't seem to recognize the line:
a@kl:/usr/src$ sudo addr2line -f -e /var/vkernel/4.6-p/boot/kernel/kernel 0x6ef9b9
.mcount
??:?
After cherry-picking the two commits it still doesn't work, but addr2line shows something different:
a@kl:/usr/src$ sudo addr2line -f -e /var/vkernel/4.6-p-cherry/boot/kernel/kernel 0x6ef9b9
time
??:?
But could you check if I did everything right (I don't have much experience in all this):
a@kl:/usr/src$ git branchDragonFly_RELEASE_3_8
DragonFly_RELEASE_4_0
DragonFly_RELEASE_4_2
DragonFly_RELEASE_4_4
- DragonFly_RELEASE_4_6
master
a@kl:/usr/src$ git status | head -n 2
On branch DragonFly_RELEASE_4_6
Your branch is ahead of 'origin/DragonFly_RELEASE_4_6' by 2 commits.
a@kl:/usr/src$ git log -n 2
commit 49c5ab8692483d2ec472789f02823714859a81ba
Author: Antonio Huete Jimenez <tuxillo@quantumachine.net>
Date: Wed Sep 21 01:31:58 2016 +0200
vkernel - Invalidate pte before setting attributes to the vm_page
- Fixes a problem at mountroot time where it doesn't find any disk
even though the disk is detected earlier.
commit 2ea5d46f3046bdaa05c13c93f357b16170c14461
Author: Antonio Huete Jimenez <tuxillo@quantumachine.net>
Date: Wed Sep 21 00:03:05 2016 +0200
vkernel - Fix a vkernel lockup on startup
- During ap_init() any pending IPIs is processed manually so
clear gd_npoll as the real kernel does.
- Do not disable interrupts for vkernels during lwkt_send_ipiq3()
because they don't seem to be re-enabled afterwards as they should.
I'm not entirely sure this is the right fix, more investigation
is required.
Build command was: make -DNO_MODULES CONFIGARGS=-p buildkernel KERNCONF=VKERNEL64 -j 4
Install command: make -DNO_MODULES installkernel CONFIGARGS=-p KERNCONF=VKERNEL64 DESTDIR=/var/vkernel/4.6-p-cherry
So this all was with CONFIGARGS=-p.
But when compiled without it, it started to work. Thanks!
Updated by zhtw over 8 years ago
Just to be clear, the summary:
When compiled with CONFIGARGS=-p, neither 4.4 nor 4.6 work -- both crash.
When compiled without this option, 4.4 works, but 4.6 hangs.
After cherry-picking:
Without CONFIGARGS=-p 4.6 works.
With it it still crashes.
Anyway, my problem is solved. Thank you so much!
Updated by zhtw over 8 years ago
- Status changed from Feedback to Closed
I'm closing this ticket because the problem was solved for me. But I wander if it makes sense to back-port (basically cherrypick and submit as there are no conflicts) the two commits that solved the problem from master to DragonFly_RELEASE_4_6?