bruce.dubbs at gmail.com
Sun Mar 2 19:03:37 PST 2014
Ken Moffat wrote:
> On Sun, Mar 02, 2014 at 05:47:36PM -0600, Bruce Dubbs wrote:
>> Ken, You mentioned a problem with linux-3.13.5. Can you explain your
>> issues with that version?
>> -- Bruce
> Ah! I've just put a short summary in my reply to you re libexecdir
> on blfs-dev. Here's a MUCH fuller one:
> I built 7.5 with a 3.13.5 kernel (and 3.13.5 headers) to test some
> gnome packages. I hadn't built those packages since
> October/November and there was quite a lot of change amongst them,
> so I kept the new system in chroot (with a 3.13.4 kernel) until I'd
> finished libreoffice. Up to then, everything was fine.
> I then booted it - still ok - and started building the remainder.
> I'd forgotten that my original "build everything else" script (from
> when I was testing that everything built with make-4.0) included
> java/icedtea, and because I'd forgotten that, those versions were out
> of date. Realised it was building them, thought "oh well, let it
> build the old versions, no benefit but no harm done" and left it
> running. When I woke up in the morning, I was surprised to find
> that my OpenJDK script was still running rm -rf for the source
> directory, and had been doing so for more than 5 hours (both
> wall-clock time and CPU time), and was now at 99%-100% of one CPU,
> according to top.
> This build was running as root (yeah, I know) and somehow
> /etc/passwd- seemed to have been altered during OpenJDK.
> Specifically, I couldn't su or login, and none of the previous
> package's logs showed that file had changed during the package's
> build. But I also need to mention that this is an AMD phonon
> running at -j4 for the builds, and it tends to "lose its lunch"
> doing that. In fact, pass-1 gcc usually does this (an ICE), and
> dropping the caches is not usually sufficiebnt to allow it to
> proceed. So, much of LFS was built with -j3. I think I had changed
> back to -j4 for docbook and xorg.
> Anyway, I booted an older system, chrooted, fixed up the passwords.
> After that I resumed my build - got as far as GConf, and again rm
> -rf was taking 100% of one cpu, but this time only for about 15
> minutes before I killed it. Built 3.13.4, rebooted, everything else
> built fine running 3.13.4.
> Tried SysRQ-T before killing the rm -rf, but rm was not running,
> merely ready to run. So, I asked on lkml about trying to find out
> what is going on. I got a suggestion to use perf (thought I hadn't
> enabled that, but in fact I had done or perhaps it is a default on
> x86), and to perhaps use 'crash' which needs a kernel with debugging
> information, and a saved vmlinux. So, I turned on a *lot* of
> debugging options, built crash, booted it - the kernel was noticeably
> slower, even simple things showed they were using a lot of CPU in
> top, but I couldn't replicate the problem.
> Yesterday (Sunday) evening, I went back to the regular 3.13.5
> kernel, confirmed that perf did work there, and tried to replicate
> the problem without any success. Then I built the new system - if
> nothing else, I'll be able to update and test my kde scripts on this
> build. Last time I looked, it was still running ok at -j4.
> Once again, I'm starting to think that cosmic rays are the problem,
> or perhaps someone forgot to feed or muck-out the imps, or someone
> upset the camel. [ Can you guess what I'm reading at the moment ? ].
> Probably, this kernel is fine for most people. Even if I can
> replicate the problem, it might be something in my .config. OTOH,
> this does mean it's going to be some days before I get round to
> testing 3.14-rc kernels :-( As always, YMMV and you may be burned
> even by a stable kernel.
OK. It sounds like you are doing a lot of quixotic things. I still do
almost all of my builds at -j1. I like to see my build logs in order.
I ran into a problem with the nouveau drivers and now can't duplicate
after a reboot, so it may indeed be wild muons.
More information about the lfs-dev