From stefan at mdy.univie.ac.at Tue Sep 1 08:01:29 2009 From: stefan at mdy.univie.ac.at (Stefan Boresch) Date: Tue, 1 Sep 2009 17:01:29 +0200 Subject: [Caos] jmicron module missing during install Message-ID: <20090901150129.GB10349@loop.mdy.univie.ac.at> The following pertains to an install using caos-nsa-full-1.0.8.x86_64.iso (md5sum checked OK. I was not able to find a later install CD, let alone the live CD described in the wiki ???) The problem arose on rather new machines, Intel Xeon Nehalem (E5530/X5550) and matching supermicro motherboards. Since there were only ATA CD/DVD drives on these machines, the jmicron stuff is used to simulate SCSI. Unfortunately, Caos NSA 1.0.8 installer doesn't install the jmicron module, and so while the "test" target reports SUCCESS, the "autoinstall" target fails. To add insult to injury, the jmicron module is included on the boot CD. A naive use of the (by now undocumented(?)) preshell boot option doesn't help, since the check for the presence of the cinch installer is carried out before the preshell is executed, so loading jmicron in the preshell is too late ... I finally could install after studying the init scripts on the CD and manually carrying out *again* the check for the cinch installer after having loaded the jmicron module. So, while this problem is solved, I thought this might help someone later on. At present, this raises the following issues. * Is it only stupid me, or describes the wiki something I couldn't find to download. Presumably, with a live cd, the above issue could be solved easily. * Why is the preshell option not documented anymore. I knew about it because I needed it in the past to manually load an ethernet driver; it was docemnted then. This time I had to read the init script to remember the exact "incantation" * The specific problem suggests for me that preshell (if specified) should be the first thing to be executed. It's the logical point to fix things that the autodetection missed. Checking for the cinch installer should come later ... * Having "test" report success only to be greated with 'cannot find cinch installer, rebooting in 60 seconds' when attempting "autoinstall" resulted in some medium to heavy cursing ... (one might consider finding the cinch installer a prerequisite for success ?) [Don't construe this as too much of a rant; caos never claimed to be for aunt Tilly ... in retrospect reading through these nash scripts was quite educational ;-) ] Cheers, Stefan -- Stefan Boresch Institute for Computational Biological Chemistry University of Vienna, Waehringerstr. 17 A-1090 Vienna, Austria Phone: -43-1-427752715 Fax: -43-1-427752790 From gmkurtzer at gmail.com Tue Sep 1 09:05:25 2009 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Tue, 1 Sep 2009 09:05:25 -0700 Subject: [Caos] jmicron module missing during install In-Reply-To: <20090901150129.GB10349@loop.mdy.univie.ac.at> References: <20090901150129.GB10349@loop.mdy.univie.ac.at> Message-ID: <571f1a060909010905t6435e8dexf785e9e81d002b17@mail.gmail.com> Hi Stefan. Comments inline... On Tue, Sep 1, 2009 at 8:01 AM, Stefan Boresch wrote: > The following pertains to an install using > > caos-nsa-full-1.0.8.x86_64.iso > > (md5sum checked OK. I was not able to find a later install CD, let alone > the live CD described in the wiki ???) > > The problem arose on rather new machines, Intel Xeon Nehalem (E5530/X5550) > and matching supermicro motherboards. ?Since there were only ATA CD/DVD > drives on these machines, the jmicron stuff is used to simulate SCSI. > > Unfortunately, Caos NSA 1.0.8 installer doesn't install the jmicron > module, and so while the "test" target reports SUCCESS, the > "autoinstall" target fails. To add insult to injury, the jmicron > module is included on the boot CD. ?A naive use of the (by now > undocumented(?)) preshell boot option doesn't help, since the > check for the presence of the cinch installer is carried out before > the preshell is executed, so loading jmicron in the preshell is too > late ... > > I finally could install after studying the init scripts on the CD and > manually carrying out *again* the check for the cinch installer after > having loaded the jmicron module. So, while this problem is solved, I thought > this might help someone later on. At present, this raises the following > issues. > > * Is it only stupid me, or describes the wiki something I couldn't find to > download. Presumably, with a live cd, the above issue could be solved easily. Yes, we know. We were planning on releasing the live media installer already, but then there were several high priority exploits that were found in the current kernels. For testing purposes, they can be found at: http://altruistic.infiscale.org/isos. Please expect a new one in the next day or so, then we will push that to the live mirrors. > > * Why is the preshell option not documented anymore. I knew about it > because I needed it in the past to manually load an ethernet driver; > it was docemnted then. This time I had to read the init script to > remember the exact "incantation" It was documented at one point in the wiki, but I am not sure what happened to it or if someone removed it. > > * The specific problem suggests for me that preshell (if specified) > should be the first thing to be executed. It's the logical point to > fix things that the autodetection missed. Checking for the cinch > installer should come later ... Agreed. > > * Having "test" report success only to be greated with 'cannot find > cinch installer, rebooting in 60 seconds' when attempting > "autoinstall" resulted in some medium to heavy cursing ... (one might > consider finding the cinch installer a prerequisite for success ?) Yes, again I agree. > > [Don't construe this as too much of a rant; caos never claimed to be for > aunt Tilly ... in retrospect reading through these nash scripts was quite > educational ;-) ] Since we are moving the installer to the live media solution I don't think most of these issues will continue to be a problem. If you have the opportunity, I would be very interested in some feedback on the new live media installer. Thanks, Greg -- Greg Kurtzer http://www.infiscale.com/ http://www.perceus.org/ http://www.caoslinux.org/ From stefan at mdy.univie.ac.at Wed Sep 2 00:19:22 2009 From: stefan at mdy.univie.ac.at (Stefan Boresch) Date: Wed, 2 Sep 2009 09:19:22 +0200 Subject: [Caos] jmicron module missing during install In-Reply-To: <571f1a060909010905t6435e8dexf785e9e81d002b17@mail.gmail.com> References: <20090901150129.GB10349@loop.mdy.univie.ac.at> <571f1a060909010905t6435e8dexf785e9e81d002b17@mail.gmail.com> Message-ID: <20090902071922.GC10349@loop.mdy.univie.ac.at> Hi Greg, On Tue, Sep 01, 2009 at 09:05:25AM -0700, Greg Kurtzer wrote: > On Tue, Sep 1, 2009 at 8:01 AM, Stefan Boresch wrote: > Yes, we know. We were planning on releasing the live media installer > already, but then there were several high priority exploits that were > found in the current kernels. > > For testing purposes, they can be found at: > http://altruistic.infiscale.org/isos. Please expect a new one in the > next day or so, then we will push that to the live mirrors. OK, I thought something like this ... > > * Why is the preshell option not documented anymore. I knew about it > > because I needed it in the past to manually load an ethernet driver; > > it was docemnted then. This time I had to read the init script to > > remember the exact "incantation" > > It was documented at one point in the wiki, but I am not sure what > happened to it or if someone removed it. I know; that's where I found the option some months ago when it was *very* helpful as at the time the installer made a wrong choice in ethernet driver ... At least I knew that something most likely existed. My general impression of the wiki is that some editors seem to be quick to delete stuff once newer (= better?) versions appear. E.g., the one liner outlining which modules to install for nvidia also were online for a few months; a few weeks ago when I wanted to look at them for a new machine they were gone and could only be "resurrected" by google (meaning the page is still available, but there is no direct link pointing to it) [snip] > Since we are moving the installer to the live media solution I don't > think most of these issues will continue to be a problem. If you have > the opportunity, I would be very interested in some feedback on the > new live media installer. > that makes sense. I will certainly test the new installer on new machines; unfortunately, those needing the jmicron module are already in production (my co-workers wouldn't appreciate if I told them to move those 12TB (or so) somewhere else for a quick test ;-) Speaking of installers; any chance for a net-install, or maybe a "howto" how one could install over the net? Until caos, it was some years that I actually inserted a CD somewhere ... [wiki.caoslinux.org/Caos_NSA_1_Installation_-_Soekris_net5501 describes an install that relies in part on a netboot] Best regards, Stefan -- Stefan Boresch Institute for Computational Biological Chemistry University of Vienna, Waehringerstr. 17 A-1090 Vienna, Austria Phone: -43-1-427752715 Fax: -43-1-427752790 From stefan at mdy.univie.ac.at Wed Sep 2 05:17:19 2009 From: stefan at mdy.univie.ac.at (Stefan Boresch) Date: Wed, 2 Sep 2009 14:17:19 +0200 Subject: [Caos] Conflict nvidia-libs and xorg-server-glx Message-ID: <20090902121719.GE10349@loop.mdy.univie.ac.at> [applies to stuff in nsa-testing] This has occured to me for all nvidia-*-185.* packages for several weeks: After 'smart update' (testing repository enabled), 'smart upgrade' fails with Committing transaction... Preparing... ######################################## [ 0%] error: file /usr/lib64/xorg/modules/extensions/libglx.so from install of nvidia-libs-185.18.14-16.caos.x86_64 conflicts with file from package xorg-server-glx-1.4-8.caos.x86_64 I am working around that by forcing install of the nvidia packages by hand; then doing 'smart upgrade' for the rest, but nice it is not ... [I assume that I want libglx.so from the nvidia.rpm; things seem to work that way] Cheers, Stefan -- Stefan Boresch Institute for Computational Biological Chemistry University of Vienna, Waehringerstr. 17 A-1090 Vienna, Austria Phone: -43-1-427752715 Fax: -43-1-427752790 From gmkurtzer at gmail.com Wed Sep 2 07:50:02 2009 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Wed, 2 Sep 2009 07:50:02 -0700 Subject: [Caos] Conflict nvidia-libs and xorg-server-glx In-Reply-To: <20090902121719.GE10349@loop.mdy.univie.ac.at> References: <20090902121719.GE10349@loop.mdy.univie.ac.at> Message-ID: <571f1a060909020750u4aebd406x4e203b519f9a372@mail.gmail.com> Hi Stefan, There is an update available in nsa-testing now that adds a Conflicts tag for xorg-server-glx. Can you give another upgrade a test and let me know how it goes? Thanks! Greg On Wed, Sep 2, 2009 at 5:17 AM, Stefan Boresch wrote: > [applies to stuff in nsa-testing] > > This has occured to me for all nvidia-*-185.* packages for > several weeks: After 'smart update' (testing repository enabled), > 'smart upgrade' fails with > > Committing transaction... > Preparing... ? ? ? ? ? ? ? ? ? ?######################################## [ ?0%] > error: file /usr/lib64/xorg/modules/extensions/libglx.so from install of nvidia-libs-185.18.14-16.caos.x86_64 conflicts with file from package xorg-server-glx-1.4-8.caos.x86_64 > > I am working around that by forcing install of the nvidia packages by > hand; then doing 'smart upgrade' for the rest, but nice it is not ... > > [I assume that I want libglx.so from the nvidia.rpm; things seem to work > that way] > > Cheers, > > Stefan > > -- > Stefan Boresch > Institute for Computational Biological Chemistry > University of Vienna, Waehringerstr. 17 ? ? ? A-1090 Vienna, Austria > Phone: -43-1-427752715 ? ? ? ? ? ? ? ? ? ? ? ?Fax: ? -43-1-427752790 > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > -- Greg Kurtzer http://www.infiscale.com/ http://www.perceus.org/ http://www.caoslinux.org/ From stefan at mdy.univie.ac.at Thu Sep 3 01:50:18 2009 From: stefan at mdy.univie.ac.at (Stefan Boresch) Date: Thu, 3 Sep 2009 10:50:18 +0200 Subject: [Caos] Conflict nvidia-libs and xorg-server-glx In-Reply-To: <571f1a060909020750u4aebd406x4e203b519f9a372@mail.gmail.com> References: <20090902121719.GE10349@loop.mdy.univie.ac.at> <571f1a060909020750u4aebd406x4e203b519f9a372@mail.gmail.com> Message-ID: <20090903085018.GF10349@loop.mdy.univie.ac.at> Hi Greg, On Wed, Sep 02, 2009 at 07:50:02AM -0700, Greg Kurtzer wrote: > Hi Stefan, > > There is an update available in nsa-testing now that adds a Conflicts > tag for xorg-server-glx. Can you give another upgrade a test and let > me know how it goes? > Just tried, and it did this ----------------------------------------------------- Computing transaction... Upgrading packages (7): nvidia-cuda-185.18.14-17.caos at x86_64 nvidia-cuda-devel-185.18.14-17.caos at x86_64 nvidia-devel-185.18.14-17.caos at x86_64 nvidia-kmods-185.18.14-17.caos at x86_64 nvidia-libs-185.18.14-17.caos at x86_64 nvidia-utils-185.18.14-17.caos at x86_64 perl-5.10.0-13.caos at x86_64 Removing packages (2): metapkg-gl-1.2-162.nsa1 at x86_64 xorg-server-glx-1.4-8.caos at x86_64 29.7MB of package files are needed. 508.5kB will be freed. ----------------------------------------------------- Is this what you intended? (Doesn't xorg-server-glx contain some other stuff that might / may be needed for GL related stuff ??) I use the chance to report another little snag that I see constantly on this machine ... (output from the same smart upgrade run) ----------------------------------------------------- 4:Installing nvidia-kmods ######################################## [ 25%] Output from nvidia-kmods-185.18.14-17.caos at x86_64: WARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm-amd.ko ignored, due to loop WARNING: Loop detected: /lib/modules/2.6.30.4-3.caos/updates/kvm.ko which needs kvm.ko again! WARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm.ko ignored, due to loopWARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm-intel.ko ignored, due to loop 5:Cleaning nvidia-kmods ######################################## [ 31%] WARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm-amd.ko ignored, due to loop WARNING: Loop detected: /lib/modules/2.6.30.4-3.caos/updates/kvm.ko which needs kvm.ko again! WARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm.ko ignored, due to loopWARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm-intel.ko ignored, due to loop 6:Removing xorg-server-glx ######################################## [ 37%] ----------------------------------------------------- Since I don't use kvm, I don't know whether this is harmless or less so ... Thanks, Stefan -- Stefan Boresch Institute for Computational Biological Chemistry University of Vienna, Waehringerstr. 17 A-1090 Vienna, Austria Phone: -43-1-427752715 Fax: -43-1-427752790 From gmkurtzer at gmail.com Thu Sep 3 10:37:31 2009 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Thu, 3 Sep 2009 10:37:31 -0700 Subject: [Caos] Conflict nvidia-libs and xorg-server-glx In-Reply-To: <20090903085018.GF10349@loop.mdy.univie.ac.at> References: <20090902121719.GE10349@loop.mdy.univie.ac.at> <571f1a060909020750u4aebd406x4e203b519f9a372@mail.gmail.com> <20090903085018.GF10349@loop.mdy.univie.ac.at> Message-ID: <571f1a060909031037g7ddd85b8ndc1650dfeb16a799@mail.gmail.com> Hi Stefan, The smart output of removing the xorg-server-glx is perfect as it only provides the conflicting library (that is why it was separated into a sub-package). The install warnings about kvm-kmods isn't harmful, but it isn't helpful either. It is because KVM provides kernel modules with KVM, and so does our kernel. So there is a duplicate set of kernel modules for kvm*.ko. Chances are we are going to remove the kvm-kmods package and just use the KVM support that is in the kernel. Thanks, Greg On Thu, Sep 3, 2009 at 1:50 AM, Stefan Boresch wrote: > Hi Greg, > > On Wed, Sep 02, 2009 at 07:50:02AM -0700, Greg Kurtzer wrote: >> Hi Stefan, >> >> There is an update available in nsa-testing now that adds a Conflicts >> tag for xorg-server-glx. Can you give another upgrade a test and let >> me know how it goes? >> > > Just tried, and it did this > > ----------------------------------------------------- > Computing transaction... > > Upgrading packages (7): > ?nvidia-cuda-185.18.14-17.caos at x86_64 > ?nvidia-cuda-devel-185.18.14-17.caos at x86_64 > ?nvidia-devel-185.18.14-17.caos at x86_64 > ?nvidia-kmods-185.18.14-17.caos at x86_64 > ?nvidia-libs-185.18.14-17.caos at x86_64 > ?nvidia-utils-185.18.14-17.caos at x86_64 > ?perl-5.10.0-13.caos at x86_64 > > Removing packages (2): > ?metapkg-gl-1.2-162.nsa1 at x86_64 ? ? ? ? xorg-server-glx-1.4-8.caos at x86_64 > > 29.7MB of package files are needed. 508.5kB will be freed. > ----------------------------------------------------- > > Is this what you intended? (Doesn't xorg-server-glx contain some other > stuff that might / may be needed for GL related stuff ??) > > I use the chance to report another little snag that I see constantly > on this machine ... (output from the same smart upgrade run) > > ----------------------------------------------------- > ? 4:Installing nvidia-kmods ? ?######################################## [ 25%] > Output from nvidia-kmods-185.18.14-17.caos at x86_64: > WARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm-amd.ko ignored, due to loop > WARNING: Loop detected: /lib/modules/2.6.30.4-3.caos/updates/kvm.ko which needs kvm.ko again! > WARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm.ko ignored, due to loopWARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm-intel.ko ignored, due to loop > ? 5:Cleaning nvidia-kmods ? ? ?######################################## [ 31%] > WARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm-amd.ko ignored, due to loop > WARNING: Loop detected: /lib/modules/2.6.30.4-3.caos/updates/kvm.ko which needs kvm.ko again! > WARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm.ko ignored, due to loopWARNING: Module /lib/modules/2.6.30.4-3.caos/updates/kvm-intel.ko ignored, due to loop > ? 6:Removing xorg-server-glx ? ######################################## [ 37%] > ----------------------------------------------------- > > Since I don't use kvm, I don't know whether this is harmless or less so ... > > Thanks, > > Stefan > > -- > Stefan Boresch > Institute for Computational Biological Chemistry > University of Vienna, Waehringerstr. 17 ? ? ? A-1090 Vienna, Austria > Phone: -43-1-427752715 ? ? ? ? ? ? ? ? ? ? ? ?Fax: ? -43-1-427752790 > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > -- Greg Kurtzer http://www.infiscale.com/ http://www.perceus.org/ http://www.caoslinux.org/ From Eliezer.Rosengaus at kla-tencor.com Tue Sep 22 08:54:39 2009 From: Eliezer.Rosengaus at kla-tencor.com (Rosengaus, Eliezer) Date: Tue, 22 Sep 2009 08:54:39 -0700 Subject: [Caos] Shutdown problems Message-ID: <87B5F8D7EA676B4EA235361F2250E6650281F2F1@CA1EXCLV06.adcorp.kla-tencor.com> I have a cluster of boxes running caos linux (provisioned by perceus). Each box is a dual-socket motherboard (Supermicro X8DTN). When I try to shut down a node by issuing "shutdown -f -h now", the shutdown proceeds down to a point where one of the CPUs is shut down, but the system then freezes and never completes the powerdown sequence. Has anyone seen this before or have any idea of how to fix it? Eliezer Rosengaus -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.infiscale.org/pipermail/caos/attachments/20090922/86f573a6/attachment.html From gmk at infiscale.org Tue Sep 22 10:10:52 2009 From: gmk at infiscale.org (Greg Kurtzer) Date: Tue, 22 Sep 2009 10:10:52 -0700 Subject: [Caos] Shutdown problems In-Reply-To: <87B5F8D7EA676B4EA235361F2250E6650281F2F1@CA1EXCLV06.adcorp.kla-tencor.com> References: <87B5F8D7EA676B4EA235361F2250E6650281F2F1@CA1EXCLV06.adcorp.kla-tencor.com> Message-ID: <571f1a060909221010l799d7c5dk5d49a17f93372dee@mail.gmail.com> Hi What are the last things printed to the screen? On Tue, Sep 22, 2009 at 8:54 AM, Rosengaus, Eliezer wrote: > I have a cluster of boxes running caos linux (provisioned by perceus). Each > box is a dual-socket motherboard (Supermicro X8DTN). When I try to shut down > a node by issuing "shutdown -f -h now", the shutdown proceeds down to a > point where one of the CPUs is shut down, but the system then freezes and > never completes the powerdown sequence. Has anyone seen this before or have > any idea of how to fix it? > > Eliezer Rosengaus > > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > > -- Greg M. Kurtzer Chief Technology Officer HPC Systems Architect Infiscale, Inc. - http://www.infiscale.com From Eliezer.Rosengaus at kla-tencor.com Tue Sep 22 10:30:40 2009 From: Eliezer.Rosengaus at kla-tencor.com (Rosengaus, Eliezer) Date: Tue, 22 Sep 2009 10:30:40 -0700 Subject: [Caos] Shutdown problems In-Reply-To: <571f1a060909221010l799d7c5dk5d49a17f93372dee@mail.gmail.com> References: <87B5F8D7EA676B4EA235361F2250E6650281F2F1@CA1EXCLV06.adcorp.kla-tencor.com> <571f1a060909221010l799d7c5dk5d49a17f93372dee@mail.gmail.com> Message-ID: <87B5F8D7EA676B4EA235361F2250E6650281F343@CA1EXCLV06.adcorp.kla-tencor.com> Disabling non-boot CPUs ... CPU 1 is now offline SYSTEM FREEZES Eliezer Rosengaus Sr. Systems Architect, Image Computer KLA-Tencor Corp. 5 Technology Dr. MS 5-1245 Milpitas, CA 95035 (408)875-5196 -----Original Message----- From: caos-bounces at lists.infiscale.org [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer Sent: Tuesday, September 22, 2009 10:11 AM To: Caos Linux discussion and support forum Subject: Re: [Caos] Shutdown problems Hi What are the last things printed to the screen? On Tue, Sep 22, 2009 at 8:54 AM, Rosengaus, Eliezer wrote: > I have a cluster of boxes running caos linux (provisioned by perceus). Each > box is a dual-socket motherboard (Supermicro X8DTN). When I try to shut down > a node by issuing "shutdown -f -h now", the shutdown proceeds down to a > point where one of the CPUs is shut down, but the system then freezes and > never completes the powerdown sequence. Has anyone seen this before or have > any idea of how to fix it? > > Eliezer Rosengaus > > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > > -- Greg M. Kurtzer Chief Technology Officer HPC Systems Architect Infiscale, Inc. - http://www.infiscale.com _______________________________________________ Caos mailing list Caos at lists.infiscale.org http://lists.infiscale.org/mailman/listinfo/caos From gmk at infiscale.org Tue Sep 22 11:02:12 2009 From: gmk at infiscale.org (Greg Kurtzer) Date: Tue, 22 Sep 2009 11:02:12 -0700 Subject: [Caos] Shutdown problems In-Reply-To: <87B5F8D7EA676B4EA235361F2250E6650281F343@CA1EXCLV06.adcorp.kla-tencor.com> References: <87B5F8D7EA676B4EA235361F2250E6650281F2F1@CA1EXCLV06.adcorp.kla-tencor.com> <571f1a060909221010l799d7c5dk5d49a17f93372dee@mail.gmail.com> <87B5F8D7EA676B4EA235361F2250E6650281F343@CA1EXCLV06.adcorp.kla-tencor.com> Message-ID: <20090922180212.GF15187@infiscale.org> Which kernel are you running in that VNFS? On Tuesday, 22 September 2009, at 10:30:40 (-0700), Rosengaus, Eliezer wrote: > Disabling non-boot CPUs > ... > CPU 1 is now offline > > > > > > > SYSTEM FREEZES > > > Eliezer Rosengaus > Sr. Systems Architect, Image Computer > KLA-Tencor Corp. > 5 Technology Dr. > MS 5-1245 > Milpitas, CA 95035 > > (408)875-5196 > > > -----Original Message----- > From: caos-bounces at lists.infiscale.org > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > Sent: Tuesday, September 22, 2009 10:11 AM > To: Caos Linux discussion and support forum > Subject: Re: [Caos] Shutdown problems > > Hi > > What are the last things printed to the screen? > > > On Tue, Sep 22, 2009 at 8:54 AM, Rosengaus, Eliezer > wrote: > > I have a cluster of boxes running caos linux (provisioned by perceus). > Each > > box is a dual-socket motherboard (Supermicro X8DTN). When I try to > shut down > > a node by issuing "shutdown -f -h now", the shutdown proceeds down to > a > > point where one of the CPUs is shut down, but the system then freezes > and > > never completes the powerdown sequence. Has anyone seen this before or > have > > any idea of how to fix it? > > > > Eliezer Rosengaus > > > > _______________________________________________ > > Caos mailing list > > Caos at lists.infiscale.org > > http://lists.infiscale.org/mailman/listinfo/caos > > > > > > > > -- > Greg M. Kurtzer > Chief Technology Officer > HPC Systems Architect > Infiscale, Inc. - http://www.infiscale.com > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos -- Greg M. Kurtzer Chief Technology Officer HPC Systems Architect Infiscale, Inc. - http://www.infiscale.com From Eliezer.Rosengaus at kla-tencor.com Tue Sep 22 11:35:46 2009 From: Eliezer.Rosengaus at kla-tencor.com (Rosengaus, Eliezer) Date: Tue, 22 Sep 2009 11:35:46 -0700 Subject: [Caos] Shutdown problems In-Reply-To: <20090922180212.GF15187@infiscale.org> References: <87B5F8D7EA676B4EA235361F2250E6650281F2F1@CA1EXCLV06.adcorp.kla-tencor.com><571f1a060909221010l799d7c5dk5d49a17f93372dee@mail.gmail.com><87B5F8D7EA676B4EA235361F2250E6650281F343@CA1EXCLV06.adcorp.kla-tencor.com> <20090922180212.GF15187@infiscale.org> Message-ID: <87B5F8D7EA676B4EA235361F2250E6650281F38E@CA1EXCLV06.adcorp.kla-tencor.com> the caos vnfs on the perceus site: Linux node001 2.6.28.3-1.nsa1 #1 SMP Wed Feb 4 23:08:20 PST 2009 x86_64 x86_64 x86_64 GNU/Linux Eliezer Rosengaus Sr. Systems Architect, Image Computer KLA-Tencor Corp. 5 Technology Dr. MS 5-1245 Milpitas, CA 95035 (408)875-5196 -----Original Message----- From: caos-bounces at lists.infiscale.org [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer Sent: Tuesday, September 22, 2009 11:02 AM To: Caos Linux discussion and support forum Subject: Re: [Caos] Shutdown problems Which kernel are you running in that VNFS? On Tuesday, 22 September 2009, at 10:30:40 (-0700), Rosengaus, Eliezer wrote: > Disabling non-boot CPUs > ... > CPU 1 is now offline > > > > > > > SYSTEM FREEZES > > > Eliezer Rosengaus > Sr. Systems Architect, Image Computer > KLA-Tencor Corp. > 5 Technology Dr. > MS 5-1245 > Milpitas, CA 95035 > > (408)875-5196 > > > -----Original Message----- > From: caos-bounces at lists.infiscale.org > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > Sent: Tuesday, September 22, 2009 10:11 AM > To: Caos Linux discussion and support forum > Subject: Re: [Caos] Shutdown problems > > Hi > > What are the last things printed to the screen? > > > On Tue, Sep 22, 2009 at 8:54 AM, Rosengaus, Eliezer > wrote: > > I have a cluster of boxes running caos linux (provisioned by perceus). > Each > > box is a dual-socket motherboard (Supermicro X8DTN). When I try to > shut down > > a node by issuing "shutdown -f -h now", the shutdown proceeds down to > a > > point where one of the CPUs is shut down, but the system then freezes > and > > never completes the powerdown sequence. Has anyone seen this before or > have > > any idea of how to fix it? > > > > Eliezer Rosengaus > > > > _______________________________________________ > > Caos mailing list > > Caos at lists.infiscale.org > > http://lists.infiscale.org/mailman/listinfo/caos > > > > > > > > -- > Greg M. Kurtzer > Chief Technology Officer > HPC Systems Architect > Infiscale, Inc. - http://www.infiscale.com > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos -- Greg M. Kurtzer Chief Technology Officer HPC Systems Architect Infiscale, Inc. - http://www.infiscale.com _______________________________________________ Caos mailing list Caos at lists.infiscale.org http://lists.infiscale.org/mailman/listinfo/caos From gmk at infiscale.org Tue Sep 22 12:10:13 2009 From: gmk at infiscale.org (Greg Kurtzer) Date: Tue, 22 Sep 2009 12:10:13 -0700 Subject: [Caos] Shutdown problems In-Reply-To: <87B5F8D7EA676B4EA235361F2250E6650281F38E@CA1EXCLV06.adcorp.kla-tencor.com> References: <20090922180212.GF15187@infiscale.org> <87B5F8D7EA676B4EA235361F2250E6650281F38E@CA1EXCLV06.adcorp.kla-tencor.com> Message-ID: <20090922191013.GG15187@infiscale.org> There is an updated kernel on the mirrors. Give that a try and let us know. On Tuesday, 22 September 2009, at 11:35:46 (-0700), Rosengaus, Eliezer wrote: > the caos vnfs on the perceus site: > > Linux node001 2.6.28.3-1.nsa1 #1 SMP Wed Feb 4 23:08:20 PST 2009 x86_64 > x86_64 x86_64 GNU/Linux > > > Eliezer Rosengaus > Sr. Systems Architect, Image Computer > KLA-Tencor Corp. > 5 Technology Dr. > MS 5-1245 > Milpitas, CA 95035 > > (408)875-5196 > > > -----Original Message----- > From: caos-bounces at lists.infiscale.org > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > Sent: Tuesday, September 22, 2009 11:02 AM > To: Caos Linux discussion and support forum > Subject: Re: [Caos] Shutdown problems > > Which kernel are you running in that VNFS? > > > > On Tuesday, 22 September 2009, at 10:30:40 (-0700), > Rosengaus, Eliezer wrote: > > > Disabling non-boot CPUs > > ... > > CPU 1 is now offline > > > > > > > > > > > > > > SYSTEM FREEZES > > > > > > Eliezer Rosengaus > > Sr. Systems Architect, Image Computer > > KLA-Tencor Corp. > > 5 Technology Dr. > > MS 5-1245 > > Milpitas, CA 95035 > > > > (408)875-5196 > > > > > > -----Original Message----- > > From: caos-bounces at lists.infiscale.org > > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > > Sent: Tuesday, September 22, 2009 10:11 AM > > To: Caos Linux discussion and support forum > > Subject: Re: [Caos] Shutdown problems > > > > Hi > > > > What are the last things printed to the screen? > > > > > > On Tue, Sep 22, 2009 at 8:54 AM, Rosengaus, Eliezer > > wrote: > > > I have a cluster of boxes running caos linux (provisioned by > perceus). > > Each > > > box is a dual-socket motherboard (Supermicro X8DTN). When I try to > > shut down > > > a node by issuing "shutdown -f -h now", the shutdown proceeds down > to > > a > > > point where one of the CPUs is shut down, but the system then > freezes > > and > > > never completes the powerdown sequence. Has anyone seen this before > or > > have > > > any idea of how to fix it? > > > > > > Eliezer Rosengaus > > > > > > _______________________________________________ > > > Caos mailing list > > > Caos at lists.infiscale.org > > > http://lists.infiscale.org/mailman/listinfo/caos > > > > > > > > > > > > > > -- > > Greg M. Kurtzer > > Chief Technology Officer > > HPC Systems Architect > > Infiscale, Inc. - http://www.infiscale.com > > _______________________________________________ > > Caos mailing list > > Caos at lists.infiscale.org > > http://lists.infiscale.org/mailman/listinfo/caos > > _______________________________________________ > > Caos mailing list > > Caos at lists.infiscale.org > > http://lists.infiscale.org/mailman/listinfo/caos > > -- > Greg M. Kurtzer > Chief Technology Officer > HPC Systems Architect > Infiscale, Inc. - http://www.infiscale.com > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos -- Greg M. Kurtzer Chief Technology Officer HPC Systems Architect Infiscale, Inc. - http://www.infiscale.com From Eliezer.Rosengaus at kla-tencor.com Tue Sep 22 15:00:14 2009 From: Eliezer.Rosengaus at kla-tencor.com (Rosengaus, Eliezer) Date: Tue, 22 Sep 2009 15:00:14 -0700 Subject: [Caos] Shutdown problems In-Reply-To: <20090922191013.GG15187@infiscale.org> References: <20090922180212.GF15187@infiscale.org><87B5F8D7EA676B4EA235361F2250E6650281F38E@CA1EXCLV06.adcorp.kla-tencor.com> <20090922191013.GG15187@infiscale.org> Message-ID: <87B5F8D7EA676B4EA235361F2250E6650281F471@CA1EXCLV06.adcorp.kla-tencor.com> Sorry for the newbie questions, but I copied the new kernel files to the vnfs /boot directory. How do I change the boot kernel specification (like in the grub menu.lst) in perceus? By the way, the Centos capsule (older kernel 2.6.18...) shuts down fine. Eliezer Rosengaus Sr. Systems Architect, Image Computer KLA-Tencor Corp. 5 Technology Dr. MS 5-1245 Milpitas, CA 95035 (408)875-5196 -----Original Message----- From: caos-bounces at lists.infiscale.org [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer Sent: Tuesday, September 22, 2009 12:10 PM To: Caos Linux discussion and support forum Subject: Re: [Caos] Shutdown problems There is an updated kernel on the mirrors. Give that a try and let us know. On Tuesday, 22 September 2009, at 11:35:46 (-0700), Rosengaus, Eliezer wrote: > the caos vnfs on the perceus site: > > Linux node001 2.6.28.3-1.nsa1 #1 SMP Wed Feb 4 23:08:20 PST 2009 x86_64 > x86_64 x86_64 GNU/Linux > > > Eliezer Rosengaus > Sr. Systems Architect, Image Computer > KLA-Tencor Corp. > 5 Technology Dr. > MS 5-1245 > Milpitas, CA 95035 > > (408)875-5196 > > > -----Original Message----- > From: caos-bounces at lists.infiscale.org > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > Sent: Tuesday, September 22, 2009 11:02 AM > To: Caos Linux discussion and support forum > Subject: Re: [Caos] Shutdown problems > > Which kernel are you running in that VNFS? > > > > On Tuesday, 22 September 2009, at 10:30:40 (-0700), > Rosengaus, Eliezer wrote: > > > Disabling non-boot CPUs > > ... > > CPU 1 is now offline > > > > > > > > > > > > > > SYSTEM FREEZES > > > > > > Eliezer Rosengaus > > Sr. Systems Architect, Image Computer > > KLA-Tencor Corp. > > 5 Technology Dr. > > MS 5-1245 > > Milpitas, CA 95035 > > > > (408)875-5196 > > > > > > -----Original Message----- > > From: caos-bounces at lists.infiscale.org > > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > > Sent: Tuesday, September 22, 2009 10:11 AM > > To: Caos Linux discussion and support forum > > Subject: Re: [Caos] Shutdown problems > > > > Hi > > > > What are the last things printed to the screen? > > > > > > On Tue, Sep 22, 2009 at 8:54 AM, Rosengaus, Eliezer > > wrote: > > > I have a cluster of boxes running caos linux (provisioned by > perceus). > > Each > > > box is a dual-socket motherboard (Supermicro X8DTN). When I try to > > shut down > > > a node by issuing "shutdown -f -h now", the shutdown proceeds down > to > > a > > > point where one of the CPUs is shut down, but the system then > freezes > > and > > > never completes the powerdown sequence. Has anyone seen this before > or > > have > > > any idea of how to fix it? > > > > > > Eliezer Rosengaus > > > > > > _______________________________________________ > > > Caos mailing list > > > Caos at lists.infiscale.org > > > http://lists.infiscale.org/mailman/listinfo/caos > > > > > > > > > > > > > > -- > > Greg M. Kurtzer > > Chief Technology Officer > > HPC Systems Architect > > Infiscale, Inc. - http://www.infiscale.com > > _______________________________________________ > > Caos mailing list > > Caos at lists.infiscale.org > > http://lists.infiscale.org/mailman/listinfo/caos > > _______________________________________________ > > Caos mailing list > > Caos at lists.infiscale.org > > http://lists.infiscale.org/mailman/listinfo/caos > > -- > Greg M. Kurtzer > Chief Technology Officer > HPC Systems Architect > Infiscale, Inc. - http://www.infiscale.com > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos -- Greg M. Kurtzer Chief Technology Officer HPC Systems Architect Infiscale, Inc. - http://www.infiscale.com _______________________________________________ Caos mailing list Caos at lists.infiscale.org http://lists.infiscale.org/mailman/listinfo/caos From gmk at infiscale.org Tue Sep 22 16:38:27 2009 From: gmk at infiscale.org (Greg Kurtzer) Date: Tue, 22 Sep 2009 16:38:27 -0700 Subject: [Caos] Shutdown problems In-Reply-To: <87B5F8D7EA676B4EA235361F2250E6650281F471@CA1EXCLV06.adcorp.kla-tencor.com> References: <20090922191013.GG15187@infiscale.org> <87B5F8D7EA676B4EA235361F2250E6650281F471@CA1EXCLV06.adcorp.kla-tencor.com> Message-ID: <20090922233827.GH15187@infiscale.org> Give this a try: # perceus vnfs mount [vnfs name] # smart -o rpm-root=/mnt/[vnfs name] upgrade caos-kernel # perceus vnfs umount [vnfs name] # vi /etc/perceus/vnfs/[vnfs name]/config Define the kernel you wish to boot. Reboot the node. Greg On Tuesday, 22 September 2009, at 15:00:14 (-0700), Rosengaus, Eliezer wrote: > Sorry for the newbie questions, but I copied the new kernel files to the > vnfs /boot directory. How do I change the boot kernel specification > (like in the grub menu.lst) in perceus? > By the way, the Centos capsule (older kernel 2.6.18...) shuts down fine. > > > > Eliezer Rosengaus > Sr. Systems Architect, Image Computer > KLA-Tencor Corp. > 5 Technology Dr. > MS 5-1245 > Milpitas, CA 95035 > > (408)875-5196 > > > -----Original Message----- > From: caos-bounces at lists.infiscale.org > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > Sent: Tuesday, September 22, 2009 12:10 PM > To: Caos Linux discussion and support forum > Subject: Re: [Caos] Shutdown problems > > There is an updated kernel on the mirrors. Give that a try and let us > know. > > On Tuesday, 22 September 2009, at 11:35:46 (-0700), > Rosengaus, Eliezer wrote: > > > the caos vnfs on the perceus site: > > > > Linux node001 2.6.28.3-1.nsa1 #1 SMP Wed Feb 4 23:08:20 PST 2009 > x86_64 > > x86_64 x86_64 GNU/Linux > > > > > > Eliezer Rosengaus > > Sr. Systems Architect, Image Computer > > KLA-Tencor Corp. > > 5 Technology Dr. > > MS 5-1245 > > Milpitas, CA 95035 > > > > (408)875-5196 > > > > > > -----Original Message----- > > From: caos-bounces at lists.infiscale.org > > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > > Sent: Tuesday, September 22, 2009 11:02 AM > > To: Caos Linux discussion and support forum > > Subject: Re: [Caos] Shutdown problems > > > > Which kernel are you running in that VNFS? > > > > > > > > On Tuesday, 22 September 2009, at 10:30:40 (-0700), > > Rosengaus, Eliezer wrote: > > > > > Disabling non-boot CPUs > > > ... > > > CPU 1 is now offline > > > > > > > > > > > > > > > > > > > > > SYSTEM FREEZES > > > > > > > > > Eliezer Rosengaus > > > Sr. Systems Architect, Image Computer > > > KLA-Tencor Corp. > > > 5 Technology Dr. > > > MS 5-1245 > > > Milpitas, CA 95035 > > > > > > (408)875-5196 > > > > > > > > > -----Original Message----- > > > From: caos-bounces at lists.infiscale.org > > > [mailto:caos-bounces at lists.infiscale.org] On Behalf Of Greg Kurtzer > > > Sent: Tuesday, September 22, 2009 10:11 AM > > > To: Caos Linux discussion and support forum > > > Subject: Re: [Caos] Shutdown problems > > > > > > Hi > > > > > > What are the last things printed to the screen? > > > > > > > > > On Tue, Sep 22, 2009 at 8:54 AM, Rosengaus, Eliezer > > > wrote: > > > > I have a cluster of boxes running caos linux (provisioned by > > perceus). > > > Each > > > > box is a dual-socket motherboard (Supermicro X8DTN). When I try to > > > shut down > > > > a node by issuing "shutdown -f -h now", the shutdown proceeds down > > to > > > a > > > > point where one of the CPUs is shut down, but the system then > > freezes > > > and > > > > never completes the powerdown sequence. Has anyone seen this > before > > or > > > have > > > > any idea of how to fix it? > > > > > > > > Eliezer Rosengaus > > > > > > > > _______________________________________________ > > > > Caos mailing list > > > > Caos at lists.infiscale.org > > > > http://lists.infiscale.org/mailman/listinfo/caos > > > > > > > > > > > > > > > > > > > > -- > > > Greg M. Kurtzer > > > Chief Technology Officer > > > HPC Systems Architect > > > Infiscale, Inc. - http://www.infiscale.com > > > _______________________________________________ > > > Caos mailing list > > > Caos at lists.infiscale.org > > > http://lists.infiscale.org/mailman/listinfo/caos > > > _______________________________________________ > > > Caos mailing list > > > Caos at lists.infiscale.org > > > http://lists.infiscale.org/mailman/listinfo/caos > > > > -- > > Greg M. Kurtzer > > Chief Technology Officer > > HPC Systems Architect > > Infiscale, Inc. - http://www.infiscale.com > > _______________________________________________ > > Caos mailing list > > Caos at lists.infiscale.org > > http://lists.infiscale.org/mailman/listinfo/caos > > _______________________________________________ > > Caos mailing list > > Caos at lists.infiscale.org > > http://lists.infiscale.org/mailman/listinfo/caos > > -- > Greg M. Kurtzer > Chief Technology Officer > HPC Systems Architect > Infiscale, Inc. - http://www.infiscale.com > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos > _______________________________________________ > Caos mailing list > Caos at lists.infiscale.org > http://lists.infiscale.org/mailman/listinfo/caos -- Greg M. Kurtzer Chief Technology Officer HPC Systems Architect Infiscale, Inc. - http://www.infiscale.com