Portál AbcLinuxu, 11. května 2025 09:49

Dotaz: Server zamrza po nekolika hodinach.

23.11.2007 15:21 Lemmy
Server zamrza po nekolika hodinach.
Přečteno: 1193×
Odpovědět | Admin
Ahoj, mam problem se serverem. Jedna se o IBM Netvista s procesorem Pentium 4 2.40GHz, 256 MB ram a pevnym diskem IBM/Hitachi Deskstar GXP-180 40 GB. PC driv slouzilo jako pracovni stanice s winXP.

Na server jsem nainstaloval Centos 5. Server by mel slouzit jako zalozni firewall, takze v teto chvili na nem v podstate nic nebezi.

Problem je v tom, ze server zacne po nekolika hodinach provozu zamrzat. Nektere prikazy se provadeji nesmyslne dlouho(ls). Jine prikazy nezobrazi zadny vystup (top) a server nejde restartovat zadnym prikazem (shutdown -r now, reboot, init 6).

V systemovem logu jsem nic podezreleho nenasel. Jedina zvlastni vec je vypis procesu po spusteni prikazu shutdown -r now:
ps -A
.
3659 ?        00:00:00 udevd
 3660 ?        00:00:00 udevd
 3672 ?        00:00:00 udev_run_devd defunct
 3676 ?        00:00:00 udev_run_hotplu defunct
.
Kdyz server restartuji rucne, tak po restartu vse zase nejakou dobu funguje tak jak ma.

Prikladam seznam bezicich procesu:
  PID TTY          TIME CMD
    1 ?        00:00:00 init
    2 ?        00:00:00 migration/0
    3 ?        00:00:00 ksoftirqd/0
    4 ?        00:00:00 watchdog/0
    5 ?        00:00:00 events/0
    6 ?        00:00:00 khelper
    7 ?        00:00:00 kthread
   10 ?        00:00:00 kblockd/0
   11 ?        00:00:00 kacpid
   94 ?        00:00:00 cqueue/0
   97 ?        00:00:00 khubd
   99 ?        00:00:00 kseriod
  158 ?        00:00:00 pdflush
  159 ?        00:00:00 pdflush
  160 ?        00:00:00 kswapd0
  161 ?        00:00:00 aio/0
  317 ?        00:00:00 kpsmoused
  334 ?        00:00:00 kjournald
  361 ?        00:00:00 kauditd
  395 ?        00:00:00 udevd
 1258 ?        00:00:00 kmirrord
 1904 ?        00:00:00 auditd
 1906 ?        00:00:00 python
 1924 ?        00:00:00 syslogd
 1927 ?        00:00:00 klogd
 1978 ?        00:00:00 portmap
 2003 ?        00:00:00 rpc.statd
 2044 ?        00:00:00 rpc.idmapd
 2071 ?        00:00:00 dbus-daemon
 2117 ?        00:00:00 pcscd
 2151 ?        00:00:00 hidd
 2170 ?        00:00:00 automount
 2193 ?        00:00:00 acpid
 2232 ?        00:00:00 sshd
 2256 ?        00:00:00 gpm
 2271 ?        00:00:00 crond
 2300 ?        00:00:00 atd
 2315 ?        00:00:13 yum-updatesd
 2331 ?        00:00:00 avahi-daemon
 2332 ?        00:00:00 avahi-daemon
 2347 ?        00:00:00 hald
 2348 ?        00:00:00 hald-runner
 2354 ?        00:00:00 hald-addon-acpi
 2359 ?        00:00:00 hald-addon-keyb
 2371 ?        00:00:00 hald-addon-stor
 2398 ?        00:00:00 smartd
 3400 ?        00:00:00 sshd
 3402 pts/0    00:00:00 bash
 3429 pts/0    00:00:01 mc
 3431 pts/1    00:00:00 bash
 3696 ?        00:00:00 sshd
 3698 ?        00:00:00 bash
 3729 pts/1    00:00:00 ps
Dekuji za nakopnuti spravnym smerem.
Nástroje: Začni sledovat (1) ?Zašle upozornění na váš email při vložení nového komentáře.

Odpovědi

Shadow avatar 23.11.2007 15:36 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Odpovědět | | Sbalit | Link | Blokovat | Admin
Viděl bych to na problém s HW a diagnostiku bych začal memtestem.
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
23.11.2007 15:54 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Bohuzel ten server je v zahranici, takze nemam moznost spustit memtest z live cd. Existuje nejaky memtest, ktery se necha spustit primo z bashe?

Zvlastni je, ze stejny problem mam na trech serverch a vsechny maji stejny hw.

Diky za reakci.
23.11.2007 16:11 svaca | skóre: 38
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
pokud to nebude samotnym Centos (jadro a pod) tak to muze byt i napajenim, treba je nejaky problem na UPS, ci neco podobneho... Memtest zkouset nemusis, kdyz ma pocitac problem s pameti, programy generuji chyby a padaji a nebo se restartuje stroj sam, ale nezamrza .... Pokud pocitac zamrza a je to hardware pricinou, vetsinou se jedna o CPU, sbernice a pridavne karty ...
Never give up ! Stay ATARI !
24.11.2007 20:59 Marian Krucina | skóre: 13
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Primo z bashe ne, ale muzes ho nainstalovat od systemu, nastavit v grubu na vyhozi, a reboot. Tady moje predstavivost konci :(

A co dmesg? Nehlasi zadne problemy?
Luboš Doležel (Doli) avatar 24.11.2007 21:08 Luboš Doležel (Doli) | skóre: 98 | blog: Doliho blog | Kladensko
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
[I] sys-apps/memtester
     Available versions:  4.0.5 4.0.7
     Installed versions:  4.0.7(17:40:26 26.8.2007)
     Homepage:            http://pyropus.ca/software/memtester/
     Description:         userspace utility for testing the memory subsystem for faults
25.11.2007 09:05 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
dmesg:
Linux version 2.6.18-8.1.15.el5 (mockbuild@builder6.centos.org) (gcc version 4.1
.1 20070105 (Red Hat 4.1.1-52)) #1 SMP Mon Oct 22 08:32:04 EDT 2007
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
 BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000000fcf0000 (usable)
 BIOS-e820: 000000000fcf0000 - 000000000fcfb000 (ACPI data)
 BIOS-e820: 000000000fcfb000 - 000000000fd00000 (ACPI NVS)
 BIOS-e820: 000000000fd00000 - 000000000fe80000 (usable)
 BIOS-e820: 000000000fe80000 - 0000000010000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved)
 BIOS-e820: 00000000fffffc00 - 0000000100000000 (reserved)
0MB HIGHMEM available.
254MB LOWMEM available.
found SMP MP-table at 000f62d0
Using x86 segment limits to approximate NX protection
On node 0 totalpages: 65152
  DMA zone: 4096 pages, LIFO batch:0
  Normal zone: 61056 pages, LIFO batch:15
DMI present.
Using APIC driver default
ACPI: RSDP (v000 PTLTD                                 ) @ 0x000f6360
ACPI: RSDT (v001 PTLTD    RSDT   0x060400d0  LTP 0x00000000) @ 0x0fcf73ae
ACPI: FADT (v001 IBM    NETVISTA 0x060400d0 PTL  0x00000001) @ 0x0fcfaee2
ACPI: TCPA (v001 IBM    NETVISTA 0x060400d0 PTL  0x00000001) @ 0x0fcfaf56
ACPI: MADT (v001 PTLTD           APIC   0x060400d0  LTP 0x00000000) @ 0x0fcfaf88
ACPI: BOOT (v001 PTLTD  $SBFTBL$ 0x060400d0  LTP 0x00000001) @ 0x0fcfafd8
ACPI: DSDT (v001    IBM Yelotail 0x060400d0 MSFT 0x0100000e) @ 0x00000000
ACPI: PM-Timer IO Port: 0x1008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:2 APIC version 20
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 20000000 (gap: 10000000:eec00000)
Detected 2392.073 MHz processor.
Built 1 zonelists.  Total pages: 65152
Kernel command line: ro root=LABEL=/
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
CPU 0 irqstacks, hard=c0737000 soft=c0717000
PID hash table entries: 1024 (order: 10, 4096 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 252380k/260608k available (2043k kernel code, 7600k reserved, 846k data, 232k init, 0k highm
em)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 4786.57 BogoMIPS (lpj=2393287)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00000400 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00000400 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Hyper-Threading is disabled
CPU: After all inits, caps: bfebf3ff 00000000 00000000 00000080 00000400 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU0: Thermal monitoring enabled
Checking 'hlt' instruction... OK.
SMP alternatives: switching to UP code
Freeing SMP alternatives: 16k freed
ACPI: Core revision 20060707
CPU0: Intel(R) Pentium(R) 4 CPU 2.40GHz stepping 07
Total of 1 processors activated (4786.57 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1
Brought up 1 CPUs
sizeof(vma)=84 bytes
sizeof(page)=32 bytes
sizeof(inode)=340 bytes
sizeof(dentry)=136 bytes
sizeof(ext3inode)=492 bytes
sizeof(buffer_head)=52 bytes
sizeof(skbuff)=172 bytes
checking if image is initramfs... it is
Freeing initrd memory: 1365k freed
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xfd98d, last bus=2
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:00:02.0
* The chipset may have PM-Timer Bug. Due to workarounds for a bug,
* this clock source is slow. If you are sure your timer does not have
* this bug, please use "acpi_pm_good" to disable the workaround
PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO
PCI quirk: region 1180-11bf claimed by ICH4 GPIO
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
PCI: Firmware left 0000:02:08.0 e100 interrupts enabled, disabling
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.SLOT._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 *9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 *9 10 11 12 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 15 devices
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
PCI: Ignore bogus resource 6 [0:0] of 0000:00:02.0
PCI: Bridge: 0000:00:1e.0
  IO window: 2000-2fff
  MEM window: c0100000-c01fffff
  PREFETCH window: 20000000-200fffff
PCI: Setting latency timer of device 0000:00:1e.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 4096 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 4096)
TCP reno registered
Simple Boot Flag at 0x6c set to 0x1
IBM machine detected. Enabling interrupts during APM calls.
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
apm: overridden by ACPI.
audit: initializing netlink socket (disabled)
audit(1195866433.668:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
SELinux:  Registering netfilter hooks
Initializing Cryptographic API
ksign: Installing public key data
Loading keyring
- Added public key 906C6E63F7A1235A
- User ID: CentOS (Kernel Module GPG key)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI: Processor [CPU0] (supports 8 throttling states)
ACPI: Thermal Zone [THM0] (53 C)
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
agpgart: Detected an Intel 845G Chipset.
agpgart: Detected 892K stolen memory.
agpgart: AGP aperture is 128M @ 0x88000000
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:0c: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:0d: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH4: IDE controller at PCI slot 0000:00:1f.1
PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 177
ICH4: chipset revision 1
ICH4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x1860-0x1867, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0x1868-0x186f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
hda: IC35L060AVV207-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: SAMSUNG CD-ROM SC-148C, ATAPI CD/DVD-ROM drive
hdc: Disabling (U)DMA for SAMSUNG CD-ROM SC-148C (blacklisted)
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 512KiB
hda: 78156288 sectors (40016 MB) w/1821KiB Cache, CHS=16383/255/63, UDMA(100)
hda: cache flushes supported
 hda: hda1 hda2 hda3
ide-floppy driver 0.99.newide
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: PS/2 Controller [PNP0303:KBC,PNP0f13:PSM] at 0x60,0x64 irq 1,12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP bic registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI No-Shortcut mode
ACPI: (supports S0<6>Time: tsc clocksource has been installed.
 S1 S3 S4 S5)
Freeing unused kernel memory: 232k freed
Write protecting the kernel read-only data: 383k
input: AT Translated Set 2 keyboard as /class/input/input0
input: PS/2 Generic Mouse as /class/input/input1
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 185
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:1d.0: irq 185, io base 0x00001800
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 193
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.1: irq 193, io base 0x00001820
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 177
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.2: irq 177, io base 0x00001840
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 201
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 4
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: irq 201, io mem 0xc0080000
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 6 ports detected
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
SELinux:  Disabled at runtime.
SELinux:  Unregistering netfilter hooks
audit(1195866439.816:2): selinux=0 auid=4294967295
input: PC Speaker as /class/input/input2
hdc: ATAPI 48X CD-ROM drive, 128kB Cache
Uniform CD-ROM driver Revision: 3.20
intel_rng: Firmware space is locked read-only. If you can't or
intel_rng: don't want to disable this in firmware setup, and if
intel_rng: you are certain that your system has a functional
intel_rng: RNG, try using the 'no_fwh_detect' option.
8139cp: 10/100 PCI Ethernet driver v1.2 (Mar 22, 2004)
8139cp 0000:02:09.0: This (id 10ec:8139 rev 10) is not an 8139C+ compatible chip
8139cp 0000:02:09.0: Try the "8139too" driver instead.
8139cp 0000:02:0a.0: This (id 10ec:8139 rev 10) is not an 8139C+ compatible chip
8139cp 0000:02:0a.0: Try the "8139too" driver instead.
e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
ACPI: PCI Interrupt 0000:02:08.0[A] -> GSI 20 (level, low) -> IRQ 209
e100: eth0: e100_probe: addr 0xc0100000, irq 209, MAC addr 00:09:6B:45:55:F4
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE]
8139too Fast Ethernet driver 0.9.27
ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 21 (level, low) -> IRQ 217
eth1: RealTek RTL8139 at 0xd0856000, 00:e0:4c:68:03:4f, IRQ 217
eth1:  Identified 8139 chip type 'RTL-8139C'
ACPI: PCI Interrupt 0000:02:0a.0[A] -> GSI 22 (level, low) -> IRQ 225
eth2: RealTek RTL8139 at 0xd0902400, 00:e0:4c:31:66:90, IRQ 225
eth2:  Identified 8139 chip type 'RTL-8139C'
ACPI: PCI Interrupt 0000:00:1f.3[B] -> GSI 17 (level, low) -> IRQ 233
i810_smbus 0000:00:02.0: i810/i815 i2c device found.
ACPI: PCI Interrupt 0000:00:1f.5[B] -> GSI 17 (level, low) -> IRQ 233
PCI: Setting latency timer of device 0000:00:1f.5 to 64
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
intel8x0_measure_ac97_clock: measured 50871 usecs
intel8x0: clocking to 48000
lp0: using parport0 (interrupt-driven).
lp0: console ready
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ibm_acpi: ec object not found
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
device-mapper: ioctl: 4.11.0-ioctl (2006-09-14) initialised: dm-devel@redhat.com
EXT3 FS on hda2, internal journal
IA-32 Microcode Update Driver: v1.14a tigran@veritas.com
microcode: CPU0 updated from revision 0x27 to 0x37, date = 06042003
eth1: link up, 100Mbps, full-duplex, lpa 0xC1E1
eth2: link down
audit(1195866455.146:3): audit_pid=1903 old=0 by auid=4294967295
Bluetooth: Core ver 2.10
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.8
Bluetooth: L2CAP socket layer initialized
Bluetooth: HIDP (Human Interface Emulation) ver 1.1
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
ADDRCONF(NETDEV_UP): eth0: link is not ready
ADDRCONF(NETDEV_UP): eth2: link is not ready
IPv6 over IPv4 tunneling driver
eth1: no IPv6 routers present
Krome teto casti:
Boot video device is 0000:00:02.0
* The chipset may have PM-Timer Bug. Due to workarounds for a bug,
* this clock source is slow. If you are sure your timer does not have
* this bug, please use "acpi_pm_good" to disable the workaround
PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO
nic podezreleho nevidim.

Diky za reakci.
rudiik avatar 25.11.2007 09:23 rudiik | skóre: 16 | blog: rudiikuv miniblog
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Zdravim!

Tohle prilis nepomuze. Spise by se hodilo, jestli je moznost vyfotit vystup obrazovky v dobe zakousnuti. Urcite tam kernel vyplivne co se mu stalo, ale bohuzel nikde jinde to neni k dohledani, takze tam musi prijit bud linuxar a nebo lamka s fotakem :-D
KDE 2.0 .. KDE 3.5.10 -> KDE 4.1 .. KDE 4.4.5 -> E17 Alpha/Beta -> Trinity 3.5.12 -> GNOME 2.30 -> KDE 4.6.5
25.11.2007 09:35 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
To asi bude problem, protoze server se nachazi v zahranici.
Shadow avatar 25.11.2007 09:46 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Existuje nejaky memtest, ktery se necha spustit primo z bashe?
Existuje, memtester.
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
Shadow avatar 25.11.2007 09:52 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Odpovědět | | Sbalit | Link | Blokovat | Admin
Ještě mě napadly dvě věci. Mohl by to být nějaký splašený proces, co žere paměť nebo CPU čas (popř. obojí). Když začne opět docházet k problémům, zkontroloval bych volnou paměť příkazem free a mrknul se na to, jak moc aktivní jsou procesy. Tady dobře poslouží top nebo htop.

Druhý nápad je asi úplná blbost, ale co takhle zkusit vypnout udev a zjistit, jestli se problém náhodou netýká jeho samotného?
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
Shadow avatar 25.11.2007 09:54 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Pardon, říkal jste, že vám top nezobrazí žádný výstup, takže místo toho zkuste:

ps aux
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
25.11.2007 10:08 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Vypis ps aux:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.2   2032   644 ?        Ss   Nov24   0:00 init [3]
root         2  0.0  0.0      0     0 ?        S    Nov24   0:00 [migration/0]
root         3  0.0  0.0      0     0 ?        SN   Nov24   0:00 [ksoftirqd/0]
root         4  0.0  0.0      0     0 ?        S    Nov24   0:00 [watchdog/0]
root         5  0.0  0.0      0     0 ?        S <   Nov24   0:00 [events/0]
root         6  0.0  0.0      0     0 ?        S <   Nov24   0:00 [khelper]
root         7  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kthread]
root        10  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kblockd/0]
root        11  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kacpid]
root        94  0.0  0.0      0     0 ?        S <   Nov24   0:00 [cqueue/0]
root        97  0.0  0.0      0     0 ?        S <   Nov24   0:00 [khubd]
root        99  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kseriod]
root       158  0.0  0.0      0     0 ?        S    Nov24   0:00 [pdflush]
root       159  0.0  0.0      0     0 ?        S    Nov24   0:00 [pdflush]
root       160  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kswapd0]
root       161  0.0  0.0      0     0 ?        S <   Nov24   0:00 [aio/0]
root       317  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kpsmoused]
root       334  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kjournald]
root       361  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kauditd]
root       395  0.0  0.2   2208   600 ?        S < s  Nov24   0:00 /sbin/udevd -d
root      1257  0.0  0.0      0     0 ?        S <   Nov24   0:00 [kmirrord]
root      1903  0.0  0.2  12044   664 ?        S < sl Nov24   0:00 auditd
root      1905  0.0  1.4   9616  3800 ?        S < s  Nov24   0:00 python /sbin/audispd
root      1923  0.0  0.2   1792   704 ?        Ss   Nov24   0:00 syslogd -m 0
root      1926  0.0  0.1   1644   396 ?        Ss   Nov24   0:00 klogd -x -c 1
rpc       1977  0.0  0.2   1772   548 ?        Ss   Nov24   0:00 portmap
root      2002  0.0  0.3   1884   812 ?        Ss   Nov24   0:00 rpc.statd
root      2043  0.0  0.2   4936   556 ?        Ss   Nov24   0:00 rpc.idmapd
dbus      2070  0.0  0.3   2712   776 ?        Ss   Nov24   0:00 dbus-daemon --system
root      2116  0.0  0.4  12696  1268 ?        Ssl  Nov24   0:00 pcscd
root      2150  0.0  0.1   1876   444 ?        Ss   Nov24   0:00 /usr/bin/hidd --server
root      2169  0.0  0.4   9336  1116 ?        Ssl  Nov24   0:00 automount
root      2192  0.0  0.2   1636   536 ?        Ss   Nov24   0:00 /usr/sbin/acpid
root      2240  0.0  0.3   5176   948 ?        Ss   Nov24   0:00 /usr/sbin/sshd
root      2255  0.0  0.1   1872   364 ?        Ss   Nov24   0:00 gpm -m /dev/input/mice -t exps2
root      2270  0.0  0.4   5220  1108 ?        Ss   Nov24   0:00 crond
root      2299  0.0  0.1   2200   424 ?        Ss   Nov24   0:00 /usr/sbin/atd
root      2314  0.0 12.3  44592 31400 ?        S    Nov24   1:39 /usr/bin/python /usr/sbin/yum-updatesd
avahi     2330  0.0  0.5   2544  1344 ?        Ss   Nov24   0:00 avahi-daemon: running [defaultdde.local]
avahi     2331  0.0  0.1   2544   320 ?        Ss   Nov24   0:00 avahi-daemon: chroot helper
68        2346  0.0  1.4   5324  3564 ?        Ss   Nov24   0:00 hald
root      2347  0.0  0.3   3100   984 ?        S    Nov24   0:00 hald-runner
68        2353  0.0  0.3   1964   804 ?        S    Nov24   0:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
68        2358  0.0  0.3   1968   812 ?        S    Nov24   0:00 hald-addon-keyboard: listening on /dev/input/event0
root      2370  0.0  0.2   1920   620 ?        S    Nov24   0:00 hald-addon-storage: polling /dev/hdc
root      2397  0.0  0.2   1952   516 ?        S    Nov24   0:00 /usr/sbin/smartd -q never
root      5556  0.0  0.9   8184  2480 ?        Ss   08:50   0:00 sshd: root@pts/0
root      5558  0.0  0.5   4492  1436 pts/0    Ss   08:51   0:00 -bash
root      5699  0.0  0.3   4216   944 pts/0    R+   10:01   0:00 ps aux
Nikde nemuzu najit jak vypnout udev. Neni to v RHELu automount ?

Diky za reakci.
Shadow avatar 25.11.2007 10:42 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Žádný splašený proces nevidím. Udev by měl jít dočasně vypnout takto:

/etc/init.d/udev stop

Nebo také takto:

service udev stop

Permanentně vypnout by měl jít pomocí chkconfig. Ale to bych prozatím nedělal.
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
25.11.2007 11:42 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Tak zadnou takovou sluzbu nemam:
[root@default_dde ~]# chkconfig udev off
error reading information on service udev: No such file or directory
Shadow avatar 25.11.2007 12:20 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Démon udev se dá vyřadit ručně. Pokud máte možnost počítač vzdáleně restartovat i bez ssh (kdyby něco), pak zkuste:

kill -15 `ps aux | grep udevd | awk '{print $2}'`
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
25.11.2007 14:20 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Tak bohuzel ani po ukonceni udevd se situace nezmenila. Server se chova porad stejne a nejde vzdalene restartovat. nezbyva nez pockat do pondeli, kdy ho otoci kolega rucne. A uvidime po restartu.

Mozne ze by mohlo pomoct toto: http://www.fedoraforum.org/forum/printthread.php?t=139648&page=2&pp=80

Obzvlast posledni prispevek.

Diky za reakce.
25.11.2007 15:27 svaca | skóre: 38
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Jak rikam, pokud nezmenis nejaky HW, dle meho nazoru napajeni a nebo CPU nikam se asi nehnes ...
Never give up ! Stay ATARI !
25.11.2007 19:46 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
HW bohuzel nevymenim. Servery jsou v zahranici. Problem je totozny u dvou HW stejnych serveru. Oba drive slouzily jako stanice s winXP, kde se chovaly naprosto standardne.

Diky za reakci.
25.11.2007 20:32 svaca | skóre: 38
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Mam zkusenost, ze chovani na Windows je UPLNE jine, nez na linuxu, priklad: Kdyz to prezenu s prekatkovanym procesorem a testuji to pomoci prime95 tak ve windows CASTO zmrzne cele PC a v linuxu pouze spadne aplikace ...

Kdyz pak zase je problem na pametech, tak ve Windows se pocitac resetuje ... V linuxu VETSINOU zase spadne aplikace a nebo padaji jaderne moduly a veobecne padaji veci okolo jadra ....
Never give up ! Stay ATARI !
Shadow avatar 25.11.2007 12:22 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Ještě jedna věc. Předpokládám, že tenhle výpis náleží oné době, kdy "Nektere prikazy se provadeji nesmyslne dlouho(ls). Jine prikazy nezobrazi zadny vystup (top) a server nejde restartovat zadnym prikazem (shutdown -r now, reboot, init 6)". Pokud ne, pak to zkuste znovu ve chvíli, kdy k těmto problémům dochází.
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
26.11.2007 09:49 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Tak musim mirne upravit zneni problemu. Server na konec nezamrzne. Predchozi zamrznuti pricitam mym snaham o restart systemu.

Server zustave ve stavu kdy:
"Nektere prikazy se provadeji nesmyslne dlouho(ls). Jine prikazy nezobrazi zadny vystup (top) a server nejde restartovat zadnym prikazem (shutdown -r now, reboot, init 6)"
Na jednom ze serveru se my povedlo spustit program memtester a ten nezjistil zadne chyby.

Diky za vase reakce.
Shadow avatar 26.11.2007 10:36 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
OK, takže doporučuji následující. Až se některý ze serverů opět dostane do tohoto stavu, použijte strace na problémové příkazy:

strace top

strace ls

Pak se snad dozvíte, při provádění čeho to na něco čeká nebo proč to nezobrazí žádný výstup. Strace sice není úplně jednoduché interpretovat, ale snad vám poskytne alespoň nějaké vodítko.
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
26.11.2007 11:24 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Tak sem vyzkousel to co mne momentalne trapi nejvic:

[root@server ~]# strace reboot
execve("/sbin/reboot", ["reboot"], [/* 20 vars */]) = 0
brk(0)                                  = 0x8f03000
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f49000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=20076, ...}) = 0
mmap2(NULL, 20076, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f44000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000?\216"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1576952, ...}) = 0
mmap2(0x8ce000, 1295780, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x110000
mmap2(0x247000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x137) = 0x247000
mmap2(0x24a000, 9636, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x24a000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f43000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7f436c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0x247000, 8192, PROT_READ)     = 0
mprotect(0x8c5000, 4096, PROT_READ)     = 0
munmap(0xb7f44000, 20076)               = 0
geteuid32()                             = 0
chdir("/")                              = 0
access("/var/run/utmpx", F_OK)          = -1 ENOENT (No such file or directory)
open("/var/run/utmp", O_RDWR)           = 3
fcntl64(3, F_GETFD)                     = 0
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
_llseek(3, 0, [0], SEEK_SET)            = 0
brk(0)                                  = 0x8f03000
brk(0x8f24000)                          = 0x8f24000
alarm(0)                                = 0
rt_sigaction(SIGALRM, {0x20ff70, [], 0}, {SIG_DFL}, 8) = 0
alarm(1)                                = 0
fcntl64(3, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}) = 0
read(3, "\10\0\0\0O\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 384) = 384
fcntl64(3, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
alarm(0)                                = 1
rt_sigaction(SIGALRM, {SIG_DFL}, NULL, 8) = 0
alarm(0)                                = 0
rt_sigaction(SIGALRM, {0x20ff70, [], 0}, {SIG_DFL}, 8) = 0
alarm(1)                                = 0
fcntl64(3, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}) = 0
read(3, "\2\0\0\0\0\0\0\0~\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 384) = 384
fcntl64(3, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
alarm(0)                                = 1
rt_sigaction(SIGALRM, {SIG_DFL}, NULL, 8) = 0
alarm(0)                                = 0
rt_sigaction(SIGALRM, {0x20ff70, [], 0}, {SIG_DFL}, 8) = 0
alarm(1)                                = 0
fcntl64(3, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}) = 0
read(3, "\1\0\0\00063\0\0~\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 384) = 384
fcntl64(3, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
alarm(0)                                = 1
rt_sigaction(SIGALRM, {SIG_DFL}, NULL, 8) = 0
open("/var/log/wtmp", O_WRONLY|O_APPEND) = 4
gettimeofday({1196075207, 865073}, NULL) = 0
uname({sys="Linux", node="default_hamme", ...}) = 0
access("/var/log/wtmpx", F_OK)          = -1 ENOENT (No such file or directory)
open("/var/log/wtmp", O_WRONLY)         = 5
alarm(0)                                = 0
rt_sigaction(SIGALRM, {0x20ff70, [], 0}, {SIG_DFL}, 8) = 0
alarm(1)                                = 0
fcntl64(5, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
_llseek(5, 0, [265728], SEEK_END)       = 0
write(5, "\1\0\0\0\0\0\0\0~~\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 384) = 384
fcntl64(5, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
alarm(0)                                = 1
rt_sigaction(SIGALRM, {SIG_DFL}, NULL, 8) = 0
close(5)                                = 0
close(4)                                = 0
sync()                                  = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({2, 0},
{2, 0})               = 0
reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, LINUX_REBOOT_CMD_RESTART
Dal se nedostanu.

Diky za reakce.
Shadow avatar 26.11.2007 12:12 Shadow | skóre: 25 | blog: Brainstorm
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Dle tohoto výpisu a tohoto příspěvku soudím, že by to mohl být problém s ACPI. Ale to nevysvětluje problémy s ls nebo top. Ještě byste mohl zkusit zkompilovat si vlastní jádro a pečlivě jej přizpůsobit vašemu HW. Mohl byste zkusit vanilla jádro, třeba i nějaké novější.

Možná by ale stálo za to nejprve projet vaší základní desku Googlem, jestli s ní někdo neměl nějaký podobný problém související s Linuxem.
If we do not believe in freedom of speech for those we despise we do not believe in it at all.
26.11.2007 12:25 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Podivna vec se stala.

Pote co jsem na jednom serveru spustil parkrat strace s parametrem top a reboot, tak se server zacal chovat normalne. Top zacal fungovat a server sem normalne restartoval.

Na problem s ACPI jsem uz taky myslel, tak jsem ted nabootoval do starsiho jadra s parametry acpi=off noapic. Na vic, nenechavam zavadet modul asus_acpi.

[root@server log]# uname -a
Linux server 2.6.18-8.el5 #1 SMP Thu Mar 15 19:57:35 EDT 2007 i686 i686 i386 GNU/Linux

Na problem s ACPI by mozna mohlo naznacovat i fakt, ze kdyz sem servery instaloval a testoval, tak pri bootovani se neustale vypinal a zapinal monitor, takze nebylo mozne vlastni proces bootovani sledovat.

Uvidime jestli stabilni stav vydrzi. Po restartu se vzdy servery tvarili ok, ale po par hodinach nastal vyse zmineny problem.

Diky za reakci
28.11.2007 10:40 Lemmy
Rozbalit Rozbalit vše Re: Server zamrza po nekolika hodinach.
Odpovědět | | Sbalit | Link | Blokovat | Admin
Tak problem byl v power managementu.

Po pridani parametru acpi=off a noapic na radek kernel v menu.lst a vypnuti vsech daemonu nejak spojenyhc s rizenim spotreby server jede v poradku uz treti den.

Diky vsem za reakce.

Založit nové vláknoNahoru

Tiskni Sdílej: Linkuj Jaggni to Vybrali.sme.sk Google Del.icio.us Facebook

ISSN 1214-1267, (c) 1999-2007 Stickfish s.r.o.