mt-daapd crash (NSLU 6.8)

Viewing 11 posts - 1 through 11 (of 11 total)
  • Author
    Posts
  • #1799
    hk242
    Participant

    hello all,

    i have some problems running the package “mt-daapd_svn-1673-1_armeb.ipk” on an unslung NSLU (V2.3R63-uNSLUng-6.8-beta).

    after serveral minutes (sometimes 1 minute, sometimes 1 hour) the process crashes, logging:

    <41>Oct 8 16:57:27 klogd: Unable to handle kernel paging request at virtual address 7265736b
    <41>Oct 8 16:57:27 klogd: mm = c000bd20 pgd = c04e4000
    <41>Oct 8 16:57:27 klogd: *pgd = 00000000, *pmd = 00000000
    <44>Oct 8 16:57:27 klogd: Internal error: Oops: 0
    <44>Oct 8 16:57:27 klogd: CPU: 0
    <44>Oct 8 16:57:27 klogd: pc : [] lr : [] Tainted: P
    <44>Oct 8 16:57:27 klogd: sp : c1f03ecc ip : c026e0e8 fp : 00001000
    <44>Oct 8 16:57:27 klogd: r10: c12737a0 r9 : 00000000 r8 : c02f641c
    <44>Oct 8 16:57:27 klogd: r7 : 000003c1 r6 : c000d514 r5 : c12737a0 r4 : 72657363
    <44>Oct 8 16:57:27 klogd: r3 : 6c5f666f r2 : ffffffff r1 : 0000000d r0 : c02f0000
    <44>Oct 8 16:57:27 klogd: Flags: nzCv IRQs on FIQs on Mode SVC_32 Segment user
    <25>Oct 8 16:57:27 mt-daapd[470]: Rendezvous socket closed (daap server crashed?) Aborting.
    <44>Oct 8 16:57:27 klogd: Control: 39FF Table: 004E4000 DAC: 00000015
    <44>Oct 8 16:57:27 klogd: Process mt-daapd (pid: 493, stack limit = 0xc1f02368)
    <44>Oct 8 16:57:27 klogd: Stack: (0xc1f03ecc to 0xc1f04000)
    <44>Oct 8 16:57:27 klogd: 3ec0: 000003b0 c12737a0 00000011 0000052b 00000020
    <44>Oct 8 16:57:27 klogd: 3ee0: 0000001f c0062a04 c02f635c c023ca50 c12737a0 00000000 c000d514 00000391
    <44>Oct 8 16:57:27 klogd: 3f00: c0062cb4 00000001 c000d460 c0063238 c1f03f4c c12737c0 c12737c0 c12737a0
    <44>Oct 8 16:57:27 klogd: 3f20: ffffffea 00001000 00001000 bf5fe208 00000000 c12737a0 c006334c c12737a0
    <44>Oct 8 16:57:27 klogd: 3f40: c000d460 a0000013 c12737c0 00000000 00001000 bf5fe208 00000000 c12737c0
    <44>Oct 8 16:57:27 klogd: 3f60: c12737a0 ffffffea 00391000 00001000 c1f02000 bf5fe208 00000000 c00b5ed4
    <44>Oct 8 16:57:27 klogd: 3f80: c0070908 0000000b bf5fe208 0000000c bf5fe208 00001000 00000003 c00466e4
    <44>Oct 8 16:57:27 klogd: 3fa0: bf5fe208 c0046520 0000000c c0046388 0000000c bf5fe208 00001000 00000000
    <44>Oct 8 16:57:27 klogd: 3fc0: 0000000c bf5fe208 00001000 bf5fe200 bf5fe208 bf5fe204 bf5fe208 bf5fe208
    <44>Oct 8 16:57:27 klogd: 3fe0: 400be424 bf5fe1ac 400ad9cc 402d7a14 60000010 0000000c 00000000 00000000
    <44>Oct 8 16:57:27 klogd: Backtrace: invalid frame pointer
    <44>Oct 8 16:57:27 klogd: Code: ea000000 e5944010 e3540000 0a000007 (e5943008)

    any ideas what went wrong here?

    thx,
    hk

    #12844
    fizze
    Participant

    Well, either your slug ran out of memory, or you just had some memory corruption which lead to the crash.

    Do you have a particularly large library, and what apart from mt-daapd is your slug running?

    Its a long shot, but I think it might be possible that it just ran out of memory.

    #12845
    hk242
    Participant

    there should be some swap space left…

    # free
    total used free shared buffers
    Mem: 30524 29804 720 0 9672
    Swap: 127852 2152 125700
    Total: 158376 31956 126420

    except from bash, openssh and mt-daapd, it’s a plain system:

    # ipkg list_installed
    alac-decoder - 0.1.0-2 - A decoder for the apple lossless file format
    bash - 3.2.17-1 - A bourne style shell
    cpio - 2.5-r3 -
    findutils - 4.1.20-r2 -
    flac - 1.1.4-1 - FLAC is a free lossless audio codec. This package contains the codec libraries and the command-line tools flac and metaflac.
    gdbm - 1.8.3-2 - GNU dbm is a set of database routines that use extensible hashing. It works similar to the standard UNIX dbm routines.
    ipkg - 0.99.154-r2 -
    ivorbis-tools - 1.0-6 - Tools to allow you to play, encode, and manage Ogg Vorbis files. This version is hacked to use the Tremor integer decoder.
    kernel - 2.4.22.l2.3r63-r10 -
    kernel-image-2.4.22-xfs - 2.4.22.l2.3r63-r10 -
    libao - 0.8.8-1 - Cross Platform Audio Library.
    libc6-unslung - 2.2.5-r5 -
    libcurl - 7.17.0-2 - Curl is a command line tool for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FI
    libgcc - 3.4.4-r3 -
    libid3tag - 0.15.1b-1 - The library used for ID3 tag reading
    libipkg - 0.99.154-r2 -
    libogg - 1.1.3-3 - Ogg is a multimedia container format.
    libvorbis - 1.1.2-5 - Ogg Vorbis compressed audio format.
    libvorbisidec - cvs-20050221-2 - libvorbisidec is the integer-only ogg decoder library, AKA Tremor
    mt-daapd - svn-1673-1 - A multi-threaded DAAP server for Linux and other POSIX type systems. Allows a Linux box to share audio files with iTunes users
    ncurses - 5.6-1 - NCurses libraries
    nslu2-linksys-libs - 2.3r63-r2 -
    openssh - 4.7p1-1 - a FREE version of the SSH protocol suite of network connectivity tools.
    openssl - 0.9.7m-3 - Openssl provides the ssl implementation in libraries libcrypto and libssl, and is needed by many other applications and librari
    readline - 5.2-2 - The GNU Readline library provides a set of functions for use by applications that allow users to edit command lines as they are
    slingbox - 1.00-r8 -
    sqlite - 3.4.1-1 - SQLite is a small C library that implements a self-contained, embeddable, zero-configuration SQL database engine.
    sqlite2 - 2.8.16-1 - SQLite is a small C library that implements a self-contained, embeddable, zero-configuration SQL database engine.
    unslung-rootfs - 2.3r63-r11 -
    update-modules - 1.0-r4 -
    zlib - 1.2.3-2 - zlib is a library implementing the 'deflate' compression system.

    could it be
    – a hardware problem (memory ?)
    – a firmware problem (flashing again ?)
    – a software problem (maybe try an older version ?)

    if it just ran out of memory – how to solve this ?

    cheers
    hk

    #12846
    rpedde
    Participant

    @hk242 wrote:

    Oct 8 16:57:27 klogd: Unable to handle kernel paging request at virtual address 7265736b
    Oct 8 16:57:27 klogd: mm = c000bd20 pgd = c04e4000
    Oct 8 16:57:27 klogd: *pgd = 00000000, *pmd = 00000000
    Oct 8 16:57:27 klogd: Internal error: Oops: 0

    Fundamentally it’s a kernel or a module bug, as a userspace program shouldn’t be able to oops the kernel.

    That said, it’s obvious that mt-daapd is doing something to provoke the oops. Can you watch memory while its running, to see if it is exhausing memory?

    You can also try stepping back to 1586, just to see if that’s stable.

    If it isn’t, then it’s almost certainly kernel/module or hardware.

    — Ron

    #12847
    hk242
    Participant

    i installed version 1586, still no changes.
    so i’ve tried to watch memory:

    # /opt/etc/init.d/S60mt-daapd
    # free
    total used free shared buffers
    Mem: 30524 29676 848 0 5820
    Swap: 127852 1668 126184
    Total: 158376 31344 127032

    from this point on, the free mem decreases approx. 4/sec, reaching:

    # free
    total used free shared buffers
    Mem: 30524 29788 736 0 5924
    Swap: 127852 1668 126184
    Total: 158376 31456 126920

    at this point, the process crashes.. after that, memory returns to a higher value…

    # free
    total used free shared buffers
    Mem: 30524 28784 1740 0 5928
    Swap: 127852 1668 126184
    Total: 158376 30452 127924

    does that make any sense ?

    cheers,
    hk

    #12848
    rpedde
    Participant

    @hk242 wrote:

    i installed version 1586, still no changes.
    so i’ve tried to watch memory:

    # /opt/etc/init.d/S60mt-daapd
    # free
    total used free shared buffers
    Mem: 30524 29676 848 0 5820
    Swap: 127852 1668 126184
    Total: 158376 31344 127032

    from this point on, the free mem decreases approx. 4/sec, reaching:

    # free
    total used free shared buffers
    Mem: 30524 29788 736 0 5924
    Swap: 127852 1668 126184
    Total: 158376 31456 126920

    at this point, the process crashes.. after that, memory returns to a higher value…

    # free
    total used free shared buffers
    Mem: 30524 28784 1740 0 5928
    Swap: 127852 1668 126184
    Total: 158376 30452 127924

    does that make any sense ?

    cheers,
    hk

    No, not really. 🙂

    It’s not pushing the machine out of memory… is the file system on a flash drive? Are you swapping to flash? I’m wondering about either bad memory or physically bad swap device.

    #12849
    fizze
    Participant

    Well Ron, even if the flash would be “bad”, data integrity is still ensured when reading. Just writing is impossible then. Of course a really bad fs-driver could still make it worse 😉

    This is highly peculiar. Has the slug been running fine otherwise, or did you just receive it?

    Oh, and you don’t use any NTFS-formatted drives, btw? 🙄

    #12850
    hk242
    Participant

    Disk 2 is a Corsair Flash Voyager, 1 GB.
    Disk 1 is a FAT32-formatted HDD.

    The file system (unslung) and the swap space are on Disk 2.

    The slug is about 2-3 years old, rarely used. The inital flashing caused some problems (simple programms died, e.g. fdisk), so I flashed it again and everything seemed to be ok.

    #12851
    fizze
    Participant

    Fair enough. This might be unrelated, but are you using a .ext3flash file as proposed in http://www.nslu2-linux.org/wiki/Unslung/Ext3flash ?

    Well either your stick is badly failing again, or some of the slug’s internal memory is failing. I’ve never seen such weird bugs on a rather stable piece of software.

    Did you try to do some kind of memory benchmarks, or actually test the RAM of the slug? http://www.nslu2-linux.org/wiki/HowTo/TestSDRAMinRedBoot seems cumbersome, but would yield in potentially being able to rule out RAM as a culprit.

    #12852
    rpedde
    Participant

    @fizze wrote:

    Well Ron, even if the flash would be “bad”, data integrity is still ensured when reading. Just writing is impossible then. Of course a really bad fs-driver could still make it worse 😉

    This is highly peculiar. Has the slug been running fine otherwise, or did you just receive it?

    Oh, and you don’t use any NTFS-formatted drives, btw? 🙄

    I’m not a kernel guru or anything, but it looks like the error comes when the kernel is trying to evict in-memory pages to swap. So it looks like a problem on write. I’m going with bad flash as a diagnosis.

    — Ron

    #12853
    hk242
    Participant

    @fizze wrote:

    Fair enough. This might be unrelated, but are you using a .ext3flash file as proposed in http://www.nslu2-linux.org/wiki/Unslung/Ext3flash ?

    No, not yet. At first I’ve just tried to get the slug running. Extending the live of the flash drive is the second step 😉

    @fizze wrote:

    Well either your stick is badly failing again, or some of the slug’s internal memory is failing. I’ve never seen such weird bugs on a rather stable piece of software.

    Did you try to do some kind of memory benchmarks, or actually test the RAM of the slug? http://www.nslu2-linux.org/wiki/HowTo/TestSDRAMinRedBoot seems cumbersome, but would yield in potentially being able to rule out RAM as a culprit.

    No benchmarks so far. I followed your link and did a memory test – the checksum produced by the slug was identical to the one created by my suse-linux machine – the memory seems to be ok.

    @rpedde wrote:

    I’m not a kernel guru or anything, but it looks like the error comes when the kernel is trying to evict in-memory pages to swap. So it looks like a problem on write. I’m going with bad flash as a diagnosis.

    — Ron

    What I tried next was disabling the swap space (“/sbin/swapoff /dev/sda3”) – still crashing after a while…

    😥
    kh242

Viewing 11 posts - 1 through 11 (of 11 total)
  • The forum ‘Setup Issues’ is closed to new topics and replies.