17th March 2009 at 7:02 am #2814
I’ve been trying to upgrade from a fedora 8 to a fedora 10 server to host mt-daapd-svn-1696.
I’ve been running mt-daapd-svn-1696 for a long time on an f8 server with no problems other than I had to put my gig E card into promiscuous mode to handle the multicast traffic (port 5353 etc).
After upgrade to fedora 10, recompiling with the same options as previously, my M1001 soundbridge no longer finds the mt-daapd server. I’ve compiled using both the following options
./configure –prefix=/usr –sysconfdir=/ –enable-sqlite3 –libdir=/usr/lib –enable-mdns –enable-oggvorbis
./configure –prefix=/usr –sysconfdir=/ –enable-sqlite3 –libdir=/usr/lib –enable-avahi –enable-oggvorbis
I’ve tried two separate fedora 10 systems – same result. M1001 doesn’t see the server when using gigabit ethernet.
What’s got me stumped is the following…
I’ve tried two different Gig E NICs on both machines. Neither gig E NIC works.
I’ve tried two different 10/100 NICs on both machines ***AND BOTH OF THESE WORK***.
I’m out of ideas – but am hoping this will ring a bell for somebody out there. Any ideas you can offer??
Thanks19th March 2009 at 3:18 am #18247
Have you tried the basic stuff? Like can you ping your M1001? And I assume you have it hard wired. Do you have other 10/100 devices on the network and can you talk to them? Do you have other gigabit divices on the network? Should the M1001 be able to see the internet, and does it? I have fedora 10 on my laptop and it has gigabit ethernet, and it works with firefly. But nothing else on the network is gigabit, so maybe that’s why it works.
rsl19th March 2009 at 4:22 am #18248
Thanks rsl… Your questions are good and cover off all the obvious things to check…
Have you tried the basic stuff? Like can you ping your M1001?
Yes I can ping it and can telnet into ports 4444 and 5555.
And I assume you have it hard wired.
Do you have other 10/100 devices on the network and can you talk to them?
Yes – the network is working fine. All devices are contactable and working fine.
Do you have other gigabit devices on the network?
Yes – many – its sad. And they’re all communicating perfectly at the normal 300-400 Mbit/s they normally sustain into the fedora 10 server that’s hosting firefly.
Should the M1001 be able to see the internet, and does it?
It can definitely see more than the internet. Just for grins – and to perhaps be informative for anybody else that has trouble getting their M1001 working in the future, a wireshark trace taken on the server, filtering everything other than traffic to/from the M1001, shows the following: (Scenario is that M1001 is turned off, I start the wireshark trace, then turn on the M1001)…
No. Time Source Destination Protocol Info
1 0.000000 0.0.0.0 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x55443322
2 0.000002 RokuLlc_04:45:2a Broadcast ARP Who has 169.254.50.91? Tell 0.0.0.0
3 0.001310 0.0.0.0 255.255.255.255 DHCP DHCP Request - Transaction ID 0x55443322
4 0.130036 RokuLlc_04:45:2a Broadcast ARP Who has 172.16.1.1? Tell 172.16.1.16
5 0.130066 AsustekC_5a:b1:8f RokuLlc_04:45:2a ARP 172.16.1.1 is at 00:18:f3:5a:b1:8f
6 0.130257 172.16.1.16 172.16.1.1 NTP NTP client
7 0.130475 172.16.1.1 172.16.1.16 NTP NTP server
8 0.131075 172.16.1.16 172.16.1.1 NTP NTP client
9 0.131184 172.16.1.1 172.16.1.16 NTP NTP server
10 0.131794 172.16.1.16 172.16.1.1 NTP NTP client
11 0.131887 172.16.1.1 172.16.1.16 NTP NTP server
12 0.155875 172.16.1.16 172.16.1.1 DNS Standard query A www.radioroku.com
13 0.156391 172.16.1.1 172.16.1.16 DNS Standard query response CNAME radioroku.com A 188.8.131.52
14 0.182405 172.16.1.16 172.16.1.1 DNS Standard query A soundbridgeupgrade.rokulabs.com
15 0.182891 172.16.1.1 172.16.1.16 DNS Standard query response A 184.108.40.206
16 0.352190 172.16.1.16 220.127.116.11 TCP 8978 > http [SYN] Seq=0 Win=8192 Len=0
17 0.352320 18.104.22.168 172.16.1.16 TCP http > 8978 [SYN, ACK] Seq=0 Ack=1 Win=5840 Len=0 MSS=1460
18 0.352622 172.16.1.16 22.214.171.124 TCP 8978 > http [ACK] Seq=1 Ack=1 Win=8192 Len=0
19 0.353200 172.16.1.16 126.96.36.199 TCP [TCP segment of a reassembled PDU]
20 0.353268 188.8.131.52 172.16.1.16 TCP http > 8978 [ACK] Seq=1 Ack=229 Win=6432 Len=0
21 0.353729 172.16.1.16 184.108.40.206 HTTP POST /services/url_performance.php
22 0.353817 220.127.116.11 172.16.1.16 TCP http > 8978 [ACK] Seq=1 Ack=712 Win=7504 Len=0
23 0.379974 172.16.1.16 18.104.22.168 TCP 8979 > http [SYN] Seq=0 Win=8192 Len=0
24 0.380108 22.214.171.124 172.16.1.16 TCP http > 8979 [SYN, ACK] Seq=0 Ack=1 Win=5840 Len=0 MSS=1460
25 0.380413 172.16.1.16 126.96.36.199 TCP 8979 > http [ACK] Seq=1 Ack=1 Win=8192 Len=0
26 0.381845 172.16.1.16 188.8.131.52 HTTP GET /cgi-bin/debug4
27 0.381959 184.108.40.206 172.16.1.16 TCP http > 8979 [ACK] Seq=1 Ack=1461 Win=8760 Len=0
28 0.381848 172.16.1.16 220.127.116.11 HTTP Continuation or non-HTTP traffic
29 0.382004 18.104.22.168 172.16.1.16 TCP http > 8979 [ACK] Seq=1 Ack=1698 Win=11680 Len=0
30 0.395446 172.16.1.16 22.214.171.124 MDNS Standard query PTR _daap._tcp.local, "QU" question PTR _daap._udp.local, "QU" question PTR _rsp._tcp.local, "QU" question PTR _rsp._udp.local, "QU" question ANY SoundBridge._telnet._tcp.local, "QU" question ANY SoundBridge._roku-rcp._tcp.local, "QU" question ANY SoundBridge.local, "QU" question ANY SoundBridge._http._tcp.local, "QU" question
31 0.645222 172.16.1.16 126.96.36.199 MDNS Standard query ANY SoundBridge._telnet._tcp.local, "QM" question ANY SoundBridge._roku-rcp._tcp.local, "QM" question ANY SoundBridge.local, "QM" question ANY SoundBridge._http._tcp.local, "QM" question
...And the chit-chat continues for a while
You can see that the DHCP service gives the M1001 an address of 172.16.1.16, you can see the M1001 contact the NTP service on the server (at 172.16.1.1) and get the time/date, you can see the M1001 ‘phone home’ to soundbridgeupgrade.rokulabs.com and check for upgrades and to http://www.radioroku.com to tell them what I listen to (not really), and finally you can see the M1001 multicasting out to 188.8.131.52 on the 5353 daap port looking for servers.
When its working, my 172.16.1.1 firefly server responds immediately after the M1001 multicast to 184.108.40.206 with a list of supported services (when I’ve got a 10/100 card plugged into my one and only PCI slot). With either of my Gig E cards (one is an “r8169” based card and the other is based on the “Sundance Technology IPG Triple-Speed Ethernet” chipset), there’s no response from the firefly server.
I have fedora 10 on my laptop and it has gigabit ethernet, and it works with firefly. But nothing else on the network is gigabit, so maybe that’s why it works.
I have some really old Dell D600 laptops with onboard Gig E and I checked one of these. It seems to work OK – ie when I compile and run a firefly server, the M1001 sees it immediately. It streams music OK. I’ve not had it running for long, just to start a song to see it works. Also – the same gig E NICs used to work fine with F8 and svn-1696. I was running this version of firefly on F8 without power-down for 18 months or more and earlier version on F7, F6, F5, F4 ….
Since the last posting, I’ve somehow configured the ‘broken’ GigE server so that the M1001 sees the firefly server *sometimes* but not all the time. If it sees it, it *sometimes* can’t connect (with the connection failed message on the screen on the M1001). If it connects, it is able to start streaming music, but it stops sometimes later with the message that it can’t find any music libraries. I’ve been trying desperately to capture the offending behaviour in wireshark so that I can compare the packet trace using a 10/100 card with the packet trace using a gig E card, to find out what’s missing or not – but I can’t seem to reproduce the behaviour in a reliable way – like I say, it *sometimes* seems to work, but not consistently. And I’ve been changing too many things to be able to confidently say what it is that got the M1001 to start seeing the firefly server in the first place.
This is seeming horribly like driver / networking issues with the latest fedora kernel or the ethernet driver code. Or possibly its an issue with the (now old) svn-1696 code with the (newer) fedora 10 kernel and drivers.
I shall be expending more effort because I’m damned if I’m going to give up and return to F8!!!
Any other ideas??
Thanks.19th March 2009 at 8:19 am #18249
Now this is an interesting issue! 😉
Well, maybe some kernel behaviour was changed, that affects timing and whatnot.
I’d look into the kernel changes between F8 and F10. Especially regarding network stuff, and probably those network drivers.
The fact that you get the same behaviour with two different NICs indicates that the problem might not be with a device driver, but rather somewhere in the stack.
I could think of timing issues, or whatnot.
To drop a famous quote here: We’re still puzzled, but on a much higher level! ! 😉19th March 2009 at 4:37 pm #18250
What do you have between the m1001 and the firefly server? Switch? Hub? Some sort of home router?
rsl19th March 2009 at 10:13 pm #18251
I have two ethernet switches between the firefly server and the M1001 – a gig E switch and a 10/100 switch. I have zero routers between them. (Why two switches??? Because I have about 14 wired ethernet devices in my house and I need more than one eight port switch).
Anyway – I got it working finally YIPEE – but not sure exactly what I did to make it work. Too many things changing…
For what its worth, here’s what I’ve ended up with and how I did it more or less…
1) Because I am ‘lazy’ and don’t want to run upstairs every time I restarted my mt-daapd server, I discovered some time back that you can telnet to port 5555 of the M1001 (ie telnet “IP of M1001” 5555) and issue a Reboot command. It now seems to me that this is NOT THE SAME as powering down the M1001 and doing a hard reboot. I think that its necessary to power cycle the M1001 after restarting the mt-daapd server or it seems to remain confused with some previous data. But I am not 100% sure about this. Without power cycling the M1001, a ‘working’ firefly build won’t work properly! My guess is that if the M1001 was previously connected to a broken build, then it stays broken even if you change to a working build – until power cycling.
2) I was flipping between compiling mt-daapd using the –enable-mdns flag and the –enable-avahi flag. In the end, I have a working server with a build using –enable-avahi. This doesn’t necessarily mean that the built in mdns server doesn’t work, but maybe.
It slowly dawned upon me that, with avahi (which I have not used before), if you restart the mt-daapd server, you also need to restart the avahi-daemon server. In other words…
service mt-daapd stop
service avahi-daemon restart
service mt-daapd start
...Then power cycle the M1001
3) At some stage in all the rigmarole, I realised I was having trouble with the avahi-daemon and rendezvous/bonjour because the M1001 was always seeing TWO instances of the firefly server. When I told the M1001 to connect to what it listed as the first firefly server, the mt-daapd server or the avahi-daemin would seem to lock up.
I found that the problem was that I had erroneously followed instructions in different postings on various discussion groups saying you need to define a daap service in the file /etc/avahi/services/daap.service. THIS IS DEFINITELY WRONG.
It seems that if you compile mt-daapd with the –enable-avahi flag, you SHOULD NOT create an /etc/avahi/services/daap.service file. When I removed this file and restarted everything as above, the M1001 correctly found one instance of the firefly server and connected to it without locking everything up.
4) Try as I may, I never got svn-1696 to work. At some stage, I looked to see if I could find a ‘newer’ build and discovered that the project moved to sourceforge some time ago. I downloaded mt-daapd-0.2.4.2.tar.gz. This is the version of mt-daapd I now have working with Fedora 10.
Interesting because the /etc/mt-daapd.conf file seems to be an *older* format than the version required by svn-1696 (ie I remember having to make a change in format a couple of years ago to the format used by svn-1696 and its immediate predecessors).
And it seems that the M1001 works much better with 0.2.4.2 with regards browsing composers / classical music than it ever did with svn-1696 (it was actually broken in 1696 – at least for me). I’m not sure whether this is because I have a *large* (>10k tracks) classical collection or what. I also noticed that 0.2.4.2 is much faster (four times faster?) than svn-1696 in scanning the library.
Anyway, 0.2.4.2 works for me on Fedora 10 and svn-1696 does not, even though 1696 worked on Fedora 8.
5) I still have an ‘ifconfig eth1 promisc’ command in my etc/rc.d/rc.local script. I added ‘ifconfig eth1 allmulti’ for good measure. This is because my ipg based gigE card at eth1 doesn’t work with multicast otherwise. I required this flag to make mt-daapd work even under Fedora 8.
6) I can’t remember all of the libraries I had to download to get it to compile because my system is now full of libraries, but here’s a guess at what it was (this may contain more libraries than actually required)…
yum install pcre-devel clamav-devel avahi-devel flac-devel nss-mdns avahi-compat-libdns_sd-devel sqlite-devel libid3tag-devel faad2-devel ffmpeg-devel libogg-devel libvorbis-devel libtheora-devel taglib-devel mysql-embedded-devel libmp4v2-devel
For what its worth, I’m using the Fedora 10 development-testing repositories but I have a feeling it will still work with the regular repositories.
7) It probably didn’t make any difference, but I added the –enable-flac to my compile command. Here is the compile line I used to build a working server…
./configure --prefix=/usr --sysconfdir=/ --enable-sqlite3 --libdir=/usr/lib --enable-avahi --enable-oggvorbis --enable-flac
phew. It works now. And it works better than the old svn-1696.
Thanks for the ideas and support.21st March 2009 at 4:56 pm #18252
Well, 0.2.4 is the stable branch that is _ancient_.
It doesn’t use sqlite as database, but rather gdbm. Which is why its faster. You’ll spot the difference when you browse to the admin interface, for instance.
Anyway, interesting piece of information about avahi, thanks for sharing. 🙂
You must be logged in to reply to this topic.