Ticket #190 (closed bug: fixed)

Opened 4 months ago

Last modified 3 months ago

DNS queries to locally defined aliases fail from node itself

Reported by: luckman212 Owned by: dfl-owner
Priority: major Milestone: ng-beta
Component: dashboard Keywords:
Cc: Network name: cloudpound

Description

I've updated my om2p node to r357. It still seems the OM2P node itself ignores or has trouble with local DNS server lookups. e.g. I have the following defined in the local DNS on my LAN router (192.168.20.1, also runs dnsmasq):

address=/cloudprv.com/192.168.20.248

Yet, when I ssh into the OM2P, here's what I see:

# cat /tmp/resolv.conf && /tmp/resolv.conf.auto search lan nameserver 127.0.0.1 nameserver 192.168.20.1 search lan

# nslookup cloudprv.com Server: 127.0.0.1 Address 1: 127.0.0.1 localhost

nslookup: can't resolve 'cloudprv.com': Name or service not known

# nslookup cloudprv.com 192.168.20.1 Server: 192.168.20.1 Address 1: 192.168.20.1 r1.lan

Name: cloudprv.com Address 1: 192.168.20.248 pbx.lan

So it resolves correctly when I specifically force nslookup to query the local DNS. And it appears to have the right stuff in resolve.conf/resolv.conf.auto. Anything I can do to make this work as desired here?

Change History

  Changed 4 months ago by luckman212

That formatting got totally messed up. Should have looked like this:

# cat /tmp/resolv.conf && cat /tmp/resolv.conf.auto
search lan
nameserver 127.0.0.1
nameserver 192.168.20.1
search lan

# nslookup cloudprv.com
Server:    127.0.0.1
Address 1: 127.0.0.1 localhost

nslookup: can't resolve 'cloudprv.com': Name or service not known

# nslookup cloudprv.com 192.168.20.1 
Server:    192.168.20.1
Address 1: 192.168.20.1 r1.lan

Name:      cloudprv.com
Address 1: 192.168.20.248 pbx.lan

P.S. (unrelated) - what happened to http://dev.cloudtrax.com/downloads/testing/firmware-ng/ ?

  Changed 4 months ago by marek

Ok, I checked your node and I have no idea why it is not working. It seems to always query the same DNS server. What happens if you change the name to something else ? Preferably random ? Maybe there is a name collision somewhere ?

  Changed 4 months ago by luckman212

Yeah it's weird. I don't think there is any name collision happening. I tested some other queries as well e.g.

root@ap1:~# nslookup x703.lan
Server:    127.0.0.1
Address 1: 127.0.0.1 localhost

nslookup: can't resolve 'x703.lan': Name or service not known
root@ap1:~# nslookup x703.lan 192.168.20.1
Server:    192.168.20.1
Address 1: 192.168.20.1 r1.lan

Name:      x703.lan
Address 1: 192.168.20.227 x703.lan

So it gives the same kind of results. When you say it always queries the same name server, do you mean it is querying the external nameserver or the DNS running on my LAN router (dnsmasq)?

  Changed 4 months ago by luckman212

A little more to add, I was just in there again with SSH trying to figure out what might be going on. I noticed (I think you installed tcpdump?) so I checked, and here's a screenshot showing the nslookup where the upstream router (192.168.20.1) actually does return the correct A record but for some reason it still returns a 'cant resolve' error. screenshot:

 http://i.imgur.com/KF5ts.png

  Changed 4 months ago by marek

Yes, I installed tcpdump and did the same as you did: Dumping the DNS queries. This left me with the impression that the DNS server is correct but something else is wrong. Maybe it is one of the settings ?

For testing purposes I disabled the dnsmasq rebinding protection (--stop-dns-rebind) by running:

uci set dhcp.conf.rebind_protection=0
uci commit

Can you try it now ?

  Changed 4 months ago by luckman212

Thank you for checking it out Marek. Disabling the rebind attack protection seems to have fixed it! I was mucking around in /etc/config/dhcp yesterday -- I also modified these values:

  domainneeded = 0
  boguspriv = 0
  option 'local' ''
  option 'authoritative' '0'

I guess I can leave those but not sure if any of those settings were actually needed, or if there are 'uci set' commands I could put in my custom.sh to set them without hand-editing.

  Changed 4 months ago by luckman212

Okay, I found the relevant UCI commands. I did some further testing, and it turns out the 'dhcp.conf.local' *is* crucial to the puzzle. If I don't delete that value (it defaults to '/lan/' then the local DNS lookups start failing again. Here is a complete set of UCI commands that resolve the issue. Not sure if this creates any other problems that I am unaware of, but it's working for me:

uci set dhcp.conf.domainneeded=0
uci set dhcp.conf.boguspriv=0
uci set dhcp.conf.rebind_protection=0
uci set dhcp.conf.authoritative=0
uci delete dhcp.conf.local
uci commit

  Changed 4 months ago by marek

There are uci commands for all options and files in /etc/config/. Generally, the syntax is:

uci set $file_in_etc_config.$section_name.$option=value

If you are interested you can read its documentation:  http://wiki.openwrt.org/doc/uci

If I were you I'd be more cautious about changing configuration parameters unless you know exactly what you are doing. For example, the 'authoritative' is not related to DNS at all. It also makes it harder for us to help you because you might introduce bugs that we are not aware of. I don't go through the entire config to check what you changed. At least you should add information about your changes to the ticket next time when you report something.

Note that we won't change the default configuration of the rebind protection because it makes sense for most people.

  Changed 4 months ago by marek

  • status changed from new to closed
  • resolution set to fixed

  Changed 4 months ago by luckman212

Thanks again Marek. I do try to be fairly careful about setting config options, but this is just my home network so it's not too important. I have some experience with dnsmasq although I am certainly no expert. I found a link in the dnsmasq FAQ about the dhcp authoritative option which pointed to  http://www.isc.org/files/auth.html

After reading through that, I removed the dhcp authoritative directive from my custom.sh -- thanks for the tip on that.

I certainly understand the decision to not make these options default, although I suspect this problem could affect anyone who wants to put a custom.sh file on a webserver that is on the same subnet as the open-mesh gateway nodes.

  Changed 4 months ago by luckman212

  • status changed from closed to reopened
  • resolution fixed deleted

Sorry, but something just occurred to me so I am re-opening this ticket. So let's say that the node receives an automatic/OTA upgrade. I guess after flashing it will reboot and no longer be able to download the locally-hosted custom.sh because of the DNS issue (kind of a catch-22). Am I wrong in this assumption?

  Changed 4 months ago by marek

You are right. That will be a problem.

By the way, you are the first person I heard of that runs a webserver in his LAN.

  Changed 4 months ago by luckman212

Haha really? I don't run a heavy duty production webserver here or anything, but I have a small Intel Atom server that runs CentOS and has apache, Asterisk, a syslog daemon that I use to capture log output from a few different devices, a small FTP server etc. It's just convenient (and more secure) to have these things running locally as I have no need for anyone to access them from outside the LAN.

In general I would think it is not that uncommon to have a device that can function as a mini webserver on one's LAN these days?

  Changed 4 months ago by marek

Maybe it is not uncommon to have a webserver but you are the first to run into this. Combining it with the custom.sh seems to be uncommon.

By the way, if you wish to keep your settings during an OTA I suggest you check ticket #187. The next build is going to support that.

  Changed 4 months ago by luckman212

The support for /etc/sysupgrade.conf sounds good & I think that will be a solution. I looked on the OpenWrt? website for some docs on how to use it, seems you just make a simple text file and place each file you want 'preserved' one per line? So in that case would I try to preserve the custom.sh script (I don't think this is saved anywhere on the filesystem) or '/etc/config/dhcp' ? Still not quite sure how I would make use of this feature.

  Changed 4 months ago by marek

Yes, one entry per line for each file you want to have restored or the name of a folder if you wish to keep an entire folder (plus its content).

I suggest creating a file that is called at boot time and sets the things you want to configure. Keeping '/etc/config/dhcp' will get you into trouble once we want to change something in that file or OpenWrt?.

  Changed 4 months ago by luckman212

Sorry to keep asking questions but: how would I make sure the file gets called at boot time? To be sure I understand what you're proposing, the 'flow' of this upgrade process would be something like:

1. create a file for my 'special' commands e.g. /tmp/dns_fixup.sh 2. the contents of this file could be:

uci set dhcp.conf.domainneeded=0
uci set dhcp.conf.boguspriv=0
uci set dhcp.conf.rebind_protection=0
uci delete dhcp.conf.local
uci commit

3. make that file executable 4. somehow add that file to the initrc (how?) 5. create /etc/sysupgrade.conf and its contents should be:

/tmp/dns_fixup.sh

6. now If I want to propagate such a custom config to >1 node then I would need to somehow "wrap" those commands up in the custom.sh file? I think there's a catch-22 logic loop here because if e.g. I add a node to the mesh there is NO way that node is going to be able to fetch the custom.sh "bootstrap" until I manually ssh into it and disable the rebind protection.

So perhaps this is all way too complicated, I should just host the custom.sh on an "external" webserver. Or, is it possible to specify an IP address via the dashboard? e.g.  http://192.168.20.248/ ? that would sort of solve this problem.

  Changed 4 months ago by marek

You can use an IP address - there should be no problem with that.

  Changed 4 months ago by marek

Does using the IP address work ?

  Changed 4 months ago by luckman212

Sorry Marek, I forgot to try it. I'll give it a try towards the end of the day

  Changed 3 months ago by marek

Any update here ?

  Changed 3 months ago by luckman212

IP address does (did) work - so did dhcp.conf.rebind_protection=0. But as I said in the other ticket I haven't been testing much past few weeks as I don't have the OM2P hooked up anymore.

  Changed 3 months ago by marek

So, can we close the ticket then or shall we wait for your additional testing ?

follow-up: ↓ 25   Changed 3 months ago by luckman212

Go ahead and close the ticket if you like. Can I ask one more question- sorry I know it's not related but it's not a bug & so didn't want to open another ticket.

I've got a list of 6 networks that are "stuck" on old versions of the firmware - some are 299p, others are 300p, and one is r330. I've enabled the "NG-firmware" checkbox for these networks but they don't update for some reason. All the other nets have successfully upgraded to 376.

Is there a way I can ssh into these and force an upgrade? in case it's important, the networks are:

Yassky_CT [ng-r299p] Yassky_NY [ng-r299p] Yassky_office [ng-r300p] magnolia_Bleecker [ng-r300p] magnolia_Chicago [ng-r300p] simondev [ng-r330]

All nodes are OM1P's with the exception of 'simondev' which is a single MR500. thanks

in reply to: ↑ 24   Changed 3 months ago by marek

Replying to luckman212:

Is there a way I can ssh into these and force an upgrade? in case it's important, the networks are: Yassky_CT [ng-r299p]

Has upgraded.

Yassky_NY [ng-r299p]

Did not have "test firmware" checked - I corrected that.

Yassky_office [ng-r300p]

Has upgraded.

magnolia_Bleecker [ng-r300p]

Has upgraded.

magnolia_Chicago [ng-r300p]

Has upgraded.

simondev [ng-r330]

Has upgraded.

  Changed 3 months ago by luckman212

  • status changed from reopened to closed
  • resolution set to fixed

Marek, thank you so much. All nets are now at r376. So far so good. Going to close this ticket up.

Note: See TracTickets for help on using tickets.