Hi Terry,
On Wed, Jul 15, 2020 at 11:32:02PM -0000, Terry Bolinger wrote:
We have a product that we are selling that uses the Beaglebone Black
SBC with Debian Jessie as a small network appliance in a residential
environment. Connman is the network manager in this distribution. We
are experiencing a very random issue where occasionally the DNS
servers being returned by a network DHCP server get corrupted.
What means corrupted here? The DHCP response on the wire is already
bad? Or ConnMan parses it wrongly?
All other configurations are fine - IP address, gateway, etc. When
this happens, the unit still has local network connection but loses
the connection to the Internet (as you would expect). This impacts
the operation of the unit in a very negative manner. We have a user
interface that shows the configuration settings and when the Primary
DNS server gets corrupted, the user interface field normally shows the
DNS server as " [ ".
Ah, okay, this answers my question.
The Secondary DNS server is mostly blank when this happens.
Manually
changing the Primary DNS Server to a good nameserver always resolves
the issue.
Any chance you could record the DHCP traffic via tcpdump? It's very hard
to say what's going on with out the input.
It happens often enough that it's annoying us because of the
support
calls we get, but it happens so infrequently that it's very difficult
to reproduce. In fact, I've never seen it on my test units - I've
only seen it on units installed in the field.
Sure, this is annoying. I suspect this problem depends on the DHCP
server used. Any idea what kind of server it could be?
Since it's so hard to reproduce, I'm looking at some bandaid
approaches using some scripts. But, before I did that, I thought I'd
check to see if connman might have some options that I could use to
resolve the issue. One potentially available option I was reading
about was the use of the nodnsproxy option. This one might be
extremely hard to determine if it has resolved the issue or not, as it
is so hard to reproduce but would be simple to implement, if it were a
potential solution. This would assume that DNS proxy feature in
connman was in conflict with the external DNS server, as I was
reading, which might be a real stretch.
One thing we should look into is, if we could prevent the resolv.conf
update with bad addresses. Also, adding a bit of warning to the logs
would help I suppose.
A second option I was looking at was the
"FallbackNameservers" option.
However, I don't understand how this works. Does connman do a test to
verify Internet connection or proper DNS operation, and if it fails
that test, it "falls back" to a secondary DNS server and then if that
also fails, it falls back to the "FallbackNameservers"? If this is how
this works, this is a possibility to explore - I would try setting the
FallBack servers to available DNS servers such as the Google
nameserver at 8.8.8.8.
The FallbackNameservers are there if we get a valid DHCP without DNS
entries. In this case those should be used. But as ConnMan adds the '['
(which is decimal 91) it might prevents ConnMan to use the
Fallbacknameservers
By the way, the connman version on my Debian distribution is V1.32.
I
see that connman is up to at least 1.37. Is it possible that
upgrading connman might resolve this issue?
I would appreciate if you could update to the latest version of
ConnMan. 1.32 is 4 years old and we had plenty of fixes (including DHCP
parsing) since then. Not sure if Debian package got some fixes back
ported.
I apologize - I'm comfortable with simple networking but I'm
by no
means a networking expert and I'm really hoping for a sanctioned
option to resolve this issue.
All good. Thanks for your report. This kind of feedback is highly
appreciated. Sorry for my long delay to answer.
Thanks,
Daniel