FX30 Connected, losing name resolution

I have everything working now in my application and did an 12+ hour run last night and the script is collecting the data and pushing to the cloud on the schedule I set.

I have a call to ‘cm data connect &’ through a python subprocess call in my startup script and it works as expected and everything comes up and runs great for a while.

A new problem I have noticed that also has happened occasionally to me during testing is that the system will be connected. ‘cm data’ shows the IP address, gateway, dns1, dns2, but the name resolution does not work. Based on my mqtt reports, it appears to fix itself eventually but may take 2+ hours to do so before reporting starts again.

Python script throws: socket.gaierror: [Errno -3] Temporary failure in name resolution

From manual testing: ntpdate -q ntpserver => also throws name server error

So the system thinks it is connected, but the dns resolution does not work.

If I manually run ‘cm data disconnect’, wait, ‘cm data connect’, then it will re-connect and run for a while, but eventually might happen again.

I have seen some posts saying they had connection issues, but a search for ‘name resolution’ both on the Sierra forums and google don’t turn up many solutions.

The pyRTE authors (python for Legato written by Sierra developers in Australia) had written an app that would monitor the sent or received data on the rmnet_data0 interface, if it doesn’t change in X seconds, reboot the modem. That app is not open source, but I could probably re-create the functionality if that is the only way.

  1. Are other people having this issue?

  2. Is the correct course of action a reboot or is issuing ‘cm data disconnect’, wait, ‘cm data connect’ acceptable?

  3. I am reporting every 15 minutes. Would I be better off just calling ‘cm data connect’ directly before the MQTT push, and then ‘cm data disconnect’ directly after?

I have not been able to quantify how much data is used just maintain the connection indefinitely (cm data connect called at startup) vs a connect when needed approach (#3) above.

At this point, I don’t plan to have the device go to deep sleep (ULPM) at this point since they want 15 minute reports.

Thanks

1 Like

Before problem happens, are you able to ping the DNS server?
when problem happens, are you still able to ping the DNS server?
when problem happens, are you still able to ping 8.8.8.8?

you can try if there is improvement by disabling edrx feature by AT+CEDRXS=0

Will the DNS work fine after disconnecting “cm data” and connecting “cm data”?

BTW, does it work with gethostname() by using real IP address instead of host name resolution?

This is what I get back…

AT+CEDRXS?
+CEDRXS:

OK

I can’t find any description of what the AT+CEDRXS does???
Yes, I have downloaded and review the WP77xx AT Command guide, but it mentions it, but doesn’t tell you what it does, if you need it, or how to configure it.

I will trying pinging by ip address the next time it fails, to see if it is just loss of name resolution.
I am pretty sure it is just dns loss.

Are you thinking during startup, just lookup the ipaddress of my dashboard host and send data to the ipaddress instead of the mqtthost.com name?


you can directly set AT+CEDRXS=0 and see what happens

Are you thinking during startup, just lookup the ipaddress of my dashboard host and send data to the ipaddress instead of the [mqtthost.com](http://mqtthost.com/) name?

Yes, if just DNS problem, you can try with real IP address

Finally failed again, I can confirm it is just a loss of name resolution (DNS) and not a connectivity problem.

ping www.google.com - does not work
ping 8.8.8.8 - works fine

‘cm data disconnect’, wait 5 seconds, ‘cm data connect’ and everything works again.

I have not tried AT+CEDRXS=0 yet, as I wanted to see if it was just DNS failure and needed a failure to do so.

Before problem happens, are you able to ping the DNS server?
when problem happens, are you still able to ping the DNS server?

If you can still ping DNS server, you might need to see why the DNS packet cannot be sent out to DNS server

Has not failed again, but Yes can ping both DNS 1 and DNS 2 Ip addresses while active

That time it took 12+ hours to fail…

I can confirm that I CAN ping both DNS1 and DNS2 by IP ADDRESS (DNS addresses as listed in cm data)

ntpdate -q timeserver gives name resolution error

So it definitely is a DNS error, not a connectivity issue

On the python side, I am going to use a try - except on my Mqtt connect call, on failure,
it will call

my cycle_connection function which will issue a
cm data disconnect,
wait 10 seconds,
cm data connect,
wait 10 seconds, and then check with a

try:
       socket.gethostbyname(hostname)
       return 1
   except socket.error:
       return 0

If returns one, then go back to re-try the Mqtt publish, if fail, then reboot.

but if you can ping DNS server when problem happens, why did the DNS server not responding to the DNS request?
You might need to capture wireshark log to see how the DNS packet is going when problem happens.

BTW, does AT+CEDRXS=0 help?

The DNS is hung up somehow is all I know.

Since the unit decided to stop abiding by the APN setting and constantly switches carriers, I can no longer connect to do any more tests as we have no connectivity to test with. I am going to try to escalate the connectivity issue with my distributor today…