GobiNet Linux driver: BUG: scheduling while atomic

We are trying to get a MC7430 running with an embedded ARM board running Linux kernel 2.6.35.
When we try to load the GobiNet kernel module (version S2.26N2.39), the kernel prints the following error messages (
the complete log can be found in the attachment):

# modprobe GobiNet
[  244.791441] GobiNet: 2016-11-07/SWI_2.39
[  244.796014] GobiNet::GobiNetDriverBind in 86, out 4
[  244.807065] GobiNet 2-1.2:1.8: usb0: register 'GobiNet' at usb-fsl-ehci.1-1.2, GobiNet Ethernet Device, 32:cd:ee:83:ee:c6
[  244.824178] Ethernet mode
[  244.827178] GobiNet::GobiUSBNetProbe Mac Address:
[  244.831955] GobiNet::PrintHex    : 32 CD EE 83 EE 08 
[  244.837043] GobiNet::ClearTaskID ClearTaskID iTaskID(0)
[  244.842436] GobiNet::GobiUSBNetProbe <6>GobiNet Thread : GobiNetThread:0 2:8
[  244.849484] USB Speed : USB 2.0
[  244.852654] GobiNet::thread_function Handle qcqmi(qcqmi0-2-1.2:1.8), task: 0
[  244.859719] GobiNet::FindClientMem Could not find client mem 0x0000
[  244.866072] BUG: scheduling while atomic: GobiNetThread:0/2313/0x00000002
[  244.873002] Modules linked in: GobiNet(+) g_file_storage tps61170_gpio rtc_mxc_v2(+) ad7843_mxc hwmon arcotg_udc sierra GobiSerial usbserial usbnet mii ad799x(C) ring_sw(C) industrialio(C) adv7181d [last unlo
aded: GobiNet]
[  244.893043] [<8020b770>] (unwind_backtrace+0x0/0x16c) from [<805cf960>] (schedule+0x68/0x314)
[  244.901594] [<805cf960>] (schedule+0x68/0x314) from [<805d037c>] (schedule_timeout+0x1a8/0x1dc)
[  244.910299] [<805d037c>] (schedule_timeout+0x1a8/0x1dc) from [<805d0050>] (wait_for_common+0x100/0x1bc)
[  244.919718] [<805d0050>] (wait_for_common+0x100/0x1bc) from [<80490ea4>] (usb_start_wait_urb+0x68/0x190)
[  244.929217] [<80490ea4>] (usb_start_wait_urb+0x68/0x190) from [<80491180>] (usb_control_msg+0xb8/0xdc)
[  244.938797] [<80491180>] (usb_control_msg+0xb8/0xdc) from [<7f1c2a34>] (RegisterQMIDevice+0x174/0x55c [GobiNet])
[  244.949038] [<7f1c2a34>] (RegisterQMIDevice+0x174/0x55c [GobiNet]) from [<7f1bdf50>] (thread_function+0xa0/0x180 [GobiNet])
[  244.960209] [<7f1bdf50>] (thread_function+0xa0/0x180 [GobiNet]) from [<802457e8>] (kthread+0x78/0x80)
[  244.969451] [<802457e8>] (kthread+0x78/0x80) from [<802079a8>] (kernel_thread_exit+0x0/0x8)
[  244.979507] GobiNet::FindClientMem Found client's 0x0 memory
...

How can that be fixed?

Thanks,
Peter
gobinet_log.txt (190 KB)

Hi peter,
We are also having same problem with GobiNet (BUG: scheduling while atomic: GobiNetThread). Are you able to solve the issue?

This is caused by new major code changes introduced in v2.39 of the GobiNet driver. My best guess is that it isn’t quite ready yet. I would have tried using S2.26N2.38 instead until the issue is resolved.

The 74xx isn’t support Ethernet mode, please check makefile about RAWIP.
You can use “make RAWIP=1” to compile

I believe the driver would have refused to bind if missing RAWIP support was the problem.

EDIT: Sorry, I was wrong. You are completely right.

Reading the full log I see that it finally fails with

[  486.046568] GobiNet::QMIWDASetDataFormatResp EFAULT: Data Format Cannot be set to Ethernet Mode
[  486.055266] GobiNet::QMIWDASetDataFormat Data Format Cannot be set

so yes, the driver needs to be built with RAWIP=1.

I still don’t see how that is related to the “BUG: scheduling while atomic” warnings, but maybe it is somehow? Worth trying.

BTW, I see that there now is a new version. S2.27N2.40, where the build time RAWIP flag has been removed. It is now finally a runtime setting and it is even automatic. From the release notes:

No notes about the “scheduling while atomic” issue, but if it was related to the lack of RAWIP support then it should be fixed.

This issue does not seem to be entirely fixed at least.

I have seen the “scheduling while atomic” issue on our embedded system running kernel 2.6.35 on an armv5 CPU (ARM926EJ-S rev 5 (v5l)). This leads to kernel panics on GobiNet drivers as recent as v2.50 as soon as I load the kernel module with a modem connected.

One issue at least seems to be that the function LocalClientMemLockSpinIsLock (in QMIDevice.c) checks a spin lock’s status using spin_is_locked before releasing the lock using spin_unlock_irqrestore. On my setup, spin_is_locked always returns 0. This is probably because all non-SMP kernels will return 0 on spin_is_locked, AFAIK.

It seems to me like the separate check for spin_is_locked is a bad idea anyway, so I removed it. I do no longer see the scheduling while atomic issue, but I have not yet tested the driver enough to see if the patch causes other issues. There might also be other locking-related issues in other parts of the code base remaining. The patch is the following:

--- a/GobiNet/QMIDevice.c
+++ b/GobiNet/QMIDevice.c
@@ -618,13 +618,6 @@
    if(pDev!=NULL)
    {
       unsigned long flags = pDev->mQMIDev.mFlag;
-      if(LocalClientMemLockSpinIsLock(pDev)==0)
-      {
-         #if SPIN_LOCK_DEBUG
-         printk("(%d)%s :%d Not Locked\n",task_pid_nr(current),__FUNCTION__,line);
-         #endif
-         return 0;
-      }
       #if SPIN_LOCK_DEBUG
       printk("(%d)%s %d :%d\n",task_pid_nr(current),__FUNCTION__,__LINE__,line);
       #endif

Best regards,
Stefan