GobiNet Linux driver: BUG: scheduling while atomic


#1

We are trying to get a MC7430 running with an embedded ARM board running Linux kernel 2.6.35.
When we try to load the GobiNet kernel module (version S2.26N2.39), the kernel prints the following error messages (
the complete log can be found in the attachment):

# modprobe GobiNet
[  244.791441] GobiNet: 2016-11-07/SWI_2.39
[  244.796014] GobiNet::GobiNetDriverBind in 86, out 4
[  244.807065] GobiNet 2-1.2:1.8: usb0: register 'GobiNet' at usb-fsl-ehci.1-1.2, GobiNet Ethernet Device, 32:cd:ee:83:ee:c6
[  244.824178] Ethernet mode
[  244.827178] GobiNet::GobiUSBNetProbe Mac Address:
[  244.831955] GobiNet::PrintHex    : 32 CD EE 83 EE 08 
[  244.837043] GobiNet::ClearTaskID ClearTaskID iTaskID(0)
[  244.842436] GobiNet::GobiUSBNetProbe <6>GobiNet Thread : GobiNetThread:0 2:8
[  244.849484] USB Speed : USB 2.0
[  244.852654] GobiNet::thread_function Handle qcqmi(qcqmi0-2-1.2:1.8), task: 0
[  244.859719] GobiNet::FindClientMem Could not find client mem 0x0000
[  244.866072] BUG: scheduling while atomic: GobiNetThread:0/2313/0x00000002
[  244.873002] Modules linked in: GobiNet(+) g_file_storage tps61170_gpio rtc_mxc_v2(+) ad7843_mxc hwmon arcotg_udc sierra GobiSerial usbserial usbnet mii ad799x(C) ring_sw(C) industrialio(C) adv7181d [last unlo
aded: GobiNet]
[  244.893043] [<8020b770>] (unwind_backtrace+0x0/0x16c) from [<805cf960>] (schedule+0x68/0x314)
[  244.901594] [<805cf960>] (schedule+0x68/0x314) from [<805d037c>] (schedule_timeout+0x1a8/0x1dc)
[  244.910299] [<805d037c>] (schedule_timeout+0x1a8/0x1dc) from [<805d0050>] (wait_for_common+0x100/0x1bc)
[  244.919718] [<805d0050>] (wait_for_common+0x100/0x1bc) from [<80490ea4>] (usb_start_wait_urb+0x68/0x190)
[  244.929217] [<80490ea4>] (usb_start_wait_urb+0x68/0x190) from [<80491180>] (usb_control_msg+0xb8/0xdc)
[  244.938797] [<80491180>] (usb_control_msg+0xb8/0xdc) from [<7f1c2a34>] (RegisterQMIDevice+0x174/0x55c [GobiNet])
[  244.949038] [<7f1c2a34>] (RegisterQMIDevice+0x174/0x55c [GobiNet]) from [<7f1bdf50>] (thread_function+0xa0/0x180 [GobiNet])
[  244.960209] [<7f1bdf50>] (thread_function+0xa0/0x180 [GobiNet]) from [<802457e8>] (kthread+0x78/0x80)
[  244.969451] [<802457e8>] (kthread+0x78/0x80) from [<802079a8>] (kernel_thread_exit+0x0/0x8)
[  244.979507] GobiNet::FindClientMem Found client's 0x0 memory
...

How can that be fixed?

Thanks,
Peter
gobinet_log.txt (190 KB)


#2

Hi peter,
We are also having same problem with GobiNet (BUG: scheduling while atomic: GobiNetThread). Are you able to solve the issue?


#3

This is caused by new major code changes introduced in v2.39 of the GobiNet driver. My best guess is that it isn’t quite ready yet. I would have tried using S2.26N2.38 instead until the issue is resolved.


#4

The 74xx isn’t support Ethernet mode, please check makefile about RAWIP.
You can use “make RAWIP=1” to compile


#5

I believe the driver would have refused to bind if missing RAWIP support was the problem.

EDIT: Sorry, I was wrong. You are completely right.

Reading the full log I see that it finally fails with

[  486.046568] GobiNet::QMIWDASetDataFormatResp EFAULT: Data Format Cannot be set to Ethernet Mode
[  486.055266] GobiNet::QMIWDASetDataFormat Data Format Cannot be set

so yes, the driver needs to be built with RAWIP=1.

I still don’t see how that is related to the “BUG: scheduling while atomic” warnings, but maybe it is somehow? Worth trying.


#6

BTW, I see that there now is a new version. S2.27N2.40, where the build time RAWIP flag has been removed. It is now finally a runtime setting and it is even automatic. From the release notes:

No notes about the “scheduling while atomic” issue, but if it was related to the lack of RAWIP support then it should be fixed.


#7

This issue does not seem to be entirely fixed at least.

I have seen the “scheduling while atomic” issue on our embedded system running kernel 2.6.35 on an armv5 CPU (ARM926EJ-S rev 5 (v5l)). This leads to kernel panics on GobiNet drivers as recent as v2.50 as soon as I load the kernel module with a modem connected.

One issue at least seems to be that the function LocalClientMemLockSpinIsLock (in QMIDevice.c) checks a spin lock’s status using spin_is_locked before releasing the lock using spin_unlock_irqrestore. On my setup, spin_is_locked always returns 0. This is probably because all non-SMP kernels will return 0 on spin_is_locked, AFAIK.

It seems to me like the separate check for spin_is_locked is a bad idea anyway, so I removed it. I do no longer see the scheduling while atomic issue, but I have not yet tested the driver enough to see if the patch causes other issues. There might also be other locking-related issues in other parts of the code base remaining. The patch is the following:

--- a/GobiNet/QMIDevice.c
+++ b/GobiNet/QMIDevice.c
@@ -618,13 +618,6 @@
    if(pDev!=NULL)
    {
       unsigned long flags = pDev->mQMIDev.mFlag;
-      if(LocalClientMemLockSpinIsLock(pDev)==0)
-      {
-         #if SPIN_LOCK_DEBUG
-         printk("(%d)%s :%d Not Locked\n",task_pid_nr(current),__FUNCTION__,line);
-         #endif
-         return 0;
-      }
       #if SPIN_LOCK_DEBUG
       printk("(%d)%s %d :%d\n",task_pid_nr(current),__FUNCTION__,__LINE__,line);
       #endif

Best regards,
Stefan