VM crashes OpenAT

I am trying to execute this very basic code in the Lua VM:

for i=1,1000 do print(i) end

What happens is that after “i” reaches 380+, the modem resets, without any warning/error.
When checking the backtrace, the following errors show up:

Current OAT task index 0
ARM Data Abort caught at 0028723A, Current Task 1D by CP15

The 0028723a address seems to map to the luaV_execute function.

I could be caused by a memory leak, but it’s just a guess.

I am using oatlua revision 97 on a Fastrack Supreme 10 modem (compiled with a 256KB memory model).
The oatlua library was built without any debug information to save memory.

A fix for this problem would be great, because it really impacts our project.

Thank you,
Alex

Big Loops Are Bad!

They trip the watchdog.

FAQ: wavecom.com/modules/movie/sc … iki#p16098

don’t link to the old forum! :stuck_out_tongue:

is the correct link

:blush: Oops - good point.

The trouble is, all the references in existing posts are to the old forum!

What I did was a search for “FAQ Wiki”, and then just followed the link in the first post it found - which, of course, ended up at the old forum!

See: viewtopic.php?f=7&t=4579&p=18492#p18492

I am not sure if the infinite loop problem applies to the Lua VM. If it does, than I strongly believe it should be considered a bug.
I tried adding a wait() function call to the loop to let the other threads do their job, but the same happens:

for i=1,1000 do print(i);wait(5);end

After “i” reaches 60+, a crash occurs.

This might sound like a trivial question, but from what I experienced during the last few days I can assure you it’s not. Can someone please show me how to print the numbers from 1 to 1000 in OpenAT lua without crashing?

Thank you,
Alex

I’m not sure either, but I suspect it does

Why?

The Lua VM, as I understand it, isjust an Open-AT app - so the watchdog limitation would be expected to apply?

I do not have any LuaW HW available here, but I doubt that OpenAT watchdog trigger reset is the reason for this behaviour. LuaW makes use of adl_wdPut2Sleep func for watchdog triggering prevention for at least 1 minute (see src_orig\lvm.c for details).

To see if memory is the problem, run something like:

for i = 1,1000 do print(i, mem().USED_MEM_NOW) end  // see memory rising
// or
for i = 1,1000 do print(i); gc() end   // garbagge collection on each iteration

AFAIK r97 is prepared for 1MB+ memory footprint, so if you are using the smaller one, you will have to change some constants (LUAW_MEM_ALARM_THRESHOLD should be set to 150, etc.). Better yet, get an 1MB+ device and enjoy Lua.

Hmm… sounds like a risky idea!

In any embedded system, the watchdog is there for a purpose - so disabling it without specific, careful thought is not a good idea!

Now wandering off-topic…

Newcomers to the embedded world often consider watchdog resets to be a fault in themselves - but this is to miss the point!
The purpose of a watchdog is to prevent operations from taking “too long”. Therefore, if you are getting watchdog resets, it indicates that something is taking “too long”, and you need to look at the cause of that - not just blame (or disable) the watchdog!

OK, I’ll stop rambling now.

Just to note that I do realise that it is not proven that watchdog resets are involved here…

Indeed, I think this was one of the problems - after setting LUAW_MEM_ALARM_THRESHOLD to 140 KB “mem().USED_MEM_NOW” never went above that value.

However there was another detail that I missed before and it seems it’s responsible for the reset:
I have an adl_tmr_t timer callback C function used for printing debug messages directly into the VM. This function is triggered periodically and calls the “print” lua function (using lua_call). Therefor, while the ‘for’ loop is running in the lua VM, this function calls the “print” function simultaneously. And then, for some reason, the OpenAT crashes.

I think one of the possible reasons is that while lua’s garbage collector does its job releasing the resources allocated during the execution of the ‘for’ loop, I call the ‘print’ function and this somehow disrupts the garbage collector. Although this is just a theory, I was unable to find any sync functions in the C LUA API.

What do you think?
Thanks

I don’t think the watchdog has anything to do with the crash because the VM seems to reset the watchdog periodically as long as it’s executing bytecode. Also, I would’ve considered this to be a bug since I don’t see why anyone programming in a VM should worry about some external OS watchdog …
Let’s see…

Thanks
Alex

Fair enough.

Your theory about the “interference” between ‘C’ and Lua printing sounds plausible…

There is watchdog control in the Lua VM, but it isn’t disabled. It is periodically put to sleep in the VM main loop, i.e. as long as there are new Lua bytecode intructions executed, the watchdog won’t trip.

  • If some C callback doesn’t release the CPU, after ~ 1 minute the watchdog will trigger.
  • if the VM somehow fails, it will stop executing new bytecode instructions, stop calling put2sleep(), and let the watchdog trigger.
  • If it has nothing left to do, it lets the user app go idle and won’t trigger the watchdog.
  • If it takes more than 60 seconds (put2sleep duration) to execute 1000 bytecode instruction (put2sleep periodicity), the watchdog triggers.

To come back to your issue: Lua expects multi-tasking to either happen in different lua_State (n.a. on Open AT, we can’t afford to maintain several states in RAM, and lack of context arguments in many ADL callbacks would make it very tricky to implement anyway), or be handled through coroutines. By doing a lua_call out of the blue, you might indeed cause some ptroblems.

There’s a (probably undocumented?) wip_debug() function, which works as printf() and outputs to the port given to wip_netInitOpts as WIP_NET_DEBUG_PORT option, I suggest you use this. To preserve non-blocking behavior, it might lose some data if buffers are saturated.