BUG: In the ongoing saga of timers, our hero boldly codes

keithnicholas · February 10, 2011, 12:54am

So… starting a new project with the helloworld in 2.34

I wanted to see how timers work if they take too long. So the following code (at the bottom) starts a few timers… with 1 timer taking too long “the sluggish timer”. It seems that after 33/34 cycles on my AirPrime Q2686RD / 7.44 the whole app resets with an exception. Implying that something behind the scenes in queuing up timers expirations and eventually runs out of space to queue things to.

I don’t think it should reset with an Exception as the API gives you no way to work out if thats ever going to happen or if you are close to doing that.

With more code I found If the Timer takes an extremely long time in one timer cycle, and then has 1 short timer cycle. Then the whole thing stays alive… which means a 100ms timer ends up taking about 500ms. Slowing everything down but not killing the unit.

So far I can’t reproduce a problem in simple code where I have where 1 timer that seems to stop working, while the rest carry on. But I’m working on it

#include "adl_global.h"


const u16 wm_apmCustomStackSize = 1024*3;

bool timers_launched = false;
u32 sluggish_count  = 0;
u32 watching_count = 0;
u32 compute = 0;

void sluggish_handler( u8 ID, void * Context )
{
	u32 i;
	u32 ii;
	for(i = 0; i < 562; i++)
	{
		for(ii = 0; ii < 562; ii++)
		{
			compute = i >> ii;
		}
	}
	sluggish_count++;
}

void watching_handler( u8 ID, void * Context )
{
	watching_count++;
}

void launch_timers( void )
{
	adl_tmrSubscribe ( TRUE, 1, ADL_TMR_TYPE_100MS, sluggish_handler );
	adl_tmrSubscribe ( TRUE, 1, ADL_TMR_TYPE_100MS, watching_handler );
	timers_launched = true;
}


void HelloWorld_TimerHandler ( u8 ID, void * Context )
{
	ascii s[200];
	if(! timers_launched) launch_timers();

	sprintf(s, "slug %u   watching %u   [%u]\r\n", sluggish_count, watching_count, compute);

    adl_atSendResponse ( ADL_AT_UNS, s);
}


void adl_main ( adl_InitType_e  InitType )
{
	ascii s[100];
	sprintf(s, "Init %d\r\n", (int) InitType);
    adl_atSendResponse ( ADL_AT_UNS, s);
    adl_tmrSubscribe ( TRUE, 10, ADL_TMR_TYPE_100MS, HelloWorld_TimerHandler );
}

awneil · February 10, 2011, 7:45am

There is a documented limit of 32 timers running simultaneously:

Your code doesn’t check the return values from adl_tmrSubscribe!
That would be the obvious way for the API to tell you about such things and, according to the above quote, it does.

[size=150]Always check API return codes![/size]

keithnicholas · February 10, 2011, 8:53pm

I’m not starting 32 timers

I have 3 timers running… the return codes are fine.

I have 1 timer that takes slightly too long for its time period.

even if I was starting more than 32 timers, the application shouldn’t crash. The API call will return an error code, and no timer handler would get called… and it would simply carry on with the timers that has started.

[size=150]ALWAYS Read the question properly[/size]

awneil · February 11, 2011, 8:01am

Touche!

But how do you know that, since your code discards all return codes?

Note that the “success” retun code from adl_tmrSubscribe is a pointer to a timer structure - have you tried looking into that structure, so see if there are any clues there…?

keithnicholas · February 11, 2011, 10:11am

its got nothing to do with the return codes…this is just a sample piece of code to show how to crash the app when it shouldn’t. Look through the example projects shipped with the SDK, most don’t store away the return code unless they want to unsubscribe because its not industrial code, just example code. In our actual app we have our own wrappers and a bit of macro magic around the timer api which auto handles all the return codes, and keeps track of them.

I know the 3 timers are running because of what the program prints out.

it then crashes, and it shouldn’t.

I can make the timers do all kinds of interesting things. But it should never crash, and I would like for Sierra Wireless to document the internal architecture of timers and tasks so their exact nature isn’t so much of a mystery.

and don’t even get me started on why the api returns a pointer to a struct. Thats just wrong. It should return an actual handle. If you wanted to query the parameters of a timer then using the handle it should return a copy of the struct. But anyways…

awneil · February 11, 2011, 11:02am

You’re probably right, but how do you know?

If you don’t observe the return codes, how can you be sure that they aren’t giving you some clue(s)?

And don’t get me started on the quality i [/i]of the example projects…
(see rant elsewhere)

It’s possible that the return code might tell you that there’s a problem - and yet still start the timer…
That would, of course, be wrong in itself - but possible.

It shouldn’t be necessary for the internal architecture to be documented, but the behaviour should certainly be fully & completely documented - especially exceptional conditions like this!

Unfortunately, Wavecom have an extremely poor record on documentation, and it remains to be seen if SiWi can redeem the situation…

keithnicholas · February 13, 2011, 8:21pm

I know its not the return codes because the timers are started. The return code only returns that it failed to start the timer, either because its run out of timers or the service is locked, and since I know the timers are running (and in fact there is no reason in this situations for the timers not to run unless something is REALLY screwed) I know the return code is ok and that they hold no useful information.

Having the behavior documented is like having the results of an algorithm documented. You might be able to work out how it works, or you might not. Its simpler to document the algorithm so its possible to predict behavior in different contexts. You’ll find in most OS’s that are non open source there is a lot of details about the internal workings.

awneil · February 23, 2011, 1:47pm

Cross reference: https://forum.sierrawireless.com/t/details-of-how-timers-work/4865/3

Topic		Replies	Views
Use of Timers Legacy AirPrime modules	2	790	September 17, 2009
Quick Timer Question Open AT	9	2707	October 14, 2010
Loop problem Open AT	5	3075	November 3, 2005
Cyclic adl_tmr hangs up after a time Open AT	4	1062	November 19, 2008
Timer interrupt issue Legacy AirPrime modules	4	1946	June 4, 2013

BUG: In the ongoing saga of timers, our hero boldly codes

Related topics