Skip to content

xtimer: timer already in the list #4841

@immesys

Description

@immesys

I initially reported this on the mailing list, but I figured I should add an issue.

I was getting strange lockups on my nodes, and managed to work out that a timer was being inserted into the list when it was already in the list, which would cause an infinite loop here

I modified the function with this inside the while loop:

if (*list_head == timer || (*list_head)->next == timer){
   disableIRQ();
   while(1);
}

And waited to see where the invocation was coming from. It triggered (on the second condition in the if statement), and this was the call chain:

#0  _add_timer_to_list (list_head=0x200021ac <sampling_stack+1755>, list_head@entry=0x20000630 <timer_list_head>, timer=timer@entry=0x2000148c <comms_stack+1832>) at //sys/xtimer/xtimer_core.c:226
#1  0x00000d12 in _xtimer_set_absolute (timer=0x2000148c <comms_stack+1832>, target=1963159872) at //sys/xtimer/xtimer_core.c:197
#2  0x00000dec in xtimer_set (timer=timer@entry=0x2000148c <comms_stack+1832>, offset=offset@entry=1000000) at //sys/xtimer/xtimer_core.c:133
#3  0x00000a4e in xtimer_set_msg (target_pid=<optimized out>, msg=0x20001484 <comms_stack+1824>, offset=1000000, timer=0x2000148c <comms_stack+1832>) at //sys/xtimer/xtimer.c:124
#4  xtimer_msg_receive_timeout (msg=msg@entry=0x200014bc <comms_stack+1880>, us=us@entry=1000000) at //sys/xtimer/xtimer.c:200
#5  0x00001c94 in comms (arg=<optimized out>) at //Apps/accel/main.c:192
#6  0x00000288 in sched_switch (other_prio=<optimized out>) at //core/sched.c:180
#7  0x20001404 in comms_stack ()

The comms thread looks basically like this (I am working on isolated reproducer):

msg_t dns_msg; //type set to MSG_TYPE_DNS_TIMER once in main
xtimer_t dns_timer; //nobody else uses this
void *comms(void *arg)
{
  msg_t msg;
  msg_init_queue(_comms_msg_queue, COMMS_QUEUE_SIZE);
  gnrc_netreg_entry_t entry = { NULL, 4041, thread_getpid() };
  gnrc_netreg_register(GNRC_NETTYPE_UDP, &entry))
  while(1) {
    if (xtimer_msg_receive_timeout(&msg, NETWORK_MAXRTT_US) < 0)
      msg.type = MSG_TYPE_MUST_TX; //default to resending old packet
    switch(msg.type) {
      case GNRC_NETAPI_MSG_TYPE_RCV: {
          /* snip */
          gnrc_pktbuf_release((gnrc_pktsnip_t *) msg.content.ptr);
          //Deliberately do not break
      }
      case MSG_TYPE_MUST_TX: { //Sent every ~100ms from sampling_thread
          /* snip building packet 'ip' */
          gnrc_netapi_dispatch_send(GNRC_NETTYPE_UDP, GNRC_NETREG_DEMUX_CTX_ALL, ip)
          break;
      }
      case MSG_TYPE_DNS_TIMER: { //sent once by main, otherwise only to ourselves
        update_dns_record(); //Sends a UDP packet, like above
        xtimer_set_msg(&dns_timer, DNS_INTERVAL, &dns_msg, thread_getpid());
        break;
      }
    } //end switch
  } //end while
} //endfun

So essentially the xtimer_msg_receive_timeout(&msg, NETWORK_MAXRTT_US) is sometimes causing a stack-allocated timer to appear in the timer list twice. This is pretty interesting considering that there is a call to remove the timer earlier in the call chain, although they are in different critical sections, so maybe something is happening between them.

I will do my best to reproduce this with a small application that runs on the SAMR21-xpro.

Also, apologies if this is user error, I am still getting the hang of RIOT.

Metadata

Metadata

Assignees

Labels

Area: timersArea: timer subsystemsType: bugThe issue reports a bug / The PR fixes a bug (including spelling errors)

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions