Demo Discussion
Forum Config Examples Contributions Vulnerabilities
  Discussion forum about ELOG  Not logged in ELOG logo
icon5.gif   ELOG deamon stuck in find_thread_head(), posted by soren poulsen on Sat Apr 30 19:45:30 2011 
    icon2.gif   Re: ELOG deamon stuck in find_thread_head(), posted by Soren Poulsen on Tue May 3 17:35:57 2011 
       icon8.gif   Re: ELOG deamon stuck in find_thread_head(), posted by Soren Poulsen on Wed Jul 6 12:06:01 2011 
          icon2.gif   Re: ELOG deamon stuck in find_thread_head(), posted by David Pilgram on Wed Jul 6 12:36:33 2011 
Message ID: 67087     Entry time: Wed Jul 6 12:36:33 2011     In reply to: 67086
Icon: Reply  Author: David Pilgram  Author Email: David.Pilgram@epost.org.uk 
Category: Bug report  OS: Linux  ELOG Version: 2.9.0-2413 
Subject: Re: ELOG deamon stuck in find_thread_head() 

Soren Poulsen wrote:

Soren Poulsen wrote:

soren poulsen wrote:

ELOG seems to enter a loop when you do certain opeations on certain messages: I moved a message to a different logbook and the deamon just gets stuck.

If I restart the daemon, the message was in fact moved: I can move it back to its original destination without problems.

I started in GDB and break with ctrl-C when the process gets stuck, to be told :

Program received signal SIGINT, Interrupt.
0x000000000040a968 in find_thread_head ()

I then made a core dump.

I put the files here: http://cern.ch/poulsen2/elog-error-report-110430.zip (they are too big to upload).

I get into the same problem in other circumstances such as when opening some threads (maybe because they contain "Reply-to" references to non-existing messages, but I have problems reproducing this on the test installation.

I should maybe also submit the incriminating thread.

Soren

 

 1. It appears that some times find_thread_head is called with message references that do not exist. That is not good.

I put in a little check like this  before seeing if the message has an "in_reply_to" reference:

The line:

if (lbs->el_index[i].in_reply_to)

becomes:

if (i < *lbs->n_el_index && lbs->el_index[i].in_reply_to)
 

2. The trouble started when I deleted a message in the middle of a thread, which left the thread badly "connected" (references to a deleted message).

3. Also, when a thread is badly connected, it is a problem moving messages to a different logbook. ELOG complains that it cannot access the message (with the invalid reference). But ELOG should ignore it, since the message was deleted.

 

Soren

 It would be nice to have this corrected. The problem occurs when you select (read) a message which refers to another message via "In-reply-to", and this message does not exist.

Soren

Soren, you're not alone!  I've had similar problems, as did Sara Vanini (elog:67077).

 

In my case, it is because the "move" or "copy" function does not move all the messages in very long threads.   To be more precise, elog will crash in the attempt to move a long thread - say over 40 replies, I don't know for sure.  Sometimes it has already moved the entire thread before it crashes, sometimes not.  I'd not flagged it up as an issue because I could not be sure it was not a memory issue with the old (>12 years) linux box I was using earlier this year, but it still happens on this new (to me, only 3 years old) linux box.

 

Whether it is the number of entries, the total memory size of the thread or some combination, I don't know.

 

I've found that in the "move" case, it has not deleted all the messages from the donor thread, so that there is a semi-thread still hidden there.  Should one by chance select that semi-thread, (because it is found during a search) elog goes into infinate loop, which requires a reboot of this linux box to fix.   Certainly the pinning down the issue to the missing entry referenced by an <i>In reply to:</i> explains this part of the issue.  Of course, deletion of one entry within a thread, or other adjustments will do the same thing, just as you (Soren) point out above.

 

If it happens to me, I will go in to the yymmdda.log files and fix the problem, be it deleting the entries of the semi-thread, moving across missing entries from the donor to the acceptor logbooks, adjusting the <i>Reply:</i> and <i>In reply to:</i> lines, but that is quite a time consuming and error prone exercise.

ELOG V3.1.5-fe60aaf