Demo Discussion
Forum Config Examples Contributions Vulnerabilities
  Discussion forum about ELOG  Not logged in ELOG logo
icon5.gif   elogd hangs, posted by Alan Grant on Fri Aug 18 05:28:16 2017 
    icon2.gif   Re: elogd hangs, posted by Stefan Ritt on Fri Aug 18 08:59:08 2017 
       icon2.gif   Re: elogd hangs, posted by David Pilgram on Fri Aug 18 12:40:21 2017 
          icon2.gif   Re: elogd hangs, posted by Alan Grant on Fri Aug 18 15:10:15 2017 
          icon2.gif   Re: elogd hangs, posted by Alan Grant on Fri Aug 18 15:15:57 2017 
             icon2.gif   Re: elogd hangs, posted by David Pilgram on Fri Aug 18 21:16:12 2017 
       icon2.gif   Re: elogd hangs, posted by Alan Grant on Fri Aug 18 14:49:42 2017 
          icon2.gif   Re: elogd hangs, posted by Stefan Ritt on Fri Aug 18 16:26:14 2017 
             icon2.gif   Re: elogd hangs, posted by Alan Grant on Fri Aug 18 16:39:21 2017 
Message ID: 68657     Entry time: Fri Aug 18 15:10:15 2017     In reply to: 68655
Icon: Reply  Author: Alan Grant  Author Email: agrant@winnipeg.ca 
Category: Question  OS: Windows  ELOG Version: 3.1.2 
Subject: Re: elogd hangs 

Yes I think I recall the incident in the Forum you're talking about from previous searches I've done on hanging however so far I haven't used Reply To's in this elog instance. Nevertheless, you explained it very well and it's good points to keep in mind should I ever use them, thank you David.

David Pilgram wrote:

I have experience of elog hanging (under linux).  I'll describe my situation, although it may not apply to you.  I still use elog 2.9.2 but I am unaware of this issue ever being resolved although I have mentioned it in the past.  (Possibly because I'm one of the few who has this situation).  I certainly recall other person had this as the problem, and my reply on this forum solved their problem.  The cause is the following:

1.  A thread with a large number of replies - something over 40 I think.

2.  This long thread is deleted from the first entry.  This will crash elog,

3.  Once restarted, the later entries of the deleted thread (which survived the deletion attempt when elog crashed) are accessed.  This will cause elog to go into an endless loop and hang.  Until I learnt better, I had to reboot the computer.  Under linux kill -9 (process) does the job, but kill (process) does not.

The problem lays with the first entry that survived the attempt at deletion.  It has an "In reply to" line in the entry in the yymmdda.log file, referring to an entry that has now been deleted.  Manually editing the yymmdda.log file to remove that line does the trick, and then the surviving entries can be accessed and deleted.

A good work-around is that if you are about to delete a long thread is to delete it in sections, starting at the end.  It is useful to note the entry number or some other way to find it again after the last section is deleted, as of course it will now be back in with the even older entries.  Or have two tabs on your browser accessing the same thread.

If you want to move the long thread to another logbook, to avoid the problem, Copy the thread, and then do the deletion in stages.  Moving a long thread does the same computer crash/computer hang, although the Copying part is done fine, the deletion part is the problem.

You don't have to have a large number of replies to an entry to cause the hang in controlled conditions.  Just edit the yymmdda.log file of a new entry adding in a "In reply to" line referring to an earlier entry number that does not exist is enough to cause the problem when you try and access the thread.

If this is the cause of your issue, the problem is to find the orphan thread that is causing the hang, especially after all this time.  Also, you may have more than one orphan thread.  Even though I am aware of the problem, I do occasionally find orphan threads in my logbooks.  In my case I use the ticketing system, and searching by ticket number will find an orphan thread without hanging the computer, but if you then click on any entry found - hang.

 

There is a related issue, which I think I have now resolved.  If the entry in the "Reply to" field in the yymmdda.log file does not exist, that is a later entry (not earlier, as above), elog will cause a duplicate entry, always in bold, with entry no 0 to appear in the listings.  This entry is an artifact that appears in the listings, not a real entry in a yymmdda.log file.  Again, finding the rogue entry is the tricky bit.

Stefan Ritt wrote:

I have to figure out where elog hangs. I guess it must be some kind of endless loop, triggered by some corrupt data in one of the elog entries. Under linux this is fairly simple (just run elogd under the gdb debugger, wait until it hangs, then press ctrl-c and enter "where" to see a full stack dump where elogd is currently executing). Under Windows this is more difficult, since you need Visual C++ from Microsoft to do the debugging. One thing you can do however without VC is to check if the CPU time is consumed to 100% by elogd, indicating an endless loop.

Stefan

Alan Grant wrote:

I have a very long standing problem with elog over the last few versions where almost daily the service will hang. Cannot even Restart elogd, that just hangs. Clients experience Page not Found. I can only get the service reinitialized by rebooting the VM machine. I have Elog verbose logging On plus a number of external triage monitors running but nothing is yielding clues beyond the precise time the hang occurs. Aside from providing the Config and log files what else can I provide for you to assist, and what other triage measures can you suggest I try? FYI, there can be up to 20 users at one time doing searches (not updates), and I've trimmed the depth of log files that can be searched so that the CPU/service doesn't bog down but that hasn't helped either. Inserts happen in the background using the elog client app (about 2 or 3 inserts per batch at sporadic times).

 

 

 

ELOG V3.1.5-fe60aaf