Demo Discussion
Forum Config Examples Contributions Vulnerabilities
  Discussion forum about ELOG, Page 677 of 808  Not logged in ELOG logo
New entries since:Thu Jan 1 01:00:00 1970
ID Date Icon Author Author Email Categorydown OS ELOG Version Subject
  66449   Mon Jul 20 09:26:41 2009 Question lancelance1.hayward@yahoo.comBug reportWindows2.7.6Elog Crashes
Stefan,
 
Our log is crashing on a regular basis and I have been unable to identify the reason. Now the if the log crashes that is not a major problem however when you try to stop the daemon from the services it fails to stop. This means that the daemon cannot be restarted. The only way then is to start killing processes. This is not something I want none experienced guys to do.
 
Looking at the processes is look like the elogd.exe is still running and doesn’t die when you try to stop the daemon service.
 
I checked the times it was crashing with events in the elog logfiles but there was nothing actually happening at these times. It seems something is causing it to just hang.
 
I have attached the eventlog files for you if you have any ideas I would appreciate them.
 
I have not run the log in verbose mode as I have thus far been unable to redirect the output of the screen in order to see what is happening. If you have any tips on how to redirect the output I would save the file for off line analysis. Our log is used 24/7 therefore it is critical that it be kept running so if I was to run it with the –v option the guys would have to restart it and I would lose the data.
 
Any help is much appreciated
 
 
Regards,
 
Lance
Attachment 1: Elog_crash_events.doc
  66450   Mon Jul 20 10:30:44 2009 Reply Stefan Rittstefan.ritt@psi.chBug reportWindows2.7.6Re: Elog Crashes

lance wrote:
Stefan,
 
Our log is crashing on a regular basis and I have been unable to identify the reason. Now the if the log crashes that is not a major problem however when you try to stop the daemon from the services it fails to stop. This means that the daemon cannot be restarted. The only way then is to start killing processes. This is not something I want none experienced guys to do.
 
Looking at the processes is look like the elogd.exe is still running and doesn’t die when you try to stop the daemon service.
 
I checked the times it was crashing with events in the elog logfiles but there was nothing actually happening at these times. It seems something is causing it to just hang.
 
I have attached the eventlog files for you if you have any ideas I would appreciate them.
 
I have not run the log in verbose mode as I have thus far been unable to redirect the output of the screen in order to see what is happening. If you have any tips on how to redirect the output I would save the file for off line analysis. Our log is used 24/7 therefore it is critical that it be kept running so if I was to run it with the –v option the guys would have to restart it and I would lose the data.
 
Any help is much appreciated
 
 
Regards,
 
Lance

Using the Windows event log won't help much. I guess in your case elogd is driven into some kind of endless loop (does the CPU go to 100%???). There are only two possibilities to tackle this:

1) You find a way to reliably reproduce this problem, tell me how to do this. When I can reproduce it here, I can fix it easily.

2) You do debugging yourself. Under Linux this is simple, since you have debuggers on most systems. Under Windows however, you first have to install the Visual C++ development environment. I believe there is a free version (Express?) which you can use. You then run elogd under the debugger, and when it hangs you investigate where. This needs some basic knowledge about C++ development and I'm not sure if you have this, but maybe you can find someone around you who does. 

  66455   Wed Jul 22 12:12:37 2009 Warning T. Ribbrockemgaron+elog@ribbrock.orgBug reportLinux2.7.6r2233Crashes when editing entries

For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.

I've been running elogd like this (to get more info)

elogd -v > elog-2233-2.log 2>&1

The last entry I get in the log when elogd crashes is:

  Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"

I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.

 

Any additional information you need: Just let me know.

Regards,

Thomas

  66456   Wed Jul 22 12:15:56 2009 Reply T. Ribbrockemgaron+elog@ribbrock.orgBug reportLinux2.7.6r2233Re: Crashes when editing entries

T. Ribbrock wrote:

For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.

I've been running elogd like this (to get more info)

elogd -v > elog-2233-2.log 2>&1

The last entry I get in the log when elogd crashes is:

  Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"

I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.

 

Any additional information you need: Just let me know.

Regards,

Thomas

 Forgot to mention: I've also seen error messages like this upon a crash:

*** glibc detected *** corrupted double-linked list: 0x0911bbc0 ***

Regards,

Thomas

  66457   Wed Jul 22 12:46:36 2009 Reply Stefan Rittstefan.ritt@psi.chBug reportLinux2.7.6r2233Re: Crashes when editing entries

T. Ribbrock wrote:

For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.

I've been running elogd like this (to get more info)

elogd -v > elog-2233-2.log 2>&1

The last entry I get in the log when elogd crashes is:

  Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"

I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.

 

Any additional information you need: Just let me know.

well, I need to reproduce your problem in order to fix it. The failed assertation you get is due to some internal writing beyond array boundaries, but I have no clue which part of the code makes this. It might be related to the fact that you use the same index (via Subdir=...) for two logbooks. In this scenario, you are only allowed to modify/add entries to one logbook, not the other. The other one may only be used for reading. And even then it's not guaranteed that new entries show up in the second logbook immediately, you might have to restart the server in order to re-index the logbooks. Internally, the daemon does not know that two logbooks are "the same" and one instance will not realize if the other instance modifies the data "below its feet". Can you try to give up the double logbooks and see if the problem goes away?

  66458   Wed Jul 22 15:35:57 2009 Reply T. Ribbrockemgaron+elog@ribbrock.orgBug reportLinux2.7.6r2233Re: Crashes when editing entries

Stefan Ritt wrote:

well, I need to reproduce your problem in order to fix it. The failed assertation you get is due to some internal writing beyond array boundaries, but I have no clue which part of the code makes this. It might be related to the fact that you use the same index (via Subdir=...) for two logbooks. In this scenario, you are only allowed to modify/add entries to one logbook, not the other. The other one may only be used for reading. And even then it's not guaranteed that new entries show up in the second logbook immediately, you might have to restart the server in order to re-index the logbooks. Internally, the daemon does not know that two logbooks are "the same" and one instance will not realize if the other instance modifies the data "below its feet". Can you try to give up the double logbooks and see if the problem goes away?

 Hm... I have implemented this set-up originally based on this: https://midas.psi.ch/elogs/Forum/66024. The "double logbook" is a machine log with a "software" (OS installations etc.) and a "hardware" (CPU, RAM, etc.) view. The "hardware" view has the "Subdir=" statement. Thinking about it, the "software" view is used most - I have several automatic scripts running which update the contents whenever a machine gets updated, re-installed and so on. The hardware part does not see much editing - until this week, when we decided to start an inventory... So, it's quite possible that we never noticed that this was iffy. For the rest of our goals, this set-up has worked fantastically - never noticed any problem with one view not updating, actually. Also, I do not remember any crashes with the other, single logbooks.

What I've done for now is to ask all team members to use only the software part (the one without the Subdir statement) to actually change content (the entry masks are the same in both versions) and use the hardware part just for viewing. I'll report back as soon as I get some feedback.

Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.

Regards,

Thomas

  66459   Wed Jul 22 16:30:48 2009 Reply Stefan Rittstefan.ritt@psi.chBug reportLinux2.7.6r2233Re: Crashes when editing entries

T. Ribbrock wrote:

Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.

Well, for that I have to reproduce the problem. So best would be if you strip it down to the bare minimum in order to reproduce this reliably. Then you zip everything and send it to me. Then tell me what I have to edit and submit in order to stimulate the crash. Once this is successful, I can fix it.

  66460   Wed Jul 22 16:52:13 2009 Reply T. Ribbrockemgaron+elog@ribbrock.orgBug reportLinux2.7.6r2233Re: Crashes when editing entries

Stefan Ritt wrote:

T. Ribbrock wrote:

Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.

Well, for that I have to reproduce the problem. So best would be if you strip it down to the bare minimum in order to reproduce this reliably. Then you zip everything and send it to me. Then tell me what I have to edit and submit in order to stimulate the crash. Once this is successful, I can fix it.

Thank you - I shall look into that, though it'll probably take a while to prepare it.

ELOG V3.1.5-3fb85fa6