ID |
Date |
Icon |
Author |
Author Email |
Category |
OS |
ELOG Version |
Subject |
66449
|
Mon Jul 20 09:26:41 2009 |
| lance | lance1.hayward@yahoo.com | Bug report | Windows | 2.7.6 | Elog Crashes | Stefan,
Our log is crashing on a regular basis and I have been unable to identify the reason. Now the if the log crashes that is not a major problem however when you try to stop the daemon from the services it fails to stop. This means that the daemon cannot be restarted. The only way then is to start killing processes. This is not something I want none experienced guys to do.
Looking at the processes is look like the elogd.exe is still running and doesn’t die when you try to stop the daemon service.
I checked the times it was crashing with events in the elog logfiles but there was nothing actually happening at these times. It seems something is causing it to just hang.
I have attached the eventlog files for you if you have any ideas I would appreciate them.
I have not run the log in verbose mode as I have thus far been unable to redirect the output of the screen in order to see what is happening. If you have any tips on how to redirect the output I would save the file for off line analysis. Our log is used 24/7 therefore it is critical that it be kept running so if I was to run it with the –v option the guys would have to restart it and I would lose the data.
Any help is much appreciated
Regards,
Lance |
Attachment 1: Elog_crash_events.doc
|
66450
|
Mon Jul 20 10:30:44 2009 |
| Stefan Ritt | stefan.ritt@psi.ch | Bug report | Windows | 2.7.6 | Re: Elog Crashes |
lance wrote: |
Stefan,
Our log is crashing on a regular basis and I have been unable to identify the reason. Now the if the log crashes that is not a major problem however when you try to stop the daemon from the services it fails to stop. This means that the daemon cannot be restarted. The only way then is to start killing processes. This is not something I want none experienced guys to do.
Looking at the processes is look like the elogd.exe is still running and doesn’t die when you try to stop the daemon service.
I checked the times it was crashing with events in the elog logfiles but there was nothing actually happening at these times. It seems something is causing it to just hang.
I have attached the eventlog files for you if you have any ideas I would appreciate them.
I have not run the log in verbose mode as I have thus far been unable to redirect the output of the screen in order to see what is happening. If you have any tips on how to redirect the output I would save the file for off line analysis. Our log is used 24/7 therefore it is critical that it be kept running so if I was to run it with the –v option the guys would have to restart it and I would lose the data.
Any help is much appreciated
Regards,
Lance
|
Using the Windows event log won't help much. I guess in your case elogd is driven into some kind of endless loop (does the CPU go to 100%???). There are only two possibilities to tackle this:
1) You find a way to reliably reproduce this problem, tell me how to do this. When I can reproduce it here, I can fix it easily.
2) You do debugging yourself. Under Linux this is simple, since you have debuggers on most systems. Under Windows however, you first have to install the Visual C++ development environment. I believe there is a free version (Express?) which you can use. You then run elogd under the debugger, and when it hangs you investigate where. This needs some basic knowledge about C++ development and I'm not sure if you have this, but maybe you can find someone around you who does. |
66455
|
Wed Jul 22 12:12:37 2009 |
| T. Ribbrock | emgaron+elog@ribbrock.org | Bug report | Linux | 2.7.6r2233 | Crashes when editing entries | For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.
I've been running elogd like this (to get more info)
elogd -v > elog-2233-2.log 2>&1
The last entry I get in the log when elogd crashes is:
Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"
I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.
Any additional information you need: Just let me know.
Regards,
Thomas |
66456
|
Wed Jul 22 12:15:56 2009 |
| T. Ribbrock | emgaron+elog@ribbrock.org | Bug report | Linux | 2.7.6r2233 | Re: Crashes when editing entries |
T. Ribbrock wrote: |
For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.
I've been running elogd like this (to get more info)
elogd -v > elog-2233-2.log 2>&1
The last entry I get in the log when elogd crashes is:
Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"
I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.
Any additional information you need: Just let me know.
Regards,
Thomas
|
Forgot to mention: I've also seen error messages like this upon a crash:
*** glibc detected *** corrupted double-linked list: 0x0911bbc0 ***
Regards,
Thomas |
66457
|
Wed Jul 22 12:46:36 2009 |
| Stefan Ritt | stefan.ritt@psi.ch | Bug report | Linux | 2.7.6r2233 | Re: Crashes when editing entries |
T. Ribbrock wrote: |
For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.
I've been running elogd like this (to get more info)
elogd -v > elog-2233-2.log 2>&1
The last entry I get in the log when elogd crashes is:
Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"
I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.
Any additional information you need: Just let me know.
|
well, I need to reproduce your problem in order to fix it. The failed assertation you get is due to some internal writing beyond array boundaries, but I have no clue which part of the code makes this. It might be related to the fact that you use the same index (via Subdir=...) for two logbooks. In this scenario, you are only allowed to modify/add entries to one logbook, not the other. The other one may only be used for reading. And even then it's not guaranteed that new entries show up in the second logbook immediately, you might have to restart the server in order to re-index the logbooks. Internally, the daemon does not know that two logbooks are "the same" and one instance will not realize if the other instance modifies the data "below its feet". Can you try to give up the double logbooks and see if the problem goes away? |
66458
|
Wed Jul 22 15:35:57 2009 |
| T. Ribbrock | emgaron+elog@ribbrock.org | Bug report | Linux | 2.7.6r2233 | Re: Crashes when editing entries |
Stefan Ritt wrote: |
well, I need to reproduce your problem in order to fix it. The failed assertation you get is due to some internal writing beyond array boundaries, but I have no clue which part of the code makes this. It might be related to the fact that you use the same index (via Subdir=...) for two logbooks. In this scenario, you are only allowed to modify/add entries to one logbook, not the other. The other one may only be used for reading. And even then it's not guaranteed that new entries show up in the second logbook immediately, you might have to restart the server in order to re-index the logbooks. Internally, the daemon does not know that two logbooks are "the same" and one instance will not realize if the other instance modifies the data "below its feet". Can you try to give up the double logbooks and see if the problem goes away?
|
Hm... I have implemented this set-up originally based on this: https://midas.psi.ch/elogs/Forum/66024. The "double logbook" is a machine log with a "software" (OS installations etc.) and a "hardware" (CPU, RAM, etc.) view. The "hardware" view has the "Subdir=" statement. Thinking about it, the "software" view is used most - I have several automatic scripts running which update the contents whenever a machine gets updated, re-installed and so on. The hardware part does not see much editing - until this week, when we decided to start an inventory... So, it's quite possible that we never noticed that this was iffy. For the rest of our goals, this set-up has worked fantastically - never noticed any problem with one view not updating, actually. Also, I do not remember any crashes with the other, single logbooks.
What I've done for now is to ask all team members to use only the software part (the one without the Subdir statement) to actually change content (the entry masks are the same in both versions) and use the hardware part just for viewing. I'll report back as soon as I get some feedback.
Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.
Regards,
Thomas |
66459
|
Wed Jul 22 16:30:48 2009 |
| Stefan Ritt | stefan.ritt@psi.ch | Bug report | Linux | 2.7.6r2233 | Re: Crashes when editing entries |
T. Ribbrock wrote: |
Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.
|
Well, for that I have to reproduce the problem. So best would be if you strip it down to the bare minimum in order to reproduce this reliably. Then you zip everything and send it to me. Then tell me what I have to edit and submit in order to stimulate the crash. Once this is successful, I can fix it. |
66460
|
Wed Jul 22 16:52:13 2009 |
| T. Ribbrock | emgaron+elog@ribbrock.org | Bug report | Linux | 2.7.6r2233 | Re: Crashes when editing entries |
Stefan Ritt wrote: |
T. Ribbrock wrote: |
Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.
|
Well, for that I have to reproduce the problem. So best would be if you strip it down to the bare minimum in order to reproduce this reliably. Then you zip everything and send it to me. Then tell me what I have to edit and submit in order to stimulate the crash. Once this is successful, I can fix it.
|
Thank you - I shall look into that, though it'll probably take a while to prepare it. |
|