ID |
Date |
Icon |
Author |
Author Email |
Category |
OS |
ELOG Version |
Subject |
67079
|
Thu Jun 2 20:20:19 2011 |
| David Pilgram | David.Pilgram@epost.org.uk | Bug report | Linux | 2.7 | Re: editor dosn't work |
Sara Vanini wrote: |
David Pilgram wrote: |
Sara Vanini wrote: |
Hi,
when I try to edit an entry of my ELOG, the display shows the editor window blank, without all the previous content of the entry, and it is not possibile to write in it. It worked since yesterday, when ELOG tried to save a new entry but the disk was full. ELOG was srewed up. I deleted the buggy entry and now I can display all the previuos entries, but I cannot edit anymore... Please help!
Sara
|
I've a little experience of digging myself out of (in my case, self-induced) problems using ELOG. I'm also aware that I may be the least experienced/qualified user..
First: Archive your work directories. Then at least whatever you do from here, you've got the status quo to fall back on. Also, record anything you can remember (ID number, thread, etc) of the deleted entry/entries.
I've found that ELOG can hang in an infinite loop if it tries to find an entry that is no longer there - and that depends upon how you approach the point where the missing entry would be. ELOG's own delete works fine in normal circumstances. I'm talking about abnormal circumstances, for example when idiots (me) are playing around with the yymmdda.log files, or *possibly* if the disk is full, and you then try deleting the entry that caused the full disk problem. Whether that is what you are seeing, I cannot say at present.
However, to progress this: When you are stuck, unable to edit anything, in a[nother] terminal, try the process report
ps -A
two or three times, with a short interval between commands. (Or other switches if you know how to select to view the elogd process on your system). If elogd is using seconds of CPU time between each ps command, it's probably in an infinite loop. If you need to be sure, wait a minute and check again. If so, you'll have to stop the daemon, possibly requiring a computer reboot. In my experience, ELOG does not get stuck in an infinite loop when just indexing the pages when the daemon starts, but experts may well know better.
This may at least diagnose whether you cannot edit because ELOG is stuck in an infinite loop, or has some other cause.
If it is the infinite loop, the trick is to find which entry causes the loop without getting stuck in that loop next time around.
David Pilgram.
|
Hi David,
you have been very helpful indeed. The problem was the one you spot, I've deleted the buggy entry removing the ***.log file, and this caused disaster..... now it is working again, thanks a lot, I have all my PhD thesis in ELOG....
Sara
|
Don't get too excited yet!
When you reply to an entry in ELOG, then some additional data is added to that original entry.
So, if you reply today (say 02/06/11) to an entry made yesterday, then you will find that the file 110602a.log has a large change (the new entry in full, plus elog extra codes), *and* an additional line added into 110601a.log. Deleting 110602a.log will not remove the line in 110601a.log, and that could still cause problems, that is, wandering into an infinite loop.
To save a lot of effort, I'll suggest that you (a) keep the back-ups up to date, and keep two (the latest and the one before that); (b) proceed carefully at least to start with. If you fall into the infinite loop again, then flag it up and I (or someone else) will be able to give further pointers.
David Pilgram.
So unless you are sure that |
67084
|
Mon Jun 20 05:31:31 2011 |
| Andreas Luedeke | andreas.luedeke@psi.ch | Bug report | Linux | 2.9.0-2414 | segmentation fault when "restrict edit" is used and "new" is allowed for anonymous users | The simple config file below produces a segmentation fault when elogd is started,
http://localhost/Test/?cmd=New
is opened in the browser and then e.g. "Entry" is switched to "Problem".
gdb shows the following output:
(gdb) run -c /usr/local/elog/elogd.cfg
Starting program: /usr/local/sbin/elogd -c /usr/local/elog/elogd.cfg
elogd 2.9.0 built Jun 20 2011, 04:57:23 revision 2414
Falling back to default group "elog"
Falling back to default user "elog"
FCKedit detected
Falling back to default group "elog"
Falling back to default user "elog"
ImageMagick detected
Indexing logbooks ... done
Server listening on port 80 ...
Program received signal SIGSEGV, Segmentation fault.
0x080a2940 in get_user_line (lbs=0xae3c1c0, user=0x0, password=0x0, full_name=0xbfca1690 "", email=0x0, email_notify=0x0,
last_logout=0x0, inactive=0x0) at src/elogd.c:24864
24864 if (!str[0] || !user[0])
|
Attachment 1: elogd.cfg
|
[global]
Authentication = File
Password file = passwd.txt
Restrict edit = 1
[Test]
Guest Menu commands = New, List, Login, Help
Guest List Menu commands = New, Login, Help
Comment = Test ELog
Attributes = Author, Entry, Title
List display = ID, Author, Entry, Title
Start page = ?rsort=When
# Author
Preset Author = $long_name
Locked Attributes = Author
# Entry
Options Entry = Problem{1}, Measurement{2}
|
67085
|
Mon Jun 20 17:53:58 2011 |
| Stefan Ritt | stefan.ritt@psi.ch | Bug report | Linux | 2.9.0-2414 | Re: segmentation fault when "restrict edit" is used and "new" is allowed for anonymous users | You are the first one allowing guests to enter new entries, so this probes a code path which was never used before. I fixed the crash in SVN revision 2416, but it might be that there are more issues with that. Just keep reporting. |
67086
|
Wed Jul 6 12:06:01 2011 |
| Soren Poulsen | soren.poulsen@cern.ch | Bug report | Linux | 2.9.0-2413 | Re: ELOG deamon stuck in find_thread_head() |
Soren Poulsen wrote: |
soren poulsen wrote: |
ELOG seems to enter a loop when you do certain opeations on certain messages: I moved a message to a different logbook and the deamon just gets stuck.
If I restart the daemon, the message was in fact moved: I can move it back to its original destination without problems.
I started in GDB and break with ctrl-C when the process gets stuck, to be told :
Program received signal SIGINT, Interrupt.
0x000000000040a968 in find_thread_head ()
I then made a core dump.
I put the files here: http://cern.ch/poulsen2/elog-error-report-110430.zip (they are too big to upload).
I get into the same problem in other circumstances such as when opening some threads (maybe because they contain "Reply-to" references to non-existing messages, but I have problems reproducing this on the test installation.
I should maybe also submit the incriminating thread.
Soren
|
1. It appears that some times find_thread_head is called with message references that do not exist. That is not good.
I put in a little check like this before seeing if the message has an "in_reply_to" reference:
The line:
if (lbs->el_index[i].in_reply_to)
becomes:
if (i < *lbs->n_el_index && lbs->el_index[i].in_reply_to)
2. The trouble started when I deleted a message in the middle of a thread, which left the thread badly "connected" (references to a deleted message).
3. Also, when a thread is badly connected, it is a problem moving messages to a different logbook. ELOG complains that it cannot access the message (with the invalid reference). But ELOG should ignore it, since the message was deleted.
Soren
|
It would be nice to have this corrected. The problem occurs when you select (read) a message which refers to another message via "In-reply-to", and this message does not exist.
Soren |
67087
|
Wed Jul 6 12:36:33 2011 |
| David Pilgram | David.Pilgram@epost.org.uk | Bug report | Linux | 2.9.0-2413 | Re: ELOG deamon stuck in find_thread_head() |
Soren Poulsen wrote: |
Soren Poulsen wrote: |
soren poulsen wrote: |
ELOG seems to enter a loop when you do certain opeations on certain messages: I moved a message to a different logbook and the deamon just gets stuck.
If I restart the daemon, the message was in fact moved: I can move it back to its original destination without problems.
I started in GDB and break with ctrl-C when the process gets stuck, to be told :
Program received signal SIGINT, Interrupt.
0x000000000040a968 in find_thread_head ()
I then made a core dump.
I put the files here: http://cern.ch/poulsen2/elog-error-report-110430.zip (they are too big to upload).
I get into the same problem in other circumstances such as when opening some threads (maybe because they contain "Reply-to" references to non-existing messages, but I have problems reproducing this on the test installation.
I should maybe also submit the incriminating thread.
Soren
|
1. It appears that some times find_thread_head is called with message references that do not exist. That is not good.
I put in a little check like this before seeing if the message has an "in_reply_to" reference:
The line:
if (lbs->el_index[i].in_reply_to)
becomes:
if (i < *lbs->n_el_index && lbs->el_index[i].in_reply_to)
2. The trouble started when I deleted a message in the middle of a thread, which left the thread badly "connected" (references to a deleted message).
3. Also, when a thread is badly connected, it is a problem moving messages to a different logbook. ELOG complains that it cannot access the message (with the invalid reference). But ELOG should ignore it, since the message was deleted.
Soren
|
It would be nice to have this corrected. The problem occurs when you select (read) a message which refers to another message via "In-reply-to", and this message does not exist.
Soren
|
Soren, you're not alone! I've had similar problems, as did Sara Vanini (elog:67077).
In my case, it is because the "move" or "copy" function does not move all the messages in very long threads. To be more precise, elog will crash in the attempt to move a long thread - say over 40 replies, I don't know for sure. Sometimes it has already moved the entire thread before it crashes, sometimes not. I'd not flagged it up as an issue because I could not be sure it was not a memory issue with the old (>12 years) linux box I was using earlier this year, but it still happens on this new (to me, only 3 years old) linux box.
Whether it is the number of entries, the total memory size of the thread or some combination, I don't know.
I've found that in the "move" case, it has not deleted all the messages from the donor thread, so that there is a semi-thread still hidden there. Should one by chance select that semi-thread, (because it is found during a search) elog goes into infinate loop, which requires a reboot of this linux box to fix. Certainly the pinning down the issue to the missing entry referenced by an <i>In reply to:</i> explains this part of the issue. Of course, deletion of one entry within a thread, or other adjustments will do the same thing, just as you (Soren) point out above.
If it happens to me, I will go in to the yymmdda.log files and fix the problem, be it deleting the entries of the semi-thread, moving across missing entries from the donor to the acceptor logbooks, adjusting the <i>Reply:</i> and <i>In reply to:</i> lines, but that is quite a time consuming and error prone exercise. |
67102
|
Mon Aug 15 11:36:02 2011 |
| Kester Habermann | kester.habermann@gmail.com | Bug report | Other | 2.9.0 | SEGV after upgrade from 2.7.8 to 2.9.0 | Hello,
We've been using ELOG 2.6.5 to 2.7.8 for 4 years without any major problems.
Recently we upgraded to version 2.9.0 and since we've had the daemon frequently crash with SEGV.
I've detached debugging output from one time when ELOG the crashed. We've had many crashes
it was a different logbook each time. Platform is Solaris 10 5/08 on SPARC.
Has anyone else experienced problems with 2.9.0?
Best Regards
Kester
|
Attachment 1: elog-2.9.0-dbx.txt
|
signal SEGV (no mapping at the fault address) in show_elog_list at line 19781 in file "elogd.c"
19781 message_id = msg_list[index].lbs->el_index[msg_list[index].index].message_id;
(dbx)
(dbx) list
19781 message_id = msg_list[index].lbs->el_index[msg_list[index].index].message_id;
19782
19783 if (filtering) {
19784 status = el_retrieve(msg_list[index].lbs, message_id, date, attr_list, attrib, lbs->n_attr, text,
19785 &size, in_reply_to, reply_to, attachment, encoding, locked_by);
19786 if (status != EL_SUCCESS)
19787 break;
19788
19789 /* apply filter for attributes */
19790 for (i = 0; i < lbs->n_attr; i++) {
(dbx) print index
index = 0
(dbx) where
=>[1] show_elog_list(lbs = 0x1180200, past_n = 0, last_n = 0, page_n = 0, default_page = 1, info = (nil)), line 19781 in "elogd.c"
[2] interprete(lbook = 0xffbd89f8 "Galileo-Coord", path = 0xffbd8648 ""), line 27213 in "elogd.c"
[3] decode_get(logbook = 0xffbd89f8 "Galileo-Coord", string = 0xffbfe896 ""), line 27253 in "elogd.c"
[4] process_http_request(request = 0x13a4eb8 "GET /Galileo-Coord/", i_conn = 1), line 28001 in "elogd.c"
[5] server_loop(), line 28926 in "elogd.c"
[6] main(argc = 5, argv = 0xffbffb8c), line 29947 in "elogd.c"
(dbx) print n_msg
n_msg = 49
(dbx) print *msg_list
*msg_list = {
lbs = 0x1195dd0
index = 1667786092
string = "\001\017��-D"
number = 0
in_reply_to = 0
}
(dbx) print msg_list[index].lbs->el_index[msg_list[index].index].message_id
dbx: cannot access address 0x18da195b00
(dbx) print ms(dbx) [index].lbs->el_index[msg_list[index].index].message_id
(dbx) print msg_list[index].lbs
msg_list[index].lbs = 0x1195dd0
(dbx) print msg_list[index].lbs->el_index
msg_list[index].lbs->el_index = (nil)
(dbx) pr(dbx) g_list[index].lbs->el_index
(dbx) print *msg_list[index].lbs
*msg_list[index].lbs = {
name = ""
name_enc = ""
data_dir = ""
top_group = ""
el_index = (nil)
n_el_index = (nil)
n_attr = 0
pwd_xml_tree = (nil)
}
(dbx) print msg_list[1].lbs
msg_list[1].lbs = (nil)
(dbx) print msg_list[2].lbs
msg_list[2].lbs = (nil)
(dbx) print msg_list[3].lbs
msg_list[3].lbs = (nil)
(dbx) exit
|
67122
|
Tue Sep 13 11:54:16 2011 |
| Andreas Luedeke | andreas.luedeke@psi.ch | Bug report | Linux | 2.9.0-2414 | Elog crashes with URL find npp=0 | Some user wanted to modify the URL by hand and succeeded to crash the elogd process with npp=now
It appears that npp=0 crashes elogd with the following error message:
Program received signal SIGFPE, Arithmetic exception.
0x0808eba2 in show_elog_list (lbs=0xab3c770, past_n=0, last_n=0, page_n=1,
default_page=1, info=0x0) at src/elogd.c:20214
20214 sprintf(str + strlen(str), loc("Page %d of %d"), page_n, (n_msg - 1) / n_page + 1);
I guess this bug is not OS dependent: you can crash every logbook that you can search ;-) |
67124
|
Tue Sep 20 04:46:55 2011 |
| Ryan | ryan.hoitt@intelsat.com | Bug report | Linux | 2.9.0-2411 | Memory Leak in V2.9.0-2411 (Mirroring Related) | I have two identical servers (IBM X337) setup on the same LAN with Ubuntu Linux 10.04 LTS with ELOGD running (Compiled from tarbell) with the same exact package install base. (Only difference between the two servers is the hostname, and the ELOGD.CFG global section)
I noticed after setting these servers up today that ELOGD crashed on the server configured to mirror. It looks like there may be a memory leak in the mirroring of ELOG.
SERVER 1 ELOGD.CFG
[global]
Mirror server = http://10.146.1.76
Mirror config = 1
Mirror cron = 0,5,10,15,20,25,30,35,40,45,50,55 * * * *
Mirror user = (* Removed for Web Post *)
port = 80
Allowed encoding = 1
Suppress default = 3
Mode commands = 1
Password file = password.pwd
Self register = 1
Admin user = (* Removed for Web Post *)
Time format = %d-%b-%y %H:%M UTC
Group 2009 = Station Log-09, DAT-09, Hours Logging-09
Group 2010 = Station Log 10, DAT-10, Hours Logging-10
Group 2011 = Station Log, DAT, Hours Logging, Operations Tasks, Viasat-1, OS-2
Group Cable Database = Cable Database
Group Provisioning = Provisioning
Group ECR = ECR
SERVER 1 SYSLOGD (cat /var/log/syslog |grep elog)
Sep 19 12:14:13 riverside-log elogd[8588]: elogd 2.9.0 built Sep 19 2011, 10:32:58
Sep 19 12:14:13 riverside-log elogd[8588]: revision 2411
Sep 19 12:14:13 riverside-log elogd[8588]: Falling back to default group "elog"
Sep 19 12:14:13 riverside-log elogd[8588]: Falling back to default user "elog"
Sep 19 12:14:13 riverside-log elogd[8588]: FCKedit detected
Sep 19 12:14:13 riverside-log elogd[8590]: Falling back to default group "elog"
Sep 19 12:14:13 riverside-log elogd[8590]: Falling back to default user "elog"
Sep 19 12:14:13 riverside-log elogd[8588]: Server listening on port 80 ...
Sep 19 19:55:05 riverside-log elogd[8588]: xmalloc: not enough memory
SERVER 1 (Set to mirror off server 2) Memory Usage over 1 hour (ps aux|grep elog)
elog 8760 11.6 3.4109240 35092 ?
elog 8760 12.2 3.9137852 40204 ?
elog 8760 11.6 4.4165448 45440 ?
elog 8760 10.7 5.4221652 55548 ?
elog 8760 9.9 5.9249752 60552 ?
elog 8760 10.1 6.4278364 65680 ?
elog 8760 9.5 6.8305712 70700 ?
SERVER 2 Memory Usage over 1 hour (ps aux|grep elog)
elog 799 2.1 2.6 31744 27116 ?
elog 799 2.0 2.6 31744 27116 ?
elog 799 2.1 2.6 31744 27116 ?
elog 799 2.0 2.6 31744 27116 ?
elog 799 2.0 2.6 31744 27116 ?
elog 799 2.0 2.6 31744 27116 ?
elog 799 2.1 2.6 31744 27116 ? |
|