Problems when trying to set up mirror elog , posted by Ricardo Goncalo on Tue Jul 28 19:14:23 2009
|
Hi,
I'm trying to synchronize an elog in my computer with my personal elog in my institute's server. The two elogd.cfg files are necessarily different since the elog in my institute is for many people's elogs, so I'm not completely sure this is supposed to work.
I've copied the configuration of my (remote) elog into my local elogd.cfg file, and added in the global section of the local file:
Mirror server = https://www.pp.rhul.ac.uk:8080/
Mirror Config = 1
Mirror cron = 0 7-19 * * 0-5
Mirror user = ricardo
Password file = [my password file]
Then I relaunched my local elogd demon and tried both to synchronize the local elog and to wait for the cron job to do it. When I try to synchronize the "ATLAS Trigger" elog I get in the browser:
"Safari can’t open the page “https://localhost:8080/ATLAS+Trigger/?cmd=Synchronize” because Safari can’t establish a secure connection to the server “localhost”"
(the local elog is in https://localhost:8080/ATLAS+Trigger/)
When I wait for cron to update from the mirror this is what I get in the log:
28-Jul-2009 18:51:09 [] SSLServer listening on port 8080 ...
28-Jul-2009 18:50:00 [] Cron job started
28-Jul-2009 18:50:00 [] {ATLAS Trigger} MIRROR: Remote server is not an ELOG server
Any ideas of what I'm doing wrong? I thought it might be the password, but checked that the one in the local password file should be the same as in the remote server. Then I thought it could be the path to the remote server, but can't figure out what might be bad about it and it still doesn't work after a few variations. Another possibility is the elog version: 2.7.3 in the remote server and 2.7.6 locally. Any ideas would be welcome... this is a very convenient feature and it would be great to get it to work!
Cheers,
Ricardo
|
Re: Problems when trying to set up mirror elog , posted by Stefan Ritt on Wed Jul 29 09:33:16 2009
|
Ricardo Goncalo wrote: |
Any ideas of what I'm doing wrong?
|
Yepp. Synchronizing over SSL does not yet work. I have it on my to-do list since quite some time, but can't find the time to implement it. So at the moment you have to synchronize without SSL. |
Re: Problems when trying to set up mirror elog , posted by Ricardo Goncalo on Wed Jul 29 09:59:02 2009
|
Stefan Ritt wrote: |
Ricardo Goncalo wrote: |
Any ideas of what I'm doing wrong?
|
Yepp. Synchronizing over SSL does not yet work. I have it on my to-do list since quite some time, but can't find the time to implement it. So at the moment you have to synchronize without SSL.
|
Hi,
Ok, to see if I understand. You mean setting SSL = 0 in my cfg file and leaving the rest as it is, right? Then I synchronize by hand and I guess I'll be prompted for the password. Perhaps I should remove my local password file to avoid that the password is send unencrypted?
Cheers,
Ricardo |
Re: Problems when trying to set up mirror elog , posted by Stefan Ritt on Wed Jul 29 10:17:49 2009
|
Ricardo Goncalo wrote: |
Ok, to see if I understand. You mean setting SSL = 0 in my cfg file and leaving the rest as it is, right? Then I synchronize by hand and I guess I'll be prompted for the password. Perhaps I should remove my local password file to avoid that the password is send unencrypted?
|
That's correct. The password will be sent unencrypted if you get prompted, but if you use the automatic scheme the password will be encrypted (but not the logbook entries of course). But your concerns are right, running this thing not over SSL is a bad thing these days... |
Re: Problems when trying to set up mirror elog , posted by Ricardo Goncalo on Wed Jul 29 10:33:53 2009
|
Stefan Ritt wrote: |
Ricardo Goncalo wrote: |
Ok, to see if I understand. You mean setting SSL = 0 in my cfg file and leaving the rest as it is, right? Then I synchronize by hand and I guess I'll be prompted for the password. Perhaps I should remove my local password file to avoid that the password is send unencrypted?
|
That's correct. The password will be sent unencrypted if you get prompted, but if you use the automatic scheme the password will be encrypted (but not the logbook entries of course). But your concerns are right, running this thing not over SSL is a bad thing these days...
|
Ok, thanks a lot! I'll try asap and report back.
Cheers,
Ricardo |
Re: Problems when trying to set up mirror elog , posted by Ricardo Goncalo on Thu Jul 30 17:35:50 2009
|
Ricardo Goncalo wrote: |
Stefan Ritt wrote: |
Ricardo Goncalo wrote: |
Ok, to see if I understand. You mean setting SSL = 0 in my cfg file and leaving the rest as it is, right? Then I synchronize by hand and I guess I'll be prompted for the password. Perhaps I should remove my local password file to avoid that the password is send unencrypted?
|
That's correct. The password will be sent unencrypted if you get prompted, but if you use the automatic scheme the password will be encrypted (but not the logbook entries of course). But your concerns are right, running this thing not over SSL is a bad thing these days...
|
Ok, thanks a lot! I'll try asap and report back.
|
Hi again,
Unfortunately I only got 1/2 hour to go back to this... I was trying to avoid copying the whole remote elog server from 20 people (that's what I get with the automatic cloning, right?)
So, I set SSL=0 and removed the password file, but still got the same result. Then I looked in the code a bit, and can see that the problem happens in retrieve_remote_md5(...) in the lines:
p = strstr(text, "ELOG HTTP ");
if (!p) {
if (isparam("debug"))
rsputs(text);
sprintf(error_str, loc("Remote server is not an ELOG server"));
in elogd.c, where I see text is filled by retrieve_url()
So what seems to fail is that retrieve_url() gets back a string from the remote server which doesn't include the string "ELOG HTTP ". But I don't know what that really means. Here is what I get if I try:
bash-3.2$ /usr/local/sbin/elogd -v -m -p 8080 -c /usr/local/elog/elogd.cfg -D
I get the following output:
Indexing logbook "ATLAS Trigger" in "/usr/local/elog/logbooks/ATLAS Trigger/" ...
Config [ATLAS Trigger], MD5=1FAE83FC1D3B920AFDB3DC5F49C25FAF
Entries:
ID 1, 090728a.log, ofs 0, thead, MD5=8D8E44C14FCFA9E2FC24CEC14E60D5ED
After sort:
ID 1, 090728a.log, ofs 0
ok
Indexing logbook "Top physics and SLT" in "/usr/local/elog/logbooks/Top/" ...
Config [Top physics and SLT], MD5=C6A82A4BD6FF708BFDA3EA8719ECE48C
Found empty logbook "Top physics and SLT"
Indexing logbook "Trigger Slices and Core SW" in "/usr/local/elog/logbooks/Slices/" ...
Config [Trigger Slices and Core SW], MD5=316B8D7A8FBA661518FD61D3BAC39F3C
Entries:
ID 2, 090727a.log, ofs 0, thead, MD5=AA8B0B0972718F9BD95F5BA89E70DD97
ID 3, 090727a.log, ofs 3870, thead, MD5=A69E46D18074A59C4445B72EE72F025D
ID 4, 090727a.log, ofs 8354, thead, MD5=0DC3AF86F2A88ACD76E766FA1AA08665
ID 5, 090730a.log, ofs 0, thead, MD5=59299CDFA98983EB33EC08CF1A8FF7C0
ID 6, 090730a.log, ofs 10120, thead, MD5=0039C61DA667AA36D06A5772F8E3D0FA
After sort:
ID 2, 090727a.log, ofs 0
ID 3, 090727a.log, ofs 3870
ID 4, 090727a.log, ofs 8354
ID 5, 090730a.log, ofs 0
ID 6, 090730a.log, ofs 10120
ok
Retrieving entries from "https://www.pp.rhul.ac.uk:8080/ATLAS Trigger"...
Remote server is not an ELOG server
...so I'm running out of options. Any ideas would be welcome!
Cheers,
Ricardo
|
Re: Problems when trying to set up mirror elog , posted by Stefan Ritt on Mon Aug 3 10:00:18 2009
|
Ricardo Goncalo wrote: |
Retrieving entries from "https://www.pp.rhul.ac.uk:8080/ATLAS Trigger"...
Remote server is not an ELOG server
...so I'm running out of options. Any ideas would be welcome!
Cheers,
Ricardo
|
Your problem is here. I wrote that synchronization is not possible through SSL, but you try to access https://www.pp.rhul.ac.uk:8080 which is SSL (because you have https:// not http://). |
Re: Problems when trying to set up mirror elog , posted by Ricardo Goncalo on Tue Aug 4 10:48:26 2009
|
Stefan Ritt wrote: |
Ricardo Goncalo wrote: |
Retrieving entries from "https://www.pp.rhul.ac.uk:8080/ATLAS Trigger"...
Remote server is not an ELOG server
...so I'm running out of options. Any ideas would be welcome!
Cheers,
Ricardo
|
Your problem is here. I wrote that synchronization is not possible through SSL, but you try to access https://www.pp.rhul.ac.uk:8080 which is SSL (because you have https:// not http://).
|
Ah, I see. Hmm, ok it doesn't work without the s either. I can't access the server in that case. Ok, I think I'll just wait for this feature to be available. Thanks for your help! |
Re: Problems when trying to set up mirror elog , posted by Stefan Ritt on Tue Aug 4 11:22:18 2009
|
Ricardo Goncalo wrote: |
Stefan Ritt wrote: |
Ricardo Goncalo wrote: |
Retrieving entries from "https://www.pp.rhul.ac.uk:8080/ATLAS Trigger"...
Remote server is not an ELOG server
...so I'm running out of options. Any ideas would be welcome!
Cheers,
Ricardo
|
Your problem is here. I wrote that synchronization is not possible through SSL, but you try to access https://www.pp.rhul.ac.uk:8080 which is SSL (because you have https:// not http://).
|
Ah, I see. Hmm, ok it doesn't work without the s either. I can't access the server in that case. Ok, I think I'll just wait for this feature to be available. Thanks for your help!
|
Of course you also have to switch your mirror server not to use SSL as well. You can check this then by accessing it directly via http://www.pp.rhul.ac.uk:8080/ATLAS Trigger from the same computer. Also make sure that you don't have any firewall issue. |
synchronization, posted by lance on Wed Jun 17 09:05:27 2009
|
We are running elog across two sites and synchronizing every four hours on change only. There are about 100 entries per day of which most are just one line entries. However this is taking up to 9 mins and during the replication process the server gives us an "unavailable" error. We are using a T1 across the sites so bandwidth should not be an issue, I am confused as to why this takes so long.
The issue for us is not how long the sync takes, providing this was happening in the background, and doesnt lock out the server while the replication was taking place. We are operating under a 24 hour call center type environment so the server being available all the time is of paramount importance.
We use version 2.7.2 and I know there have been several changes made since this version. Would changing to the latest version have any impact on this?
Cheers,
Lance |
Re: synchronization, posted by lance on Fri Jun 19 09:44:02 2009
|
lance wrote: |
We are running elog across two sites and synchronizing every four hours on change only. There are about 100 entries per day of which most are just one line entries. However this is taking up to 9 mins and during the replication process the server gives us an "unavailable" error. We are using a T1 across the sites so bandwidth should not be an issue, I am confused as to why this takes so long.
The issue for us is not how long the sync takes, providing this was happening in the background, and doesnt lock out the server while the replication was taking place. We are operating under a 24 hour call center type environment so the server being available all the time is of paramount importance.
We use version 2.7.2 and I know there have been several changes made since this version. Would changing to the latest version have any impact on this?
Cheers,
Lance
|
Stefan,
I have been running logging and I think I have found the problem however I do not know how to resolve it.
Here is where it is going wrong:
19-May-2009 04:02:48 [] {NSS} MIRROR: ID21268: Local entry submitted
19-May-2009 04:07:34 [] {AMC} MIRROR: send entry #31056
It seems that when it has finished replication on one logbook there is a significant time before the next logbook replication starts. During this time the server is not avalable. I have noticed that the time between ending one logbook and starting the next differs betwee 2 and 5 minutes. Again the server is not available when this happens.
Do you have any idea?
Cheers,
Lance |
Re: synchronization, posted by Stefan Ritt on Thu Jun 25 16:30:24 2009
|
lance wrote: |
lance wrote: |
We are running elog across two sites and synchronizing every four hours on change only. There are about 100 entries per day of which most are just one line entries. However this is taking up to 9 mins and during the replication process the server gives us an "unavailable" error. We are using a T1 across the sites so bandwidth should not be an issue, I am confused as to why this takes so long.
The issue for us is not how long the sync takes, providing this was happening in the background, and doesnt lock out the server while the replication was taking place. We are operating under a 24 hour call center type environment so the server being available all the time is of paramount importance.
We use version 2.7.2 and I know there have been several changes made since this version. Would changing to the latest version have any impact on this?
Cheers,
Lance
|
Stefan,
I have been running logging and I think I have found the problem however I do not know how to resolve it.
Here is where it is going wrong:
19-May-2009 04:02:48 [] {NSS} MIRROR: ID21268: Local entry submitted
19-May-2009 04:07:34 [] {AMC} MIRROR: send entry #31056
It seems that when it has finished replication on one logbook there is a significant time before the next logbook replication starts. During this time the server is not avalable. I have noticed that the time between ending one logbook and starting the next differs betwee 2 and 5 minutes. Again the server is not available when this happens.
Do you have any idea?
Cheers,
Lance
|
Hi,
I need more information to narrow down the problem:
- does this only happen on automatic mirroring or also when you do a manual synchronize?
- does this happen on both sides, like when you make the "other" elog server the master?
- if the server is inresponsive, does the CPU on that machine go to 100% or to 0%?
- do you have very long attachments in your logbooks?
- do you have the same problem on a "tiny" logbook like it comes with the distribution?
- are you using any Apache proxy in between?
I'm afraid that in the end you have to debug this yourself, since it will be very hard for me to reproduce exactly your problem (unless you send me all your files).
Cheers,
Stefan |
Re: synchronization, posted by lance on Fri Jun 26 14:27:27 2009 
|
Stefan Ritt wrote: |
lance wrote: |
lance wrote: |
We are running elog across two sites and synchronizing every four hours on change only. There are about 100 entries per day of which most are just one line entries. However this is taking up to 9 mins and during the replication process the server gives us an "unavailable" error. We are using a T1 across the sites so bandwidth should not be an issue, I am confused as to why this takes so long.
The issue for us is not how long the sync takes, providing this was happening in the background, and doesnt lock out the server while the replication was taking place. We are operating under a 24 hour call center type environment so the server being available all the time is of paramount importance.
We use version 2.7.2 and I know there have been several changes made since this version. Would changing to the latest version have any impact on this?
Cheers,
Lance
|
Stefan,
I have been running logging and I think I have found the problem however I do not know how to resolve it.
Here is where it is going wrong:
19-May-2009 04:02:48 [] {NSS} MIRROR: ID21268: Local entry submitted
19-May-2009 04:07:34 [] {AMC} MIRROR: send entry #31056
It seems that when it has finished replication on one logbook there is a significant time before the next logbook replication starts. During this time the server is not avalable. I have noticed that the time between ending one logbook and starting the next differs betwee 2 and 5 minutes. Again the server is not available when this happens.
Do you have any idea?
Cheers,
Lance
|
Hi,
I need more information to narrow down the problem:
- does this only happen on automatic mirroring or also when you do a manual synchronize?
- does this happen on both sides, like when you make the "other" elog server the master?
- if the server is inresponsive, does the CPU on that machine go to 100% or to 0%?
- do you have very long attachments in your logbooks?
- do you have the same problem on a "tiny" logbook like it comes with the distribution?
- are you using any Apache proxy in between?
I'm afraid that in the end you have to debug this yourself, since it will be very hard for me to reproduce exactly your problem (unless you send me all your files).
Cheers,
Stefan
|
Stefan,
Thanks for the reply.
This happens on automatic mirroring and by manual sync. However only the site initializing the mirror is locked out the remote seems to still be able to function.
The CPU jumps from very little usage to 50%+ being used by elogd.exe as soon as you start the mirroring/sync process
I have attached a file that that is in three parts and its pretty big. When I start up the elogd -v it takes over two minutes to scroll through hundreds of files. I have attached the last of those entries in the first part of the attached PDF, the second part of the PDF shows a manual sync and the third part shows the same sync on the same logbook a few mins later. It seems to take about 3 minutes even when there has been no new log entries. In addition if you are mirroring more that one log book through the automated cron job it can take about 3-5 mins before the second logbook starts its replication. I have also added a screenshot of the completed replications on both runs.
If there is a way to redirect the output of the cmd window when running elogd -v I would capture all the data for you but the standard redirect ">> elog.txt" only creates a blank file.
We are running several logbooks and it does look like the smaller logbooks still take several minutes to start up. I have attached the PMCLogfile and if you look between the NSS and the AMC replications on any day there seems to be a 3 min gap between one book finishing and another starting.
We are not using Apache prox in between.
I am not a programmer but I can follow instruction, if you need anything else let me know.
Stefan this has been driving me nuts for a while now so any help you can give would be more than appreciated.
Cheers,
Lance
|
Re: synchronization, posted by Stefan Ritt on Mon Aug 3 10:16:12 2009
|
lance wrote: |
Thanks for the reply.
This happens on automatic mirroring and by manual sync. However only the site initializing the mirror is locked out the remote seems to still be able to function.
The CPU jumps from very little usage to 50%+ being used by elogd.exe as soon as you start the mirroring/sync process
I have attached a file that that is in three parts and its pretty big. When I start up the elogd -v it takes over two minutes to scroll through hundreds of files. I have attached the last of those entries in the first part of the attached PDF, the second part of the PDF shows a manual sync and the third part shows the same sync on the same logbook a few mins later. It seems to take about 3 minutes even when there has been no new log entries. In addition if you are mirroring more that one log book through the automated cron job it can take about 3-5 mins before the second logbook starts its replication. I have also added a screenshot of the completed replications on both runs.
If there is a way to redirect the output of the cmd window when running elogd -v I would capture all the data for you but the standard redirect ">> elog.txt" only creates a blank file.
We are running several logbooks and it does look like the smaller logbooks still take several minutes to start up. I have attached the PMCLogfile and if you look between the NSS and the AMC replications on any day there seems to be a 3 min gap between one book finishing and another starting.
We are not using Apache prox in between.
I am not a programmer but I can follow instruction, if you need anything else let me know.
Stefan this has been driving me nuts for a while now so any help you can give would be more than appreciated.
|
Sorry my late reply but I'm pretty busy these days...
I don't have a clear solution, just a few thoughts:
- Network handling has benn improved recently, so I propose you first upgrade to Version 2.7.7
- Looking at your sync logs, I see many lines of the form
19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} MIRROR change entry #1095 to #23357
19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} DELETE entry #1095
19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} MIRROR send entry #23357
this indicates that you add entries to both logbooks (with ID 1095) in this case. Then elog has a problem, since you have new entries with ID 1095 on both sides. So the only solution is to re-submit the entry #1095 on the source logbook as a new entry (with ID #23357 in this case), delete the old one and then submit the new one. This happens very often, which takes quite some time. Mirroring mainly makes sense if there is one active logbook where new entries gets submitted, and the second logbook is mainly as backup and read-only. Then mirroring is very effective. If you submit on both sides very heavily new entries, the merge process is quite complicated.
- If nothing has changed on both sides and you still have heavy synchronization work, it means that both logbooks kind of became inconsistent, and elog tries to sort that out. So a good starting point is to manually copy all xxxxxxa.log files from one side to the other, thus ensuring both logbooks are 100% identical. Then restart both elogd servers, issue a manual synchronization, and make sure it reports back to you that everything is identical.
Hope some of this helps,
Stefan
|
display GMT time instead of local time in Entry time/ Last edit field, posted by Dan Duong on Tue Jul 28 02:50:42 2009
|
Hi all,
I have set my PC in Time Zone GMT+10:00 but I get GMT time in Entry time/Last edit field.
I have installed in another PC. Which has Time Zone GMT+10:00 but I still get GMT time in Entry time/Last edit field.
Please help. Thank you very much.
|
Re: display GMT time instead of local time in Entry time/ Last edit field, posted by Stefan Ritt on Tue Jul 28 11:04:01 2009
|
Dan Duong wrote: |
Hi all,
I have set my PC in Time Zone GMT+10:00 but I get GMT time in Entry time/Last edit field.
I have installed in another PC. Which has Time Zone GMT+10:00 but I still get GMT time in Entry time/Last edit field.
Please help. Thank you very much.
|
That's strange. I use the C function localtime() to obtain the local time from Windows. The documentation says that this function checks the Windows control panel and returns the proper local time. So far, nobody complained so I guess only you have this problem (anybody else to correct me???). The only hint I found is to set the environment variable TZ. So open a DOS box and enter
set TZ=AST+10
then start elogd.exe interactively in that dos box and see if you get something else. |
Re: display GMT time instead of local time in Entry time/ Last edit field, posted by Dan Duong on Wed Jul 29 04:56:27 2009
|
Stefan Ritt wrote: |
Dan Duong wrote: |
Hi all,
I have set my PC in Time Zone GMT+10:00 but I get GMT time in Entry time/Last edit field.
I have installed in another PC. Which has Time Zone GMT+10:00 but I still get GMT time in Entry time/Last edit field.
Please help. Thank you very much.
|
That's strange. I use the C function localtime() to obtain the local time from Windows. The documentation says that this function checks the Windows control panel and returns the proper local time. So far, nobody complained so I guess only you have this problem (anybody else to correct me???). The only hint I found is to set the environment variable TZ. So open a DOS box and enter
set TZ=AST+10
then start elogd.exe interactively in that dos box and see if you get something else.
|
I did as instructed but time was 20 hours behide
I have entered set TZ=AST-10 I got the correct time. I think my elog files have been changed by someone. elogd file is running in DOS box now. Please help how to run elog as normal or correct elog files. Which file I should check. Is it elconv.c file? Thank you Stefan. |
Re: display GMT time instead of local time in Entry time/ Last edit field, posted by Stefan Ritt on Wed Jul 29 09:21:39 2009
|
Dan Duong wrote: |
I did as instructed but time was 20 hours behide
I have entered set TZ=AST-10 I got the correct time. I think my elog files have been changed by someone. elogd file is running in DOS box now. Please help how to run elog as normal or correct elog files. Which file I should check. Is it elconv.c file? Thank you Stefan.
|
You have to change your environment variable "TZ" system wide. You do that by going to
My Computer/Properties/Advanced/Environment Variables/New
then you enter TZ as the variable name and AST-10 as the value. You might have to reboot your computer. |
Re: display GMT time instead of local time in Entry time/ Last edit field, posted by Dan Duong on Mon Aug 3 03:26:23 2009
|
Stefan Ritt wrote: |
Dan Duong wrote: |
I did as instructed but time was 20 hours behind.
I have entered set TZ=AST-10 I got the correct time. I think my elog files have been changed by someone. elogd file is running in DOS box now. Please help how to run elog as normal or correct elog files. Which file I should check. Is it elconv.c file? Thank you Stefan.
|
You have to change your environment variable "TZ" system wide. You do that by going to
My Computer/Properties/Advanced/Environment Variables/New
then you enter TZ as the variable name and AST-10 as the value. You might have to reboot your computer.
|
It is working with the correct time stamp now. Thanks you very much Stefan. |
Can't install on Fedora 11, posted by Neil B. Cohen on Fri Jul 31 20:41:12 2009
|
Tried installing on Fedora 11 - failed dependency on libssl.so.6 I have libssl.so.8 installed. What do I need to do to install this package?
thanks,
nbc |
Re: Can't install on Fedora 11, posted by Stefan Ritt on Sat Aug 1 09:34:00 2009
|
Neil B. Cohen wrote: |
Tried installing on Fedora 11 - failed dependency on libssl.so.6 I have libssl.so.8 installed. What do I need to do to install this package?
thanks,
nbc
|
The easiest is if you install from source by downloading the tar ball. It's hard these days to make a RPM which runs on all possible distributions. I would have to maintain a zoo of new and old distributions, which I don't have the hardware for. Or you go and install libssl.so.6 by hand. |
Crashes when editing entries, posted by T. Ribbrock on Wed Jul 22 12:12:37 2009
|
For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.
I've been running elogd like this (to get more info)
elogd -v > elog-2233-2.log 2>&1
The last entry I get in the log when elogd crashes is:
Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"
I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.
Any additional information you need: Just let me know.
Regards,
Thomas |
Re: Crashes when editing entries, posted by T. Ribbrock on Wed Jul 22 12:15:56 2009
|
T. Ribbrock wrote: |
For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.
I've been running elogd like this (to get more info)
elogd -v > elog-2233-2.log 2>&1
The last entry I get in the log when elogd crashes is:
Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"
I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.
Any additional information you need: Just let me know.
Regards,
Thomas
|
Forgot to mention: I've also seen error messages like this upon a crash:
*** glibc detected *** corrupted double-linked list: 0x0911bbc0 ***
Regards,
Thomas |
Re: Crashes when editing entries, posted by Stefan Ritt on Wed Jul 22 12:46:36 2009
|
T. Ribbrock wrote: |
For some odd reasons, we are experiencing frequent crashes of elogd over the past few days. It has been working fine so far, but more or less out of the blue it became rather unreliable. The current configuration is installed on two servers, one running 2.7.5.-r2174 on ClarkConnect 4 and one running 2.7.6-r2233 on Debian 4.0 - both show the same problem. Each of them has an "active" group with four logbooks and an "archive" group with three logbooks. In the "active" group, there are two logbooks that share the same index (using Subdir=...) and it looks like the crashes occur most of the time in these, though that's just a hunch so far. Also, most of the crashes seem to happen when submitting an entry that has been edited. Actually, submitting a modified entry has always been strange in our logbooks: When we hit submit, we get a pop-up window asking "Submit modified entry?". When choosing "OK", the entry that has been edited is duplicated. When choosing "Cancel", it is submitted correctly.
I've been running elogd like this (to get more info)
elogd -v > elog-2233-2.log 2>&1
The last entry I get in the log when elogd crashes is:
Same index as logbook Machine Log
elogd: src/elogd.c:727: xfree: Assertion `*((unsigned int *) (temp - 4)) == 0xdeadc0de' failed.
Received unknown cookie "wikidb_mw__session"
Received unknown cookie "wikidb_mw__session"
I did actually make a few changes to the configuration before we noticed the crashes: I added one extra attribute and a few more conditionals.
Any additional information you need: Just let me know.
|
well, I need to reproduce your problem in order to fix it. The failed assertation you get is due to some internal writing beyond array boundaries, but I have no clue which part of the code makes this. It might be related to the fact that you use the same index (via Subdir=...) for two logbooks. In this scenario, you are only allowed to modify/add entries to one logbook, not the other. The other one may only be used for reading. And even then it's not guaranteed that new entries show up in the second logbook immediately, you might have to restart the server in order to re-index the logbooks. Internally, the daemon does not know that two logbooks are "the same" and one instance will not realize if the other instance modifies the data "below its feet". Can you try to give up the double logbooks and see if the problem goes away? |
Re: Crashes when editing entries, posted by T. Ribbrock on Wed Jul 22 15:35:57 2009
|
Stefan Ritt wrote: |
well, I need to reproduce your problem in order to fix it. The failed assertation you get is due to some internal writing beyond array boundaries, but I have no clue which part of the code makes this. It might be related to the fact that you use the same index (via Subdir=...) for two logbooks. In this scenario, you are only allowed to modify/add entries to one logbook, not the other. The other one may only be used for reading. And even then it's not guaranteed that new entries show up in the second logbook immediately, you might have to restart the server in order to re-index the logbooks. Internally, the daemon does not know that two logbooks are "the same" and one instance will not realize if the other instance modifies the data "below its feet". Can you try to give up the double logbooks and see if the problem goes away?
|
Hm... I have implemented this set-up originally based on this: https://midas.psi.ch/elogs/Forum/66024. The "double logbook" is a machine log with a "software" (OS installations etc.) and a "hardware" (CPU, RAM, etc.) view. The "hardware" view has the "Subdir=" statement. Thinking about it, the "software" view is used most - I have several automatic scripts running which update the contents whenever a machine gets updated, re-installed and so on. The hardware part does not see much editing - until this week, when we decided to start an inventory... So, it's quite possible that we never noticed that this was iffy. For the rest of our goals, this set-up has worked fantastically - never noticed any problem with one view not updating, actually. Also, I do not remember any crashes with the other, single logbooks.
What I've done for now is to ask all team members to use only the software part (the one without the Subdir statement) to actually change content (the entry masks are the same in both versions) and use the hardware part just for viewing. I'll report back as soon as I get some feedback.
Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.
Regards,
Thomas |
Re: Crashes when editing entries, posted by Stefan Ritt on Wed Jul 22 16:30:48 2009
|
T. Ribbrock wrote: |
Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.
|
Well, for that I have to reproduce the problem. So best would be if you strip it down to the bare minimum in order to reproduce this reliably. Then you zip everything and send it to me. Then tell me what I have to edit and submit in order to stimulate the crash. Once this is successful, I can fix it. |
Re: Crashes when editing entries, posted by T. Ribbrock on Wed Jul 22 16:52:13 2009
|
Stefan Ritt wrote: |
T. Ribbrock wrote: |
Nonetheless, given that this set-up has been a great help for us - if you ever get the chance to make this work (even) better, I'd be most grateful.
|
Well, for that I have to reproduce the problem. So best would be if you strip it down to the bare minimum in order to reproduce this reliably. Then you zip everything and send it to me. Then tell me what I have to edit and submit in order to stimulate the crash. Once this is successful, I can fix it.
|
Thank you - I shall look into that, though it'll probably take a while to prepare it. |
Re: Crashes when editing entries, posted by T. Ribbrock on Wed Jul 29 14:48:34 2009
|
By now, I've installed 2244 and ran some rudimentary tests. So far, I was not able to reproduce the crash anymore. Looking good!  |
Odd problem (bug?) with certain attribute, posted by T. Ribbrock on Tue Jul 28 16:30:46 2009
|
I have the following simple test logbook:
; General
List display = Edit, Hostname, OS, Size
Entries per page = 150
Quick filter = OS
Date Format = %d/%m/%Y
Summary Lines = 0
; Attributes
Attributes = Hostname, OS, CPU,Size
Required Attributes = Hostname
Sort Attributes = Hostname
; Message part: log is text only, but elog is allowed
Default encoding = 0
Allowed encoding = 2
For some strange reasons, I'm having problems with the "Size" attribute, which I have added later. If I start adding/editing entries, at some point, the "Size" attribute will stay at "0" and will not accept any further changes. I've tried to pare down the config to maybe find which statement could be causing this, but to no avail. The only thing I can say is taht id doesn not seem to happen if the configuration consist of the sole line
Attributes = Hostname, OS, CPU,Size
I'm quite puzzled as to what is going on here (the original problems stems from a more complicated logbook) - I'm not even 100% sure whether this happens due to a bug or not.
Regards,
Thomas |
Re: Odd problem (bug?) with certain attribute, posted by Stefan Ritt on Tue Jul 28 16:57:39 2009
|
T. Ribbrock wrote: |
I have the following simple test logbook:
; General
List display = Edit, Hostname, OS, Size
Entries per page = 150
Quick filter = OS
Date Format = %d/%m/%Y
Summary Lines = 0
; Attributes
Attributes = Hostname, OS, CPU,Size
Required Attributes = Hostname
Sort Attributes = Hostname
; Message part: log is text only, but elog is allowed
Default encoding = 0
Allowed encoding = 2
For some strange reasons, I'm having problems with the "Size" attribute, which I have added later. If I start adding/editing entries, at some point, the "Size" attribute will stay at "0" and will not accept any further changes. I've tried to pare down the config to maybe find which statement could be causing this, but to no avail. The only thing I can say is taht id doesn not seem to happen if the configuration consist of the sole line
Attributes = Hostname, OS, CPU,Size
I'm quite puzzled as to what is going on here (the original problems stems from a more complicated logbook) - I'm not even 100% sure whether this happens due to a bug or not.
|
That's indeed a strange bug, and thanks to your detailed explanation I could easily reproduce it. The problem was that in the ELCode toolbar there is already a "SIZE" parameter, which gets submitted instead of your "size" attribute. Therefore whatever you submit as "size", gets replaced by zero (since the SIZE drop-down box usually sits at zero). So you can either go and change your "size" attribute into someting else like "Memory Size", or you upgrade to version 2244, where I fixed this problem. |
Re: Odd problem (bug?) with certain attribute, posted by T. Ribbrock on Tue Jul 28 17:10:13 2009
|
Stefan Ritt wrote: |
That's indeed a strange bug, and thanks to your detailed explanation I could easily reproduce it. The problem was that in the ELCode toolbar there is already a "SIZE" parameter, which gets submitted instead of your "size" attribute. Therefore whatever you submit as "size", gets replaced by zero (since the SIZE drop-down box usually sits at zero). So you can either go and change your "size" attribute into someting else like "Memory Size", or you upgrade to version 2244, where I fixed this problem.
|
Very nice, thank you! Given that I also still have to test the fix you made for the crash problem with "shared" logbooks (I assume it's also present in 2244), I'll upgrade and report back (probably tomorrow). |
Re: Odd problem (bug?) with certain attribute, posted by T. Ribbrock on Wed Jul 29 13:23:41 2009
|
T. Ribbrock wrote: |
Very nice, thank you! Given that I also still have to test the fix you made for the crash problem with "shared" logbooks (I assume it's also present in 2244), I'll upgrade and report back (probably tomorrow).
|
2244 is now running, and indeed, the "Size" attribute works now. Thank you!  |
attachment not displayed if entry contains link to attachment., posted by Devin Bougie on Mon Jul 27 18:02:14 2009 
|
I'm not sure if this is the expected behavior, but it appears as though an attachment is not displayed in the list of attachments if you manually add a link to the attachment into the body of the entry. I would greatly appreciate any advice on how to fix or change this behavior.
I will try to demonstrate with the two attachments on this entry. There are two attachments in this entry, but only one appears in the standard view of the entry.
Picture_1.png.png
Many thanks,
Devin |
Re: attachment not displayed if entry contains link to attachment., posted by Stefan Ritt on Tue Jul 28 10:42:40 2009
|
Devin Bougie wrote: |
I'm not sure if this is the expected behavior, but it appears as though an attachment is not displayed in the list of attachments if you manually add a link to the attachment into the body of the entry. I would greatly appreciate any advice on how to fix or change this behavior.
I will try to demonstrate with the two attachments on this entry. There are two attachments in this entry, but only one appears in the standard view of the entry.
Picture_1.png.png
Many thanks,
Devin
|
Well, you can have "inline" pictures like that:
In this case it is clumsy if this image gets displayed twice, once in the text and once as the attachment. So I look at the entry and if I find the image inlined, I suppress the display at the end. Now the "check if the image is shown inline" is a bit stupid, it just looks for the link. So I never thought that someone would just put a link in the text manually. I will have a look and see if I can change that. |
Re: attachment not displayed if entry contains link to attachment., posted by Stefan Ritt on Tue Jul 28 13:30:35 2009
|
Devin Bougie wrote: |
I'm not sure if this is the expected behavior, but it appears as though an attachment is not displayed in the list of attachments if you manually add a link to the attachment into the body of the entry. I would greatly appreciate any advice on how to fix or change this behavior.
I will try to demonstrate with the two attachments on this entry. There are two attachments in this entry, but only one appears in the standard view of the entry.
Picture_1.png.png
Many thanks,
Devin
|
I improved the mentioned check for inline attachments. Now your page correctly shows both attachments. This fix is in SVN revision 2241. |
Wrong error message if invalid attribute is used, posted by T. Ribbrock on Mon Jul 27 10:20:14 2009
|
I just ran into this little bug: I had defined a new logbook in my config file and suddenly got the message Attribute "Date" not allowed. While I did have several attributes starting with the word "Date" (e.g. "Date In Service", "Date Retired") I had no attribute "Date" in there. After some pondering and wildly commenting out lines, it finally dawned on me: I had used an attribute "ID" - which is also not allowed. However, it would be very helpful if the error message actually reflected that...  |
Re: Wrong error message if invalid attribute is used, posted by Stefan Ritt on Mon Jul 27 10:49:19 2009
|
T. Ribbrock wrote: |
I just ran into this little bug: I had defined a new logbook in my config file and suddenly got the message Attribute "Date" not allowed. While I did have several attributes starting with the word "Date" (e.g. "Date In Service", "Date Retired") I had no attribute "Date" in there. After some pondering and wildly commenting out lines, it finally dawned on me: I had used an attribute "ID" - which is also not allowed. However, it would be very helpful if the error message actually reflected that... 
|
Oops, just a typo. The message Attribute "Date" not allowed should read Attribute "ID" not allowed. Fixed in the current SVN version. |
|