Demo Discussion
Forum Config Examples Contributions Vulnerabilities
  Discussion forum about ELOG, Page 116 of 808  Not logged in ELOG logo
New entries since:Thu Jan 1 01:00:00 1970
    icon2.gif   Re: synchronization, posted by lance on Fri Jun 19 09:44:02 2009 

lance wrote:

We are running elog across two sites and synchronizing every four hours on change only. There are about 100 entries per day of which most are just one line entries. However this is taking up to 9 mins and during the replication process the server gives us an "unavailable" error. We are using a T1 across the sites so bandwidth should not be an issue, I am confused as to why this takes so long.

The issue for us is not how long the sync takes, providing this was happening in the background, and doesnt lock out the server while the replication was taking place. We are operating under a 24 hour call center type environment so the server being available all the time is of paramount importance.

We use version 2.7.2 and I know there have been several changes made since this version. Would changing to the latest version have any impact on this?

 

Cheers,

 

Lance

 

Stefan,

I have been running logging and I think I have found the problem however I do not know how to resolve it.

Here is where it is going wrong:

19-May-2009 04:02:48 [] {NSS} MIRROR: ID21268: Local entry submitted

19-May-2009 04:07:34 [] {AMC} MIRROR: send entry #31056

It seems that when it has finished replication on one logbook there is a significant time before the next logbook replication starts. During this time the server is not avalable. I have noticed that the time between ending one logbook and starting the next differs betwee 2 and 5 minutes. Again the server is not available when this happens.

Do you have any idea?

Cheers,

 

Lance

    icon2.gif   Re: synchronization, posted by Stefan Ritt on Thu Jun 25 16:30:24 2009 

lance wrote:

lance wrote:

We are running elog across two sites and synchronizing every four hours on change only. There are about 100 entries per day of which most are just one line entries. However this is taking up to 9 mins and during the replication process the server gives us an "unavailable" error. We are using a T1 across the sites so bandwidth should not be an issue, I am confused as to why this takes so long.

The issue for us is not how long the sync takes, providing this was happening in the background, and doesnt lock out the server while the replication was taking place. We are operating under a 24 hour call center type environment so the server being available all the time is of paramount importance.

We use version 2.7.2 and I know there have been several changes made since this version. Would changing to the latest version have any impact on this?

 

Cheers,

 

Lance

 

Stefan,

I have been running logging and I think I have found the problem however I do not know how to resolve it.

Here is where it is going wrong:

19-May-2009 04:02:48 [] {NSS} MIRROR: ID21268: Local entry submitted

19-May-2009 04:07:34 [] {AMC} MIRROR: send entry #31056

It seems that when it has finished replication on one logbook there is a significant time before the next logbook replication starts. During this time the server is not avalable. I have noticed that the time between ending one logbook and starting the next differs betwee 2 and 5 minutes. Again the server is not available when this happens.

Do you have any idea?

Cheers,

 

Lance

Hi,

I need more information to narrow down the problem:

- does this only happen on automatic mirroring or also when you do a manual synchronize?

- does this happen on both sides, like when you make the "other" elog server the master?

- if the server is inresponsive, does the CPU on that machine go to 100% or to 0%?

- do you have very long attachments in your logbooks?

- do you have the same problem on a "tiny" logbook like it comes with the distribution?

- are you using any Apache proxy in between?

I'm afraid that in the end you have to debug this yourself, since it will be very hard for me to reproduce exactly your problem (unless you send me all your files). 

Cheers,

  Stefan

    icon2.gif   Re: synchronization, posted by lance on Fri Jun 26 14:27:27 2009 log_for_stefan.pdfPMCLogfile

Stefan Ritt wrote:

lance wrote:

lance wrote:

We are running elog across two sites and synchronizing every four hours on change only. There are about 100 entries per day of which most are just one line entries. However this is taking up to 9 mins and during the replication process the server gives us an "unavailable" error. We are using a T1 across the sites so bandwidth should not be an issue, I am confused as to why this takes so long.

The issue for us is not how long the sync takes, providing this was happening in the background, and doesnt lock out the server while the replication was taking place. We are operating under a 24 hour call center type environment so the server being available all the time is of paramount importance.

We use version 2.7.2 and I know there have been several changes made since this version. Would changing to the latest version have any impact on this?

 

Cheers,

 

Lance

 

Stefan,

I have been running logging and I think I have found the problem however I do not know how to resolve it.

Here is where it is going wrong:

19-May-2009 04:02:48 [] {NSS} MIRROR: ID21268: Local entry submitted

19-May-2009 04:07:34 [] {AMC} MIRROR: send entry #31056

It seems that when it has finished replication on one logbook there is a significant time before the next logbook replication starts. During this time the server is not avalable. I have noticed that the time between ending one logbook and starting the next differs betwee 2 and 5 minutes. Again the server is not available when this happens.

Do you have any idea?

Cheers,

 

Lance

Hi,

I need more information to narrow down the problem:

- does this only happen on automatic mirroring or also when you do a manual synchronize?

- does this happen on both sides, like when you make the "other" elog server the master?

- if the server is inresponsive, does the CPU on that machine go to 100% or to 0%?

- do you have very long attachments in your logbooks?

- do you have the same problem on a "tiny" logbook like it comes with the distribution?

- are you using any Apache proxy in between?

I'm afraid that in the end you have to debug this yourself, since it will be very hard for me to reproduce exactly your problem (unless you send me all your files). 

Cheers,

  Stefan

Stefan,

Thanks for the reply.

This happens on automatic mirroring and by manual sync. However only the site initializing the mirror is locked out the remote seems to still be able to function.

The CPU jumps from very little usage to 50%+ being used by elogd.exe as soon as you start the mirroring/sync process

I have attached a file that that is in three parts and its pretty big. When I start up the elogd -v it takes over two minutes to scroll through hundreds of  files. I have attached the last of those entries in the first part of the attached PDF, the second part of the PDF shows a manual sync and the third part shows the same sync on the same logbook a few mins later. It seems to take about 3 minutes even when there has been no new log entries. In addition if you are mirroring more that one log book through the automated cron job it can take about 3-5 mins before the second logbook starts its replication. I have also added a screenshot of the completed replications on both runs.

If there is a way to redirect the output of the cmd window when running elogd -v I would capture all the data for you but the standard redirect ">> elog.txt" only creates a blank file.

We are running several logbooks and it does look like the smaller logbooks still take several minutes to start up. I have attached the PMCLogfile and if you look between the NSS and the AMC replications on any day there seems to be a 3 min gap between one book finishing and another starting.

We are not using Apache prox in between.

I am not a programmer but I can follow instruction, if you need anything else let me know.

Stefan this has been driving me nuts for a while now so any help you can give would be more than appreciated.

Cheers,

 

Lance

 

    icon2.gif   Re: synchronization, posted by Stefan Ritt on Mon Aug 3 10:16:12 2009 

lance wrote:

Thanks for the reply.

This happens on automatic mirroring and by manual sync. However only the site initializing the mirror is locked out the remote seems to still be able to function.

The CPU jumps from very little usage to 50%+ being used by elogd.exe as soon as you start the mirroring/sync process

I have attached a file that that is in three parts and its pretty big. When I start up the elogd -v it takes over two minutes to scroll through hundreds of  files. I have attached the last of those entries in the first part of the attached PDF, the second part of the PDF shows a manual sync and the third part shows the same sync on the same logbook a few mins later. It seems to take about 3 minutes even when there has been no new log entries. In addition if you are mirroring more that one log book through the automated cron job it can take about 3-5 mins before the second logbook starts its replication. I have also added a screenshot of the completed replications on both runs.

If there is a way to redirect the output of the cmd window when running elogd -v I would capture all the data for you but the standard redirect ">> elog.txt" only creates a blank file.

We are running several logbooks and it does look like the smaller logbooks still take several minutes to start up. I have attached the PMCLogfile and if you look between the NSS and the AMC replications on any day there seems to be a 3 min gap between one book finishing and another starting.

We are not using Apache prox in between.

I am not a programmer but I can follow instruction, if you need anything else let me know.

Stefan this has been driving me nuts for a while now so any help you can give would be more than appreciated. 

Sorry my late reply but I'm pretty busy these days...

I don't have a clear solution, just a few thoughts:

- Network handling has benn improved recently, so I propose you first upgrade to Version 2.7.7

- Looking at your sync logs, I see many lines of the form

19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} MIRROR change entry #1095 to #23357
19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} DELETE entry #1095
19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} MIRROR send entry #23357

this indicates that you add entries to both logbooks (with ID 1095) in this case. Then elog has a problem, since you have new entries with ID 1095 on both sides. So the only solution is to re-submit the entry #1095 on the source logbook as a new entry (with ID #23357 in this case), delete the old one and then submit the new one. This happens very often, which takes quite some time. Mirroring mainly makes sense if there is one active logbook where new entries gets submitted, and the second logbook is mainly as backup and read-only. Then mirroring is very effective. If you submit on both sides very heavily new entries, the merge process is quite complicated.

- If nothing has changed on both sides and you still have heavy synchronization work, it means that both logbooks kind of became inconsistent, and elog tries to sort that out. So a good starting point is to manually copy all xxxxxxa.log files from one side to the other, thus ensuring both logbooks are 100% identical. Then restart both elogd servers, issue a manual synchronization, and make sure it reports back to you that everything is identical.

 

Hope some of this helps,

  Stefan
 

    icon2.gif   Re: svn revision number in the source, posted by Stefan Ritt on Tue Feb 21 20:24:18 2006 

Steve Jones wrote:
There is a variable $Id$ in source that looks like it is supposed to reflect the svn revision number of the compiled code. How is this supposed to be set, manually just before compiling?


It gets set automatically on every commit to the Subversion repository.
    icon2.gif   Re: svn revision number in the source, posted by Steve Jones on Tue Feb 21 21:01:22 2006 

Stefan Ritt wrote:

Steve Jones wrote:
There is a variable $Id$ in source that looks like it is supposed to reflect the svn revision number of the compiled code. How is this supposed to be set, manually just before compiling?


It gets set automatically on every commit to the Subversion repository.


So, when we go to the download section and download directly from there, that is not "committed" source? I ask because the revision id there is not set to anything that I can see.
    icon2.gif   Re: svn revision number in the source, posted by Stefan Ritt on Tue Feb 21 21:17:13 2006 

Steve Jones wrote:
So, when we go to the download section and download directly from there, that is not "committed" source? I ask because the revision id there is not set to anything that I can see.


Can you be a bit more specific? What do you download? The Windows binaries, the Linux RPM? Or from the Subversion repository? The current version in the repository, which you can download here, contains in the file elogd.c following line 8:
   $Id: elogd.c 1660 2006-02-17 19:48:12Z ritt $

This tells you that this is revision 1660, committed on Feb. 17 by myself. So what is the problem?
    icon14.gif   Re: svn revision number in the source, posted by Steve Jones on Tue Feb 21 21:58:16 2006 

Stefan Ritt wrote:

Steve Jones wrote:
So, when we go to the download section and download directly from there, that is not "committed" source? I ask because the revision id there is not set to anything that I can see.


Can you be a bit more specific? What do you download? The Windows binaries, the Linux RPM? Or from the Subversion repository? The current version in the repository, which you can download here, contains in the file elogd.c following line 8:
   $Id: elogd.c 1660 2006-02-17 19:48:12Z ritt $

This tells you that this is revision 1660, committed on Feb. 17 by myself. So what is the problem?



Steve Jones wrote:
Ok, this is really strange but just an hour ago I clicked on the http://midas.psi.ch/elog/download.html link and I was taken to a completely different webview - in fact, I am quite sure that at the bottom right corner it said "WebCVS"! Now, it says WebSVN and the revision info is in there. I've been trying to debug a problem with default.css and the elcode icons - and somewhere in there I cleared my firefox cache. Perhaps an old page was cached????

I have no idea how I got to CVS, and it make sense that CVS was not setting the SVN revision code.
Sorry to botter you on this.
ELOG V3.1.5-3fb85fa6