lance wrote: |
Thanks for the reply.
This happens on automatic mirroring and by manual sync. However only the site initializing the mirror is locked out the remote seems to still be able to function.
The CPU jumps from very little usage to 50%+ being used by elogd.exe as soon as you start the mirroring/sync process
I have attached a file that that is in three parts and its pretty big. When I start up the elogd -v it takes over two minutes to scroll through hundreds of files. I have attached the last of those entries in the first part of the attached PDF, the second part of the PDF shows a manual sync and the third part shows the same sync on the same logbook a few mins later. It seems to take about 3 minutes even when there has been no new log entries. In addition if you are mirroring more that one log book through the automated cron job it can take about 3-5 mins before the second logbook starts its replication. I have also added a screenshot of the completed replications on both runs.
If there is a way to redirect the output of the cmd window when running elogd -v I would capture all the data for you but the standard redirect ">> elog.txt" only creates a blank file.
We are running several logbooks and it does look like the smaller logbooks still take several minutes to start up. I have attached the PMCLogfile and if you look between the NSS and the AMC replications on any day there seems to be a 3 min gap between one book finishing and another starting.
We are not using Apache prox in between.
I am not a programmer but I can follow instruction, if you need anything else let me know.
Stefan this has been driving me nuts for a while now so any help you can give would be more than appreciated.
|
Sorry my late reply but I'm pretty busy these days...
I don't have a clear solution, just a few thoughts:
- Network handling has benn improved recently, so I propose you first upgrade to Version 2.7.7
- Looking at your sync logs, I see many lines of the form
19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} MIRROR change entry #1095 to #23357
19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} DELETE entry #1095
19-Jun-2009 15:41:05 [lance@127.0.0.1] {NSS} MIRROR send entry #23357
this indicates that you add entries to both logbooks (with ID 1095) in this case. Then elog has a problem, since you have new entries with ID 1095 on both sides. So the only solution is to re-submit the entry #1095 on the source logbook as a new entry (with ID #23357 in this case), delete the old one and then submit the new one. This happens very often, which takes quite some time. Mirroring mainly makes sense if there is one active logbook where new entries gets submitted, and the second logbook is mainly as backup and read-only. Then mirroring is very effective. If you submit on both sides very heavily new entries, the merge process is quite complicated.
- If nothing has changed on both sides and you still have heavy synchronization work, it means that both logbooks kind of became inconsistent, and elog tries to sort that out. So a good starting point is to manually copy all xxxxxxa.log files from one side to the other, thus ensuring both logbooks are 100% identical. Then restart both elogd servers, issue a manual synchronization, and make sure it reports back to you that everything is identical.
Hope some of this helps,
Stefan
|