Good bit of detective work. To me it suggests that something as yet undetermined occurs, that, when the 57th reply happens, causes the issue. If that "something" hasn't happened, all is well. Apart from Heinz varieties (not true, in fact), 57 isn't an obvious number; nor did it leap out at me at a quick look at the parameters in the coding. My example of deleting more than 40 entries causing elog to crash was at least consistent, it happened every time.
I'm trying to think what this something might be. With my (admittedly largeish) database of elog entries, starting elog from a cold start will take minutes of indexing before it will display home page or whatever. Presumably it must count the number of entries in each thread (as otherwise why always 57?), yet if you stop and restart, it doesn't necessarily need to do the full indexing again - time between restarts I guess, the authors not considering the evil deeds I perform on yymmdda.log entries.
Bare me out on this, I once had software that ran a system, and every Thursday, without fail, it always did a full recalibration on every start up. Since updates were issued on Fridays, I commented that it was just adding to our pressure, "as if it knew the day of the week"; it really was (and turned out to be) a day-of-the-week bug. So, I've been right on more than one occasion. Anything in common with the threads with cross indexing, such as day of the week, day of the month, time, especially if crossing midnight before the 57th reply?
Another line would be to view the yymmdda.log files while you are making a normal reply. In my v2.9.2 version, nothing is written until the Submit button is pressed, then either one or two files are modified or one modified and one new one created. Is that still true with your version? I ask because clearly one or two entry numbers have somehow already been "reserved" as if opened, but where? That Autosave =0 looks to be a useful test to do.
Sorry I cannot be more help. I'm not one of the development team, though I do have experience of (ab)using elog, and I'm a pretty rubbish coder as well. but I do have some experience in bug finding!
Well, you've made some very interesting observations, and raised some excellent questions. So, I went back and did some homework, reviewing a number of logbooks to find instances where this strange 'record twist' occurs. You had asked, "Do you have enough information to decided that this event always happens after x replies?" -- and to my surprise, indeed there was a magic number that I didn't expect to see. The 57th reply to the original posting was always where the corruption began. Mind you, we don't always get a corruption on the 57th reply -- most of the time, it works as expected. However, in all the cases where I saw this record twist, it was the 57th reply after the original posting. Go figure.
I also reviewed my elogd.cfg file to see how I handled drafts. Currently, it does have the flag Save drafts = 0. What I plan to try next, if only to satisfy my curiosity, is to also add Autosave=0.
I can't thank you enough for your time and feedback...very much appreciated!
There are two interesting points about the log file.
1. Entry 5658 is timestamped later than 5659, but is earlier in the entry list. It also is "In Reply to" 5659. despite 5659 having not been written (or at least timestamped) at the time that 5658 is. Might this be a feature of the draft function? I've not upgraded my elog for a long time now so my version doesn't have the feature - so I cannot test the idea of more than one entry being worked upon at the same time.
2. Entry 5657 says it is "In Reply to" 5656, but entry 5656 does not reference 5657 in the "Reply to" line, as it should Again, this might be a feature of the draft function
Could someone be confusing a draft entry with a real one? Or two attempts to make an entry?
On the idea of large number of entries, elog doesn't handle deleting of a thread of more than 40 replies well - it crashes after deleting the 40th. This leaves an orphan thread that causes other issues. Do you have enough information to decided that this event always happens after x replies?
Thanks for the quick response! Well, I'd have to say that the sequence is as tangled as it looks in the logbook -- I've attached a copy of the log file for your reading pleasure.
This one is definitely a "head-scratcher" for me...it definitely seems like it is more prevalent on log entries with many replies.
I've had problems in the past due to a dodgy pointer creating branches despite a "No branches" in the configuration file. It would be very interesting to see what the 200428a.log file looks li looks like with these entries: in the screenshot they appear to be shown in time order, but do the "Reply to" and "In reply to" liknes in each entry (in the .log file) show a linear progression through the entires, a branch a branch or indeed this same order as the screenshot. If the duplicated entry sequential to 5657 (i.e 5658) then I would suspect something akin to my pointer's double click when I only made a single click, so fast that then second e second entry were created before the "No branches" checking part of the program had been reached. Not so sure about such an event here unless entry 5658 were already open but not closed?
I've encountered an occasional problem that seems to be exacerbated by having a message with many replies.
In our use of ELOG, we run lengthy environmental tests (often several days) in multiple temperature chambers (one logbook for each chamber). We document the start of the test with a log entry, and then periodically create replies -- first to the original log entry, and then to each successive reply (no branching allowed), in order to document how far along the test is.
What I'm seeing is an occasional "hiccup" in the order of records -- in the snapshot below, you can see that the record ID(s) go (in chronological order) ....5654, 5655, 56 5656, 5659, 5657, 5658, 5660, 5661....
Additionally, in this example, record ID# 5659 and record ID# 5657 are duplicates -- duplicate time stamp and duplicate text.
Has anyone else encountered this?