I could begin debugging with C++. In the interim, if you think it will help, I can also provide you with my Config and log files, and the section of redacted data encompassing the time of the hang - just let me know. Regarding CPU usage, I have noted that in the past and have never seen very high CPU usage during an Elog hang. The VM itself remains responsive.
Stefan Ritt wrote: |
I have to figure out where elog hangs. I guess it must be some kind of endless loop, triggered by some corrupt data in one of the elog entries. Under linux this is fairly simple (just run elogd under the gdb debugger, wait until it hangs, then press ctrl-c and enter "where" to see a full stack dump where elogd is currently executing). Under Windows this is more difficult, since you need Visual C++ from Microsoft to do the debugging. One thing you can do however without VC is to check if the CPU time is consumed to 100% by elogd, indicating an endless loop.
Stefan
Alan Grant wrote: |
I have a very long standing problem with elog over the last few versions where almost daily the service will hang. Cannot even Restart elogd, that just hangs. Clients experience Page not Found. I can only get the service reinitialized by rebooting the VM machine. I have Elog verbose logging On plus a number of external triage monitors running but nothing is yielding clues beyond the precise time the hang occurs. Aside from providing the Config and log files what else can I provide for you to assist, and what other triage measures can you suggest I try? FYI, there can be up to 20 users at one time doing searches (not updates), and I've trimmed the depth of log files that can be searched so that the CPU/service doesn't bog down but that hasn't helped either. Inserts happen in the background using the elog client app (about 2 or 3 inserts per batch at sporadic times).
|
|
|