Demo Discussion
Forum Config Examples Contributions Vulnerabilities
  Discussion forum about ELOG, Page 401 of 808  Not logged in ELOG logo
ID Date Icon Author Author Emaildown Category OS ELOG Version Subject
  67713   Wed Nov 12 03:19:17 2014 Reply Konstantin Olchanskiolchansk@triumf.caBug reportLinux2.9.2-a738Re: Defunct daemons
Also see this in ALPHA at CERN. Eventually there are so many defunct elogd processes that the user runs out of "maxproc" quota and automatic submission 
of elog messages starts to fail. (and the users complain, reboot all computers, etc).

The elogd we use is this:
https://bitbucket.org/ritt/elog/commits/44800a769b99599db7620779e2142b1161c694fc?at=master

The best I can tell, the main elogd is spawning something but does not reap finished subprocesses (wait() syscall). My guess it is spawning ImageMagik stuff 
to create preview images.

K.O.
  67714   Wed Nov 12 03:48:29 2014 Reply Konstantin Olchanskiolchansk@triumf.caBug reportLinux2.9.2-a738Re: Defunct daemons
> Also see this in ALPHA at CERN.
> The elogd we use is this: https://bitbucket.org/ritt/elog/commits/44800a769b99599db7620779e2142b1161c694fc?at=master

Okey, found it. waitpid() in my_shell() is not protected against the periodic alarm signal. (UNIX signals are evil).

In the following log file, notice the entries that have "wait_status" of "-1". Those would have generated zombies ("defunct" processes).

Nov 12 03:43:05 alphacpc05 elogd[4809]: WAITPID pid 4873, wait_status 4873, errno 2 (No such file or directory), status 0, command "convert  
'/home/alpha/online/elog/logbooks/test/141112_034304_xvthr04.pdf[0-7]' -thumbnail '600' '/home/alpha/online/elog/logbooks/test/141112_034304_xvthr04-%d.png'"
Nov 12 03:43:05 alphacpc05 elogd[4809]: WAITPID pid 4880, wait_status 4880, errno 2 (No such file or directory), status 0, command "identify -format '%wx%h' 
'/home/alpha/online/elog/logbooks/test/141112_034304_xvthr04.pdf[0]'"
Nov 12 03:43:19 alphacpc05 elogd[4809]: WAITPID pid 4890, wait_status 4890, errno 2 (No such file or directory), status 0, command "identify -format '%wx%h' 
'/home/alpha/online/elog/logbooks/test/141112_034304_xvthr04.pdf[0]'"
Nov 12 03:43:19 alphacpc05 elogd[4809]: WAITPID pid 4896, wait_status -1, errno 4 (Interrupted system call), status 0, command "convert  
'/home/alpha/online/elog/logbooks/test/141112_034318_xvthr05.pdf[0-7]' -thumbnail '600' '/home/alpha/online/elog/logbooks/test/141112_034318_xvthr05-%d.png'"
Nov 12 03:43:19 alphacpc05 elogd[4809]: WAITPID pid 4896, wait_status 4896, errno 4 (Interrupted system call), status 0, command "convert  
'/home/alpha/online/elog/logbooks/test/141112_034318_xvthr05.pdf[0-7]' -thumbnail '600' '/home/alpha/online/elog/logbooks/test/141112_034318_xvthr05-%d.png'"
Nov 12 03:43:20 alphacpc05 elogd[4809]: WAITPID pid 4904, wait_status 4904, errno 4 (Interrupted system call), status 0, command "identify -format '%wx%h' 
'/home/alpha/online/elog/logbooks/test/141112_034318_xvthr05.pdf[0]'"
Nov 12 03:43:48 alphacpc05 elogd[4809]: WAITPID pid 4922, wait_status 4922, errno 2 (No such file or directory), status 0, command "identify -format '%wx%h' 
'/home/alpha/online/elog/logbooks/test/141112_034304_xvthr04.pdf[0]'"
Nov 12 03:43:49 alphacpc05 elogd[4809]: WAITPID pid 4929, wait_status -1, errno 4 (Interrupted system call), status 1302603136, command "identify -format '%wx%h' 
'/home/alpha/online/elog/logbooks/test/141112_034318_xvthr05.pdf[0]'"
Nov 12 03:43:49 alphacpc05 elogd[4809]: WAITPID pid 4929, wait_status 4929, errno 4 (Interrupted system call), status 0, command "identify -format '%wx%h' 
'/home/alpha/online/elog/logbooks/test/141112_034318_xvthr05.pdf[0]'"
Nov 12 03:43:50 alphacpc05 elogd[4809]: WAITPID pid 4935, wait_status 4935, errno 2 (No such file or directory), status 0, command "convert  
'/home/alpha/online/elog/logbooks/test/141112_034348_xvthr06.pdf[0-7]' -thumbnail '600' '/home/alpha/online/elog/logbooks/test/141112_034348_xvthr06-%d.png'"
Nov 12 03:43:50 alphacpc05 elogd[4809]: WAITPID pid 4943, wait_status 4943, errno 2 (No such file or directory), status 0, command "identify -format '%wx%h' 
'/home/alpha/online/elog/logbooks/test/141112_034348_xvthr06.pdf[0]'"

The following code is verified to not generate zombies, please apply it to the master branch of elog:

alphadaq.cern.ch:~/packages/elog> git diff
diff --git a/src/elogd.c b/src/elogd.c
index 277ba30..2d9a848 100755
--- a/src/elogd.c
+++ b/src/elogd.c
@@ -892,14 +892,25 @@ int my_shell(char *cmd, char *result, int size)
 
 #ifdef OS_UNIX
    pid_t child_pid;
-   int fh, status, i;
+   int fh, status, i, wait_status;
    char str[1024];
 
    if ((child_pid = fork()) < 0)
       return 0;
    else if (child_pid > 0) {
       /* parent process waits for child */
-      waitpid(child_pid, &status, 0);
+
+      while (1) {
+         wait_status = waitpid(child_pid, &status, 0);
+
+         sprintf(str, "WAITPID pid %d, wait_status %d, errno %d (%s), status %d, command \"%s\"", child_pid, wait_status, errno, strerror(errno), status, cmd);
+         write_logfile(NULL, str);
+         eprintf("%s", str);
+
+         if (wait_status == -1 && errno == EINTR)
+            continue;
+         break;
+      }
 
       /* read back result */
       memset(result, 0, size);
diff --git a/src/git-revision.h b/src/git-revision.h

K.O.
  67915   Wed May 20 01:45:09 2015 Entry Konstantin Olchanskiolchansk@triumf.caBug reportLinux3.1.0elogd complains about unknown cookies
elogd is spewing these messages about unknown cookies:

Received unknown cookie "is_returning"
Received unknown cookie "__utma"
Received unknown cookie "__utmz"
Received unknown cookie "SSESSee3cc9c70bedf9a840203765bf409d7b"
Received unknown cookie "SESSee3cc9c70bedf9a840203765bf409d7b"
Received unknown cookie "MidasWikiUserID"
Received unknown cookie "MidasWikiUserName"
Received unknown cookie "MidasWiki_session"

K.O.
  67916   Wed May 20 01:49:37 2015 Entry Konstantin Olchanskiolchansk@triumf.caBug reportLinux3.1.0elconv deletes everything
Converting from elog 2.9.something to new elog 3.1.0 elogd refuses to start, instructs running elconv in one logbook.

When I do so, elconv converts a existing mhttpd-style elog entries to the new format (the corresponding new-format entries already exist)
and deletes everything else - this is very bad.

So there are 2 bugs:
- elogd should not tell us to run elconv when both old-style and corresponding new-style elog entries exist
- elconv should not delete all existing new-style elog entries.

I confirm that elconv *does* delete all new-style elog entries - with strace, I see it issue "unlink" on every elog entry.

What a disaster!

K.O.
  67917   Wed May 20 01:52:23 2015 Entry Konstantin Olchanskiolchansk@triumf.caBug reportOtherthis onethis elog errors sending email
this elog gives errors sending mail through PSI email server. (did not capture the error messages, sorry). K.O.
  67918   Wed May 20 01:54:55 2015 Entry Konstantin Olchanskiolchansk@triumf.caBug reportOtherthis oneedit somebody else's draft
this elog offers me to edit a draft message, then yells at me "only some other user can edit this draft!!!".
methinks I should only be offered to edit draft messages that I own or I can edit. K.O.
  67919   Wed May 20 01:59:17 2015 Entry Konstantin Olchanskiolchansk@triumf.caBug reportLinux3.1.0elogd moves elog entries
elogd 3.1.0 moves all elog entries into year-named subdirectories. this feature makes it incompatible with older elogs and so should be clearly mentioned in the documentation,
in the release announcement and in the release and migration notes. K.O.
  67924   Wed May 20 20:03:06 2015 Reply Konstantin Olchanskiolchansk@triumf.caBug reportLinux3.1.0Re: elogd moves elog entries
> Stefan told me that the change was because some users were having thousands of yymmdda.log files
> in the logbook directories

I am one of those users. The elog for the ALPHA experiment at CERN goes back to 2006 or so,
with large volume of messages and huge number of attachments. The MIDAS forum elog goes back to 2003.
The TRIUMF DAQ internal elog goes back to 2001.

I think the new organization is an improvement.

K.O.
ELOG V3.1.5-3fb85fa6