Small improvement of rc script makes big difference (bug report 35898)

Simon Perreault nomis80 at yahoo.com
Fri Jul 28 21:12:54 PDT 2000


> I've tried to test it as well but I think I don't understand what you
> are trying to accomplish and therefore I haven't been able to create
mistakes
> on purpose to trigger it.

Seems like creating mistakes is easier for some than for others... ;)

>
> Could you give me a situation where a startup() call will fail.

I modified the rc script when my dhcpcd script failed but without signaling
it with a [FAILED]. The problem was that there was still a
/var/run/dhcpcd-eth0.pid file present. The modification of the rc script was
the only way I could get a [FAILED] notice.

> One
> reason I can think of it startup() trying to execute one of the symlinks
> in say /etc/rc3.d which point to a non-existing file, but that will
> never be the case. Way above in the script is a line that checks whether
> the symlink points to a valid file or not. The startup() call is only
> reached when a symlink exists to an existing file.

That would be one reason. But another, much more common, reason would be, as
you mention it in the following paragraph, an /etc/init.d script failing
with a non-zero return value.

>
> The only way an /etc/init.d script would give back a non-zero return
> value is when the script aborts with something like "exit 1", but an
> "exit 1" is always preceeded by an error message.

And so it should say [FAILED]. Simply. If the script fails, you can't just
say "We don't need a red failed notification, we already have an error
message." The evaluate_retval evaluates the return value of sub sub scripts,
and a non-zero retval should also print an error message, but you imlemented
a [FAILED] notification anyway. Remember: "Consistency".

We should do it for the same reason it is there for sub sub scripts: to
quickly and visually identify failed booting steps.

> *)
> echo -n "Usage: S0 {start|stop|reload|restart|status}
> print_status failure
> ;;
>
> # End of /etc/init.d/sysklogd

That would work too. You can still add the previous code to the sysklogd
script while using the rc modification. It would be more gracious, yes. And
I'd recommend it. But you still need an rc modification to cover _all_
possible errors. There are countless possible ways a script can fail, and
adding custom fit code for every one of them would be a tidy job, to say the
least.

>
> An 'exit 1' wouldn't be necessary anymore (but you can still do it if
> you want of course)
>
> Now that I have been thinking about it, while trying to understand your
> rc script modification, I think I'll make these changes to the scripts.

You almost understood what I meant (maybe you did, and I didn't understand
that you did in fact understand what I was meaning (am I confusing?)). Let
me explain how errors should be dealt with. Now that I think about it, in
light of your feedback, this is pretty different from the actual
error-dealing procedure.

1) rc launches a sub script
    1.1) the sub script executes commands
    1.2) the sub script checks for errors within commands, and DOES NOT make
use of the "print_status failure" function. It simply exits with a return
value of 1. The [FAILED] notification is taken care of by the rc script.
    1.3) you can add code in the script to deal with many kind of errors,
and have the script output custom-fit error messages (like your previous
example, but not limited to it). The custom-fit code should print it's
custom-fit error message, without adding a [FAILED] notice.
2) rc should evaluate the sub script's retval, and issue a "print_status
failure" command if retval<0.

So, no "print_status" nor "evaluate_retval" function should be called in the
sub script. That should be taken care of in the rc script.

But then, one problem arises (or maybe more, I just thought about this one):
what if, like in sysklogd, you have more that one command run, and you have
to use more that one print_status calls? I suggest that this kind of
practice should be non-standard. After all, if you look at the sysklogd
script, there's a defined order in which each logging daemon must be
started. I suggest that two distinct scripts be created. Each script schould
only need one print_status call. Some scripts would need to be split... Hum.
I'm now thinking about other scripts, like moutfs, checkfs, and possibly
others. Maybe this radical change isn't that good after all.

So, this is where I stand:
1) A modification to rc NEEDS to be done so that sub scripts that exit with
a return value < 0 get a [FAILED] notification.
2) If we change the way errors get dealt with, the way I described above,
some scripts will need to be split. That would be a bit of work, but it
would maybe result in more error-proof scripts, and maybe we would never
again scour the scripts to find where the swap is activated, as it would
have it's own script. Since that would restrain the flexibility, I don't
think this is such a good idea. I'm not too sure about this issue.

As usual, this is just my newbie advice.

--
Mail archive: http://www.pcrdallas.com/mail-archives/lfs-discuss
IRC access: server: irc.linuxfromscratch.org port: 6667 channel: #LFS
Unsubscribe: email lfs-discuss-request at linuxfromscratch.org and put
"unsubscribe" (without the quotation marks) in the body of the message
(no subject is required)



More information about the lfs-dev mailing list