On Sat 2017-04-08 00:13:06, Sergey Senozhatsky wrote:
On (04/07/17 14:44), Pavel Machek wrote:
> > [..]
> > > I believe "spend at most 2 seconds in printk(), then print a warning
> > > and offload" is a solution closer to what we had before.
> > a warning here can be very noisy.
> Well, on normally-configured it should be ok. We don't commonly see
> printk problems... If it is too noisy, perhaps we should increase from
> 2 seconds, but I don't think it will be problem.
we are looking at different typical setups :) serial console being 45
seconds behind logbuf does not surprise me anymore.
> > what we have been thinking about is something like printk-stall detection.
> > we probably (there are some if-s) can detect in printk() that offloading
> > does not work and we must automatically switch to printk_emergency mode.
> > that, in theory, can relax our dependency on printk_emergency_begin/end
> > being in the right place at the right time. need to think more about it.
> So... I don't really like the begin/end interface. I would rather have
> printk_emergency(KERN_ ...).
you mean a single printk_emergency() switches printk to emergency mode
or printk_emergency(KERN_ ... ) is a single message that must be printed
in emergency mode?
The latter. Having state is ugly.
printk() depends on console_trylock(). we can't expect
to always do more than just log_store().
the idea behind begin/end interface is that you can do
with out the need of rewriting dump_stack() or anything else to use
printk_emergency(). we, for example, do this in sysrq patch from this
Well.. I guess it is less work to include emergency_begin/end() but I
also believe result will state-less solution will be cleaner.
> Second... I don't think "stuck detector" is that
helpful. What I
> usually seen was some rather innocent kernel message followed by
> hard-lock. That's where "message delayed" is useful..
a side note,
that's rather unclear to me how would "message delayed" really help.
if your system hard-lockup so badly and there are no printk messages
even from NMI watchdog, then we won't be able to print that message.
We are talking about
do_something_clever(); /* Which unfortunately hard-crashes the machine */
that works with my proposal, but not with yours. Seen it happen many
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html