[cpia] Re: CPiA problem?

Johannes Erdfelt jerdfelt@sventech.com
Tue, 16 May 2000 13:03:12 -0400


On Tue, May 16, 2000, Chris Jones <chris@black-sun.co.uk> wrote:
> I just tried out 2.3.99-pre8 on my 2xCeleron 433 ABit BP6 machine (256MB
> RAM). I decided to stress test it a bit so I left xawtv running on the
> webcam /dev/video entry, an mpeg playing and an mp3 playing overnight.
> From the point of the cpia module being loaded at 06:56am (i was using your
> alternative usb stack - the main one just hung my machine) I had xawtv
> reading the webcam pretty much continuously and the last entry in my
> /var/log/messages is at 13:29pm so it managed to go quite a while, but after
> that the system was very unstable - I managed to issue some SysRq commands,
> but it hung and corrupted the video display before I could finish.
> 
> I've looked through the logs and I can't find any oopsen so I guess this may
> be hard to track down, but I've put some of the more important bits of the
> log below (let me know if you need anymore).
> 
> On a slightly different note - using xawtv to view th webcam I could only
> take 2 jpeg snapshots with it before it would exit when trying to take
> another - is this a CPiA problem or an xawtv problem?

It's hard to tell who's problem it is. There is a bug in the CPiA code
wrt to getting a number of corrupted frames in a short time apparentely.
One of my coworkers showed me his machine which has the CPiA camera
stuck with a bunch of error messages parsing the compressed frame.

I haven't been able to duplicate the problem quite yet. These may be
related.

> This is me loading your alternative usb stack:
> May 16 06:55:58 bitch kernel: usb.c: registered new driver hub 
> May 16 06:55:58 bitch kernel: uhci.c: USB UHCI at I/O 0xc000, IRQ 19 
> May 16 06:55:58 bitch kernel: uhci.c: detected 2 ports 
> May 16 06:55:58 bitch kernel: usb.c: new USB bus registered, assigned bus
> number 1 
> May 16 06:55:58 bitch kernel: usb.c: USB new device connect, assigned device
> number 1 
> May 16 06:55:58 bitch kernel: hub.c: USB hub found 
> May 16 06:55:58 bitch kernel: hub.c: 2 ports detected 
> May 16 06:56:00 bitch kernel: usb.c: USB new device connect, assigned device
> number 2 
> May 16 06:56:00 bitch kernel: usb.c: This device is not recognized by any
> installed USB driver. 

Looks good.

> This is me loading the CPiA module
> May 16 06:56:12 bitch kernel: Linux video capture interface: v1.00 
> May 16 06:56:12 bitch kernel: V4L-Driver for Vision CPiA based cameras
> v0.7.4 
> May 16 06:56:12 bitch kernel: usb.c: registered new driver cpia 
> May 16 06:56:12 bitch kernel: USB CPiA camera found 
> May 16 06:56:12 bitch kernel:   CPiA Version: 1.20 (2.0) 
> May 16 06:56:12 bitch kernel:   CPiA PnP-ID: 0553:0002:0100 
> May 16 06:56:12 bitch kernel:   VP-Version: 1.0 0100 

Looks good.

> There were an awful lot of these (I have heard this is an ABit BP6 problem?)
> May 16 07:52:44 bitch kernel: APIC error interrupt on CPU#0, should never
> happen. 
> May 16 07:52:44 bitch kernel: ... APIC ESR0: 00000000 
> May 16 07:52:44 bitch kernel: ... APIC ESR1: 00000004 
> May 16 07:52:44 bitch kernel: ... bit 2: APIC Send Accept Error. 
> May 16 07:59:14 bitch kernel: usb_control/bulk_msg: timeout 
> May 16 08:05:56 bitch kernel: APIC error interrupt on CPU#1, should never
> happen. 
> May 16 08:05:56 bitch kernel: ... APIC ESR0: 00000000 
> May 16 08:05:56 bitch kernel: ... APIC ESR1: 00000002 
> May 16 08:05:56 bitch kernel: ... bit 1: APIC Receive CS Error (hw problem). 

Looks very bad.

This could be a cause of some of the problem you were seeing. I'm pretty
confident that we (USB or CPiA) didn't cause this problem.

> This was the final log entry before the machine froze.
> May 16 13:29:39 bitch kernel: usb_control/bulk_msg: timeout

Quite a while before this happened. Were there any other USB/UHCI related
messages?

If not, then there is a couple of possibilities. The camera could be
continuously NAKing the control transfer which is unlikely but
possible.

Or the Host Controller could have locked up, but it usually tells us
something really bad happened with an interrupt.

There could be some sort of problem with the APIC. What is definately a
possibility given those errors you see above.

Have you sent a message to linux-kernel yet about the APIC problem?

JE