[Linux-Xtensa] Re: A2000 kernel work - Welcome aboard Ed; current changes in the Kernel 2.6.29-smp git repo, buildroot direction, ...

Piet Delaney pdelaney at tensilica.com
Tue Nov 30 00:19:38 PST 2010


Morning Ed, welcome aboard, I thought it might be helpful
to mention a bit of what's going on recently/now on the Xtensa kernel
and buildroot. I thought it would likely be worthy of sharing with
the rest of the Linux-Xtensa community; hope you don't mind my killing
two birds with one stone here.

For the kernel I'm currently thinking merging the 'Initialize_MMU_Inside_vmlinux'
to the mater branch now that it ran relatively fine on the LX200 with the new
V3 MMU over Thanksgiving.

LTP with Audio stress test ran for almost 5 days and had a reasonable
0.24 FAIL/PASS ratio (2.4%):

  cat Log.25-Nov-2010 | grep PASS | wc
   42561  395693 2584913

cat Log.25-Nov-2010 | grep FAIL | wc
    1035    8914   65234

A LX200 with the three core SMP kernel ran for almost 5 days now
with a significantly better, a FAIL/PASS ratio of only 0.177 (1.7%).

cat Log.25-Nov-2010 | grep PASS | wc
  121755 1131657 7392343

cat Log.25-Nov-2010 | grep FAIL | wc
    2154   17562  110662

The smaller boards, the LX60 and LX110, didn't do as well,
I suspect do to their having much less memory.

	LX200:	96MB
	LX210:	48MB
	LX60:	64MB

I've seen problems on the A2000 with LTP test, perhaps due to it's
using a lot of memory for ram based file system. I'd like
to test this new master branch with the A2000 and make sure
it's improving.

One issue is that after 4 days and 20 hours on the kernel on the
LX200 with the three HiFi-2 cores appears to have locked up:

(gdb) bt
#0  0xd00cc0be in _raw_spin_lock (lock=0xd027b214) at lib/spinlock_debug.c:79
#1  0xd01501f4 in _spin_lock (lock=0xd027b214) at kernel/spinlock.c:181
#2  0xd004370d in page_check_address (page=0xd0282df4, mm=0xd2977438, address=0x2018c000, ptlp=0xd5a7d9b0, sync=<value optimized out>) at mm/rmap.c:296
#3  0xd00437f0 in page_referenced_one (page=0xd0282df4, vma=0xd5a455f0, mapcount=0xd5a7d9e0) at mm/rmap.c:348
#4  0xd004440d in page_referenced (page=0xd0282df4, is_locked=<value optimized out>, mem_cont=<value optimized out>) at mm/rmap.c:408
#5  0xd0038042 in shrink_page_list (page_list=0xd5a7db00, sc=0xd5a7dbf0, sync_writeback=PAGEOUT_IO_ASYNC) at mm/vmscan.c:634
#6  0xd00388fd in shrink_zone (priority=0x0, zone=0xd01bb7a0, sc=0xd5a7dbf0) at mm/vmscan.c:1104
#7  0xd0039229 in try_to_free_pages (zonelist=0xd01bbfa0, order=<value optimized out>, gfp_mask=<value optimized out>) at mm/vmscan.c:1572
#8  0xd0033aea in __alloc_pages_internal (gfp_mask=0x1201d2, order=0x0, zonelist=0xd01bbfa0, nodemask=0x0) at mm/page_alloc.c:1584
#9  0xd00357fe in __do_page_cache_readahead (mapping=0xd542bb78, filp=0xd5a95e60, offset=0x12, nr_to_read=0xf, lookahead_size=0x0) at include/linux/gfp.h:178
#10 0xd00358fc in do_page_cache_readahead (mapping=0xd542bb78, filp=0xd5a95e60, offset=0x12, nr_to_read=0xf) at mm/readahead.c:223
#11 0xd002fe44 in filemap_fault (vma=0xd5a7aeb0, vmf=0xd5a7dde0) at mm/filemap.c:1520
#12 0xd003ef60 in __do_fault (mm=0xd5ad8d60, vma=0xd5a7aeb0, address=0x419558, pmd=0xd5ac0004, pgoff=0x19, flags=0x0, orig_pte={pte = 0xc, present = {x = 0x0, w = 0x0, attr = invalid, ring = kern, 
present = 0x0, dirty = 0x0, accessed = 0x0, writable = 0x0, _unused = 0x0, ppn = 0x0}}) at mm/memory.c:2595
#13 0xd003f918 in handle_mm_fault (mm=0xd5ad8d60, vma=0xd5a7aeb0, address=0x419558, write_access=0x0) at mm/memory.c:2740
#14 0xd0007fad in do_page_fault (regs=0xd5a7df40) at arch/xtensa/mm/fault.c:113
#15 0xd0003e04 in _user_exception () at arch/xtensa/kernel/entry.S:332
(gdb)

I think I've seen this recently and thought I fixed it last week
with an exception table fix. Perhaps this implies we have a rare
but in exception handling. It's very hard to say.

Some of the changes we have made since the snapshot_2+SMP branch
7 months ago have only been stress tested for three days without
any problems showing up and since this only seems to have shown up
after 4 days it's difficult to know when the possible/likely regression
crept in. For the HiFi-2 Three core release we had over three weeks
of reliability under constant LTP stress testing on the snapshot_2+SMP branch.

It's possibly an artefact of running out of memory. Our writing
everything in /var to /tmp (memory based) feels like asking for
trouble. I'm thinking of letting /var/log/messages et. al. go
to the NFS mounted root. It would save memory and make /var/log/messages
available after a crash.

	[root at rtos11 ddd-3.3.12]# cd /var
	[root at rtos11 var]# ls -l
	lrwxrwxrwx    1 root     root            6 Nov 25 01:46 cache -> ../tmp/
	drwxr-xr-x    2 root     root         4096 Aug  3 11:04 empty/
	drwxr-xr-x    3 root     root         4096 Aug 13 19:45 lib/
	lrwxrwxrwx    1 root     root            6 Nov 25 01:46 lock -> ../tmp/
	lrwxrwxrwx    1 root     root            6 Nov 25 01:46 log -> ../tmp/
	lrwxrwxrwx    1 root     root            6 Nov 25 01:46 pcmcia -> ../tmp/
	lrwxrwxrwx    1 root     root            6 Nov 25 01:46 run -> ../tmp/
	lrwxrwxrwx    1 root     root            6 Nov 25 01:46 spool -> ../tmp/
	lrwxrwxrwx    1 root     root            6 Nov 25 01:46 tmp -> ../tmp/
	[root at rtos11 var]#

I'm not seeing the hang on the simple non-SMP test on the LX200 running
with the DC233 (V3 MMU) (yet?).  I suspect that the regression, if it exist,
is already in the master branch in the pgtable.h changes that Marc and I
made quite a while ago to fix pthread problems with shares libraries. Lots
of variables have changed.

I hate losing our best longevity of 3.5 weeks during the HiFi-2 SMP release;
I'm hoping we can do much better.


On buildroot, I'd really like to get a unified Development Environment
for all of the important Variants, including the A2000. I started a web page
for the A2000 board as I'm building it.

	http://wiki.linux-xtensa.org/index.php/Transwitch_A2000_Development_Board

I'm using git repo's for both the kernel and buildroot. I'm updating the web pages now.
Using the Development Environment for the SMP HiFi and A2000 as a model. I'm running it
now on the DC233L (new V3 MMU).

I'll add an Updated HiFi snapshot based of the 2nd Snapshot:

	http://wiki.linux-xtensa.org/index.php/Mplayer_Hifi_2_Codec_Development_Board

and recent kernel changes going in now and make a Third Snapshot:

	http://wiki.linux-xtensa.org/index.php/Buildroot_Snapshots

I thought it best to support the Transwitch A2000 and the Chipsbank 570T variant as well;
this will enable quick support via our engineering staff.

Due to time limitations it's likely be best to make the 3ed snapshot the last 2.6.29-smp version and
then move on to a 2.6.25-smp kernel which apparently is desirable for embedded platforms.

It would also be nice to move forward with buildroot, but I think that's
a big effort and likely best for a 4th snapshot with 2.6.35-smp.

Christian Zankel started an attempt at a new buildroot a few months ago:

	http://git.linux-xtensa.org/cgi-bin/git.cgi?p=buildroot/xtensa;a=summary

perhaps we should start with it.

I'd like to get the std buildroot Development Environment, as seen in the buildroot .config:

#
# Other development stuff
#
BR2_PACKAGE_AUTOCONF=y
BR2_PACKAGE_AUTOMAKE=y
BR2_PACKAGE_BISON=y
# BR2_PACKAGE_CCACHE_TARGET is not set
# BR2_PACKAGE_CVS is not set
# BR2_PACKAGE_DISTCC is not set
# BR2_PACKAGE_DMALLOC is not set
# BR2_PACKAGE_EXPAT is not set
# BR2_PACKAGE_FAKEROOT is not set
BR2_HOST_FAKEROOT=y
# BR2_PACKAGE_GETTEXT is not set
# BR2_PACKAGE_LIBINTL is not set
# BR2_PACKAGE_LIBGMP is not set
# BR2_PACKAGE_LIBMPFR is not set
# BR2_PACKAGE_LIBTOOL is not set
# BR2_PACKAGE_M4 is not set
# BR2_PACKAGE_MPATROL is not set
# BR2_PACKAGE_PKGCONFIG is not set
# BR2_READLINE is not set
# BR2_PACKAGE_XERCES is not set

and selected via 'make menuconfig':
	BuildRoot Configuration
		Target options
			[*] Generic Development System

working without having to cherry pick parts of the Generic Development System.

We really should get this completely working, not just the subset that currently works.
I've been documenting the land mines with the various buildroot packages.

For example to the top level .config file:

	http://wiki.linux-xtensa.org/index.php/HiFi-2_snapshot_2_SMP_Snapshot_menuconfig

and for the uClibc .config file:
	
	http://wiki.linux-xtensa.org/index.php/HiFi-2_snapshot_2_SMP_uclibc-menuconfig


It would be nice/conventional to get buildroot building most, if not all, of the Development Environment
as selected in the Buildroot menu as well as X11, xmplayer, as well as the the Qt graphics library
that Chipsbank has been messing with.

When fixing the Pthreads bugs we used the 2nd snapshot, with smp additions that Joe Taylor did:

	http://git.linux-xtensa.org/cgi-bin/git.cgi?p=buildroot/buildroot-xtensa-HiFi2-Snapshot.git;a=shortlog;h=snapshot_2%2BSMP

I'd like to maintain the ability to compile code on the target with gcc and
debug it with gdb on the target. This was very comfortable for debugging user space and kernel problems.
Adding a documented procedure for reducing this to a flash based file-system
like Prasanna has set up would be nice; as well as details for setting up
simulations.

In addition it would be great to get a clean procedure set up for
building buildroot and the kernel with XCC. Perhaps we could do this
for the 3ed snapshot.


Let me know if there is anything else I can do to help or if you,or anyone else,
have/has any thoughts or suggestions on how we should move the Xtensa-Linux community
forward.

-piet




More information about the linux-xtensa mailing list