[Linux-Xtensa] RAM load/store reorder

Baruch Siach baruch at tkos.co.il
Thu Nov 28 04:31:25 UTC 2013


Hi Chris,

(Adding the list to Cc)

On Wed, Nov 27, 2013 at 04:14:31PM -0800, Chris Zankel wrote:
> Basically came to the same conclusions as the two of you when I
> looked at it this morning. There seems to be a weird reordering
> execution issue. Are you using just the default bitstream from
> Tensilica? What version? Were there any changes to the external
> memory? I'd be surprised that we hadn't caught such an issue when
> running the kernel and other tools on that board before.

This is not the default Tensilica bitstream. I'll check how this bitstream was 
generated.

It is worth noting that U-Boot version 2009.08 and the kernel (versions 3.12 
and 3.13-rc1), all built with the same toolchain, run happily on that board.

> Regarding the compiler, it's really weird that it would reload a3
> for every loop. Marc, you probably don't have the entire
> disassembled function in front of you, so here it is.

Which version of gcc have you used?

> (I can't really come up with any explanation other than that it
> might have optimized out some additional code, but that would have
> been something rather complex like TLS or so... weird..)

I didn't notice the jump label between the store and the load. This makes the 
code a little less weird, but not by much.

baruch

> 00000344 <fdt_next_subnode>:
>      344:       008136          entry   a1, 64
>      347:       03bd            mov.n   a11, a3
>      349:       ff3331          l32r    a3, 18 <___arch__swab32-0xfc>
>      34c:       180c            movi.n  a8, 1
>      34e:       0189            s32i.n  a8, a1, 0
>      350:       4139            s32i.n  a3, a1, 16
> 
> JUMP_LABEL
>      352:       4138            l32i.n  a3, a1, 16    # reloads a3
> for every loop, but a3 never changes
>      354:       02ad            mov.n   a10, a2
>      356:       20c110          or      a12, a1, a1
>      359:       0003e0          callx8  a3
>      35c:       0abd            mov.n   a11, a10
>      35e:       00aa96          bltz    a10, 36c <fdt_next_subnode+0x28>
>      361:       0188            l32i.n  a8, a1, 0
>      363:       0518a6          blti    a8, 1, 36c <fdt_next_subnode+0x28>
>      366:       e81866          bnei    a8, 1, JUMP_LABEL ## 352
> <fdt_next_subnode+0xe>
> 
>      369:       000046          j       36e <fdt_next_subnode+0x2a>
>      36c:       fb7c            movi.n  a11, -1
>      36e:       0b2d            mov.n   a2, a11
>      370:       f01d            retw.n
> 
> 
> On 11/27/13, 11:46 AM, Baruch Siach wrote:
> >On Wed, Nov 27, 2013 at 11:22:08AM -0800, Marc Gauthier wrote:
> >>Baruch Siach wrote:
> >>>Thanks for you prompt response.
> >>>
> >>>On Wed, Nov 27, 2013 at 08:39:25AM -0800, Marc Gauthier wrote:
> >>>>Baruch Siach wrote:
> >>>>>1. Why is this seemingly NOP store/load pair at 0x5f3d414
> >>>and 0x5f3d416
> >>>>>needed at all?
> >>>>Looks like you're compiling at -O0.  In that case, the compiler
> >>>>ensures that all variables are in memory at every source line
> >>>>boundary ie. nothing is cached in registers across such boundaries,
> >>>>so debuggers can access/modify variables in memory (eg. stack) only,
> >>>>and the right thing happens.
> >>>The source file in question builds with -Os. I checked this
> >>>again by looking
> >>>at the actual build command line. Also, a3 holds an internal
> >>>function pointer, not a user set variable.
> >>Is fdt_next_node() a macro?  what does it expand to?
> >No. fdt_next_node() is a regular function. See its full listing in my first
> >email at
> >http://lists.linux-xtensa.org/pipermail/linux-xtensa/Week-of-Mon-20131125/001350.html.
> >
> >>Is this GCC or XCC?
> >It's GCC version 4.7.2.
> >
> >[...]
> >
> >>>The FPGA runs at 25MHz which I believe is nowhere near the limit of this
> >>>board. I'll check here how exactly this bitstream was generated.
> >>Ok.  I can't explain the behavior otherwise.
> >>Can you also try different ML605 board?
> >It's a KC705. I'm not sure we have another one, but I'll check.
> >
> >>Am assuming the code works fine when single-stepped, or putting
> >>breakpoints after each instruction?
> >Yes. Single stepping through the code makes the problem magically disappear.
> >
> >Thanks,
> >baruch

-- 
     http://baruch.siach.name/blog/                  ~. .~   Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
   - baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -


More information about the linux-xtensa mailing list