[Linux-Xtensa] Re: l32r in FLIX

Piet Delaney pdelaney at tensilica.com
Thu Jun 10 01:07:26 PDT 2010


Marc Gauthier wrote:
> Modules also use TRY etc, so they end up with exception table entries.
> It's the module's __ex_table that's unaligned (not the kernel's).

The  .align 4 is used for both static kernel's __ex_table section used by the kernel. When
we get a bad page fault we check the kernel and modules tables in search_exception_tables():
------------------------------------------------------------------------------------------------
/* Given an address, look for it in the exception tables. */
const struct exception_table_entry *search_exception_tables(unsigned long addr)
{
         const struct exception_table_entry *e;

         e = search_extable(__start___ex_table, __stop___ex_table-1, addr);
         if (!e)
                 e = search_module_extables(addr);
         return e;
}

A separate functions looks thru the __ex_table sections for each module:
-------------------------------------------------------------------------
/* Given an address, look for it in the module exception tables. */
const struct exception_table_entry *search_module_extables(unsigned long addr)
{
         const struct exception_table_entry *e = NULL;
         struct module *mod;

         preempt_disable();
         list_for_each_entry_rcu(mod, &modules, list) {
                 if (mod->num_exentries == 0)
                         continue;

                 e = search_extable(mod->extable,
                                    mod->extable + mod->num_exentries - 1,
                                    addr);
                 if (e)
                         break;
         }
         preempt_enable();

         /* Now, if we found one, we are running inside it now, hence
            we cannot unload the module, hence no refcnt needed. */
         return e;
}
------------------------------------------------------------------------------

The UNIX nofault() was effectively merged with copy_user() and renamed __put_user().
This kernel code that uses pointers from user space can't trust the pointers so this
code is put in the static kernels__ex_table section and checked by bad_page_fault()
upon getting a fault while in
the kernel.

Your added .align statements will effect both the static kernels __ex_table section
and the modules __ex_table sections.

I'm making sure I understand this a bit more while digesting the exception table issue:

	#define __put_user_asm(x, addr, err, align, insn, cb)   \
    __asm__ __volatile__(                                \
         __check_align_##align                           \
         "1: "insn"  %2, %3, 0           \n"             \
         "2:                             \n"             \
         "   .section  .fixup,\"ax\"     \n"             \
         "   .align 4                    \n"             \
         "4:                             \n"             \
         "   .long  2b                   \n"             \
         "5:                             \n"             \
         "   l32r   %1, 4b               \n"             \
         "   movi   %0, %4               \n"             \
         "   jx     %1                   \n"             \
         "   .previous                   \n"             \
         "   .section  __ex_table,\"a\"  \n"             \
         "    .align 4                   \n"             \
         "   .long       1b, 5b          \n"             \
         "   .previous"                                  \
         :"=r" (err), "=r" (cb)                          \
         :"r" ((int)(x)), "r" (addr), "i" (-EFAULT), "0" (err))

where:
  	%0 is 'err'
	%1 is 'cb',
         %2 is 'x',
	%3 is 'addr
	$4 is '-EFAUT'
	%5 not used because 'err' is already defined as %0.


I'd  prefer we used the less officiated %[variable] formats for asm macros:

  asm ("fsinx %[angle],%[output]"
           : [output] "=f" (result)
           : [angle] "f" (angle));



> When the module loader gets to the __ex_table section, it fails.

So the section the being relocated in frame 4 was  __ex_table:
  ---------------------------------------------------------------------------------------------------------------------------------
  (gdb) bt
  #0  die (str=0xd022b148 "Unhandled unalig"..., regs=0xd3a2fd20, err=0x9) at arch/xtensa/kernel/traps.c:573
  #1  0xd000674c in __die_if_kernel (str=0xd022b148 "Unhandled unalig"..., regs=0xd3a2fd20, err=0x9) at arch/xtensa/kernel/traps.c:180
  #2  0xd0006880 in do_unaligned_user (regs=0xd3a2fd20) at arch/xtensa/kernel/traps.c:287
  #3  0xd000253c in _kernel_exception () at arch/xtensa/kernel/entry.S:747
 >#4  0xd0007284 in apply_relocate_add (sechdrs=0xc029fb10, strtab=0xc02e14dc "", symindex=0x32, relsec=0x11, mod=0xc02e344c) at arch/xtensa/kernel/module.c:116
  #5  0xd0044cb8 in load_module (umod=0x2006f008, len=0x2d1405, uargs=0x482008 <Address 0x482008 out of bounds>) at kernel/module.c:2193
  #6  0xd0045135 in sys_init_module (umod=0x2006f008, len=0x2d1405, uargs=0x482008 <Address 0x482008 out of bounds>) at kernel/module.c:2329
  #7  0xd0002c70 in system_call () at arch/xtensa/kernel/entry.S:2103
  #8  0xd0002388 in _user_exception () at arch/xtensa/kernel/entry.S:321
  #9  0xc0000001 in ?? ()
  gdb)
-----------------------------------------------------------------------------------------------------------------------------------------
I missed that and that there are multipe exception fixup tables.


> 
> Since it's an alignment problem, you have 1 in 4 chances of getting
> it right, maybe they got lucky?

I just got off the phone with Prassana and he wasn't aware of anything
that they did.


> 
> Anyway, if you verify my two sets of patches work, feel free to check
> them in and tell Transwitch and Chipsbanks (or anyone) about them.

I plained to test two more modules from the KERNEL HACKING section
which at least do something, to make sure the code was linked correctly.
But only the ext2 module, which doesn't do much, gets through module initialization.
-------------------------------------------------------------------------------------------------------------
CONFIG_EXT2_FS=m
CONFIG_RCU_TORTURE_TEST=m
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_AES=m
-------------------------------------------------------------------------------------------------------------
   INSTALL crypto/aes_generic.ko
   INSTALL crypto/tcrypt.ko
   INSTALL fs/ext2/ext2.ko
   INSTALL kernel/rcutorture.ko
-------------------------------------------------------------------------------------------------------------
[root at hifi ~]# modprobe ext2                WORKS
[root at hifi ~]#
[root at hifi ~]# modprobe tcrypt		    FAILS
alg: No test for md5 (md5)
--------------------------------------------------------------------------------------------------------------
(gdb) bt
#0  die (str=0xd022b0c0 "Illegal instruct"..., regs=0xd291bb10, err=0x9) at arch/xtensa/kernel/traps.c:573
#1  0xd000674c in __die_if_kernel (str=0xd022b0c0 "Illegal instruct"..., regs=0xd291bb10, err=0x9) at arch/xtensa/kernel/traps.c:180
#2  0xd000682e in do_illegal_instruction (regs=0xd291bb10) at arch/xtensa/kernel/traps.c:261
#3  0xd000253c in _kernel_exception () at arch/xtensa/kernel/entry.S:747
#4  0xc0354e01 in ?? ()								[ODD ADDRESS - Linked Wrong]
#5  0xc03544b1 in ?? ()								[ODD ADDRESS]
#6  0xc0359064 in ?? ()
#7  0xd000139e in do_one_initcall (fn=0xc0359014) at init/main.c:709
#8  0xd0045188 in sys_init_module (umod=0x2006f008, len=0x67880, uargs=0x482008 <Address 0x482008 out of bounds>) at kernel/module.c:2343
#9  0xd0002c70 in system_call () at arch/xtensa/kernel/entry.S:2104
#10 0xd0002388 in _user_exception () at arch/xtensa/kernel/entry.S:321
#11 0xc0000001 in ?? ()
(gdb)
--------------------------------------------------------------------------------------------------------------
I just double checked the diffs and don't see anything missing:

	/export/src/xtensa-2.6.29-smp.test_mmuhifi_c3

In my Transswitch Notes I've got a procedure to add the symbols for the gap in the
above backtrace. I just modified kernel/module.c to print the modules final section addresses with:

#if 0
#define DEBUGP printk
#else
#define DEBUGP(fmt , a...)
#endif

to get the symbols right and watch the code at apply_relocate_add() in arch/xtensa/kernel/module.c

and Tried again...
----------------------------------------------------------------------------------------------------------------
# modprobe tcrypt
load_module: umod=2006f008, len=424064, uargs=00482008
Core section allocation order:
         .literal
         .text
         .note.gnu.build-id
         .rodata
         .symtab
         .strtab
         .data
         .data.rel.local
         __param
         .gnu.linkonce.this_module
         .bss
Init section allocation order:
         .init.text
final section addresses:
         0xc006bd6c .note.gnu.build-id
         0xc0070000 .init.text
         0xc006a000 .literal
         0xc006a1dc .text
         0xc006bd90 .rodata
         0xc006d1c0 .data
         0xc006d2ac .data.rel.local
         0xc006d34c __param
         0xc006d374 .gnu.linkonce.this_module
         0xc006d464 .bss
         0xc006c3a0 .symtab
         0xc006cb70 .strtab
Absolute symbol: 0x00000000
Absolute symbol: 0x00000000
--------------------------------------------------------------------------------------------------------------------
Breakpoint 26, apply_relocate_add (sechdrs=0xc005cee0, strtab=0xc006cb70 "", symindex=0x29, relsec=0x3, mod=0xc006d374) at arch/xtensa/kernel/module.c:91
1: x/i $pc
0xd00070c9 <apply_relocate_add+13>:	l32i.n	a9, a1, 44
(gdb) bt
#0  apply_relocate_add (sechdrs=0xc005cee0, strtab=0xc006cb70 "", symindex=0x29, relsec=0x3, mod=0xc006d374) at arch/xtensa/kernel/module.c:91
#1  0xd0044ca8 in load_module (umod=0x2006f008, len=0x67880, uargs=0x482008 "") at kernel/module.c:2193
#2  0xd0045125 in sys_init_module (umod=0x2006f008, len=0x67880, uargs=0x482008 "") at kernel/module.c:2329
#3  0xd0002c68 in system_call () at arch/xtensa/kernel/entry.S:2104
#4  0xd0002380 in _user_exception () at arch/xtensa/kernel/entry.S:321
#5  0xc0000001 in ?? ()

(gdb) add-symbol-file /exports/LINUX_ROOT.HiFi-2/lib/modules/2.6.29-rc7/kernel/crypto/tcrypt.ko 0xc006a1dc -s .data 0xc006d1c0 -s .init.text 0xc0070000 -s .bss 0xc006d464 -s .literal 0xc006a000 -s 
.rodata 0xc006bd90
add symbol table from file "/exports/LINUX_ROOT.HiFi-2/lib/modules/2.6.29-rc7/kernel/crypto/tcrypt.ko" at
	.text_addr = 0xc006a1dc
	.data_addr = 0xc006d1c0
	.init.text_addr = 0xc0070000
	.bss_addr = 0xc006d464
	.literal_addr = 0xc006a000
	.rodata_addr = 0xc006bd90
(gdb)
------------------------------------------------------------------------------------------------------------------------
I thought we likely wanted to cleanup module.c by removing
the relocations that are no longer necessary with the text literal
section of the module being linked with the text section and having
a permanent offset. I tried it on my 1st try and didn't see a regression.

With the xtensa module.c still relocating the code it still seemed somewhat reasonable.
But not reproducing the gap from the previous run without the module symbols loaded the
the module debug code enabled.

It seem weird that the symbols seem like 2nd class citizens where
xt-gdb can find them but doesn't show them in the asm listing. As I recall
this use to work when I worked with Prasanna on the A2000 board and the
modules I thought the gdb back-trace showed them in the panics we were getting.
--------------------------------------------------------------------------
Breakpoint 26, apply_relocate_add (sechdrs=0xc005cee0, strtab=0xc006cb70 "", symindex=0x29, relsec=0x3, mod=0xc006d374) at arch/xtensa/kernel/module.c:91
1: x/i $pc
0xd00070c9 <apply_relocate_add+13>:	l32i.n	a9, a1, 44
(gdb) cont

Breakpoint 26, apply_relocate_add (sechdrs=0xc005cee0, strtab=0xc006cb70 "", symindex=0x29, relsec=0x6, mod=0xc006d374) at arch/xtensa/kernel/module.c:91
1: x/i $pc
0xd00070c9 <apply_relocate_add+13>:	l32i.n	a9, a1, 44
(gdb) add-symbol-file /exports/LINUX_ROOT.HiFi-2/lib/modules/2.6.29-rc7/kernel/crypto/tcrypt.ko 0xc006a1dc -s .data 0xc006d1c0 -s .init.text 0xc0070000 -s .bss 0xc006d464 -s .literal 0xc006a000 -s 
.rodata 0xc006bd90
add symbol table from file "/exports/LINUX_ROOT.HiFi-2/lib/modules/2.6.29-rc7/kernel/crypto/tcrypt.ko" at
	.text_addr = 0xc006a1dc
	.data_addr = 0xc006d1c0
	.init.text_addr = 0xc0070000
	.bss_addr = 0xc006d464
	.literal_addr = 0xc006a000
	.rodata_addr = 0xc006bd90
(gdb) print apply_relocate_add_enabled
$4 = 0x0
(gdb) set apply_relocate_add_enabled = 1
(gdb) print apply_relocate_add_enabled
$5 = 0x1
(gdb) c

Breakpoint 26, apply_relocate_add (sechdrs=0xc005cee0, strtab=0xc006cb70 "", symindex=0x29, relsec=0x8, mod=0xc006d374) at arch/xtensa/kernel/module.c:91
1: x/i $pc
0xd00070c9 <apply_relocate_add+13>:	l32i.n	a9, a1, 44
(gdb) c

Breakpoint 26, apply_relocate_add (sechdrs=0xc005cee0, strtab=0xc006cb70 "", symindex=0x29, relsec=0xd, mod=0xc006d374) at arch/xtensa/kernel/module.c:91
1: x/i $pc
0xd00070c9 <apply_relocate_add+13>:	l32i.n	a9, a1, 44
(gdb) c

Breakpoint 26, apply_relocate_add (sechdrs=0xc005cee0, strtab=0xc006cb70 "", symindex=0x29, relsec=0xf, mod=0xc006d374) at arch/xtensa/kernel/module.c:91
1: x/i $pc
0xd00070c9 <apply_relocate_add+13>:	l32i.n	a9, a1, 44
(gdb) c

Breakpoint 26, apply_relocate_add (sechdrs=0xc005cee0, strtab=0xc006cb70 "", symindex=0x29, relsec=0x11, mod=0xc006d374) at arch/xtensa/kernel/module.c:91
1: x/i $pc
0xd00070c9 <apply_relocate_add+13>:	l32i.n	a9, a1, 44
(gdb) c

Breakpoint 18, bad_page_fault_bp () at arch/xtensa/mm/fault.c:228
1: x/i $pc
0xd0008afb <bad_page_fault_bp+3>:	retw.n
(gdb) bt
#0  bad_page_fault_bp () at arch/xtensa/mm/fault.c:228
#1  0xd0008b42 in bad_page_fault (regs=0xd3a47b90, address=0x2e736468, sig=0xb) at arch/xtensa/mm/fault.c:248
#2  0xd0008965 in do_page_fault (regs=0xd3a47b90) at arch/xtensa/mm/fault.c:142
#3  0xd0002534 in _kernel_exception () at arch/xtensa/kernel/entry.S:747
#4  0xee73646c in ?? ()
#5  0xd000139e in do_one_initcall (fn=0xc0070014) at init/main.c:709
#6  0xd0045178 in sys_init_module (umod=0x2006f008, len=0x67880, uargs=0x482008 <Address 0x482008 out of bounds>) at kernel/module.c:2343
#7  0xd0002c68 in system_call () at arch/xtensa/kernel/entry.S:2104
#8  0xd0002380 in _user_exception () at arch/xtensa/kernel/entry.S:321
#9  0xc0000001 in ?? ()
(gdb) c

Breakpoint 16, die (str=0xd022bb20 "Oops", regs=0xd3a47b90, err=0xb) at arch/xtensa/kernel/traps.c:573
1: x/i $pc
0xd0006c71 <die+9>:	movi.n	a8, 0

(gdb) x/20i tcrypt_test
0xc006bd50:	entry	a1, 48
0xc006bd53:	s32i.n	a2, a1, 0
0xc006bd55:	l32i.n	a10, a1, 0
0xc006bd57:	l32i.n	a11, a1, 0
0xc006bd59:	movi.n	a12, 0
0xc006bd5b:	movi	a13, 0
0xc006bd5e:	l32r	a8, 0xc002bd60
0xc006bd61:	callx8	a8
0xc006bd64:	mov.n	a8, a10
0xc006bd66:	mov.n	a2, a8
0xc006bd68:	retw.n
0xc006bd6a:	extui	a0, a0, 0, 1
0xc006bd6d:	ill
0xc006bd70:	.byte 0x14
0xc006bd71:	ill
0xc006bd74:	.byte 0x3
0xc006bd75:	ill
0xc006bd78:	ball	a14, a4, 0xc006bdd1
0xc006bd7b:	.byte 00
0xc006bd7c:	.byte 0x76

(gdb) x/20i sg_set_buf
0xc006a7b0:	entry	a1, 48
0xc006a7b3:	s32i.n	a2, a1, 0
0xc006a7b5:	s32i.n	a3, a1, 4
0xc006a7b7:	s32i.n	a4, a1, 8
0xc006a7b9:	l32i.n	a8, a1, 4
0xc006a7bb:	mov.n	a9, a8
0xc006a7bd:	l32r	a8, 0xc002a7c0
0xc006a7c0:	add.n	a8, a9, a8
0xc006a7c2:	srli	a9, a8, 12
0xc006a7c5:	mov.n	a8, a9
0xc006a7c7:	slli	a8, a8, 3
0xc006a7ca:	add.n	a8, a8, a9
0xc006a7cc:	slli	a8, a8, 2
0xc006a7cf:	mov.n	a9, a8
0xc006a7d1:	l32r	a8, 0xc002a7d4
0xc006a7d4:	l32i.n	a8, a8, 0
0xc006a7d6:	add.n	a9, a9, a8
0xc006a7d8:	l32i.n	a8, a1, 4
0xc006a7da:	extui	a8, a8, 0, 12
0xc006a7dd:	l32i.n	a10, a1, 0

(gdb) x/20i sg_set_page
0xd0226278 <sg_set_page>:	entry	a1, 48
0xd022627b <sg_set_page+3>:	s32i.n	a2, a1, 0
0xd022627d <sg_set_page+5>:	s32i.n	a3, a1, 4
0xd022627f <sg_set_page+7>:	s32i.n	a4, a1, 8
0xd0226281 <sg_set_page+9>:	s32i.n	a5, a1, 12
0xd0226283 <sg_set_page+11>:	l32i.n	a10, a1, 0
0xd0226285 <sg_set_page+13>:	l32i.n	a11, a1, 4
0xd0226287 <sg_set_page+15>:	call8	0xd0226298 <sg_assign_page>
0xd022628a <sg_set_page+18>:	l32i.n	a9, a1, 0
0xd022628c <sg_set_page+20>:	l32i.n	a8, a1, 12
0xd022628e <sg_set_page+22>:	s32i.n	a8, a9, 4
0xd0226290 <sg_set_page+24>:	l32i.n	a9, a1, 0
0xd0226292 <sg_set_page+26>:	l32i.n	a8, a1, 8
0xd0226294 <sg_set_page+28>:	s32i.n	a8, a9, 12
0xd0226296 <sg_set_page+30>:	retw.n
0xd0226298 <sg_assign_page>:	entry	a1, 64
0xd022629b <sg_assign_page+3>:	s32i.n	a2, a1, 16
0xd022629d <sg_assign_page+5>:	s32i.n	a3, a1, 20
0xd022629f <sg_assign_page+7>:	l32i.n	a8, a1, 16
0xd02262a1 <sg_assign_page+9>:	l32i.n	a8, a8, 0
(gdb) x/20i sg_page
0xd0226264 <sg_page>:	entry	a1, 48
0xd0226267 <sg_page+3>:	s32i.n	a2, a1, 0
0xd0226269 <sg_page+5>:	l32i.n	a3, a1, 0
0xd022626b <sg_page+7>:	l32i.n	a4, a3, 0
0xd022626d <sg_page+9>:	movi.n	a3, -4
0xd022626f <sg_page+11>:	and	a3, a4, a3
0xd0226272 <sg_page+14>:	mov.n	a2, a3
0xd0226274 <sg_page+16>:	retw.n
0xd0226276:	.byte 00
0xd0226277:	.byte 00
0xd0226278 <sg_set_page>:	entry	a1, 48
0xd022627b <sg_set_page+3>:	s32i.n	a2, a1, 0
0xd022627d <sg_set_page+5>:	s32i.n	a3, a1, 4
0xd022627f <sg_set_page+7>:	s32i.n	a4, a1, 8
0xd0226281 <sg_set_page+9>:	s32i.n	a5, a1, 12
0xd0226283 <sg_set_page+11>:	l32i.n	a10, a1, 0
0xd0226285 <sg_set_page+13>:	l32i.n	a11, a1, 4
0xd0226287 <sg_set_page+15>:	call8	0xd0226298 <sg_assign_page>
0xd022628a <sg_set_page+18>:	l32i.n	a9, a1, 0
0xd022628c <sg_set_page+20>:	l32i.n	a8, a1, 12


(gdb) x/20i sg_chain
0xd014a770 <sg_chain>:	entry	a1, 48
0xd014a773 <sg_chain+3>:	s32i.n	a2, a1, 0
0xd014a775 <sg_chain+5>:	s32i.n	a3, a1, 4
0xd014a777 <sg_chain+7>:	s32i.n	a4, a1, 8
0xd014a779 <sg_chain+9>:	l32r	a8, 0xd0146ed4
0xd014a77c <sg_chain+12>:	l32r	a9, 0xd0146ed8
0xd014a77f <sg_chain+15>:	l32r	a13, 0xd0146ef4
0xd014a782 <sg_chain+18>:	or	a10, a8, a8
0xd014a785 <sg_chain+21>:	or	a11, a9, a9
0xd014a788 <sg_chain+24>:	movi	a12, 135
0xd014a78b <sg_chain+27>:	l32r	a8, 0xd0114c64
0xd014a78e <sg_chain+30>:	callx8	a8
0xd014a791 <sg_chain+33>:	l32r	a8, 0xd0146ee0
0xd014a794 <sg_chain+36>:	mov.n	a10, a8
0xd014a796 <sg_chain+38>:	l32r	a8, 0xd0114c6c
0xd014a799 <sg_chain+41>:	callx8	a8
0xd014a79c <sg_alloc_table>:	entry	a1, 64
0xd014a79f <sg_alloc_table+3>:	s32i.n	a2, a1, 16
0xd014a7a1 <sg_alloc_table+5>:	s32i.n	a3, a1, 20
0xd014a7a3 <sg_alloc_table+7>:	s32i.n	a4, a1, 24

(gdb) x/20i 0xc006a1dc
0xc006a1dc:	entry	a1, 96
0xc006a1df:	s32i.n	a2, a1, 32
0xc006a1e1:	s32i.n	a3, a1, 36
0xc006a1e3:	s32i.n	a4, a1, 40
0xc006a1e5:	s32i.n	a5, a1, 44
0xc006a1e7:	s32i.n	a6, a1, 48
0xc006a1e9:	l32r	a8, 0xc002a1ec
0xc006a1ec:	memw
0xc006a1ef:	l32i.n	a8, a8, 0
0xc006a1f1:	s32i.n	a8, a1, 12
0xc006a1f3:	l32i.n	a9, a1, 48
0xc006a1f5:	mov.n	a8, a9
0xc006a1f7:	slli	a8, a8, 2
0xc006a1fa:	add.n	a8, a8, a9
0xc006a1fc:	slli	a9, a8, 2
0xc006a1ff:	add.n	a8, a8, a9
0xc006a201:	slli	a8, a8, 2
0xc006a204:	mov.n	a9, a8
0xc006a206:	l32i.n	a8, a1, 12
0xc006a208:	add.n	a8, a9, a8

(gdb) x/20i 0xc0070000
0xc0070000:	.byte 0x24
0xc0070001:	movi.n	a5, 64
0xc0070003:	extui	a7, a13, 4, 14
0xc0070006:	j	0xc008c30a
0xc0070009:	ae_movpa24x2	aep0, a6, a13
0xc007000c:	beqz.n	a0, 0xc007004b
0xc007000e:	j	0xc0086312
0xc0070011:	.byte 0x4e
0xc0070012:	call0	0xc00a6d14
0xc0070015:	l32r	a6, 0xc004f018
0xc0070018:	l32i.n	a4, a9, 32
0xc007001a:	l32r	a1, 0xc003204c
0xc007001d:	s32i.n	a8, a1, 0
0xc007001f:	j	0xc0070052
0xc0070022:	l32i.n	a6, a1, 0
0xc0070024:	movi	a10, 208
0xc0070027:	movi	a11, 0
0xc007002a:	l32r	a8, 0xc003002c
0xc007002d:	callx8	a8
0xc0070030:	mov.n	a8, a10


(gdb) x/100i 0xc006a1dc
0xc006a1dc:	entry	a1, 96
0xc006a1df:	s32i.n	a2, a1, 32
0xc006a1e1:	s32i.n	a3, a1, 36
0xc006a1e3:	s32i.n	a4, a1, 40
0xc006a1e5:	s32i.n	a5, a1, 44
0xc006a1e7:	s32i.n	a6, a1, 48
0xc006a1e9:	l32r	a8, 0xc002a1ec
0xc006a1ec:	memw
0xc006a1ef:	l32i.n	a8, a8, 0
0xc006a1f1:	s32i.n	a8, a1, 12
0xc006a1f3:	l32i.n	a9, a1, 48
0xc006a1f5:	mov.n	a8, a9
0xc006a1f7:	slli	a8, a8, 2
0xc006a1fa:	add.n	a8, a8, a9
0xc006a1fc:	slli	a9, a8, 2
0xc006a1ff:	add.n	a8, a8, a9
0xc006a201:	slli	a8, a8, 2
0xc006a204:	mov.n	a9, a8
0xc006a206:	l32i.n	a8, a1, 12
0xc006a208:	add.n	a8, a9, a8
0xc006a20a:	s32i.n	a8, a1, 8
0xc006a20c:	movi.n	a8, 0
0xc006a20e:	s32i.n	a8, a1, 4
0xc006a210:	j	0xc006a256
0xc006a213:	srai	a8, a0, 2
0xc006a216:	s32i.n	a0, a6, 4
0xc006a218:	l32i.n	a0, a0, 0
0xc006a21a:	l32i	a8, a1, 44
0xc006a21d:	l32i.n	a10, a1, 32
0xc006a21f:	l32i.n	a11, a1, 40
0xc006a221:	l32i.n	a12, a1, 40
0xc006a223:	mov.n	a13, a8
0xc006a225:	call8	0xc006a228
0xc006a228:	mov.n	a8, a10
0xc006a22a:	s32i.n	a8, a1, 0
0xc006a22c:	j	0xc006a244
0xc006a22f:	srai	a8, a0, 2
0xc006a232:	addi.n	a10, a2, -1
0xc006a234:	l32r	a2, 0xc0056a54
0xc006a237:	l32r	a2, 0xc005c260
0xc006a23a:	l32r	a10, 0xc002c5b0
0xc006a23d:	call8	0xc006a240
0xc006a240:	mov.n	a8, a10
0xc006a242:	s32i.n	a8, a1, 0
0xc006a244:	l32i.n	a8, a1, 0
0xc006a246:	beqz.n	a8, 0xc006a250
0xc006a248:	l32i.n	a8, a1, 0
0xc006a24a:	s32i.n	a8, a1, 52
0xc006a24c:	j	0xc006a28c
0xc006a24f:	slli	a8, a8, 16
0xc006a252:	addi.n	a8, a8, 1
0xc006a254:	s32i.n	a8, a1, 4
0xc006a256:	movi.n	a8, 1
0xc006a258:	beqz.n	a8, 0xc006a270
0xc006a25a:	movi.n	a8, 1
0xc006a25c:	beqz.n	a8, 0xc006a270
0xc006a25e:	l32r	a8, 0xc002a260
0xc006a261:	memw
0xc006a264:	l32i.n	a8, a8, 0
0xc006a266:	mov.n	a9, a8
0xc006a268:	l32i.n	a8, a1, 8
0xc006a26a:	sub	a8, a9, a8
0xc006a26d:	bltz	a8, 0xc006a214
0xc006a270:	l32i.n	a9, a1, 4
0xc006a272:	l32i.n	a8, a1, 44
0xc006a274:	mull	a9, a9, a8
0xc006a277:	l32r	a8, 0xc002a278
0xc006a27a:	mov.n	a10, a8
0xc006a27c:	l32i.n	a11, a1, 4
0xc006a27e:	l32i.n	a12, a1, 48
0xc006a280:	mov.n	a13, a9
0xc006a282:	l32r	a8, 0xc002a284
0xc006a285:	callx8	a8                                    [LOOK AT THIS CALLX8 TOMARROW]
0xc006a288:	movi.n	a8, 0
0xc006a28a:	s32i.n	a8, a1, 52
0xc006a28c:	l32i.n	a8, a1, 52
0xc006a28e:	mov.n	a2, a8
0xc006a290:	retw.n
0xc006a292:	.byte 00
0xc006a293:	.byte 00
0xc006a294:	entry	a1, 48
0xc006a297:	s32i.n	a2, a1, 0
0xc006a299:	s32i.n	a3, a1, 4
0xc006a29b:	s32i.n	a4, a1, 8
0xc006a29d:	s32i.n	a5, a1, 12
0xc006a29f:	l32i.n	a8, a1, 0
0xc006a2a1:	l32i.n	a8, a8, 0
0xc006a2a3:	mov.n	a10, a8
0xc006a2a5:	call8	0xc006a2d0
0xc006a2a8:	mov.n	a8, a10
0xc006a2aa:	l32i.n	a9, a8, 0
0xc006a2ac:	l32i.n	a8, a1, 0
0xc006a2ae:	s32i.n	a9, a8, 4
0xc006a2b0:	l32i.n	a8, a1, 0
0xc006a2b2:	l32i.n	a8, a8, 0
0xc006a2b4:	mov.n	a10, a8
0xc006a2b6:	call8	0xc006a2d0
0xc006a2b9:	mov.n	a8, a10
0xc006a2bb:	l32i.n	a8, a8, 8
0xc006a2bd:	l32i.n	a10, a1, 0

(gdb) x/30i 0xc0070014							do_one_initcall (fn=0xc0070014)
0xc0070014:	entry	a1, 48
0xc0070017:	movi.n	a8, -12
0xc0070019:	s32i.n	a8, a1, 4
0xc007001b:	movi.n	a8, 0
0xc007001d:	s32i.n	a8, a1, 0
0xc007001f:	j	0xc0070052
0xc0070022:	l32i.n	a6, a1, 0
0xc0070024:	movi	a10, 208
0xc0070027:	movi	a11, 0
0xc007002a:	l32r	a8, 0xc003002c
0xc007002d:	callx8	a8						[Look at this CallX8 and L32r; breakpoint, and single step during initcall()
0xc0070030:	mov.n	a8, a10
0xc0070032:	mov.n	a10, a8
0xc0070034:	l32r	a9, 0xc0030034
0xc0070037:	slli	a8, a6, 2
0xc007003a:	add.n	a8, a8, a9
0xc007003c:	s32i.n	a10, a8, 0
0xc007003e:	l32i.n	a8, a1, 0
0xc0070040:	l32r	a9, 0xc0030040
0xc0070043:	slli	a8, a8, 2
0xc0070046:	add.n	a8, a8, a9
0xc0070048:	l32i.n	a8, a8, 0
0xc007004a:	beqz.n	a8, 0xc0070068
0xc007004c:	l32i.n	a8, a1, 0
0xc007004e:	addi.n	a8, a8, 1
0xc0070050:	s32i.n	a8, a1, 0
0xc0070052:	l32i.n	a8, a1, 0
0xc0070054:	blti	a8, 4, 0xc0070022
0xc0070057:	l32r	a8, 0xc0030058
0xc007005a:	l32i.n	a8, a8, 0


(gdb) help add-symbol-file
Load symbols from FILE, assuming FILE has been dynamically loaded.
Usage: add-symbol-file FILE ADDR [-s <SECT> <SECT_ADDR> -s <SECT> <SECT_ADDR> ...]
ADDR is the starting address of the file's text.
The optional arguments are section-name section-address pairs and
should be specified if the data and bss segments are not contiguous
with the text.  SECT is a section name to be loaded at SECT_ADDR.

(gdb) x/20i crypto_blkcipher_encrypt
0xc006a294:	entry	a1, 48
0xc006a297:	s32i.n	a2, a1, 0
0xc006a299:	s32i.n	a3, a1, 4
0xc006a29b:	s32i.n	a4, a1, 8
0xc006a29d:	s32i.n	a5, a1, 12
0xc006a29f:	l32i.n	a8, a1, 0
0xc006a2a1:	l32i.n	a8, a8, 0
0xc006a2a3:	mov.n	a10, a8
0xc006a2a5:	call8	0xc006a2d0
0xc006a2a8:	mov.n	a8, a10
0xc006a2aa:	l32i.n	a9, a8, 0
0xc006a2ac:	l32i.n	a8, a1, 0
0xc006a2ae:	s32i.n	a9, a8, 4
0xc006a2b0:	l32i.n	a8, a1, 0
0xc006a2b2:	l32i.n	a8, a8, 0
0xc006a2b4:	mov.n	a10, a8
0xc006a2b6:	call8	0xc006a2d0
0xc006a2b9:	mov.n	a8, a10
0xc006a2bb:	l32i.n	a8, a8, 8
0xc006a2bd:	l32i.n	a10, a1, 0
(gdb)
--------------------------------------------------------------------------------------------------------------------

It's late, might have a 11:00am dept meeting. Seems like a good point to quit
and check the callx8 and l32r instructions and literals tomorrow.


-piet

> 
> -Marc
> 
> 
> 
> Piet Delaney wrote:
>> Marc Gauthier wrote:
>>> Piet Delaney wrote:
>>>> Sterling Augustine wrote:
>>>>> I was only tangentially involved in the original design, but my
>>>>> hazy memory on this is that the module loader does need to
>>>>> disassemble and modify the l32rs. That was one reason that you
>>>>> couldn't have l32r's in flix in a module--because then the loader
>>>>> wouldn't know how to disassemble it.
>>>> Any thoughts then on what this fragment would be doing for the CALLn
>>>> and L32R relocation entrys for say big-endian:
>>>>
>>>>               case R_XTENSA_SLOT0_OP:
>>>>                          if (decode_calln_opcode(location)) {
>>>>                                  value -= ((unsigned
>>>> long)location & -4) + 4;
>>>>                                  value = (signed int)value >> 2;
>>>>                                  location[0] = ((location[0]
>>>> & ~0x3) | ((value >> 16) & 0x3));
>>>>                                  location[1] = (value >> 8) & 0xff;
>>>>                                  location[2] = value & 0xff;
>>>>                          } else if (decode_l32r_opcode(location)) {
>>>>                                  value -= (((unsigned long)location
>>>>                                  + 3) & -4); value = (signed
>>>>                                  int)value >> 2; location[1] =
>>>>                                  (value >> 8) & 0xff; location[2] =
>>>>                       value & 0xff; }
>>>
>>> First look at the piece of code that precedes:
>>>
>>>                 location = (char
>> *)sechdrs[sechdrs[relsec].sh_info].sh_addr
>>>                         + rela[i].r_offset;
>>>                 sym = (Elf32_Sym *)sechdrs[symindex].sh_addr
>>>                         + ELF32_R_SYM(rela[i].r_info);
>>>                 value = sym->st_value + rela[i].r_addend;
>>>
>>> it figures out which symbol is associated with the relocation
>>> (sym = ...), and computes the relocated value (symbol address
>>> plus relocation addend).  Then, the piece of code you showed
>>> applies this value in the instruction's PC-relative offset field
>>> (after making the value PC relative, ie. subtracting location aka
>>> PC).
>>>
>>> You could load a few modules, and verify that location[n] is never
>>> actually modified.
>>>
>>>
>>> I tried loading one just now (ext2.ko) but I get an unaligned
>>> exception upon insmod.
>> I tried the same thing on HiFi-2 and aso got an unaligned
>> exception upon using modprobe.
>> -------------------------------------------------------------------
>> (gdb) bt #0  die (str=0xd022b148 "Unhandled unalig"...,
>> regs=0xd3a2fd20, err=0x9) at arch/xtensa/kernel/traps.c:573
>> #1  0xd000674c in __die_if_kernel (str=0xd022b148 "Unhandled
>> unalig"..., regs=0xd3a2fd20, err=0x9) at
>> arch/xtensa/kernel/traps.c:180
>> #2  0xd0006880 in do_unaligned_user (regs=0xd3a2fd20) at
>> arch/xtensa/kernel/traps.c:287
>> #3  0xd000253c in _kernel_exception () at
>> arch/xtensa/kernel/entry.S:747
>> #4  0xd0007284 in apply_relocate_add (sechdrs=0xc029fb10,
>> strtab=0xc02e14dc "", symindex=0x32, relsec=0x11,
>> mod=0xc02e344c) at arch/xtensa/kernel/module.c:116
>> #5  0xd0044cb8 in load_module (umod=0x2006f008, len=0x2d1405,
>> uargs=0x482008 <Address 0x482008 out of bounds>) at
>> kernel/module.c:2193 #6  0xd0045135 in sys_init_module
>> (umod=0x2006f008,
>> len=0x2d1405, uargs=0x482008 <Address 0x482008 out of
>> bounds>) at kernel/module.c:2329
>> #7  0xd0002c70 in system_call () at arch/xtensa/kernel/entry.S:2103
>> #8  0xd0002388 in _user_exception () at
>> arch/xtensa/kernel/entry.S:321 #9  0xc0000001 in ?? () (gdb)
>> ----------------------------------------------------------------------
>> But I didn't see the connection to the exception fixup table
>> being unaligned.
>> I didn't see any 'TRY's being set up to use the kernel
>> exception fixup table.
>>
>> The alignment error occurs  because the pointer location is
>> odd where we are doing
>> relocating a literal or other simple relocation:
>>
>>
>>   114                 case R_XTENSA_32:
>>   115                 case R_XTENSA_PLT:
>>   116                         *(uint32_t *)location += value;
>>   117                         break;
>>   118
>>
>>               (gdb) print location
>>               $10 = (unsigned char *) 0xc02df399 "\025n"
>>
>> what's the connection with the kernel exception fixup table.
>>
>> I was also wondering why ChipBanks nor Transwitch mentioning
>> this problem
>> loading modules? Maybe a lot more of this chit-chat should be
>> on the linux-xtensa
>> mailing list so we are all on the same page.
>>
>>
>>>  After some investigation, turns out the
>>> __ex_table section is not properly aligned in the ELF file;
>>> see that file offset is 0x7205 below, and Algn is 2**0 instead
>>> of 2**2.  Sterling, is that a compiler bug?  Am not sure offhand
>>> how __ex_table gets generated.
>> Great find Marc! I'd really like to hear more about the investigation.
>>
>> More to come on the module linker script change....
>>
>> -piet
>>
>>>
>>> [marc at gums linux2629]$ xt-objdump -wh build-dc232b/fs/ext2/ext2.ko
>>>
>>> build-dc232b/fs/ext2/ext2.ko:     file format elf32-xtensa-le
>>>
>>> Sections:
>>> Idx Name                      Size      VMA       LMA File off  Algn
>>>   Flags 0 .note.gnu.build-id        00000024  00000000  00000000
>> 00000034  2**2  CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   1 .init.literal             00000020  00000000  00000000
>> 00000058  2**2  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>   2 .text                     00005a89  00000000  00000000
>> 00000078  2**2  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>   3 .fixup                    00000052  00000000  00000000
>> 00005b04  2**2  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>   4 .exit.literal             00000010  00000000  00000000
>> 00005b58  2**2  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>   5 .exit.text                0000001a  00000000  00000000
>> 00005b68  2**2  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>   6 .init.text                00000041  00000000  00000000
>> 00005b84  2**2  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>   7 .rodata                   0000017c  00000000  00000000
>> 00005bc8  2**2  CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   8 .rodata.str1.4            0000143e  00000000  00000000
>> 00005d44  2**2  CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   9 .modinfo                  00000081  00000000  00000000
>> 00007184  2**2  CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  10 __ex_table                00000030  00000000  00000000
>> 00007205  2**0  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>>>  11 .data                     00000008  00000000  00000000
>> 00007238  2**2  CONTENTS, ALLOC, LOAD, DATA
>>>  12 .data.rel.ro              00000378  00000000  00000000
>> 00007240  2**2  CONTENTS, ALLOC, LOAD, RELOC, DATA
>>>  13 .data.rel.ro.local        00000138  00000000  00000000
>> 000075b8  2**2  CONTENTS, ALLOC, LOAD, RELOC, DATA
>>>  14 .data.rel                 00000020  00000000  00000000
>> 000076f0  2**2  CONTENTS, ALLOC, LOAD, RELOC, DATA
>>>  15 .gnu.linkonce.this_module 000000f8  00000000  00000000
>> 00007710  2**2  CONTENTS, ALLOC, LOAD, RELOC, DATA, LINK_ONCE_DISCARD
>>>  16 .bss                      00000004  00000000  00000000 00007808
>>>  2**2  ALLOC 17 .comment                  000000c6  00000000
>>> 00000000
>> 00007808  2**0  CONTENTS, READONLY
>>>  18 .debug_aranges            00000150  00000000  00000000
>> 000078ce  2**0  CONTENTS, RELOC, READONLY, DEBUGGING
>>>  19 .debug_pubnames           00000526  00000000  00000000
>> 00007a1e  2**0  CONTENTS, RELOC, READONLY, DEBUGGING
>>>  20 .debug_info               0004f256  00000000  00000000
>> 00007f44  2**0  CONTENTS, RELOC, READONLY, DEBUGGING
>>>  21 .debug_abbrev             00002f9b  00000000  00000000
>> 0005719a  2**0  CONTENTS, READONLY, DEBUGGING
>>>  22 .debug_line               00008c1d  00000000  00000000
>> 0005a135  2**0  CONTENTS, RELOC, READONLY, DEBUGGING
>>>  23 .debug_frame              00000958  00000000  00000000
>> 00062d54  2**2  CONTENTS, RELOC, READONLY, DEBUGGING
>>>  24 .debug_str                00026021  00000000  00000000
>> 000636ac  2**0  CONTENTS, READONLY, DEBUGGING
>>>  25 .debug_loc                00006a0b  00000000  00000000
>> 000896cd  2**0  CONTENTS, RELOC, READONLY, DEBUGGING
>>>  26 .debug_ranges             00001d08  00000000  00000000
>> 000900d8  2**0  CONTENTS, RELOC, READONLY, DEBUGGING
>>>  27 .xt.lit                   00000018  00000000  00000000
>> 00091de0  2**0  CONTENTS, RELOC, READONLY
>>>  28 .xt.prop                  00003570  00000000  00000000
>> 00091df8  2**0  CONTENTS, RELOC, READONLY
>>>  29 .xtensa.info              00000038  00000000  00000000
>> 00095368  2**0  CONTENTS, READONLY
>>>
>>> -Marc
>>>
>>>
>>>
>>>> decode_calln_opcode() and decode_l32r_opcode() just identify
>>>> the op codes
>>>> in slot0.
>>>>
>>>> decode_calln_opcode (unsigned char *location)
>>>> {
>>>>          return (location[0] & 0xf0) == 0x50;
>>>> }
>>>>
>>>> static int
>>>> decode_l32r_opcode (unsigned char *location)
>>>> {
>>>>          return (location[0] & 0xf0) == 0x10;
>>>> }
>>>>
>>>> I think Marc's theory is that the literals won't move and
>>>> values won't change.
>>>>
>>>>
>>>>
>>>>
>>>> Don't skot3 xt_widebranch18 instructions of core supporting
>>>> P600_VLIW also need to have relocation entries applied to them?
>>>> They won't be handled by the literal pool.
>>>>
>>>> Looks like they are generated by the assembler:
>>>>
>>>>       90:   0000000005ffbc2b748051a004f80c2e        { nop;
>>>> nop; nop; bltu.w18       a5, a6, c <clearpage+0xc> }
>>>>
>>>> and can be disabled with -mno-flix
>>>>
>>>>
>>>>       177:   022897
>>>> blt     a8, a9, 17d <copypage+0xe5>
>>>>       17a:   ffcbc6
>>>> j       ad <copypage+0x15>
>>>>
>>>> Since -mno-flix would disable so much wouldn't it be best to just
>>>> add code to handle the relocation widebranch18 instructions in
>>>> slot3?
>>>>
>>>>   xt-objdump -d -r copypage.o
>>>> --------------------------------------------------------------
>>>> ------------------------------------------------
>>>>   18d:   0000000005ff9944948051a004f80c2e        { nop; nop;
>>>> nop; bge.w18        a8, a9, c3 <copypage+0x1f> }
>>>>                          18d: R_XTENSA_SLOT3_OP  .text+0xc3
>>>> --------------------------------------------------------------
>>>> ------------------------------------------------
>>>>
>>>> -piet
>>>>
>>>>
>>>>> Perhaps the design has evolved? Or my memory could be off.
>>>>>
>>>>> Sterling
>>>>>
>>>>> Marc Gauthier wrote:
>>>>>> Sterling Augustine wrote:
>>>>>>> Marc Gauthier wrote:
>>>>>>>> The module loader never does "link-time relaxation", as I said,
>>>>>>>> no need for any flag to remove the need -- the need isn't there.
>>>>>>>> The flag suggested by Bob is to remove unneeded relocations
>>>>>>>> in the case no relaxation is being done.
>>>>>>> You would get somewhat faster module load times and somewhat
>>>>>>> smaller modules.
>>>>>>>
>>>>>>> But the problem would be that the assembler would have to know
>>>>>>> which relocations the kernel module loader needs and which ones
>>>>>>> it doesn't. Personally I think it makes more sense to keep that
>>>>>>> knowledge only in the module loader, rather than pushing that
>>>>>>> into the assembler. The number of relocations we are dealing
>>>>>>> with are linear in the size of the text, and I assume that
>>>>>>> loading a module is already linear in that time.
>>>>>> Indeed it is.
>>>>>>
>>>>>> There's desire to keep the kernel module loader relatively simple.
>>>>>> I think that's doable if we understand what's going on well
>>>>>> enough. The module loader need not touch any relocation that is
>>>>>> only there to deal with relaxation.  So, which are the
>>>>>> relocations that need to be dealt with?  Does the module loader
>>>>>> *ever* need to relocate an instruction?  I suspect not.  For
>>>>>> example, relocating literals involves applying a relocation on a
>>>>>> 32-bit data word (the literal), not on the L32R instruction
>>>>>> itself.  I somehow doubt we ever need to touch any instruction.
>>>>>> We don't support CONST16 in the kernel. So, perhaps it is
>>>>>> appropriate to ignore *all* slot relocations, rather than raise
>>>>>> an error as we do today?  In which case, the "L32R in FLIX" issue
>>>>>> is a non-issue.  Also, the current module.c has code to handle
>>>>>> relocations in CALLn and L32R instructions: I'm sure there are
>>>>>> many such relocations, but I bet they never actually change any
>>>>>> code.  At first I thought they'd be needed to handle symbolic
>>>>>> references to variables and functions in the kernel itself, but
>>>>>> that doesn't make sense -- for functions, we need L32R;CALLX8,
>>>>>> not CALL8 which probably won't reach; and for variables, it's the
>>>>>> literal we relocate, can't have L32R in the module load a literal
>>>>>> that's outside the module, again the L32R instruction couldn't
>>>>>> reach.  Together the CALLn and L32R relocs are most of the code
>>>>>> in module.c, take those out and it becomes really simple.  And
>>>>>> works for any config, any FLIX, etc.
>>>>>>
>>>>>> Make sense?
>>>>>>
>>>>>>
>>>>>>
>>>>>>> It's also important to note that the ld doesn't do branch-level
>>>>>>> relaxation.
>>>>>> Ah, thanks for pointing that out, makes sense.
>>>>>>
>>>>>>
>>>>>> -Marc
>>>>>>
>>>>>>
>>>>>>
>>>>>>> It does do branch-level relocation though. So the assembler
>>>>>>> leaves branch-related relocations in the object files. Which in
>>>>>>> turn confuse the kernel module loader.
>>>>>>>
>>>>>>> Also, the need for relocation and relaxation is not related to
>>>>>>> the way we put literals in the text section.
>>>>>>>
>>>>>>> Sterling
> 



More information about the linux-xtensa mailing list