[Linux-Xtensa] Re: TLB and PTE coherency during munmap

Max Filippov jcmvbkbc at gmail.com
Tue May 28 21:15:28 PDT 2013


Hi Catalin,

On Tue, May 28, 2013 at 6:35 PM, Catalin Marinas <catalin.marinas at arm.com> wrote:
> Max,
>
> On 26 May 2013 03:42, Max Filippov <jcmvbkbc at gmail.com> wrote:
>> Hello arch and mm people.
>>
>> Is it intentional that threads of a process that invoked munmap syscall
>> can see TLB entries pointing to already freed pages, or it is a bug?
>
> If it happens, this would be a bug. It means that a process can access
> a physical page that has been allocated to something else, possibly
> kernel data.
>
>> I'm talking about zap_pmd_range and zap_pte_range:
>>
>>       zap_pmd_range
>>         zap_pte_range
>>           arch_enter_lazy_mmu_mode
>>             ptep_get_and_clear_full
>>             tlb_remove_tlb_entry
>>             __tlb_remove_page
>>           arch_leave_lazy_mmu_mode
>>         cond_resched
>>
>> With the default arch_{enter,leave}_lazy_mmu_mode, tlb_remove_tlb_entry
>> and __tlb_remove_page there is a loop in the zap_pte_range that clears
>> PTEs and frees corresponding pages, but doesn't flush TLB, and
>> surrounding loop in the zap_pmd_range that calls cond_resched. If a thread
>> of the same process gets scheduled then it is able to see TLB entries
>> pointing to already freed physical pages.
>
> It looks to me like cond_resched() here introduces a possible bug but
> it depends on the actual arch code, especially the
> __tlb_remove_tlb_entry() function. On ARM we record the range in
> tlb_remove_tlb_entry() and queue the pages to be removed in
> __tlb_remove_page(). It pretty much acts like tlb_fast_mode() == 0
> even for the UP case (which is also needed for hardware speculative
> TLB loads). The tlb_finish_mmu() takes care of whatever pages are left
> to be freed.
>
> With a dummy __tlb_remove_tlb_entry() and tlb_fast_mode() == 1,
> cond_resched() in zap_pmd_range() would cause problems.

So, looks like most architectures in the UP configuration should have
this issue (unless they flush TLB in the switch_mm, even when switching
to the same mm):

		tlb_remove_tlb_entry	__tlb_remove_tlb_entry	__tlb_remove_page	__HAVE_ARCH_ENTER_LAZY_MMU_MODE		
		non-default		non-trivial		non-default		defined				
alpha
arc
arm		yes						yes
arm64		yes						yes
avr32
blackfin
c6x
cris
frv
h8300
hexagon
ia64		yes			yes			yes
Kconfig
m32r
m68k
metag
microblaze
mips
mn10300
openrisc
parisc
powerpc					yes						yes
s390		yes						yes (a)
score
sh		yes						yes (a)
sparc											yes
tile
um		yes			yes			yes
unicore32
x86											yes
xtensa

(a) __tlb_remove_page == free_page_and_swap_cache

> I think possible workarounds:
>
> 1. tlb_fast_mode() always returning 0.
> 2. add a tlb_flush_mmu(tlb) before cond_resched() in zap_pmd_range().
> 3. implement __tlb_remove_tlb_entry() on xtensa to always flush the
> tlb (which is probably costly).
> 4. drop the cond_resched() (not sure about preemptible kernels though).
>
> I would vote for 1 but let's see what the mm people say.

-- 
Thanks.
-- Max


More information about the linux-xtensa mailing list