25
Why mincore() returns different value of stat ?

Why mincore() returns different value of stat ?

Embed Size (px)

DESCRIPTION

Analyzer of MongoDB 2.4 's new feature returned ununderstandable results. The value of "resident" totally different from "pagesInMemory". But why ? -"resident" are coming from STAT. -"pageInMemory" are coming from mincore(). This slide illustrates this issue.

Citation preview

Page 1: Why mincore() returns different value of stat ?

Why mincore() returns different

value of stat ?

Page 2: Why mincore() returns different value of stat ?

mapped file

Page 3: Why mincore() returns different value of stat ?

Interior of a process

FS

virtual mem

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

used by others

mmap() 1MB

mapped area

User-land Kernel-land

Page 4: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

used by others

User-land Kernel-land

Page 5: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

touch

physical mem

used by others

User-land Kernel-land

Major page fault

Page 6: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

touch

physical mem

page 2

used by others

Read page from disk

User-land Kernel-land

Major page fault

Page 7: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

touch

physical mem

page 2

used by others

Associate physical memory with virtual memory

User-land Kernel-land

Major page fault

Page 8: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

touch

physical mem

page 2

page 3

used by others

User-land Kernel-land

Page 9: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

page 2

page 3

page 6

page 7

page 1

page 4

used by others

User-land Kernel-land

Page 10: Why mincore() returns different value of stat ?

swap out (Just image , not actual)

Page 11: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

touch

physical mem

page 2

page 3

page 6

page 7

page 1

page 4

used by others

User-land Kernel-land

Page 12: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

touch

physical mem

page 2

page 3

page 6

page 7

page 1

page 4

used by others

User-land Kernel-land

Page 13: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

touch

physical mem

page 3

page 6

page 7

page 1

page 4

used by others

User-land Kernel-land

Page 14: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

touch

physical mem

page 5

page 3

page 6

page 7

page 1

page 4

used by others

User-land Kernel-land

Page 15: Why mincore() returns different value of stat ?

restart the process

Page 16: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

page 5

page 3

page 6

page 7

page 1

page 4

used by others

Old processUser-land Kernel-land

Page 17: Why mincore() returns different value of stat ?

Nothing Kernel-land

FSpage 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

page 5

page 3

page 6

page 7

page 1

page 4

used by others

Kill process

Page 18: Why mincore() returns different value of stat ?

User-land Kernel-land

FSpage 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

page 5

page 3

page 6

page 7

page 1

page 4

used by others

virtual mem

New process

Page 19: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

page 5

page 3

page 6

page 7

page 1

page 4

used by others

mmap() 1MB

User-land Kernel-land

Page 20: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

page 5

page 3

page 6

page 7

page 1

page 4

used by others

User-land Kernel-land

Actually, some pages are on memory

Page 21: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

page 5

page 3

page 6

page 7

page 1

page 4

used by others

touch

ONLY associate physical memory with virtual memory

User-land Kernel-land

Minor page fault

Page 22: Why mincore() returns different value of stat ?

names & seeing

Page 23: Why mincore() returns different value of stat ?

FS

virtual mem

page 1

mapped area

page 2

page 3

page 4

page 5

page 6

page 7

page 256

page 1

mapped file

page 2

page 3

page 4

page 5

page 6

page 7

page 256

physical mem

page 5

page 3

page 6

page 7

page 1

page 4

used by others

PTE VMA

RES(resident)

SHR

mincore()

VIRT(virtual)

top

/proc/<pid>/smaps

/proc/<pid>/statm

:

User-land Kernel-land

Page 24: Why mincore() returns different value of stat ?

Kernel code

Page 25: Why mincore() returns different value of stat ?

fs/proc/task_mmu.c#L447

static void smaps_pte_entry(pte_t ptent, unsigned long addr, unsigned long ptent_size, struct mm_walk *walk){ struct mem_size_stats *mss = walk->private; struct vm_area_struct *vma = mss->vma; pgoff_t pgoff = linear_page_index(vma, addr); struct page *page = NULL; int mapcount;

if (pte_present(ptent)) { page = vm_normal_page(vma, addr, ptent); } else if (is_swap_pte(ptent)) { swp_entry_t swpent = pte_to_swp_entry(ptent);

if (!non_swap_entry(swpent)) mss->swap += ptent_size; else if (is_migration_entry(swpent)) page = migration_entry_to_page(swpent); } else if (pte_file(ptent)) { if (pte_to_pgoff(ptent) != pgoff) mss->nonlinear += ptent_size; }

if (!page) return;

if (PageAnon(page)) mss->anonymous += ptent_size;

if (page->index != pgoff) mss->nonlinear += ptent_size;

mss->resident += ptent_size; : :

mm/mincore.c#L108

static void mincore_pte_range(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, unsigned char *vec){ unsigned long next; spinlock_t *ptl; pte_t *ptep;

ptep = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); do { pte_t pte = *ptep; pgoff_t pgoff;

next = addr + PAGE_SIZE; if (pte_none(pte)) mincore_unmapped_range(vma, addr, next, vec); else if (pte_present(pte)) *vec = 1; else if (pte_file(pte)) { pgoff = pte_to_pgoff(pte); *vec = mincore_page(vma->vm_file->f_mapping, pgoff); } else { /* pte is a swap entry */ swp_entry_t entry = pte_to_swp_entry(pte); : :