summaryrefslogtreecommitdiffstats
path: root/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* ARM: mach-shmobile: sh73a0 CPGA fix for FRQCRA M3Magnus Damm2011-01-181-1/+1
| | | | | | | | Fix the M3 field offset for the FRQCRA register in the sh73a0 CPGA. It should be 12, not 8. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* ARM: mach-shmobile: remove sh7367 on-chip set_irq_type()Magnus Damm2011-01-181-1/+0
| | | | | | | | set_irq_type() should only be used for external IRQ pins, so update the G3EVM board code to remove low level request. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* ARM: mach-shmobile: sh7372 INTCS MFIS2 interrupt updateMagnus Damm2011-01-181-4/+7
| | | | | | | | | | Enable the MFIS2 interrupt source in the INTCS interrupt controller included in the sh7372 processor. The priority field is constantly enabled to let the interrupt through to both the ARM side and the SH side. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* ARM: mach-shmobile: ag5evm: Add IrDA supportKuninori Morimoto2011-01-182-1/+29
| | | | | Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* ARM: mach-shmobile: clock-sh7372: fixup pllc2 set_rateKuninori Morimoto2011-01-181-1/+3
| | | | | | | | | | | This patch fixup 421b446abeec55bed1251fab80cb5c12be58b773 - Care clk->rate - Don't over write PLLC2 enable bit Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reported-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* Merge branch 'common/mmcif' into rmobile-latestPaul Mundt2011-01-1457-598/+1222
|\
| * Merge branch 'stable/gntdev' of ↵Linus Torvalds2011-01-144-369/+525
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/gntdev' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/p2m: Fix module linking error. xen p2m: clear the old pte when adding a page to m2p_override xen gntdev: use gnttab_map_refs and gnttab_unmap_refs xen: introduce gnttab_map_refs and gnttab_unmap_refs xen p2m: transparently change the p2m mappings in the m2p override xen/gntdev: Fix circular locking dependency xen/gntdev: stop using "token" argument xen: gntdev: move use of GNTMAP_contains_pte next to the map_op xen: add m2p override mechanism xen: move p2m handling to separate file xen/gntdev: add VM_PFNMAP to vma xen/gntdev: allow usermode to map granted pages xen: define gnttab_set_map_op/unmap_op Fix up trivial conflict in drivers/xen/Kconfig
| | * xen/p2m: Fix module linking error.Konrad Rzeszutek Wilk2011-01-111-0/+1
| | | | | | | | | | | | | | | | | | | | | Fixes: ERROR: "m2p_find_override_pfn" [drivers/block/xen-blkfront.ko] undefined! Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen p2m: clear the old pte when adding a page to m2p_overrideJeremy Fitzhardinge2011-01-112-6/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding a page to m2p_override we change the p2m of the page so we need to also clear the old pte of the kernel linear mapping because it doesn't correspond anymore. When we remove the page from m2p_override we restore the original p2m of the page and we also restore the old pte of the kernel linear mapping. Before changing the p2m mappings in m2p_add_override and m2p_remove_override, check that the page passed as argument is valid and return an error if it is not. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen p2m: transparently change the p2m mappings in the m2p overrideStefano Stabellini2011-01-111-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In m2p_add_override store the original mfn into page->index and then change the p2m mapping, setting mfns as FOREIGN_FRAME. In m2p_remove_override restore the original mapping. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen: add m2p override mechanismJeremy Fitzhardinge2011-01-112-3/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a simple hashtable based mechanism to override some portions of the m2p, so that we can find out the pfn corresponding to an mfn of a granted page. In fact entries corresponding to granted pages in the m2p hold the original pfn value of the page in the source domain that granted it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen: move p2m handling to separate fileJeremy Fitzhardinge2011-01-113-366/+378
| | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | thp: mm: define MADV_NOHUGEPAGEAndrea Arcangeli2011-01-144-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Define MADV_NOHUGEPAGE. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: mmu_notifier_test_youngAndrea Arcangeli2011-01-143-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For GRU and EPT, we need gup-fast to set referenced bit too (this is why it's correct to return 0 when shadow_access_mask is zero, it requires gup-fast to set the referenced bit). qemu-kvm access already sets the young bit in the pte if it isn't zero-copy, if it's zero copy or a shadow paging EPT minor fault we relay on gup-fast to signal the page is in use... We also need to check the young bits on the secondary pagetables for NPT and not nested shadow mmu as the data may never get accessed again by the primary pte. Without this closer accuracy, we'd have to remove the heuristic that avoids collapsing hugepages in hugepage virtual regions that have not even a single subpage in use. ->test_young is full backwards compatible with GRU and other usages that don't have young bits in pagetables set by the hardware and that should nuke the secondary mmu mappings when ->clear_flush_young runs just like EPT does. Removing the heuristic that checks the young bit in khugepaged/collapse_huge_page completely isn't so bad either probably but I thought it was worth it and this makes it reliable. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: don't allow transparent hugepage support without PSEAndrea Arcangeli2011-01-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Archs implementing Transparent Hugepage Support must implement a function called has_transparent_hugepage to be sure the virtual or physical CPU supports Transparent Hugepages. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: add pmd_modifyJohannes Weiner2011-01-142-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add pmd_modify() for use with mprotect() on huge pmds. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: add x86 32bit supportJohannes Weiner2011-01-145-111/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for transparent hugepages to x86 32bit. Share the same VM_ bitflag for VM_MAPPED_COPY. mm/nommu.c will never support transparent hugepages. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: transparent hugepage coreAndrea Arcangeli2011-01-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lately I've been working to make KVM use hugepages transparently without the usual restrictions of hugetlbfs. Some of the restrictions I'd like to see removed: 1) hugepages have to be swappable or the guest physical memory remains locked in RAM and can't be paged out to swap 2) if a hugepage allocation fails, regular pages should be allocated instead and mixed in the same vma without any failure and without userland noticing 3) if some task quits and more hugepages become available in the buddy, guest physical memory backed by regular pages should be relocated on hugepages automatically in regions under madvise(MADV_HUGEPAGE) (ideally event driven by waking up the kernel deamon if the order=HPAGE_PMD_SHIFT-PAGE_SHIFT list becomes not null) 4) avoidance of reservation and maximization of use of hugepages whenever possible. Reservation (needed to avoid runtime fatal faliures) may be ok for 1 machine with 1 database with 1 database cache with 1 database cache size known at boot time. It's definitely not feasible with a virtualization hypervisor usage like RHEV-H that runs an unknown number of virtual machines with an unknown size of each virtual machine with an unknown amount of pagecache that could be potentially useful in the host for guest not using O_DIRECT (aka cache=off). hugepages in the virtualization hypervisor (and also in the guest!) are much more important than in a regular host not using virtualization, becasue with NPT/EPT they decrease the tlb-miss cacheline accesses from 24 to 19 in case only the hypervisor uses transparent hugepages, and they decrease the tlb-miss cacheline accesses from 19 to 15 in case both the linux hypervisor and the linux guest both uses this patch (though the guest will limit the addition speedup to anonymous regions only for now...). Even more important is that the tlb miss handler is much slower on a NPT/EPT guest than for a regular shadow paging or no-virtualization scenario. So maximizing the amount of virtual memory cached by the TLB pays off significantly more with NPT/EPT than without (even if there would be no significant speedup in the tlb-miss runtime). The first (and more tedious) part of this work requires allowing the VM to handle anonymous hugepages mixed with regular pages transparently on regular anonymous vmas. This is what this patch tries to achieve in the least intrusive possible way. We want hugepages and hugetlb to be used in a way so that all applications can benefit without changes (as usual we leverage the KVM virtualization design: by improving the Linux VM at large, KVM gets the performance boost too). The most important design choice is: always fallback to 4k allocation if the hugepage allocation fails! This is the _very_ opposite of some large pagecache patches that failed with -EIO back then if a 64k (or similar) allocation failed... Second important decision (to reduce the impact of the feature on the existing pagetable handling code) is that at any time we can split an hugepage into 512 regular pages and it has to be done with an operation that can't fail. This way the reliability of the swapping isn't decreased (no need to allocate memory when we are short on memory to swap) and it's trivial to plug a split_huge_page* one-liner where needed without polluting the VM. Over time we can teach mprotect, mremap and friends to handle pmd_trans_huge natively without calling split_huge_page*. The fact it can't fail isn't just for swap: if split_huge_page would return -ENOMEM (instead of the current void) we'd need to rollback the mprotect from the middle of it (ideally including undoing the split_vma) which would be a big change and in the very wrong direction (it'd likely be simpler not to call split_huge_page at all and to teach mprotect and friends to handle hugepages instead of rolling them back from the middle). In short the very value of split_huge_page is that it can't fail. The collapsing and madvise(MADV_HUGEPAGE) part will remain separated and incremental and it'll just be an "harmless" addition later if this initial part is agreed upon. It also should be noted that locking-wise replacing regular pages with hugepages is going to be very easy if compared to what I'm doing below in split_huge_page, as it will only happen when page_count(page) matches page_mapcount(page) if we can take the PG_lock and mmap_sem in write mode. collapse_huge_page will be a "best effort" that (unlike split_huge_page) can fail at the minimal sign of trouble and we can try again later. collapse_huge_page will be similar to how KSM works and the madvise(MADV_HUGEPAGE) will work similar to madvise(MADV_MERGEABLE). The default I like is that transparent hugepages are used at page fault time. This can be changed with /sys/kernel/mm/transparent_hugepage/enabled. The control knob can be set to three values "always", "madvise", "never" which mean respectively that hugepages are always used, or only inside madvise(MADV_HUGEPAGE) regions, or never used. /sys/kernel/mm/transparent_hugepage/defrag instead controls if the hugepage allocation should defrag memory aggressively "always", only inside "madvise" regions, or "never". The pmd_trans_splitting/pmd_trans_huge locking is very solid. The put_page (from get_user_page users that can't use mmu notifier like O_DIRECT) that runs against a __split_huge_page_refcount instead was a pain to serialize in a way that would result always in a coherent page count for both tail and head. I think my locking solution with a compound_lock taken only after the page_first is valid and is still a PageHead should be safe but it surely needs review from SMP race point of view. In short there is no current existing way to serialize the O_DIRECT final put_page against split_huge_page_refcount so I had to invent a new one (O_DIRECT loses knowledge on the mapping status by the time gup_fast returns so...). And I didn't want to impact all gup/gup_fast users for now, maybe if we change the gup interface substantially we can avoid this locking, I admit I didn't think too much about it because changing the gup unpinning interface would be invasive. If we ignored O_DIRECT we could stick to the existing compound refcounting code, by simply adding a get_user_pages_fast_flags(foll_flags) where KVM (and any other mmu notifier user) would call it without FOLL_GET (and if FOLL_GET isn't set we'd just BUG_ON if nobody registered itself in the current task mmu notifier list yet). But O_DIRECT is fundamental for decent performance of virtualized I/O on fast storage so we can't avoid it to solve the race of put_page against split_huge_page_refcount to achieve a complete hugepage feature for KVM. Swap and oom works fine (well just like with regular pages ;). MMU notifier is handled transparently too, with the exception of the young bit on the pmd, that didn't have a range check but I think KVM will be fine because the whole point of hugepages is that EPT/NPT will also use a huge pmd when they notice gup returns pages with PageCompound set, so they won't care of a range and there's just the pmd young bit to check in that case. NOTE: in some cases if the L2 cache is small, this may slowdown and waste memory during COWs because 4M of memory are accessed in a single fault instead of 8k (the payoff is that after COW the program can run faster). So we might want to switch the copy_huge_page (and clear_huge_page too) to not temporal stores. I also extensively researched ways to avoid this cache trashing with a full prefault logic that would cow in 8k/16k/32k/64k up to 1M (I can send those patches that fully implemented prefault) but I concluded they're not worth it and they add an huge additional complexity and they remove all tlb benefits until the full hugepage has been faulted in, to save a little bit of memory and some cache during app startup, but they still don't improve substantially the cache-trashing during startup if the prefault happens in >4k chunks. One reason is that those 4k pte entries copied are still mapped on a perfectly cache-colored hugepage, so the trashing is the worst one can generate in those copies (cow of 4k page copies aren't so well colored so they trashes less, but again this results in software running faster after the page fault). Those prefault patches allowed things like a pte where post-cow pages were local 4k regular anon pages and the not-yet-cowed pte entries were pointing in the middle of some hugepage mapped read-only. If it doesn't payoff substantially with todays hardware it will payoff even less in the future with larger l2 caches, and the prefault logic would blot the VM a lot. If one is emebdded transparent_hugepage can be disabled during boot with sysfs or with the boot commandline parameter transparent_hugepage=0 (or transparent_hugepage=2 to restrict hugepages inside madvise regions) that will ensure not a single hugepage is allocated at boot time. It is simple enough to just disable transparent hugepage globally and let transparent hugepages be allocated selectively by applications in the MADV_HUGEPAGE region (both at page fault time, and if enabled with the collapse_huge_page too through the kernel daemon). This patch supports only hugepages mapped in the pmd, archs that have smaller hugepages will not fit in this patch alone. Also some archs like power have certain tlb limits that prevents mixing different page size in the same regions so they will not fit in this framework that requires "graceful fallback" to basic PAGE_SIZE in case of physical memory fragmentation. hugetlbfs remains a perfect fit for those because its software limits happen to match the hardware limits. hugetlbfs also remains a perfect fit for hugepage sizes like 1GByte that cannot be hoped to be found not fragmented after a certain system uptime and that would be very expensive to defragment with relocation, so requiring reservation. hugetlbfs is the "reservation way", the point of transparent hugepages is not to have any reservation at all and maximizing the use of cache and hugepages at all times automatically. Some performance result: vmx andrea # LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=yes HUGETLB_PATH=/mnt/huge/ ./largep ages3 memset page fault 1566023 memset tlb miss 453854 memset second tlb miss 453321 random access tlb miss 41635 random access second tlb miss 41658 vmx andrea # LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=yes HUGETLB_PATH=/mnt/huge/ ./largepages3 memset page fault 1566471 memset tlb miss 453375 memset second tlb miss 453320 random access tlb miss 41636 random access second tlb miss 41637 vmx andrea # ./largepages3 memset page fault 1566642 memset tlb miss 453417 memset second tlb miss 453313 random access tlb miss 41630 random access second tlb miss 41647 vmx andrea # ./largepages3 memset page fault 1566872 memset tlb miss 453418 memset second tlb miss 453315 random access tlb miss 41618 random access second tlb miss 41659 vmx andrea # echo 0 > /proc/sys/vm/transparent_hugepage vmx andrea # ./largepages3 memset page fault 2182476 memset tlb miss 460305 memset second tlb miss 460179 random access tlb miss 44483 random access second tlb miss 44186 vmx andrea # ./largepages3 memset page fault 2182791 memset tlb miss 460742 memset second tlb miss 459962 random access tlb miss 43981 random access second tlb miss 43988 ============ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/time.h> #define SIZE (3UL*1024*1024*1024) int main() { char *p = malloc(SIZE), *p2; struct timeval before, after; gettimeofday(&before, NULL); memset(p, 0, SIZE); gettimeofday(&after, NULL); printf("memset page fault %Lu\n", (after.tv_sec-before.tv_sec)*1000000UL + after.tv_usec-before.tv_usec); gettimeofday(&before, NULL); memset(p, 0, SIZE); gettimeofday(&after, NULL); printf("memset tlb miss %Lu\n", (after.tv_sec-before.tv_sec)*1000000UL + after.tv_usec-before.tv_usec); gettimeofday(&before, NULL); memset(p, 0, SIZE); gettimeofday(&after, NULL); printf("memset second tlb miss %Lu\n", (after.tv_sec-before.tv_sec)*1000000UL + after.tv_usec-before.tv_usec); gettimeofday(&before, NULL); for (p2 = p; p2 < p+SIZE; p2 += 4096) *p2 = 0; gettimeofday(&after, NULL); printf("random access tlb miss %Lu\n", (after.tv_sec-before.tv_sec)*1000000UL + after.tv_usec-before.tv_usec); gettimeofday(&before, NULL); for (p2 = p; p2 < p+SIZE; p2 += 4096) *p2 = 0; gettimeofday(&after, NULL); printf("random access second tlb miss %Lu\n", (after.tv_sec-before.tv_sec)*1000000UL + after.tv_usec-before.tv_usec); return 0; } ============ Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: kvm mmu transparent hugepage supportAndrea Arcangeli2011-01-142-17/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This should work for both hugetlbfs and transparent hugepages. [akpm@linux-foundation.org: bring forward PageTransCompound() addition for bisectability] Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: split_huge_page_mm/vmaAndrea Arcangeli2011-01-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | split_huge_page_pmd compat code. Each one of those would need to be expanded to hundred of lines of complex code without a fully reliable split_huge_page_pmd design. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: pte alloc trans splittingAndrea Arcangeli2011-01-148-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pte alloc routines must wait for split_huge_page if the pmd is not present and not null (i.e. pmd_trans_splitting). The additional branches are optimized away at compile time by pmd_trans_splitting if the config option is off. However we must pass the vma down in order to know the anon_vma lock to wait for. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: bail out gup_fast on splitting pmdAndrea Arcangeli2011-01-141-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Force gup_fast to take the slow path and block if the pmd is splitting, not only if it's none. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: add pmd mangling functions to x86Andrea Arcangeli2011-01-143-8/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add needed pmd mangling functions with symmetry with their pte counterparts. pmdp_splitting_flush() is the only new addition on the pmd_ methods and it's needed to serialize the VM against split_huge_page. It simply atomically sets the splitting bit in a similar way pmdp_clear_flush_young atomically clears the accessed bit. pmdp_splitting_flush() also has to flush the tlb to make it effective against gup_fast, but it wouldn't really require to flush the tlb too. Just the tlb flush is the simplest operation we can invoke to serialize pmdp_splitting_flush() against gup_fast. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: special pmd_trans_* functionsAndrea Arcangeli2011-01-142-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These returns 0 at compile time when the config option is disabled, to allow gcc to eliminate the transparent hugepage function calls at compile time without additional #ifdefs (only the export of those functions have to be visible to gcc but they won't be required at link time and huge_memory.o can be not built at all). _PAGE_BIT_UNUSED1 is never used for pmd, only on pte. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: no paravirt version of pmd opsAndrea Arcangeli2011-01-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No paravirt version of set_pmd_at/pmd_update/pmd_update_defer. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: add pmd paravirt opsAndrea Arcangeli2011-01-143-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Paravirt ops pmd_update/pmd_update_defer/pmd_set_at. Not all might be necessary (vmware needs pmd_update, Xen needs set_pmd_at, nobody needs pmd_update_defer), but this is to keep full simmetry with pte paravirt ops, which looks cleaner and simpler from a common code POV. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: add native_set_pmd_atAndrea Arcangeli2011-01-141-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Used by paravirt and not paravirt set_pmd_at. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: alter compound get_page/put_pageAndrea Arcangeli2011-01-142-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alter compound get_page/put_page to keep references on subpages too, in order to allow __split_huge_page_refcount to split an hugepage even while subpages have been pinned by one of the get_user_pages() variants. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | thp: mm: define MADV_HUGEPAGEAndrea Arcangeli2011-01-144-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define MADV_HUGEPAGE. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mm: unify module_alloc code for vmallocDavid Rientjes2011-01-144-45/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Four architectures (arm, mips, sparc, x86) use __vmalloc_area() for module_init(). Much of the code is duplicated and can be generalized in a globally accessible function, __vmalloc_node_range(). __vmalloc_node() now calls into __vmalloc_node_range() with a range of [VMALLOC_START, VMALLOC_END) for functionally equivalent behavior. Each architecture may then use __vmalloc_node_range() directly to remove the duplication of code. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | Merge branch 'release' of ↵Linus Torvalds2011-01-141-1/+1
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] fix build error - arch/ia64/kernel/perfmon.c
| | * | [IA64] fix build error - arch/ia64/kernel/perfmon.cTony Luck2011-01-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/ia64/kernel/perfmon.c:621: error: duplicate 'static' Introduced by commit c74a1cbb3cac348f276fabc381758f5b0b4713b2 pass default dentry_operations to mount_pseudo() Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | | Merge branch 'for-linus' of ↵Linus Torvalds2011-01-1321-149/+206
| |\ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/avr32-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/avr32-2.6: avr32: update default configuration files for Atmel boards avr32: Convert to clocksource_register_hz avr32: make architecture sys_clone prototype match asm-generic prototype avr32: use syscall prototypes from asm-generic instead of arch avr32: disable kprobes for all default configurations avr32: boards: setup: use IS_ERR() instead of NULL check
| | * | avr32: update default configuration files for Atmel boardsHans-Christian Egtvedt2011-01-1310-109/+184
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adjusts some values to make the default configuration for Atmel boards more similar, and adds missing values to enable required functions. Also remove defined symbols for functions not in use. Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
| | * | avr32: Convert to clocksource_register_hzJohn Stultz2011-01-131-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This converts the avr32 clocksource to use clocksource_register_hz. This is untested, so any assistance in testing would be appreciated! CC: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <johnstul@us.ibm.com>
| | * | avr32: make architecture sys_clone prototype match asm-generic prototypeHans-Christian Egtvedt2011-01-132-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch will fix the arguments to the architecture sys_clone() function to match the asm-generic/syscalls.h prototype. In the same go remove the architecture specific prototype for the same function. The sys_clone() function is only called from assembly, hence the argument types were not having any affect. Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
| | * | avr32: use syscall prototypes from asm-generic instead of archHans-Christian Egtvedt2011-01-131-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the redundant syscalls prototypes in the architecture specific syscalls.h header file. These were identical with the ones in asm-generic/syscalls.h. Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Reported-by: Peter Huewe <PeterHuewe@gmx.de> Reported-by: Sven Schnelle <svens@stackframe.org> Cc: stable <stable@kernel.org>
| | * | avr32: disable kprobes for all default configurationsHans-Christian Egtvedt2011-01-1311-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch will disable kprobes for all the default AVR32 board configurations. This works around a regression in kprobes which seems to be related to AVR32 is now lacking the struct kprobe_ctlblk. Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
| | * | avr32: boards: setup: use IS_ERR() instead of NULL checkVasiliy Kulikov2011-01-136-6/+6
| | |/ | | | | | | | | | | | | | | | | | | clk_get() returns ERR_PTR() on error, not NULL. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
* | | ARM: mach-shmobile: ag5evm requires GPIOLIBYoshii Takashi2011-01-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Because ag5evm board setup code uses gpio functions, ARCH_REQUIRE_GPIOLIB should be set in Kconfig. Otherwise, the first build with defconfig fails. Signed-off-by: Takashi YOSHII <takashi.yoshii.zj@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* | | ARM: mach-shmobile: fix cpu_base of gic_init() on sh73a0Yoshii Takashi2011-01-131-2/+3
|/ / | | | | | | | | | | | | | | | | | | | | | | The latest rmobile-latest doesn't run on ag5evm because of a small mistake on initialization. Though, I don't have any idea to write them smart. anyway, On sh73a0, GIC cpu_base is 0xf0000100 but 0xf0001000. Signed-off-by: Takashi YOSHII <takashi.yoshii.zj@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* | Merge branch 'release' of ↵Linus Torvalds2011-01-131-1/+1
|\ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Fix format warning in arch/ia64/kernel/acpi.c
| * | [IA64] Fix format warning in arch/ia64/kernel/acpi.cTony Luck2011-01-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | arch/ia64/kernel/acpi.c:481: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long unsigned int’ Introduced by commit 05f2f274c8a8747bbfb13ac8ee0c27d5f2ad8510 [IA64] Avoid array overflow if there are too many cpus in SRAT table Signed-off-by: Tony Luck <tony.luck@intel.com>
* | | Merge branch 'rmobile-latest' of ↵Linus Torvalds2011-01-138-19/+67
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'rmobile-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: ARM: mach-shmobile: Kill off unused !gpio_is_valid() case ARM: mach-shmobile: sh7372 Enable SDIO IRQs for Mackerel ARM: mach-shmobile: sh7377 Enable SDIO IRQs ARM: mach-shmobile: sh7367 Enable SDIO IRQs ARM: mach-shmobile: sh7372 Enable SDIO IRQs ARM: mach-shmobile: mackerel: Add touchscreen ST1232 support ARM: mach-shmobile: ap4eb: SCIF port for earlyprintk when using zboot ARM: mach-shmobile: mackerel: SCIF port for earlyprintk when using zboot ARM: mach-shmobile: mackerel: Add support get_cd in CN23
| * | | ARM: mach-shmobile: Kill off unused !gpio_is_valid() caseMagnus Damm2011-01-132-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Card Detect GPIOs used on AP4EVB and Mackerel are alwayws valid, so kill off the unused !gpio_is_valid() case. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
| * | | Merge branch 'rmobile/sdio' into rmobile-latestPaul Mundt2011-01-12283-11761/+12046
| |\ \ \
| | * | | ARM: mach-shmobile: sh7372 Enable SDIO IRQs for MackerelMagnus Damm2011-01-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable SDIO IRQ support for the Mackerel board. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
| | * | | ARM: mach-shmobile: sh7377 Enable SDIO IRQsMagnus Damm2011-01-122-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables interrupt generation for SDIO IRQs of the SDHI block on the sh7377 aka G4 processor. Use together with a recent SDHI driver using TMIO_MMC_SDIO_IRQ and with the MMC_CAP_SDIO_IRQ flag in the board code. The G4EVM specific SDHI platform data is also updated to flag SDIO capabilities. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
| | * | | ARM: mach-shmobile: sh7367 Enable SDIO IRQsMagnus Damm2011-01-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables interrupt generation for SDIO IRQs of the SDHI block on the sh7367 aka G3 processor. Use together with a recent SDHI driver using TMIO_MMC_SDIO_IRQ and with the MMC_CAP_SDIO_IRQ flag in the board code. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
| | * | | ARM: mach-shmobile: sh7372 Enable SDIO IRQsArnd Hannemann2011-01-122-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables the interrupt generation for SDIO IRQs of the sdhi controllers of the SoC. To make sure interrupts are handled announce the MMC_CAP_SDIO_IRQ capability on AP4EVB. Tested with a b43-based SDIO wireless card. Signed-off-by: Arnd Hannemann <arnd@arndnet.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>