aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-08-03Merge branch 'security-fixes' into fixesRussell King
2013-08-03ARM: fix nommu builds with 48be69a02 (ARM: move signal handlers into a ↵Russell King
vdso-like page) Olof reports that noMMU builds error out with: arch/arm/kernel/signal.c: In function 'setup_return': arch/arm/kernel/signal.c:413:25: error: 'mm_context_t' has no member named 'sigpage' This shows one of the evilnesses of IS_ENABLED(). Get rid of it here and replace it with #ifdef's - and as no noMMU platform can make use of sigpage, depend on CONIFG_MMU not CONFIG_ARM_MPU. Reported-by: Olof Johansson <olof@lixom.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-03ARM: fix a cockup in 48be69a02 (ARM: move signal handlers into a vdso-like page)Russell King
Unfortunately, I never committed the fix to a nasty oops which can occur as a result of that commit: ------------[ cut here ]------------ kernel BUG at /home/olof/work/batch/include/linux/mm.h:414! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 490 Comm: killall5 Not tainted 3.11.0-rc3-00288-gabe0308 #53 task: e90acac0 ti: e9be8000 task.ti: e9be8000 PC is at special_mapping_fault+0xa4/0xc4 LR is at __do_fault+0x68/0x48c This doesn't show up unless you do quite a bit of testing; a simple boot test does not do this, so all my nightly tests were passing fine. The reason for this is that install_special_mapping() expects the page array to stick around, and as this was only inserting one page which was stored on the kernel stack, that's why this was blowing up. Reported-by: Olof Johansson <olof@lixom.net> Tested-by: Olof Johansson <olof@lixom.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-01Merge branch 'security-fixes' into fixesRussell King
2013-08-01ARM: Add .text annotations where required after __CPUINIT removalRussell King
Commit 8bd26e3a7 (arm: delete __cpuinit/__CPUINIT usage from all ARM users) caused some code to leak into sections which are discarded through the removal of __CPUINIT annotations. Add appropriate .text annotations to bring these back into the kernel text. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-01ARM: 7803/1: Fix deadlock scenario with smp_send_stop()Stephen Boyd
If one process calls sys_reboot and that process then stops other CPUs while those CPUs are within a spin_lock() region we can potentially encounter a deadlock scenario like below. CPU 0 CPU 1 ----- ----- spin_lock(my_lock) smp_send_stop() <send IPI> handle_IPI() disable_preemption/irqs while(1); <PREEMPT> spin_lock(my_lock) <--- Waits forever We shouldn't attempt to run any other tasks after we send a stop IPI to a CPU so disable preemption so that this task runs to completion. We use local_irq_disable() here for cross-arch consistency with x86. Reported-by: Sundarajan Srinivasan <sundaraj@codeaurora.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-01ARM: make vectors page inaccessible from userspaceRussell King
If kuser helpers are not provided by the kernel, disable user access to the vectors page. With the kuser helpers gone, there is no reason for this page to be visible to userspace. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-01ARM: move signal handlers into a vdso-like pageRussell King
Move the signal handlers into a VDSO page rather than keeping them in the vectors page. This allows us to place them randomly within this page, and also map the page at a random location within userspace further protecting these code fragments from ROP attacks. The new VDSO page is also poisoned in the same way as the vector page. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: allow kuser helpers to be removed from the vector pageRussell King
Provide a kernel configuration option to allow the kernel user helpers to be removed from the vector page, thereby preventing their use with ROP (return orientated programming) attacks. This option is only visible for CPU architectures which natively support all the operations which kernel user helpers would normally provide, and must be enabled with caution. Cc: <stable@vger.kernel.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: update FIQ support for relocation of vectorsRussell King
FIQ should no longer copy the FIQ code into the user visible vector page. Instead, it should use the hidden page. This change makes that happen. Cc: <stable@vger.kernel.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: use linker magic for vectors and vector stubsRussell King
Use linker magic to create the vectors and vector stubs: we can tell the linker to place them at an appropriate VMA, but keep the LMA within the kernel. This gets rid of some unnecessary symbol manipulation, and have the linker calculate the relocations appropriately. Cc: <stable@vger.kernel.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: move vector stubsRussell King
Move the machine vector stubs into the page above the vector page, which we can prevent from being visible to userspace. Also move the reset stub, and place the swi vector at a location that the 'ldr' can get to it. This hides pointers into the kernel which could give valuable information to attackers, and reduces the number of exploitable instructions at a fixed address. Cc: <stable@vger.kernel.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: poison memory between kuser helpersRussell King
Poison the memory between each kuser helper. This ensures that any branch between the kuser helpers will be appropriately trapped. Cc: <stable@vger.kernel.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: poison the vectors pageRussell King
Fill the empty regions of the vectors page with an exception generating instruction. This ensures that any inappropriate branch to the vector page is appropriately trapped, rather than just encountering some code to execute. (The vectors page was filled with zero before, which corresponds with the "andeq r0, r0, r0" instruction - a no-op.) Cc: <stable@vger.kernel.org> Acked-by Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: 7801/1: v6: prevent gcc 4.5 from reordering extended CP15 reads above ↵Paul Walmsley
is_smp() test Commit 621a0147d5c921f4cc33636ccd0602ad5d7cbfbc ("ARM: 7757/1: mm: don't flush icache in switch_mm with hardware broadcasting") breaks the boot on OMAP2430SDP with omap2plus_defconfig. Tracked to an undefined instruction abort from the CP15 read in cache_ops_need_broadcast(). It turns out that gcc 4.5 reorders the extended CP15 read above the is_smp() test. This breaks ARM1136 r0 cores, since they don't support several CP15 registers that later ARM cores do. ARM1136JF-S TRM section 3.2.1 "Register allocation" has the details. So mark the extended CP15 read as clobbering memory, which prevents the compiler from reordering it before the is_smp() test. Russell states that the code generated from this approach is preferable to marking the inline asm as volatile. Remove the existing condition code clobber as it's obsolete, per Nico's post: http://www.spinics.net/lists/arm-kernel/msg261208.html This patch is a collaboration with Will Deacon and Russell King. Comments from Paul Walmsley: Russell, if you accept this one, might you also add Will's ack from the lists: Comments from Paul Walmsley: I'd also be obliged if you could add a Cc: line for Jonathan Austin, since he helped test: Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Tony Lindgren <tony@atomide.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Jonathan Austin <jonathan.austin@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: 7800/1: ARMv7-M: Fix name of NVIC handler functionUwe Kleine-König
The name changed in response to review comments for the nvic irqchip driver when the original name was already accepted into Russell King's tree. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-27ARM: Fix sorting of machine- initializersRussell King
So, there's a comment I put at the top of this, which people seem to fail to read. So let's fix it for them instead. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7791/1: a.out: remove partial a.out supportWill Deacon
a.out support on ARM requires that argc, argv and envp are passed in r0-r2 respectively, which requires hacking load_aout_binary to prevent argc being clobbered by the return code. Whilst mainline kernels do set the registers up in start_thread, the aout loader has never carried the hack in mainline. Initialising the registers in this way actually goes against the libc expectations for ELF binaries, where argc, argv and envp are passed on the stack, with r0 being used to hold a pointer to an exit function for cleaning up after the dynamic linker if required. If the pointer is NULL, then it is ignored. When execing an ELF binary, Linux currently zeroes r0, then sets it to argc and then finally clobbers it with the return value of the execve syscall, so we actually end up with: r0 = 0 stack[0] = argc r1 = stack[1] = argv r2 = stack[2] = envp libc treats r1 and r2 as undefined. The clobbering of r0 by sys_execve works for user-spawned threads, but when executing an ELF binary from a kernel thread (via call_usermodehelper), the execve is performed on the ret_from_fork path, which restores r0 from the saved pt_regs, resulting in argc being presented to the C library. This has horrible consequences when the application exits, since we have an exit function registered using argc, resulting in a jump to hyperspace. This patch solves the problem by removing the partial a.out support from arch/arm/ altogether. Cc: <stable@vger.kernel.org> Cc: Ashish Sangwan <ashishsangwan2@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7790/1: Fix deferred mm switch on VIVT processorsCatalin Marinas
As of commit b9d4d42ad9 (ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs), the mm switching on VIVT processors is done in the finish_arch_post_lock_switch() function to avoid whole cache flushing with interrupts disabled. The need for deferred mm switch is stored as a thread flag (TIF_SWITCH_MM). However, with preemption enabled, we can have another thread switch before finish_arch_post_lock_switch(). If the new thread has the same mm as the previous 'next' thread, the scheduler will not call switch_mm() and the TIF_SWITCH_MM flag won't be set for the new thread. This patch moves the switch pending flag to the mm_context_t structure since this is specific to the mm rather than thread. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Cc: <stable@vger.kernel.org> # 3.5+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7789/1: Do not run dummy_flush_tlb_a15_erratum() on non-Cortex-A15Fabio Estevam
Commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)) causes the following undefined instruction error on a mx53 (Cortex-A8): Internal error: Oops - undefined instruction: 0 [#1] SMP ARM CPU: 0 PID: 275 Comm: modprobe Not tainted 3.11.0-rc2-next-20130722-00009-g9b0f371 #881 task: df46cc00 ti: df48e000 task.ti: df48e000 PC is at check_and_switch_context+0x17c/0x4d0 LR is at check_and_switch_context+0xdc/0x4d0 This problem happens because check_and_switch_context() calls dummy_flush_tlb_a15_erratum() without checking if we are really running on a Cortex-A15 or not. To avoid this issue, only call dummy_flush_tlb_a15_erratum() inside check_and_switch_context() if erratum_a15_798181() returns true, which means that we are really running on a Cortex-A15. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7787/1: virt: ensure visibility of __boot_cpu_modeMark Rutland
Secondary CPUs write to __boot_cpu_mode with caches disabled, and thus a cached value of __boot_cpu_mode may be incoherent with that in memory. This could lead to a failure to detect mismatched boot modes. This patch adds flushing to ensure that writes by secondaries to __boot_cpu_mode are made visible before we test against it. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-22ARM: 7788/1: elf: fix lpae hwcap feature reporting in proc/cpuinfoTetsuyuki Kobayashi
Commit a469abd0f868 ("ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions") added a new hwcap to identify LPAE on CPUs which support it. Whilst the hwcap data is correct, the string reported in /proc/cpuinfo actually matches on HWCAP_VFPD32, which was missing an entry in the string table. This patch fixes this problem by adding a "vfpd32" string at the correct offset, preventing us from falsely advertising LPAE on CPUs which do not support it. [will: added commit message] Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-22ARM: 7786/1: hyp: fix macro parameterisationMark Rutland
Currently, compare_cpu_mode_with_primary uses a mixture of macro arguments and hardcoded registers, and does so incorrectly, as it stores (__boot_cpu_mode_offset | BOOT_CPU_MODE_MISMATCH) to (__boot_cpu_mode + &__boot_cpu_mode_offset), which could corrupt an arbitrary portion of memory. This patch fixes up compare_cpu_mode_with_primary to use the macro arguments, correctly updating __boot_cpu_mode. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-22ARM: 7785/1: mm: restrict early_alloc to section-aligned memoryRussell King
When map_lowmem() runs, and processes a memory bank whose start or end is not section-aligned, memory must be allocated to store the 2nd-level page tables. Those allocations are made by calling memblock_alloc(). At this point, the only memory that is free *and* mapped is memory which has already been mapped by map_lowmem() itself. For this reason, we must calculate the first point at which map_lowmem() will need to allocate memory, and set the memblock allocation limit to a lower address, so that memblock_alloc() is guaranteed to return memory that is already mapped. This patch enhances sanity_check_meminfo() to calculate that memory address, and pass it to memblock_set_current_limit(), rather than just assuming the limit is arm_lowmem_limit. The algorithm applied is: * Default memblock_limit to arm_lowmem_limit in the absence of any other limit; arm_lowmem_limit is the highest memory that is mapped by map_lowmem(). * While walking the list of memblocks, if the start of a block is not aligned, 2nd-level page tables will need to be allocated to map the first few pages of the block. Hence, the memblock_limit must be before the start of the block. * Similarly, if the end of any block is not aligned, 2nd-level page tables will need to be allocated to map the last few pages of the block. Hence, the memblock_limit must point at the end of the block, rounded down to section-alignment. * The memory blocks are assumed to be sorted in address order, so the first unaligned block start or end is used to set the limit. With this algorithm, the start or end of almost any bank can be non- section-aligned. The only exception is that the start of bank 0 must be section-aligned, since otherwise memory would need to be allocated when mapping the start of bank 0, which occurs before any free memory is mapped. [swarren, wrote commit description, rewrote calculation of memblock_limit] Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-22ARM: 7784/1: mm: ensure SMP alternates assemble to exactly 4 bytes with Thumb-2Will Deacon
Commit ae8a8b9553bd ("ARM: 7691/1: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead") added early function returns for page table cache flushing operations on ARMv7 SMP CPUs. Unfortunately, when targetting Thumb-2, these `mov pc, lr' sequences assemble to 2 bytes which can lead to corruption of the instruction stream after code patching. This patch fixes the alternates to use wide (32-bit) instructions for Thumb-2, therefore ensuring that the patching code works correctly. Cc: <stable@vger.kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-22ARM: document DEBUG_UNCOMPRESS Kconfig optionRussell King
This non-user visible option lacked any kind of documentation. This is quite common for non-user visible options; certian people can't understand the point of documenting such options with help text. However, here we have a case in point: developers don't understand the option either, as they were thinking that when the option is not set, the decompressor should produce no output what so ever. This is incorrect, as the purpose of this option is to control whether a multiplatform kernel uses the kernel debugging macros to produce output or not. So let's document this via help rather than commentry to prevent others falling into this misunderstanding. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-21Linux 3.11-rc2Linus Torvalds
2013-07-21Merge tag 'acpi-video-3.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI video support fixes from Rafael Wysocki: "I'm sending a separate pull request for this as it may be somewhat controversial. The breakage addressed here is not really new and the fixes may not satisfy all users of the affected systems, but we've had so much back and forth dance in this area over the last several weeks that I think it's time to actually make some progress. The source of the problem is that about a year ago we started to tell BIOSes that we're compatible with Windows 8, which we really need to do, because some systems shipping with Windows 8 are tested with it and nothing else, so if we tell their BIOSes that we aren't compatible with Windows 8, we expose our users to untested BIOS/AML code paths. However, as it turns out, some Windows 8-specific AML code paths are not tested either, because Windows 8 actually doesn't use the ACPI methods containing them, so if we declare Windows 8 compatibility and attempt to use those ACPI methods, things break. That occurs mostly in the backlight support area where in particular the _BCM and _BQC methods are plain unusable on some systems if the OS declares Windows 8 compatibility. [ The additional twist is that they actually become usable if the OS says it is not compatible with Windows 8, but that may cause problems to show up elsewhere ] Investigation carried out by Matthew Garrett indicates that what Windows 8 does about backlight is to leave backlight control up to individual graphics drivers. At least there's evidence that it does that if the Intel graphics driver is used, so we've decided to follow Windows 8 in that respect and allow i915 to control backlight (Daniel likes that part). The first commit from Aaron Lu makes ACPICA export the variable from which we can infer whether or not the BIOS believes that we are compatible with Windows 8. The second commit from Matthew Garrett prepares the ACPI video driver by making it initialize the ACPI backlight even if it is not going to be used afterward (that is needed for backlight control to work on Thinkpads). The third commit implements the actual workaround making i915 take over backlight control if the firmware thinks it's dealing with Windows 8 and is based on the work of multiple developers, including Matthew Garrett, Chun-Yi Lee, Seth Forshee, and Aaron Lu. The final commit from Aaron Lu makes us follow Windows 8 by informing the firmware through the _DOS method that it should not carry out automatic brightness changes, so that brightness can be controlled by GUI. Hopefully, this approach will allow us to avoid using blacklists of systems that should not declare Windows 8 compatibility just to avoid backlight control problems in the future. - Change from Aaron Lu makes ACPICA export a variable which can be used by driver code to determine whether or not the BIOS believes that we are compatible with Windows 8. - Change from Matthew Garrett makes the ACPI video driver initialize the ACPI backlight even if it is not going to be used afterward (that is needed for backlight control to work on Thinkpads). - Fix from Rafael J Wysocki implements Windows 8 backlight support workaround making i915 take over bakclight control if the firmware thinks it's dealing with Windows 8. Based on the work of multiple developers including Matthew Garrett, Chun-Yi Lee, Seth Forshee, and Aaron Lu. - Fix from Aaron Lu makes the kernel follow Windows 8 by informing the firmware through the _DOS method that it should not carry out automatic brightness changes, so that brightness can be controlled by GUI" * tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / video: no automatic brightness changes by win8-compatible firmware ACPI / video / i915: No ACPI backlight if firmware expects Windows 8 ACPI / video: Always call acpi_video_init_brightness() on init ACPICA: expose OSI version
2013-07-20Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext[34] tmpfile bugfix from Ted Ts'o: "Fix regression caused by commit af51a2ac36d1f which added ->tmpfile() support (along with a similar fix for ext3)" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext3: fix a BUG when opening a file with O_TMPFILE flag ext4: fix a BUG when opening a file with O_TMPFILE flag
2013-07-20ext3: fix a BUG when opening a file with O_TMPFILE flagZheng Liu
When we try to open a file with O_TMPFILE flag, we will trigger a bug. The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and this check always fails because we set ->i_nlink = 1 in inode_init_always(). We can use the following program to trigger it: int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_TMPFILE, 0666); if (fd < 0) { perror("open "); return -1; } close(fd); return 0; } The oops message looks like this: kernel: kernel BUG at fs/ext3/namei.c:1992! kernel: invalid opcode: 0000 [#1] SMP kernel: Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand ipv6 dm_mirror dm_region_hash dm_log dm_mod parport_pc parport serio_raw sg dcdbas pcspkr i2c_i801 ehci_pci ehci_hcd button acpi_cpufreq mperf e1000e ptp pps_core ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core ext3 jbd sd_mod ahci libahci libata scsi_mod uhci_hcd kernel: CPU: 0 PID: 2882 Comm: tst_tmpfile Not tainted 3.11.0-rc1+ #4 kernel: Hardware name: Dell Inc. OptiPlex 780 /0V4W66, BIOS A05 08/11/2010 kernel: task: ffff880112d30050 ti: ffff8801124d4000 task.ti: ffff8801124d4000 kernel: RIP: 0010:[<ffffffffa00db5ae>] [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3] kernel: RSP: 0018:ffff8801124d5cc8 EFLAGS: 00010202 kernel: RAX: 0000000000000000 RBX: ffff880111510128 RCX: ffff8801114683a0 kernel: RDX: 0000000000000000 RSI: ffff880111510128 RDI: ffff88010fcf65a8 kernel: RBP: ffff8801124d5d18 R08: 0080000000000000 R09: ffffffffa00d3b7f kernel: R10: ffff8801114683a0 R11: ffff8801032a2558 R12: 0000000000000000 kernel: R13: ffff88010fcf6800 R14: ffff8801032a2558 R15: ffff8801115100d8 kernel: FS: 00007f5d172b5700(0000) GS:ffff880117c00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b kernel: CR2: 00007f5d16df15d0 CR3: 0000000110b1d000 CR4: 00000000000407f0 kernel: Stack: kernel: 000000000000000c ffff8801048a7dc8 ffff8801114685a8 ffffffffa00b80d7 kernel: ffff8801124d5e38 ffff8801032a2558 ffff88010ce24d68 0000000000000000 kernel: ffff88011146b300 ffff8801124d5d44 ffff8801124d5d78 ffffffffa00db7e1 kernel: Call Trace: kernel: [<ffffffffa00b80d7>] ? journal_start+0x8c/0xbd [jbd] kernel: [<ffffffffa00db7e1>] ext3_tmpfile+0xb2/0x13b [ext3] kernel: [<ffffffff821076f8>] path_openat+0x11f/0x5e7 kernel: [<ffffffff821c86b4>] ? list_del+0x11/0x30 kernel: [<ffffffff82065fa2>] ? __dequeue_entity+0x33/0x38 kernel: [<ffffffff82107cd5>] do_filp_open+0x3f/0x8d kernel: [<ffffffff82112532>] ? __alloc_fd+0x50/0x102 kernel: [<ffffffff820f9296>] do_sys_open+0x13b/0x1cd kernel: [<ffffffff820f935c>] SyS_open+0x1e/0x20 kernel: [<ffffffff82398c02>] system_call_fastpath+0x16/0x1b kernel: Code: 39 c7 0f 85 67 01 00 00 0f b7 03 25 00 f0 00 00 3d 00 40 00 00 74 18 3d 00 80 00 00 74 11 3d 00 a0 00 00 74 0a 83 7b 48 00 74 04 <0f> 0b eb fe 49 8b 85 50 03 00 00 4c 89 f6 48 c7 c7 c0 99 0e a0 kernel: RIP [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3] kernel: RSP <ffff8801124d5cc8> Here we couldn't call clear_nlink() directly because in d_tmpfile() we will call inode_dec_link_count() to decrease ->i_nlink. So this commit tries to call d_tmpfile() before ext4_orphan_add() to fix this problem. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20ext4: fix a BUG when opening a file with O_TMPFILE flagZheng Liu
When we try to open a file with O_TMPFILE flag, we will trigger a bug. The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and this check always fails because we set ->i_nlink = 1 in inode_init_always(). We can use the following program to trigger it: int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_TMPFILE, 0666); if (fd < 0) { perror("open "); return -1; } close(fd); return 0; } The oops message looks like this: kernel BUG at fs/ext4/namei.c:2572! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dlci bridge stp hidp cmtp kernelcapi l2tp_ppp l2tp_netlink l2tp_core sctp libcrc32c rfcomm tun fuse nfnetli nk can_raw ipt_ULOG can_bcm x25 scsi_transport_iscsi ipx p8023 p8022 appletalk phonet psnap vmw_vsock_vmci_transport af_key vmw_vmci rose vsock atm can netrom ax25 af_rxrpc ir da pppoe pppox ppp_generic slhc bluetooth nfc rfkill rds caif_socket caif crc_ccitt af_802154 llc2 llc snd_hda_codec_realtek snd_hda_intel snd_hda_codec serio_raw snd_pcm pcsp kr edac_core snd_page_alloc snd_timer snd soundcore r8169 mii sr_mod cdrom pata_atiixp radeon backlight drm_kms_helper ttm CPU: 1 PID: 1812571 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #12 Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010 task: ffff88007dfe69a0 ti: ffff88010f7b6000 task.ti: ffff88010f7b6000 RIP: 0010:[<ffffffff8125ce69>] [<ffffffff8125ce69>] ext4_orphan_add+0x299/0x2b0 RSP: 0018:ffff88010f7b7cf8 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff8800966d3020 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88007dfe70b8 RDI: 0000000000000001 RBP: ffff88010f7b7d40 R08: ffff880126a3c4e0 R09: ffff88010f7b7ca0 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801271fd668 R13: ffff8800966d2f78 R14: ffff88011d7089f0 R15: ffff88007dfe69a0 FS: 00007f70441a3740(0000) GS:ffff88012a800000(0000) knlGS:00000000f77c96c0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000002834000 CR3: 0000000107964000 CR4: 00000000000007e0 DR0: 0000000000780000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Stack: 0000000000002000 00000020810b6dde 0000000000000000 ffff88011d46db00 ffff8800966d3020 ffff88011d7089f0 ffff88009c7f4c10 ffff88010f7b7f2c ffff88007dfe69a0 ffff88010f7b7da8 ffffffff8125cfac ffff880100000004 Call Trace: [<ffffffff8125cfac>] ext4_tmpfile+0x12c/0x180 [<ffffffff811cba78>] path_openat+0x238/0x700 [<ffffffff8100afc4>] ? native_sched_clock+0x24/0x80 [<ffffffff811cc647>] do_filp_open+0x47/0xa0 [<ffffffff811db73f>] ? __alloc_fd+0xaf/0x200 [<ffffffff811ba2e4>] do_sys_open+0x124/0x210 [<ffffffff81010725>] ? syscall_trace_enter+0x25/0x290 [<ffffffff811ba3ee>] SyS_open+0x1e/0x20 [<ffffffff816ca8d4>] tracesys+0xdd/0xe2 [<ffffffff81001001>] ? start_thread_common.constprop.6+0x1/0xa0 Code: 04 00 00 00 89 04 24 31 c0 e8 c4 77 04 00 e9 43 fe ff ff 66 25 00 d0 66 3d 00 80 0f 84 0e fe ff ff 83 7b 48 00 0f 84 04 fe ff ff <0f> 0b 49 8b 8c 24 50 07 00 00 e9 88 fe ff ff 0f 1f 84 00 00 00 Here we couldn't call clear_nlink() directly because in d_tmpfile() we will call inode_dec_link_count() to decrease ->i_nlink. So this commit tries to call d_tmpfile() before ext4_orphan_add() to fix this problem. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Tested-by: Darrick J. Wong <darrick.wong@oracle.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20Merge tag 'staging-3.11-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging tree fixes from Greg KH: "Here are a few iio driver fixes for 3.11-rc2. They are still spread across drivers/iio and drivers/staging/iio so they are coming in through this tree. I've also removed the drivers/staging/csr/ driver as the developers who originally sent it to me have moved on to other companies, and CSR still will not send us the specs for the device, making the driver pretty much obsolete and impossible to fix up. Deleting it now prevents people from sending in lots of tiny codingsyle fixes that will never go anywhere. It also helps to offset the large lustre filesystem merge that happened in 3.11-rc1 in the overall 3.11.0 diffstat. :)" * tag 'staging-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: csr: remove driver iio: lps331ap: Fix wrong in_pressure_scale output value iio staging: fix lis3l02dq, read error handling staging:iio:ad7291: add missing .driver_module to struct iio_info iio: ti_am335x_adc: add missing .driver_module to struct iio_info iio: mxs-lradc: Remove useless check in read_raw iio: mxs-lradc: Fix misuse of iio->trig iio: inkern: fix iio_convert_raw_to_processed_unlocked iio: Fix iio_channel_has_info iio:trigger: device_unregister->device_del to avoid double free iio: dac: ad7303: fix error return code in ad7303_probe()
2013-07-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "The sget() one is a long-standing bug and will need to go into -stable (in fact, it had been originally caught in RHEL6), the other two are 3.11-only" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: constify dentry parameter in d_count() livelock avoidance in sget() allow O_TMPFILE to work with O_WRONLY
2013-07-20Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "Fixes for 3.11-rc2, sent at 5pm, in the professoinal style. :-)" I'm not sure I like this new level of "professionalism". 9-5, people, 9-5. * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: call ext4_es_lru_add() after handling cache miss ext4: yield during large unlinks ext4: make the extent_status code more robust against ENOMEM failures ext4: simplify calculation of blocks to free on error ext4: fix error handling in ext4_ext_truncate()
2013-07-20Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: - Fix a regression against NFSv4 FreeBSD servers when creating a new file - Fix another regression in rpc_client_register() * tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix a regression against the FreeBSD server SUNRPC: Fix another issue with rpc_client_register()
2013-07-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next Pull btrfs fixes from Josef Bacik: "I'm playing the role of Chris Mason this week while he's on vacation. There are a few critical fixes for btrfs here, all regressions and have been tested well" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next: Btrfs: fix wrong write offset when replacing a device Btrfs: re-add root to dead root list if we stop dropping it Btrfs: fix lock leak when resuming snapshot deletion Btrfs: update drop progress before stopping snapshot dropping
2013-07-20vfs: constify dentry parameter in d_count()Peng Tao
so that it can be used in places like d_compare/d_hash without causing a compiler warning. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20livelock avoidance in sget()Al Viro
Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about to fail. The superblock is on ->fs_supers, ->s_umount is held exclusive, ->s_active is 1. Along comes two more processes, trying to mount the same thing; sget() in each is picking that superblock, bumping ->s_count and trying to grab ->s_umount. ->s_active is 3 now. Original mount(2) finally gets to deactivate_locked_super() on failure; ->s_active is 2, superblock is still ->fs_supers because shutdown will *not* happen until ->s_active hits 0. ->s_umount is dropped and now we have two processes chasing each other: s_active = 2, A acquired ->s_umount, B blocked A sees that the damn thing is stillborn, does deactivate_locked_super() s_active = 1, A drops ->s_umount, B gets it A restarts the search and finds the same superblock. And bumps it ->s_active. s_active = 2, B holds ->s_umount, A blocked on trying to get it ... and we are in the earlier situation with A and B switched places. The root cause, of course, is that ->s_active should not grow until we'd got MS_BORN. Then failing ->mount() will have deactivate_locked_super() shut the damn thing down. Fortunately, it's easy to do - the key point is that grab_super() is called only for superblocks currently on ->fs_supers, so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and bump ->s_active; we must never increment ->s_count for superblocks past ->kill_sb(), but grab_super() is never called for those. The bug is pretty old; we would've caught it by now, if not for accidental exclusion between sget() for block filesystems; the things like cgroup or e.g. mtd-based filesystems don't have anything of that sort, so they get bitten. The right way to deal with that is obviously to fix sget()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20allow O_TMPFILE to work with O_WRONLYAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-19Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/umlLinus Torvalds
Pull UML fixes from Richard Weinberger: "Special thanks goes to Toralf Föster for continuously testing UML and reporting issues!" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: remove dead code um: siginfo cleanup uml: Fix which_tmpdir failure when /dev/shm is a symlink, and in other edge cases um: Fix wait_stub_done() error handling um: Mark stub pages mapping with VM_PFNMAP um: Fix return value of strnlen_user()
2013-07-19Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS fixes from Ralf Baechle: "MIPS fixes for 3.11. Half of then is for Netlogic the remainder touches things across arch/mips. Nothing really dramatic and by rc1 standards MIPS will be in fairly good shape with this applied. Tested by building all MIPS defconfigs of which with this pull request four platforms won't build. And yes, it boots also on my favorite test systems" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: kvm: Kconfig: Drop HAVE_KVM dependency from VIRTUALIZATION MIPS: Octeon: Fix DT pruning bug with pip ports MIPS: KVM: Mark KVM_GUEST (T&E KVM) as BROKEN_ON_SMP MIPS: tlbex: fix broken build in v3.11-rc1 MIPS: Netlogic: Add XLP PIC irqdomain MIPS: Netlogic: Fix USB block's coherent DMA mask MIPS: tlbex: Fix typo in r3000 tlb store handler MIPS: BMIPS: Fix thinko to release slave TP from reset MIPS: Delete dead invocation of exception_exit().
2013-07-19Merge tag 'arm64-stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull arm64 fixes from Catalin Marinas: - Post -rc1 update to the common reboot infrastructure. - Fixes (user cache maintenance fault handling, !COMPAT compilation, CPU online and interrupt hanlding). * tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: arm64: use common reboot infrastructure arm64: mm: don't treat user cache maintenance faults as writes arm64: add '#ifdef CONFIG_COMPAT' for aarch32_break_handler() arm64: Only enable local interrupts after the CPU is marked online
2013-07-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "An update for the BFP jit to the latest and greatest, two patches to get kdump working again, the random-abort ptrace extention for transactional execution, the z90crypt module alias for ap and a tiny cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/zcrypt: Alias for new zcrypt device driver base module s390/kdump: Allow copy_oldmem_page() copy to virtual memory s390/kdump: Disable mmap for s390 s390/bpf,jit: add pkt_type support s390/bpf,jit: address randomize and write protect jit code s390/bpf,jit: use generic jit dumper s390/bpf,jit: call module_free() from any context s390/qdio: remove unused variable s390/ptrace: PTRACE_TE_ABORT_RAND
2013-07-19Btrfs: fix wrong write offset when replacing a deviceStefan Behrens
Miao Xie reported the following issue: The filesystem was corrupted after we did a device replace. Steps to reproduce: # mkfs.btrfs -f -m single -d raid10 <device0>..<device3> # mount <device0> <mnt> # btrfs replace start -rfB 1 <device4> <mnt> # umount <mnt> # btrfsck <device4> The reason for the issue is that we changed the write offset by mistake, introduced by commit 625f1c8dc. We read the data from the source device at first, and then write the data into the corresponding place of the new device. In order to implement the "-r" option, the source location is remapped using btrfs_map_block(). The read takes place on the mapped location, and the write needs to take place on the unmapped location. Currently the write is using the mapped location, and this commit changes it back by undoing the change to the write address that the aforementioned commit added by mistake. Reported-by: Miao Xie <miaox@cn.fujitsu.com> Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-19Btrfs: re-add root to dead root list if we stop dropping itJosef Bacik
If we stop dropping a root for whatever reason we need to add it back to the dead root list so that we will re-start the dropping next transaction commit. The other case this happens is if we recover a drop because we will add a root without adding it to the fs radix tree, so we can leak it's root and commit root extent buffer, adding this to the dead root list makes this cleanup happen. Thanks, Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-19Btrfs: fix lock leak when resuming snapshot deletionJosef Bacik
We aren't setting path->locks[level] when we resume a snapshot deletion which means we won't unlock the buffer when we free the path. This causes deadlocks if we happen to re-allocate the block before we've evicted the extent buffer from cache. Thanks, Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-19Btrfs: update drop progress before stopping snapshot droppingJosef Bacik
Alex pointed out a problem and fix that exists in the drop one snapshot at a time patch. If we decide we need to exit for whatever reason (umount for example) we will just exit the snapshot dropping without updating the drop progress. So the next time we go to resume we will BUG_ON() because we can't find the extent we left off at because we never updated it. This patch fixes the problem. Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-19Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fix from Paolo Bonzini: "This single patch fixes a regression caused by one of the optimizations introduced in 3.11, which is generally visible only on AMD processors" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: MMU: avoid fast page fault fixing mmio page fault
2013-07-19Merge tag 'pm+acpi-3.11-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These are fixes collected over the last week, most importnatly two cpufreq reverts fixing regressions introduced in 3.10, an autoseelp fix preventing systems using it from crashing during shutdown and two ACPI scan fixes related to hotplug. Specifics: - Two cpufreq commits from the 3.10 cycle introduced regressions. The first of them was buggy (it did way much more than it needed to do) and the second one attempted to fix an issue introduced by the first one. Fixes from Srivatsa S Bhat revert both. - If autosleep triggers during system shutdown and the shutdown callbacks of some device drivers have been called already, it may crash the system. Fix from Liu Shuo prevents that from happening by making try_to_suspend() check system_state. - The ACPI memory hotplug driver doesn't clear its driver_data on errors which may cause a NULL poiter dereference to happen later. Fix from Toshi Kani. - The ACPI namespace scanning code should not try to attach scan handlers to device objects that have them already, which may confuse things quite a bit, and it should rescan the whole namespace branch starting at the given node after receiving a bus check notify event even if the device at that particular node has been discovered already. Fixes from Rafael J Wysocki. - New ACPI video blacklist entry for a system whose initial backlight setting from the BIOS doesn't make sense. From Lan Tianyu. - Garbage string output avoindance for ACPI PNP from Liu Shuo. - Two Kconfig fixes for issues introduced recently in the s3c24xx cpufreq driver (when moving the driver to drivers/cpufreq) from Paul Bolle. - Trivial comment fix in pm_wakeup.h from Chanwoo Choi" * tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / video: ignore BIOS initial backlight value for Fujitsu E753 PNP / ACPI: avoid garbage in resource name cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression cpufreq: s3c24xx: fix "depends on ARM_S3C24XX" in Kconfig cpufreq: s3c24xx: rename CONFIG_CPU_FREQ_S3C24XX_DEBUGFS PM / Sleep: Fix comment typo in pm_wakeup.h PM / Sleep: avoid 'autosleep' in shutdown progress cpufreq: Revert commit a66b2e to fix suspend/resume regression ACPI / memhotplug: Fix a stale pointer in error path ACPI / scan: Always call acpi_bus_scan() for bus check notifications ACPI / scan: Do not try to attach scan handlers to devices having them
2013-07-19arm64: use common reboot infrastructureMarc Zyngier
Commit 7b6d864b48d9 (reboot: arm: change reboot_mode to use enum reboot_mode) changed the way reboot is handled on arm, which has a direct impact on arm64 as we share the reset driver on the VE platform. The obvious fix is to move arm64 to use the same infrastructure. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> [catalin.marinas@arm.com: removed reboot_mode = REBOOT_HARD default setting] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>