linux.git - Linux kernel

Age	Commit message (Collapse)	Author
2006-09-26	[PATCH] i386: Allow a kernel not to be in ring 0	Rusty Russell
	We allow for the fact that the guest kernel may not run in ring 0. This requires some abstraction in a few places when setting %cs or checking privilege level (user vs kernel). This is Chris' [RFC PATCH 15/33] move segment checks to subarch, except rather than using #define USER_MODE_MASK which depends on a config option, we use Zach's more flexible approach of assuming ring 3 == userspace. I also used "get_kernel_rpl()" over "get_kernel_cs()" because I think it reads better in the code... 1) Remove the hardcoded 3 and introduce #define SEGMENT_RPL_MASK 3 2) Add a get_kernel_rpl() macro, and don't assume it's zero. And: Clean up of patch for letting kernel run other than ring 0: a. Add some comments about the SEGMENT_IS_*_CODE() macros. b. Add a USER_RPL macro. (Code was comparing a value to a mask in some places and to the magic number 3 in other places.) c. Add macros for table indicator field and use them. d. Change the entry.S tests for LDT stack segment to use the macros Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: Abstract sensitive instructions	Rusty Russell
	Abstract sensitive instructions in assembler code, replacing them with macros (which currently are #defined to the native versions). We use long names: assembler is case-insensitive, so if something goes wrong and macros do not expand, it would assemble anyway. Resulting object files are exactly the same as before. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Fix a PDA warning uncovered by the new type checking	Andi Kleen
	Fix linux/arch/x86_64/kernel/process.c: In function __switch_to: linux/arch/x86_64/kernel/process.c:626: warning: assignment makes integer from pointer without a cast Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Fix a irqcount comment in entry.S	Andi Kleen
	Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Add the -fstack-protector option to the CFLAGS	Arjan van de Ven
	Add a feature check that checks that the gcc compiler has stack-protector support and has the bugfix for PR28281 to make this work in kernel mode. The easiest solution I could find was to have a shell script in scripts/ to do the detection; if needed we can make this fancier in the future without making the makefile too complex. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andi Kleen <ak@suse.de> CC: Andi Kleen <ak@suse.de> CC: Sam Ravnborg <sam@ravnborg.org>
2006-09-26	[PATCH] Add the canary field to the PDA area and the task struct	Arjan van de Ven
	This patch adds the per thread cookie field to the task struct and the PDA. Also it makes sure that the PDA value gets the new cookie value at context switch, and that a new task gets a new cookie at task creation time. Signed-off-by: Arjan van Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> CC: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Add the Kconfig option for the stackprotector feature	Arjan van de Ven
	This patch adds the config options for -fstack-protector. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> CC: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Don't use kernel_text_address in oops context	Andi Kleen
	Because it can take spinlocks. Suggested by Mathieu Desnoyers Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: Avoid overwriting the current pgd (V4, i386)	Magnus Damm
	kexec: Avoid overwriting the current pgd (V4, i386) This patch upgrades the i386-specific kexec code to avoid overwriting the current pgd. Overwriting the current pgd is bad when CONFIG_CRASH_DUMP is used to start a secondary kernel that dumps the memory of the previous kernel. The code introduces a new set of page tables. These tables are used to provide an executable identity mapping without overwriting the current pgd. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Avoid overwriting the current pgd (V4, x86_64)	Magnus Damm
	kexec: Avoid overwriting the current pgd (V4, x86_64) This patch upgrades the x86_64-specific kexec code to avoid overwriting the current pgd. Overwriting the current pgd is bad when CONFIG_CRASH_DUMP is used to start a secondary kernel that dumps the memory of the previous kernel. The code introduces a new set of page tables. These tables are used to provide an executable identity mapping without overwriting the current pgd. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Remove most of the special cases for the debug IST stack	Keith Owens
	Remove most of the special cases for the debug IST stack. This is a follow on clean up patch, it requires the bug fix patch that adds orig_ist. Signed-off-by: Keith Owens <kaos@ocs.com.au> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: Fix the EDD code misparsing the command line	H. Peter Anvin
	The EDD code would scan the command line as a fixed array, without taking account of either whitespace, null-termination, the old command-line protocol, late overrides early, or the fact that the command line may not be reachable from INITSEG. This should fix those problems, and enable us to use a longer command line. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Optimize PDA accesses slightly	Andi Kleen
	Based on a idea by Jeremy Fitzhardinge: Replace the volatiles and memory clobbers in the PDA access with telling gcc about access to a proxy PDA structure that doesn't actually exist. But the dummy accesses give a defined ordering for read/write accesses. Also add some memory barriers to the early GS initialization to make sure no PDA access is moved before it. Advantage is some .text savings (probably most from better code for accessing "current"): text data bss dec hex filename 4845647 1223688 615864 6685199 66020f vmlinux 4837780 1223688 615864 6677332 65e354 vmlinux-pda 1.2% smaller code Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Put .note.* sections into a PT_NOTE segment	Ian Campbell
	This patch updates x86_64 linker script to pack any .note.* sections into a PT_NOTE segment in the output file. To do this, we tell ld that we need a PT_NOTE segment. This requires us to start explicitly mapping sections to segments, so we also need to explicitly create PT_LOAD segments for text and data, and map the sections to them appropriately. Fortunately, each section will default to its previous section's segment, so it doesn't take many changes to vmlinux.lds.S. The corresponding change is already made for i386 in -mm and I'd like this patch to join it. The section to segment mappings do change as do the segment flags so some time in -mm would be good for that reason as well, just in case. In particular .data and .bss move from the text segment to the data segment and .data.cacheline_aligned .data.read_mostly are put in the data segment instead of a separate one. I think that it would be possible to exactly match the existing section to segment mapping and flags but it would be a more intrusive change and I'm not sure there is a reason for the existing layout other than it is what you get by default if you don't explicitly specify something else. If there is a reason for the existing layout then I will of course make the more intrusive change. If there is no reason we could probably drop the executable or writable flags from some segments but I don't know how much attention is paid to them anyway so it might not be worth the effort. The vsyscall related sections need to go in a different segment to the normal data segment and so I invented a "user" segment to contain them. I believe this should appear to be another data segment as far as the kernel is concerned so the flags are setup accordingly. The notes will be used in the Xen paravirt_ops backend to provide additional information to the domain builder. I am in the process of converting the xen-unstable kernels and tools over to this scheme at the moment to support this in the future. It has been suggested to me that the notes segment should have flags 0 (i.e. not readable) since it is only used by the loader and is not used at runtime. For now I went with a readable segment since that is what the i386 patch uses. AK: dropped NOTES addition right now because the needed infrastructure for that is not merged yet Signed-off-by: Ian Campbell <ian.campbell@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Reload CS when startup_64 is used.	Eric W. Biederman
	In long mode the %cs is largely a relic. However there are a few cases like iret where it matters that we have a valid value. Without this patch it is possible to enter the kernel in startup_64 without setting %cs to a valid value. With this patch we don't care what %cs value we enter the kernel with, so long as the cs shadow register indicates it is a privileged code segment. Thanks to Magnus Damm for finding this problem and posting the first workable patch. I have moved the jump to set %cs down a few instructions so we don't need to take an extra jump. Which keeps the code simpler. Signed-of-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Remove non e820 fallbacks in high level code	Andi Kleen
	Drop support for non e820 BIOS calls to get the memory map. The boot assembler code still has some support, but not the C code now. Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Fix boot code head.S warning	Paolo 'Blaisorblade' Giarrusso
	When compiling a 64-bit kernel on an Ubuntu 6.06 32bit system (whose GCC is also a cross-compiler for x86_64) I've seen that head.o is compiled as a 64-bit file (while it should not) and ld complaining about this during linking: [AK: it happens on all systems with new binutils] ld: warning: i386:x86-64 architecture of input file `arch/x86_64/boot/compressed/head.o' is incompatible with i386 output I've verified that removing -m64 from compilation flags to turn "-m64 -traditional -m32" into "-traditional -m32" fixes the issue. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Add a missing check for irq flags tracing in NMI	Andi Kleen
	NMIs are not supposed to track the irq flags, but TRACE_IRQS_IRETQ did it anyways. Add a check. Cc: mingo@elte.hu Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Fix coding style and output of the mptable parser	Andi Kleen
	Give the printks a consistent prefix. Add some missing white space. Cc: len.brown@intel.com Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Remove some cruft in apic id checking during processor setup	Andi Kleen
	- Remove a define that was used only once - Remove the too large APIC ID check because we always support the full 8bit range of APICs. - Restructure code a bit to be simpler. Cc: len.brown@intel.com Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Remove APIC version/cpu capability mpparse checking/printing	Andi Kleen
	ACPI went to great trouble to get the APIC version and CPU capabilities of different CPUs before passing them to the mpparser. But all that data was used was to print it out. Actually it even faked some data based on the boot cpu, not on the actual CPU being booted. Remove all this code because it's not needed. Cc: len.brown@intel.com Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Use proper accessors to change PSE bits in change_page_attr()	Andi Kleen
	Use normal pte accessors in change_page_attr() to access the PSE bits. Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Fix pte_exec/mkexec and use it in change_page_attr()	Andi Kleen
	Fix the pte_exec/mkexec page table accessor functions to really use the NX bit. Previously they only checked the USER bit, but weren't actually used for anything. Then use them in change_page_attr() to manipulate the NX bit properly. Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Remove bogus warning from early_ioremap	Andi Kleen
	It is correct for its only caller right now, but not for possible future others. Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Remove safe_smp_processor_id()	Andi Kleen
	And replace all users with ordinary smp_processor_id. The function was originally added to get some basic oops information out even if the GS register was corrupted. However that didn't work for some anymore because printk is needed to print the oops and it uses smp_processor_id() already. Also GS register corruptions are not particularly common anymore. This also helps the Xen port which would otherwise need to do this in a special way because it can't access the local APIC. Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Detect clock skew during suspend	Rafael J. Wysocki
	Detect the situations in which the time after a resume from disk would be earlier than the time before the suspend and prevent them from happening on x86_64. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: annotate FIX_STACK() and the rest of nmi()	Chuck Ebbert
	In i386's entry.S, FIX_STACK() needs annotation because it replaces the stack pointer. And the rest of nmi() needs annotation in order to compile with these new annotations. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] make numa_emulation() __init	Andrew Morton
	Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Don't force reserve the 640k-1MB range	Andi Kleen
	From i386 x86-64 inherited code to force reserve the 640k-1MB area. That was needed on some old systems. But we generally trust the e820 map to be correct on 64bit systems and mark all areas that are not memory correctly. This patch will allow to use the real memory in there. Or rather the only way to find out if it's still needed is to try. So far I'm optimistic. Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: Disallow kprobes on NMI handlers	Fernando Luis V�zquez Cao
	A kprobe executes IRET early and that could cause NMI recursion and stack corruption. Note: This problem was originally spotted and solved by Andi Kleen in the x86_64 architecture. This patch is an adaption of his patch for i386. AK: Merged with current code which was a bit different. AK: Removed printk in nmi handler that shouldn't be there in the first time AK: Added missing include. AK: added KPROBES_END Signed-off-by: Fernando Vazquez <fernando@intellilink.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: Disallow kprobes on NMI handlers	Fernando Luis V�zquez Cao
	A kprobe executes IRET early and that could cause NMI recursion and stack corruption. Note: This problem was originally spotted by Andi Kleen. This patch adds fixes not included in his original patch. [AK: Jan Beulich originally discovered these classes of bugs] Signed-off-by: Fernando Vazquez <fernando@intellilink.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: mark cpu cache functions as __cpuinit	Magnus Damm
	Mark i386-specific cpu cache functions as __cpuinit. They are all only called from arch/i386/common.c:display_cache_info() that already is marked as __cpuinit. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: mark cpu identify functions as __cpuinit	Magnus Damm
	Mark i386-specific cpu identification functions as __cpuinit. They are all only called from arch/i386/common.c:identify_cpu() that already is marked as __cpuinit. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: mark cpu init functions as __cpuinit, data as __cpuinitdata	Magnus Damm
	Mark i386-specific cpu init functions as __cpuinit. They are all only called from arch/i386/common.c:identify_cpu() that already is marked as __cpuinit. This patch also removes the empty function init_umc(). Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: mark cpu_dev structures as __cpuinitdata	Magnus Damm
	The different cpu_dev structures are all used from __cpuinit callers what I can tell. So mark them as __cpuinitdata instead of __initdata. I am a little bit unsure about arch/i386/common.c:default_cpu, especially when it comes to the purpose of this_cpu. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] mark init_amd() as __cpuinit	Magnus Damm
	The init_amd() function is only called from identify_cpu() which is already marked as __cpuinit. So let's mark it as __cpuinit. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: remove redundant generic_identify() calls when identifying cpus	Magnus Damm
	cpu_dev->c_identify is only called from arch/i386/common.c:identify_cpu(), and this after generic_identify() already has been called. There is no need to call this function twice and hook it in c_identify - but I may be wrong, please double check before applying. This patch also removes generic_identify() from cpu.h to avoid unnecessary future nesting. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] x86_64 kernel mapping fix	Keith Mannthey
	Fix for the x86_64 kernel mapping code. Without this patch the update path only inits one pmd_page worth of memory and tramples any entries on it. now the calling convention to phys_pmd_init and phys_init is to always pass a [pmd/pud] page not an offset within a page. Signed-off-by: Keith Mannthey<kmannth@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26	[PATCH] wire up oops_enter()/oops_exit()	Andrew Morton
	Implement pause_on_oops() on x86_64. AK: I redid the patch to do the oops_enter/exit in the existing oops_begin()/end(). This makes it much shorter. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] non lazy "sleazy" fpu implementation	Arjan van de Ven
	Right now the kernel on x86-64 has a 100% lazy fpu behavior: after every context switch a trap is taken for the first FPU use to restore the FPU context lazily. This is of course great for applications that have very sporadic or no FPU use (since then you avoid doing the expensive save/restore all the time). However for very frequent FPU users... you take an extra trap every context switch. The patch below adds a simple heuristic to this code: After 5 consecutive context switches of FPU use, the lazy behavior is disabled and the context gets restored every context switch. If the app indeed uses the FPU, the trap is avoided. (the chance of the 6th time slice using FPU after the previous 5 having done so are quite high obviously). After 256 switches, this is reset and lazy behavior is returned (until there are 5 consecutive ones again). The reason for this is to give apps that do longer bursts of FPU use still the lazy behavior back after some time. [akpm@osdl.org: place new task_struct field next to jit_keyring to save space] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26	[PATCH] i386: Support physical cpu hotplug for x86_64	Ashok Raj
	This patch enables ACPI based physical CPU hotplug support for x86_64. Implements acpi_map_lsapic() and acpi_unmap_lsapic() to support physical cpu hotplug. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@muc.de> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26	[PATCH] fix bus numbering format in mmconfig warning	Brice Goglin
	Make an mmconfig warning print the bus id with a regular format. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26	[PATCH] i386: mark two more functions as __init	Magnus Damm
	cyrix_identify() should be __init because transmeta_identify() is. tsc_init() is only called from setup_arch() which is marked as __init. These two section mismatches have been detected using running modpost on a vmlinux image compiled with CONFIG_RELOCATABLE=y. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: clean up topology.c	Magnus Damm
	There is no need to duplicate the topology_init() function. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] Auto size the per cpu area.	Eric W. Biederman
	Now for a completely different but trivial approach. I just boot tested it with 255 CPUS and everything worked. Currently everything (except module data) we place in the per cpu area we know about at compile time. So instead of allocating a fixed size for the per_cpu area allocate the number of bytes we need plus a fixed constant for to be used for modules. It isn't perfect but it is much less of a pain to work with than what we are doing now. AK: fixed warning Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: Descriptor and trap table cleanups.	Rusty Russell
	The implementation comes from Zach's [RFC, PATCH 10/24] i386 Vmi descriptor changes: Descriptor and trap table cleanups. Add cleanly written accessors for IDT and GDT gates so the subarch may override them. Note that this allows the hypervisor to transparently tweak the DPL of the descriptors as well as the RPL of segments in those descriptors, with no unnecessary kernel code modification. It also allows the hypervisor implementation of the VMI to tweak the gates, allowing for custom exception frames or extra layers of indirection above the guest fault / IRQ handlers. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: move kernel_thread_helper into entry.S	Andi Kleen
	And add proper CFI annotation to it which was previously impossible. This prevents "stuck" messages by the dwarf2 unwinder when reaching the top of a kernel stack. Includes feedback from Jan Beulich Cc: jbeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: Make enable_local_apic static	Adrian Bunk
	enable_local_apic can now become static. Cc: len.brown@intel.com Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: Make acpi_force static	Adrian Bunk
	acpi_force can become static. Cc: len.brown@intel.com Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26	[PATCH] i386: make fault notifier unconditional and export it	Andi Kleen
	It's needed for external debuggers and overhead is very small. Also make the actual notifier chain they use static Cc: jbeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de>