diff options
author | Linus Torvalds | 2015-04-18 11:10:49 -0400 |
---|---|---|
committer | Linus Torvalds | 2015-04-18 11:10:49 -0400 |
commit | d6a24d0640d609138a4e40a4ce9fd9fe7859e24c (patch) | |
tree | aa40fb7f5d03dc605634fb4030afd8798bd6bc0b /Documentation | |
parent | 1f5014d6a77513fa7cefe30eb7791d5856c04384 (diff) | |
parent | 197175427a221fe3200f7727ea35e261727e7228 (diff) |
Merge tag 'docs-for-linus' of git://git.lwn.net/linux-2.6
Pull documentation updates from Jonathan Corbet:
"Numerous fixes, the overdue removal of the i2o docs, some new Chinese
translations, and, hopefully, the README fix that will end the flow of
identical patches to that file"
* tag 'docs-for-linus' of git://git.lwn.net/linux-2.6: (34 commits)
Documentation/memcg: update memcg/kmem status
Documentation: blackfin: Makefile: Typo building issue
Documentation/vm/pagemap.txt: correct location of page-types tool
Documentation/memory-barriers.txt: typo fix
doc: Add guest_nice column to example output of `cat /proc/stat'
Documentation/kernel-parameters: Move "eagerfpu" to its right place
Documentation: gpio: Update ACPI part of the document to mention _DSD
docs/completion.txt: Various tweaks and corrections
doc: completion: context, scope and language fixes
Documentation:Update Documentation/zh_CN/arm64/memory.txt
Documentation:Update Documentation/zh_CN/arm64/booting.txt
Documentation: Chinese translation of arm64/legacy_instructions.txt
DocBook media: fix broken EIA hyperlink
Documentation: tweak the maintainers entry
README: Change gzip/bzip2 to xz compression format
README: Update version number reference
doc:pci: Fix typo in Documentation/PCI
Documentation: drm: Use '->' when describing access through pointers.
Documentation: Remove mentioning of block barriers
Documentation/email-clients.txt: Fix one grammar mistake, add extra info about TB
...
Diffstat (limited to 'Documentation')
36 files changed, 467 insertions, 743 deletions
diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index 4d4f06d47e06..f4b78eafd92a 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle @@ -13,7 +13,7 @@ and NOT read it. Burn them, it's a great symbolic gesture. Anyway, here goes: - Chapter 1: Indentation + Chapter 1: Indentation Tabs are 8 characters, and thus indentations are also 8 characters. There are heretic movements that try to make indentations 4 (or even 2!) @@ -56,7 +56,6 @@ instead of "double-indenting" the "case" labels. E.g.: break; } - Don't put multiple statements on a single line unless you have something to hide: @@ -156,25 +155,25 @@ comments on. Do not unnecessarily use braces where a single statement will do. -if (condition) - action(); + if (condition) + action(); and -if (condition) - do_this(); -else - do_that(); + if (condition) + do_this(); + else + do_that(); This does not apply if only one branch of a conditional statement is a single statement; in the latter case use braces in both branches: -if (condition) { - do_this(); - do_that(); -} else { - otherwise(); -} + if (condition) { + do_this(); + do_that(); + } else { + otherwise(); + } 3.1: Spaces @@ -186,8 +185,11 @@ although they are not required in the language, as in: "sizeof info" after "struct fileinfo info;" is declared). So use a space after these keywords: + if, switch, case, for, do, while + but not with sizeof, typeof, alignof, or __attribute__. E.g., + s = sizeof(struct file); Do not add spaces around (inside) parenthesized expressions. This example is @@ -209,12 +211,15 @@ such as any of these: = + - < > * / % | & ^ <= >= == != ? : but no space after unary operators: + & * + - ~ ! sizeof typeof alignof __attribute__ defined no space before the postfix increment & decrement unary operators: + ++ -- no space after the prefix increment & decrement unary operators: + ++ -- and no space around the '.' and "->" structure member operators. @@ -268,13 +273,11 @@ See chapter 6 (Functions). Chapter 5: Typedefs Please don't use things like "vps_t". - It's a _mistake_ to use typedef for structures and pointers. When you see a vps_t a; in the source, what does it mean? - In contrast, if it says struct virtual_container *a; @@ -372,11 +375,11 @@ In source files, separate functions with one blank line. If the function is exported, the EXPORT* macro for it should follow immediately after the closing function brace line. E.g.: -int system_is_up(void) -{ - return system_state == SYSTEM_RUNNING; -} -EXPORT_SYMBOL(system_is_up); + int system_is_up(void) + { + return system_state == SYSTEM_RUNNING; + } + EXPORT_SYMBOL(system_is_up); In function prototypes, include parameter names with their data types. Although this is not required by the C language, it is preferred in Linux @@ -405,34 +408,34 @@ The rationale for using gotos is: modifications are prevented - saves the compiler work to optimize redundant code away ;) -int fun(int a) -{ - int result = 0; - char *buffer; - - buffer = kmalloc(SIZE, GFP_KERNEL); - if (!buffer) - return -ENOMEM; - - if (condition1) { - while (loop1) { - ... + int fun(int a) + { + int result = 0; + char *buffer; + + buffer = kmalloc(SIZE, GFP_KERNEL); + if (!buffer) + return -ENOMEM; + + if (condition1) { + while (loop1) { + ... + } + result = 1; + goto out_buffer; } - result = 1; - goto out_buffer; + ... + out_buffer: + kfree(buffer); + return result; } - ... -out_buffer: - kfree(buffer); - return result; -} A common type of bug to be aware of it "one err bugs" which look like this: -err: - kfree(foo->bar); - kfree(foo); - return ret; + err: + kfree(foo->bar); + kfree(foo); + return ret; The bug in this code is that on some exit paths "foo" is NULL. Normally the fix for this is to split it up into two error labels "err_bar:" and "err_foo:". @@ -503,9 +506,9 @@ values. To do the latter, you can stick the following in your .emacs file: (defun c-lineup-arglist-tabs-only (ignored) "Line up argument lists by tabs, not spaces" (let* ((anchor (c-langelem-pos c-syntactic-element)) - (column (c-langelem-2nd-pos c-syntactic-element)) - (offset (- (1+ column) anchor)) - (steps (floor offset c-basic-offset))) + (column (c-langelem-2nd-pos c-syntactic-element)) + (offset (- (1+ column) anchor)) + (steps (floor offset c-basic-offset))) (* (max steps 1) c-basic-offset))) @@ -612,7 +615,7 @@ have a reference count on it, you almost certainly have a bug. Names of macros defining constants and labels in enums are capitalized. -#define CONSTANT 0x12345 + #define CONSTANT 0x12345 Enums are preferred when defining several related constants. @@ -623,28 +626,28 @@ Generally, inline functions are preferable to macros resembling functions. Macros with multiple statements should be enclosed in a do - while block: -#define macrofun(a, b, c) \ - do { \ - if (a == 5) \ - do_this(b, c); \ - } while (0) + #define macrofun(a, b, c) \ + do { \ + if (a == 5) \ + do_this(b, c); \ + } while (0) Things to avoid when using macros: 1) macros that affect control flow: -#define FOO(x) \ - do { \ - if (blah(x) < 0) \ - return -EBUGGERED; \ - } while(0) + #define FOO(x) \ + do { \ + if (blah(x) < 0) \ + return -EBUGGERED; \ + } while(0) is a _very_ bad idea. It looks like a function call but exits the "calling" function; don't break the internal parsers of those who will read the code. 2) macros that depend on having a local variable with a magic name: -#define FOO(val) bar(index, val) + #define FOO(val) bar(index, val) might look like a good thing, but it's confusing as hell when one reads the code and it's prone to breakage from seemingly innocent changes. @@ -656,8 +659,8 @@ bite you if somebody e.g. turns FOO into an inline function. must enclose the expression in parentheses. Beware of similar issues with macros using parameters. -#define CONSTANT 0x4000 -#define CONSTEXP (CONSTANT | 3) + #define CONSTANT 0x4000 + #define CONSTEXP (CONSTANT | 3) 5) namespace collisions when defining local variables in macros resembling functions: @@ -809,11 +812,11 @@ you should use, rather than explicitly coding some variant of them yourself. For example, if you need to calculate the length of an array, take advantage of the macro - #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) Similarly, if you need to calculate the size of some structure member, use - #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) + #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) There are also min() and max() macros that do strict type checking if you need them. Feel free to peruse that header file to see what else is already @@ -826,19 +829,19 @@ Some editors can interpret configuration information embedded in source files, indicated with special markers. For example, emacs interprets lines marked like this: --*- mode: c -*- + -*- mode: c -*- Or like this: -/* -Local Variables: -compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" -End: -*/ + /* + Local Variables: + compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" + End: + */ Vim interprets markers that look like this: -/* vim:set sw=8 noet */ + /* vim:set sw=8 noet */ Do not include any of these in source files. People have their own personal editor configurations, and your source files should not override them. This @@ -915,9 +918,9 @@ At the end of any non-trivial #if or #ifdef block (more than a few lines), place a comment after the #endif on the same line, noting the conditional expression used. For instance: -#ifdef CONFIG_SOMETHING -... -#endif /* CONFIG_SOMETHING */ + #ifdef CONFIG_SOMETHING + ... + #endif /* CONFIG_SOMETHING */ Appendix I: References diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl index 03f1985a4bd1..0cad3ce957ff 100644 --- a/Documentation/DocBook/drm.tmpl +++ b/Documentation/DocBook/drm.tmpl @@ -1293,7 +1293,7 @@ int max_width, max_height;</synopsis> </para> <para> If a page flip can be successfully scheduled the driver must set the - <code>drm_crtc-<fb</code> field to the new framebuffer pointed to + <code>drm_crtc->fb</code> field to the new framebuffer pointed to by <code>fb</code>. This is important so that the reference counting on framebuffers stays balanced. </para> diff --git a/Documentation/DocBook/media/v4l/biblio.xml b/Documentation/DocBook/media/v4l/biblio.xml index 7ff01a23c2fe..fdee6b3f3eca 100644 --- a/Documentation/DocBook/media/v4l/biblio.xml +++ b/Documentation/DocBook/media/v4l/biblio.xml @@ -1,14 +1,13 @@ <bibliography> <title>References</title> - <biblioentry id="eia608"> - <abbrev>EIA 608-B</abbrev> + <biblioentry id="cea608"> + <abbrev>CEA 608-E</abbrev> <authorgroup> - <corpauthor>Electronic Industries Alliance (<ulink -url="http://www.eia.org">http://www.eia.org</ulink>)</corpauthor> + <corpauthor>Consumer Electronics Association (<ulink +url="http://www.ce.org">http://www.ce.org</ulink>)</corpauthor> </authorgroup> - <title>EIA 608-B "Recommended Practice for Line 21 Data -Service"</title> + <title>CEA-608-E R-2014 "Line 21 Data Services"</title> </biblioentry> <biblioentry id="en300294"> diff --git a/Documentation/DocBook/media/v4l/dev-sliced-vbi.xml b/Documentation/DocBook/media/v4l/dev-sliced-vbi.xml index 7a8bf3011ee9..0aec62ed2bf8 100644 --- a/Documentation/DocBook/media/v4l/dev-sliced-vbi.xml +++ b/Documentation/DocBook/media/v4l/dev-sliced-vbi.xml @@ -254,7 +254,7 @@ ETS 300 231, lsb first transmitted.</entry> <row> <entry><constant>V4L2_SLICED_CAPTION_525</constant></entry> <entry>0x1000</entry> - <entry><xref linkend="eia608" /></entry> + <entry><xref linkend="cea608" /></entry> <entry>NTSC line 21, 284 (second field 21)</entry> <entry>Two bytes in transmission order, including parity bit, lsb first transmitted.</entry> diff --git a/Documentation/DocBook/media/v4l/vidioc-g-sliced-vbi-cap.xml b/Documentation/DocBook/media/v4l/vidioc-g-sliced-vbi-cap.xml index bd015d1563ff..d05623c55403 100644 --- a/Documentation/DocBook/media/v4l/vidioc-g-sliced-vbi-cap.xml +++ b/Documentation/DocBook/media/v4l/vidioc-g-sliced-vbi-cap.xml @@ -205,7 +205,7 @@ ETS 300 231, lsb first transmitted.</entry> <row> <entry><constant>V4L2_SLICED_CAPTION_525</constant></entry> <entry>0x1000</entry> - <entry><xref linkend="eia608" /></entry> + <entry><xref linkend="cea608" /></entry> <entry>NTSC line 21, 284 (second field 21)</entry> <entry>Two bytes in transmission order, including parity bit, lsb first transmitted.</entry> diff --git a/Documentation/PCI/MSI-HOWTO.txt b/Documentation/PCI/MSI-HOWTO.txt index 0d920d54536d..1179850f453c 100644 --- a/Documentation/PCI/MSI-HOWTO.txt +++ b/Documentation/PCI/MSI-HOWTO.txt @@ -353,7 +353,7 @@ retry: rc = pci_enable_msix_range(adapter->pdev, adapter->msix_entries, maxvec, maxvec); /* - * -ENOSPC is the only error code allowed to be analized + * -ENOSPC is the only error code allowed to be analyzed */ if (rc == -ENOSPC) { if (maxvec == 1) @@ -370,7 +370,7 @@ retry: return rc; } -Note how pci_enable_msix_range() return value is analized for a fallback - +Note how pci_enable_msix_range() return value is analyzed for a fallback - any error code other than -ENOSPC indicates a fatal error and should not be retried. @@ -486,7 +486,7 @@ during development. If your device supports both MSI-X and MSI capabilities, you should use the MSI-X facilities in preference to the MSI facilities. As mentioned above, MSI-X supports any number of interrupts between 1 and 2048. -In constrast, MSI is restricted to a maximum of 32 interrupts (and +In contrast, MSI is restricted to a maximum of 32 interrupts (and must be a power of two). In addition, the MSI interrupt vectors must be allocated consecutively, so the system might not be able to allocate as many vectors for MSI as it could for MSI-X. On some platforms, MSI @@ -501,18 +501,9 @@ necessary to disable interrupts (Linux guarantees the same interrupt will not be re-entered). If a device uses multiple interrupts, the driver must disable interrupts while the lock is held. If the device sends a different interrupt, the driver will deadlock trying to recursively -acquire the spinlock. - -There are two solutions. The first is to take the lock with -spin_lock_irqsave() or spin_lock_irq() (see -Documentation/DocBook/kernel-locking). The second is to specify -IRQF_DISABLED to request_irq() so that the kernel runs the entire -interrupt routine with interrupts disabled. - -If your MSI interrupt routine does not hold the lock for the whole time -it is running, the first solution may be best. The second solution is -normally preferred as it avoids making two transitions from interrupt -disabled to enabled and back again. +acquire the spinlock. Such deadlocks can be avoided by using +spin_lock_irqsave() or spin_lock_irq() which disable local interrupts +and acquire the lock (see Documentation/DocBook/kernel-locking). 4.6 How to tell whether MSI/MSI-X is enabled on a device diff --git a/Documentation/PCI/pci-error-recovery.txt b/Documentation/PCI/pci-error-recovery.txt index 898ded24510d..ac26869c7db4 100644 --- a/Documentation/PCI/pci-error-recovery.txt +++ b/Documentation/PCI/pci-error-recovery.txt @@ -256,7 +256,7 @@ STEP 4: Slot Reset ------------------ In response to a return value of PCI_ERS_RESULT_NEED_RESET, the -the platform will peform a slot reset on the requesting PCI device(s). +the platform will perform a slot reset on the requesting PCI device(s). The actual steps taken by a platform to perform a slot reset will be platform-dependent. Upon completion of slot reset, the platform will call the device slot_reset() callback. diff --git a/Documentation/PCI/pcieaer-howto.txt b/Documentation/PCI/pcieaer-howto.txt index 26d3d945c3c2..b4987c0bcb20 100644 --- a/Documentation/PCI/pcieaer-howto.txt +++ b/Documentation/PCI/pcieaer-howto.txt @@ -66,8 +66,8 @@ hardware (mostly chipsets) has root ports that cannot obtain the reporting source ID. nosourceid=n by default. 2.3 AER error output -When a PCI-E AER error is captured, an error message will be outputed to -console. If it's a correctable error, it is outputed as a warning. +When a PCI-E AER error is captured, an error message will be outputted to +console. If it's a correctable error, it is outputted as a warning. Otherwise, it is printed as an error. So users could choose different log level to filter out correctable error messages. diff --git a/Documentation/arm/Booting b/Documentation/arm/Booting index 371814a36719..83c1df2fc758 100644 --- a/Documentation/arm/Booting +++ b/Documentation/arm/Booting @@ -58,13 +58,18 @@ serial format options as described in -------------------------- Existing boot loaders: OPTIONAL -New boot loaders: MANDATORY +New boot loaders: MANDATORY except for DT-only platforms The boot loader should detect the machine type its running on by some method. Whether this is a hard coded value or some algorithm that looks at the connected hardware is beyond the scope of this document. The boot loader must ultimately be able to provide a MACH_TYPE_xxx -value to the kernel. (see linux/arch/arm/tools/mach-types). +value to the kernel. (see linux/arch/arm/tools/mach-types). This +should be passed to the kernel in register r1. + +For DT-only platforms, the machine type will be determined by device +tree. set the machine type to all ones (~0). This is not strictly +necessary, but assures that it will not match any existing types. 4. Setup boot data ------------------ diff --git a/Documentation/arm/README b/Documentation/arm/README index aea34095cdcf..9d1e5b2c92e6 100644 --- a/Documentation/arm/README +++ b/Documentation/arm/README @@ -185,13 +185,20 @@ Kernel entry (head.S) board devices are used, or the device is setup, and provides that machine specific "personality." - This fine-grained machine specific selection is controlled by the machine - type ID, which acts both as a run-time and a compile-time code selection - method. + For platforms that support device tree (DT), the machine selection is + controlled at runtime by passing the device tree blob to the kernel. At + compile-time, support for the machine type must be selected. This allows for + a single multiplatform kernel build to be used for several machine types. - You can register a new machine via the web site at: + For platforms that do not use device tree, this machine selection is + controlled by the machine type ID, which acts both as a run-time and a + compile-time code selection method. You can register a new machine via the + web site at: <http://www.arm.linux.org.uk/developer/machines/> + Note: Please do not register a machine type for DT-only platforms. If your + platform is DT-only, you do not need a registered machine type. + --- Russell King (15/03/2004) diff --git a/Documentation/blackfin/Makefile b/Documentation/blackfin/Makefile index 03f78059d6f5..6782c58fbc29 100644 --- a/Documentation/blackfin/Makefile +++ b/Documentation/blackfin/Makefile @@ -1,5 +1,5 @@ ifneq ($(CONFIG_BLACKFIN),) -ifneq ($(CONFIG_BFIN_GPTIMERS,) +ifneq ($(CONFIG_BFIN_GPTIMERS),) obj-m := gptimers-example.o endif endif diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt index 5aabc08de811..fd12c0d835fd 100644 --- a/Documentation/block/biodoc.txt +++ b/Documentation/block/biodoc.txt @@ -48,8 +48,7 @@ Description of Contents: - Highmem I/O support - I/O scheduler modularization 1.2 Tuning based on high level requirements/capabilities - 1.2.1 I/O Barriers - 1.2.2 Request Priority/Latency + 1.2.1 Request Priority/Latency 1.3 Direct access/bypass to lower layers for diagnostics and special device operations 1.3.1 Pre-built commands @@ -255,29 +254,12 @@ some control over i/o ordering. What kind of support exists at the generic block layer for this ? The flags and rw fields in the bio structure can be used for some tuning -from above e.g indicating that an i/o is just a readahead request, or for -marking barrier requests (discussed next), or priority settings (currently -unused). As far as user applications are concerned they would need an -additional mechanism either via open flags or ioctls, or some other upper -level mechanism to communicate such settings to block. - -1.2.1 I/O Barriers - -There is a way to enforce strict ordering for i/os through barriers. -All requests before a barrier point must be serviced before the barrier -request and any other requests arriving after the barrier will not be -serviced until after the barrier has completed. This is useful for higher -level control on write ordering, e.g flushing a log of committed updates -to disk before the corresponding updates themselves. - -A flag in the bio structure, BIO_BARRIER is used to identify a barrier i/o. -The generic i/o scheduler would make sure that it places the barrier request and -all other requests coming after it after all the previous requests in the -queue. Barriers may be implemented in different ways depending on the -driver. For more details regarding I/O barriers, please read barrier.txt -in this directory. - -1.2.2 Request Priority/Latency +from above e.g indicating that an i/o is just a readahead request, or priority +settings (currently unused). As far as user applications are concerned they +would need an additional mechanism either via open flags or ioctls, or some +other upper level mechanism to communicate such settings to block. + +1.2.1 Request Priority/Latency Todo/Under discussion: Arjan's proposed request priority scheme allows higher levels some broad @@ -906,8 +888,8 @@ queue and specific I/O schedulers. Unless stated otherwise, elevator is used to refer to both parts and I/O scheduler to specific I/O schedulers. Block layer implements generic dispatch queue in block/*.c. -The generic dispatch queue is responsible for properly ordering barrier -requests, requeueing, handling non-fs requests and all other subtleties. +The generic dispatch queue is responsible for requeueing, handling non-fs +requests and all other subtleties. Specific I/O schedulers are responsible for ordering normal filesystem requests. They can also choose to delay certain requests to improve diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index a22df3ad35ff..f456b4315e86 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -275,11 +275,6 @@ When oom event notifier is registered, event will be delivered. 2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM) -WARNING: Current implementation lacks reclaim support. That means allocation - attempts will fail when close to the limit even if there are plenty of - kmem available for reclaim. That makes this option unusable in real - life so DO NOT SELECT IT unless for development purposes. - With the Kernel memory extension, the Memory Controller is able to limit the amount of kernel memory used by the system. Kernel memory is fundamentally different than user memory, since it can't be swapped out, which makes it @@ -345,6 +340,9 @@ set: In this case, the admin could set up K so that the sum of all groups is never greater than the total memory, and freely set U at the cost of his QoS. + WARNING: In the current implementation, memory reclaim will NOT be + triggered for a cgroup when it hits K while staying below U, which makes + this setup impractical. U != 0, K >= U: Since kmem charges will also be fed to the user counter and reclaim will be diff --git a/Documentation/email-clients.txt b/Documentation/email-clients.txt index eede6088f978..c7d49b885559 100644 --- a/Documentation/email-clients.txt +++ b/Documentation/email-clients.txt @@ -211,7 +211,7 @@ Thunderbird (GUI) Thunderbird is an Outlook clone that likes to mangle text, but there are ways to coerce it into behaving. -- Allows use of an external editor: +- Allow use of an external editor: The easiest thing to do with Thunderbird and patches is to use an "external editor" extension and then just use your favorite $EDITOR for reading/merging patches into the body text. To do this, download @@ -219,6 +219,15 @@ to coerce it into behaving. View->Toolbars->Customize... and finally just click on it when in the Compose dialog. + Please note that "external editor" requires that your editor must not + fork, or in other words, the editor must not return before closing. + You may have to pass additional flags or change the settings of your + editor. Most notably if you are using gvim then you must pass the -f + option to gvim by putting "/usr/bin/gvim -f" (if the binary is in + /usr/bin) to the text editor field in "external editor" settings. If you + are using some other editor then please read its manual to find out how + to do this. + To beat some sense out of the internal editor, do this: - Edit your Thunderbird config settings so that it won't use format=flowed. diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 8e36c7e3c345..c3b6b301d8b0 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -1260,9 +1260,9 @@ Various pieces of information about kernel activity are available in the since the system first booted. For a quick look, simply cat the file: > cat /proc/stat - cpu 2255 34 2290 22625563 6290 127 456 0 0 - cpu0 1132 34 1441 11311718 3675 127 438 0 0 - cpu1 1123 0 849 11313845 2614 0 18 0 0 + cpu 2255 34 2290 22625563 6290 127 456 0 0 0 + cpu0 1132 34 1441 11311718 3675 127 438 0 0 0 + cpu1 1123 0 849 11313845 2614 0 18 0 0 0 intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] ctxt 1990473 btime 1062191376 diff --git a/Documentation/gpio/board.txt b/Documentation/gpio/board.txt index 8b35f51fe7b6..b80606de545a 100644 --- a/Documentation/gpio/board.txt +++ b/Documentation/gpio/board.txt @@ -50,10 +50,43 @@ gpiod_is_active_low(power) will be true). ACPI ---- -ACPI does not support function names for GPIOs. Therefore, only the "idx" -argument of gpiod_get_index() is useful to discriminate between GPIOs assigned -to a device. The "con_id" argument can still be set for debugging purposes (it -will appear under error messages as well as debug and sysfs nodes). +ACPI also supports function names for GPIOs in a similar fashion to DT. +The above DT example can be converted to an equivalent ACPI description +with the help of _DSD (Device Specific Data), introduced in ACPI 5.1: + + Device (FOO) { + Name (_CRS, ResourceTemplate () { + GpioIo (Exclusive, ..., IoRestrictionOutputOnly, + "\\_SB.GPI0") {15} // red + GpioIo (Exclusive, ..., IoRestrictionOutputOnly, + "\\_SB.GPI0") {16} // green + GpioIo (Exclusive, ..., IoRestrictionOutputOnly, + "\\_SB.GPI0") {17} // blue + GpioIo (Exclusive, ..., IoRestrictionOutputOnly, + "\\_SB.GPI0") {1} // power + }) + + Name (_DSD, Package () { + ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"), + Package () { + Package () { + "led-gpios", + Package () { + ^FOO, 0, 0, 1, + ^FOO, 1, 0, 1, + ^FOO, 2, 0, 1, + } + }, + Package () { + "power-gpios", + Package () {^FOO, 3, 0, 0}, + }, + } + }) + } + +For more information about the ACPI GPIO bindings see +Documentation/acpi/gpio-properties.txt. Platform Data ------------- diff --git a/Documentation/i2o/README b/Documentation/i2o/README deleted file mode 100644 index ee91e2626ff0..000000000000 --- a/Documentation/i2o/README +++ /dev/null @@ -1,63 +0,0 @@ - - Linux I2O Support (c) Copyright 1999 Red Hat Software - and others. - - This program is free software; you can redistribute it and/or - modify it under the terms of the GNU General Public License - as published by the Free Software Foundation; either version - 2 of the License, or (at your option) any later version. - -AUTHORS (so far) - -Alan Cox, Building Number Three Ltd. - Core code, SCSI and Block OSMs - -Steve Ralston, LSI Logic Corp. - Debugging SCSI and Block OSM - -Deepak Saxena, Intel Corp. - Various core/block extensions - /proc interface, bug fixes - Ioctl interfaces for control - Debugging LAN OSM - -Philip Rumpf - Fixed assorted dumb SMP locking bugs - -Juha Sievanen, University of Helsinki Finland - LAN OSM code - /proc interface to LAN class - Bug fixes - Core code extensions - -Auvo Häkkinen, University of Helsinki Finland - LAN OSM code - /Proc interface to LAN class - Bug fixes - Core code extensions - -Taneli Vähäkangas, University of Helsinki Finland - Fixes to i2o_config - -CREDITS - - This work was made possible by - -Red Hat Software - Funding for the Building #3 part of the project - -Symbios Logic (Now LSI) - Host adapters, hints, known to work platforms when I hit - compatibility problems - -BoxHill Corporation - Loan of initial FibreChannel disk array used for development work. - -European Commission - Funding the work done by the University of Helsinki - -SysKonnect - Loan of FDDI and Gigabit Ethernet cards - -ASUSTeK - Loan of I2O motherboard diff --git a/Documentation/i2o/ioctl b/Documentation/i2o/ioctl deleted file mode 100644 index 27c3c5493116..000000000000 --- a/Documentation/i2o/ioctl +++ /dev/null @@ -1,394 +0,0 @@ - -Linux I2O User Space Interface -rev 0.3 - 04/20/99 - -============================================================================= -Originally written by Deepak Saxena(deepak@plexity.net) -Currently maintained by Deepak Saxena(deepak@plexity.net) -============================================================================= - -I. Introduction - -The Linux I2O subsystem provides a set of ioctl() commands that can be -utilized by user space applications to communicate with IOPs and devices -on individual IOPs. This document defines the specific ioctl() commands -that are available to the user and provides examples of their uses. - -This document assumes the reader is familiar with or has access to the -I2O specification as no I2O message parameters are outlined. For information -on the specification, see http://www.i2osig.org - -This document and the I2O user space interface are currently maintained -by Deepak Saxena. Please send all comments, errata, and bug fixes to -deepak@csociety.purdue.edu - -II. IOP Access - -Access to the I2O subsystem is provided through the device file named -/dev/i2o/ctl. This file is a character file with major number 10 and minor -number 166. It can be created through the following command: - - mknod /dev/i2o/ctl c 10 166 - -III. Determining the IOP Count - - SYNOPSIS - - ioctl(fd, I2OGETIOPS, int *count); - - u8 count[MAX_I2O_CONTROLLERS]; - - DESCRIPTION - - This function returns the system's active IOP table. count should - point to a buffer containing MAX_I2O_CONTROLLERS entries. Upon - returning, each entry will contain a non-zero value if the given - IOP unit is active, and NULL if it is inactive or non-existent. - - RETURN VALUE. - - Returns 0 if no errors occur, and -1 otherwise. If an error occurs, - errno is set appropriately: - - EFAULT Invalid user space pointer was passed - -IV. Getting Hardware Resource Table - - SYNOPSIS - - ioctl(fd, I2OHRTGET, struct i2o_cmd_hrt *hrt); - - struct i2o_cmd_hrtlct - { - u32 iop; /* IOP unit number */ - void *resbuf; /* Buffer for result */ - u32 *reslen; /* Buffer length in bytes */ - }; - - DESCRIPTION - - This function returns the Hardware Resource Table of the IOP specified - by hrt->iop in the buffer pointed to by hrt->resbuf. The actual size of - the data is written into *(hrt->reslen). - - RETURNS - - This function returns 0 if no errors occur. If an error occurs, -1 - is returned and errno is set appropriately: - - EFAULT Invalid user space pointer was passed - ENXIO Invalid IOP number - ENOBUFS Buffer not large enough. If this occurs, the required - buffer length is written into *(hrt->reslen) - -V. Getting Logical Configuration Table - - SYNOPSIS - - ioctl(fd, I2OLCTGET, struct i2o_cmd_lct *lct); - - struct i2o_cmd_hrtlct - { - u32 iop; /* IOP unit number */ - void *resbuf; /* Buffer for result */ - u32 *reslen; /* Buffer length in bytes */ - }; - - DESCRIPTION - - This function returns the Logical Configuration Table of the IOP specified - by lct->iop in the buffer pointed to by lct->resbuf. The actual size of - the data is written into *(lct->reslen). - - RETURNS - - This function returns 0 if no errors occur. If an error occurs, -1 - is returned and errno is set appropriately: - - EFAULT Invalid user space pointer was passed - ENXIO Invalid IOP number - ENOBUFS Buffer not large enough. If this occurs, the required - buffer length is written into *(lct->reslen) - -VI. Setting Parameters - - SYNOPSIS - - ioctl(fd, I2OPARMSET, struct i2o_parm_setget *ops); - - struct i2o_cmd_psetget - { - u32 iop; /* IOP unit number */ - u32 tid; /* Target device TID */ - void *opbuf; /* Operation List buffer */ - u32 oplen; /* Operation List buffer length in bytes */ - void *resbuf; /* Result List buffer */ - u32 *reslen; /* Result List buffer length in bytes */ - }; - - DESCRIPTION - - This function posts a UtilParamsSet message to the device identified - by ops->iop and ops->tid. The operation list for the message is - sent through the ops->opbuf buffer, and the result list is written - into the buffer pointed to by ops->resbuf. The number of bytes - written is placed into *(ops->reslen). - - RETURNS - - The return value is the size in bytes of the data written into - ops->resbuf if no errors occur. If an error occurs, -1 is returned - and errno is set appropriately: - - EFAULT Invalid user space pointer was passed - ENXIO Invalid IOP number - ENOBUFS Buffer not large enough. If this occurs, the required - buffer length is written into *(ops->reslen) - ETIMEDOUT Timeout waiting for reply message - ENOMEM Kernel memory allocation error - - A return value of 0 does not mean that the value was actually - changed properly on the IOP. The user should check the result - list to determine the specific status of the transaction. - -VII. Getting Parameters - - SYNOPSIS - - ioctl(fd, I2OPARMGET, struct i2o_parm_setget *ops); - - struct i2o_parm_setget - { - u32 iop; /* IOP unit number */ - u32 tid; /* Target device TID */ - void *opbuf; /* Operation List buffer */ - u32 oplen; /* Operation List buffer length in bytes */ - void *resbuf; /* Result List buffer */ - u32 *reslen; /* Result List buffer length in bytes */ - }; - - DESCRIPTION - - This function posts a UtilParamsGet message to the device identified - by ops->iop and ops->tid. The operation list for the message is - sent through the ops->opbuf buffer, and the result list is written - into the buffer pointed to by ops->resbuf. The actual size of data - written is placed into *(ops->reslen). - - RETURNS - - EFAULT Invalid user space pointer was passed - ENXIO Invalid IOP number - ENOBUFS Buffer not large enough. If this occurs, the required - buffer length is written into *(ops->reslen) - ETIMEDOUT Timeout waiting for reply message - ENOMEM Kernel memory allocation error - - A return value of 0 does not mean that the value was actually - properly retrieved. The user should check the result list - to determine the specific status of the transaction. - -VIII. Downloading Software - - SYNOPSIS - - ioctl(fd, I2OSWDL, struct i2o_sw_xfer *sw); - - struct i2o_sw_xfer - { - u32 iop; /* IOP unit number */ - u8 flags; /* DownloadFlags field */ - u8 sw_type; /* Software type */ - u32 sw_id; /* Software ID */ - void *buf; /* Pointer to software buffer */ - u32 *swlen; /* Length of software buffer */ - u32 *maxfrag; /* Number of fragments */ - u32 *curfrag; /* Current fragment number */ - }; - - DESCRIPTION - - This function downloads a software fragment pointed by sw->buf - to the iop identified by sw->iop. The DownloadFlags, SwID, SwType - and SwSize fields of the ExecSwDownload message are filled in with - the values of sw->flags, sw->sw_id, sw->sw_type and *(sw->swlen). - - The fragments _must_ be sent in order and be 8K in size. The last - fragment _may_ be shorter, however. The kernel will compute its - size based on information in the sw->swlen field. - - Please note that SW transfers can take a long time. - - RETURNS - - This function returns 0 no errors occur. If an error occurs, -1 - is returned and errno is set appropriately: - - EFAULT Invalid user space pointer was passed - ENXIO Invalid IOP number - ETIMEDOUT Timeout waiting for reply message - ENOMEM Kernel memory allocation error - -IX. Uploading Software - - SYNOPSIS - - ioctl(fd, I2OSWUL, struct i2o_sw_xfer *sw); - - struct i2o_sw_xfer - { - u32 iop; /* IOP unit number */ - u8 flags; /* UploadFlags */ - u8 sw_type; /* Software type */ - u32 sw_id; /* Software ID */ - void *buf; /* Pointer to software buffer */ - u32 *swlen; /* Length of software buffer */ - u32 *maxfrag; /* Number of fragments */ - u32 *curfrag; /* Current fragment number */ - }; - - DESCRIPTION - - This function uploads a software fragment from the IOP identified - by sw->iop, sw->sw_type, sw->sw_id and optionally sw->swlen fields. - The UploadFlags, SwID, SwType and SwSize fields of the ExecSwUpload - message are filled in with the values of sw->flags, sw->sw_id, - sw->sw_type and *(sw->swlen). - - The fragments _must_ be requested in order and be 8K in size. The - user is responsible for allocating memory pointed by sw->buf. The - last fragment _may_ be shorter. - - Please note that SW transfers can take a long time. - - RETURNS - - This function returns 0 if no errors occur. If an error occurs, -1 - is returned and errno is set appropriately: - - EFAULT Invalid user space pointer was passed - ENXIO Invalid IOP number - ETIMEDOUT Timeout waiting for reply message - ENOMEM Kernel memory allocation error - -X. Removing Software - - SYNOPSIS - - ioctl(fd, I2OSWDEL, struct i2o_sw_xfer *sw); - - struct i2o_sw_xfer - { - u32 iop; /* IOP unit number */ - u8 flags; /* RemoveFlags */ - u8 sw_type; /* Software type */ - u32 sw_id; /* Software ID */ - void *buf; /* Unused */ - u32 *swlen; /* Length of the software data */ - u32 *maxfrag; /* Unused */ - u32 *curfrag; /* Unused */ - }; - - DESCRIPTION - - This function removes software from the IOP identified by sw->iop. - The RemoveFlags, SwID, SwType and SwSize fields of the ExecSwRemove message - are filled in with the values of sw->flags, sw->sw_id, sw->sw_type and - *(sw->swlen). Give zero in *(sw->len) if the value is unknown. IOP uses - *(sw->swlen) value to verify correct identication of the module to remove. - The actual size of the module is written into *(sw->swlen). - - RETURNS - - This function returns 0 if no errors occur. If an error occurs, -1 - is returned and errno is set appropriately: - - EFAULT Invalid user space pointer was passed - ENXIO Invalid IOP number - ETIMEDOUT Timeout waiting for reply message - ENOMEM Kernel memory allocation error - -X. Validating Configuration - - SYNOPSIS - - ioctl(fd, I2OVALIDATE, int *iop); - u32 iop; - - DESCRIPTION - - This function posts an ExecConfigValidate message to the controller - identified by iop. This message indicates that the current - configuration is accepted. The iop changes the status of suspect drivers - to valid and may delete old drivers from its store. - - RETURNS - - This function returns 0 if no erro occur. If an error occurs, -1 is - returned and errno is set appropriately: - - ETIMEDOUT Timeout waiting for reply message - ENXIO Invalid IOP number - -XI. Configuration Dialog - - SYNOPSIS - - ioctl(fd, I2OHTML, struct i2o_html *htquery); - struct i2o_html - { - u32 iop; /* IOP unit number */ - u32 tid; /* Target device ID */ - u32 page; /* HTML page */ - void *resbuf; /* Buffer for reply HTML page */ - u32 *reslen; /* Length in bytes of reply buffer */ - void *qbuf; /* Pointer to HTTP query string */ - u32 qlen; /* Length in bytes of query string buffer */ - }; - - DESCRIPTION - - This function posts an UtilConfigDialog message to the device identified - by htquery->iop and htquery->tid. The requested HTML page number is - provided by the htquery->page field, and the resultant data is stored - in the buffer pointed to by htquery->resbuf. If there is an HTTP query - string that is to be sent to the device, it should be sent in the buffer - pointed to by htquery->qbuf. If there is no query string, this field - should be set to NULL. The actual size of the reply received is written - into *(htquery->reslen). - - RETURNS - - This function returns 0 if no error occur. If an error occurs, -1 - is returned and errno is set appropriately: - - EFAULT Invalid user space pointer was passed - ENXIO Invalid IOP number - ENOBUFS Buffer not large enough. If this occurs, the required - buffer length is written into *(ops->reslen) - ETIMEDOUT Timeout waiting for reply message - ENOMEM Kernel memory allocation error - -XII. Events - - In the process of determining this. Current idea is to have use - the select() interface to allow user apps to periodically poll - the /dev/i2o/ctl device for events. When select() notifies the user - that an event is available, the user would call read() to retrieve - a list of all the events that are pending for the specific device. - -============================================================================= -Revision History -============================================================================= - -Rev 0.1 - 04/01/99 -- Initial revision - -Rev 0.2 - 04/06/99 -- Changed return values to match UNIX ioctl() standard. Only return values - are 0 and -1. All errors are reported through errno. -- Added summary of proposed possible event interfaces - -Rev 0.3 - 04/20/99 -- Changed all ioctls() to use pointers to user data instead of actual data -- Updated error values to match the code diff --git a/Documentation/input/alps.txt b/Documentation/input/alps.txt index 92ae734c00c3..b9d229fee6b9 100644 --- a/Documentation/input/alps.txt +++ b/Documentation/input/alps.txt @@ -58,7 +58,7 @@ To exit command mode, PSMOUSE_CMD_SETSTREAM (EA) is sent to the touchpad. While in command mode, register addresses can be set by first sending a specific command, either EC for v3 devices or F5 for v4 devices. Then the address is sent one nibble at a time, where each nibble is encoded as a -command with optional data. This enoding differs slightly between the v3 and +command with optional data. This encoding differs slightly between the v3 and v4 protocols. Once an address has been set, the addressed register can be read by sending @@ -139,7 +139,7 @@ ALPS Absolute Mode - Protocol Version 3 --------------------------------------- ALPS protocol version 3 has three different packet formats. The first two are -associated with touchpad events, and the third is associatd with trackstick +associated with touchpad events, and the third is associated with trackstick events. The first type is the touchpad position packet. diff --git a/Documentation/input/event-codes.txt b/Documentation/input/event-codes.txt index 96705616f582..3f0f5ce3338b 100644 --- a/Documentation/input/event-codes.txt +++ b/Documentation/input/event-codes.txt @@ -229,7 +229,7 @@ such device to feedback. EV_PWR: ---------- EV_PWR events are a special type of event used specifically for power -mangement. Its usage is not well defined. To be addressed later. +management. Its usage is not well defined. To be addressed later. Device properties: ================= diff --git a/Documentation/input/gpio-tilt.txt b/Documentation/input/gpio-tilt.txt index 06d60c3ff5e7..2cdfd9bcb1af 100644 --- a/Documentation/input/gpio-tilt.txt +++ b/Documentation/input/gpio-tilt.txt @@ -28,7 +28,7 @@ Example: -------- Example configuration for a single TS1003 tilt switch that rotates around -one axis in 4 steps and emitts the current tilt via two GPIOs. +one axis in 4 steps and emits the current tilt via two GPIOs. static int sg060_tilt_enable(struct device *dev) { /* code to enable the sensors */ diff --git a/Documentation/input/iforce-protocol.txt b/Documentation/input/iforce-protocol.txt index 2d5fbfd6023e..66287151c54a 100644 --- a/Documentation/input/iforce-protocol.txt +++ b/Documentation/input/iforce-protocol.txt @@ -97,7 +97,7 @@ LEN= 0e *** Attack and fade *** OP= 02 LEN= 08 -00-01 Address where to store the parameteres +00-01 Address where to store the parameters 02-03 Duration of attack (little endian encoding, in ms) 04 Level at end of attack. Signed byte. 05-06 Duration of fade. diff --git a/Documentation/input/walkera0701.txt b/Documentation/input/walkera0701.txt index 561385d38482..49e3ac60dcef 100644 --- a/Documentation/input/walkera0701.txt +++ b/Documentation/input/walkera0701.txt @@ -91,7 +91,7 @@ absolute binary value. (10 bits per channel). Next nibble is checksum for first ten nibbles. Next nibbles 12 .. 21 represents four channels (not all channels can be -directly controlled from TX). Binary representations ar the same as in first +directly controlled from TX). Binary representations are the same as in first four channels. In nibbles 22 and 23 is a special magic number. Nibble 24 is checksum for nibbles 12..23. diff --git a/Documentation/input/yealink.txt b/Documentation/input/yealink.txt index 5360e434486c..8277b76ec506 100644 --- a/Documentation/input/yealink.txt +++ b/Documentation/input/yealink.txt @@ -93,7 +93,7 @@ Format description: Format specifier '8' : Generic 7 segment digit with individual addressable segments - Reduced capability 7 segm digit, when segments are hard wired together. + Reduced capability 7 segment digit, when segments are hard wired together. '1' : 2 segments digit only able to produce a 1. 'e' : Most significant day of the month digit, able to produce at least 1 2 3. diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 11a76df2e1f1..5bc92d2f4716 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -928,6 +928,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Enable debug messages at boot time. See Documentation/dynamic-debug-howto.txt for details. + eagerfpu= [X86] + on enable eager fpu restore + off disable eager fpu restore + auto selects the default scheme, which automatically + enables eagerfpu restore for xsaveopt. + early_ioremap_debug [KNL] Enable debug messages in early_ioremap support. This is useful for tracking down temporary early mappings @@ -2344,12 +2350,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. parameter, xsave area per process might occupy more memory on xsaves enabled systems. - eagerfpu= [X86] - on enable eager fpu restore - off disable eager fpu restore - auto selects the default scheme, which automatically - enables eagerfpu restore for xsaveopt. - nohlt [BUGS=ARM,SH] Tells the kernel that the sleep(SH) or wfi(ARM) instruction doesn't work correctly and not to use it. This is also useful when using JTAG debugger. diff --git a/Documentation/kmemcheck.txt b/Documentation/kmemcheck.txt index a41bdebbe87b..80aae85d8da6 100644 --- a/Documentation/kmemcheck.txt +++ b/Documentation/kmemcheck.txt @@ -82,8 +82,8 @@ menu to even appear in "menuconfig". These are: o CONFIG_DEBUG_PAGEALLOC=n - This option is located under "Kernel hacking" / "Debug page memory - allocations". + This option is located under "Kernel hacking" / "Memory Debugging" + / "Debug page memory allocations". In addition, I highly recommend turning on CONFIG_DEBUG_INFO=y. This is also located under "Kernel hacking". With this, you will be able to get line number diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt index 1488b6525eb6..1f9b3e2b98ae 100644 --- a/Documentation/kprobes.txt +++ b/Documentation/kprobes.txt @@ -305,8 +305,8 @@ architectures: 3. Configuring Kprobes When configuring the kernel using make menuconfig/xconfig/oldconfig, -ensure that CONFIG_KPROBES is set to "y". Under "Instrumentation -Support", look for "Kprobes". +ensure that CONFIG_KPROBES is set to "y". Under "General setup", look +for "Kprobes". So that you can load and unload Kprobes-based instrumentation modules, make sure "Loadable module support" (CONFIG_MODULES) and "Module diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 6974f1c2b4e1..f95746189b5d 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -1727,7 +1727,7 @@ There are some more advanced barrier functions: } The dma_rmb() allows us guarantee the device has released ownership - before we read the data from the descriptor, and he dma_wmb() allows + before we read the data from the descriptor, and the dma_wmb() allows us to guarantee the data is written to the descriptor before the device can see it now has ownership. The wmb() is needed to guarantee that the cache coherent memory writes have completed before attempting a write to diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt index ea03abfc97e9..ce2cfcf35c27 100644 --- a/Documentation/memory-hotplug.txt +++ b/Documentation/memory-hotplug.txt @@ -149,7 +149,7 @@ For example, assume 1GiB memory block size. A device for a memory starting at (0x100000000 / 1Gib = 4) This device covers address range [0x100000000 ... 0x140000000) -Under each memory block, you can see 4 files: +Under each memory block, you can see 5 files: /sys/devices/system/memory/memoryXXX/phys_index /sys/devices/system/memory/memoryXXX/phys_device @@ -359,38 +359,51 @@ Need more implementation yet.... -------------------------------- 8. Memory hotplug event notifier -------------------------------- -Memory hotplug has event notifier. There are 6 types of notification. +Hotplugging events are sent to a notification queue. -MEMORY_GOING_ONLINE +There are six types of notification defined in include/linux/memory.h: + +MEM_GOING_ONLINE Generated before new memory becomes available in order to be able to prepare subsystems to handle memory. The page allocator is still unable to allocate from the new memory. -MEMORY_CANCEL_ONLINE +MEM_CANCEL_ONLINE Generated if MEMORY_GOING_ONLINE fails. -MEMORY_ONLINE +MEM_ONLINE Generated when memory has successfully brought online. The callback may allocate pages from the new memory. -MEMORY_GOING_OFFLINE +MEM_GOING_OFFLINE Generated to begin the process of offlining memory. Allocations are no longer possible from the memory but some of the memory to be offlined is still in use. The callback can be used to free memory known to a subsystem from the indicated memory block. -MEMORY_CANCEL_OFFLINE +MEM_CANCEL_OFFLINE Generated if MEMORY_GOING_OFFLINE fails. Memory is available again from the memory block that we attempted to offline. -MEMORY_OFFLINE +MEM_OFFLINE Generated after offlining memory is complete. -A callback routine can be registered by +A callback routine can be registered by calling + hotplug_memory_notifier(callback_func, priority) -The second argument of callback function (action) is event types of above. -The third argument is passed by pointer of struct memory_notify. +Callback functions with higher values of priority are called before callback +functions with lower values. + +A callback function must have the following prototype: + + int callback_func( + struct notifier_block *self, unsigned long action, void *arg); + +The first argument of the callback function (self) is a pointer to the block +of the notifier chain that points to the callback function itself. +The second argument (action) is one of the event types described above. +The third argument (arg) passes a pointer of struct memory_notify. struct memory_notify { unsigned long start_pfn; @@ -412,6 +425,18 @@ node loses all memory. If this is -1, then nodemask status is not changed. If status_changed_nid* >= 0, callback should create/discard structures for the node if necessary. +The callback routine shall return one of the values +NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP +defined in include/linux/notifier.h + +NOTIFY_DONE and NOTIFY_OK have no effect on the further processing. + +NOTIFY_BAD is used as response to the MEM_GOING_ONLINE, MEM_GOING_OFFLINE, +MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops +further processing of the notification queue. + +NOTIFY_STOP stops further processing of the notification queue. + -------------- 9. Future Work -------------- diff --git a/Documentation/printk-formats.txt b/Documentation/printk-formats.txt index cb6a596072bb..2216eb187c21 100644 --- a/Documentation/printk-formats.txt +++ b/Documentation/printk-formats.txt @@ -228,7 +228,7 @@ UUID/GUID addresses: lower ('l') or upper case ('L') hex characters - and big endian order in lower ('b') or upper case ('B') hex characters. - Where no additional specifiers are used the default little endian + Where no additional specifiers are used the default big endian order with lower case hex characters will be printed. Passed by reference. @@ -273,6 +273,16 @@ struct clk: Passed by reference. +bitmap and its derivatives such as cpumask and nodemask: + + %*pb 0779 + %*pbl 0,3-6,8-10 + + For printing bitmap and its derivatives such as cpumask and nodemask, + %*pb output the bitmap with field width as the number of bits and %*pbl + output the bitmap as range list with field width as the number of bits. + + Passed by reference. Thank you for your cooperation and attention. diff --git a/Documentation/scheduler/completion.txt b/Documentation/scheduler/completion.txt index f77651eca31e..2622bc7a188b 100644 --- a/Documentation/scheduler/completion.txt +++ b/Documentation/scheduler/completion.txt @@ -7,24 +7,24 @@ Introduction: ------------- If you have one or more threads of execution that must wait for some process -to have reached a point or a specific state, completions can provide a race -free solution to this problem. Semantically they are somewhat like a -pthread_barriers and have similar use-cases. +to have reached a point or a specific state, completions can provide a +race-free solution to this problem. Semantically they are somewhat like a +pthread_barrier and have similar use-cases. -Completions are a code synchronization mechanism that is preferable to any +Completions are a code synchronization mechanism which is preferable to any misuse of locks. Any time you think of using yield() or some quirky -msleep(1); loop to allow something else to proceed, you probably want to +msleep(1) loop to allow something else to proceed, you probably want to look into using one of the wait_for_completion*() calls instead. The -advantage of using completions is clear intent of the code but also more +advantage of using completions is clear intent of the code, but also more efficient code as both threads can continue until the result is actually needed. Completions are built on top of the generic event infrastructure in Linux, -with the event reduced to a simple flag appropriately called "done" in -struct completion, that tells the waiting threads of execution if they +with the event reduced to a simple flag (appropriately called "done") in +struct completion that tells the waiting threads of execution if they can continue safely. -As completions are scheduling related the code is found in +As completions are scheduling related, the code is found in kernel/sched/completion.c - for details on completion design and implementation see completions-design.txt @@ -32,9 +32,9 @@ implementation see completions-design.txt Usage: ------ -There are three parts to the using completions, the initialization of the +There are three parts to using completions, the initialization of the struct completion, the waiting part through a call to one of the variants of -wait_for_completion() and the signaling side through a call to complete(), +wait_for_completion() and the signaling side through a call to complete() or complete_all(). Further there are some helper functions for checking the state of completions. @@ -50,7 +50,7 @@ handling of completions is: providing the wait queue to place tasks on for waiting and the flag for indicating the state of affairs. -Completions should be named to convey the intent of the waiter. A good +Completions should be named to convey the intent of the waiter. A good example is: wait_for_completion(&early_console_added); @@ -73,7 +73,7 @@ the default state to "not available", that is, "done" is set to 0. The re-initialization function, reinit_completion(), simply resets the done element to "not available", thus again to 0, without touching the -wait queue. Calling init_completion() on the same completions object is +wait queue. Calling init_completion() twice on the same completion object is most likely a bug as it re-initializes the queue to an empty queue and enqueued tasks could get "lost" - use reinit_completion() in that case. @@ -87,10 +87,17 @@ initialization should always use: DECLARE_COMPLETION_ONSTACK(setup_done) suitable for automatic/local variables on the stack and will make lockdep -happy. Note also that one needs to making *sure* the completion passt to +happy. Note also that one needs to make *sure* the completion passed to work threads remains in-scope, and no references remain to on-stack data when the initiating function returns. +Using on-stack completions for code that calls any of the _timeout or +_interruptible/_killable variants is not advisable as they will require +additional synchronization to prevent the on-stack completion object in +the timeout/signal cases from going out of scope. Consider using dynamically +allocated completions when intending to use the _interruptible/_killable +or _timeout variants of wait_for_completion(). + Waiting for completions: ------------------------ @@ -99,34 +106,38 @@ For a thread of execution to wait for some concurrent work to finish, it calls wait_for_completion() on the initialized completion structure. A typical usage scenario is: - structure completion setup_done; + struct completion setup_done; init_completion(&setup_done); - initialze_work(...,&setup_done,...) + initialize_work(...,&setup_done,...) /* run non-dependent code */ /* do setup */ - wait_for_completion(&seupt_done); complete(setup_done) + wait_for_completion(&setup_done); complete(setup_done) -This is not implying any temporal order of wait_for_completion() and the +This is not implying any temporal order on wait_for_completion() and the call to complete() - if the call to complete() happened before the call to wait_for_completion() then the waiting side simply will continue -immediately as all dependencies are satisfied. +immediately as all dependencies are satisfied if not it will block until +completion is signaled by complete(). -Note that wait_for_completion() is calling spin_lock_irq/spin_unlock_irq +Note that wait_for_completion() is calling spin_lock_irq()/spin_unlock_irq(), so it can only be called safely when you know that interrupts are enabled. -Calling it from hard-irq context will result in hard to detect spurious -enabling of interrupts. +Calling it from hard-irq or irqs-off atomic contexts will result in +hard-to-detect spurious enabling of interrupts. wait_for_completion(): void wait_for_completion(struct completion *done): -The default behavior is to wait without a timeout and mark the task as +The default behavior is to wait without a timeout and to mark the task as uninterruptible. wait_for_completion() and its variants are only safe -in soft-interrupt or process context but not in hard-irq context. +in process context (as they can sleep) but not in atomic context, +interrupt context, with disabled irqs. or preemption is disabled - see also +try_wait_for_completion() below for handling completion in atomic/interrupt +context. + As all variants of wait_for_completion() can (obviously) block for a long -time, you probably don't want to call this with held locks - see also -try_wait_for_completion() below. +time, you probably don't want to call this with held mutexes. Variants available: @@ -141,43 +152,44 @@ A common problem that occurs is to have unclean assignment of return types, so care should be taken with assigning return-values to variables of proper type. Checking for the specific meaning of return values also has been found to be quite inaccurate e.g. constructs like -if(!wait_for_completion_interruptible_timeout(...)) would execute the same +if (!wait_for_completion_interruptible_timeout(...)) would execute the same code path for successful completion and for the interrupted case - which is probably not what you want. int wait_for_completion_interruptible(struct completion *done) -marking the task TASK_INTERRUPTIBLE. If a signal was received while waiting. -It will return -ERESTARTSYS and 0 otherwise. +This function marks the task TASK_INTERRUPTIBLE. If a signal was received +while waiting it will return -ERESTARTSYS; 0 otherwise. unsigned long wait_for_completion_timeout(struct completion *done, unsigned long timeout) -The task is marked as TASK_UNINTERRUPTIBLE and will wait at most timeout -(in jiffies). If timeout occurs it return 0 else the remaining time in -jiffies (but at least 1). Timeouts are preferably passed by msecs_to_jiffies() -or usecs_to_jiffies(). If the returned timeout value is deliberately ignored -a comment should probably explain why (e.g. see drivers/mfd/wm8350-core.c -wm8350_read_auxadc()) +The task is marked as TASK_UNINTERRUPTIBLE and will wait at most 'timeout' +(in jiffies). If timeout occurs it returns 0 else the remaining time in +jiffies (but at least 1). Timeouts are preferably calculated with +msecs_to_jiffies() or usecs_to_jiffies(). If the returned timeout value is +deliberately ignored a comment should probably explain why (e.g. see +drivers/mfd/wm8350-core.c wm8350_read_auxadc()) long wait_for_completion_interruptible_timeout( struct completion *done, unsigned long timeout) -passing a timeout in jiffies and marking the task as TASK_INTERRUPTIBLE. If a -signal was received it will return -ERESTARTSYS, 0 if completion timed-out and -the remaining time in jiffies if completion occurred. +This function passes a timeout in jiffies and marks the task as +TASK_INTERRUPTIBLE. If a signal was received it will return -ERESTARTSYS; +otherwise it returns 0 if the completion timed out or the remaining time in +jiffies if completion occurred. -Further variants include _killable which passes TASK_KILLABLE as the -designated tasks state and will return a -ERESTARTSYS if interrupted or -else 0 if completions was achieved as well as a _timeout variant. +Further variants include _killable which uses TASK_KILLABLE as the +designated tasks state and will return -ERESTARTSYS if it is interrupted or +else 0 if completion was achieved. There is a _timeout variant as well: long wait_for_completion_killable(struct completion *done) long wait_for_completion_killable_timeout(struct completion *done, unsigned long timeout) -The _io variants wait_for_completion_io behave the same as the non-_io +The _io variants wait_for_completion_io() behave the same as the non-_io variants, except for accounting waiting time as waiting on IO, which has -an impact on how scheduling is calculated. +an impact on how the task is accounted in scheduling stats. void wait_for_completion_io(struct completion *done) unsigned long wait_for_completion_io_timeout(struct completion *done @@ -187,13 +199,13 @@ an impact on how scheduling is calculated. Signaling completions: ---------------------- -A thread of execution that wants to signal that the conditions for -continuation have been achieved calls complete() to signal exactly one -of the waiters that it can continue. +A thread that wants to signal that the conditions for continuation have been +achieved calls complete() to signal exactly one of the waiters that it can +continue. void complete(struct completion *done) -or calls complete_all to signal all current and future waiters. +or calls complete_all() to signal all current and future waiters. void complete_all(struct completion *done) @@ -205,32 +217,32 @@ wakeup order is the same in which they were enqueued (FIFO order). If complete() is called multiple times then this will allow for that number of waiters to continue - each call to complete() will simply increment the done element. Calling complete_all() multiple times is a bug though. Both -complete() and complete_all() can be called in hard-irq context safely. +complete() and complete_all() can be called in hard-irq/atomic context safely. There only can be one thread calling complete() or complete_all() on a -particular struct completions at any time - serialized through the wait +particular struct completion at any time - serialized through the wait queue spinlock. Any such concurrent calls to complete() or complete_all() probably are a design bug. Signaling completion from hard-irq context is fine as it will appropriately -lock with spin_lock_irqsave/spin_unlock_irqrestore. +lock with spin_lock_irqsave/spin_unlock_irqrestore and it will never sleep. try_wait_for_completion()/completion_done(): -------------------------------------------- -The try_wait_for_completion will not put the thread on the wait queue but -rather returns false if it would need to enqueue (block) the thread, else it -consumes any posted completions and returns true. +The try_wait_for_completion() function will not put the thread on the wait +queue but rather returns false if it would need to enqueue (block) the thread, +else it consumes one posted completion and returns true. - bool try_wait_for_completion(struct completion *done) + bool try_wait_for_completion(struct completion *done) -Finally to check state of a completions without changing it in any way is -provided by completion_done() returning false if there are any posted -completion that was not yet consumed by waiters implying that there are -waiters and true otherwise; +Finally, to check the state of a completion without changing it in any way, +call completion_done(), which returns false if there are no posted +completions that were not yet consumed by waiters (implying that there are +waiters) and true otherwise; - bool completion_done(struct completion *done) + bool completion_done(struct completion *done) Both try_wait_for_completion() and completion_done() are safe to be called in -hard-irq context. +hard-irq or atomic context. diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt index 6fbd55ef6b45..6bfbc172cdb9 100644 --- a/Documentation/vm/pagemap.txt +++ b/Documentation/vm/pagemap.txt @@ -131,7 +131,8 @@ Short descriptions to the page flags: 13. SWAPCACHE page is mapped to swap space, ie. has an associated swap entry 14. SWAPBACKED page is backed by swap/RAM -The page-types tool in this directory can be used to query the above flags. +The page-types tool in the tools/vm directory can be used to query the +above flags. Using pagemap to do something useful: diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt index 6b31cfbe2a9a..8143b9e8373d 100644 --- a/Documentation/vm/transhuge.txt +++ b/Documentation/vm/transhuge.txt @@ -159,6 +159,17 @@ for each pass: /sys/kernel/mm/transparent_hugepage/khugepaged/full_scans +max_ptes_none specifies how many extra small pages (that are +not already mapped) can be allocated when collapsing a group +of small pages into one large page. + +/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none + +A higher value leads to use additional memory for programs. +A lower value leads to gain less thp performance. Value of +max_ptes_none can waste cpu time very little, you can +ignore it. + == Boot parameter == You can change the sysfs boot time defaults of Transparent Hugepage diff --git a/Documentation/zh_CN/arm64/booting.txt b/Documentation/zh_CN/arm64/booting.txt index 6f6d956ac1c9..7cd36af11e71 100644 --- a/Documentation/zh_CN/arm64/booting.txt +++ b/Documentation/zh_CN/arm64/booting.txt @@ -15,6 +15,8 @@ Documentation/arm64/booting.txt 的中文翻译 交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 译存在问题,请联系中文版维护者。 +本文翻译提交时的 Git 检出点为: bc465aa9d045feb0e13b4a8f32cc33c1943f62d6 + 英文版维护者: Will Deacon <will.deacon@arm.com> 中文版维护者: 傅炜 Fu Wei <wefu@redhat.com> 中文版翻译者: 傅炜 Fu Wei <wefu@redhat.com> @@ -88,22 +90,44 @@ AArch64 内核当前没有提供自解压代码,因此如果使用了压缩内 u32 code0; /* 可执行代码 */ u32 code1; /* 可执行代码 */ - u64 text_offset; /* 映像装载偏移 */ - u64 res0 = 0; /* 保留 */ - u64 res1 = 0; /* 保留 */ + u64 text_offset; /* 映像装载偏移,小端模式 */ + u64 image_size; /* 映像实际大小, 小端模式 */ + u64 flags; /* 内核旗标, 小端模式 * u64 res2 = 0; /* 保留 */ u64 res3 = 0; /* 保留 */ u64 res4 = 0; /* 保留 */ u32 magic = 0x644d5241; /* 魔数, 小端, "ARM\x64" */ - u32 res5 = 0; /* 保留 */ + u32 res5; /* 保留 (用于 PE COFF 偏移) */ 映像头注释: +- 自 v3.17 起,除非另有说明,所有域都是小端模式。 + - code0/code1 负责跳转到 stext. -映像必须位于系统 RAM 起始处的特定偏移(当前是 0x80000)。系统 RAM -的起始地址必须是以 2MB 对齐的。 +- 当通过 EFI 启动时, 最初 code0/code1 被跳过。 + res5 是到 PE 文件头的偏移,而 PE 文件头含有 EFI 的启动入口点 (efi_stub_entry)。 + 当 stub 代码完成了它的使命,它会跳转到 code0 继续正常的启动流程。 + +- v3.17 之前,未明确指定 text_offset 的字节序。此时,image_size 为零, + 且 text_offset 依照内核字节序为 0x80000。 + 当 image_size 非零,text_offset 为小端模式且是有效值,应被引导加载程序使用。 + 当 image_size 为零,text_offset 可假定为 0x80000。 + +- flags 域 (v3.17 引入) 为 64 位小端模式,其编码如下: + 位 0: 内核字节序。 1 表示大端模式,0 表示小端模式。 + 位 1-63: 保留。 + +- 当 image_size 为零时,引导装载程序应该试图在内核映像末尾之后尽可能多地保留空闲内存 + 供内核直接使用。对内存空间的需求量因所选定的内核特性而异, 且无实际限制。 + +内核映像必须被放置在靠近可用系统内存起始的 2MB 对齐为基址的 text_offset 字节处,并从那里被调用。 +当前,对 Linux 来说在此基址以下的内存是无法使用的,因此强烈建议将系统内存的起始作为这个基址。 +从映像起始地址算起,最少必须为内核释放出 image_size 字节的空间。 + +任何提供给内核的内存(甚至在 2MB 对齐的基地址之前),若未从内核中标记为保留 +(如在设备树(dtb)的 memreserve 区域),都将被认为对内核是可用。 在跳转入内核前,必须符合以下状态: @@ -124,8 +148,12 @@ AArch64 内核当前没有提供自解压代码,因此如果使用了压缩内 - 高速缓存、MMU MMU 必须关闭。 指令缓存开启或关闭都可以。 - 数据缓存必须关闭且无效。 - 外部高速缓存(如果存在)必须配置并禁用。 + 已载入的内核映像的相应内存区必须被清理,以达到缓存一致性点(PoC)。 + 当存在系统缓存或其他使能缓存的一致性主控器时,通常需使用虚拟地址维护其缓存,而非 set/way 操作。 + 遵从通过虚拟地址操作维护构架缓存的系统缓存必须被配置,并可以被使能。 + 而不通过虚拟地址操作维护构架缓存的系统缓存(不推荐),必须被配置且禁用。 + + *译者注:对于 PoC 以及缓存相关内容,请参考 ARMv8 构架参考手册 ARM DDI 0487A - 架构计时器 CNTFRQ 必须设定为计时器的频率,且 CNTVOFF 必须设定为对所有 CPU @@ -141,6 +169,14 @@ AArch64 内核当前没有提供自解压代码,因此如果使用了压缩内 在进入内核映像的异常级中,所有构架中可写的系统寄存器必须通过软件 在一个更高的异常级别下初始化,以防止在 未知 状态下运行。 + 对于拥有 GICv3 中断控制器的系统: + - 若当前在 EL3 : + ICC_SRE_EL3.Enable (位 3) 必须初始化为 0b1。 + ICC_SRE_EL3.SRE (位 0) 必须初始化为 0b1。 + - 若内核运行在 EL1: + ICC_SRE_EL2.Enable (位 3) 必须初始化为 0b1。 + ICC_SRE_EL2.SRE (位 0) 必须初始化为 0b1。 + 以上对于 CPU 模式、高速缓存、MMU、架构计时器、一致性、系统寄存器的 必要条件描述适用于所有 CPU。所有 CPU 必须在同一异常级别跳入内核。 @@ -170,7 +206,7 @@ AArch64 内核当前没有提供自解压代码,因此如果使用了压缩内 ARM DEN 0022A:用于 ARM 上的电源状态协调接口系统软件)中描述的 CPU_ON 调用来将 CPU 带入内核。 - *译者注:到文档翻译时,此文档已更新为 ARM DEN 0022B。 + *译者注: ARM DEN 0022A 已更新到 ARM DEN 0022C。 设备树必须包含一个 ‘psci’ 节点,请参考以下文档: Documentation/devicetree/bindings/arm/psci.txt diff --git a/Documentation/zh_CN/arm64/legacy_instructions.txt b/Documentation/zh_CN/arm64/legacy_instructions.txt new file mode 100644 index 000000000000..68362a1ab717 --- /dev/null +++ b/Documentation/zh_CN/arm64/legacy_instructions.txt @@ -0,0 +1,72 @@ +Chinese translated version of Documentation/arm64/legacy_instructions.txt + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have a problem +communicating in English you can also ask the Chinese maintainer for +help. Contact the Chinese maintainer if this translation is outdated +or if there is a problem with the translation. + +Maintainer: Punit Agrawal <punit.agrawal@arm.com> + Suzuki K. Poulose <suzuki.poulose@arm.com> +Chinese maintainer: Fu Wei <wefu@redhat.com> +--------------------------------------------------------------------- +Documentation/arm64/legacy_instructions.txt 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + +本文翻译提交时的 Git 检出点为: bc465aa9d045feb0e13b4a8f32cc33c1943f62d6 + +英文版维护者: Punit Agrawal <punit.agrawal@arm.com> + Suzuki K. Poulose <suzuki.poulose@arm.com> +中文版维护者: 傅炜 Fu Wei <wefu@redhat.com> +中文版翻译者: 傅炜 Fu Wei <wefu@redhat.com> +中文版校译者: 傅炜 Fu Wei <wefu@redhat.com> + +以下为正文 +--------------------------------------------------------------------- +Linux 内核在 arm64 上的移植提供了一个基础框架,以支持构架中正在被淘汰或已废弃指令的模拟执行。 +这个基础框架的代码使用未定义指令钩子(hooks)来支持模拟。如果指令存在,它也允许在硬件中启用该指令。 + +模拟模式可通过写 sysctl 节点(/proc/sys/abi)来控制。 +不同的执行方式及 sysctl 节点的相应值,解释如下: + +* Undef(未定义) + 值: 0 + 产生未定义指令终止异常。它是那些构架中已废弃的指令,如 SWP,的默认处理方式。 + +* Emulate(模拟) + 值: 1 + 使用软件模拟方式。为解决软件迁移问题,这种模拟指令模式的使用是被跟踪的,并会发出速率限制警告。 + 它是那些构架中正在被淘汰的指令,如 CP15 barriers(隔离指令),的默认处理方式。 + +* Hardware Execution(硬件执行) + 值: 2 + 虽然标记为正在被淘汰,但一些实现可能提供硬件执行这些指令的使能/禁用操作。 + 使用硬件执行一般会有更好的性能,但将无法收集运行时对正被淘汰指令的使用统计数据。 + +默认执行模式依赖于指令在构架中状态。正在被淘汰的指令应该以模拟(Emulate)作为默认模式, +而已废弃的指令必须默认使用未定义(Undef)模式 + +注意:指令模拟可能无法应对所有情况。更多详情请参考单独的指令注释。 + +受支持的遗留指令 +------------- +* SWP{B} +节点: /proc/sys/abi/swp +状态: 已废弃 +默认执行方式: Undef (0) + +* CP15 Barriers +节点: /proc/sys/abi/cp15_barrier +状态: 正被淘汰,不推荐使用 +默认执行方式: Emulate (1) + +* SETEND +节点: /proc/sys/abi/setend +状态: 正被淘汰,不推荐使用 +默认执行方式: Emulate (1)* +注:为了使能这个特性,系统中的所有 CPU 必须在 EL0 支持混合字节序。 +如果一个新的 CPU (不支持混合字节序) 在使能这个特性后被热插入系统, +在应用中可能会出现不可预期的结果。 diff --git a/Documentation/zh_CN/arm64/memory.txt b/Documentation/zh_CN/arm64/memory.txt index a782704c1cb5..19b3a52d5d94 100644 --- a/Documentation/zh_CN/arm64/memory.txt +++ b/Documentation/zh_CN/arm64/memory.txt @@ -15,6 +15,8 @@ Documentation/arm64/memory.txt 的中文翻译 交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 译存在问题,请联系中文版维护者。 +本文翻译提交时的 Git 检出点为: bc465aa9d045feb0e13b4a8f32cc33c1943f62d6 + 英文版维护者: Catalin Marinas <catalin.marinas@arm.com> 中文版维护者: 傅炜 Fu Wei <wefu@redhat.com> 中文版翻译者: 傅炜 Fu Wei <wefu@redhat.com> @@ -26,69 +28,53 @@ Documentation/arm64/memory.txt 的中文翻译 =========================== 作者: Catalin Marinas <catalin.marinas@arm.com> -日期: 2012 年 02 月 20 日 本文档描述 AArch64 Linux 内核所使用的虚拟内存布局。此构架可以实现 页大小为 4KB 的 4 级转换表和页大小为 64KB 的 3 级转换表。 -AArch64 Linux 使用页大小为 4KB 的 3 级转换表配置,对于用户和内核 -都有 39-bit (512GB) 的虚拟地址空间。对于页大小为 64KB的配置,仅 -使用 2 级转换表,但内存布局相同。 +AArch64 Linux 使用 3 级或 4 级转换表,其页大小配置为 4KB,对于用户和内核 +分别都有 39-bit (512GB) 或 48-bit (256TB) 的虚拟地址空间。 +对于页大小为 64KB的配置,仅使用 2 级转换表,有 42-bit (4TB) 的虚拟地址空间,但内存布局相同。 -用户地址空间的 63:39 位为 0,而内核地址空间的相应位为 1。TTBRx 的 +用户地址空间的 63:48 位为 0,而内核地址空间的相应位为 1。TTBRx 的 选择由虚拟地址的 63 位给出。swapper_pg_dir 仅包含内核(全局)映射, -而用户 pgd 仅包含用户(非全局)映射。swapper_pgd_dir 地址被写入 +而用户 pgd 仅包含用户(非全局)映射。swapper_pg_dir 地址被写入 TTBR1 中,且从不写入 TTBR0。 -AArch64 Linux 在页大小为 4KB 时的内存布局: +AArch64 Linux 在页大小为 4KB,并使用 3 级转换表时的内存布局: 起始地址 结束地址 大小 用途 ----------------------------------------------------------------------- 0000000000000000 0000007fffffffff 512GB 用户空间 +ffffff8000000000 ffffffffffffffff 512GB 内核空间 -ffffff8000000000 ffffffbbfffeffff ~240GB vmalloc - -ffffffbbffff0000 ffffffbbffffffff 64KB [防护页] - -ffffffbc00000000 ffffffbdffffffff 8GB vmemmap - -ffffffbe00000000 ffffffbffbbfffff ~8GB [防护页,未来用于 vmmemap] -ffffffbffbc00000 ffffffbffbdfffff 2MB earlyprintk 设备 +AArch64 Linux 在页大小为 4KB,并使用 4 级转换表时的内存布局: -ffffffbffbe00000 ffffffbffbe0ffff 64KB PCI I/O 空间 - -ffffffbffbe10000 ffffffbcffffffff ~2MB [防护页] - -ffffffbffc000000 ffffffbfffffffff 64MB 模块 - -ffffffc000000000 ffffffffffffffff 256GB 内核逻辑内存映射 +起始地址 结束地址 大小 用途 +----------------------------------------------------------------------- +0000000000000000 0000ffffffffffff 256TB 用户空间 +ffff000000000000 ffffffffffffffff 256TB 内核空间 -AArch64 Linux 在页大小为 64KB 时的内存布局: +AArch64 Linux 在页大小为 64KB,并使用 2 级转换表时的内存布局: 起始地址 结束地址 大小 用途 ----------------------------------------------------------------------- 0000000000000000 000003ffffffffff 4TB 用户空间 +fffffc0000000000 ffffffffffffffff 4TB 内核空间 -fffffc0000000000 fffffdfbfffeffff ~2TB vmalloc - -fffffdfbffff0000 fffffdfbffffffff 64KB [防护页] - -fffffdfc00000000 fffffdfdffffffff 8GB vmemmap - -fffffdfe00000000 fffffdfffbbfffff ~8GB [防护页,未来用于 vmmemap] -fffffdfffbc00000 fffffdfffbdfffff 2MB earlyprintk 设备 +AArch64 Linux 在页大小为 64KB,并使用 3 级转换表时的内存布局: -fffffdfffbe00000 fffffdfffbe0ffff 64KB PCI I/O 空间 - -fffffdfffbe10000 fffffdfffbffffff ~2MB [防护页] +起始地址 结束地址 大小 用途 +----------------------------------------------------------------------- +0000000000000000 0000ffffffffffff 256TB 用户空间 +ffff000000000000 ffffffffffffffff 256TB 内核空间 -fffffdfffc000000 fffffdffffffffff 64MB 模块 -fffffe0000000000 ffffffffffffffff 2TB 内核逻辑内存映射 +更详细的内核虚拟内存布局,请参阅内核启动信息。 4KB 页大小的转换表查找: @@ -102,7 +88,7 @@ fffffe0000000000 ffffffffffffffff 2TB 内核逻辑内存映射 | | | | +-> [20:12] L3 索引 | | | +-----------> [29:21] L2 索引 | | +---------------------> [38:30] L1 索引 - | +-------------------------------> [47:39] L0 索引 (未使用) + | +-------------------------------> [47:39] L0 索引 +-------------------------------------------------> [63] TTBR0/1 @@ -115,10 +101,11 @@ fffffe0000000000 ffffffffffffffff 2TB 内核逻辑内存映射 | | | | v | | | | [15:0] 页内偏移 | | | +----------> [28:16] L3 索引 - | | +--------------------------> [41:29] L2 索引 (仅使用 38:29 ) - | +-------------------------------> [47:42] L1 索引 (未使用) + | | +--------------------------> [41:29] L2 索引 + | +-------------------------------> [47:42] L1 索引 +-------------------------------------------------> [63] TTBR0/1 + 当使用 KVM 时, 管理程序(hypervisor)在 EL2 中通过相对内核虚拟地址的 一个固定偏移来映射内核页(内核虚拟地址的高 24 位设为零): |