Age | Commit message (Collapse) | Author |
|
Some architectures may want to override the default implementation
at compile time to do things inline. For example, ARM uses a
non-standard calling convention for better efficiency in this case.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
The default C implementation for the 128-bit cross product is abstracted
into the __arch_xprod_64() macro that can be overridden to let
architectures provide their own assembly optimized implementation.
There are many advantages to an assembly version for this operation.
Carry bit handling becomes trivial, and 32-bit shifts may be achieved
simply by inverting register pairs on some architectures. This has the
potential to be quite faster and use much fewer instructions.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
64-by-32-bit divisions are prominent in the kernel, even on 32-bit
machines. Luckily, many of them use a constant divisor that allows
for a much faster multiplication by the divisor's reciprocal.
The compiler already performs this optimization when compiling a 32-by-32
division with a constant divisor. Unfortunately, on 32-bit machines, gcc
does not optimize 64-by-32 divisions in that case, except for constant
divisors that happen to be a power of 2.
Let's avoid the slow path whenever the divisor is constant by manually
computing the reciprocal ourselves and performing the multiplication
inline. In most cases, this improves performance of 64-by-32 divisions
by about two orders of magnitude compared to the __div64_32() fallback,
especially on architectures lacking a native div instruction.
The algorithm used here comes from the existing ARM code.
The __div64_const32_is_OK macro can be predefined by architectures to
disable this optimization in some cases. For example, some ancient gcc
version on ARM would crash with an ICE when fed this code.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
|
|
Let's perform the obvious mask and shift operation in this case.
On 32-bit targets, gcc is able to do the same thing with a constant
divisor that happens to be a power of two i.e. it turns the division
into an inline shift, but it doesn't hurt to be explicit.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
|
|
Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide. Move its definition
to math64.h as currently no architecture overrides the generic implementation.
They can still override it of course, but the duplicated declarations are
avoided.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Here is the current version of the 64 bit divide common code.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
|