aboutsummaryrefslogtreecommitdiff
path: root/libavutil/intmath.h
AgeCommit message (Collapse)Author
2016-01-07lavu: rename and move ff_parity to av_parityJames Almer
av_popcount is not defined in intmath.h. Reviewed-by: ubitux Signed-off-by: James Almer <jamrial@gmail.com>
2016-01-07lavu: add ff_parity()Clément Bœsch
2015-12-19lavu/intmath: add faster clz supportGanesh Ajjanagadde
This should be useful for the sofalizer filter. Reviewed-by: Kieran Kunhya <kierank@ob-encoder.com> Reviewed-by: Clément Bœsch <u@pkh.me> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-10-22avutil/intmath: fix undefined behavior in ff_ctzll_c()Michael Niedermayer
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-19lavu/intmath.h: Move x86 only msvc/icl functions to x86 specific header.Matt Oliver
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2015-10-14avutil/intmath: use de Bruijn based ff_ctzGanesh Ajjanagadde
It has already been demonstrated that the de Bruijn method has benefits over the current implementation: commit 971d12b7f9d7be3ca8eb98e6c04ed521f83cbd3c. That commit implemented it for long long, this extends it to the int version. Tested with FATE. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2015-10-11intmath: remove av_ctz.Ronald S. Bultje
It's a non-installed header and only used in one place (flacenc). Since ff_ctz is static inline, it's fine to use that instead.
2015-10-11avutil/intmath: Change debruijn_ctz64 to use 8bit elementsMichael Niedermayer
This reduces the memory & cache need from 256 to 64 bytes the code also seems faster with this change Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-11avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithmGanesh Ajjanagadde
This uses Stein's binary GCD algorithm: https://en.wikipedia.org/wiki/Binary_GCD_algorithm to get a roughly 4x speedup over Euclidean GCD on standard architectures with a compiler intrinsic for ctzll, and a roughly 2x speedup otherwise. At the moment, the compiler intrinsic is used on GCC and Clang due to its easy availability. Quick note regarding overflow: yes, subtractions on int64_t can, but the llabs takes care of that. The llabs is also guaranteed to be safe, with no annoying INT64_MIN business since INT64_MIN being a power of 2, is shifted down before being sent to llabs. The binary GCD needs ff_ctzll, an extension of ff_ctz for long long (int64_t). On GCC, this is provided by a built-in. On Microsoft, there is a BitScanForward64 analog of BitScanForward that should work; but I can't confirm. Apparently it is not available on 32 bit builds; so this may or may not work correctly. On Intel, per the documentation there is only an intrinsic for _bit_scan_forward and people have posted on forums regarding _bit_scan_forward64, but often their documentation is woeful. Again, I don't have it, so I can't test. As such, to be safe, for now only the GCC/Clang intrinsic is added, the rest use a compiled version based on the De-Bruijn method of Leiserson et al: http://supertech.csail.mit.edu/papers/debruijn.pdf. Tested with FATE, sample benchmark (x86-64, GCC 5.2.0, Haswell) with a START_TIMER and STOP_TIMER in libavutil/rationsl.c, followed by a make fate. aac-am00_88.err: builtin: 714 decicycles in av_gcd, 4095 runs, 1 skips de-bruijn: 1440 decicycles in av_gcd, 4096 runs, 0 skips previous: 2889 decicycles in av_gcd, 4096 runs, 0 skips Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-08-22doxygen: Remove lavu_internal groupTimothy Gu
There is no use in an internal group for a public API documentation.
2015-07-18avutil/intmath: check for ICC before GCCJames Almer
Intel compiler also defines __GNUC__, so the Intel specific intrinsics were not really being used. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-25libavutil: add x86 optimized av_popcountJames Almer
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-17avutil/intmath: Add () to protect the ff_log2() argumentMichael Niedermayer
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-10-26avutil/intmath: enable builtin intrinsics for icl and msvc.Matthew Oliver
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-31intmath.h: Remove duplicated ARM include.Reimar Döffinger
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-11-05Merge commit '5ff998a233d759d0de83ea6f95c383d03d25d88e'Michael Niedermayer
* commit '5ff998a233d759d0de83ea6f95c383d03d25d88e': flacenc: use uint64_t for bit counts flacenc: remove wasted trailing 0 bits lavu: add av_ctz() for trailing zero bit count flacenc: use a separate buffer for byte-swapping for MD5 checksum on big-endian fate: aac: Place LATM tests and general AAC tests in different groups build: The A64 muxer depends on rawenc.o for ff_raw_write_packet() Conflicts: doc/APIchanges libavutil/version.h tests/fate/aac.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-05lavu: add av_ctz() for trailing zero bit countJustin Ruggles
2012-10-21Merge commit '2d09b36c0379fcda8f984bc8ad8816c8326fd7bd'Michael Niedermayer
* commit '2d09b36c0379fcda8f984bc8ad8816c8326fd7bd': doc/platform: Add info on shared builds with MSVC doc/platform: Move a caveat down to the notes section ARM: reinstate optimised intmath.h ffv1: update to ffv1 version 3 Conflicts: doc/platform.texi libavcodec/ffv1.c libavcodec/ffv1.h libavcodec/ffv1dec.c libavcodec/ffv1enc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-21Merge commit 'd15c21e5fa3961f10026da1a3080a3aa3cf4cec9'Michael Niedermayer
* commit 'd15c21e5fa3961f10026da1a3080a3aa3cf4cec9': avutil: Add a copy of ff_sqrt_tab back into avutil to restore ABI compatibility avutil: make some tables visible again avutil: remove inline av_log2 from public API celp_math: rename ff_log2 to ff_log2_q15 Conflicts: libavutil/libavutil.v Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-20ARM: reinstate optimised intmath.hMans Rullgard
Use of the ARM optimised intmath.h was accidentally dropped in 9734b8b. Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-10-20avutil: remove inline av_log2 from public APIMans Rullgard
This removes inline av_log2 and av_log2_16bit from the public API, instead exporting them as regular functions. In-tree code still gets the inline and otherwise optimised variants. Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-10-12Merge commit '9734b8ba56d05e970c353dfd5baafa43fdb08024'Michael Niedermayer
* commit '9734b8ba56d05e970c353dfd5baafa43fdb08024': Move avutil tables only used in libavcodec to libavcodec. Conflicts: libavcodec/mathtables.c libavutil/intmath.h Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-11Move avutil tables only used in libavcodec to libavcodec.Diego Biurrun
2012-08-22x86: remove FASTDIV inline asmMans Rullgard
GCC 4.3 and later do the right thing with the plain C code. Earlier versions in 32-bit mode generate one extra instruction, needlessly zeroing what would be the high half of the shifted value. At least two gcc configurations miscompile the inline asm in some situations. In 64-bit mode, all gcc versions generate imul r64, r64 followed by shr. On Intel i7 and later, this imul is faster 32-bit mul. On older Intel and all AMD, it is slightly slower. On Atom it is much slower. Considering where the FASTDIV macro is used, any overall negative performance impact of this change should be negligible. If anyone cares, they should file a bug against gcc and get the instruction selection fixed. Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-22Merge remote-tracking branch 'qatar/master'Michael Niedermayer
* qatar/master: build: x86: Only compile mpegvideo optimizations when necessary configure: Drop fastdiv option build: Make the E-AC-3 encoder select the AC-3 encoder fate: flac: Only run tests requiring samples when samples are available Conflicts: configure Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-22configure: Drop fastdiv optionDiego Biurrun
There is no point in having the user disable any fastdiv macros. Besides the condition implementation was broken and only disabled the C implementation, but no platform specific assembly versions.
2011-11-23Merge remote-tracking branch 'qatar/master'Michael Niedermayer
* qatar/master: (22 commits) aacdec: Fix PS in ADTS. avconv: Consistently use PIX_FMT_NONE. dsputil: use cpuflags in x86 emu_edge_core dsputil: use movups instead of movdqu in ff_emu_edge_core_sse() wma: initialize prev_block_len_bits, next_block_len_bits, and block_len_bits. mov: Remove some redundant and obsolete comments. Add libavutil/mathematics.h #includes for INFINITY doxy: structure libavformat groups doxy: introduce an empty structure in libavcodec doxy: provide a start page and document libavutil doxy: cleanup pixfmt.h regtest: split video encode/decode tests into individual targets ARM: add explicit .arch and .fpu directives to asm.S pthread: do not touch has_b_frames avconv: cleanup the transcoding loop in output_packet(). avconv: split subtitle transcoding out of output_packet(). avconv: split video transcoding out of output_packet(). avconv: split audio transcoding out of output_packet(). avconv: reindent. avconv: move streamcopy-only code out of decoding loop. ... Conflicts: avconv.c libavcodec/aaccoder.c libavcodec/pthread.c libavcodec/version.h libavutil/audioconvert.h libavutil/avutil.h libavutil/mem.h tests/ref/vsynth1/dv tests/ref/vsynth1/mpeg2thread tests/ref/vsynth2/dv tests/ref/vsynth2/mpeg2thread Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-11-22doxy: provide a start page and document libavutilLuca Barbato
Introduce a basic layout, the subpages are currently left empty. Split libavutil in multiple groups as example of the structure
2011-03-19Replace FFmpeg with Libav in licence headersMans Rullgard
Signed-off-by: Mans Rullgard <mans@mansr.com>
2010-07-07Remove macro duplication between common.h and intmath.hMåns Rullgård
Originally committed as revision 24086 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-07intmath: whitespace cosmeticsMåns Rullgård
Originally committed as revision 24085 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-09Fix build on configurations without fast av_log2()Måns Rullgård
This is a bit hackish. I will try to think of something nicer, but this will do for now. Originally committed as revision 22366 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-08Move ff_sqrt() to libavutil/intmath.hMåns Rullgård
Originally committed as revision 22345 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-19Move FASTDIV macro to intmath.hMåns Rullgård
Originally committed as revision 21335 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-14Optimise av_log2 with clz when availableMåns Rullgård
10% faster flac decoding on x86 and ARM. Originally committed as revision 21217 to svn://svn.ffmpeg.org/ffmpeg/trunk