aboutsummaryrefslogtreecommitdiff
path: root/libavcodec/x86
AgeCommit message (Collapse)Author
2014-06-23x86/dsputil: remove redundant global motion compensation codeJames Almer
The SSE version has been no different than the mmx one since commit a41bf09d Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-22x86/audiodsp: move asm code out of dsputilJames Almer
Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-22Merge commit '9a9e2f1c8aa4539a261625145e5c1f46a8106ac2'Michael Niedermayer
* commit '9a9e2f1c8aa4539a261625145e5c1f46a8106ac2': dsputil: Split audio operations off into a separate context Conflicts: configure libavcodec/takdec.c libavcodec/x86/Makefile libavcodec/x86/dsputil.asm libavcodec/x86/dsputil_init.c libavcodec/x86/dsputil_mmx.c libavcodec/x86/dsputil_x86.h Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-22dsputil: Split audio operations off into a separate contextDiego Biurrun
2014-06-20avcodec/x86/rv40dsp_init: fix () in macrosMichael Niedermayer
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19x86/blockdsp: restore author attributionJames Almer
See commits 649c00c96d7044aed46d70623e47d7434318e6b9 5fecfb7d58a12baf326e99f2d071060f2638d93c 73b02e24604961e49a63ca34203d8f6c56612117 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19avcodec: add simpleauto idctMichael Niedermayer
This will pick the "best" simple idct compatible idct Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19x86/hevc_idct: fix movd parameter size in DC_ADD_INITJames Almer
Fixes compilation with NASM x86_64 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19x86/blockdsp: move asm code out of dsputilJames Almer
Also replace INLINE_<opt> with EXTERNAL_<opt> that were wrongly changed by commit 2b05db4f8102148d013755ac2a7e47f6d79ff7ca Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19avcodec/x86/lossless_videodsp: Fix size of values read for left/left_topMichael Niedermayer
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19Merge commit 'e74433a8e6fc00c8dbde293c97a3e45384c2c1d9'Michael Niedermayer
* commit 'e74433a8e6fc00c8dbde293c97a3e45384c2c1d9': dsputil: Split clear_block*/fill_block* off into a separate context Conflicts: configure libavcodec/asvdec.c libavcodec/dnxhddec.c libavcodec/dnxhdenc.c libavcodec/dsputil.h libavcodec/eamad.c libavcodec/intrax8.c libavcodec/mjpegdec.c libavcodec/ppc/dsputil_ppc.c libavcodec/vc1dec.c libavcodec/x86/dsputil_init.c libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-18dsputil: Split clear_block*/fill_block* off into a separate contextDiego Biurrun
2014-06-17avcodec/hevc: new idct + asmplepere
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-15x86util: add and use RSHIFT/LSHIFT macrosChristophe Gisquet
Those macros take a byte number as shift argument, as this argument differs between MMX and SSE2 instructions. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-12vp9/x86: fix overwrite in ipred_vl_4x4_ssse3.Ronald S. Bultje
Fixes track ticket 3717. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-12x86: huffyuv: fix {add,diff}_int16Christophe Gisquet
They used an extra, undeclared register. Fixes a crash in fate-vsynth3-ffvhuff444p16 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-10Merge commit '570d4b21863b6254d6bbca9c528bede471bb4478'Michael Niedermayer
* commit '570d4b21863b6254d6bbca9c528bede471bb4478': x86: h264: Don't keep data in the redzone across function calls on 64 bit unix Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-10x86: h264: Don't keep data in the redzone across function calls on 64 bit unixMartin Storsjö
We know that the called function (ff_chroma_inter_body_mmxext) doesn't touch the redzone, and thus will be kept intact - thus, this doesn't fix any bug per se. However, valgrind's memcheck tool intentionally assumes that the redzone is clobbered on every function call and function return (see a long comment in valgrind/memcheck/mc_main.c). This avoids false positives in that tool, at the cost of an extra stack pointer adjustment. The other alternative would be a valgrind suppression for this issue, but that's an extra burden for everybody that wants to run libavcodec within valgrind. Signed-off-by: Martin Storsjö <martin@martin.st>
2014-06-09avcodec/x86/dct_init: fix build failure with clang && disable-optimizationsMichael Niedermayer
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-09x86/dct32: don't build ff_dct32_float_sse on x86_64James Almer
There's an SSE2 version already, and technically the SSE version on x86_64 was wrong (using pshufd and pshuflw, SSE2 instructions). Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-08x86/vp9: inital AVX2 intra_predJames Almer
tos3k-vp9-b10000.webm on a Core i5-4200U @1.6GHz 1219 decicycles in ff_vp9_ipred_dc_32x32_ssse3, 131070 runs, 2 skips 439 decicycles in ff_vp9_ipred_dc_32x32_avx2, 131070 runs, 2 skips 3570 decicycles in ff_vp9_ipred_dc_top_32x32_ssse3, 4096 runs, 0 skips 2494 decicycles in ff_vp9_ipred_dc_top_32x32_avx2, 4096 runs, 0 skips 1419 decicycles in ff_vp9_ipred_dc_left_32x32_ssse3, 16384 runs, 0 skips 717 decicycles in ff_vp9_ipred_dc_left_32x32_avx2, 16384 runs, 0 skips 2737 decicycles in ff_vp9_ipred_tm_32x32_avx, 1024 runs, 0 skips 2088 decicycles in ff_vp9_ipred_tm_32x32_avx2, 1024 runs, 0 skips 3090 decicycles in ff_vp9_ipred_v_32x32_avx, 512 runs, 0 skips 2226 decicycles in ff_vp9_ipred_v_32x32_avx2, 512 runs, 0 skips 1565 decicycles in ff_vp9_ipred_h_32x32_avx, 1024 runs, 0 skips 922 decicycles in ff_vp9_ipred_h_32x32_avx2, 1024 runs, 0 skips Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-06x86/dsputil: move some mmx init code inside dsputil_init_mmx()James Almer
This reduces differences with the fork Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-05apedsp: move to llauddspChristophe Gisquet
APE is not the sole codec using scalarproduct_and_madd_int16. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-04avcodec/x86/dsputilenc_mmx: fix build without yasmMichael Niedermayer
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-04x86/motion_est: sad_{x, y}2_mmxext functions are bitexactJames Almer
Only the xy2 functions aren't. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-03x86: dsputilenc: convert hf_noise*_mmx to yasmTimothy Gu
Signed-off-by: Timothy Gu <timothygu99@gmail.com> Several bugfixes by: Christophe Gisquet <christophe.gisquet@gmail.com> See: [FFmpeg-devel] [WIP] [PATCH 4/4] x86: dsputilenc: convert hf_noise*_mmx to yasm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-01x86: hevc_mc: remove unneeded shiftChristophe Gisquet
The immediate value may be 0. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-01x86: hevcdsp_init: fix macro usageChristophe Gisquet
The macro was not using the parameter but unconditionally using sse4. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-01x86/motion_est: enable sad16_sse2 on k10 CPUsJames Almer
The check is meant for k8 CPUs. sad16_sse2 is ~20% faster than sad16_mmxext on k10. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-31build: fix compilation of svq1enc_mmx.c with --disable-mmxJames Almer
It's needed for ff_svq1enc_init_x86() even if simd functions are disabled. Alternatively, svq1enc_init.c could be made and the relevant code moved there. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-31x86/huffyuvdsp: fix some prototypesJames Almer
Remove duplicate prototypes and fix int -> intptr_t in another Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30x86: huffyuvdsp: fewer functions for x86_64Christophe Gisquet
When there are 2 functions that are <= SSE2, only one is needed for x86_64. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30x86: dsputilenc: convert ff_sse{8, 16}_mmx() to yasmTimothy Gu
Signed-off-by: Timothy Gu <timothygu99@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30x86: dsputilenc: move all the function prototypes togetherTimothy Gu
Signed-off-by: Timothy Gu <timothygu99@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30x86: huffyuvdsp: add_hfyu_left_pred_bgr32Christophe Gisquet
C MMX SSE2 Cycles: 3092 1053 578 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30avcodec/huffyuvdsp: Change w to intptr in add_hfyu_median_pred() and ↵Michael Niedermayer
add_hfyu_left_pred() This avoids potential issues with the high 32bits being random in x86-64 asm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30x86: huffyuvdsp: add SSE2 median predictionChristophe Gisquet
From 5010c to 4566 on lagarith YUY2. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30avcodec/x86/qpeldsp_init: Restore author attributionMichael Niedermayer
See: 368f50359eb328b0b9d67451f56fda20b3255f9a See: 44eb49512888143905860af2de2932ab002cdbf7, and many others See: similarity index 83% copy from libavcodec/x86/dsputil_init.c copy to libavcodec/x86/qpeldsp_init.c index ebbf97f..8f296a1 100644 --- a/libavcodec/x86/dsputil_init.c +++ b/libavcodec/x86/qpeldsp_init.c @@ -1,6 +1,5 @@ /* - * Copyright (c) 2000, 2001 Fabrice Bellard - * Copyright (c) 2002-2004 Michael Niedermayer <michaelni@gmx.at> + * quarterpel DSP functions * * This file is part of FFmpeg. * Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30Merge commit '368f50359eb328b0b9d67451f56fda20b3255f9a'Michael Niedermayer
* commit '368f50359eb328b0b9d67451f56fda20b3255f9a': dsputil: Split off quarterpel bits into their own context Conflicts: configure libavcodec/dsputil.c libavcodec/h263dec.c libavcodec/mpegvideo.c libavcodec/mpegvideo_enc.c libavcodec/vc1dec.c libavcodec/vc1dsp.c libavcodec/x86/dsputil_init.c libavcodec/x86/qpeldsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30Merge commit '054013a0fc6f2b52c60cee3e051be8cc7f82cef3'Michael Niedermayer
* commit '054013a0fc6f2b52c60cee3e051be8cc7f82cef3': dsputil: Move APE-specific bits into apedsp Conflicts: libavcodec/arm/int_neon.S libavcodec/x86/dsputil.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30avcodec/x86/svq1enc_mmx: Add author attributionMichael Niedermayer
See: 5900637219ccccdd39ddafa4e7181da20b8e1f1b Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30Merge commit '65d5d5865845f057cc6530a8d0f34db952d9009c'Michael Niedermayer
* commit '65d5d5865845f057cc6530a8d0f34db952d9009c': dsputil: Move SVQ1 encoding specific bits into svq1enc Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29x86/dsputilenc: add missing guards to ff_pix_sum16_xopJames Almer
XOP support was added in Yasm 1.0.0 and Nasm 2.06, and we still support older versions. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29x86: huffyuvdsp: port add_bytes to yasmChristophe Gisquet
C MMX SSE2 Cycles: 2972 587 302 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29x86: hpeldsp: better factorizationChristophe Gisquet
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29rename add_hfyu_left_prediction_int16 to add_hfyu_left_pred_int16Michael Niedermayer
This makes the naming more consistent with the 8bit variant Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29rename add_hfyu_median_prediction_int16 to add_hfyu_median_pred_int16Michael Niedermayer
This makes the naming more consistent with the 8bit variant Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29rename sub_hfyu_median_prediction_int16 to sub_hfyu_median_pred_int16Michael Niedermayer
This makes the naming more consistent with the 8bit variant Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29x86/dsputilenc: implement XOP version of pix_sum16James Almer
SSE2: 137 cycles XOP: 87 cycles Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29dsputil: Split off quarterpel bits into their own contextDiego Biurrun