xvidcore/0000775000076500007650000000000011566427763013504 5ustar xvidbuildxvidbuildxvidcore/ChangeLog0000664000076500007650000063516411565767212015270 0ustar xvidbuildxvidbuild# Note that this ChangeLog covers only changes to the release branch 2011-05-18 16:13 Isibaar * xvidcore/dshow/src/CXvidDecoder.cpp, xvidcore/dshow/src/config.c, xvidcore/dshow/src/config.h, xvidcore/dshow/src/resource.h, xvidcore/dshow/src/xvid.ax.rc: Display version number in About box 2011-05-18 12:30 Isibaar * xvidcore/vfw/src/resource.rc: More GUI cosmetics 2011-05-18 09:55 Isibaar * xvidcore/vfw/src/resource.rc: Minor GUI cosmetics 2011-05-18 09:08 Isibaar * xvidcore/src/dct/README.IJG: deleted deprecated README.IJG 2011-05-18 09:07 Isibaar * xvidcore/src/dct/fdct.c, xvidcore/src/dct/fdct.h, xvidcore/src/dct/idct.c, xvidcore/src/dct/idct.h: re-applied new IDCT/DCT patch from trunk 2011-05-18 09:02 Isibaar * xvidcore/dshow/src/CAbout.h, xvidcore/dshow/src/CXvidDecoder.h, xvidcore/dshow/src/IXvidDecoder.h, xvidcore/dshow/src/config.c, xvidcore/dshow/src/config.h, xvidcore/dshow/src/debug.c, xvidcore/dshow/src/debug.h, xvidcore/dshow/src/resource.h, xvidcore/examples/xvid_bench.c, xvidcore/examples/xvid_decraw.c, xvidcore/examples/xvid_encraw.c, xvidcore/src/bitstream/bitstream.c, xvidcore/src/bitstream/bitstream.h, xvidcore/src/bitstream/cbp.c, xvidcore/src/bitstream/cbp.h, xvidcore/src/bitstream/mbcoding.c, xvidcore/src/bitstream/mbcoding.h, xvidcore/src/bitstream/vlc_codes.h, xvidcore/src/bitstream/zigzag.h, xvidcore/src/dct/fdct.c, xvidcore/src/dct/fdct.h, xvidcore/src/dct/idct.c, xvidcore/src/dct/idct.h, xvidcore/src/dct/ppc_asm/idct_altivec.c, xvidcore/src/dct/simple_idct.c, xvidcore/src/decoder.c, xvidcore/src/decoder.h, xvidcore/src/encoder.c, xvidcore/src/encoder.h, xvidcore/src/global.h, xvidcore/src/image/colorspace.c, xvidcore/src/image/colorspace.h, xvidcore/src/image/font.c, xvidcore/src/image/font.h, xvidcore/src/image/image.c, xvidcore/src/image/image.h, xvidcore/src/image/interpolate8x8.c, xvidcore/src/image/interpolate8x8.h, xvidcore/src/image/postprocessing.c, xvidcore/src/image/postprocessing.h, xvidcore/src/image/ppc_asm/colorspace_altivec.c, xvidcore/src/image/ppc_asm/interpolate8x8_altivec.c, xvidcore/src/image/ppc_asm/qpel_altivec.c, xvidcore/src/image/qpel.c, xvidcore/src/image/qpel.h, xvidcore/src/image/reduced.c, xvidcore/src/image/reduced.h, xvidcore/src/motion/estimation.h, xvidcore/src/motion/estimation_bvop.c, xvidcore/src/motion/estimation_common.c, xvidcore/src/motion/estimation_gmc.c, xvidcore/src/motion/estimation_pvop.c, xvidcore/src/motion/estimation_rd_based.c, xvidcore/src/motion/estimation_rd_based_bvop.c, xvidcore/src/motion/gmc.c, xvidcore/src/motion/gmc.h, xvidcore/src/motion/motion.h, xvidcore/src/motion/motion_comp.c, xvidcore/src/motion/motion_inlines.h, xvidcore/src/motion/motion_smp.h, xvidcore/src/motion/ppc_asm/sad_altivec.c, xvidcore/src/motion/sad.c, xvidcore/src/motion/sad.h, xvidcore/src/motion/vop_type_decision.c, xvidcore/src/plugins/plugin_2pass1.c, xvidcore/src/plugins/plugin_2pass2.c, xvidcore/src/plugins/plugin_dump.c, xvidcore/src/plugins/plugin_lumimasking.c, xvidcore/src/plugins/plugin_psnr.c, xvidcore/src/plugins/plugin_psnrhvsm.c, xvidcore/src/plugins/plugin_single.c, xvidcore/src/plugins/plugin_ssim.c, xvidcore/src/plugins/plugin_ssim.h, xvidcore/src/portab.h, xvidcore/src/prediction/mbprediction.c, xvidcore/src/prediction/mbprediction.h, xvidcore/src/quant/ppc_asm/quant_h263_altivec.c, xvidcore/src/quant/ppc_asm/quant_mpeg_altivec.c, xvidcore/src/quant/quant.h, xvidcore/src/quant/quant_h263.c, xvidcore/src/quant/quant_matrix.c, xvidcore/src/quant/quant_matrix.h, xvidcore/src/quant/quant_mpeg.c, xvidcore/src/utils/emms.c, xvidcore/src/utils/emms.h, xvidcore/src/utils/mbfunctions.h, xvidcore/src/utils/mbtransquant.c, xvidcore/src/utils/mem_align.c, xvidcore/src/utils/mem_align.h, xvidcore/src/utils/mem_transfer.c, xvidcore/src/utils/mem_transfer.h, xvidcore/src/utils/ppc_asm/altivec_trigger.c, xvidcore/src/utils/ppc_asm/mem_transfer_altivec.c, xvidcore/src/utils/timer.c, xvidcore/src/utils/timer.h, xvidcore/src/xvid.c, xvidcore/src/xvid.h, xvidcore/vfw/src/codec.c, xvidcore/vfw/src/codec.h, xvidcore/vfw/src/config.c, xvidcore/vfw/src/config.h, xvidcore/vfw/src/debug.h, xvidcore/vfw/src/driverproc.c, xvidcore/vfw/src/resource.h, xvidcore/vfw/src/status.c, xvidcore/vfw/src/status.h, xvidcore/vfw/src/vfwext.h, xvidcore/vfw/src/w32api/vfw.h: enabled auto-props property 2011-05-18 08:51 Isibaar * xvidcore/src/dct/fdct.c, xvidcore/src/dct/fdct.h, xvidcore/src/dct/idct.c, xvidcore/src/dct/idct.h: backported new DCT/IDCT C-implementations from trunk 2011-05-18 08:06 Isibaar * xvidcore/build/generic/Makefile: make info 2011-05-18 07:59 Isibaar * xvidcore/build/generic/configure.in: Increased version number to 1.3.2 2011-05-18 07:38 Isibaar * xvidcore/src/xvid.c, xvidcore/src/xvid.h: Pump up version number to 1.3.2 2011-05-16 10:09 Isibaar * xvidcore/build/win32/libxvidcore.sln: - Fixed issue with CR/LF 2011-05-16 09:38 Isibaar * xvidcore/debian: - Removed debian directory from release branch 2011-04-07 19:07 Isibaar * xvidcore/build/generic/configure.in, xvidcore/src/encoder.c, xvidcore/src/image/postprocessing.c: switchable pthread (backported from HEAD) 2011-03-21 16:00 Isibaar * xvidcore/src/image/image.c: add brackets to avoid ambuigity 2011-03-21 14:25 Isibaar * xvidcore/dshow/dshow.vcproj: switched back to LIBCMT runtime 2011-03-18 21:16 Isibaar * xvidcore/dshow/src/xvid.ico: icon with darker blue 2011-03-17 15:52 Isibaar * xvidcore/ChangeLog: Updated changelog --------------------- Date: 2011/03/17 16:13:25 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: Don't flip RGB output for MFT decoder Members: dshow/src/CXvidDecoder.cpp:1.25->1.25.2.4 dshow/src/CXvidDecoder.h:1.9->1.9.2.2 --------------------- Date: 2011/03/10 16:27:57 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: increase version number to 1.3.1 Members: build/generic/configure.in:1.33.2.3->1.33.2.4 debian/changelog:1.3->1.3.2.2 --------------------- Date: 2011/03/08 22:07:00 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: ramp up version number to 1.3.1 Members: src/xvid.c:1.85.2.2->1.85.2.3 src/xvid.h:1.74.2.3->1.74.2.4 --------------------- Date: 2011/03/08 20:18:34 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: fixed padding regression for input widths/heights not multiple of 16 Members: src/encoder.c:1.135.2.4->1.135.2.5 src/xvid.h:1.74.2.2->1.74.2.3 src/image/image.c:1.46.2.1->1.46.2.2 --------------------- Date: 2011/02/25 14:15:35 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: fixed syntax error for pthread check - thanks to Fabrian Greffrath Members: build/generic/configure.in:1.33->1.33.2.3 --------------------- Date: 2011/02/25 13:40:25 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: noexec stack check not only for "elf" but also "elf32" - thanks to Fabian Greffrath Members: src/nasm.inc:1.7->1.7.2.2 --------------------- Date: 2011/02/16 20:04:39 Author: Isibaar Branch: release-1_3-branch Tag: release-1_3_0 Log: decoder support for lower case FourCCs (from Jawor's patch) GUI cosmetics (from Jawor's patch) Members: vfw/src/codec.c:1.30.2.2->1.30.2.3 vfw/src/codec.h:1.7->1.7.2.1 vfw/src/config.c:1.45->1.45.2.2 vfw/src/resource.rc:1.30.2.1->1.30.2.2 --------------------- Date: 2011/02/14 18:26:20 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: updated changelog for 1_3 branch Members: ChangeLog:1.17->1.17.2.1 --------------------- Date: 2011/02/14 18:21:00 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: Increased version number Members: src/xvid.c:1.85->1.85.2.2 src/xvid.h:1.74->1.74.2.2 --------------------- Date: 2011/02/14 17:58:54 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: Support for additional third-party FourCCs (based on Jawor's patch with some small fixes) Members: dshow/src/CXvidDecoder.cpp:1.25.2.2->1.25.2.3 dshow/src/CXvidDecoder.h:1.9->1.9.2.1 dshow/src/config.c:1.12->1.12.2.1 dshow/src/config.h:1.8->1.8.2.1 dshow/src/resource.h:1.5->1.5.2.1 dshow/src/xvid.ax.rc:1.8->1.8.2.1 --------------------- Date: 2011/02/03 16:12:34 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: skip possible only for P_VOP (Jawor) Members: src/encoder.c:1.135.2.3->1.135.2.4 --------------------- Date: 2011/02/03 16:01:06 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: auto framerate detection if possible, some other minor cosmetics (derived from Jawor's patches) Members: examples/xvid_encraw.c:1.46.2.2->1.46.2.3 --------------------- Date: 2011/01/27 14:18:13 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: Increased filter merit Members: dshow/src/CXvidDecoder.cpp:1.25.2.1->1.25.2.2 --------------------- Date: 2011/01/27 14:13:16 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: Ensure that colorspace pass-through is enabled really just for FOURCC_YV12 Members: vfw/src/codec.c:1.30.2.1->1.30.2.2 --------------------- Date: 2011/01/11 12:37:52 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: don't hardcode "ar" tool (thanks to Bin Tian) Members: build/generic/Makefile:1.18->1.18.2.1 build/generic/configure.in:1.33.2.1->1.33.2.2 build/generic/platform.inc.in:1.8->1.8.2.1 --------------------- Date: 2011/01/09 14:20:50 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: update for macho64 Members: src/nasm.inc:1.7->1.7.2.1 --------------------- Date: 2011/01/06 15:12:29 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: patch for darwin 64-bit target (thanks to Bin Tian) Members: build/generic/configure.in:1.33->1.33.2.1 --------------------- Date: 2011/01/03 09:31:22 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: removed absolute logo path (thanks to Brendan Brewster) Members: vfw/src/resource.rc:1.30->1.30.2.1 --------------------- Date: 2010/12/31 11:20:22 Author: Isibaar Branch: #CVSPS_NO_BRANCH Tag: rc1_1_3_0 Log: fix some typo Members: src/encoder.c:1.135.2.2->1.135.2.3 --------------------- Date: 2010/12/30 23:59:31 Author: Isibaar Branch: #CVSPS_NO_BRANCH Tag: (none) Log: CVS corruption again Members: vfw/src/hd1080_40.ico:1.1->1.1.2.1 vfw/src/hd720_40.ico:1.1->1.1.2.1 vfw/src/home_40.ico:1.1->1.1.2.1 vfw/src/mobile_40.ico:1.1->1.1.2.1 --------------------- Date: 2010/12/30 23:07:43 Author: Isibaar Branch: #CVSPS_NO_BRANCH Tag: (none) Log: changed num_slice determination logic Members: vfw/src/codec.c:1.30->1.30.2.1 vfw/src/config.c:1.45->1.45.2.1 vfw/src/resource.h:1.15->1.15.2.1 --------------------- Date: 2010/12/30 12:46:58 Author: Isibaar Branch: #CVSPS_NO_BRANCH Tag: (none) Log: more bug squashing Members: examples/xvid_encraw.c:1.46.2.1->1.46.2.2 src/image/image.c:1.46->1.46.2.1 --------------------- Date: 2010/12/29 23:29:51 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: bug fixing... Members: src/encoder.c:1.135.2.1->1.135.2.2 --------------------- Date: 2010/12/29 23:29:44 Author: Isibaar Branch: #CVSPS_NO_BRANCH Tag: (none) Log: bug fixing... Members: src/motion/estimation_bvop.c:1.28->1.28.2.1 src/motion/estimation_rd_based.c:1.16->1.16.2.1 --------------------- Date: 2010/12/28 20:19:57 Author: Isibaar Branch: #CVSPS_NO_BRANCH Tag: (none) Log: get rid off compiler warnings Members: build/win32/libxvidcore.vcproj:1.5->1.5.2.1 build/win32/xvid_decraw.vcproj:1.3->1.3.2.1 build/win32/xvid_encraw.vcproj:1.4->1.4.2.1 dshow/dshow.vcproj:1.3->1.3.2.1 dshow/src/CXvidDecoder.cpp:1.25->1.25.2.1 examples/xvid_decraw.c:1.28->1.28.2.1 src/decoder.c:1.86->1.86.2.1 src/bitstream/bitstream.c:1.60->1.60.2.1 src/bitstream/bitstream.h:1.25->1.25.2.1 src/motion/estimation_pvop.c:1.24->1.24.2.1 src/motion/motion.h:1.27->1.27.2.1 src/motion/motion_comp.c:1.24->1.24.2.1 src/motion/sad.c:1.17->1.17.2.1 src/motion/sad.h:1.25->1.25.2.1 src/plugins/plugin_2pass2.c:1.10->1.10.2.1 src/plugins/plugin_single.c:1.4->1.4.2.1 src/quant/quant_matrix.c:1.16->1.16.4.1 vfw/vfw.vcproj:1.1->1.1.4.1 --------------------- Date: 2010/12/28 20:19:57 Author: Isibaar Branch: release-1_3-branch Tag: (none) Log: get rid off compiler warnings Members: examples/xvid_encraw.c:1.46->1.46.2.1 src/encoder.c:1.135->1.135.2.1 --------------------- Date: 2010/12/28 17:34:55 Author: Isibaar Branch: #CVSPS_NO_BRANCH Tag: (none) Log: updated readme and debian files Members: debian/changelog:1.3->1.3.2.1 debian/copyright:1.1->1.1.4.1 doc/INSTALL:1.3->1.3.2.1 doc/README:1.5->1.5.2.1 examples/README:1.3->1.3.6.1 --------------------- Date: 2010/12/28 17:04:31 Author: Isibaar Branch: #CVSPS_NO_BRANCH Tag: (none) Log: set version info to xvid-1.3.0-rc1 Members: src/xvid.c:1.85->1.85.2.1 src/xvid.h:1.74->1.74.2.1 --------------------- Date: 2010/12/27 17:39:31 Author: Isibaar Branch: HEAD Tag: tag-branching-1_3_0 Log: updated changelog Members: ChangeLog:INITIAL->1.17 TODO:1.8->1.9 --------------------- Date: 2010/12/27 17:11:05 Author: Isibaar Branch: HEAD Tag: (none) Log: auto slice determination Members: vfw/src/codec.c:1.29->1.30 vfw/src/config.c:1.44->1.45 vfw/src/resource.rc:1.29->1.30 --------------------- Date: 2010/12/24 14:49:58 Author: Isibaar Branch: HEAD Tag: (none) Log: improved decoder robustness upon resync markers in BVOP Members: src/decoder.c:1.85->1.86 --------------------- Date: 2010/12/24 14:31:31 Author: Isibaar Branch: HEAD Tag: (none) Log: update linker definitions (psnrhvsm plugin) Members: build/generic/libxvidcore.ld:1.2->1.3 --------------------- Date: 2010/12/24 14:20:07 Author: Isibaar Branch: HEAD Tag: (none) Log: slice coding: workaround some third-party decoder bugs Members: src/encoder.c:1.134->1.135 src/xvid.h:1.73->1.74 src/motion/estimation.h:1.15->1.16 src/motion/estimation_bvop.c:1.27->1.28 src/motion/estimation_rd_based_bvop.c:1.11->1.12 src/motion/motion.h:1.26->1.27 --------------------- Date: 2010/12/22 17:52:52 Author: Isibaar Branch: HEAD Tag: (none) Log: update Members: AUTHORS:1.5->1.6 CodingStyle:1.3->1.4 README:1.2->1.3 --------------------- Date: 2010/12/22 17:52:12 Author: Isibaar Branch: HEAD Tag: (none) Log: slice coding GUI element Members: vfw/src/codec.c:1.28->1.29 vfw/src/config.c:1.43->1.44 vfw/src/config.h:1.16->1.17 vfw/src/resource.h:1.14->1.15 vfw/src/resource.rc:1.28->1.29 --------------------- Date: 2010/12/22 16:21:13 Author: Isibaar Branch: HEAD Tag: (none) Log: read cpu_flags and num_threads from registry Members: dshow/src/CXvidDecoder.cpp:1.24->1.25 dshow/src/config.c:1.11->1.12 dshow/src/config.h:1.7->1.8 --------------------- Date: 2010/12/21 21:23:06 Author: Isibaar Branch: HEAD Tag: (none) Log: some bugfixing... Members: vfw/src/config.c:1.42->1.43 vfw/src/config.h:1.15->1.16 --------------------- Date: 2010/12/21 17:56:42 Author: Isibaar Branch: HEAD Tag: (none) Log: moved num_threads control to common tab VHQ metric combobox Members: vfw/src/codec.c:1.27->1.28 vfw/src/config.c:1.41->1.42 vfw/src/config.h:1.14->1.15 vfw/src/resource.h:1.13->1.14 vfw/src/resource.rc:1.27->1.28 --------------------- Date: 2010/12/18 17:02:00 Author: Isibaar Branch: HEAD Tag: (none) Log: refactorized encoder multi-threading introduced slice-based encoding Members: examples/xvid_encraw.c:1.45->1.46 src/encoder.c:1.133->1.134 src/encoder.h:1.33->1.34 src/xvid.h:1.72->1.73 src/bitstream/bitstream.c:1.59->1.60 src/bitstream/bitstream.h:1.24->1.25 src/motion/estimation.h:1.14->1.15 src/motion/estimation_bvop.c:1.26->1.27 src/motion/estimation_gmc.c:1.5->1.6 src/motion/estimation_pvop.c:1.23->1.24 src/motion/estimation_rd_based.c:1.15->1.16 src/motion/motion.h:1.25->1.26 src/motion/motion_comp.c:1.23->1.24 src/motion/motion_smp.h:1.7->1.8 src/prediction/mbprediction.c:1.19->1.20 src/prediction/mbprediction.h:1.25->1.26 --------------------- Date: 2010/12/18 11:16:46 Author: Isibaar Branch: HEAD Tag: (none) Log: set decoder threads Members: dshow/src/CXvidDecoder.cpp:1.23->1.24 dshow/src/config.c:1.10->1.11 dshow/src/config.h:1.6->1.7 examples/xvid_decraw.c:1.27->1.28 --------------------- Date: 2010/12/18 11:13:30 Author: Isibaar Branch: HEAD Tag: (none) Log: MT deblocker Members: src/decoder.c:1.84->1.85 src/decoder.h:1.19->1.20 src/xvid.h:1.71->1.72 src/image/postprocessing.c:1.5->1.6 src/image/postprocessing.h:1.6->1.7 --------------------- Date: 2010/12/02 07:46:07 Author: Isibaar Branch: HEAD Tag: (none) Log: some cleanup of vfw code, increased allowed max bitrate, profile for HD 1080 Members: vfw/src/codec.c:1.26->1.27 vfw/src/codec.h:1.6->1.7 vfw/src/config.c:1.40->1.41 vfw/src/config.h:1.13->1.14 vfw/src/debug.h:1.2->1.3 vfw/src/driverproc.c:1.11->1.12 vfw/src/hd1080_40.ico:INITIAL->1.1 vfw/src/hd720_40.ico:INITIAL->1.1 vfw/src/home_40.ico:INITIAL->1.1 vfw/src/mobile_40.ico:INITIAL->1.1 vfw/src/resource.h:1.12->1.13 vfw/src/resource.rc:1.26->1.27 vfw/src/status.c:1.4->1.5 vfw/src/status.h:1.2->1.3 vfw/src/vfwext.h:1.2->1.3 vfw/src/xvid.ico:1.2->1.3 --------------------- Date: 2010/11/28 16:18:21 Author: Isibaar Branch: HEAD Tag: (none) Log: PSNRHVSM R-D optimization Members: examples/xvid_encraw.c:1.44->1.45 src/encoder.c:1.132->1.133 src/global.h:1.26->1.27 src/xvid.c:1.84->1.85 src/xvid.h:1.70->1.71 src/image/image.c:1.45->1.46 src/image/image.h:1.17->1.18 src/image/x86_asm/qpel_mmx.asm:1.12->1.13 src/motion/estimation.h:1.13->1.14 src/motion/estimation_bvop.c:1.25->1.26 src/motion/estimation_pvop.c:1.22->1.23 src/motion/estimation_rd_based.c:1.14->1.15 src/motion/estimation_rd_based_bvop.c:1.10->1.11 src/motion/sad.c:1.16->1.17 src/motion/sad.h:1.24->1.25 src/motion/x86_asm/sad_sse2.asm:1.20->1.21 src/plugins/plugin_psnrhvsm.c:1.3->1.4 src/utils/mbtransquant.c:1.32->1.33 --------------------- Date: 2010/11/23 12:00:35 Author: Isibaar Branch: HEAD Tag: (none) Log: Changed semantics of frame_drop_ratio: "0" will not produce any N_VOPs. Members: src/encoder.c:1.131->1.132 --------------------- Date: 2010/11/16 15:58:42 Author: Isibaar Branch: HEAD Tag: (none) Log: Had accidentally overwritten the VS 2005 project files by VS 2008 ones -> Restore previous version. Members: build/win32/libxvidcore.sln:1.3->1.4 build/win32/libxvidcore.vcproj:1.4->1.5 build/win32/xvid_decraw.vcproj:1.2->1.3 build/win32/xvid_encraw.vcproj:1.3->1.4 --------------------- Date: 2010/11/16 15:42:07 Author: Isibaar Branch: HEAD Tag: (none) Log: Clean-up for vbv_peakrate handling (many thanks to Lasse Collin) Members: build/win32/libxvidcore.sln:1.2->1.3 build/win32/libxvidcore.vcproj:1.3->1.4 build/win32/xvid_decraw.vcproj:1.1->1.2 build/win32/xvid_encraw.vcproj:1.2->1.3 doc/INSTALL:1.2->1.3 doc/README:1.4->1.5 examples/xvid_encraw.c:1.43->1.44 src/xvid.h:1.69->1.70 src/plugins/plugin_2pass2.c:1.9->1.10 vfw/src/codec.c:1.25->1.26 --------------------- Date: 2010/11/12 11:10:40 Author: Isibaar Branch: HEAD Tag: (none) Log: fix for typo on pred mv init (thanks to Lasse Collin) Members: src/decoder.c:1.83->1.84 --------------------- Date: 2010/11/10 22:25:16 Author: Isibaar Branch: HEAD Tag: (none) Log: psnrhvsm for u/v planes too Members: src/plugins/plugin_psnrhvsm.c:1.2->1.3 --------------------- Date: 2010/11/08 21:20:39 Author: Isibaar Branch: HEAD Tag: (none) Log: fixed some bugs (possible overflow, mainly) Members: src/plugins/plugin_psnrhvsm.c:1.1->1.2 --------------------- Date: 2010/10/29 18:39:07 Author: Isibaar Branch: HEAD Tag: (none) Log: don't use tray icon and MFT by default Members: dshow/src/CXvidDecoder.cpp:1.22->1.23 --------------------- Date: 2010/10/29 16:33:39 Author: Isibaar Branch: HEAD Tag: (none) Log: tray icon update Members: dshow/src/CXvidDecoder.cpp:1.21->1.22 dshow/src/xvid.ico:1.1->1.2 --------------------- Date: 2010/10/24 10:50:54 Author: Isibaar Branch: HEAD Tag: (none) Log: forgot to add new plugin_psnrhvsm.c source file to unix-style build environment... Members: build/generic/sources.inc:1.15->1.16 --------------------- Date: 2010/10/17 20:36:12 Author: Isibaar Branch: HEAD Tag: (none) Log: fixed typo Members: dshow/src/CXvidDecoder.cpp:1.20->1.21 --------------------- Date: 2010/10/17 20:31:46 Author: Isibaar Branch: HEAD Tag: (none) Log: MFT decoder Members: dshow/dshow.vcproj:1.2->1.3 dshow/src/CXvidDecoder.cpp:1.19->1.20 dshow/src/CXvidDecoder.h:1.8->1.9 --------------------- Date: 2010/10/17 19:46:43 Author: Isibaar Branch: HEAD Tag: (none) Log: XVID_GBL_CONVERT: generic colorspace conversion from XVID_CSP_INTERNAL Members: src/xvid.c:1.83->1.84 --------------------- Date: 2010/10/16 14:20:30 Author: Isibaar Branch: HEAD Tag: (none) Log: tray icon Members: dshow/dshow.vcproj:1.1->1.2 dshow/src/CXvidDecoder.cpp:1.18->1.19 dshow/src/CXvidDecoder.h:1.7->1.8 dshow/src/Configure.cpp:1.6->1.7 dshow/src/debug.c:1.1->1.2 dshow/src/resource.h:1.4->1.5 dshow/src/xvid.ax.rc:1.7->1.8 dshow/src/xvid.ico:INITIAL->1.1 --------------------- Date: 2010/10/15 18:20:48 Author: Isibaar Branch: HEAD Tag: (none) Log: table update Members: src/bitstream/mbcoding.c:1.58->1.59 --------------------- Date: 2010/10/10 21:19:46 Author: Isibaar Branch: HEAD Tag: (none) Log: PSNR-HVS-M quality metric Members: build/generic/libxvidcore.def:1.6->1.7 build/win32/libxvidcore.vcproj:1.2->1.3 examples/xvid_encraw.c:1.42->1.43 src/xvid.h:1.68->1.69 src/plugins/plugin_psnrhvsm.c:INITIAL->1.1 --------------------- Date: 2010/09/13 09:38:09 Author: Isibaar Branch: HEAD Tag: (none) Log: define additional simple profile levels Members: src/encoder.h:1.32->1.33 src/global.h:1.25->1.26 src/xvid.h:1.67->1.68 --------------------- Date: 2010/08/23 16:58:48 Author: Isibaar Branch: HEAD Tag: (none) Log: Added new simple profile levels to GUI (patch by Carl Eric Codere) Members: vfw/src/config.c:1.39->1.40 --------------------- Date: 2010/08/10 17:00:06 Author: Isibaar Branch: HEAD Tag: (none) Log: decoder: better distinguish between xvid and non-xvid streams Members: src/decoder.c:1.82->1.83 src/bitstream/bitstream.c:1.58->1.59 src/image/image.c:1.44->1.45 src/prediction/mbprediction.c:1.18->1.19 --------------------- Date: 2010/08/10 16:17:23 Author: Isibaar Branch: HEAD Tag: (none) Log: API change: signal fourcc to xvidcore Members: dshow/src/CXvidDecoder.cpp:1.17->1.18 src/decoder.c:1.81->1.82 src/xvid.h:1.66->1.67 vfw/src/codec.c:1.24->1.25 --------------------- Date: 2010/06/07 09:03:37 Author: Isibaar Branch: HEAD Tag: (none) Log: patch for yasm >= 1.0 by Takashi Mochizuki Members: build/generic/configure.in:1.32->1.33 --------------------- Date: 2010/05/10 15:50:46 Author: Isibaar Branch: HEAD Tag: (none) Log: fix for handle leak problem reported by Chris Korda Members: vfw/src/codec.c:1.23->1.24 --------------------- Date: 2010/04/01 14:16:48 Author: Isibaar Branch: HEAD Tag: (none) Log: fixed rounding issue for app-level multi-threading Members: examples/xvid_encraw.c:1.41->1.42 --------------------- Date: 2010/03/09 17:25:17 Author: Isibaar Branch: HEAD Tag: (none) Log: fixed multithreaded AVI input (hopefully) Members: examples/xvid_encraw.c:1.40->1.41 --------------------- Date: 2010/03/09 15:56:02 Author: Isibaar Branch: HEAD Tag: (none) Log: typo with sequence splitting Members: examples/xvid_encraw.c:1.39->1.40 --------------------- Date: 2010/03/09 11:00:14 Author: Isibaar Branch: HEAD Tag: (none) Log: app-level multi-threading for xvid_encraw Members: build/win32/xvid_encraw.vcproj:1.1->1.2 examples/xvid_encraw.c:1.38->1.39 src/decoder.h:1.18->1.19 src/encoder.c:1.130->1.131 src/portab.h:1.59->1.60 src/xvid.c:1.82->1.83 src/xvid.h:1.65->1.66 src/dct/simple_idct.c:1.5->1.6 src/image/reduced.c:1.4->1.5 src/image/x86_asm/deintl_sse.asm:1.6->1.7 src/image/x86_asm/gmc_mmx.asm:1.11->1.12 src/image/x86_asm/postprocessing_mmx.asm:1.13->1.14 src/image/x86_asm/postprocessing_sse2.asm:1.16->1.17 src/image/x86_asm/qpel_mmx.asm:1.11->1.12 src/image/x86_asm/reduced_mmx.asm:1.12->1.13 src/motion/motion_smp.h:1.6->1.7 src/plugins/plugin_2pass1.c:1.3->1.4 src/plugins/plugin_2pass2.c:1.8->1.9 src/plugins/plugin_dump.c:1.3->1.4 src/plugins/plugin_lumimasking.c:1.8->1.9 src/plugins/plugin_psnr.c:1.2->1.3 src/plugins/plugin_single.c:1.3->1.4 --------------------- Date: 2010/03/09 10:20:05 Author: Isibaar Branch: HEAD Tag: (none) Log: added option for postprocessing Members: examples/xvid_decraw.c:1.26->1.27 --------------------- Date: 2010/01/08 11:03:09 Author: Isibaar Branch: HEAD Tag: (none) Log: bugfix for new -f yuv option Members: examples/xvid_decraw.c:1.25->1.26 --------------------- Date: 2010/01/05 10:25:19 Author: Isibaar Branch: HEAD Tag: (none) Log: added option for raw yuv output format Members: examples/xvid_decraw.c:1.24->1.25 --------------------- Date: 2009/11/10 15:06:58 Author: Isibaar Branch: HEAD Tag: (none) Log: skip mv_bits assert in _DEBUG mode Members: src/bitstream/mbcoding.c:1.57->1.58 --------------------- Date: 2009/10/05 11:55:46 Author: Isibaar Branch: HEAD Tag: (none) Log: Removed inner nested AC_CHECK_LIB test for pthread_join (autoconf-2.64 compatibility) Members: build/generic/configure.in:1.31->1.32 --------------------- Date: 2009/09/16 19:07:58 Author: Isibaar Branch: HEAD Tag: (none) Log: no_exec stack patch for x86_64 too by Michal Schmidt (mschmidt at redhat dot com) Members: src/nasm.inc:1.6->1.7 src/bitstream/x86_asm/cbp_mmx.asm:1.18->1.19 src/bitstream/x86_asm/cbp_sse2.asm:1.13->1.14 src/dct/x86_asm/fdct_mmx_ffmpeg.asm:1.9->1.10 src/dct/x86_asm/fdct_mmx_skal.asm:1.11->1.12 src/dct/x86_asm/fdct_sse2_skal.asm:1.14->1.15 src/dct/x86_asm/idct_3dne.asm:1.10->1.11 src/dct/x86_asm/idct_mmx.asm:1.14->1.15 src/dct/x86_asm/idct_sse2_dmitry.asm:1.10->1.11 src/image/x86_asm/colorspace_rgb_mmx.asm:1.12->1.13 src/image/x86_asm/colorspace_yuv_mmx.asm:1.14->1.15 src/image/x86_asm/colorspace_yuyv_mmx.asm:1.11->1.12 src/image/x86_asm/deintl_sse.asm:1.5->1.6 src/image/x86_asm/gmc_mmx.asm:1.10->1.11 src/image/x86_asm/interpolate8x8_3dn.asm:1.13->1.14 src/image/x86_asm/interpolate8x8_3dne.asm:1.13->1.14 src/image/x86_asm/interpolate8x8_mmx.asm:1.24->1.25 src/image/x86_asm/interpolate8x8_xmm.asm:1.14->1.15 src/image/x86_asm/postprocessing_mmx.asm:1.12->1.13 src/image/x86_asm/postprocessing_sse2.asm:1.15->1.16 src/image/x86_asm/qpel_mmx.asm:1.10->1.11 src/image/x86_asm/reduced_mmx.asm:1.11->1.12 src/motion/x86_asm/sad_3dn.asm:1.13->1.14 src/motion/x86_asm/sad_3dne.asm:1.11->1.12 src/motion/x86_asm/sad_mmx.asm:1.21->1.22 src/motion/x86_asm/sad_sse2.asm:1.19->1.20 src/motion/x86_asm/sad_xmm.asm:1.14->1.15 src/plugins/x86_asm/plugin_ssim-a.asm:1.12->1.13 src/quant/x86_asm/quantize_h263_3dne.asm:1.11->1.12 src/quant/x86_asm/quantize_h263_mmx.asm:1.15->1.16 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.15->1.16 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.12->1.13 src/utils/x86_asm/cpuid.asm:1.18->1.19 src/utils/x86_asm/interlacing_mmx.asm:1.11->1.12 src/utils/x86_asm/mem_transfer_3dne.asm:1.12->1.13 src/utils/x86_asm/mem_transfer_mmx.asm:1.21->1.22 --------------------- Date: 2009/07/01 11:25:38 Author: Isibaar Branch: HEAD Tag: (none) Log: Additional sanity check when reading stats file Members: src/plugins/plugin_2pass2.c:1.7->1.8 --------------------- Date: 2009/06/09 09:48:57 Author: Isibaar Branch: HEAD Tag: (none) Log: Use -maltivec only to compile the sources containting altivec code. GCC may otherwise produce altivec code on non-altivec PPC (thanks to Frederik Wikstrom) Members: build/generic/Makefile:1.17->1.18 build/generic/configure.in:1.30->1.31 build/generic/platform.inc.in:1.7->1.8 --------------------- Date: 2009/06/05 09:58:41 Author: Isibaar Branch: HEAD Tag: (none) Log: Patch for Amiga OS4 by Fredrik Wikstrom Members: src/xvid.c:1.81->1.82 src/motion/motion_smp.h:1.5->1.6 src/utils/ppc_asm/altivec_trigger.c:1.1->1.2 --------------------- Date: 2009/06/02 15:06:49 Author: Isibaar Branch: HEAD Tag: (none) Log: Added alternative processor cores detection routing for Apple (thanks to Fabian Groffen) C90 style fix in variance masking code Members: src/xvid.c:1.80->1.81 src/plugins/plugin_lumimasking.c:1.7->1.8 --------------------- Date: 2009/05/28 19:03:45 Author: Isibaar Branch: release-1_2-branch Tag: release-1_2_2 Log: allow text relocations for dynlib OS X target Members: build/generic/configure.in:1.25.2.3->1.25.2.4 --------------------- Date: 2009/05/28 18:59:21 Author: Isibaar Branch: HEAD Tag: (none) Log: Allow text relocations for dynlib OS X target Members: build/generic/configure.in:1.29->1.30 --------------------- Date: 2009/05/28 17:52:33 Author: Isibaar Branch: release-1_2-branch Tag: (none) Log: back-port from HEAD: - add resync-marker range check - return E_FAIL on XVID_ERR_MEMORY error in dshow Members: ChangeLog:1.14.4.3->1.14.4.4 dshow/src/CXvidDecoder.cpp:1.16->1.16.4.1 src/decoder.c:1.80->1.80.2.1 --------------------- Date: 2009/05/28 17:42:06 Author: Isibaar Branch: HEAD Tag: (none) Log: Bugfix: - Added missing resync marker range check in decoder.c (reported by IBM X-Force. Thanks go to John McDonald and Christopher Valasek) - return E_FAIL instead of S_FALSE upon XVID_ERR_MEMORY error in dshow frontend (reported by IBM X-Force. Thanks to John McDonald and Mark Dowd) Members: dshow/src/CXvidDecoder.cpp:1.16->1.17 src/decoder.c:1.80->1.81 --------------------- Date: 2009/05/28 17:04:35 Author: Isibaar Branch: release-1_2-branch Tag: (none) Log: backport from HEAD: yasm compatibility Members: build/generic/configure.in:1.25.2.2->1.25.2.3 src/nasm.inc:1.1.2.3->1.1.2.4 src/image/x86_asm/colorspace_yuv_mmx.asm:1.10.2.1->1.10.2.2 src/image/x86_asm/interpolate8x8_3dne.asm:1.11.2.1->1.11.2.2 src/image/x86_asm/postprocessing_mmx.asm:1.9.2.1->1.9.2.2 src/image/x86_asm/postprocessing_sse2.asm:1.10.2.2->1.10.2.3 src/quant/x86_asm/quantize_h263_3dne.asm:1.9.2.1->1.9.2.2 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.10.2.1->1.10.2.2 --------------------- Date: 2009/05/28 16:15:53 Author: Isibaar Branch: HEAD Tag: (none) Log: require yasm >= 0.8.0 Members: build/generic/configure.in:1.28->1.29 --------------------- Date: 2009/05/28 10:42:37 Author: Isibaar Branch: release-1_2-branch Tag: (none) Log: backport from HEAD: Use of TEXT macro for Mach-O Members: src/nasm.inc:1.1.2.2->1.1.2.3 src/bitstream/x86_asm/cbp_mmx.asm:1.17->1.17.2.1 src/bitstream/x86_asm/cbp_sse2.asm:1.10.2.1->1.10.2.2 src/dct/x86_asm/fdct_mmx_ffmpeg.asm:1.8->1.8.2.1 src/dct/x86_asm/fdct_mmx_skal.asm:1.10->1.10.2.1 src/dct/x86_asm/fdct_sse2_skal.asm:1.10.2.2->1.10.2.3 src/dct/x86_asm/idct_3dne.asm:1.9->1.9.2.1 src/dct/x86_asm/idct_mmx.asm:1.13->1.13.2.1 src/dct/x86_asm/idct_sse2_dmitry.asm:1.8.2.1->1.8.2.2 src/image/x86_asm/colorspace_rgb_mmx.asm:1.10.2.1->1.10.2.2 src/image/x86_asm/colorspace_yuv_mmx.asm:1.10->1.10.2.1 src/image/x86_asm/colorspace_yuyv_mmx.asm:1.10->1.10.2.1 src/image/x86_asm/deintl_sse.asm:1.4->1.4.2.1 src/image/x86_asm/gmc_mmx.asm:1.7.2.2->1.7.2.3 src/image/x86_asm/interpolate8x8_3dn.asm:1.12->1.12.2.1 src/image/x86_asm/interpolate8x8_3dne.asm:1.11->1.11.2.1 src/image/x86_asm/interpolate8x8_mmx.asm:1.22->1.22.2.1 src/image/x86_asm/interpolate8x8_xmm.asm:1.13->1.13.2.1 src/image/x86_asm/postprocessing_mmx.asm:1.9->1.9.2.1 src/image/x86_asm/postprocessing_sse2.asm:1.10.2.1->1.10.2.2 src/image/x86_asm/qpel_mmx.asm:1.9->1.9.2.1 src/image/x86_asm/reduced_mmx.asm:1.9->1.9.2.1 src/motion/x86_asm/sad_3dn.asm:1.12->1.12.2.1 src/motion/x86_asm/sad_3dne.asm:1.10->1.10.2.1 src/motion/x86_asm/sad_mmx.asm:1.20->1.20.2.1 src/motion/x86_asm/sad_sse2.asm:1.16.2.1->1.16.2.2 src/motion/x86_asm/sad_xmm.asm:1.13->1.13.2.1 src/plugins/x86_asm/plugin_ssim-a.asm:1.9.2.1->1.9.2.2 src/quant/x86_asm/quantize_h263_3dne.asm:1.9->1.9.2.1 src/quant/x86_asm/quantize_h263_mmx.asm:1.11.2.2->1.11.2.3 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.13->1.13.2.1 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.10->1.10.2.1 src/utils/x86_asm/cpuid.asm:1.15.2.1->1.15.2.2 src/utils/x86_asm/interlacing_mmx.asm:1.10->1.10.2.1 src/utils/x86_asm/mem_transfer_3dne.asm:1.11->1.11.2.1 src/utils/x86_asm/mem_transfer_mmx.asm:1.20->1.20.2.1 --------------------- Date: 2009/05/27 19:27:56 Author: Isibaar Branch: HEAD Tag: (none) Log: GUI for variance masking Members: vfw/src/codec.c:1.22->1.23 vfw/src/config.c:1.38->1.39 vfw/src/resource.h:1.11->1.12 vfw/src/resource.rc:1.25->1.26 --------------------- Date: 2009/05/27 17:52:05 Author: Isibaar Branch: HEAD Tag: (none) Log: Added Darkshikari's variance masking as an option to lumimasking Members: ChangeLog:1.15->1.16 examples/xvid_encraw.c:1.37->1.38 src/xvid.h:1.64->1.65 src/plugins/plugin_lumimasking.c:1.6->1.7 --------------------- Date: 2009/05/13 11:39:20 Author: Isibaar Branch: HEAD Tag: (none) Log: improved precision and rounding for RGB->YV12 conversion Members: src/image/colorspace.c:1.14->1.15 src/image/x86_asm/colorspace_rgb_mmx.asm:1.11->1.12 --------------------- Date: 2009/03/30 16:40:05 Author: Isibaar Branch: HEAD Tag: (none) Log: attempt at fixing a RGB24 access violation bug reported by Matthew Allen Members: src/image/image.c:1.43->1.44 --------------------- Date: 2009/02/19 18:07:29 Author: Isibaar Branch: HEAD Tag: (none) Log: added proper license headers to the IA64 asm files Members: src/dct/ia64_asm/fdct_ia64.s:1.5->1.6 src/dct/ia64_asm/idct_fini.s:1.1->1.2 src/dct/ia64_asm/idct_ia64_ecc.s:1.1->1.2 src/dct/ia64_asm/idct_ia64_gcc.s:1.1->1.2 src/dct/ia64_asm/idct_init.s:1.1->1.2 src/image/ia64_asm/interpolate8x8_ia64.s:1.5->1.6 src/image/ia64_asm/interpolate8x8_ia64_exact.s:1.1->1.2 src/motion/ia64_asm/calc_delta_1.s:1.1->1.2 src/motion/ia64_asm/calc_delta_2.s:1.1->1.2 src/motion/ia64_asm/calc_delta_3.s:1.1->1.2 src/motion/ia64_asm/halfpel8_refine_ia64.s:1.3->1.4 src/motion/ia64_asm/sad_ia64.s:1.7->1.8 src/quant/ia64_asm/quant_h263_ia64.s:1.6->1.7 src/utils/ia64_asm/mem_transfer_ia64.s:1.5->1.6 --------------------- Date: 2009/02/18 23:09:37 Author: Isibaar Branch: HEAD Tag: (none) Log: amd64 Members: debian/control:1.3->1.4 --------------------- Date: 2009/02/18 16:10:19 Author: Isibaar Branch: HEAD Tag: (none) Log: - Some updates to license headers Members: src/image/x86_asm/colorspace_mmx.inc:1.8->1.9 src/motion/motion.h:1.24->1.25 --------------------- Date: 2009/01/07 17:32:31 Author: Isibaar Branch: HEAD Tag: (none) Log: Added note for OSX users that nasm >=2.06rc2 is required for MACH-O build Will auto-check in the configure script once 2.06 release is out... Members: doc/README:1.3->1.4 --------------------- Date: 2009/01/07 17:22:02 Author: Isibaar Branch: HEAD Tag: (none) Log: added quotes around nasm include paths to fix problems with directory names containing spaces Members: build/win32/libxvidcore.dsp:1.15->1.16 build/win32/libxvidcore_static.dsp:1.5->1.6 --------------------- Date: 2008/12/15 11:22:07 Author: Isibaar Branch: HEAD Tag: (none) Log: added -D_WIN32_IE=0x0501 to CFLAGS Members: vfw/bin/Makefile:1.6->1.7 --------------------- Date: 2008/12/09 11:42:38 Author: Isibaar Branch: HEAD Tag: (none) Log: Note for yasm version required for MacOS X Members: doc/README:1.2->1.3 --------------------- Date: 2008/12/05 11:33:47 Author: Isibaar Branch: HEAD Tag: (none) Log: added a comment Members: src/dct/x86_asm/fdct_sse2_skal.asm:1.13->1.14 --------------------- Date: 2008/12/05 11:18:52 Author: Isibaar Branch: HEAD Tag: (none) Log: Added -arch ppc for Apple gcc Members: build/generic/configure.in:1.27->1.28 --------------------- Date: 2008/12/05 11:15:02 Author: Isibaar Branch: HEAD Tag: (none) Log: MacOS X specific changes Members: src/nasm.inc:1.5->1.6 src/dct/x86_asm/fdct_sse2_skal.asm:1.12->1.13 --------------------- Date: 2008/12/04 19:30:36 Author: Isibaar Branch: HEAD Tag: (none) Log: yasm compatibility Members: build/generic/configure.in:INITIAL->1.27 src/nasm.inc:1.4->1.5 src/image/x86_asm/colorspace_yuv_mmx.asm:1.13->1.14 src/image/x86_asm/interpolate8x8_3dne.asm:1.12->1.13 src/image/x86_asm/postprocessing_mmx.asm:1.11->1.12 src/image/x86_asm/postprocessing_sse2.asm:1.14->1.15 src/quant/x86_asm/quantize_h263_3dne.asm:1.10->1.11 src/quant/x86_asm/quantize_h263_mmx.asm:1.14->1.15 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.14->1.15 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.11->1.12 --------------------- Date: 2008/12/04 15:41:50 Author: Isibaar Branch: HEAD Tag: (none) Log: trying to add back yasm support Members: src/nasm.inc:INITIAL->1.4 src/bitstream/x86_asm/cbp_mmx.asm:1.17->1.18 src/bitstream/x86_asm/cbp_sse2.asm:1.12->1.13 src/dct/x86_asm/fdct_mmx_ffmpeg.asm:1.8->1.9 src/dct/x86_asm/fdct_mmx_skal.asm:1.10->1.11 src/dct/x86_asm/fdct_sse2_skal.asm:INITIAL->1.12 src/dct/x86_asm/idct_3dne.asm:1.9->1.10 src/dct/x86_asm/idct_mmx.asm:1.13->1.14 src/dct/x86_asm/idct_sse2_dmitry.asm:INITIAL->1.10 src/image/x86_asm/colorspace_rgb_mmx.asm:1.10->1.11 src/image/x86_asm/colorspace_yuv_mmx.asm:1.12->1.13 src/image/x86_asm/colorspace_yuyv_mmx.asm:1.10->1.11 src/image/x86_asm/deintl_sse.asm:1.4->1.5 src/image/x86_asm/gmc_mmx.asm:INITIAL->1.10 src/image/x86_asm/interpolate8x8_3dn.asm:1.12->1.13 src/image/x86_asm/interpolate8x8_3dne.asm:1.11->1.12 src/image/x86_asm/interpolate8x8_mmx.asm:1.23->1.24 src/image/x86_asm/interpolate8x8_xmm.asm:1.13->1.14 src/image/x86_asm/postprocessing_mmx.asm:1.10->1.11 src/image/x86_asm/postprocessing_sse2.asm:1.13->1.14 src/image/x86_asm/qpel_mmx.asm:1.9->1.10 src/image/x86_asm/reduced_mmx.asm:1.10->1.11 src/motion/x86_asm/sad_3dn.asm:1.12->1.13 src/motion/x86_asm/sad_3dne.asm:1.10->1.11 src/motion/x86_asm/sad_mmx.asm:1.20->1.21 src/motion/x86_asm/sad_sse2.asm:1.18->1.19 src/motion/x86_asm/sad_xmm.asm:1.13->1.14 src/plugins/x86_asm/plugin_ssim-a.asm:1.11->1.12 src/quant/x86_asm/quantize_h263_3dne.asm:1.9->1.10 src/quant/x86_asm/quantize_h263_mmx.asm:INITIAL->1.14 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.13->1.14 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.10->1.11 src/utils/x86_asm/cpuid.asm:INITIAL->1.18 src/utils/x86_asm/interlacing_mmx.asm:1.10->1.11 src/utils/x86_asm/mem_transfer_3dne.asm:1.11->1.12 src/utils/x86_asm/mem_transfer_mmx.asm:1.20->1.21 --------------------- Date: 2008/12/02 14:58:30 Author: Isibaar Branch: HEAD Tag: (none) Log: missing ENDFUNC Members: src/utils/x86_asm/cpuid.asm:1.16->1.17 --------------------- Date: 2008/12/02 14:44:55 Author: Isibaar Branch: HEAD Tag: (none) Log: WIN64 XMM6/XMM7 bench and asm optimization patch by Andrew Dunstan Members: examples/xvid_bench.c:1.38->1.39 src/bitstream/x86_asm/cbp_sse2.asm:1.11->1.12 src/image/x86_asm/gmc_mmx.asm:1.8->1.9 src/image/x86_asm/postprocessing_sse2.asm:1.12->1.13 src/motion/x86_asm/sad_sse2.asm:1.17->1.18 src/plugins/x86_asm/plugin_ssim-a.asm:1.10->1.11 src/quant/x86_asm/quantize_h263_mmx.asm:1.12->1.13 src/utils/emms.h:1.16->1.17 src/utils/x86_asm/cpuid.asm:1.15->1.16 --------------------- Date: 2008/12/01 16:22:37 Author: Isibaar Branch: HEAD Tag: (none) Log: fix for previous commit Members: src/image/x86_asm/colorspace_yuv_mmx.asm:1.11->1.12 --------------------- Date: 2008/12/01 16:06:48 Author: Isibaar Branch: HEAD Tag: (none) Log: OSX/Leopard compilation fix by Guillaume Poirier Members: src/xvid.c:1.79->1.80 --------------------- Date: 2008/12/01 16:00:44 Author: Isibaar Branch: HEAD Tag: (none) Log: ASM clean-up patch by Carlo Bramix Members: src/image/x86_asm/colorspace_mmx.inc:1.7->1.8 src/image/x86_asm/colorspace_yuv_mmx.asm:1.10->1.11 src/image/x86_asm/interpolate8x8_mmx.asm:1.22->1.23 src/image/x86_asm/postprocessing_mmx.asm:1.9->1.10 src/image/x86_asm/postprocessing_sse2.asm:1.11->1.12 src/image/x86_asm/reduced_mmx.asm:1.9->1.10 --------------------- Date: 2008/12/01 15:45:45 Author: Isibaar Branch: HEAD Tag: (none) Log: properly treat XMM6/XMM7 as non-volatile on WIN64 (to be tested) Members: src/nasm.inc:1.2->1.3 src/bitstream/x86_asm/cbp_sse2.asm:1.10->1.11 src/dct/x86_asm/fdct_sse2_skal.asm:1.10->1.11 src/dct/x86_asm/idct_sse2_dmitry.asm:1.8->1.9 src/image/x86_asm/gmc_mmx.asm:1.7->1.8 src/image/x86_asm/postprocessing_sse2.asm:1.10->1.11 src/motion/x86_asm/sad_sse2.asm:1.16->1.17 src/plugins/x86_asm/plugin_ssim-a.asm:1.9->1.10 src/quant/x86_asm/quantize_h263_mmx.asm:1.11->1.12 --------------------- Date: 2008/11/30 19:05:42 Author: Isibaar Branch: HEAD Tag: (none) Log: finish up WIN64 compatibility Members: vfw/src/driverproc.c:1.10->1.11 --------------------- Date: 2008/11/30 18:56:07 Author: Isibaar Branch: HEAD Tag: (none) Log: finish up WIN64 compatibility Members: vfw/src/config.c:1.37->1.38 vfw/src/driverproc.c:1.9->1.10 vfw/src/status.c:1.3->1.4 --------------------- Date: 2008/11/30 17:36:44 Author: Isibaar Branch: HEAD Tag: (none) Log: VC8 win32 / x64 project files Members: build/win32/libxvidcore.sln:1.1->1.2 build/win32/libxvidcore.vcproj:1.1->1.2 build/win32/xvid_decraw.vcproj:INITIAL->1.1 build/win32/xvid_encraw.vcproj:INITIAL->1.1 dshow/dshow.vcproj:INITIAL->1.1 dshow/src/CAbout.cpp:1.2->1.3 dshow/src/CAbout.h:1.2->1.3 dshow/src/Configure.cpp:1.5->1.6 dshow/src/config.c:1.9->1.10 src/nasm.inc:1.1->1.2 src/portab.h:1.58->1.59 src/xvid.c:INITIAL->1.79 src/motion/gmc.c:1.9->1.10 vfw/vfw.dsp:INITIAL->1.4 vfw/vfw.vcproj:INITIAL->1.1 vfw/src/config.c:1.36->1.37 vfw/src/config.h:1.12->1.13 vfw/src/driverproc.c:1.8->1.9 --------------------- Date: 2008/11/28 19:28:41 Author: Isibaar Branch: HEAD Tag: (none) Log: updated nasm dependency Members: debian/control:1.2->1.3 --------------------- Date: 2008/11/28 19:16:42 Author: Isibaar Branch: HEAD Tag: (none) Log: pump up HEAD version numbers Members: build/generic/configure.in:1.25->1.26 debian/changelog:1.2->1.3 src/xvid.c:1.77->1.78 src/xvid.h:1.63->1.64 --------------------- Date: 2008/11/28 17:54:43 Author: Isibaar Branch: HEAD Tag: tag-branching-1_2_0 Log: WIN64 compatibility Members: dshow/src/config.h:1.5->1.6 --------------------- Date: 2008/11/28 17:42:50 Author: Isibaar Branch: HEAD Tag: (none) Log: alternative multicore detection Members: src/xvid.c:1.76->1.77 --------------------- Date: 2008/11/28 12:56:01 Author: Isibaar Branch: HEAD Tag: (none) Log: Auto SMP Members: vfw/src/codec.c:1.21->1.22 vfw/src/config.c:1.35->1.36 vfw/src/resource.rc:1.24->1.25 --------------------- Date: 2008/11/28 11:58:07 Author: Isibaar Branch: HEAD Tag: (none) Log: bugfix: prevent access violation if width/height is not multiple of 2 Members: src/image/image.c:1.42->1.43 --------------------- Date: 2008/11/27 21:46:13 Author: Isibaar Branch: HEAD Tag: (none) Log: AMD64 fix Members: src/plugins/x86_asm/plugin_ssim-a.asm:1.8->1.9 --------------------- Date: 2008/11/27 21:34:53 Author: Isibaar Branch: HEAD Tag: (none) Log: readded cpu check Members: src/plugins/plugin_ssim.c:1.11->1.12 --------------------- Date: 2008/11/27 21:17:33 Author: Isibaar Branch: HEAD Tag: (none) Log: more ssim fixes Members: examples/xvid_encraw.c:1.36->1.37 src/xvid.h:1.62->1.63 src/plugins/plugin_ssim.c:1.10->1.11 src/plugins/plugin_ssim.h:1.3->1.4 --------------------- Date: 2008/11/27 20:45:28 Author: Isibaar Branch: HEAD Tag: (none) Log: fix for -ssim option Members: examples/xvid_encraw.c:1.35->1.36 --------------------- Date: 2008/11/27 19:35:36 Author: Isibaar Branch: HEAD Tag: (none) Log: 64-bit fix Members: src/utils/x86_asm/interlacing_mmx.asm:1.9->1.10 --------------------- Date: 2008/11/27 17:42:00 Author: Isibaar Branch: HEAD Tag: (none) Log: updated strings Members: vfw/bin/xvid.inf:1.3->1.4 --------------------- Date: 2008/11/27 17:33:32 Author: Isibaar Branch: HEAD Tag: (none) Log: 64-bit GUI note Members: vfw/src/config.c:1.34->1.35 vfw/src/config.h:1.11->1.12 vfw/src/resource.rc:1.23->1.24 --------------------- Date: 2008/11/27 17:31:48 Author: Isibaar Branch: HEAD Tag: (none) Log: enable SSE4 GMC code Members: src/portab.h:1.57->1.58 src/motion/gmc.c:1.8->1.9 --------------------- Date: 2008/11/27 12:57:28 Author: Isibaar Branch: HEAD Tag: (none) Log: WIN64 compatibility Members: dshow/Makefile:1.6->1.7 vfw/bin/Makefile:1.5->1.6 vfw/src/config.c:1.33->1.34 vfw/src/status.c:1.2->1.3 --------------------- Date: 2008/11/27 01:47:03 Author: Isibaar Branch: HEAD Tag: (none) Log: brightness control fix Members: src/xvid.c:1.75->1.76 src/image/postprocessing.c:1.4->1.5 src/image/x86_asm/postprocessing_sse2.asm:1.9->1.10 --------------------- Date: 2008/11/27 00:37:28 Author: Isibaar Branch: HEAD Tag: (none) Log: sad8bi bench Members: examples/xvid_bench.c:1.37->1.38 --------------------- Date: 2008/11/27 00:35:50 Author: Isibaar Branch: HEAD Tag: (none) Log: some WIN64 fixes Members: src/image/x86_asm/colorspace_mmx.inc:1.6->1.7 src/image/x86_asm/colorspace_yuv_mmx.asm:1.9->1.10 src/image/x86_asm/gmc_mmx.asm:1.6->1.7 src/image/x86_asm/interpolate8x8_mmx.asm:1.21->1.22 src/image/x86_asm/postprocessing_mmx.asm:1.8->1.9 src/image/x86_asm/qpel_mmx.asm:1.8->1.9 src/quant/x86_asm/quantize_h263_mmx.asm:1.10->1.11 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.12->1.13 --------------------- Date: 2008/11/26 11:11:16 Author: Isibaar Branch: HEAD Tag: (none) Log: GUI controls for SSE3/SSE4 Updated about box and messages Members: dshow/src/CXvidDecoder.h:1.6->1.7 dshow/src/Configure.cpp:1.4->1.5 dshow/src/config.c:1.8->1.9 vfw/src/codec.h:1.5->1.6 vfw/src/config.c:1.32->1.33 vfw/src/resource.h:1.10->1.11 vfw/src/resource.rc:1.22->1.23 --------------------- Date: 2008/11/26 10:31:06 Author: Isibaar Branch: HEAD Tag: (none) Log: some more benches Members: examples/xvid_bench.c:1.36->1.37 --------------------- Date: 2008/11/26 03:36:37 Author: Isibaar Branch: HEAD Tag: (none) Log: increment bs version Members: src/xvid.h:1.61->1.62 --------------------- Date: 2008/11/26 03:32:54 Author: Isibaar Branch: HEAD Tag: (none) Log: removed obsolete AMD64 asm source files Members: src/dct/x86_64_asm/fdct_mmx_skal.asm:1.3->1.4(DEAD) src/dct/x86_64_asm/idct_mmx.asm:1.3->1.4(DEAD) src/image/x86_64_asm/interpolate8x8_mmx.asm:1.3->1.4(DEAD) src/image/x86_64_asm/interpolate8x8_xmm.asm:1.3->1.4(DEAD) src/image/x86_64_asm/qpel_mmx.asm:1.4->1.5(DEAD) src/motion/x86_64_asm/sad_mmx.asm:1.3->1.4(DEAD) src/motion/x86_64_asm/sad_xmm.asm:1.3->1.4(DEAD) src/quant/x86_64_asm/quantize_h263_mmx.asm:1.3->1.4(DEAD) src/quant/x86_64_asm/quantize_mpeg_xmm.asm:1.3->1.4(DEAD) src/utils/x86_64_asm/cpuid.asm:1.6->1.7(DEAD) src/utils/x86_64_asm/interlacing_mmx.asm:1.5->1.6(DEAD) src/utils/x86_64_asm/mem_transfer_mmx.asm:1.3->1.4(DEAD) --------------------- Date: 2008/11/26 03:21:02 Author: Isibaar Branch: HEAD Tag: (none) Log: X86_64 fixes Members: src/image/x86_asm/postprocessing_mmx.asm:1.7->1.8 src/image/x86_asm/postprocessing_sse2.asm:1.8->1.9 src/quant/quant_mpeg.c:1.4->1.5 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.11->1.12 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.9->1.10 --------------------- Date: 2008/11/26 03:17:50 Author: Isibaar Branch: HEAD Tag: (none) Log: win64 target Members: build/generic/Makefile:1.16->1.17 build/generic/configure.in:1.24->1.25 build/generic/sources.inc:1.14->1.15 --------------------- Date: 2008/11/26 03:12:41 Author: Isibaar Branch: HEAD Tag: (none) Log: updated MSVC project files Members: build/win32/libxvidcore.dsp:1.14->1.15 build/win32/libxvidcore_static.dsp:1.4->1.5 --------------------- Date: 2008/11/26 02:04:34 Author: Isibaar Branch: HEAD Tag: (none) Log: Unified elf64/win64 X86_64 support Members: src/nasm.inc:INITIAL->1.1 src/xvid.c:1.74->1.75 src/bitstream/cbp.h:1.11->1.12 src/bitstream/mbcoding.c:1.56->1.57 src/bitstream/x86_asm/cbp_3dne.asm:1.7->1.8(DEAD) src/bitstream/x86_asm/cbp_mmx.asm:1.16->1.17 src/bitstream/x86_asm/cbp_sse2.asm:1.9->1.10 src/dct/fdct.h:1.10->1.11 src/dct/idct.h:1.12->1.13 src/dct/x86_asm/fdct_mmx_ffmpeg.asm:1.7->1.8 src/dct/x86_asm/fdct_mmx_skal.asm:1.9->1.10 src/dct/x86_asm/fdct_sse2_skal.asm:1.9->1.10 src/dct/x86_asm/idct_3dne.asm:1.8->1.9 src/dct/x86_asm/idct_mmx.asm:1.12->1.13 src/dct/x86_asm/idct_sse2_dmitry.asm:1.7->1.8 src/dct/x86_asm/simple_idct_mmx.asm:1.9->1.10(DEAD) src/image/colorspace.h:1.9->1.10 src/image/image.c:1.41->1.42 src/image/interpolate8x8.h:1.16->1.17 src/image/qpel.c:1.8->1.9 src/image/qpel.h:1.7->1.8 src/image/reduced.h:1.3->1.4 src/image/x86_asm/colorspace_mmx.inc:1.5->1.6 src/image/x86_asm/colorspace_rgb_mmx.asm:1.9->1.10 src/image/x86_asm/colorspace_yuv_mmx.asm:1.8->1.9 src/image/x86_asm/colorspace_yuyv_mmx.asm:1.9->1.10 src/image/x86_asm/deintl_sse.asm:1.3->1.4 src/image/x86_asm/gmc_mmx.asm:1.5->1.6 src/image/x86_asm/interpolate8x8_3dn.asm:1.11->1.12 src/image/x86_asm/interpolate8x8_3dne.asm:1.10->1.11 src/image/x86_asm/interpolate8x8_mmx.asm:1.20->1.21 src/image/x86_asm/interpolate8x8_xmm.asm:1.12->1.13 src/image/x86_asm/postprocessing_mmx.asm:1.6->1.7 src/image/x86_asm/postprocessing_sse2.asm:1.7->1.8 src/image/x86_asm/qpel_mmx.asm:1.7->1.8 src/image/x86_asm/reduced_mmx.asm:1.8->1.9 src/motion/motion_smp.h:1.4->1.5 src/motion/sad.h:1.23->1.24 src/motion/x86_asm/sad_3dn.asm:1.11->1.12 src/motion/x86_asm/sad_3dne.asm:1.9->1.10 src/motion/x86_asm/sad_mmx.asm:1.19->1.20 src/motion/x86_asm/sad_sse2.asm:1.15->1.16 src/motion/x86_asm/sad_xmm.asm:1.12->1.13 src/plugins/plugin_ssim.c:1.9->1.10 src/plugins/x86_asm/plugin_ssim-a.asm:1.7->1.8 src/quant/quant.h:1.7->1.8 src/quant/quant_matrix.c:1.15->1.16 src/quant/quant_mpeg.c:1.3->1.4 src/quant/x86_asm/quantize_h263_3dne.asm:1.8->1.9 src/quant/x86_asm/quantize_h263_mmx.asm:1.9->1.10 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.10->1.11 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.8->1.9 src/utils/mbfunctions.h:1.20->1.21 src/utils/mem_transfer.h:1.17->1.18 src/utils/x86_asm/cpuid.asm:1.14->1.15 src/utils/x86_asm/interlacing_mmx.asm:1.8->1.9 src/utils/x86_asm/mem_transfer_3dne.asm:1.10->1.11 src/utils/x86_asm/mem_transfer_mmx.asm:1.19->1.20 --------------------- Date: 2008/11/14 16:43:27 Author: Isibaar Branch: HEAD Tag: (none) Log: initial SSE4 support Members: build/generic/configure.in:1.23->1.24 examples/xvid_bench.c:1.35->1.36 examples/xvid_encraw.c:1.34->1.35 src/xvid.c:1.73->1.74 src/xvid.h:1.60->1.61 src/image/x86_asm/gmc_mmx.asm:1.4->1.5 src/motion/gmc.c:1.7->1.8 src/utils/x86_64_asm/cpuid.asm:1.5->1.6 src/utils/x86_asm/cpuid.asm:1.13->1.14 --------------------- Date: 2008/11/11 21:46:24 Author: Isibaar Branch: HEAD Tag: (none) Log: NASM 2.x compatibility Members: src/bitstream/x86_asm/cbp_3dne.asm:1.6->1.7 src/bitstream/x86_asm/cbp_mmx.asm:1.15->1.16 src/bitstream/x86_asm/cbp_sse2.asm:1.8->1.9 src/dct/x86_64_asm/fdct_mmx_skal.asm:1.2->1.3 src/dct/x86_64_asm/idct_mmx.asm:1.2->1.3 src/dct/x86_asm/fdct_mmx_ffmpeg.asm:1.6->1.7 src/dct/x86_asm/fdct_mmx_skal.asm:1.8->1.9 src/dct/x86_asm/fdct_sse2_skal.asm:1.8->1.9 src/dct/x86_asm/idct_3dne.asm:1.7->1.8 src/dct/x86_asm/idct_mmx.asm:1.11->1.12 src/dct/x86_asm/idct_sse2_dmitry.asm:1.6->1.7 src/dct/x86_asm/simple_idct_mmx.asm:1.8->1.9 src/image/x86_64_asm/interpolate8x8_mmx.asm:1.2->1.3 src/image/x86_64_asm/interpolate8x8_xmm.asm:1.2->1.3 src/image/x86_64_asm/qpel_mmx.asm:1.3->1.4 src/image/x86_asm/colorspace_mmx.inc:1.4->1.5 src/image/x86_asm/colorspace_rgb_mmx.asm:1.8->1.9 src/image/x86_asm/colorspace_yuv_mmx.asm:1.7->1.8 src/image/x86_asm/colorspace_yuyv_mmx.asm:1.8->1.9 src/image/x86_asm/deintl_sse.asm:1.2->1.3 src/image/x86_asm/gmc_mmx.asm:1.3->1.4 src/image/x86_asm/interpolate8x8_3dn.asm:1.10->1.11 src/image/x86_asm/interpolate8x8_3dne.asm:1.9->1.10 src/image/x86_asm/interpolate8x8_mmx.asm:1.19->1.20 src/image/x86_asm/interpolate8x8_xmm.asm:1.11->1.12 src/image/x86_asm/postprocessing_mmx.asm:1.5->1.6 src/image/x86_asm/postprocessing_sse2.asm:1.6->1.7 src/image/x86_asm/qpel_mmx.asm:1.6->1.7 src/image/x86_asm/reduced_mmx.asm:1.7->1.8 src/motion/x86_64_asm/sad_mmx.asm:1.2->1.3 src/motion/x86_64_asm/sad_xmm.asm:1.2->1.3 src/motion/x86_asm/sad_3dn.asm:1.10->1.11 src/motion/x86_asm/sad_3dne.asm:1.8->1.9 src/motion/x86_asm/sad_mmx.asm:1.18->1.19 src/motion/x86_asm/sad_sse2.asm:1.14->1.15 src/motion/x86_asm/sad_xmm.asm:1.11->1.12 src/plugins/x86_asm/plugin_ssim-a.asm:1.6->1.7 src/quant/x86_64_asm/quantize_h263_mmx.asm:1.2->1.3 src/quant/x86_64_asm/quantize_mpeg_xmm.asm:1.2->1.3 src/quant/x86_asm/quantize_h263_3dne.asm:1.7->1.8 src/quant/x86_asm/quantize_h263_mmx.asm:1.8->1.9 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.9->1.10 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.7->1.8 src/utils/x86_64_asm/cpuid.asm:1.4->1.5 src/utils/x86_64_asm/interlacing_mmx.asm:1.4->1.5 src/utils/x86_64_asm/mem_transfer_mmx.asm:1.2->1.3 src/utils/x86_asm/cpuid.asm:1.12->1.13 src/utils/x86_asm/interlacing_mmx.asm:1.7->1.8 src/utils/x86_asm/mem_transfer_3dne.asm:1.9->1.10 src/utils/x86_asm/mem_transfer_mmx.asm:1.18->1.19 --------------------- Date: 2008/09/02 14:23:30 Author: Isibaar Branch: HEAD Tag: (none) Log: - added the debian files from 1.1.3 release to CVS head - applied a patch by Fabian Greffrath Members: debian/changelog:1.1->1.2 debian/compat:INITIAL->1.1 debian/control:1.1->1.2 debian/copyright:INITIAL->1.1 debian/libxvidcore4-dev.dirs:INITIAL->1.1 debian/libxvidcore4-dev.docs:INITIAL->1.1 debian/libxvidcore4-dev.install:INITIAL->1.1 debian/libxvidcore4.dirs:INITIAL->1.1 debian/libxvidcore4.docs:INITIAL->1.1 debian/libxvidcore4.install:INITIAL->1.1 debian/rules:1.1->1.2 --------------------- Date: 2008/08/19 11:17:17 Author: Isibaar Branch: HEAD Tag: (none) Log: - removed accidental duplicates Members: src/utils/x86_64_asm/cpuid.asm:1.3->1.4 src/utils/x86_64_asm/interlacing_mmx.asm:1.3->1.4 --------------------- Date: 2008/08/19 11:06:48 Author: Isibaar Branch: HEAD Tag: (none) Log: - noexecstack patch by Hans de Goede Members: src/bitstream/x86_asm/cbp_3dne.asm:1.5->1.6 src/bitstream/x86_asm/cbp_mmx.asm:1.14->1.15 src/bitstream/x86_asm/cbp_sse2.asm:1.7->1.8 src/dct/x86_64_asm/fdct_mmx_skal.asm:1.1->1.2 src/dct/x86_64_asm/idct_mmx.asm:1.1->1.2 src/dct/x86_asm/fdct_mmx_ffmpeg.asm:1.5->1.6 src/dct/x86_asm/fdct_mmx_skal.asm:1.7->1.8 src/dct/x86_asm/fdct_sse2_skal.asm:1.7->1.8 src/dct/x86_asm/idct_3dne.asm:1.6->1.7 src/dct/x86_asm/idct_mmx.asm:1.10->1.11 src/dct/x86_asm/idct_sse2_dmitry.asm:1.5->1.6 src/dct/x86_asm/simple_idct_mmx.asm:1.7->1.8 src/image/x86_64_asm/interpolate8x8_mmx.asm:1.1->1.2 src/image/x86_64_asm/interpolate8x8_xmm.asm:1.1->1.2 src/image/x86_64_asm/qpel_mmx.asm:1.2->1.3 src/image/x86_asm/colorspace_rgb_mmx.asm:1.7->1.8 src/image/x86_asm/colorspace_yuv_mmx.asm:1.6->1.7 src/image/x86_asm/colorspace_yuyv_mmx.asm:1.7->1.8 src/image/x86_asm/deintl_sse.asm:1.1->1.2 src/image/x86_asm/gmc_mmx.asm:1.2->1.3 src/image/x86_asm/interpolate8x8_3dn.asm:1.9->1.10 src/image/x86_asm/interpolate8x8_3dne.asm:1.8->1.9 src/image/x86_asm/interpolate8x8_mmx.asm:1.18->1.19 src/image/x86_asm/interpolate8x8_xmm.asm:1.10->1.11 src/image/x86_asm/postprocessing_mmx.asm:1.4->1.5 src/image/x86_asm/postprocessing_sse2.asm:1.5->1.6 src/image/x86_asm/qpel_mmx.asm:1.5->1.6 src/image/x86_asm/reduced_mmx.asm:1.6->1.7 src/motion/x86_64_asm/sad_mmx.asm:1.1->1.2 src/motion/x86_64_asm/sad_xmm.asm:1.1->1.2 src/motion/x86_asm/sad_3dn.asm:1.9->1.10 src/motion/x86_asm/sad_3dne.asm:1.7->1.8 src/motion/x86_asm/sad_mmx.asm:1.17->1.18 src/motion/x86_asm/sad_sse2.asm:1.13->1.14 src/motion/x86_asm/sad_xmm.asm:1.10->1.11 src/plugins/x86_asm/plugin_ssim-a.asm:1.5->1.6 src/quant/x86_64_asm/quantize_h263_mmx.asm:1.1->1.2 src/quant/x86_64_asm/quantize_mpeg_xmm.asm:1.1->1.2 src/quant/x86_asm/quantize_h263_3dne.asm:1.6->1.7 src/quant/x86_asm/quantize_h263_mmx.asm:1.7->1.8 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.8->1.9 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.6->1.7 src/utils/x86_64_asm/cpuid.asm:1.2->1.3 src/utils/x86_64_asm/interlacing_mmx.asm:1.2->1.3 src/utils/x86_64_asm/mem_transfer_mmx.asm:1.1->1.2 src/utils/x86_asm/cpuid.asm:1.11->1.12 src/utils/x86_asm/interlacing_mmx.asm:1.6->1.7 src/utils/x86_asm/mem_transfer_3dne.asm:1.8->1.9 src/utils/x86_asm/mem_transfer_mmx.asm:1.17->1.18 --------------------- Date: 2007/11/23 11:45:09 Author: Isibaar Branch: HEAD Tag: (none) Log: - add missing #!/usr/bin/perl Members: examples/bench_list.pl:1.2->1.3 --------------------- Date: 2007/11/23 11:44:11 Author: Isibaar Branch: HEAD Tag: (none) Log: - update for nasm >= 0.99 Members: build/generic/configure.in:1.22->1.23 --------------------- Date: 2007/09/11 14:41:13 Author: suxen_drol Branch: HEAD Tag: (none) Log: nasm 0.99.x compatibility Members: src/dct/x86_asm/fdct_mmx_skal.asm:1.6->1.7 src/image/x86_asm/interpolate8x8_3dne.asm:1.7->1.8 src/motion/x86_asm/sad_3dne.asm:1.6->1.7 src/quant/x86_asm/quantize_h263_3dne.asm:1.5->1.6 src/utils/x86_asm/mem_transfer_3dne.asm:1.7->1.8 --------------------- Date: 2007/07/26 16:30:31 Author: suxen_drol Branch: HEAD Tag: (none) Log: posix compliance; s/malloc\.h/stdlib\.h/ Members: src/plugins/plugin_ssim.c:1.8->1.9 --------------------- Date: 2007/07/24 11:43:10 Author: Isibaar Branch: HEAD Tag: (none) Log: - improve b-frame decoding robustness (with broken stream or missing ref frame) Members: src/bitstream/bitstream.c:1.57->1.58 --------------------- Date: 2007/07/19 15:46:16 Author: suxen_drol Branch: HEAD Tag: (none) Log: remove plugin_fixed.c as per instruction "[Xvid-devel] pete cvs commits digest" (13 May 2003) Members: src/plugins/plugin_fixed.c:1.2->1.3(DEAD) --------------------- Date: 2007/06/28 16:55:11 Author: Skal Branch: HEAD Tag: (none) Log: Simplify index testing after get_coeff(), esp. after decoding error. Members: src/bitstream/mbcoding.c:1.55->1.56 --------------------- Date: 2007/06/27 16:38:05 Author: Isibaar Branch: HEAD Tag: (none) Log: - patch for a potential vulnerability reported by Secunia Members: src/bitstream/mbcoding.c:1.54->1.55 --------------------- Date: 2007/06/02 15:53:13 Author: syskin Branch: HEAD Tag: (none) Log: compile with unicode support correctly (windows/debug). Patch by Kurt Pruenner Members: src/portab.h:1.56->1.57 --------------------- Date: 2007/04/28 18:30:20 Author: syskin Branch: HEAD Tag: (none) Log: Correction to out-of-bounds memory access in d_mv_bits; an assertion showing that d_mv_bits is still wrong; VC8 project files Members: build/win32/libxvidcore.sln:INITIAL->1.1 build/win32/libxvidcore.vcproj:INITIAL->1.1 src/bitstream/mbcoding.c:1.53->1.54 src/motion/motion_inlines.h:1.4->1.5 --------------------- Date: 2007/04/16 21:01:28 Author: Skal Branch: HEAD Tag: (none) Log: fix for bad resync_marker length in b-vops (both enc and dec) Thanks to Mathieu Monnier for the report (mathieu.monnier at polytechnique dot org) Members: src/decoder.c:1.79->1.80 src/bitstream/bitstream.c:1.56->1.57 --------------------- Date: 2007/03/08 22:40:12 Author: Isibaar Branch: HEAD Tag: (none) Log: - fix SSE3 detection and port to x86_64 Members: src/utils/x86_64_asm/cpuid.asm:1.1->1.2 src/utils/x86_asm/cpuid.asm:1.10->1.11 --------------------- Date: 2007/02/08 14:10:24 Author: Isibaar Branch: HEAD Tag: (none) Log: - avoid access violation when stats==NULL - patch by Felipe Contreras Members: src/encoder.c:1.129->1.130 --------------------- Date: 2007/01/09 21:08:53 Author: Isibaar Branch: HEAD Tag: (none) Log: - ssim and colorspace set-up patches by Johannes Reinhardt Members: examples/xvid_encraw.c:1.33->1.34 --------------------- Date: 2006/12/22 00:29:27 Author: Isibaar Branch: HEAD Tag: (none) Log: - build patch for Mac by Eric Petit Members: build/generic/configure.in:1.21->1.22 --------------------- Date: 2006/12/22 00:27:25 Author: Isibaar Branch: HEAD Tag: (none) Log: - PPC build cleanup patch for SSIM by Paul Kurucz Members: examples/xvid_bench.c:1.34->1.35 src/plugins/plugin_ssim.c:1.7->1.8 --------------------- Date: 2006/12/14 14:09:00 Author: Isibaar Branch: HEAD Tag: (none) Log: - missing emms() fix by squid_80 Members: src/encoder.c:1.128->1.129 --------------------- Date: 2006/12/06 20:55:42 Author: Isibaar Branch: HEAD Tag: (none) Log: - Add xvid_plugin_ssim Members: build/generic/libxvidcore.def:1.4->1.5 --------------------- Date: 2006/12/06 20:55:07 Author: Isibaar Branch: HEAD Tag: (none) Log: - SSE3 patch Members: src/xvid.c:1.72->1.73 src/xvid.h:1.59->1.60 src/motion/sad.h:1.22->1.23 src/motion/x86_asm/sad_sse2.asm:1.12->1.13 src/utils/x86_asm/cpuid.asm:1.9->1.10 --------------------- Date: 2006/11/12 02:40:36 Author: chl Branch: HEAD Tag: (none) Log: MMX version of RGB_to_yv12, shamelessly copy&pasted from the BGR version. Members: src/xvid.c:1.71->1.72 src/image/colorspace.h:1.8->1.9 src/image/x86_asm/colorspace_rgb_mmx.asm:1.6->1.7 --------------------- Date: 2006/11/11 23:06:44 Author: chl Branch: HEAD Tag: (none) Log: Fixed RGB but, simply forgot to initialize the function ptr Members: src/xvid.c:1.70->1.71 --------------------- Date: 2006/11/11 23:03:30 Author: chl Branch: HEAD Tag: (none) Log: Same RGB bug, different location Members: src/image/image.c:1.40->1.41 --------------------- Date: 2006/11/11 06:07:25 Author: chl Branch: HEAD Tag: (none) Log: Typo in RGB, but still seems broken. Members: src/image/image.c:1.39->1.40 --------------------- Date: 2006/11/10 19:58:39 Author: chl Branch: HEAD Tag: (none) Log: Added support for RGB colorspace. Incredible that after 5 years, this still wasn't there (only BGR and RGB+alpha). There are no accelerated MMX version, yet. Members: src/xvid.h:1.58->1.59 src/image/colorspace.c:1.13->1.14 src/image/colorspace.h:1.7->1.8 src/image/image.c:1.38->1.39 --------------------- Date: 2006/11/08 08:17:22 Author: Skal Branch: HEAD Tag: (none) Log: + added an integer-based alternative to float gaussian. #define USE_INT_GAUSSIAN to activate it Members: src/plugins/plugin_ssim.c:1.6->1.7 --------------------- Date: 2006/11/08 07:55:27 Author: Skal Branch: HEAD Tag: (none) Log: + applied ssim_part3.diff patch, by Johannes Reinhardt Members: examples/xvid_encraw.c:1.32->1.33 src/plugins/plugin_ssim.c:1.5->1.6 src/plugins/plugin_ssim.h:1.2->1.3 --------------------- Date: 2006/11/07 20:59:03 Author: Skal Branch: HEAD Tag: (none) Log: + added a seamingly missing emms() to generate_GMCimage() + little ASM clean-up, pointer out by Celtic_Druid Members: src/image/x86_asm/gmc_mmx.asm:1.1->1.2 src/motion/gmc.c:1.6->1.7 --------------------- Date: 2006/11/01 11:04:29 Author: Isibaar Branch: HEAD Tag: (none) Log: - upped BS_VERSION to 47 Members: src/xvid.h:1.57->1.58 --------------------- Date: 2006/11/01 08:12:26 Author: Skal Branch: HEAD Tag: (none) Log: + added a very simple bench to test bitstream-read functions mostly to be used in conjunction to valgrind to spot uninitialized reads. Members: examples/xvid_bench.c:1.33->1.34 --------------------- Date: 2006/10/30 23:23:05 Author: chl Branch: HEAD Tag: (none) Log: nasm/yasm (at least my versions) didn't like the 0EH syntax in pshufd. Change to 0x0E fixes it (thanks for the hint, skal!). Members: src/plugins/x86_asm/plugin_ssim-a.asm:1.4->1.5 --------------------- Date: 2006/10/30 12:33:57 Author: Skal Branch: HEAD Tag: (none) Log: + fix for rounding error while descaling Members: src/plugins/x86_asm/plugin_ssim-a.asm:1.3->1.4 --------------------- Date: 2006/10/30 12:21:42 Author: Skal Branch: HEAD Tag: (none) Log: + further patch for SSIM plugin by Johannes Reinhardt + updated `xvid_bench 15` => there's still a little rounding inaccuracy in the reported CRCs. Work in progress... Members: examples/xvid_bench.c:1.32->1.33 examples/xvid_encraw.c:1.31->1.32 src/xvid.h:1.56->1.57 src/plugins/plugin_ssim.c:1.4->1.5 src/plugins/plugin_ssim.h:1.1->1.2 src/plugins/x86_asm/plugin_ssim-a.asm:1.2->1.3 --------------------- Date: 2006/10/30 11:52:00 Author: Skal Branch: HEAD Tag: (none) Log: + added support for NULL u/v pointer in yv12_to_yv12* functions (+little bug fix for the vflip case). Added a bench in xvid_bench.c (`xvid_bench 16`) Members: examples/xvid_bench.c:1.31->1.32 src/image/colorspace.c:1.12->1.13 src/image/x86_asm/colorspace_yuv_mmx.asm:1.5->1.6 --------------------- Date: 2006/10/29 09:04:02 Author: chl Branch: HEAD Tag: (none) Log: Simple handle to flooding chroma components with 0x80: set src->u and src->v to NULL. To work with VFlip, set also src_uv_stride=0. Members: src/image/colorspace.c:1.11->1.12 --------------------- Date: 2006/10/26 18:34:32 Author: Skal Branch: HEAD Tag: (none) Log: slightly faster lum_8x8_mmx Members: src/plugins/x86_asm/plugin_ssim-a.asm:1.1->1.2 --------------------- Date: 2006/10/16 06:46:01 Author: Skal Branch: HEAD Tag: (none) Log: update totalPSNR[], whatever the ARG_PROGRESS Members: examples/xvid_encraw.c:1.30->1.31 --------------------- Date: 2006/10/13 17:19:48 Author: Skal Branch: HEAD Tag: (none) Log: bench on lum2x8 was wrong (uninitialized reads) Members: examples/xvid_bench.c:1.30->1.31 --------------------- Date: 2006/10/13 17:16:25 Author: Skal Branch: HEAD Tag: (none) Log: some more SSIM patches by Johannes Members: examples/xvid_bench.c:1.29->1.30 examples/xvid_encraw.c:1.29->1.30 src/plugins/plugin_ssim.c:1.3->1.4 --------------------- Date: 2006/10/13 13:26:18 Author: Skal Branch: HEAD Tag: (none) Log: wrong call to check_cpu_features() in case of non-ARCH_IS_IA32 Members: src/image/image.c:1.37->1.38 --------------------- Date: 2006/10/13 11:28:46 Author: Skal Branch: HEAD Tag: (none) Log: removed the #ifndef WIN32 protection around xvid_plugin_ssim Members: examples/xvid_encraw.c:1.28->1.29 --------------------- Date: 2006/10/13 10:39:07 Author: Isibaar Branch: HEAD Tag: (none) Log: - Updated the MSVC project files plus some minor compilation fixes Members: build/win32/libxvidcore.dsp:1.13->1.14 src/image/image.c:1.36->1.37 src/plugins/plugin_ssim.c:1.2->1.3 --------------------- Date: 2006/10/13 09:38:09 Author: Skal Branch: HEAD Tag: (none) Log: + added a simple de-interlacing func (c + sse version), declared as xvid_image_deinterlace() in image.h Of course, one should prefer deinterlacing through some avisynth plugin, but... please update the dsp/dsw Members: build/generic/sources.inc:1.13->1.14 src/image/image.c:1.35->1.36 src/image/image.h:1.16->1.17 src/image/x86_asm/deintl_sse.asm:INITIAL->1.1 --------------------- Date: 2006/10/13 08:32:02 Author: Skal Branch: HEAD Tag: (none) Log: + added a forgotten ARCH_IS_IA32 + added some missing emms() after asm calls (since floats are used) Members: src/plugins/plugin_ssim.c:1.1->1.2 --------------------- Date: 2006/10/11 16:55:28 Author: Skal Branch: HEAD Tag: (none) Log: + added a bench for SSIM's internal function (`xvid_bench 15`) Members: examples/xvid_bench.c:1.28->1.29 --------------------- Date: 2006/10/11 15:55:32 Author: Skal Branch: HEAD Tag: (none) Log: + added SSIM plugin code Patch by Johannes Reinhardt at uni-konstanz dot de Members: build/generic/sources.inc:1.12->1.13 src/xvid.h:1.55->1.56 src/plugins/plugin_ssim.c:INITIAL->1.1 src/plugins/plugin_ssim.h:INITIAL->1.1 src/plugins/x86_asm/plugin_ssim-a.asm:INITIAL->1.1 --------------------- Date: 2006/10/11 15:52:06 Author: Skal Branch: HEAD Tag: (none) Log: + added SSIM pluging to xvid_encraw.c (only for non-WIN32 for now) + modified Makefile to use generic/=build/libxvidcore.a direct path Patch by Johannes Reinhardt at uni-konstanz dot de Members: examples/Makefile:1.9->1.10 examples/xvid_encraw.c:1.27->1.28 --------------------- Date: 2006/09/22 05:40:11 Author: syskin Branch: HEAD Tag: (none) Log: stop using cmov with mmx Members: src/quant/x86_asm/quantize_mpeg_mmx.asm:1.7->1.8 --------------------- Date: 2006/09/11 00:42:15 Author: Isibaar Branch: HEAD Tag: (none) Log: - small bug reported by Greg Handi Members: src/decoder.c:1.78->1.79 --------------------- Date: 2006/09/03 10:46:56 Author: Skal Branch: HEAD Tag: (none) Log: + added a protection flag XVID_SAFE_BS_TAIL for not reading more than 4byte past the end of the input buffer. This is disabled by default (because slow), and 8byte-padding of input buffer should be a prefered solution in case of problem. Please cross-check i didn't break something. Thanks to Liang Jian ( jianliang79 at gmail dot com ) for pointing out the problem. Members: src/bitstream/bitstream.h:1.23->1.24 --------------------- Date: 2006/08/23 22:27:22 Author: Skal Branch: HEAD Tag: (none) Log: Typo: use stride from data->current instead of data->reference Thanks to Johannes.Reinhardt at uni-konstanz dot de Members: src/plugins/plugin_dump.c:1.2->1.3 --------------------- Date: 2006/07/11 20:36:18 Author: Isibaar Branch: HEAD Tag: (none) Log: - updated graphics Members: dshow/src/Xvid_logo.bmp:1.2->1.3 vfw/src/Xvid_logo.bmp:1.2->1.3 vfw/src/xvid.ico:1.1->1.2 --------------------- Date: 2006/07/11 19:17:09 Author: chl Branch: HEAD Tag: (none) Log: ARG_FRAMERATE=0. broke encoding with default Members: examples/xvid_encraw.c:1.26->1.27 --------------------- Date: 2006/07/11 12:19:27 Author: chl Branch: HEAD Tag: (none) Log: linking to pthread library was missing Members: examples/Makefile:1.8->1.9 --------------------- Date: 2006/07/11 12:01:27 Author: chl Branch: HEAD Tag: (none) Log: fissing .endfunc Members: src/quant/x86_asm/quantize_mpeg_mmx.asm:1.6->1.7 --------------------- Date: 2006/07/10 19:39:23 Author: Isibaar Branch: HEAD Tag: (none) Log: - updated profile definitions Members: vfw/src/codec.c:1.20->1.21 vfw/src/config.c:1.31->1.32 vfw/src/config.h:1.10->1.11 --------------------- Date: 2006/07/10 19:25:23 Author: Isibaar Branch: HEAD Tag: (none) Log: - increment bs version to 45 Members: src/xvid.h:1.54->1.55 --------------------- Date: 2006/07/10 10:09:59 Author: syskin Branch: HEAD Tag: (none) Log: faster and waaay more precise mpeg intra quantization Members: src/encoder.h:1.31->1.32 src/xvid.c:1.69->1.70 src/quant/quant.h:1.6->1.7 src/quant/quant_matrix.c:1.14->1.15 src/quant/quant_matrix.h:1.7->1.8 src/quant/quant_mpeg.c:1.2->1.3 src/quant/x86_asm/quantize_mpeg_mmx.asm:1.5->1.6 src/quant/x86_asm/quantize_mpeg_xmm.asm:1.5->1.6 src/utils/mbtransquant.c:1.31->1.32 --------------------- Date: 2006/07/08 16:19:04 Author: Skal Branch: HEAD Tag: (none) Log: some compile fix... note: -start only works for raw YUV input (type 0). Members: examples/xvid_encraw.c:1.25->1.26 --------------------- Date: 2006/06/17 15:07:55 Author: Isibaar Branch: HEAD Tag: (none) Log: - Enabled Skal's new SIMD optimizations for GMC Members: build/win32/libxvidcore.dsp:1.12->1.13 src/motion/gmc.c:1.5->1.6 --------------------- Date: 2006/06/16 12:08:28 Author: syskin Branch: HEAD Tag: (none) Log: xvid_encraw with AVI input support, possible MKV output support, and all options/settings. Possibly the ugliest piece of code in our tree. Needs a rewrite. Members: examples/xvid_encraw.c:1.24->1.25 --------------------- Date: 2006/06/14 23:44:07 Author: Skal Branch: HEAD Tag: (none) Log: added mmx/sse2 code for GMC (3-pts only). new file: image/x86_asm/gmc_mmx.asm At this point, new GMC code isn't enabled (gmc.c:586). So: this commit should give binary-exact same input/output than before. dsp/dsw not updated. Members: build/generic/sources.inc:1.11->1.12 src/xvid.c:1.68->1.69 src/image/x86_asm/gmc_mmx.asm:INITIAL->1.1 src/motion/gmc.c:1.4->1.5 src/motion/gmc.h:1.2->1.3 --------------------- Date: 2006/06/07 23:00:55 Author: Skal Branch: HEAD Tag: (none) Log: + fix for a long-standing typo in the clipping value for 1-pts GMC prediction. Members: src/motion/gmc.c:1.3->1.4 --------------------- Date: 2006/06/05 23:30:49 Author: Skal Branch: HEAD Tag: (none) Log: + added a test_yuv() stub to test YUV functions, at least : yv12_to_yuyv and yv12_to_uyvy Members: examples/xvid_bench.c:1.27->1.28 --------------------- Date: 2006/06/05 23:27:36 Author: Skal Branch: HEAD Tag: (none) Log: + faster yv12->yuyv / uyvy MMX functions patch suggested by Carlo Bramini ( carlo bramix at libero dot it ) Members: src/image/x86_asm/colorspace_yuyv_mmx.asm:1.6->1.7 --------------------- Date: 2006/05/28 09:52:45 Author: suxen_drol Branch: HEAD Tag: (none) Log: define _INTPTR_T_DEFINED Members: src/portab.h:1.55->1.56 --------------------- Date: 2006/05/06 06:37:15 Author: syskin Branch: HEAD Tag: (none) Log: missing #include b0rks compilation Members: src/plugins/plugin_lumimasking.c:1.5->1.6 --------------------- Date: 2006/04/26 19:44:29 Author: Skal Branch: HEAD Tag: (none) Log: + bswap and quant_h264_intra naming fix. Patch by Thomas Koeckerbauer ( k0055217 at students dot uni-linz dot ac dot at ) Members: src/portab.h:1.54->1.55 src/quant/quant.h:1.5->1.6 --------------------- Date: 2006/04/25 17:19:27 Author: syskin Branch: HEAD Tag: (none) Log: write stats file in the same directory as target file, not in root directory Members: vfw/src/config.h:1.9->1.10 --------------------- Date: 2006/04/19 17:42:19 Author: syskin Branch: HEAD Tag: (none) Log: final skip threshold had its sign reversed, oops Members: src/motion/estimation_pvop.c:1.21->1.22 --------------------- Date: 2006/04/15 06:17:02 Author: syskin Branch: HEAD Tag: (none) Log: s/max/MAX - *nix compilation bustage fix Members: src/plugins/plugin_lumimasking.c:1.4->1.5 --------------------- Date: 2006/04/14 09:24:47 Author: Skal Branch: HEAD Tag: (none) Log: preserve the intervening bytes in BitstreamInit() original reports by Alex Volkov and Liang Jian. Members: src/bitstream/bitstream.h:1.22->1.23 --------------------- Date: 2006/04/13 22:48:06 Author: Isibaar Branch: HEAD Tag: (none) Log: - debian patch by GomGom Members: debian/README.Debian:INITIAL->1.1 debian/changelog:INITIAL->1.1 debian/control:INITIAL->1.1 debian/rules:INITIAL->1.1 --------------------- Date: 2006/03/27 13:21:48 Author: Skal Branch: HEAD Tag: (none) Log: fix for the visual_object_verid vs. video_object_layer_verid problem of 6.3.3, reported by Li Xiang (lixiang01 at gmail dot com) Thanks for report and test bitstreams! Members: src/decoder.c:1.77->1.78 src/decoder.h:1.17->1.18 src/bitstream/bitstream.c:1.55->1.56 --------------------- Date: 2006/03/11 13:10:42 Author: syskin Branch: HEAD Tag: (none) Log: let lumimasking work with quant 1 too Members: src/plugins/plugin_lumimasking.c:1.3->1.4 --------------------- Date: 2006/03/05 05:01:07 Author: syskin Branch: HEAD Tag: (none) Log: detect pthreads and add proper linking flags if they are found. patch by caro from irc Members: build/generic/configure.in:1.20->1.21 --------------------- Date: 2006/03/03 12:54:58 Author: syskin Branch: HEAD Tag: (none) Log: fixed destructor bug - temp lamdas not freed Members: src/encoder.c:1.127->1.128 --------------------- Date: 2006/02/27 13:16:04 Author: suxen_drol Branch: HEAD Tag: (none) Log: mingw compatibility, remove gcc warnings Members: src/motion/motion_smp.h:1.3->1.4 --------------------- Date: 2006/02/27 01:24:02 Author: syskin Branch: HEAD Tag: (none) Log: synchronize only once *slaps forhead* Members: src/motion/estimation_pvop.c:1.20->1.21 --------------------- Date: 2006/02/27 01:22:31 Author: syskin Branch: HEAD Tag: (none) Log: cosmetics; make it compile on linux and others without #defines Members: src/motion/motion_smp.h:1.2->1.3 --------------------- Date: 2006/02/26 02:52:34 Author: suxen_drol Branch: HEAD Tag: (none) Log: add debug.c Members: dshow/dshow.dsp:1.6->1.7 dshow/src/debug.c:INITIAL->1.1 dshow/src/debug.h:1.5->1.6 --------------------- Date: 2006/02/25 05:41:12 Author: suxen_drol Branch: HEAD Tag: (none) Log: win32: populate info.num_thread fields using GetProcessAffinityMask() Members: src/xvid.c:1.67->1.68 --------------------- Date: 2006/02/25 02:20:41 Author: syskin Branch: HEAD Tag: (none) Log: oops I forgot to commit these yesterday ;_; Members: src/motion/estimation_bvop.c:1.24->1.25 src/motion/estimation_pvop.c:1.19->1.20 --------------------- Date: 2006/02/25 00:35:04 Author: suxen_drol Branch: HEAD Tag: (none) Log: add minfcode and minbcode members to SMPmotionData struct Members: src/motion/motion_smp.h:1.1->1.2 --------------------- Date: 2006/02/24 23:59:07 Author: suxen_drol Branch: HEAD Tag: (none) Log: prevent segfault when encoding application calls compress_end with NULL codec context (PerfectDark at yandex dot ru) Members: vfw/src/codec.c:1.19->1.20 --------------------- Date: 2006/02/24 15:18:59 Author: syskin Branch: HEAD Tag: (none) Log: SMP update - don't run encoding in parallel after all Members: src/encoder.c:1.126->1.127 --------------------- Date: 2006/02/24 11:39:23 Author: syskin Branch: HEAD Tag: (none) Log: support -threads parameter (defaults to zero) Members: examples/xvid_encraw.c:1.23->1.24 --------------------- Date: 2006/02/24 09:46:22 Author: syskin Branch: HEAD Tag: (none) Log: multithreaded encoding Members: src/encoder.c:1.125->1.126 src/encoder.h:1.30->1.31 src/motion/estimation_bvop.c:1.23->1.24 src/motion/estimation_pvop.c:1.18->1.19 src/motion/motion_smp.h:INITIAL->1.1 --------------------- Date: 2006/02/24 09:33:52 Author: syskin Branch: HEAD Tag: (none) Log: enable number of threads; treat is as any other config (no auto-detection) Members: vfw/src/config.c:1.30->1.31 --------------------- Date: 2006/02/23 08:22:43 Author: syskin Branch: HEAD Tag: (none) Log: reset dquant table, all of it Members: src/encoder.c:1.124->1.125 --------------------- Date: 2006/02/15 21:58:43 Author: Isibaar Branch: HEAD Tag: (none) Log: - N-VOP patch by Andrew Dunstan Members: src/encoder.c:1.123->1.124 --------------------- Date: 2006/02/15 20:16:39 Author: Isibaar Branch: HEAD Tag: (none) Log: Bugfix: Decoding was prematurely terminated upon EOF Members: examples/xvid_decraw.c:1.23->1.24 --------------------- Date: 2006/01/19 23:25:18 Author: Isibaar Branch: HEAD Tag: (none) Log: - Added MV bits to statistics Members: src/encoder.c:1.122->1.123 src/encoder.h:1.29->1.30 src/bitstream/mbcoding.c:1.52->1.53 --------------------- Date: 2006/01/17 20:06:25 Author: Isibaar Branch: HEAD Tag: (none) Log: - Removed the 9999 frames encode limit from xvid_encraw Members: examples/xvid_encraw.c:1.22->1.23 --------------------- Date: 2006/01/09 01:39:43 Author: Isibaar Branch: HEAD Tag: (none) Log: - fix for EMT64 platform Members: src/xvid.c:1.66->1.67 --------------------- Date: 2006/01/08 23:25:57 Author: Isibaar Branch: HEAD Tag: (none) Log: - Increased the bs_version to 43 Members: src/xvid.h:1.53->1.54 --------------------- Date: 2005/12/30 15:04:49 Author: Isibaar Branch: HEAD Tag: (none) Log: - Initialize dec->bs_version to high value. Before it seemed unititalized for non Xvid streams... Members: src/decoder.c:1.76->1.77 --------------------- Date: 2005/12/30 14:52:32 Author: Isibaar Branch: HEAD Tag: (none) Log: - Made the debug build config link again Members: dshow/dshow.dsp:1.5->1.6 dshow/src/debug.h:1.4->1.5 --------------------- Date: 2005/12/24 02:06:20 Author: Isibaar Branch: HEAD Tag: (none) Log: - (hopefully) fixed the decoder bugs reported by Michael Niedermayer Members: src/decoder.c:1.75->1.76 --------------------- Date: 2005/12/18 07:52:12 Author: syskin Branch: HEAD Tag: (none) Log: cleanup; skip decision moved to separate function Members: src/motion/estimation_pvop.c:1.17->1.18 src/motion/motion.h:1.23->1.24 --------------------- Date: 2005/12/18 03:55:54 Author: syskin Branch: HEAD Tag: (none) Log: -freduce-all-givs not supporeted by gcc4 - easiest to just remove Members: dshow/Makefile:1.5->1.6 vfw/bin/Makefile:1.4->1.5 --------------------- Date: 2005/12/17 14:57:15 Author: syskin Branch: HEAD Tag: (none) Log: stupid typo in latest patch Members: src/image/image.c:1.34->1.35 --------------------- Date: 2005/12/17 13:04:52 Author: syskin Branch: HEAD Tag: (none) Log: easier image_interpolate() call, absolete comments removed Members: src/encoder.c:1.121->1.122 src/image/image.c:1.33->1.34 src/image/image.h:1.15->1.16 --------------------- Date: 2005/12/17 12:24:32 Author: syskin Branch: HEAD Tag: (none) Log: ancient useless code removed Members: src/image/image.c:1.32->1.33 --------------------- Date: 2005/12/10 06:20:35 Author: syskin Branch: HEAD Tag: (none) Log: slightly better trellis - check at least 3 coefficients. 0.05dB better with no measurable speed penalty Members: src/utils/mbtransquant.c:1.30->1.31 --------------------- Date: 2005/12/09 05:45:35 Author: syskin Branch: HEAD Tag: (none) Log: expose VHQ and Trellis lambdas to HVS plugins Members: src/encoder.c:1.120->1.121 src/encoder.h:1.28->1.29 src/global.h:1.24->1.25 src/xvid.h:1.52->1.53 src/motion/estimation_rd_based.c:1.13->1.14 src/motion/estimation_rd_based_bvop.c:1.9->1.10 src/utils/mbtransquant.c:1.29->1.30 --------------------- Date: 2005/12/09 05:39:49 Author: syskin Branch: HEAD Tag: (none) Log: tuning lambdas for better PSNR and vhq0 mode decision Members: src/motion/estimation.h:1.12->1.13 src/motion/estimation_common.c:1.12->1.13 --------------------- Date: 2005/11/25 13:07:01 Author: chl Branch: HEAD Tag: (none) Log: remove "xvid" in PGM-header, so xvid_encraw understands it Members: examples/xvid_decraw.c:1.22->1.23 --------------------- Date: 2005/11/22 11:53:10 Author: suxen_drol Branch: HEAD Tag: (none) Log: update cvs-head to reflect xvid-1.2 development status: set build string to "xvid-1.2.0-dev" set XVID_VERSION to 1.2.-127 set XVID_BS_VERSION to 40 set XVID_UNSTABLE Members: src/xvid.c:1.65->1.66 src/xvid.h:1.51->1.52 --------------------- Date: 2005/11/22 11:23:01 Author: suxen_drol Branch: HEAD Tag: (none) Log: cleanings in code spotted by sparse (ed dot gomez at free dot fr> Members: src/decoder.c:1.74->1.75 src/encoder.c:1.119->1.120 src/xvid.c:1.64->1.65 src/bitstream/bitstream.c:1.54->1.55 src/dct/idct.c:1.8->1.9 src/image/colorspace.c:1.10->1.11 src/image/font.c:1.6->1.7 src/image/qpel.c:1.7->1.8 src/motion/estimation_rd_based.c:1.12->1.13 src/motion/estimation_rd_based_bvop.c:1.8->1.9 src/prediction/mbprediction.c:1.17->1.18 src/utils/emms.c:1.10->1.11 src/utils/mbtransquant.c:1.28->1.29 src/utils/timer.h:1.10->1.11 --------------------- Date: 2005/11/03 06:44:07 Author: Skal Branch: HEAD Tag: (none) Log: typo fixed (thanks squid_80) Members: examples/xvid_bench.c:1.26->1.27 --------------------- Date: 2005/10/26 14:38:33 Author: Skal Branch: HEAD Tag: (none) Log: + removed the x_Ref%4 in qpel.h, in favor of x_Ref>>2. As suggested by Gruel, there might be a compiler problem for some very very exotic platform. Hence, i've added a test_compiler() in xvid_bench.c, to be sure everything are ok. Hope the test is correct. + added benches for interlaced decoding, as supplied by Christoph Khnel (info at intek-darmstadt dot de). Thanks a lot. Members: examples/xvid_bench.c:1.25->1.26 src/image/qpel.h:1.6->1.7 --------------------- Date: 2005/10/23 00:32:44 Author: Isibaar Branch: HEAD Tag: (none) Log: - Renamed and extended the profiles Members: vfw/src/codec.c:1.18->1.19 vfw/src/config.c:1.29->1.30 vfw/src/config.h:1.8->1.9 --------------------- Date: 2005/10/16 02:00:04 Author: suxen_drol Branch: HEAD Tag: (none) Log: vfw quality presets Members: vfw/src/codec.c:1.17->1.18 vfw/src/config.c:1.28->1.29 vfw/src/config.h:1.7->1.8 vfw/src/resource.h:1.9->1.10 vfw/src/resource.rc:1.21->1.22 --------------------- Date: 2005/10/09 09:38:33 Author: suxen_drol Branch: HEAD Tag: (none) Log: TODO/Changelog update Members: ChangeLog:1.13->1.14 TODO:1.7->1.8 2005/10/8 0:58:2, 'suxen_drol' compatibility with haali media splitter: - FORMAT_MPEG2Video support - handle uppercase MP4V fourcc/clsid 2005/10/7 15:2:28, 'suxen_drol' minor xvid_{enc,dec}_raw fixes: - fix clock resolution (thanks yuri khan) - link vfw32.lib for win32 avifile support - honour avifile stream length 2005/10/6 18:28:31, 'Isibaar' - added avi/avs input support - various new options 2005/10/6 10:46:42, 'Isibaar' - Wiped the remainders of RRV encoding support - Marked the RRV flags as obsolete in xvid.h API 2005/10/5 11:20:22, 'suxen_drol' vfw: replace "Picture Aspect Ratio" with "Display Aspect Ratio" 2005/9/24 3:10:37, 'suxen_drol' bugfix: calc_cbp_mmx was ignoring negative coeff case. have replaced "coeff_sum>0" evaluation with "coeff_sum != 0" see http://forum.doom9.org/showthread.php?t=100275 for description of bug. 2005/9/23 12:53:35, 'suxen_drol' +ve/-ve cbp test (to demonstrate fault with current calc_cbp_mmx function 2005/9/20 11:54:11, 'suxen_drol' > > - uint32_t intra_dc_threshold; /* fake variable */ > > + int intra_dc_threshold; /* fake variable */ This patch fixes a warning spotted by gcc 4.0.1, because &intra_dc_threshold is passed to some function which expects a int*, not a uint32_t* (on 64bit this is important, even if this is fake data, the callee could corrupt the stack writing 64bit to a 32bit allocated destination) 2005/9/20 11:51:40, 'suxen_drol' msvc fails on void* arithmetic in xvid_bench.c 2005/9/20 11:19:34, 'suxen_drol' update example documentation to "newer" commandline arguments for encraw/decraw (the arguments were changed ~2003). bugfix: prevent endless loop when useful_bytes==1 within xvid_decraw.c 2005/9/18 1:34:13, 'suxen_drol' renamed dshow "Aspect_Ratio" registry key to "Decoder_Aspect_Ratio", in order to prevent conflict with vfw encoder registry key. 2005/9/15 10:52:28, 'suxen_drol' bugfix: support for aspect ratio when decoding unpacked b-frames 2005/9/15 10:55:29, 'suxen_drol' OutputDebugString cleanup 2005-09-19 19:37:45 GMT patch-38 Summary: Renamed dshow aspect ratio registry key Revision: xvidcore--head--0.0--patch-38 From pete: * Renamed dshow "Aspect_Ratio" registry key to "Decoder_Aspect_Ratio", in order to prevent conflict with vfw encoder registry key. modified files: dshow/src/config.c 2005-09-15 16:30:59 GMT patch-37 Summary: Field interlaced decoding Revision: xvidcore--head--0.0--patch-37 From Christoph Kuehnel: * decoder.c - Some new defines for DIV - modified: had wrong address offsets for interlaced - = new function for interlaced - = new function for interlaced motion vector prediction - modified so that it differs between frame and field prediction * global.h - For field motion prediction MACORBLOCK has new member that is the average of field1 and field2 motion vector = * xvid.c - For field predicted macroblocks we need new field oriented transfer functions. For colour calculations they may only process 4 lines (one field from the colour macroblock that is 8x8). So I introduced 4 new function pointers: * mbcoding.c - _DEBUG code; index is checked against 64 * interpolate8x8.[c,h,asm] - New 8x4 functions * mbpredicition.[c,h] - New function for interlaced prediciton according to spec * mem_transfer.[c,h,asm] - New 8x4 function modified files: AUTHORS src/bitstream/mbcoding.c src/decoder.c src/global.h src/image/image.c src/image/image.h src/image/interpolate8x8.c src/image/interpolate8x8.h src/image/x86_asm/interpolate8x8_3dn.asm src/image/x86_asm/interpolate8x8_3dne.asm src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/prediction/mbprediction.c src/prediction/mbprediction.h src/utils/mem_transfer.c src/utils/mem_transfer.h src/utils/x86_asm/mem_transfer_3dne.asm src/utils/x86_asm/mem_transfer_mmx.asm src/xvid.c vfw/src/config.c vfw/src/driverproc.c 2005-09-15 16:27:07 GMT patch-36 Summary: Optimiozed cbp operator on IA32 Revision: xvidcore--head--0.0--patch-36 From carlo dot bramix at libero dot it: - Optimized CBP opeartor for ia32 arch modified files: examples/xvid_bench.c src/bitstream/x86_asm/cbp_mmx.asm 2005-09-15 16:20:49 GMT patch-35 Summary: OutputDebugStream cleanup Revision: xvidcore--head--0.0--patch-35 From pete: - OutputDebugStream cleanup modified files: dshow/dshow.dsp dshow/sources.inc dshow/src/CXvidDecoder.cpp dshow/src/debug.h 2005-07-26 23:32:52 GMT patch-34 Summary: Long standing error in xvid_decraw for big endian hosts Revision: xvidcore--head--0.0--patch-34 From ed.gomez, spoted by Paul Kurucz ptk9417 at ritvax.isc.rit.edu * No width and height declared in write_tga bug. modified files: examples/xvid_decraw.c 2005-07-25 19:45:25 GMT patch-33 Summary: Fix big endian tga output for decraw. Revision: xvidcore--head--0.0--patch-33 From ed.gomez: * Writing pointed data is always better than memory addresses modified files: examples/xvid_decraw.c 2005-07-14 14:25:43 GMT patch-32 Summary: Disable packed-bframes widget for dxn profile Revision: xvidcore--head--0.0--patch-32 From pete: * Disable packed-bframes widget for dxn profile modified files: vfw/src/config.c 2005-07-14 14:22:53 GMT patch-31 Summary: Fixed qpel for gcc4 and x86_64 Revision: xvidcore--head--0.0--patch-31 From Martin Drab * Incomplete type definitions are not supported in GCC4 and newer. This was breaking x86_64. From ed.gomez: * If both generic and x86_64 share 90% of the array declaration, better use an extern macro. Makes teh code clearer. modified files: src/image/qpel.c 2005-07-14 14:11:24 GMT patch-30 Summary: Added VHQ support to xvid_encraw Revision: xvidcore--head--0.0--patch-30 From skal: * Added VHQ support to xvid_encraw modified files: examples/xvid_encraw.c 2005-06-26 15:05:01 GMT patch-29 Summary: Merge noise Revision: xvidcore--head--0.0--patch-29 From ed.gomez: - Merge noise forgotten bit. Other small differences exist with the CVS tree, but i consider them to not fullfill the local code style and thus don't fit well... modified files: src/dct/idct.h 2005-06-26 15:02:05 GMT patch-28 Summary: Bench updates Revision: xvidcore--head--0.0--patch-28 From skal: - Fixed bench for big endian platforms, updated tests modified files: examples/bench.pl examples/bench_list.pl examples/xvid_bench.c 2005-06-26 14:59:17 GMT patch-27 Summary: Optimized C mem transfer functions Revision: xvidcore--head--0.0--patch-27 From skal: - Optimized C mem transfer funcs, disabled for safety. Enabled by undefining USE_REFERENCE_CODE at the top of the mem_transfer.c file modified files: src/utils/mem_transfer.c 2005-06-26 14:55:35 GMT patch-26 Summary: Optimized gcd Revision: xvidcore--head--0.0--patch-26 From skal: - Optmized GCD, added test for gcd in xvid_bench modified files: examples/xvid_bench.c src/encoder.c 2005-06-26 14:51:35 GMT patch-25 Summary: Fixed write_video_packet_header Revision: xvidcore--head--0.0--patch-25 From Sigdrak at free.fr: - Fix write_video_packet_header() which was buggy and kind of obfuscated. From skal: - Fixed log table - Small cleanup modified files: src/bitstream/bitstream.c 2005-06-26 14:46:23 GMT patch-24 Summary: Added greyscale option support in xvid_encraw Revision: xvidcore--head--0.0--patch-24 Added greyscale option support in xvid_encraw modified files: examples/xvid_encraw.c 2005-06-26 14:43:42 GMT patch-23 Summary: IEEE-1180 SSE2 iDCT implementation Revision: xvidcore--head--0.0--patch-23 From skal: - Implemented IEEE-1180 SSE2 iDCT. Disabled for safety. modified files: src/dct/x86_asm/fdct_sse2_skal.asm src/xvid.c 2005-05-18 22:08:12 GMT patch-22 Summary: No executable shared objects installed Revision: xvidcore--head--0.0--patch-22 From ed.gomez: * Do not install the lib as executable. It's no use as the SO has no main symbol anyway, and the static lib is not runnable anyway. modified files: build/generic/Makefile 2005-05-18 22:05:09 GMT patch-21 Summary: Statically link xvid_bench with libxvidcore.a Revision: xvidcore--head--0.0--patch-21 Statically link xvid_bench with libxvidcore.a modified files: examples/Makefile 2005-05-18 21:59:27 GMT patch-20 Summary: New autoconf garbage removal Revision: xvidcore--head--0.0--patch-20 New autoconf garbage removal modified files: build/generic/bootstrap.sh 2005-05-18 21:58:16 GMT patch-19 Summary: Quotes in configure.in Revision: xvidcore--head--0.0--patch-19 Quotes in configure.in modified files: build/generic/configure.in 2005-05-18 19:40:18 GMT patch-18 Summary: Added bitstream helper functions for packets. Revision: xvidcore--head--0.0--patch-18 From Skal: * Added helper functions for video packets, though they're still unused. modified files: src/bitstream/bitstream.c src/bitstream/bitstream.h 2005-05-18 19:30:41 GMT patch-17 Summary: A few more bench stuff Revision: xvidcore--head--0.0--patch-17 From Skal: * Added Perl scripts to automate benches. * Worked on xvid_bench tests to cover more code. new files: examples/.arch-ids/bench.pl.id examples/.arch-ids/bench_list.pl.id examples/bench.pl examples/bench_list.pl modified files: examples/xvid_bench.c 2005-05-18 19:22:28 GMT patch-16 Summary: Decoder cleanup for memory de/allocation Revision: xvidcore--head--0.0--patch-16 From Skal: * Memory de/allocation code refactored using goto. modified files: src/decoder.c 2005-05-11 21:18:41 GMT patch-15 Summary: Export only public API for GNU/Linux and Solaris Revision: xvidcore--head--0.0--patch-15 From ed.gomez: * Use ld version script to hide internal functions. new files: build/generic/.arch-ids/libxvidcore.ld.id build/generic/libxvidcore.ld modified files: build/generic/Makefile build/generic/configure.in 2005-05-11 21:07:00 GMT patch-14 Summary: Warnings GCC4 Revision: xvidcore--head--0.0--patch-14 From ed.gomez: * Remove all GCC 4 warnings. modified files: src/bitstream/bitstream.c src/bitstream/bitstream.h src/decoder.c src/encoder.c src/image/image.c src/image/image.h src/motion/estimation_common.c src/utils/mbtransquant.c 2005-05-11 20:18:49 GMT patch-13 Summary: Add support for gcc-4 in configure system Revision: xvidcore--head--0.0--patch-13 From ed.gomez: * Added gcc 4 detection and CFLAG option filtering for it. modified files: build/generic/configure.in 2005-05-11 20:07:54 GMT patch-12 Summary: Revision: xvidcore--head--0.0--patch-12 From pete: * bugfix: correct max bitrate display for slider layout: "(kbps)" added to avgerage bitrate labels within calculator dialog modified files: vfw/src/config.c vfw/src/resource.rc 2005-05-11 20:06:04 GMT patch-11 Summary: Fix alignment issue for mem tranfer Revision: xvidcore--head--0.0--patch-11 From skal: * Fix alignment issue (32 bit reading from non aligned memory) likely for RISC CPUs using the C code. modified files: src/utils/mem_transfer.c 2005-05-11 20:03:57 GMT patch-10 Summary: Get time function right on win32 Revision: xvidcore--head--0.0--patch-10 From Skal: * Get the time function right for win32 (ms precision) modified files: examples/xvid_bench.c 2005-05-11 20:01:28 GMT patch-9 Summary: Better ASP bitstream autodetection Revision: xvidcore--head--0.0--patch-9 From pete: - Use more flags to determine ASP activation or not. modified files: src/bitstream/bitstream.c ######################################################################### # 1.1.0-beta2 (Bitstream Version 39) ######################################################################### 2005-04-03 20:15:00 GMT patch-7 Summary: Makefile credits and whitespace cleaning Revision: xvidcore--head--0.0--patch-7 Makefile credits and whitespace cleaning modified files: dshow/Makefile vfw/bin/Makefile 2005-04-03 19:52:35 GMT patch-6 Summary: Various small things to vbv conformance and divx5 compatibility. Revision: xvidcore--head--0.0--patch-6 From pete: xvidcore ======== * added XVID_GLOBAL_DIVX5_USERDATA global flag * removed the bvop delay warning text ("warning: nothing to output), as this often confuses joe user. * minor changed to closed gop image_printf statement: s/"DX50 BVOP->PVOP"/"CLOSED GOP BVOP->PVOP" * additional comments for low_delay_default mode within decoder_decode() * divx userdata string: s/DivX999b000/DivX503b1393. this has been suggested by dxn for improved hardware compatibility [nb: i dont have a hardware player to confirm this] * vbv_peakrate constraint is ignored if <= 0 vfw frontend ============ * dxn profiles now confirm to "DivX Certified Profile Compatibility v1.1", February 2005. this document was provided by DivXNetworks, USA. when a dxn profile is selected, strict conformance is enabled: - force 1:1 picture aspect ratio - disable bframes if interlacing is enabled - force maximum of 1 consecutive bvops for the portable and ht profiles, 2 bvops for the hd profile - always write divx 5 userdata string to bitstream - force packed bitstream option - updated dxn vbv parameters * added PROFILE_4MV flag. 4mv is now disabled for the dxn handheld profile. * moved PROFILE_AS/PROFILE_ARTS/PROFILE_S to config.c * profile[].max_bitrate now measured in bit/sec (not kbps) * profile->level box: widgets are now greyed-out if they are not used. * increase vertical size of profile drop down list. * about box button: s/Dismiss/OK modified files: src/bitstream/bitstream.c src/decoder.c src/encoder.c src/plugins/plugin_2pass2.c src/xvid.h vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc 2005-04-03 19:50:05 GMT patch-5 Summary: Fixed bug in cartoon mode. Revision: xvidcore--head--0.0--patch-5 From Isiibar: - Fixed cartoon bug as described by CrunCher. modified files: src/motion/estimation_pvop.c 2005-03-22 20:40:47 GMT patch-4 Summary: Added MPEG quant support for PPC Revision: xvidcore--head--0.0--patch-4 From Christoph Nageli: * Added support for MPEG quant functions for PPC. new files: src/quant/ppc_asm/.arch-ids/quant_mpeg_altivec.c.id src/quant/ppc_asm/quant_mpeg_altivec.c modified files: build/generic/sources.inc src/quant/quant.h src/xvid.c 2005-03-18 18:00:13 GMT patch-3 Summary: Updated ChangeLog Revision: xvidcore--head--0.0--patch-3 Updated ChangeLog modified files: ChangeLog 2005-03-18 17:53:24 GMT patch-2 Summary: Colorspace code for PPC Revision: xvidcore--head--0.0--patch-2 From Christoph Nageli: - Colorspace function fixes for non 16bytes aligned target adresses. modified files: src/image/ppc_asm/colorspace_altivec.c 2005-03-18 17:39:00 GMT patch-1 Summary: Fix for 64bit interlacing Revision: xvidcore--head--0.0--patch-1 From Andrew Dunstan: * Fixed bug where 64bit mov shoud have been 32bit modified files: src/utils/x86_64_asm/interlacing_mmx.asm 2005-03-18 17:28:00 GMT base-0 Summary: tag of ed.gomez@free.fr--2004-1/xvidcore--head--0.0--patch-121 Revision: xvidcore--head--0.0--base-0 (automatically generated log message) # Change of arch/tla archive, explains the patch number wraparound 2005-03-18 16:58:08 GMT patch-121 Summary: ME work Revision: xvidcore--head--0.0--patch-121 From Isiibar: - Cartoon mode bugfix - New lambda tables for R-D motion search. The old tables were obviously taken from h.264, which uses a logarithmic quantizer scale. This lead to bad results at very low bit-rates. With this patch, compression efficiency at low bit-rates is greatly improved. modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_common.c src/motion/estimation_pvop.c 2005-03-18 16:56:13 GMT patch-120 Summary: Better instruction pairing in sad mmx Revision: xvidcore--head--0.0--patch-120 From Dark sylinc (dark_sylinc at yahoo dor com dor ar), commited by Isiibar: * Better instruction pairing in sad_mmx.asm, improves speed. modified files: src/motion/x86_asm/sad_mmx.asm src/utils/emms.c 2005-03-18 16:53:00 GMT patch-119 Summary: Fixed resource leak in Dshow Revision: xvidcore--head--0.0--patch-119 From antonz, commited by Isiibar: * Fixed resource leaking caused by poor xvidcore initialization tracking. modified files: dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h 2005-03-18 16:50:44 GMT patch-118 Summary: Debug flag support in vfw Revision: xvidcore--head--0.0--patch-118 From pete: * debug flag support for vfw decoder. modified files: vfw/src/codec.c ######################################################################### # 1.1.0-beta1 (Bitstream Version 38) ######################################################################### 2005-01-16 10:27:41 GMT patch-117 Summary: License was using wrong linefeeds for vfw Revision: xvidcore--head--0.0--patch-117 License was using wrong linefeeds for vfw new files: vfw/.arch-ids/LICENSE.id vfw/LICENSE modified files: vfw/src/resource.rc 2005-01-10 22:59:46 GMT patch-116 Summary: Last minutes vfw bugfixes/improvements Revision: xvidcore--head--0.0--patch-116 From sysKin: * last minute fixes and improvements to vfw frontend. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.rc 2005-01-09 20:37:32 GMT patch-115 Summary: Marking 1.1.0 beta1 Revision: xvidcore--head--0.0--patch-115 From ed.gomez: * Marking xvid 1.1.0 beta1 release. modified files: ChangeLog build/generic/configure.in src/xvid.c src/xvid.h 2005-01-09 20:15:14 GMT patch-114 Summary: Moved cartoon mode to zones in vfw. Revision: xvidcore--head--0.0--patch-114 From sysKin: * Moved cartoon mode to zones in vfw frontend. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.rc 2005-01-09 20:09:27 GMT patch-113 Summary: Revision: xvidcore--head--0.0--patch-113 From algern0n (#xvid@freenode): * Avoid infinite loop when updating audio size. modified files: vfw/src/config.c 2005-01-09 11:32:41 GMT patch-112 Summary: Long standing bug in 2pass2 code. Double overflow accumulation. Revision: xvidcore--head--0.0--patch-112 From pengvado (x264 developer, sorry i don't have your realname): * rc_2pass2_after accumulates overflow twice, once in each I/PB subcase and then in a common code path. The common path was just supposed to store the stat struct entry error for statistics (even if they're unused) modified files: src/plugins/plugin_2pass2.c 2005-01-06 23:42:12 GMT patch-111 Summary: Merged amd64 branch fix Revision: xvidcore--head--0.0--patch-111 Merged amd64 branch fix Patches applied: * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-12 Merged upstream * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-13 Bug fix for qpel problem from Andrew Dunstan modified files: src/image/x86_64_asm/qpel_mmx.asm new patches: ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-12 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-13 2005-01-05 22:53:12 GMT patch-110 Summary: Merged x86_64 Linux port Revision: xvidcore--head--0.0--patch-110 Merged x86_64 Linux port Patches applied: * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--base-0 tag of ed.gomez@free.fr--2004-1/xvidcore--head--0.0--patch-96 * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-1 Merged mainline up to patch-101 * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-2 Merged mainline again for hotfixes * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-3 Added x86_64 detection in configure system * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-4 Added src/utils/x86_64_asm files * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-5 Added /src/quant/x86_64_asm files * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-6 Added src/motion/x86_64_asm files * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-7 Added src/dct/x86_64_asm * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-8 Added halfpel part of src/image/x86_64_asm files * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-9 Merged mainline * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-10 Ported the new mem transfer function * ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-11 Added src/image/x86_4 qpel code new files: src/dct/x86_64_asm/.arch-ids/=id src/dct/x86_64_asm/.arch-ids/fdct_mmx_skal.asm.id src/dct/x86_64_asm/.arch-ids/idct_mmx.asm.id src/dct/x86_64_asm/fdct_mmx_skal.asm src/dct/x86_64_asm/idct_mmx.asm src/image/x86_64_asm/.arch-ids/=id src/image/x86_64_asm/.arch-ids/interpolate8x8_mmx.asm.id src/image/x86_64_asm/.arch-ids/interpolate8x8_xmm.asm.id src/image/x86_64_asm/.arch-ids/qpel_mmx.asm.id src/image/x86_64_asm/interpolate8x8_mmx.asm src/image/x86_64_asm/interpolate8x8_xmm.asm src/image/x86_64_asm/qpel_mmx.asm src/motion/x86_64_asm/.arch-ids/=id src/motion/x86_64_asm/.arch-ids/sad_mmx.asm.id src/motion/x86_64_asm/.arch-ids/sad_xmm.asm.id src/motion/x86_64_asm/sad_mmx.asm src/motion/x86_64_asm/sad_xmm.asm src/quant/x86_64_asm/.arch-ids/=id src/quant/x86_64_asm/.arch-ids/quantize_h263_mmx.asm.id src/quant/x86_64_asm/.arch-ids/quantize_mpeg_xmm.asm.id src/quant/x86_64_asm/quantize_h263_mmx.asm src/quant/x86_64_asm/quantize_mpeg_xmm.asm src/utils/x86_64_asm/.arch-ids/=id src/utils/x86_64_asm/.arch-ids/cpuid.asm.id src/utils/x86_64_asm/.arch-ids/interlacing_mmx.asm.id src/utils/x86_64_asm/.arch-ids/mem_transfer_mmx.asm.id src/utils/x86_64_asm/cpuid.asm src/utils/x86_64_asm/interlacing_mmx.asm src/utils/x86_64_asm/mem_transfer_mmx.asm modified files: build/generic/configure.in build/generic/sources.inc examples/xvid_bench.c src/dct/fdct.h src/dct/idct.h src/image/interpolate8x8.h src/image/qpel.c src/image/qpel.h src/motion/sad.h src/portab.h src/quant/quant.h src/utils/emms.h src/utils/mbfunctions.h src/utils/mem_transfer.h src/xvid.c new directories: src/dct/x86_64_asm src/dct/x86_64_asm/.arch-ids src/image/x86_64_asm src/image/x86_64_asm/.arch-ids src/motion/x86_64_asm src/motion/x86_64_asm/.arch-ids src/quant/x86_64_asm src/quant/x86_64_asm/.arch-ids src/utils/x86_64_asm src/utils/x86_64_asm/.arch-ids new patches: ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--base-0 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-1 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-2 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-3 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-4 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-5 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-6 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-7 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-8 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-9 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-10 ed.gomez@free.fr--amd64/xvidcore--amd64work--0.0--patch-11 2004-12-19 16:58:13 GMT patch-109 Summary: bvhq speedup. Revision: xvidcore--head--0.0--patch-109 From sysKin: * Add cbp cost as soon as possible, so it saves a few candidates testing. modified files: src/motion/estimation_rd_based_bvop.c 2004-12-19 16:55:47 GMT patch-108 Summary: Added ia32 optimized code for new mem transfer operator. Revision: xvidcore--head--0.0--patch-108 From sysKin: * Added ia32 (xmm) optimized code for new mem transfer operator. modified files: src/utils/mem_transfer.c src/utils/mem_transfer.h src/utils/x86_asm/mem_transfer_mmx.asm src/xvid.c 2004-12-19 13:39:58 GMT patch-107 Summary: Added missing license header Revision: xvidcore--head--0.0--patch-107 From ed.gomez: * The GPL header was missing modified files: src/motion/estimation_rd_based_bvop.c 2004-12-19 12:41:02 GMT patch-106 Summary: Updated ChangeLog Revision: xvidcore--head--0.0--patch-106 Updated ChangeLog modified files: ChangeLog 2004-12-19 12:38:15 GMT patch-105 Summary: Merged stable 1.0.3 release patches Revision: xvidcore--head--0.0--patch-105 Merged stable 1.0.3 release patches Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-63 Trellis overflow for quant<=2 * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-64 Marking 1.0.3 release modified files: ChangeLog-1.0 src/utils/mbtransquant.c src/xvid.h new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-63 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-64 2004-12-19 11:15:53 GMT patch-104 Summary: Faster bvhq Revision: xvidcore--head--0.0--patch-104 From sysKin: * Faster bvhq skipping Intra test if the rd optimized rate is already < 24bits... some other things too modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_rd_based.c src/motion/estimation_rd_based_bvop.c 2004-12-10 20:51:56 GMT patch-103 Summary: Encoder cleanups. Revision: xvidcore--head--0.0--patch-103 From sysKin: * Moved greyscale code to mbcoding. * Moved the fcode code to its own function. * Some other minor cleanups. modified files: src/bitstream/mbcoding.c src/encoder.c 2004-12-10 20:39:23 GMT patch-102 Summary: Fixed patch-101 Revision: xvidcore--head--0.0--patch-102 From sysKin: * Add a cbp assigment that should not have disapeared in patch-101. modified files: src/encoder.c 2004-12-09 22:53:20 GMT patch-101 Summary: Speedup using RD results Revision: xvidcore--head--0.0--patch-101 From sysKin: * Use cbp from RD to speedup things a bit. modified files: src/encoder.c 2004-12-09 22:51:02 GMT patch-100 Summary: Speedup RD a bit Revision: xvidcore--head--0.0--patch-100 From sysKin: * Saves a few multiplies in RD code saving the quant*quant value into the SearchData struct. modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_rd_based.c 2004-12-09 22:48:06 GMT patch-99 Summary: Smarter fcode code Revision: xvidcore--head--0.0--patch-99 From sysKin: * Replaced old fcode code with smarter one. modified files: src/bitstream/mbcoding.c src/encoder.c src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_common.c src/motion/estimation_pvop.c src/motion/motion_inlines.h 2004-12-09 22:00:49 GMT patch-98 Summary: Removed Reduced Resolution Vops support Revision: xvidcore--head--0.0--patch-98 From sysKin: * it's now a long time we planned removing support for RRV as it adds complexity to the ME, to the decoder and this feature fits nowhere in any MPEG4 profile we plan to support. modified files: src/bitstream/bitstream.c src/bitstream/bitstream.h src/decoder.c src/encoder.c src/image/image.c src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_gmc.c src/motion/estimation_pvop.c src/motion/estimation_rd_based.c src/motion/estimation_rd_based_bvop.c src/motion/motion.h src/motion/motion_comp.c src/motion/motion_inlines.h src/motion/vop_type_decision.c src/utils/mbtransquant.c src/xvid.c 2004-12-07 23:58:12 GMT patch-97 Summary: Merged PowerPC fixes from christoph naegeli's branch Revision: xvidcore--head--0.0--patch-97 Merged PowerPC fixes from christoph naegeli's branch Patches applied: * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-11 Star-merge with Edouards Branch * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-12 debug alignment bugfixes * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-13 bugfixes in altivec alignment assumptions * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-14 linux gcc fixes * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-15 linux ppc long fixes * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-16 minor ppc linux enhancements... modified files: src/image/ppc_asm/colorspace_altivec.c src/image/ppc_asm/interpolate8x8_altivec.c src/image/ppc_asm/qpel_altivec.c src/motion/ppc_asm/sad_altivec.c src/quant/ppc_asm/quant_h263_altivec.c src/utils/ppc_asm/mem_transfer_altivec.c src/xvid.c new patches: chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-11 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-12 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-13 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-14 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-15 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-16 2004-11-24 22:10:17 GMT patch-96 Summary: Forgotten bit of patch-94 (vbv code in vfw) Revision: xvidcore--head--0.0--patch-96 Forgotten bit of patch-94 (vbv code in vfw) modified files: vfw/src/codec.c 2004-11-24 21:50:45 GMT patch-95 Summary: Changed default Brightness value in DShow frontend Revision: xvidcore--head--0.0--patch-95 Changed default Brightness value in DShow frontend modified files: dshow/src/config.c 2004-11-24 21:50:14 GMT patch-94 Summary: Added support for VBV in frontend. Revision: xvidcore--head--0.0--patch-94 From sysKin: * Added support code for VBV in VFW frontend. modified files: vfw/src/config.c vfw/src/resource.rc 2004-11-24 21:48:35 GMT patch-93 Summary: Added interlaced option parsing in xvid_encraw. Revision: xvidcore--head--0.0--patch-93 From christoph: * Added support for interlaced option in xvid_encraw. modified files: examples/xvid_encraw.c 2004-11-24 21:45:47 GMT patch-92 Summary: Synced with stable tree Revision: xvidcore--head--0.0--patch-92 Synced with stable tree Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-60 Fixed DiamondSearch * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-61 Fixed stride in DShow decoder. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-62 Fixed stride in vfw frontend. modified files: dshow/src/CXvidDecoder.cpp src/motion/estimation_common.c vfw/src/codec.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-60 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-61 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-62 2004-10-17 10:13:02 GMT patch-91 Summary: Syncing with Christoph Nageli branch Revision: xvidcore--head--0.0--patch-91 Syncing with Christoph Nageli branch Patches applied: * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--base-0 tag of ed.gomez@free.fr--2004-1/xvidcore--head--0.0--patch-68 * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-1 interpolate8x8_haflpel add functions * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-2 little enhancement * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-3 Basic QPel pass_16 routines altivec codec * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-4 Basic QPel pass_8 routines altivec code * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-5 packed pass_16 routines in a macro * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-6 packed pass_8 routines in a macro * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-7 Enhancement of the qpel functions for P-frames * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-8 QPel Pass_16 Add Functions * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-9 Pass_8_Add Altivec functions * chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-10 Bugfix for Pass_8 Add routines new files: src/image/ppc_asm/.arch-ids/qpel_altivec.c.id src/image/ppc_asm/qpel_altivec.c modified files: build/generic/sources.inc src/image/interpolate8x8.h src/image/ppc_asm/colorspace_altivec.c src/image/ppc_asm/interpolate8x8_altivec.c src/image/qpel.c src/image/qpel.h src/xvid.c new patches: chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--base-0 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-1 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-2 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-3 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-4 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-5 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-6 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-7 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-8 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-9 chn@kbw.ch--2004-1/xvidcore--naegeli-head--0.0--patch-10 2004-10-12 21:00:08 GMT patch-90 Summary: Resynced with 1.0 tree Revision: xvidcore--head--0.0--patch-90 Resynced with 1.0 tree Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-59 Don't read too short streams. modified files: src/bitstream/bitstream.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-59 2004-10-12 20:54:47 GMT patch-89 Summary: Credits to Christoph Nageli for his work on PPC port Revision: xvidcore--head--0.0--patch-89 Credits to Christoph Nageli for his work on PPC port modified files: AUTHORS 2004-10-12 20:51:24 GMT patch-88 Summary: Revision: xvidcore--head--0.0--patch-88 Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-57 ME fix. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-58 64bit fixes From ed.gomez: * Resolved conflicts caused by the 64bit fixes, extended it for qpel.h Note that 1.1 tree needs a new review for 64bit problems as lot of ME code has changed. modified files: src/image/qpel.h src/motion/estimation_bvop.c src/motion/estimation_common.c src/motion/motion_comp.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-57 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-58 2004-09-22 22:42:16 GMT patch-87 Summary: DESTDIR support in Makefile Revision: xvidcore--head--0.0--patch-87 From Thomas Galliano (Gentoo bug #62190): * Added DESTDIR prefix variable to all install commands destination (ed.gomez: i think it's for packaging issues, so the install process installs all files as if they were going to the real location, as xvid doesn't use any path, i always thought this was not required) modified files: build/generic/Makefile build/generic/platform.inc.in 2004-09-04 15:10:33 GMT patch-86 Summary: First bvop search must initalize best_sad Revision: xvidcore--head--0.0--patch-86 From sysKin: * First bvop search must initalize best_sad modified files: src/motion/estimation_bvop.c 2004-09-04 14:11:43 GMT patch-85 Summary: Fixed function prototype/definition mismatch Revision: xvidcore--head--0.0--patch-85 From sysKin: * Fixed function prototype/definition mismatch for some interpolation C functions. modified files: src/image/interpolate8x8.c 2004-09-04 14:08:13 GMT patch-84 Summary: Fixed buffer termination logic in xvid_decraw. Revision: xvidcore--head--0.0--patch-84 From ed.gomez: * Fixed main decoding loop condition to really match the empty buffer and end of stream condition. * Removed the unwanted frame number limitation modified files: examples/xvid_decraw.c 2004-09-04 14:04:48 GMT patch-83 Summary: Uninitialized user data usage. Revision: xvidcore--head--0.0--patch-83 From ed.gomez: - Fixed user data parsing uninitialized data. modified files: src/bitstream/bitstream.c 2004-09-04 13:59:26 GMT patch-82 Summary: Unitialized data in bvop ME Revision: xvidcore--head--0.0--patch-82 From ed.gomez: * Fixed unitialized data usage during bvop ME. modified files: src/motion/estimation_bvop.c 2004-09-03 00:13:31 GMT patch-81 Summary: Add VOL header saving in xvid_decraw Revision: xvidcore--head--0.0--patch-81 From ed.gomez: * Added VOL header saving in xvid_decraw The little story: I was trying to cut some frames off of a big stream (150MB) with "xvid_decraw -m", and cat'ing the single frame stream files together. The reconstructed stream was rejected by all mpeg4 decoders because the vol header wasn't present. Thus the fix. modified files: examples/xvid_decraw.c 2004-08-30 23:22:35 GMT patch-80 Summary: Complete previous API numbering change Revision: xvidcore--head--0.0--patch-80 Complete previous API numbering change modified files: src/xvid.h 2004-08-29 11:53:05 GMT patch-79 Summary: Merged stable tree Revision: xvidcore--head--0.0--patch-79 Merged stable tree Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-54 Marking 1.0.2 * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-55 Merged one important forgotten bugfix from head * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-56 ChangeLog update modified files: ChangeLog-1.0 src/xvid.h new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-54 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-55 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-56 2004-08-29 09:56:34 GMT patch-78 Summary: Write ia32 symbols' size to elf output. Revision: xvidcore--head--0.0--patch-78 From ed.gomez: - write symbols size to elf output, so the asm objects look really like any usual object file. modified files: src/bitstream/x86_asm/cbp_3dne.asm src/bitstream/x86_asm/cbp_mmx.asm src/bitstream/x86_asm/cbp_sse2.asm src/dct/x86_asm/fdct_mmx_ffmpeg.asm src/dct/x86_asm/fdct_mmx_skal.asm src/dct/x86_asm/fdct_sse2_skal.asm src/dct/x86_asm/idct_3dne.asm src/dct/x86_asm/idct_mmx.asm src/dct/x86_asm/idct_sse2_dmitry.asm src/dct/x86_asm/simple_idct_mmx.asm src/image/x86_asm/colorspace_mmx.inc src/image/x86_asm/colorspace_rgb_mmx.asm src/image/x86_asm/colorspace_yuv_mmx.asm src/image/x86_asm/colorspace_yuyv_mmx.asm src/image/x86_asm/interpolate8x8_3dn.asm src/image/x86_asm/interpolate8x8_3dne.asm src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/image/x86_asm/postprocessing_mmx.asm src/image/x86_asm/postprocessing_sse2.asm src/image/x86_asm/qpel_mmx.asm src/image/x86_asm/reduced_mmx.asm src/motion/x86_asm/sad_3dn.asm src/motion/x86_asm/sad_3dne.asm src/motion/x86_asm/sad_mmx.asm src/motion/x86_asm/sad_sse2.asm src/motion/x86_asm/sad_xmm.asm src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/x86_asm/cpuid.asm src/utils/x86_asm/interlacing_mmx.asm src/utils/x86_asm/mem_transfer_3dne.asm src/utils/x86_asm/mem_transfer_mmx.asm 2004-08-28 13:00:56 GMT patch-77 Summary: Thread safety problem in sse2 brightness control Revision: xvidcore--head--0.0--patch-77 From ed.gomez: * CodingStyle for the sse2 image brightness file * Fixed thread safety problem/big error. Writing to a RO data segment is a no go ! and using global data segment is a no go either (use stack instead) ! modified files: src/image/x86_asm/postprocessing_sse2.asm 2004-08-22 13:11:23 GMT patch-76 Summary: Stable merge Revision: xvidcore--head--0.0--patch-76 Stable merge Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-53 Thread safety problem in idct C version modified files: src/bitstream/mbcoding.c src/dct/idct.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-53 2004-08-22 11:48:23 GMT patch-75 Summary: This tree isn't API 4.0 anymore, mark it as 4.1 Revision: xvidcore--head--0.0--patch-75 From ed.gomez: * The fields added to some structs make this lib isn't API 4.0 anymore, mark it as 4.1 because ABI compatibility is conserved. modified files: build/generic/configure.in 2004-08-22 11:41:22 GMT patch-74 Summary: Functions qualified as such for elf format. Revision: xvidcore--head--0.0--patch-74 From ed.gomez: * Functions weren't marked as functions in ia32 asm files. Added support for the function qualifier for elf. modified files: build/generic/configure.in src/bitstream/x86_asm/cbp_3dne.asm src/bitstream/x86_asm/cbp_mmx.asm src/bitstream/x86_asm/cbp_sse2.asm src/dct/x86_asm/fdct_mmx_ffmpeg.asm src/dct/x86_asm/fdct_mmx_skal.asm src/dct/x86_asm/fdct_sse2_skal.asm src/dct/x86_asm/idct_3dne.asm src/dct/x86_asm/idct_mmx.asm src/dct/x86_asm/idct_sse2_dmitry.asm src/dct/x86_asm/simple_idct_mmx.asm src/image/x86_asm/colorspace_rgb_mmx.asm src/image/x86_asm/colorspace_yuv_mmx.asm src/image/x86_asm/colorspace_yuyv_mmx.asm src/image/x86_asm/interpolate8x8_3dn.asm src/image/x86_asm/interpolate8x8_3dne.asm src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/image/x86_asm/postprocessing_mmx.asm src/image/x86_asm/postprocessing_sse2.asm src/image/x86_asm/qpel_mmx.asm src/image/x86_asm/reduced_mmx.asm src/motion/x86_asm/sad_3dn.asm src/motion/x86_asm/sad_3dne.asm src/motion/x86_asm/sad_mmx.asm src/motion/x86_asm/sad_sse2.asm src/motion/x86_asm/sad_xmm.asm src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/x86_asm/cpuid.asm src/utils/x86_asm/interlacing_mmx.asm src/utils/x86_asm/mem_transfer_3dne.asm src/utils/x86_asm/mem_transfer_mmx.asm 2004-08-21 17:04:57 GMT patch-73 Summary: Added yasm support in configure.in Revision: xvidcore--head--0.0--patch-73 From ed.gomez: * Added yasm configure.in support. It's my preferred ia32 assembly because it allows debugging/profiling of assembly code with oprofile. modified files: build/generic/configure.in 2004-08-21 11:47:31 GMT patch-72 Summary: Merged fix from stable Revision: xvidcore--head--0.0--patch-72 Merged fix from stable Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-52 Stupid typo+error in fdct_xxx_skal macro generator. modified files: src/dct/x86_asm/fdct_mmx_skal.asm new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-52 2004-08-16 22:32:46 GMT patch-71 Summary: Decoder optimization (fixing regression) Revision: xvidcore--head--0.0--patch-71 From ed.gomez: * With newly introduced vector checking, decoder became noticeably slower. This was caused by poorly written code (sorry sysKin :P) + unrolled loop + removed duplicated border computings + marks the function as __inline modified files: src/decoder.c 2004-08-15 11:42:20 GMT patch-70 Summary: Out of bounds MVs clipping Revision: xvidcore--head--0.0--patch-70 From sysKin: * Clip vectors that end up out of bounds. modified files: src/decoder.c 2004-08-10 22:30:09 GMT patch-69 Summary: Fixed CBR plugin. Revision: xvidcore--head--0.0--patch-69 From Foxer: * Do not set the return quantizer to the frame's quantizer (caused the crazy quant choices) * Allow quant1 to influence the sequence quality * Allow more than +- 1 quantizer variation if the desired quantizer is much higher than the previous. * Clamp the overflow influence to 1 unit of buffer, that should help cases where still motion scenes are followed by normal motion scenes... old code was reaching amazing high bitrate, with this modification it should keep smaller bitrate. modified files: src/plugins/plugin_single.c 2004-08-01 15:23:49 GMT patch-68 Summary: error in dshow par array indexing Revision: xvidcore--head--0.0--patch-68 error in dshow par array indexing modified files: dshow/src/CXvidDecoder.cpp 2004-08-01 13:38:36 GMT patch-67 Summary: Faster bframe decoding (qpel this time) Revision: xvidcore--head--0.0--patch-67 From ed.gomez: * Used the same trick as for halfpel bvops, merge backward interpolation and dst averaging steps. NB: i'm currently not able to say if it's a real speedup or not because my linux kernel uses a process scheduler gives great variance to results... so far i'm sure this isn't a slowdown neither for C nor ia32 SIMD. modified files: src/decoder.c src/image/qpel.c src/image/qpel.h 2004-08-01 11:24:07 GMT patch-66 Summary: Unified qpel code path for all platforms Revision: xvidcore--head--0.0--patch-66 From ed.gomez and skal: * Unified qpel code path for all platforms. Next step is to fully exploit this code path to speedup qpel bframe decoding NB: this makes also ports life easier as they would not port obsoleted function sets... modified files: src/decoder.c src/image/interpolate8x8.h src/image/qpel.c src/image/qpel.h src/motion/motion_comp.c 2004-07-31 15:08:19 GMT patch-65 Summary: Faster bframe decoding. Revision: xvidcore--head--0.0--patch-65 From ed.gomez and skal: * Faster direct/interpolated bvop blocks decoding for halfpel sequences. The trick is to compute and average directly with destination during one of the forward/backward interpolations. At this moment, this patch covers only halfpel decoding, the same trick is expected to be hacked for qpel. modified files: src/decoder.c src/image/interpolate8x8.c src/image/interpolate8x8.h src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/xvid.c 2004-07-31 09:13:23 GMT patch-64 Summary: Last bit for dshow gcc support Revision: xvidcore--head--0.0--patch-64 From pete: * Last bits concerning gcc build support for dshow. removed files: dshow/dxpatch/.arch-ids/DXVCSDK-9.0-gcc.patch.id dshow/dxpatch/.arch-ids/DXVCSDK-9.0-gcc.txt.id dshow/dxpatch/DXVCSDK-9.0-gcc.patch dshow/dxpatch/DXVCSDK-9.0-gcc.txt modified files: dshow/src/Configure.cpp 2004-07-27 21:10:02 GMT patch-63 Summary: Better cross compilation support for dshow. Revision: xvidcore--head--0.0--patch-63 From pete and ed.gomez: * gcc 3.4.1 is even more pedantic, ::GUID was breaking it, use struct _GUID instead. * Some uppercase/lowercase mixing in MS headers. * More documentation footage for the braves ! NB: with all this, dshow should compile, but it is not guaranted to work ! There's even a patch for Configure.cpp that could be required... postponed to a later patch. modified files: dshow/dxpatch/dx90sdk-update-gcc.patch dshow/dxpatch/dx90sdk-update-gcc.txt 2004-07-26 20:25:52 GMT patch-62 Summary: ChangeLog 1.1 update Revision: xvidcore--head--0.0--patch-62 ChangeLog 1.1 update modified files: ChangeLog 2004-07-26 20:22:38 GMT patch-61 Summary: Update from stable Revision: xvidcore--head--0.0--patch-61 Update from stable Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-51 ChangeLog Update modified files: ChangeLog-1.0 new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-51 2004-07-26 19:26:24 GMT patch-60 Summary: Another missing memset in xvid_decraw Revision: xvidcore--head--0.0--patch-60 Another missing memset in xvid_decraw modified files: examples/xvid_decraw.c 2004-07-26 19:14:45 GMT patch-59 Summary: DShow updates for gcc toolchain. Revision: xvidcore--head--0.0--patch-59 From pete: * More work on the gcc toolchain. new files: dshow/dxpatch/.arch-ids/dx90sdk-update-gcc.patch.id dshow/dxpatch/.arch-ids/dx90sdk-update-gcc.txt.id dshow/dxpatch/dx90sdk-update-gcc.patch dshow/dxpatch/dx90sdk-update-gcc.txt modified files: dshow/Makefile dshow/dshow.dsp dshow/src/CXvidDecoder.cpp 2004-07-25 21:31:41 GMT patch-58 Summary: Added GPL to vfw frontend Revision: xvidcore--head--0.0--patch-58 From pete: * Added GPL to VFW ressources. modified files: vfw/src/config.c vfw/src/driverproc.c vfw/src/resource.h vfw/src/resource.rc 2004-07-25 19:31:32 GMT patch-57 Summary: decoder_mb_decode cleanup Revision: xvidcore--head--0.0--patch-57 From ed.gomez: * Try to cleanup the decoder_mb_decode function. A bit more computing required, less branches, more readable code. modified files: src/decoder.c 2004-07-24 11:39:57 GMT patch-56 Summary: Important bugfix from stable Revision: xvidcore--head--0.0--patch-56 Important bugfix from stable Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-50 BVOP direct/interpolated ref block rounding fix. modified files: src/decoder.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-50 2004-07-23 20:40:08 GMT patch-55 Summary: Revision: xvidcore--head--0.0--patch-55 From ed.gomez: * Extended stable patch pplying same change to new nasm files Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-49 Removed data qualifer in .rodata modified files: src/bitstream/x86_asm/cbp_mmx.asm src/bitstream/x86_asm/cbp_sse2.asm src/dct/x86_asm/fdct_mmx_ffmpeg.asm src/dct/x86_asm/fdct_mmx_skal.asm src/dct/x86_asm/fdct_sse2_skal.asm src/dct/x86_asm/idct_3dne.asm src/dct/x86_asm/idct_mmx.asm src/dct/x86_asm/idct_sse2_dmitry.asm src/dct/x86_asm/simple_idct_mmx.asm src/image/x86_asm/colorspace_rgb_mmx.asm src/image/x86_asm/colorspace_yuyv_mmx.asm src/image/x86_asm/interpolate8x8_3dn.asm src/image/x86_asm/interpolate8x8_3dne.asm src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/image/x86_asm/postprocessing_mmx.asm src/image/x86_asm/postprocessing_sse2.asm src/image/x86_asm/qpel_mmx.asm src/image/x86_asm/reduced_mmx.asm src/motion/x86_asm/sad_3dn.asm src/motion/x86_asm/sad_3dne.asm src/motion/x86_asm/sad_mmx.asm src/motion/x86_asm/sad_sse2.asm src/motion/x86_asm/sad_xmm.asm src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/x86_asm/cpuid.asm src/utils/x86_asm/interlacing_mmx.asm src/utils/x86_asm/mem_transfer_3dne.asm src/utils/x86_asm/mem_transfer_mmx.asm new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-49 2004-07-21 19:36:46 GMT patch-54 Summary: Bframe fixes, still not back to 1.0.1 level Revision: xvidcore--head--0.0--patch-54 Bframe fixes, still not back to 1.0.1 level modified files: src/motion/estimation_bvop.c src/motion/estimation_rd_based_bvop.c 2004-07-19 18:46:09 GMT patch-53 Summary: Stable merge Revision: xvidcore--head--0.0--patch-53 Stable merge Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-47 ISO C99'ism fix * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-48 Complete previous xvid_decraw patch modified files: examples/xvid_decraw.c src/encoder.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-47 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-48 2004-07-18 15:19:58 GMT patch-52 Summary: Added dshow mingw build process Revision: xvidcore--head--0.0--patch-52 From pete: - Added mingw build process to dshow frontend. new files: dshow/.arch-ids/Makefile.id dshow/.arch-ids/sources.inc.id dshow/Makefile dshow/dxpatch/.arch-ids/=id dshow/dxpatch/.arch-ids/DXVCSDK-9.0-gcc.patch.id dshow/dxpatch/.arch-ids/DXVCSDK-9.0-gcc.txt.id dshow/dxpatch/DXVCSDK-9.0-gcc.patch dshow/dxpatch/DXVCSDK-9.0-gcc.txt dshow/sources.inc modified files: dshow/dshow.dsp dshow/src/CXvidDecoder.cpp dshow/src/Configure.cpp dshow/src/config.h dshow/src/debug.h dshow/src/xvid.ax.rc new directories: dshow/dxpatch dshow/dxpatch/.arch-ids 2004-07-18 15:01:02 GMT patch-51 Summary: Added RD optimized block mode decision in bvops Revision: xvidcore--head--0.0--patch-51 From sysKin: * Added RD optimized block mode decision in bvops. new files: src/motion/.arch-ids/estimation_rd_based_bvop.c.id src/motion/estimation_rd_based_bvop.c modified files: build/generic/sources.inc build/win32/libxvidcore.dsp build/win32/libxvidcore_static.dsp src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_rd_based.c src/plugins/plugin_2pass1.c src/xvid.h vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc 2004-07-17 11:37:21 GMT patch-50 Summary: Stable merges Revision: xvidcore--head--0.0--patch-50 Stable merges Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-45 Future version interoperability * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-46 Make sure time incr is never larger than 16bit. modified files: examples/xvid_decraw.c src/encoder.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-45 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-46 2004-07-16 19:53:27 GMT patch-49 Summary: AR support in DShow Revision: xvidcore--head--0.0--patch-49 From koepi/minolta: * Added AR support to dshow frontend. modified files: dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h dshow/src/config.c 2004-07-16 19:49:25 GMT patch-48 Summary: VFW update and fixes Revision: xvidcore--head--0.0--patch-48 From makc on our forums: * Use non deprecated defines * Fixed frame size formula. modified files: TODO vfw/src/codec.c vfw/src/driverproc.c 2004-07-16 19:29:58 GMT patch-47 Summary: SSE2 brightness postproc. Revision: xvidcore--head--0.0--patch-47 From Decoder: * Added SS2 brightness postproc code. new files: src/image/x86_asm/.arch-ids/postprocessing_sse2.asm.id src/image/x86_asm/postprocessing_sse2.asm modified files: build/generic/sources.inc build/win32/libxvidcore.dsp build/win32/libxvidcore_static.dsp src/image/postprocessing.h src/xvid.c src/xvid.h 2004-07-14 23:27:14 GMT patch-46 Summary: More audio for VFW bitcalc Revision: xvidcore--head--0.0--patch-46 From ???: * added more audio formats to bitcalc * replaced old ogm overhead formula with more precise one modified files: TODO vfw/src/config.c vfw/src/resource.rc 2004-07-14 13:01:57 GMT patch-45 Summary: Enable MMX qpel in decoder. Revision: xvidcore--head--0.0--patch-45 From ed.gomez: * It seems we're not that smart. We had mmx qpel code for more than a year, it is used in encoder but wasn't in decoder :\ modified files: src/decoder.c 2004-07-14 10:27:43 GMT patch-44 Summary: Speedup block transfer C functionKeywords: Revision: xvidcore--head--0.0--patch-44 From ed.gomez: * Not that a useful patch for most of users, but transfer8x8 was really too slow. Simple optimizations did great, all 32bit platforms using the C code should benefit from this speedup. modified files: src/utils/mem_transfer.c 2004-07-11 12:53:19 GMT patch-43 Summary: Manual AR setting for dshow. Revision: xvidcore--head--0.0--patch-43 From koepi: * added manual AR setting in dshow. modified files: dshow/src/CXvidDecoder.cpp dshow/src/config.c dshow/src/config.h dshow/src/resource.h dshow/src/xvid.ax.rc 2004-07-11 10:34:56 GMT patch-42 Summary: Added top field control to vfw. Revision: xvidcore--head--0.0--patch-42 From koepi: * added top field first flag to vfw. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc 2004-07-10 17:47:01 GMT patch-41 Summary: Decoder work. Revision: xvidcore--head--0.0--patch-41 From ed.gomez: * Faster get coeff (now gcc can even inline it) * On the fly coeff dequant for inter blocks (intra don't get this, because there are lot more non zero coeffs, and i doubt it'd get faster with this) modified files: src/bitstream/mbcoding.c src/bitstream/mbcoding.h src/decoder.c 2004-07-10 17:34:19 GMT patch-40 Summary: ChangeLog update + removed my email Revision: xvidcore--head--0.0--patch-40 ChangeLog update + removed my email modified files: ChangeLog 2004-07-10 17:31:36 GMT patch-39 Summary: Stable merge Revision: xvidcore--head--0.0--patch-39 Stable merge Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-43 Small mem leak in vfw. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-44 ChangeLog update modified files: ChangeLog-1.0 vfw/src/codec.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-43 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-44 2004-07-10 17:25:33 GMT patch-38 Summary: Improved ME. Revision: xvidcore--head--0.0--patch-38 From sysKin: * new ME for b-frames * small redesign of subpel refinement function From ed.gomez: * Fixed some warnings reported by gcc. (the if condition should be checked by the original autor) modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_common.c src/motion/estimation_gmc.c src/motion/estimation_pvop.c src/motion/estimation_rd_based.c src/motion/vop_type_decision.c 2004-07-10 17:16:38 GMT patch-37 Summary: qpel and chroma-sad had overlapping memory targets Revision: xvidcore--head--0.0--patch-37 From sysKin: * qpel and chroma-sad had overlapping memory targets modified files: src/motion/estimation_bvop.c 2004-07-10 17:03:06 GMT patch-36 Summary: New changelog for 1.1 tree Revision: xvidcore--head--0.0--patch-36 New changelog for 1.1 tree new files: .arch-ids/ChangeLog.id ChangeLog renamed files: .arch-ids/ChangeLog.id ==> .arch-ids/ChangeLog-1.0.id ChangeLog ==> ChangeLog-1.0 2004-07-10 16:57:53 GMT patch-35 Summary: Stable tree merge Revision: xvidcore--head--0.0--patch-35 Stable tree merge Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-40 Small memory error in ia32 cpuid function. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-41 low delay guessing (il)logic fix. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-42 Fix wrong matrix reading logic. modified files: src/bitstream/bitstream.c src/decoder.c src/utils/x86_asm/cpuid.asm new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-40 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-41 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-42 2004-06-12 13:51:50 GMT patch-34 Summary: Added VBV to twopass RC Revision: xvidcore--head--0.0--patch-34 From christoph: * Added VBV model verifier to twopass RC plugin From ed.gomez: * Do apply CodingStyle to christoph's code * Use DPRINTF instead of #ifdef VBV_DEBUG #endif blocks as the information that was outputting was usefull for general RC debugging. modified files: examples/xvid_encraw.c src/plugins/plugin_2pass2.c src/xvid.h 2004-06-05 23:05:43 GMT patch-33 Summary: Merged stable branch patches Revision: xvidcore--head--0.0--patch-33 Merged stable branch patches Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-38 DC clipping bug for real * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-39 Marking 1.0.1 release modified files: ChangeLog TODO build/generic/configure.in src/decoder.c src/motion/estimation_rd_based.c src/prediction/mbprediction.c src/prediction/mbprediction.h src/xvid.h new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-38 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-39 2004-06-05 23:02:35 GMT patch-32 Summary: Previous merge went wrong Revision: xvidcore--head--0.0--patch-32 Previous merge went wrong modified files: src/portab.h 2004-05-31 21:32:38 GMT patch-31 Summary: Added icon into vfw frontend. Revision: xvidcore--head--0.0--patch-31 From pete: * Added icon into vfw fronted dll. Should show up in uninstall menu. new files: vfw/src/.arch-ids/xvid.ico.id vfw/src/xvid.ico modified files: vfw/bin/xvid.inf vfw/src/resource.rc vfw/vfw.dsp 2004-05-31 21:22:49 GMT patch-30 Summary: Merged stable branch fixes Revision: xvidcore--head--0.0--patch-30 Merged stable branch fixes Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-31 Close variable argument list. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-32 Bits/Bytes confusion in the VFW frontend. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-33 Nasty typo in pvop vector lambdas. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-34 FPS=1 problem in decoder. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-35 More missing va_end() calls. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-36 Wrong license header. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-37 time fixes to decoder. modified files: src/bitstream/bitstream.c src/decoder.c src/decoder.h src/image/font.c src/image/reduced.c src/motion/estimation_pvop.c src/portab.h vfw/src/codec.c vfw/src/config.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-31 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-32 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-33 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-34 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-35 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-36 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-37 2004-05-26 09:13:33 GMT patch-29 Summary: Stable merges Revision: xvidcore--head--0.0--patch-29 Stable merges Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-28 Small bug in bframe ME. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-29 Small trellis bug * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-30 ICM compatibility for VFW modified files: src/motion/estimation_bvop.c src/utils/mbtransquant.c vfw/src/config.c vfw/src/driverproc.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-28 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-29 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-30 2004-05-21 14:32:41 GMT patch-28 Summary: Don't set edges twice on a frame. Revision: xvidcore--head--0.0--patch-28 From ed.gomez: * A similar optimization has been done for encoder long ago, dunno why this hasn't been "ported" to decoder. This speeds up quite a lot the decoder for no effort (~7%). modified files: src/decoder.c src/decoder.h 2004-05-21 14:25:19 GMT patch-27 Summary: No 64 bit arithmetic in critical path. Revision: xvidcore--head--0.0--patch-27 From ed.gomez: * No 64 bit arithmetic in critical paths (direct blocks in bvops), it's way too slow (__divdi3 GNU/Linux ABI for 64bit division was taking up to 5% cycles) modified files: src/decoder.c 2004-05-15 22:20:11 GMT patch-26 Summary: Merged stable tree changes Revision: xvidcore--head--0.0--patch-26 Merged stable tree changes Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-23 Some very light Unix build system changes * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-24 Possible VOL header corruption. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-25 DC prediction fix. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-26 Small mismatch in hint<->widget in VFW * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-27 Marking 1.0.0 final modified files: ChangeLog build/generic/Makefile build/generic/bootstrap.sh build/generic/configure.in src/bitstream/bitstream.c src/decoder.c src/motion/estimation_rd_based.c src/prediction/mbprediction.c src/prediction/mbprediction.h src/xvid.h vfw/src/resource.rc new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-23 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-24 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-25 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-26 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-27 2004-04-25 21:46:25 GMT patch-25 Summary: Smarter skipping Revision: xvidcore--head--0.0--patch-25 From sysKin: * Smarter skipping + bugfix modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_pvop.c 2004-04-20 20:37:08 GMT patch-24 Summary: ME cleanup. Revision: xvidcore--head--0.0--patch-24 From sysKin: * First stage cleanup: new fast qpel refinement. modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_common.c src/motion/estimation_gmc.c src/motion/estimation_pvop.c src/motion/estimation_rd_based.c src/motion/vop_type_decision.c 2004-04-20 19:44:44 GMT patch-23 Summary: Merging 1.0 fixes Revision: xvidcore--head--0.0--patch-23 Merging 1.0 fixes Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-21 Fix crash in decoder for non IFrame 1st frame. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-22 Small visual fix. modified files: src/decoder.c vfw/src/config.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-21 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-22 2004-04-18 17:14:29 GMT patch-22 Summary: Merging 1.0 fixes Revision: xvidcore--head--0.0--patch-22 Merging 1.0 fixes Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-18 Tiny xvid_decraw cleaning * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-19 vfw opens audio file in shared access mode * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-20 Typo modified files: examples/xvid_decraw.c vfw/src/resource.rc new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-18 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-19 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-20 2004-04-18 17:09:59 GMT patch-21 Summary: Dering hooking in DShow. Revision: xvidcore--head--0.0--patch-21 From sysKin: * Dering widget and associated code for dering support in DShow. modified files: dshow/src/CXvidDecoder.cpp dshow/src/config.c dshow/src/config.h dshow/src/resource.h dshow/src/xvid.ax.rc 2004-04-18 17:08:53 GMT patch-20 Summary: Dering hooking in VFW. Revision: xvidcore--head--0.0--patch-20 From sysKin: * Dering widget and associated code for dering support in VFW. modified files: vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/resource.h vfw/src/resource.rc 2004-04-18 17:06:14 GMT patch-19 Summary: Added dering code. Revision: xvidcore--head--0.0--patch-19 From Marc Fauconneau: * Added dering code to core. modified files: src/image/postprocessing.c src/image/postprocessing.h src/xvid.h 2004-04-18 17:02:48 GMT patch-18 Summary: Added static builds for msvc. Revision: xvidcore--head--0.0--patch-18 From pete: * Added static type building project files for MSVC. Needed for xvid_bench. new files: build/win32/.arch-ids/libxvidcore_static.dsp.id build/win32/.arch-ids/xvid_decraw_static.dsp.id build/win32/.arch-ids/xvid_encraw_static.dsp.id build/win32/libxvidcore_static.dsp build/win32/xvid_decraw_static.dsp build/win32/xvid_encraw_static.dsp modified files: TODO build/win32/xvidcore.dsw 2004-04-15 19:32:53 GMT patch-17 Summary: Merged fixes from 1.0 tree Revision: xvidcore--head--0.0--patch-17 Merged fixes from 1.0 tree Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-14 Fixed small bug in trellis code. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-15 Ressource leaking in dshow. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-16 Fixed missing 1st frame in dshow output. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-17 Tiny minor fixes for msvc. modified files: build/win32/xvid_decraw.dsp build/win32/xvid_encraw.dsp dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h examples/xvid_decraw.c src/utils/mbtransquant.c src/xvid.h new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-14 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-15 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-16 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-17 2004-04-15 19:28:53 GMT patch-16 Summary: Starting the 1.1 numbering here. Revision: xvidcore--head--0.0--patch-16 From pete: * Started 1.1 numbering * Added 1.1 checking for the brightness field. PS: API 4 is extensible as long as we add fields at the end of the structures and that we had checks to mimic old core settings. That's why the API version doesn't change, but the public numbering does. modified files: src/decoder.c src/xvid.c src/xvid.h 2004-04-14 19:22:52 GMT patch-15 Summary: Remove ppro code from mmx h263 quant. Revision: xvidcore--head--0.0--patch-15 From Jean Marc: * Removed pentium pro opcodes from mmx functions (cmov) modified files: src/quant/x86_asm/quantize_h263_mmx.asm 2004-04-13 20:05:24 GMT patch-14 Summary: Reverted troublesome patch-11 Revision: xvidcore--head--0.0--patch-14 From ed.gomez: * Removed buggy patch-11. Though xvid_bench tests passed, the code was buggy... and as all changes were involved, the patch is reverted. modified files: src/utils/x86_asm/mem_transfer_mmx.asm 2004-04-12 15:48:21 GMT patch-13 Summary: Optimized Plane SSE. Revision: xvidcore--head--0.0--patch-13 From ed.gomez: * Mostly unuseful patch as it optimizes a function that is used not so often and that doesn't eat so much CPU. But as i'm always doing debugging (thus using plane_sse), i like the idea of being doing the debugging the faster i can :-) modified files: examples/xvid_bench.c src/image/image.c src/motion/sad.c src/motion/sad.h src/motion/x86_asm/sad_mmx.asm src/xvid.c 2004-04-12 15:38:01 GMT patch-12 Summary: New H263 code. Revision: xvidcore--head--0.0--patch-12 From Jean Marc: * Improved H263 code. modified files: src/quant/x86_asm/quantize_h263_mmx.asm 2004-04-12 14:03:19 GMT patch-10 Summary: Removed CVS Id field Revision: xvidcore--head--0.0--patch-10 Removed CVS Id field modified files: src/motion/ppc_asm/sad_altivec.c 2004-04-12 14:00:16 GMT patch-9 Summary: Added debug option (-debug) Revision: xvidcore--head--0.0--patch-9 Added debug option (-debug) modified files: examples/xvid_decraw.c 2004-04-12 13:53:00 GMT patch-8 Summary: Merged stable tree fixes Revision: xvidcore--head--0.0--patch-8 Merged stable tree fixes Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-6 Compiler quirk in portab.h * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-7 DShow widget hiding. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-8 RGB 16bit output fix. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-9 3DNow Ext functions use MMXEXT opcodes. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-10 PGM support back in xvid_decraw. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-11 Better MV clipping code. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-12 3dnow functions proper separation. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-13 Don't do SAD and RD based searches for qp. modified files: dshow/src/xvid.ax.rc examples/xvid_decraw.c src/decoder.c src/image/colorspace.c src/motion/estimation_pvop.c src/portab.h src/xvid.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-6 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-7 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-8 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-9 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-10 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-11 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-12 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-13 2004-04-05 20:44:57 GMT patch-7 Summary: MSVC warning. Revision: xvidcore--head--0.0--patch-7 From pete: * Fixed MSVC warnings about float vs double. modified files: src/plugins/plugin_lumimasking.c 2004-04-05 20:04:10 GMT patch-6 Summary: Frame dropping alternative fix. Revision: xvidcore--head--0.0--patch-6 From sysKin: * Different solution to the same problem previously fixed in 1.0 tree. modified files: src/encoder.c 2004-04-05 19:45:17 GMT patch-5 Summary: Merged stable tree fixes Revision: xvidcore--head--0.0--patch-5 Merged stable tree fixes Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-2 Typo in ME fast comparison. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-3 Dead code removal. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-4 Frame dropping disabling for bframes. * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-5 Marking RC4 modified files: ChangeLog build/generic/configure.in src/encoder.c src/motion/estimation_common.c src/xvid.h new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-2 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-3 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-4 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-5 2004-04-03 10:33:44 GMT patch-4 Summary: Merged 1.0 fixes Revision: xvidcore--head--0.0--patch-4 Merged 1.0 fixes Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--base-0 tag of ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-53 * ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-1 VFW Resource leak fix (try #2) modified files: vfw/src/codec.c vfw/src/driverproc.c new patches: ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--base-0 ed.gomez@free.fr--2004-1/xvidcore--stable--1.0--patch-1 2004-04-02 21:44:39 GMT patch-3 Summary: Merged new PPC port Revision: xvidcore--head--0.0--patch-3 Merged new PPC port Patches applied: * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--base-0 tag of ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-4 * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-1 Sad Altivec File added * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-2 Mem Transfer functions ported to altivec * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-4 bugfix in mem transfer altivec routines * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-5 Bug Fix in Mem Transfer * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-6 Walken Inverse DCT added * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-7 Interpolate8x8 altivec added * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-8 interpolate avg2 altivec added * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-9 Star-merged Edouards Branch * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-10 Added RGB to YV12 Altivec routines * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-11 Added YUV to YV12 Altivec routines * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-12 more interpolate functions * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-13 H263 Quantization added in altivec * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-14 Star-Merge with main branch * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-15 h263 dequantization with altivec * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-16 sse8_16bit added * chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-17 added yv12 to yuv colorspace routines (altivec) * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--base-0 tag of ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--base-0 * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-1 Merged with mainline patch-9 * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-2 PPC platform support cleanup. * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-3 Merging Paul's changes * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-4 Merged mainline patches * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-5 Merged up to mainline RC1 * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-6 Merged chn's work * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-7 Merged chn's mem transfer functions * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-8 Merged mainline mem_transfer arch separation * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-9 Merged mainline patches * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-10 Replayed unconflicting patches from chn * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-11 Merging mainline up to patch-31 * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-12 Forgotten patch from chn * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-13 Merged chn's branch up to patch-10 * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-14 Merged work from Chriostoph up to patch-13 * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-15 Merged stuff from mainline * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-16 Merged chn's work * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-17 Merged mainline * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-18 Merged mainline fixes * ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-19 Merging head branch to prepare mainline merging * ptk9417@rit.edu--2004-1/xvidcore--devapi4-ppc--1.0--base-0 tag of ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-2 * ptk9417@rit.edu--2004-1/xvidcore--devapi4-ppc--1.0--patch-1 Quick changes for ppc linux new files: src/dct/ppc_asm/.arch-ids/idct_altivec.c.id src/dct/ppc_asm/idct_altivec.c src/image/ppc_asm/.arch-ids/=id src/image/ppc_asm/.arch-ids/colorspace_altivec.c.id src/image/ppc_asm/.arch-ids/interpolate8x8_altivec.c.id src/image/ppc_asm/colorspace_altivec.c src/image/ppc_asm/interpolate8x8_altivec.c src/motion/ppc_asm/.arch-ids/sad_altivec.c.id src/motion/ppc_asm/sad_altivec.c src/quant/ppc_asm/.arch-ids/=id src/quant/ppc_asm/.arch-ids/quant_h263_altivec.c.id src/quant/ppc_asm/quant_h263_altivec.c src/utils/ppc_asm/.arch-ids/=id src/utils/ppc_asm/.arch-ids/altivec_trigger.c.id src/utils/ppc_asm/.arch-ids/mem_transfer_altivec.c.id src/utils/ppc_asm/altivec_trigger.c src/utils/ppc_asm/mem_transfer_altivec.c removed files: src/bitstream/ppc_asm/.arch-ids/cbp_altivec.s.id src/bitstream/ppc_asm/.arch-ids/cbp_ppc.s.id src/bitstream/ppc_asm/cbp_altivec.s src/bitstream/ppc_asm/cbp_ppc.s src/dct/ppc_asm/.arch-ids/fdct_altivec.s.id src/dct/ppc_asm/.arch-ids/idct_altivec.s.id src/dct/ppc_asm/fdct_altivec.s src/dct/ppc_asm/idct_altivec.s src/motion/ppc_asm/.arch-ids/README.id src/motion/ppc_asm/.arch-ids/sad_altivec.c.id src/motion/ppc_asm/.arch-ids/sad_altivec.s.id src/motion/ppc_asm/README src/motion/ppc_asm/sad_altivec.c src/motion/ppc_asm/sad_altivec.s modified files: build/generic/Makefile build/generic/configure.in build/generic/platform.inc.in build/generic/sources.inc examples/xvid_bench.c src/bitstream/cbp.h src/dct/fdct.h src/dct/idct.h src/dct/simple_idct.c src/image/colorspace.h src/image/interpolate8x8.h src/motion/sad.h src/portab.h src/quant/quant.h src/utils/emms.h src/utils/mem_transfer.h src/xvid.c new directories: src/image/ppc_asm src/image/ppc_asm/.arch-ids src/quant/ppc_asm src/quant/ppc_asm/.arch-ids src/utils/ppc_asm src/utils/ppc_asm/.arch-ids new patches: chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--base-0 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-1 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-2 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-4 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-5 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-6 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-7 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-8 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-9 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-10 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-11 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-12 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-13 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-14 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-15 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-16 chn@kbw.ch--2004-1/xvidcore--naegeli--1.0--patch-17 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--base-0 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-1 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-2 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-3 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-4 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-5 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-6 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-7 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-8 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-9 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-10 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-11 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-12 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-13 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-14 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-15 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-16 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-17 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-18 ed.gomez@free.fr--2004-1/xvidcore--devapi4-ppc--1.0--patch-19 ptk9417@rit.edu--2004-1/xvidcore--devapi4-ppc--1.0--base-0 ptk9417@rit.edu--2004-1/xvidcore--devapi4-ppc--1.0--patch-1 2004-04-02 21:26:57 GMT patch-2 Summary: messed with Xvid BS version Revision: xvidcore--head--0.0--patch-2 messed with Xvid BS version modified files: src/xvid.h 2004-04-02 21:25:15 GMT patch-1 Summary: Brightness Postprocessing. Revision: xvidcore--head--0.0--patch-1 From Pete: * Added brightness postprocessing. From ed.gomez: * Merging changes due to CVS branches unsync state between head and last 1.0 dev branch. new files: src/image/x86_asm/.arch-ids/postprocessing_mmx.asm.id src/image/x86_asm/postprocessing_mmx.asm modified files: build/generic/sources.inc build/win32/libxvidcore.dsp dshow/src/CXvidDecoder.cpp dshow/src/config.c dshow/src/xvid.ax.rc src/decoder.c src/image/image.c src/image/image.h src/image/postprocessing.c src/image/postprocessing.h src/xvid.c src/xvid.h vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/resource.h vfw/src/resource.rc 2004-04-02 20:36:54 GMT base-0 Summary: tag of ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-53 Revision: xvidcore--head--0.0--base-0 (automatically generated log message) new patches: ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--base-0 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-1 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-2 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-3 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-4 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-5 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-6 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-7 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-8 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-9 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-10 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-11 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-12 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-13 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-14 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-15 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-16 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-17 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-18 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-19 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-20 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-21 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-22 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-23 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-24 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-25 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-26 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-27 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-28 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-29 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-30 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-31 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-32 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-33 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-34 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-35 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-36 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-37 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-38 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-39 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-40 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-41 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-42 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-43 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-44 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-45 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-46 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-47 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-48 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-49 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-50 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-51 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-52 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-53 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-54 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-55 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-56 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-57 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-58 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-59 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-60 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-61 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-62 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-63 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-64 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-65 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-66 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-67 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-68 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-69 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-70 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-71 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-72 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-73 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-74 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-75 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-76 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-77 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-78 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-79 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-80 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-81 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-82 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-83 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-84 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-85 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-86 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-87 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-88 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-89 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-90 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-91 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-92 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-93 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-94 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-95 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-96 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-97 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-98 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-99 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-100 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-101 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-102 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-103 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-104 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-105 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-106 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-107 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-108 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-109 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-110 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-111 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-112 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-113 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-114 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-115 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-116 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-117 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-118 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-119 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-120 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-121 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-122 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-123 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-124 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-125 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-126 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-127 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-128 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-129 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-130 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-131 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-132 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-133 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-134 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-135 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-136 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-137 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-138 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-139 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-140 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-141 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-142 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-143 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-144 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-145 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-146 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-147 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-148 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-149 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-150 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-151 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-152 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-153 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-154 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-155 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-156 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-157 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-158 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-159 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-160 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-161 ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-162 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--base-0 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-1 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-2 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-3 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-4 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-5 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-6 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-7 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-8 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-9 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-10 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-11 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-12 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-13 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-14 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-15 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-16 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-17 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-18 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-19 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-20 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-21 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-22 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-23 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-24 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-25 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-26 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-27 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-28 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-29 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-30 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-31 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-32 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-33 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-34 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-35 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-36 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-37 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-38 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-39 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-40 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-41 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-42 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-43 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-44 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-45 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-46 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-47 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-48 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-49 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-50 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-51 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-52 ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-53 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--base-0 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-1 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-2 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-3 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-4 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-5 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-6 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-7 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-8 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-9 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-10 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-11 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-12 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-13 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-14 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-15 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-16 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-17 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-18 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-19 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-20 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-21 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-22 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-23 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-24 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-25 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-26 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-27 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-28 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-29 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-30 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-31 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-32 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-33 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-34 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-35 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-36 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-37 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-38 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-39 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-40 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-41 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-42 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-43 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-44 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-45 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-46 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-47 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-48 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-49 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-50 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-51 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-52 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-53 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-54 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-55 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-56 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-57 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-58 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-59 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-60 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-61 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-62 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-63 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-64 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-65 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-66 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-67 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-68 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-69 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-70 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-71 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-72 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-73 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-74 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-75 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-76 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-77 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-78 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-79 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-80 ed.gomez@free.fr--main/xvidcore--stable--0.9--base-0 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-1 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-2 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-3 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-4 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-5 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-6 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-7 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-8 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-9 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-10 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-11 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-12 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-13 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-14 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-15 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-16 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-17 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-18 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-19 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-20 ed.gomez@free.fr--main/xvidcore--stable--0.9--version-0 ed.gomez@free.fr--main/xvidcore--stable--1.0--base-0 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-1 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-2 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-3 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-4 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-5 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-6 xvidcore/dshow/0000775000076500007650000000000011567132332014612 5ustar xvidbuildxvidbuildxvidcore/dshow/sources.inc0000664000076500007650000000023110312251441016752 0ustar xvidbuildxvidbuildLIBSO = xvid.ax SRC_DIR = src SRC_C = \ config.c \ debug.c \ SRC_CPP = \ Configure.cpp \ CAbout.cpp \ CXvidDecoder.cpp SRC_RES = \ xvid.ax.rc xvidcore/dshow/dshow.dsp0000664000076500007650000001160511567132326016454 0ustar xvidbuildxvidbuild# Microsoft Developer Studio Project File - Name="dshow" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** DO NOT EDIT ** # TARGTYPE "Win32 (x86) Dynamic-Link Library" 0x0102 CFG=dshow - Win32 Debug !MESSAGE This is not a valid makefile. To build this project using NMAKE, !MESSAGE use the Export Makefile command and run !MESSAGE !MESSAGE NMAKE /f "dshow.mak". !MESSAGE !MESSAGE You can specify a configuration when running NMAKE !MESSAGE by defining the macro CFG on the command line. For example: !MESSAGE !MESSAGE NMAKE /f "dshow.mak" CFG="dshow - Win32 Debug" !MESSAGE !MESSAGE Possible choices for configuration are: !MESSAGE !MESSAGE "dshow - Win32 Release" (based on "Win32 (x86) Dynamic-Link Library") !MESSAGE "dshow - Win32 Debug" (based on "Win32 (x86) Dynamic-Link Library") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe MTL=midl.exe RSC=rc.exe !IF "$(CFG)" == "dshow - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "Release" # PROP BASE Intermediate_Dir "Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "Release" # PROP Intermediate_Dir "Release" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /MT /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c # ADD CPP /nologo /MT /W3 /GX /O2 /I "..\src" /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c # ADD BASE MTL /nologo /D "NDEBUG" /mktyplib203 /win32 # ADD MTL /nologo /D "NDEBUG" /mktyplib203 /win32 # ADD BASE RSC /l 0xc09 /d "NDEBUG" # ADD RSC /l 0xc09 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /dll /machine:I386 # ADD LINK32 kernel32.lib user32.lib msvcrt.lib advapi32.lib winmm.lib ole32.lib uuid.lib strmbase.lib oleaut32.lib comctl32.lib /nologo /entry:"DllMain" /dll /machine:I386 /nodefaultlib /out:"bin\xvid.ax" # SUBTRACT LINK32 /pdb:none !ELSEIF "$(CFG)" == "dshow - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "Debug" # PROP BASE Intermediate_Dir "Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "Debug" # PROP Intermediate_Dir "Debug" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /MTd /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /GZ /c # ADD CPP /nologo /MTd /W3 /Gm /GX /ZI /Od /I "..\src" /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /GZ /c # ADD BASE MTL /nologo /D "_DEBUG" /mktyplib203 /win32 # ADD MTL /nologo /D "_DEBUG" /mktyplib203 /win32 # ADD BASE RSC /l 0xc09 /d "_DEBUG" # ADD RSC /l 0xc09 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /dll /debug /machine:I386 /pdbtype:sept # ADD LINK32 kernel32.lib user32.lib msvcrt.lib advapi32.lib winmm.lib ole32.lib uuid.lib strmbasd.lib oleaut32.lib comctl32.lib /nologo /entry:"DllMain" /dll /debug /machine:I386 /nodefaultlib /out:"bin\xvid.ax" /pdbtype:sept # SUBTRACT LINK32 /pdb:none !ENDIF # Begin Target # Name "dshow - Win32 Release" # Name "dshow - Win32 Debug" # Begin Group "Source Files" # PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" # Begin Source File SOURCE=.\src\CAbout.cpp # End Source File # Begin Source File SOURCE=.\src\config.c # End Source File # Begin Source File SOURCE=.\src\Configure.cpp # End Source File # Begin Source File SOURCE=.\src\CXvidDecoder.cpp # End Source File # Begin Source File SOURCE=.\src\debug.c # End Source File # End Group # Begin Group "Header Files" # PROP Default_Filter "h;hpp;hxx;hm;inl" # Begin Source File SOURCE=.\src\CAbout.h # End Source File # Begin Source File SOURCE=.\src\config.h # End Source File # Begin Source File SOURCE=.\src\CXvidDecoder.h # End Source File # Begin Source File SOURCE=.\src\debug.h # End Source File # Begin Source File SOURCE=.\src\IXvidDecoder.h # End Source File # Begin Source File SOURCE=.\src\resource.h # End Source File # End Group # Begin Group "Resource Files" # PROP Default_Filter "ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe" # Begin Source File SOURCE=.\src\xvid.ax.rc # End Source File # Begin Source File SOURCE=.\src\XviD_logo.bmp # End Source File # End Group # Begin Group "Linker Defs" # PROP Default_Filter "def" # Begin Source File SOURCE=.\src\xvid.ax.def # End Source File # End Group # End Target # End Project xvidcore/dshow/src/0000775000076500007650000000000011566427761015415 5ustar xvidbuildxvidbuildxvidcore/dshow/src/CAbout.cpp0000664000076500007650000000435311565210673017272 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC - DShow Front End * - About Property Page - * * Copyright(C) 2002-2004 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: CAbout.cpp 2006 2011-05-19 12:48:59Z Isibaar $ * ****************************************************************************/ /**************************************************************************** * * 2004/02/01 - Move configuration processing code into config.c * 2003/12/11 - added some additional options, mainly to make the deblocking * code from xvidcore available. Most of the new code is taken * from Nic's dshow filter, (C) Nic, http://nic.dnsalias.com * ****************************************************************************/ #include #include #include "CAbout.h" #include "CXvidDecoder.h" #include "resource.h" #include "config.h" CUnknown * WINAPI CAbout::CreateInstance(LPUNKNOWN punk, HRESULT *phr) { CAbout * pNewObject = new CAbout(punk, phr); if (pNewObject == NULL) { *phr = E_OUTOFMEMORY; } return pNewObject; } CAbout::CAbout(LPUNKNOWN pUnk, HRESULT * phr) : CBasePropertyPage(NAME("CAbout"), pUnk, IDD_ABOUT, IDS_ABOUT) { ASSERT(phr); } CAbout::~CAbout() { } INT_PTR CAbout::OnReceiveMessage(HWND hwnd, UINT uMsg, WPARAM wParam, LPARAM lParam) { if (adv_proc(hwnd, uMsg, wParam, lParam) == FALSE) { return CBasePropertyPage::OnReceiveMessage(hwnd, uMsg, wParam, lParam); } return TRUE; } xvidcore/dshow/src/config.c0000664000076500007650000002276411564770043017031 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Configuration processing - * * Copyright(C) 2002-2011 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: config.c 1995 2011-05-18 16:13:23Z Isibaar $ * ****************************************************************************/ #include #include #include #include "config.h" #include "debug.h" #include "resource.h" // ----------------------------------------- // global config structure CONFIG g_config; void LoadRegistryInfo() { HKEY hKey; DWORD size; RegOpenKeyEx(XVID_REG_KEY, XVID_REG_SUBKEY, 0, KEY_READ, &hKey); // Set the default post-processing settings REG_GET_N("Brightness", g_config.nBrightness, 0) REG_GET_N("Deblock_Y", g_config.nDeblock_Y, 0) REG_GET_N("Deblock_UV", g_config.nDeblock_UV, 0) REG_GET_N("Dering_Y", g_config.nDering_Y, 0) REG_GET_N("Dering_UV", g_config.nDering_UV, 0) REG_GET_N("FilmEffect", g_config.nFilmEffect, 0) REG_GET_N("ForceColorspace", g_config.nForceColorspace, 0) REG_GET_N("FlipVideo", g_config.nFlipVideo, 0) REG_GET_N("Supported_4CC", g_config.supported_4cc, 0) REG_GET_N("Videoinfo_Compat", g_config.videoinfo_compat, 0) REG_GET_N("Decoder_Aspect_Ratio", g_config.aspect_ratio, 0) REG_GET_N("num_threads", g_config.num_threads, 0) REG_GET_N("cpu_flags", g_config.cpu, 0) RegCloseKey(hKey); } void SaveRegistryInfo() { HKEY hKey; DWORD dispo; if (RegCreateKeyEx( XVID_REG_KEY, XVID_REG_SUBKEY, 0, XVID_REG_CLASS, REG_OPTION_NON_VOLATILE, KEY_WRITE, 0, &hKey, &dispo) != ERROR_SUCCESS) { OutputDebugString("Couldn't create XVID_REG_SUBKEY"); return; } REG_SET_N("Brightness", g_config.nBrightness); REG_SET_N("Deblock_Y", g_config.nDeblock_Y); REG_SET_N("Deblock_UV", g_config.nDeblock_UV); REG_SET_N("Dering_Y", g_config.nDering_Y); REG_SET_N("Dering_UV", g_config.nDering_UV); REG_SET_N("FilmEffect", g_config.nFilmEffect); REG_SET_N("ForceColorspace", g_config.nForceColorspace); REG_SET_N("FlipVideo", g_config.nFlipVideo); REG_SET_N("Supported_4CC", g_config.supported_4cc); REG_SET_N("Videoinfo_Compat", g_config.videoinfo_compat); REG_SET_N("Decoder_Aspect_Ratio", g_config.aspect_ratio); REG_SET_N("num_threads", g_config.num_threads); RegCloseKey(hKey); } INT_PTR CALLBACK adv_proc(HWND hwnd, UINT uMsg, WPARAM wParam, LPARAM lParam) { HWND hBrightness; switch ( uMsg ) { case WM_DESTROY: { LPARAM nForceColorspace; LPARAM aspect_ratio; nForceColorspace = SendMessage(GetDlgItem(hwnd, IDC_COLORSPACE), CB_GETCURSEL, 0, 0); if ( g_config.nForceColorspace != nForceColorspace ) { MessageBox(0, "You have changed the output colorspace.\r\nClose the movie and open it for the new colorspace to take effect.", "Xvid DShow", MB_TOPMOST); } g_config.nForceColorspace = (int) nForceColorspace; aspect_ratio = SendMessage(GetDlgItem(hwnd, IDC_USE_AR), CB_GETCURSEL, 0, 0); if ( g_config.aspect_ratio != aspect_ratio ) { MessageBox(0, "You have changed the default aspect ratio.\r\nClose the movie and open it for the new aspect ratio to take effect.", "Xvid DShow", MB_TOPMOST); } g_config.aspect_ratio = (int) aspect_ratio; SaveRegistryInfo(); } break; case WM_INITDIALOG: { xvid_gbl_info_t info; char core[100]; HINSTANCE m_hdll; memset(&info, 0, sizeof(info)); info.version = XVID_VERSION; m_hdll = LoadLibrary(XVID_DLL_NAME); if (m_hdll != NULL) { ((int (__cdecl *)(void *, int, void *, void *))GetProcAddress(m_hdll, "xvid_global")) (0, XVID_GBL_INFO, &info, NULL); wsprintf(core, "Xvid MPEG-4 Video Codec v%d.%d.%d", XVID_VERSION_MAJOR(info.actual_version), XVID_VERSION_MINOR(info.actual_version), XVID_VERSION_PATCH(info.actual_version)); FreeLibrary(m_hdll); } else { wsprintf(core, "xvidcore.dll not found!"); } SetDlgItemText(hwnd, IDC_CORE, core); } // Load Force Colorspace Box SendMessage(GetDlgItem(hwnd, IDC_COLORSPACE), CB_ADDSTRING, 0, (LPARAM)"No Force"); SendMessage(GetDlgItem(hwnd, IDC_COLORSPACE), CB_ADDSTRING, 0, (LPARAM)"YV12"); SendMessage(GetDlgItem(hwnd, IDC_COLORSPACE), CB_ADDSTRING, 0, (LPARAM)"YUY2"); SendMessage(GetDlgItem(hwnd, IDC_COLORSPACE), CB_ADDSTRING, 0, (LPARAM)"RGB24"); SendMessage(GetDlgItem(hwnd, IDC_COLORSPACE), CB_ADDSTRING, 0, (LPARAM)"RGB32"); // Select Colorspace SendMessage(GetDlgItem(hwnd, IDC_COLORSPACE), CB_SETCURSEL, g_config.nForceColorspace, 0); hBrightness = GetDlgItem(hwnd, IDC_BRIGHTNESS); SendMessage(hBrightness, TBM_SETRANGE, (WPARAM)TRUE, (LPARAM)MAKELONG(-96, 96)); SendMessage(hBrightness, TBM_SETTICFREQ, (WPARAM)16, (LPARAM)0); SendMessage(hBrightness, TBM_SETPOS, (WPARAM)TRUE, (LPARAM) g_config.nBrightness); // Load Aspect Ratio Box SendMessage(GetDlgItem(hwnd, IDC_USE_AR), CB_ADDSTRING, 0, (LPARAM)"Auto (MPEG-4 first)"); SendMessage(GetDlgItem(hwnd, IDC_USE_AR), CB_ADDSTRING, 0, (LPARAM)"Auto (external first)"); SendMessage(GetDlgItem(hwnd, IDC_USE_AR), CB_ADDSTRING, 0, (LPARAM)"4:3"); SendMessage(GetDlgItem(hwnd, IDC_USE_AR), CB_ADDSTRING, 0, (LPARAM)"16:9"); SendMessage(GetDlgItem(hwnd, IDC_USE_AR), CB_ADDSTRING, 0, (LPARAM)"2.35:1"); // Select Aspect Ratio SendMessage(GetDlgItem(hwnd, IDC_USE_AR), CB_SETCURSEL, g_config.aspect_ratio, 0); // Load Buttons SendMessage(GetDlgItem(hwnd, IDC_DEBLOCK_Y), BM_SETCHECK, g_config.nDeblock_Y, 0); SendMessage(GetDlgItem(hwnd, IDC_DEBLOCK_UV), BM_SETCHECK, g_config.nDeblock_UV, 0); SendMessage(GetDlgItem(hwnd, IDC_DERINGY), BM_SETCHECK, g_config.nDering_Y, 0); SendMessage(GetDlgItem(hwnd, IDC_DERINGUV), BM_SETCHECK, g_config.nDering_UV, 0); SendMessage(GetDlgItem(hwnd, IDC_FILMEFFECT), BM_SETCHECK, g_config.nFilmEffect, 0); SendMessage(GetDlgItem(hwnd, IDC_FLIPVIDEO), BM_SETCHECK, g_config.nFlipVideo, 0); // 4CC checkbuttons SendMessage(GetDlgItem(hwnd, IDC_DIVX), BM_SETCHECK, g_config.supported_4cc & SUPPORT_DIVX, 0); SendMessage(GetDlgItem(hwnd, IDC_3IVX), BM_SETCHECK, g_config.supported_4cc & SUPPORT_3IVX, 0); SendMessage(GetDlgItem(hwnd, IDC_MP4V), BM_SETCHECK, g_config.supported_4cc & SUPPORT_MP4V, 0); SendMessage(GetDlgItem(hwnd, IDC_COMPAT), BM_SETCHECK, g_config.videoinfo_compat, 0); EnableWindow(GetDlgItem(hwnd,IDC_DERINGY),g_config.nDeblock_Y); EnableWindow(GetDlgItem(hwnd,IDC_DERINGUV),g_config.nDeblock_UV); EnableWindow(GetDlgItem(hwnd, IDC_USE_AR), !g_config.videoinfo_compat); // Set Date & Time of Compilation DPRINTF("(%s %s)", __DATE__, __TIME__); break; case WM_COMMAND: switch ( wParam ) { case IDC_RESET: ZeroMemory(&g_config, sizeof(CONFIG)); hBrightness = GetDlgItem(hwnd, IDC_BRIGHTNESS); SendMessage(hBrightness, TBM_SETPOS, (WPARAM) TRUE, (LPARAM) g_config.nBrightness); // Load Buttons SendMessage(GetDlgItem(hwnd, IDC_DEBLOCK_Y), BM_SETCHECK, g_config.nDeblock_Y, 0); SendMessage(GetDlgItem(hwnd, IDC_DEBLOCK_UV), BM_SETCHECK, g_config.nDeblock_UV, 0); SendMessage(GetDlgItem(hwnd, IDC_DERINGY), BM_SETCHECK, g_config.nDering_Y, 0); SendMessage(GetDlgItem(hwnd, IDC_DERINGUV), BM_SETCHECK, g_config.nDering_UV, 0); SendMessage(GetDlgItem(hwnd, IDC_FILMEFFECT), BM_SETCHECK, g_config.nFilmEffect, 0); SendMessage(GetDlgItem(hwnd, IDC_FLIPVIDEO), BM_SETCHECK, g_config.nFlipVideo, 0); g_config.nForceColorspace = 0; SendMessage(GetDlgItem(hwnd, IDC_COLORSPACE), CB_SETCURSEL, g_config.nForceColorspace, 0); g_config.aspect_ratio = 0; SendMessage(GetDlgItem(hwnd, IDC_USE_AR), CB_SETCURSEL, g_config.aspect_ratio, 0); break; case IDC_DEBLOCK_Y: g_config.nDeblock_Y = !g_config.nDeblock_Y; break; case IDC_DEBLOCK_UV: g_config.nDeblock_UV = !g_config.nDeblock_UV; break; case IDC_DERINGY: g_config.nDering_Y = !g_config.nDering_Y; break; case IDC_DERINGUV: g_config.nDering_UV = !g_config.nDering_UV; break; case IDC_FILMEFFECT: g_config.nFilmEffect = !g_config.nFilmEffect; break; case IDC_FLIPVIDEO: g_config.nFlipVideo = !g_config.nFlipVideo; break; case IDC_DIVX: g_config.supported_4cc ^= SUPPORT_DIVX; break; case IDC_3IVX: g_config.supported_4cc ^= SUPPORT_3IVX; break; case IDC_MP4V: g_config.supported_4cc ^= SUPPORT_MP4V; break; case IDC_COMPAT: g_config.videoinfo_compat = !g_config.videoinfo_compat; break; default : return FALSE; } EnableWindow(GetDlgItem(hwnd,IDC_DERINGY),g_config.nDeblock_Y); EnableWindow(GetDlgItem(hwnd,IDC_DERINGUV),g_config.nDeblock_UV); EnableWindow(GetDlgItem(hwnd, IDC_USE_AR), !g_config.videoinfo_compat); SaveRegistryInfo(); break; case WM_NOTIFY: hBrightness = GetDlgItem(hwnd, IDC_BRIGHTNESS); g_config.nBrightness = (int) SendMessage(hBrightness, TBM_GETPOS, (WPARAM)NULL, (LPARAM)NULL); SaveRegistryInfo(); break; default : return FALSE; } return TRUE; /* ok */ } xvidcore/dshow/src/CXvidDecoder.cpp0000664000076500007650000015413411565210673020423 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Xvid Decoder part of the DShow Filter - * * Copyright(C) 2002-2011 Peter Ross * 2003-2011 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: CXvidDecoder.cpp 2006 2011-05-19 12:48:59Z Isibaar $ * ****************************************************************************/ /* this requires the directx sdk place these paths at the top of the Tools|Options|Directories list headers: C:\DX90SDK\Include C:\DX90SDK\Samples\C++\DirectShow\BaseClasses C:\DX90SDK\Samples\C++\DirectShow\BaseClasses\Release C:\DX90SDK\Samples\C++\DirectShow\BaseClasses\Debug */ #ifdef ENABLE_MFT #define XVID_USE_MFT #endif #define XVID_USE_TRAYICON #include #include #include #include #if (1100 > _MSC_VER) #include #endif #include // VIDEOINFOHEADER2 #if defined(XVID_USE_MFT) #define MFT_UNIQUE_METHOD_NAMES #include #include #include #include #endif #include #include // Xvid API #include "resource.h" #include "IXvidDecoder.h" #include "CXvidDecoder.h" #include "CAbout.h" #include "config.h" #include "debug.h" static bool USE_IYUV; static bool USE_YV12; static bool USE_YUY2; static bool USE_YVYU; static bool USE_UYVY; static bool USE_RGB32; static bool USE_RGB24; static bool USE_RG555; static bool USE_RG565; const AMOVIESETUP_MEDIATYPE sudInputPinTypes[] = { { &MEDIATYPE_Video, &CLSID_XVID }, { &MEDIATYPE_Video, &CLSID_XVID_UC }, { &MEDIATYPE_Video, &CLSID_DIVX }, { &MEDIATYPE_Video, &CLSID_DIVX_UC }, { &MEDIATYPE_Video, &CLSID_DX50 }, { &MEDIATYPE_Video, &CLSID_DX50_UC }, { &MEDIATYPE_Video, &CLSID_3IVX }, { &MEDIATYPE_Video, &CLSID_3IVX_UC }, { &MEDIATYPE_Video, &CLSID_3IV0 }, { &MEDIATYPE_Video, &CLSID_3IV0_UC }, { &MEDIATYPE_Video, &CLSID_3IV1 }, { &MEDIATYPE_Video, &CLSID_3IV1_UC }, { &MEDIATYPE_Video, &CLSID_3IV2 }, { &MEDIATYPE_Video, &CLSID_3IV2_UC }, { &MEDIATYPE_Video, &CLSID_LMP4 }, { &MEDIATYPE_Video, &CLSID_LMP4_UC }, { &MEDIATYPE_Video, &CLSID_RMP4 }, { &MEDIATYPE_Video, &CLSID_RMP4_UC }, { &MEDIATYPE_Video, &CLSID_SMP4 }, { &MEDIATYPE_Video, &CLSID_SMP4_UC }, { &MEDIATYPE_Video, &CLSID_HDX4 }, { &MEDIATYPE_Video, &CLSID_HDX4_UC }, { &MEDIATYPE_Video, &CLSID_MP4V }, { &MEDIATYPE_Video, &CLSID_MP4V_UC }, }; const AMOVIESETUP_MEDIATYPE sudOutputPinTypes[] = { { &MEDIATYPE_Video, &MEDIASUBTYPE_NULL } }; const AMOVIESETUP_PIN psudPins[] = { { L"Input", // String pin name FALSE, // Is it rendered FALSE, // Is it an output FALSE, // Allowed none FALSE, // Allowed many &CLSID_NULL, // Connects to filter L"Output", // Connects to pin sizeof(sudInputPinTypes) / sizeof(AMOVIESETUP_MEDIATYPE), // Number of types &sudInputPinTypes[0] // The pin details }, { L"Output", // String pin name FALSE, // Is it rendered TRUE, // Is it an output FALSE, // Allowed none FALSE, // Allowed many &CLSID_NULL, // Connects to filter L"Input", // Connects to pin sizeof(sudOutputPinTypes) / sizeof(AMOVIESETUP_MEDIATYPE), // Number of types sudOutputPinTypes // The pin details } }; const AMOVIESETUP_FILTER sudXvidDecoder = { &CLSID_XVID, // Filter CLSID XVID_NAME_L, // Filter name MERIT_PREFERRED+2, // Its merit sizeof(psudPins) / sizeof(AMOVIESETUP_PIN), // Number of pins psudPins // Pin details }; // List of class IDs and creator functions for the class factory. This // provides the link between the OLE entry point in the DLL and an object // being created. The class factory will call the static CreateInstance CFactoryTemplate g_Templates[] = { { XVID_NAME_L, &CLSID_XVID, CXvidDecoder::CreateInstance, NULL, &sudXvidDecoder }, { XVID_NAME_L L"About", &CLSID_CABOUT, CAbout::CreateInstance } }; /* note: g_cTemplates must be global; used by strmbase.lib(dllentry.cpp,dllsetup.cpp) */ int g_cTemplates = sizeof(g_Templates) / sizeof(CFactoryTemplate); #ifdef XVID_USE_TRAYICON extern HINSTANCE g_xvid_hInst; static int GUI_Page = 0; static int Tray_Icon = 0; extern "C" void CALLBACK Configure(HWND hWndParent, HINSTANCE hInstParent, LPSTR lpCmdLine, int nCmdShow ); LRESULT CALLBACK msg_proc(HWND hwnd, UINT uMsg, WPARAM wParam, LPARAM lParam) { switch ( uMsg ) { case WM_ICONMESSAGE: switch(lParam) { case WM_LBUTTONDBLCLK: if (!GUI_Page) { GUI_Page = 1; Configure(hwnd, g_xvid_hInst, "", 1); GUI_Page = 0; } break; default: return DefWindowProc(hwnd, uMsg, wParam, lParam); }; break; case WM_DESTROY: NOTIFYICONDATA nid; ZeroMemory(&nid,sizeof(NOTIFYICONDATA)); nid.cbSize = NOTIFYICONDATA_V1_SIZE; nid.hWnd = hwnd; nid.uID = 1456; Shell_NotifyIcon(NIM_DELETE, &nid); Tray_Icon = 0; default: return DefWindowProc(hwnd, uMsg, wParam, lParam); } return TRUE; /* ok */ } #endif STDAPI DllRegisterServer() { #if defined(XVID_USE_MFT) int inputs_num = sizeof(sudInputPinTypes) / sizeof(AMOVIESETUP_MEDIATYPE); int outputs_num = sizeof(sudOutputPinTypes) / sizeof(AMOVIESETUP_MEDIATYPE); MFT_REGISTER_TYPE_INFO * mft_bs = new MFT_REGISTER_TYPE_INFO[inputs_num]; MFT_REGISTER_TYPE_INFO * mft_csp = new MFT_REGISTER_TYPE_INFO[outputs_num]; { int i; for(i=0;i(XVID_NAME_L), // Friendly name 0, // Flags inputs_num, // Number of input types mft_bs, // Input types outputs_num, // Number of output types mft_csp, // Output types NULL // Attributes (optional) ); delete[] mft_bs; delete[] mft_csp; #endif /* XVID_USE_MFT */ return AMovieDllRegisterServer2( TRUE ); } STDAPI DllUnregisterServer() { #if defined(XVID_USE_MFT) MFTUnregister(CLSID_XVID); #endif return AMovieDllRegisterServer2( FALSE ); } /* create instance */ CUnknown * WINAPI CXvidDecoder::CreateInstance(LPUNKNOWN punk, HRESULT *phr) { CXvidDecoder * pNewObject = new CXvidDecoder(punk, phr); if (pNewObject == NULL) { *phr = E_OUTOFMEMORY; } return pNewObject; } /* query interfaces */ STDMETHODIMP CXvidDecoder::NonDelegatingQueryInterface(REFIID riid, void **ppv) { CheckPointer(ppv, E_POINTER); if (riid == IID_IXvidDecoder) { return GetInterface((IXvidDecoder *) this, ppv); } if (riid == IID_ISpecifyPropertyPages) { return GetInterface((ISpecifyPropertyPages *) this, ppv); } #if defined(XVID_USE_MFT) if (riid == IID_IMFTransform) { return GetInterface((IMFTransform *) this, ppv); } #endif return CVideoTransformFilter::NonDelegatingQueryInterface(riid, ppv); } /* constructor */ CXvidDecoder::CXvidDecoder(LPUNKNOWN punk, HRESULT *phr) : CVideoTransformFilter(NAME("CXvidDecoder"), punk, CLSID_XVID), m_hdll (NULL) { DPRINTF("Constructor"); xvid_decore_func = NULL; // Hmm, some strange errors appearing if I try to initialize... xvid_global_func = NULL; // ...this in constructor's init-list. So, they assigned here. #if defined(XVID_USE_MFT) InitializeCriticalSection(&m_mft_lock); m_pInputType = NULL; m_pOutputType = NULL; m_rtFrame = 0; m_duration = 0; m_discont = 0; m_frameRate.Denominator = 1; m_frameRate.Numerator = 1; #endif LoadRegistryInfo(); *phr = OpenLib(); } HRESULT CXvidDecoder::OpenLib() { DPRINTF("OpenLib"); if (m_hdll != NULL) return E_UNEXPECTED; // Seems, that library already opened. xvid_gbl_init_t init; memset(&init, 0, sizeof(init)); init.version = XVID_VERSION; init.cpu_flags = g_config.cpu; xvid_gbl_info_t info; memset(&info, 0, sizeof(info)); info.version = XVID_VERSION; m_hdll = LoadLibrary(XVID_DLL_NAME); if (m_hdll == NULL) { DPRINTF("dll load failed"); MessageBox(0, XVID_DLL_NAME " not found","Error", MB_TOPMOST); return E_FAIL; } xvid_global_func = (int (__cdecl *)(void *, int, void *, void *))GetProcAddress(m_hdll, "xvid_global"); if (xvid_global_func == NULL) { FreeLibrary(m_hdll); m_hdll = NULL; MessageBox(0, "xvid_global() not found", "Error", MB_TOPMOST); return E_FAIL; } xvid_decore_func = (int (__cdecl *)(void *, int, void *, void *))GetProcAddress(m_hdll, "xvid_decore"); if (xvid_decore_func == NULL) { xvid_global_func = NULL; FreeLibrary(m_hdll); m_hdll = NULL; MessageBox(0, "xvid_decore() not found", "Error", MB_TOPMOST); return E_FAIL; } if (xvid_global_func(0, XVID_GBL_INIT, &init, NULL) < 0) { xvid_global_func = NULL; xvid_decore_func = NULL; FreeLibrary(m_hdll); m_hdll = NULL; MessageBox(0, "xvid_global() failed", "Error", MB_TOPMOST); return E_FAIL; } if (xvid_global_func(0, XVID_GBL_INFO, &info, NULL) < 0) { xvid_global_func = NULL; xvid_decore_func = NULL; FreeLibrary(m_hdll); m_hdll = NULL; MessageBox(0, "xvid_global() failed", "Error", MB_TOPMOST); return E_FAIL; } memset(&m_create, 0, sizeof(m_create)); m_create.version = XVID_VERSION; m_create.handle = NULL; /* Decoder threads */ if (g_config.cpu & XVID_CPU_FORCE) { m_create.num_threads = g_config.num_threads; } else { m_create.num_threads = info.num_threads; /* Autodetect */ g_config.num_threads = info.num_threads; } memset(&m_frame, 0, sizeof(m_frame)); m_frame.version = XVID_VERSION; USE_IYUV = false; USE_YV12 = false; USE_YUY2 = false; USE_YVYU = false; USE_UYVY = false; USE_RGB32 = false; USE_RGB24 = false; USE_RG555 = false; USE_RG565 = false; switch ( g_config.nForceColorspace ) { case FORCE_NONE: USE_IYUV = true; USE_YV12 = true; USE_YUY2 = true; USE_YVYU = true; USE_UYVY = true; USE_RGB32 = true; USE_RGB24 = true; USE_RG555 = true; USE_RG565 = true; break; case FORCE_YV12: USE_IYUV = true; USE_YV12 = true; break; case FORCE_YUY2: USE_YUY2 = true; break; case FORCE_RGB24: USE_RGB24 = true; break; case FORCE_RGB32: USE_RGB32 = true; break; } switch (g_config.aspect_ratio) { case 0: case 1: break; case 2: ar_x = 4; ar_y = 3; break; case 3: ar_x = 16; ar_y = 9; break; case 4: ar_x = 47; ar_y = 20; break; } return S_OK; } void CXvidDecoder::CloseLib() { DPRINTF("CloseLib"); if ((m_create.handle != NULL) && (xvid_decore_func != NULL)) { xvid_decore_func(m_create.handle, XVID_DEC_DESTROY, 0, 0); m_create.handle = NULL; } if (m_hdll != NULL) { FreeLibrary(m_hdll); m_hdll = NULL; } xvid_decore_func = NULL; xvid_global_func = NULL; } /* destructor */ CXvidDecoder::~CXvidDecoder() { DPRINTF("Destructor"); #ifdef XVID_USE_TRAYICON if (Tray_Icon) { /* Destroy tray icon */ NOTIFYICONDATA nid; ZeroMemory(&nid,sizeof(NOTIFYICONDATA)); nid.cbSize = NOTIFYICONDATA_V1_SIZE; nid.hWnd = MSG_hwnd; nid.uID = 1456; Shell_NotifyIcon(NIM_DELETE, &nid); Tray_Icon = 0; } #endif /* Close xvidcore library */ CloseLib(); #if defined(XVID_USE_MFT) DeleteCriticalSection(&m_mft_lock); #endif } /* check input type */ HRESULT CXvidDecoder::CheckInputType(const CMediaType * mtIn) { DPRINTF("CheckInputType"); BITMAPINFOHEADER * hdr; ar_x = ar_y = 0; if (*mtIn->Type() != MEDIATYPE_Video) { DPRINTF("Error: Unknown Type"); CloseLib(); return VFW_E_TYPE_NOT_ACCEPTED; } if (m_hdll == NULL) { HRESULT hr = OpenLib(); if (FAILED(hr) || (m_hdll == NULL)) // Paranoid checks. return VFW_E_TYPE_NOT_ACCEPTED; } if (*mtIn->FormatType() == FORMAT_VideoInfo) { VIDEOINFOHEADER * vih = (VIDEOINFOHEADER *) mtIn->Format(); hdr = &vih->bmiHeader; } else if (*mtIn->FormatType() == FORMAT_VideoInfo2) { VIDEOINFOHEADER2 * vih2 = (VIDEOINFOHEADER2 *) mtIn->Format(); hdr = &vih2->bmiHeader; if (g_config.aspect_ratio == 0 || g_config.aspect_ratio == 1) { ar_x = vih2->dwPictAspectRatioX; ar_y = vih2->dwPictAspectRatioY; } DPRINTF("VIDEOINFOHEADER2 AR: %d:%d", ar_x, ar_y); } else if (*mtIn->FormatType() == FORMAT_MPEG2Video) { MPEG2VIDEOINFO * mpgvi = (MPEG2VIDEOINFO*)mtIn->Format(); VIDEOINFOHEADER2 * vih2 = &mpgvi->hdr; hdr = &vih2->bmiHeader; if (g_config.aspect_ratio == 0 || g_config.aspect_ratio == 1) { ar_x = vih2->dwPictAspectRatioX; ar_y = vih2->dwPictAspectRatioY; } DPRINTF("VIDEOINFOHEADER2 AR: %d:%d", ar_x, ar_y); /* haali media splitter reports VOL information in the format header */ if (mpgvi->cbSequenceHeader>0) { xvid_dec_stats_t stats; memset(&stats, 0, sizeof(stats)); stats.version = XVID_VERSION; if (m_create.handle == NULL) { if (xvid_decore_func == NULL) return E_FAIL; if (xvid_decore_func(0, XVID_DEC_CREATE, &m_create, 0) < 0) { DPRINTF("*** XVID_DEC_CREATE error"); return E_FAIL; } } m_frame.general = 0; m_frame.bitstream = (void*)mpgvi->dwSequenceHeader; m_frame.length = mpgvi->cbSequenceHeader; m_frame.output.csp = XVID_CSP_NULL; int ret = 0; if ((ret=xvid_decore_func(m_create.handle, XVID_DEC_DECODE, &m_frame, &stats)) >= 0) { /* honour video dimensions reported in VOL header */ if (stats.type == XVID_TYPE_VOL) { hdr->biWidth = stats.data.vol.width; hdr->biHeight = stats.data.vol.height; } } if (ret == XVID_ERR_MEMORY) return E_FAIL; } } else { DPRINTF("Error: Unknown FormatType"); CloseLib(); return VFW_E_TYPE_NOT_ACCEPTED; } if (hdr->biHeight < 0) { DPRINTF("colorspace: inverted input format not supported"); } m_create.width = hdr->biWidth; m_create.height = hdr->biHeight; switch(hdr->biCompression) { case FOURCC_mp4v : case FOURCC_MP4V : case FOURCC_lmp4 : case FOURCC_LMP4 : case FOURCC_rmp4 : case FOURCC_RMP4 : case FOURCC_smp4 : case FOURCC_SMP4 : case FOURCC_hdx4 : case FOURCC_HDX4 : if (!(g_config.supported_4cc & SUPPORT_MP4V)) { CloseLib(); return VFW_E_TYPE_NOT_ACCEPTED; } break; case FOURCC_divx : case FOURCC_DIVX : case FOURCC_dx50 : case FOURCC_DX50 : if (!(g_config.supported_4cc & SUPPORT_DIVX)) { CloseLib(); return VFW_E_TYPE_NOT_ACCEPTED; } break; case FOURCC_3ivx : case FOURCC_3IVX : case FOURCC_3iv0 : case FOURCC_3IV0 : case FOURCC_3iv1 : case FOURCC_3IV1 : case FOURCC_3iv2 : case FOURCC_3IV2 : if (!(g_config.supported_4cc & SUPPORT_3IVX)) { CloseLib(); return VFW_E_TYPE_NOT_ACCEPTED; } case FOURCC_xvid : case FOURCC_XVID : break; default : DPRINTF("Unknown fourcc: 0x%08x (%c%c%c%c)", hdr->biCompression, (hdr->biCompression)&0xff, (hdr->biCompression>>8)&0xff, (hdr->biCompression>>16)&0xff, (hdr->biCompression>>24)&0xff); CloseLib(); return VFW_E_TYPE_NOT_ACCEPTED; } m_create.fourcc = hdr->biCompression; return S_OK; } /* get list of supported output colorspaces */ HRESULT CXvidDecoder::GetMediaType(int iPosition, CMediaType *mtOut) { BITMAPINFOHEADER * bmih; DPRINTF("GetMediaType"); if (m_pInput->IsConnected() == FALSE) { return E_UNEXPECTED; } if (!g_config.videoinfo_compat) { VIDEOINFOHEADER2 * vih = (VIDEOINFOHEADER2 *) mtOut->ReallocFormatBuffer(sizeof(VIDEOINFOHEADER2)); if (vih == NULL) return E_OUTOFMEMORY; ZeroMemory(vih, sizeof (VIDEOINFOHEADER2)); bmih = &(vih->bmiHeader); mtOut->SetFormatType(&FORMAT_VideoInfo2); if (ar_x != 0 && ar_y != 0) { vih->dwPictAspectRatioX = ar_x; vih->dwPictAspectRatioY = ar_y; forced_ar = true; } else { // just to be safe vih->dwPictAspectRatioX = m_create.width; vih->dwPictAspectRatioY = abs(m_create.height); forced_ar = false; } } else { VIDEOINFOHEADER * vih = (VIDEOINFOHEADER *) mtOut->ReallocFormatBuffer(sizeof(VIDEOINFOHEADER)); if (vih == NULL) return E_OUTOFMEMORY; ZeroMemory(vih, sizeof (VIDEOINFOHEADER)); bmih = &(vih->bmiHeader); mtOut->SetFormatType(&FORMAT_VideoInfo); } bmih->biSize = sizeof(BITMAPINFOHEADER); bmih->biWidth = m_create.width; bmih->biHeight = m_create.height; bmih->biPlanes = 1; if (iPosition < 0) return E_INVALIDARG; switch(iPosition) { case 0: if ( USE_YUY2 ) { bmih->biCompression = MEDIASUBTYPE_YUY2.Data1; bmih->biBitCount = 16; mtOut->SetSubtype(&MEDIASUBTYPE_YUY2); break; } case 1 : if ( USE_YVYU ) { bmih->biCompression = MEDIASUBTYPE_YVYU.Data1; bmih->biBitCount = 16; mtOut->SetSubtype(&MEDIASUBTYPE_YVYU); break; } case 2 : if ( USE_UYVY ) { bmih->biCompression = MEDIASUBTYPE_UYVY.Data1; bmih->biBitCount = 16; mtOut->SetSubtype(&MEDIASUBTYPE_UYVY); break; } case 3 : if ( USE_IYUV ) { bmih->biCompression = CLSID_MEDIASUBTYPE_IYUV.Data1; bmih->biBitCount = 12; mtOut->SetSubtype(&CLSID_MEDIASUBTYPE_IYUV); break; } case 4 : if ( USE_YV12 ) { bmih->biCompression = MEDIASUBTYPE_YV12.Data1; bmih->biBitCount = 12; mtOut->SetSubtype(&MEDIASUBTYPE_YV12); break; } case 5 : if ( USE_RGB32 ) { bmih->biCompression = BI_RGB; bmih->biBitCount = 32; mtOut->SetSubtype(&MEDIASUBTYPE_RGB32); break; } case 6 : if ( USE_RGB24 ) { bmih->biCompression = BI_RGB; bmih->biBitCount = 24; mtOut->SetSubtype(&MEDIASUBTYPE_RGB24); break; } case 7 : if ( USE_RG555 ) { bmih->biCompression = BI_RGB; bmih->biBitCount = 16; mtOut->SetSubtype(&MEDIASUBTYPE_RGB555); break; } case 8 : if ( USE_RG565 ) { bmih->biCompression = BI_RGB; bmih->biBitCount = 16; mtOut->SetSubtype(&MEDIASUBTYPE_RGB565); break; } default : return VFW_S_NO_MORE_ITEMS; } bmih->biSizeImage = GetBitmapSize(bmih); mtOut->SetType(&MEDIATYPE_Video); mtOut->SetTemporalCompression(FALSE); mtOut->SetSampleSize(bmih->biSizeImage); return S_OK; } /* (internal function) change colorspace */ #define CALC_BI_STRIDE(width,bitcount) ((((width * bitcount) + 31) & ~31) >> 3) HRESULT CXvidDecoder::ChangeColorspace(GUID subtype, GUID formattype, void * format, int noflip) { DWORD biWidth; if (formattype == FORMAT_VideoInfo) { VIDEOINFOHEADER * vih = (VIDEOINFOHEADER * )format; biWidth = vih->bmiHeader.biWidth; out_stride = CALC_BI_STRIDE(vih->bmiHeader.biWidth, vih->bmiHeader.biBitCount); rgb_flip = (vih->bmiHeader.biHeight < 0 ? 0 : XVID_CSP_VFLIP); } else if (formattype == FORMAT_VideoInfo2) { VIDEOINFOHEADER2 * vih2 = (VIDEOINFOHEADER2 * )format; biWidth = vih2->bmiHeader.biWidth; out_stride = CALC_BI_STRIDE(vih2->bmiHeader.biWidth, vih2->bmiHeader.biBitCount); rgb_flip = (vih2->bmiHeader.biHeight < 0 ? 0 : XVID_CSP_VFLIP); } else { return S_FALSE; } if (noflip) rgb_flip = 0; if (subtype == CLSID_MEDIASUBTYPE_IYUV) { DPRINTF("IYUV"); rgb_flip = 0; m_frame.output.csp = XVID_CSP_I420; out_stride = CALC_BI_STRIDE(biWidth, 8); /* planar format fix */ } else if (subtype == MEDIASUBTYPE_YV12) { DPRINTF("YV12"); rgb_flip = 0; m_frame.output.csp = XVID_CSP_YV12; out_stride = CALC_BI_STRIDE(biWidth, 8); /* planar format fix */ } else if (subtype == MEDIASUBTYPE_YUY2) { DPRINTF("YUY2"); rgb_flip = 0; m_frame.output.csp = XVID_CSP_YUY2; } else if (subtype == MEDIASUBTYPE_YVYU) { DPRINTF("YVYU"); rgb_flip = 0; m_frame.output.csp = XVID_CSP_YVYU; } else if (subtype == MEDIASUBTYPE_UYVY) { DPRINTF("UYVY"); rgb_flip = 0; m_frame.output.csp = XVID_CSP_UYVY; } else if (subtype == MEDIASUBTYPE_RGB32) { DPRINTF("RGB32"); m_frame.output.csp = rgb_flip | XVID_CSP_BGRA; } else if (subtype == MEDIASUBTYPE_RGB24) { DPRINTF("RGB24"); m_frame.output.csp = rgb_flip | XVID_CSP_BGR; } else if (subtype == MEDIASUBTYPE_RGB555) { DPRINTF("RGB555"); m_frame.output.csp = rgb_flip | XVID_CSP_RGB555; } else if (subtype == MEDIASUBTYPE_RGB565) { DPRINTF("RGB565"); m_frame.output.csp = rgb_flip | XVID_CSP_RGB565; } else if (subtype == GUID_NULL) { m_frame.output.csp = XVID_CSP_NULL; } else { return S_FALSE; } return S_OK; } /* set output colorspace */ HRESULT CXvidDecoder::SetMediaType(PIN_DIRECTION direction, const CMediaType *pmt) { DPRINTF("SetMediaType"); if (direction == PINDIR_OUTPUT) { return ChangeColorspace(*pmt->Subtype(), *pmt->FormatType(), pmt->Format(), 0); } return S_OK; } /* check input<->output compatiblity */ HRESULT CXvidDecoder::CheckTransform(const CMediaType *mtIn, const CMediaType *mtOut) { DPRINTF("CheckTransform"); return S_OK; } /* input/output pin connection complete */ HRESULT CXvidDecoder::CompleteConnect(PIN_DIRECTION direction, IPin *pReceivePin) { DPRINTF("CompleteConnect"); #ifdef XVID_USE_TRAYICON if ((direction == PINDIR_OUTPUT) && (Tray_Icon == 0)) { WNDCLASSEX wc; wc.cbSize = sizeof(WNDCLASSEX); wc.lpfnWndProc = msg_proc; wc.style = CS_HREDRAW | CS_VREDRAW; wc.cbWndExtra = 0; wc.cbClsExtra = 0; wc.hInstance = (HINSTANCE) g_xvid_hInst; wc.hbrBackground = (HBRUSH) GetStockObject(NULL_BRUSH); wc.lpszMenuName = NULL; wc.lpszClassName = "XVID_MSG_WINDOW"; wc.hIcon = NULL; wc.hIconSm = NULL; wc.hCursor = NULL; RegisterClassEx(&wc); MSG_hwnd = CreateWindowEx(0, "XVID_MSG_WINDOW", NULL, 0, CW_USEDEFAULT, CW_USEDEFAULT, 0, 0, HWND_MESSAGE, NULL, (HINSTANCE) g_xvid_hInst, NULL); /* display the tray icon */ NOTIFYICONDATA nid; ZeroMemory(&nid,sizeof(NOTIFYICONDATA)); nid.cbSize = NOTIFYICONDATA_V1_SIZE; nid.hWnd = MSG_hwnd; nid.uID = 1456; nid.uCallbackMessage = WM_ICONMESSAGE; nid.hIcon = LoadIcon(g_xvid_hInst, MAKEINTRESOURCE(IDI_ICON)); strcpy_s(nid.szTip, 19, "Xvid Video Decoder"); nid.uFlags = NIF_MESSAGE | NIF_ICON | NIF_TIP; Shell_NotifyIcon(NIM_ADD, &nid); DestroyIcon(nid.hIcon); Tray_Icon = 1; } #endif return S_OK; } /* input/output pin disconnected */ HRESULT CXvidDecoder::BreakConnect(PIN_DIRECTION direction) { DPRINTF("BreakConnect"); return S_OK; } /* alloc output buffer */ HRESULT CXvidDecoder::DecideBufferSize(IMemAllocator *pAlloc, ALLOCATOR_PROPERTIES *ppropInputRequest) { DPRINTF("DecideBufferSize"); HRESULT result; ALLOCATOR_PROPERTIES ppropActual; if (m_pInput->IsConnected() == FALSE) { return E_UNEXPECTED; } ppropInputRequest->cBuffers = 1; ppropInputRequest->cbBuffer = m_create.width * m_create.height * 4; // cbAlign causes problems with the resize filter */ // ppropInputRequest->cbAlign = 16; ppropInputRequest->cbPrefix = 0; result = pAlloc->SetProperties(ppropInputRequest, &ppropActual); if (result != S_OK) { return result; } if (ppropActual.cbBuffer < ppropInputRequest->cbBuffer) { return E_FAIL; } return S_OK; } /* decode frame */ HRESULT CXvidDecoder::Transform(IMediaSample *pIn, IMediaSample *pOut) { DPRINTF("Transform"); xvid_dec_stats_t stats; int length; memset(&stats, 0, sizeof(stats)); stats.version = XVID_VERSION; if (m_create.handle == NULL) { if (xvid_decore_func == NULL) return E_FAIL; if (xvid_decore_func(0, XVID_DEC_CREATE, &m_create, 0) < 0) { DPRINTF("*** XVID_DEC_CREATE error"); return E_FAIL; } } AM_MEDIA_TYPE * mtOut; pOut->GetMediaType(&mtOut); if (mtOut != NULL) { HRESULT result; result = ChangeColorspace(mtOut->subtype, mtOut->formattype, mtOut->pbFormat, 0); DeleteMediaType(mtOut); if (result != S_OK) { DPRINTF("*** ChangeColorspace error"); return result; } } m_frame.length = pIn->GetActualDataLength(); if (pIn->GetPointer((BYTE**)&m_frame.bitstream) != S_OK) { return S_FALSE; } if (pOut->GetPointer((BYTE**)&m_frame.output.plane[0]) != S_OK) { return S_FALSE; } m_frame.general = XVID_LOWDELAY; if (pIn->IsDiscontinuity() == S_OK) m_frame.general |= XVID_DISCONTINUITY; if (g_config.nDeblock_Y) m_frame.general |= XVID_DEBLOCKY; if (g_config.nDeblock_UV) m_frame.general |= XVID_DEBLOCKUV; if (g_config.nDering_Y) m_frame.general |= XVID_DERINGY; if (g_config.nDering_UV) m_frame.general |= XVID_DERINGUV; if (g_config.nFilmEffect) m_frame.general |= XVID_FILMEFFECT; m_frame.brightness = g_config.nBrightness; m_frame.output.csp &= ~XVID_CSP_VFLIP; m_frame.output.csp |= rgb_flip^(g_config.nFlipVideo ? XVID_CSP_VFLIP : 0); m_frame.output.stride[0] = out_stride; // Paranoid check. if (xvid_decore_func == NULL) return E_FAIL; repeat : if (pIn->IsPreroll() != S_OK) { length = xvid_decore_func(m_create.handle, XVID_DEC_DECODE, &m_frame, &stats); if (length == XVID_ERR_MEMORY) return E_FAIL; else if (length < 0) { DPRINTF("*** XVID_DEC_DECODE"); return S_FALSE; } else if (g_config.aspect_ratio == 0 || g_config.aspect_ratio == 1 && forced_ar == false) { if (stats.type != XVID_TYPE_NOTHING) { /* dont attempt to set vmr aspect ratio if no frame was returned by decoder */ // inspired by minolta! works for VMR 7 + 9 IMediaSample2 *pOut2 = NULL; AM_SAMPLE2_PROPERTIES outProp2; if (SUCCEEDED(pOut->QueryInterface(IID_IMediaSample2, (void **)&pOut2)) && SUCCEEDED(pOut2->GetProperties(FIELD_OFFSET(AM_SAMPLE2_PROPERTIES, tStart), (PBYTE)&outProp2))) { CMediaType mtOut2 = m_pOutput->CurrentMediaType(); VIDEOINFOHEADER2* vihOut2 = (VIDEOINFOHEADER2*)mtOut2.Format(); if (*mtOut2.FormatType() == FORMAT_VideoInfo2 && vihOut2->dwPictAspectRatioX != ar_x && vihOut2->dwPictAspectRatioY != ar_y) { vihOut2->dwPictAspectRatioX = ar_x; vihOut2->dwPictAspectRatioY = ar_y; pOut2->SetMediaType(&mtOut2); m_pOutput->SetMediaType(&mtOut2); } pOut2->Release(); } } } } else { /* Preroll frame - won't be displayed */ int tmp = m_frame.output.csp; int tmp_gen = m_frame.general; m_frame.output.csp = XVID_CSP_NULL; /* Disable postprocessing to speed-up seeking */ m_frame.general &= ~XVID_DEBLOCKY; m_frame.general &= ~XVID_DEBLOCKUV; /*m_frame.general &= ~XVID_DERING;*/ m_frame.general &= ~XVID_FILMEFFECT; length = xvid_decore_func(m_create.handle, XVID_DEC_DECODE, &m_frame, &stats); if (length == XVID_ERR_MEMORY) return E_FAIL; else if (length < 0) { DPRINTF("*** XVID_DEC_DECODE"); return S_FALSE; } m_frame.output.csp = tmp; m_frame.general = tmp_gen; } if (stats.type == XVID_TYPE_NOTHING && length > 0) { DPRINTF(" B-Frame decoder lag"); return S_FALSE; } if (stats.type == XVID_TYPE_VOL) { if (stats.data.vol.width != m_create.width || stats.data.vol.height != m_create.height) { DPRINTF("TODO: auto-resize"); return S_FALSE; } pOut->SetSyncPoint(TRUE); if (g_config.aspect_ratio == 0 || g_config.aspect_ratio == 1) { /* auto */ int par_x, par_y; if (stats.data.vol.par == XVID_PAR_EXT) { par_x = stats.data.vol.par_width; par_y = stats.data.vol.par_height; } else { par_x = PARS[stats.data.vol.par-1][0]; par_y = PARS[stats.data.vol.par-1][1]; } ar_x = par_x * stats.data.vol.width; ar_y = par_y * stats.data.vol.height; } m_frame.bitstream = (BYTE*)m_frame.bitstream + length; m_frame.length -= length; goto repeat; } if (pIn->IsPreroll() == S_OK) { return S_FALSE; } return S_OK; } /* get property page list */ STDMETHODIMP CXvidDecoder::GetPages(CAUUID * pPages) { DPRINTF("GetPages"); pPages->cElems = 1; pPages->pElems = (GUID *)CoTaskMemAlloc(pPages->cElems * sizeof(GUID)); if (pPages->pElems == NULL) { return E_OUTOFMEMORY; } pPages->pElems[0] = CLSID_CABOUT; return S_OK; } /* cleanup pages */ STDMETHODIMP CXvidDecoder::FreePages(CAUUID * pPages) { DPRINTF("FreePages"); CoTaskMemFree(pPages->pElems); return S_OK; } /*=============================================================================== // MFT Interface //=============================================================================*/ #if defined(XVID_USE_MFT) #include // _I64_MAX #define INVALID_TIME _I64_MAX HRESULT CXvidDecoder::MFTGetStreamLimits(DWORD *pdwInputMinimum, DWORD *pdwInputMaximum, DWORD *pdwOutputMinimum, DWORD *pdwOutputMaximum) { DPRINTF("(MFT)GetStreamLimits"); if ((pdwInputMinimum == NULL) || (pdwInputMaximum == NULL) || (pdwOutputMinimum == NULL) || (pdwOutputMaximum == NULL)) return E_POINTER; /* Just a fixed number of streams allowed */ *pdwInputMinimum = *pdwInputMaximum = 1; *pdwOutputMinimum = *pdwOutputMaximum = 1; return S_OK; } HRESULT CXvidDecoder::MFTGetStreamCount(DWORD *pcInputStreams, DWORD *pcOutputStreams) { DPRINTF("(MFT)GetStreamCount"); if ((pcInputStreams == NULL) || (pcOutputStreams == NULL)) return E_POINTER; /* We have a fixed number of streams */ *pcInputStreams = 1; *pcOutputStreams = 1; return S_OK; } HRESULT CXvidDecoder::MFTGetStreamIDs(DWORD dwInputIDArraySize, DWORD *pdwInputIDs, DWORD dwOutputIDArraySize, DWORD *pdwOutputIDs) { DPRINTF("(MFT)GetStreamIDs"); return E_NOTIMPL; /* We have fixed number of streams, so stream ID match stream index */ } HRESULT CXvidDecoder::MFTGetInputStreamInfo(DWORD dwInputStreamID, MFT_INPUT_STREAM_INFO *pStreamInfo) { DPRINTF("(MFT)GetInputStreamInfo"); if (pStreamInfo == NULL) return E_POINTER; if (dwInputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; EnterCriticalSection(&m_mft_lock); pStreamInfo->dwFlags = MFT_INPUT_STREAM_WHOLE_SAMPLES | MFT_INPUT_STREAM_SINGLE_SAMPLE_PER_BUFFER; pStreamInfo->hnsMaxLatency = 0; pStreamInfo->cbSize = 1; /* Need atleast 1 byte input */ pStreamInfo->cbMaxLookahead = 0; pStreamInfo->cbAlignment = 1; LeaveCriticalSection(&m_mft_lock); return S_OK; } HRESULT CXvidDecoder::MFTGetOutputStreamInfo(DWORD dwOutputStreamID, MFT_OUTPUT_STREAM_INFO *pStreamInfo) { DPRINTF("(MFT)GetOutputStreamInfo"); if (pStreamInfo == NULL) return E_POINTER; if (dwOutputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; EnterCriticalSection(&m_mft_lock); pStreamInfo->dwFlags = MFT_OUTPUT_STREAM_WHOLE_SAMPLES | MFT_OUTPUT_STREAM_SINGLE_SAMPLE_PER_BUFFER | MFT_OUTPUT_STREAM_FIXED_SAMPLE_SIZE | MFT_OUTPUT_STREAM_DISCARDABLE; if (m_pOutputType == NULL) { pStreamInfo->cbSize = 0; pStreamInfo->cbAlignment = 0; } else { pStreamInfo->cbSize = m_create.width * abs(m_create.height) * 4; // XXX pStreamInfo->cbAlignment = 1; } LeaveCriticalSection(&m_mft_lock); return S_OK; } HRESULT CXvidDecoder::GetAttributes(IMFAttributes** pAttributes) { DPRINTF("(MFT)GetAttributes"); return E_NOTIMPL; /* We don't support any attributes */ } HRESULT CXvidDecoder::GetInputStreamAttributes(DWORD dwInputStreamID, IMFAttributes **ppAttributes) { DPRINTF("(MFT)GetInputStreamAttributes"); return E_NOTIMPL; /* We don't support any attributes */ } HRESULT CXvidDecoder::GetOutputStreamAttributes(DWORD dwOutputStreamID, IMFAttributes **ppAttributes) { DPRINTF("(MFT)GetOutputStreamAttributes"); return E_NOTIMPL; /* We don't support any attributes */ } HRESULT CXvidDecoder::MFTDeleteInputStream(DWORD dwStreamID) { DPRINTF("(MFT)DeleteInputStream"); return E_NOTIMPL; /* We have a fixed number of streams */ } HRESULT CXvidDecoder::MFTAddInputStreams(DWORD cStreams, DWORD *adwStreamIDs) { DPRINTF("(MFT)AddInputStreams"); return E_NOTIMPL; /* We have a fixed number of streams */ } HRESULT CXvidDecoder::MFTGetInputAvailableType(DWORD dwInputStreamID, DWORD dwTypeIndex, IMFMediaType **ppType) { DPRINTF("(MFT)GetInputAvailableType"); if (dwInputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; DWORD i = 0; GUID *bs_guid_table[8]; bs_guid_table[i++] = (GUID *)&CLSID_XVID; bs_guid_table[i++] = (GUID *)&CLSID_XVID_UC; if (g_config.supported_4cc & SUPPORT_3IVX) { bs_guid_table[i++] = (GUID *)&CLSID_3IVX; bs_guid_table[i++] = (GUID *)&CLSID_3IVX_UC; bs_guid_table[i++] = (GUID *)&CLSID_3IV0; bs_guid_table[i++] = (GUID *)&CLSID_3IV0_UC; bs_guid_table[i++] = (GUID *)&CLSID_3IV1; bs_guid_table[i++] = (GUID *)&CLSID_3IV1_UC; bs_guid_table[i++] = (GUID *)&CLSID_3IV2; bs_guid_table[i++] = (GUID *)&CLSID_3IV2_UC; } if (g_config.supported_4cc & SUPPORT_DIVX) { bs_guid_table[i++] = (GUID *)&CLSID_DIVX; bs_guid_table[i++] = (GUID *)&CLSID_DIVX_UC; bs_guid_table[i++] = (GUID *)&CLSID_DX50; bs_guid_table[i++] = (GUID *)&CLSID_DX50_UC; } if (g_config.supported_4cc & SUPPORT_MP4V) { bs_guid_table[i++] = (GUID *)&CLSID_MP4V; bs_guid_table[i++] = (GUID *)&CLSID_MP4V_UC; bs_guid_table[i++] = (GUID *)&CLSID_LMP4; bs_guid_table[i++] = (GUID *)&CLSID_LMP4_UC; bs_guid_table[i++] = (GUID *)&CLSID_RMP4; bs_guid_table[i++] = (GUID *)&CLSID_RMP4_UC; bs_guid_table[i++] = (GUID *)&CLSID_SMP4; bs_guid_table[i++] = (GUID *)&CLSID_SMP4_UC; bs_guid_table[i++] = (GUID *)&CLSID_HDX4; bs_guid_table[i++] = (GUID *)&CLSID_HDX4_UC; } const GUID *subtype; if (dwTypeIndex < i) { subtype = bs_guid_table[dwTypeIndex]; } else { return MF_E_NO_MORE_TYPES; } EnterCriticalSection(&m_mft_lock); HRESULT hr = S_OK; if (ppType) { IMFMediaType *pInputType = NULL; hr = MFCreateMediaType(&pInputType); if (SUCCEEDED(hr)) hr = pInputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video); if (SUCCEEDED(hr)) hr = pInputType->SetGUID(MF_MT_SUBTYPE, *subtype); if (SUCCEEDED(hr)) { *ppType = pInputType; (*ppType)->AddRef(); } if (pInputType) pInputType->Release(); } LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::MFTGetOutputAvailableType(DWORD dwOutputStreamID, DWORD dwTypeIndex, IMFMediaType **ppType) { DPRINTF("(MFT)GetOutputAvailableType"); if (ppType == NULL) return E_INVALIDARG; if (dwOutputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; if (dwTypeIndex < 0) return E_INVALIDARG; GUID csp; int bitdepth = 8; switch(dwTypeIndex) { case 0: if ( USE_YUY2 ) { csp = MFVideoFormat_YUY2; bitdepth = 4; break; } case 1 : if ( USE_UYVY ) { csp = MFVideoFormat_UYVY; bitdepth = 4; break; } case 2 : if ( USE_IYUV ) { csp = MFVideoFormat_IYUV; bitdepth = 3; break; } case 3 : if ( USE_YV12 ) { csp = MFVideoFormat_YV12; bitdepth = 3; break; } case 4 : if ( USE_RGB32 ) { csp = MFVideoFormat_RGB32; bitdepth = 8; break; } case 5 : if ( USE_RGB24 ) { csp = MFVideoFormat_RGB24; bitdepth = 6; break; } case 6 : if ( USE_RG555 ) { csp = MFVideoFormat_RGB555; bitdepth = 4; break; } case 7 : if ( USE_RG565 ) { csp = MFVideoFormat_RGB565; bitdepth = 4; break; } default : return MF_E_NO_MORE_TYPES; } if (m_pInputType == NULL) return MF_E_TRANSFORM_TYPE_NOT_SET; EnterCriticalSection(&m_mft_lock); HRESULT hr = S_OK; IMFMediaType *pOutputType = NULL; hr = MFCreateMediaType(&pOutputType); if (SUCCEEDED(hr)) { hr = pOutputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video); } if (SUCCEEDED(hr)) { hr = pOutputType->SetGUID(MF_MT_SUBTYPE, csp); } if (SUCCEEDED(hr)) { hr = pOutputType->SetUINT32(MF_MT_FIXED_SIZE_SAMPLES, TRUE); } if (SUCCEEDED(hr)) { hr = pOutputType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE); } if (SUCCEEDED(hr)) { hr = pOutputType->SetUINT32(MF_MT_SAMPLE_SIZE, (m_create.height * m_create.width * bitdepth)>>1); } if (SUCCEEDED(hr)) { hr = MFSetAttributeSize(pOutputType, MF_MT_FRAME_SIZE, m_create.width, m_create.height); } if (SUCCEEDED(hr)) { hr = MFSetAttributeRatio(pOutputType, MF_MT_FRAME_RATE, m_frameRate.Numerator, m_frameRate.Denominator); } if (SUCCEEDED(hr)) { hr = pOutputType->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive); } if (SUCCEEDED(hr)) { hr = MFSetAttributeRatio(pOutputType, MF_MT_PIXEL_ASPECT_RATIO, ar_x, ar_y); } if (SUCCEEDED(hr)) { *ppType = pOutputType; (*ppType)->AddRef(); } if (pOutputType) pOutputType->Release(); LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::MFTSetInputType(DWORD dwInputStreamID, IMFMediaType *pType, DWORD dwFlags) { DPRINTF("(MFT)SetInputType"); if (dwInputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; if (dwFlags & ~MFT_SET_TYPE_TEST_ONLY) return E_INVALIDARG; EnterCriticalSection(&m_mft_lock); HRESULT hr = S_OK; /* Actually set the type or just test it? */ BOOL bReallySet = ((dwFlags & MFT_SET_TYPE_TEST_ONLY) == 0); /* If we have samples pending the type can't be changed right now */ if (HasPendingOutput()) hr = MF_E_TRANSFORM_CANNOT_CHANGE_MEDIATYPE_WHILE_PROCESSING; if (SUCCEEDED(hr)) { if (pType) { // /* Check the type */ hr = OnCheckInputType(pType); } } if (SUCCEEDED(hr)) { if (bReallySet) { /* Set the type if needed */ hr = OnSetInputType(pType); } } LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::MFTSetOutputType(DWORD dwOutputStreamID, IMFMediaType *pType, DWORD dwFlags) { DPRINTF("(MFT)SetOutputType"); if (dwOutputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; if (dwFlags & ~MFT_SET_TYPE_TEST_ONLY) return E_INVALIDARG; HRESULT hr = S_OK; EnterCriticalSection(&m_mft_lock); /* Actually set the type or just test it? */ BOOL bReallySet = ((dwFlags & MFT_SET_TYPE_TEST_ONLY) == 0); /* If we have samples pending the type can't be changed right now */ if (HasPendingOutput()) hr = MF_E_TRANSFORM_CANNOT_CHANGE_MEDIATYPE_WHILE_PROCESSING; if (SUCCEEDED(hr)) { if (pType) { /* Check the type */ AM_MEDIA_TYPE *am; hr = MFCreateAMMediaTypeFromMFMediaType(pType, GUID_NULL, &am); if (SUCCEEDED(hr)) { if (FAILED(ChangeColorspace(am->subtype, am->formattype, am->pbFormat, 1))) { DPRINTF("(MFT)InternalCheckOutputType (MF_E_INVALIDTYPE)"); return MF_E_INVALIDTYPE; } CoTaskMemFree(am->pbFormat); CoTaskMemFree(am); } } } if (SUCCEEDED(hr)) { if (bReallySet) { /* Set the type if needed */ hr = OnSetOutputType(pType); } } #ifdef XVID_USE_TRAYICON if (SUCCEEDED(hr) && Tray_Icon == 0) /* Create message passing window */ { WNDCLASSEX wc; wc.cbSize = sizeof(WNDCLASSEX); wc.lpfnWndProc = msg_proc; wc.style = CS_HREDRAW | CS_VREDRAW; wc.cbWndExtra = 0; wc.cbClsExtra = 0; wc.hInstance = (HINSTANCE) g_xvid_hInst; wc.hbrBackground = (HBRUSH) GetStockObject(NULL_BRUSH); wc.lpszMenuName = NULL; wc.lpszClassName = "XVID_MSG_WINDOW"; wc.hIcon = NULL; wc.hIconSm = NULL; wc.hCursor = NULL; RegisterClassEx(&wc); MSG_hwnd = CreateWindowEx(0, "XVID_MSG_WINDOW", NULL, 0, CW_USEDEFAULT, CW_USEDEFAULT, 0, 0, HWND_MESSAGE, NULL, (HINSTANCE) g_xvid_hInst, NULL); /* display the tray icon */ NOTIFYICONDATA nid; ZeroMemory(&nid,sizeof(NOTIFYICONDATA)); nid.cbSize = NOTIFYICONDATA_V1_SIZE; nid.hWnd = MSG_hwnd; nid.uID = 1456; nid.uCallbackMessage = WM_ICONMESSAGE; nid.hIcon = LoadIcon(g_xvid_hInst, MAKEINTRESOURCE(IDI_ICON)); strcpy_s(nid.szTip, 19, "Xvid Video Decoder"); nid.uFlags = NIF_MESSAGE | NIF_ICON | NIF_TIP; Shell_NotifyIcon(NIM_ADD, &nid); DestroyIcon(nid.hIcon); Tray_Icon = 1; } #endif LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::MFTGetInputCurrentType(DWORD dwInputStreamID, IMFMediaType **ppType) { DPRINTF("(MFT)GetInputCurrentType"); if (ppType == NULL) return E_POINTER; if (dwInputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; EnterCriticalSection(&m_mft_lock); HRESULT hr = S_OK; if (!m_pInputType) hr = MF_E_TRANSFORM_TYPE_NOT_SET; if (SUCCEEDED(hr)) { *ppType = m_pInputType; (*ppType)->AddRef(); } LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::MFTGetOutputCurrentType(DWORD dwOutputStreamID, IMFMediaType **ppType) { DPRINTF("(MFT)GetOutputCurrentType"); if (ppType == NULL) return E_POINTER; if (dwOutputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; EnterCriticalSection(&m_mft_lock); HRESULT hr = S_OK; if (!m_pOutputType) hr = MF_E_TRANSFORM_TYPE_NOT_SET; if (SUCCEEDED(hr)) { *ppType = m_pOutputType; (*ppType)->AddRef(); } LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::MFTGetInputStatus(DWORD dwInputStreamID, DWORD *pdwFlags) { DPRINTF("(MFT)GetInputStatus"); if (pdwFlags == NULL) return E_POINTER; if (dwInputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; EnterCriticalSection(&m_mft_lock); /* If there's pending output sampels we don't accept new input data until ProcessOutput() or Flush() was called */ if (!HasPendingOutput()) { *pdwFlags = MFT_INPUT_STATUS_ACCEPT_DATA; } else { *pdwFlags = 0; } LeaveCriticalSection(&m_mft_lock); return S_OK; } HRESULT CXvidDecoder::MFTGetOutputStatus(DWORD *pdwFlags) { DPRINTF("(MFT)GetOutputStatus"); if (pdwFlags == NULL) return E_POINTER; EnterCriticalSection(&m_mft_lock); /* We can render an output sample only after we have decoded one */ if (HasPendingOutput()) { *pdwFlags = MFT_OUTPUT_STATUS_SAMPLE_READY; } else { *pdwFlags = 0; } LeaveCriticalSection(&m_mft_lock); return S_OK; } HRESULT CXvidDecoder::MFTSetOutputBounds(LONGLONG hnsLowerBound, LONGLONG hnsUpperBound) { DPRINTF("(MFT)SetOutputBounds"); return E_NOTIMPL; } HRESULT CXvidDecoder::MFTProcessEvent(DWORD dwInputStreamID, IMFMediaEvent *pEvent) { DPRINTF("(MFT)ProcessEvent"); return E_NOTIMPL; /* We don't handle any stream events */ } HRESULT CXvidDecoder::MFTProcessMessage(MFT_MESSAGE_TYPE eMessage, ULONG_PTR ulParam) { DPRINTF("(MFT)ProcessMessage"); HRESULT hr = S_OK; EnterCriticalSection(&m_mft_lock); switch (eMessage) { case MFT_MESSAGE_COMMAND_FLUSH: if (m_create.handle != NULL) { DPRINTF("(MFT)CommandFlush"); xvid_dec_stats_t stats; int used_bytes; memset(&stats, 0, sizeof(stats)); stats.version = XVID_VERSION; int csp = m_frame.output.csp; m_frame.output.csp = XVID_CSP_INTERNAL; m_frame.bitstream = NULL; m_frame.length = -1; m_frame.general = XVID_LOWDELAY; do { used_bytes = xvid_decore_func(m_create.handle, XVID_DEC_DECODE, &m_frame, &stats); } while(used_bytes>=0 && stats.type <= 0); m_frame.output.csp = csp; m_frame.output.plane[1] = NULL; /* Don't display flushed samples */ //m_timestamp = INVALID_TIME; //m_timelength = INVALID_TIME; //m_rtFrame = 0; } break; case MFT_MESSAGE_COMMAND_DRAIN: m_discont = 1; /* Set discontinuity flag */ m_rtFrame = 0; break; case MFT_MESSAGE_SET_D3D_MANAGER: hr = E_NOTIMPL; break; case MFT_MESSAGE_NOTIFY_BEGIN_STREAMING: case MFT_MESSAGE_NOTIFY_END_STREAMING: break; case MFT_MESSAGE_NOTIFY_START_OF_STREAM: case MFT_MESSAGE_NOTIFY_END_OF_STREAM: break; } LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::MFTProcessInput(DWORD dwInputStreamID, IMFSample *pSample, DWORD dwFlags) { DPRINTF("(MFT)ProcessInput"); if (pSample == NULL) return E_POINTER; if (dwInputStreamID != 0) return MF_E_INVALIDSTREAMNUMBER; if (dwFlags != 0) return E_INVALIDARG; if (!m_pInputType || !m_pOutputType) { return MF_E_NOTACCEPTING; /* Must have set input and output types */ } else if (HasPendingOutput()) { return MF_E_NOTACCEPTING; /* We still have output samples to render */ } xvid_dec_stats_t stats; int length; memset(&stats, 0, sizeof(stats)); stats.version = XVID_VERSION; if (m_create.handle == NULL) { if (xvid_decore_func == NULL) return E_FAIL; if (xvid_decore_func(0, XVID_DEC_CREATE, &m_create, 0) < 0) { DPRINTF("*** XVID_DEC_CREATE error"); return E_FAIL; } } EnterCriticalSection(&m_mft_lock); HRESULT hr = S_OK; IMFMediaBuffer *pBuffer; if (SUCCEEDED(hr)) { hr = pSample->ConvertToContiguousBuffer(&pBuffer); } if (SUCCEEDED(hr)) { hr = pBuffer->Lock((BYTE**)&m_frame.bitstream, NULL, (DWORD *)&m_frame.length); } m_frame.general = XVID_LOWDELAY; if (m_discont == 1) { m_frame.general |= XVID_DISCONTINUITY; m_discont = 0; } if (g_config.nDeblock_Y) m_frame.general |= XVID_DEBLOCKY; if (g_config.nDeblock_UV) m_frame.general |= XVID_DEBLOCKUV; if (g_config.nDering_Y) m_frame.general |= XVID_DERINGY; if (g_config.nDering_UV) m_frame.general |= XVID_DERINGUV; if (g_config.nFilmEffect) m_frame.general |= XVID_FILMEFFECT; m_frame.brightness = g_config.nBrightness; m_frame.output.csp &= ~XVID_CSP_VFLIP; m_frame.output.csp |= rgb_flip^(g_config.nFlipVideo ? XVID_CSP_VFLIP : 0); int csp = m_frame.output.csp; m_frame.output.csp = XVID_CSP_INTERNAL; // Paranoid check. if (xvid_decore_func == NULL) { hr = E_FAIL; goto END_LOOP; } repeat : length = xvid_decore_func(m_create.handle, XVID_DEC_DECODE, &m_frame, &stats); if (length == XVID_ERR_MEMORY) { hr = E_FAIL; goto END_LOOP; } else if (length < 0) { DPRINTF("*** XVID_DEC_DECODE"); goto END_LOOP; } if (stats.type == XVID_TYPE_NOTHING && length > 0) { DPRINTF(" B-Frame decoder lag"); m_frame.output.plane[1] = NULL; goto END_LOOP; } if (stats.type == XVID_TYPE_VOL) { if (stats.data.vol.width != m_create.width || stats.data.vol.height != m_create.height) { DPRINTF("TODO: auto-resize"); m_frame.output.plane[1] = NULL; hr = E_FAIL; } if (g_config.aspect_ratio == 0 || g_config.aspect_ratio == 1) { /* auto */ int par_x, par_y; if (stats.data.vol.par == XVID_PAR_EXT) { par_x = stats.data.vol.par_width; par_y = stats.data.vol.par_height; } else { par_x = PARS[stats.data.vol.par-1][0]; par_y = PARS[stats.data.vol.par-1][1]; } ar_x = par_x * stats.data.vol.width; ar_y = par_y * stats.data.vol.height; } m_frame.bitstream = (BYTE*)m_frame.bitstream + length; m_frame.length -= length; goto repeat; } END_LOOP: m_frame.output.csp = csp; if (pBuffer) { pBuffer->Unlock(); pBuffer->Release(); } if (SUCCEEDED(hr)) { /* Try to get a timestamp */ if (FAILED(pSample->GetSampleTime(&m_timestamp))) m_timestamp = INVALID_TIME; if (FAILED(pSample->GetSampleDuration(&m_timelength))) { m_timelength = INVALID_TIME; } if (m_timestamp != INVALID_TIME && stats.type == XVID_TYPE_IVOP) { m_rtFrame = m_timestamp; } } LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::MFTProcessOutput(DWORD dwFlags, DWORD cOutputBufferCount, MFT_OUTPUT_DATA_BUFFER *pOutputSamples, DWORD *pdwStatus) { DPRINTF("(MFT)ProcessOutput"); /* Preroll in MFT ?? Flags ?? -> TODO... */ if (dwFlags != 0) return E_INVALIDARG; if (pOutputSamples == NULL || pdwStatus == NULL) return E_POINTER; if (cOutputBufferCount != 1) /* Must be exactly one output buffer */ return E_INVALIDARG; if (pOutputSamples[0].pSample == NULL) /* Must have a sample */ return E_INVALIDARG; if (!HasPendingOutput()) { /* If there's no sample we need to decode one first */ return MF_E_TRANSFORM_NEED_MORE_INPUT; } EnterCriticalSection(&m_mft_lock); HRESULT hr = S_OK; BYTE *Dst = NULL; DWORD buffer_size; IMFMediaBuffer *pOutput = NULL; if (SUCCEEDED(hr)) { hr = pOutputSamples[0].pSample->GetBufferByIndex(0, &pOutput); /* Get output buffer */ } if (SUCCEEDED(hr)) { hr = pOutput->GetMaxLength(&buffer_size); } if (SUCCEEDED(hr)) hr = pOutput->Lock(&Dst, NULL, NULL); if (SUCCEEDED(hr)) { xvid_gbl_convert_t convert; memset(&convert, 0, sizeof(convert)); convert.version = XVID_VERSION; convert.input.csp = XVID_CSP_INTERNAL; convert.input.plane[0] = m_frame.output.plane[0]; convert.input.plane[1] = m_frame.output.plane[1]; convert.input.plane[2] = m_frame.output.plane[2]; convert.input.stride[0] = m_frame.output.stride[0]; convert.input.stride[1] = m_frame.output.stride[1]; convert.input.stride[2] = m_frame.output.stride[2]; convert.output.csp = m_frame.output.csp; convert.output.plane[0] = Dst; convert.output.stride[0] = out_stride; convert.width = m_create.width; convert.height = m_create.height; convert.interlacing = 0; if (m_frame.output.plane[1] != NULL && Dst != NULL && xvid_global_func != NULL) if (xvid_global_func(0, XVID_GBL_CONVERT, &convert, NULL) < 0) /* CSP convert into output buffer */ hr = E_FAIL; m_frame.output.plane[1] = NULL; } *pdwStatus = 0; if (SUCCEEDED(hr)) { if (SUCCEEDED(hr)) hr = pOutputSamples[0].pSample->SetUINT32(MFSampleExtension_CleanPoint, TRUE); // key frame if (SUCCEEDED(hr)) { /* Set timestamp of output sample */ if (m_timestamp != INVALID_TIME) hr = pOutputSamples[0].pSample->SetSampleTime(m_timestamp); else hr = pOutputSamples[0].pSample->SetSampleTime(m_rtFrame); if (m_timelength != INVALID_TIME) hr = pOutputSamples[0].pSample->SetSampleDuration(m_timelength); else hr = pOutputSamples[0].pSample->SetSampleDuration(m_duration); m_rtFrame += m_duration; } if (SUCCEEDED(hr)) hr = pOutput->SetCurrentLength(m_create.width * abs(m_create.height) * 4); // XXX } if (pOutput) { pOutput->Unlock(); pOutput->Release(); } LeaveCriticalSection(&m_mft_lock); return hr; } HRESULT CXvidDecoder::OnCheckInputType(IMFMediaType *pmt) { DPRINTF("(MFT)CheckInputType"); HRESULT hr = S_OK; /* Check if input type is already set. Reject any type that is not identical */ if (m_pInputType) { DWORD dwFlags = 0; if (S_OK == m_pInputType->IsEqual(pmt, &dwFlags)) { return S_OK; } else { return MF_E_INVALIDTYPE; } } GUID majortype = {0}, subtype = {0}; UINT32 width = 0, height = 0; hr = pmt->GetMajorType(&majortype); if (SUCCEEDED(hr)) { if (majortype != MFMediaType_Video) { /* Must be Video */ hr = MF_E_INVALIDTYPE; } } if (m_hdll == NULL) { HRESULT hr = OpenLib(); if (FAILED(hr) || (m_hdll == NULL)) // Paranoid checks. hr = MF_E_INVALIDTYPE; } if (SUCCEEDED(hr)) { hr = MFGetAttributeSize(pmt, MF_MT_FRAME_SIZE, &width, &height); } /* Check the frame size */ if (SUCCEEDED(hr)) { if (width > 4096 || height > 4096) { hr = MF_E_INVALIDTYPE; } } m_create.width = width; m_create.height = height; if (SUCCEEDED(hr)) { if (g_config.aspect_ratio == 0 || g_config.aspect_ratio == 1) { hr = MFGetAttributeRatio(pmt, MF_MT_PIXEL_ASPECT_RATIO, (UINT32*)&ar_x, (UINT32*)&ar_y); } } /* TODO1: Make sure there really is a frame rate after all! TODO2: Use the framerate for something! */ MFRatio fps = {0}; if (SUCCEEDED(hr)) { hr = MFGetAttributeRatio(pmt, MF_MT_FRAME_RATE, (UINT32*)&fps.Numerator, (UINT32*)&fps.Denominator); } if (SUCCEEDED(hr)) { hr = pmt->GetGUID(MF_MT_SUBTYPE, &subtype); } if (subtype == CLSID_MP4V || subtype == CLSID_MP4V_UC || subtype == CLSID_LMP4 || subtype == CLSID_LMP4_UC || subtype == CLSID_RMP4 || subtype == CLSID_RMP4_UC || subtype == CLSID_SMP4 || subtype == CLSID_SMP4_UC || subtype == CLSID_HDX4 || subtype == CLSID_HDX4_UC) { if (!(g_config.supported_4cc & SUPPORT_MP4V)) { CloseLib(); hr = MF_E_INVALIDTYPE; } else m_create.fourcc = FOURCC_MP4V; } else if (subtype == CLSID_DIVX || subtype == CLSID_DIVX_UC) { if (!(g_config.supported_4cc & SUPPORT_DIVX)) { CloseLib(); hr = MF_E_INVALIDTYPE; } else m_create.fourcc = FOURCC_DIVX; } else if (subtype == CLSID_DX50 || subtype == CLSID_DX50_UC) { if (!(g_config.supported_4cc & SUPPORT_DIVX)) { CloseLib(); hr = MF_E_INVALIDTYPE; } else m_create.fourcc = FOURCC_DX50; } else if (subtype == CLSID_3IVX || subtype == CLSID_3IVX_UC || subtype == CLSID_3IV0 || subtype == CLSID_3IV0_UC || subtype == CLSID_3IV1 || subtype == CLSID_3IV1_UC || subtype == CLSID_3IV2 || subtype == CLSID_3IV2_UC) { if (!(g_config.supported_4cc & SUPPORT_3IVX)) { CloseLib(); hr = MF_E_INVALIDTYPE; } else m_create.fourcc = FOURCC_3IVX; } else if (subtype == CLSID_XVID || subtype == CLSID_XVID_UC) { m_create.fourcc = FOURCC_XVID; } else { DPRINTF("Unknown subtype!"); CloseLib(); hr = MF_E_INVALIDTYPE; } /* haali media splitter reports VOL information in the format header */ if (SUCCEEDED(hr)) { UINT32 cbSeqHeader = 0; (void)pmt->GetBlobSize(MF_MT_MPEG_SEQUENCE_HEADER, &cbSeqHeader); if (cbSeqHeader>0) { xvid_dec_stats_t stats; memset(&stats, 0, sizeof(stats)); stats.version = XVID_VERSION; if (m_create.handle == NULL) { if (xvid_decore_func == NULL) hr = E_FAIL; if (xvid_decore_func(0, XVID_DEC_CREATE, &m_create, 0) < 0) { DPRINTF("*** XVID_DEC_CREATE error"); hr = E_FAIL; } } if (SUCCEEDED(hr)) { (void)pmt->GetAllocatedBlob(MF_MT_MPEG_SEQUENCE_HEADER, (UINT8 **)&m_frame.bitstream, (UINT32 *)&m_frame.length); m_frame.general = 0; m_frame.output.csp = XVID_CSP_NULL; int ret = 0; if ((ret=xvid_decore_func(m_create.handle, XVID_DEC_DECODE, &m_frame, &stats)) >= 0) { /* honour video dimensions reported in VOL header */ if (stats.type == XVID_TYPE_VOL) { m_create.width = stats.data.vol.width; m_create.height = stats.data.vol.height; } } if (ret == XVID_ERR_MEMORY) hr = E_FAIL; CoTaskMemFree(m_frame.bitstream); } } } return hr; } HRESULT CXvidDecoder::OnSetInputType(IMFMediaType *pmt) { HRESULT hr = S_OK; UINT32 w, h; if (m_pInputType) m_pInputType->Release(); hr = MFGetAttributeSize(pmt, MF_MT_FRAME_SIZE, &w, &h); m_create.width = w; m_create.height = h; if (SUCCEEDED(hr)) hr = MFGetAttributeRatio(pmt, MF_MT_FRAME_RATE, (UINT32*)&m_frameRate.Numerator, (UINT32*)&m_frameRate.Denominator); if (SUCCEEDED(hr)) { /* Store frame duration, derived from the frame rate */ hr = MFFrameRateToAverageTimePerFrame(m_frameRate.Numerator, m_frameRate.Denominator, &m_duration); } if (SUCCEEDED(hr)) { m_pInputType = pmt; m_pInputType->AddRef(); } return hr; } HRESULT CXvidDecoder::OnSetOutputType(IMFMediaType *pmt) { if (m_pOutputType) m_pOutputType->Release(); m_pOutputType = pmt; m_pOutputType->AddRef(); return S_OK; } #endif /* XVID_USE_MFT */ xvidcore/dshow/src/debug.h0000664000076500007650000000047311564705453016653 0ustar xvidbuildxvidbuild#ifndef _DSHOW_DEBUG_ #define _DSHOW_DEBUG_ #include #ifdef __cplusplus extern "C" { #endif void OutputDebugStringf(char *fmt, ...); #ifdef _DEBUG #define DPRINTF OutputDebugStringf #else static __inline void DPRINTF(char *fmt, ...) { } #endif #ifdef __cplusplus } #endif #endif /* _DSHOW_DEBUG */ xvidcore/dshow/src/Configure.cpp0000664000076500007650000000475311565210673020042 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Configure from command line - * * Copyright(C) 2002-2010 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: Configure.cpp 2006 2011-05-19 12:48:59Z Isibaar $ * ****************************************************************************/ #include #include #include "config.h" #include "resource.h" HINSTANCE g_xvid_hInst; INT_PTR adv_dialog(HWND hwndOwner) { PROPSHEETPAGE psp [1]; PROPSHEETHEADER psh; psp[0].dwSize = sizeof (PROPSHEETPAGE); psp[0].dwFlags = PSP_USETITLE; psp[0].hInstance = g_xvid_hInst; psp[0].pszTemplate = MAKEINTRESOURCE (IDD_ABOUT); psp[0].pszIcon = NULL; psp[0].pfnDlgProc = adv_proc; psp[0].pszTitle = "About"; psp[0].lParam = 0; psh.dwSize = sizeof (PROPSHEETHEADER); psh.dwFlags = PSH_PROPSHEETPAGE; psh.hwndParent = hwndOwner; psh.hInstance = g_xvid_hInst; psh.pszIcon = NULL; psh.pszCaption = (LPSTR)"Xvid Configuration"; psh.nPages = sizeof (psp) / sizeof (PROPSHEETPAGE); psh.ppsp = psp; return PropertySheet (&psh); } extern "C" void CALLBACK Configure(HWND hWndParent, HINSTANCE hInstParent, LPSTR lpCmdLine, int nCmdShow ); void CALLBACK Configure(HWND hWndParent, HINSTANCE hInstParent, LPSTR lpCmdLine, int nCmdShow ) { InitCommonControls(); LoadRegistryInfo(); adv_dialog( GetDesktopWindow() ); } /* strmbase.lib\dllentry.obj:DllEntryPoint@12 */ extern "C" BOOL WINAPI DllEntryPoint(HINSTANCE, ULONG, LPVOID); extern "C" BOOL WINAPI DllMain(HINSTANCE hInst, DWORD fdwReason, LPVOID lpvReserved); BOOL WINAPI DllMain(HINSTANCE hInst, DWORD fdwReason, LPVOID lpvReserved) { g_xvid_hInst = hInst; /* Call directshow DllEntryPoint@12 */ return DllEntryPoint(hInst, fdwReason, lpvReserved); } xvidcore/dshow/src/CXvidDecoder.h0000664000076500007650000002317311564705453020072 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - XviD Decoder part of the DShow Filter - * * Copyright(C) 2002-2010 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: CXvidDecoder.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _FILTER_H_ #define _FILTER_H_ #include #include "IXvidDecoder.h" #define XVID_NAME_L L"Xvid MPEG-4 Video Decoder" /* --- fourcc --- */ #define FOURCC_XVID mmioFOURCC('X','V','I','D') #define FOURCC_xvid mmioFOURCC('x','v','i','d') #define FOURCC_DIVX mmioFOURCC('D','I','V','X') #define FOURCC_divx mmioFOURCC('d','i','v','x') #define FOURCC_DX50 mmioFOURCC('D','X','5','0') #define FOURCC_dx50 mmioFOURCC('d','x','5','0') #define FOURCC_MP4V mmioFOURCC('M','P','4','V') #define FOURCC_mp4v mmioFOURCC('m','p','4','v') #define FOURCC_3IVX mmioFOURCC('3','I','V','X') #define FOURCC_3ivx mmioFOURCC('3','i','v','x') #define FOURCC_3IV0 mmioFOURCC('3','I','V','0') #define FOURCC_3iv0 mmioFOURCC('3','i','v','0') #define FOURCC_3IV1 mmioFOURCC('3','I','V','1') #define FOURCC_3iv1 mmioFOURCC('3','i','v','1') #define FOURCC_3IV2 mmioFOURCC('3','I','V','2') #define FOURCC_3iv2 mmioFOURCC('3','i','v','2') #define FOURCC_LMP4 mmioFOURCC('L','M','P','4') #define FOURCC_lmp4 mmioFOURCC('l','m','p','4') #define FOURCC_RMP4 mmioFOURCC('R','M','P','4') #define FOURCC_rmp4 mmioFOURCC('r','m','p','4') #define FOURCC_SMP4 mmioFOURCC('S','M','P','4') #define FOURCC_smp4 mmioFOURCC('s','m','p','4') #define FOURCC_HDX4 mmioFOURCC('H','D','X','4') #define FOURCC_hdx4 mmioFOURCC('h','d','x','4') /* --- media uids --- */ DEFINE_GUID(CLSID_XVID, mmioFOURCC('x','v','i','d'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_XVID_UC, mmioFOURCC('X','V','I','D'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_DIVX, mmioFOURCC('d','i','v','x'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_DIVX_UC, mmioFOURCC('D','I','V','X'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_DX50, mmioFOURCC('d','x','5','0'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_DX50_UC, mmioFOURCC('D','X','5','0'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_3IVX, mmioFOURCC('3','i','v','x'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_3IVX_UC, mmioFOURCC('3','I','V','X'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_3IV0, mmioFOURCC('3','i','v','0'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_3IV0_UC, mmioFOURCC('3','I','V','0'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_3IV1, mmioFOURCC('3','i','v','1'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_3IV1_UC, mmioFOURCC('3','I','V','1'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_3IV2, mmioFOURCC('3','i','v','2'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_3IV2_UC, mmioFOURCC('3','I','V','2'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_LMP4, mmioFOURCC('l','m','p','4'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_LMP4_UC, mmioFOURCC('L','M','P','4'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_RMP4, mmioFOURCC('r','m','p','4'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_RMP4_UC, mmioFOURCC('R','M','P','4'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_SMP4, mmioFOURCC('s','m','p','4'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_SMP4_UC, mmioFOURCC('S','M','P','4'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_HDX4, mmioFOURCC('h','d','x','4'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_HDX4_UC, mmioFOURCC('H','D','X','4'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_MP4V, mmioFOURCC('m','p','4','v'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); DEFINE_GUID(CLSID_MP4V_UC, mmioFOURCC('M','P','4','V'), 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); /* MEDIATYPE_IYUV is not always defined in the directx headers */ DEFINE_GUID(CLSID_MEDIASUBTYPE_IYUV, 0x56555949, 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71); class CXvidDecoder : public CVideoTransformFilter, public IXvidDecoder, public ISpecifyPropertyPages #if defined(XVID_USE_MFT) ,public IMFTransform #endif { public : static CUnknown * WINAPI CreateInstance(LPUNKNOWN punk, HRESULT *phr); STDMETHODIMP NonDelegatingQueryInterface(REFIID riid, void ** ppv); DECLARE_IUNKNOWN; CXvidDecoder(LPUNKNOWN punk, HRESULT *phr); ~CXvidDecoder(); HRESULT CompleteConnect(PIN_DIRECTION direction, IPin *pReceivePin); HRESULT BreakConnect(PIN_DIRECTION dir); HRESULT CheckInputType(const CMediaType * mtIn); HRESULT GetMediaType(int iPos, CMediaType * pmt); HRESULT SetMediaType(PIN_DIRECTION direction, const CMediaType *pmt); HRESULT CheckTransform(const CMediaType *mtIn, const CMediaType *mtOut); HRESULT DecideBufferSize(IMemAllocator * pima, ALLOCATOR_PROPERTIES * pProperties); HRESULT Transform(IMediaSample *pIn, IMediaSample *pOut); STDMETHODIMP GetPages(CAUUID * pPages); STDMETHODIMP FreePages(CAUUID * pPages); /* IMFTransform */ #if defined(XVID_USE_MFT) STDMETHODIMP MFTGetStreamLimits(DWORD *pdwInputMinimum, DWORD *pdwInputMaximum, DWORD *pdwOutputMinimum, DWORD *pdwOutputMaximum); STDMETHODIMP MFTGetStreamCount(DWORD *pcInputStreams, DWORD *pcOutputStreams); STDMETHODIMP MFTGetStreamIDs(DWORD dwInputIDArraySize, DWORD *pdwInputIDs, DWORD dwOutputIDArraySize, DWORD *pdwOutputIDs); STDMETHODIMP MFTGetInputStreamInfo(DWORD dwInputStreamID, MFT_INPUT_STREAM_INFO *pStreamInfo); STDMETHODIMP MFTGetOutputStreamInfo(DWORD dwOutputStreamID, MFT_OUTPUT_STREAM_INFO *pStreamInfo); STDMETHODIMP GetAttributes(IMFAttributes** pAttributes); STDMETHODIMP GetInputStreamAttributes(DWORD dwInputStreamID, IMFAttributes **ppAttributes); STDMETHODIMP GetOutputStreamAttributes(DWORD dwOutputStreamID, IMFAttributes **ppAttributes); STDMETHODIMP MFTDeleteInputStream(DWORD dwStreamID); STDMETHODIMP MFTAddInputStreams(DWORD cStreams, DWORD *adwStreamIDs); STDMETHODIMP MFTGetInputAvailableType(DWORD dwInputStreamID, DWORD dwTypeIndex, IMFMediaType **ppType); STDMETHODIMP MFTGetOutputAvailableType(DWORD dwOutputStreamID, DWORD dwTypeIndex, IMFMediaType **ppType); STDMETHODIMP MFTSetInputType(DWORD dwInputStreamID, IMFMediaType *pType, DWORD dwFlags); STDMETHODIMP MFTSetOutputType(DWORD dwOutputStreamID, IMFMediaType *pType, DWORD dwFlags); STDMETHODIMP MFTGetInputCurrentType(DWORD dwInputStreamID, IMFMediaType **ppType); STDMETHODIMP MFTGetOutputCurrentType(DWORD dwOutputStreamID, IMFMediaType **ppType); STDMETHODIMP MFTGetInputStatus(DWORD dwInputStreamID, DWORD *pdwFlags); STDMETHODIMP MFTGetOutputStatus(DWORD *pdwFlags); STDMETHODIMP MFTSetOutputBounds(LONGLONG hnsLowerBound, LONGLONG hnsUpperBound); STDMETHODIMP MFTProcessEvent(DWORD dwInputStreamID, IMFMediaEvent *pEvent); STDMETHODIMP MFTProcessMessage(MFT_MESSAGE_TYPE eMessage, ULONG_PTR ulParam); STDMETHODIMP MFTProcessInput(DWORD dwInputStreamID, IMFSample *pSample, DWORD dwFlags); STDMETHODIMP MFTProcessOutput(DWORD dwFlags, DWORD cOutputBufferCount, MFT_OUTPUT_DATA_BUFFER *pOutputSamples, DWORD *pdwStatus); #endif /* XVID_USE_MFT */ private : HRESULT ChangeColorspace(GUID subtype, GUID formattype, void * format, int noflip); HRESULT OpenLib(); void CloseLib(); xvid_dec_create_t m_create; xvid_dec_frame_t m_frame; HINSTANCE m_hdll; int (*xvid_global_func)(void *handle, int opt, void *param1, void *param2); int (*xvid_decore_func)(void *handle, int opt, void *param1, void *param2); int ar_x, ar_y; bool forced_ar; int rgb_flip; int out_stride; /* mft stuff */ #if defined(XVID_USE_MFT) BOOL HasPendingOutput() const { return m_frame.output.plane[1] != NULL; } HRESULT OnSetInputType(IMFMediaType *pmt); HRESULT OnCheckInputType(IMFMediaType *pmt); HRESULT OnSetOutputType(IMFMediaType *pmt); IMFMediaType *m_pInputType; IMFMediaType *m_pOutputType; CRITICAL_SECTION m_mft_lock; REFERENCE_TIME m_timestamp; REFERENCE_TIME m_timelength; int m_discont; /* Used to construct or interpolate missing timestamps */ REFERENCE_TIME m_rtFrame; MFRatio m_frameRate; UINT64 m_duration; #endif #ifdef XVID_USE_TRAYICON HWND MSG_hwnd; /* message handler window */ }; #define WM_ICONMESSAGE (WM_USER + 1) #else }; #endif static const int PARS[][2] = { {1, 1}, {12, 11}, {10, 11}, {16, 11}, {40, 33}, {0, 0}, }; #endif /* _FILTER_H_ */ xvidcore/dshow/src/CAbout.h0000664000076500007650000000300711564705453016736 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC - DShow Front End * - About Window header file - * * Copyright(C) 2002-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: CAbout.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _CABOUT_H_ #define _CABOUT_H_ #include DEFINE_GUID(CLSID_CABOUT, 0x00000001, 0x4fef, 0x40d3, 0xb3, 0xfa, 0xe0, 0x53, 0x1b, 0x89, 0x7f, 0x98); class CAbout : public CBasePropertyPage { public: static CUnknown * WINAPI CreateInstance(LPUNKNOWN lpunk, HRESULT * phr); CAbout(LPUNKNOWN pUnk, HRESULT * phr); ~CAbout(); INT_PTR OnReceiveMessage(HWND hwnd, UINT uMsg, WPARAM wParam, LPARAM lParam); }; #endif /* _CABOUT_H_ */ xvidcore/dshow/src/XviD_logo.bmp0000664000076500007650000001140610454767660020011 0ustar xvidbuildxvidbuildBMv(,  RVf֖ r*՛9[Ǘffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff`o`otFDhFvtFdG`o`opF db+@`oN+!pw xpo`oog@DG``oO{pg@vpǏ`or-`h@wH``o{@p ``o`@`oww~(q@`os35S37v`oS335S333@(`oS3337s3333 K`os3333\33335@)wwwwx`o33335S33337 KfL`h`o33333833333> Kp`o`o333333333333@Jpo`oS333339333335 @p`o`o333335s333339)opgwwto`o333335S33333 po`os33337s33337  po`o3333]3333KLpo`os335s337@pxpo`oUWuU Mppo`o@(+po`oL`pMo`oS5S5@(qpopo`oS335S335mMp@DDBo`o3333:3333^`p@ po`os33335s33335 pp o`o333335S33333@p Npo`o333337s333338Ir@nB""!o`oS33333:333335o`o333333333333o`o33333Y33333<o`os33335S33337o`oS3333~33335o`oS33383333o`oS337s335o`os35S37@`oww{`o`ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff`xvidcore/dshow/src/resource.h0000664000076500007650000000307011564770043017405 0ustar xvidbuildxvidbuild//{{NO_DEPENDENCIES}} // Microsoft Developer Studio generated include file. // Used by xvid.ax.rc // #define VERSION_RES_MINOR_VER 0 #define VERSION_RES_BUILD 0 #define VER_DEBUG 0 #define IDS_ABOUT 1 #define VERSION_RES_MAJOR_VER 8 #define IDD_ABOUT 102 #define IDB_LOGO 103 #define IDI_ICON 104 #define IDC_BRIGHTNESS 1002 #define IDC_DEBLOCK_UV 1003 #define IDC_DEBLOCK_Y 1004 #define IDC_FLIPVIDEO 1005 #define IDC_COLORSPACE 1006 #define IDC_RESET 1007 #define IDC_FILMEFFECT 1008 #define IDC_DERING 1009 #define IDC_DERINGY 1009 #define IDC_DIVX 1010 #define IDC_COMPAT 1011 #define IDC_MP4V 1012 #define IDC_3IVX 1013 #define IDC_DERINGUV 1014 #define IDC_USE_AR 1015 #define IDC_CORE 1016 #define VERSION_RES_LANGUAGE 0x409 #define VERSION_RES_CHARSET 1252 #define IDC_STATIC -1 // Next default values for new objects // #ifdef APSTUDIO_INVOKED #ifndef APSTUDIO_READONLY_SYMBOLS #define _APS_NEXT_RESOURCE_VALUE 106 #define _APS_NEXT_COMMAND_VALUE 40001 #define _APS_NEXT_CONTROL_VALUE 1017 #define _APS_NEXT_SYMED_VALUE 101 #endif #endif xvidcore/dshow/src/IXvidDecoder.h0000664000076500007650000000254411564705453020077 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - XviD Decoder part of the DShow Filter - * * Copyright(C) 2002-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: IXvidDecoder.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _IXVID_H_ #define _IXVID_H_ #ifdef __cplusplus extern "C" { #endif DEFINE_GUID(IID_IXvidDecoder, 0x00000000, 0x4fef, 0x40d3, 0xb3, 0xfa, 0xe0, 0x53, 0x1b, 0x89, 0x7f, 0x98); DECLARE_INTERFACE_(IXvidDecoder, IUnknown) { }; #ifdef __cplusplus } #endif #endif /* _IXVID_H_ */ xvidcore/dshow/src/debug.c0000664000076500007650000000046311564705453016645 0ustar xvidbuildxvidbuild#include "debug.h" #include #include #include /* vsprintf */ #define DPRINTF_BUF_SZ 1024 void OutputDebugStringf(char *fmt, ...) { #ifdef _DEBUG va_list args; char buf[DPRINTF_BUF_SZ]; va_start(args, fmt); vsprintf(buf, fmt, args); OutputDebugString(buf); #endif } xvidcore/dshow/src/xvid.ax.rc0000664000076500007650000001107311564770043017316 0ustar xvidbuildxvidbuild//Microsoft Developer Studio generated resource script. // #include "resource.h" #define APSTUDIO_READONLY_SYMBOLS ///////////////////////////////////////////////////////////////////////////// // // Generated from the TEXTINCLUDE 2 resource. // #include #ifndef IDC_STATIC #define IDC_STATIC (-1) #endif ///////////////////////////////////////////////////////////////////////////// #undef APSTUDIO_READONLY_SYMBOLS ///////////////////////////////////////////////////////////////////////////// // Neutral resources #if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_NEU) #ifdef _WIN32 LANGUAGE LANG_NEUTRAL, SUBLANG_NEUTRAL #pragma code_page(1252) #endif //_WIN32 ///////////////////////////////////////////////////////////////////////////// // // Dialog // IDD_ABOUT DIALOG DISCARDABLE 0, 0, 216, 267 STYLE WS_CHILD FONT 8, "MS Shell Dlg" BEGIN CONTROL IDB_LOGO,IDC_STATIC,"Static",SS_BITMAP,36,6,142,29 CTEXT "Xvid MPEG-4 Video Codec",IDC_CORE,36,33,142,16, SS_CENTERIMAGE | SS_SUNKEN GROUPBOX "Brightness",IDC_STATIC,7,54,202,39 CONTROL "Slider1",IDC_BRIGHTNESS,"msctls_trackbar32", TBS_AUTOTICKS | TBS_BOTH | WS_TABSTOP,17,64,181,24 GROUPBOX "Postprocessing",IDC_STATIC,7,96,202,42 CONTROL "Deblocking (Y)",IDC_DEBLOCK_Y,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,23,109,63,10 CONTROL "Deblocking (UV)",IDC_DEBLOCK_UV,"Button", BS_AUTOCHECKBOX | WS_TABSTOP,23,123,68,10 CONTROL "Dering (Y)",IDC_DERINGY,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,98,109,47,10 CONTROL "Film Effect",IDC_FILMEFFECT,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,158,109,47,10 GROUPBOX "Output",IDC_STATIC,7,141,202,43 CONTROL "Flip Video",IDC_FLIPVIDEO,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,23,154,46,10 CONTROL "Compatibility Renderer",IDC_COMPAT,"Button", BS_AUTOCHECKBOX | WS_TABSTOP,23,167,88,12 COMBOBOX IDC_COLORSPACE,134,164,69,67,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP CONTROL "DIVX",IDC_DIVX,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,21, 198,47,13 CONTROL "3IVX",IDC_3IVX,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,92, 198,43,13 GROUPBOX "Other MPEG-4 video support",IDC_STATIC,7,187,202,29 CONTROL "Other",IDC_MP4V,"Button",BS_AUTOCHECKBOX | WS_TABSTOP, 160,198,38,13 PUSHBUTTON "Reset",IDC_RESET,79,252,50,12 LTEXT "Output Colourspace",IDC_STATIC,136,151,67,9 CONTROL "Dering (UV)",IDC_DERINGUV,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,98,123,53,10 GROUPBOX "Aspect Ratio",IDC_STATIC,7,221,202,25 LTEXT "After restarting player, use this AR:",IDC_STATIC,14, 233,109,8 COMBOBOX IDC_USE_AR,135,230,68,95,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP END ///////////////////////////////////////////////////////////////////////////// // // Bitmap // IDB_LOGO BITMAP DISCARDABLE "XviD_logo.bmp" ///////////////////////////////////////////////////////////////////////////// // // Icon // // Icon with lowest ID value placed first to ensure application icon // remains consistent on all systems. IDI_ICON ICON DISCARDABLE "xvid.ico" #ifdef APSTUDIO_INVOKED ///////////////////////////////////////////////////////////////////////////// // // TEXTINCLUDE // 1 TEXTINCLUDE DISCARDABLE BEGIN "resource.h\0" END 2 TEXTINCLUDE DISCARDABLE BEGIN "#include \r\n" "#ifndef IDC_STATIC\r\n" "#define IDC_STATIC (-1)\r\n" "#endif\r\n" "\0" END 3 TEXTINCLUDE DISCARDABLE BEGIN "\r\n" "\0" END #endif // APSTUDIO_INVOKED ///////////////////////////////////////////////////////////////////////////// // // String Table // STRINGTABLE DISCARDABLE BEGIN IDS_ABOUT "About" END #endif // Neutral resources ///////////////////////////////////////////////////////////////////////////// #ifndef APSTUDIO_INVOKED ///////////////////////////////////////////////////////////////////////////// // // Generated from the TEXTINCLUDE 3 resource. // ///////////////////////////////////////////////////////////////////////////// #endif // not APSTUDIO_INVOKED xvidcore/dshow/src/config.h0000664000076500007650000000465011564770043017030 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Configuration processing header file - * * Copyright(C) 2002-2011 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: config.h 1995 2011-05-18 16:13:23Z Isibaar $ * ****************************************************************************/ #ifndef _DSHOW_CONFIG_H_ #define _DSHOW_CONFIG_H_ #ifdef __cplusplus extern "C" { #endif /* registry stuff */ #define XVID_REG_KEY HKEY_CURRENT_USER #define XVID_REG_SUBKEY "Software\\GNU\\XviD" #define XVID_REG_CLASS "config" #define REG_GET_N(X, Y, Z) size=sizeof(int);if(RegQueryValueEx(hKey, X, 0, 0, (LPBYTE)&Y, &size) != ERROR_SUCCESS) {Y=Z;} #define REG_GET_S(X, Y, Z) size=MAX_PATH;if(RegQueryValueEx(hKey, X, 0, 0, Y, &size) != ERROR_SUCCESS) {lstrcpy(Y, Z);} #define REG_SET_N(X, Y) RegSetValueEx(hKey, X, 0, REG_DWORD, (LPBYTE)&Y, sizeof(int)) #define REG_SET_S(X, Y) RegSetValueEx(hKey, X, 0, REG_SZ, Y, lstrlen(Y)+1) /* config struct */ #define SUPPORT_3IVX (1<<0) #define SUPPORT_DIVX (1<<1) #define SUPPORT_MP4V (1<<2) #define FORCE_NONE 0 #define FORCE_YV12 1 #define FORCE_YUY2 2 #define FORCE_RGB24 3 #define FORCE_RGB32 4 typedef struct { int nBrightness; int nDeblock_Y; int nDeblock_UV; int nDering_Y; int nDering_UV; int nFilmEffect; int nFlipVideo; int nForceColorspace; unsigned int supported_4cc; int videoinfo_compat; int aspect_ratio; int num_threads; DWORD cpu; } CONFIG; /* global */ extern CONFIG g_config; /* functions */ void LoadRegistryInfo(); void SaveRegistryInfo(); INT_PTR CALLBACK adv_proc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam); #define XVID_DLL_NAME "xvidcore.dll" #ifdef __cplusplus } #endif #endif xvidcore/dshow/src/xvid.ico0000664000076500007650000002267611540745634017072 0ustar xvidbuildxvidbuild00 %(0` h6n8 t9t:s9r9p8o8U*p8 w<y<t:r9q8z=i7o8_t9t:s9r9q8p8E Y,-p8fw<x<t:r9q8v; v<Ku;t:t9s9s9r9s9s:sr:>m7s:0p8jp7s9u;v;t:s9r9j5T\.y=ux<t:s9t:t:t9t:t:s:q9Xr94"`4s:)s:t:u:t:s9s9s9t:s9k5f3:t:t:t:s9t:t:t:t:t:s:s9s9r9,?& j7qt;t:t9t:t:s9s9s9t:s9r9q9>t:ut:t:t9s:s:t:s9s9t9t:s9s:s;Qp9 o8v:|=x;t9t9t9t9t9t9t9t9t9t9t9t9t9t9k6=&2"C~>yx;u:t9t9t9t9t9t9t9t9t9t9t9t9t9s9r9Ls: ~>u:Wu:u:v:v:u:u:u:u:u:u:u:u:u:u:v:r9l6e3 n8t:Vy<w;u:u:u:u:u:u:u:u:u:u:u:v:v:u:t:j5e3)<m6u:x;x;w;w;w;w;w;w;w;w;w;w;w;v:u:t9-r9Is:v:w;w;w;w;w;w;w;w;w;w;w;w;w;v;w;w<Ge2Cm6x<z=y<y<y<y<y<y<y<y<y<y<x;x;x;s9o84s9pu:x<y<y<y<y<y<y<y<y<y<y<y<z=w;u9u:2L& l6Oz>{=z={=z=z=z=z=z=z=z=z=y=y<z=z=z>Nw;x<z=z=z=z=z=z=z=z=z=z=z=z={>x<r9l4G }@z>y<|=|>{>{>{>{>{>{>{>{>{>{>~@AK{>|{>{>{>{>{>{>{>{>{>{>{>{>{>{>x<l7,j5Ht;zu;|>~?}>}>}>}>}>}>}>}>}>|>u;n9,}@g}?}?}>}>}>}>}>}>}>}>}>}>}=|?z?Oo97u<|>~?@@@@?@@@@~?r:AN* r:4v<|>~?@@@@@@@@?~?}@U{Bl>v>Hz>}?@AAAA@@@@ƀAxE a1;z>~@AAAAAAAAAˀAwBt>u>Gx={=@AAB?x=t;t;>E T*w<@iAρAAAAAABńB^z?j= u<@x<{|>~?@?|>v;ck7/E&B@]}?|>{>{>|>}?BVEj50v=_z?po8Ga1\2j6\/5;G&D&g5s<.v=5n9$: AG1?4n7Y4J D&|>h}@~@|>~?GGB v;HAv<CCrEA}?vx=B@Fa7 E|GԋEEEEEEÃBkz=CADEߋEEEFF܅C@6r<'CFՎGFFFFFFE̊FH=b=*t=$EeEEFFFFFFEEH;r>!BqFFGGGGGGGGGGCr=BnEFGGGGGGGGGHG$G^GGGGGGGGGGGHG{>fc3GSGďGGGGGGGGGGGGFE.NGHHHHHHHHHHHHHHHDB=HsHHHHHHHHHHHHHGވDxU,E%G'HHIIIIIIIIIIIIIHGNIIIIIIIIIIIIIIIFz>]s: D0FJJJJJJJJJJJJJJJGE<J{JJJJJJJJJJJJJJJJJEhIוKKKKKKKKKKKKKKKGV@H^IؗLLKKKKKKKKKKKKKJpIc4j7/GKLLLLLLLLLLLLLKJKC2FLLLLLLLLLLLLLLKJ۔IeH F|JҚMLLLLLLLLLLMLKHy@Fs=F$HKMLLLLLLLLLLLLLIl9*d6|?+JLMLLLLLLLLMLLKHDT7& F;IKNNLLLLLLLLLLMJʉEiG U-B[LMMMMMMMMMMMMLIy?G4% E;HNNMMMMMMMMMMMLJFDGn;F"G|L՟POOOOOOOOOOON~ABKXM̞OPPPPPPPPPOOޢRM;;' ~?&JtNŞOPPPPPPPPPPODrp9!PkPPPPPPPPPPOQءRS}BH_MƝOPPPPPPPPPOIE8Q}QQQQQQQQRQOOWU }@v=:KPQQQQQQQQQOM>R}RRRRRRRRPΗMNNNH4PwPQRRRRRRQLI>PcQޢQRQQQQRKkR/!L:N^QSۢQQRRRQH@,{?'HxQޣRQPЗMJgO"M@'M RQ,OPϢRRRQM'?# {?'PcR~Q{OZJ<|AM2PYRsRQvRPh???xvidcore/dshow/src/xvid.ax.def0000664000076500007650000000017710027672214017445 0ustar xvidbuildxvidbuildEXPORTS Configure DllGetClassObject PRIVATE DllCanUnloadNow PRIVATE DllRegisterServer PRIVATE DllUnregisterServer PRIVATE xvidcore/dshow/dshow.vcproj0000664000076500007650000006402211567132332017167 0ustar xvidbuildxvidbuild xvidcore/dshow/authors.txt0000664000076500007650000000004510027665151017037 0ustar xvidbuildxvidbuildauthors Peter Ross xvidcore/dshow/dxpatch/0000775000076500007650000000000011566427761016261 5ustar xvidbuildxvidbuildxvidcore/dshow/dxpatch/dx90sdk-update-gcc.patch0000664000076500007650000002476610101542771022600 0ustar xvidbuildxvidbuilddiff -burN /c/DX90SDK-orig/Include/DShow.h ./Include/DShow.h --- /c/DX90SDK-orig/Include/DShow.h Mon Aug 18 21:22:52 2003 +++ ./Include/DShow.h Tue Jul 27 20:43:16 2004 @@ -44,7 +44,7 @@ // Include DirectShow include files /////////////////////////////////////////////////////////////////////////// #include // Generated IDL header file for streams interfaces -#include // ActiveMovie video interfaces and definitions +#include // ActiveMovie video interfaces and definitions #include // ActiveMovie audio interfaces and definitions #include // generated from control.odl #include // event code definitions diff -burN /c/DX90SDK-orig/Include/errors.h ./Include/errors.h --- /c/DX90SDK-orig/Include/errors.h Mon Aug 18 21:22:52 2003 +++ ./Include/errors.h Tue Jul 27 20:44:04 2004 @@ -24,7 +24,7 @@ #define VFW_FIRST_CODE 0x200 #define MAX_ERROR_TEXT_LEN 160 -#include // includes all message definitions +#include // includes all message definitions typedef BOOL (WINAPI* AMGETERRORTEXTPROCA)(HRESULT, char *, DWORD); typedef BOOL (WINAPI* AMGETERRORTEXTPROCW)(HRESULT, WCHAR *, DWORD); diff -burN /c/DX90SDK-orig/Include/strmif.h ./Include/strmif.h --- /c/DX90SDK-orig/Include/strmif.h Mon Aug 18 21:22:54 2003 +++ ./Include/strmif.h Tue Jul 27 20:45:07 2004 @@ -5604,7 +5604,7 @@ { DWORD dwVersion; DWORD dwMerit; - /* [switch_type][switch_is] */ union + /* [switch_type][switch_is] */ struct { /* [case()] */ struct { @@ -28732,8 +28732,8 @@ typedef struct tagVMRGUID { - GUID *pGUID; - GUID GUID; + struct _GUID *pGUID; + struct _GUID GUID; } VMRGUID; typedef struct tagVMRMONITORINFO diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/Makefile ./Samples/C++/DirectShow/BaseClasses/Makefile --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/Makefile Thu Jan 1 10:00:00 1970 +++ ./Samples/C++/DirectShow/BaseClasses/Makefile Tue Jul 27 20:47:27 2004 @@ -0,0 +1,27 @@ +SRC=amextra.cpp amfilter.cpp amvideo.cpp combase.cpp cprop.cpp ctlutil.cpp ddmm.cpp dllentry.cpp dllsetup.cpp mtype.cpp outputq.cpp pstream.cpp pullpin.cpp refclock.cpp renbase.cpp schedule.cpp seekpt.cpp source.cpp strmctl.cpp sysclock.cpp transfrm.cpp transip.cpp videoctl.cpp vtrans.cpp winctrl.cpp winutil.cpp wxdebug.cpp wxlist.cpp wxutil.cpp + +DXTREE=../../../.. +DXBASECLASSES=$(DXTREE)/Samples/C++/DirectShow/BaseClasses +OBJ=$(SRC:.cpp=.o) +LIB=strmbase.lib +RANLIB=ranlib + +CXX=g++ +CXXFLAGS=-O2 -fno-for-scope -mthreads + +all: $(LIB) + +$(LIB): $(OBJ) + $(AR) $(ARFLAGS) $@ $^ + $(RANLIB) $@ + +.cpp.o: + $(CXX) $(CXXFLAGS) \ + -DRELEASE \ + -I$(DXTREE)/Include \ + -I$(DXBASECLASSES) \ + -include $(DXTREE)/mingw_dshow_port.h \ + -c $(CXXFLAGS) $< -o $@ + +clean: + rm $(OBJ) diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/amfilter.cpp ./Samples/C++/DirectShow/BaseClasses/amfilter.cpp --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/amfilter.cpp Mon Aug 18 16:03:32 2003 +++ ./Samples/C++/DirectShow/BaseClasses/amfilter.cpp Tue Jul 27 20:42:25 2004 @@ -1363,7 +1363,7 @@ /* Make sure the destructor doesn't free these */ cmt.pbFormat = NULL; - cmt.cbFormat = NULL; + cmt.cbFormat = 0; cmt.pUnk = NULL; diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/ctlutil.cpp ./Samples/C++/DirectShow/BaseClasses/ctlutil.cpp --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/ctlutil.cpp Mon Aug 18 16:03:32 2003 +++ ./Samples/C++/DirectShow/BaseClasses/ctlutil.cpp Tue Jul 27 20:42:25 2004 @@ -709,7 +709,7 @@ HRESULT CPosPassThru::GetSeekingLongLong -( HRESULT (__stdcall IMediaSeeking::*pMethod)( LONGLONG * ) +( HRESULT ( GETSEEKINGLONGLONG_CALL IMediaSeeking::*pMethod)( LONGLONG * ) , LONGLONG * pll ) { diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/ctlutil.h ./Samples/C++/DirectShow/BaseClasses/ctlutil.h --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/ctlutil.h Mon Aug 18 16:03:32 2003 +++ ./Samples/C++/DirectShow/BaseClasses/ctlutil.h Tue Jul 27 20:42:25 2004 @@ -275,7 +275,7 @@ // Prevent bugs from constructing from LONG (which gets // converted to double and then multiplied by 10000000 COARefTime(LONG); - operator=(LONG); + COARefTime& operator=(LONG); }; @@ -355,7 +355,12 @@ STDMETHODIMP CanSeekBackward(LONG *pCanSeekBackward); private: - HRESULT GetSeekingLongLong( HRESULT (__stdcall IMediaSeeking::*pMethod)( LONGLONG * ), +#if !defined(__GNUC__) +#define GETSEEKINGLONGLONG_CALL __stdcall +#else +#define GETSEEKINGLONGLONG_CALL +#endif + HRESULT GetSeekingLongLong( HRESULT ( GETSEEKINGLONGLONG_CALL IMediaSeeking::*pMethod)( LONGLONG * ), LONGLONG * pll ); }; diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/ddmm.cpp ./Samples/C++/DirectShow/BaseClasses/ddmm.cpp --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/ddmm.cpp Mon Aug 18 16:03:32 2003 +++ ./Samples/C++/DirectShow/BaseClasses/ddmm.cpp Tue Jul 27 20:45:21 2004 @@ -17,8 +17,8 @@ */ typedef struct { LPSTR szDevice; - GUID* lpGUID; - GUID GUID; + struct _GUID* lpGUID; + struct _GUID GUID; BOOL fFound; } FindDeviceData; diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/mtype.cpp ./Samples/C++/DirectShow/BaseClasses/mtype.cpp --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/mtype.cpp Mon Aug 18 16:03:32 2003 +++ ./Samples/C++/DirectShow/BaseClasses/mtype.cpp Tue Jul 27 20:42:25 2004 @@ -13,7 +13,6 @@ // in the streams IDL file, but also has (non-virtual) functions #include -#include CMediaType::~CMediaType(){ FreeMediaType(*this); diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/refclock.h ./Samples/C++/DirectShow/BaseClasses/refclock.h --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/refclock.h Mon Aug 18 16:03:32 2003 +++ ./Samples/C++/DirectShow/BaseClasses/refclock.h Tue Jul 27 20:42:25 2004 @@ -14,7 +14,11 @@ const UINT RESOLUTION = 1; /* High resolution timer */ const INT ADVISE_CACHE = 4; /* Default cache size */ +#if !defined(__GNUC__) const LONGLONG MAX_TIME = 0x7FFFFFFFFFFFFFFF; /* Maximum LONGLONG value */ +#else +const LONGLONG MAX_TIME = 0x7FFFFFFFFFFFFFFFLL; /* Maximum LONGLONG value */ +#endif inline LONGLONG WINAPI ConvertToMilliseconds(const REFERENCE_TIME& RT) { diff -ur /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/streams.h.orig ./Samples/C++/DirectShow/BaseClasses/streams.h --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/streams.h 2004-07-27 23:04:18.000000000 +0200 +++ ./Samples/C++/DirectShow/BaseClasses/streams.h 2004-07-27 23:03:57.000000000 +0200 @@ -135,7 +135,7 @@ #include // Helper class for REFERENCE_TIME management #include // Debug support for logging and ASSERTs -#include // ActiveMovie video interfaces and definitions +#include // ActiveMovie video interfaces and definitions //include amaudio.h explicitly if you need it. it requires the DirectX SDK. //#include // ActiveMovie audio interfaces and definitions #include // General helper classes for threads etc diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/wxdebug.cpp ./Samples/C++/DirectShow/BaseClasses/wxdebug.cpp --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/wxdebug.cpp Mon Aug 18 16:03:32 2003 +++ ./Samples/C++/DirectShow/BaseClasses/wxdebug.cpp Tue Jul 27 20:42:25 2004 @@ -564,7 +564,7 @@ { // re-read the registry every second. We cannot use RegNotify() to // notice registry changes because it's not available on win9x. - static g_dwLastRefresh = 0; + static DWORD g_dwLastRefresh = 0; DWORD dwTime = timeGetTime(); if(dwTime - g_dwLastRefresh > 1000) { g_dwLastRefresh = dwTime; @@ -1143,7 +1143,33 @@ hr = pUnk->QueryInterface(IID_IPin, (void **)&pp); if(SUCCEEDED(hr)) { - CDisp::CDisp(pp); +/* --- copy from CDisp::CDisp(IPin*) --- */ + PIN_INFO pi; + TCHAR str[MAX_PIN_NAME]; + CLSID clsid; + + if (pp) { + pp->QueryPinInfo(&pi); + pi.pFilter->GetClassID(&clsid); + QueryPinInfoReleaseFilter(pi); + #ifndef UNICODE + WideCharToMultiByte(GetACP(), 0, pi.achName, lstrlenW(pi.achName) + 1, + str, MAX_PIN_NAME, NULL, NULL); + #else + lstrcpy(str, pi.achName); + #endif + } else { + lstrcpy(str, TEXT("NULL IPin")); + } + + m_pString = (PTCHAR) new TCHAR[lstrlen(str)+64]; + if (!m_pString) { + pp->Release(); + return; + } + + wsprintf(m_pString, TEXT("%hs(%s)"), GuidNames[clsid], str); +/* --- copy from CDisp::CDisp(IPin*) --- */ pp->Release(); return; } diff -burN /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/wxutil.cpp ./Samples/C++/DirectShow/BaseClasses/wxutil.cpp --- /c/DX90SDK-orig/Samples/C++/DirectShow/BaseClasses/wxutil.cpp Mon Aug 18 16:03:32 2003 +++ ./Samples/C++/DirectShow/BaseClasses/wxutil.cpp Tue Jul 27 20:42:25 2004 @@ -742,8 +742,13 @@ /* This will catch c == 0 and overflow */ if (uc <= p[1].QuadPart) { +#if !defined(__GNUC__) return bSign ? (LONGLONG)0x8000000000000000 : (LONGLONG)0x7FFFFFFFFFFFFFFF; +#else + return bSign ? (LONGLONG)0x8000000000000000LL : + (LONGLONG)0x7FFFFFFFFFFFFFFFLL; +#endif } DWORDLONG ullResult; @@ -903,8 +908,13 @@ /* This will catch c == 0 and overflow */ if (uc <= p1) { +#if !defined(__GNUC__) return bSign ? (LONGLONG)0x8000000000000000 : (LONGLONG)0x7FFFFFFFFFFFFFFF; +#else + return bSign ? (LONGLONG)0x8000000000000000LL : + (LONGLONG)0x7FFFFFFFFFFFFFFFLL; +#endif } /* Do the division */ diff -burN /c/DX90SDK-orig/mingw_dshow_port.h ./mingw_dshow_port.h --- /c/DX90SDK-orig/mingw_dshow_port.h Thu Jan 1 10:00:00 1970 +++ ./mingw_dshow_port.h Tue Jul 27 20:42:25 2004 @@ -0,0 +1,12 @@ +#include +#include +#include +#include + +#define _WINGDI_ 1 +#define AM_NOVTABLE +#define _OBJBASE_H_ +#undef _X86_ +#define _I64_MAX LONG_LONG_MAX +#define EXTERN_GUID(itf,l1,s1,s2,c1,c2,c3,c4,c5,c6,c7,c8) \ + EXTERN_C static const IID itf = {l1,s1,s2,{c1,c2,c3,c4,c5,c6,c7,c8} } xvidcore/dshow/dxpatch/dx90sdk-update-gcc.txt0000664000076500007650000000735110101542771022307 0ustar xvidbuildxvidbuilddirectx 9.0 software development kit update (summer 2003) gnu c compiler compatibility patch =============================================== this patch has been tested with: gcc v3.2.3 gcc v3.3.3 gcc v3.4.1 (linux->win32 cross compiler) msc v6.0 + sp5 + pp I) Applying the patch ------------------ 1. Install the directx sdk update to /c/DX90SDK (to match default Makefile variable values) or to any other dir. From now on, the sdk install directory will be called ${SDKDIR}. dx90updatesdk.exe size: 190,991,976 bytes md5: ed328da4033e18124801265ee91f690e 2. cd ${SDKDIR} 3. patch -p0 --dry-run < /path/to/dx90sdk-update-gcc.patch (if all goes well, no rejects... else read the special notes) patch -p0 < /path/to/dx90sdk-update-gcc.patch -- Special notes for cross compilation on GNU/Linux systems (or any supported platform for wine): - You can install the DX SDK using the wine win32 API emulation layer. The unzipping stage of the install will succeed, it's all that is required to continue. So don't panic if the install program breaks/crashes after the self unzip did succeed. - Then depending on your cvs program, you may require to unix'ify the endlines of all sources in the SDK before applying the patch. That can be required CVS uses to expand/replace endlines according to the host platform type, so it's very likely that if you do extract the xvidcore sources from a windows box, this step isn't mandatory, but if you're using a unix box (even cygwin), you may be obliged to proceed with: find ${SDKDIR} -name "*.cpp" -exec dos2unix {} \; find ${SDKDIR} -name "*.h" -exec dos2unix {} \; II) building strmbase.lib ------------------------- 1. cd ${SDKDIR}/Samples/C++/Directshow/BaseClasses 2. make (this should output strmbase.lib) -- Special notes for people cross compiling, or people who installed the SDK elsewhere than /c/DX90SDK: - you can overide Makefile defaults in the make command line, just use something like this command line (adapt according to your build environment): make \ CXX=/opt/mingw32-cross/bin/i386-mingw32-g++ \ RANLIB=/opt/mingw32-cross/bin/i386-mingw32-ranlib \ DXTREE=${SDKDIR} III) Building your own apps --------------------------- These variables should be defined in your Makefiles: DXTREE=${SDKTREE} DXBASECLASSES=$(DXTREE)/Samples/C++/DirectShow/BaseClasses CXXFLAGS += -DRELEASE \ -I$(DXTREE)/Include \ -I$(DXBASECLASSES) \ -include $(DXTREE)/mingw_dshow_port.h LDFLAGS += -L$(DXTREE)/Lib -lstrmiids \ $(DXBASECLASSES)/strmbase.lib \ -lole32 -loleaut32 -lstdc++ So it's now time to build the XviD Dshow filter (the xvidcore source dir is supposed to be ${xvidcore}): 1. cd ${xvidcore}/dshow 2. make (should output a xvid.ax file in a =build dir by default) -- Notes for people using a cross compiler, or people who did install the SDK elsewhere than /c/DX90SDK: - you can overide Makefile variables from the make command line, a fairly complete command could look like this, adapt to your build environment: make \ CC=/opt/mingw32-cross/bin/i386-mingw32-gcc \ CXX=/opt/mingw32-cross/bin/i386-mingw32-g++ \ WINDRES=/opt/mingw32-cross/bin/i386-mingw32-windres \ DXTREE=/mnt/data/windows/dx9sdk NB: with some win32-api headers from mingw.org, you may suffer multiple QACONTAINERFLAGS definitions. In that case you need to manually edit ${mingw_install}/include/ocidl.h; Search for QACONTAINERFLAGS. It should look like this: enum tagQACONTAINERFLAGS { ... } QACONTAINERFLAGS; Then change this to that: typedef enum tagQACONTAINERFLAGS { ... } QACONTAINERFLAGS; Noticed the additional typedef ? that's the point ! xvidcore/dshow/Makefile0000664000076500007650000000661111113505477016257 0ustar xvidbuildxvidbuild############################################################################## # # Makefile for XviD DirectShow driver # # Adapted from XviD VFW driver makefile. # Modified by : Peter Ross # # Requires GNU Make because of shell expansion performed at a bad time with # other make programs (even using := variable assignments) # # $Id: Makefile,v 1.7 2008-11-27 11:57:51 Isibaar Exp $ ############################################################################## include sources.inc ############################################################################## # DXTREE must point to the directx sdk root directory. # # if a release prior to "directx v9.0 sdk update (summer 2003)" is installed, # uncomment the DXBASECLASSES=$(DXTREE)/Samples/MultiMedia/DirectShow/BaseClasses ############################################################################## DXTREE=/c/DX90SDK # DXTREE=/c/DXVCSDK DXBASECLASSES=$(DXTREE)/Samples/C++/DirectShow/BaseClasses # DXBASECLASSES=$(DXTREE)/Samples/MultiMedia/DirectShow/BaseClasses MAKEFILE_PWD:=$(shell pwd) LOCAL_XVID_SRCTREE:=$(MAKEFILE_PWD)/../src LOCAL_XVID_BUILDTREE:=$(MAKEFILE_PWD)/../build/generic/=build RM = rm -rf WINDRES=windres # Constants which should not be modified # The `mingw-runtime` package is required when building with -mno-cygwin CFLAGS += -mthreads CFLAGS += -I$(SRC_DIR)/w32api -I$(LOCAL_XVID_SRCTREE) CFLAGS += -mno-cygwin CXXFLAGS +=-mthreads CXXFLAGS += -DRELEASE \ -I$(LOCAL_XVID_SRCTREE) \ -I$(DXTREE)/Include \ -I$(DXBASECLASSES) \ -include $(DXTREE)/mingw_dshow_port.h CXXFLAGS += -mno-cygwin ############################################################################## # Optional Compiler options ############################################################################## CFLAGS += -Wall CFLAGS += -O2 CFLAGS += -fstrength-reduce CFLAGS += -finline-functions CFLAGS += -fgcse CFLAGS += -ffast-math CXXFLAGS += -O2 ############################################################################## # Compiler flags for linking stage ############################################################################## #LDFLAGS += ############################################################################## # Rules ############################################################################## OBJECTS = $(SRC_C:.c=.obj) OBJECTS+= $(SRC_CPP:.cpp=.obj) OBJECTS+= $(SRC_RES:.rc=.obj) .SUFFIXES: .obj .rc .c BUILD_DIR = =build VPATH = $(SRC_DIR):$(BUILD_DIR) all: $(LIBSO) $(BUILD_DIR): @echo " D: $(BUILD_DIR)" @mkdir -p $(BUILD_DIR) .rc.obj: @echo " W: $(@D)/$( # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** DO NOT EDIT ** # TARGTYPE "Win32 (x86) Console Application" 0x0103 CFG=xvid_decraw - Win32 Debug !MESSAGE This is not a valid makefile. To build this project using NMAKE, !MESSAGE use the Export Makefile command and run !MESSAGE !MESSAGE NMAKE /f "xvid_decraw.mak". !MESSAGE !MESSAGE You can specify a configuration when running NMAKE !MESSAGE by defining the macro CFG on the command line. For example: !MESSAGE !MESSAGE NMAKE /f "xvid_decraw.mak" CFG="xvid_decraw - Win32 Debug" !MESSAGE !MESSAGE Possible choices for configuration are: !MESSAGE !MESSAGE "xvid_decraw - Win32 Release" (based on "Win32 (x86) Console Application") !MESSAGE "xvid_decraw - Win32 Debug" (based on "Win32 (x86) Console Application") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe RSC=rc.exe !IF "$(CFG)" == "xvid_decraw - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "bin" # PROP BASE Intermediate_Dir "Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "bin" # PROP Intermediate_Dir "Release" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c # ADD CPP /nologo /MT /W3 /GX /O2 /I "..\\..\\src" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /D "ARCH_IS_LITTLE_ENDIAN" /YX /FD /c # ADD BASE RSC /l 0x409 /d "NDEBUG" # ADD RSC /l 0x409 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib xvidcore.dll.a /nologo /subsystem:console /machine:I386 /libpath:"bin" !ELSEIF "$(CFG)" == "xvid_decraw - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "bin" # PROP BASE Intermediate_Dir "Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "bin" # PROP Intermediate_Dir "Debug" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c # ADD CPP /nologo /MTd /W3 /Gm /GX /ZI /Od /I "..\\..\\src" /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /D "ARCH_IS_LITTLE_ENDIAN" /YX /FD /GZ /c # ADD BASE RSC /l 0x409 /d "_DEBUG" # ADD RSC /l 0x409 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib xvidcore.dll.a /nologo /subsystem:console /pdb:"Debug/xvid_decraw.pdb" /debug /machine:I386 /pdbtype:sept /libpath:"bin" # SUBTRACT LINK32 /pdb:none !ENDIF # Begin Target # Name "xvid_decraw - Win32 Release" # Name "xvid_decraw - Win32 Debug" # Begin Group "Source Files" # PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" # Begin Source File SOURCE=..\..\examples\xvid_decraw.c # End Source File # End Group # End Target # End Project xvidcore/build/win32/libxvidcore.sln0000664000076500007650000000531711567132224020565 0ustar xvidbuildxvidbuildMicrosoft Visual Studio Solution File, Format Version 9.00 # Visual Studio 2005 Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "libxvidcore", "libxvidcore.vcproj", "{64954A96-C813-4A92-87AD-DD733A5404AF}" EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "xvid_decraw", "xvid_decraw.vcproj", "{C499489E-1438-4118-A5D6-CADFE611AFB4}" ProjectSection(ProjectDependencies) = postProject {64954A96-C813-4A92-87AD-DD733A5404AF} = {64954A96-C813-4A92-87AD-DD733A5404AF} EndProjectSection EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "xvid_encraw", "xvid_encraw.vcproj", "{C71ECEC2-B8FA-42F0-9A38-3BE4CF0B1624}" ProjectSection(ProjectDependencies) = postProject {64954A96-C813-4A92-87AD-DD733A5404AF} = {64954A96-C813-4A92-87AD-DD733A5404AF} EndProjectSection EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug|Win32 = Debug|Win32 Debug|x64 = Debug|x64 Release|Win32 = Release|Win32 Release|x64 = Release|x64 EndGlobalSection GlobalSection(ProjectConfigurationPlatforms) = postSolution {64954A96-C813-4A92-87AD-DD733A5404AF}.Debug|Win32.ActiveCfg = Debug|Win32 {64954A96-C813-4A92-87AD-DD733A5404AF}.Debug|Win32.Build.0 = Debug|Win32 {64954A96-C813-4A92-87AD-DD733A5404AF}.Debug|x64.ActiveCfg = Debug|x64 {64954A96-C813-4A92-87AD-DD733A5404AF}.Debug|x64.Build.0 = Debug|x64 {64954A96-C813-4A92-87AD-DD733A5404AF}.Release|Win32.ActiveCfg = Release|Win32 {64954A96-C813-4A92-87AD-DD733A5404AF}.Release|Win32.Build.0 = Release|Win32 {64954A96-C813-4A92-87AD-DD733A5404AF}.Release|x64.ActiveCfg = Release|x64 {64954A96-C813-4A92-87AD-DD733A5404AF}.Release|x64.Build.0 = Release|x64 {C499489E-1438-4118-A5D6-CADFE611AFB4}.Debug|Win32.ActiveCfg = Debug|Win32 {C499489E-1438-4118-A5D6-CADFE611AFB4}.Debug|Win32.Build.0 = Debug|Win32 {C499489E-1438-4118-A5D6-CADFE611AFB4}.Debug|x64.ActiveCfg = Debug|x64 {C499489E-1438-4118-A5D6-CADFE611AFB4}.Release|Win32.ActiveCfg = Release|Win32 {C499489E-1438-4118-A5D6-CADFE611AFB4}.Release|Win32.Build.0 = Release|Win32 {C499489E-1438-4118-A5D6-CADFE611AFB4}.Release|x64.ActiveCfg = Release|x64 {C71ECEC2-B8FA-42F0-9A38-3BE4CF0B1624}.Debug|Win32.ActiveCfg = Debug|Win32 {C71ECEC2-B8FA-42F0-9A38-3BE4CF0B1624}.Debug|Win32.Build.0 = Debug|Win32 {C71ECEC2-B8FA-42F0-9A38-3BE4CF0B1624}.Debug|x64.ActiveCfg = Debug|x64 {C71ECEC2-B8FA-42F0-9A38-3BE4CF0B1624}.Release|Win32.ActiveCfg = Release|Win32 {C71ECEC2-B8FA-42F0-9A38-3BE4CF0B1624}.Release|Win32.Build.0 = Release|Win32 {C71ECEC2-B8FA-42F0-9A38-3BE4CF0B1624}.Release|x64.ActiveCfg = Release|x64 EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE EndGlobalSection EndGlobal xvidcore/build/win32/libxvidcore.vcproj0000664000076500007650000030364211567132217021300 0ustar xvidbuildxvidbuild xvidcore/build/win32/xvid_decraw.vcproj0000664000076500007650000002442711567132217021266 0ustar xvidbuildxvidbuild xvidcore/build/win32/xvid_encraw.vcproj0000664000076500007650000002465711567132217021305 0ustar xvidbuildxvidbuild xvidcore/build/win32/xvid_encraw_static.dsp0000664000076500007650000001065411567132204022123 0ustar xvidbuildxvidbuild# Microsoft Developer Studio Project File - Name="xvid_encraw_static" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** DO NOT EDIT ** # TARGTYPE "Win32 (x86) Console Application" 0x0103 CFG=xvid_encraw_static - Win32 Debug !MESSAGE This is not a valid makefile. To build this project using NMAKE, !MESSAGE use the Export Makefile command and run !MESSAGE !MESSAGE NMAKE /f "xvid_encraw_static.mak". !MESSAGE !MESSAGE You can specify a configuration when running NMAKE !MESSAGE by defining the macro CFG on the command line. For example: !MESSAGE !MESSAGE NMAKE /f "xvid_encraw_static.mak" CFG="xvid_encraw_static - Win32 Debug" !MESSAGE !MESSAGE Possible choices for configuration are: !MESSAGE !MESSAGE "xvid_encraw_static - Win32 Release" (based on "Win32 (x86) Console Application") !MESSAGE "xvid_encraw_static - Win32 Debug" (based on "Win32 (x86) Console Application") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe RSC=rc.exe !IF "$(CFG)" == "xvid_encraw_static - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "xvid_encraw_static___Win32_Release" # PROP BASE Intermediate_Dir "xvid_encraw_static___Win32_Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "xvid_encraw_static___Win32_Release" # PROP Intermediate_Dir "xvid_encraw_static___Win32_Release" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c # ADD CPP /nologo /W3 /GX /O2 /I "..\\..\\src" /D "NDEBUG" /D "WIN32" /D "_CONSOLE" /D "_MBCS" /D "ARCH_IS_LITTLE_ENDIAN" /YX /FD /c # ADD BASE RSC /l 0xc09 /d "NDEBUG" # ADD RSC /l 0xc09 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib vfw32.lib /nologo /subsystem:console /machine:I386 !ELSEIF "$(CFG)" == "xvid_encraw_static - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "xvid_encraw_static___Win32_Debug" # PROP BASE Intermediate_Dir "xvid_encraw_static___Win32_Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "xvid_encraw_static___Win32_Debug" # PROP Intermediate_Dir "xvid_encraw_static___Win32_Debug" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c # ADD CPP /nologo /W3 /Gm /GX /ZI /Od /I "..\\..\\src" /D "_DEBUG" /D "WIN32" /D "_CONSOLE" /D "_MBCS" /D "ARCH_IS_LITTLE_ENDIAN" /YX /FD /GZ /c # ADD BASE RSC /l 0xc09 /d "_DEBUG" # ADD RSC /l 0xc09 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib vfw32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept !ENDIF # Begin Target # Name "xvid_encraw_static - Win32 Release" # Name "xvid_encraw_static - Win32 Debug" # Begin Group "Source Files" # PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" # Begin Source File SOURCE=..\..\examples\xvid_encraw.c # End Source File # End Group # Begin Group "Header Files" # PROP Default_Filter "h;hpp;hxx;hm;inl" # End Group # Begin Group "Resource Files" # PROP Default_Filter "ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe" # End Group # End Target # End Project xvidcore/build/win32/libxvidcore.dsp0000664000076500007650000012252111567132204020552 0ustar xvidbuildxvidbuild# Microsoft Developer Studio Project File - Name="libxvidcore" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** NICHT BEARBEITEN ** # TARGTYPE "Win32 (x86) Dynamic-Link Library" 0x0102 CFG=libxvidcore - Win32 Debug !MESSAGE Dies ist kein gltiges Makefile. Zum Erstellen dieses Projekts mit NMAKE !MESSAGE verwenden Sie den Befehl "Makefile exportieren" und fhren Sie den Befehl !MESSAGE !MESSAGE NMAKE /f "libxvidcore.mak". !MESSAGE !MESSAGE Sie knnen beim Ausfhren von NMAKE eine Konfiguration angeben !MESSAGE durch Definieren des Makros CFG in der Befehlszeile. Zum Beispiel: !MESSAGE !MESSAGE NMAKE /f "libxvidcore.mak" CFG="libxvidcore - Win32 Debug" !MESSAGE !MESSAGE Fr die Konfiguration stehen zur Auswahl: !MESSAGE !MESSAGE "libxvidcore - Win32 Release" (basierend auf "Win32 (x86) Dynamic-Link Library") !MESSAGE "libxvidcore - Win32 Debug" (basierend auf "Win32 (x86) Dynamic-Link Library") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe MTL=midl.exe RSC=rc.exe !IF "$(CFG)" == "libxvidcore - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "Release" # PROP BASE Intermediate_Dir "Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "Release" # PROP Intermediate_Dir "Release" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /MT /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c # ADD CPP /nologo /MT /W3 /GX /O2 /D "NDEBUG" /D "ARCH_IS_IA32" /D "WIN32" /D "_WINDOWS" /D "ARCH_IS_32BIT" /YX /FD /Qipo /c # ADD BASE MTL /nologo /D "NDEBUG" /mktyplib203 /win32 # ADD MTL /nologo /D "NDEBUG" /mktyplib203 /win32 # ADD BASE RSC /l 0xc09 /d "NDEBUG" # ADD RSC /l 0xc09 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /machine:I386 # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /version:1.0 /subsystem:windows /dll /machine:I386 /out:"bin\xvidcore.dll" /implib:"bin\xvidcore.dll.a" !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "Debug" # PROP BASE Intermediate_Dir "Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "Debug" # PROP Intermediate_Dir "Debug" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /MTd /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /GZ /c # ADD CPP /nologo /MTd /W3 /Gm /GX /ZI /Od /D "_DEBUG" /D "WIN32" /D "_WINDOWS" /D "ARCH_IS_32BIT" /D "ARCH_IS_IA32" /YX /FD /GZ /c # ADD BASE MTL /nologo /D "_DEBUG" /mktyplib203 /win32 # ADD MTL /nologo /D "_DEBUG" /mktyplib203 /win32 # ADD BASE RSC /l 0xc09 /d "_DEBUG" # ADD RSC /l 0xc09 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /debug /machine:I386 /pdbtype:sept # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /version:1.0 /subsystem:windows /dll /debug /machine:I386 /out:"bin\xvidcore.dll" /implib:"bin\xvidcore.dll.a" /pdbtype:sept !ENDIF # Begin Target # Name "libxvidcore - Win32 Release" # Name "libxvidcore - Win32 Debug" # Begin Group "docs" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\AUTHORS # End Source File # Begin Source File SOURCE=..\..\ChangeLog # End Source File # Begin Source File SOURCE=..\..\CodingStyle # End Source File # Begin Source File SOURCE="..\..\doc\INSTALL" # End Source File # Begin Source File SOURCE=..\..\LICENSE # End Source File # Begin Source File SOURCE="..\..\doc\README" # End Source File # Begin Source File SOURCE=..\..\README # End Source File # Begin Source File SOURCE=..\..\TODO # End Source File # End Group # Begin Group "bitstream" # PROP Default_Filter "" # Begin Group "bitstream_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\bitstream\x86_asm\cbp_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\bitstream\x86_asm\cbp_mmx.asm InputName=cbp_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\bitstream\x86_asm\cbp_mmx.asm InputName=cbp_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\bitstream\x86_asm\cbp_sse2.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\bitstream\x86_asm\cbp_sse2.asm InputName=cbp_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\bitstream\x86_asm\cbp_sse2.asm InputName=cbp_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "bitstream_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\bitstream\bitstream.h # End Source File # Begin Source File SOURCE=..\..\src\bitstream\cbp.h # End Source File # Begin Source File SOURCE=..\..\src\bitstream\mbcoding.h # End Source File # Begin Source File SOURCE=..\..\src\bitstream\vlc_codes.h # End Source File # Begin Source File SOURCE=..\..\src\bitstream\zigzag.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\bitstream\bitstream.c # End Source File # Begin Source File SOURCE=..\..\src\bitstream\cbp.c # End Source File # Begin Source File SOURCE=..\..\src\bitstream\mbcoding.c # End Source File # End Group # Begin Group "dct" # PROP Default_Filter "" # Begin Group "dct_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\dct\x86_asm\fdct_mmx_ffmpeg.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\fdct_mmx_ffmpeg.asm InputName=fdct_mmx_ffmpeg "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\fdct_mmx_ffmpeg.asm InputName=fdct_mmx_ffmpeg "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\fdct_mmx_skal.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\fdct_mmx_skal.asm InputName=fdct_mmx_skal "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\fdct_mmx_skal.asm InputName=fdct_mmx_skal "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\fdct_sse2_skal.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\fdct_sse2_skal.asm InputName=fdct_sse2_skal "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\fdct_sse2_skal.asm InputName=fdct_sse2_skal "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\idct_3dne.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\idct_3dne.asm InputName=idct_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\idct_3dne.asm InputName=idct_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\idct_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\idct_mmx.asm InputName=idct_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\idct_mmx.asm InputName=idct_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\idct_sse2_dmitry.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\idct_sse2_dmitry.asm InputName=idct_sse2_dmitry "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\idct_sse2_dmitry.asm InputName=idct_sse2_dmitry "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "dct_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\dct\fdct.h # End Source File # Begin Source File SOURCE=..\..\src\dct\idct.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\dct\fdct.c # End Source File # Begin Source File SOURCE=..\..\src\dct\idct.c # End Source File # Begin Source File SOURCE=..\..\src\dct\simple_idct.c # End Source File # End Group # Begin Group "image" # PROP Default_Filter "" # Begin Group "image_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\image\x86_asm\colorspace_mmx.inc # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\colorspace_rgb_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Release InputPath=..\..\src\image\x86_asm\colorspace_rgb_mmx.asm InputName=colorspace_rgb_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Debug InputPath=..\..\src\image\x86_asm\colorspace_rgb_mmx.asm InputName=colorspace_rgb_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\colorspace_yuv_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Release InputPath=..\..\src\image\x86_asm\colorspace_yuv_mmx.asm InputName=colorspace_yuv_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Debug InputPath=..\..\src\image\x86_asm\colorspace_yuv_mmx.asm InputName=colorspace_yuv_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\colorspace_yuyv_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Release InputPath=..\..\src\image\x86_asm\colorspace_yuyv_mmx.asm InputName=colorspace_yuyv_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Debug InputPath=..\..\src\image\x86_asm\colorspace_yuyv_mmx.asm InputName=colorspace_yuyv_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\deintl_sse.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\deintl_sse.asm InputName=deintl_sse "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\deintl_sse.asm InputName=deintl_sse "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\gmc_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\gmc_mmx.asm InputName=gmc_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\gmc_mmx.asm InputName=gmc_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\interpolate8x8_3dn.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\interpolate8x8_3dn.asm InputName=interpolate8x8_3dn "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\interpolate8x8_3dn.asm InputName=interpolate8x8_3dn "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\interpolate8x8_3dne.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\interpolate8x8_3dne.asm InputName=interpolate8x8_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\interpolate8x8_3dne.asm InputName=interpolate8x8_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\interpolate8x8_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\interpolate8x8_mmx.asm InputName=interpolate8x8_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\interpolate8x8_mmx.asm InputName=interpolate8x8_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\interpolate8x8_xmm.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\interpolate8x8_xmm.asm InputName=interpolate8x8_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\interpolate8x8_xmm.asm InputName=interpolate8x8_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\postprocessing_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\postprocessing_mmx.asm InputName=postprocessing_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\postprocessing_mmx.asm InputName=postprocessing_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\postprocessing_sse2.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\postprocessing_sse2.asm InputName=postprocessing_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\postprocessing_sse2.asm InputName=postprocessing_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\qpel_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\qpel_mmx.asm InputName=qpel_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\qpel_mmx.asm InputName=qpel_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\reduced_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\reduced_mmx.asm InputName=reduced_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\reduced_mmx.asm InputName=reduced_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "image_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\image\colorspace.h # End Source File # Begin Source File SOURCE=..\..\src\image\font.h # End Source File # Begin Source File SOURCE=..\..\src\image\image.h # End Source File # Begin Source File SOURCE=..\..\src\image\interpolate8x8.h # End Source File # Begin Source File SOURCE=..\..\src\image\postprocessing.h # End Source File # Begin Source File SOURCE=..\..\src\image\qpel.h # End Source File # Begin Source File SOURCE=..\..\src\image\reduced.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\image\colorspace.c # End Source File # Begin Source File SOURCE=..\..\src\image\font.c # End Source File # Begin Source File SOURCE=..\..\src\image\image.c # End Source File # Begin Source File SOURCE=..\..\src\image\interpolate8x8.c # End Source File # Begin Source File SOURCE=..\..\src\image\postprocessing.c # End Source File # Begin Source File SOURCE=..\..\src\image\qpel.c # End Source File # Begin Source File SOURCE=..\..\src\image\reduced.c # End Source File # End Group # Begin Group "motion" # PROP Default_Filter "" # Begin Group "motion_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_3dn.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # PROP Ignore_Default_Tool 1 # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_3dn.asm InputName=sad_3dn "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_3dn.asm InputName=sad_3dn "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_3dne.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_3dne.asm InputName=sad_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_3dne.asm InputName=sad_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_mmx.asm InputName=sad_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_mmx.asm InputName=sad_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_sse2.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_sse2.asm InputName=sad_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_sse2.asm InputName=sad_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_xmm.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_xmm.asm InputName=sad_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_xmm.asm InputName=sad_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "motion_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\motion\estimation.h # End Source File # Begin Source File SOURCE=..\..\src\motion\gmc.h # End Source File # Begin Source File SOURCE=..\..\src\motion\motion.h # End Source File # Begin Source File SOURCE=..\..\src\motion\motion_inlines.h # End Source File # Begin Source File SOURCE=..\..\src\motion\sad.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\motion\estimation_bvop.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_common.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_gmc.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_pvop.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_rd_based.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_rd_based_bvop.c # End Source File # Begin Source File SOURCE=..\..\src\motion\gmc.c # End Source File # Begin Source File SOURCE=..\..\src\motion\motion_comp.c # End Source File # Begin Source File SOURCE=..\..\src\motion\sad.c # End Source File # Begin Source File SOURCE=..\..\src\motion\vop_type_decision.c # End Source File # End Group # Begin Group "prediction" # PROP Default_Filter "" # Begin Group "prediction_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\prediction\mbprediction.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\prediction\mbprediction.c # End Source File # End Group # Begin Group "quant" # PROP Default_Filter "" # Begin Group "quant_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\quant\x86_asm\quantize_h263_3dne.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\quant\x86_asm\quantize_h263_3dne.asm InputName=quantize_h263_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\quant\x86_asm\quantize_h263_3dne.asm InputName=quantize_h263_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\quant\x86_asm\quantize_h263_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\quant\x86_asm\quantize_h263_mmx.asm InputName=quantize_h263_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\quant\x86_asm\quantize_h263_mmx.asm InputName=quantize_h263_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\quant\x86_asm\quantize_mpeg_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\quant\x86_asm\quantize_mpeg_mmx.asm InputName=quantize_mpeg_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\quant\x86_asm\quantize_mpeg_mmx.asm InputName=quantize_mpeg_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\quant\x86_asm\quantize_mpeg_xmm.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\quant\x86_asm\quantize_mpeg_xmm.asm InputName=quantize_mpeg_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\quant\x86_asm\quantize_mpeg_xmm.asm InputName=quantize_mpeg_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "quant_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\quant\quant.h # End Source File # Begin Source File SOURCE=..\..\src\quant\quant_matrix.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\quant\quant_h263.c # End Source File # Begin Source File SOURCE=..\..\src\quant\quant_matrix.c # End Source File # Begin Source File SOURCE=..\..\src\quant\quant_mpeg.c # End Source File # End Group # Begin Group "utils" # PROP Default_Filter "" # Begin Group "utils_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\utils\x86_asm\cpuid.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\utils\x86_asm\cpuid.asm InputName=cpuid "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\utils\x86_asm\cpuid.asm InputName=cpuid "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\utils\x86_asm\interlacing_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\utils\x86_asm\interlacing_mmx.asm InputName=interlacing_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\utils\x86_asm\interlacing_mmx.asm InputName=interlacing_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\utils\x86_asm\mem_transfer_3dne.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\utils\x86_asm\mem_transfer_3dne.asm InputName=mem_transfer_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\utils\x86_asm\mem_transfer_3dne.asm InputName=mem_transfer_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\utils\x86_asm\mem_transfer_mmx.asm !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\utils\x86_asm\mem_transfer_mmx.asm InputName=mem_transfer_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\utils\x86_asm\mem_transfer_mmx.asm InputName=mem_transfer_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "utils_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\utils\emms.h # End Source File # Begin Source File SOURCE=..\..\src\utils\mbfunctions.h # End Source File # Begin Source File SOURCE=..\..\src\utils\mem_align.h # End Source File # Begin Source File SOURCE=..\..\src\utils\mem_transfer.h # End Source File # Begin Source File SOURCE=..\..\src\utils\ratecontrol.h # End Source File # Begin Source File SOURCE=..\..\src\utils\timer.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\utils\emms.c # End Source File # Begin Source File SOURCE=..\..\src\utils\mbtransquant.c # End Source File # Begin Source File SOURCE=..\..\src\utils\mem_align.c # End Source File # Begin Source File SOURCE=..\..\src\utils\mem_transfer.c # End Source File # Begin Source File SOURCE=..\..\src\utils\timer.c # End Source File # End Group # Begin Group "xvid_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\decoder.h # End Source File # Begin Source File SOURCE=..\..\src\encoder.h # End Source File # Begin Source File SOURCE=..\..\src\global.h # End Source File # Begin Source File SOURCE=..\..\src\portab.h # End Source File # Begin Source File SOURCE=..\..\src\xvid.h # End Source File # End Group # Begin Group "plugins" # PROP Default_Filter "" # Begin Group "plugins_h" # PROP Default_Filter "" # End Group # Begin Group "plugins_asm" # PROP Default_Filter "" # Begin Source File SOURCE="..\..\src\plugins\x86_asm\plugin_ssim-a.asm" !IF "$(CFG)" == "libxvidcore - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath="..\..\src\plugins\x86_asm\plugin_ssim-a.asm" InputName=plugin_ssim-a "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath="..\..\src\plugins\x86_asm\plugin_ssim-a.asm" InputName=plugin_ssim-a "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Source File SOURCE=..\..\src\plugins\plugin_2pass1.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_2pass2.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_dump.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_lumimasking.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_psnr.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_single.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_ssim.c # End Source File # End Group # Begin Source File SOURCE=..\..\src\decoder.c # End Source File # Begin Source File SOURCE=..\..\src\encoder.c # End Source File # Begin Source File SOURCE=..\generic\libxvidcore.def # End Source File # Begin Source File SOURCE=..\..\src\xvid.c # End Source File # End Target # End Project xvidcore/build/win32/xvid_bench.dsp0000664000076500007650000001074011567132204020350 0ustar xvidbuildxvidbuild# Microsoft Developer Studio Project File - Name="xvid_bench" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** DO NOT EDIT ** # TARGTYPE "Win32 (x86) Console Application" 0x0103 CFG=xvid_bench - Win32 Debug !MESSAGE This is not a valid makefile. To build this project using NMAKE, !MESSAGE use the Export Makefile command and run !MESSAGE !MESSAGE NMAKE /f "xvid_bench.mak". !MESSAGE !MESSAGE You can specify a configuration when running NMAKE !MESSAGE by defining the macro CFG on the command line. For example: !MESSAGE !MESSAGE NMAKE /f "xvid_bench.mak" CFG="xvid_bench - Win32 Debug" !MESSAGE !MESSAGE Possible choices for configuration are: !MESSAGE !MESSAGE "xvid_bench - Win32 Release" (based on "Win32 (x86) Console Application") !MESSAGE "xvid_bench - Win32 Debug" (based on "Win32 (x86) Console Application") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe RSC=rc.exe !IF "$(CFG)" == "xvid_bench - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "Release" # PROP BASE Intermediate_Dir "Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "bin" # PROP Intermediate_Dir "Release" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c # ADD CPP /nologo /W3 /GX /O2 /I "..\\..\\src" /D "WIN32" /D "NDEBUG" /D "ARCH_IS_IA32" /D "ARCH_IS_32BIT" /D "_CONSOLE" /D "_MBCS" /YX /FD /c # ADD BASE RSC /l 0x409 /d "NDEBUG" # ADD RSC /l 0x409 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 !ELSEIF "$(CFG)" == "xvid_bench - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "xvid_bench___Win32_Debug" # PROP BASE Intermediate_Dir "xvid_bench___Win32_Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "bin" # PROP Intermediate_Dir "Debug" # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c # ADD CPP /nologo /W3 /Gm /GX /ZI /Od /I "..\\..\\src" /D "WIN32" /D "_DEBUG" /D "ARCH_IS_IA32" /D "ARCH_IS_32BIT" /D "_CONSOLE" /D "_MBCS" /FR /YX /FD /GZ /c # ADD BASE RSC /l 0x409 /d "_DEBUG" # ADD RSC /l 0x409 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept !ENDIF # Begin Target # Name "xvid_bench - Win32 Release" # Name "xvid_bench - Win32 Debug" # Begin Group "Source Files" # PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" # Begin Source File SOURCE=..\..\examples\xvid_bench.c # End Source File # End Group # Begin Group "Header Files" # PROP Default_Filter "h;hpp;hxx;hm;inl" # Begin Source File SOURCE=..\..\src\xvid.h # End Source File # End Group # Begin Group "Resource Files" # PROP Default_Filter "ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe" # End Group # End Target # End Project xvidcore/build/win32/libxvidcore_static.dsp0000664000076500007650000012161311567132204022122 0ustar xvidbuildxvidbuild# Microsoft Developer Studio Project File - Name="libxvidcore_static" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** NICHT BEARBEITEN ** # TARGTYPE "Win32 (x86) Static Library" 0x0104 CFG=libxvidcore_static - Win32 Debug !MESSAGE Dies ist kein gltiges Makefile. Zum Erstellen dieses Projekts mit NMAKE !MESSAGE verwenden Sie den Befehl "Makefile exportieren" und fhren Sie den Befehl !MESSAGE !MESSAGE NMAKE /f "libxvidcore_static.mak". !MESSAGE !MESSAGE Sie knnen beim Ausfhren von NMAKE eine Konfiguration angeben !MESSAGE durch Definieren des Makros CFG in der Befehlszeile. Zum Beispiel: !MESSAGE !MESSAGE NMAKE /f "libxvidcore_static.mak" CFG="libxvidcore_static - Win32 Debug" !MESSAGE !MESSAGE Fr die Konfiguration stehen zur Auswahl: !MESSAGE !MESSAGE "libxvidcore_static - Win32 Release" (basierend auf "Win32 (x86) Static Library") !MESSAGE "libxvidcore_static - Win32 Debug" (basierend auf "Win32 (x86) Static Library") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe RSC=rc.exe !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "Release" # PROP BASE Intermediate_Dir "Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "Release" # PROP Intermediate_Dir "Release" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c # ADD CPP /nologo /W3 /GX /O2 /D "NDEBUG" /D "ARCH_IS_IA32" /D "WIN32" /D "_WINDOWS" /D "ARCH_IS_32BIT" /YX /FD /Qipo /c # ADD BASE RSC /l 0xc09 /d "NDEBUG" # ADD RSC /l 0xc09 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LIB32=link.exe -lib # ADD BASE LIB32 /nologo # ADD LIB32 /nologo /out:"bin\libxvidcore.lib" !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "Debug" # PROP BASE Intermediate_Dir "Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "Debug" # PROP Intermediate_Dir "Debug" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /GZ /c # ADD CPP /nologo /W3 /Gm /GX /ZI /Od /D "_DEBUG" /D "WIN32" /D "_WINDOWS" /D "ARCH_IS_32BIT" /D "ARCH_IS_IA32" /YX /FD /GZ /c # ADD BASE RSC /l 0xc09 /d "_DEBUG" # ADD RSC /l 0xc09 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LIB32=link.exe -lib # ADD BASE LIB32 /nologo # ADD LIB32 /nologo /out:"bin\libxvidcore.lib" !ENDIF # Begin Target # Name "libxvidcore_static - Win32 Release" # Name "libxvidcore_static - Win32 Debug" # Begin Group "docs" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\AUTHORS # End Source File # Begin Source File SOURCE=..\..\ChangeLog # End Source File # Begin Source File SOURCE=..\..\CodingStyle # End Source File # Begin Source File SOURCE="..\..\doc\INSTALL" # End Source File # Begin Source File SOURCE=..\..\LICENSE # End Source File # Begin Source File SOURCE="..\..\doc\README" # End Source File # Begin Source File SOURCE=..\..\README # End Source File # Begin Source File SOURCE=..\..\TODO # End Source File # End Group # Begin Group "bitstream" # PROP Default_Filter "" # Begin Group "bitstream_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\bitstream\x86_asm\cbp_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\bitstream\x86_asm\cbp_mmx.asm InputName=cbp_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\bitstream\x86_asm\cbp_mmx.asm InputName=cbp_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\bitstream\x86_asm\cbp_sse2.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\bitstream\x86_asm\cbp_sse2.asm InputName=cbp_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\bitstream\x86_asm\cbp_sse2.asm InputName=cbp_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "bitstream_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\bitstream\bitstream.h # End Source File # Begin Source File SOURCE=..\..\src\bitstream\cbp.h # End Source File # Begin Source File SOURCE=..\..\src\bitstream\mbcoding.h # End Source File # Begin Source File SOURCE=..\..\src\bitstream\vlc_codes.h # End Source File # Begin Source File SOURCE=..\..\src\bitstream\zigzag.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\bitstream\bitstream.c # End Source File # Begin Source File SOURCE=..\..\src\bitstream\cbp.c # End Source File # Begin Source File SOURCE=..\..\src\bitstream\mbcoding.c # End Source File # End Group # Begin Group "dct" # PROP Default_Filter "" # Begin Group "dct_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\dct\x86_asm\fdct_mmx_ffmpeg.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\fdct_mmx_ffmpeg.asm InputName=fdct_mmx_ffmpeg "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\fdct_mmx_ffmpeg.asm InputName=fdct_mmx_ffmpeg "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\fdct_mmx_skal.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\fdct_mmx_skal.asm InputName=fdct_mmx_skal "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\fdct_mmx_skal.asm InputName=fdct_mmx_skal "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\fdct_sse2_skal.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\fdct_sse2_skal.asm InputName=fdct_sse2_skal "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\fdct_sse2_skal.asm InputName=fdct_sse2_skal "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\idct_3dne.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\idct_3dne.asm InputName=idct_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\idct_3dne.asm InputName=idct_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\idct_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\idct_mmx.asm InputName=idct_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\idct_mmx.asm InputName=idct_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\dct\x86_asm\idct_sse2_dmitry.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\dct\x86_asm\idct_sse2_dmitry.asm InputName=idct_sse2_dmitry "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\dct\x86_asm\idct_sse2_dmitry.asm InputName=idct_sse2_dmitry "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "dct_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\dct\fdct.h # End Source File # Begin Source File SOURCE=..\..\src\dct\idct.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\dct\fdct.c # End Source File # Begin Source File SOURCE=..\..\src\dct\idct.c # End Source File # Begin Source File SOURCE=..\..\src\dct\simple_idct.c # End Source File # End Group # Begin Group "image" # PROP Default_Filter "" # Begin Group "image_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\image\x86_asm\colorspace_mmx.inc # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\colorspace_rgb_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Release InputPath=..\..\src\image\x86_asm\colorspace_rgb_mmx.asm InputName=colorspace_rgb_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Debug InputPath=..\..\src\image\x86_asm\colorspace_rgb_mmx.asm InputName=colorspace_rgb_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\colorspace_yuv_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Release InputPath=..\..\src\image\x86_asm\colorspace_yuv_mmx.asm InputName=colorspace_yuv_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Debug InputPath=..\..\src\image\x86_asm\colorspace_yuv_mmx.asm InputName=colorspace_yuv_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\colorspace_yuyv_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Release InputPath=..\..\src\image\x86_asm\colorspace_yuyv_mmx.asm InputName=colorspace_yuyv_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Debug InputPath=..\..\src\image\x86_asm\colorspace_yuyv_mmx.asm InputName=colorspace_yuyv_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ -I"$(InputDir)"\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\deintl_sse.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Release InputPath=..\..\src\image\x86_asm\deintl_sse.asm InputName=deintl_sse "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Debug InputPath=..\..\src\image\x86_asm\deintl_sse.asm InputName=deintl_sse "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\gmc_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Release InputPath=..\..\src\image\x86_asm\gmc_mmx.asm InputName=gmc_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) InputDir=\xvidcore\src\image\x86_asm IntDir=.\Debug InputPath=..\..\src\image\x86_asm\gmc_mmx.asm InputName=gmc_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\interpolate8x8_3dn.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\interpolate8x8_3dn.asm InputName=interpolate8x8_3dn "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\interpolate8x8_3dn.asm InputName=interpolate8x8_3dn "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\interpolate8x8_3dne.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\interpolate8x8_3dne.asm InputName=interpolate8x8_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\interpolate8x8_3dne.asm InputName=interpolate8x8_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\interpolate8x8_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\interpolate8x8_mmx.asm InputName=interpolate8x8_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\interpolate8x8_mmx.asm InputName=interpolate8x8_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\interpolate8x8_xmm.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\interpolate8x8_xmm.asm InputName=interpolate8x8_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\interpolate8x8_xmm.asm InputName=interpolate8x8_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\postprocessing_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\postprocessing_mmx.asm InputName=postprocessing_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\postprocessing_mmx.asm InputName=postprocessing_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\postprocessing_sse2.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\postprocessing_sse2.asm InputName=postprocessing_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\postprocessing_sse2.asm InputName=postprocessing_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\qpel_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\qpel_mmx.asm InputName=qpel_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\qpel_mmx.asm InputName=qpel_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\image\x86_asm\reduced_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\image\x86_asm\reduced_mmx.asm InputName=reduced_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\image\x86_asm\reduced_mmx.asm InputName=reduced_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "image_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\image\colorspace.h # End Source File # Begin Source File SOURCE=..\..\src\image\font.h # End Source File # Begin Source File SOURCE=..\..\src\image\image.h # End Source File # Begin Source File SOURCE=..\..\src\image\interpolate8x8.h # End Source File # Begin Source File SOURCE=..\..\src\image\postprocessing.h # End Source File # Begin Source File SOURCE=..\..\src\image\qpel.h # End Source File # Begin Source File SOURCE=..\..\src\image\reduced.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\image\colorspace.c # End Source File # Begin Source File SOURCE=..\..\src\image\font.c # End Source File # Begin Source File SOURCE=..\..\src\image\image.c # End Source File # Begin Source File SOURCE=..\..\src\image\interpolate8x8.c # End Source File # Begin Source File SOURCE=..\..\src\image\postprocessing.c # End Source File # Begin Source File SOURCE=..\..\src\image\qpel.c # End Source File # Begin Source File SOURCE=..\..\src\image\reduced.c # End Source File # End Group # Begin Group "motion" # PROP Default_Filter "" # Begin Group "motion_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_3dn.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # PROP Ignore_Default_Tool 1 # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_3dn.asm InputName=sad_3dn "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_3dn.asm InputName=sad_3dn "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_3dne.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_3dne.asm InputName=sad_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_3dne.asm InputName=sad_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_mmx.asm InputName=sad_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_mmx.asm InputName=sad_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_sse2.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_sse2.asm InputName=sad_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_sse2.asm InputName=sad_sse2 "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\motion\x86_asm\sad_xmm.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\motion\x86_asm\sad_xmm.asm InputName=sad_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\motion\x86_asm\sad_xmm.asm InputName=sad_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "motion_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\motion\estimation.h # End Source File # Begin Source File SOURCE=..\..\src\motion\gmc.h # End Source File # Begin Source File SOURCE=..\..\src\motion\motion.h # End Source File # Begin Source File SOURCE=..\..\src\motion\motion_inlines.h # End Source File # Begin Source File SOURCE=..\..\src\motion\sad.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\motion\estimation_bvop.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_common.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_gmc.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_pvop.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_rd_based.c # End Source File # Begin Source File SOURCE=..\..\src\motion\estimation_rd_based_bvop.c # End Source File # Begin Source File SOURCE=..\..\src\motion\gmc.c # End Source File # Begin Source File SOURCE=..\..\src\motion\motion_comp.c # End Source File # Begin Source File SOURCE=..\..\src\motion\sad.c # End Source File # Begin Source File SOURCE=..\..\src\motion\vop_type_decision.c # End Source File # End Group # Begin Group "prediction" # PROP Default_Filter "" # Begin Group "prediction_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\prediction\mbprediction.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\prediction\mbprediction.c # End Source File # End Group # Begin Group "quant" # PROP Default_Filter "" # Begin Group "quant_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\quant\x86_asm\quantize_h263_3dne.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\quant\x86_asm\quantize_h263_3dne.asm InputName=quantize_h263_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\quant\x86_asm\quantize_h263_3dne.asm InputName=quantize_h263_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\quant\x86_asm\quantize_h263_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\quant\x86_asm\quantize_h263_mmx.asm InputName=quantize_h263_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\quant\x86_asm\quantize_h263_mmx.asm InputName=quantize_h263_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\quant\x86_asm\quantize_mpeg_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\quant\x86_asm\quantize_mpeg_mmx.asm InputName=quantize_mpeg_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\quant\x86_asm\quantize_mpeg_mmx.asm InputName=quantize_mpeg_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\quant\x86_asm\quantize_mpeg_xmm.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\quant\x86_asm\quantize_mpeg_xmm.asm InputName=quantize_mpeg_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\quant\x86_asm\quantize_mpeg_xmm.asm InputName=quantize_mpeg_xmm "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "quant_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\quant\quant.h # End Source File # Begin Source File SOURCE=..\..\src\quant\quant_matrix.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\quant\quant_h263.c # End Source File # Begin Source File SOURCE=..\..\src\quant\quant_matrix.c # End Source File # Begin Source File SOURCE=..\..\src\quant\quant_mpeg.c # End Source File # End Group # Begin Group "utils" # PROP Default_Filter "" # Begin Group "utils_asm" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\utils\x86_asm\cpuid.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\utils\x86_asm\cpuid.asm InputName=cpuid "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\utils\x86_asm\cpuid.asm InputName=cpuid "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\utils\x86_asm\interlacing_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\utils\x86_asm\interlacing_mmx.asm InputName=interlacing_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\utils\x86_asm\interlacing_mmx.asm InputName=interlacing_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\utils\x86_asm\mem_transfer_3dne.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\utils\x86_asm\mem_transfer_3dne.asm InputName=mem_transfer_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\utils\x86_asm\mem_transfer_3dne.asm InputName=mem_transfer_3dne "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # Begin Source File SOURCE=..\..\src\utils\x86_asm\mem_transfer_mmx.asm !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\utils\x86_asm\mem_transfer_mmx.asm InputName=mem_transfer_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\utils\x86_asm\mem_transfer_mmx.asm InputName=mem_transfer_mmx "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Group "utils_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\utils\emms.h # End Source File # Begin Source File SOURCE=..\..\src\utils\mbfunctions.h # End Source File # Begin Source File SOURCE=..\..\src\utils\mem_align.h # End Source File # Begin Source File SOURCE=..\..\src\utils\mem_transfer.h # End Source File # Begin Source File SOURCE=..\..\src\utils\ratecontrol.h # End Source File # Begin Source File SOURCE=..\..\src\utils\timer.h # End Source File # End Group # Begin Source File SOURCE=..\..\src\utils\emms.c # End Source File # Begin Source File SOURCE=..\..\src\utils\mbtransquant.c # End Source File # Begin Source File SOURCE=..\..\src\utils\mem_align.c # End Source File # Begin Source File SOURCE=..\..\src\utils\mem_transfer.c # End Source File # Begin Source File SOURCE=..\..\src\utils\timer.c # End Source File # End Group # Begin Group "xvid_h" # PROP Default_Filter "" # Begin Source File SOURCE=..\..\src\decoder.h # End Source File # Begin Source File SOURCE=..\..\src\encoder.h # End Source File # Begin Source File SOURCE=..\..\src\global.h # End Source File # Begin Source File SOURCE=..\..\src\portab.h # End Source File # Begin Source File SOURCE=..\..\src\xvid.h # End Source File # End Group # Begin Group "plugins" # PROP Default_Filter "" # Begin Group "plugins_h" # PROP Default_Filter "" # End Group # Begin Group "plugins_asm" # PROP Default_Filter "" # Begin Source File SOURCE="..\..\src\plugins\x86_asm\plugin_ssim-a.asm" !IF "$(CFG)" == "libxvidcore_static - Win32 Release" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Release InputPath=..\..\src\plugins\x86_asm\plugin_ssim-a.asm InputName=plugin_ssim-a "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ELSEIF "$(CFG)" == "libxvidcore_static - Win32 Debug" # Begin Custom Build - Assembling $(InputPath) IntDir=.\Debug InputPath=..\..\src\plugins\x86_asm\plugin_ssim-a.asm InputName=plugin_ssim-a "$(IntDir)\$(InputName).obj" : $(SOURCE) "$(INTDIR)" "$(OUTDIR)" nasm -o "$(IntDir)\$(InputName).obj" -f win32 -DWINDOWS -I..\..\src\ "$(InputPath)" # End Custom Build !ENDIF # End Source File # End Group # Begin Source File SOURCE=..\..\src\plugins\plugin_2pass1.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_2pass2.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_dump.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_lumimasking.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_psnr.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_single.c # End Source File # Begin Source File SOURCE=..\..\src\plugins\plugin_ssim.c # End Source File # End Group # Begin Source File SOURCE=..\..\src\decoder.c # End Source File # Begin Source File SOURCE=..\..\src\encoder.c # End Source File # Begin Source File SOURCE=..\..\src\xvid.c # End Source File # End Target # End Project xvidcore/build/win32/xvid_encraw.dsp0000664000076500007650000001005211567132204020544 0ustar xvidbuildxvidbuild# Microsoft Developer Studio Project File - Name="xvid_encraw" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** DO NOT EDIT ** # TARGTYPE "Win32 (x86) Console Application" 0x0103 CFG=xvid_encraw - Win32 Debug !MESSAGE This is not a valid makefile. To build this project using NMAKE, !MESSAGE use the Export Makefile command and run !MESSAGE !MESSAGE NMAKE /f "xvid_encraw.mak". !MESSAGE !MESSAGE You can specify a configuration when running NMAKE !MESSAGE by defining the macro CFG on the command line. For example: !MESSAGE !MESSAGE NMAKE /f "xvid_encraw.mak" CFG="xvid_encraw - Win32 Debug" !MESSAGE !MESSAGE Possible choices for configuration are: !MESSAGE !MESSAGE "xvid_encraw - Win32 Release" (based on "Win32 (x86) Console Application") !MESSAGE "xvid_encraw - Win32 Debug" (based on "Win32 (x86) Console Application") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe RSC=rc.exe !IF "$(CFG)" == "xvid_encraw - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "bin" # PROP BASE Intermediate_Dir "Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "bin" # PROP Intermediate_Dir "Release" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c # ADD CPP /nologo /MT /W3 /GX /O2 /I "..\\..\\src" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /D "ARCH_IS_LITTLE_ENDIAN" /YX /FD /c # ADD BASE RSC /l 0x409 /d "NDEBUG" # ADD RSC /l 0x409 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 # ADD LINK32 xvidcore.dll.a kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib vfw32.lib /nologo /subsystem:console /machine:I386 /libpath:"bin" !ELSEIF "$(CFG)" == "xvid_encraw - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "bin" # PROP BASE Intermediate_Dir "Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "bin" # PROP Intermediate_Dir "Debug" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c # ADD CPP /nologo /MTd /W3 /Gm /GX /ZI /Od /I "..\\..\\src" /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /D "ARCH_IS_LITTLE_ENDIAN" /YX /FD /GZ /c # ADD BASE RSC /l 0x409 /d "_DEBUG" # ADD RSC /l 0x409 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept # ADD LINK32 xvidcore.dll.a kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib vfw32.lib /nologo /subsystem:console /pdb:"Debug/xvid_encraw.pdb" /debug /machine:I386 /pdbtype:sept /libpath:"bin" # SUBTRACT LINK32 /pdb:none !ENDIF # Begin Target # Name "xvid_encraw - Win32 Release" # Name "xvid_encraw - Win32 Debug" # Begin Group "Source Files" # PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" # Begin Source File SOURCE=..\..\examples\xvid_encraw.c # End Source File # End Group # End Target # End Project xvidcore/build/win32/xvidcore.dsw0000664000076500007650000000432611567132261020077 0ustar xvidbuildxvidbuildMicrosoft Developer Studio Workspace File, Format Version 6.00 # WARNING: DO NOT EDIT OR DELETE THIS WORKSPACE FILE! ############################################################################### Project: "libxvidcore"=".\libxvidcore.dsp" - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ }}} ############################################################################### Project: "libxvidcore_static"=".\libxvidcore_static.dsp" - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ }}} ############################################################################### Project: "xvid_bench"=".\xvid_bench.dsp" - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ Begin Project Dependency Project_Dep_Name libxvidcore_static End Project Dependency }}} ############################################################################### Project: "xvid_decraw"=".\xvid_decraw.dsp" - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ Begin Project Dependency Project_Dep_Name libxvidcore End Project Dependency }}} ############################################################################### Project: "xvid_decraw_static"=".\xvid_decraw_static.dsp" - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ Begin Project Dependency Project_Dep_Name libxvidcore_static End Project Dependency }}} ############################################################################### Project: "xvid_encraw"=".\xvid_encraw.dsp" - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ Begin Project Dependency Project_Dep_Name libxvidcore End Project Dependency }}} ############################################################################### Project: "xvid_encraw_static"=".\xvid_encraw_static.dsp" - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ Begin Project Dependency Project_Dep_Name libxvidcore_static End Project Dependency }}} ############################################################################### Global: Package=<5> {{{ }}} Package=<3> {{{ }}} ############################################################################### xvidcore/build/win32/xvid_decraw_static.dsp0000664000076500007650000001120011567132204022075 0ustar xvidbuildxvidbuild# Microsoft Developer Studio Project File - Name="xvid_decraw_static" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** DO NOT EDIT ** # TARGTYPE "Win32 (x86) Console Application" 0x0103 CFG=xvid_decraw_static - Win32 Debug !MESSAGE This is not a valid makefile. To build this project using NMAKE, !MESSAGE use the Export Makefile command and run !MESSAGE !MESSAGE NMAKE /f "xvid_decraw_static.mak". !MESSAGE !MESSAGE You can specify a configuration when running NMAKE !MESSAGE by defining the macro CFG on the command line. For example: !MESSAGE !MESSAGE NMAKE /f "xvid_decraw_static.mak" CFG="xvid_decraw_static - Win32 Debug" !MESSAGE !MESSAGE Possible choices for configuration are: !MESSAGE !MESSAGE "xvid_decraw_static - Win32 Release" (based on "Win32 (x86) Console Application") !MESSAGE "xvid_decraw_static - Win32 Debug" (based on "Win32 (x86) Console Application") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe RSC=rc.exe !IF "$(CFG)" == "xvid_decraw_static - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "xvid_decraw_static___Win32_Release" # PROP BASE Intermediate_Dir "xvid_decraw_static___Win32_Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "xvid_decraw_static___Win32_Release" # PROP Intermediate_Dir "xvid_decraw_static___Win32_Release" # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c # ADD CPP /nologo /W3 /GX /O2 /I "..\\..\\src" /D "NDEBUG" /D "WIN32" /D "_CONSOLE" /D "_MBCS" /D "ARCH_IS_LITTLE_ENDIAN" /YX /FD /c # ADD BASE RSC /l 0xc09 /d "NDEBUG" # ADD RSC /l 0xc09 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 !ELSEIF "$(CFG)" == "xvid_decraw_static - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "xvid_decraw_static___Win32_Debug" # PROP BASE Intermediate_Dir "xvid_decraw_static___Win32_Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "xvid_decraw_static___Win32_Debug" # PROP Intermediate_Dir "xvid_decraw_static___Win32_Debug" # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c # ADD CPP /nologo /W3 /Gm /GX /ZI /Od /I "..\\..\\src" /D "_DEBUG" /D "WIN32" /D "_CONSOLE" /D "_MBCS" /D "ARCH_IS_LITTLE_ENDIAN" /YX /FD /GZ /c # ADD BASE RSC /l 0xc09 /d "_DEBUG" # ADD RSC /l 0xc09 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept !ENDIF # Begin Target # Name "xvid_decraw_static - Win32 Release" # Name "xvid_decraw_static - Win32 Debug" # Begin Group "Source Files" # PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" # Begin Source File SOURCE=..\..\examples\xvid_decraw.c # End Source File # End Group # Begin Group "Header Files" # PROP Default_Filter "h;hpp;hxx;hm;inl" # End Group # Begin Group "Resource Files" # PROP Default_Filter "ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe" # End Group # End Target # End Project xvidcore/build/generic/0000775000076500007650000000000011566432756016215 5ustar xvidbuildxvidbuildxvidcore/build/generic/config.guess0000755000076500007650000012776411566432664020552 0ustar xvidbuildxvidbuild#! /bin/sh # Attempt to guess a canonical system name. # Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, # 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 # Free Software Foundation, Inc. timestamp='2010-08-21' # This file is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA # 02110-1301, USA. # # As a special exception to the GNU General Public License, if you # distribute this file as part of a program that contains a # configuration script generated by Autoconf, you may include it under # the same distribution terms that you use for the rest of that program. # Originally written by Per Bothner. Please send patches (context # diff format) to and include a ChangeLog # entry. # # This script attempts to guess a canonical system name similar to # config.sub. If it succeeds, it prints the system name on stdout, and # exits with 0. Otherwise, it exits with 1. # # You can get the latest version of this script from: # http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD me=`echo "$0" | sed -e 's,.*/,,'` usage="\ Usage: $0 [OPTION] Output the configuration name of the system \`$me' is run on. Operation modes: -h, --help print this help, then exit -t, --time-stamp print date of last modification, then exit -v, --version print version number, then exit Report bugs and patches to ." version="\ GNU config.guess ($timestamp) Originally written by Per Bothner. Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE." help=" Try \`$me --help' for more information." # Parse command line while test $# -gt 0 ; do case $1 in --time-stamp | --time* | -t ) echo "$timestamp" ; exit ;; --version | -v ) echo "$version" ; exit ;; --help | --h* | -h ) echo "$usage"; exit ;; -- ) # Stop option processing shift; break ;; - ) # Use stdin as input. break ;; -* ) echo "$me: invalid option $1$help" >&2 exit 1 ;; * ) break ;; esac done if test $# != 0; then echo "$me: too many arguments$help" >&2 exit 1 fi trap 'exit 1' HUP INT TERM # CC_FOR_BUILD -- compiler used by this script. Note that the use of a # compiler to aid in system detection is discouraged as it requires # temporary files to be created and, as you can see below, it is a # headache to deal with in a portable fashion. # Historically, `CC_FOR_BUILD' used to be named `HOST_CC'. We still # use `HOST_CC' if defined, but it is deprecated. # Portable tmp directory creation inspired by the Autoconf team. set_cc_for_build=' trap "exitcode=\$?; (rm -f \$tmpfiles 2>/dev/null; rmdir \$tmp 2>/dev/null) && exit \$exitcode" 0 ; trap "rm -f \$tmpfiles 2>/dev/null; rmdir \$tmp 2>/dev/null; exit 1" HUP INT PIPE TERM ; : ${TMPDIR=/tmp} ; { tmp=`(umask 077 && mktemp -d "$TMPDIR/cgXXXXXX") 2>/dev/null` && test -n "$tmp" && test -d "$tmp" ; } || { test -n "$RANDOM" && tmp=$TMPDIR/cg$$-$RANDOM && (umask 077 && mkdir $tmp) ; } || { tmp=$TMPDIR/cg-$$ && (umask 077 && mkdir $tmp) && echo "Warning: creating insecure temp directory" >&2 ; } || { echo "$me: cannot create a temporary directory in $TMPDIR" >&2 ; exit 1 ; } ; dummy=$tmp/dummy ; tmpfiles="$dummy.c $dummy.o $dummy.rel $dummy" ; case $CC_FOR_BUILD,$HOST_CC,$CC in ,,) echo "int x;" > $dummy.c ; for c in cc gcc c89 c99 ; do if ($c -c -o $dummy.o $dummy.c) >/dev/null 2>&1 ; then CC_FOR_BUILD="$c"; break ; fi ; done ; if test x"$CC_FOR_BUILD" = x ; then CC_FOR_BUILD=no_compiler_found ; fi ;; ,,*) CC_FOR_BUILD=$CC ;; ,*,*) CC_FOR_BUILD=$HOST_CC ;; esac ; set_cc_for_build= ;' # This is needed to find uname on a Pyramid OSx when run in the BSD universe. # (ghazi@noc.rutgers.edu 1994-08-24) if (test -f /.attbin/uname) >/dev/null 2>&1 ; then PATH=$PATH:/.attbin ; export PATH fi UNAME_MACHINE=`(uname -m) 2>/dev/null` || UNAME_MACHINE=unknown UNAME_RELEASE=`(uname -r) 2>/dev/null` || UNAME_RELEASE=unknown UNAME_SYSTEM=`(uname -s) 2>/dev/null` || UNAME_SYSTEM=unknown UNAME_VERSION=`(uname -v) 2>/dev/null` || UNAME_VERSION=unknown # Note: order is significant - the case branches are not exclusive. case "${UNAME_MACHINE}:${UNAME_SYSTEM}:${UNAME_RELEASE}:${UNAME_VERSION}" in *:NetBSD:*:*) # NetBSD (nbsd) targets should (where applicable) match one or # more of the tupples: *-*-netbsdelf*, *-*-netbsdaout*, # *-*-netbsdecoff* and *-*-netbsd*. For targets that recently # switched to ELF, *-*-netbsd* would select the old # object file format. This provides both forward # compatibility and a consistent mechanism for selecting the # object file format. # # Note: NetBSD doesn't particularly care about the vendor # portion of the name. We always set it to "unknown". sysctl="sysctl -n hw.machine_arch" UNAME_MACHINE_ARCH=`(/sbin/$sysctl 2>/dev/null || \ /usr/sbin/$sysctl 2>/dev/null || echo unknown)` case "${UNAME_MACHINE_ARCH}" in armeb) machine=armeb-unknown ;; arm*) machine=arm-unknown ;; sh3el) machine=shl-unknown ;; sh3eb) machine=sh-unknown ;; sh5el) machine=sh5le-unknown ;; *) machine=${UNAME_MACHINE_ARCH}-unknown ;; esac # The Operating System including object format, if it has switched # to ELF recently, or will in the future. case "${UNAME_MACHINE_ARCH}" in arm*|i386|m68k|ns32k|sh3*|sparc|vax) eval $set_cc_for_build if echo __ELF__ | $CC_FOR_BUILD -E - 2>/dev/null \ | grep -q __ELF__ then # Once all utilities can be ECOFF (netbsdecoff) or a.out (netbsdaout). # Return netbsd for either. FIX? os=netbsd else os=netbsdelf fi ;; *) os=netbsd ;; esac # The OS release # Debian GNU/NetBSD machines have a different userland, and # thus, need a distinct triplet. However, they do not need # kernel version information, so it can be replaced with a # suitable tag, in the style of linux-gnu. case "${UNAME_VERSION}" in Debian*) release='-gnu' ;; *) release=`echo ${UNAME_RELEASE}|sed -e 's/[-_].*/\./'` ;; esac # Since CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM: # contains redundant information, the shorter form: # CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM is used. echo "${machine}-${os}${release}" exit ;; *:OpenBSD:*:*) UNAME_MACHINE_ARCH=`arch | sed 's/OpenBSD.//'` echo ${UNAME_MACHINE_ARCH}-unknown-openbsd${UNAME_RELEASE} exit ;; *:ekkoBSD:*:*) echo ${UNAME_MACHINE}-unknown-ekkobsd${UNAME_RELEASE} exit ;; *:SolidBSD:*:*) echo ${UNAME_MACHINE}-unknown-solidbsd${UNAME_RELEASE} exit ;; macppc:MirBSD:*:*) echo powerpc-unknown-mirbsd${UNAME_RELEASE} exit ;; *:MirBSD:*:*) echo ${UNAME_MACHINE}-unknown-mirbsd${UNAME_RELEASE} exit ;; alpha:OSF1:*:*) case $UNAME_RELEASE in *4.0) UNAME_RELEASE=`/usr/sbin/sizer -v | awk '{print $3}'` ;; *5.*) UNAME_RELEASE=`/usr/sbin/sizer -v | awk '{print $4}'` ;; esac # According to Compaq, /usr/sbin/psrinfo has been available on # OSF/1 and Tru64 systems produced since 1995. I hope that # covers most systems running today. This code pipes the CPU # types through head -n 1, so we only detect the type of CPU 0. ALPHA_CPU_TYPE=`/usr/sbin/psrinfo -v | sed -n -e 's/^ The alpha \(.*\) processor.*$/\1/p' | head -n 1` case "$ALPHA_CPU_TYPE" in "EV4 (21064)") UNAME_MACHINE="alpha" ;; "EV4.5 (21064)") UNAME_MACHINE="alpha" ;; "LCA4 (21066/21068)") UNAME_MACHINE="alpha" ;; "EV5 (21164)") UNAME_MACHINE="alphaev5" ;; "EV5.6 (21164A)") UNAME_MACHINE="alphaev56" ;; "EV5.6 (21164PC)") UNAME_MACHINE="alphapca56" ;; "EV5.7 (21164PC)") UNAME_MACHINE="alphapca57" ;; "EV6 (21264)") UNAME_MACHINE="alphaev6" ;; "EV6.7 (21264A)") UNAME_MACHINE="alphaev67" ;; "EV6.8CB (21264C)") UNAME_MACHINE="alphaev68" ;; "EV6.8AL (21264B)") UNAME_MACHINE="alphaev68" ;; "EV6.8CX (21264D)") UNAME_MACHINE="alphaev68" ;; "EV6.9A (21264/EV69A)") UNAME_MACHINE="alphaev69" ;; "EV7 (21364)") UNAME_MACHINE="alphaev7" ;; "EV7.9 (21364A)") UNAME_MACHINE="alphaev79" ;; esac # A Pn.n version is a patched version. # A Vn.n version is a released version. # A Tn.n version is a released field test version. # A Xn.n version is an unreleased experimental baselevel. # 1.2 uses "1.2" for uname -r. echo ${UNAME_MACHINE}-dec-osf`echo ${UNAME_RELEASE} | sed -e 's/^[PVTX]//' | tr 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 'abcdefghijklmnopqrstuvwxyz'` exit ;; Alpha\ *:Windows_NT*:*) # How do we know it's Interix rather than the generic POSIX subsystem? # Should we change UNAME_MACHINE based on the output of uname instead # of the specific Alpha model? echo alpha-pc-interix exit ;; 21064:Windows_NT:50:3) echo alpha-dec-winnt3.5 exit ;; Amiga*:UNIX_System_V:4.0:*) echo m68k-unknown-sysv4 exit ;; *:[Aa]miga[Oo][Ss]:*:*) echo ${UNAME_MACHINE}-unknown-amigaos exit ;; *:[Mm]orph[Oo][Ss]:*:*) echo ${UNAME_MACHINE}-unknown-morphos exit ;; *:OS/390:*:*) echo i370-ibm-openedition exit ;; *:z/VM:*:*) echo s390-ibm-zvmoe exit ;; *:OS400:*:*) echo powerpc-ibm-os400 exit ;; arm:RISC*:1.[012]*:*|arm:riscix:1.[012]*:*) echo arm-acorn-riscix${UNAME_RELEASE} exit ;; arm:riscos:*:*|arm:RISCOS:*:*) echo arm-unknown-riscos exit ;; SR2?01:HI-UX/MPP:*:* | SR8000:HI-UX/MPP:*:*) echo hppa1.1-hitachi-hiuxmpp exit ;; Pyramid*:OSx*:*:* | MIS*:OSx*:*:* | MIS*:SMP_DC-OSx*:*:*) # akee@wpdis03.wpafb.af.mil (Earle F. Ake) contributed MIS and NILE. if test "`(/bin/universe) 2>/dev/null`" = att ; then echo pyramid-pyramid-sysv3 else echo pyramid-pyramid-bsd fi exit ;; NILE*:*:*:dcosx) echo pyramid-pyramid-svr4 exit ;; DRS?6000:unix:4.0:6*) echo sparc-icl-nx6 exit ;; DRS?6000:UNIX_SV:4.2*:7* | DRS?6000:isis:4.2*:7*) case `/usr/bin/uname -p` in sparc) echo sparc-icl-nx7; exit ;; esac ;; s390x:SunOS:*:*) echo ${UNAME_MACHINE}-ibm-solaris2`echo ${UNAME_RELEASE}|sed -e 's/[^.]*//'` exit ;; sun4H:SunOS:5.*:*) echo sparc-hal-solaris2`echo ${UNAME_RELEASE}|sed -e 's/[^.]*//'` exit ;; sun4*:SunOS:5.*:* | tadpole*:SunOS:5.*:*) echo sparc-sun-solaris2`echo ${UNAME_RELEASE}|sed -e 's/[^.]*//'` exit ;; i86pc:AuroraUX:5.*:* | i86xen:AuroraUX:5.*:*) echo i386-pc-auroraux${UNAME_RELEASE} exit ;; i86pc:SunOS:5.*:* | i86xen:SunOS:5.*:*) eval $set_cc_for_build SUN_ARCH="i386" # If there is a compiler, see if it is configured for 64-bit objects. # Note that the Sun cc does not turn __LP64__ into 1 like gcc does. # This test works for both compilers. if [ "$CC_FOR_BUILD" != 'no_compiler_found' ]; then if (echo '#ifdef __amd64'; echo IS_64BIT_ARCH; echo '#endif') | \ (CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | \ grep IS_64BIT_ARCH >/dev/null then SUN_ARCH="x86_64" fi fi echo ${SUN_ARCH}-pc-solaris2`echo ${UNAME_RELEASE}|sed -e 's/[^.]*//'` exit ;; sun4*:SunOS:6*:*) # According to config.sub, this is the proper way to canonicalize # SunOS6. Hard to guess exactly what SunOS6 will be like, but # it's likely to be more like Solaris than SunOS4. echo sparc-sun-solaris3`echo ${UNAME_RELEASE}|sed -e 's/[^.]*//'` exit ;; sun4*:SunOS:*:*) case "`/usr/bin/arch -k`" in Series*|S4*) UNAME_RELEASE=`uname -v` ;; esac # Japanese Language versions have a version number like `4.1.3-JL'. echo sparc-sun-sunos`echo ${UNAME_RELEASE}|sed -e 's/-/_/'` exit ;; sun3*:SunOS:*:*) echo m68k-sun-sunos${UNAME_RELEASE} exit ;; sun*:*:4.2BSD:*) UNAME_RELEASE=`(sed 1q /etc/motd | awk '{print substr($5,1,3)}') 2>/dev/null` test "x${UNAME_RELEASE}" = "x" && UNAME_RELEASE=3 case "`/bin/arch`" in sun3) echo m68k-sun-sunos${UNAME_RELEASE} ;; sun4) echo sparc-sun-sunos${UNAME_RELEASE} ;; esac exit ;; aushp:SunOS:*:*) echo sparc-auspex-sunos${UNAME_RELEASE} exit ;; # The situation for MiNT is a little confusing. The machine name # can be virtually everything (everything which is not # "atarist" or "atariste" at least should have a processor # > m68000). The system name ranges from "MiNT" over "FreeMiNT" # to the lowercase version "mint" (or "freemint"). Finally # the system name "TOS" denotes a system which is actually not # MiNT. But MiNT is downward compatible to TOS, so this should # be no problem. atarist[e]:*MiNT:*:* | atarist[e]:*mint:*:* | atarist[e]:*TOS:*:*) echo m68k-atari-mint${UNAME_RELEASE} exit ;; atari*:*MiNT:*:* | atari*:*mint:*:* | atarist[e]:*TOS:*:*) echo m68k-atari-mint${UNAME_RELEASE} exit ;; *falcon*:*MiNT:*:* | *falcon*:*mint:*:* | *falcon*:*TOS:*:*) echo m68k-atari-mint${UNAME_RELEASE} exit ;; milan*:*MiNT:*:* | milan*:*mint:*:* | *milan*:*TOS:*:*) echo m68k-milan-mint${UNAME_RELEASE} exit ;; hades*:*MiNT:*:* | hades*:*mint:*:* | *hades*:*TOS:*:*) echo m68k-hades-mint${UNAME_RELEASE} exit ;; *:*MiNT:*:* | *:*mint:*:* | *:*TOS:*:*) echo m68k-unknown-mint${UNAME_RELEASE} exit ;; m68k:machten:*:*) echo m68k-apple-machten${UNAME_RELEASE} exit ;; powerpc:machten:*:*) echo powerpc-apple-machten${UNAME_RELEASE} exit ;; RISC*:Mach:*:*) echo mips-dec-mach_bsd4.3 exit ;; RISC*:ULTRIX:*:*) echo mips-dec-ultrix${UNAME_RELEASE} exit ;; VAX*:ULTRIX*:*:*) echo vax-dec-ultrix${UNAME_RELEASE} exit ;; 2020:CLIX:*:* | 2430:CLIX:*:*) echo clipper-intergraph-clix${UNAME_RELEASE} exit ;; mips:*:*:UMIPS | mips:*:*:RISCos) eval $set_cc_for_build sed 's/^ //' << EOF >$dummy.c #ifdef __cplusplus #include /* for printf() prototype */ int main (int argc, char *argv[]) { #else int main (argc, argv) int argc; char *argv[]; { #endif #if defined (host_mips) && defined (MIPSEB) #if defined (SYSTYPE_SYSV) printf ("mips-mips-riscos%ssysv\n", argv[1]); exit (0); #endif #if defined (SYSTYPE_SVR4) printf ("mips-mips-riscos%ssvr4\n", argv[1]); exit (0); #endif #if defined (SYSTYPE_BSD43) || defined(SYSTYPE_BSD) printf ("mips-mips-riscos%sbsd\n", argv[1]); exit (0); #endif #endif exit (-1); } EOF $CC_FOR_BUILD -o $dummy $dummy.c && dummyarg=`echo "${UNAME_RELEASE}" | sed -n 's/\([0-9]*\).*/\1/p'` && SYSTEM_NAME=`$dummy $dummyarg` && { echo "$SYSTEM_NAME"; exit; } echo mips-mips-riscos${UNAME_RELEASE} exit ;; Motorola:PowerMAX_OS:*:*) echo powerpc-motorola-powermax exit ;; Motorola:*:4.3:PL8-*) echo powerpc-harris-powermax exit ;; Night_Hawk:*:*:PowerMAX_OS | Synergy:PowerMAX_OS:*:*) echo powerpc-harris-powermax exit ;; Night_Hawk:Power_UNIX:*:*) echo powerpc-harris-powerunix exit ;; m88k:CX/UX:7*:*) echo m88k-harris-cxux7 exit ;; m88k:*:4*:R4*) echo m88k-motorola-sysv4 exit ;; m88k:*:3*:R3*) echo m88k-motorola-sysv3 exit ;; AViiON:dgux:*:*) # DG/UX returns AViiON for all architectures UNAME_PROCESSOR=`/usr/bin/uname -p` if [ $UNAME_PROCESSOR = mc88100 ] || [ $UNAME_PROCESSOR = mc88110 ] then if [ ${TARGET_BINARY_INTERFACE}x = m88kdguxelfx ] || \ [ ${TARGET_BINARY_INTERFACE}x = x ] then echo m88k-dg-dgux${UNAME_RELEASE} else echo m88k-dg-dguxbcs${UNAME_RELEASE} fi else echo i586-dg-dgux${UNAME_RELEASE} fi exit ;; M88*:DolphinOS:*:*) # DolphinOS (SVR3) echo m88k-dolphin-sysv3 exit ;; M88*:*:R3*:*) # Delta 88k system running SVR3 echo m88k-motorola-sysv3 exit ;; XD88*:*:*:*) # Tektronix XD88 system running UTekV (SVR3) echo m88k-tektronix-sysv3 exit ;; Tek43[0-9][0-9]:UTek:*:*) # Tektronix 4300 system running UTek (BSD) echo m68k-tektronix-bsd exit ;; *:IRIX*:*:*) echo mips-sgi-irix`echo ${UNAME_RELEASE}|sed -e 's/-/_/g'` exit ;; ????????:AIX?:[12].1:2) # AIX 2.2.1 or AIX 2.1.1 is RT/PC AIX. echo romp-ibm-aix # uname -m gives an 8 hex-code CPU id exit ;; # Note that: echo "'`uname -s`'" gives 'AIX ' i*86:AIX:*:*) echo i386-ibm-aix exit ;; ia64:AIX:*:*) if [ -x /usr/bin/oslevel ] ; then IBM_REV=`/usr/bin/oslevel` else IBM_REV=${UNAME_VERSION}.${UNAME_RELEASE} fi echo ${UNAME_MACHINE}-ibm-aix${IBM_REV} exit ;; *:AIX:2:3) if grep bos325 /usr/include/stdio.h >/dev/null 2>&1; then eval $set_cc_for_build sed 's/^ //' << EOF >$dummy.c #include main() { if (!__power_pc()) exit(1); puts("powerpc-ibm-aix3.2.5"); exit(0); } EOF if $CC_FOR_BUILD -o $dummy $dummy.c && SYSTEM_NAME=`$dummy` then echo "$SYSTEM_NAME" else echo rs6000-ibm-aix3.2.5 fi elif grep bos324 /usr/include/stdio.h >/dev/null 2>&1; then echo rs6000-ibm-aix3.2.4 else echo rs6000-ibm-aix3.2 fi exit ;; *:AIX:*:[4567]) IBM_CPU_ID=`/usr/sbin/lsdev -C -c processor -S available | sed 1q | awk '{ print $1 }'` if /usr/sbin/lsattr -El ${IBM_CPU_ID} | grep ' POWER' >/dev/null 2>&1; then IBM_ARCH=rs6000 else IBM_ARCH=powerpc fi if [ -x /usr/bin/oslevel ] ; then IBM_REV=`/usr/bin/oslevel` else IBM_REV=${UNAME_VERSION}.${UNAME_RELEASE} fi echo ${IBM_ARCH}-ibm-aix${IBM_REV} exit ;; *:AIX:*:*) echo rs6000-ibm-aix exit ;; ibmrt:4.4BSD:*|romp-ibm:BSD:*) echo romp-ibm-bsd4.4 exit ;; ibmrt:*BSD:*|romp-ibm:BSD:*) # covers RT/PC BSD and echo romp-ibm-bsd${UNAME_RELEASE} # 4.3 with uname added to exit ;; # report: romp-ibm BSD 4.3 *:BOSX:*:*) echo rs6000-bull-bosx exit ;; DPX/2?00:B.O.S.:*:*) echo m68k-bull-sysv3 exit ;; 9000/[34]??:4.3bsd:1.*:*) echo m68k-hp-bsd exit ;; hp300:4.4BSD:*:* | 9000/[34]??:4.3bsd:2.*:*) echo m68k-hp-bsd4.4 exit ;; 9000/[34678]??:HP-UX:*:*) HPUX_REV=`echo ${UNAME_RELEASE}|sed -e 's/[^.]*.[0B]*//'` case "${UNAME_MACHINE}" in 9000/31? ) HP_ARCH=m68000 ;; 9000/[34]?? ) HP_ARCH=m68k ;; 9000/[678][0-9][0-9]) if [ -x /usr/bin/getconf ]; then sc_cpu_version=`/usr/bin/getconf SC_CPU_VERSION 2>/dev/null` sc_kernel_bits=`/usr/bin/getconf SC_KERNEL_BITS 2>/dev/null` case "${sc_cpu_version}" in 523) HP_ARCH="hppa1.0" ;; # CPU_PA_RISC1_0 528) HP_ARCH="hppa1.1" ;; # CPU_PA_RISC1_1 532) # CPU_PA_RISC2_0 case "${sc_kernel_bits}" in 32) HP_ARCH="hppa2.0n" ;; 64) HP_ARCH="hppa2.0w" ;; '') HP_ARCH="hppa2.0" ;; # HP-UX 10.20 esac ;; esac fi if [ "${HP_ARCH}" = "" ]; then eval $set_cc_for_build sed 's/^ //' << EOF >$dummy.c #define _HPUX_SOURCE #include #include int main () { #if defined(_SC_KERNEL_BITS) long bits = sysconf(_SC_KERNEL_BITS); #endif long cpu = sysconf (_SC_CPU_VERSION); switch (cpu) { case CPU_PA_RISC1_0: puts ("hppa1.0"); break; case CPU_PA_RISC1_1: puts ("hppa1.1"); break; case CPU_PA_RISC2_0: #if defined(_SC_KERNEL_BITS) switch (bits) { case 64: puts ("hppa2.0w"); break; case 32: puts ("hppa2.0n"); break; default: puts ("hppa2.0"); break; } break; #else /* !defined(_SC_KERNEL_BITS) */ puts ("hppa2.0"); break; #endif default: puts ("hppa1.0"); break; } exit (0); } EOF (CCOPTS= $CC_FOR_BUILD -o $dummy $dummy.c 2>/dev/null) && HP_ARCH=`$dummy` test -z "$HP_ARCH" && HP_ARCH=hppa fi ;; esac if [ ${HP_ARCH} = "hppa2.0w" ] then eval $set_cc_for_build # hppa2.0w-hp-hpux* has a 64-bit kernel and a compiler generating # 32-bit code. hppa64-hp-hpux* has the same kernel and a compiler # generating 64-bit code. GNU and HP use different nomenclature: # # $ CC_FOR_BUILD=cc ./config.guess # => hppa2.0w-hp-hpux11.23 # $ CC_FOR_BUILD="cc +DA2.0w" ./config.guess # => hppa64-hp-hpux11.23 if echo __LP64__ | (CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | grep -q __LP64__ then HP_ARCH="hppa2.0w" else HP_ARCH="hppa64" fi fi echo ${HP_ARCH}-hp-hpux${HPUX_REV} exit ;; ia64:HP-UX:*:*) HPUX_REV=`echo ${UNAME_RELEASE}|sed -e 's/[^.]*.[0B]*//'` echo ia64-hp-hpux${HPUX_REV} exit ;; 3050*:HI-UX:*:*) eval $set_cc_for_build sed 's/^ //' << EOF >$dummy.c #include int main () { long cpu = sysconf (_SC_CPU_VERSION); /* The order matters, because CPU_IS_HP_MC68K erroneously returns true for CPU_PA_RISC1_0. CPU_IS_PA_RISC returns correct results, however. */ if (CPU_IS_PA_RISC (cpu)) { switch (cpu) { case CPU_PA_RISC1_0: puts ("hppa1.0-hitachi-hiuxwe2"); break; case CPU_PA_RISC1_1: puts ("hppa1.1-hitachi-hiuxwe2"); break; case CPU_PA_RISC2_0: puts ("hppa2.0-hitachi-hiuxwe2"); break; default: puts ("hppa-hitachi-hiuxwe2"); break; } } else if (CPU_IS_HP_MC68K (cpu)) puts ("m68k-hitachi-hiuxwe2"); else puts ("unknown-hitachi-hiuxwe2"); exit (0); } EOF $CC_FOR_BUILD -o $dummy $dummy.c && SYSTEM_NAME=`$dummy` && { echo "$SYSTEM_NAME"; exit; } echo unknown-hitachi-hiuxwe2 exit ;; 9000/7??:4.3bsd:*:* | 9000/8?[79]:4.3bsd:*:* ) echo hppa1.1-hp-bsd exit ;; 9000/8??:4.3bsd:*:*) echo hppa1.0-hp-bsd exit ;; *9??*:MPE/iX:*:* | *3000*:MPE/iX:*:*) echo hppa1.0-hp-mpeix exit ;; hp7??:OSF1:*:* | hp8?[79]:OSF1:*:* ) echo hppa1.1-hp-osf exit ;; hp8??:OSF1:*:*) echo hppa1.0-hp-osf exit ;; i*86:OSF1:*:*) if [ -x /usr/sbin/sysversion ] ; then echo ${UNAME_MACHINE}-unknown-osf1mk else echo ${UNAME_MACHINE}-unknown-osf1 fi exit ;; parisc*:Lites*:*:*) echo hppa1.1-hp-lites exit ;; C1*:ConvexOS:*:* | convex:ConvexOS:C1*:*) echo c1-convex-bsd exit ;; C2*:ConvexOS:*:* | convex:ConvexOS:C2*:*) if getsysinfo -f scalar_acc then echo c32-convex-bsd else echo c2-convex-bsd fi exit ;; C34*:ConvexOS:*:* | convex:ConvexOS:C34*:*) echo c34-convex-bsd exit ;; C38*:ConvexOS:*:* | convex:ConvexOS:C38*:*) echo c38-convex-bsd exit ;; C4*:ConvexOS:*:* | convex:ConvexOS:C4*:*) echo c4-convex-bsd exit ;; CRAY*Y-MP:*:*:*) echo ymp-cray-unicos${UNAME_RELEASE} | sed -e 's/\.[^.]*$/.X/' exit ;; CRAY*[A-Z]90:*:*:*) echo ${UNAME_MACHINE}-cray-unicos${UNAME_RELEASE} \ | sed -e 's/CRAY.*\([A-Z]90\)/\1/' \ -e y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/ \ -e 's/\.[^.]*$/.X/' exit ;; CRAY*TS:*:*:*) echo t90-cray-unicos${UNAME_RELEASE} | sed -e 's/\.[^.]*$/.X/' exit ;; CRAY*T3E:*:*:*) echo alphaev5-cray-unicosmk${UNAME_RELEASE} | sed -e 's/\.[^.]*$/.X/' exit ;; CRAY*SV1:*:*:*) echo sv1-cray-unicos${UNAME_RELEASE} | sed -e 's/\.[^.]*$/.X/' exit ;; *:UNICOS/mp:*:*) echo craynv-cray-unicosmp${UNAME_RELEASE} | sed -e 's/\.[^.]*$/.X/' exit ;; F30[01]:UNIX_System_V:*:* | F700:UNIX_System_V:*:*) FUJITSU_PROC=`uname -m | tr 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 'abcdefghijklmnopqrstuvwxyz'` FUJITSU_SYS=`uname -p | tr 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 'abcdefghijklmnopqrstuvwxyz' | sed -e 's/\///'` FUJITSU_REL=`echo ${UNAME_RELEASE} | sed -e 's/ /_/'` echo "${FUJITSU_PROC}-fujitsu-${FUJITSU_SYS}${FUJITSU_REL}" exit ;; 5000:UNIX_System_V:4.*:*) FUJITSU_SYS=`uname -p | tr 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 'abcdefghijklmnopqrstuvwxyz' | sed -e 's/\///'` FUJITSU_REL=`echo ${UNAME_RELEASE} | tr 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 'abcdefghijklmnopqrstuvwxyz' | sed -e 's/ /_/'` echo "sparc-fujitsu-${FUJITSU_SYS}${FUJITSU_REL}" exit ;; i*86:BSD/386:*:* | i*86:BSD/OS:*:* | *:Ascend\ Embedded/OS:*:*) echo ${UNAME_MACHINE}-pc-bsdi${UNAME_RELEASE} exit ;; sparc*:BSD/OS:*:*) echo sparc-unknown-bsdi${UNAME_RELEASE} exit ;; *:BSD/OS:*:*) echo ${UNAME_MACHINE}-unknown-bsdi${UNAME_RELEASE} exit ;; *:FreeBSD:*:*) case ${UNAME_MACHINE} in pc98) echo i386-unknown-freebsd`echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'` ;; amd64) echo x86_64-unknown-freebsd`echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'` ;; *) echo ${UNAME_MACHINE}-unknown-freebsd`echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'` ;; esac exit ;; i*:CYGWIN*:*) echo ${UNAME_MACHINE}-pc-cygwin exit ;; *:MINGW*:*) echo ${UNAME_MACHINE}-pc-mingw32 exit ;; i*:windows32*:*) # uname -m includes "-pc" on this system. echo ${UNAME_MACHINE}-mingw32 exit ;; i*:PW*:*) echo ${UNAME_MACHINE}-pc-pw32 exit ;; *:Interix*:*) case ${UNAME_MACHINE} in x86) echo i586-pc-interix${UNAME_RELEASE} exit ;; authenticamd | genuineintel | EM64T) echo x86_64-unknown-interix${UNAME_RELEASE} exit ;; IA64) echo ia64-unknown-interix${UNAME_RELEASE} exit ;; esac ;; [345]86:Windows_95:* | [345]86:Windows_98:* | [345]86:Windows_NT:*) echo i${UNAME_MACHINE}-pc-mks exit ;; 8664:Windows_NT:*) echo x86_64-pc-mks exit ;; i*:Windows_NT*:* | Pentium*:Windows_NT*:*) # How do we know it's Interix rather than the generic POSIX subsystem? # It also conflicts with pre-2.0 versions of AT&T UWIN. Should we # UNAME_MACHINE based on the output of uname instead of i386? echo i586-pc-interix exit ;; i*:UWIN*:*) echo ${UNAME_MACHINE}-pc-uwin exit ;; amd64:CYGWIN*:*:* | x86_64:CYGWIN*:*:*) echo x86_64-unknown-cygwin exit ;; p*:CYGWIN*:*) echo powerpcle-unknown-cygwin exit ;; prep*:SunOS:5.*:*) echo powerpcle-unknown-solaris2`echo ${UNAME_RELEASE}|sed -e 's/[^.]*//'` exit ;; *:GNU:*:*) # the GNU system echo `echo ${UNAME_MACHINE}|sed -e 's,[-/].*$,,'`-unknown-gnu`echo ${UNAME_RELEASE}|sed -e 's,/.*$,,'` exit ;; *:GNU/*:*:*) # other systems with GNU libc and userland echo ${UNAME_MACHINE}-unknown-`echo ${UNAME_SYSTEM} | sed 's,^[^/]*/,,' | tr '[A-Z]' '[a-z]'``echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'`-gnu exit ;; i*86:Minix:*:*) echo ${UNAME_MACHINE}-pc-minix exit ;; alpha:Linux:*:*) case `sed -n '/^cpu model/s/^.*: \(.*\)/\1/p' < /proc/cpuinfo` in EV5) UNAME_MACHINE=alphaev5 ;; EV56) UNAME_MACHINE=alphaev56 ;; PCA56) UNAME_MACHINE=alphapca56 ;; PCA57) UNAME_MACHINE=alphapca56 ;; EV6) UNAME_MACHINE=alphaev6 ;; EV67) UNAME_MACHINE=alphaev67 ;; EV68*) UNAME_MACHINE=alphaev68 ;; esac objdump --private-headers /bin/sh | grep -q ld.so.1 if test "$?" = 0 ; then LIBC="libc1" ; else LIBC="" ; fi echo ${UNAME_MACHINE}-unknown-linux-gnu${LIBC} exit ;; arm*:Linux:*:*) eval $set_cc_for_build if echo __ARM_EABI__ | $CC_FOR_BUILD -E - 2>/dev/null \ | grep -q __ARM_EABI__ then echo ${UNAME_MACHINE}-unknown-linux-gnu else echo ${UNAME_MACHINE}-unknown-linux-gnueabi fi exit ;; avr32*:Linux:*:*) echo ${UNAME_MACHINE}-unknown-linux-gnu exit ;; cris:Linux:*:*) echo cris-axis-linux-gnu exit ;; crisv32:Linux:*:*) echo crisv32-axis-linux-gnu exit ;; frv:Linux:*:*) echo frv-unknown-linux-gnu exit ;; i*86:Linux:*:*) LIBC=gnu eval $set_cc_for_build sed 's/^ //' << EOF >$dummy.c #ifdef __dietlibc__ LIBC=dietlibc #endif EOF eval `$CC_FOR_BUILD -E $dummy.c 2>/dev/null | grep '^LIBC'` echo "${UNAME_MACHINE}-pc-linux-${LIBC}" exit ;; ia64:Linux:*:*) echo ${UNAME_MACHINE}-unknown-linux-gnu exit ;; m32r*:Linux:*:*) echo ${UNAME_MACHINE}-unknown-linux-gnu exit ;; m68*:Linux:*:*) echo ${UNAME_MACHINE}-unknown-linux-gnu exit ;; mips:Linux:*:* | mips64:Linux:*:*) eval $set_cc_for_build sed 's/^ //' << EOF >$dummy.c #undef CPU #undef ${UNAME_MACHINE} #undef ${UNAME_MACHINE}el #if defined(__MIPSEL__) || defined(__MIPSEL) || defined(_MIPSEL) || defined(MIPSEL) CPU=${UNAME_MACHINE}el #else #if defined(__MIPSEB__) || defined(__MIPSEB) || defined(_MIPSEB) || defined(MIPSEB) CPU=${UNAME_MACHINE} #else CPU= #endif #endif EOF eval `$CC_FOR_BUILD -E $dummy.c 2>/dev/null | grep '^CPU'` test x"${CPU}" != x && { echo "${CPU}-unknown-linux-gnu"; exit; } ;; or32:Linux:*:*) echo or32-unknown-linux-gnu exit ;; padre:Linux:*:*) echo sparc-unknown-linux-gnu exit ;; parisc64:Linux:*:* | hppa64:Linux:*:*) echo hppa64-unknown-linux-gnu exit ;; parisc:Linux:*:* | hppa:Linux:*:*) # Look for CPU level case `grep '^cpu[^a-z]*:' /proc/cpuinfo 2>/dev/null | cut -d' ' -f2` in PA7*) echo hppa1.1-unknown-linux-gnu ;; PA8*) echo hppa2.0-unknown-linux-gnu ;; *) echo hppa-unknown-linux-gnu ;; esac exit ;; ppc64:Linux:*:*) echo powerpc64-unknown-linux-gnu exit ;; ppc:Linux:*:*) echo powerpc-unknown-linux-gnu exit ;; s390:Linux:*:* | s390x:Linux:*:*) echo ${UNAME_MACHINE}-ibm-linux exit ;; sh64*:Linux:*:*) echo ${UNAME_MACHINE}-unknown-linux-gnu exit ;; sh*:Linux:*:*) echo ${UNAME_MACHINE}-unknown-linux-gnu exit ;; sparc:Linux:*:* | sparc64:Linux:*:*) echo ${UNAME_MACHINE}-unknown-linux-gnu exit ;; tile*:Linux:*:*) echo ${UNAME_MACHINE}-tilera-linux-gnu exit ;; vax:Linux:*:*) echo ${UNAME_MACHINE}-dec-linux-gnu exit ;; x86_64:Linux:*:*) echo x86_64-unknown-linux-gnu exit ;; xtensa*:Linux:*:*) echo ${UNAME_MACHINE}-unknown-linux-gnu exit ;; i*86:DYNIX/ptx:4*:*) # ptx 4.0 does uname -s correctly, with DYNIX/ptx in there. # earlier versions are messed up and put the nodename in both # sysname and nodename. echo i386-sequent-sysv4 exit ;; i*86:UNIX_SV:4.2MP:2.*) # Unixware is an offshoot of SVR4, but it has its own version # number series starting with 2... # I am not positive that other SVR4 systems won't match this, # I just have to hope. -- rms. # Use sysv4.2uw... so that sysv4* matches it. echo ${UNAME_MACHINE}-pc-sysv4.2uw${UNAME_VERSION} exit ;; i*86:OS/2:*:*) # If we were able to find `uname', then EMX Unix compatibility # is probably installed. echo ${UNAME_MACHINE}-pc-os2-emx exit ;; i*86:XTS-300:*:STOP) echo ${UNAME_MACHINE}-unknown-stop exit ;; i*86:atheos:*:*) echo ${UNAME_MACHINE}-unknown-atheos exit ;; i*86:syllable:*:*) echo ${UNAME_MACHINE}-pc-syllable exit ;; i*86:LynxOS:2.*:* | i*86:LynxOS:3.[01]*:* | i*86:LynxOS:4.[02]*:*) echo i386-unknown-lynxos${UNAME_RELEASE} exit ;; i*86:*DOS:*:*) echo ${UNAME_MACHINE}-pc-msdosdjgpp exit ;; i*86:*:4.*:* | i*86:SYSTEM_V:4.*:*) UNAME_REL=`echo ${UNAME_RELEASE} | sed 's/\/MP$//'` if grep Novell /usr/include/link.h >/dev/null 2>/dev/null; then echo ${UNAME_MACHINE}-univel-sysv${UNAME_REL} else echo ${UNAME_MACHINE}-pc-sysv${UNAME_REL} fi exit ;; i*86:*:5:[678]*) # UnixWare 7.x, OpenUNIX and OpenServer 6. case `/bin/uname -X | grep "^Machine"` in *486*) UNAME_MACHINE=i486 ;; *Pentium) UNAME_MACHINE=i586 ;; *Pent*|*Celeron) UNAME_MACHINE=i686 ;; esac echo ${UNAME_MACHINE}-unknown-sysv${UNAME_RELEASE}${UNAME_SYSTEM}${UNAME_VERSION} exit ;; i*86:*:3.2:*) if test -f /usr/options/cb.name; then UNAME_REL=`sed -n 's/.*Version //p' /dev/null >/dev/null ; then UNAME_REL=`(/bin/uname -X|grep Release|sed -e 's/.*= //')` (/bin/uname -X|grep i80486 >/dev/null) && UNAME_MACHINE=i486 (/bin/uname -X|grep '^Machine.*Pentium' >/dev/null) \ && UNAME_MACHINE=i586 (/bin/uname -X|grep '^Machine.*Pent *II' >/dev/null) \ && UNAME_MACHINE=i686 (/bin/uname -X|grep '^Machine.*Pentium Pro' >/dev/null) \ && UNAME_MACHINE=i686 echo ${UNAME_MACHINE}-pc-sco$UNAME_REL else echo ${UNAME_MACHINE}-pc-sysv32 fi exit ;; pc:*:*:*) # Left here for compatibility: # uname -m prints for DJGPP always 'pc', but it prints nothing about # the processor, so we play safe by assuming i586. # Note: whatever this is, it MUST be the same as what config.sub # prints for the "djgpp" host, or else GDB configury will decide that # this is a cross-build. echo i586-pc-msdosdjgpp exit ;; Intel:Mach:3*:*) echo i386-pc-mach3 exit ;; paragon:*:*:*) echo i860-intel-osf1 exit ;; i860:*:4.*:*) # i860-SVR4 if grep Stardent /usr/include/sys/uadmin.h >/dev/null 2>&1 ; then echo i860-stardent-sysv${UNAME_RELEASE} # Stardent Vistra i860-SVR4 else # Add other i860-SVR4 vendors below as they are discovered. echo i860-unknown-sysv${UNAME_RELEASE} # Unknown i860-SVR4 fi exit ;; mini*:CTIX:SYS*5:*) # "miniframe" echo m68010-convergent-sysv exit ;; mc68k:UNIX:SYSTEM5:3.51m) echo m68k-convergent-sysv exit ;; M680?0:D-NIX:5.3:*) echo m68k-diab-dnix exit ;; M68*:*:R3V[5678]*:*) test -r /sysV68 && { echo 'm68k-motorola-sysv'; exit; } ;; 3[345]??:*:4.0:3.0 | 3[34]??A:*:4.0:3.0 | 3[34]??,*:*:4.0:3.0 | 3[34]??/*:*:4.0:3.0 | 4400:*:4.0:3.0 | 4850:*:4.0:3.0 | SKA40:*:4.0:3.0 | SDS2:*:4.0:3.0 | SHG2:*:4.0:3.0 | S7501*:*:4.0:3.0) OS_REL='' test -r /etc/.relid \ && OS_REL=.`sed -n 's/[^ ]* [^ ]* \([0-9][0-9]\).*/\1/p' < /etc/.relid` /bin/uname -p 2>/dev/null | grep 86 >/dev/null \ && { echo i486-ncr-sysv4.3${OS_REL}; exit; } /bin/uname -p 2>/dev/null | /bin/grep entium >/dev/null \ && { echo i586-ncr-sysv4.3${OS_REL}; exit; } ;; 3[34]??:*:4.0:* | 3[34]??,*:*:4.0:*) /bin/uname -p 2>/dev/null | grep 86 >/dev/null \ && { echo i486-ncr-sysv4; exit; } ;; NCR*:*:4.2:* | MPRAS*:*:4.2:*) OS_REL='.3' test -r /etc/.relid \ && OS_REL=.`sed -n 's/[^ ]* [^ ]* \([0-9][0-9]\).*/\1/p' < /etc/.relid` /bin/uname -p 2>/dev/null | grep 86 >/dev/null \ && { echo i486-ncr-sysv4.3${OS_REL}; exit; } /bin/uname -p 2>/dev/null | /bin/grep entium >/dev/null \ && { echo i586-ncr-sysv4.3${OS_REL}; exit; } /bin/uname -p 2>/dev/null | /bin/grep pteron >/dev/null \ && { echo i586-ncr-sysv4.3${OS_REL}; exit; } ;; m68*:LynxOS:2.*:* | m68*:LynxOS:3.0*:*) echo m68k-unknown-lynxos${UNAME_RELEASE} exit ;; mc68030:UNIX_System_V:4.*:*) echo m68k-atari-sysv4 exit ;; TSUNAMI:LynxOS:2.*:*) echo sparc-unknown-lynxos${UNAME_RELEASE} exit ;; rs6000:LynxOS:2.*:*) echo rs6000-unknown-lynxos${UNAME_RELEASE} exit ;; PowerPC:LynxOS:2.*:* | PowerPC:LynxOS:3.[01]*:* | PowerPC:LynxOS:4.[02]*:*) echo powerpc-unknown-lynxos${UNAME_RELEASE} exit ;; SM[BE]S:UNIX_SV:*:*) echo mips-dde-sysv${UNAME_RELEASE} exit ;; RM*:ReliantUNIX-*:*:*) echo mips-sni-sysv4 exit ;; RM*:SINIX-*:*:*) echo mips-sni-sysv4 exit ;; *:SINIX-*:*:*) if uname -p 2>/dev/null >/dev/null ; then UNAME_MACHINE=`(uname -p) 2>/dev/null` echo ${UNAME_MACHINE}-sni-sysv4 else echo ns32k-sni-sysv fi exit ;; PENTIUM:*:4.0*:*) # Unisys `ClearPath HMP IX 4000' SVR4/MP effort # says echo i586-unisys-sysv4 exit ;; *:UNIX_System_V:4*:FTX*) # From Gerald Hewes . # How about differentiating between stratus architectures? -djm echo hppa1.1-stratus-sysv4 exit ;; *:*:*:FTX*) # From seanf@swdc.stratus.com. echo i860-stratus-sysv4 exit ;; i*86:VOS:*:*) # From Paul.Green@stratus.com. echo ${UNAME_MACHINE}-stratus-vos exit ;; *:VOS:*:*) # From Paul.Green@stratus.com. echo hppa1.1-stratus-vos exit ;; mc68*:A/UX:*:*) echo m68k-apple-aux${UNAME_RELEASE} exit ;; news*:NEWS-OS:6*:*) echo mips-sony-newsos6 exit ;; R[34]000:*System_V*:*:* | R4000:UNIX_SYSV:*:* | R*000:UNIX_SV:*:*) if [ -d /usr/nec ]; then echo mips-nec-sysv${UNAME_RELEASE} else echo mips-unknown-sysv${UNAME_RELEASE} fi exit ;; BeBox:BeOS:*:*) # BeOS running on hardware made by Be, PPC only. echo powerpc-be-beos exit ;; BeMac:BeOS:*:*) # BeOS running on Mac or Mac clone, PPC only. echo powerpc-apple-beos exit ;; BePC:BeOS:*:*) # BeOS running on Intel PC compatible. echo i586-pc-beos exit ;; BePC:Haiku:*:*) # Haiku running on Intel PC compatible. echo i586-pc-haiku exit ;; SX-4:SUPER-UX:*:*) echo sx4-nec-superux${UNAME_RELEASE} exit ;; SX-5:SUPER-UX:*:*) echo sx5-nec-superux${UNAME_RELEASE} exit ;; SX-6:SUPER-UX:*:*) echo sx6-nec-superux${UNAME_RELEASE} exit ;; SX-7:SUPER-UX:*:*) echo sx7-nec-superux${UNAME_RELEASE} exit ;; SX-8:SUPER-UX:*:*) echo sx8-nec-superux${UNAME_RELEASE} exit ;; SX-8R:SUPER-UX:*:*) echo sx8r-nec-superux${UNAME_RELEASE} exit ;; Power*:Rhapsody:*:*) echo powerpc-apple-rhapsody${UNAME_RELEASE} exit ;; *:Rhapsody:*:*) echo ${UNAME_MACHINE}-apple-rhapsody${UNAME_RELEASE} exit ;; *:Darwin:*:*) UNAME_PROCESSOR=`uname -p` || UNAME_PROCESSOR=unknown case $UNAME_PROCESSOR in i386) eval $set_cc_for_build if [ "$CC_FOR_BUILD" != 'no_compiler_found' ]; then if (echo '#ifdef __LP64__'; echo IS_64BIT_ARCH; echo '#endif') | \ (CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | \ grep IS_64BIT_ARCH >/dev/null then UNAME_PROCESSOR="x86_64" fi fi ;; unknown) UNAME_PROCESSOR=powerpc ;; esac echo ${UNAME_PROCESSOR}-apple-darwin${UNAME_RELEASE} exit ;; *:procnto*:*:* | *:QNX:[0123456789]*:*) UNAME_PROCESSOR=`uname -p` if test "$UNAME_PROCESSOR" = "x86"; then UNAME_PROCESSOR=i386 UNAME_MACHINE=pc fi echo ${UNAME_PROCESSOR}-${UNAME_MACHINE}-nto-qnx${UNAME_RELEASE} exit ;; *:QNX:*:4*) echo i386-pc-qnx exit ;; NSE-?:NONSTOP_KERNEL:*:*) echo nse-tandem-nsk${UNAME_RELEASE} exit ;; NSR-?:NONSTOP_KERNEL:*:*) echo nsr-tandem-nsk${UNAME_RELEASE} exit ;; *:NonStop-UX:*:*) echo mips-compaq-nonstopux exit ;; BS2000:POSIX*:*:*) echo bs2000-siemens-sysv exit ;; DS/*:UNIX_System_V:*:*) echo ${UNAME_MACHINE}-${UNAME_SYSTEM}-${UNAME_RELEASE} exit ;; *:Plan9:*:*) # "uname -m" is not consistent, so use $cputype instead. 386 # is converted to i386 for consistency with other x86 # operating systems. if test "$cputype" = "386"; then UNAME_MACHINE=i386 else UNAME_MACHINE="$cputype" fi echo ${UNAME_MACHINE}-unknown-plan9 exit ;; *:TOPS-10:*:*) echo pdp10-unknown-tops10 exit ;; *:TENEX:*:*) echo pdp10-unknown-tenex exit ;; KS10:TOPS-20:*:* | KL10:TOPS-20:*:* | TYPE4:TOPS-20:*:*) echo pdp10-dec-tops20 exit ;; XKL-1:TOPS-20:*:* | TYPE5:TOPS-20:*:*) echo pdp10-xkl-tops20 exit ;; *:TOPS-20:*:*) echo pdp10-unknown-tops20 exit ;; *:ITS:*:*) echo pdp10-unknown-its exit ;; SEI:*:*:SEIUX) echo mips-sei-seiux${UNAME_RELEASE} exit ;; *:DragonFly:*:*) echo ${UNAME_MACHINE}-unknown-dragonfly`echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'` exit ;; *:*VMS:*:*) UNAME_MACHINE=`(uname -p) 2>/dev/null` case "${UNAME_MACHINE}" in A*) echo alpha-dec-vms ; exit ;; I*) echo ia64-dec-vms ; exit ;; V*) echo vax-dec-vms ; exit ;; esac ;; *:XENIX:*:SysV) echo i386-pc-xenix exit ;; i*86:skyos:*:*) echo ${UNAME_MACHINE}-pc-skyos`echo ${UNAME_RELEASE}` | sed -e 's/ .*$//' exit ;; i*86:rdos:*:*) echo ${UNAME_MACHINE}-pc-rdos exit ;; i*86:AROS:*:*) echo ${UNAME_MACHINE}-pc-aros exit ;; esac #echo '(No uname command or uname output not recognized.)' 1>&2 #echo "${UNAME_MACHINE}:${UNAME_SYSTEM}:${UNAME_RELEASE}:${UNAME_VERSION}" 1>&2 eval $set_cc_for_build cat >$dummy.c < # include #endif main () { #if defined (sony) #if defined (MIPSEB) /* BFD wants "bsd" instead of "newsos". Perhaps BFD should be changed, I don't know.... */ printf ("mips-sony-bsd\n"); exit (0); #else #include printf ("m68k-sony-newsos%s\n", #ifdef NEWSOS4 "4" #else "" #endif ); exit (0); #endif #endif #if defined (__arm) && defined (__acorn) && defined (__unix) printf ("arm-acorn-riscix\n"); exit (0); #endif #if defined (hp300) && !defined (hpux) printf ("m68k-hp-bsd\n"); exit (0); #endif #if defined (NeXT) #if !defined (__ARCHITECTURE__) #define __ARCHITECTURE__ "m68k" #endif int version; version=`(hostinfo | sed -n 's/.*NeXT Mach \([0-9]*\).*/\1/p') 2>/dev/null`; if (version < 4) printf ("%s-next-nextstep%d\n", __ARCHITECTURE__, version); else printf ("%s-next-openstep%d\n", __ARCHITECTURE__, version); exit (0); #endif #if defined (MULTIMAX) || defined (n16) #if defined (UMAXV) printf ("ns32k-encore-sysv\n"); exit (0); #else #if defined (CMU) printf ("ns32k-encore-mach\n"); exit (0); #else printf ("ns32k-encore-bsd\n"); exit (0); #endif #endif #endif #if defined (__386BSD__) printf ("i386-pc-bsd\n"); exit (0); #endif #if defined (sequent) #if defined (i386) printf ("i386-sequent-dynix\n"); exit (0); #endif #if defined (ns32000) printf ("ns32k-sequent-dynix\n"); exit (0); #endif #endif #if defined (_SEQUENT_) struct utsname un; uname(&un); if (strncmp(un.version, "V2", 2) == 0) { printf ("i386-sequent-ptx2\n"); exit (0); } if (strncmp(un.version, "V1", 2) == 0) { /* XXX is V1 correct? */ printf ("i386-sequent-ptx1\n"); exit (0); } printf ("i386-sequent-ptx\n"); exit (0); #endif #if defined (vax) # if !defined (ultrix) # include # if defined (BSD) # if BSD == 43 printf ("vax-dec-bsd4.3\n"); exit (0); # else # if BSD == 199006 printf ("vax-dec-bsd4.3reno\n"); exit (0); # else printf ("vax-dec-bsd\n"); exit (0); # endif # endif # else printf ("vax-dec-bsd\n"); exit (0); # endif # else printf ("vax-dec-ultrix\n"); exit (0); # endif #endif #if defined (alliant) && defined (i860) printf ("i860-alliant-bsd\n"); exit (0); #endif exit (1); } EOF $CC_FOR_BUILD -o $dummy $dummy.c 2>/dev/null && SYSTEM_NAME=`$dummy` && { echo "$SYSTEM_NAME"; exit; } # Apollos put the system type in the environment. test -d /usr/apollo && { echo ${ISP}-apollo-${SYSTYPE}; exit; } # Convex versions that predate uname can use getsysinfo(1) if [ -x /usr/convex/getsysinfo ] then case `getsysinfo -f cpu_type` in c1*) echo c1-convex-bsd exit ;; c2*) if getsysinfo -f scalar_acc then echo c32-convex-bsd else echo c2-convex-bsd fi exit ;; c34*) echo c34-convex-bsd exit ;; c38*) echo c38-convex-bsd exit ;; c4*) echo c4-convex-bsd exit ;; esac fi cat >&2 < in order to provide the needed information to handle your system. config.guess timestamp = $timestamp uname -m = `(uname -m) 2>/dev/null || echo unknown` uname -r = `(uname -r) 2>/dev/null || echo unknown` uname -s = `(uname -s) 2>/dev/null || echo unknown` uname -v = `(uname -v) 2>/dev/null || echo unknown` /usr/bin/uname -p = `(/usr/bin/uname -p) 2>/dev/null` /bin/uname -X = `(/bin/uname -X) 2>/dev/null` hostinfo = `(hostinfo) 2>/dev/null` /bin/universe = `(/bin/universe) 2>/dev/null` /usr/bin/arch -k = `(/usr/bin/arch -k) 2>/dev/null` /bin/arch = `(/bin/arch) 2>/dev/null` /usr/bin/oslevel = `(/usr/bin/oslevel) 2>/dev/null` /usr/convex/getsysinfo = `(/usr/convex/getsysinfo) 2>/dev/null` UNAME_MACHINE = ${UNAME_MACHINE} UNAME_RELEASE = ${UNAME_RELEASE} UNAME_SYSTEM = ${UNAME_SYSTEM} UNAME_VERSION = ${UNAME_VERSION} EOF exit 1 # Local variables: # eval: (add-hook 'write-file-hooks 'time-stamp) # time-stamp-start: "timestamp='" # time-stamp-format: "%:y-%02m-%02d" # time-stamp-end: "'" # End: xvidcore/build/generic/platform.inc.in0000664000076500007650000000510411513040541021114 0ustar xvidbuildxvidbuild# ============================================================================ # # Do not edit this file manually, it is generated automatically by # the configure script # # See ./configure --help # # ============================================================================ # ============================================================================ # Activated features # ============================================================================ FEATURES=@FEATURES@ # ============================================================================ # Architecture dependant things # ============================================================================ ARCHITECTURE=-DARCH_IS_@ARCHITECTURE@ BUS=-DARCH_IS_@BUS@ ENDIANNESS=-DARCH_IS_@ENDIANNESS@ SHARED_EXTENSION=@SHARED_EXTENSION@ STATIC_EXTENSION=@STATIC_EXTENSION@ OBJECT_EXTENSION=@OBJECT_EXTENSION@ # ============================================================================ # Compiler # ============================================================================ CC=@CC@ SPECIFIC_CFLAGS=@SPECIFIC_CFLAGS@ ALTIVEC_CFLAGS=@ALTIVEC_CFLAGS@ CFLAGS=@CFLAGS@ # ============================================================================ # Assembler # ============================================================================ AS=@AS@ AFLAGS=@AFLAGS@ ASSEMBLY_EXTENSION=@ASSEMBLY_EXTENSION@ # ============================================================================ # Linker # ============================================================================ SPECIFIC_LDFLAGS=@SPECIFIC_LDFLAGS@ API_MAJOR=@API_MAJOR@ API_MINOR=@API_MINOR@ RANLIB=@RANLIB@ AR=@AR@ # ============================================================================ # Installation # ============================================================================ INSTALL=@INSTALL@ DESTDIR= prefix=@prefix@ exec_prefix=@exec_prefix@ libdir=@libdir@ includedir=@includedir@ # ============================================================================ # Sources # ============================================================================ GENERIC_SOURCES=$(@GENERIC_SOURCES@) ASSEMBLY_SOURCES=$(@ASSEMBLY_SOURCES@) DCT_IA64_SOURCES=$(@DCT_IA64_SOURCES@) PPC_ALTIVEC_SOURCES=$(@PPC_ALTIVEC_SOURCES@) GENERIC_OBJECTS=$(@GENERIC_SOURCES@:.c=.@OBJECT_EXTENSION@) ASSEMBLY_OBJECTS=$(@ASSEMBLY_SOURCES@:.@ASSEMBLY_EXTENSION@=.@OBJECT_EXTENSION@) DCT_IA64_OBJECTS=$(@DCT_IA64_SOURCES@:.@ASSEMBLY_EXTENSION@=.@OBJECT_EXTENSION@) PPC_ALTIVEC_OBJECTS=$(@PPC_ALTIVEC_SOURCES@:.c=.@OBJECT_EXTENSION@) STATIC_LIB=@STATIC_LIB@ SHARED_LIB=@SHARED_LIB@ PRE_SHARED_LIB=@PRE_SHARED_LIB@ xvidcore/build/generic/sources.inc0000664000076500007650000000562511460771556020400 0ustar xvidbuildxvidbuildSRC_DIR = ../../src SRC_GENERIC = \ decoder.c \ encoder.c \ xvid.c \ bitstream/bitstream.c \ bitstream/cbp.c \ bitstream/mbcoding.c \ dct/fdct.c \ dct/idct.c \ dct/simple_idct.c \ image/colorspace.c \ image/image.c \ image/interpolate8x8.c \ image/font.c \ image/postprocessing.c \ image/qpel.c \ image/reduced.c \ motion/estimation_bvop.c \ motion/estimation_common.c \ motion/estimation_gmc.c \ motion/estimation_pvop.c \ motion/estimation_rd_based.c \ motion/estimation_rd_based_bvop.c \ motion/gmc.c \ motion/motion_comp.c \ motion/vop_type_decision.c \ motion/sad.c \ prediction/mbprediction.c \ plugins/plugin_single.c \ plugins/plugin_2pass1.c \ plugins/plugin_2pass2.c \ plugins/plugin_lumimasking.c \ plugins/plugin_dump.c \ plugins/plugin_psnr.c \ plugins/plugin_ssim.c \ plugins/plugin_psnrhvsm.c \ quant/quant_h263.c \ quant/quant_matrix.c \ quant/quant_mpeg.c \ utils/emms.c \ utils/mbtransquant.c \ utils/mem_align.c \ utils/mem_transfer.c \ utils/timer.c SRC_IA32 = \ bitstream/x86_asm/cbp_mmx.asm \ bitstream/x86_asm/cbp_sse2.asm \ dct/x86_asm/fdct_mmx_ffmpeg.asm \ dct/x86_asm/fdct_mmx_skal.asm \ dct/x86_asm/fdct_sse2_skal.asm \ dct/x86_asm/idct_3dne.asm \ dct/x86_asm/idct_mmx.asm \ dct/x86_asm/idct_sse2_dmitry.asm \ image/x86_asm/colorspace_rgb_mmx.asm \ image/x86_asm/colorspace_yuv_mmx.asm \ image/x86_asm/colorspace_yuyv_mmx.asm \ image/x86_asm/interpolate8x8_3dn.asm \ image/x86_asm/interpolate8x8_3dne.asm \ image/x86_asm/interpolate8x8_mmx.asm \ image/x86_asm/interpolate8x8_xmm.asm \ image/x86_asm/postprocessing_mmx.asm \ image/x86_asm/postprocessing_sse2.asm \ image/x86_asm/reduced_mmx.asm \ image/x86_asm/qpel_mmx.asm \ image/x86_asm/gmc_mmx.asm \ image/x86_asm/deintl_sse.asm \ motion/x86_asm/sad_xmm.asm \ motion/x86_asm/sad_sse2.asm \ motion/x86_asm/sad_mmx.asm \ motion/x86_asm/sad_3dne.asm \ motion/x86_asm/sad_3dn.asm \ quant/x86_asm/quantize_h263_mmx.asm \ quant/x86_asm/quantize_h263_3dne.asm \ quant/x86_asm/quantize_mpeg_xmm.asm \ quant/x86_asm/quantize_mpeg_mmx.asm \ utils/x86_asm/mem_transfer_mmx.asm \ utils/x86_asm/mem_transfer_3dne.asm \ utils/x86_asm/interlacing_mmx.asm \ utils/x86_asm/cpuid.asm \ plugins/x86_asm/plugin_ssim-a.asm SRC_X86_64 = $(SRC_IA32) SRC_IA64 = \ dct/ia64_asm/fdct_ia64.s \ image/ia64_asm/interpolate8x8_ia64.s \ motion/ia64_asm/sad_ia64.s \ motion/ia64_asm/halfpel8_refine_ia64.s \ quant/ia64_asm/quant_h263_ia64.s \ utils/ia64_asm/mem_transfer_ia64.s SRC_IA64_IDCT_GCC = \ dct/ia64_asm/idct_ia64_gcc.s SRC_IA64_IDCT_ECC = \ dct/ia64_asm/idct_ia64_ecc.s SRC_PPC_ALTIVEC = \ utils/ppc_asm/altivec_trigger.c \ utils/ppc_asm/mem_transfer_altivec.c \ motion/ppc_asm/sad_altivec.c \ dct/ppc_asm/idct_altivec.c \ image/ppc_asm/interpolate8x8_altivec.c \ image/ppc_asm/colorspace_altivec.c \ image/ppc_asm/qpel_altivec.c \ quant/ppc_asm/quant_h263_altivec.c \ quant/ppc_asm/quant_mpeg_altivec.c xvidcore/build/generic/missing0000755000076500007650000002623311566433046017611 0ustar xvidbuildxvidbuild#! /bin/sh # Common stub for a few missing GNU programs while installing. scriptversion=2009-04-28.21; # UTC # Copyright (C) 1996, 1997, 1999, 2000, 2002, 2003, 2004, 2005, 2006, # 2008, 2009 Free Software Foundation, Inc. # Originally by Fran,cois Pinard , 1996. # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2, or (at your option) # any later version. # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # You should have received a copy of the GNU General Public License # along with this program. If not, see . # As a special exception to the GNU General Public License, if you # distribute this file as part of a program that contains a # configuration script generated by Autoconf, you may include it under # the same distribution terms that you use for the rest of that program. if test $# -eq 0; then echo 1>&2 "Try \`$0 --help' for more information" exit 1 fi run=: sed_output='s/.* --output[ =]\([^ ]*\).*/\1/p' sed_minuso='s/.* -o \([^ ]*\).*/\1/p' # In the cases where this matters, `missing' is being run in the # srcdir already. if test -f configure.ac; then configure_ac=configure.ac else configure_ac=configure.in fi msg="missing on your system" case $1 in --run) # Try to run requested program, and just exit if it succeeds. run= shift "$@" && exit 0 # Exit code 63 means version mismatch. This often happens # when the user try to use an ancient version of a tool on # a file that requires a minimum version. In this case we # we should proceed has if the program had been absent, or # if --run hadn't been passed. if test $? = 63; then run=: msg="probably too old" fi ;; -h|--h|--he|--hel|--help) echo "\ $0 [OPTION]... PROGRAM [ARGUMENT]... Handle \`PROGRAM [ARGUMENT]...' for when PROGRAM is missing, or return an error status if there is no known handling for PROGRAM. Options: -h, --help display this help and exit -v, --version output version information and exit --run try to run the given command, and emulate it if it fails Supported PROGRAM values: aclocal touch file \`aclocal.m4' autoconf touch file \`configure' autoheader touch file \`config.h.in' autom4te touch the output file, or create a stub one automake touch all \`Makefile.in' files bison create \`y.tab.[ch]', if possible, from existing .[ch] flex create \`lex.yy.c', if possible, from existing .c help2man touch the output file lex create \`lex.yy.c', if possible, from existing .c makeinfo touch the output file tar try tar, gnutar, gtar, then tar without non-portable flags yacc create \`y.tab.[ch]', if possible, from existing .[ch] Version suffixes to PROGRAM as well as the prefixes \`gnu-', \`gnu', and \`g' are ignored when checking the name. Send bug reports to ." exit $? ;; -v|--v|--ve|--ver|--vers|--versi|--versio|--version) echo "missing $scriptversion (GNU Automake)" exit $? ;; -*) echo 1>&2 "$0: Unknown \`$1' option" echo 1>&2 "Try \`$0 --help' for more information" exit 1 ;; esac # normalize program name to check for. program=`echo "$1" | sed ' s/^gnu-//; t s/^gnu//; t s/^g//; t'` # Now exit if we have it, but it failed. Also exit now if we # don't have it and --version was passed (most likely to detect # the program). This is about non-GNU programs, so use $1 not # $program. case $1 in lex*|yacc*) # Not GNU programs, they don't have --version. ;; tar*) if test -n "$run"; then echo 1>&2 "ERROR: \`tar' requires --run" exit 1 elif test "x$2" = "x--version" || test "x$2" = "x--help"; then exit 1 fi ;; *) if test -z "$run" && ($1 --version) > /dev/null 2>&1; then # We have it, but it failed. exit 1 elif test "x$2" = "x--version" || test "x$2" = "x--help"; then # Could not run --version or --help. This is probably someone # running `$TOOL --version' or `$TOOL --help' to check whether # $TOOL exists and not knowing $TOOL uses missing. exit 1 fi ;; esac # If it does not exist, or fails to run (possibly an outdated version), # try to emulate it. case $program in aclocal*) echo 1>&2 "\ WARNING: \`$1' is $msg. You should only need it if you modified \`acinclude.m4' or \`${configure_ac}'. You might want to install the \`Automake' and \`Perl' packages. Grab them from any GNU archive site." touch aclocal.m4 ;; autoconf*) echo 1>&2 "\ WARNING: \`$1' is $msg. You should only need it if you modified \`${configure_ac}'. You might want to install the \`Autoconf' and \`GNU m4' packages. Grab them from any GNU archive site." touch configure ;; autoheader*) echo 1>&2 "\ WARNING: \`$1' is $msg. You should only need it if you modified \`acconfig.h' or \`${configure_ac}'. You might want to install the \`Autoconf' and \`GNU m4' packages. Grab them from any GNU archive site." files=`sed -n 's/^[ ]*A[CM]_CONFIG_HEADER(\([^)]*\)).*/\1/p' ${configure_ac}` test -z "$files" && files="config.h" touch_files= for f in $files; do case $f in *:*) touch_files="$touch_files "`echo "$f" | sed -e 's/^[^:]*://' -e 's/:.*//'`;; *) touch_files="$touch_files $f.in";; esac done touch $touch_files ;; automake*) echo 1>&2 "\ WARNING: \`$1' is $msg. You should only need it if you modified \`Makefile.am', \`acinclude.m4' or \`${configure_ac}'. You might want to install the \`Automake' and \`Perl' packages. Grab them from any GNU archive site." find . -type f -name Makefile.am -print | sed 's/\.am$/.in/' | while read f; do touch "$f"; done ;; autom4te*) echo 1>&2 "\ WARNING: \`$1' is needed, but is $msg. You might have modified some files without having the proper tools for further handling them. You can get \`$1' as part of \`Autoconf' from any GNU archive site." file=`echo "$*" | sed -n "$sed_output"` test -z "$file" && file=`echo "$*" | sed -n "$sed_minuso"` if test -f "$file"; then touch $file else test -z "$file" || exec >$file echo "#! /bin/sh" echo "# Created by GNU Automake missing as a replacement of" echo "# $ $@" echo "exit 0" chmod +x $file exit 1 fi ;; bison*|yacc*) echo 1>&2 "\ WARNING: \`$1' $msg. You should only need it if you modified a \`.y' file. You may need the \`Bison' package in order for those modifications to take effect. You can get \`Bison' from any GNU archive site." rm -f y.tab.c y.tab.h if test $# -ne 1; then eval LASTARG="\${$#}" case $LASTARG in *.y) SRCFILE=`echo "$LASTARG" | sed 's/y$/c/'` if test -f "$SRCFILE"; then cp "$SRCFILE" y.tab.c fi SRCFILE=`echo "$LASTARG" | sed 's/y$/h/'` if test -f "$SRCFILE"; then cp "$SRCFILE" y.tab.h fi ;; esac fi if test ! -f y.tab.h; then echo >y.tab.h fi if test ! -f y.tab.c; then echo 'main() { return 0; }' >y.tab.c fi ;; lex*|flex*) echo 1>&2 "\ WARNING: \`$1' is $msg. You should only need it if you modified a \`.l' file. You may need the \`Flex' package in order for those modifications to take effect. You can get \`Flex' from any GNU archive site." rm -f lex.yy.c if test $# -ne 1; then eval LASTARG="\${$#}" case $LASTARG in *.l) SRCFILE=`echo "$LASTARG" | sed 's/l$/c/'` if test -f "$SRCFILE"; then cp "$SRCFILE" lex.yy.c fi ;; esac fi if test ! -f lex.yy.c; then echo 'main() { return 0; }' >lex.yy.c fi ;; help2man*) echo 1>&2 "\ WARNING: \`$1' is $msg. You should only need it if you modified a dependency of a manual page. You may need the \`Help2man' package in order for those modifications to take effect. You can get \`Help2man' from any GNU archive site." file=`echo "$*" | sed -n "$sed_output"` test -z "$file" && file=`echo "$*" | sed -n "$sed_minuso"` if test -f "$file"; then touch $file else test -z "$file" || exec >$file echo ".ab help2man is required to generate this page" exit $? fi ;; makeinfo*) echo 1>&2 "\ WARNING: \`$1' is $msg. You should only need it if you modified a \`.texi' or \`.texinfo' file, or any other file indirectly affecting the aspect of the manual. The spurious call might also be the consequence of using a buggy \`make' (AIX, DU, IRIX). You might want to install the \`Texinfo' package or the \`GNU make' package. Grab either from any GNU archive site." # The file to touch is that specified with -o ... file=`echo "$*" | sed -n "$sed_output"` test -z "$file" && file=`echo "$*" | sed -n "$sed_minuso"` if test -z "$file"; then # ... or it is the one specified with @setfilename ... infile=`echo "$*" | sed 's/.* \([^ ]*\) *$/\1/'` file=`sed -n ' /^@setfilename/{ s/.* \([^ ]*\) *$/\1/ p q }' $infile` # ... or it is derived from the source name (dir/f.texi becomes f.info) test -z "$file" && file=`echo "$infile" | sed 's,.*/,,;s,.[^.]*$,,'`.info fi # If the file does not exist, the user really needs makeinfo; # let's fail without touching anything. test -f $file || exit 1 touch $file ;; tar*) shift # We have already tried tar in the generic part. # Look for gnutar/gtar before invocation to avoid ugly error # messages. if (gnutar --version > /dev/null 2>&1); then gnutar "$@" && exit 0 fi if (gtar --version > /dev/null 2>&1); then gtar "$@" && exit 0 fi firstarg="$1" if shift; then case $firstarg in *o*) firstarg=`echo "$firstarg" | sed s/o//` tar "$firstarg" "$@" && exit 0 ;; esac case $firstarg in *h*) firstarg=`echo "$firstarg" | sed s/h//` tar "$firstarg" "$@" && exit 0 ;; esac fi echo 1>&2 "\ WARNING: I can't seem to be able to run \`tar' with the given arguments. You may want to install GNU tar or Free paxutils, or check the command line arguments." exit 1 ;; *) echo 1>&2 "\ WARNING: \`$1' is needed, and is $msg. You might have modified some files without having the proper tools for further handling them. Check the \`README' file, it often tells you about the needed prerequisites for installing this package. You may also peek at any GNU archive site, in case some other package would contain this missing \`$1' program." exit 1 ;; esac exit 0 # Local variables: # eval: (add-hook 'write-file-hooks 'time-stamp) # time-stamp-start: "scriptversion=" # time-stamp-format: "%:y-%02m-%02d.%02H" # time-stamp-time-zone: "UTC" # time-stamp-end: "; # UTC" # End: xvidcore/build/generic/bootstrap.sh0000775000076500007650000000405011567130320020547 0ustar xvidbuildxvidbuild#!/bin/sh # # - Bootstrap script - # # Copyright(C) 2003-2004 Edouard Gomez # # This file builds the configure script and copies all needed files # provided by automake/libtoolize # # $Id: bootstrap.sh,v 1.7 2005-05-23 09:29:43 Skal Exp $ ############################################################################## # Detect the right autoconf script ############################################################################## # Find a suitable autoconf AUTOCONF="autoconf2.50" $AUTOCONF --version 1>/dev/null 2>&1 if [ $? -ne 0 ] ; then AUTOCONF="autoconf" $AUTOCONF --version 1>/dev/null 2>&1 if [ $? -ne 0 ] ; then echo "ERROR: 'autoconf' not found" exit -1 fi fi # Tests the autoconf version AC_VER=`$AUTOCONF --version | head -1 | sed 's/'^[^0-9]*'/''/'` AC_MAJORVER=`echo $AC_VER | cut -f1 -d'.'` AC_MINORVER=`echo $AC_VER | cut -f2 -d'.'` if [ "$AC_MAJORVER" -lt "2" ]; then echo "ERROR: This bootstrapper requires Autoconf >= 2.50 (detected $AC_VER)" exit -1 fi if [ "$AC_MINORVER" -lt "50" ]; then echo "ERROR: This bootstrapper requires Autoconf >= 2.50 (detected $AC_VER)" exit -1 fi LIBTOOLIZE="libtoolize" $LIBTOOLIZE --version 1>/dev/null 2>&1 if [ $? -ne 0 ] ; then LIBTOOLIZE="glibtoolize" $LIBTOOLIZE --version 1>/dev/null 2>&1 if [ $? -ne 0 ] ; then echo "ERROR: 'libtoolize' not found" exit -1 fi fi AUTOMAKE="automake" $AUTOMAKE --version 1>/dev/null 2>&1 if [ $? -ne 0 ] ; then echo "ERROR: 'automake' not found" exit -1 fi ############################################################################## # Bootstraps the configure script ############################################################################## echo "Creating ./configure" $AUTOCONF echo "Copying files provided by automake" $AUTOMAKE -c -a 1>/dev/null 2>&1 echo "Copying files provided by libtool" $LIBTOOLIZE -f -c 1>/dev/null 2>&1 echo "Removing files that are not needed" rm -rf autom4* 1>/dev/null 2>&1 rm -rf ltmain.sh 1>/dev/null 2>&1 rm -rf *.m4 1>/dev/null 2>&1xvidcore/build/generic/configure.in0000664000076500007650000005057111564676140020531 0ustar xvidbuildxvidbuilddnl ========================================================================== dnl dnl Autoconf script for Xvid dnl dnl Copyright(C) 2003-2004 Edouard Gomez dnl dnl ========================================================================== AC_PREREQ([2.50]) AC_INIT([Xvid], [1.3.2], [xvid-devel@xvid.org]) AC_CONFIG_SRCDIR(configure.in) dnl Do not forget to increase that when needed. API_MAJOR="4" API_MINOR="3" dnl NASM/YASM version requirement minimum_yasm_major_version=1 minimum_nasm_minor_version=0 minimum_nasm_major_version=2 nasm_prog="nasm" yasm_prog="yasm" dnl Default CFLAGS -- Big impact on overall speed our_cflags_defaults="-Wall" our_cflags_defaults="$our_cflags_defaults -O2" our_cflags_defaults="$our_cflags_defaults -fstrength-reduce" our_cflags_defaults="$our_cflags_defaults -finline-functions" our_cflags_defaults="$our_cflags_defaults -ffast-math" our_cflags_defaults="$our_cflags_defaults -fomit-frame-pointer" dnl ========================================================================== dnl Features - configure options dnl ========================================================================== FEATURES="" dnl Internal Debug AC_ARG_ENABLE(idebug, AC_HELP_STRING([--enable-idebug], [Enable internal debug function]), [if test "$enable_idebug" = "yes" ; then FEATURES="$FEATURES -D_DEBUG" fi]) dnl Internal Profile AC_ARG_ENABLE(iprofile, AC_HELP_STRING([--enable-iprofile], [Enable internal profiling]), [if test "$enable_iprofile" = "yes" ; then FEATURES="$FEATURES -D_PROFILING_" fi]) dnl GNU Profiling options AC_ARG_ENABLE(gnuprofile, AC_HELP_STRING([--enable-gnuprofile], [Enable profiling informations for gprof]), [if test "$enable_gnuprofile" = "yes" ; then GNU_PROF_CFLAGS="-pg -fprofile-arcs -ftest-coverage" GNU_PROF_LDFLAGS="-pg" fi]) dnl Assembly code AC_ARG_ENABLE(assembly, AC_HELP_STRING([--disable-assembly], [Disable assembly code]), [if test "$enable_assembly" = "no" ; then assembly="no" else if test "$enable_assembly" = "yes" ; then assembly="yes" fi fi], [assembly="yes"]) dnl pthread code AC_ARG_ENABLE(pthread, AC_HELP_STRING([--disable-pthread], [Disable pthread dependent code]), [if test "$enable_pthread" = "no" ; then pthread="no" else if test "$enable_pthread" = "yes" ; then pthread="yes" fi fi], [pthread="yes"]) dnl Build as a module not a shared lib on darwin AC_ARG_ENABLE(macosx_module, AC_HELP_STRING([--enable-macosx_module], [Build as a module on MacOS X]), [if test "$enable_macosx_module" = "yes" ; then macosx_module="yes" else macosx_module="no" fi], [macosx_module="no"]) dnl ========================================================================== dnl Default install prefix and checks build type dnl ========================================================================== AC_PREFIX_DEFAULT("/usr/local") AC_CANONICAL_BUILD AC_CANONICAL_HOST AC_CANONICAL_TARGET dnl ========================================================================== dnl Check for the C compiler (could be passed on command line) dnl ========================================================================== dnl dnl First we test if CFLAGS have been passed on command line dnl I do that because autoconf defaults (-g -O2) suck and they would kill dnl performance. To prevent that we define a good defult CFLAGS at the end dnl of the script if and only if CFLAGS has not been passed on the command dnl line dnl AC_MSG_CHECKING(whether to use default CFLAGS) if test x"$CFLAGS" = x"" ; then force_default_cc_options="yes" AC_MSG_RESULT([yes]) else force_default_cc_options="no" AC_MSG_RESULT([no]) fi dnl Now we can safely check for the C compiler AC_PROG_CC dnl ========================================================================== dnl Check for the install program dnl ========================================================================== AC_PROG_INSTALL dnl ========================================================================== dnl Check for the ranlib program to generate static library index dnl ========================================================================== AC_PROG_RANLIB AC_CHECK_TOOL([AR], [ar], [ar-not-found]) dnl ========================================================================== dnl dnl This part looks for: dnl dnl ARCHITECTURE : The platform architecture dnl - IA32 for mmx, mmx-ext, mmx2, sse assembly dnl - IA64 dnl - PPC for PowerPC assembly routines dnl - GENERIC for plain C sources only dnl dnl BUS: Address bus size (in bits) dnl - 32 dnl - 64 dnl dnl ENDIANNESS: I think you can guess what this thing means :-) dnl - LITTLE_ENDIAN dnl - BIG_ENDIAN dnl dnl ========================================================================== dnl dnl Looking what sources have to be compiled according to the CPU type dnl ARCHITECTURE="" AC_MSG_CHECKING([for whether to use assembly code]) if test x"$assembly" = x"yes" ; then AC_MSG_RESULT([yes]) AC_MSG_CHECKING([for architecture type]) case "$target_cpu" in i[[3456]]86) AC_MSG_RESULT(ia32) ARCHITECTURE="IA32" ;; x86_64) AC_MSG_RESULT(x86_64) ARCHITECTURE="X86_64" ;; powerpc) AC_MSG_RESULT(PowerPC) ARCHITECTURE="PPC" ;; ia64) AC_MSG_RESULT(ia64) ARCHITECTURE="IA64" ;; *) AC_MSG_RESULT($target_cpu) ARCHITECTURE="GENERIC" ;; esac else AC_MSG_RESULT([no]) ARCHITECTURE="GENERIC" fi dnl dnl Testing address bus length dnl BUS="" AC_CHECK_SIZEOF([int *]) case "$ac_cv_sizeof_int_p" in 4) BUS="32BIT" ;; 8) BUS="64BIT" ;; *) AC_MSG_ERROR([Xvid supports only 32/64 bit architectures]) ;; esac dnl dnl Testing endianness dnl ENDIANNESS="" AC_C_BIGENDIAN(ENDIANNESS="BIG_ENDIAN", ENDIANNESS="LITTLE_ENDIAN") dnl ========================================================================== dnl dnl Check for OS specific variables dnl - SHARED_EXTENSION, STATIC_EXTENSION, OBJECT_EXTENSION dnl dnl ========================================================================== AC_MSG_CHECKING(for build extensions) SHARED_EXTENSION="" STATIC_EXTENSION="" OBJECT_EXTENSION="" case "$target_os" in *bsd*|linux*|beos|irix*|solaris*) AC_MSG_RESULT([.so .a .o]) STATIC_EXTENSION="a" SHARED_EXTENSION="so" OBJECT_EXTENSION="o" ;; [[cC]][[yY]][[gG]][[wW]][[iI]][[nN]]*|mingw32*|mks*) AC_MSG_RESULT([.dll .a .obj]) STATIC_EXTENSION="a" SHARED_EXTENSION="dll" OBJECT_EXTENSION="obj" ;; darwin*|raphsody*) if test x"$macosx_module" = x"yes"; then AC_MSG_RESULT([.so .a .o]) SHARED_EXTENSION="so" else AC_MSG_RESULT([.dynlib .a .o]) SHARED_EXTENSION="dylib" fi STATIC_EXTENSION="a" OBJECT_EXTENSION="o" ;; *) AC_MSG_RESULT([Unknown OS - Using .so .a .o]) STATIC_EXTENSION="a" SHARED_EXTENSION="so" OBJECT_EXTENSION="o" ;; esac dnl ========================================================================== dnl dnl Determines best options for CC and LD dnl - STATIC_LIB, SHARED_LIB, SPECIFIC_CFLAGS, SPECIFIC_LDLAGS dnl dnl ========================================================================== AC_MSG_CHECKING(for platform specific LDFLAGS/CFLAGS) SPECIFIC_LDFLAGS="" SPECIFIC_CFLAGS="" ALTIVEC_CFLAGS="" PRE_SHARED_LIB="" case "$target_os" in linux*|solaris*) AC_MSG_RESULT([ok]) STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR).\$(API_MINOR)" SPECIFIC_LDFLAGS="-Wl,-soname,libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR) -shared -Wl,--version-script=libxvidcore.ld -lc -lm" SPECIFIC_CFLAGS="-fPIC" ;; *bsd*|irix*) AC_MSG_RESULT([ok]) STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR).\$(API_MINOR)" SPECIFIC_LDFLAGS="-Wl,-soname,libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR) -shared -lc -lm" SPECIFIC_CFLAGS="-fPIC" ;; [[cC]][[yY]][[gG]][[wW]][[iI]][[nN]]*|mingw32*|mks*) AC_MSG_RESULT([ok]) STATIC_LIB="xvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="xvidcore.\$(SHARED_EXTENSION)" SPECIFIC_LDFLAGS="-mno-cygwin -shared -Wl,--dll,--out-implib,\$@.a libxvidcore.def" SPECIFIC_CFLAGS="-mno-cygwin" ;; darwin*|raphsody*) STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SPECIFIC_CFLAGS="-fPIC -fno-common -no-cpp-precomp" if test x"$macosx_module" = x"no"; then AC_MSG_RESULT([dylib options]) SHARED_LIB="libxvidcore.\$(API_MAJOR).\$(SHARED_EXTENSION)" SPECIFIC_LDFLAGS="-Wl,-read_only_relocs,suppress -dynamiclib -flat_namespace -compatibility_version \$(API_MAJOR) -current_version \$(API_MAJOR).\$(API_MINOR) -install_name \$(libdir)/\$(SHARED_LIB)" else AC_MSG_RESULT([module options]) PRE_SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION)-temp.o" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR)" SPECIFIC_LDFLAGS="-r -keep_private_externs -nostdlib && \$(CC) \$(LDFLAGS) \$(PRE_SHARED_LIB) -o libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR) -bundle -flat_namespace -undefined suppress" fi ;; beos) AC_MSG_RESULT([ok]) STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION)" SPECIFIC_LDFLAGS="-nostart" SPECIFIC_CFLAGS="-fPIC" ;; *) AC_MSG_RESULT([Unknown Platform (Using default -shared -lc -lm)]) STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION)" SPECIFIC_LDFLAGS="" SPECIFIC_CFLAGS="" ;; esac if test x"$PRE_SHARED_LIB" = x; then PRE_SHARED_LIB=$SHARED_LIB fi dnl ========================================================================== dnl dnl Assembler stuff dnl - AS, AFLAGS, ASSEMBLY_EXTENSION, SOURCES dnl dnl ========================================================================== AS="" AFLAGS="" ASSEMBLY_EXTENSION="" GENERIC_SOURCES="SRC_GENERIC" ASSEMBLY_SOURCES="" dnl dnl IA32 dnl if test "$ARCHITECTURE" = "IA32" -o "$ARCHITECTURE" = "X86_64" ; then dnl dnl Checking for nasm compatible programs dnl found_nasm_comp_prog="no" chosen_asm_prog="" dnl Check for yasm first AC_CHECK_PROG([ac_yasm], [$yasm_prog], [yes], [no], , [yes]) if test "$ac_yasm" = "yes" ; then dnl dnl Checking yasm version dnl AC_MSG_CHECKING([for yasm version]) yasm_major=`$yasm_prog --version | head -1 | cut -d '.' -f 1 | cut -d ' ' -f 2` if test -z $yasm_major ; then yasm_major=-1 fi AC_MSG_RESULT([$yasm_major]) dnl Actually, yasm >= 0.7.99.2161 should be ok dnl But I'm too lazy to check also the patch version... if test "$yasm_major" -lt "$minimum_yasm_major_version" ; then AC_MSG_WARN([yasm version is too old]) else found_nasm_comp_prog="yes" chosen_asm_prog="$yasm_prog" fi fi dnl Check for nasm (not buggy version) if test "$found_nasm_comp_prog" = "no" ; then AC_CHECK_PROG([ac_nasm], [$nasm_prog], [yes], [no], , [yes]) if test "$ac_nasm" = "yes" ; then dnl dnl Checking nasm version dnl AC_MSG_CHECKING([for nasm version]) nasm_minor=`$nasm_prog -v | cut -d '.' -f 2 | cut -d ' ' -f 1` nasm_major=`$nasm_prog -v | cut -d '.' -f 1 | cut -d ' ' -f 3` if test -z $nasm_minor ; then nasm_minor=-1 fi if test -z $nasm_major ; then nasm_major=-1 fi AC_MSG_RESULT([$nasm_major]) dnl need nasm 2.x for SSE3/4 and X86_64 if test "$nasm_major" -lt "$minimum_nasm_major_version" ; then AC_MSG_WARN([nasm version is too old]) else found_nasm_comp_prog="yes" chosen_asm_prog="$nasm_prog" fi fi fi dnl dnl Ok now sort what object format we must use dnl if test "$found_nasm_comp_prog" = "yes" ; then AC_MSG_CHECKING([for asm object format]) case "$target_os" in *bsd*|linux*|beos|irix*|solaris*) if test "$ARCHITECTURE" = "X86_64" ; then AC_MSG_RESULT([elf64]) NASM_FORMAT="elf64" else AC_MSG_RESULT([elf]) NASM_FORMAT="elf" fi MARK_FUNCS="-DMARK_FUNCS" PREFIX="" ;; [[cC]][[yY]][[gG]][[wW]][[iI]][[nN]]*|mingw32*|mks*) if test "$ARCHITECTURE" = "X86_64" ; then AC_MSG_RESULT([win64]) NASM_FORMAT="win64" else AC_MSG_RESULT([win32]) NASM_FORMAT="win32" fi PREFIX="-DWINDOWS" MARK_FUNCS="" ;; *darwin*) if test "$ARCHITECTURE" = "X86_64" ; then AC_MSG_RESULT([macho64]) NASM_FORMAT="macho64" else AC_MSG_RESULT([macho32]) NASM_FORMAT="macho32" fi PREFIX="-DPREFIX" MARK_FUNCS="" ;; esac AS="$chosen_asm_prog" ASSEMBLY_EXTENSION="asm" AFLAGS="-I\$( before using intrincic dnl - define vectors with vec = {0,0,0,0} dnl dnl * The compile time option will be "injected" into SPECIFIC_CFLAGS variable dnl * The need for altivec.h will also be injected into SPECIFIC_CFLAGS through dnl a -DHAVE_ALTIVEC_H dnl * The vector definition is handled in portab.h thx to dnl HAVE_PARENTHESES/BRACES_ALTIVEC_DECL dnl PPC_ALTIVEC_SOURCES="" if test "$ARCHITECTURE" = "PPC" ; then AS="\$(CC)" AFLAGS="" ASSEMBLY_EXTENSION=".s" ASSEMBLY_SOURCES="" AC_MSG_CHECKING([for altivec.h]) cat > conftest.c << EOF #include int main() { return(0); } EOF if $CC -arch ppc -faltivec -c conftest.c 2>/dev/null 1>/dev/null || \ $CC -maltivec -mabi=altivec -c conftest.c 2>/dev/null 1>/dev/null ; then AC_MSG_RESULT(yes) SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS -DHAVE_ALTIVEC_H" TEMP_ALTIVEC="-DHAVE_ALTIVEC_H" else AC_MSG_RESULT(no) TEMP_ALTIVEC="" fi AC_MSG_CHECKING([for Altivec compiler support]) cat > conftest.c << EOF #ifdef HAVE_ALTIVEC_H #include #endif int main() { vector unsigned int vartest2 = (vector unsigned int)(0); vector unsigned int vartest3 = (vector unsigned int)(1); vartest2 = vec_add(vartest2, vartest3); return(0); } EOF if $CC $TEMP_ALTIVEC -arch ppc -faltivec -c conftest.c 2>/dev/null 1>/dev/null ; then AC_MSG_RESULT([yes (Apple)]) SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS -arch ppc -faltivec -DHAVE_ALTIVEC_PARENTHESES_DECL $TEMP_ALTIVEC" PPC_ALTIVEC_SOURCES="SRC_PPC_ALTIVEC" else cat > conftest.c << EOF #ifdef HAVE_ALTIVEC_H #include #endif int main() { vector unsigned int vartest2 = (vector unsigned int){0}; vector unsigned int vartest3 = (vector unsigned int){1}; vartest2 = vec_add(vartest2, vartest3); return(0); } EOF if $CC $TEMP_ALTIVEC -maltivec -mabi=altivec -c conftest.c 2>/dev/null 1>/dev/null ; then AC_MSG_RESULT([yes (GNU)]) SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS -DHAVE_ALTIVEC_BRACES_DECL $TEMP_ALTIVEC" PPC_ALTIVEC_SOURCES="SRC_PPC_ALTIVEC" ALTIVEC_CFLAGS="-maltivec -mabi=altivec" else AC_MSG_RESULT([no (ppc support won't be compiled in)]) dnl Only C code can be compiled :-( ARCHITECTURE="GENERIC" fi fi rm -f conftest.* fi dnl dnl IA64 dnl if test "$ARCHITECTURE" = "IA64" ; then AS="\$(CC)" AFLAGS="-c" ASSEMBLY_EXTENSION="s" ASSEMBLY_SOURCES="SRC_IA64" case `basename $CC` in *ecc*) DCT_IA64_SOURCES="SRC_IA64_IDCT_ECC" dnl If the compiler is ecc, then i don't know its options dnl fallback to "no options" if test "$force_default_cc_options" = "yes" ; then our_cflags_defaults="" fi ;; *) DCT_IA64_SOURCES="SRC_IA64_IDCT_GCC" ;; esac fi dnl ========================================================================== dnl dnl Check for header files dnl dnl ========================================================================== AC_CHECK_HEADERS( stdio.h \ signal.h \ stdlib.h \ string.h \ assert.h \ math.h \ , , AC_MSG_ERROR(Missing header file)) dnl ========================================================================== dnl dnl Check for pthread dnl dnl ========================================================================== if test x"$pthread" = x"yes" ; then AC_CHECK_HEADER( [pthread.h], [AC_CHECK_LIB( [pthread], [pthread_create], [SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS -DHAVE_PTHREAD" SPECIFIC_LDFLAGS="$SPECIFIC_LDFLAGS -lpthread"], AC_MSG_WARN(Pthread not supported. No SMP support))], AC_MSG_WARN(Pthread not supported. No SMP support)) else AC_MSG_WARN(Pthread support disabled. No SMP support) fi dnl ========================================================================== dnl dnl Now we can set CFLAGS if needed dnl dnl ========================================================================== if test "$force_default_cc_options" = "yes" ; then CFLAGS="$our_cflags_defaults" fi dnl ========================================================================== dnl dnl Profiling stuff goes here dnl - adds options to SPECIFIC_CFLAGS, SPECIFIC_LDLAGS dnl - removes incompatible options from CFLAGS dnl dnl ========================================================================== SPECIFIC_LDFLAGS="$SPECIFIC_LDFLAGS $GNU_PROF_LDFLAGS" SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS $GNU_PROF_CFLAGS" if test "$enable_gnuprofile" = "yes" ; then CFLAGS=`echo $CFLAGS | sed s/'-fomit-frame-pointer'/''/` fi dnl ========================================================================== dnl Some gcc flags can't be used for gcc >= 3.4.0 dnl ========================================================================== if test "$GCC" = "yes" ; then cat << EOF > test.c #include int main(int argc, char **argv) { if (*argv[[1]] == 'M') { printf("%d", __GNUC__); } if (*argv[[1]] == 'm') { printf("%d", __GNUC_MINOR__); } return 0; } EOF $CC -o gcc-ver test.c GCC_MAJOR=`./gcc-ver M` GCC_MINOR=`./gcc-ver m` rm -f test.c rm -f gcc-ver # GCC 4.x if test "${GCC_MAJOR}" -gt 3 ; then CFLAGS=`echo $CFLAGS | sed s,"-mcpu","-mtune",g` CFLAGS=`echo $CFLAGS | sed s,'-freduce-all-givs','',g` CFLAGS=`echo $CFLAGS | sed s,'-fmove-all-movables','',g` CFLAGS=`echo $CFLAGS | sed s,'-fnew-ra','',g` CFLAGS=`echo $CFLAGS | sed s,'-fwritable-strings','',g` fi # GCC 3.4.x if test "${GCC_MAJOR}" -eq 3 && test "${GCC_MINOR}" -gt 3 ; then CFLAGS=`echo $CFLAGS | sed s,"-mcpu","-mtune",g` fi fi dnl ========================================================================== dnl dnl Substitions dnl dnl ========================================================================== AC_SUBST(FEATURES) AC_SUBST(ARCHITECTURE) AC_SUBST(BUS) AC_SUBST(ENDIANNESS) AC_SUBST(SHARED_EXTENSION) AC_SUBST(STATIC_EXTENSION) AC_SUBST(OBJECT_EXTENSION) AC_SUBST(NASM_FORMAT) AC_SUBST(AS) AC_SUBST(AFLAGS) AC_SUBST(ASSEMBLY_EXTENSION) AC_SUBST(GENERIC_SOURCES) AC_SUBST(ASSEMBLY_SOURCES) AC_SUBST(CC) AC_SUBST(CFLAGS) AC_SUBST(SPECIFIC_LDFLAGS) AC_SUBST(SPECIFIC_CFLAGS) AC_SUBST(DCT_IA64_SOURCES) AC_SUBST(PPC_ALTIVEC_SOURCES) AC_SUBST(RANLIB) AC_SUBST(AR) AC_SUBST(API_MAJOR) AC_SUBST(API_MINOR) AC_SUBST(STATIC_LIB) AC_SUBST(PRE_SHARED_LIB) AC_SUBST(SHARED_LIB) AC_SUBST(ALTIVEC_CFLAGS) dnl ========================================================================== dnl dnl Output files dnl dnl ========================================================================== AC_CONFIG_FILES(platform.inc) AC_OUTPUT xvidcore/build/generic/config.sub0000755000076500007650000010437611566432661020204 0ustar xvidbuildxvidbuild#! /bin/sh # Configuration validation subroutine script. # Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, # 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 # Free Software Foundation, Inc. timestamp='2010-09-11' # This file is (in principle) common to ALL GNU software. # The presence of a machine in this file suggests that SOME GNU software # can handle that machine. It does not imply ALL GNU software can. # # This file is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA # 02110-1301, USA. # # As a special exception to the GNU General Public License, if you # distribute this file as part of a program that contains a # configuration script generated by Autoconf, you may include it under # the same distribution terms that you use for the rest of that program. # Please send patches to . Submit a context # diff and a properly formatted GNU ChangeLog entry. # # Configuration subroutine to validate and canonicalize a configuration type. # Supply the specified configuration type as an argument. # If it is invalid, we print an error message on stderr and exit with code 1. # Otherwise, we print the canonical config type on stdout and succeed. # You can get the latest version of this script from: # http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD # This file is supposed to be the same for all GNU packages # and recognize all the CPU types, system types and aliases # that are meaningful with *any* GNU software. # Each package is responsible for reporting which valid configurations # it does not support. The user should be able to distinguish # a failure to support a valid configuration from a meaningless # configuration. # The goal of this file is to map all the various variations of a given # machine specification into a single specification in the form: # CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM # or in some cases, the newer four-part form: # CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM # It is wrong to echo any other type of specification. me=`echo "$0" | sed -e 's,.*/,,'` usage="\ Usage: $0 [OPTION] CPU-MFR-OPSYS $0 [OPTION] ALIAS Canonicalize a configuration name. Operation modes: -h, --help print this help, then exit -t, --time-stamp print date of last modification, then exit -v, --version print version number, then exit Report bugs and patches to ." version="\ GNU config.sub ($timestamp) Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE." help=" Try \`$me --help' for more information." # Parse command line while test $# -gt 0 ; do case $1 in --time-stamp | --time* | -t ) echo "$timestamp" ; exit ;; --version | -v ) echo "$version" ; exit ;; --help | --h* | -h ) echo "$usage"; exit ;; -- ) # Stop option processing shift; break ;; - ) # Use stdin as input. break ;; -* ) echo "$me: invalid option $1$help" exit 1 ;; *local*) # First pass through any local machine types. echo $1 exit ;; * ) break ;; esac done case $# in 0) echo "$me: missing argument$help" >&2 exit 1;; 1) ;; *) echo "$me: too many arguments$help" >&2 exit 1;; esac # Separate what the user gave into CPU-COMPANY and OS or KERNEL-OS (if any). # Here we must recognize all the valid KERNEL-OS combinations. maybe_os=`echo $1 | sed 's/^\(.*\)-\([^-]*-[^-]*\)$/\2/'` case $maybe_os in nto-qnx* | linux-gnu* | linux-android* | linux-dietlibc | linux-newlib* | \ linux-uclibc* | uclinux-uclibc* | uclinux-gnu* | kfreebsd*-gnu* | \ knetbsd*-gnu* | netbsd*-gnu* | \ kopensolaris*-gnu* | \ storm-chaos* | os2-emx* | rtmk-nova*) os=-$maybe_os basic_machine=`echo $1 | sed 's/^\(.*\)-\([^-]*-[^-]*\)$/\1/'` ;; *) basic_machine=`echo $1 | sed 's/-[^-]*$//'` if [ $basic_machine != $1 ] then os=`echo $1 | sed 's/.*-/-/'` else os=; fi ;; esac ### Let's recognize common machines as not being operating systems so ### that things like config.sub decstation-3100 work. We also ### recognize some manufacturers as not being operating systems, so we ### can provide default operating systems below. case $os in -sun*os*) # Prevent following clause from handling this invalid input. ;; -dec* | -mips* | -sequent* | -encore* | -pc532* | -sgi* | -sony* | \ -att* | -7300* | -3300* | -delta* | -motorola* | -sun[234]* | \ -unicom* | -ibm* | -next | -hp | -isi* | -apollo | -altos* | \ -convergent* | -ncr* | -news | -32* | -3600* | -3100* | -hitachi* |\ -c[123]* | -convex* | -sun | -crds | -omron* | -dg | -ultra | -tti* | \ -harris | -dolphin | -highlevel | -gould | -cbm | -ns | -masscomp | \ -apple | -axis | -knuth | -cray | -microblaze) os= basic_machine=$1 ;; -bluegene*) os=-cnk ;; -sim | -cisco | -oki | -wec | -winbond) os= basic_machine=$1 ;; -scout) ;; -wrs) os=-vxworks basic_machine=$1 ;; -chorusos*) os=-chorusos basic_machine=$1 ;; -chorusrdb) os=-chorusrdb basic_machine=$1 ;; -hiux*) os=-hiuxwe2 ;; -sco6) os=-sco5v6 basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -sco5) os=-sco3.2v5 basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -sco4) os=-sco3.2v4 basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -sco3.2.[4-9]*) os=`echo $os | sed -e 's/sco3.2./sco3.2v/'` basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -sco3.2v[4-9]*) # Don't forget version if it is 3.2v4 or newer. basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -sco5v6*) # Don't forget version if it is 3.2v4 or newer. basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -sco*) os=-sco3.2v2 basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -udk*) basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -isc) os=-isc2.2 basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -clix*) basic_machine=clipper-intergraph ;; -isc*) basic_machine=`echo $1 | sed -e 's/86-.*/86-pc/'` ;; -lynx*) os=-lynxos ;; -ptx*) basic_machine=`echo $1 | sed -e 's/86-.*/86-sequent/'` ;; -windowsnt*) os=`echo $os | sed -e 's/windowsnt/winnt/'` ;; -psos*) os=-psos ;; -mint | -mint[0-9]*) basic_machine=m68k-atari os=-mint ;; esac # Decode aliases for certain CPU-COMPANY combinations. case $basic_machine in # Recognize the basic CPU types without company name. # Some are omitted here because they have special meanings below. 1750a | 580 \ | a29k \ | alpha | alphaev[4-8] | alphaev56 | alphaev6[78] | alphapca5[67] \ | alpha64 | alpha64ev[4-8] | alpha64ev56 | alpha64ev6[78] | alpha64pca5[67] \ | am33_2.0 \ | arc | arm | arm[bl]e | arme[lb] | armv[2345] | armv[345][lb] | avr | avr32 \ | bfin \ | c4x | clipper \ | d10v | d30v | dlx | dsp16xx \ | fido | fr30 | frv \ | h8300 | h8500 | hppa | hppa1.[01] | hppa2.0 | hppa2.0[nw] | hppa64 \ | i370 | i860 | i960 | ia64 \ | ip2k | iq2000 \ | lm32 \ | m32c | m32r | m32rle | m68000 | m68k | m88k \ | maxq | mb | microblaze | mcore | mep | metag \ | mips | mipsbe | mipseb | mipsel | mipsle \ | mips16 \ | mips64 | mips64el \ | mips64octeon | mips64octeonel \ | mips64orion | mips64orionel \ | mips64r5900 | mips64r5900el \ | mips64vr | mips64vrel \ | mips64vr4100 | mips64vr4100el \ | mips64vr4300 | mips64vr4300el \ | mips64vr5000 | mips64vr5000el \ | mips64vr5900 | mips64vr5900el \ | mipsisa32 | mipsisa32el \ | mipsisa32r2 | mipsisa32r2el \ | mipsisa64 | mipsisa64el \ | mipsisa64r2 | mipsisa64r2el \ | mipsisa64sb1 | mipsisa64sb1el \ | mipsisa64sr71k | mipsisa64sr71kel \ | mipstx39 | mipstx39el \ | mn10200 | mn10300 \ | moxie \ | mt \ | msp430 \ | nds32 | nds32le | nds32be \ | nios | nios2 \ | ns16k | ns32k \ | or32 \ | pdp10 | pdp11 | pj | pjl \ | powerpc | powerpc64 | powerpc64le | powerpcle | ppcbe \ | pyramid \ | rx \ | score \ | sh | sh[1234] | sh[24]a | sh[24]aeb | sh[23]e | sh[34]eb | sheb | shbe | shle | sh[1234]le | sh3ele \ | sh64 | sh64le \ | sparc | sparc64 | sparc64b | sparc64v | sparc86x | sparclet | sparclite \ | sparcv8 | sparcv9 | sparcv9b | sparcv9v \ | spu | strongarm \ | tahoe | thumb | tic4x | tic54x | tic55x | tic6x | tic80 | tron \ | ubicom32 \ | v850 | v850e \ | we32k \ | x86 | xc16x | xscale | xscalee[bl] | xstormy16 | xtensa \ | z8k | z80) basic_machine=$basic_machine-unknown ;; c54x) basic_machine=tic54x-unknown ;; c55x) basic_machine=tic55x-unknown ;; c6x) basic_machine=tic6x-unknown ;; m6811 | m68hc11 | m6812 | m68hc12 | picochip) # Motorola 68HC11/12. basic_machine=$basic_machine-unknown os=-none ;; m88110 | m680[12346]0 | m683?2 | m68360 | m5200 | v70 | w65 | z8k) ;; ms1) basic_machine=mt-unknown ;; # We use `pc' rather than `unknown' # because (1) that's what they normally are, and # (2) the word "unknown" tends to confuse beginning users. i*86 | x86_64) basic_machine=$basic_machine-pc ;; # Object if more than one company name word. *-*-*) echo Invalid configuration \`$1\': machine \`$basic_machine\' not recognized 1>&2 exit 1 ;; # Recognize the basic CPU types with company name. 580-* \ | a29k-* \ | alpha-* | alphaev[4-8]-* | alphaev56-* | alphaev6[78]-* \ | alpha64-* | alpha64ev[4-8]-* | alpha64ev56-* | alpha64ev6[78]-* \ | alphapca5[67]-* | alpha64pca5[67]-* | arc-* \ | arm-* | armbe-* | armle-* | armeb-* | armv*-* \ | avr-* | avr32-* \ | bfin-* | bs2000-* \ | c[123]* | c30-* | [cjt]90-* | c4x-* \ | clipper-* | craynv-* | cydra-* \ | d10v-* | d30v-* | dlx-* \ | elxsi-* \ | f30[01]-* | f700-* | fido-* | fr30-* | frv-* | fx80-* \ | h8300-* | h8500-* \ | hppa-* | hppa1.[01]-* | hppa2.0-* | hppa2.0[nw]-* | hppa64-* \ | i*86-* | i860-* | i960-* | ia64-* \ | ip2k-* | iq2000-* \ | lm32-* \ | m32c-* | m32r-* | m32rle-* \ | m68000-* | m680[012346]0-* | m68360-* | m683?2-* | m68k-* \ | m88110-* | m88k-* | maxq-* | mcore-* | metag-* | microblaze-* \ | mips-* | mipsbe-* | mipseb-* | mipsel-* | mipsle-* \ | mips16-* \ | mips64-* | mips64el-* \ | mips64octeon-* | mips64octeonel-* \ | mips64orion-* | mips64orionel-* \ | mips64r5900-* | mips64r5900el-* \ | mips64vr-* | mips64vrel-* \ | mips64vr4100-* | mips64vr4100el-* \ | mips64vr4300-* | mips64vr4300el-* \ | mips64vr5000-* | mips64vr5000el-* \ | mips64vr5900-* | mips64vr5900el-* \ | mipsisa32-* | mipsisa32el-* \ | mipsisa32r2-* | mipsisa32r2el-* \ | mipsisa64-* | mipsisa64el-* \ | mipsisa64r2-* | mipsisa64r2el-* \ | mipsisa64sb1-* | mipsisa64sb1el-* \ | mipsisa64sr71k-* | mipsisa64sr71kel-* \ | mipstx39-* | mipstx39el-* \ | mmix-* \ | mt-* \ | msp430-* \ | nds32-* | nds32le-* | nds32be-* \ | nios-* | nios2-* \ | none-* | np1-* | ns16k-* | ns32k-* \ | orion-* \ | pdp10-* | pdp11-* | pj-* | pjl-* | pn-* | power-* \ | powerpc-* | powerpc64-* | powerpc64le-* | powerpcle-* | ppcbe-* \ | pyramid-* \ | romp-* | rs6000-* | rx-* \ | sh-* | sh[1234]-* | sh[24]a-* | sh[24]aeb-* | sh[23]e-* | sh[34]eb-* | sheb-* | shbe-* \ | shle-* | sh[1234]le-* | sh3ele-* | sh64-* | sh64le-* \ | sparc-* | sparc64-* | sparc64b-* | sparc64v-* | sparc86x-* | sparclet-* \ | sparclite-* \ | sparcv8-* | sparcv9-* | sparcv9b-* | sparcv9v-* | strongarm-* | sv1-* | sx?-* \ | tahoe-* | thumb-* \ | tic30-* | tic4x-* | tic54x-* | tic55x-* | tic6x-* | tic80-* \ | tile-* | tilegx-* \ | tron-* \ | ubicom32-* \ | v850-* | v850e-* | vax-* \ | we32k-* \ | x86-* | x86_64-* | xc16x-* | xps100-* | xscale-* | xscalee[bl]-* \ | xstormy16-* | xtensa*-* \ | ymp-* \ | z8k-* | z80-*) ;; # Recognize the basic CPU types without company name, with glob match. xtensa*) basic_machine=$basic_machine-unknown ;; # Recognize the various machine names and aliases which stand # for a CPU type and a company and sometimes even an OS. 386bsd) basic_machine=i386-unknown os=-bsd ;; 3b1 | 7300 | 7300-att | att-7300 | pc7300 | safari | unixpc) basic_machine=m68000-att ;; 3b*) basic_machine=we32k-att ;; a29khif) basic_machine=a29k-amd os=-udi ;; abacus) basic_machine=abacus-unknown ;; adobe68k) basic_machine=m68010-adobe os=-scout ;; alliant | fx80) basic_machine=fx80-alliant ;; altos | altos3068) basic_machine=m68k-altos ;; am29k) basic_machine=a29k-none os=-bsd ;; amd64) basic_machine=x86_64-pc ;; amd64-*) basic_machine=x86_64-`echo $basic_machine | sed 's/^[^-]*-//'` ;; amdahl) basic_machine=580-amdahl os=-sysv ;; amiga | amiga-*) basic_machine=m68k-unknown ;; amigaos | amigados) basic_machine=m68k-unknown os=-amigaos ;; amigaunix | amix) basic_machine=m68k-unknown os=-sysv4 ;; apollo68) basic_machine=m68k-apollo os=-sysv ;; apollo68bsd) basic_machine=m68k-apollo os=-bsd ;; aros) basic_machine=i386-pc os=-aros ;; aux) basic_machine=m68k-apple os=-aux ;; balance) basic_machine=ns32k-sequent os=-dynix ;; blackfin) basic_machine=bfin-unknown os=-linux ;; blackfin-*) basic_machine=bfin-`echo $basic_machine | sed 's/^[^-]*-//'` os=-linux ;; bluegene*) basic_machine=powerpc-ibm os=-cnk ;; c54x-*) basic_machine=tic54x-`echo $basic_machine | sed 's/^[^-]*-//'` ;; c55x-*) basic_machine=tic55x-`echo $basic_machine | sed 's/^[^-]*-//'` ;; c6x-*) basic_machine=tic6x-`echo $basic_machine | sed 's/^[^-]*-//'` ;; c90) basic_machine=c90-cray os=-unicos ;; cegcc) basic_machine=arm-unknown os=-cegcc ;; convex-c1) basic_machine=c1-convex os=-bsd ;; convex-c2) basic_machine=c2-convex os=-bsd ;; convex-c32) basic_machine=c32-convex os=-bsd ;; convex-c34) basic_machine=c34-convex os=-bsd ;; convex-c38) basic_machine=c38-convex os=-bsd ;; cray | j90) basic_machine=j90-cray os=-unicos ;; craynv) basic_machine=craynv-cray os=-unicosmp ;; cr16) basic_machine=cr16-unknown os=-elf ;; crds | unos) basic_machine=m68k-crds ;; crisv32 | crisv32-* | etraxfs*) basic_machine=crisv32-axis ;; cris | cris-* | etrax*) basic_machine=cris-axis ;; crx) basic_machine=crx-unknown os=-elf ;; da30 | da30-*) basic_machine=m68k-da30 ;; decstation | decstation-3100 | pmax | pmax-* | pmin | dec3100 | decstatn) basic_machine=mips-dec ;; decsystem10* | dec10*) basic_machine=pdp10-dec os=-tops10 ;; decsystem20* | dec20*) basic_machine=pdp10-dec os=-tops20 ;; delta | 3300 | motorola-3300 | motorola-delta \ | 3300-motorola | delta-motorola) basic_machine=m68k-motorola ;; delta88) basic_machine=m88k-motorola os=-sysv3 ;; dicos) basic_machine=i686-pc os=-dicos ;; djgpp) basic_machine=i586-pc os=-msdosdjgpp ;; dpx20 | dpx20-*) basic_machine=rs6000-bull os=-bosx ;; dpx2* | dpx2*-bull) basic_machine=m68k-bull os=-sysv3 ;; ebmon29k) basic_machine=a29k-amd os=-ebmon ;; elxsi) basic_machine=elxsi-elxsi os=-bsd ;; encore | umax | mmax) basic_machine=ns32k-encore ;; es1800 | OSE68k | ose68k | ose | OSE) basic_machine=m68k-ericsson os=-ose ;; fx2800) basic_machine=i860-alliant ;; genix) basic_machine=ns32k-ns ;; gmicro) basic_machine=tron-gmicro os=-sysv ;; go32) basic_machine=i386-pc os=-go32 ;; h3050r* | hiux*) basic_machine=hppa1.1-hitachi os=-hiuxwe2 ;; h8300hms) basic_machine=h8300-hitachi os=-hms ;; h8300xray) basic_machine=h8300-hitachi os=-xray ;; h8500hms) basic_machine=h8500-hitachi os=-hms ;; harris) basic_machine=m88k-harris os=-sysv3 ;; hp300-*) basic_machine=m68k-hp ;; hp300bsd) basic_machine=m68k-hp os=-bsd ;; hp300hpux) basic_machine=m68k-hp os=-hpux ;; hp3k9[0-9][0-9] | hp9[0-9][0-9]) basic_machine=hppa1.0-hp ;; hp9k2[0-9][0-9] | hp9k31[0-9]) basic_machine=m68000-hp ;; hp9k3[2-9][0-9]) basic_machine=m68k-hp ;; hp9k6[0-9][0-9] | hp6[0-9][0-9]) basic_machine=hppa1.0-hp ;; hp9k7[0-79][0-9] | hp7[0-79][0-9]) basic_machine=hppa1.1-hp ;; hp9k78[0-9] | hp78[0-9]) # FIXME: really hppa2.0-hp basic_machine=hppa1.1-hp ;; hp9k8[67]1 | hp8[67]1 | hp9k80[24] | hp80[24] | hp9k8[78]9 | hp8[78]9 | hp9k893 | hp893) # FIXME: really hppa2.0-hp basic_machine=hppa1.1-hp ;; hp9k8[0-9][13679] | hp8[0-9][13679]) basic_machine=hppa1.1-hp ;; hp9k8[0-9][0-9] | hp8[0-9][0-9]) basic_machine=hppa1.0-hp ;; hppa-next) os=-nextstep3 ;; hppaosf) basic_machine=hppa1.1-hp os=-osf ;; hppro) basic_machine=hppa1.1-hp os=-proelf ;; i370-ibm* | ibm*) basic_machine=i370-ibm ;; # I'm not sure what "Sysv32" means. Should this be sysv3.2? i*86v32) basic_machine=`echo $1 | sed -e 's/86.*/86-pc/'` os=-sysv32 ;; i*86v4*) basic_machine=`echo $1 | sed -e 's/86.*/86-pc/'` os=-sysv4 ;; i*86v) basic_machine=`echo $1 | sed -e 's/86.*/86-pc/'` os=-sysv ;; i*86sol2) basic_machine=`echo $1 | sed -e 's/86.*/86-pc/'` os=-solaris2 ;; i386mach) basic_machine=i386-mach os=-mach ;; i386-vsta | vsta) basic_machine=i386-unknown os=-vsta ;; iris | iris4d) basic_machine=mips-sgi case $os in -irix*) ;; *) os=-irix4 ;; esac ;; isi68 | isi) basic_machine=m68k-isi os=-sysv ;; m68knommu) basic_machine=m68k-unknown os=-linux ;; m68knommu-*) basic_machine=m68k-`echo $basic_machine | sed 's/^[^-]*-//'` os=-linux ;; m88k-omron*) basic_machine=m88k-omron ;; magnum | m3230) basic_machine=mips-mips os=-sysv ;; merlin) basic_machine=ns32k-utek os=-sysv ;; microblaze) basic_machine=microblaze-xilinx ;; mingw32) basic_machine=i386-pc os=-mingw32 ;; mingw32ce) basic_machine=arm-unknown os=-mingw32ce ;; miniframe) basic_machine=m68000-convergent ;; *mint | -mint[0-9]* | *MiNT | *MiNT[0-9]*) basic_machine=m68k-atari os=-mint ;; mips3*-*) basic_machine=`echo $basic_machine | sed -e 's/mips3/mips64/'` ;; mips3*) basic_machine=`echo $basic_machine | sed -e 's/mips3/mips64/'`-unknown ;; monitor) basic_machine=m68k-rom68k os=-coff ;; morphos) basic_machine=powerpc-unknown os=-morphos ;; msdos) basic_machine=i386-pc os=-msdos ;; ms1-*) basic_machine=`echo $basic_machine | sed -e 's/ms1-/mt-/'` ;; mvs) basic_machine=i370-ibm os=-mvs ;; ncr3000) basic_machine=i486-ncr os=-sysv4 ;; netbsd386) basic_machine=i386-unknown os=-netbsd ;; netwinder) basic_machine=armv4l-rebel os=-linux ;; news | news700 | news800 | news900) basic_machine=m68k-sony os=-newsos ;; news1000) basic_machine=m68030-sony os=-newsos ;; news-3600 | risc-news) basic_machine=mips-sony os=-newsos ;; necv70) basic_machine=v70-nec os=-sysv ;; next | m*-next ) basic_machine=m68k-next case $os in -nextstep* ) ;; -ns2*) os=-nextstep2 ;; *) os=-nextstep3 ;; esac ;; nh3000) basic_machine=m68k-harris os=-cxux ;; nh[45]000) basic_machine=m88k-harris os=-cxux ;; nindy960) basic_machine=i960-intel os=-nindy ;; mon960) basic_machine=i960-intel os=-mon960 ;; nonstopux) basic_machine=mips-compaq os=-nonstopux ;; np1) basic_machine=np1-gould ;; neo-tandem) basic_machine=neo-tandem ;; nse-tandem) basic_machine=nse-tandem ;; nsr-tandem) basic_machine=nsr-tandem ;; op50n-* | op60c-*) basic_machine=hppa1.1-oki os=-proelf ;; openrisc | openrisc-*) basic_machine=or32-unknown ;; os400) basic_machine=powerpc-ibm os=-os400 ;; OSE68000 | ose68000) basic_machine=m68000-ericsson os=-ose ;; os68k) basic_machine=m68k-none os=-os68k ;; pa-hitachi) basic_machine=hppa1.1-hitachi os=-hiuxwe2 ;; paragon) basic_machine=i860-intel os=-osf ;; parisc) basic_machine=hppa-unknown os=-linux ;; parisc-*) basic_machine=hppa-`echo $basic_machine | sed 's/^[^-]*-//'` os=-linux ;; pbd) basic_machine=sparc-tti ;; pbb) basic_machine=m68k-tti ;; pc532 | pc532-*) basic_machine=ns32k-pc532 ;; pc98) basic_machine=i386-pc ;; pc98-*) basic_machine=i386-`echo $basic_machine | sed 's/^[^-]*-//'` ;; pentium | p5 | k5 | k6 | nexgen | viac3) basic_machine=i586-pc ;; pentiumpro | p6 | 6x86 | athlon | athlon_*) basic_machine=i686-pc ;; pentiumii | pentium2 | pentiumiii | pentium3) basic_machine=i686-pc ;; pentium4) basic_machine=i786-pc ;; pentium-* | p5-* | k5-* | k6-* | nexgen-* | viac3-*) basic_machine=i586-`echo $basic_machine | sed 's/^[^-]*-//'` ;; pentiumpro-* | p6-* | 6x86-* | athlon-*) basic_machine=i686-`echo $basic_machine | sed 's/^[^-]*-//'` ;; pentiumii-* | pentium2-* | pentiumiii-* | pentium3-*) basic_machine=i686-`echo $basic_machine | sed 's/^[^-]*-//'` ;; pentium4-*) basic_machine=i786-`echo $basic_machine | sed 's/^[^-]*-//'` ;; pn) basic_machine=pn-gould ;; power) basic_machine=power-ibm ;; ppc) basic_machine=powerpc-unknown ;; ppc-*) basic_machine=powerpc-`echo $basic_machine | sed 's/^[^-]*-//'` ;; ppcle | powerpclittle | ppc-le | powerpc-little) basic_machine=powerpcle-unknown ;; ppcle-* | powerpclittle-*) basic_machine=powerpcle-`echo $basic_machine | sed 's/^[^-]*-//'` ;; ppc64) basic_machine=powerpc64-unknown ;; ppc64-*) basic_machine=powerpc64-`echo $basic_machine | sed 's/^[^-]*-//'` ;; ppc64le | powerpc64little | ppc64-le | powerpc64-little) basic_machine=powerpc64le-unknown ;; ppc64le-* | powerpc64little-*) basic_machine=powerpc64le-`echo $basic_machine | sed 's/^[^-]*-//'` ;; ps2) basic_machine=i386-ibm ;; pw32) basic_machine=i586-unknown os=-pw32 ;; rdos) basic_machine=i386-pc os=-rdos ;; rom68k) basic_machine=m68k-rom68k os=-coff ;; rm[46]00) basic_machine=mips-siemens ;; rtpc | rtpc-*) basic_machine=romp-ibm ;; s390 | s390-*) basic_machine=s390-ibm ;; s390x | s390x-*) basic_machine=s390x-ibm ;; sa29200) basic_machine=a29k-amd os=-udi ;; sb1) basic_machine=mipsisa64sb1-unknown ;; sb1el) basic_machine=mipsisa64sb1el-unknown ;; sde) basic_machine=mipsisa32-sde os=-elf ;; sei) basic_machine=mips-sei os=-seiux ;; sequent) basic_machine=i386-sequent ;; sh) basic_machine=sh-hitachi os=-hms ;; sh5el) basic_machine=sh5le-unknown ;; sh64) basic_machine=sh64-unknown ;; sparclite-wrs | simso-wrs) basic_machine=sparclite-wrs os=-vxworks ;; sps7) basic_machine=m68k-bull os=-sysv2 ;; spur) basic_machine=spur-unknown ;; st2000) basic_machine=m68k-tandem ;; stratus) basic_machine=i860-stratus os=-sysv4 ;; sun2) basic_machine=m68000-sun ;; sun2os3) basic_machine=m68000-sun os=-sunos3 ;; sun2os4) basic_machine=m68000-sun os=-sunos4 ;; sun3os3) basic_machine=m68k-sun os=-sunos3 ;; sun3os4) basic_machine=m68k-sun os=-sunos4 ;; sun4os3) basic_machine=sparc-sun os=-sunos3 ;; sun4os4) basic_machine=sparc-sun os=-sunos4 ;; sun4sol2) basic_machine=sparc-sun os=-solaris2 ;; sun3 | sun3-*) basic_machine=m68k-sun ;; sun4) basic_machine=sparc-sun ;; sun386 | sun386i | roadrunner) basic_machine=i386-sun ;; sv1) basic_machine=sv1-cray os=-unicos ;; symmetry) basic_machine=i386-sequent os=-dynix ;; t3e) basic_machine=alphaev5-cray os=-unicos ;; t90) basic_machine=t90-cray os=-unicos ;; # This must be matched before tile*. tilegx*) basic_machine=tilegx-unknown os=-linux-gnu ;; tile*) basic_machine=tile-unknown os=-linux-gnu ;; tx39) basic_machine=mipstx39-unknown ;; tx39el) basic_machine=mipstx39el-unknown ;; toad1) basic_machine=pdp10-xkl os=-tops20 ;; tower | tower-32) basic_machine=m68k-ncr ;; tpf) basic_machine=s390x-ibm os=-tpf ;; udi29k) basic_machine=a29k-amd os=-udi ;; ultra3) basic_machine=a29k-nyu os=-sym1 ;; v810 | necv810) basic_machine=v810-nec os=-none ;; vaxv) basic_machine=vax-dec os=-sysv ;; vms) basic_machine=vax-dec os=-vms ;; vpp*|vx|vx-*) basic_machine=f301-fujitsu ;; vxworks960) basic_machine=i960-wrs os=-vxworks ;; vxworks68) basic_machine=m68k-wrs os=-vxworks ;; vxworks29k) basic_machine=a29k-wrs os=-vxworks ;; w65*) basic_machine=w65-wdc os=-none ;; w89k-*) basic_machine=hppa1.1-winbond os=-proelf ;; xbox) basic_machine=i686-pc os=-mingw32 ;; xps | xps100) basic_machine=xps100-honeywell ;; ymp) basic_machine=ymp-cray os=-unicos ;; z8k-*-coff) basic_machine=z8k-unknown os=-sim ;; z80-*-coff) basic_machine=z80-unknown os=-sim ;; none) basic_machine=none-none os=-none ;; # Here we handle the default manufacturer of certain CPU types. It is in # some cases the only manufacturer, in others, it is the most popular. w89k) basic_machine=hppa1.1-winbond ;; op50n) basic_machine=hppa1.1-oki ;; op60c) basic_machine=hppa1.1-oki ;; romp) basic_machine=romp-ibm ;; mmix) basic_machine=mmix-knuth ;; rs6000) basic_machine=rs6000-ibm ;; vax) basic_machine=vax-dec ;; pdp10) # there are many clones, so DEC is not a safe bet basic_machine=pdp10-unknown ;; pdp11) basic_machine=pdp11-dec ;; we32k) basic_machine=we32k-att ;; sh[1234] | sh[24]a | sh[24]aeb | sh[34]eb | sh[1234]le | sh[23]ele) basic_machine=sh-unknown ;; sparc | sparcv8 | sparcv9 | sparcv9b | sparcv9v) basic_machine=sparc-sun ;; cydra) basic_machine=cydra-cydrome ;; orion) basic_machine=orion-highlevel ;; orion105) basic_machine=clipper-highlevel ;; mac | mpw | mac-mpw) basic_machine=m68k-apple ;; pmac | pmac-mpw) basic_machine=powerpc-apple ;; *-unknown) # Make sure to match an already-canonicalized machine name. ;; *) echo Invalid configuration \`$1\': machine \`$basic_machine\' not recognized 1>&2 exit 1 ;; esac # Here we canonicalize certain aliases for manufacturers. case $basic_machine in *-digital*) basic_machine=`echo $basic_machine | sed 's/digital.*/dec/'` ;; *-commodore*) basic_machine=`echo $basic_machine | sed 's/commodore.*/cbm/'` ;; *) ;; esac # Decode manufacturer-specific aliases for certain operating systems. if [ x"$os" != x"" ] then case $os in # First match some system type aliases # that might get confused with valid system types. # -solaris* is a basic system type, with this one exception. -auroraux) os=-auroraux ;; -solaris1 | -solaris1.*) os=`echo $os | sed -e 's|solaris1|sunos4|'` ;; -solaris) os=-solaris2 ;; -svr4*) os=-sysv4 ;; -unixware*) os=-sysv4.2uw ;; -gnu/linux*) os=`echo $os | sed -e 's|gnu/linux|linux-gnu|'` ;; # First accept the basic system types. # The portable systems comes first. # Each alternative MUST END IN A *, to match a version number. # -sysv* is not here because it comes later, after sysvr4. -gnu* | -bsd* | -mach* | -minix* | -genix* | -ultrix* | -irix* \ | -*vms* | -sco* | -esix* | -isc* | -aix* | -cnk* | -sunos | -sunos[34]*\ | -hpux* | -unos* | -osf* | -luna* | -dgux* | -auroraux* | -solaris* \ | -sym* | -kopensolaris* \ | -amigaos* | -amigados* | -msdos* | -newsos* | -unicos* | -aof* \ | -aos* | -aros* \ | -nindy* | -vxsim* | -vxworks* | -ebmon* | -hms* | -mvs* \ | -clix* | -riscos* | -uniplus* | -iris* | -rtu* | -xenix* \ | -hiux* | -386bsd* | -knetbsd* | -mirbsd* | -netbsd* \ | -openbsd* | -solidbsd* \ | -ekkobsd* | -kfreebsd* | -freebsd* | -riscix* | -lynxos* \ | -bosx* | -nextstep* | -cxux* | -aout* | -elf* | -oabi* \ | -ptx* | -coff* | -ecoff* | -winnt* | -domain* | -vsta* \ | -udi* | -eabi* | -lites* | -ieee* | -go32* | -aux* \ | -chorusos* | -chorusrdb* | -cegcc* \ | -cygwin* | -pe* | -psos* | -moss* | -proelf* | -rtems* \ | -mingw32* | -linux-gnu* | -linux-android* \ | -linux-newlib* | -linux-uclibc* \ | -uxpv* | -beos* | -mpeix* | -udk* \ | -interix* | -uwin* | -mks* | -rhapsody* | -darwin* | -opened* \ | -openstep* | -oskit* | -conix* | -pw32* | -nonstopux* \ | -storm-chaos* | -tops10* | -tenex* | -tops20* | -its* \ | -os2* | -vos* | -palmos* | -uclinux* | -nucleus* \ | -morphos* | -superux* | -rtmk* | -rtmk-nova* | -windiss* \ | -powermax* | -dnix* | -nx6 | -nx7 | -sei* | -dragonfly* \ | -skyos* | -haiku* | -rdos* | -toppers* | -drops* | -es*) # Remember, each alternative MUST END IN *, to match a version number. ;; -qnx*) case $basic_machine in x86-* | i*86-*) ;; *) os=-nto$os ;; esac ;; -nto-qnx*) ;; -nto*) os=`echo $os | sed -e 's|nto|nto-qnx|'` ;; -sim | -es1800* | -hms* | -xray | -os68k* | -none* | -v88r* \ | -windows* | -osx | -abug | -netware* | -os9* | -beos* | -haiku* \ | -macos* | -mpw* | -magic* | -mmixware* | -mon960* | -lnews*) ;; -mac*) os=`echo $os | sed -e 's|mac|macos|'` ;; -linux-dietlibc) os=-linux-dietlibc ;; -linux*) os=`echo $os | sed -e 's|linux|linux-gnu|'` ;; -sunos5*) os=`echo $os | sed -e 's|sunos5|solaris2|'` ;; -sunos6*) os=`echo $os | sed -e 's|sunos6|solaris3|'` ;; -opened*) os=-openedition ;; -os400*) os=-os400 ;; -wince*) os=-wince ;; -osfrose*) os=-osfrose ;; -osf*) os=-osf ;; -utek*) os=-bsd ;; -dynix*) os=-bsd ;; -acis*) os=-aos ;; -atheos*) os=-atheos ;; -syllable*) os=-syllable ;; -386bsd) os=-bsd ;; -ctix* | -uts*) os=-sysv ;; -nova*) os=-rtmk-nova ;; -ns2 ) os=-nextstep2 ;; -nsk*) os=-nsk ;; # Preserve the version number of sinix5. -sinix5.*) os=`echo $os | sed -e 's|sinix|sysv|'` ;; -sinix*) os=-sysv4 ;; -tpf*) os=-tpf ;; -triton*) os=-sysv3 ;; -oss*) os=-sysv3 ;; -svr4) os=-sysv4 ;; -svr3) os=-sysv3 ;; -sysvr4) os=-sysv4 ;; # This must come after -sysvr4. -sysv*) ;; -ose*) os=-ose ;; -es1800*) os=-ose ;; -xenix) os=-xenix ;; -*mint | -mint[0-9]* | -*MiNT | -MiNT[0-9]*) os=-mint ;; -aros*) os=-aros ;; -kaos*) os=-kaos ;; -zvmoe) os=-zvmoe ;; -dicos*) os=-dicos ;; -nacl*) ;; -none) ;; *) # Get rid of the `-' at the beginning of $os. os=`echo $os | sed 's/[^-]*-//'` echo Invalid configuration \`$1\': system \`$os\' not recognized 1>&2 exit 1 ;; esac else # Here we handle the default operating systems that come with various machines. # The value should be what the vendor currently ships out the door with their # machine or put another way, the most popular os provided with the machine. # Note that if you're going to try to match "-MANUFACTURER" here (say, # "-sun"), then you have to tell the case statement up towards the top # that MANUFACTURER isn't an operating system. Otherwise, code above # will signal an error saying that MANUFACTURER isn't an operating # system, and we'll never get to this point. case $basic_machine in score-*) os=-elf ;; spu-*) os=-elf ;; *-acorn) os=-riscix1.2 ;; arm*-rebel) os=-linux ;; arm*-semi) os=-aout ;; c4x-* | tic4x-*) os=-coff ;; tic54x-*) os=-coff ;; tic55x-*) os=-coff ;; tic6x-*) os=-coff ;; # This must come before the *-dec entry. pdp10-*) os=-tops20 ;; pdp11-*) os=-none ;; *-dec | vax-*) os=-ultrix4.2 ;; m68*-apollo) os=-domain ;; i386-sun) os=-sunos4.0.2 ;; m68000-sun) os=-sunos3 # This also exists in the configure program, but was not the # default. # os=-sunos4 ;; m68*-cisco) os=-aout ;; mep-*) os=-elf ;; mips*-cisco) os=-elf ;; mips*-*) os=-elf ;; or32-*) os=-coff ;; *-tti) # must be before sparc entry or we get the wrong os. os=-sysv3 ;; sparc-* | *-sun) os=-sunos4.1.1 ;; *-be) os=-beos ;; *-haiku) os=-haiku ;; *-ibm) os=-aix ;; *-knuth) os=-mmixware ;; *-wec) os=-proelf ;; *-winbond) os=-proelf ;; *-oki) os=-proelf ;; *-hp) os=-hpux ;; *-hitachi) os=-hiux ;; i860-* | *-att | *-ncr | *-altos | *-motorola | *-convergent) os=-sysv ;; *-cbm) os=-amigaos ;; *-dg) os=-dgux ;; *-dolphin) os=-sysv3 ;; m68k-ccur) os=-rtu ;; m88k-omron*) os=-luna ;; *-next ) os=-nextstep ;; *-sequent) os=-ptx ;; *-crds) os=-unos ;; *-ns) os=-genix ;; i370-*) os=-mvs ;; *-next) os=-nextstep3 ;; *-gould) os=-sysv ;; *-highlevel) os=-bsd ;; *-encore) os=-bsd ;; *-sgi) os=-irix ;; *-siemens) os=-sysv4 ;; *-masscomp) os=-rtu ;; f30[01]-fujitsu | f700-fujitsu) os=-uxpv ;; *-rom68k) os=-coff ;; *-*bug) os=-coff ;; *-apple) os=-macos ;; *-atari*) os=-mint ;; *) os=-none ;; esac fi # Here we handle the case where we know the os, and the CPU type, but not the # manufacturer. We pick the logical manufacturer. vendor=unknown case $basic_machine in *-unknown) case $os in -riscix*) vendor=acorn ;; -sunos*) vendor=sun ;; -cnk*|-aix*) vendor=ibm ;; -beos*) vendor=be ;; -hpux*) vendor=hp ;; -mpeix*) vendor=hp ;; -hiux*) vendor=hitachi ;; -unos*) vendor=crds ;; -dgux*) vendor=dg ;; -luna*) vendor=omron ;; -genix*) vendor=ns ;; -mvs* | -opened*) vendor=ibm ;; -os400*) vendor=ibm ;; -ptx*) vendor=sequent ;; -tpf*) vendor=ibm ;; -vxsim* | -vxworks* | -windiss*) vendor=wrs ;; -aux*) vendor=apple ;; -hms*) vendor=hitachi ;; -mpw* | -macos*) vendor=apple ;; -*mint | -mint[0-9]* | -*MiNT | -MiNT[0-9]*) vendor=atari ;; -vos*) vendor=stratus ;; esac basic_machine=`echo $basic_machine | sed "s/unknown/$vendor/"` ;; esac echo $basic_machine$os exit # Local variables: # eval: (add-hook 'write-file-hooks 'time-stamp) # time-stamp-start: "timestamp='" # time-stamp-format: "%:y-%02m-%02d" # time-stamp-end: "'" # End: xvidcore/build/generic/libxvidcore.ld0000664000076500007650000000037411505120263021030 0ustar xvidbuildxvidbuild{ global: xvid_global; xvid_decore; xvid_encore; xvid_plugin_single; xvid_plugin_2pass1; xvid_plugin_2pass2; xvid_plugin_lumimasking; xvid_plugin_dump; xvid_plugin_psnr; xvid_plugin_ssim; xvid_plugin_psnrhvsm; local: *; }; xvidcore/build/generic/libxvidcore.def0000664000076500007650000000034111454411350021164 0ustar xvidbuildxvidbuildEXPORTS xvid_global; xvid_decore; xvid_encore; xvid_plugin_single; xvid_plugin_2pass1; xvid_plugin_2pass2; xvid_plugin_lumimasking; xvid_plugin_dump; xvid_plugin_psnr; xvid_plugin_ssim; xvid_plugin_psnrhvsm; xvidcore/build/generic/install-sh0000755000076500007650000003272511566433175020224 0ustar xvidbuildxvidbuild#!/bin/sh # install - install a program, script, or datafile scriptversion=2010-02-06.18; # UTC # This originates from X11R5 (mit/util/scripts/install.sh), which was # later released in X11R6 (xc/config/util/install.sh) with the # following copyright and license. # # Copyright (C) 1994 X Consortium # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to # deal in the Software without restriction, including without limitation the # rights to use, copy, modify, merge, publish, distribute, sublicense, and/or # sell copies of the Software, and to permit persons to whom the Software is # furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in # all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # X CONSORTIUM BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN # AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNEC- # TION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # # Except as contained in this notice, the name of the X Consortium shall not # be used in advertising or otherwise to promote the sale, use or other deal- # ings in this Software without prior written authorization from the X Consor- # tium. # # # FSF changes to this file are in the public domain. # # Calling this script install-sh is preferred over install.sh, to prevent # `make' implicit rules from creating a file called install from it # when there is no Makefile. # # This script is compatible with the BSD install script, but was written # from scratch. nl=' ' IFS=" "" $nl" # set DOITPROG to echo to test this script # Don't use :- since 4.3BSD and earlier shells don't like it. doit=${DOITPROG-} if test -z "$doit"; then doit_exec=exec else doit_exec=$doit fi # Put in absolute file names if you don't have them in your path; # or use environment vars. chgrpprog=${CHGRPPROG-chgrp} chmodprog=${CHMODPROG-chmod} chownprog=${CHOWNPROG-chown} cmpprog=${CMPPROG-cmp} cpprog=${CPPROG-cp} mkdirprog=${MKDIRPROG-mkdir} mvprog=${MVPROG-mv} rmprog=${RMPROG-rm} stripprog=${STRIPPROG-strip} posix_glob='?' initialize_posix_glob=' test "$posix_glob" != "?" || { if (set -f) 2>/dev/null; then posix_glob= else posix_glob=: fi } ' posix_mkdir= # Desired mode of installed file. mode=0755 chgrpcmd= chmodcmd=$chmodprog chowncmd= mvcmd=$mvprog rmcmd="$rmprog -f" stripcmd= src= dst= dir_arg= dst_arg= copy_on_change=false no_target_directory= usage="\ Usage: $0 [OPTION]... [-T] SRCFILE DSTFILE or: $0 [OPTION]... SRCFILES... DIRECTORY or: $0 [OPTION]... -t DIRECTORY SRCFILES... or: $0 [OPTION]... -d DIRECTORIES... In the 1st form, copy SRCFILE to DSTFILE. In the 2nd and 3rd, copy all SRCFILES to DIRECTORY. In the 4th, create DIRECTORIES. Options: --help display this help and exit. --version display version info and exit. -c (ignored) -C install only if different (preserve the last data modification time) -d create directories instead of installing files. -g GROUP $chgrpprog installed files to GROUP. -m MODE $chmodprog installed files to MODE. -o USER $chownprog installed files to USER. -s $stripprog installed files. -t DIRECTORY install into DIRECTORY. -T report an error if DSTFILE is a directory. Environment variables override the default commands: CHGRPPROG CHMODPROG CHOWNPROG CMPPROG CPPROG MKDIRPROG MVPROG RMPROG STRIPPROG " while test $# -ne 0; do case $1 in -c) ;; -C) copy_on_change=true;; -d) dir_arg=true;; -g) chgrpcmd="$chgrpprog $2" shift;; --help) echo "$usage"; exit $?;; -m) mode=$2 case $mode in *' '* | *' '* | *' '* | *'*'* | *'?'* | *'['*) echo "$0: invalid mode: $mode" >&2 exit 1;; esac shift;; -o) chowncmd="$chownprog $2" shift;; -s) stripcmd=$stripprog;; -t) dst_arg=$2 shift;; -T) no_target_directory=true;; --version) echo "$0 $scriptversion"; exit $?;; --) shift break;; -*) echo "$0: invalid option: $1" >&2 exit 1;; *) break;; esac shift done if test $# -ne 0 && test -z "$dir_arg$dst_arg"; then # When -d is used, all remaining arguments are directories to create. # When -t is used, the destination is already specified. # Otherwise, the last argument is the destination. Remove it from $@. for arg do if test -n "$dst_arg"; then # $@ is not empty: it contains at least $arg. set fnord "$@" "$dst_arg" shift # fnord fi shift # arg dst_arg=$arg done fi if test $# -eq 0; then if test -z "$dir_arg"; then echo "$0: no input file specified." >&2 exit 1 fi # It's OK to call `install-sh -d' without argument. # This can happen when creating conditional directories. exit 0 fi if test -z "$dir_arg"; then do_exit='(exit $ret); exit $ret' trap "ret=129; $do_exit" 1 trap "ret=130; $do_exit" 2 trap "ret=141; $do_exit" 13 trap "ret=143; $do_exit" 15 # Set umask so as not to create temps with too-generous modes. # However, 'strip' requires both read and write access to temps. case $mode in # Optimize common cases. *644) cp_umask=133;; *755) cp_umask=22;; *[0-7]) if test -z "$stripcmd"; then u_plus_rw= else u_plus_rw='% 200' fi cp_umask=`expr '(' 777 - $mode % 1000 ')' $u_plus_rw`;; *) if test -z "$stripcmd"; then u_plus_rw= else u_plus_rw=,u+rw fi cp_umask=$mode$u_plus_rw;; esac fi for src do # Protect names starting with `-'. case $src in -*) src=./$src;; esac if test -n "$dir_arg"; then dst=$src dstdir=$dst test -d "$dstdir" dstdir_status=$? else # Waiting for this to be detected by the "$cpprog $src $dsttmp" command # might cause directories to be created, which would be especially bad # if $src (and thus $dsttmp) contains '*'. if test ! -f "$src" && test ! -d "$src"; then echo "$0: $src does not exist." >&2 exit 1 fi if test -z "$dst_arg"; then echo "$0: no destination specified." >&2 exit 1 fi dst=$dst_arg # Protect names starting with `-'. case $dst in -*) dst=./$dst;; esac # If destination is a directory, append the input filename; won't work # if double slashes aren't ignored. if test -d "$dst"; then if test -n "$no_target_directory"; then echo "$0: $dst_arg: Is a directory" >&2 exit 1 fi dstdir=$dst dst=$dstdir/`basename "$src"` dstdir_status=0 else # Prefer dirname, but fall back on a substitute if dirname fails. dstdir=` (dirname "$dst") 2>/dev/null || expr X"$dst" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$dst" : 'X\(//\)[^/]' \| \ X"$dst" : 'X\(//\)$' \| \ X"$dst" : 'X\(/\)' \| . 2>/dev/null || echo X"$dst" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q' ` test -d "$dstdir" dstdir_status=$? fi fi obsolete_mkdir_used=false if test $dstdir_status != 0; then case $posix_mkdir in '') # Create intermediate dirs using mode 755 as modified by the umask. # This is like FreeBSD 'install' as of 1997-10-28. umask=`umask` case $stripcmd.$umask in # Optimize common cases. *[2367][2367]) mkdir_umask=$umask;; .*0[02][02] | .[02][02] | .[02]) mkdir_umask=22;; *[0-7]) mkdir_umask=`expr $umask + 22 \ - $umask % 100 % 40 + $umask % 20 \ - $umask % 10 % 4 + $umask % 2 `;; *) mkdir_umask=$umask,go-w;; esac # With -d, create the new directory with the user-specified mode. # Otherwise, rely on $mkdir_umask. if test -n "$dir_arg"; then mkdir_mode=-m$mode else mkdir_mode= fi posix_mkdir=false case $umask in *[123567][0-7][0-7]) # POSIX mkdir -p sets u+wx bits regardless of umask, which # is incompatible with FreeBSD 'install' when (umask & 300) != 0. ;; *) tmpdir=${TMPDIR-/tmp}/ins$RANDOM-$$ trap 'ret=$?; rmdir "$tmpdir/d" "$tmpdir" 2>/dev/null; exit $ret' 0 if (umask $mkdir_umask && exec $mkdirprog $mkdir_mode -p -- "$tmpdir/d") >/dev/null 2>&1 then if test -z "$dir_arg" || { # Check for POSIX incompatibilities with -m. # HP-UX 11.23 and IRIX 6.5 mkdir -m -p sets group- or # other-writeable bit of parent directory when it shouldn't. # FreeBSD 6.1 mkdir -m -p sets mode of existing directory. ls_ld_tmpdir=`ls -ld "$tmpdir"` case $ls_ld_tmpdir in d????-?r-*) different_mode=700;; d????-?--*) different_mode=755;; *) false;; esac && $mkdirprog -m$different_mode -p -- "$tmpdir" && { ls_ld_tmpdir_1=`ls -ld "$tmpdir"` test "$ls_ld_tmpdir" = "$ls_ld_tmpdir_1" } } then posix_mkdir=: fi rmdir "$tmpdir/d" "$tmpdir" else # Remove any dirs left behind by ancient mkdir implementations. rmdir ./$mkdir_mode ./-p ./-- 2>/dev/null fi trap '' 0;; esac;; esac if $posix_mkdir && ( umask $mkdir_umask && $doit_exec $mkdirprog $mkdir_mode -p -- "$dstdir" ) then : else # The umask is ridiculous, or mkdir does not conform to POSIX, # or it failed possibly due to a race condition. Create the # directory the slow way, step by step, checking for races as we go. case $dstdir in /*) prefix='/';; -*) prefix='./';; *) prefix='';; esac eval "$initialize_posix_glob" oIFS=$IFS IFS=/ $posix_glob set -f set fnord $dstdir shift $posix_glob set +f IFS=$oIFS prefixes= for d do test -z "$d" && continue prefix=$prefix$d if test -d "$prefix"; then prefixes= else if $posix_mkdir; then (umask=$mkdir_umask && $doit_exec $mkdirprog $mkdir_mode -p -- "$dstdir") && break # Don't fail if two instances are running concurrently. test -d "$prefix" || exit 1 else case $prefix in *\'*) qprefix=`echo "$prefix" | sed "s/'/'\\\\\\\\''/g"`;; *) qprefix=$prefix;; esac prefixes="$prefixes '$qprefix'" fi fi prefix=$prefix/ done if test -n "$prefixes"; then # Don't fail if two instances are running concurrently. (umask $mkdir_umask && eval "\$doit_exec \$mkdirprog $prefixes") || test -d "$dstdir" || exit 1 obsolete_mkdir_used=true fi fi fi if test -n "$dir_arg"; then { test -z "$chowncmd" || $doit $chowncmd "$dst"; } && { test -z "$chgrpcmd" || $doit $chgrpcmd "$dst"; } && { test "$obsolete_mkdir_used$chowncmd$chgrpcmd" = false || test -z "$chmodcmd" || $doit $chmodcmd $mode "$dst"; } || exit 1 else # Make a couple of temp file names in the proper directory. dsttmp=$dstdir/_inst.$$_ rmtmp=$dstdir/_rm.$$_ # Trap to clean up those temp files at exit. trap 'ret=$?; rm -f "$dsttmp" "$rmtmp" && exit $ret' 0 # Copy the file name to the temp name. (umask $cp_umask && $doit_exec $cpprog "$src" "$dsttmp") && # and set any options; do chmod last to preserve setuid bits. # # If any of these fail, we abort the whole thing. If we want to # ignore errors from any of these, just make sure not to ignore # errors from the above "$doit $cpprog $src $dsttmp" command. # { test -z "$chowncmd" || $doit $chowncmd "$dsttmp"; } && { test -z "$chgrpcmd" || $doit $chgrpcmd "$dsttmp"; } && { test -z "$stripcmd" || $doit $stripcmd "$dsttmp"; } && { test -z "$chmodcmd" || $doit $chmodcmd $mode "$dsttmp"; } && # If -C, don't bother to copy if it wouldn't change the file. if $copy_on_change && old=`LC_ALL=C ls -dlL "$dst" 2>/dev/null` && new=`LC_ALL=C ls -dlL "$dsttmp" 2>/dev/null` && eval "$initialize_posix_glob" && $posix_glob set -f && set X $old && old=:$2:$4:$5:$6 && set X $new && new=:$2:$4:$5:$6 && $posix_glob set +f && test "$old" = "$new" && $cmpprog "$dst" "$dsttmp" >/dev/null 2>&1 then rm -f "$dsttmp" else # Rename the file to the real destination. $doit $mvcmd -f "$dsttmp" "$dst" 2>/dev/null || # The rename failed, perhaps because mv can't rename something else # to itself, or perhaps because mv is so ancient that it does not # support -f. { # Now remove or move aside any old file at destination location. # We try this two ways since rm can't unlink itself on some # systems and the destination file might be busy for other # reasons. In this case, the final cleanup might fail but the new # file should still install successfully. { test ! -f "$dst" || $doit $rmcmd -f "$dst" 2>/dev/null || { $doit $mvcmd -f "$dst" "$rmtmp" 2>/dev/null && { $doit $rmcmd -f "$rmtmp" 2>/dev/null; :; } } || { echo "$0: cannot unlink or rename $dst" >&2 (exit 1); exit 1 } } && # Now rename the file to the real destination. $doit $mvcmd "$dsttmp" "$dst" } fi || exit 1 trap '' 0 fi done # Local variables: # eval: (add-hook 'write-file-hooks 'time-stamp) # time-stamp-start: "scriptversion=" # time-stamp-format: "%:y-%02m-%02d.%02H" # time-stamp-time-zone: "UTC" # time-stamp-end: "; # UTC" # End: xvidcore/build/generic/Makefile0000664000076500007650000001716111565152156017653 0ustar xvidbuildxvidbuild############################################################################## # # - Unified Makefile for Xvid for *nix environments - # # Copyright(C) 2003-2004 Edouard Gomez # # # Description: # This Makefile allows building Xvid sources to obtain a shared library # and a static library. This Makefile uses variables defined in the # platform.inc file. This platform.inc file is usually created by the # ./configure script whenever a unix shell is available. # # Makefile functional dependencies: # - echo # - rm (with option -r and -f) # - cd # - make VPATH support (eg: GNU make, solaris 8 make) # - ar # # Building output: # - C means "_C_ompiling" # - A means "_A_ssembling" # - I means "_I_nstalling" # - D means "creating _D_irectory" # - Cl means "_Cl_eaning" # # NB: (for mingw32/djgpp users) # These 2 environments do not provide a shell by default. So it's impossible # to use the configure script to generate a platform.inc file suitable for # your machine. You have two choices: # - install minsys from the mingw project or install cygwin and then use # the configure script as on a unix system. # - write a platform.inc file by hand. # # PS: default build directory is "=build", it fits naming conventions that # make the arch/tla revision control program ignore files contained in # this directory during commits operations. This choice is completly # arbitrary, but try not to change it. # ############################################################################## include sources.inc ifeq ($(findstring $(MAKECMDGOALS), clean distclean mrproper),) include platform.inc endif RM = rm -rf ############################################################################## # # Build rules # ############################################################################## # Their Objects OBJECTS = $(GENERIC_OBJECTS) OBJECTS += $(ASSEMBLY_OBJECTS) OBJECTS += $(DCT_IA64_OBJECTS) OBJECTS += $(PPC_ALTIVEC_OBJECTS) # The VPATH mechanism could use a "per target" build directory # To keep it simple at the moment, the directory is fixed to "build" BUILD_DIR = =build VPATH = $(SRC_DIR):$(BUILD_DIR) #----------------------------------------------------------------------------- # The default rule #----------------------------------------------------------------------------- .SUFFIXES: .$(OBJECT_EXTENSION) .$(ASSEMBLY_EXTENSION) .c all: info $(STATIC_LIB) $(SHARED_LIB) @echo @echo "---------------------------------------------------------------" @echo " Xvid has been successfully built." @echo @echo " * Binaries are currently located in the '$(BUILD_DIR)' directory" @echo " * To install them on your system, you can run '# make install'" @echo " as root." @echo "---------------------------------------------------------------" @echo $(OBJECTS): platform.inc $(BUILD_DIR): @echo " D: $(BUILD_DIR)" @$(INSTALL) -d $(BUILD_DIR) #----------------------------------------------------------------------------- # Generic assembly rule #----------------------------------------------------------------------------- .$(ASSEMBLY_EXTENSION).$(OBJECT_EXTENSION): @echo " A: $(@D)/$(. # # # Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, # 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software # Foundation, Inc. # # # This configure script is free software; the Free Software Foundation # gives unlimited permission to copy, distribute and modify it. ## -------------------- ## ## M4sh Initialization. ## ## -------------------- ## # Be more Bourne compatible DUALCASE=1; export DUALCASE # for MKS sh if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi as_nl=' ' export as_nl # Printing a long string crashes Solaris 7 /usr/bin/printf. as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo # Prefer a ksh shell builtin over an external printf program on Solaris, # but without wasting forks for bash or zsh. if test -z "$BASH_VERSION$ZSH_VERSION" \ && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='print -r --' as_echo_n='print -rn --' elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='printf %s\n' as_echo_n='printf %s' else if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"' as_echo_n='/usr/ucb/echo -n' else as_echo_body='eval expr "X$1" : "X\\(.*\\)"' as_echo_n_body='eval arg=$1; case $arg in #( *"$as_nl"*) expr "X$arg" : "X\\(.*\\)$as_nl"; arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;; esac; expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl" ' export as_echo_n_body as_echo_n='sh -c $as_echo_n_body as_echo' fi export as_echo_body as_echo='sh -c $as_echo_body as_echo' fi # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then PATH_SEPARATOR=: (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 || PATH_SEPARATOR=';' } fi # IFS # We need space, tab and new line, in precisely that order. Quoting is # there to prevent editors from complaining about space-tab. # (If _AS_PATH_WALK were called with IFS unset, it would disable word # splitting by setting IFS to empty value.) IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. as_myself= case $0 in #(( *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done IFS=$as_save_IFS ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then $as_echo "$as_myself: error: cannot find myself; rerun with an absolute file name" >&2 exit 1 fi # Unset variables that we do not need and which cause bugs (e.g. in # pre-3.0 UWIN ksh). But do not cause bugs in bash 2.01; the "|| exit 1" # suppresses any "Segmentation fault" message there. '((' could # trigger a bug in pdksh 5.2.14. for as_var in BASH_ENV ENV MAIL MAILPATH do eval test x\${$as_var+set} = xset \ && ( (unset $as_var) || exit 1) >/dev/null 2>&1 && unset $as_var || : done PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. LC_ALL=C export LC_ALL LANGUAGE=C export LANGUAGE # CDPATH. (unset CDPATH) >/dev/null 2>&1 && unset CDPATH if test "x$CONFIG_SHELL" = x; then as_bourne_compatible="if test -n \"\${ZSH_VERSION+set}\" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on \${1+\"\$@\"}, which # is contrary to our usage. Disable this feature. alias -g '\${1+\"\$@\"}'='\"\$@\"' setopt NO_GLOB_SUBST else case \`(set -o) 2>/dev/null\` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi " as_required="as_fn_return () { (exit \$1); } as_fn_success () { as_fn_return 0; } as_fn_failure () { as_fn_return 1; } as_fn_ret_success () { return 0; } as_fn_ret_failure () { return 1; } exitcode=0 as_fn_success || { exitcode=1; echo as_fn_success failed.; } as_fn_failure && { exitcode=1; echo as_fn_failure succeeded.; } as_fn_ret_success || { exitcode=1; echo as_fn_ret_success failed.; } as_fn_ret_failure && { exitcode=1; echo as_fn_ret_failure succeeded.; } if ( set x; as_fn_ret_success y && test x = \"\$1\" ); then : else exitcode=1; echo positional parameters were not saved. fi test x\$exitcode = x0 || exit 1" as_suggested=" as_lineno_1=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" as_lineno_1a=\$LINENO as_lineno_2=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" as_lineno_2a=\$LINENO eval 'test \"x\$as_lineno_1'\$as_run'\" != \"x\$as_lineno_2'\$as_run'\" && test \"x\`expr \$as_lineno_1'\$as_run' + 1\`\" = \"x\$as_lineno_2'\$as_run'\"' || exit 1 test \$(( 1 + 1 )) = 2 || exit 1" if (eval "$as_required") 2>/dev/null; then : as_have_required=yes else as_have_required=no fi if test x$as_have_required = xyes && (eval "$as_suggested") 2>/dev/null; then : else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR as_found=false for as_dir in /bin$PATH_SEPARATOR/usr/bin$PATH_SEPARATOR$PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. as_found=: case $as_dir in #( /*) for as_base in sh bash ksh sh5; do # Try only shells that exist, to save several forks. as_shell=$as_dir/$as_base if { test -f "$as_shell" || test -f "$as_shell.exe"; } && { $as_echo "$as_bourne_compatible""$as_required" | as_run=a "$as_shell"; } 2>/dev/null; then : CONFIG_SHELL=$as_shell as_have_required=yes if { $as_echo "$as_bourne_compatible""$as_suggested" | as_run=a "$as_shell"; } 2>/dev/null; then : break 2 fi fi done;; esac as_found=false done $as_found || { if { test -f "$SHELL" || test -f "$SHELL.exe"; } && { $as_echo "$as_bourne_compatible""$as_required" | as_run=a "$SHELL"; } 2>/dev/null; then : CONFIG_SHELL=$SHELL as_have_required=yes fi; } IFS=$as_save_IFS if test "x$CONFIG_SHELL" != x; then : # We cannot yet assume a decent shell, so we have to provide a # neutralization value for shells without unset; and this also # works around shells that cannot unset nonexistent variables. # Preserve -v and -x to the replacement shell. BASH_ENV=/dev/null ENV=/dev/null (unset BASH_ENV) >/dev/null 2>&1 && unset BASH_ENV ENV export CONFIG_SHELL case $- in # (((( *v*x* | *x*v* ) as_opts=-vx ;; *v* ) as_opts=-v ;; *x* ) as_opts=-x ;; * ) as_opts= ;; esac exec "$CONFIG_SHELL" $as_opts "$as_myself" ${1+"$@"} fi if test x$as_have_required = xno; then : $as_echo "$0: This script requires a shell more modern than all" $as_echo "$0: the shells that I found on your system." if test x${ZSH_VERSION+set} = xset ; then $as_echo "$0: In particular, zsh $ZSH_VERSION has bugs and should" $as_echo "$0: be upgraded to zsh 4.3.4 or later." else $as_echo "$0: Please tell bug-autoconf@gnu.org and $0: xvid-devel@xvid.org about your system, including any $0: error possibly output before this message. Then install $0: a modern shell, or manually run the script under such a $0: shell if you do have one." fi exit 1 fi fi fi SHELL=${CONFIG_SHELL-/bin/sh} export SHELL # Unset more variables known to interfere with behavior of common tools. CLICOLOR_FORCE= GREP_OPTIONS= unset CLICOLOR_FORCE GREP_OPTIONS ## --------------------- ## ## M4sh Shell Functions. ## ## --------------------- ## # as_fn_unset VAR # --------------- # Portably unset VAR. as_fn_unset () { { eval $1=; unset $1;} } as_unset=as_fn_unset # as_fn_set_status STATUS # ----------------------- # Set $? to STATUS, without forking. as_fn_set_status () { return $1 } # as_fn_set_status # as_fn_exit STATUS # ----------------- # Exit the shell with STATUS, even in a "trap 0" or "set -e" context. as_fn_exit () { set +e as_fn_set_status $1 exit $1 } # as_fn_exit # as_fn_mkdir_p # ------------- # Create "$as_dir" as a directory, including parents if necessary. as_fn_mkdir_p () { case $as_dir in #( -*) as_dir=./$as_dir;; esac test -d "$as_dir" || eval $as_mkdir_p || { as_dirs= while :; do case $as_dir in #( *\'*) as_qdir=`$as_echo "$as_dir" | sed "s/'/'\\\\\\\\''/g"`;; #'( *) as_qdir=$as_dir;; esac as_dirs="'$as_qdir' $as_dirs" as_dir=`$as_dirname -- "$as_dir" || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` test -d "$as_dir" && break done test -z "$as_dirs" || eval "mkdir $as_dirs" } || test -d "$as_dir" || as_fn_error $? "cannot create directory $as_dir" } # as_fn_mkdir_p # as_fn_append VAR VALUE # ---------------------- # Append the text in VALUE to the end of the definition contained in VAR. Take # advantage of any shell optimizations that allow amortized linear growth over # repeated appends, instead of the typical quadratic growth present in naive # implementations. if (eval "as_var=1; as_var+=2; test x\$as_var = x12") 2>/dev/null; then : eval 'as_fn_append () { eval $1+=\$2 }' else as_fn_append () { eval $1=\$$1\$2 } fi # as_fn_append # as_fn_arith ARG... # ------------------ # Perform arithmetic evaluation on the ARGs, and store the result in the # global $as_val. Take advantage of shells that can avoid forks. The arguments # must be portable across $(()) and expr. if (eval "test \$(( 1 + 1 )) = 2") 2>/dev/null; then : eval 'as_fn_arith () { as_val=$(( $* )) }' else as_fn_arith () { as_val=`expr "$@" || test $? -eq 1` } fi # as_fn_arith # as_fn_error STATUS ERROR [LINENO LOG_FD] # ---------------------------------------- # Output "`basename $0`: error: ERROR" to stderr. If LINENO and LOG_FD are # provided, also output the error to LOG_FD, referencing LINENO. Then exit the # script with STATUS, using 1 if that was 0. as_fn_error () { as_status=$1; test $as_status -eq 0 && as_status=1 if test "$4"; then as_lineno=${as_lineno-"$3"} as_lineno_stack=as_lineno_stack=$as_lineno_stack $as_echo "$as_me:${as_lineno-$LINENO}: error: $2" >&$4 fi $as_echo "$as_me: error: $2" >&2 as_fn_exit $as_status } # as_fn_error if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi if (basename -- /) >/dev/null 2>&1 && test "X`basename -- / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi if (as_dir=`dirname -- /` && test "X$as_dir" = X/) >/dev/null 2>&1; then as_dirname=dirname else as_dirname=false fi as_me=`$as_basename -- "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| . 2>/dev/null || $as_echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } /^X\/\(\/\/\)$/{ s//\1/ q } /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits as_lineno_1=$LINENO as_lineno_1a=$LINENO as_lineno_2=$LINENO as_lineno_2a=$LINENO eval 'test "x$as_lineno_1'$as_run'" != "x$as_lineno_2'$as_run'" && test "x`expr $as_lineno_1'$as_run' + 1`" = "x$as_lineno_2'$as_run'"' || { # Blame Lee E. McMahon (1931-1989) for sed's syntax. :-) sed -n ' p /[$]LINENO/= ' <$as_myself | sed ' s/[$]LINENO.*/&-/ t lineno b :lineno N :loop s/[$]LINENO\([^'$as_cr_alnum'_].*\n\)\(.*\)/\2\1\2/ t loop s/-\n.*// ' >$as_me.lineno && chmod +x "$as_me.lineno" || { $as_echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2; as_fn_exit 1; } # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensitive to this). . "./$as_me.lineno" # Exit status is that of the last command. exit } ECHO_C= ECHO_N= ECHO_T= case `echo -n x` in #((((( -n*) case `echo 'xy\c'` in *c*) ECHO_T=' ';; # ECHO_T is single tab character. xy) ECHO_C='\c';; *) echo `echo ksh88 bug on AIX 6.1` > /dev/null ECHO_T=' ';; esac;; *) ECHO_N='-n';; esac rm -f conf$$ conf$$.exe conf$$.file if test -d conf$$.dir; then rm -f conf$$.dir/conf$$.file else rm -f conf$$.dir mkdir conf$$.dir 2>/dev/null fi if (echo >conf$$.file) 2>/dev/null; then if ln -s conf$$.file conf$$ 2>/dev/null; then as_ln_s='ln -s' # ... but there are two gotchas: # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail. # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable. # In both cases, we have to default to `cp -p'. ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe || as_ln_s='cp -p' elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file rmdir conf$$.dir 2>/dev/null if mkdir -p . 2>/dev/null; then as_mkdir_p='mkdir -p "$as_dir"' else test -d ./-p && rmdir ./-p as_mkdir_p=false fi if test -x / >/dev/null 2>&1; then as_test_x='test -x' else if ls -dL / >/dev/null 2>&1; then as_ls_L_option=L else as_ls_L_option= fi as_test_x=' eval sh -c '\'' if test -d "$1"; then test -d "$1/."; else case $1 in #( -*)set "./$1";; esac; case `ls -ld'$as_ls_L_option' "$1" 2>/dev/null` in #(( ???[sx]*):;;*)false;;esac;fi '\'' sh ' fi as_executable_p=$as_test_x # Sed expression to map a string onto a valid CPP name. as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'" # Sed expression to map a string onto a valid variable name. as_tr_sh="eval sed 'y%*+%pp%;s%[^_$as_cr_alnum]%_%g'" test -n "$DJDIR" || exec 7<&0 &1 # Name of the host. # hostname on some systems (SVR3.2, old GNU/Linux) returns a bogus exit status, # so uname gets run too. ac_hostname=`(hostname || uname -n) 2>/dev/null | sed 1q` # # Initializations. # ac_default_prefix=/usr/local ac_clean_files= ac_config_libobj_dir=. LIBOBJS= cross_compiling=no subdirs= MFLAGS= MAKEFLAGS= # Identity of this package. PACKAGE_NAME='Xvid' PACKAGE_TARNAME='xvid' PACKAGE_VERSION='1.3.2' PACKAGE_STRING='Xvid 1.3.2' PACKAGE_BUGREPORT='xvid-devel@xvid.org' PACKAGE_URL='' ac_unique_file="configure.in" ac_default_prefix="/usr/local" # Factoring default headers for most tests. ac_includes_default="\ #include #ifdef HAVE_SYS_TYPES_H # include #endif #ifdef HAVE_SYS_STAT_H # include #endif #ifdef STDC_HEADERS # include # include #else # ifdef HAVE_STDLIB_H # include # endif #endif #ifdef HAVE_STRING_H # if !defined STDC_HEADERS && defined HAVE_MEMORY_H # include # endif # include #endif #ifdef HAVE_STRINGS_H # include #endif #ifdef HAVE_INTTYPES_H # include #endif #ifdef HAVE_STDINT_H # include #endif #ifdef HAVE_UNISTD_H # include #endif" ac_subst_vars='LTLIBOBJS LIBOBJS ALTIVEC_CFLAGS SHARED_LIB PRE_SHARED_LIB STATIC_LIB API_MINOR API_MAJOR PPC_ALTIVEC_SOURCES DCT_IA64_SOURCES SPECIFIC_CFLAGS SPECIFIC_LDFLAGS ASSEMBLY_SOURCES GENERIC_SOURCES ASSEMBLY_EXTENSION AFLAGS AS NASM_FORMAT OBJECT_EXTENSION STATIC_EXTENSION SHARED_EXTENSION ENDIANNESS BUS ARCHITECTURE FEATURES ac_nasm ac_yasm EGREP GREP CPP AR RANLIB INSTALL_DATA INSTALL_SCRIPT INSTALL_PROGRAM OBJEXT EXEEXT ac_ct_CC CPPFLAGS LDFLAGS CFLAGS CC target_os target_vendor target_cpu target host_os host_vendor host_cpu host build_os build_vendor build_cpu build target_alias host_alias build_alias LIBS ECHO_T ECHO_N ECHO_C DEFS mandir localedir libdir psdir pdfdir dvidir htmldir infodir docdir oldincludedir includedir localstatedir sharedstatedir sysconfdir datadir datarootdir libexecdir sbindir bindir program_transform_name prefix exec_prefix PACKAGE_URL PACKAGE_BUGREPORT PACKAGE_STRING PACKAGE_VERSION PACKAGE_TARNAME PACKAGE_NAME PATH_SEPARATOR SHELL' ac_subst_files='' ac_user_opts=' enable_option_checking enable_idebug enable_iprofile enable_gnuprofile enable_assembly enable_pthread enable_macosx_module ' ac_precious_vars='build_alias host_alias target_alias CC CFLAGS LDFLAGS LIBS CPPFLAGS CPP' # Initialize some variables set by options. ac_init_help= ac_init_version=false ac_unrecognized_opts= ac_unrecognized_sep= # The variables have the same names as the options, with # dashes changed to underlines. cache_file=/dev/null exec_prefix=NONE no_create= no_recursion= prefix=NONE program_prefix=NONE program_suffix=NONE program_transform_name=s,x,x, silent= site= srcdir= verbose= x_includes=NONE x_libraries=NONE # Installation directory options. # These are left unexpanded so users can "make install exec_prefix=/foo" # and all the variables that are supposed to be based on exec_prefix # by default will actually change. # Use braces instead of parens because sh, perl, etc. also accept them. # (The list follows the same order as the GNU Coding Standards.) bindir='${exec_prefix}/bin' sbindir='${exec_prefix}/sbin' libexecdir='${exec_prefix}/libexec' datarootdir='${prefix}/share' datadir='${datarootdir}' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' includedir='${prefix}/include' oldincludedir='/usr/include' docdir='${datarootdir}/doc/${PACKAGE_TARNAME}' infodir='${datarootdir}/info' htmldir='${docdir}' dvidir='${docdir}' pdfdir='${docdir}' psdir='${docdir}' libdir='${exec_prefix}/lib' localedir='${datarootdir}/locale' mandir='${datarootdir}/man' ac_prev= ac_dashdash= for ac_option do # If the previous option needs an argument, assign it. if test -n "$ac_prev"; then eval $ac_prev=\$ac_option ac_prev= continue fi case $ac_option in *=?*) ac_optarg=`expr "X$ac_option" : '[^=]*=\(.*\)'` ;; *=) ac_optarg= ;; *) ac_optarg=yes ;; esac # Accept the important Cygnus configure options, so we can diagnose typos. case $ac_dashdash$ac_option in --) ac_dashdash=yes ;; -bindir | --bindir | --bindi | --bind | --bin | --bi) ac_prev=bindir ;; -bindir=* | --bindir=* | --bindi=* | --bind=* | --bin=* | --bi=*) bindir=$ac_optarg ;; -build | --build | --buil | --bui | --bu) ac_prev=build_alias ;; -build=* | --build=* | --buil=* | --bui=* | --bu=*) build_alias=$ac_optarg ;; -cache-file | --cache-file | --cache-fil | --cache-fi \ | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c) ac_prev=cache_file ;; -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \ | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*) cache_file=$ac_optarg ;; --config-cache | -C) cache_file=config.cache ;; -datadir | --datadir | --datadi | --datad) ac_prev=datadir ;; -datadir=* | --datadir=* | --datadi=* | --datad=*) datadir=$ac_optarg ;; -datarootdir | --datarootdir | --datarootdi | --datarootd | --dataroot \ | --dataroo | --dataro | --datar) ac_prev=datarootdir ;; -datarootdir=* | --datarootdir=* | --datarootdi=* | --datarootd=* \ | --dataroot=* | --dataroo=* | --dataro=* | --datar=*) datarootdir=$ac_optarg ;; -disable-* | --disable-*) ac_useropt=`expr "x$ac_option" : 'x-*disable-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid feature name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "enable_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--disable-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval enable_$ac_useropt=no ;; -docdir | --docdir | --docdi | --doc | --do) ac_prev=docdir ;; -docdir=* | --docdir=* | --docdi=* | --doc=* | --do=*) docdir=$ac_optarg ;; -dvidir | --dvidir | --dvidi | --dvid | --dvi | --dv) ac_prev=dvidir ;; -dvidir=* | --dvidir=* | --dvidi=* | --dvid=* | --dvi=* | --dv=*) dvidir=$ac_optarg ;; -enable-* | --enable-*) ac_useropt=`expr "x$ac_option" : 'x-*enable-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid feature name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "enable_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--enable-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval enable_$ac_useropt=\$ac_optarg ;; -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \ | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \ | --exec | --exe | --ex) ac_prev=exec_prefix ;; -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \ | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \ | --exec=* | --exe=* | --ex=*) exec_prefix=$ac_optarg ;; -gas | --gas | --ga | --g) # Obsolete; use --with-gas. with_gas=yes ;; -help | --help | --hel | --he | -h) ac_init_help=long ;; -help=r* | --help=r* | --hel=r* | --he=r* | -hr*) ac_init_help=recursive ;; -help=s* | --help=s* | --hel=s* | --he=s* | -hs*) ac_init_help=short ;; -host | --host | --hos | --ho) ac_prev=host_alias ;; -host=* | --host=* | --hos=* | --ho=*) host_alias=$ac_optarg ;; -htmldir | --htmldir | --htmldi | --htmld | --html | --htm | --ht) ac_prev=htmldir ;; -htmldir=* | --htmldir=* | --htmldi=* | --htmld=* | --html=* | --htm=* \ | --ht=*) htmldir=$ac_optarg ;; -includedir | --includedir | --includedi | --included | --include \ | --includ | --inclu | --incl | --inc) ac_prev=includedir ;; -includedir=* | --includedir=* | --includedi=* | --included=* | --include=* \ | --includ=* | --inclu=* | --incl=* | --inc=*) includedir=$ac_optarg ;; -infodir | --infodir | --infodi | --infod | --info | --inf) ac_prev=infodir ;; -infodir=* | --infodir=* | --infodi=* | --infod=* | --info=* | --inf=*) infodir=$ac_optarg ;; -libdir | --libdir | --libdi | --libd) ac_prev=libdir ;; -libdir=* | --libdir=* | --libdi=* | --libd=*) libdir=$ac_optarg ;; -libexecdir | --libexecdir | --libexecdi | --libexecd | --libexec \ | --libexe | --libex | --libe) ac_prev=libexecdir ;; -libexecdir=* | --libexecdir=* | --libexecdi=* | --libexecd=* | --libexec=* \ | --libexe=* | --libex=* | --libe=*) libexecdir=$ac_optarg ;; -localedir | --localedir | --localedi | --localed | --locale) ac_prev=localedir ;; -localedir=* | --localedir=* | --localedi=* | --localed=* | --locale=*) localedir=$ac_optarg ;; -localstatedir | --localstatedir | --localstatedi | --localstated \ | --localstate | --localstat | --localsta | --localst | --locals) ac_prev=localstatedir ;; -localstatedir=* | --localstatedir=* | --localstatedi=* | --localstated=* \ | --localstate=* | --localstat=* | --localsta=* | --localst=* | --locals=*) localstatedir=$ac_optarg ;; -mandir | --mandir | --mandi | --mand | --man | --ma | --m) ac_prev=mandir ;; -mandir=* | --mandir=* | --mandi=* | --mand=* | --man=* | --ma=* | --m=*) mandir=$ac_optarg ;; -nfp | --nfp | --nf) # Obsolete; use --without-fp. with_fp=no ;; -no-create | --no-create | --no-creat | --no-crea | --no-cre \ | --no-cr | --no-c | -n) no_create=yes ;; -no-recursion | --no-recursion | --no-recursio | --no-recursi \ | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) no_recursion=yes ;; -oldincludedir | --oldincludedir | --oldincludedi | --oldincluded \ | --oldinclude | --oldinclud | --oldinclu | --oldincl | --oldinc \ | --oldin | --oldi | --old | --ol | --o) ac_prev=oldincludedir ;; -oldincludedir=* | --oldincludedir=* | --oldincludedi=* | --oldincluded=* \ | --oldinclude=* | --oldinclud=* | --oldinclu=* | --oldincl=* | --oldinc=* \ | --oldin=* | --oldi=* | --old=* | --ol=* | --o=*) oldincludedir=$ac_optarg ;; -prefix | --prefix | --prefi | --pref | --pre | --pr | --p) ac_prev=prefix ;; -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*) prefix=$ac_optarg ;; -program-prefix | --program-prefix | --program-prefi | --program-pref \ | --program-pre | --program-pr | --program-p) ac_prev=program_prefix ;; -program-prefix=* | --program-prefix=* | --program-prefi=* \ | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*) program_prefix=$ac_optarg ;; -program-suffix | --program-suffix | --program-suffi | --program-suff \ | --program-suf | --program-su | --program-s) ac_prev=program_suffix ;; -program-suffix=* | --program-suffix=* | --program-suffi=* \ | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*) program_suffix=$ac_optarg ;; -program-transform-name | --program-transform-name \ | --program-transform-nam | --program-transform-na \ | --program-transform-n | --program-transform- \ | --program-transform | --program-transfor \ | --program-transfo | --program-transf \ | --program-trans | --program-tran \ | --progr-tra | --program-tr | --program-t) ac_prev=program_transform_name ;; -program-transform-name=* | --program-transform-name=* \ | --program-transform-nam=* | --program-transform-na=* \ | --program-transform-n=* | --program-transform-=* \ | --program-transform=* | --program-transfor=* \ | --program-transfo=* | --program-transf=* \ | --program-trans=* | --program-tran=* \ | --progr-tra=* | --program-tr=* | --program-t=*) program_transform_name=$ac_optarg ;; -pdfdir | --pdfdir | --pdfdi | --pdfd | --pdf | --pd) ac_prev=pdfdir ;; -pdfdir=* | --pdfdir=* | --pdfdi=* | --pdfd=* | --pdf=* | --pd=*) pdfdir=$ac_optarg ;; -psdir | --psdir | --psdi | --psd | --ps) ac_prev=psdir ;; -psdir=* | --psdir=* | --psdi=* | --psd=* | --ps=*) psdir=$ac_optarg ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) silent=yes ;; -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ | --sbi=* | --sb=*) sbindir=$ac_optarg ;; -sharedstatedir | --sharedstatedir | --sharedstatedi \ | --sharedstated | --sharedstate | --sharedstat | --sharedsta \ | --sharedst | --shareds | --shared | --share | --shar \ | --sha | --sh) ac_prev=sharedstatedir ;; -sharedstatedir=* | --sharedstatedir=* | --sharedstatedi=* \ | --sharedstated=* | --sharedstate=* | --sharedstat=* | --sharedsta=* \ | --sharedst=* | --shareds=* | --shared=* | --share=* | --shar=* \ | --sha=* | --sh=*) sharedstatedir=$ac_optarg ;; -site | --site | --sit) ac_prev=site ;; -site=* | --site=* | --sit=*) site=$ac_optarg ;; -srcdir | --srcdir | --srcdi | --srcd | --src | --sr) ac_prev=srcdir ;; -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*) srcdir=$ac_optarg ;; -sysconfdir | --sysconfdir | --sysconfdi | --sysconfd | --sysconf \ | --syscon | --sysco | --sysc | --sys | --sy) ac_prev=sysconfdir ;; -sysconfdir=* | --sysconfdir=* | --sysconfdi=* | --sysconfd=* | --sysconf=* \ | --syscon=* | --sysco=* | --sysc=* | --sys=* | --sy=*) sysconfdir=$ac_optarg ;; -target | --target | --targe | --targ | --tar | --ta | --t) ac_prev=target_alias ;; -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*) target_alias=$ac_optarg ;; -v | -verbose | --verbose | --verbos | --verbo | --verb) verbose=yes ;; -version | --version | --versio | --versi | --vers | -V) ac_init_version=: ;; -with-* | --with-*) ac_useropt=`expr "x$ac_option" : 'x-*with-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid package name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "with_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--with-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval with_$ac_useropt=\$ac_optarg ;; -without-* | --without-*) ac_useropt=`expr "x$ac_option" : 'x-*without-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid package name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "with_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--without-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval with_$ac_useropt=no ;; --x) # Obsolete; use --with-x. with_x=yes ;; -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \ | --x-incl | --x-inc | --x-in | --x-i) ac_prev=x_includes ;; -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \ | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*) x_includes=$ac_optarg ;; -x-libraries | --x-libraries | --x-librarie | --x-librari \ | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l) ac_prev=x_libraries ;; -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \ | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*) x_libraries=$ac_optarg ;; -*) as_fn_error $? "unrecognized option: \`$ac_option' Try \`$0 --help' for more information" ;; *=*) ac_envvar=`expr "x$ac_option" : 'x\([^=]*\)='` # Reject names that are not valid shell variable names. case $ac_envvar in #( '' | [0-9]* | *[!_$as_cr_alnum]* ) as_fn_error $? "invalid variable name: \`$ac_envvar'" ;; esac eval $ac_envvar=\$ac_optarg export $ac_envvar ;; *) # FIXME: should be removed in autoconf 3.0. $as_echo "$as_me: WARNING: you should use --build, --host, --target" >&2 expr "x$ac_option" : ".*[^-._$as_cr_alnum]" >/dev/null && $as_echo "$as_me: WARNING: invalid host type: $ac_option" >&2 : "${build_alias=$ac_option} ${host_alias=$ac_option} ${target_alias=$ac_option}" ;; esac done if test -n "$ac_prev"; then ac_option=--`echo $ac_prev | sed 's/_/-/g'` as_fn_error $? "missing argument to $ac_option" fi if test -n "$ac_unrecognized_opts"; then case $enable_option_checking in no) ;; fatal) as_fn_error $? "unrecognized options: $ac_unrecognized_opts" ;; *) $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2 ;; esac fi # Check all directory arguments for consistency. for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \ datadir sysconfdir sharedstatedir localstatedir includedir \ oldincludedir docdir infodir htmldir dvidir pdfdir psdir \ libdir localedir mandir do eval ac_val=\$$ac_var # Remove trailing slashes. case $ac_val in */ ) ac_val=`expr "X$ac_val" : 'X\(.*[^/]\)' \| "X$ac_val" : 'X\(.*\)'` eval $ac_var=\$ac_val;; esac # Be sure to have absolute directory names. case $ac_val in [\\/$]* | ?:[\\/]* ) continue;; NONE | '' ) case $ac_var in *prefix ) continue;; esac;; esac as_fn_error $? "expected an absolute directory name for --$ac_var: $ac_val" done # There might be people who depend on the old broken behavior: `$host' # used to hold the argument of --host etc. # FIXME: To remove some day. build=$build_alias host=$host_alias target=$target_alias # FIXME: To remove some day. if test "x$host_alias" != x; then if test "x$build_alias" = x; then cross_compiling=maybe $as_echo "$as_me: WARNING: if you wanted to set the --build type, don't use --host. If a cross compiler is detected then cross compile mode will be used" >&2 elif test "x$build_alias" != "x$host_alias"; then cross_compiling=yes fi fi ac_tool_prefix= test -n "$host_alias" && ac_tool_prefix=$host_alias- test "$silent" = yes && exec 6>/dev/null ac_pwd=`pwd` && test -n "$ac_pwd" && ac_ls_di=`ls -di .` && ac_pwd_ls_di=`cd "$ac_pwd" && ls -di .` || as_fn_error $? "working directory cannot be determined" test "X$ac_ls_di" = "X$ac_pwd_ls_di" || as_fn_error $? "pwd does not report name of working directory" # Find the source files, if location was not specified. if test -z "$srcdir"; then ac_srcdir_defaulted=yes # Try the directory containing this script, then the parent directory. ac_confdir=`$as_dirname -- "$as_myself" || $as_expr X"$as_myself" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_myself" : 'X\(//\)[^/]' \| \ X"$as_myself" : 'X\(//\)$' \| \ X"$as_myself" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_myself" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` srcdir=$ac_confdir if test ! -r "$srcdir/$ac_unique_file"; then srcdir=.. fi else ac_srcdir_defaulted=no fi if test ! -r "$srcdir/$ac_unique_file"; then test "$ac_srcdir_defaulted" = yes && srcdir="$ac_confdir or .." as_fn_error $? "cannot find sources ($ac_unique_file) in $srcdir" fi ac_msg="sources are in $srcdir, but \`cd $srcdir' does not work" ac_abs_confdir=`( cd "$srcdir" && test -r "./$ac_unique_file" || as_fn_error $? "$ac_msg" pwd)` # When building in place, set srcdir=. if test "$ac_abs_confdir" = "$ac_pwd"; then srcdir=. fi # Remove unnecessary trailing slashes from srcdir. # Double slashes in file names in object file debugging info # mess up M-x gdb in Emacs. case $srcdir in */) srcdir=`expr "X$srcdir" : 'X\(.*[^/]\)' \| "X$srcdir" : 'X\(.*\)'`;; esac for ac_var in $ac_precious_vars; do eval ac_env_${ac_var}_set=\${${ac_var}+set} eval ac_env_${ac_var}_value=\$${ac_var} eval ac_cv_env_${ac_var}_set=\${${ac_var}+set} eval ac_cv_env_${ac_var}_value=\$${ac_var} done # # Report the --help message. # if test "$ac_init_help" = "long"; then # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF \`configure' configures Xvid 1.3.2 to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... To assign environment variables (e.g., CC, CFLAGS...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Defaults for the options are specified in brackets. Configuration: -h, --help display this help and exit --help=short display options specific to this package --help=recursive display the short help of all the included packages -V, --version display version information and exit -q, --quiet, --silent do not print \`checking ...' messages --cache-file=FILE cache test results in FILE [disabled] -C, --config-cache alias for \`--cache-file=config.cache' -n, --no-create do not create output files --srcdir=DIR find the sources in DIR [configure dir or \`..'] Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX [$ac_default_prefix] --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX [PREFIX] By default, \`make install' will install all the files in \`$ac_default_prefix/bin', \`$ac_default_prefix/lib' etc. You can specify an installation prefix other than \`$ac_default_prefix' using \`--prefix', for instance \`--prefix=\$HOME'. For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] --sbindir=DIR system admin executables [EPREFIX/sbin] --libexecdir=DIR program executables [EPREFIX/libexec] --sysconfdir=DIR read-only single-machine data [PREFIX/etc] --sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] --libdir=DIR object code libraries [EPREFIX/lib] --includedir=DIR C header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] --datarootdir=DIR read-only arch.-independent data root [PREFIX/share] --datadir=DIR read-only architecture-independent data [DATAROOTDIR] --infodir=DIR info documentation [DATAROOTDIR/info] --localedir=DIR locale-dependent data [DATAROOTDIR/locale] --mandir=DIR man documentation [DATAROOTDIR/man] --docdir=DIR documentation root [DATAROOTDIR/doc/xvid] --htmldir=DIR html documentation [DOCDIR] --dvidir=DIR dvi documentation [DOCDIR] --pdfdir=DIR pdf documentation [DOCDIR] --psdir=DIR ps documentation [DOCDIR] _ACEOF cat <<\_ACEOF System types: --build=BUILD configure for building on BUILD [guessed] --host=HOST cross-compile to build programs to run on HOST [BUILD] --target=TARGET configure for building compilers for TARGET [HOST] _ACEOF fi if test -n "$ac_init_help"; then case $ac_init_help in short | recursive ) echo "Configuration of Xvid 1.3.2:";; esac cat <<\_ACEOF Optional Features: --disable-option-checking ignore unrecognized --enable/--with options --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] --enable-idebug Enable internal debug function --enable-iprofile Enable internal profiling --enable-gnuprofile Enable profiling informations for gprof --disable-assembly Disable assembly code --disable-pthread Disable pthread dependent code --enable-macosx_module Build as a module on MacOS X Some influential environment variables: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L if you have libraries in a nonstandard directory LIBS libraries to pass to the linker, e.g. -l CPPFLAGS (Objective) C/C++ preprocessor flags, e.g. -I if you have headers in a nonstandard directory CPP C preprocessor Use these variables to override the choices made by `configure' or to help it to find libraries and programs with nonstandard names/locations. Report bugs to . _ACEOF ac_status=$? fi if test "$ac_init_help" = "recursive"; then # If there are subdirs, report their specific --help. for ac_dir in : $ac_subdirs_all; do test "x$ac_dir" = x: && continue test -d "$ac_dir" || { cd "$srcdir" && ac_pwd=`pwd` && srcdir=. && test -d "$ac_dir"; } || continue ac_builddir=. case "$ac_dir" in .) ac_dir_suffix= ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_dir_suffix=/`$as_echo "$ac_dir" | sed 's|^\.[\\/]||'` # A ".." for each directory in $ac_dir_suffix. ac_top_builddir_sub=`$as_echo "$ac_dir_suffix" | sed 's|/[^\\/]*|/..|g;s|/||'` case $ac_top_builddir_sub in "") ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_top_build_prefix=$ac_top_builddir_sub/ ;; esac ;; esac ac_abs_top_builddir=$ac_pwd ac_abs_builddir=$ac_pwd$ac_dir_suffix # for backward compatibility: ac_top_builddir=$ac_top_build_prefix case $srcdir in .) # We are building in place. ac_srcdir=. ac_top_srcdir=$ac_top_builddir_sub ac_abs_top_srcdir=$ac_pwd ;; [\\/]* | ?:[\\/]* ) # Absolute name. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ac_abs_top_srcdir=$srcdir ;; *) # Relative name. ac_srcdir=$ac_top_build_prefix$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_build_prefix$srcdir ac_abs_top_srcdir=$ac_pwd/$srcdir ;; esac ac_abs_srcdir=$ac_abs_top_srcdir$ac_dir_suffix cd "$ac_dir" || { ac_status=$?; continue; } # Check for guested configure. if test -f "$ac_srcdir/configure.gnu"; then echo && $SHELL "$ac_srcdir/configure.gnu" --help=recursive elif test -f "$ac_srcdir/configure"; then echo && $SHELL "$ac_srcdir/configure" --help=recursive else $as_echo "$as_me: WARNING: no configuration information is in $ac_dir" >&2 fi || ac_status=$? cd "$ac_pwd" || { ac_status=$?; break; } done fi test -n "$ac_init_help" && exit $ac_status if $ac_init_version; then cat <<\_ACEOF Xvid configure 1.3.2 generated by GNU Autoconf 2.68 Copyright (C) 2010 Free Software Foundation, Inc. This configure script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it. _ACEOF exit fi ## ------------------------ ## ## Autoconf initialization. ## ## ------------------------ ## # ac_fn_c_try_compile LINENO # -------------------------- # Try to compile conftest.$ac_ext, and return whether this succeeded. ac_fn_c_try_compile () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack rm -f conftest.$ac_objext if { { ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compile") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_c_try_compile # ac_fn_c_try_run LINENO # ---------------------- # Try to link conftest.$ac_ext, and return whether this succeeded. Assumes # that executables *can* be run. ac_fn_c_try_run () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { ac_try='./conftest$ac_exeext' { { case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_try") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; }; then : ac_retval=0 else $as_echo "$as_me: program exited with status $ac_status" >&5 $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=$ac_status fi rm -rf conftest.dSYM conftest_ipa8_conftest.oo eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_c_try_run # ac_fn_c_compute_int LINENO EXPR VAR INCLUDES # -------------------------------------------- # Tries to find the compile-time value of EXPR in a program that includes # INCLUDES, setting VAR accordingly. Returns whether the value could be # computed ac_fn_c_compute_int () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if test "$cross_compiling" = yes; then # Depending upon the size, compute the lo and hi bounds. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 int main () { static int test_array [1 - 2 * !(($2) >= 0)]; test_array [0] = 0 ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_lo=0 ac_mid=0 while :; do cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 int main () { static int test_array [1 - 2 * !(($2) <= $ac_mid)]; test_array [0] = 0 ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_hi=$ac_mid; break else as_fn_arith $ac_mid + 1 && ac_lo=$as_val if test $ac_lo -le $ac_mid; then ac_lo= ac_hi= break fi as_fn_arith 2 '*' $ac_mid + 1 && ac_mid=$as_val fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext done else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 int main () { static int test_array [1 - 2 * !(($2) < 0)]; test_array [0] = 0 ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_hi=-1 ac_mid=-1 while :; do cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 int main () { static int test_array [1 - 2 * !(($2) >= $ac_mid)]; test_array [0] = 0 ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_lo=$ac_mid; break else as_fn_arith '(' $ac_mid ')' - 1 && ac_hi=$as_val if test $ac_mid -le $ac_hi; then ac_lo= ac_hi= break fi as_fn_arith 2 '*' $ac_mid && ac_mid=$as_val fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext done else ac_lo= ac_hi= fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext # Binary search between lo and hi bounds. while test "x$ac_lo" != "x$ac_hi"; do as_fn_arith '(' $ac_hi - $ac_lo ')' / 2 + $ac_lo && ac_mid=$as_val cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 int main () { static int test_array [1 - 2 * !(($2) <= $ac_mid)]; test_array [0] = 0 ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_hi=$ac_mid else as_fn_arith '(' $ac_mid ')' + 1 && ac_lo=$as_val fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext done case $ac_lo in #(( ?*) eval "$3=\$ac_lo"; ac_retval=0 ;; '') ac_retval=1 ;; esac else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 static long int longval () { return $2; } static unsigned long int ulongval () { return $2; } #include #include int main () { FILE *f = fopen ("conftest.val", "w"); if (! f) return 1; if (($2) < 0) { long int i = longval (); if (i != ($2)) return 1; fprintf (f, "%ld", i); } else { unsigned long int i = ulongval (); if (i != ($2)) return 1; fprintf (f, "%lu", i); } /* Do not output a trailing newline, as this causes \r\n confusion on some platforms. */ return ferror (f) || fclose (f) != 0; ; return 0; } _ACEOF if ac_fn_c_try_run "$LINENO"; then : echo >>conftest.val; read $3 &5 (eval "$ac_cpp conftest.$ac_ext") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } > conftest.i && { test -z "$ac_c_preproc_warn_flag$ac_c_werror_flag" || test ! -s conftest.err }; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_c_try_cpp # ac_fn_c_check_header_compile LINENO HEADER VAR INCLUDES # ------------------------------------------------------- # Tests whether HEADER exists and can be compiled using the include files in # INCLUDES, setting the cache variable VAR accordingly. ac_fn_c_check_header_compile () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval \${$3+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 #include <$2> _ACEOF if ac_fn_c_try_compile "$LINENO"; then : eval "$3=yes" else eval "$3=no" fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno } # ac_fn_c_check_header_compile # ac_fn_c_check_header_mongrel LINENO HEADER VAR INCLUDES # ------------------------------------------------------- # Tests whether HEADER exists, giving a warning if it cannot be compiled using # the include files in INCLUDES and setting the cache variable VAR # accordingly. ac_fn_c_check_header_mongrel () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if eval \${$3+:} false; then : { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval \${$3+:} false; then : $as_echo_n "(cached) " >&6 fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } else # Is the header compilable? { $as_echo "$as_me:${as_lineno-$LINENO}: checking $2 usability" >&5 $as_echo_n "checking $2 usability... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 #include <$2> _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_header_compiler=yes else ac_header_compiler=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_header_compiler" >&5 $as_echo "$ac_header_compiler" >&6; } # Is the header present? { $as_echo "$as_me:${as_lineno-$LINENO}: checking $2 presence" >&5 $as_echo_n "checking $2 presence... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include <$2> _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : ac_header_preproc=yes else ac_header_preproc=no fi rm -f conftest.err conftest.i conftest.$ac_ext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_header_preproc" >&5 $as_echo "$ac_header_preproc" >&6; } # So? What about this header? case $ac_header_compiler:$ac_header_preproc:$ac_c_preproc_warn_flag in #(( yes:no: ) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: accepted by the compiler, rejected by the preprocessor!" >&5 $as_echo "$as_me: WARNING: $2: accepted by the compiler, rejected by the preprocessor!" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: proceeding with the compiler's result" >&5 $as_echo "$as_me: WARNING: $2: proceeding with the compiler's result" >&2;} ;; no:yes:* ) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: present but cannot be compiled" >&5 $as_echo "$as_me: WARNING: $2: present but cannot be compiled" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: check for missing prerequisite headers?" >&5 $as_echo "$as_me: WARNING: $2: check for missing prerequisite headers?" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: see the Autoconf documentation" >&5 $as_echo "$as_me: WARNING: $2: see the Autoconf documentation" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: section \"Present But Cannot Be Compiled\"" >&5 $as_echo "$as_me: WARNING: $2: section \"Present But Cannot Be Compiled\"" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: proceeding with the compiler's result" >&5 $as_echo "$as_me: WARNING: $2: proceeding with the compiler's result" >&2;} ( $as_echo "## ---------------------------------- ## ## Report this to xvid-devel@xvid.org ## ## ---------------------------------- ##" ) | sed "s/^/$as_me: WARNING: /" >&2 ;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval \${$3+:} false; then : $as_echo_n "(cached) " >&6 else eval "$3=\$ac_header_compiler" fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } fi eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno } # ac_fn_c_check_header_mongrel # ac_fn_c_try_link LINENO # ----------------------- # Try to link conftest.$ac_ext, and return whether this succeeded. ac_fn_c_try_link () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack rm -f conftest.$ac_objext conftest$ac_exeext if { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest$ac_exeext && { test "$cross_compiling" = yes || $as_test_x conftest$ac_exeext }; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi # Delete the IPA/IPO (Inter Procedural Analysis/Optimization) information # created by the PGI compiler (conftest_ipa8_conftest.oo), as it would # interfere with the next link command; also delete a directory that is # left behind by Apple's compiler. We do this before executing the actions. rm -rf conftest.dSYM conftest_ipa8_conftest.oo eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_c_try_link cat >config.log <<_ACEOF This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by Xvid $as_me 1.3.2, which was generated by GNU Autoconf 2.68. Invocation command line was $ $0 $@ _ACEOF exec 5>>config.log { cat <<_ASUNAME ## --------- ## ## Platform. ## ## --------- ## hostname = `(hostname || uname -n) 2>/dev/null | sed 1q` uname -m = `(uname -m) 2>/dev/null || echo unknown` uname -r = `(uname -r) 2>/dev/null || echo unknown` uname -s = `(uname -s) 2>/dev/null || echo unknown` uname -v = `(uname -v) 2>/dev/null || echo unknown` /usr/bin/uname -p = `(/usr/bin/uname -p) 2>/dev/null || echo unknown` /bin/uname -X = `(/bin/uname -X) 2>/dev/null || echo unknown` /bin/arch = `(/bin/arch) 2>/dev/null || echo unknown` /usr/bin/arch -k = `(/usr/bin/arch -k) 2>/dev/null || echo unknown` /usr/convex/getsysinfo = `(/usr/convex/getsysinfo) 2>/dev/null || echo unknown` /usr/bin/hostinfo = `(/usr/bin/hostinfo) 2>/dev/null || echo unknown` /bin/machine = `(/bin/machine) 2>/dev/null || echo unknown` /usr/bin/oslevel = `(/usr/bin/oslevel) 2>/dev/null || echo unknown` /bin/universe = `(/bin/universe) 2>/dev/null || echo unknown` _ASUNAME as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. $as_echo "PATH: $as_dir" done IFS=$as_save_IFS } >&5 cat >&5 <<_ACEOF ## ----------- ## ## Core tests. ## ## ----------- ## _ACEOF # Keep a trace of the command line. # Strip out --no-create and --no-recursion so they do not pile up. # Strip out --silent because we don't want to record it for future runs. # Also quote any args containing shell meta-characters. # Make two passes to allow for proper duplicate-argument suppression. ac_configure_args= ac_configure_args0= ac_configure_args1= ac_must_keep_next=false for ac_pass in 1 2 do for ac_arg do case $ac_arg in -no-create | --no-c* | -n | -no-recursion | --no-r*) continue ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) continue ;; *\'*) ac_arg=`$as_echo "$ac_arg" | sed "s/'/'\\\\\\\\''/g"` ;; esac case $ac_pass in 1) as_fn_append ac_configure_args0 " '$ac_arg'" ;; 2) as_fn_append ac_configure_args1 " '$ac_arg'" if test $ac_must_keep_next = true; then ac_must_keep_next=false # Got value, back to normal. else case $ac_arg in *=* | --config-cache | -C | -disable-* | --disable-* \ | -enable-* | --enable-* | -gas | --g* | -nfp | --nf* \ | -q | -quiet | --q* | -silent | --sil* | -v | -verb* \ | -with-* | --with-* | -without-* | --without-* | --x) case "$ac_configure_args0 " in "$ac_configure_args1"*" '$ac_arg' "* ) continue ;; esac ;; -* ) ac_must_keep_next=true ;; esac fi as_fn_append ac_configure_args " '$ac_arg'" ;; esac done done { ac_configure_args0=; unset ac_configure_args0;} { ac_configure_args1=; unset ac_configure_args1;} # When interrupted or exit'd, cleanup temporary files, and complete # config.log. We remove comments because anyway the quotes in there # would cause problems or look ugly. # WARNING: Use '\'' to represent an apostrophe within the trap. # WARNING: Do not start the trap code with a newline, due to a FreeBSD 4.0 bug. trap 'exit_status=$? # Save into config.log some information that might help in debugging. { echo $as_echo "## ---------------- ## ## Cache variables. ## ## ---------------- ##" echo # The following way of writing the cache mishandles newlines in values, ( for ac_var in `(set) 2>&1 | sed -n '\''s/^\([a-zA-Z_][a-zA-Z0-9_]*\)=.*/\1/p'\''`; do eval ac_val=\$$ac_var case $ac_val in #( *${as_nl}*) case $ac_var in #( *_cv_*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cache variable $ac_var contains a newline" >&5 $as_echo "$as_me: WARNING: cache variable $ac_var contains a newline" >&2;} ;; esac case $ac_var in #( _ | IFS | as_nl) ;; #( BASH_ARGV | BASH_SOURCE) eval $ac_var= ;; #( *) { eval $ac_var=; unset $ac_var;} ;; esac ;; esac done (set) 2>&1 | case $as_nl`(ac_space='\'' '\''; set) 2>&1` in #( *${as_nl}ac_space=\ *) sed -n \ "s/'\''/'\''\\\\'\'''\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\''\\2'\''/p" ;; #( *) sed -n "/^[_$as_cr_alnum]*_cv_[_$as_cr_alnum]*=/p" ;; esac | sort ) echo $as_echo "## ----------------- ## ## Output variables. ## ## ----------------- ##" echo for ac_var in $ac_subst_vars do eval ac_val=\$$ac_var case $ac_val in *\'\''*) ac_val=`$as_echo "$ac_val" | sed "s/'\''/'\''\\\\\\\\'\'''\''/g"`;; esac $as_echo "$ac_var='\''$ac_val'\''" done | sort echo if test -n "$ac_subst_files"; then $as_echo "## ------------------- ## ## File substitutions. ## ## ------------------- ##" echo for ac_var in $ac_subst_files do eval ac_val=\$$ac_var case $ac_val in *\'\''*) ac_val=`$as_echo "$ac_val" | sed "s/'\''/'\''\\\\\\\\'\'''\''/g"`;; esac $as_echo "$ac_var='\''$ac_val'\''" done | sort echo fi if test -s confdefs.h; then $as_echo "## ----------- ## ## confdefs.h. ## ## ----------- ##" echo cat confdefs.h echo fi test "$ac_signal" != 0 && $as_echo "$as_me: caught signal $ac_signal" $as_echo "$as_me: exit $exit_status" } >&5 rm -f core *.core core.conftest.* && rm -f -r conftest* confdefs* conf$$* $ac_clean_files && exit $exit_status ' 0 for ac_signal in 1 2 13 15; do trap 'ac_signal='$ac_signal'; as_fn_exit 1' $ac_signal done ac_signal=0 # confdefs.h avoids OS command line length limits that DEFS can exceed. rm -f -r conftest* confdefs.h $as_echo "/* confdefs.h */" > confdefs.h # Predefined preprocessor variables. cat >>confdefs.h <<_ACEOF #define PACKAGE_NAME "$PACKAGE_NAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_TARNAME "$PACKAGE_TARNAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_VERSION "$PACKAGE_VERSION" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_STRING "$PACKAGE_STRING" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_BUGREPORT "$PACKAGE_BUGREPORT" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_URL "$PACKAGE_URL" _ACEOF # Let the site file select an alternate cache file if it wants to. # Prefer an explicitly selected file to automatically selected ones. ac_site_file1=NONE ac_site_file2=NONE if test -n "$CONFIG_SITE"; then # We do not want a PATH search for config.site. case $CONFIG_SITE in #(( -*) ac_site_file1=./$CONFIG_SITE;; */*) ac_site_file1=$CONFIG_SITE;; *) ac_site_file1=./$CONFIG_SITE;; esac elif test "x$prefix" != xNONE; then ac_site_file1=$prefix/share/config.site ac_site_file2=$prefix/etc/config.site else ac_site_file1=$ac_default_prefix/share/config.site ac_site_file2=$ac_default_prefix/etc/config.site fi for ac_site_file in "$ac_site_file1" "$ac_site_file2" do test "x$ac_site_file" = xNONE && continue if test /dev/null != "$ac_site_file" && test -r "$ac_site_file"; then { $as_echo "$as_me:${as_lineno-$LINENO}: loading site script $ac_site_file" >&5 $as_echo "$as_me: loading site script $ac_site_file" >&6;} sed 's/^/| /' "$ac_site_file" >&5 . "$ac_site_file" \ || { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "failed to load site script $ac_site_file See \`config.log' for more details" "$LINENO" 5; } fi done if test -r "$cache_file"; then # Some versions of bash will fail to source /dev/null (special files # actually), so we avoid doing that. DJGPP emulates it as a regular file. if test /dev/null != "$cache_file" && test -f "$cache_file"; then { $as_echo "$as_me:${as_lineno-$LINENO}: loading cache $cache_file" >&5 $as_echo "$as_me: loading cache $cache_file" >&6;} case $cache_file in [\\/]* | ?:[\\/]* ) . "$cache_file";; *) . "./$cache_file";; esac fi else { $as_echo "$as_me:${as_lineno-$LINENO}: creating cache $cache_file" >&5 $as_echo "$as_me: creating cache $cache_file" >&6;} >$cache_file fi # Check that the precious variables saved in the cache have kept the same # value. ac_cache_corrupted=false for ac_var in $ac_precious_vars; do eval ac_old_set=\$ac_cv_env_${ac_var}_set eval ac_new_set=\$ac_env_${ac_var}_set eval ac_old_val=\$ac_cv_env_${ac_var}_value eval ac_new_val=\$ac_env_${ac_var}_value case $ac_old_set,$ac_new_set in set,) { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&5 $as_echo "$as_me: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&2;} ac_cache_corrupted=: ;; ,set) { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' was not set in the previous run" >&5 $as_echo "$as_me: error: \`$ac_var' was not set in the previous run" >&2;} ac_cache_corrupted=: ;; ,);; *) if test "x$ac_old_val" != "x$ac_new_val"; then # differences in whitespace do not lead to failure. ac_old_val_w=`echo x $ac_old_val` ac_new_val_w=`echo x $ac_new_val` if test "$ac_old_val_w" != "$ac_new_val_w"; then { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' has changed since the previous run:" >&5 $as_echo "$as_me: error: \`$ac_var' has changed since the previous run:" >&2;} ac_cache_corrupted=: else { $as_echo "$as_me:${as_lineno-$LINENO}: warning: ignoring whitespace changes in \`$ac_var' since the previous run:" >&5 $as_echo "$as_me: warning: ignoring whitespace changes in \`$ac_var' since the previous run:" >&2;} eval $ac_var=\$ac_old_val fi { $as_echo "$as_me:${as_lineno-$LINENO}: former value: \`$ac_old_val'" >&5 $as_echo "$as_me: former value: \`$ac_old_val'" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: current value: \`$ac_new_val'" >&5 $as_echo "$as_me: current value: \`$ac_new_val'" >&2;} fi;; esac # Pass precious variables to config.status. if test "$ac_new_set" = set; then case $ac_new_val in *\'*) ac_arg=$ac_var=`$as_echo "$ac_new_val" | sed "s/'/'\\\\\\\\''/g"` ;; *) ac_arg=$ac_var=$ac_new_val ;; esac case " $ac_configure_args " in *" '$ac_arg' "*) ;; # Avoid dups. Use of quotes ensures accuracy. *) as_fn_append ac_configure_args " '$ac_arg'" ;; esac fi done if $ac_cache_corrupted; then { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: error: changes in the environment can compromise the build" >&5 $as_echo "$as_me: error: changes in the environment can compromise the build" >&2;} as_fn_error $? "run \`make distclean' and/or \`rm $cache_file' and start over" "$LINENO" 5 fi ## -------------------- ## ## Main body of script. ## ## -------------------- ## ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu API_MAJOR="4" API_MINOR="3" minimum_yasm_major_version=1 minimum_nasm_minor_version=0 minimum_nasm_major_version=2 nasm_prog="nasm" yasm_prog="yasm" our_cflags_defaults="-Wall" our_cflags_defaults="$our_cflags_defaults -O2" our_cflags_defaults="$our_cflags_defaults -fstrength-reduce" our_cflags_defaults="$our_cflags_defaults -finline-functions" our_cflags_defaults="$our_cflags_defaults -ffast-math" our_cflags_defaults="$our_cflags_defaults -fomit-frame-pointer" FEATURES="" # Check whether --enable-idebug was given. if test "${enable_idebug+set}" = set; then : enableval=$enable_idebug; if test "$enable_idebug" = "yes" ; then FEATURES="$FEATURES -D_DEBUG" fi fi # Check whether --enable-iprofile was given. if test "${enable_iprofile+set}" = set; then : enableval=$enable_iprofile; if test "$enable_iprofile" = "yes" ; then FEATURES="$FEATURES -D_PROFILING_" fi fi # Check whether --enable-gnuprofile was given. if test "${enable_gnuprofile+set}" = set; then : enableval=$enable_gnuprofile; if test "$enable_gnuprofile" = "yes" ; then GNU_PROF_CFLAGS="-pg -fprofile-arcs -ftest-coverage" GNU_PROF_LDFLAGS="-pg" fi fi # Check whether --enable-assembly was given. if test "${enable_assembly+set}" = set; then : enableval=$enable_assembly; if test "$enable_assembly" = "no" ; then assembly="no" else if test "$enable_assembly" = "yes" ; then assembly="yes" fi fi else assembly="yes" fi # Check whether --enable-pthread was given. if test "${enable_pthread+set}" = set; then : enableval=$enable_pthread; if test "$enable_pthread" = "no" ; then pthread="no" else if test "$enable_pthread" = "yes" ; then pthread="yes" fi fi else pthread="yes" fi # Check whether --enable-macosx_module was given. if test "${enable_macosx_module+set}" = set; then : enableval=$enable_macosx_module; if test "$enable_macosx_module" = "yes" ; then macosx_module="yes" else macosx_module="no" fi else macosx_module="no" fi ac_aux_dir= for ac_dir in "$srcdir" "$srcdir/.." "$srcdir/../.."; do if test -f "$ac_dir/install-sh"; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/install-sh -c" break elif test -f "$ac_dir/install.sh"; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/install.sh -c" break elif test -f "$ac_dir/shtool"; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/shtool install -c" break fi done if test -z "$ac_aux_dir"; then as_fn_error $? "cannot find install-sh, install.sh, or shtool in \"$srcdir\" \"$srcdir/..\" \"$srcdir/../..\"" "$LINENO" 5 fi # These three variables are undocumented and unsupported, # and are intended to be withdrawn in a future Autoconf release. # They can cause serious problems if a builder's source tree is in a directory # whose full name contains unusual characters. ac_config_guess="$SHELL $ac_aux_dir/config.guess" # Please don't use this var. ac_config_sub="$SHELL $ac_aux_dir/config.sub" # Please don't use this var. ac_configure="$SHELL $ac_aux_dir/configure" # Please don't use this var. # Make sure we can run config.sub. $SHELL "$ac_aux_dir/config.sub" sun4 >/dev/null 2>&1 || as_fn_error $? "cannot run $SHELL $ac_aux_dir/config.sub" "$LINENO" 5 { $as_echo "$as_me:${as_lineno-$LINENO}: checking build system type" >&5 $as_echo_n "checking build system type... " >&6; } if ${ac_cv_build+:} false; then : $as_echo_n "(cached) " >&6 else ac_build_alias=$build_alias test "x$ac_build_alias" = x && ac_build_alias=`$SHELL "$ac_aux_dir/config.guess"` test "x$ac_build_alias" = x && as_fn_error $? "cannot guess build type; you must specify one" "$LINENO" 5 ac_cv_build=`$SHELL "$ac_aux_dir/config.sub" $ac_build_alias` || as_fn_error $? "$SHELL $ac_aux_dir/config.sub $ac_build_alias failed" "$LINENO" 5 fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_build" >&5 $as_echo "$ac_cv_build" >&6; } case $ac_cv_build in *-*-*) ;; *) as_fn_error $? "invalid value of canonical build" "$LINENO" 5;; esac build=$ac_cv_build ac_save_IFS=$IFS; IFS='-' set x $ac_cv_build shift build_cpu=$1 build_vendor=$2 shift; shift # Remember, the first character of IFS is used to create $*, # except with old shells: build_os=$* IFS=$ac_save_IFS case $build_os in *\ *) build_os=`echo "$build_os" | sed 's/ /-/g'`;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking host system type" >&5 $as_echo_n "checking host system type... " >&6; } if ${ac_cv_host+:} false; then : $as_echo_n "(cached) " >&6 else if test "x$host_alias" = x; then ac_cv_host=$ac_cv_build else ac_cv_host=`$SHELL "$ac_aux_dir/config.sub" $host_alias` || as_fn_error $? "$SHELL $ac_aux_dir/config.sub $host_alias failed" "$LINENO" 5 fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_host" >&5 $as_echo "$ac_cv_host" >&6; } case $ac_cv_host in *-*-*) ;; *) as_fn_error $? "invalid value of canonical host" "$LINENO" 5;; esac host=$ac_cv_host ac_save_IFS=$IFS; IFS='-' set x $ac_cv_host shift host_cpu=$1 host_vendor=$2 shift; shift # Remember, the first character of IFS is used to create $*, # except with old shells: host_os=$* IFS=$ac_save_IFS case $host_os in *\ *) host_os=`echo "$host_os" | sed 's/ /-/g'`;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking target system type" >&5 $as_echo_n "checking target system type... " >&6; } if ${ac_cv_target+:} false; then : $as_echo_n "(cached) " >&6 else if test "x$target_alias" = x; then ac_cv_target=$ac_cv_host else ac_cv_target=`$SHELL "$ac_aux_dir/config.sub" $target_alias` || as_fn_error $? "$SHELL $ac_aux_dir/config.sub $target_alias failed" "$LINENO" 5 fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_target" >&5 $as_echo "$ac_cv_target" >&6; } case $ac_cv_target in *-*-*) ;; *) as_fn_error $? "invalid value of canonical target" "$LINENO" 5;; esac target=$ac_cv_target ac_save_IFS=$IFS; IFS='-' set x $ac_cv_target shift target_cpu=$1 target_vendor=$2 shift; shift # Remember, the first character of IFS is used to create $*, # except with old shells: target_os=$* IFS=$ac_save_IFS case $target_os in *\ *) target_os=`echo "$target_os" | sed 's/ /-/g'`;; esac # The aliases save the names the user supplied, while $host etc. # will get canonicalized. test -n "$target_alias" && test "$program_prefix$program_suffix$program_transform_name" = \ NONENONEs,x,x, && program_prefix=${target_alias}- { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to use default CFLAGS" >&5 $as_echo_n "checking whether to use default CFLAGS... " >&6; } if test x"$CFLAGS" = x"" ; then force_default_cc_options="yes" { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } else force_default_cc_options="no" { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}gcc", so it can be a program name with args. set dummy ${ac_tool_prefix}gcc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_CC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="${ac_tool_prefix}gcc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$ac_cv_prog_CC"; then ac_ct_CC=$CC # Extract the first word of "gcc", so it can be a program name with args. set dummy gcc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_ct_CC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_ac_ct_CC="gcc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_CC" >&5 $as_echo "$ac_ct_CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "x$ac_ct_CC" = x; then CC="" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac CC=$ac_ct_CC fi else CC="$ac_cv_prog_CC" fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}cc", so it can be a program name with args. set dummy ${ac_tool_prefix}cc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_CC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="${ac_tool_prefix}cc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi fi if test -z "$CC"; then # Extract the first word of "cc", so it can be a program name with args. set dummy cc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_CC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else ac_prog_rejected=no as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then if test "$as_dir/$ac_word$ac_exec_ext" = "/usr/ucb/cc"; then ac_prog_rejected=yes continue fi ac_cv_prog_CC="cc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS if test $ac_prog_rejected = yes; then # We found a bogon in the path, so make sure we never use it. set dummy $ac_cv_prog_CC shift if test $# != 0; then # We chose a different compiler from the bogus one. # However, it has the same basename, so the bogon will be chosen # first if we set CC to just the basename; use the full file name. shift ac_cv_prog_CC="$as_dir/$ac_word${1+' '}$@" fi fi fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then for ac_prog in cl.exe do # Extract the first word of "$ac_tool_prefix$ac_prog", so it can be a program name with args. set dummy $ac_tool_prefix$ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_CC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="$ac_tool_prefix$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$CC" && break done fi if test -z "$CC"; then ac_ct_CC=$CC for ac_prog in cl.exe do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_ct_CC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_ac_ct_CC="$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_CC" >&5 $as_echo "$ac_ct_CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$ac_ct_CC" && break done if test "x$ac_ct_CC" = x; then CC="" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac CC=$ac_ct_CC fi fi fi test -z "$CC" && { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "no acceptable C compiler found in \$PATH See \`config.log' for more details" "$LINENO" 5; } # Provide some information about the compiler. $as_echo "$as_me:${as_lineno-$LINENO}: checking for C compiler version" >&5 set X $ac_compile ac_compiler=$2 for ac_option in --version -v -V -qversion; do { { ac_try="$ac_compiler $ac_option >&5" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compiler $ac_option >&5") 2>conftest.err ac_status=$? if test -s conftest.err; then sed '10a\ ... rest of stderr output deleted ... 10q' conftest.err >conftest.er1 cat conftest.er1 >&5 fi rm -f conftest.er1 conftest.err $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } done cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files a.out a.out.dSYM a.exe b.out" # Try to create an executable without -o first, disregard a.out. # It will help us diagnose broken compilers, and finding out an intuition # of exeext. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether the C compiler works" >&5 $as_echo_n "checking whether the C compiler works... " >&6; } ac_link_default=`$as_echo "$ac_link" | sed 's/ -o *conftest[^ ]*//'` # The possible output files: ac_files="a.out conftest.exe conftest a.exe a_out.exe b.out conftest.*" ac_rmfiles= for ac_file in $ac_files do case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; * ) ac_rmfiles="$ac_rmfiles $ac_file";; esac done rm -f $ac_rmfiles if { { ac_try="$ac_link_default" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link_default") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : # Autoconf-2.13 could set the ac_cv_exeext variable to `no'. # So ignore a value of `no', otherwise this would lead to `EXEEXT = no' # in a Makefile. We should not override ac_cv_exeext if it was cached, # so that the user can short-circuit this test for compilers unknown to # Autoconf. for ac_file in $ac_files '' do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; [ab].out ) # We found the default executable, but exeext='' is most # certainly right. break;; *.* ) if test "${ac_cv_exeext+set}" = set && test "$ac_cv_exeext" != no; then :; else ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` fi # We set ac_cv_exeext here because the later test for it is not # safe: cross compilers may not add the suffix if given an `-o' # argument, so we may need to know it at that point already. # Even if this section looks crufty: it has the advantage of # actually working. break;; * ) break;; esac done test "$ac_cv_exeext" = no && ac_cv_exeext= else ac_file='' fi if test -z "$ac_file"; then : { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "C compiler cannot create executables See \`config.log' for more details" "$LINENO" 5; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for C compiler default output file name" >&5 $as_echo_n "checking for C compiler default output file name... " >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_file" >&5 $as_echo "$ac_file" >&6; } ac_exeext=$ac_cv_exeext rm -f -r a.out a.out.dSYM a.exe conftest$ac_cv_exeext b.out ac_clean_files=$ac_clean_files_save { $as_echo "$as_me:${as_lineno-$LINENO}: checking for suffix of executables" >&5 $as_echo_n "checking for suffix of executables... " >&6; } if { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : # If both `conftest.exe' and `conftest' are `present' (well, observable) # catch `conftest.exe'. For instance with Cygwin, `ls conftest' will # work properly (i.e., refer to `conftest.exe'), while it won't with # `rm'. for ac_file in conftest.exe conftest conftest.*; do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; *.* ) ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` break;; * ) break;; esac done else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot compute suffix of executables: cannot compile and link See \`config.log' for more details" "$LINENO" 5; } fi rm -f conftest conftest$ac_cv_exeext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_exeext" >&5 $as_echo "$ac_cv_exeext" >&6; } rm -f conftest.$ac_ext EXEEXT=$ac_cv_exeext ac_exeext=$EXEEXT cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include int main () { FILE *f = fopen ("conftest.out", "w"); return ferror (f) || fclose (f) != 0; ; return 0; } _ACEOF ac_clean_files="$ac_clean_files conftest.out" # Check that the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether we are cross compiling" >&5 $as_echo_n "checking whether we are cross compiling... " >&6; } if test "$cross_compiling" != yes; then { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } if { ac_try='./conftest$ac_cv_exeext' { { case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_try") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; }; then cross_compiling=no else if test "$cross_compiling" = maybe; then cross_compiling=yes else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details" "$LINENO" 5; } fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $cross_compiling" >&5 $as_echo "$cross_compiling" >&6; } rm -f conftest.$ac_ext conftest$ac_cv_exeext conftest.out ac_clean_files=$ac_clean_files_save { $as_echo "$as_me:${as_lineno-$LINENO}: checking for suffix of object files" >&5 $as_echo_n "checking for suffix of object files... " >&6; } if ${ac_cv_objext+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.o conftest.obj if { { ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compile") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : for ac_file in conftest.o conftest.obj conftest.*; do test -f "$ac_file" || continue; case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM ) ;; *) ac_cv_objext=`expr "$ac_file" : '.*\.\(.*\)'` break;; esac done else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot compute suffix of object files: cannot compile See \`config.log' for more details" "$LINENO" 5; } fi rm -f conftest.$ac_cv_objext conftest.$ac_ext fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_objext" >&5 $as_echo "$ac_cv_objext" >&6; } OBJEXT=$ac_cv_objext ac_objext=$OBJEXT { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether we are using the GNU C compiler" >&5 $as_echo_n "checking whether we are using the GNU C compiler... " >&6; } if ${ac_cv_c_compiler_gnu+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { #ifndef __GNUC__ choke me #endif ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_compiler_gnu=yes else ac_compiler_gnu=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_cv_c_compiler_gnu=$ac_compiler_gnu fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_compiler_gnu" >&5 $as_echo "$ac_cv_c_compiler_gnu" >&6; } if test $ac_compiler_gnu = yes; then GCC=yes else GCC= fi ac_test_CFLAGS=${CFLAGS+set} ac_save_CFLAGS=$CFLAGS { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC accepts -g" >&5 $as_echo_n "checking whether $CC accepts -g... " >&6; } if ${ac_cv_prog_cc_g+:} false; then : $as_echo_n "(cached) " >&6 else ac_save_c_werror_flag=$ac_c_werror_flag ac_c_werror_flag=yes ac_cv_prog_cc_g=no CFLAGS="-g" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_g=yes else CFLAGS="" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : else ac_c_werror_flag=$ac_save_c_werror_flag CFLAGS="-g" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_g=yes fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_c_werror_flag=$ac_save_c_werror_flag fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_prog_cc_g" >&5 $as_echo "$ac_cv_prog_cc_g" >&6; } if test "$ac_test_CFLAGS" = set; then CFLAGS=$ac_save_CFLAGS elif test $ac_cv_prog_cc_g = yes; then if test "$GCC" = yes; then CFLAGS="-g -O2" else CFLAGS="-g" fi else if test "$GCC" = yes; then CFLAGS="-O2" else CFLAGS= fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $CC option to accept ISO C89" >&5 $as_echo_n "checking for $CC option to accept ISO C89... " >&6; } if ${ac_cv_prog_cc_c89+:} false; then : $as_echo_n "(cached) " >&6 else ac_cv_prog_cc_c89=no ac_save_CC=$CC cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include /* Most of the following tests are stolen from RCS 5.7's src/conf.sh. */ struct buf { int x; }; FILE * (*rcsopen) (struct buf *, struct stat *, int); static char *e (p, i) char **p; int i; { return p[i]; } static char *f (char * (*g) (char **, int), char **p, ...) { char *s; va_list v; va_start (v,p); s = g (p, va_arg (v,int)); va_end (v); return s; } /* OSF 4.0 Compaq cc is some sort of almost-ANSI by default. It has function prototypes and stuff, but not '\xHH' hex character constants. These don't provoke an error unfortunately, instead are silently treated as 'x'. The following induces an error, until -std is added to get proper ANSI mode. Curiously '\x00'!='x' always comes out true, for an array size at least. It's necessary to write '\x00'==0 to get something that's true only with -std. */ int osf4_cc_array ['\x00' == 0 ? 1 : -1]; /* IBM C 6 for AIX is almost-ANSI by default, but it replaces macro parameters inside strings and character constants. */ #define FOO(x) 'x' int xlc6_cc_array[FOO(a) == 'x' ? 1 : -1]; int test (int i, double x); struct s1 {int (*f) (int a);}; struct s2 {int (*f) (double a);}; int pairnames (int, char **, FILE *(*)(struct buf *, struct stat *, int), int, int); int argc; char **argv; int main () { return f (e, argv, 0) != argv[0] || f (e, argv, 1) != argv[1]; ; return 0; } _ACEOF for ac_arg in '' -qlanglvl=extc89 -qlanglvl=ansi -std \ -Ae "-Aa -D_HPUX_SOURCE" "-Xc -D__EXTENSIONS__" do CC="$ac_save_CC $ac_arg" if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_c89=$ac_arg fi rm -f core conftest.err conftest.$ac_objext test "x$ac_cv_prog_cc_c89" != "xno" && break done rm -f conftest.$ac_ext CC=$ac_save_CC fi # AC_CACHE_VAL case "x$ac_cv_prog_cc_c89" in x) { $as_echo "$as_me:${as_lineno-$LINENO}: result: none needed" >&5 $as_echo "none needed" >&6; } ;; xno) { $as_echo "$as_me:${as_lineno-$LINENO}: result: unsupported" >&5 $as_echo "unsupported" >&6; } ;; *) CC="$CC $ac_cv_prog_cc_c89" { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_prog_cc_c89" >&5 $as_echo "$ac_cv_prog_cc_c89" >&6; } ;; esac if test "x$ac_cv_prog_cc_c89" != xno; then : fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu # Find a good install program. We prefer a C program (faster), # so one script is as good as another. But avoid the broken or # incompatible versions: # SysV /etc/install, /usr/sbin/install # SunOS /usr/etc/install # IRIX /sbin/install # AIX /bin/install # AmigaOS /C/install, which installs bootblocks on floppy discs # AIX 4 /usr/bin/installbsd, which doesn't work without a -g flag # AFS /usr/afsws/bin/install, which mishandles nonexistent args # SVR4 /usr/ucb/install, which tries to use the nonexistent group "staff" # OS/2's system install, which has a completely different semantic # ./install, which can be erroneously created by make from ./install.sh. # Reject install programs that cannot install multiple files. { $as_echo "$as_me:${as_lineno-$LINENO}: checking for a BSD-compatible install" >&5 $as_echo_n "checking for a BSD-compatible install... " >&6; } if test -z "$INSTALL"; then if ${ac_cv_path_install+:} false; then : $as_echo_n "(cached) " >&6 else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. # Account for people who put trailing slashes in PATH elements. case $as_dir/ in #(( ./ | .// | /[cC]/* | \ /etc/* | /usr/sbin/* | /usr/etc/* | /sbin/* | /usr/afsws/bin/* | \ ?:[\\/]os2[\\/]install[\\/]* | ?:[\\/]OS2[\\/]INSTALL[\\/]* | \ /usr/ucb/* ) ;; *) # OSF1 and SCO ODT 3.0 have their own names for install. # Don't use installbsd from OSF since it installs stuff as root # by default. for ac_prog in ginstall scoinst install; do for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_prog$ac_exec_ext" && $as_test_x "$as_dir/$ac_prog$ac_exec_ext"; }; then if test $ac_prog = install && grep dspmsg "$as_dir/$ac_prog$ac_exec_ext" >/dev/null 2>&1; then # AIX install. It has an incompatible calling convention. : elif test $ac_prog = install && grep pwplus "$as_dir/$ac_prog$ac_exec_ext" >/dev/null 2>&1; then # program-specific install script used by HP pwplus--don't use. : else rm -rf conftest.one conftest.two conftest.dir echo one > conftest.one echo two > conftest.two mkdir conftest.dir if "$as_dir/$ac_prog$ac_exec_ext" -c conftest.one conftest.two "`pwd`/conftest.dir" && test -s conftest.one && test -s conftest.two && test -s conftest.dir/conftest.one && test -s conftest.dir/conftest.two then ac_cv_path_install="$as_dir/$ac_prog$ac_exec_ext -c" break 3 fi fi fi done done ;; esac done IFS=$as_save_IFS rm -rf conftest.one conftest.two conftest.dir fi if test "${ac_cv_path_install+set}" = set; then INSTALL=$ac_cv_path_install else # As a last resort, use the slow shell script. Don't cache a # value for INSTALL within a source directory, because that will # break other packages using the cache if that directory is # removed, or if the value is a relative name. INSTALL=$ac_install_sh fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $INSTALL" >&5 $as_echo "$INSTALL" >&6; } # Use test -z because SunOS4 sh mishandles braces in ${var-val}. # It thinks the first close brace ends the variable substitution. test -z "$INSTALL_PROGRAM" && INSTALL_PROGRAM='${INSTALL}' test -z "$INSTALL_SCRIPT" && INSTALL_SCRIPT='${INSTALL}' test -z "$INSTALL_DATA" && INSTALL_DATA='${INSTALL} -m 644' if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}ranlib", so it can be a program name with args. set dummy ${ac_tool_prefix}ranlib; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_RANLIB+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$RANLIB"; then ac_cv_prog_RANLIB="$RANLIB" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_RANLIB="${ac_tool_prefix}ranlib" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi RANLIB=$ac_cv_prog_RANLIB if test -n "$RANLIB"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $RANLIB" >&5 $as_echo "$RANLIB" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$ac_cv_prog_RANLIB"; then ac_ct_RANLIB=$RANLIB # Extract the first word of "ranlib", so it can be a program name with args. set dummy ranlib; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_ct_RANLIB+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_RANLIB"; then ac_cv_prog_ac_ct_RANLIB="$ac_ct_RANLIB" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_ac_ct_RANLIB="ranlib" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_RANLIB=$ac_cv_prog_ac_ct_RANLIB if test -n "$ac_ct_RANLIB"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_RANLIB" >&5 $as_echo "$ac_ct_RANLIB" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "x$ac_ct_RANLIB" = x; then RANLIB=":" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac RANLIB=$ac_ct_RANLIB fi else RANLIB="$ac_cv_prog_RANLIB" fi if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}ar", so it can be a program name with args. set dummy ${ac_tool_prefix}ar; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_AR+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$AR"; then ac_cv_prog_AR="$AR" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_AR="${ac_tool_prefix}ar" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi AR=$ac_cv_prog_AR if test -n "$AR"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $AR" >&5 $as_echo "$AR" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$ac_cv_prog_AR"; then ac_ct_AR=$AR # Extract the first word of "ar", so it can be a program name with args. set dummy ar; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_ct_AR+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_AR"; then ac_cv_prog_ac_ct_AR="$ac_ct_AR" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_ac_ct_AR="ar" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_AR=$ac_cv_prog_ac_ct_AR if test -n "$ac_ct_AR"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_AR" >&5 $as_echo "$ac_ct_AR" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "x$ac_ct_AR" = x; then AR="ar-not-found" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac AR=$ac_ct_AR fi else AR="$ac_cv_prog_AR" fi ARCHITECTURE="" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for whether to use assembly code" >&5 $as_echo_n "checking for whether to use assembly code... " >&6; } if test x"$assembly" = x"yes" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for architecture type" >&5 $as_echo_n "checking for architecture type... " >&6; } case "$target_cpu" in i[3456]86) { $as_echo "$as_me:${as_lineno-$LINENO}: result: ia32" >&5 $as_echo "ia32" >&6; } ARCHITECTURE="IA32" ;; x86_64) { $as_echo "$as_me:${as_lineno-$LINENO}: result: x86_64" >&5 $as_echo "x86_64" >&6; } ARCHITECTURE="X86_64" ;; powerpc) { $as_echo "$as_me:${as_lineno-$LINENO}: result: PowerPC" >&5 $as_echo "PowerPC" >&6; } ARCHITECTURE="PPC" ;; ia64) { $as_echo "$as_me:${as_lineno-$LINENO}: result: ia64" >&5 $as_echo "ia64" >&6; } ARCHITECTURE="IA64" ;; *) { $as_echo "$as_me:${as_lineno-$LINENO}: result: $target_cpu" >&5 $as_echo "$target_cpu" >&6; } ARCHITECTURE="GENERIC" ;; esac else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } ARCHITECTURE="GENERIC" fi BUS="" ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu { $as_echo "$as_me:${as_lineno-$LINENO}: checking how to run the C preprocessor" >&5 $as_echo_n "checking how to run the C preprocessor... " >&6; } # On Suns, sometimes $CPP names a directory. if test -n "$CPP" && test -d "$CPP"; then CPP= fi if test -z "$CPP"; then if ${ac_cv_prog_CPP+:} false; then : $as_echo_n "(cached) " >&6 else # Double quotes because CPP needs to be expanded for CPP in "$CC -E" "$CC -E -traditional-cpp" "/lib/cpp" do ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : else # Broken: fails on valid input. continue fi rm -f conftest.err conftest.i conftest.$ac_ext # OK, works on sane cases. Now check whether nonexistent headers # can be detected and how. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : # Broken: success on invalid input. continue else # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.i conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.i conftest.err conftest.$ac_ext if $ac_preproc_ok; then : break fi done ac_cv_prog_CPP=$CPP fi CPP=$ac_cv_prog_CPP else ac_cv_prog_CPP=$CPP fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CPP" >&5 $as_echo "$CPP" >&6; } ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : else # Broken: fails on valid input. continue fi rm -f conftest.err conftest.i conftest.$ac_ext # OK, works on sane cases. Now check whether nonexistent headers # can be detected and how. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : # Broken: success on invalid input. continue else # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.i conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.i conftest.err conftest.$ac_ext if $ac_preproc_ok; then : else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details" "$LINENO" 5; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu { $as_echo "$as_me:${as_lineno-$LINENO}: checking for grep that handles long lines and -e" >&5 $as_echo_n "checking for grep that handles long lines and -e... " >&6; } if ${ac_cv_path_GREP+:} false; then : $as_echo_n "(cached) " >&6 else if test -z "$GREP"; then ac_path_GREP_found=false # Loop through the user's path and test for each of PROGNAME-LIST as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_prog in grep ggrep; do for ac_exec_ext in '' $ac_executable_extensions; do ac_path_GREP="$as_dir/$ac_prog$ac_exec_ext" { test -f "$ac_path_GREP" && $as_test_x "$ac_path_GREP"; } || continue # Check for GNU ac_path_GREP and select it if it is found. # Check for GNU $ac_path_GREP case `"$ac_path_GREP" --version 2>&1` in *GNU*) ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_found=:;; *) ac_count=0 $as_echo_n 0123456789 >"conftest.in" while : do cat "conftest.in" "conftest.in" >"conftest.tmp" mv "conftest.tmp" "conftest.in" cp "conftest.in" "conftest.nl" $as_echo 'GREP' >> "conftest.nl" "$ac_path_GREP" -e 'GREP$' -e '-(cannot match)-' < "conftest.nl" >"conftest.out" 2>/dev/null || break diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break as_fn_arith $ac_count + 1 && ac_count=$as_val if test $ac_count -gt ${ac_path_GREP_max-0}; then # Best one so far, save it but keep looking for a better one ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_max=$ac_count fi # 10*(2^10) chars as input seems more than enough test $ac_count -gt 10 && break done rm -f conftest.in conftest.tmp conftest.nl conftest.out;; esac $ac_path_GREP_found && break 3 done done done IFS=$as_save_IFS if test -z "$ac_cv_path_GREP"; then as_fn_error $? "no acceptable grep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" "$LINENO" 5 fi else ac_cv_path_GREP=$GREP fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_path_GREP" >&5 $as_echo "$ac_cv_path_GREP" >&6; } GREP="$ac_cv_path_GREP" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for egrep" >&5 $as_echo_n "checking for egrep... " >&6; } if ${ac_cv_path_EGREP+:} false; then : $as_echo_n "(cached) " >&6 else if echo a | $GREP -E '(a|b)' >/dev/null 2>&1 then ac_cv_path_EGREP="$GREP -E" else if test -z "$EGREP"; then ac_path_EGREP_found=false # Loop through the user's path and test for each of PROGNAME-LIST as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_prog in egrep; do for ac_exec_ext in '' $ac_executable_extensions; do ac_path_EGREP="$as_dir/$ac_prog$ac_exec_ext" { test -f "$ac_path_EGREP" && $as_test_x "$ac_path_EGREP"; } || continue # Check for GNU ac_path_EGREP and select it if it is found. # Check for GNU $ac_path_EGREP case `"$ac_path_EGREP" --version 2>&1` in *GNU*) ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_found=:;; *) ac_count=0 $as_echo_n 0123456789 >"conftest.in" while : do cat "conftest.in" "conftest.in" >"conftest.tmp" mv "conftest.tmp" "conftest.in" cp "conftest.in" "conftest.nl" $as_echo 'EGREP' >> "conftest.nl" "$ac_path_EGREP" 'EGREP$' < "conftest.nl" >"conftest.out" 2>/dev/null || break diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break as_fn_arith $ac_count + 1 && ac_count=$as_val if test $ac_count -gt ${ac_path_EGREP_max-0}; then # Best one so far, save it but keep looking for a better one ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_max=$ac_count fi # 10*(2^10) chars as input seems more than enough test $ac_count -gt 10 && break done rm -f conftest.in conftest.tmp conftest.nl conftest.out;; esac $ac_path_EGREP_found && break 3 done done done IFS=$as_save_IFS if test -z "$ac_cv_path_EGREP"; then as_fn_error $? "no acceptable egrep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" "$LINENO" 5 fi else ac_cv_path_EGREP=$EGREP fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_path_EGREP" >&5 $as_echo "$ac_cv_path_EGREP" >&6; } EGREP="$ac_cv_path_EGREP" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ANSI C header files" >&5 $as_echo_n "checking for ANSI C header files... " >&6; } if ${ac_cv_header_stdc+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_header_stdc=yes else ac_cv_header_stdc=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext if test $ac_cv_header_stdc = yes; then # SunOS 4.x string.h does not declare mem*, contrary to ANSI. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "memchr" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "free" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi. if test "$cross_compiling" = yes; then : : else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #if ((' ' & 0x0FF) == 0x020) # define ISLOWER(c) ('a' <= (c) && (c) <= 'z') # define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c)) #else # define ISLOWER(c) \ (('a' <= (c) && (c) <= 'i') \ || ('j' <= (c) && (c) <= 'r') \ || ('s' <= (c) && (c) <= 'z')) # define TOUPPER(c) (ISLOWER(c) ? ((c) | 0x40) : (c)) #endif #define XOR(e, f) (((e) && !(f)) || (!(e) && (f))) int main () { int i; for (i = 0; i < 256; i++) if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) return 2; return 0; } _ACEOF if ac_fn_c_try_run "$LINENO"; then : else ac_cv_header_stdc=no fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_header_stdc" >&5 $as_echo "$ac_cv_header_stdc" >&6; } if test $ac_cv_header_stdc = yes; then $as_echo "#define STDC_HEADERS 1" >>confdefs.h fi # On IRIX 5.3, sys/types and inttypes.h are conflicting. for ac_header in sys/types.h sys/stat.h stdlib.h string.h memory.h strings.h \ inttypes.h stdint.h unistd.h do : as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default " if eval test \"x\$"$as_ac_Header"\" = x"yes"; then : cat >>confdefs.h <<_ACEOF #define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done # The cast to long int works around a bug in the HP C Compiler # version HP92453-01 B.11.11.23709.GP, which incorrectly rejects # declarations like `int a3[[(sizeof (unsigned char)) >= 0]];'. # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of int *" >&5 $as_echo_n "checking size of int *... " >&6; } if ${ac_cv_sizeof_int_p+:} false; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (int *))" "ac_cv_sizeof_int_p" "$ac_includes_default"; then : else if test "$ac_cv_type_int_p" = yes; then { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (int *) See \`config.log' for more details" "$LINENO" 5; } else ac_cv_sizeof_int_p=0 fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_sizeof_int_p" >&5 $as_echo "$ac_cv_sizeof_int_p" >&6; } cat >>confdefs.h <<_ACEOF #define SIZEOF_INT_P $ac_cv_sizeof_int_p _ACEOF case "$ac_cv_sizeof_int_p" in 4) BUS="32BIT" ;; 8) BUS="64BIT" ;; *) as_fn_error $? "Xvid supports only 32/64 bit architectures" "$LINENO" 5 ;; esac ENDIANNESS="" { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether byte ordering is bigendian" >&5 $as_echo_n "checking whether byte ordering is bigendian... " >&6; } if ${ac_cv_c_bigendian+:} false; then : $as_echo_n "(cached) " >&6 else ac_cv_c_bigendian=unknown # See if we're dealing with a universal compiler. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifndef __APPLE_CC__ not a universal capable compiler #endif typedef int dummy; _ACEOF if ac_fn_c_try_compile "$LINENO"; then : # Check for potential -arch flags. It is not universal unless # there are at least two -arch flags with different values. ac_arch= ac_prev= for ac_word in $CC $CFLAGS $CPPFLAGS $LDFLAGS; do if test -n "$ac_prev"; then case $ac_word in i?86 | x86_64 | ppc | ppc64) if test -z "$ac_arch" || test "$ac_arch" = "$ac_word"; then ac_arch=$ac_word else ac_cv_c_bigendian=universal break fi ;; esac ac_prev= elif test "x$ac_word" = "x-arch"; then ac_prev=arch fi done fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext if test $ac_cv_c_bigendian = unknown; then # See if sys/param.h defines the BYTE_ORDER macro. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main () { #if ! (defined BYTE_ORDER && defined BIG_ENDIAN \ && defined LITTLE_ENDIAN && BYTE_ORDER && BIG_ENDIAN \ && LITTLE_ENDIAN) bogus endian macros #endif ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : # It does; now see whether it defined to BIG_ENDIAN or not. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main () { #if BYTE_ORDER != BIG_ENDIAN not big endian #endif ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_c_bigendian=yes else ac_cv_c_bigendian=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi if test $ac_cv_c_bigendian = unknown; then # See if defines _LITTLE_ENDIAN or _BIG_ENDIAN (e.g., Solaris). cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include int main () { #if ! (defined _LITTLE_ENDIAN || defined _BIG_ENDIAN) bogus endian macros #endif ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : # It does; now see whether it defined to _BIG_ENDIAN or not. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include int main () { #ifndef _BIG_ENDIAN not big endian #endif ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_c_bigendian=yes else ac_cv_c_bigendian=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi if test $ac_cv_c_bigendian = unknown; then # Compile a test program. if test "$cross_compiling" = yes; then : # Try to guess by grepping values from an object file. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ short int ascii_mm[] = { 0x4249, 0x4765, 0x6E44, 0x6961, 0x6E53, 0x7953, 0 }; short int ascii_ii[] = { 0x694C, 0x5454, 0x656C, 0x6E45, 0x6944, 0x6E61, 0 }; int use_ascii (int i) { return ascii_mm[i] + ascii_ii[i]; } short int ebcdic_ii[] = { 0x89D3, 0xE3E3, 0x8593, 0x95C5, 0x89C4, 0x9581, 0 }; short int ebcdic_mm[] = { 0xC2C9, 0xC785, 0x95C4, 0x8981, 0x95E2, 0xA8E2, 0 }; int use_ebcdic (int i) { return ebcdic_mm[i] + ebcdic_ii[i]; } extern int foo; int main () { return use_ascii (foo) == use_ebcdic (foo); ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : if grep BIGenDianSyS conftest.$ac_objext >/dev/null; then ac_cv_c_bigendian=yes fi if grep LiTTleEnDian conftest.$ac_objext >/dev/null ; then if test "$ac_cv_c_bigendian" = unknown; then ac_cv_c_bigendian=no else # finding both strings is unlikely to happen, but who knows? ac_cv_c_bigendian=unknown fi fi fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $ac_includes_default int main () { /* Are we little or big endian? From Harbison&Steele. */ union { long int l; char c[sizeof (long int)]; } u; u.l = 1; return u.c[sizeof (long int) - 1] == 1; ; return 0; } _ACEOF if ac_fn_c_try_run "$LINENO"; then : ac_cv_c_bigendian=no else ac_cv_c_bigendian=yes fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_bigendian" >&5 $as_echo "$ac_cv_c_bigendian" >&6; } case $ac_cv_c_bigendian in #( yes) ENDIANNESS="BIG_ENDIAN";; #( no) ENDIANNESS="LITTLE_ENDIAN" ;; #( universal) $as_echo "#define AC_APPLE_UNIVERSAL_BUILD 1" >>confdefs.h ;; #( *) as_fn_error $? "unknown endianness presetting ac_cv_c_bigendian=no (or yes) will help" "$LINENO" 5 ;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking for build extensions" >&5 $as_echo_n "checking for build extensions... " >&6; } SHARED_EXTENSION="" STATIC_EXTENSION="" OBJECT_EXTENSION="" case "$target_os" in *bsd*|linux*|beos|irix*|solaris*) { $as_echo "$as_me:${as_lineno-$LINENO}: result: .so .a .o" >&5 $as_echo ".so .a .o" >&6; } STATIC_EXTENSION="a" SHARED_EXTENSION="so" OBJECT_EXTENSION="o" ;; [cC][yY][gG][wW][iI][nN]*|mingw32*|mks*) { $as_echo "$as_me:${as_lineno-$LINENO}: result: .dll .a .obj" >&5 $as_echo ".dll .a .obj" >&6; } STATIC_EXTENSION="a" SHARED_EXTENSION="dll" OBJECT_EXTENSION="obj" ;; darwin*|raphsody*) if test x"$macosx_module" = x"yes"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: .so .a .o" >&5 $as_echo ".so .a .o" >&6; } SHARED_EXTENSION="so" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: .dynlib .a .o" >&5 $as_echo ".dynlib .a .o" >&6; } SHARED_EXTENSION="dylib" fi STATIC_EXTENSION="a" OBJECT_EXTENSION="o" ;; *) { $as_echo "$as_me:${as_lineno-$LINENO}: result: Unknown OS - Using .so .a .o" >&5 $as_echo "Unknown OS - Using .so .a .o" >&6; } STATIC_EXTENSION="a" SHARED_EXTENSION="so" OBJECT_EXTENSION="o" ;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking for platform specific LDFLAGS/CFLAGS" >&5 $as_echo_n "checking for platform specific LDFLAGS/CFLAGS... " >&6; } SPECIFIC_LDFLAGS="" SPECIFIC_CFLAGS="" ALTIVEC_CFLAGS="" PRE_SHARED_LIB="" case "$target_os" in linux*|solaris*) { $as_echo "$as_me:${as_lineno-$LINENO}: result: ok" >&5 $as_echo "ok" >&6; } STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR).\$(API_MINOR)" SPECIFIC_LDFLAGS="-Wl,-soname,libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR) -shared -Wl,--version-script=libxvidcore.ld -lc -lm" SPECIFIC_CFLAGS="-fPIC" ;; *bsd*|irix*) { $as_echo "$as_me:${as_lineno-$LINENO}: result: ok" >&5 $as_echo "ok" >&6; } STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR).\$(API_MINOR)" SPECIFIC_LDFLAGS="-Wl,-soname,libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR) -shared -lc -lm" SPECIFIC_CFLAGS="-fPIC" ;; [cC][yY][gG][wW][iI][nN]*|mingw32*|mks*) { $as_echo "$as_me:${as_lineno-$LINENO}: result: ok" >&5 $as_echo "ok" >&6; } STATIC_LIB="xvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="xvidcore.\$(SHARED_EXTENSION)" SPECIFIC_LDFLAGS="-mno-cygwin -shared -Wl,--dll,--out-implib,\$@.a libxvidcore.def" SPECIFIC_CFLAGS="-mno-cygwin" ;; darwin*|raphsody*) STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SPECIFIC_CFLAGS="-fPIC -fno-common -no-cpp-precomp" if test x"$macosx_module" = x"no"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: dylib options" >&5 $as_echo "dylib options" >&6; } SHARED_LIB="libxvidcore.\$(API_MAJOR).\$(SHARED_EXTENSION)" SPECIFIC_LDFLAGS="-Wl,-read_only_relocs,suppress -dynamiclib -flat_namespace -compatibility_version \$(API_MAJOR) -current_version \$(API_MAJOR).\$(API_MINOR) -install_name \$(libdir)/\$(SHARED_LIB)" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: module options" >&5 $as_echo "module options" >&6; } PRE_SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION)-temp.o" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR)" SPECIFIC_LDFLAGS="-r -keep_private_externs -nostdlib && \$(CC) \$(LDFLAGS) \$(PRE_SHARED_LIB) -o libxvidcore.\$(SHARED_EXTENSION).\$(API_MAJOR) -bundle -flat_namespace -undefined suppress" fi ;; beos) { $as_echo "$as_me:${as_lineno-$LINENO}: result: ok" >&5 $as_echo "ok" >&6; } STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION)" SPECIFIC_LDFLAGS="-nostart" SPECIFIC_CFLAGS="-fPIC" ;; *) { $as_echo "$as_me:${as_lineno-$LINENO}: result: Unknown Platform (Using default -shared -lc -lm)" >&5 $as_echo "Unknown Platform (Using default -shared -lc -lm)" >&6; } STATIC_LIB="libxvidcore.\$(STATIC_EXTENSION)" SHARED_LIB="libxvidcore.\$(SHARED_EXTENSION)" SPECIFIC_LDFLAGS="" SPECIFIC_CFLAGS="" ;; esac if test x"$PRE_SHARED_LIB" = x; then PRE_SHARED_LIB=$SHARED_LIB fi AS="" AFLAGS="" ASSEMBLY_EXTENSION="" GENERIC_SOURCES="SRC_GENERIC" ASSEMBLY_SOURCES="" if test "$ARCHITECTURE" = "IA32" -o "$ARCHITECTURE" = "X86_64" ; then found_nasm_comp_prog="no" chosen_asm_prog="" # Extract the first word of "$yasm_prog", so it can be a program name with args. set dummy $yasm_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_yasm+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_yasm"; then ac_cv_prog_ac_yasm="$ac_yasm" # Let the user override the test. else ac_prog_rejected=no as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then if test "$as_dir/$ac_word$ac_exec_ext" = "yes"; then ac_prog_rejected=yes continue fi ac_cv_prog_ac_yasm="yes" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS if test $ac_prog_rejected = yes; then # We found a bogon in the path, so make sure we never use it. set dummy $ac_cv_prog_ac_yasm shift if test $# != 0; then # We chose a different compiler from the bogus one. # However, it has the same basename, so the bogon will be chosen # first if we set ac_yasm to just the basename; use the full file name. shift ac_cv_prog_ac_yasm="$as_dir/$ac_word${1+' '}$@" fi fi test -z "$ac_cv_prog_ac_yasm" && ac_cv_prog_ac_yasm="no" fi fi ac_yasm=$ac_cv_prog_ac_yasm if test -n "$ac_yasm"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_yasm" >&5 $as_echo "$ac_yasm" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "$ac_yasm" = "yes" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for yasm version" >&5 $as_echo_n "checking for yasm version... " >&6; } yasm_major=`$yasm_prog --version | head -1 | cut -d '.' -f 1 | cut -d ' ' -f 2` if test -z $yasm_major ; then yasm_major=-1 fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $yasm_major" >&5 $as_echo "$yasm_major" >&6; } if test "$yasm_major" -lt "$minimum_yasm_major_version" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: yasm version is too old" >&5 $as_echo "$as_me: WARNING: yasm version is too old" >&2;} else found_nasm_comp_prog="yes" chosen_asm_prog="$yasm_prog" fi fi if test "$found_nasm_comp_prog" = "no" ; then # Extract the first word of "$nasm_prog", so it can be a program name with args. set dummy $nasm_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_nasm+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_nasm"; then ac_cv_prog_ac_nasm="$ac_nasm" # Let the user override the test. else ac_prog_rejected=no as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then if test "$as_dir/$ac_word$ac_exec_ext" = "yes"; then ac_prog_rejected=yes continue fi ac_cv_prog_ac_nasm="yes" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS if test $ac_prog_rejected = yes; then # We found a bogon in the path, so make sure we never use it. set dummy $ac_cv_prog_ac_nasm shift if test $# != 0; then # We chose a different compiler from the bogus one. # However, it has the same basename, so the bogon will be chosen # first if we set ac_nasm to just the basename; use the full file name. shift ac_cv_prog_ac_nasm="$as_dir/$ac_word${1+' '}$@" fi fi test -z "$ac_cv_prog_ac_nasm" && ac_cv_prog_ac_nasm="no" fi fi ac_nasm=$ac_cv_prog_ac_nasm if test -n "$ac_nasm"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_nasm" >&5 $as_echo "$ac_nasm" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "$ac_nasm" = "yes" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for nasm version" >&5 $as_echo_n "checking for nasm version... " >&6; } nasm_minor=`$nasm_prog -v | cut -d '.' -f 2 | cut -d ' ' -f 1` nasm_major=`$nasm_prog -v | cut -d '.' -f 1 | cut -d ' ' -f 3` if test -z $nasm_minor ; then nasm_minor=-1 fi if test -z $nasm_major ; then nasm_major=-1 fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $nasm_major" >&5 $as_echo "$nasm_major" >&6; } if test "$nasm_major" -lt "$minimum_nasm_major_version" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: nasm version is too old" >&5 $as_echo "$as_me: WARNING: nasm version is too old" >&2;} else found_nasm_comp_prog="yes" chosen_asm_prog="$nasm_prog" fi fi fi if test "$found_nasm_comp_prog" = "yes" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for asm object format" >&5 $as_echo_n "checking for asm object format... " >&6; } case "$target_os" in *bsd*|linux*|beos|irix*|solaris*) if test "$ARCHITECTURE" = "X86_64" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: elf64" >&5 $as_echo "elf64" >&6; } NASM_FORMAT="elf64" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: elf" >&5 $as_echo "elf" >&6; } NASM_FORMAT="elf" fi MARK_FUNCS="-DMARK_FUNCS" PREFIX="" ;; [cC][yY][gG][wW][iI][nN]*|mingw32*|mks*) if test "$ARCHITECTURE" = "X86_64" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: win64" >&5 $as_echo "win64" >&6; } NASM_FORMAT="win64" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: win32" >&5 $as_echo "win32" >&6; } NASM_FORMAT="win32" fi PREFIX="-DWINDOWS" MARK_FUNCS="" ;; *darwin*) if test "$ARCHITECTURE" = "X86_64" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: macho64" >&5 $as_echo "macho64" >&6; } NASM_FORMAT="macho64" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: macho32" >&5 $as_echo "macho32" >&6; } NASM_FORMAT="macho32" fi PREFIX="-DPREFIX" MARK_FUNCS="" ;; esac AS="$chosen_asm_prog" ASSEMBLY_EXTENSION="asm" AFLAGS="-I\$(&5 $as_echo "$as_me: WARNING: no correct assembler was found - Compiling generic sources only" >&2;} ARCHITECTURE="GENERIC" fi fi PPC_ALTIVEC_SOURCES="" if test "$ARCHITECTURE" = "PPC" ; then AS="\$(CC)" AFLAGS="" ASSEMBLY_EXTENSION=".s" ASSEMBLY_SOURCES="" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for altivec.h" >&5 $as_echo_n "checking for altivec.h... " >&6; } cat > conftest.c << EOF #include int main() { return(0); } EOF if $CC -arch ppc -faltivec -c conftest.c 2>/dev/null 1>/dev/null || \ $CC -maltivec -mabi=altivec -c conftest.c 2>/dev/null 1>/dev/null ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS -DHAVE_ALTIVEC_H" TEMP_ALTIVEC="-DHAVE_ALTIVEC_H" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } TEMP_ALTIVEC="" fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for Altivec compiler support" >&5 $as_echo_n "checking for Altivec compiler support... " >&6; } cat > conftest.c << EOF #ifdef HAVE_ALTIVEC_H #include #endif int main() { vector unsigned int vartest2 = (vector unsigned int)(0); vector unsigned int vartest3 = (vector unsigned int)(1); vartest2 = vec_add(vartest2, vartest3); return(0); } EOF if $CC $TEMP_ALTIVEC -arch ppc -faltivec -c conftest.c 2>/dev/null 1>/dev/null ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes (Apple)" >&5 $as_echo "yes (Apple)" >&6; } SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS -arch ppc -faltivec -DHAVE_ALTIVEC_PARENTHESES_DECL $TEMP_ALTIVEC" PPC_ALTIVEC_SOURCES="SRC_PPC_ALTIVEC" else cat > conftest.c << EOF #ifdef HAVE_ALTIVEC_H #include #endif int main() { vector unsigned int vartest2 = (vector unsigned int){0}; vector unsigned int vartest3 = (vector unsigned int){1}; vartest2 = vec_add(vartest2, vartest3); return(0); } EOF if $CC $TEMP_ALTIVEC -maltivec -mabi=altivec -c conftest.c 2>/dev/null 1>/dev/null ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes (GNU)" >&5 $as_echo "yes (GNU)" >&6; } SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS -DHAVE_ALTIVEC_BRACES_DECL $TEMP_ALTIVEC" PPC_ALTIVEC_SOURCES="SRC_PPC_ALTIVEC" ALTIVEC_CFLAGS="-maltivec -mabi=altivec" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no (ppc support won't be compiled in)" >&5 $as_echo "no (ppc support won't be compiled in)" >&6; } ARCHITECTURE="GENERIC" fi fi rm -f conftest.* fi if test "$ARCHITECTURE" = "IA64" ; then AS="\$(CC)" AFLAGS="-c" ASSEMBLY_EXTENSION="s" ASSEMBLY_SOURCES="SRC_IA64" case `basename $CC` in *ecc*) DCT_IA64_SOURCES="SRC_IA64_IDCT_ECC" if test "$force_default_cc_options" = "yes" ; then our_cflags_defaults="" fi ;; *) DCT_IA64_SOURCES="SRC_IA64_IDCT_GCC" ;; esac fi for ac_header in stdio.h \ signal.h \ stdlib.h \ string.h \ assert.h \ math.h \ do : as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default" if eval test \"x\$"$as_ac_Header"\" = x"yes"; then : cat >>confdefs.h <<_ACEOF #define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF else as_fn_error $? "Missing header file" "$LINENO" 5 fi done if test x"$pthread" = x"yes" ; then ac_fn_c_check_header_mongrel "$LINENO" "pthread.h" "ac_cv_header_pthread_h" "$ac_includes_default" if test "x$ac_cv_header_pthread_h" = xyes; then : { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pthread_create in -lpthread" >&5 $as_echo_n "checking for pthread_create in -lpthread... " >&6; } if ${ac_cv_lib_pthread_pthread_create+:} false; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lpthread $LIBS" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ /* Override any GCC internal prototype to avoid an error. Use char because int might match the return type of a GCC builtin and then its argument prototype would still apply. */ #ifdef __cplusplus extern "C" #endif char pthread_create (); int main () { return pthread_create (); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : ac_cv_lib_pthread_pthread_create=yes else ac_cv_lib_pthread_pthread_create=no fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_pthread_pthread_create" >&5 $as_echo "$ac_cv_lib_pthread_pthread_create" >&6; } if test "x$ac_cv_lib_pthread_pthread_create" = xyes; then : SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS -DHAVE_PTHREAD" SPECIFIC_LDFLAGS="$SPECIFIC_LDFLAGS -lpthread" else { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: Pthread not supported. No SMP support" >&5 $as_echo "$as_me: WARNING: Pthread not supported. No SMP support" >&2;} fi else { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: Pthread not supported. No SMP support" >&5 $as_echo "$as_me: WARNING: Pthread not supported. No SMP support" >&2;} fi else { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: Pthread support disabled. No SMP support" >&5 $as_echo "$as_me: WARNING: Pthread support disabled. No SMP support" >&2;} fi if test "$force_default_cc_options" = "yes" ; then CFLAGS="$our_cflags_defaults" fi SPECIFIC_LDFLAGS="$SPECIFIC_LDFLAGS $GNU_PROF_LDFLAGS" SPECIFIC_CFLAGS="$SPECIFIC_CFLAGS $GNU_PROF_CFLAGS" if test "$enable_gnuprofile" = "yes" ; then CFLAGS=`echo $CFLAGS | sed s/'-fomit-frame-pointer'/''/` fi if test "$GCC" = "yes" ; then cat << EOF > test.c #include int main(int argc, char **argv) { if (*argv[1] == 'M') { printf("%d", __GNUC__); } if (*argv[1] == 'm') { printf("%d", __GNUC_MINOR__); } return 0; } EOF $CC -o gcc-ver test.c GCC_MAJOR=`./gcc-ver M` GCC_MINOR=`./gcc-ver m` rm -f test.c rm -f gcc-ver # GCC 4.x if test "${GCC_MAJOR}" -gt 3 ; then CFLAGS=`echo $CFLAGS | sed s,"-mcpu","-mtune",g` CFLAGS=`echo $CFLAGS | sed s,'-freduce-all-givs','',g` CFLAGS=`echo $CFLAGS | sed s,'-fmove-all-movables','',g` CFLAGS=`echo $CFLAGS | sed s,'-fnew-ra','',g` CFLAGS=`echo $CFLAGS | sed s,'-fwritable-strings','',g` fi # GCC 3.4.x if test "${GCC_MAJOR}" -eq 3 && test "${GCC_MINOR}" -gt 3 ; then CFLAGS=`echo $CFLAGS | sed s,"-mcpu","-mtune",g` fi fi ac_config_files="$ac_config_files platform.inc" cat >confcache <<\_ACEOF # This file is a shell script that caches the results of configure # tests run on this system so they can be shared between configure # scripts and configure runs, see configure's option --config-cache. # It is not useful on other systems. If it contains results you don't # want to keep, you may remove or edit it. # # config.status only pays attention to the cache file if you give it # the --recheck option to rerun configure. # # `ac_cv_env_foo' variables (set or unset) will be overridden when # loading this file, other *unset* `ac_cv_foo' will be assigned the # following values. _ACEOF # The following way of writing the cache mishandles newlines in values, # but we know of no workaround that is simple, portable, and efficient. # So, we kill variables containing newlines. # Ultrix sh set writes to stderr and can't be redirected directly, # and sets the high bit in the cache file unless we assign to the vars. ( for ac_var in `(set) 2>&1 | sed -n 's/^\([a-zA-Z_][a-zA-Z0-9_]*\)=.*/\1/p'`; do eval ac_val=\$$ac_var case $ac_val in #( *${as_nl}*) case $ac_var in #( *_cv_*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cache variable $ac_var contains a newline" >&5 $as_echo "$as_me: WARNING: cache variable $ac_var contains a newline" >&2;} ;; esac case $ac_var in #( _ | IFS | as_nl) ;; #( BASH_ARGV | BASH_SOURCE) eval $ac_var= ;; #( *) { eval $ac_var=; unset $ac_var;} ;; esac ;; esac done (set) 2>&1 | case $as_nl`(ac_space=' '; set) 2>&1` in #( *${as_nl}ac_space=\ *) # `set' does not quote correctly, so add quotes: double-quote # substitution turns \\\\ into \\, and sed turns \\ into \. sed -n \ "s/'/'\\\\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\\2'/p" ;; #( *) # `set' quotes correctly as required by POSIX, so do not add quotes. sed -n "/^[_$as_cr_alnum]*_cv_[_$as_cr_alnum]*=/p" ;; esac | sort ) | sed ' /^ac_cv_env_/b end t clear :clear s/^\([^=]*\)=\(.*[{}].*\)$/test "${\1+set}" = set || &/ t end s/^\([^=]*\)=\(.*\)$/\1=${\1=\2}/ :end' >>confcache if diff "$cache_file" confcache >/dev/null 2>&1; then :; else if test -w "$cache_file"; then if test "x$cache_file" != "x/dev/null"; then { $as_echo "$as_me:${as_lineno-$LINENO}: updating cache $cache_file" >&5 $as_echo "$as_me: updating cache $cache_file" >&6;} if test ! -f "$cache_file" || test -h "$cache_file"; then cat confcache >"$cache_file" else case $cache_file in #( */* | ?:*) mv -f confcache "$cache_file"$$ && mv -f "$cache_file"$$ "$cache_file" ;; #( *) mv -f confcache "$cache_file" ;; esac fi fi else { $as_echo "$as_me:${as_lineno-$LINENO}: not updating unwritable cache $cache_file" >&5 $as_echo "$as_me: not updating unwritable cache $cache_file" >&6;} fi fi rm -f confcache test "x$prefix" = xNONE && prefix=$ac_default_prefix # Let make expand exec_prefix. test "x$exec_prefix" = xNONE && exec_prefix='${prefix}' # Transform confdefs.h into DEFS. # Protect against shell expansion while executing Makefile rules. # Protect against Makefile macro expansion. # # If the first sed substitution is executed (which looks for macros that # take arguments), then branch to the quote section. Otherwise, # look for a macro that doesn't take arguments. ac_script=' :mline /\\$/{ N s,\\\n,, b mline } t clear :clear s/^[ ]*#[ ]*define[ ][ ]*\([^ (][^ (]*([^)]*)\)[ ]*\(.*\)/-D\1=\2/g t quote s/^[ ]*#[ ]*define[ ][ ]*\([^ ][^ ]*\)[ ]*\(.*\)/-D\1=\2/g t quote b any :quote s/[ `~#$^&*(){}\\|;'\''"<>?]/\\&/g s/\[/\\&/g s/\]/\\&/g s/\$/$$/g H :any ${ g s/^\n// s/\n/ /g p } ' DEFS=`sed -n "$ac_script" confdefs.h` ac_libobjs= ac_ltlibobjs= U= for ac_i in : $LIBOBJS; do test "x$ac_i" = x: && continue # 1. Remove the extension, and $U if already installed. ac_script='s/\$U\././;s/\.o$//;s/\.obj$//' ac_i=`$as_echo "$ac_i" | sed "$ac_script"` # 2. Prepend LIBOBJDIR. When used with automake>=1.10 LIBOBJDIR # will be set to the directory where LIBOBJS objects are built. as_fn_append ac_libobjs " \${LIBOBJDIR}$ac_i\$U.$ac_objext" as_fn_append ac_ltlibobjs " \${LIBOBJDIR}$ac_i"'$U.lo' done LIBOBJS=$ac_libobjs LTLIBOBJS=$ac_ltlibobjs : "${CONFIG_STATUS=./config.status}" ac_write_fail=0 ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files $CONFIG_STATUS" { $as_echo "$as_me:${as_lineno-$LINENO}: creating $CONFIG_STATUS" >&5 $as_echo "$as_me: creating $CONFIG_STATUS" >&6;} as_write_fail=0 cat >$CONFIG_STATUS <<_ASEOF || as_write_fail=1 #! $SHELL # Generated by $as_me. # Run this file to recreate the current configuration. # Compiler output produced by configure, useful for debugging # configure, is in config.log if it exists. debug=false ac_cs_recheck=false ac_cs_silent=false SHELL=\${CONFIG_SHELL-$SHELL} export SHELL _ASEOF cat >>$CONFIG_STATUS <<\_ASEOF || as_write_fail=1 ## -------------------- ## ## M4sh Initialization. ## ## -------------------- ## # Be more Bourne compatible DUALCASE=1; export DUALCASE # for MKS sh if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi as_nl=' ' export as_nl # Printing a long string crashes Solaris 7 /usr/bin/printf. as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo # Prefer a ksh shell builtin over an external printf program on Solaris, # but without wasting forks for bash or zsh. if test -z "$BASH_VERSION$ZSH_VERSION" \ && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='print -r --' as_echo_n='print -rn --' elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='printf %s\n' as_echo_n='printf %s' else if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"' as_echo_n='/usr/ucb/echo -n' else as_echo_body='eval expr "X$1" : "X\\(.*\\)"' as_echo_n_body='eval arg=$1; case $arg in #( *"$as_nl"*) expr "X$arg" : "X\\(.*\\)$as_nl"; arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;; esac; expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl" ' export as_echo_n_body as_echo_n='sh -c $as_echo_n_body as_echo' fi export as_echo_body as_echo='sh -c $as_echo_body as_echo' fi # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then PATH_SEPARATOR=: (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 || PATH_SEPARATOR=';' } fi # IFS # We need space, tab and new line, in precisely that order. Quoting is # there to prevent editors from complaining about space-tab. # (If _AS_PATH_WALK were called with IFS unset, it would disable word # splitting by setting IFS to empty value.) IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. as_myself= case $0 in #(( *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done IFS=$as_save_IFS ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then $as_echo "$as_myself: error: cannot find myself; rerun with an absolute file name" >&2 exit 1 fi # Unset variables that we do not need and which cause bugs (e.g. in # pre-3.0 UWIN ksh). But do not cause bugs in bash 2.01; the "|| exit 1" # suppresses any "Segmentation fault" message there. '((' could # trigger a bug in pdksh 5.2.14. for as_var in BASH_ENV ENV MAIL MAILPATH do eval test x\${$as_var+set} = xset \ && ( (unset $as_var) || exit 1) >/dev/null 2>&1 && unset $as_var || : done PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. LC_ALL=C export LC_ALL LANGUAGE=C export LANGUAGE # CDPATH. (unset CDPATH) >/dev/null 2>&1 && unset CDPATH # as_fn_error STATUS ERROR [LINENO LOG_FD] # ---------------------------------------- # Output "`basename $0`: error: ERROR" to stderr. If LINENO and LOG_FD are # provided, also output the error to LOG_FD, referencing LINENO. Then exit the # script with STATUS, using 1 if that was 0. as_fn_error () { as_status=$1; test $as_status -eq 0 && as_status=1 if test "$4"; then as_lineno=${as_lineno-"$3"} as_lineno_stack=as_lineno_stack=$as_lineno_stack $as_echo "$as_me:${as_lineno-$LINENO}: error: $2" >&$4 fi $as_echo "$as_me: error: $2" >&2 as_fn_exit $as_status } # as_fn_error # as_fn_set_status STATUS # ----------------------- # Set $? to STATUS, without forking. as_fn_set_status () { return $1 } # as_fn_set_status # as_fn_exit STATUS # ----------------- # Exit the shell with STATUS, even in a "trap 0" or "set -e" context. as_fn_exit () { set +e as_fn_set_status $1 exit $1 } # as_fn_exit # as_fn_unset VAR # --------------- # Portably unset VAR. as_fn_unset () { { eval $1=; unset $1;} } as_unset=as_fn_unset # as_fn_append VAR VALUE # ---------------------- # Append the text in VALUE to the end of the definition contained in VAR. Take # advantage of any shell optimizations that allow amortized linear growth over # repeated appends, instead of the typical quadratic growth present in naive # implementations. if (eval "as_var=1; as_var+=2; test x\$as_var = x12") 2>/dev/null; then : eval 'as_fn_append () { eval $1+=\$2 }' else as_fn_append () { eval $1=\$$1\$2 } fi # as_fn_append # as_fn_arith ARG... # ------------------ # Perform arithmetic evaluation on the ARGs, and store the result in the # global $as_val. Take advantage of shells that can avoid forks. The arguments # must be portable across $(()) and expr. if (eval "test \$(( 1 + 1 )) = 2") 2>/dev/null; then : eval 'as_fn_arith () { as_val=$(( $* )) }' else as_fn_arith () { as_val=`expr "$@" || test $? -eq 1` } fi # as_fn_arith if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi if (basename -- /) >/dev/null 2>&1 && test "X`basename -- / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi if (as_dir=`dirname -- /` && test "X$as_dir" = X/) >/dev/null 2>&1; then as_dirname=dirname else as_dirname=false fi as_me=`$as_basename -- "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| . 2>/dev/null || $as_echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } /^X\/\(\/\/\)$/{ s//\1/ q } /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits ECHO_C= ECHO_N= ECHO_T= case `echo -n x` in #((((( -n*) case `echo 'xy\c'` in *c*) ECHO_T=' ';; # ECHO_T is single tab character. xy) ECHO_C='\c';; *) echo `echo ksh88 bug on AIX 6.1` > /dev/null ECHO_T=' ';; esac;; *) ECHO_N='-n';; esac rm -f conf$$ conf$$.exe conf$$.file if test -d conf$$.dir; then rm -f conf$$.dir/conf$$.file else rm -f conf$$.dir mkdir conf$$.dir 2>/dev/null fi if (echo >conf$$.file) 2>/dev/null; then if ln -s conf$$.file conf$$ 2>/dev/null; then as_ln_s='ln -s' # ... but there are two gotchas: # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail. # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable. # In both cases, we have to default to `cp -p'. ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe || as_ln_s='cp -p' elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file rmdir conf$$.dir 2>/dev/null # as_fn_mkdir_p # ------------- # Create "$as_dir" as a directory, including parents if necessary. as_fn_mkdir_p () { case $as_dir in #( -*) as_dir=./$as_dir;; esac test -d "$as_dir" || eval $as_mkdir_p || { as_dirs= while :; do case $as_dir in #( *\'*) as_qdir=`$as_echo "$as_dir" | sed "s/'/'\\\\\\\\''/g"`;; #'( *) as_qdir=$as_dir;; esac as_dirs="'$as_qdir' $as_dirs" as_dir=`$as_dirname -- "$as_dir" || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` test -d "$as_dir" && break done test -z "$as_dirs" || eval "mkdir $as_dirs" } || test -d "$as_dir" || as_fn_error $? "cannot create directory $as_dir" } # as_fn_mkdir_p if mkdir -p . 2>/dev/null; then as_mkdir_p='mkdir -p "$as_dir"' else test -d ./-p && rmdir ./-p as_mkdir_p=false fi if test -x / >/dev/null 2>&1; then as_test_x='test -x' else if ls -dL / >/dev/null 2>&1; then as_ls_L_option=L else as_ls_L_option= fi as_test_x=' eval sh -c '\'' if test -d "$1"; then test -d "$1/."; else case $1 in #( -*)set "./$1";; esac; case `ls -ld'$as_ls_L_option' "$1" 2>/dev/null` in #(( ???[sx]*):;;*)false;;esac;fi '\'' sh ' fi as_executable_p=$as_test_x # Sed expression to map a string onto a valid CPP name. as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'" # Sed expression to map a string onto a valid variable name. as_tr_sh="eval sed 'y%*+%pp%;s%[^_$as_cr_alnum]%_%g'" exec 6>&1 ## ----------------------------------- ## ## Main body of $CONFIG_STATUS script. ## ## ----------------------------------- ## _ASEOF test $as_write_fail = 0 && chmod +x $CONFIG_STATUS || ac_write_fail=1 cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # Save the log message, to keep $0 and so on meaningful, and to # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. ac_log=" This file was extended by Xvid $as_me 1.3.2, which was generated by GNU Autoconf 2.68. Invocation command line was CONFIG_FILES = $CONFIG_FILES CONFIG_HEADERS = $CONFIG_HEADERS CONFIG_LINKS = $CONFIG_LINKS CONFIG_COMMANDS = $CONFIG_COMMANDS $ $0 $@ on `(hostname || uname -n) 2>/dev/null | sed 1q` " _ACEOF case $ac_config_files in *" "*) set x $ac_config_files; shift; ac_config_files=$*;; esac cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 # Files that config.status was made for. config_files="$ac_config_files" _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 ac_cs_usage="\ \`$as_me' instantiates files and other configuration actions from templates according to the current configuration. Unless the files and actions are specified as TAGs, all are instantiated by default. Usage: $0 [OPTION]... [TAG]... -h, --help print this help, then exit -V, --version print version number and configuration settings, then exit --config print configuration, then exit -q, --quiet, --silent do not print progress messages -d, --debug don't remove temporary files --recheck update $as_me by reconfiguring in the same conditions --file=FILE[:TEMPLATE] instantiate the configuration file FILE Configuration files: $config_files Report bugs to ." _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`" ac_cs_version="\\ Xvid config.status 1.3.2 configured by $0, generated by GNU Autoconf 2.68, with options \\"\$ac_cs_config\\" Copyright (C) 2010 Free Software Foundation, Inc. This config.status script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it." ac_pwd='$ac_pwd' srcdir='$srcdir' INSTALL='$INSTALL' test -n "\$AWK" || AWK=awk _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # The default lists apply if the user does not specify any file. ac_need_defaults=: while test $# != 0 do case $1 in --*=?*) ac_option=`expr "X$1" : 'X\([^=]*\)='` ac_optarg=`expr "X$1" : 'X[^=]*=\(.*\)'` ac_shift=: ;; --*=) ac_option=`expr "X$1" : 'X\([^=]*\)='` ac_optarg= ac_shift=: ;; *) ac_option=$1 ac_optarg=$2 ac_shift=shift ;; esac case $ac_option in # Handling of the options. -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) ac_cs_recheck=: ;; --version | --versio | --versi | --vers | --ver | --ve | --v | -V ) $as_echo "$ac_cs_version"; exit ;; --config | --confi | --conf | --con | --co | --c ) $as_echo "$ac_cs_config"; exit ;; --debug | --debu | --deb | --de | --d | -d ) debug=: ;; --file | --fil | --fi | --f ) $ac_shift case $ac_optarg in *\'*) ac_optarg=`$as_echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"` ;; '') as_fn_error $? "missing file argument" ;; esac as_fn_append CONFIG_FILES " '$ac_optarg'" ac_need_defaults=false;; --he | --h | --help | --hel | -h ) $as_echo "$ac_cs_usage"; exit ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil | --si | --s) ac_cs_silent=: ;; # This is an error. -*) as_fn_error $? "unrecognized option: \`$1' Try \`$0 --help' for more information." ;; *) as_fn_append ac_config_targets " $1" ac_need_defaults=false ;; esac shift done ac_configure_extra_args= if $ac_cs_silent; then exec 6>/dev/null ac_configure_extra_args="$ac_configure_extra_args --silent" fi _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 if \$ac_cs_recheck; then set X '$SHELL' '$0' $ac_configure_args \$ac_configure_extra_args --no-create --no-recursion shift \$as_echo "running CONFIG_SHELL=$SHELL \$*" >&6 CONFIG_SHELL='$SHELL' export CONFIG_SHELL exec "\$@" fi _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 exec 5>>config.log { echo sed 'h;s/./-/g;s/^.../## /;s/...$/ ##/;p;x;p;x' <<_ASBOX ## Running $as_me. ## _ASBOX $as_echo "$ac_log" } >&5 _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # Handling of arguments. for ac_config_target in $ac_config_targets do case $ac_config_target in "platform.inc") CONFIG_FILES="$CONFIG_FILES platform.inc" ;; *) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5;; esac done # If the user did not use the arguments to specify the items to instantiate, # then the envvar interface is used. Set only those that are not. # We use the long form for the default assignment because of an extremely # bizarre bug on SunOS 4.1.3. if $ac_need_defaults; then test "${CONFIG_FILES+set}" = set || CONFIG_FILES=$config_files fi # Have a temporary directory for convenience. Make it in the build tree # simply because there is no reason against having it here, and in addition, # creating and moving files from /tmp can sometimes cause problems. # Hook for its removal unless debugging. # Note that there is a small window in which the directory will not be cleaned: # after its creation but before its name has been assigned to `$tmp'. $debug || { tmp= ac_tmp= trap 'exit_status=$? : "${ac_tmp:=$tmp}" { test ! -d "$ac_tmp" || rm -fr "$ac_tmp"; } && exit $exit_status ' 0 trap 'as_fn_exit 1' 1 2 13 15 } # Create a (secure) tmp directory for tmp files. { tmp=`(umask 077 && mktemp -d "./confXXXXXX") 2>/dev/null` && test -d "$tmp" } || { tmp=./conf$$-$RANDOM (umask 077 && mkdir "$tmp") } || as_fn_error $? "cannot create a temporary directory in ." "$LINENO" 5 ac_tmp=$tmp # Set up the scripts for CONFIG_FILES section. # No need to generate them if there are no CONFIG_FILES. # This happens for instance with `./config.status config.h'. if test -n "$CONFIG_FILES"; then ac_cr=`echo X | tr X '\015'` # On cygwin, bash can eat \r inside `` if the user requested igncr. # But we know of no other shell where ac_cr would be empty at this # point, so we can use a bashism as a fallback. if test "x$ac_cr" = x; then eval ac_cr=\$\'\\r\' fi ac_cs_awk_cr=`$AWK 'BEGIN { print "a\rb" }' /dev/null` if test "$ac_cs_awk_cr" = "a${ac_cr}b"; then ac_cs_awk_cr='\\r' else ac_cs_awk_cr=$ac_cr fi echo 'BEGIN {' >"$ac_tmp/subs1.awk" && _ACEOF { echo "cat >conf$$subs.awk <<_ACEOF" && echo "$ac_subst_vars" | sed 's/.*/&!$&$ac_delim/' && echo "_ACEOF" } >conf$$subs.sh || as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 ac_delim_num=`echo "$ac_subst_vars" | grep -c '^'` ac_delim='%!_!# ' for ac_last_try in false false false false false :; do . ./conf$$subs.sh || as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 ac_delim_n=`sed -n "s/.*$ac_delim\$/X/p" conf$$subs.awk | grep -c X` if test $ac_delim_n = $ac_delim_num; then break elif $ac_last_try; then as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 else ac_delim="$ac_delim!$ac_delim _$ac_delim!! " fi done rm -f conf$$subs.sh cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 cat >>"\$ac_tmp/subs1.awk" <<\\_ACAWK && _ACEOF sed -n ' h s/^/S["/; s/!.*/"]=/ p g s/^[^!]*!// :repl t repl s/'"$ac_delim"'$// t delim :nl h s/\(.\{148\}\)..*/\1/ t more1 s/["\\]/\\&/g; s/^/"/; s/$/\\n"\\/ p n b repl :more1 s/["\\]/\\&/g; s/^/"/; s/$/"\\/ p g s/.\{148\}// t nl :delim h s/\(.\{148\}\)..*/\1/ t more2 s/["\\]/\\&/g; s/^/"/; s/$/"/ p b :more2 s/["\\]/\\&/g; s/^/"/; s/$/"\\/ p g s/.\{148\}// t delim ' >$CONFIG_STATUS || ac_write_fail=1 rm -f conf$$subs.awk cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACAWK cat >>"\$ac_tmp/subs1.awk" <<_ACAWK && for (key in S) S_is_set[key] = 1 FS = "" } { line = $ 0 nfields = split(line, field, "@") substed = 0 len = length(field[1]) for (i = 2; i < nfields; i++) { key = field[i] keylen = length(key) if (S_is_set[key]) { value = S[key] line = substr(line, 1, len) "" value "" substr(line, len + keylen + 3) len += length(value) + length(field[++i]) substed = 1 } else len += 1 + keylen } print line } _ACAWK _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 if sed "s/$ac_cr//" < /dev/null > /dev/null 2>&1; then sed "s/$ac_cr\$//; s/$ac_cr/$ac_cs_awk_cr/g" else cat fi < "$ac_tmp/subs1.awk" > "$ac_tmp/subs.awk" \ || as_fn_error $? "could not setup config files machinery" "$LINENO" 5 _ACEOF # VPATH may cause trouble with some makes, so we remove sole $(srcdir), # ${srcdir} and @srcdir@ entries from VPATH if srcdir is ".", strip leading and # trailing colons and then remove the whole line if VPATH becomes empty # (actually we leave an empty line to preserve line numbers). if test "x$srcdir" = x.; then ac_vpsub='/^[ ]*VPATH[ ]*=[ ]*/{ h s/// s/^/:/ s/[ ]*$/:/ s/:\$(srcdir):/:/g s/:\${srcdir}:/:/g s/:@srcdir@:/:/g s/^:*// s/:*$// x s/\(=[ ]*\).*/\1/ G s/\n// s/^[^=]*=[ ]*$// }' fi cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 fi # test -n "$CONFIG_FILES" eval set X " :F $CONFIG_FILES " shift for ac_tag do case $ac_tag in :[FHLC]) ac_mode=$ac_tag; continue;; esac case $ac_mode$ac_tag in :[FHL]*:*);; :L* | :C*:*) as_fn_error $? "invalid tag \`$ac_tag'" "$LINENO" 5;; :[FH]-) ac_tag=-:-;; :[FH]*) ac_tag=$ac_tag:$ac_tag.in;; esac ac_save_IFS=$IFS IFS=: set x $ac_tag IFS=$ac_save_IFS shift ac_file=$1 shift case $ac_mode in :L) ac_source=$1;; :[FH]) ac_file_inputs= for ac_f do case $ac_f in -) ac_f="$ac_tmp/stdin";; *) # Look for the file first in the build tree, then in the source tree # (if the path is not absolute). The absolute path cannot be DOS-style, # because $ac_f cannot contain `:'. test -f "$ac_f" || case $ac_f in [\\/$]*) false;; *) test -f "$srcdir/$ac_f" && ac_f="$srcdir/$ac_f";; esac || as_fn_error 1 "cannot find input file: \`$ac_f'" "$LINENO" 5;; esac case $ac_f in *\'*) ac_f=`$as_echo "$ac_f" | sed "s/'/'\\\\\\\\''/g"`;; esac as_fn_append ac_file_inputs " '$ac_f'" done # Let's still pretend it is `configure' which instantiates (i.e., don't # use $as_me), people would be surprised to read: # /* config.h. Generated by config.status. */ configure_input='Generated from '` $as_echo "$*" | sed 's|^[^:]*/||;s|:[^:]*/|, |g' `' by configure.' if test x"$ac_file" != x-; then configure_input="$ac_file. $configure_input" { $as_echo "$as_me:${as_lineno-$LINENO}: creating $ac_file" >&5 $as_echo "$as_me: creating $ac_file" >&6;} fi # Neutralize special characters interpreted by sed in replacement strings. case $configure_input in #( *\&* | *\|* | *\\* ) ac_sed_conf_input=`$as_echo "$configure_input" | sed 's/[\\\\&|]/\\\\&/g'`;; #( *) ac_sed_conf_input=$configure_input;; esac case $ac_tag in *:-:* | *:-) cat >"$ac_tmp/stdin" \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; esac ;; esac ac_dir=`$as_dirname -- "$ac_file" || $as_expr X"$ac_file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$ac_file" : 'X\(//\)[^/]' \| \ X"$ac_file" : 'X\(//\)$' \| \ X"$ac_file" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$ac_file" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` as_dir="$ac_dir"; as_fn_mkdir_p ac_builddir=. case "$ac_dir" in .) ac_dir_suffix= ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_dir_suffix=/`$as_echo "$ac_dir" | sed 's|^\.[\\/]||'` # A ".." for each directory in $ac_dir_suffix. ac_top_builddir_sub=`$as_echo "$ac_dir_suffix" | sed 's|/[^\\/]*|/..|g;s|/||'` case $ac_top_builddir_sub in "") ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_top_build_prefix=$ac_top_builddir_sub/ ;; esac ;; esac ac_abs_top_builddir=$ac_pwd ac_abs_builddir=$ac_pwd$ac_dir_suffix # for backward compatibility: ac_top_builddir=$ac_top_build_prefix case $srcdir in .) # We are building in place. ac_srcdir=. ac_top_srcdir=$ac_top_builddir_sub ac_abs_top_srcdir=$ac_pwd ;; [\\/]* | ?:[\\/]* ) # Absolute name. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ac_abs_top_srcdir=$srcdir ;; *) # Relative name. ac_srcdir=$ac_top_build_prefix$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_build_prefix$srcdir ac_abs_top_srcdir=$ac_pwd/$srcdir ;; esac ac_abs_srcdir=$ac_abs_top_srcdir$ac_dir_suffix case $ac_mode in :F) # # CONFIG_FILE # case $INSTALL in [\\/$]* | ?:[\\/]* ) ac_INSTALL=$INSTALL ;; *) ac_INSTALL=$ac_top_build_prefix$INSTALL ;; esac _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # If the template does not know about datarootdir, expand it. # FIXME: This hack should be removed a few years after 2.60. ac_datarootdir_hack=; ac_datarootdir_seen= ac_sed_dataroot=' /datarootdir/ { p q } /@datadir@/p /@docdir@/p /@infodir@/p /@localedir@/p /@mandir@/p' case `eval "sed -n \"\$ac_sed_dataroot\" $ac_file_inputs"` in *datarootdir*) ac_datarootdir_seen=yes;; *@datadir@*|*@docdir@*|*@infodir@*|*@localedir@*|*@mandir@*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $ac_file_inputs seems to ignore the --datarootdir setting" >&5 $as_echo "$as_me: WARNING: $ac_file_inputs seems to ignore the --datarootdir setting" >&2;} _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_datarootdir_hack=' s&@datadir@&$datadir&g s&@docdir@&$docdir&g s&@infodir@&$infodir&g s&@localedir@&$localedir&g s&@mandir@&$mandir&g s&\\\${datarootdir}&$datarootdir&g' ;; esac _ACEOF # Neutralize VPATH when `$srcdir' = `.'. # Shell code in configure.ac might set extrasub. # FIXME: do we really want to maintain this feature? cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_sed_extra="$ac_vpsub $extrasub _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 :t /@[a-zA-Z_][a-zA-Z_0-9]*@/!b s|@configure_input@|$ac_sed_conf_input|;t t s&@top_builddir@&$ac_top_builddir_sub&;t t s&@top_build_prefix@&$ac_top_build_prefix&;t t s&@srcdir@&$ac_srcdir&;t t s&@abs_srcdir@&$ac_abs_srcdir&;t t s&@top_srcdir@&$ac_top_srcdir&;t t s&@abs_top_srcdir@&$ac_abs_top_srcdir&;t t s&@builddir@&$ac_builddir&;t t s&@abs_builddir@&$ac_abs_builddir&;t t s&@abs_top_builddir@&$ac_abs_top_builddir&;t t s&@INSTALL@&$ac_INSTALL&;t t $ac_datarootdir_hack " eval sed \"\$ac_sed_extra\" "$ac_file_inputs" | $AWK -f "$ac_tmp/subs.awk" \ >$ac_tmp/out || as_fn_error $? "could not create $ac_file" "$LINENO" 5 test -z "$ac_datarootdir_hack$ac_datarootdir_seen" && { ac_out=`sed -n '/\${datarootdir}/p' "$ac_tmp/out"`; test -n "$ac_out"; } && { ac_out=`sed -n '/^[ ]*datarootdir[ ]*:*=/p' \ "$ac_tmp/out"`; test -z "$ac_out"; } && { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined" >&5 $as_echo "$as_me: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined" >&2;} rm -f "$ac_tmp/stdin" case $ac_file in -) cat "$ac_tmp/out" && rm -f "$ac_tmp/out";; *) rm -f "$ac_file" && mv "$ac_tmp/out" "$ac_file";; esac \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; esac done # for ac_tag as_fn_exit 0 _ACEOF ac_clean_files=$ac_clean_files_save test $ac_write_fail = 0 || as_fn_error $? "write failure creating $CONFIG_STATUS" "$LINENO" 5 # configure is writing to config.log, and then calls config.status. # config.status does its own redirection, appending to config.log. # Unfortunately, on DOS this fails, as config.log is still kept open # by configure, so config.status won't be able to write to it; its # output is simply discarded. So we exec the FD to /dev/null, # effectively closing config.log, so it can be properly (re)opened and # appended to by config.status. When coming back to configure, we # need to make the FD available again. if test "$no_create" != yes; then ac_cs_success=: ac_config_status_args= test "$silent" = yes && ac_config_status_args="$ac_config_status_args --quiet" exec 5>/dev/null $SHELL $CONFIG_STATUS $ac_config_status_args || ac_cs_success=false exec 5>>config.log # Use ||, not &&, to avoid exiting from the if with $? = 1, which # would make configure fail if this is the last instruction. $ac_cs_success || as_fn_exit 1 fi if test -n "$ac_unrecognized_opts" && test "$enable_option_checking" != no; then { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: unrecognized options: $ac_unrecognized_opts" >&5 $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2;} fi xvidcore/doc/0000775000076500007650000000000011566427762014250 5ustar xvidbuildxvidbuildxvidcore/doc/INSTALL0000664000076500007650000001665211506410657015300 0ustar xvidbuildxvidbuildINSTALL ======= Table of contents: ================== 1/ Generic install procedure for Unix based systems 1.a/ Requirements. 1.b/ How to build from a release tarball. 1.c/ How to build from CVS. 1.d/ Cross compiling xvidcore. 1.e/ What is the meaning of the xvidcore Makefile output. 1.f/ Building a Debian package. 2/ Generic install procedure for Win32/MSVC. 2.a/ Requirements. 2.b/ How to build the VFW frontend from a release tarball. 2.c/ How to build from CVS. 1/ Generic install procedure for Unix based systems =================================================== This build process works for most common Unix based systems, including GNU/Linux, (Free|Open|Net)BSD, Solaris and faked unix environments like cygwin and minsys on Win32 platforms. 1.a/ Requirements ----------------- - ANSI C compiler (gcc) - make (GNU make, BSD make, Solaris make) - a C library providing ANSI C functions like malloc/free/realloc and some other standard functions. - yasm or nasm on ia32/x86_64 platforms for MMX/SSE optimized code. 1.b/ How to build from a release tarball ---------------------------------------- Get the latest version on http://www.xvid.org/, and uncompress it on your disk. Let's name the resulting source directory ${xvidcore}. The next step allows you to configure the xvid sources. # cd ${xvidcore}/build/generic # ./configure Some building options can be tuned thanks to the ./configure tool. You can use your own CC and CFLAGS variables in order to override xvid's default ones. To have a list of known options: # ./configure --help Now xvidcore is configured according to your specific platform. You can still handwrite the platform.inc file in order to add/remove specific flags that ./configure may have set them wrong. It is time to build xvidcore: # make That creates a =build directory where all object files go, and where the build targets are linked. If no error was reported by the build process, then you can install it on your system: # make install This copies the shared and static libraries to the prefix location passed to the ./configure tool (/usr/local by default). The xvid.h include file is also copied during the "make install" run. Voila, xvidcore is installed on your system, make sure your runtime linker knows about the xvidcore prefix lib dir where it is installed. And make also sure that it generates a symlink to its SONAME. In case it would do not take care of the symlink itself: # cd ${prefix}/lib # ls libxvidcore.so.* ls should list at least one libxvidcore.so.MAJOR.MINOR file # ln -s libxvidcore.so.MAJOR.MINOR libxvidcore.so.MAJOR You may also add a .so link to .so.MAJOR, so that applications linked against .so are in fact linked to .so.MAJOR and thus ensures better binary compatibility as we take care not changing the MAJOR number until there is an incompatible ABI change. # ln -s libxvidcore.so.MAJOR libxvidcore.so 1.c/ How to build from CVS -------------------------- In order to build from CVS, you need some more requirements compared to the release building process: - GNU autoconf >= 2.5 - GNU automake (no specific version) - GNU libtool (no specific version) Grab the desired CVS version you want to build. # cvs -d:pserver:anonymous@cvs.xvid.org:/xvid login (just type enter) # cvs -d:pserver:anonymous@cvs.xvid.org:/xvid co xvidcore (read the CVS documentation if you want details about branch checking outs) You now need to bootstrap the build files: # cd xvidcore/build/generic # ./bootstrap.sh A configure script has been bootstraped, you're now able to follow the "Build from release tarballs" steps. 1.d/ Cross compiling xvidcore. ------------------------------ The configure script allows an easy handling of cross compilation. You have just to specify the host and build platform values. e.g: building Win32 libxvidcore.dll from a gnu/linux systems # cd ${xvidcore}/build/generic # ./configure --host=i386-mingw32 --build=i386-pc-linux-gnu And then build as usual. As the example uses the Win32 target, we can even build the vfw frontend. Additional requirements are: - Resource compiler (the Makefile uses the syntax of windres from the GNU CC suite, but you can easily modify the cmd line) - GNU make (other make programs may have problem with shell expansion) So to cross compile the VFW frontend, you just need to override the Makefile variables pointing to the compiler and the resource compiler. These variables are CC and WINDRES. # cd ${xvidcore}/vfw/bin # make CC=i386-mingw32-gcc WINDRES=i386-mingw32-windres 1.e/ What is the meaning of xvidcore Makefile output. ----------------------------------------------------- The makefile available in ${xvidcore}/build/generic is handwritten and outputs uncommon building progress strings to the terminal. You may want to understanding their meaning. Here is a brief explaination. - A: a/dir/file.(asm|s) This an assembling rule assembling 'a/dir/file.(asm|s)' - C: a/dir/file.c This is a compilation rule compiling 'a/dir/file.c' - Cl: Stuff This is a cleaning rule in action - D: Directory This is a rule creating 'Directory' - I: a/dir/file Installing 'file' in 'a/dir' - L: file Linking 'file' - W: file Compiling the Win32 resource 'file' 1.f/ Building a Debian package. ------------------------------- Release tarballs contain a debian dir so that you can easily build a debian package. Just execute the usual steps. They may be sumed up to: # cd ${xvidcore} # dpkg-buildpackage -rfakeroot If all went right, you're now able to install the package: # cd .. # dpkg -i libxvidcore... 2/ Generic install procedure for Win32/MSVC. ============================================ 2.a/ Requirements. ------------------ - MS Visual C++ 2005 or later - nasm installed as 'nasm' in the msvc binary search paths. 2.b/ How to build the VFW frontend from a release tarball. ---------------------------------------------------------- Download the latest source distribution from http://www.xvid.org/ and uncompress it on your disk. Let's call this directory ${xvidcore}. - Make sure your ${xvidcore} directory name does not contain spaces. MS Visual C++ can have issues handling file names with whitespace. - Open the workspace libxvidcore.sln located in ${xvidcore}/build/win32. - Then choose the libxvidcore project as the Active project of the workspace. - Make sure the Active configuration is 'libxvidcore Win32 Release' - Build the project (F7) - Open the project vfw.vcproj file located in ${xvidcore}/vfw. - Make sure the Active configuration is 'vfw Win32 Release' - Build the project (F7) - Install the resulting VFW frontend using the xvid.inf file provided in ${xvidcore}/vfw/bin. Right click on the file, and then click 'Install' 2.c/ How to build from CVS. --------------------------- You have first to retrieve the sources from the Xvid CVS repository using a tool like WinCVS. Then follow the normal steps explained in the previous section. NB: your CVS program may not convert text files to the cr/lf windows text. In that case opening project files in MSVC will result in some weird error messages from MSVC. To fix that, you have to convert all .dsp files to the cr/lf format. You can do that opening the .dsp file in WordPad and saving it. It should now be in cr/lf format. Last edited: $Date: 2010-12-28 16:34:55 $ xvidcore/doc/README0000664000076500007650000000077511506410657015126 0ustar xvidbuildxvidbuildDocumentation of Xvid ===================== INSTALL ------- This file describes the steps to build and install Xvid on supported platforms. NOTE FOR MacOS X PLATFORM ------------------------- Many versions of yasm/nasm are known to be buggy for Mach-O output, whereas nasm seems to be less reliable than yasm. Therefore, yasm is highly recommended for Mac OS X target. * yasm 1.0 or later is required to build xvidcore. * http://www.tortall.net/projects/yasm/ Last edited: $Date: 2010-12-28 16:34:55 $ xvidcore/examples/0000775000076500007650000000000011566427764015323 5ustar xvidbuildxvidbuildxvidcore/examples/cactus.pgm.bz20000664000076500007650000023357107446630304020005 0ustar xvidbuildxvidbuildBZh91AY&SYaQ@#芠 7@R *UE ((P@@ @P$@   @PH (U P@ ( (@( @(@@ P  "M Hb0da=ODƧLͪѠ6iSL aѠɠT&hT0 #C ``&h``QQDBdFFM3@h&L4=&&M4 &43< bmFCF F@h  h ɠ!@h&@iѩѡ4OSOM @MFzM1"vh!A %Xfh9 1 .*n~*4 x=#GSDZs +w~QwU}^k4)U4kORm+]0IZˍD? Ҽ>;,$+C)K =;}=)b{T9(t7WDyBT~gbZJWJr RtN%ZaД"D~~ց(Z2u0V^. 3`nUUFר%Ts ]^6Q:xdܝU ؀$h$೩%qY5즚fHA&YRޑ{ְD)q󒔮-VIA)BtM\\W+&*pqGh  F1.!Y$fUQ4i&ҽQT%jz&"giy[4 Xg EfWkBŪVi13*qww tA)J Se)K:LXzr]$7;>۶sC9~{OȝIGs?s1yyN?7J+][n}owo|U'J,t)C&O*7]OՁ9%z_jd%٫NLY,7?jf;N>gektC% LhQ˿'EZs8]KCv ld \̪6(* O }у]x7=ub]&ljlbUA?k_^[v%ba'kտ!Yt [Wǻ"-tsQ`]Sȼڸ1QQ9|3܎3KQKLr(]>&njQQ+UɱqtҘ6CRuO7bVg)Z›[3}%>^Áje`>^_#jqneƞ$⯲CSk*57Pgk;eS*%jR&E"@2cdc$ڏmQOmqUp_ao-aLc~knoYglޓ2I}\q0WbL2̜uIk)vjOU?K;׈*\dSʨ$.3TzoZ~21d^L2$őI2=UFjدLdjObo̳ hM%hu/WĘOΧjJrjJYAEd$1QJTfF*\[=2GjuFJadQ|]I1y la;ɚFEIQYT?IR/T$ŸյD%Eٸ7.O6?ঊQQՑj M # YɹY#^fz$͛US&ne?$L83ŚȥapY&ݟ?إ.\39[ F_5e;VSGu)Qw3 SΆƙI-l}9_))TrSQLXH[S*F,YebqfO)_^2FVVHwYO~u/;S򿡹x/qqxU*}XcuUQqܷ~>G}6'}益u_M/u{o>#Gc+#d{YsWg9qR";7s1-lT̍;bܥFW_e)b.[N{#L䧸G㕕LYl9NZdF,'rAe-snc90S9{m9YdbcN\c=Ǹ29q13m3[-i"a˖Jf-GNR)sO~7{9?/)OAr{y_qocO1鱻ɸeyp/Or}c Æ+!]׈s>V 3yu(.-.$ `)b#&脔H'P:dkHd'a2j)HegP5 e\1&e 0D $Ar Q"QBxpg|ɬrϒBn T Q.Z/ ]a$)#ED2(7Z=1b9 R*0"""LDDDIHAL,l;H1 B=6Q̏!T{Oul[b0[ˊ[-_5Ƴ 66IklYq:0£%6F6EgY'Vfɱ?a&)JfmRIu0욬ee]bfɇTM-/#oF$gz ̣lYhËGV7h4a%mjKEK,tSr?Tac!j3dfu$ 3RPJ}-G'zHћQaŜ*8شR,EE*,n4K?, 0Շչ ԓkslxplIfb,Qu7:rfaۃ{roRNHͫڥ;ʸuţ.NK$ѹ5j˫ͬ,ጵYhՆjrYXյh]Nh)KF{/qǼoqp^w3gKe:éfִ6S{s&ػtY֋6S{k%R 󝢜Y?Qcy#.NT_V/tTm^lfإ63nhE)ߟǿ-FL)PjӒ2I.-$gLI-7Kj~%nIJoYOb.#8O|v9$M.س\UXdf6L)KUSj)Su*Rb]uYK^jKΕhX?ñɓuۜঊmIYKt*_kMSC.x3 jp#oRR5aIJ.&,4e+X>g'cF Z.y;ǏZֵI/b2YS5jͻRvBtss˒ э-ֵjՍ]qqeNJN*V0-'pb tgS%6$4]cILX#f-BKy-&&NK'8ISmhާ;ɫW)F6 :;[F8&YvkE8sbg$f*̖mIK[;Z֫tIUUWg;k6.íL]~sTM}{ZtR]IwwR0hU-\63pfܳMRVzlH{6o/cyg1b ŒZH`U]4w.]e^fv[["ԥZHH)mc)JMqRKK7a6Fg&k5Ijhac99##kXX%,C9M6;rfjfK7*2qg'Kj]4k[rlrS&*nqYNLoT{6L6_IY5Yf*]N,?ÓGY›jшEYeVdսs̗qYQfn.,pY˵rks3f{M#cV]UUVs84maQK&;emnċI$r] sVm.I2HI& ##ARHAOo3_aнW~'}>Cz>4z L$q^)'~N2MC>Jw2F"WswqwNwQH줎HfIƪq>%WUs#W[:ZJU$7ڟR{a*|jA'QSU*zGߎ0=j߶& jjS[SzNjˮ L).ie,v*E?Դ)aJYR.Z)Kl %RL$"l:$E;[#fEQ(dYl)0`uًeKR&΅67K$ IM̘T?Y橋Vcu/h4]wN^8x[ΧceeIJ̘ae:Q}ſ= IRI&HծjQi'rwAR|*RGإObUIuSe;c =w#5dzO+':MǑ̓6M P#vO+z{QmdnM Ӳu9hvmm.MZ2aNckw3ondÚ92YŔuHFL73b/,eIKڥNL/ e;4tS2l;ɢ:ٷwyg[Z80s鵏]1QJRYLu5u7η^V5fËc6?u#ާcu7hYdKfvhU]MFR**3Tau+^.zzjդzaJ ,EG[cT,HK6&3aFS fɣgћijk}ds< 3UәɵULGL]]?;kW07y߲͓%6L'fܲ'k_amh#enu;Nwwy3qn~Gz:z<{.g[{lx..YT޺ѹhb ˽ՖI拹0p}MTSs66Vv<znzQ=Wnav<.ߏ}o7 ct<-rpTTTw#u:\q606#K5TnS 62xٸ='gCg9>Ə+=s2]]:_#ӭݏ}Wf2eS]=cnm|Y-[0/>ݭMv;}{O5ר"^롛Fw~Ϫ<'k~W~FS^w ̥. 06]MчƼzmK7)]g'lfJu.MFngqfs+6l =~zMCWrzcTܻC7}.y}s_%uئk|Gcky^Ww;l;+u6le]o{ YYfգUL9mL3s=GJyhS6l<ͬxɛ594Sܧmd͍MћGykWޣ[YO{nǁW;j \.gWVz^nd~}z:=qlzL6 ^kkH:]}MZE< RS| ;:$6:jlwd ^xۙ6F,a.u5lN*d<]-O=F3bs7&fpYa%Gхo] cFf֌h,M75{޳ 8Sﶳ,7λ]lo{My'a92'v6GpzWwSWf4ar~G;,{N"ʑhyi ˷3dJ,߰0x"R.UaOŇ3/vqg&nmdõQ r{ sG^%ޛScM^jڧ}L4w:VuK6=}a]gxv=lj;]2} :Ի)0Y&=Ǖw=[(<dw9ޓfצnzNv#tXy'3c5>ձj0͆~ckj)heKj,m9Ts2wߙY4Z=6LL$8-)Ņ600ѱ0ػcVq&yIx[u6>GK&Nv8>%׍ټyFMjondodg7]vQvO2̔&=X\]֋769>:]*ddɜ] XjmYQs~tmp}̳vZxW&=$=>'].\pxsrjzVd';wy,:YrwEaalSh&)thK)Yh}-IL4R(ELaLfK7,e.8$Z>eld]gcGy~'䔧Ss7Z4npqzngp}/:W;ФYNvaM]4y'3 񷸽G[c{hZks OVxu:pa{}y\]/ %u?B]ho{w7>fǨ|irvzziw33t9<g$dɽx";ڻkSj<\2fEG$Ss%Eњ)Jod׉:ћVJRRYh)J|ˬaNՙ=%Z$ks %3*Tl;,MSt]s72+mMz.vdn)&Jf}(] ]fnJh2]]ul|o:󷮧ssc n-Y=']ΗZ;];|o.uoI>~ke<;z0ww?CzY}QβUaS x6OMjܓVZ6e]MZ;ӎi5 )%*Gp^/9Iv]xJSd$jRJR%$- *1Y&Jfݯa'SGClz+G$TvO)rdG> M,ӃtZqK)Nut:c񸶮yaN:SVfR,F:#b+-SdSrt8w ;Φo=fm˻_쾆nw6]m·}OI<"n.wI99_7[w<oo,/]'[y^v,ͪiU-5Q~cfVZv*jVon 5rmmRFeh])XJajEdW͔Iu$YLbUad45v80fdOŖIe,hwKrv֋3KYe"*) wc2x^'3F#b'ņL-YwYGT3Ft>M=*)mhqt)D:d͊of$.ш2s4fSzhL>WoZ)YxY+;] y= +6;Vϥ6mWmu<,W<+md,bNkW]t=Ot;nzw63vL߸鋩y%t.w'Wt5m^6?J|˻+83ffLE8TSI%$Z)_ T^3fՒd*SbI{N<\\onlf7E*$4|qi>eudv6>FM6,QJRYhfoesGKfv6'EG f&qwenxXtMgG%2Sۛ9wkYOM֦a7Gsд}^ޫlou97GStOC70\^?f#zmM#ccw;\\^{'9d:66.λ,:2'?z'};+'2;woy>S#=bdNmlj}LhJڬ^0eL6,%nz+(f)&IgZ%0VdZp~ޓz;[fRzfut6)vՖRN/Y]veܦl-:'gcsvSYu7Ĕܧ]wҴmhx#7'ɓy]Z˩멛ڎ70^VoMGNI:ܞF%.RmjՇ4aYfTՆXv94OYN1l]ܺKRe %zZN'3Yĝn-[[e0h[3xZ^.%e ZEE8*>jk*N4a8.92ljIMRZ,R,nu7xWC&ʼhU>^:I,f]&цo#)SSt'鎩ѓsfzN'Cõth=O_xXpKrص):ۺ9 '督`ڷ-xܷգ{ϙ禎þ^6޳{f.69;NufiyGձe h[qpv &"Ye9IfƻVnk%0 II:ԻQfx7986h}L)x޲Mi|6nIR~F_}+FV 2a&$gbo]E}:˵Sʻ4YiNK9?4)dT.qe3If-=U,޻5)M]JRd͵)0)u#F';n|+89޻&.g5j³*N<ƮfOmOh+c?^mMOK6wscI~4v '#8ѵvda\Y?~ 5RMcYՆrrlINKL)J]hi[[$ZE)RIʖ_O}N1KcY P8SKE6)յYvs(n}/UxL5vk2RRdգ]ҦK3dmnhNvY3xjڥ7'ccqt7Jgrlz_Sϥw6~7'=Fg[>'S^pY=='{M]YRU;kUѬb0Vf޺4dvl>;725 m]֊dN.)W^.LR)~l|N~rL>'70|n 9.?uBRw,Gz,7 .,y]*3xRh0fǡŪfٲYNMc]͎'ܛd￶dGmhzn-]mZ:Xhfmuɹ\fh'>W)$wSugvzޛY{{e?ٚ9%=w[0e\\.f茣dS ,)灹dmRGmR)%?qK$h$ˤͫGI 4h1z`ַ$av1gNY,"2^, 1 0FL=Jod~;6M;'CtCnNqs?/M߬9f97~\;_yv=W޻拲tnrsNS&3Gw=}NhpRZ;d7;bW]&LԼ]M<.jf}I2TsV|WIHZYQe)|$?%υQHNu3#kOaY'rJ#zSdɢ]ܿqw=))e$cTTx.4lj|:ldxw9?+E='c{X~u=ghs=mzʳg3uG'yh8XkOgߥltF[';G8n{2z2qj͔S'Kv,ѵK0myeg&"lpSi$ͽx\${T/d(.բK0+TrT]x)xv)Nt6;EV$0RԒ3u9>l8f6ѫ MwyN{ѓjGnŜ^Gk7>:8NikL.Ê\*.6Y]K]K.laSbeRR0ZfwwwI%Rv:s(j|WpϏ s9-z70?MbE)JS IeG]S' Z= K)eRSQYG;%Yɇhak|/6oC>&ڳíԳcyבv=Oei wGe]zgks/lx{ex=Gѽ;Y6]Kxjչposؓ%Fl,f%)QMmS{%*2RYeٯ0-)eٽM.]w,LJt=6qcU&EjFKÔrdYF-Rt.DgS 2#|ola%IQen]~8]ɵunma:E0ի,w ]>Enl^)I2IeޤWw;YL1{Zֵ.lf8E)PSkk}TkQXfE:8I/6:fI4RIu^/Iylag mhɪ.T]eٶ3ma֦INuock|fإ]NT6ks6i[WKk{}#y|'[:^WOt7UP?Q}wf5I:b1wLd6N*o{U}MXj^GcG=GܧU>vK{N-Z6-u,J^,IwKmVRRJ]e)'YR-b)Qۗa&#FC%=VŗSh}'Z$T7ljomS%rlGy|=7&mIe$,EVRs1IgyK4}TͽM0GR,7rΧK8h⧨}fN+ F]cs]Y{ v?jp}j~OϽhc[RUH[vNi7-lw\+sWCcomE9_;pIL>qSYw6t5]KYeYehR푱%4S(r5mY$^̢L8.UUQ#h6.H|껡5Z˹96'%fKœE)gk͚ԲI,jIe.)β3fsZGX2Yy"]mZ2R :)ommx0á'͛w]12aߌZ=KÓa7k. ]Ѽ{vFHeuovoxf>H&e>yi{zO驛zY<츬ںSd9:[6Yw;kgѤf%$h%.{mf)L)Ņk7Y FuJfDYwR*)QOl]uH0ՈRd6'J4SZ-κDyֳZ5Ժh:.[.0aݺ":#TFqS7b;q*d*s2e; '&ԥId—h88:ٴlYJTRYN]JZ*<*]ʼndqZ,:79kqcsVv? 6=72{.MK9[;+;sbMTU$+ipߴ\r[_KSteG>'_766Ǫlmnhړ~d+bOqNFL,YOuVk$ŗaQh$,u%TIvhA$7RNYOiCʂI]OI6Ϸ=wyPD̴ns$RY̢-^t)?͊IEgYQ&R*Tst/MY*5"sjdﱱ[&d񿉣Gly׊V#s*H54royjٷ?Vk<4jSecWg?y~q~g=w߅v˼ ${S'(a)Y I2eS6Bj5x4ufS /õeԺٷ upfۿ%HM|0ݯU~f=&#gK7^ ^w}n~˩WVwLfl]dScV׋jYa#(Ɇ,ZE){aKQI*)u-{2S IڗIu)I)hRB]0]Y)fGR%c͏TSbM/>ϡz#7fSR*; aeK61? $SwסdJIQKE-W]%0,50)I*)'fYHO]L)e9Ś|a 6)hjزͫGE5]davOeMUjF]wo]z\\^Mg{.O}ixZ8y_yI#Oox';y+}]l٤Q.RWZ)BaI)J)Hຖ]u0KFK–aI4y(YG^6Zs]%ԡI)u:\2>6"Jq,s,Y첋Eov2aO"ѱuuRTTw1#%)K)I)JSNMd+E݋7u>b2dKE,/{ l]e>CzͭIi5YJYK0u00͊x[o3y#cltYjL3jhw77;mu2fddIG3h&,޲TţfڦJw>"}8zLؑɵf&KE3q^2R J],شhgRK/)%ԓ'fkR]e7C^Gm8fIDIKHgR|ée6N >Xg%D:dդYK-aŔjmꋻFeERJT2gѣ4L?&YSELYafkeM0vqvȻHɶ0lnag~6 7 i: 4}깧H۽_*VS꽶m菥?;,)uYTTS YM)^yT4dd0)JTT"C f2nx/x~L7ۓ̴٬s\_KVĖTY4Yg^E0ʊIJ1,JTRJfLG값2R͊] 0⥝K0w5}E;Z^38;^*.+\24Th4unh޻ ,׍c&m#=JfjKkU:G93g$t)vřk1$I.;s8> %FSmT .%YJ,,ݎRIhՅ;=Ҝ["Jlhair]e2fjR JIܨ]u$JR[ޮJI;bG%.dld̴YJY0ꏕL$t,f"TN)J)H%1EFL;.^7kk6jZ4s3j%I 4S yu5w &oËe:,?u&6+Fɧev/6wg8nKmQ$j޾=4=ؿMܳlR杫ol|:^칟y8:Kl|2|>NL)g? pgjZ.]tTZ9֏j~5hqE8ՙ0ܹ)eʑQeFIuI%h˩e7'awK4I6S + 3}RJR$d{mQՇ2L3dGT*$R7vYMY)u*)LHN+K*Z;\.-w,)J? ֑eGt2d3I])t93v2YMY%)Ki)dq6GK2Scv,2IŖjv72af㡒ozRrn}ш)gscO,ѽ]N7GVq fݳZd3Ͷ 8OEue;k򽇩,CxxhKSx"sclKu;Z7.Z2p]&qe3R72f2$ܥ&Œmw4v?dɱu$Ieҥ=%M-5..,ʑJnRʅK%אYdy}-UZBJK>>]w.c%7]QRhns͒qraeE3p]m%$&F>E)&,lfʥYK}I5ngRQTRi) 0"-:뷬WmiZ;XaJTIκ쟉}M[b윢9,fLu2:fFY%>e,w֏7-^Faw;͑,74x٬-VqG5YJTToZ2Ss3f_fl,̨1#b2)M$] I*-t)x#,TRYJߴ˒ڳfܩIbK)MVZ6*.E壊1ωNR͍V|#,GrmTIXdG-2P$.ivoyݎ"OM0S7k?"]hޡR$DԲJI&"eTfE,]yFoC{x)%)JRGYR67e6n2dɽL= Y-Ksuax;W\͍MۛGofQɇ7~WrtNqs?x{?sćz~wo||X{OYyǼ70T[y7$f➓nmx\͏is)Jo}]Oh5|.໵&? ц]v= ɔRcH6)uK7 ?g#K4TbGhȺ%E4t)K$YI)Q%*H,.eڬ!L.kaaSɱJRJlSڳ&բHZwcc3IVeF#rqJTSr),l]S϶-dR}/)5ܛd70 TY%I,KHTrYeEE*Hg3j%*FRPo:ǍKH%-$Szw;Y0I.KHK3TYNuF&fѹYS(? a$7Bm}K7?lh]Mj͒zO𹞫~jG G;WWw{.sLT^+gWv ^s)Oq'}/7'g4q}*YOBueha~'{s,rJmTotŒYԺTRR 4dhe)ffYJYe-)K-)JeIauEI6.uZTsFϙflimhGeTTaس(ɹdY&k7f*3Pk$}Yx4+76-k)гW%L3R]&,sfIe$RJIQ%%)fDɓ(G꿦x\4d ƱKE,Ye,YhImc7~륵%Sj*,ɓs63f͊s^=?칝OxVa^l 3lfUU#tZz-cԿaսfvݤtl]7Gg#5>9vlR|ˤQvΡFQЧ]S Si#%"ED'ڮjvԓ .jYJdJ]ܼh5 >I.qxգkV)IL<wyfN KXgRMbYRR,JTR%$Ʌ>lFRTTs׬그?mSs52dШ&.՗hMhd7J]hQu^)Rda ]>nlwy{e=FOkfV7=g,|ˮړb],jaVoYF4I;űk)e,{oj͜RˬI.*yZ3hpmf-xEaN. Z2^6-# zL;W>e$ -$Z,ȳaxOmVzxa6awvHJTIQ$4a9ƺK֤ VR vYdYg0ɒfܳ 1Q( ae$E**>{2^0TRK4^.FYKppmlMomg,J/'V''SFR좖e"K,j2"zmyڶMXz)ѫE]&ܺ}M-\jpjﻧ_qms=g,i6~F[[>}o]DO {}/Vścg'=8=yNk3uYG%'3 2Baw,824RKE*,W'j Χ;5T:ËՄT)K,5RJYeTR)g:SY*3t/F=x Ty]Z5f ,% Dw3m ITqlZ$⤟DXYh.űʣ $j7_GZF&ŢRE,R)%芋*< :ldSbMQfjnSƨܩ3l]J$fY%F^K1'KczN. &Lٸ6#$G.__;?Ȼclے=WƏ{lw>>f^cnd|tɛu=%s}Z)%)I\.&v,MVaQJ̗avVSO SUaQtT+.Z)%* O5ZFQyE-*L0TZD~YO+ͅ-LF,Q%$ȳ8:[u$aڳjj = SvY9}a(KEz-y0£hԨ45He>7ܙ,ԲMUe.aQxՆJu3)JR*)q~o֖l].dR0ц[rh՟%|*lͪruJTx{ktU?v){ y{{͠q|>y,g ~{+{f?䞿}Vija3ks3+MM&礷KLԲmd5fќw,}_u)fإRԲ)eWIeD"aK)Դﱽw> df,nL2I?Qhk’]hj5my;\c{@ldKZg4J|u\LoS{U7S wYu,Yh5]]v))NybuHK-IJRZE)-$rY"Rh՞1vfL7E*.Yuɇ; ? %>7۞ڦN gR2j'r=-ټ/Ѣͯ+g)vfþ޳^,gw'#{v:V$fI'ܽVζӵUxNmjH[WT6]l?zwlwjLZ=ɛUSг5YOm'\ڴRnRISuF]߼Ʌ> gbTYEHR; ,(w͓aڻU)K,í.zWrزZI$4|FqMb+G()I,ؼnbYɆ.eх-*8)inZ4,os,K6r5Yvk*,7.%$b.,\V| nxI\̤ܼjG'Iuv 5^'&fUsGܳܦ$8."Ÿh'ɱ]]qelRE)JIL-"|gcE)j FlxL)\x;baLoY*fѵQ9/ra%3o]L?u)fZ80YK)5:r˵,Y.S< 3SFjY$£ ae)e{K%+,qwv 27(#g%]u ..ձn]ect)vhS76ĨMvՖb0]S{U64nnbpTd,a 6͌5xx_RKOw;Oq./s;gukU'/Vڽ szO4ny\Y<='Y0|oa&'7> YavlYe/YJIJ)W5},5jWEڲyK*,̔ɚ♺lE)gL2u6S%}L)zhUHڻr٬QEy%uYOO[rLԦ9$9mjю4ҴYM)E:9Gr񛵔d.ͲEnSljt}lRgvFjIuJ$IU..Nϵ튎6)6)80qhfu/S -"Yx=6nsWKﶶ;ލVrxI6,knvj)s58]wےhÃ5Fת[WInzceћ6<>壝鿴gwm{iew?Φ'y޴UT'mt~g.r?zZj=vtzO>09=6I>e=Kl۞FRtؒt~'l1>{?y]e)%QQe=F1$GUviOuJYK)%K,z0eg2fl0,YuJ$wS TxVYJKG2Ε2Z7-$IlExJ˰Or]&Jw3Gds9E,YJu2auKؒ'퍩6TRR 2Ydg7*FJoVj#ѓFMr?KkZ73R̜_}kY3 #j"eK;wZGh{,س6#{?syٽ)'ٮrYkcAn8MkX=m[xncSyw?iٽqnscUZ;F]aK/u׊?2Gb2Se .)r.e)ujQfS72l)NRh—RIu.u.F[*0cZ5d675Eg:|~WJ9='9t;)YxYN)s{t?]xY5VŃ=#s:݃zH1r[|yΚݴ]mYkQ>פv+?e7)>6zOOK $F;0]. VeUTY٩0M{Z&lR͑R,+۲.ߖ]N hS ]bFm˹LMYnvk,\]uE (4paLa)eZ5R˲TYJIenZ2Fy]u,h.+њ)p]x8LrlR$Tx\֩,~L4YN/ Iܧ:)afaɫ"I̥xRddIK3R(Ssڳu)%YMK];dus-U-EΥ֑J6g;k6yڥ:"fÓ 68homa00غw ?'ܻsU 6r?дtm ݎ,xU}ﹲ3Y{fش9lpsҹ/E\mQU]]]Kֿ]^NX:6-[E|Qгz5dt=wy7d7v7:aIf=6woUNvOdFrQgzYQ?gszc֏C>7^L}.|/_~~g󷾷wS64R{S|O>v9=瓃d#SrڷTɵkFS&? (,YI-7l,ڥ͒ԲEmR~jͮf]Rf%oj{]jQg%/RJ}K0UK)J]fqu,v;.&Jh;]U$'**F?GާRY$.Ne3Rylc[wdѓdKEJTRIh<]TzK.IUVL~{| ]/_;nfv+';NGx\"S]w&ͅ)M,Igb 3d-dlS/;VQ,Ivŗdz.Lܔ&JQ2RRS(K2rY8Z*$ԲI4]e(~$SU%]uԒ묺n^,R0m2v2s)hں:YJTRA2le۔jS ȺƳ 8GYdхYɵxZfם,ޓ5.JRtE뮼Yh¡0eeY%Ժ$jʳ6ȣWuYwvѣgsd~:Ã%=%Y0$jEޅ^8)%aK/%dt;mh\VfdeSU֑gjmչqu,cj,_9x\v<8:,xgޫNfLGti8;UGuZm{wlmp6 t&olx89Y0ޓVy&m]xţE*pUqSZH,JI]u)N/|s,>rSl1 Z?Qagjc jeuՅ<^hJR]uHygOBdʊVY%$,>U%>YtٱQxʏ#6Iu>ի tdޥڶ2zM#8QbS-0YE2dRԴTaEFňئjSFS%Y&iwwStO*cc80͛6Tu<.uy\{xs'~HYGdy褍`='%渮k݃5epB)wˋM#{-X)x|q)fru081.%njQJR2,L-TYK,YJRTYK8x%Rf]Sh2b=]v]%QuEEFXYQɢ)wʻaL$2yU%#bzGI3RTmRYer]JqlaqSNfl.ŢStspa*7lV̝wmIY"}η&qSNwzuVd-ز񇍜]%ⲣ\b$.rw]K)xT-\ZLͱxͽfk$FCjlɫEE͒'Ye'ͫb͋.޲̗mSW3FŬnN1REE,ʎ-ӷcVjRMk<+規 2mt)vdnfR4S Rʋ*EԲYhRR)I=5Gs-]u9auE.%'dNJg&qҤW5U4Ռadh~*QKOvśɓhy60K,*G:ˤ$䴝JYL.Ոf%)ђYeEFr= IKʍW޺4$a%1IYz3y*ETm^-jZJILI>K G(S't*4oS6$d?ȧܧ5nd)fKxjIqudn|n4tlqq]EܦJMY|wOݞMܻzi"N;Ū||{/಼ߴ#xukX[Izl[ŞM^ _K6ㄚ]%/FњΖ)%3wVKu*6GsGBs2.,JIJa^)Q 9ea~c ػ ,Y%GBS{{%,)I(S ieE*?ʦ&RZ,]KE*)R3Ta/aKE)J]K)f#&ŘIQ#I0Yhͫ %s3do;Tɒ᪩"Ds87fZ]mwNUeۣ7{W;QJthQfJ $dYQN9-3oRt2Z)K)K.ɒ]gZEԩ=f2YSR)Qu**E Y%묽EH)JSgѫ(:h͹M8]Rsv݅5YKƼdߥfr)%s:cfjRۘZ5l)I)Q,Y]Mdhά2uͭQM)JTYwz{Hɤa&nÃdlndY{sszoj*4t4{ldBNO3nhY-x=A:΋|n㹷6ld\DUUD6nqѳ۷^Gkn\뢔=W}fרuwʌ+,]eO5|}nN,\iv ͮo3~W3Yf:gMh͆k)0Z)%).55Sa7;FC6$RTjI-SkUE;]ˮzaww)nZ3n0]xYI*6)R0O|?*Fa%:ԲaKE)M,YҳHJ]QQMT,زK)eReE"%I*EE$R>6Yf%%Emu.üff,]gC.JdѢx.L*/⥔.jXRYu5ss)Ig4h]wWI 0JI76Q3M.vNM\븩c7M:~syY9#(ͲM+WSձzձ?|37MU%T!U>covS=2yܟ[ZvoaFNgrhe'7ȭ4OmSg~$[4QKEE-kٺ:VdͽTqR Yb00T]NIQK2)JS 0JR̙.ht6xe{7}Ldu$K)hmYf喊vladTRߪ䴋GbR,IK)J׊R-ڳwwwJƮCS`dњjFNQR*2T^rT,YJpag}hB̔63s1kSŢ_lj )L,ád~,./ͅɧW3]k2iV#Wܦb, .Rt,$fiaYiY&Jf }+, )dYRJRJRY ܻ&Ƌ3ij:w.ަCNl,dK)Qߴ[Z;I0RMTaaJR3Dj:ReZRYe,E2YaJJRJIH;W.>Rgb0ڼjqu:ZLMvl=޺gJxXRJRRTZ,$,—S wnL #;8RLdj6ѽgZ]ZGqΩ\ %D R7FR)LʎɹMb^:T%)h{eUUMVtKZ/R)NK-F$RRIQQQQI;f YeE*$œT%0T{SҲUUQH*6Ot6ZlqraK$Y-4sfWw̳Y,K)xJRJDG}Qu/Z)i2]Rv^4:ZFbڦYCDKb̖h”ѯ9|fŖ$,٩eS [6I/C\4Z֥wxn+V{jݭi8q1w.FC#ys֫JCvEv8:>^sozNCpփl_En#`lj)3|υt~Fƫ{Foɒjvܳ,3dOlnIO tnt7.d}˴Tm*)fxY)glYQ]xlSYyT,I)wfYdN]uRd8ÒL&.4RMwl;Y,JF-"8:6i4SuǕgb:U.,%H*.QK,EE.3R-$YJTҺ8)JYd-TZ*E)I)ФFۙ2]LYHFmRE֋F%)e,–s00Sg{ jlaQMT]̴*F,H e9JB53RJR%4Yh–]NfZLх)KL.uw2]F8.ѓ&jPTeRIhHWaQCU4xjɛ5y;+&hKIQR0TI5o]gRZ,)HʓUYh’Y%$Z$)JRJI%hu׏e,I,KB.Y#uEiv";YQԴ|fff*0YJfXΪ T]JRab- )eo]I-1쳱ֳ'SյZ83;]me*7e)ggm]3SYl"jo$Z,Yk8ƬY)W;]ħCmxL:ٳqO/鿨Ia^HelgTRR!=g~-6qmoVͩ{-nkmY:FOu>WƳd4]O]s8j8ERbva'3&nBhX&YI,84dҕ,Fu٪:yl]JQuxd.^6F]OFJjpaֻb7ԟȻG3y$XoaeEF4T+6#b*&RI5hˤT6wJf޲Ssk )ҳEYRH52TdiE66jĊR ^0r;:CqY꜔ERζQfL%r^9K'̖SeG[&Y-"IQw%Kg:ˮ"YO IKv$wžEueѷjYIJ';{j}[3boi$}GH[TlTUԺ%%U*W}^O}[iOU[N?CYO}|3f6̻7ħ5d>%Rh6Fں)ldhS;ћG::IK>eqfE3TdʒYKtT)Iy\ɢ_N3},UUwE򣝱u$þ';&]lx~;[^)Y7,>,j^ER:։YQRMbkPKE3ahaedɛ5R˵˺.0SF{6ɛ5{kk-w75qi#G4nt )Ygyi#v.%d,JI%ڵrs2MlNblxW]Rw*$s76:T̲;i&Yw<;Vs0ɔa®سu-JDȻkR55d﵍0f&LٰY(TNlF`R.K%Z.veTTJwZ$BʑJIJT}*ZE.)ҩ"qf]K5E׎mKS%7;#s6\jцIJT)LKʔe0edͪ2c72ƭ$h[M6R )KśU7Z0)c )K,Q,j5wݯ6סumj}2lL۟3{Ψw{ex C5y;5Ckw9ܸ0 0 /OwaX]d:wH[#\gj;r~r?{LSF;9E'ʻb666k?#$UGmef#jQrR6.2Ye-5aF0,dػ6avk")f{ehyRLE6egB*Jj>4pjEgFi;qv)fpSۙ_} M-"ldѷ $بͣV IJINlRhхޥRSUU.hjFL0gj,D vLE,YQū猛ٶ7K*T}Lw-_%sz5 rQ"j9Q=#'r^5I{E0w2{OU3t2}硵kshqd<0xYY k)Ol]t] 2daL%YR06YF#%RtaKYhU,u.ScFKR·c},.du0$L600͕RJRd-*9,hх\޲fhh{wSjѵKg2IJae04.)7(OrpڥڽI_Yt)fu0đ7Z Ë%-#S ]&SdrnY4Z>U,V)eGv7JT/ŗ .aK-TTT.]v0ܥ00IkZ֐YKE5d%£gjћf= TQM-"$Ud.,w&LeGB[ϱQx;2SC~I&HRU;˹FY%.7Z)]uYͱ̦qvRʊS +$a͊]xS̰3nWl_T׫c Jbhەyػ E$heKu='Ѣfl60ɱ[HU*)n5S p4dњ-*JLR·.U;E/9Z4y#l{M,Rs)J.5Iwbv>e ˩JsR̙I{6G[c|s4aj?3c 't|.qxhc]:?]S{Ӓ-Q>}\-ޅJTg;]򛗎G^?]<OYNfK)гgTɹz|3j{k>fm] gB}^6LbR͆l0bRoSE)*)%$زKزTRck XIdɺ7.hKIs EbIe&лgKի!Gڴt)Mu:YvަfWY%*dTTY%$u3񢄙,ܒK)NfK,ܥ,Xֲ!gHڻU0Ye6hahM˲]%)%)':TqTY.ڳ7zsR}tsI2:T s8^0f-&Qe29Z}uz7UUu94u3Y%)K8'YlɒL^2YtFJf5YKEh1vUB#k%19))6}9FEZE)&Ŗu^KdZ ɲbfSlTjMZF<]YeYy$fYwꊊI]K:ٮoI6lps79vʑKe.ڥ)ܯ+{EfMVR˷G05lxn76>Vtoa[O<)f?oz)#y_C~Tu9+f R$| 6o|ko_knǤʻ_dҙ=c C{yѫdY{pY4aIK]uhlfd SlYJSrf# FF#FO޳4zE)J6YaxJ,"]K2SW'cj2]s7FllYM̗]x]y}RV^9RlWxtY&)e)e$⤒YJSË TZ)&LYJRRJS ZUU!#vJ:]fl:bIM_yٵW6t)edLte/K)HHlm2 7;X^6jh}?+Zʦ*1/IIbε7)-n_JRw){(¨̓,ؼn^,['j̓()f$]Ѻ/%[EIseYHv*F )ҦMKEEY]R4RUUeI'M EJ[e"3]w(.w᪭UeR E*E8KIi,Be1Zd]RlQJ)JhJ]vLꏅDt=G[' s ]LUp3 %*,]u/Igɬa)#Hd֌a})bS82j65z5x~׺[]j^)%oi7B쟑UWqluEU=w=V~Fe].heSս4b}h_{N<#UUR$lxSq 7|=vt-CP~gW]{#uGs{:&.+cjv)f՚;l~7&YūmYO[Z)um㥽fJd$NKY,gjYvn+y]x"afR;n^,ܻ6XaKGK. ߩOKYں]hÝdҢNJnTb%ѤRRGRKTj/xxӞ/{I$-L70ͅdN+FjH774nh9[R묺[lQ$T~NnCVRpbr_38+XƮ2xoR/{cYQuUUYMpRfYu]%Ekn]wS fձnHձxM;gVk(JɈd{%UۜkǝQv5TYJu0nSQnSU$Յ/YuԥENkTS6*6*Y5%k0$EE)QK6.RQH,LkIVTPmhb2IJTx׎O56-hYQ̔K0$y0()RѽL5Rͩ/S%]HaJYfKř$e,TxLfᨺHi$nfͣ&kI,—F70ڿqwWN70ؓ_~CG:.,ܮt,u9ߪϡ)OYWx.g;˿pllv0aNs݋>mmf,\{idqUcI#mh'98?#:Y>>W]hYŝF{cqq{MW.^,:c&Ě,kNJ5wfm[$Ye]REG)#$=wuڤ6ȥEE.)RS8YvJ—fq d3IvK͑?7664hFNeF%,޲KɆ$HuoS5HԒˬU,غuD(YeEFmW]LS ÊPGL.YQf2f2dv֎-5=UTW*^/t Skg S0ʕJ;n$YGYgTu=&m IDT*)8hZ0fEHj0 )I,57f t,Nh)ME]%"]eG͒UJI:8wdTZ,)%. Z/v,=:xxQJlZhRJ)REY-u^*9,j dvQ6ƍw]%)%f,s9ͫY<56hcV"fl.js?#kkv mص Cx;Fܴ1ھ+w?'cWCd>|NLR"|M|nuZ58.ձfm6?MrZ5I ͒849^4ps80.,hGk80a%ܷ6ܘnl)xTYJmeRTSYe.sk,tZmY.I0R%.)^Hv:Y0YFQR)O'޳ IT]iJ]LR”¤R,ܩs:،(,Y)uRѹg&dYܣ"$ڲԓ5.Yԩћu,a&guz͂$KF$hަZ2YܧeN 8)gSs 5]w2R04Yi%TGJ)&s*46R̤wU]YfjZGTSWqu77FYY%6q5g4z8Ȧxݮu=)wSc$ѣDzn.f|]e*a(na TZE)f,HOUh:ƍ_gsw6>F壃{pnX ٸ0áeli>qvLfˤ)w2E0,u5fnxZ,M B,س6je%o$£ ׋e/ڬi1Zѱv97SfKE)gܺ 0TYehڥ‘,FI.j}4&KV]eh뤧JϥblQQ#dah.֫ejdU;Y$\4̈YM#U6E$sū%2Z,;[Y4>u84MJjaJ"ao2nj2v]s(ew:gRInfTYK9$mΖ)JȤIҥ YQi0ѱOQL9Z75noɔnSs3SVJIJe6Fy= I̗TREԦGSc&jRsYHQQ&F*9ԒZ2RG6L2IL)gٺ=Ov5h_uͯs{Yw/$o^)鶸FoiOW#E$|*{畣kE;-Rf. .ܻÄlṚ0X3S h5)N E5}J~'mK; )hਿeREKGRZ*);Vd^6a̔T}6ƭb3Rh%*QJRZ)<FJRJVi;q]dd ,EⲤI̻ 2_2Y3b3IQwJTQQuYxt,jUU$$)K,͑u,nlY'hSN9.RBJ]I>殇Gqhύmfƕp]p}MId:\^%97GIMURK$ܥE"v=%E Y|\Ւ䒒a]djRE̘|Nº7/Z]%yQR+MFgVQ뮨TzjaZ0)B,RNHRܦJTtGd$]%YNuGEe)Q2̣0GTգTڧr ;Y8z벍7, ֌R,0*$T]uͬdpRZ/$0ʒ.͵e6jwlhKJaԼn3dLEۗdL,ձ{LޓsN ۛ׋6mql]&F0fl$-SIk/3Skk{6ѓvjrlv6662zˤjK?eۣ'HM"eަdtPfԍͫE*>#Lt6՛c)iE^Z,*.')L$dxɅEE 0Y%؍#5&#%a>3ħd6kdf.eEGVINB'ֳEcijv:E)s/IJj̔DId*8.)%JRi/..˴hDž]M)PLf5YMdhJYh]eFK.f>S&i0g]0Bfe)$ś HIhѭdUJÃ.ݥRNN ':̾Գ5Y%*HQtECsqh4R|(pYhQJY fYMhޥٶ-kFaxaQʌ؉0UUV2Z83)JnuRJv<,FLMhZ)JIvFaQΦh,E٬ɇBad&n]fYezdػ6ʌ1 KSd,***Nufd۝G}hlF3]RTL)%3df0 jm6mg5qh)Jl[2Sj̛FFMhﶲjիGyW2Gfzoe>C?_Kf۹׶]đ_-o6ͳm=`ݻVڼvŻn]cZ7ṷ` 3m}1vN1}QvMt2WM=862YvY]Id׎,z2]/mlmu0b9F՗dšvl68♵tkl\˷3efvT’R 5fw35)N ,IeEZ9u60fLu,ܺ&js9f5h*6F#Yh謁}hѣ%GBavkg}J"RZ)iSUYR9*92rXh0UERDY^(dFej2HS&lR0̶E2Y2]8хS&]JIOUXT^,,]ds.mo8M5fɓhѽk$[s8FٸRs$N,h0͔hSGuԦqe)Orys-t,YQwj<+Ed&aI)Qe)x’]iuZ)K5^7*1rZ,*sœ\VTnahf#5h$js(3Yf7eYJTT]dYEZ-ڱىᳩ{ͭ lYFqM#6ld3YJYQx&kS&Kh:2p(L9m[޳YѪ1ɇe?mձ4lj y# h͛W''#kco{~PjgL~oԸH9G)A?蚧Gt޼V;Ųh4)^Y{nͽWɈ”ɒKRުj*l:t3[mSb)%S9Wd}Q[^,-ekK\v4jL3Z0Ne:ZţUwfjdVJIh^.HYhسjYed6Qqhbb)G&bHdavFXSX*:WL)NfU..f]O͒,r)x%]tKah;T‡35׊zKEG2Ş%.:T xh׉ꌚ)ђn^3Yhʋ$̔loddbjvL/}Ś2Ye5͓6QاsrϼIwSUEԒhVj]u)̘Z8.%;ʌ7} E-*)S5ET%rRdњu,S Sbks,ɆRSUѪ퍮L׍e<\{4hީڦ.K)eWL)R)QhYeIQi%Is;E;t]u)%S(ܼe)í.8;ګjJZL},Vamw*HKrcc~fn:3w)%QSOyh_[rqy3ħw>Y|/#gޫ{Q~OYV+S .MwṂ4j7GSscjlZ4w,M$63agK.Bom] T)e)JW%֋7G͛M0⦱QލHjScERRϥ.{saF#EY2HEEZE**)R)QJK)Ū)%$IJI̦֍TEFEڭjѓ Rfe75q]EZI"Fb12dњ뻛]m6 u)JT/BRS5*)I]śYՄfuԧdTYNvزs2Y.fDiv)M0 ф0LYeܥT|*]d0¢Ju5^))t5p3Xoa8(vl6IuE㩒R^3pZ2"y"u5.OUeԦk0w; )g0F'EFEjz^IJIک]u9t.]wKua1w]\s;[mv,ѫ]&4w3fnje#j븬'gC܏=Ţxf+ERihڥ)u,›ڥdj5]tSM/d;15HR¤6uսeK$IJf໋pt 6xv-bpS&wRͬR2ad<+3hLDZ0E,qhx'WKXz5tj몪+/멓=IQ*5RϾ;Mi*ʍXfхDnavGB4hɆ=̴^< $I.gKRf1RI2Rk6 0JIҳVE80¤4YwB)u2d%,–Yg٩ дܧ}{%Gһ&Ť}9Л9l:"Ηsv#re$sk?ѓ5TRd#̻뾇y';d|JY_a#kӹտ>jfIC#{>gYyd8s>ym~O&uw[tr YYeӭ8tl,]g2̛;|L3Tڳ]b2R:I)KE▊fMR]?$ߕ89_F.]dqefZ.(TRK"#6L1FN"G3lmqfRFRF))'&2DzXѫ%R$]x%Գ¡vţƦJxUGQfEPy3^*u)hуSWh FJlujRww2˵I]N dXđln]bEhGr,zaK85sL\s)85b5ztxyHw꿍o{ZggQv)o~w'Zђ8_a$'@靓tt)4mތmZ7Sbub]ͬxlnnt9=NypTl]MHju0޲J,ђ-tIݒ6IccU}4\T/hɖ3sdhRqZ3lfܲ]vFJRRsE.%2EE.YO64fYeFJomhuaC$뮥.g9plawKijR57**)L)I5Yi1N n^,)LI*,b86QNxbTK:Tɓh4f]h|nv$dJ|ֺ:T 42wySSֲִչu:ܞfLUJunaث/7)ΨSKIuGySgCU>%NiהfdkTZ銌4gSH.˩ue*+HNaMʊRL۱&I*JxYJfFtG:3auڦ2ZHشm9 єYQs62dyu6$YeFqL3b.K606;]4rSzMTEx97gI.:SnM:74mo~Hѫ vSa꽈7qbMΗ&h޳ *)%ԑɺ7>v޼s4aߒ4~9)6l8$yI3trrc&Vhi.mb3d/Tf,RՅ,]|;]vdžz ̎g>s/ ksoRڳbRYJTIi,^R/ur*)hMrrfa&jJz(ܦV)yL]uYKyZ-YI,es\,I3raҧtػGQwd~kZmGJ)0M9ʎɝ/7.E󟩪MhRΨͱfql˜IK,ĥ2^*)ak3]L9Rdͳ{.4fC/Si.Muҙ:Wi]&%噤)x񩱔Y<2G7֎Nu7;vš3a&嚹ٰ87$›d))&Mڵv~e'VT2s9724wFn/H޲'s&r;ﶷ>8"S{Hwh:͒8/ytUTMp[ Ѿ_#nz>jwٽN#?#iYuGԲ0:wSU*E)ܗYaKehjF#Z0)%)w^$Q%IpngtT5dR6*4]u2u}yJSj.G%umT)&m]t[ Z7aQYKFYf˜K*0w})i*@Ie$ɶ)5IJ^2Y7K':[#fLwֺJR *:M c{{FT}4nmN;#jKQ-KQس^0b:]Ε|jmpk:IڨKQyՒ+8:̛U.Nd^e Rt,^6.0S%YQJRK,Y:72x- d$*)iʒ͓i](&uQIyY4bF֊RS 9߁M,/6b,™2b7/-),4Sagھ;Y4Z]nL({qhmwk~Je~ۙ|MV]~7v^Ǒu>w}Iw{ sRGt۪zMl񫺻DtEq<{|.OƯj]:GSv4;Eߴ5zYի{k=wJx&=rErt :c'2>V}g]vYMaJd6x\NvڥEuY%,RJ˳Z,uYi# %0Փs7j1)K)RW,RJSckO&fkck{()K]F2rh¢JwԒԲRE)IQ0f,[*I&cu2dţdËjYw԰D/[֑dTTe$$m]E6,vJglNJkZzmhb7il,VVʴvXoj'},ɹ8BKq*ѪsS bsU:.3REaR˪.NRG%&J^t,y {OM׊7IybGldUUnL7|V~G$_y2G97=Z_2|oOdwԴjS&oy{Ywv\[5֒=9#cs 69[YE"IN)8CdYw)6#(Sf"N75ae63R͏iMy3$)jbދc5}m,V]2sTRùjﳍYKuFkatag֊4Z.FQh2]e*L$5& 6KZLR[UJRTTUI7):"M3YM)!%I}vO:U4R ԧ찳 3wYHɜfC tw4ehNvqm774~ bq6LN.dUETWcȭj:ڹٯk TM^:2bvFɶRI:^b>:[lf:֍y#񳚼\ˮɄIE(YJ)hިRNISL0Z*.N ZEGyxw3즯;VMz,ooYL*74~6I7IuYN8dgY2E7w8,ygg:n&"S]x;Y.b76/)v6,ѱE)M:$޼%3G3O2YF⌝N.d6a9߼hGq6< u<.$m?]WCw~'_${Ss0p~=yps?.y5ӵpg$}RFCл534fњg'ί9Y-ͤjnQfs,/+zR;] crM#&֊j4Z1RJTFxUaJg팈pY5Tx~M[b웗Ya)[ S dZ2R˩e$$"TTIxr TYJʎy[LE%IJ#Ula70IA$ѫcbYJ],,h0KF+#jRRY%'UUt$)ggEI#5m-']v-&rM[fFW߇1j6bK~nZIdTYS&JG$Twtat]&*T0IklW$IZ5.j%E)I)Ƭ6Y˃(VYOYڳjE)xe-5DI$ dŕYeF֋^)I+{xTYSɆ)M/Sδ͇z:"ĔҤyfܧJ٬ř4S'$ROMfhk x͏YWZ3wrjsձ[X]$}mML{m=Fo]-~6^ KW#*-8Qnu~7S #?us?IS)}I=w0B61| I#EZtu,Y'J0⻌t5R]CFmGBh73)JRF,Y=QܴTl| ٶ60,7:/fRLD%5a&:”3S5]ű&"5afQnjQ&NL8DEF, ZH)#jO{V'U{FƬ8mS6ƯOeFsnlSE)c{/ɵkj$me,md~ # ˵a3nIM+5*axhJTgdͤdUUTdчO񛫂cV,)QfDL)uԲ6fk,%튎Jĕ& $[/Ýe2d3^*D93qm2hY$ In5d'֩hh.,E)eHmTsۜ:_lo}{>u ;]eSUUW#r]H2GS Ldb}-]JhגL4RRS%IR)I)e-T{ ;U.f1LI$SK9ԧc4Reؙ2]i GJSRKI$rlh\vxcsU,'3hQ67:-<"u]'a(w .QeIj)xג,f]IJR)I)QMWrRMOcY4EYwfɬ:67ae^,TzL7&mY=#nVĚ0>z{0E,FfE.LRʉZ,U2],ˮÙgɣ%My2bGI>bEM# ]3T|-RbThEŕ]X)QL5TS E/I># x$ś$3$S's6DlPBfDI%6;#.)hM2w)LvI0U.Ԩڳcd͈xܥ /]e,Yu4Rʋ)Jj'%,xR)QeJII8u*.%]hŧHde;"-KŖYڲL.dg%S'CFcjͭe{5׌2fZ6G;h6IfY)Lxٺ.t_лG;ݍv9GS7LَI8)I-II,ae85IR:Z,iu0SRJ)e6FJ[U*2˵KR%$eY%%R)Q90&L]Ne.޲d,Kh,&Ns0s,5 czњ~Ӌj6, ŽhqSfyks&m>K*YxtuSWDߒ4|ϙURCK3gYCy^>'9Lw4ROK;m4mN .=3Wa͵I5SFToSl&M3,.R%YuYR)I.W&zUUfѹK)f"S=7LY%I%EYK)e$ih*H,E)QK"IeFK2k$aC{c G362]K205)R]Q>j]\^8=U8F# RK$)&n22 ?#Fk{2qwٲjU&,޳γ TYfLKED)d_ .YeՅʧ̩RVGe)QER묨Jb.fٱofɄH9^0Yv}ٮxK2#rϝ6LkmqK]Z9>Gv:66)Φ74|q=]뽧}BhOi_No{ IዳRF?C=WzGl9j4hɛVm"KQ.u53YzM̗z☊IhK)QI,ڼh 2}3SȳK4dj{m4|. ˶:+$>,)I*)JZER{물YM~|ȒʑJR̺K6]Ҥ&8e]xw1]%llflpYШjSв[ڮɛ BʊBS dMe Z876jTY)tpt̲LS6865s68 7.,]FJx^a%Hިw# jjՖRNfY%)%E5YJ]Jյz:-TfMW4Ih*]uåIfkFƧ&/voOֿ.<˸5acS&Ou2s6殧_#zmQ?3o,ws<G"OQy7u?ZH>7}mC{K2]%Wu0߅>2'W{O]zm|{#G>WzOa>W:dEOHd]]L=բږdب3}REG4fmu6’]]*avYe,*,M63t3}L6-FkHMZ4Ye5Z7/R,JIJYQNČdRYQҲ2^H8.Ylob4IQ驣ϞTrYb$){&ޣͻbϵ)N;L:&S)df$FjlR6qs;SU93])E$5RzIe)QQDVZ-&0b:̻%>e.L1Lf05얊jx&[#&fju6=Dt5psh';V94mj7r];Yf>>VM_ bnl]{yO„ZGy=ps$;c7{n-L^Wڳ5s?;G.{h/ep| 86S 2aI0dfK0ѣ|MYEI]5m]]h[Ǧ{̣"h2fLE.$ڨ%GD*.4-$ѽtRt*B,3^7dɜ]; .e䓝fȼYeYsk7EE9L)L91lTuSmmRS[]jRFKܻ ])e)I*EFK])QyƖlah1eK*_ jR̳5SbK*)Qf=3s6;ˬ4Yئ.i]5]lj 0mfܜ_dc]W85|nFFG$O]=lj>_Wr<|Yd;]ߌe=͔|_4b9:V{K4,Նk l]򴑪$SFI0u)L׌Y]˭#QQQ&JTll]uR̦l,),K6)&jYxHfJb=NTTR$C7Z8)OR)JY76:Vf**%ojOaht94w? 7$$TRTRavYeҦk;bG%,Òuue,زњ/"㴫Qe)Rq _ 3Rg0Hх)fFy#bMYw:͌ERDN,ھ.|?Ns4]5z[m[}C9Vqfɽ5j[7=Ww#a"<ϵٺ+f_w|$_3^hGKd4rYʥ]&JwS{%Ǵc *>UE$3hOe6]0ּYE2Ytd)d͊ILO62^ER>b1lRJrmz2I fÙԳ ܗE>%YJUHdeNfFc|ƫ4]NL83|t7-4lSQL0]L(jat:͓5ݪyؙ8*6Ou&Moj?i;܏pTCw|m)W;4rq?{ɣ7YvŞ1Ǹ6Y905+fFݏ+nuEE)M7YM]4ovɜjFwl}+,h$JBWuEuۖYu$C5EGgj$bG\|Xb׏ĻG}Ź-Z)Gcc67)e2RNJIZ;hj˩uپVG-Y^,)QI*)PQdcc6#s _Ds<0.]h3s]LF՚floHZ4vFs5d6=Gqu0=w3V'Ls:E,s=I3~̐7?^Hx_3Y>~uI7._8l2]u?Q6Oh ѬSD)OfхDs=m͓R)aQ%Ix-Nvnu٪.*4jrS Uх/a%2a;Y,qNk';kFlmaRJY%lYuY1hRTTRK-EE)IгYj&_%,R̖{FZ6Ն|.n GCkcW9,a.;GSsqqyG2Iܻ&oCF<sIPoVoU?9'sy=_kag=E.fއ3msձՓwdSfME)esz̞HVjz.%*)D Ȳ{6Z2R,%/$M–wZau ]JQJuUfRg3]S "qG0]u,0HGRTS;Ve)QI-Y;Xaoftњħj*)Lfnqz_]{M^יU2n]f歊nqS{YmYh뷶0nfZܺE$SUE*0hXDQ'ZRhյáP^:Ն3cpSj67TLEE)Pdp^>fwnS 3RSkvlG7:h{NKK$s^,SlYeG49Th^fYwlZFMVlaNQJSc&7>SbRFmau5dofhe{mUl|OY'ݮ]5lSzHv)i#سQ?{o~HzoUqn}KDOuyfl0ٰd^&Syvɵ}b)vW6ƏqK+|]滙'ckzWlSE)MWw5l{-ɹdTXYL v#krs0Y(RTTMYEaQR3RTpmd”%IsFN 6F&K-I)Jt#8ZFIK)N08h$6w%T›RQeFm2RGKsFL)JSbafŘRJ"ukaܻ6ֱ{,سbhx-YchյJlf qw6ɜaإ#y-G'Sd޲96hvkjpvrt6:c}a~5t}:$}n$|.'{Olws;qqI.kڳy].Nfb89ԻjI4a(>6)Լml)K3a K3fئEA񛋃c{ FqL5 FǰȚXS%,*Fl.rf$03opfOvqRaxjG%d(MX]'K6M*)&J]u,L0u$$ةTI~LEIWjvs812hozQ sƬ.oi''sWq63b6Gfk?"ΗC'K3rmaf;&&ǤqtvǸwΖ$sEG2$3S'nf{5|oK^DdGw._<V#m]իxWjڦqh7 ζMl۞k7#]N -dvltlx{8>Ƴks{r+;.SJ>'SMK;K/q6]ﺜI?~£AIosGk<^HXi$zu:7>6>w;,:nY'Փd b7mpmo]hWm ixRN 80i]avN TYK0 Ǹԥ$fj*,RJIVIuE(t)hfq~rI3Y뛣r̢dJRKD, ˲hY%,KEp75a˺9)M$䨨\-RZ웚UVYZ0g0g%.䤖TPIYü$RY.TGeTYܳ&N#j$S27hmIړf~fM;ۛnYœkUt)HvG [T$s2zFGD~ݲFRG;m;7}Yx>e6=qutܚ/pCG3tfN&JWS&l6R>Flܜ⳱wM8L,ܴonw97lj񶵍:ۗyZ,KMSK)JN+)PvK)%Ɗ^)J.sIt*)e0+ )śUFOgFLޞ˻{2vF4 E:1Id(,% 09ќd*|u:^c&Qw]s0]2nVtIat?kdlu0W~vNMٮ̻]-:-Z>&j#tMg&LnN XvGGJu7C aN#~0 cG|/$N$|N^'r>76=&rE=Y#HI:}Gڳd7wcz$:vCWKETT˩D]&Ֆd͹wSlSbtGK 2YLX6uƏVܖۖR얉4Tb.՚NzG[F0JIuRヲm986՗RkV)U0vzʑSy$IRNT}sQɹK>W&oHvu&"&3rcH$f3w޺{FM=#kƧ}+7#M>m߸xdsWt4zF$ W3Gչw<˴wsW9.ձWqmlj8I3޳8)S82]JSH䍋=֬ު8F=n SEE$]kFI+Hغ=0s)-bEEHğ%,Y^0ɓ']K3Z)NB}m/$RME$\̘pqY'Xf:kkG8)9f"RRNS&L3Txj66S|kZ ޛbL5ixrdlfѫt6lOe֒<;OzCx z$=8~jyzRG['yE֎#Ʋ< ^uId.ŘzUU)ejL8mTRby^d;c{7St}.M9Y='yHسX$ms,aOU4 l2t6f͈gjj͚ITIY.ڥw<-xڽJyM.4m6|FPG;>=yYTn)9]i#fϹY2:Ee*)f pi#{%6[pYe)͆RE,޻Uz,5sIH[>}7)gkY$hUϡeڿu3)vlj3^$mdͱ]SF#vL݊s4aKլag26E"YS%696XYeܒ/]/ܻj+oaLF|aRE=n"pygzͯ3&K+uI6;6jSdJIRE)Yړ-dyOdrS^'2Lٶ>uޜH}M\[ٳaz6Hs{O<"uu6r?7lꍎձ8d:NZL:qyU${nfnLcsHgޓk7KfՇopY;k|G$ZH_9kmY/]i"xVkv}MZQ$~fl6VmZFGC5mzYHdls<6GcN"d.LԳv$v[7< :v#'̦?L$St^H֧[L$f3f'Ifѣm]fmrH͆n#bǑGC-$lhdW8=&Mefm3pt4lYOs1f&bG_^i#tLpw4y|M'#k"6GG$yLi#ȴ]y#tS]e=9#skmbHɵje#V$pi$^2dJo]u.ɱgK)6mj$Z,RժΕڹI,o.fy3s{rʌ.gClff{ߪسkUV&k74wݮgtZN-׉-Z8=Glj'"x$.fM3ޛ6FHGc79ndH;z<SF#|]2FRFEY:䌢 ͽh;޻Ym޳7jf87674] 8.#s,4qhc7cF'R빟eMu8=gI»QzNǮsOwGx}sozn-CsQe/RH]GI"9w6ޛ)#ksc54))z,2 /Gggck|EԥE,hdV$穀lY]{lm 3)I,&NfQvM"jK83ySVL3xw,qjm3掖+y\[VW"3x:ɹv8FnDmhIeFNMYfr63j[kcF"إܗYhdQv.6dx*0ssGK}|/+lw<.kNwpMw * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: xvid_bench.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ /***************************************************************************** * * 'Reference' output is at the end of file. * * compiles with something like: * gcc -o xvid_bench xvid_bench.c -I../src/ -lxvidcore -lm * ****************************************************************************/ #include #include #include /* for memset */ #include #ifndef WIN32 #include /* for gettimeofday */ #else #include #endif #include "xvid.h" // inner guts #include "portab.h" #include "dct/idct.h" #include "dct/fdct.h" #include "image/colorspace.h" #include "image/interpolate8x8.h" #include "utils/mem_transfer.h" #include "quant/quant.h" #include "motion/sad.h" #include "utils/emms.h" #include "utils/timer.h" #include "quant/quant_matrix.c" #include "bitstream/cbp.h" #include "bitstream/bitstream.h" #include #ifndef M_PI #define M_PI 3.14159265358979323846 #endif int speed_ref = 100; /* on slow machines, decrease this value */ int verbose = 0; unsigned int cpu_mask; /********************************************************************* * misc *********************************************************************/ /* returns time in micro-s*/ double gettime_usec() { #ifndef WIN32 struct timeval tv; gettimeofday(&tv, 0); return tv.tv_sec*1.0e6 + tv.tv_usec; #else clock_t clk; clk = clock(); return clk * 1000. / CLOCKS_PER_SEC; /* clock() returns time in Milliseconds */ #endif } /* returns squared deviates (mean(v*v)-mean(v)^2) of a 8x8 block */ double sqr_dev(uint8_t v[8*8]) { double sum=0.; double sum2=0.; int n; for (n=0;n<8*8;n++) { sum += v[n]; sum2 += v[n]*v[n]; } sum2 /= n; sum /= n; return sum2-sum*sum; } /********************************************************************* * cpu init *********************************************************************/ typedef struct { const char *name; unsigned int cpu; } CPU; CPU cpu_list[] = { { "PLAINC ", 0 }, #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) { "MMX ", XVID_CPU_MMX }, { "MMXEXT ", XVID_CPU_MMXEXT | XVID_CPU_MMX }, { "SSE2 ", XVID_CPU_SSE2 | XVID_CPU_MMX }, { "SSE3 ", XVID_CPU_SSE3 | XVID_CPU_SSE2 | XVID_CPU_MMX }, { "SSE41 ", XVID_CPU_SSE41| XVID_CPU_SSE3 | XVID_CPU_SSE2 | XVID_CPU_MMX }, { "3DNOW ", XVID_CPU_3DNOW }, { "3DNOWE ", XVID_CPU_3DNOW | XVID_CPU_3DNOWEXT }, #endif #ifdef ARCH_IS_PPC { "ALTIVEC", XVID_CPU_ALTIVEC }, #endif #ifdef ARCH_IS_IA64 // { "IA64 ", XVID_CPU_IA64 }, #endif // { "TSC ", XVID_CPU_TSC }, { 0, 0 } }; int init_cpu(CPU *cpu) { xvid_gbl_info_t xinfo; /* Get the available CPU flags */ memset(&xinfo, 0, sizeof(xinfo)); xinfo.version = XVID_VERSION; xvid_global(NULL, XVID_GBL_INFO, &xinfo, NULL); /* Are we trying to test a subset of the host CPU features */ if ((xinfo.cpu_flags & cpu->cpu) == cpu->cpu) { int xerr; xvid_gbl_init_t xinit; memset(&xinit, 0, sizeof(xinit)); xinit.cpu_flags = cpu->cpu | XVID_CPU_FORCE; xinit.version = XVID_VERSION; xerr = xvid_global(NULL, XVID_GBL_INIT, &xinit, NULL); if (xerr==XVID_ERR_FAIL) { /* libxvidcore failed to init */ return 0; } } else { /* The host CPU doesn't support some required feature for this test */ return(0); } return 1; } #define CRC32_REMAINDER 0xCBF43926 #define CRC32_INITIAL 0xffffffff #define DO1(c, crc) ((crc) = crc32tab[((unsigned int)((crc)>>24) ^ (*c++)) & 0xff] ^ ((crc) << 8)) #define DO2(c, crc) DO1(c, crc); DO1(c, crc); #define DO4(c, crc) DO2(c, crc); DO2(c, crc); #define DO8(c, crc) DO4(c, crc); DO4(c, crc); /****************************************************************************** * Precomputed AAL5 CRC32 lookup table ******************************************************************************/ static unsigned long crc32tab[256] = { 0x00000000L, 0x04C11DB7L, 0x09823B6EL, 0x0D4326D9L, 0x130476DCL, 0x17C56B6BL, 0x1A864DB2L, 0x1E475005L, 0x2608EDB8L, 0x22C9F00FL, 0x2F8AD6D6L, 0x2B4BCB61L, 0x350C9B64L, 0x31CD86D3L, 0x3C8EA00AL, 0x384FBDBDL, 0x4C11DB70L, 0x48D0C6C7L, 0x4593E01EL, 0x4152FDA9L, 0x5F15ADACL, 0x5BD4B01BL, 0x569796C2L, 0x52568B75L, 0x6A1936C8L, 0x6ED82B7FL, 0x639B0DA6L, 0x675A1011L, 0x791D4014L, 0x7DDC5DA3L, 0x709F7B7AL, 0x745E66CDL, 0x9823B6E0L, 0x9CE2AB57L, 0x91A18D8EL, 0x95609039L, 0x8B27C03CL, 0x8FE6DD8BL, 0x82A5FB52L, 0x8664E6E5L, 0xBE2B5B58L, 0xBAEA46EFL, 0xB7A96036L, 0xB3687D81L, 0xAD2F2D84L, 0xA9EE3033L, 0xA4AD16EAL, 0xA06C0B5DL, 0xD4326D90L, 0xD0F37027L, 0xDDB056FEL, 0xD9714B49L, 0xC7361B4CL, 0xC3F706FBL, 0xCEB42022L, 0xCA753D95L, 0xF23A8028L, 0xF6FB9D9FL, 0xFBB8BB46L, 0xFF79A6F1L, 0xE13EF6F4L, 0xE5FFEB43L, 0xE8BCCD9AL, 0xEC7DD02DL, 0x34867077L, 0x30476DC0L, 0x3D044B19L, 0x39C556AEL, 0x278206ABL, 0x23431B1CL, 0x2E003DC5L, 0x2AC12072L, 0x128E9DCFL, 0x164F8078L, 0x1B0CA6A1L, 0x1FCDBB16L, 0x018AEB13L, 0x054BF6A4L, 0x0808D07DL, 0x0CC9CDCAL, 0x7897AB07L, 0x7C56B6B0L, 0x71159069L, 0x75D48DDEL, 0x6B93DDDBL, 0x6F52C06CL, 0x6211E6B5L, 0x66D0FB02L, 0x5E9F46BFL, 0x5A5E5B08L, 0x571D7DD1L, 0x53DC6066L, 0x4D9B3063L, 0x495A2DD4L, 0x44190B0DL, 0x40D816BAL, 0xACA5C697L, 0xA864DB20L, 0xA527FDF9L, 0xA1E6E04EL, 0xBFA1B04BL, 0xBB60ADFCL, 0xB6238B25L, 0xB2E29692L, 0x8AAD2B2FL, 0x8E6C3698L, 0x832F1041L, 0x87EE0DF6L, 0x99A95DF3L, 0x9D684044L, 0x902B669DL, 0x94EA7B2AL, 0xE0B41DE7L, 0xE4750050L, 0xE9362689L, 0xEDF73B3EL, 0xF3B06B3BL, 0xF771768CL, 0xFA325055L, 0xFEF34DE2L, 0xC6BCF05FL, 0xC27DEDE8L, 0xCF3ECB31L, 0xCBFFD686L, 0xD5B88683L, 0xD1799B34L, 0xDC3ABDEDL, 0xD8FBA05AL, 0x690CE0EEL, 0x6DCDFD59L, 0x608EDB80L, 0x644FC637L, 0x7A089632L, 0x7EC98B85L, 0x738AAD5CL, 0x774BB0EBL, 0x4F040D56L, 0x4BC510E1L, 0x46863638L, 0x42472B8FL, 0x5C007B8AL, 0x58C1663DL, 0x558240E4L, 0x51435D53L, 0x251D3B9EL, 0x21DC2629L, 0x2C9F00F0L, 0x285E1D47L, 0x36194D42L, 0x32D850F5L, 0x3F9B762CL, 0x3B5A6B9BL, 0x0315D626L, 0x07D4CB91L, 0x0A97ED48L, 0x0E56F0FFL, 0x1011A0FAL, 0x14D0BD4DL, 0x19939B94L, 0x1D528623L, 0xF12F560EL, 0xF5EE4BB9L, 0xF8AD6D60L, 0xFC6C70D7L, 0xE22B20D2L, 0xE6EA3D65L, 0xEBA91BBCL, 0xEF68060BL, 0xD727BBB6L, 0xD3E6A601L, 0xDEA580D8L, 0xDA649D6FL, 0xC423CD6AL, 0xC0E2D0DDL, 0xCDA1F604L, 0xC960EBB3L, 0xBD3E8D7EL, 0xB9FF90C9L, 0xB4BCB610L, 0xB07DABA7L, 0xAE3AFBA2L, 0xAAFBE615L, 0xA7B8C0CCL, 0xA379DD7BL, 0x9B3660C6L, 0x9FF77D71L, 0x92B45BA8L, 0x9675461FL, 0x8832161AL, 0x8CF30BADL, 0x81B02D74L, 0x857130C3L, 0x5D8A9099L, 0x594B8D2EL, 0x5408ABF7L, 0x50C9B640L, 0x4E8EE645L, 0x4A4FFBF2L, 0x470CDD2BL, 0x43CDC09CL, 0x7B827D21L, 0x7F436096L, 0x7200464FL, 0x76C15BF8L, 0x68860BFDL, 0x6C47164AL, 0x61043093L, 0x65C52D24L, 0x119B4BE9L, 0x155A565EL, 0x18197087L, 0x1CD86D30L, 0x029F3D35L, 0x065E2082L, 0x0B1D065BL, 0x0FDC1BECL, 0x3793A651L, 0x3352BBE6L, 0x3E119D3FL, 0x3AD08088L, 0x2497D08DL, 0x2056CD3AL, 0x2D15EBE3L, 0x29D4F654L, 0xC5A92679L, 0xC1683BCEL, 0xCC2B1D17L, 0xC8EA00A0L, 0xD6AD50A5L, 0xD26C4D12L, 0xDF2F6BCBL, 0xDBEE767CL, 0xE3A1CBC1L, 0xE760D676L, 0xEA23F0AFL, 0xEEE2ED18L, 0xF0A5BD1DL, 0xF464A0AAL, 0xF9278673L, 0xFDE69BC4L, 0x89B8FD09L, 0x8D79E0BEL, 0x803AC667L, 0x84FBDBD0L, 0x9ABC8BD5L, 0x9E7D9662L, 0x933EB0BBL, 0x97FFAD0CL, 0xAFB010B1L, 0xAB710D06L, 0xA6322BDFL, 0xA2F33668L, 0xBCB4666DL, 0xB8757BDAL, 0xB5365D03L, 0xB1F740B4L }; uint32_t calc_crc(uint8_t *mem, int len, uint32_t crc) { while( len >= 8) { DO8(mem, crc); len -= 8; } while( len ) { DO1(mem, crc); len--; } return crc; } void byte_swap(uint8_t *mem, int len, int element_size) { #ifdef ARCH_IS_BIG_ENDIAN int i; if(element_size == 1) { /* No need to swap */ } else if(element_size == 2) { uint8_t temp[2]; for(i=0; i < (len/2); i++ ) { temp[0] = mem[0]; temp[1] = mem[1]; mem[0] = temp[1]; mem[1] = temp[0]; mem += 2; } } else if(element_size == 4) { uint8_t temp[4]; for(i=0; i < (len/4); i++ ) { temp[0] = mem[0]; temp[1] = mem[1]; temp[2] = mem[2]; temp[3] = mem[3]; mem[0] = temp[3]; mem[1] = temp[2]; mem[2] = temp[1]; mem[3] = temp[0]; mem += 4; } } else { printf("ERROR: byte_swap unsupported element_size(%u)\n", element_size); } #endif } /********************************************************************* * test DCT *********************************************************************/ #define ABS(X) ((X)<0 ? -(X) : (X)) void test_dct() { const int nb_tests = 300*speed_ref; int tst; CPU *cpu; int i; DECLARE_ALIGNED_MATRIX(iDst0, 8, 8, short, 16); DECLARE_ALIGNED_MATRIX(iDst, 8, 8, short, 16); DECLARE_ALIGNED_MATRIX(fDst, 8, 8, short, 16); double overhead; printf( "\n ===== test fdct/idct =====\n" ); for(i=0; i<8*8; ++i) iDst0[i] = (i*7-i*i) & 0x7f; overhead = gettime_usec(); for(tst=0; tstname!=0; ++cpu) { double t, PSNR, MSE; if (!init_cpu(cpu)) continue; t = gettime_usec(); emms(); for(tst=0; tstname, t, PSNR, MSE, (ABS(MSE)>=64)? "| ERROR" :""); } } /********************************************************************* * test SAD *********************************************************************/ void test_sad() { const int nb_tests = 2000*speed_ref; int tst; CPU *cpu; int i; DECLARE_ALIGNED_MATRIX(Cur, 16, 16, uint8_t, 16); DECLARE_ALIGNED_MATRIX(Ref1, 16, 16, uint8_t, 16); DECLARE_ALIGNED_MATRIX(Ref2, 16, 16, uint8_t, 16); printf( "\n ====== test SAD ======\n" ); for(i=0; i<16*16;++i) { Cur[i] = (i/5) ^ 0x05; Ref1[i] = (i + 0x0b) & 0xff; Ref2[i] = i ^ 0x76; } for(cpu = cpu_list; cpu->name!=0; ++cpu) { double t; uint32_t s; if (!init_cpu(cpu)) continue; t = gettime_usec(); emms(); for(tst=0; tstname, t, s, (s!=3776)?"| ERROR": "" ); t = gettime_usec(); emms(); for(tst=0; tstname, t, s, (s!=27214)?"| ERROR": "" ); t = gettime_usec(); emms(); for(tst=0; tstname, t, s, (s!=26274)?"| ERROR": "" ); t = gettime_usec(); emms(); for(tst=0; tstname, t, s, (s!=4002)?"| ERROR": "" ); t = gettime_usec(); emms(); for(tst=0; tstname, t, s, (s!=3344)?"| ERROR": "" ); printf( " --- \n" ); } } /********************************************************************* * test interpolation *********************************************************************/ #define ENTER \ for(i=0; i<16*8; ++i) Dst[i] = 0; \ t = gettime_usec(); \ emms(); #define LEAVE \ emms(); \ t = (gettime_usec() - t) / nb_tests; \ iCrc = calc_crc((uint8_t*)Dst, sizeof(Dst), CRC32_INITIAL) #define TEST_MB(FUNC, R) \ ENTER \ for(tst=0; tstname!=0; ++cpu) { double t; int tst, i, iCrc; if (!init_cpu(cpu)) continue; TEST_MB(interpolate8x8_halfpel_h, 0); printf("%s - interp- h-round0 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0x115381ba)?"| ERROR": "" ); TEST_MB(interpolate8x8_halfpel_h, 1); printf("%s - round1 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0x2b1f528f)?"| ERROR": "" ); TEST_MB(interpolate8x8_halfpel_v, 0); printf("%s - interp- v-round0 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0x423cdcc7)?"| ERROR": "" ); TEST_MB(interpolate8x8_halfpel_v, 1); printf("%s - round1 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0x42202efe)?"| ERROR": "" ); TEST_MB(interpolate8x8_halfpel_hv, 0); printf("%s - interp-hv-round0 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0xd198d387)?"| ERROR": "" ); TEST_MB(interpolate8x8_halfpel_hv, 1); printf("%s - round1 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0x9ecfd921)?"| ERROR": "" ); /* this is a new function, as of 06.06.2002 */ #if 0 TEST_MB2(interpolate8x8_avrg); printf("%s - interpolate8x8_c %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=8107)?"| ERROR": "" ); #endif /* New functions for field prediction by CK 1.10.2005 */ #pragma NEW8X4 TEST_MB(interpolate8x4_halfpel_h, 0); printf("%s - interpfield-h -round0 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0x9538d6df)?"| ERROR": "" ); TEST_MB(interpolate8x4_halfpel_h, 1); printf("%s - round1 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0xde5f1db4)?"| ERROR": "" ); TEST_MB(interpolate8x4_halfpel_v, 0); printf("%s - interpfield- v-round0 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0xea5a69ef)?"| ERROR": "" ); TEST_MB(interpolate8x4_halfpel_v, 1); printf("%s - round1 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0x4f10ec0f)?"| ERROR": "" ); TEST_MB(interpolate8x4_halfpel_hv, 0); printf("%s - interpfield-hv-round0 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0xf97ee367)?"| ERROR": "" ); TEST_MB(interpolate8x4_halfpel_hv, 1); printf("%s - round1 %.3f usec crc32=0x%08x %s\n", cpu->name, t, iCrc, (iCrc!=0xb6a9f581)?"| ERROR": "" ); /* End of 8x4 functions */ printf( " --- \n" ); } } #undef ENTER #undef LEAVE #undef TEST_MB #undef TEST_MB2 /********************************************************************* * test transfer *********************************************************************/ #define INIT_TRANSFER \ for(i=0; i<8*32; ++i) { \ Src8[i] = i; Src16[i] = i; \ Dst8[i] = 0; Dst16[i] = 0; \ Ref1[i] = i^0x27; \ Ref2[i] = i^0x51; \ } #define TEST_TRANSFER_BEGIN(DST) \ INIT_TRANSFER \ overhead = -gettime_usec(); \ for(tst=0; tstname!=0; ++cpu) { double t, overhead; int tst, s; if (!init_cpu(cpu)) continue; TEST_TRANSFER(transfer_8to16copy, Dst16, Src8); printf("%s - 8to16 %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0x115814bb)?"| ERROR": ""); TEST_TRANSFER(transfer_16to8copy, Dst8, Src16); printf( "%s - 16to8 %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0xee7ccbb4)?"| ERROR": ""); /* New functions for field prediction by CK 1.10.2005 */ #pragma NEW8X4 TEST_TRANSFER(transfer8x4_copy, Dst8, Src8); printf("%s - 8to4 %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0xbb9c3db5)?"| ERROR": ""); /* End of new functions */ TEST_TRANSFER(transfer8x8_copy, Dst8, Src8); printf("%s - 8to8 %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0xd37b3295)?"| ERROR": ""); TEST_TRANSFER(transfer_16to8add, Dst8, Src16); printf("%s - 16to8add %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0xdd817bf4)?"| ERROR": "" ); TEST_TRANSFER2(transfer_8to16sub, Dst16, Src8, Ref1); { int s1, s2; s1 = calc_crc((uint8_t*)Dst16, 8*32*sizeof(Dst16[0]), CRC32_INITIAL); s2 = calc_crc((uint8_t*)Src8, 8*32*sizeof(Src8[0]), CRC32_INITIAL); printf("%s - 8to16sub %.3f usec crc32(1)=0x%08x crc32(2)=0x%08x %s %s\n", cpu->name, t, s1, s2, (s1!=0xa1e07163)?"| ERROR1": "", (s2!=0xd86c5d23)?"| ERROR2": "" ); } TEST_TRANSFER3(transfer_8to16sub2, Dst16, Src8, Ref1, Ref2); printf("%s - 8to16sub2 %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0x99b6c4c7)?"| ERROR": "" ); printf( " --- \n" ); } } /********************************************************************* * test quantization *********************************************************************/ #define TEST_QUANT(FUNC, DST, SRC) \ t = gettime_usec(); \ for(s=CRC32_INITIAL,qm=1; qm<=255; ++qm) { \ for(i=0; i<8*8; ++i) Quant[i] = qm; \ set_inter_matrix( mpeg_quant_matrices, Quant ); \ emms(); \ for(q=1; q<=max_Q; ++q) { \ for(tst=0; tstname!=0; ++cpu) { double t, overhead; int32_t tst, q; uint32_t s; if (!init_cpu(cpu)) continue; // exhaustive tests to compare against the (ref) C-version TEST_INTRA(quant_h263_intra_c, quant_h263_intra, 2048); TEST_INTRA(dequant_h263_intra_c, dequant_h263_intra , 512 ); TEST_INTER(quant_h263_inter_c, quant_h263_inter , 2048); TEST_INTER(dequant_h263_inter_c, dequant_h263_inter , 512 ); overhead = -gettime_usec(); for(s=0,qm=1; qm<=255; ++qm) { for(i=0; i<8*8; ++i) Quant[i] = qm; set_inter_matrix(mpeg_quant_matrices, Quant ); for(q=1; q<=max_Q; ++q) for(i=0; i<64; ++i) s+=Dst[i]^i^qm; } overhead += gettime_usec(); TEST_QUANT2(quant_mpeg_intra, Dst, Src); printf("%s - quant_mpeg_intra %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0x3b999af6)? "| ERROR": ""); TEST_QUANT(quant_mpeg_inter, Dst, Src); printf("%s - quant_mpeg_inter %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0xf6de7757)?"| ERROR": ""); TEST_QUANT2(dequant_mpeg_intra, Dst, Src); printf("%s - dequant_mpeg_intra %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0x2def7bc7)?"| ERROR": ""); TEST_QUANT(dequant_mpeg_inter, Dst, Src); printf("%s - dequant_mpeg_inter %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0xd878c722)?"| ERROR": ""); TEST_QUANT2(quant_h263_intra, Dst, Src); printf("%s - quant_h263_intra %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0x2eba9d43)?"| ERROR": ""); TEST_QUANT(quant_h263_inter, Dst, Src); printf("%s - quant_h263_inter %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0xbd315a7e)?"| ERROR": ""); TEST_QUANT2(dequant_h263_intra, Dst, Src); printf("%s - dequant_h263_intra %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0x9841212a)?"| ERROR": ""); TEST_QUANT(dequant_h263_inter, Dst, Src); printf("%s - dequant_h263_inter %.3f usec crc32=0x%08x %s\n", cpu->name, t, s, (s!=0xe7df8fba)?"| ERROR": ""); printf( " --- \n" ); } } /********************************************************************* * test distortion operators *********************************************************************/ static void ieee_reseed(long s); static long ieee_rand(int Min, int Max); #define TEST_SSE(FUNCTION, SRC1, SRC2, STRIDE) \ do { \ t = gettime_usec(); \ tst = nb_tests; \ while((tst--)>0) sse = (FUNCTION)((SRC1), (SRC2), (STRIDE)); \ emms(); \ t = (gettime_usec() - t)/(double)nb_tests; \ } while(0) void test_sse() { const int nb_tests = 100000*speed_ref; int i; CPU *cpu; DECLARE_ALIGNED_MATRIX(Src1, 8, 8, int16_t, 16); DECLARE_ALIGNED_MATRIX(Src2, 8, 8, int16_t, 16); DECLARE_ALIGNED_MATRIX(Src3, 8, 8, int16_t, 16); DECLARE_ALIGNED_MATRIX(Src4, 8, 8, int16_t, 16); printf( "\n ===== test sse =====\n" ); ieee_reseed(1); for(i=0; i<64; ++i) { Src1[i] = ieee_rand(-2048, 2047); Src2[i] = ieee_rand(-2048, 2047); Src3[i] = ieee_rand(-2048, 2047); Src4[i] = ieee_rand(-2048, 2047); } for(cpu = cpu_list; cpu->name!=0; ++cpu) { double t; int tst, sse; if (!init_cpu(cpu)) continue; /* 16 bit element blocks */ TEST_SSE(sse8_16bit, Src1, Src2, 16); printf("%s - sse8_16bit#1 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=182013834)?"| ERROR": ""); TEST_SSE(sse8_16bit, Src1, Src3, 16); printf("%s - sse8_16bit#2 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=142545203)?"| ERROR": ""); TEST_SSE(sse8_16bit, Src1, Src4, 16); printf("%s - sse8_16bit#3 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=146340935)?"| ERROR": ""); TEST_SSE(sse8_16bit, Src2, Src3, 16); printf("%s - sse8_16bit#4 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=130136661)?"| ERROR": ""); TEST_SSE(sse8_16bit, Src2, Src4, 16); printf("%s - sse8_16bit#5 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=136870353)?"| ERROR": ""); TEST_SSE(sse8_16bit, Src3, Src4, 16); printf("%s - sse8_16bit#6 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=164107772)?"| ERROR": ""); /* 8 bit element blocks */ TEST_SSE(sse8_8bit, (int8_t*)Src1, (int8_t*)Src2, 8); printf("%s - sse8_8bit#1 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=1356423)?"| ERROR": ""); TEST_SSE(sse8_8bit, (int8_t*)Src1, (int8_t*)Src3, 8); printf("%s - sse8_8bit#2 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=1173074)?"| ERROR": ""); TEST_SSE(sse8_8bit, (int8_t*)Src1, (int8_t*)Src4, 8); printf("%s - sse8_8bit#3 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=1092357)?"| ERROR": ""); TEST_SSE(sse8_8bit, (int8_t*)Src2, (int8_t*)Src3, 8); printf("%s - sse8_8bit#4 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=1360239)?"| ERROR": ""); TEST_SSE(sse8_8bit, (int8_t*)Src2, (int8_t*)Src4, 8); printf("%s - sse8_8bit#5 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=1208414)?"| ERROR": ""); TEST_SSE(sse8_8bit, (int8_t*)Src3, (int8_t*)Src4, 8); printf("%s - sse8_8bit#6 %.3f usec sse=%d %s\n", cpu->name, t, sse, (sse!=1099285)?"| ERROR": ""); printf(" ---\n"); } } /********************************************************************* * test non-zero AC counting *********************************************************************/ #define TEST_CBP(FUNC, SRC, NB) \ t = gettime_usec(); \ emms(); \ for(tst=0; tst3*64); Src4[i] = (i==(3*64+2) || i==(5*64+9)); Src5[i] = ieee_rand(0,1) ? -1 : 1; /* +/- test */ } for(cpu = cpu_list; cpu->name!=0; ++cpu) { double t; int tst, cbp; if (!init_cpu(cpu)) continue; TEST_CBP(calc_cbp, Src1, nb_tests); printf("%s - calc_cbp#1 %.3f usec cbp=0x%02x %s\n", cpu->name, t, cbp, (cbp!=0x15)?"| ERROR": ""); TEST_CBP(calc_cbp, Src2, nb_tests); printf("%s - calc_cbp#2 %.3f usec cbp=0x%02x %s\n", cpu->name, t, cbp, (cbp!=0x38)?"| ERROR": ""); TEST_CBP(calc_cbp, Src3, nb_tests); printf("%s - calc_cbp#3 %.3f usec cbp=0x%02x %s\n", cpu->name, t, cbp, (cbp!=0x0f)?"| ERROR": "" ); TEST_CBP(calc_cbp, Src4, nb_tests); printf("%s - calc_cbp#4 %.3f usec cbp=0x%02x %s\n", cpu->name, t, cbp, (cbp!=0x05)?"| ERROR": "" ); TEST_CBP(calc_cbp, Src5, nb_tests); printf("%s - calc_cbp#4 %.3f usec cbp=0x%02x %s\n", cpu->name, t, cbp, (cbp!=0x3f)?"| ERROR": "" ); printf( " --- \n" ); } for(cpu = cpu_list; cpu->name!=0; ++cpu) /* bench suggested by Carlo (carlo dot bramix at libero dot it) */ { double t; int tst, cbp, err; if (!init_cpu(cpu)) continue; err = 0; for(n=0; n<6; ++n) { for(m=0; m<64; ++m) { for(i=0; i<6*64; ++i) Src1[i] = (i== (m + n*64)); TEST_CBP(calc_cbp, Src1, 1); if (cbp!= (((m!=0)<<(5-n)))) { printf( "%s - calc_cbp#5: ERROR at pos %d / %d!\n", cpu->name, n, m); err = 1; break; } } } if (!err) printf( " %s - calc_cbp#5 : OK\n", cpu->name ); } } /********************************************************************* * fdct/idct IEEE1180 compliance *********************************************************************/ typedef struct { long Errors[64]; long Sqr_Errors[64]; long Max_Errors[64]; long Nb; } STATS_8x8; void init_stats(STATS_8x8 *S) { int i; for(i=0; i<64; ++i) { S->Errors[i] = 0; S->Sqr_Errors[i] = 0; S->Max_Errors[i] = 0; } S->Nb = 0; } void store_stats(STATS_8x8 *S, short Blk[64], short Ref[64]) { int i; for(i=0; i<64; ++i) { short Err = Blk[i] - Ref[i]; S->Errors[i] += Err; S->Sqr_Errors[i] += Err * Err; if (Err<0) Err = -Err; if (S->Max_Errors[i]Max_Errors[i] = Err; } S->Nb++; } void print_stats(STATS_8x8 *S) { int i; double Norm; assert(S->Nb>0); Norm = 1. / (double)S->Nb; printf("\n== Max absolute values of errors ==\n"); for(i=0; i<64; i++) { printf(" %4ld", S->Max_Errors[i]); if ((i&7)==7) printf("\n"); } printf("\n== Mean square errors ==\n"); for(i=0; i<64; i++) { double Err = Norm * (double)S->Sqr_Errors[i]; printf(" %.3f", Err); if ((i&7)==7) printf("\n"); } printf("\n== Mean errors ==\n"); for(i=0; i<64; i++) { double Err = Norm * (double)S->Errors[i]; printf(" %.3f", Err); if ((i&7)==7) printf("\n"); } printf("\n"); } static const char *CHECK(double v, double l) { if (fabs(v)<=l) return "ok"; else return "FAIL!"; } void report_stats(STATS_8x8 *S, const double *Limits) { int i; double Norm, PE, PMSE, OMSE, PME, OME; assert(S->Nb>0); Norm = 1. / (double)S->Nb; PE = 0.; for(i=0; i<64; i++) { if (PEMax_Errors[i]) PE = S->Max_Errors[i]; } PMSE = 0.; OMSE = 0.; for(i=0; i<64; i++) { double Err = Norm * (double)S->Sqr_Errors[i]; OMSE += Err; if (PMSE < Err) PMSE = Err; } OMSE /= 64.; PME = 0.; OME = 0.; for(i=0; i<64; i++) { double Err = Norm * (double)S->Errors[i]; OME += Err; Err = fabs(Err); if (PME < Err) PME = Err; } OME /= 64.; printf( "Peak error: %4.4f\n", PE ); printf( "Peak MSE: %4.4f\n", PMSE ); printf( "Overall MSE: %4.4f\n", OMSE ); printf( "Peak ME: %4.4f\n", PME ); printf( "Overall ME: %4.4f\n", OME ); if (Limits!=0) { printf( "[PE<=%.4f %s] ", Limits[0], CHECK(PE, Limits[0]) ); printf( "\n" ); printf( "[PMSE<=%.4f %s]", Limits[1], CHECK(PMSE, Limits[1]) ); printf( "[OMSE<=%.4f %s]", Limits[2], CHECK(OMSE, Limits[2]) ); printf( "\n" ); printf( "[PME<=%.4f %s] ", Limits[3], CHECK(PME , Limits[3]) ); printf( "[OME<=%.4f %s] ", Limits[4], CHECK(OME , Limits[4]) ); printf( "\n" ); } } ///* ////////////////////////////////////////////////////// */ /* Pseudo-random generator specified by IEEE 1180 */ static long ieee_seed = 1; static void ieee_reseed(long s) { ieee_seed = s; } static long ieee_rand(int Min, int Max) { static double z = (double) 0x7fffffff; long i,j; double x; ieee_seed = (ieee_seed * 1103515245) + 12345; i = ieee_seed & 0x7ffffffe; x = ((double) i) / z; x *= (Max-Min+1); j = (long)x; j = j + Min; assert(j>=Min && j<=Max); return (short)j; } #define CLAMP(x, M) (x) = ((x)<-(M)) ? (-(M)) : ((x)>=(M) ? ((M)-1) : (x)) static double Cos[8][8]; static void init_ref_dct() { int i, j; for(i=0; i<8; i++) { double scale = (i == 0) ? sqrt(0.125) : 0.5; for (j=0; j<8; j++) Cos[i][j] = scale*cos( (M_PI/8.0)*i*(j + 0.5) ); } } void ref_idct(short *M) { int i, j, k; double Tmp[8][8]; for(i=0; i<8; i++) { for(j=0; j<8; j++) { double Sum = 0.0; for (k=0; k<8; k++) Sum += Cos[k][j]*M[8*i+k]; Tmp[i][j] = Sum; } } for(i=0; i<8; i++) { for(j=0; j<8; j++) { double Sum = 0.0; for (k=0; k<8; k++) Sum += Cos[k][i]*Tmp[k][j]; M[8*i+j] = (short)floor(Sum + .5); } } } void ref_fdct(short *M) { int i, j, k; double Tmp[8][8]; for(i=0; i<8; i++) { for(j=0; j<8; j++) { double Sum = 0.0; for (k=0; k<8; k++) Sum += Cos[j][k]*M[8*i+k]; Tmp[i][j] = Sum; } } for(i=0; i<8; i++) { for(j=0; j<8; j++) { double Sum = 0.0; for (k=0; k<8; k++) Sum += Cos[i][k]*Tmp[k][j]; M[8*i+j] = (short)floor(Sum + 0.5); } } } void test_IEEE1180_compliance(int Min, int Max, int Sign) { static const double ILimits[5] = { 1., 0.06, 0.02, 0.015, 0.0015 }; int Loops = 10000; int i, m, n; DECLARE_ALIGNED_MATRIX(Blk0, 8, 8, short, 16); /* reference */ DECLARE_ALIGNED_MATRIX(Blk, 8, 8, short, 16); DECLARE_ALIGNED_MATRIX(iBlk, 8, 8, short, 16); DECLARE_ALIGNED_MATRIX(Ref_FDCT, 8, 8, short, 16); DECLARE_ALIGNED_MATRIX(Ref_IDCT, 8, 8, short, 16); STATS_8x8 FStats; /* forward dct stats */ STATS_8x8 IStats; /* inverse dct stats */ CPU *cpu; init_ref_dct(); for(cpu = cpu_list; cpu->name!=0; ++cpu) { if (!init_cpu(cpu)) continue; printf( "\n===== IEEE test for %s ==== (Min=%d Max=%d Sign=%d Loops=%d)\n", cpu->name, Min, Max, Sign, Loops); init_stats(&IStats); init_stats(&FStats); ieee_reseed(1); for(n=0; nname!=0; ++cpu) { short Blk0[64], Blk[64]; STATS_8x8 Stats; if (!init_cpu(cpu)) continue; printf( "\n===== IEEE test for %s Min=%d Max=%d =====\n", cpu->name, Min, Max ); /* FDCT tests // */ init_stats(&Stats); /* test each computation channels separately */ for(i=0; i<64; i++) Blk[i] = Blk0[i] = ((i/8)==(i%8)) ? Max : 0; ref_fdct(Blk0); emms(); fdct(Blk); emms(); store_stats(&Stats, Blk, Blk0); for(i=0; i<64; i++) Blk[i] = Blk0[i] = ((i/8)==(i%8)) ? Min : 0; ref_fdct(Blk0); emms(); fdct(Blk); emms(); store_stats(&Stats, Blk, Blk0); /* randomly saturated inputs */ for(p=0; p=p)? Max : Min; ref_fdct(Blk0); emms(); fdct(Blk); emms(); store_stats(&Stats, Blk, Blk0); } } printf( "\n -- FDCT saturation report --\n" ); report_stats(&Stats, 0); /* IDCT tests // */ #if 0 /* no finished yet */ init_stats(&Stats); /* test each computation channel separately */ for(i=0; i<64; i++) Blk[i] = Blk0[i] = ((i/8)==(i%8)) ? IDCT_MAX : 0; ref_idct(Blk0); emms(); idct(Blk); emms(); for(i=0; i<64; i++) { CLAMP(Blk0[i], IDCT_OUT); CLAMP(Blk[i], IDCT_OUT); } store_stats(&Stats, Blk, Blk0); for(i=0; i<64; i++) Blk[i] = Blk0[i] = ((i/8)==(i%8)) ? IDCT_MIN : 0; ref_idct(Blk0); emms(); idct(Blk); emms(); for(i=0; i<64; i++) { CLAMP(Blk0[i], IDCT_OUT); CLAMP(Blk[i], IDCT_OUT); } store_stats(&Stats, Blk, Blk0); /* randomly saturated inputs */ for(p=0; p=p)? IDCT_MAX : IDCT_MIN; ref_idct(Blk0); emms(); idct(Blk); emms(); for(i=0; i<64; i++) { CLAMP(Blk0[i],IDCT_OUT); CLAMP(Blk[i],IDCT_OUT); } store_stats(&Stats, Blk, Blk0); } } printf( "\n -- IDCT saturation report --\n" ); print_stats(&Stats); report_stats(&Stats, 0); #endif } } /********************************************************************* * measure raw decoding speed *********************************************************************/ void test_dec(const char *name, int width, int height, int ref_chksum) { FILE *f = 0; void *dechandle = 0; int xerr; xvid_gbl_init_t xinit; xvid_dec_create_t xparam; xvid_dec_frame_t xframe; double t = 0.; int nb = 0; uint8_t *buf = 0; uint8_t *yuv_out = 0; int buf_size, pos; uint32_t chksum = 0; int bps = (width+31) & ~31; memset(&xinit, 0, sizeof(xinit)); xinit.cpu_flags = cpu_mask; xinit.version = XVID_VERSION; xvid_global(NULL, 0, &xinit, NULL); memset(&xparam, 0, sizeof(xparam)); xparam.width = width; xparam.height = height; xparam.version = XVID_VERSION; xerr = xvid_decore(NULL, XVID_DEC_CREATE, &xparam, NULL); if (xerr==XVID_ERR_FAIL) { printf("ERROR: can't init decoder (err=%d)\n", xerr); return; } dechandle = xparam.handle; f = fopen(name, "rb"); if (f==0) { printf( "ERROR: can't open file '%s'\n", name); return; } fseek(f, 0, SEEK_END); buf_size = ftell(f); fseek(f, 0, SEEK_SET); if (buf_size<=0) { printf("ERROR: error while stating file\n"); goto End; } buf = malloc(buf_size); yuv_out = calloc(1, bps*height*3/2 + 15); if (buf==0 || yuv_out==0) { printf( "ERROR: malloc failed!\n" ); goto End; } if (fread(buf, buf_size, 1, f)!=1) { printf( "ERROR: file-read failed\n" ); goto End; } nb = 0; pos = 0; t = -gettime_usec(); while(1) { int y; memset(&xframe, 0, sizeof(xframe)); xframe.version = XVID_VERSION; xframe.bitstream = buf + pos; xframe.length = buf_size - pos; xframe.output.plane[0] = (uint8_t*)(((size_t)yuv_out + 15) & ~15); xframe.output.plane[1] = (uint8_t*)xframe.output.plane[0] + bps*height; xframe.output.plane[2] = (uint8_t*)xframe.output.plane[1] + bps/2; xframe.output.stride[0] = bps; xframe.output.stride[1] = bps; xframe.output.stride[2] = bps; xframe.output.csp = XVID_CSP_I420; xerr = xvid_decore(dechandle, XVID_DEC_DECODE, &xframe, 0); if (xerr<0) { printf("ERROR: decoding failed for frame #%d (err=%d)!\n", nb, xerr); break; } else if (xerr==0) break; else if (verbose>0) printf("#%d %d\n", nb, xerr ); pos += xerr; nb++; for(y=0; y0.) printf( "%d frames decoded in %.3f s -> %.1f FPS Checksum:0x%.8x\n", nb, t*1.e-6f, (float)(nb*1.e6f/t), chksum ); } else { printf("FPS:%.1f Checksum: 0x%.8x Expected:0x%.8x | %s\n", t>0. ? (float)(nb*1.e6f/t) : 0.f, chksum, ref_chksum, (chksum==ref_chksum) ? "OK" : "ERROR"); } End: if (yuv_out!=0) free(yuv_out); if (buf!=0) free(buf); if (dechandle!=0) { xerr= xvid_decore(dechandle, XVID_DEC_DESTROY, NULL, NULL); if (xerr==XVID_ERR_FAIL) printf("ERROR: destroy-decoder failed (err=%d)!\n", xerr); } if (f!=0) fclose(f); } /********************************************************************* * non-regression tests *********************************************************************/ void test_bugs1() { CPU *cpu; uint16_t mpeg_quant_matrices[64*8]; printf( "\n ===== (de)quant4_intra saturation bug? =====\n" ); for(cpu = cpu_list; cpu->name!=0; ++cpu) { int i; int16_t Src[8*8], Dst[8*8]; if (!init_cpu(cpu)) continue; for(i=0; i<64; ++i) Src[i] = i-32; set_intra_matrix( mpeg_quant_matrices, get_default_intra_matrix() ); dequant_mpeg_intra(Dst, Src, 31, 5, mpeg_quant_matrices); printf( "dequant_mpeg_intra with CPU=%s: ", cpu->name); printf( " Out[]= " ); for(i=0; i<64; ++i) printf( "[%d]", Dst[i]); printf( "\n" ); } printf( "\n ===== (de)quant4_inter saturation bug? =====\n" ); for(cpu = cpu_list; cpu->name!=0; ++cpu) { int i; int16_t Src[8*8], Dst[8*8]; if (!init_cpu(cpu)) continue; for(i=0; i<64; ++i) Src[i] = i-32; set_inter_matrix( mpeg_quant_matrices, get_default_inter_matrix() ); dequant_mpeg_inter(Dst, Src, 31, mpeg_quant_matrices); printf( "dequant_mpeg_inter with CPU=%s: ", cpu->name); printf( " Out[]= " ); for(i=0; i<64; ++i) printf( "[%d]", Dst[i]); printf( "\n" ); } } void test_dct_precision_diffs() { CPU *cpu; DECLARE_ALIGNED_MATRIX(Blk, 8, 8, int16_t, 16); DECLARE_ALIGNED_MATRIX(Blk0, 8, 8, int16_t, 16); printf( "\n ===== fdct/idct precision diffs =====\n" ); for(cpu = cpu_list; cpu->name!=0; ++cpu) { int i; if (!init_cpu(cpu)) continue; for(i=0; i<8*8; ++i) { Blk0[i] = (i*7-i*i) & 0x7f; Blk[i] = Blk0[i]; } fdct(Blk); idct(Blk); printf( " fdct+idct diffs with CPU=%s: \n", cpu->name ); for(i=0; i<8; ++i) { int j; for(j=0; j<8; ++j) printf( " %d ", Blk[i*8+j]-Blk0[i*8+j]); printf("\n"); } printf("\n"); } } void test_quant_bug() { const int max_Q = 31; int i, n, qm, q; CPU *cpu; DECLARE_ALIGNED_MATRIX(Src, 8, 8, int16_t, 16); DECLARE_ALIGNED_MATRIX(Dst, 8, 8, int16_t, 16); uint8_t Quant[8*8]; CPU cpu_bug_list[] = { { "PLAINC", 0 }, { "MMX ", XVID_CPU_MMX }, {0,0} }; uint16_t Crcs_Inter[2][32]; uint16_t Crcs_Intra[2][32]; DECLARE_ALIGNED_MATRIX(mpeg_quant_matrices, 8, 64, uint16_t, 16); printf( "\n ===== test MPEG4-quantize bug =====\n" ); for(i=0; i<64; ++i) Src[i] = 2048*(i-32)/32; #if 1 for(qm=1; qm<=255; ++qm) { for(i=0; i<8*8; ++i) Quant[i] = qm; set_inter_matrix( mpeg_quant_matrices, Quant ); for(n=0, cpu = cpu_bug_list; cpu->name!=0; ++cpu, ++n) { uint16_t s; if (!init_cpu(cpu)) continue; for(q=1; q<=max_Q; ++q) { emms(); quant_mpeg_inter( Dst, Src, q, mpeg_quant_matrices ); emms(); for(s=0, i=0; i<64; ++i) s+=((uint16_t)Dst[i])^i; Crcs_Inter[n][q] = s; } } for(q=1; q<=max_Q; ++q) for(i=0; i %d/%d !\n", qm, q, Crcs_Inter[i][q], Crcs_Inter[i+1][q]); } #endif #if 1 for(qm=1; qm<=255; ++qm) { for(i=0; i<8*8; ++i) Quant[i] = qm; set_intra_matrix( mpeg_quant_matrices, Quant ); for(n=0, cpu = cpu_bug_list; cpu->name!=0; ++cpu, ++n) { uint16_t s; if (!init_cpu(cpu)) continue; for(q=1; q<=max_Q; ++q) { emms(); quant_mpeg_intra( Dst, Src, q, q, mpeg_quant_matrices); emms(); for(s=0, i=0; i<64; ++i) s+=((uint16_t)Dst[i])^i; Crcs_Intra[n][q] = s; } } for(q=1; q<=max_Q; ++q) for(i=0; i %d/%d!\n", qm, q, Crcs_Inter[i][q], Crcs_Inter[i+1][q]); } #endif } /********************************************************************* * test some YUV func *********************************************************************/ #define ENTER \ for(i=0; i<(int)sizeof(Dst0); ++i) Dst0[0][i] = 0; \ t = gettime_usec(); \ emms(); #define LEAVE \ emms(); \ t = (gettime_usec() - t) / nb_tests; \ iCrc = calc_crc((uint8_t*)Dst0, sizeof(Dst0), CRC32_INITIAL) #define TEST_YUYV(FUNC, S, FLIP) \ ENTER \ for(tst=0; tst>= 1; n++; } return n; } static const uint8_t log2_tab_16[16] = { 0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4 }; static uint32_t __inline log2bin_v2(uint32_t value) { int n = 0; if (value & 0xffff0000) { value >>= 16; n += 16; } if (value & 0xff00) { value >>= 8; n += 8; } if (value & 0xf0) { value >>= 4; n += 4; } return n + log2_tab_16[value]; } void test_log2bin() { const int nb_tests = 3000*speed_ref; int n, crc1=0, crc2=0; uint32_t s, s0; double t1, t2; t1 = gettime_usec(); s0 = (int)(t1*31.241); for(s=s0, n=0; n 1) { if (*num % i == 0 && *den % i == 0) { *num /= i; *den /= i; i = *num; continue; } i--; } } static uint32_t gcd(int num, int den) { int tmp; while( (tmp=num%den) ) { num = den; den = tmp; } return den; } static void __inline new_gcd(int *num, int *den) { const int div = gcd(*num, *den); if (num) { *num /= div; *den /= div; } } void test_gcd() { const int nb_tests = 10*speed_ref; int i; uint32_t crc1=0, crc2=0; uint32_t n0, n, d0, d; double t1, t2; t1 = gettime_usec(); n0 = 0xfffff & (int)(t1*31.241); d0 = 0xfffff & (int)( ((n0*4123)%17) | 1 ); for(n=n0, d=d0, i=0; i>4)^d) + ((crc1<<2)^n) ) & 0xffffff; n = d; d = (d*12363+31) & 0xffff; d |= !d; } t1 = (gettime_usec()-t1) / nb_tests; t2 = gettime_usec(); for(n=n0, d=d0, i=0; i>4)^d) + ((crc2<<2)^n) ) & 0xffffff; n = d; d = (d*12363+31) & 0xffff; d |= !d; } t2 = (gettime_usec() - t2) / nb_tests; printf( "old_gcd: %.3f sec crc=%d\n", t1, crc1 ); printf( "new_gcd: %.3f sec crc=%d\n", t2, crc2 ); if (crc1!=crc2) printf( " CRC ERROR !\n" ); } /********************************************************************* * test compiler *********************************************************************/ void test_compiler() { int nb_err = 0; int32_t v; if (sizeof(uint16_t)<2) { printf( "ERROR: sizeof(uint16_t)<2 !!\n" ); nb_err++; } if (sizeof(int16_t)<2) { printf( "ERROR: sizeof(int16_t)<2 !!\n" ); nb_err++; } if (sizeof(uint8_t)!=1) { printf( "ERROR: sizeof(uint8_t)!=1 !!\n" ); nb_err++; } if (sizeof(int8_t)!=1) { printf( "ERROR: sizeof(int8_t)!=1 !!\n" ); nb_err++; } if (sizeof(uint32_t)<4) { printf( "ERROR: sizeof(uint32_t)<4 !!\n" ); nb_err++; } if (sizeof(int32_t)<4) { printf( "ERROR: sizeof(int32_t)<4 !!\n" ); nb_err++; } /* yes, i know, this test is silly. But better be safe than sorry. :) */ for(v=1000; v>=0; v--) { if ( (v>>2) != v/4) nb_err++; } for(v=-1000; v!=-1; v++) { if ( (v>>2) != (v/4)-!!(v%4)) nb_err++; } if (nb_err!=0) { printf( "ERROR! please post your platform/compiler specs to xvid-devel@xvid.org !\n" ); } } /********************************************************************* * test SSIM functions *********************************************************************/ typedef int (*lumfunc)(uint8_t* ptr, int stride); typedef void (*csfunc)(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr); extern int lum_8x8_c(uint8_t* ptr, int stride); extern int lum_8x8_mmx(uint8_t* ptr, int stride); extern int lum_2x8_c(uint8_t* ptr, int stride); extern void consim_c(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr); extern void consim_mmx(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr); extern void consim_sse2(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr); void test_SSIM() { const int nb_tests = 3000*speed_ref; int tst; CPU *cpu; int i; int devs[3]; long lumo, lumc; DECLARE_ALIGNED_MATRIX(Ref1, 16, 16, uint8_t, 16); DECLARE_ALIGNED_MATRIX(Ref2, 16, 16, uint8_t, 16); lumfunc lum8x8; lumfunc lum2x8; csfunc csim; ieee_reseed(1); printf( "\n ====== test SSIM ======\n" ); for(i=0; i<16*16;++i) { long v1, v2; v1 = ieee_rand(-256, 511); v2 = ieee_rand(-256, 511); Ref1[i] = (v1<0) ? 0 : (v1>255) ? 255 : v1; Ref2[i] = (v2<0) ? 0 : (v2>255) ? 255 : v2; } lumc = ieee_rand(0, 255); lumo = ieee_rand(0, 255); for(cpu = cpu_list; cpu->name!=0; ++cpu) { double t; int m; if (!init_cpu(cpu)) continue; lum8x8 = lum_8x8_c; lum2x8 = lum_2x8_c; csim = consim_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) if (cpu->cpu & XVID_CPU_MMX){ lum8x8 = lum_8x8_mmx; csim = consim_mmx; } if (cpu->cpu & XVID_CPU_MMX){ csim = consim_sse2; } #endif t = gettime_usec(); emms(); for(tst=0; tstname, t, m, (m!=8230)?"| ERROR": "" ); t = gettime_usec(); emms(); for(tst=0; tstname, t, m, (m!=681)?"| ERROR": "" ); t = gettime_usec(); emms(); for(tst=0; tstname, t, devs[0], devs[1], devs[2], (devs[0]!=0x1bdf0f || devs[1]!=0x137258 || devs[2]!=0xcdb13)?"| ERROR": "" ); printf( " --- \n" ); } } /********************************************************************* * test bitstream functions *********************************************************************/ #define BIT_BUF_SIZE 2000 static void test_bits() { const int nb_tests = 50*speed_ref; int tst; uint32_t Crc; uint8_t Buf[BIT_BUF_SIZE]; uint32_t Extracted[BIT_BUF_SIZE*8]; /* worst case: bits read 1 by 1 */ int Lens[BIT_BUF_SIZE*8]; double t1; printf( "\n === test bitstream ===\n" ); ieee_reseed(1); Crc = 0; t1 = gettime_usec(); for(tst=0; tst0; m++) { const int b = ieee_rand(1,32); Lens[m] = b; l2 -= b; if (l2<0) break; Extracted[m] = BitstreamShowBits(&bs, b); BitstreamSkip(&bs, b); // printf( "<= %d: %d 0x%x\n", m, b, Extracted[m]); } BitstreamReset(&bs); for(m2=0; m2 %d: %d 0x%x %c\n", m2, b, v, " *"[Crc]); } } t1 = (gettime_usec() - t1) / nb_tests; printf(" test_bits %.3f usec %s\n", t1, (Crc!=0)?"| ERROR": "" ); } /********************************************************************* * main *********************************************************************/ static void arg_missing(const char *opt) { printf( "missing argument after option '%s'\n", opt); exit(-1); } int main(int argc, const char *argv[]) { int c, what = 0; int width, height; uint32_t chksum = 0; const char * test_bitstream = 0; #if defined(WIN32) && defined(ARCH_IS_X86_64) DECLARE_ALIGNED_MATRIX(xmm_save, 2, 4, uint64_t, 16); // assumes xmm6 and xmm7 won't be falsely preserved by C code for(c=0;c<4;c++) xmm_save[c] = read_counter(); prime_xmm(xmm_save); #endif cpu_mask = 0; // default => will use autodectect for(c=1; cargc) { printf("usage: %s %d bitstream width height (checksum)\n", argv[0], what); exit(-1); } test_bitstream = argv[++c]; width = atoi(argv[++c]); height = atoi(argv[++c]); if (c+1= 0 && what <= 6) || what == 10) { printf("\n\n" "NB: If a function isn't optimised for a specific set of intructions,\n" " a C function is used instead. So don't panic if some functions\n" " may appear to be slow.\n"); } #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) if (what == 0 || what == 5) { printf("\n" "NB: MMX mpeg4 quantization is known to have very small errors (+/-1 magnitude)\n" " for 1 or 2 coefficients a block. This is mainly caused by the fact the unit\n" " test goes far behind the usual limits of real encoding. Please do not report\n" " this error to the developers.\n"); } #endif return 0; } /*********************************************************************/ xvidcore/examples/bench_list.pl0000775000076500007650000000173710721527465017774 0ustar xvidbuildxvidbuild#!/usr/bin/perl # # List of benches to run # ######################################### # Decoder benches ######################################### # Raw command-line args passed to 'xvid_bench 9' # format: bitstream_name width height checksum # followed, possibly, by the CPU option to use. @Dec_Benches = ( "test1.m4v 640 352 0x9fa4494d -sse2" , "test1.m4v 640 352 0x9fa4494d -mmxext" , "test1.m4v 640 352 0x9fa4494d -mmx" , "test1.m4v 640 352 0x76c9cde2 -c" , "qpel.m4v 352 288 0xc07eb687 -sse2" , "qpel.m4v 352 288 0xc07eb687 -mmxext" , "qpel.m4v 352 288 0xc07eb687 -mmx" , "qpel.m4v 352 288 0x54e720e0 -c" , "lowdelay.m4v 720 576 0xf2a3229d -sse2" , "lowdelay.m4v 720 576 0xf2a3229d -mmxext" , "lowdelay.m4v 720 576 0xf2a3229d -mmx" , "lowdelay.m4v 720 576 0x5ea8e958 -c" , "gmc1.m4v 640 272 0x94f12062 -sse2" , "gmc1.m4v 640 272 0x94f12062 -mmxext" , "gmc1.m4v 640 272 0x94f12062 -mmx" , "gmc1.m4v 640 272 0x3b938c99 -c" ); ######################################### xvidcore/examples/xvid_decraw.c0000664000076500007650000005556111564705453017772 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Console based decoding test application - * * Copyright(C) 2002-2003 Christoph Lampert * 2002-2003 Edouard Gomez * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: xvid_decraw.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ /***************************************************************************** * * Application notes : * * An MPEG-4 bitstream is read from an input file (or stdin) and decoded, * the speed for this is measured. * * The program is plain C and needs no libraries except for libxvidcore, * and maths-lib. * * Use ./xvid_decraw -help for a list of options * ****************************************************************************/ #include #include #include #include #ifndef WIN32 #include #else #include #endif #include "xvid.h" /***************************************************************************** * Global vars in module and constants ****************************************************************************/ #define USE_PNM 0 #define USE_TGA 1 #define USE_YUV 2 static int XDIM = 0; static int YDIM = 0; static int ARG_SAVEDECOUTPUT = 0; static int ARG_SAVEMPEGSTREAM = 0; static char *ARG_INPUTFILE = NULL; static int CSP = XVID_CSP_I420; static int BPP = 1; static int FORMAT = USE_PNM; static int POSTPROC = 0; static int ARG_THREADS = 0; static char filepath[256] = "./"; static void *dec_handle = NULL; #define BUFFER_SIZE (2*1024*1024) static const int display_buffer_bytes = 0; #define MIN_USEFUL_BYTES 1 /***************************************************************************** * Local prototypes ****************************************************************************/ static double msecond(); static int dec_init(int use_assembler, int debug_level); static int dec_main(unsigned char *istream, unsigned char *ostream, int istream_size, xvid_dec_stats_t *xvid_dec_stats); static int dec_stop(); static void usage(); static int write_image(char *prefix, unsigned char *image, int filenr); static int write_pnm(char *filename, unsigned char *image); static int write_tga(char *filename, unsigned char *image); static int write_yuv(char *filename, unsigned char *image); const char * type2str(int type) { if (type==XVID_TYPE_IVOP) return "I"; if (type==XVID_TYPE_PVOP) return "P"; if (type==XVID_TYPE_BVOP) return "B"; return "S"; } /***************************************************************************** * Main program ****************************************************************************/ int main(int argc, char *argv[]) { unsigned char *mp4_buffer = NULL; unsigned char *mp4_ptr = NULL; unsigned char *out_buffer = NULL; int useful_bytes; int chunk; xvid_dec_stats_t xvid_dec_stats; double totaldectime; long totalsize; int status; int use_assembler = 1; int debug_level = 0; char filename[256]; FILE *in_file; int filenr; int i; printf("xvid_decraw - raw mpeg4 bitstream decoder "); printf("written by Christoph Lampert\n\n"); /***************************************************************************** * Command line parsing ****************************************************************************/ for (i=1; i< argc; i++) { if (strcmp("-noasm", argv[i]) == 0 ) { use_assembler = 0; } else if (strcmp("-debug", argv[i]) == 0 && i < argc - 1 ) { i++; if (sscanf(argv[i], "0x%x", &debug_level) != 1) { debug_level = atoi(argv[i]); } } else if (strcmp("-d", argv[i]) == 0) { ARG_SAVEDECOUTPUT = 1; } else if (strcmp("-i", argv[i]) == 0 && i < argc - 1 ) { i++; ARG_INPUTFILE = argv[i]; } else if (strcmp("-m", argv[i]) == 0) { ARG_SAVEMPEGSTREAM = 1; } else if (strcmp("-c", argv[i]) == 0 && i < argc - 1 ) { i++; if (strcmp(argv[i], "rgb16") == 0) { CSP = XVID_CSP_RGB555; BPP = 2; } else if (strcmp(argv[i], "rgb24") == 0) { CSP = XVID_CSP_BGR; BPP = 3; } else if (strcmp(argv[i], "rgb32") == 0) { CSP = XVID_CSP_BGRA; BPP = 4; } else if (strcmp(argv[i], "yv12") == 0) { CSP = XVID_CSP_YV12; BPP = 1; } else { CSP = XVID_CSP_I420; BPP = 1; } } else if (strcmp("-postproc", argv[i]) == 0 && i < argc - 1 ) { i++; POSTPROC = atoi(argv[i]); if (POSTPROC < 0) POSTPROC = 0; if (POSTPROC > 2) POSTPROC = 2; } else if (strcmp("-f", argv[i]) == 0 && i < argc -1) { i++; if (strcmp(argv[i], "tga") == 0) { FORMAT = USE_TGA; } else if (strcmp(argv[i], "yuv") == 0) { FORMAT = USE_YUV; } else { FORMAT = USE_PNM; } } else if (strcmp("-threads", argv[i]) == 0 && i < argc -1) { i++; ARG_THREADS = atoi(argv[i]); } else if (strcmp("-help", argv[i]) == 0) { usage(); return(0); } else { usage(); exit(-1); } } #if defined(_MSC_VER) if (ARG_INPUTFILE==NULL) { fprintf(stderr, "Warning: MSVC build does not read EOF correctly from stdin. Use the -i switch.\n\n"); } #endif /***************************************************************************** * Values checking ****************************************************************************/ if ( ARG_INPUTFILE == NULL || strcmp(ARG_INPUTFILE, "stdin") == 0) { in_file = stdin; } else { in_file = fopen(ARG_INPUTFILE, "rb"); if (in_file == NULL) { fprintf(stderr, "Error opening input file %s\n", ARG_INPUTFILE); return(-1); } } /* PNM/PGM format can't handle 16/32 bit data */ if (BPP != 1 && BPP != 3 && FORMAT == USE_PNM) { FORMAT = USE_TGA; } if (BPP != 1 && FORMAT == USE_YUV) { FORMAT = USE_TGA; } /***************************************************************************** * Memory allocation ****************************************************************************/ /* Memory for encoded mp4 stream */ mp4_buffer = (unsigned char *) malloc(BUFFER_SIZE); mp4_ptr = mp4_buffer; if (!mp4_buffer) goto free_all_memory; /***************************************************************************** * Xvid PART Start ****************************************************************************/ status = dec_init(use_assembler, debug_level); if (status) { fprintf(stderr, "Decore INIT problem, return value %d\n", status); goto release_all; } /***************************************************************************** * Main loop ****************************************************************************/ /* Fill the buffer */ useful_bytes = (int) fread(mp4_buffer, 1, BUFFER_SIZE, in_file); totaldectime = 0; totalsize = 0; filenr = 0; mp4_ptr = mp4_buffer; chunk = 0; do { int used_bytes = 0; double dectime; /* * If the buffer is half empty or there are no more bytes in it * then fill it. */ if (mp4_ptr > mp4_buffer + BUFFER_SIZE/2) { int already_in_buffer = (int)(mp4_buffer + BUFFER_SIZE - mp4_ptr); /* Move data if needed */ if (already_in_buffer > 0) memcpy(mp4_buffer, mp4_ptr, already_in_buffer); /* Update mp4_ptr */ mp4_ptr = mp4_buffer; /* read new data */ if(!feof(in_file)) { useful_bytes += (int) fread(mp4_buffer + already_in_buffer, 1, BUFFER_SIZE - already_in_buffer, in_file); } } /* This loop is needed to handle VOL/NVOP reading */ do { /* Decode frame */ dectime = msecond(); used_bytes = dec_main(mp4_ptr, out_buffer, useful_bytes, &xvid_dec_stats); dectime = msecond() - dectime; /* Resize image buffer if needed */ if(xvid_dec_stats.type == XVID_TYPE_VOL) { /* Check if old buffer is smaller */ if(XDIM*YDIM < xvid_dec_stats.data.vol.width*xvid_dec_stats.data.vol.height) { /* Copy new witdh and new height from the vol structure */ XDIM = xvid_dec_stats.data.vol.width; YDIM = xvid_dec_stats.data.vol.height; /* Free old output buffer*/ if(out_buffer) free(out_buffer); /* Allocate the new buffer */ out_buffer = (unsigned char*)malloc(XDIM*YDIM*4); if(out_buffer == NULL) goto free_all_memory; fprintf(stderr, "Resized frame buffer to %dx%d\n", XDIM, YDIM); } /* Save individual mpeg4 stream if required */ if(ARG_SAVEMPEGSTREAM) { FILE *filehandle = NULL; sprintf(filename, "%svolhdr.m4v", filepath); filehandle = fopen(filename, "wb"); if(!filehandle) { fprintf(stderr, "Error writing vol header mpeg4 stream to file %s\n", filename); } else { fwrite(mp4_ptr, 1, used_bytes, filehandle); fclose(filehandle); } } } /* Update buffer pointers */ if(used_bytes > 0) { mp4_ptr += used_bytes; useful_bytes -= used_bytes; /* Total size */ totalsize += used_bytes; } if (display_buffer_bytes) { printf("Data chunk %d: %d bytes consumed, %d bytes in buffer\n", chunk++, used_bytes, useful_bytes); } } while (xvid_dec_stats.type <= 0 && useful_bytes > MIN_USEFUL_BYTES); /* Check if there is a negative number of useful bytes left in buffer * This means we went too far */ if(useful_bytes < 0) break; /* Updated data - Count only usefull decode time */ totaldectime += dectime; if (!display_buffer_bytes) { printf("Frame %5d: type = %s, dectime(ms) =%6.1f, length(bytes) =%7d\n", filenr, type2str(xvid_dec_stats.type), dectime, used_bytes); } /* Save individual mpeg4 stream if required */ if(ARG_SAVEMPEGSTREAM) { FILE *filehandle = NULL; sprintf(filename, "%sframe%05d.m4v", filepath, filenr); filehandle = fopen(filename, "wb"); if(!filehandle) { fprintf(stderr, "Error writing single mpeg4 stream to file %s\n", filename); } else { fwrite(mp4_ptr-used_bytes, 1, used_bytes, filehandle); fclose(filehandle); } } /* Save output frame if required */ if (ARG_SAVEDECOUTPUT) { sprintf(filename, "%sdec", filepath); if(write_image(filename, out_buffer, filenr)) { fprintf(stderr, "Error writing decoded frame %s\n", filename); } } filenr++; } while (useful_bytes>MIN_USEFUL_BYTES || !feof(in_file)); useful_bytes = 0; /* Empty buffer */ /***************************************************************************** * Flush decoder buffers ****************************************************************************/ do { /* Fake vars */ int used_bytes; double dectime; do { dectime = msecond(); used_bytes = dec_main(NULL, out_buffer, -1, &xvid_dec_stats); dectime = msecond() - dectime; if (display_buffer_bytes) { printf("Data chunk %d: %d bytes consumed, %d bytes in buffer\n", chunk++, used_bytes, useful_bytes); } } while(used_bytes>=0 && xvid_dec_stats.type <= 0); if (used_bytes < 0) { /* XVID_ERR_END */ break; } /* Updated data - Count only usefull decode time */ totaldectime += dectime; /* Prints some decoding stats */ if (!display_buffer_bytes) { printf("Frame %5d: type = %s, dectime(ms) =%6.1f, length(bytes) =%7d\n", filenr, type2str(xvid_dec_stats.type), dectime, used_bytes); } /* Save output frame if required */ if (ARG_SAVEDECOUTPUT) { sprintf(filename, "%sdec", filepath); if(write_image(filename, out_buffer, filenr)) { fprintf(stderr, "Error writing decoded frame %s\n", filename); } } filenr++; }while(1); /***************************************************************************** * Calculate totals and averages for output, print results ****************************************************************************/ if (filenr>0) { totalsize /= filenr; totaldectime /= filenr; printf("Avg: dectime(ms) =%7.2f, fps =%7.2f, length(bytes) =%7d\n", totaldectime, 1000/totaldectime, (int)totalsize); }else{ printf("Nothing was decoded!\n"); } /***************************************************************************** * Xvid PART Stop ****************************************************************************/ release_all: if (dec_handle) { status = dec_stop(); if (status) fprintf(stderr, "decore RELEASE problem return value %d\n", status); } free_all_memory: free(out_buffer); free(mp4_buffer); return(0); } /***************************************************************************** * Usage function ****************************************************************************/ static void usage() { fprintf(stderr, "Usage : xvid_decraw [OPTIONS]\n"); fprintf(stderr, "Options :\n"); fprintf(stderr, " -noasm : don't use assembly optimizations (default=enabled)\n"); fprintf(stderr, " -debug : debug level (debug=0)\n"); fprintf(stderr, " -i string : input filename (default=stdin)\n"); fprintf(stderr, " -d : save decoder output\n"); fprintf(stderr, " -c csp : choose colorspace output (rgb16, rgb24, rgb32, yv12, i420)\n"); fprintf(stderr, " -f format : choose output file format (tga, pnm, pgm, yuv)\n"); fprintf(stderr, " -postproc : postprocessing level (0=off, 1=deblock, 2=deblock+dering)\n"); fprintf(stderr, " -threads int : number of threads\n"); fprintf(stderr, " -m : save mpeg4 raw stream to individual files\n"); fprintf(stderr, " -help : This help message\n"); fprintf(stderr, " (* means default)\n"); } /***************************************************************************** * "helper" functions ****************************************************************************/ /* return the current time in milli seconds */ static double msecond() { #ifndef WIN32 struct timeval tv; gettimeofday(&tv, 0); return((double)tv.tv_sec*1.0e3 + (double)tv.tv_usec*1.0e-3); #else clock_t clk; clk = clock(); return(clk * 1000.0 / CLOCKS_PER_SEC); #endif } /***************************************************************************** * output functions ****************************************************************************/ static int write_image(char *prefix, unsigned char *image, int filenr) { char filename[1024]; char *ext; int ret; if (FORMAT == USE_PNM && BPP == 1) { ext = "pgm"; } else if (FORMAT == USE_PNM && BPP == 3) { ext = "pnm"; } else if (FORMAT == USE_YUV) { ext = "yuv"; } else if (FORMAT == USE_TGA) { ext = "tga"; } else { fprintf(stderr, "Bug: should not reach this path code -- please report to xvid-devel@xvid.org with command line options used"); exit(-1); } if (FORMAT == USE_YUV) { sprintf(filename, "%s.%s", prefix, ext); if (!filenr) { FILE *fp = fopen(filename, "wb"); fclose(fp); } } else sprintf(filename, "%s%05d.%s", prefix, filenr, ext); if (FORMAT == USE_PNM) { ret = write_pnm(filename, image); } else if (FORMAT == USE_YUV) { ret = write_yuv(filename, image); } else { ret = write_tga(filename, image); } return(ret); } static int write_tga(char *filename, unsigned char *image) { FILE * f; char hdr[18]; f = fopen(filename, "wb"); if ( f == NULL) { return -1; } hdr[0] = 0; /* ID length */ hdr[1] = 0; /* Color map type */ hdr[2] = (BPP>1)?2:3; /* Uncompressed true color (2) or greymap (3) */ hdr[3] = 0; /* Color map specification (not used) */ hdr[4] = 0; /* Color map specification (not used) */ hdr[5] = 0; /* Color map specification (not used) */ hdr[6] = 0; /* Color map specification (not used) */ hdr[7] = 0; /* Color map specification (not used) */ hdr[8] = 0; /* LSB X origin */ hdr[9] = 0; /* MSB X origin */ hdr[10] = 0; /* LSB Y origin */ hdr[11] = 0; /* MSB Y origin */ hdr[12] = (XDIM>>0)&0xff; /* LSB Width */ hdr[13] = (XDIM>>8)&0xff; /* MSB Width */ if (BPP > 1) { hdr[14] = (YDIM>>0)&0xff; /* LSB Height */ hdr[15] = (YDIM>>8)&0xff; /* MSB Height */ } else { hdr[14] = ((YDIM*3)>>1)&0xff; /* LSB Height */ hdr[15] = ((YDIM*3)>>9)&0xff; /* MSB Height */ } hdr[16] = BPP*8; hdr[17] = 0x00 | (1<<5) /* Up to down */ | (0<<4); /* Image descriptor */ /* Write header */ fwrite(hdr, 1, sizeof(hdr), f); #ifdef ARCH_IS_LITTLE_ENDIAN /* write first plane */ fwrite(image, 1, XDIM*YDIM*BPP, f); #else { int i; for (i=0; iversion = XVID_VERSION; /* No general flags to set */ if (POSTPROC == 1) xvid_dec_frame.general = XVID_DEBLOCKY | XVID_DEBLOCKUV; else if (POSTPROC==2) xvid_dec_frame.general = XVID_DEBLOCKY | XVID_DEBLOCKUV | XVID_DERINGY | XVID_DERINGUV; else xvid_dec_frame.general = 0; /* Input stream */ xvid_dec_frame.bitstream = istream; xvid_dec_frame.length = istream_size; /* Output frame structure */ xvid_dec_frame.output.plane[0] = ostream; xvid_dec_frame.output.stride[0] = XDIM*BPP; xvid_dec_frame.output.csp = CSP; ret = xvid_decore(dec_handle, XVID_DEC_DECODE, &xvid_dec_frame, xvid_dec_stats); return(ret); } /* close decoder to release resources */ static int dec_stop() { int ret; ret = xvid_decore(dec_handle, XVID_DEC_DESTROY, NULL, NULL); return(ret); } xvidcore/examples/xvid_encraw.c0000664000076500007650000024002611564705453017774 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Console based test application - * * Copyright(C) 2002-2003 Christoph Lampert * 2002-2003 Edouard Gomez * 2003 Peter Ross * 2003-2010 Michael Militzer * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: xvid_encraw.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ /***************************************************************************** * Application notes : * * A sequence of raw YUV I420 pics or YUV I420 PGM file format is encoded * The speed is measured and frames' PSNR are taken from core. * * The program is plain C and needs no libraries except for libxvidcore, * and maths-lib. * * Use ./xvid_encraw -help for a list of options * ************************************************************************/ #include //#include #include #include #include #ifndef WIN32 #include #else #include #include #include #define XVID_AVI_INPUT #define XVID_AVI_OUTPUT #endif #include "xvid.h" #include "portab.h" /* for pthread */ #ifdef XVID_MKV_OUTPUT #include "matroska.cpp" #endif #undef READ_PNM //#define USE_APP_LEVEL_THREADING /* Should xvid_encraw app use multi-threading? */ /***************************************************************************** * Quality presets ****************************************************************************/ // Equivalent to vfw's pmvfast_presets static const int motion_presets[] = { /* quality 0 */ 0, /* quality 1 */ 0, /* quality 2 */ 0, /* quality 3 */ 0, /* quality 4 */ 0 | XVID_ME_HALFPELREFINE16 | 0, /* quality 5 */ 0 | XVID_ME_HALFPELREFINE16 | 0 | XVID_ME_ADVANCEDDIAMOND16, /* quality 6 */ XVID_ME_HALFPELREFINE16 | XVID_ME_EXTSEARCH16 | XVID_ME_HALFPELREFINE8 | 0 | XVID_ME_USESQUARES16 }; #define ME_ELEMENTS (sizeof(motion_presets)/sizeof(motion_presets[0])) static const int vop_presets[] = { /* quality 0 */ 0, /* quality 1 */ 0, /* quality 2 */ 0, /* quality 3 */ 0, /* quality 4 */ 0, /* quality 5 */ XVID_VOP_INTER4V, /* quality 6 */ XVID_VOP_INTER4V, }; #define VOP_ELEMENTS (sizeof(vop_presets)/sizeof(vop_presets[0])) /***************************************************************************** * Command line global variables ****************************************************************************/ #define MAX_ZONES 64 #define MAX_ENC_INSTANCES 4 #define DEFAULT_QUANT 400 typedef struct { int frame; int type; int mode; int modifier; unsigned int greyscale; unsigned int chroma_opt; unsigned int bvop_threshold; unsigned int cartoon_mode; } zone_t; typedef struct { int count; int size; int quants[32]; } frame_stats_t; typedef struct { pthread_t handle; /* thread's handle */ int start_num; /* begin/end of sequence */ int stop_num; char *outfilename; /* output filename */ char *statsfilename1; /* pass1 statsfile */ int input_num; int totalsize; /* encoder stats */ double totalenctime; float totalPSNR[3]; frame_stats_t framestats[7]; } enc_sequence_data_t; /* Maximum number of frames to encode */ #define ABS_MAXFRAMENR -1 /* no limit */ #ifndef READ_PNM #define IMAGE_SIZE(x,y) ((x)*(y)*3/2) #else #define IMAGE_SIZE(x,y) ((x)*(y)*3) #endif #define MAX(A,B) ( ((A)>(B)) ? (A) : (B) ) #define SMALL_EPS (1e-10) #define SWAP(a) ( (((a)&0x000000ff)<<24) | (((a)&0x0000ff00)<<8) | \ (((a)&0x00ff0000)>>8) | (((a)&0xff000000)>>24) ) static zone_t ZONES[MAX_ZONES]; static int NUM_ZONES = 0; static int ARG_NUM_APP_THREADS = 1; static int ARG_CPU_FLAGS = 0; static int ARG_STATS = 0; static int ARG_SSIM = -1; static int ARG_PSNRHVSM = 0; static char* ARG_SSIM_PATH = NULL; static int ARG_DUMP = 0; static int ARG_LUMIMASKING = 0; static int ARG_BITRATE = 0; static int ARG_TARGETSIZE = 0; static int ARG_SINGLE = 1; static char *ARG_PASS1 = 0; static char *ARG_PASS2 = 0; //static int ARG_QUALITY = ME_ELEMENTS - 1; static int ARG_QUALITY = 6; static float ARG_FRAMERATE = 0.f; static int ARG_DWRATE = 25; static int ARG_DWSCALE = 1; static int ARG_MAXFRAMENR = ABS_MAXFRAMENR; static int ARG_MAXKEYINTERVAL = 300; static int ARG_STARTFRAMENR = 0; static char *ARG_INPUTFILE = NULL; static int ARG_INPUTTYPE = 0; static int ARG_SAVEMPEGSTREAM = 0; static int ARG_SAVEINDIVIDUAL = 0; static char *ARG_OUTPUTFILE = NULL; static char *ARG_AVIOUTPUTFILE = NULL; static char *ARG_MKVOUTPUTFILE = NULL; static char *ARG_TIMECODEFILE = NULL; static int XDIM = 0; static int YDIM = 0; static int ARG_BQRATIO = 150; static int ARG_BQOFFSET = 100; static int ARG_MAXBFRAMES = 2; static int ARG_PACKED = 1; static int ARG_DEBUG = 0; static int ARG_VOPDEBUG = 0; static int ARG_TRELLIS = 1; static int ARG_QTYPE = 0; static int ARG_QMATRIX = 0; static int ARG_GMC = 0; static int ARG_INTERLACING = 0; static int ARG_QPEL = 0; static int ARG_TURBO = 0; static int ARG_VHQMODE = 1; static int ARG_BVHQ = 0; static int ARG_QMETRIC = 0; static int ARG_CLOSED_GOP = 1; static int ARG_CHROMAME = 1; static int ARG_PAR = 1; static int ARG_PARHEIGHT; static int ARG_PARWIDTH; static int ARG_QUANTS[6] = {2, 31, 2, 31, 2, 31}; static int ARG_FRAMEDROP = 0; static double ARG_CQ = 0; static int ARG_FULL1PASS = 0; static int ARG_REACTION = 16; static int ARG_AVERAGING = 100; static int ARG_SMOOTHER = 100; static int ARG_KBOOST = 10; static int ARG_KREDUCTION = 20; static int ARG_KTHRESH = 1; static int ARG_CHIGH = 0; static int ARG_CLOW = 0; static int ARG_OVERSTRENGTH = 5; static int ARG_OVERIMPROVE = 5; static int ARG_OVERDEGRADE = 5; static int ARG_OVERHEAD = 0; static int ARG_VBVSIZE = 0; static int ARG_VBVMAXRATE = 0; static int ARG_VBVPEAKRATE = 0; static int ARG_THREADS = 0; static int ARG_SLICES = 1; static int ARG_VFR = 0; static int ARG_PROGRESS = 0; static int ARG_COLORSPACE = XVID_CSP_YV12; /* the path where to save output */ static char filepath[256] = "./"; static unsigned char qmatrix_intra[64]; static unsigned char qmatrix_inter[64]; /**************************************************************************** * Nasty global vars ;-) ***************************************************************************/ static const int height_ratios[] = {1, 1, 11, 11, 11, 33}; static const int width_ratios[] = {1, 1, 12, 10, 16, 40}; const char userdata_start_code[] = "\0\0\x01\xb2"; /***************************************************************************** * Local prototypes ****************************************************************************/ /* Prints program usage message */ static void usage(); /* Statistical functions */ static double msecond(); int gcd(int a, int b); int minquant(int quants[32]); int maxquant(int quants[32]); double avgquant(frame_stats_t frame); /* PGM related functions */ #ifndef READ_PNM static int read_pgmheader(FILE * handle); static int read_pgmdata(FILE * handle, unsigned char *image); #else static int read_pnmheader(FILE * handle); static int read_pnmdata(FILE * handle, unsigned char *image); #endif static int read_yuvdata(FILE * handle, unsigned char *image); /* Encoder related functions */ static void enc_gbl(int use_assembler); static int enc_init(void **enc_handle, char *stats_pass1, int start_num); static int enc_info(); static int enc_stop(void *enc_handle); static int enc_main(void *enc_handle, unsigned char *image, unsigned char *bitstream, int *key, int *stats_type, int *stats_quant, int *stats_length, int stats[3], int framenum); static void encode_sequence(enc_sequence_data_t *h); /* Zone Related Functions */ static void apply_zone_modifiers(xvid_enc_frame_t * frame, int framenum); static void prepare_full1pass_zones(); static void prepare_cquant_zones(); void sort_zones(zone_t * zones, int zone_num, int * sel); void removedivxp(char *buf, int size); /***************************************************************************** * Main function ****************************************************************************/ int main(int argc, char *argv[]) { double totalenctime = 0.; float totalPSNR[3] = {0., 0., 0.}; FILE *statsfile; frame_stats_t framestats[7]; int input_num = 0; int totalsize = 0; int use_assembler = 1; int i; printf("xvid_encraw - raw mpeg4 bitstream encoder "); printf("written by Christoph Lampert\n\n"); /* Is there a dumb Xvid coder ? */ if(ME_ELEMENTS != VOP_ELEMENTS) { fprintf(stderr, "Presets' arrays should have the same number of elements -- Please file a bug to xvid-devel@xvid.org\n"); return(-1); } /* Clear framestats */ memset(framestats, 0, sizeof(framestats)); /***************************************************************************** * Command line parsing ****************************************************************************/ for (i = 1; i < argc; i++) { if (strcmp("-asm", argv[i]) == 0) { use_assembler = 1; } else if (strcmp("-noasm", argv[i]) == 0) { use_assembler = 0; } else if (strcmp("-w", argv[i]) == 0 && i < argc - 1) { i++; XDIM = atoi(argv[i]); } else if (strcmp("-h", argv[i]) == 0 && i < argc - 1) { i++; YDIM = atoi(argv[i]); } else if (strcmp("-csp",argv[i]) == 0 && i < argc - 1) { i++; if (strcmp(argv[i],"i420") == 0){ ARG_COLORSPACE = XVID_CSP_I420; } else if(strcmp(argv[i],"yv12") == 0){ ARG_COLORSPACE = XVID_CSP_YV12; } else { printf("Invalid colorspace\n"); return 0; } } else if (strcmp("-bitrate", argv[i]) == 0) { if (i < argc - 1) ARG_BITRATE = atoi(argv[i+1]); if (ARG_BITRATE) { i++; if (ARG_BITRATE <= 20000) /* if given parameter is <= 20000, assume it means kbps */ ARG_BITRATE *= 1000; } else ARG_BITRATE = 700000; } else if (strcmp("-size", argv[i]) == 0 && i < argc - 1) { i++; ARG_TARGETSIZE = atoi(argv[i]); } else if (strcmp("-cq", argv[i]) == 0 && i < argc - 1) { i++; ARG_CQ = atof(argv[i])*100; } else if (strcmp("-single", argv[i]) == 0) { ARG_SINGLE = 1; ARG_PASS1 = NULL; ARG_PASS2 = NULL; } else if (strcmp("-pass1", argv[i]) == 0) { ARG_SINGLE = 0; if ((i < argc - 1) && (*argv[i+1] != '-')) { i++; ARG_PASS1 = argv[i]; } else { ARG_PASS1 = "xvid.stats"; } } else if (strcmp("-full1pass", argv[i]) == 0) { ARG_FULL1PASS = 1; } else if (strcmp("-pass2", argv[i]) == 0) { ARG_SINGLE = 0; if ((i < argc - 1) && (*argv[i+1] != '-')) { i++; ARG_PASS2 = argv[i]; } else { ARG_PASS2 = "xvid.stats"; } } else if (strcmp("-max_bframes", argv[i]) == 0 && i < argc - 1) { i++; ARG_MAXBFRAMES = atoi(argv[i]); } else if (strcmp("-par", argv[i]) == 0 && i < argc - 1) { i++; if (sscanf(argv[i], "%d:%d", &(ARG_PARWIDTH), &(ARG_PARHEIGHT))!=2) ARG_PAR = atoi(argv[i]); else { int div; ARG_PAR = 0; div = gcd(ARG_PARWIDTH, ARG_PARHEIGHT); ARG_PARWIDTH /= div; ARG_PARHEIGHT /= div; } } else if (strcmp("-nopacked", argv[i]) == 0) { ARG_PACKED = 0; } else if (strcmp("-packed", argv[i]) == 0) { ARG_PACKED = 2; } else if (strcmp("-nochromame", argv[i]) == 0) { ARG_CHROMAME = 0; } else if (strcmp("-threads", argv[i]) == 0 && i < argc -1) { i++; ARG_THREADS = atoi(argv[i]); } else if (strcmp("-slices", argv[i]) == 0 && i < argc -1) { i++; ARG_SLICES = atoi(argv[i]); } else if (strcmp("-bquant_ratio", argv[i]) == 0 && i < argc - 1) { i++; ARG_BQRATIO = atoi(argv[i]); } else if (strcmp("-bquant_offset", argv[i]) == 0 && i < argc - 1) { i++; ARG_BQOFFSET = atoi(argv[i]); } else if (strcmp("-zones", argv[i]) == 0 && i < argc -1) { char c; char *frameoptions, *rem; int startframe; char options[40]; i++; do { rem = strrchr(argv[i], '/'); if (rem==NULL) rem=argv[i]; else { *rem = '\0'; rem++; } if (sscanf(rem, "%d,%c,%s", &startframe, &c, options)<3) { fprintf(stderr, "Zone error, bad parameters %s\n", rem); continue; } if (NUM_ZONES >= MAX_ZONES) { fprintf(stderr, "warning: too many zones; zone ignored\n"); continue; } memset(&ZONES[NUM_ZONES], 0, sizeof(zone_t)); ZONES[NUM_ZONES].frame = startframe; ZONES[NUM_ZONES].modifier = (int)(atof(options)*100); if (toupper(c)=='Q') ZONES[NUM_ZONES].mode = XVID_ZONE_QUANT; else if (toupper(c)=='W') ZONES[NUM_ZONES].mode = XVID_ZONE_WEIGHT; else { fprintf(stderr, "Bad zone type %c\n", c); continue; } if ((frameoptions=strchr(options, ','))!=NULL) { int readchar=0, count; frameoptions++; while (readchar<(int)strlen(frameoptions)) { if (sscanf(frameoptions+readchar, "%d%n", &(ZONES[NUM_ZONES].bvop_threshold), &count)==1) { readchar += count; } else { if (toupper(frameoptions[readchar])=='K') ZONES[NUM_ZONES].type = XVID_TYPE_IVOP; else if (toupper(frameoptions[readchar])=='G') ZONES[NUM_ZONES].greyscale = 1; else if (toupper(frameoptions[readchar])=='O') ZONES[NUM_ZONES].chroma_opt = 1; else if (toupper(frameoptions[readchar])=='C') ZONES[NUM_ZONES].cartoon_mode = 1; else { fprintf(stderr, "Error in zone %s option %c\n", rem, frameoptions[readchar]); break; } readchar++; } } } NUM_ZONES++; } while (rem != argv[i]); } else if ((strcmp("-zq", argv[i]) == 0 || strcmp("-zw", argv[i]) == 0) && i < argc - 2) { if (NUM_ZONES >= MAX_ZONES) { fprintf(stderr,"warning: too many zones; zone ignored\n"); continue; } memset(&ZONES[NUM_ZONES], 0, sizeof(zone_t)); if (strcmp("-zq", argv[i])== 0) { ZONES[NUM_ZONES].mode = XVID_ZONE_QUANT; } else { ZONES[NUM_ZONES].mode = XVID_ZONE_WEIGHT; } ZONES[NUM_ZONES].modifier = (int)(atof(argv[i+2])*100); i++; ZONES[NUM_ZONES].frame = atoi(argv[i]); i++; ZONES[NUM_ZONES].type = XVID_TYPE_AUTO; ZONES[NUM_ZONES].greyscale = 0; ZONES[NUM_ZONES].chroma_opt = 0; ZONES[NUM_ZONES].bvop_threshold = 0; ZONES[NUM_ZONES].cartoon_mode = 0; NUM_ZONES++; } else if (strcmp("-quality", argv[i]) == 0 && i < argc - 1) { i++; ARG_QUALITY = atoi(argv[i]); } else if (strcmp("-start", argv[i]) == 0 && i < argc - 1) { i++; ARG_STARTFRAMENR = atoi(argv[i]); } else if (strcmp("-vhqmode", argv[i]) == 0 && i < argc - 1) { i++; ARG_VHQMODE = atoi(argv[i]); } else if (strcmp("-metric", argv[i]) == 0 && i < argc - 1) { i++; ARG_QMETRIC = atoi(argv[i]); } else if (strcmp("-framerate", argv[i]) == 0 && i < argc - 1) { int exponent; i++; ARG_FRAMERATE = (float) atof(argv[i]); exponent = (int) strcspn(argv[i], "."); if (exponent<(int)strlen(argv[i])) exponent=(int)pow(10.0, (int)(strlen(argv[i])-1-exponent)); else exponent=1; ARG_DWRATE = (int)(atof(argv[i])*exponent); ARG_DWSCALE = exponent; exponent = gcd(ARG_DWRATE, ARG_DWSCALE); ARG_DWRATE /= exponent; ARG_DWSCALE /= exponent; } else if (strcmp("-max_key_interval", argv[i]) == 0 && i < argc - 1) { i++; ARG_MAXKEYINTERVAL = atoi(argv[i]); } else if (strcmp("-i", argv[i]) == 0 && i < argc - 1) { i++; ARG_INPUTFILE = argv[i]; } else if (strcmp("-stats", argv[i]) == 0) { ARG_STATS = 1; } else if (strcmp("-ssim", argv[i]) == 0) { ARG_SSIM = 2; if ((i < argc - 1) && (*argv[i+1] != '-')) { i++; ARG_SSIM = atoi(argv[i]); } } else if (strcmp("-psnrhvsm", argv[i]) == 0) { ARG_PSNRHVSM = 1; } else if (strcmp("-ssim_file", argv[i]) == 0 && i < argc -1) { i++; ARG_SSIM_PATH = argv[i]; } else if (strcmp("-timecode", argv[i]) == 0 && i < argc -1) { i++; ARG_TIMECODEFILE = argv[i]; } else if (strcmp("-dump", argv[i]) == 0) { ARG_DUMP = 1; } else if (strcmp("-masking", argv[i]) == 0 && i < argc -1) { i++; ARG_LUMIMASKING = atoi(argv[i]); } else if (strcmp("-type", argv[i]) == 0 && i < argc - 1) { i++; ARG_INPUTTYPE = atoi(argv[i]); } else if (strcmp("-frames", argv[i]) == 0 && i < argc - 1) { i++; ARG_MAXFRAMENR = atoi(argv[i]); } else if (strcmp("-drop", argv[i]) == 0 && i < argc - 1) { i++; ARG_FRAMEDROP = atoi(argv[i]); } else if (strcmp("-imin", argv[i]) == 0 && i < argc - 1) { i++; ARG_QUANTS[0] = atoi(argv[i]); } else if (strcmp("-imax", argv[i]) == 0 && i < argc - 1) { i++; ARG_QUANTS[1] = atoi(argv[i]); } else if (strcmp("-pmin", argv[i]) == 0 && i < argc - 1) { i++; ARG_QUANTS[2] = atoi(argv[i]); } else if (strcmp("-pmax", argv[i]) == 0 && i < argc - 1) { i++; ARG_QUANTS[3] = atoi(argv[i]); } else if (strcmp("-bmin", argv[i]) == 0 && i < argc - 1) { i++; ARG_QUANTS[4] = atoi(argv[i]); } else if (strcmp("-bmax", argv[i]) == 0 && i < argc - 1) { i++; ARG_QUANTS[5] = atoi(argv[i]); } else if (strcmp("-qtype", argv[i]) == 0 && i < argc - 1) { i++; ARG_QTYPE = atoi(argv[i]); } else if (strcmp("-qmatrix", argv[i]) == 0 && i < argc - 1) { FILE *fp = fopen(argv[++i], "rb"); if (fp == NULL) { fprintf(stderr, "Error opening input file %s\n", argv[i]); return (-1); } fseek(fp, 0, SEEK_END); if (ftell(fp) != 128) { fprintf(stderr, "Unexpected size of input file %s\n", argv[i]); return (-1); } fseek(fp, 0, SEEK_SET); fread(qmatrix_intra, 1, 64, fp); fread(qmatrix_inter, 1, 64, fp); ARG_QMATRIX = 1; ARG_QTYPE = 1; } else if (strcmp("-save", argv[i]) == 0) { ARG_SAVEMPEGSTREAM = 1; ARG_SAVEINDIVIDUAL = 1; } else if (strcmp("-debug", argv[i]) == 0 && i < argc -1) { i++; if (!(sscanf(argv[i],"0x%x", &(ARG_DEBUG)))) sscanf(argv[i],"%d", &(ARG_DEBUG)); } else if (strcmp("-o", argv[i]) == 0 && i < argc - 1) { ARG_SAVEMPEGSTREAM = 1; i++; ARG_OUTPUTFILE = argv[i]; } else if (strcmp("-avi", argv[i]) == 0 && i < argc - 1) { #ifdef XVID_AVI_OUTPUT ARG_SAVEMPEGSTREAM = 1; i++; ARG_AVIOUTPUTFILE = argv[i]; #else fprintf( stderr, "Not compiled with AVI output support.\n"); return(-1); #endif } else if (strcmp("-mkv", argv[i]) == 0 && i < argc - 1) { #ifdef XVID_MKV_OUTPUT ARG_SAVEMPEGSTREAM = 1; i++; ARG_MKVOUTPUTFILE = argv[i]; #else fprintf(stderr, "Not compiled with MKV output support.\n"); return(-1); #endif } else if (strcmp("-vop_debug", argv[i]) == 0) { ARG_VOPDEBUG = 1; } else if (strcmp("-notrellis", argv[i]) == 0) { ARG_TRELLIS = 0; } else if (strcmp("-bvhq", argv[i]) == 0) { ARG_BVHQ = 1; } else if (strcmp("-qpel", argv[i]) == 0) { ARG_QPEL = 1; } else if (strcmp("-turbo", argv[i]) == 0) { ARG_TURBO = 1; } else if (strcmp("-gmc", argv[i]) == 0) { ARG_GMC = 1; } else if (strcmp("-interlaced", argv[i]) == 0) { if ((i < argc - 1) && (*argv[i+1] != '-')) { i++; ARG_INTERLACING = atoi(argv[i]); } else { ARG_INTERLACING = 1; } } else if (strcmp("-noclosed_gop", argv[i]) == 0) { ARG_CLOSED_GOP = 0; } else if (strcmp("-closed_gop", argv[i]) == 0) { ARG_CLOSED_GOP = 2; } else if (strcmp("-vbvsize", argv[i]) == 0 && i < argc -1) { i++; ARG_VBVSIZE = atoi(argv[i]); } else if (strcmp("-vbvmax", argv[i]) == 0 && i < argc -1) { i++; ARG_VBVMAXRATE = atoi(argv[i]); } else if (strcmp("-vbvpeak", argv[i]) == 0 && i < argc -1) { i++; ARG_VBVPEAKRATE = atoi(argv[i]); } else if (strcmp("-reaction", argv[i]) == 0 && i < argc -1) { i++; ARG_REACTION = atoi(argv[i]); } else if (strcmp("-averaging", argv[i]) == 0 && i < argc -1) { i++; ARG_AVERAGING = atoi(argv[i]); } else if (strcmp("-smoother", argv[i]) == 0 && i < argc -1) { i++; ARG_SMOOTHER = atoi(argv[i]); } else if (strcmp("-kboost", argv[i]) == 0 && i < argc -1) { i++; ARG_KBOOST = atoi(argv[i]); } else if (strcmp("-kthresh", argv[i]) == 0 && i < argc -1) { i++; ARG_KTHRESH = atoi(argv[i]); } else if (strcmp("-chigh", argv[i]) == 0 && i < argc -1) { i++; ARG_CHIGH = atoi(argv[i]); } else if (strcmp("-clow", argv[i]) == 0 && i < argc -1) { i++; ARG_CLOW = atoi(argv[i]); } else if (strcmp("-ostrength", argv[i]) == 0 && i < argc -1) { i++; ARG_OVERSTRENGTH = atoi(argv[i]); } else if (strcmp("-oimprove", argv[i]) == 0 && i < argc -1) { i++; ARG_OVERIMPROVE = atoi(argv[i]); } else if (strcmp("-odegrade", argv[i]) == 0 && i < argc -1) { i++; ARG_OVERDEGRADE = atoi(argv[i]); } else if (strcmp("-overhead", argv[i]) == 0 && i < argc -1) { i++; ARG_OVERHEAD = atoi(argv[i]); } else if (strcmp("-kreduction", argv[i]) == 0 && i < argc -1) { i++; ARG_KREDUCTION = atoi(argv[i]); } else if (strcmp("-progress", argv[i]) == 0) { if (i < argc - 1) /* in kbps */ ARG_PROGRESS = atoi(argv[i+1]); if (ARG_PROGRESS > 0) i++; else ARG_PROGRESS = 10; } else if (strcmp("-help", argv[i]) == 0) { usage(); return (0); } else { usage(); exit(-1); } } /***************************************************************************** * Arguments checking ****************************************************************************/ if (XDIM <= 0 || XDIM >= 4096 || YDIM <= 0 || YDIM >= 4096) { fprintf(stderr, "Trying to retrieve width and height from input header\n"); if (!ARG_INPUTTYPE) ARG_INPUTTYPE = 1; /* pgm */ } if (ARG_QUALITY < 0 ) { ARG_QUALITY = 0; } else if (ARG_QUALITY >= ME_ELEMENTS) { ARG_QUALITY = ME_ELEMENTS - 1; } if (ARG_STARTFRAMENR < 0) { fprintf(stderr, "Bad starting frame number %d, cannot be negative\n", ARG_STARTFRAMENR); return(-1); } if (ARG_PASS2) { if (ARG_PASS2 == ARG_PASS1) { fprintf(stderr, "Can't use the same statsfile for pass1 and pass2: %s\n", ARG_PASS2); return(-1); } statsfile = fopen(ARG_PASS2, "rb"); if (statsfile == NULL) { fprintf(stderr, "Couldn't open statsfile '%s'!\n", ARG_PASS2); return (-1); } fclose(statsfile); } #ifdef XVID_AVI_OUTPUT if (ARG_AVIOUTPUTFILE == NULL && ARG_PACKED <= 1) ARG_PACKED = 0; #endif if (ARG_BITRATE < 0) { fprintf(stderr, "Bad bitrate %d, cannot be negative\n", ARG_BITRATE); return(-1); } if (NUM_ZONES) { int i; sort_zones(ZONES, NUM_ZONES, &i); } if (ARG_PAR > 5) { fprintf(stderr, "Bad PAR: %d. Must be [1..5] or width:height\n", ARG_PAR); return(-1); } if (ARG_MAXFRAMENR == 0) { fprintf(stderr, "Wrong number of frames\n"); return (-1); } if (ARG_INPUTFILE != NULL) { #if defined(XVID_AVI_INPUT) if (strcmp(ARG_INPUTFILE+(strlen(ARG_INPUTFILE)-3), "avs")==0 || strcmp(ARG_INPUTFILE+(strlen(ARG_INPUTFILE)-3), "avi")==0 || ARG_INPUTTYPE==2) { PAVIFILE avi_in = NULL; PAVISTREAM avi_in_stream = NULL; PGETFRAME get_frame = NULL; BITMAPINFOHEADER myBitmapInfoHeader; AVISTREAMINFO avi_info; FILE *avi_fp = fopen(ARG_INPUTFILE, "rb"); AVIFileInit(); if (avi_fp == NULL) { fprintf(stderr, "Couldn't open file '%s'!\n", ARG_INPUTFILE); return (-1); } fclose(avi_fp); if (AVIFileOpen(&avi_in, ARG_INPUTFILE, OF_READ, NULL) != AVIERR_OK) { fprintf(stderr, "Can't open avi/avs file %s\n", ARG_INPUTFILE); AVIFileExit(); return(-1); } if (AVIFileGetStream(avi_in, &avi_in_stream, streamtypeVIDEO, 0) != AVIERR_OK) { fprintf(stderr, "Can't open stream from file '%s'!\n", ARG_INPUTFILE); AVIFileRelease(avi_in); AVIFileExit(); return (-1); } AVIFileRelease(avi_in); if(AVIStreamInfo(avi_in_stream, &avi_info, sizeof(AVISTREAMINFO)) != AVIERR_OK) { fprintf(stderr, "Can't get stream info from file '%s'!\n", ARG_INPUTFILE); AVIStreamRelease(avi_in_stream); AVIFileExit(); return (-1); } if (avi_info.fccHandler != MAKEFOURCC('Y', 'V', '1', '2')) { LONG size; fprintf(stderr, "Non YV12 input colorspace %c%c%c%c! Attempting conversion...\n", avi_info.fccHandler%256, (avi_info.fccHandler>>8)%256, (avi_info.fccHandler>>16)%256, (avi_info.fccHandler>>24)%256); size = sizeof(myBitmapInfoHeader); AVIStreamReadFormat(avi_in_stream, 0, &myBitmapInfoHeader, &size); if (size==0) fprintf(stderr, "AVIStreamReadFormat read 0 bytes.\n"); else { fprintf(stderr, "AVIStreamReadFormat read %d bytes.\n", size); fprintf(stderr, "width = %d, height = %d, planes = %d\n", myBitmapInfoHeader.biWidth, myBitmapInfoHeader.biHeight, myBitmapInfoHeader.biPlanes); fprintf(stderr, "Compression = %c%c%c%c, %d\n", myBitmapInfoHeader.biCompression%256, (myBitmapInfoHeader.biCompression>>8)%256, (myBitmapInfoHeader.biCompression>>16)%256, (myBitmapInfoHeader.biCompression>>24)%256, myBitmapInfoHeader.biCompression); fprintf(stderr, "Bits Per Pixel = %d\n", myBitmapInfoHeader.biBitCount); myBitmapInfoHeader.biCompression = MAKEFOURCC('Y', 'V', '1', '2'); myBitmapInfoHeader.biBitCount = 12; myBitmapInfoHeader.biSizeImage = (myBitmapInfoHeader.biWidth*myBitmapInfoHeader.biHeight)*3/2; get_frame = AVIStreamGetFrameOpen(avi_in_stream, &myBitmapInfoHeader); } if (get_frame == NULL) { AVIStreamRelease(avi_in_stream); AVIFileExit(); return (-1); } else { unsigned char *temp; fprintf(stderr, "AVIStreamGetFrameOpen successful.\n"); temp = (unsigned char*)AVIStreamGetFrame(get_frame, 0); if (temp != NULL) { int i; for (i = 0; i < (int)((DWORD*)temp)[0]; i++) { fprintf(stderr, "%2d ", temp[i]); } fprintf(stderr, "\n"); } } if (avi_info.fccHandler == MAKEFOURCC('D', 'I', 'B', ' ')) { AVIStreamGetFrameClose(get_frame); get_frame = NULL; ARG_COLORSPACE = XVID_CSP_BGR | XVID_CSP_VFLIP; } } if (ARG_MAXFRAMENR<0) ARG_MAXFRAMENR = avi_info.dwLength-ARG_STARTFRAMENR; else ARG_MAXFRAMENR = min(ARG_MAXFRAMENR, (int) (avi_info.dwLength-ARG_STARTFRAMENR)); XDIM = avi_info.rcFrame.right - avi_info.rcFrame.left; YDIM = avi_info.rcFrame.bottom - avi_info.rcFrame.top; if (ARG_FRAMERATE==0) { ARG_FRAMERATE = (float) avi_info.dwRate / (float) avi_info.dwScale; ARG_DWRATE = avi_info.dwRate; ARG_DWSCALE = avi_info.dwScale; } ARG_INPUTTYPE = 2; if (get_frame) AVIStreamGetFrameClose(get_frame); if (avi_in_stream) AVIStreamRelease(avi_in_stream); AVIFileExit(); } else #endif { FILE *in_file = fopen(ARG_INPUTFILE, "rb"); int pos = 0; if (in_file == NULL) { fprintf(stderr, "Error opening input file %s\n", ARG_INPUTFILE); return (-1); } #ifdef USE_APP_LEVEL_THREADING fseek(in_file, 0, SEEK_END); /* Determine input size */ pos = ftell(in_file); ARG_MAXFRAMENR = pos / IMAGE_SIZE(XDIM, YDIM); /* PGM, header size ?? */ #endif fclose(in_file); } } if (ARG_FRAMERATE <= 0) { ARG_FRAMERATE = 25.00f; /* default value */ } if (ARG_TARGETSIZE) { if (ARG_MAXFRAMENR <= 0) { fprintf(stderr, "Bad target size; number of input frames unknown\n"); goto release_all; } else if (ARG_BITRATE) { fprintf(stderr, "Parameter conflict: Do not specify both -bitrate and -size\n"); goto release_all; } else ARG_BITRATE = (int) (((ARG_TARGETSIZE * 8) / (ARG_MAXFRAMENR / ARG_FRAMERATE)) * 1024); } /* Set constant quant to default if no bitrate given for single pass */ if (ARG_SINGLE && (!ARG_BITRATE) && (!ARG_CQ)) ARG_CQ = DEFAULT_QUANT; /* Init xvidcore */ enc_gbl(use_assembler); #ifdef USE_APP_LEVEL_THREADING if (ARG_INPUTFILE == NULL || strcmp(ARG_INPUTFILE, "stdin") == 0 || ARG_NUM_APP_THREADS <= 1 || ARG_THREADS != 0 || ARG_TIMECODEFILE != NULL || ARG_AVIOUTPUTFILE != NULL || ARG_INPUTTYPE == 1 || ARG_MKVOUTPUTFILE != NULL) /* TODO: PGM input */ #endif /* Spawn just one encoder instance */ { enc_sequence_data_t enc_data; memset(&enc_data, 0, sizeof(enc_sequence_data_t)); if (!ARG_THREADS) ARG_THREADS = ARG_NUM_APP_THREADS; ARG_NUM_APP_THREADS = 1; enc_data.outfilename = ARG_OUTPUTFILE; enc_data.statsfilename1 = ARG_PASS1; enc_data.start_num = ARG_STARTFRAMENR; enc_data.stop_num = ARG_MAXFRAMENR; /* Encode input */ encode_sequence(&enc_data); /* Copy back stats */ input_num = enc_data.input_num; totalsize = enc_data.totalsize; totalenctime = enc_data.totalenctime; for (i=0; i < 3; i++) totalPSNR[i] = enc_data.totalPSNR[i]; memcpy(framestats, enc_data.framestats, sizeof(framestats)); } #ifdef USE_APP_LEVEL_THREADING else { /* Split input into sequences and create multiple encoder instances */ int k; void *status; FILE *f_out = NULL, *f_stats = NULL; enc_sequence_data_t enc_data[MAX_ENC_INSTANCES]; char outfile[MAX_ENC_INSTANCES][256]; char statsfilename[MAX_ENC_INSTANCES][256]; for (k = 0; k < MAX_ENC_INSTANCES; k++) memset(&enc_data[k], 0, sizeof(enc_sequence_data_t)); /* Overwrite internal encoder threading */ if (ARG_NUM_APP_THREADS > MAX_ENC_INSTANCES) { ARG_THREADS = (int) (ARG_NUM_APP_THREADS / MAX_ENC_INSTANCES); ARG_NUM_APP_THREADS = MAX_ENC_INSTANCES; } else ARG_THREADS = -1; enc_data[0].outfilename = ARG_OUTPUTFILE; enc_data[0].statsfilename1 = ARG_PASS1; enc_data[0].start_num = ARG_STARTFRAMENR; enc_data[0].stop_num = (ARG_MAXFRAMENR-ARG_STARTFRAMENR)/ARG_NUM_APP_THREADS; for (k = 1; k < ARG_NUM_APP_THREADS; k++) { sprintf(outfile[k], "%s.%03d", ARG_OUTPUTFILE, k); enc_data[k].outfilename = outfile[k]; if (ARG_PASS1) { sprintf(statsfilename[k], "%s.%03d", ARG_PASS1, k); enc_data[k].statsfilename1 = statsfilename[k]; } enc_data[k].start_num = (k*(ARG_MAXFRAMENR-ARG_STARTFRAMENR))/ARG_NUM_APP_THREADS; enc_data[k].stop_num = ((k+1)*(ARG_MAXFRAMENR-ARG_STARTFRAMENR))/ARG_NUM_APP_THREADS; } /* Start multiple encoder threads in parallel */ for (k = 1; k < ARG_NUM_APP_THREADS; k++) { pthread_create(&enc_data[k].handle, NULL, (void*)encode_sequence, (void*)&enc_data[k]); } /* Encode first sequence in this thread */ encode_sequence(&enc_data[0]); /* Wait until encoder threads have finished */ for (k = 1; k < ARG_NUM_APP_THREADS; k++) { pthread_join(enc_data[k].handle, &status); } /* Join encoder stats and encoder output files */ if (ARG_OUTPUTFILE) f_out = fopen(enc_data[0].outfilename, "ab+"); if (ARG_PASS1) f_stats = fopen(enc_data[0].statsfilename1, "ab+"); for (k = 0; k < ARG_NUM_APP_THREADS; k++) { /* Join stats */ input_num += enc_data[k].input_num; totalsize += enc_data[k].totalsize; totalenctime = MAX(totalenctime, enc_data[k].totalenctime); for (i=0; i < 3; i++) totalPSNR[i] += enc_data[k].totalPSNR[i]; for (i=0; i < 8; i++) { int l; framestats[i].count += enc_data[k].framestats[i].count; framestats[i].size += enc_data[k].framestats[i].size; for (l=0; l < 32; l++) framestats[i].quants[l] += enc_data[k].framestats[i].quants[l]; } /* Join output files */ if ((k > 0) && (f_out != NULL)) { int ch; FILE *f = fopen(enc_data[k].outfilename, "rb"); while((ch = fgetc(f)) != EOF) { fputc(ch, f_out); } fclose(f); remove(enc_data[k].outfilename); } /* Join first pass stats files */ if ((k > 0) && (f_stats != NULL)) { char str[256]; FILE *f = fopen(enc_data[k].statsfilename1, "r"); while(fgets(str, sizeof(str), f) != NULL) { if (str[0] != '#' && strlen(str) > 3) fputs(str, f_stats); } fclose(f); remove(enc_data[k].statsfilename1); } } if (f_out) fclose(f_out); if (f_stats) fclose(f_stats); } #endif /***************************************************************************** * Calculate totals and averages for output, print results ****************************************************************************/ printf("\n"); printf("Tot: enctime(ms) =%7.2f, length(bytes) = %7d\n", totalenctime, (int) totalsize); if (input_num > 0) { totalsize /= input_num; totalenctime /= input_num; totalPSNR[0] /= input_num; totalPSNR[1] /= input_num; totalPSNR[2] /= input_num; } else { totalsize = -1; totalenctime = -1; } printf("Avg: enctime(ms) =%7.2f, fps =%7.2f, length(bytes) = %7d", totalenctime, 1000 / totalenctime, (int) totalsize); if (ARG_STATS) { printf(", psnr y = %2.2f, psnr u = %2.2f, psnr v = %2.2f", totalPSNR[0],totalPSNR[1],totalPSNR[2]); } printf("\n"); if (framestats[XVID_TYPE_IVOP].count) { printf("I frames: %6d frames, size = %7d/%7d, quants = %2d / %.2f / %2d\n", \ framestats[XVID_TYPE_IVOP].count, framestats[XVID_TYPE_IVOP].size/framestats[XVID_TYPE_IVOP].count, \ framestats[XVID_TYPE_IVOP].size, minquant(framestats[XVID_TYPE_IVOP].quants), \ avgquant(framestats[XVID_TYPE_IVOP]), maxquant(framestats[XVID_TYPE_IVOP].quants)); } if (framestats[XVID_TYPE_PVOP].count) { printf("P frames: %6d frames, size = %7d/%7d, quants = %2d / %.2f / %2d\n", \ framestats[XVID_TYPE_PVOP].count, framestats[XVID_TYPE_PVOP].size/framestats[XVID_TYPE_PVOP].count, \ framestats[XVID_TYPE_PVOP].size, minquant(framestats[XVID_TYPE_PVOP].quants), \ avgquant(framestats[XVID_TYPE_PVOP]), maxquant(framestats[XVID_TYPE_PVOP].quants)); } if (framestats[XVID_TYPE_BVOP].count) { printf("B frames: %6d frames, size = %7d/%7d, quants = %2d / %.2f / %2d\n", \ framestats[XVID_TYPE_BVOP].count, framestats[XVID_TYPE_BVOP].size/framestats[XVID_TYPE_BVOP].count, \ framestats[XVID_TYPE_BVOP].size, minquant(framestats[XVID_TYPE_BVOP].quants), \ avgquant(framestats[XVID_TYPE_BVOP]), maxquant(framestats[XVID_TYPE_BVOP].quants)); } if (framestats[XVID_TYPE_SVOP].count) { printf("S frames: %6d frames, size = %7d/%7d, quants = %2d / %.2f / %2d\n", \ framestats[XVID_TYPE_SVOP].count, framestats[XVID_TYPE_SVOP].size/framestats[XVID_TYPE_SVOP].count, \ framestats[XVID_TYPE_SVOP].size, minquant(framestats[XVID_TYPE_SVOP].quants), \ avgquant(framestats[XVID_TYPE_SVOP]), maxquant(framestats[XVID_TYPE_SVOP].quants)); } if (framestats[5].count) { printf("N frames: %6d frames, size = %7d/%7d\n", \ framestats[5].count, framestats[5].size/framestats[5].count, \ framestats[5].size); } /***************************************************************************** * Xvid PART Stop ****************************************************************************/ release_all: return (0); } /***************************************************************************** * Encode a sequence ****************************************************************************/ void encode_sequence(enc_sequence_data_t *h) { /* Internal structures (handles) for encoding */ void *enc_handle = NULL; int start_num = h->start_num; int stop_num = h->stop_num; char *outfilename = h->outfilename; float *totalPSNR = h->totalPSNR; int input_num; int totalsize; double totalenctime = 0.; unsigned char *mp4_buffer = NULL; unsigned char *in_buffer = NULL; unsigned char *out_buffer = NULL; double enctime; int result; int output_num; int nvop_counter; int m4v_size; int key; int stats_type; int stats_quant; int stats_length; int fakenvop = 0; FILE *in_file = stdin; FILE *out_file = NULL; FILE *time_file = NULL; char filename[256]; #ifdef XVID_MKV_OUTPUT PMKVFILE myMKVFile = NULL; PMKVSTREAM myMKVStream = NULL; MKVSTREAMINFO myMKVStreamInfo; #endif #if defined(XVID_AVI_INPUT) PAVIFILE avi_in = NULL; PAVISTREAM avi_in_stream = NULL; PGETFRAME get_frame = NULL; BITMAPINFOHEADER myBitmapInfoHeader; #else #define get_frame NULL #endif #if defined(XVID_AVI_OUTPUT) int avierr; PAVIFILE myAVIFile = NULL; PAVISTREAM myAVIStream = NULL; AVISTREAMINFO myAVIStreamInfo; #endif #if defined(XVID_AVI_INPUT) || defined(XVID_AVI_OUTPUT) if (ARG_NUM_APP_THREADS > 1) CoInitializeEx(0, COINIT_MULTITHREADED); AVIFileInit(); #endif if (ARG_INPUTFILE == NULL || strcmp(ARG_INPUTFILE, "stdin") == 0) { in_file = stdin; } else { #ifdef XVID_AVI_INPUT if (strcmp(ARG_INPUTFILE+(strlen(ARG_INPUTFILE)-3), "avs")==0 || strcmp(ARG_INPUTFILE+(strlen(ARG_INPUTFILE)-3), "avi")==0 || ARG_INPUTTYPE==2) { AVISTREAMINFO avi_info; FILE *avi_fp = fopen(ARG_INPUTFILE, "rb"); if (avi_fp == NULL) { fprintf(stderr, "Couldn't open file '%s'!\n", ARG_INPUTFILE); return; } fclose(avi_fp); if (AVIFileOpen(&avi_in, ARG_INPUTFILE, OF_READ, NULL) != AVIERR_OK) { fprintf(stderr, "Can't open avi/avs file %s\n", ARG_INPUTFILE); AVIFileExit(); return; } if (AVIFileGetStream(avi_in, &avi_in_stream, streamtypeVIDEO, 0) != AVIERR_OK) { fprintf(stderr, "Can't open stream from file '%s'!\n", ARG_INPUTFILE); AVIFileRelease(avi_in); AVIFileExit(); return; } AVIFileRelease(avi_in); if(AVIStreamInfo(avi_in_stream, &avi_info, sizeof(AVISTREAMINFO)) != AVIERR_OK) { fprintf(stderr, "Can't get stream info from file '%s'!\n", ARG_INPUTFILE); AVIStreamRelease(avi_in_stream); AVIFileExit(); return; } if (avi_info.fccHandler != MAKEFOURCC('Y', 'V', '1', '2')) { LONG size; fprintf(stderr, "Non YV12 input colorspace %c%c%c%c! Attempting conversion...\n", avi_info.fccHandler%256, (avi_info.fccHandler>>8)%256, (avi_info.fccHandler>>16)%256, (avi_info.fccHandler>>24)%256); size = sizeof(myBitmapInfoHeader); AVIStreamReadFormat(avi_in_stream, 0, &myBitmapInfoHeader, &size); if (size==0) fprintf(stderr, "AVIStreamReadFormat read 0 bytes.\n"); else { fprintf(stderr, "AVIStreamReadFormat read %d bytes.\n", size); fprintf(stderr, "width = %d, height = %d, planes = %d\n", myBitmapInfoHeader.biWidth, myBitmapInfoHeader.biHeight, myBitmapInfoHeader.biPlanes); fprintf(stderr, "Compression = %c%c%c%c, %d\n", myBitmapInfoHeader.biCompression%256, (myBitmapInfoHeader.biCompression>>8)%256, (myBitmapInfoHeader.biCompression>>16)%256, (myBitmapInfoHeader.biCompression>>24)%256, myBitmapInfoHeader.biCompression); fprintf(stderr, "Bits Per Pixel = %d\n", myBitmapInfoHeader.biBitCount); myBitmapInfoHeader.biCompression = MAKEFOURCC('Y', 'V', '1', '2'); myBitmapInfoHeader.biBitCount = 12; myBitmapInfoHeader.biSizeImage = (myBitmapInfoHeader.biWidth*myBitmapInfoHeader.biHeight)*3/2; get_frame = AVIStreamGetFrameOpen(avi_in_stream, &myBitmapInfoHeader); } if (get_frame == NULL) { AVIStreamRelease(avi_in_stream); AVIFileExit(); return; } else { unsigned char *temp; fprintf(stderr, "AVIStreamGetFrameOpen successful.\n"); temp = (unsigned char*)AVIStreamGetFrame(get_frame, 0); if (temp != NULL) { int i; for (i = 0; i < (int)((DWORD*)temp)[0]; i++) { fprintf(stderr, "%2d ", temp[i]); } fprintf(stderr, "\n"); } } if (avi_info.fccHandler == MAKEFOURCC('D', 'I', 'B', ' ')) { AVIStreamGetFrameClose(get_frame); get_frame = NULL; ARG_COLORSPACE = XVID_CSP_BGR | XVID_CSP_VFLIP; } } } else #endif { in_file = fopen(ARG_INPUTFILE, "rb"); if (in_file == NULL) { fprintf(stderr, "Error opening input file %s\n", ARG_INPUTFILE); return; } } } // This should be after the avi input opening stuff if (ARG_TIMECODEFILE != NULL) { time_file = fopen(ARG_TIMECODEFILE, "r"); if (time_file==NULL) { fprintf(stderr, "Couldn't open timecode file '%s'!\n", ARG_TIMECODEFILE); return; } else { fscanf(time_file, "# timecode format v2\n"); } } if (ARG_INPUTTYPE==1) { #ifndef READ_PNM if (read_pgmheader(in_file)) { #else if (read_pnmheader(in_file)) { #endif fprintf(stderr, "Wrong input format, I want YUV encapsulated in PGM\n"); return; } } /* Jump to the starting frame */ if (ARG_INPUTTYPE == 0) /* TODO: Other input formats ??? */ fseek(in_file, start_num*IMAGE_SIZE(XDIM, YDIM), SEEK_SET); /* now we know the sizes, so allocate memory */ if (get_frame == NULL) { in_buffer = (unsigned char *) malloc(4*XDIM*YDIM); if (!in_buffer) goto free_all_memory; } /* this should really be enough memory ! */ mp4_buffer = (unsigned char *) malloc(IMAGE_SIZE(XDIM, YDIM) * 2); if (!mp4_buffer) goto free_all_memory; /***************************************************************************** * Xvid PART Start ****************************************************************************/ result = enc_init(&enc_handle, h->statsfilename1, h->start_num); if (result) { fprintf(stderr, "Encore INIT problem, return value %d\n", result); goto release_all; } /***************************************************************************** * Main loop ****************************************************************************/ if (ARG_SAVEMPEGSTREAM) { if (outfilename) { if ((out_file = fopen(outfilename, "w+b")) == NULL) { fprintf(stderr, "Error opening output file %s\n", outfilename); goto release_all; } } #ifdef XVID_AVI_OUTPUT if (ARG_AVIOUTPUTFILE != NULL ) { { /* Open the .avi output then close it */ /* Resets the file size to 0, which AVIFile doesn't seem to do */ FILE *scrub; if ((scrub = fopen(ARG_AVIOUTPUTFILE, "w+b")) == NULL) { fprintf(stderr, "Error opening output file %s\n", ARG_AVIOUTPUTFILE); goto release_all; } else fclose(scrub); } memset(&myAVIStreamInfo, 0, sizeof(AVISTREAMINFO)); myAVIStreamInfo.fccType = streamtypeVIDEO; myAVIStreamInfo.fccHandler = MAKEFOURCC('x', 'v', 'i', 'd'); myAVIStreamInfo.dwScale = ARG_DWSCALE; myAVIStreamInfo.dwRate = ARG_DWRATE; myAVIStreamInfo.dwLength = ARG_MAXFRAMENR; myAVIStreamInfo.dwQuality = 10000; SetRect(&myAVIStreamInfo.rcFrame, 0, 0, XDIM, YDIM); if (avierr=AVIFileOpen(&myAVIFile, ARG_AVIOUTPUTFILE, OF_CREATE|OF_WRITE, NULL)) { fprintf(stderr, "AVIFileOpen failed opening output file %s, error code %d\n", ARG_AVIOUTPUTFILE, avierr); goto release_all; } if (avierr=AVIFileCreateStream(myAVIFile, &myAVIStream, &myAVIStreamInfo)) { fprintf(stderr, "AVIFileCreateStream failed, error code %d\n", avierr); goto release_all; } memset(&myBitmapInfoHeader, 0, sizeof(BITMAPINFOHEADER)); myBitmapInfoHeader.biHeight = YDIM; myBitmapInfoHeader.biWidth = XDIM; myBitmapInfoHeader.biPlanes = 1; myBitmapInfoHeader.biSize = sizeof(BITMAPINFOHEADER); myBitmapInfoHeader.biCompression = MAKEFOURCC('X', 'V', 'I', 'D'); myBitmapInfoHeader.biBitCount = 12; myBitmapInfoHeader.biSizeImage = 6*XDIM*YDIM; if (avierr=AVIStreamSetFormat(myAVIStream, 0, &myBitmapInfoHeader, sizeof(BITMAPINFOHEADER))) { fprintf(stderr, "AVIStreamSetFormat failed, error code %d\n", avierr); goto release_all; } } #endif #ifdef XVID_MKV_OUTPUT if (ARG_MKVOUTPUTFILE != NULL) { { /* Open the .mkv output then close it */ /* Just to make sure we can write to it */ FILE *scrub; if ((scrub = fopen(ARG_MKVOUTPUTFILE, "w+b")) == NULL) { fprintf(stderr, "Error opening output file %s\n", ARG_MKVOUTPUTFILE); goto release_all; } else fclose(scrub); } MKVFileOpen(&myMKVFile, ARG_MKVOUTPUTFILE, OF_CREATE|OF_WRITE, NULL); if (ARG_PAR) { myMKVStreamInfo.display_height = YDIM*height_ratios[ARG_PAR]; myMKVStreamInfo.display_width = XDIM*width_ratios[ARG_PAR]; } else { myMKVStreamInfo.display_height = YDIM*ARG_PARHEIGHT; myMKVStreamInfo.display_width = XDIM*ARG_PARWIDTH; } myMKVStreamInfo.height = YDIM; myMKVStreamInfo.width = XDIM; myMKVStreamInfo.framerate = ARG_DWRATE; myMKVStreamInfo.framescale = ARG_DWSCALE; myMKVStreamInfo.length = ARG_MAXFRAMENR; MKVFileCreateStream(myMKVFile, &myMKVStream, &myMKVStreamInfo); } #endif } else { out_file = NULL; } /***************************************************************************** * Encoding loop ****************************************************************************/ totalsize = 0; result = 0; input_num = 0; /* input frame counter */ output_num = start_num; /* output frame counter */ nvop_counter = 0; do { char *type; int sse[3]; if ((input_num+start_num) >= stop_num && stop_num > 0) { result = 1; } if (!result) { #ifdef XVID_AVI_INPUT if (ARG_INPUTTYPE==2) { /* read avs/avi data (YUV-format) */ if (get_frame != NULL) { in_buffer = (unsigned char*)AVIStreamGetFrame(get_frame, input_num+start_num); if (in_buffer == NULL) result = 1; else in_buffer += ((DWORD*)in_buffer)[0]; } else { if(AVIStreamRead(avi_in_stream, input_num+start_num, 1, in_buffer, 4*XDIM*YDIM, NULL, NULL ) != AVIERR_OK) result = 1; } } else #endif if (ARG_INPUTTYPE==1) { /* read PGM data (YUV-format) */ #ifndef READ_PNM result = read_pgmdata(in_file, in_buffer); #else result = read_pnmdata(in_file, in_buffer); #endif } else { /* read raw data (YUV-format) */ result = read_yuvdata(in_file, in_buffer); } } /***************************************************************************** * Encode and decode this frame ****************************************************************************/ if ((unsigned int)(input_num+start_num) >= (unsigned int)(stop_num-1) && ARG_MAXBFRAMES) { stats_type = XVID_TYPE_PVOP; } else stats_type = XVID_TYPE_AUTO; enctime = msecond(); m4v_size = enc_main(enc_handle, !result ? in_buffer : 0, mp4_buffer, &key, &stats_type, &stats_quant, &stats_length, sse, input_num); enctime = msecond() - enctime; /* Write the Frame statistics */ if (stats_type > 0) { /* !XVID_TYPE_NOTHING */ switch (stats_type) { case XVID_TYPE_IVOP: type = "I"; break; case XVID_TYPE_PVOP: type = "P"; break; case XVID_TYPE_BVOP: type = "B"; if (ARG_PACKED) fakenvop = 1; break; case XVID_TYPE_SVOP: type = "S"; break; default: type = "U"; break; } if (stats_length > 8) { h->framestats[stats_type].count++; h->framestats[stats_type].quants[stats_quant]++; h->framestats[stats_type].size += stats_length; } else { h->framestats[5].count++; h->framestats[5].quants[stats_quant]++; h->framestats[5].size += stats_length; } #define SSE2PSNR(sse, width, height) ((!(sse))?0.0f : 48.131f - 10*(float)log10((float)(sse)/((float)((width)*(height))))) if (ARG_PROGRESS == 0) { printf("%5d: key=%i, time= %6.0f, len= %7d", !result ? (input_num+start_num) : -1, key, (float) enctime, (int) m4v_size); printf(" | type=%s, quant= %2d, len= %7d", type, stats_quant, stats_length); if (ARG_STATS) { printf(", psnr y = %2.2f, psnr u = %2.2f, psnr v = %2.2f", SSE2PSNR(sse[0], XDIM, YDIM), SSE2PSNR(sse[1], XDIM / 2, YDIM / 2), SSE2PSNR(sse[2], XDIM / 2, YDIM / 2)); } printf("\n"); } else { if ((input_num) % ARG_PROGRESS == 1) { if (stop_num > 0) { fprintf(stderr, "\r%7d frames(%3d%%) encoded, %6.2f fps, Average Bitrate = %5.0fkbps", \ (ARG_NUM_APP_THREADS*input_num), (input_num)*100/(stop_num-start_num), (ARG_NUM_APP_THREADS*input_num)*1000/(totalenctime), \ ((((totalsize)/1000)*ARG_FRAMERATE)*8)/(input_num)); } else { fprintf(stderr, "\r%7d frames encoded, %6.2f fps, Average Bitrate = %5.0fkbps", \ (ARG_NUM_APP_THREADS*input_num), (ARG_NUM_APP_THREADS*input_num)*1000/(totalenctime), \ ((((totalsize)/1000)*ARG_FRAMERATE)*8)/(input_num)); } } } if (ARG_STATS) { totalPSNR[0] += SSE2PSNR(sse[0], XDIM, YDIM); totalPSNR[1] += SSE2PSNR(sse[1], XDIM/2, YDIM/2); totalPSNR[2] += SSE2PSNR(sse[2], XDIM/2, YDIM/2); } #undef SSE2PSNR } if (m4v_size < 0) break; /* Update encoding time stats */ totalenctime += enctime; totalsize += m4v_size; /***************************************************************************** * Save stream to file ****************************************************************************/ if (m4v_size > 0 && ARG_SAVEMPEGSTREAM) { char timecode[50]; if (time_file != NULL) { if (fscanf(time_file, "%s\n", timecode) != 1) { fprintf(stderr, "Error reading timecode file, frame %d\n", output_num); goto release_all; } } else sprintf(timecode, "%f", ((double)ARG_DWSCALE/ARG_DWRATE)*1000*output_num); /* Save single files */ if (ARG_SAVEINDIVIDUAL) { FILE *out; sprintf(filename, "%sframe%05d.m4v", filepath, output_num); out = fopen(filename, "w+b"); fwrite(mp4_buffer, m4v_size, 1, out); fclose(out); } #ifdef XVID_AVI_OUTPUT if (ARG_AVIOUTPUTFILE && myAVIStream) { int output_frame; if (time_file == NULL) output_frame = output_num; else { output_frame = (int)(atof(timecode)/1000/((double)ARG_DWSCALE/ARG_DWRATE)+.5); } if (AVIStreamWrite(myAVIStream, output_frame, 1, mp4_buffer, m4v_size, key ? AVIIF_KEYFRAME : 0, NULL, NULL)) { fprintf(stderr, "AVIStreamWrite failed writing frame %d\n", output_num); goto release_all; } } #endif if (key && ARG_PACKED) removedivxp((char*)mp4_buffer, m4v_size); /* Save ES stream */ if (outfilename && out_file && !(fakenvop && m4v_size <= 8)) { fwrite(mp4_buffer, 1, m4v_size, out_file); } #ifdef XVID_MKV_OUTPUT if (ARG_MKVOUTPUTFILE && myMKVStream) { MKVStreamWrite(myMKVStream, atof(timecode), 1, (ARG_PACKED && fakenvop && (m4v_size <= 8)) ? NULL : mp4_buffer, m4v_size, key ? AVIIF_KEYFRAME : 0, NULL, NULL); } #endif output_num++; if (stats_type != XVID_TYPE_BVOP) fakenvop=0; } if (!result) (input_num)++; /* Read the header if it's pgm stream */ if (!result && (ARG_INPUTTYPE==1)) #ifndef READ_PNM result = read_pgmheader(in_file); #else result = read_pnmheader(in_file); #endif } while (1); release_all: h->input_num = input_num; h->totalenctime = totalenctime; h->totalsize = totalsize; #ifdef XVID_AVI_INPUT if (get_frame) AVIStreamGetFrameClose(get_frame); if (avi_in_stream) AVIStreamRelease(avi_in_stream); #endif if (enc_handle) { result = enc_stop(enc_handle); if (result) fprintf(stderr, "Encore RELEASE problem return value %d\n", result); } if (in_file) fclose(in_file); if (out_file) fclose(out_file); if (time_file) fclose(time_file); #ifdef XVID_AVI_OUTPUT if (myAVIStream) AVIStreamRelease(myAVIStream); if (myAVIFile) AVIFileRelease(myAVIFile); #endif #ifdef XVID_MKV_OUTPUT if (myMKVStream) MKVStreamRelease(myMKVStream); if (myMKVFile) MKVFileRelease(myMKVFile); #endif #if defined(XVID_AVI_INPUT) || defined(XVID_AVI_OUTPUT) AVIFileExit(); #endif free_all_memory: free(out_buffer); free(mp4_buffer); free(in_buffer); } /***************************************************************************** * "statistical" functions * * these are not needed for encoding or decoding, but for measuring * time and quality, there in nothing specific to Xvid in these * *****************************************************************************/ /* Return time elapsed time in miliseconds since the program started */ static double msecond() { #ifndef WIN32 struct timeval tv; gettimeofday(&tv, 0); return (tv.tv_sec * 1.0e3 + tv.tv_usec * 1.0e-3); #else clock_t clk; clk = clock(); return (clk * 1000.0 / CLOCKS_PER_SEC); #endif } int gcd(int a, int b) { int r ; if (b > a) { r = a; a = b; b = r; } while ((r = a % b)) { a = b; b = r; } return b; } int minquant(int quants[32]) { int i = 1; while (quants[i] == 0) { i++; } return i; } int maxquant(int quants[32]) { int i = 31; while (quants[i] == 0) { i--; } return i; } double avgquant(frame_stats_t frame) { double avg=0; int i; for (i=1; i < 32; i++) { avg += frame.quants[i]*i; } avg /= frame.count; return avg; } /***************************************************************************** * Usage message *****************************************************************************/ static void usage() { fprintf(stderr, "xvid_encraw built at %s on %s\n", __TIME__, __DATE__); fprintf(stderr, "Usage : xvid_encraw [OPTIONS]\n\n"); fprintf(stderr, "Input options:\n"); fprintf(stderr, " -i string : input filename (stdin)\n"); #ifdef XVID_AVI_INPUT fprintf(stderr, " -type integer: input data type (yuv=0, pgm=1, avi/avs=2)\n"); #else fprintf(stderr, " -type integer: input data type (yuv=0, pgm=1)\n"); #endif fprintf(stderr, " -w integer: frame width ([1.2048])\n"); fprintf(stderr, " -h integer: frame height ([1.2048])\n"); fprintf(stderr, " -csp string : colorspace of raw input file i420, yv12 (default)\n"); fprintf(stderr, " -frames integer: number of frames to encode\n"); fprintf(stderr, "\n"); fprintf(stderr, "Output options:\n"); fprintf(stderr, " -dump : save decoder output\n"); fprintf(stderr, " -save : save an Elementary Stream file per frame\n"); fprintf(stderr, " -o string : save an Elementary Stream for the complete sequence\n"); #ifdef XVID_AVI_OUTPUT fprintf(stderr, " -avi string: save an AVI file for the complete sequence\n"); #endif fprintf(stderr, " -mkv string: save a MKV file for the complete sequence\n"); fprintf(stderr, "\n"); fprintf(stderr, "BFrames options:\n"); fprintf(stderr, " -max_bframes integer: max bframes (2)\n"); fprintf(stderr, " -bquant_ratio integer: bframe quantizer ratio (150)\n"); fprintf(stderr, " -bquant_offset integer: bframe quantizer offset (100)\n"); fprintf(stderr, "\n"); fprintf(stderr, "Rate control options:\n"); fprintf(stderr, " -framerate float : target framerate (auto)\n"); fprintf(stderr, " -bitrate [integer] : target bitrate in kbps (700)\n"); fprintf(stderr, " -size integer : target size in kilobytes\n"); fprintf(stderr, " -single : single pass mode (default)\n"); fprintf(stderr, " -cq float : single pass constant quantizer\n"); fprintf(stderr, " -pass1 [filename] : twopass mode (first pass)\n"); fprintf(stderr, " -full1pass : perform full first pass\n"); fprintf(stderr, " -pass2 [filename] : twopass mode (2nd pass)\n"); fprintf(stderr, " -zq starting_frame float : bitrate zone; quant\n"); fprintf(stderr, " -zw starting_frame float : bitrate zone; weight\n"); fprintf(stderr, " -max_key_interval integer : maximum keyframe interval (300)\n"); fprintf(stderr, "\n"); fprintf(stderr, "Single Pass options:\n"); fprintf(stderr, "-reaction integer : reaction delay factor (16)\n"); fprintf(stderr, "-averaging integer : averaging period (100)\n"); fprintf(stderr, "-smoother integer : smoothing buffer (100)\n"); fprintf(stderr, "\n"); fprintf(stderr, "Second Pass options:\n"); fprintf(stderr, "-kboost integer : I frame boost (10)\n"); fprintf(stderr, "-kthresh integer : I frame reduction threshold (1)\n"); fprintf(stderr, "-kreduction integer : I frame reduction amount (20)\n"); fprintf(stderr, "-ostrength integer : overflow control strength (5)\n"); fprintf(stderr, "-oimprove integer : max overflow improvement (5)\n"); fprintf(stderr, "-odegrade integer : max overflow degradation (5)\n"); fprintf(stderr, "-chigh integer : high bitrate scenes degradation (0)\n"); fprintf(stderr, "-clow integer : low bitrate scenes improvement (0)\n"); fprintf(stderr, "-overhead integer : container frame overhead (0)\n"); fprintf(stderr, "-vbvsize integer : use vbv buffer size\n"); fprintf(stderr, "-vbvmax integer : vbv max bitrate\n"); fprintf(stderr, "-vbvpeak integer : vbv peak bitrate over 1 second\n"); fprintf(stderr, "\n"); fprintf(stderr, "Other options\n"); fprintf(stderr, " -noasm : do not use assembly optmized code\n"); fprintf(stderr, " -turbo : use turbo presets for higher encoding speed\n"); fprintf(stderr, " -quality integer : quality ([0..%d]) (6)\n", ME_ELEMENTS - 1); fprintf(stderr, " -vhqmode integer : level of R-D optimizations ([0..4]) (1)\n"); fprintf(stderr, " -bvhq : use R-D optimizations for B-frames\n"); fprintf(stderr, " -metric integer : distortion metric for R-D opt (PSNR:0, PSNRHVSM: 1)\n"); fprintf(stderr, " -qpel : use quarter pixel ME\n"); fprintf(stderr, " -gmc : use global motion compensation\n"); fprintf(stderr, " -qtype integer : quantization type (H263:0, MPEG4:1) (0)\n"); fprintf(stderr, " -qmatrix filename : use custom MPEG4 quantization matrix\n"); fprintf(stderr, " -interlaced [integer] : interlaced encoding (BFF:1, TFF:2) (1)\n"); fprintf(stderr, " -nopacked : Disable packed mode\n"); fprintf(stderr, " -noclosed_gop : Disable closed GOP mode\n"); fprintf(stderr, " -masking [integer] : HVS masking mode (None:0, Lumi:1, Variance:2) (0)\n"); fprintf(stderr, " -stats : print stats about encoded frames\n"); fprintf(stderr, " -ssim [integer] : prints ssim for every frame (accurate: 0 fast: 4) (2)\n"); fprintf(stderr, " -ssim_file filename : outputs the ssim stats into a file\n"); fprintf(stderr, " -psnrhvsm : prints PSNRHVSM metric for every frame\n"); fprintf(stderr, " -debug : activates xvidcore internal debugging output\n"); fprintf(stderr, " -vop_debug : print some info directly into encoded frames\n"); fprintf(stderr, " -nochromame : Disable chroma motion estimation\n"); fprintf(stderr, " -notrellis : Disable trellis quantization\n"); fprintf(stderr, " -imin integer : Minimum I Quantizer (1..31) (2)\n"); fprintf(stderr, " -imax integer : Maximum I quantizer (1..31) (31)\n"); fprintf(stderr, " -bmin integer : Minimum B Quantizer (1..31) (2)\n"); fprintf(stderr, " -bmax integer : Maximum B quantizer (1..31) (31)\n"); fprintf(stderr, " -pmin integer : Minimum P Quantizer (1..31) (2)\n"); fprintf(stderr, " -pmax integer : Maximum P quantizer (1..31) (31)\n"); fprintf(stderr, " -drop integer : Frame Drop Ratio (0..100) (0)\n"); fprintf(stderr, " -start integer : Starting frame number\n"); fprintf(stderr, " -threads integer : Number of threads\n"); fprintf(stderr, " -slices integer : Number of slices\n"); fprintf(stderr, " -progress [integer] : Show progress updates every n frames (10)\n"); fprintf(stderr, " -par integer[:integer] : Set Pixel Aspect Ratio.\n"); fprintf(stderr, " 1 = 1:1\n"); fprintf(stderr, " 2 = 12:11 (4:3 PAL)\n"); fprintf(stderr, " 3 = 10:11 (4:3 NTSC)\n"); fprintf(stderr, " 4 = 16:11 (16:9 PAL)\n"); fprintf(stderr, " 5 = 40:33 (16:9 NTSC)\n"); fprintf(stderr, " other = custom (width:height)\n"); fprintf(stderr, " -help : prints this help message\n"); fprintf(stderr, "\n"); fprintf(stderr, "NB: You can define %d zones repeating the -z[qw] option as needed.\n", MAX_ZONES); } /***************************************************************************** * Input and output functions * * the are small and simple routines to read and write PGM and YUV * image. It's just for convenience, again nothing specific to Xvid * *****************************************************************************/ #ifndef READ_PNM static int read_pgmheader(FILE * handle) { int bytes, xsize, ysize, depth; char dummy[2]; bytes = (int) fread(dummy, 1, 2, handle); if ((bytes < 2) || (dummy[0] != 'P') || (dummy[1] != '5')) return (1); fscanf(handle, "%d %d %d", &xsize, &ysize, &depth); if ((xsize > 4096) || (ysize > 4096*3/2) || (depth != 255)) { fprintf(stderr, "%d %d %d\n", xsize, ysize, depth); return (2); } if ((XDIM == 0) || (YDIM == 0)) { XDIM = xsize; YDIM = ysize * 2 / 3; } return (0); } static int read_pgmdata(FILE * handle, unsigned char *image) { int i; char dummy; unsigned char *y = image; unsigned char *u = image + XDIM * YDIM; unsigned char *v = image + XDIM * YDIM + XDIM / 2 * YDIM / 2; /* read Y component of picture */ fread(y, 1, XDIM * YDIM, handle); for (i = 0; i < YDIM / 2; i++) { /* read U */ fread(u, 1, XDIM / 2, handle); /* read V */ fread(v, 1, XDIM / 2, handle); /* Update pointers */ u += XDIM / 2; v += XDIM / 2; } /* I don't know why, but this seems needed */ fread(&dummy, 1, 1, handle); return (0); } #else static int read_pnmheader(FILE * handle) { int bytes, xsize, ysize, depth; char dummy[2]; bytes = fread(dummy, 1, 2, handle); if ((bytes < 2) || (dummy[0] != 'P') || (dummy[1] != '6')) return (1); fscanf(handle, "%d %d %d", &xsize, &ysize, &depth); if ((xsize > 1440) || (ysize > 2880) || (depth != 255)) { fprintf(stderr, "%d %d %d\n", xsize, ysize, depth); return (2); } XDIM = xsize; YDIM = ysize; return (0); } static int read_pnmdata(FILE * handle, unsigned char *image) { int i; char dummy; /* read Y component of picture */ fread(image, 1, XDIM * YDIM * 3, handle); /* I don't know why, but this seems needed */ fread(&dummy, 1, 1, handle); return (0); } #endif static int read_yuvdata(FILE * handle, unsigned char *image) { if (fread(image, 1, IMAGE_SIZE(XDIM, YDIM), handle) != (unsigned int) IMAGE_SIZE(XDIM, YDIM)) return (1); else return (0); } /***************************************************************************** * Routines for encoding: init encoder, frame step, release encoder ****************************************************************************/ /* sample plugin */ int rawenc_debug(void *handle, int opt, void *param1, void *param2) { switch (opt) { case XVID_PLG_INFO: { xvid_plg_info_t *info = (xvid_plg_info_t *) param1; info->flags = XVID_REQDQUANTS; return 0; } case XVID_PLG_CREATE: case XVID_PLG_DESTROY: case XVID_PLG_BEFORE: return 0; case XVID_PLG_AFTER: { xvid_plg_data_t *data = (xvid_plg_data_t *) param1; int i, j; printf("---[ frame: %5i quant: %2i length: %6i ]---\n", data->frame_num, data->quant, data->length); for (j = 0; j < data->mb_height; j++) { for (i = 0; i < data->mb_width; i++) printf("%2i ", data->dquant[j * data->dquant_stride + i]); printf("\n"); } return 0; } } return XVID_ERR_FAIL; } #define FRAMERATE_INCR 1001 /* Gobal encoder init, once per process */ void enc_gbl(int use_assembler) { xvid_gbl_init_t xvid_gbl_init; /*------------------------------------------------------------------------ * Xvid core initialization *----------------------------------------------------------------------*/ /* Set version -- version checking will done by xvidcore */ memset(&xvid_gbl_init, 0, sizeof(xvid_gbl_init)); xvid_gbl_init.version = XVID_VERSION; xvid_gbl_init.debug = ARG_DEBUG; /* Do we have to enable ASM optimizations ? */ if (use_assembler) { #ifdef ARCH_IS_IA64 xvid_gbl_init.cpu_flags = XVID_CPU_FORCE | XVID_CPU_ASM; #else xvid_gbl_init.cpu_flags = 0; #endif } else { xvid_gbl_init.cpu_flags = XVID_CPU_FORCE; } /* Initialize Xvid core -- Should be done once per __process__ */ xvid_global(NULL, XVID_GBL_INIT, &xvid_gbl_init, NULL); ARG_CPU_FLAGS = xvid_gbl_init.cpu_flags; enc_info(); } /* Initialize encoder for first use, pass all needed parameters to the codec */ static int enc_init(void **enc_handle, char *stats_pass1, int start_num) { int xerr; //xvid_plugin_cbr_t cbr; xvid_plugin_single_t single; xvid_plugin_2pass1_t rc2pass1; xvid_plugin_2pass2_t rc2pass2; xvid_plugin_ssim_t ssim; xvid_plugin_lumimasking_t masking; //xvid_plugin_fixed_t rcfixed; xvid_enc_plugin_t plugins[8]; xvid_enc_create_t xvid_enc_create; int i; /*------------------------------------------------------------------------ * Xvid encoder initialization *----------------------------------------------------------------------*/ /* Version again */ memset(&xvid_enc_create, 0, sizeof(xvid_enc_create)); xvid_enc_create.version = XVID_VERSION; /* Width and Height of input frames */ xvid_enc_create.width = XDIM; xvid_enc_create.height = YDIM; xvid_enc_create.profile = 0xf5; /* Unrestricted */ /* init plugins */ // xvid_enc_create.zones = ZONES; // xvid_enc_create.num_zones = NUM_ZONES; xvid_enc_create.plugins = plugins; xvid_enc_create.num_plugins = 0; if (ARG_SINGLE) { memset(&single, 0, sizeof(xvid_plugin_single_t)); single.version = XVID_VERSION; single.bitrate = ARG_BITRATE; single.reaction_delay_factor = ARG_REACTION; single.averaging_period = ARG_AVERAGING; single.buffer = ARG_SMOOTHER; plugins[xvid_enc_create.num_plugins].func = xvid_plugin_single; plugins[xvid_enc_create.num_plugins].param = &single; xvid_enc_create.num_plugins++; if (!ARG_BITRATE) prepare_cquant_zones(); } if (ARG_PASS2) { memset(&rc2pass2, 0, sizeof(xvid_plugin_2pass2_t)); rc2pass2.version = XVID_VERSION; rc2pass2.filename = ARG_PASS2; rc2pass2.bitrate = ARG_BITRATE; rc2pass2.keyframe_boost = ARG_KBOOST; rc2pass2.curve_compression_high = ARG_CHIGH; rc2pass2.curve_compression_low = ARG_CLOW; rc2pass2.overflow_control_strength = ARG_OVERSTRENGTH; rc2pass2.max_overflow_improvement = ARG_OVERIMPROVE; rc2pass2.max_overflow_degradation = ARG_OVERDEGRADE; rc2pass2.kfreduction = ARG_KREDUCTION; rc2pass2.kfthreshold = ARG_KTHRESH; rc2pass2.container_frame_overhead = ARG_OVERHEAD; // An example of activating VBV could look like this rc2pass2.vbv_size = ARG_VBVSIZE; rc2pass2.vbv_initial = (ARG_VBVSIZE*3)/4; rc2pass2.vbv_maxrate = ARG_VBVMAXRATE; rc2pass2.vbv_peakrate = ARG_VBVPEAKRATE; plugins[xvid_enc_create.num_plugins].func = xvid_plugin_2pass2; plugins[xvid_enc_create.num_plugins].param = &rc2pass2; xvid_enc_create.num_plugins++; } if (stats_pass1) { memset(&rc2pass1, 0, sizeof(xvid_plugin_2pass1_t)); rc2pass1.version = XVID_VERSION; rc2pass1.filename = stats_pass1; if (ARG_FULL1PASS) prepare_full1pass_zones(); plugins[xvid_enc_create.num_plugins].func = xvid_plugin_2pass1; plugins[xvid_enc_create.num_plugins].param = &rc2pass1; xvid_enc_create.num_plugins++; } /* Zones stuff */ xvid_enc_create.zones = (xvid_enc_zone_t*)malloc(sizeof(xvid_enc_zone_t) * NUM_ZONES); xvid_enc_create.num_zones = NUM_ZONES; for (i=0; i < xvid_enc_create.num_zones; i++) { xvid_enc_create.zones[i].frame = ZONES[i].frame; xvid_enc_create.zones[i].base = 100; xvid_enc_create.zones[i].mode = ZONES[i].mode; xvid_enc_create.zones[i].increment = ZONES[i].modifier; } if (ARG_LUMIMASKING) { memset(&masking, 0, sizeof(xvid_plugin_lumimasking_t)); masking.method = (ARG_LUMIMASKING==2); plugins[xvid_enc_create.num_plugins].func = xvid_plugin_lumimasking; plugins[xvid_enc_create.num_plugins].param = &masking; xvid_enc_create.num_plugins++; } if (ARG_DUMP) { plugins[xvid_enc_create.num_plugins].func = xvid_plugin_dump; plugins[xvid_enc_create.num_plugins].param = NULL; xvid_enc_create.num_plugins++; } if (ARG_SSIM>=0 || ARG_SSIM_PATH != NULL) { memset(&ssim, 0, sizeof(xvid_plugin_ssim_t)); plugins[xvid_enc_create.num_plugins].func = xvid_plugin_ssim; if( ARG_SSIM >=0){ ssim.b_printstat = 1; ssim.acc = ARG_SSIM; } else { ssim.b_printstat = 0; ssim.acc = 2; } if(ARG_SSIM_PATH != NULL){ ssim.stat_path = ARG_SSIM_PATH; } ssim.cpu_flags = ARG_CPU_FLAGS; ssim.b_visualize = 0; plugins[xvid_enc_create.num_plugins].param = &ssim; xvid_enc_create.num_plugins++; } if (ARG_PSNRHVSM>0) { plugins[xvid_enc_create.num_plugins].func = xvid_plugin_psnrhvsm; plugins[xvid_enc_create.num_plugins].param = NULL; xvid_enc_create.num_plugins++; } #if 0 if (ARG_DEBUG) { plugins[xvid_enc_create.num_plugins].func = rawenc_debug; plugins[xvid_enc_create.num_plugins].param = NULL; xvid_enc_create.num_plugins++; } #endif xvid_enc_create.num_threads = ARG_THREADS; xvid_enc_create.num_slices = ARG_SLICES; /* Frame rate */ xvid_enc_create.fincr = ARG_DWSCALE; xvid_enc_create.fbase = ARG_DWRATE; /* Maximum key frame interval */ if (ARG_MAXKEYINTERVAL > 0) { xvid_enc_create.max_key_interval = ARG_MAXKEYINTERVAL; }else { xvid_enc_create.max_key_interval = (int) ARG_FRAMERATE *10; } xvid_enc_create.min_quant[0]=ARG_QUANTS[0]; xvid_enc_create.min_quant[1]=ARG_QUANTS[2]; xvid_enc_create.min_quant[2]=ARG_QUANTS[4]; xvid_enc_create.max_quant[0]=ARG_QUANTS[1]; xvid_enc_create.max_quant[1]=ARG_QUANTS[3]; xvid_enc_create.max_quant[2]=ARG_QUANTS[5]; /* Bframes settings */ xvid_enc_create.max_bframes = ARG_MAXBFRAMES; xvid_enc_create.bquant_ratio = ARG_BQRATIO; xvid_enc_create.bquant_offset = ARG_BQOFFSET; /* Frame drop ratio */ xvid_enc_create.frame_drop_ratio = ARG_FRAMEDROP; /* Start frame number */ xvid_enc_create.start_frame_num = start_num; /* Global encoder options */ xvid_enc_create.global = 0; if (ARG_PACKED) xvid_enc_create.global |= XVID_GLOBAL_PACKED; if (ARG_CLOSED_GOP) xvid_enc_create.global |= XVID_GLOBAL_CLOSED_GOP; if (ARG_STATS) xvid_enc_create.global |= XVID_GLOBAL_EXTRASTATS_ENABLE; /* I use a small value here, since will not encode whole movies, but short clips */ xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xvid_enc_create, NULL); /* Retrieve the encoder instance from the structure */ *enc_handle = xvid_enc_create.handle; free(xvid_enc_create.zones); return (xerr); } static int enc_info() { xvid_gbl_info_t xvid_gbl_info; int ret; memset(&xvid_gbl_info, 0, sizeof(xvid_gbl_info)); xvid_gbl_info.version = XVID_VERSION; ret = xvid_global(NULL, XVID_GBL_INFO, &xvid_gbl_info, NULL); if (xvid_gbl_info.build != NULL) { fprintf(stderr, "xvidcore build version: %s\n", xvid_gbl_info.build); } fprintf(stderr, "Bitstream version: %d.%d.%d\n", XVID_VERSION_MAJOR(xvid_gbl_info.actual_version), XVID_VERSION_MINOR(xvid_gbl_info.actual_version), XVID_VERSION_PATCH(xvid_gbl_info.actual_version)); fprintf(stderr, "Detected CPU flags: "); if (xvid_gbl_info.cpu_flags & XVID_CPU_ASM) fprintf(stderr, "ASM "); if (xvid_gbl_info.cpu_flags & XVID_CPU_MMX) fprintf(stderr, "MMX "); if (xvid_gbl_info.cpu_flags & XVID_CPU_MMXEXT) fprintf(stderr, "MMXEXT "); if (xvid_gbl_info.cpu_flags & XVID_CPU_SSE) fprintf(stderr, "SSE "); if (xvid_gbl_info.cpu_flags & XVID_CPU_SSE2) fprintf(stderr, "SSE2 "); if (xvid_gbl_info.cpu_flags & XVID_CPU_SSE3) fprintf(stderr, "SSE3 "); if (xvid_gbl_info.cpu_flags & XVID_CPU_SSE41) fprintf(stderr, "SSE41 "); if (xvid_gbl_info.cpu_flags & XVID_CPU_3DNOW) fprintf(stderr, "3DNOW "); if (xvid_gbl_info.cpu_flags & XVID_CPU_3DNOWEXT) fprintf(stderr, "3DNOWEXT "); if (xvid_gbl_info.cpu_flags & XVID_CPU_TSC) fprintf(stderr, "TSC "); fprintf(stderr, "\n"); fprintf(stderr, "Detected %d cpus,", xvid_gbl_info.num_threads); ARG_NUM_APP_THREADS = xvid_gbl_info.num_threads; fprintf(stderr, " using %d threads.\n", (!ARG_THREADS) ? ARG_NUM_APP_THREADS : ARG_THREADS); return ret; } static int enc_stop(void *enc_handle) { int xerr; /* Destroy the encoder instance */ xerr = xvid_encore(enc_handle, XVID_ENC_DESTROY, NULL, NULL); return (xerr); } static int enc_main(void *enc_handle, unsigned char *image, unsigned char *bitstream, int *key, int *stats_type, int *stats_quant, int *stats_length, int sse[3], int framenum) { int ret; xvid_enc_frame_t xvid_enc_frame; xvid_enc_stats_t xvid_enc_stats; /* Version for the frame and the stats */ memset(&xvid_enc_frame, 0, sizeof(xvid_enc_frame)); xvid_enc_frame.version = XVID_VERSION; memset(&xvid_enc_stats, 0, sizeof(xvid_enc_stats)); xvid_enc_stats.version = XVID_VERSION; /* Bind output buffer */ xvid_enc_frame.bitstream = bitstream; xvid_enc_frame.length = -1; /* Initialize input image fields */ if (image) { xvid_enc_frame.input.plane[0] = image; #ifndef READ_PNM xvid_enc_frame.input.csp = ARG_COLORSPACE; xvid_enc_frame.input.stride[0] = XDIM; #else xvid_enc_frame.input.csp = XVID_CSP_BGR; xvid_enc_frame.input.stride[0] = XDIM*3; #endif } else { xvid_enc_frame.input.csp = XVID_CSP_NULL; } /* Set up core's general features */ xvid_enc_frame.vol_flags = 0; if (ARG_STATS) xvid_enc_frame.vol_flags |= XVID_VOL_EXTRASTATS; if (ARG_QTYPE) { xvid_enc_frame.vol_flags |= XVID_VOL_MPEGQUANT; if (ARG_QMATRIX) { xvid_enc_frame.quant_intra_matrix = qmatrix_intra; xvid_enc_frame.quant_inter_matrix = qmatrix_inter; } else { /* We don't use special matrices */ xvid_enc_frame.quant_intra_matrix = NULL; xvid_enc_frame.quant_inter_matrix = NULL; } } if (ARG_PAR) xvid_enc_frame.par = ARG_PAR; else { xvid_enc_frame.par = XVID_PAR_EXT; xvid_enc_frame.par_width = ARG_PARWIDTH; xvid_enc_frame.par_height = ARG_PARHEIGHT; } if (ARG_QPEL) { xvid_enc_frame.vol_flags |= XVID_VOL_QUARTERPEL; xvid_enc_frame.motion |= XVID_ME_QUARTERPELREFINE16 | XVID_ME_QUARTERPELREFINE8; } if (ARG_GMC) { xvid_enc_frame.vol_flags |= XVID_VOL_GMC; xvid_enc_frame.motion |= XVID_ME_GME_REFINE; } /* Set up core's general features */ xvid_enc_frame.vop_flags = vop_presets[ARG_QUALITY]; if (ARG_INTERLACING) { xvid_enc_frame.vol_flags |= XVID_VOL_INTERLACING; if (ARG_INTERLACING == 2) xvid_enc_frame.vop_flags |= XVID_VOP_TOPFIELDFIRST; } xvid_enc_frame.vop_flags |= XVID_VOP_HALFPEL; xvid_enc_frame.vop_flags |= XVID_VOP_HQACPRED; if (ARG_VOPDEBUG) { xvid_enc_frame.vop_flags |= XVID_VOP_DEBUG; } if (ARG_TRELLIS) { xvid_enc_frame.vop_flags |= XVID_VOP_TRELLISQUANT; } /* Frame type -- taken from function call parameter */ /* Sometimes we might want to force the last frame to be a P Frame */ xvid_enc_frame.type = *stats_type; /* Force the right quantizer -- It is internally managed by RC plugins */ xvid_enc_frame.quant = 0; if (ARG_CHROMAME) xvid_enc_frame.motion |= XVID_ME_CHROMA_PVOP + XVID_ME_CHROMA_BVOP; /* Set up motion estimation flags */ xvid_enc_frame.motion |= motion_presets[ARG_QUALITY]; if (ARG_TURBO) xvid_enc_frame.motion |= XVID_ME_FASTREFINE16 | XVID_ME_FASTREFINE8 | XVID_ME_SKIP_DELTASEARCH | XVID_ME_FAST_MODEINTERPOLATE | XVID_ME_BFRAME_EARLYSTOP; if (ARG_BVHQ) xvid_enc_frame.vop_flags |= XVID_VOP_RD_BVOP; if (ARG_QMETRIC == 1) xvid_enc_frame.vop_flags |= XVID_VOP_RD_PSNRHVSM; switch (ARG_VHQMODE) /* this is the same code as for vfw */ { case 1: /* VHQ_MODE_DECISION */ xvid_enc_frame.vop_flags |= XVID_VOP_MODEDECISION_RD; break; case 2: /* VHQ_LIMITED_SEARCH */ xvid_enc_frame.vop_flags |= XVID_VOP_MODEDECISION_RD; xvid_enc_frame.motion |= XVID_ME_HALFPELREFINE16_RD; xvid_enc_frame.motion |= XVID_ME_QUARTERPELREFINE16_RD; break; case 3: /* VHQ_MEDIUM_SEARCH */ xvid_enc_frame.vop_flags |= XVID_VOP_MODEDECISION_RD; xvid_enc_frame.motion |= XVID_ME_HALFPELREFINE16_RD; xvid_enc_frame.motion |= XVID_ME_HALFPELREFINE8_RD; xvid_enc_frame.motion |= XVID_ME_QUARTERPELREFINE16_RD; xvid_enc_frame.motion |= XVID_ME_QUARTERPELREFINE8_RD; xvid_enc_frame.motion |= XVID_ME_CHECKPREDICTION_RD; break; case 4: /* VHQ_WIDE_SEARCH */ xvid_enc_frame.vop_flags |= XVID_VOP_MODEDECISION_RD; xvid_enc_frame.motion |= XVID_ME_HALFPELREFINE16_RD; xvid_enc_frame.motion |= XVID_ME_HALFPELREFINE8_RD; xvid_enc_frame.motion |= XVID_ME_QUARTERPELREFINE16_RD; xvid_enc_frame.motion |= XVID_ME_QUARTERPELREFINE8_RD; xvid_enc_frame.motion |= XVID_ME_CHECKPREDICTION_RD; xvid_enc_frame.motion |= XVID_ME_EXTSEARCH_RD; break; default : break; } /* Not sure what this does */ // force keyframe spacing in 2-pass 1st pass if (ARG_QUALITY == 0) xvid_enc_frame.type = XVID_TYPE_IVOP; /* frame-based stuff */ apply_zone_modifiers(&xvid_enc_frame, framenum); /* Encode the frame */ ret = xvid_encore(enc_handle, XVID_ENC_ENCODE, &xvid_enc_frame, &xvid_enc_stats); *key = (xvid_enc_frame.out_flags & XVID_KEYFRAME); *stats_type = xvid_enc_stats.type; *stats_quant = xvid_enc_stats.quant; *stats_length = xvid_enc_stats.length; sse[0] = xvid_enc_stats.sse_y; sse[1] = xvid_enc_stats.sse_u; sse[2] = xvid_enc_stats.sse_v; return (ret); } void sort_zones(zone_t * zones, int zone_num, int * sel) { int i, j; zone_t tmp; for (i = 0; i < zone_num; i++) { int cur = i; int min_f = zones[i].frame; for (j = i + 1; j < zone_num; j++) { if (zones[j].frame < min_f) { min_f = zones[j].frame; cur = j; } } if (cur != i) { tmp = zones[i]; zones[i] = zones[cur]; zones[cur] = tmp; if (i == *sel) *sel = cur; else if (cur == *sel) *sel = i; } } } /* constant-quant zones for fixed quant encoding */ static void prepare_cquant_zones() { int i = 0; if (NUM_ZONES == 0 || ZONES[0].frame != 0) { /* first zone does not start at frame 0 or doesn't exist */ if (NUM_ZONES >= MAX_ZONES) NUM_ZONES--; /* we sacrifice last zone */ ZONES[NUM_ZONES].frame = 0; ZONES[NUM_ZONES].mode = XVID_ZONE_QUANT; ZONES[NUM_ZONES].modifier = (int) ARG_CQ; ZONES[NUM_ZONES].type = XVID_TYPE_AUTO; ZONES[NUM_ZONES].greyscale = 0; ZONES[NUM_ZONES].chroma_opt = 0; ZONES[NUM_ZONES].bvop_threshold = 0; ZONES[NUM_ZONES].cartoon_mode = 0; NUM_ZONES++; sort_zones(ZONES, NUM_ZONES, &i); } /* step 2: let's change all weight zones into quant zones */ for(i = 0; i < NUM_ZONES; i++) if (ZONES[i].mode == XVID_ZONE_WEIGHT) { ZONES[i].mode = XVID_ZONE_QUANT; ZONES[i].modifier = (int) ((100*ARG_CQ) / ZONES[i].modifier); } } /* full first pass zones */ static void prepare_full1pass_zones() { int i = 0; if (NUM_ZONES == 0 || ZONES[0].frame != 0) { /* first zone does not start at frame 0 or doesn't exist */ if (NUM_ZONES >= MAX_ZONES) NUM_ZONES--; /* we sacrifice last zone */ ZONES[NUM_ZONES].frame = 0; ZONES[NUM_ZONES].mode = XVID_ZONE_QUANT; ZONES[NUM_ZONES].modifier = 200; ZONES[NUM_ZONES].type = XVID_TYPE_AUTO; ZONES[NUM_ZONES].greyscale = 0; ZONES[NUM_ZONES].chroma_opt = 0; ZONES[NUM_ZONES].bvop_threshold = 0; ZONES[NUM_ZONES].cartoon_mode = 0; NUM_ZONES++; sort_zones(ZONES, NUM_ZONES, &i); } /* step 2: let's change all weight zones into quant zones */ for(i = 0; i < NUM_ZONES; i++) if (ZONES[i].mode == XVID_ZONE_WEIGHT) { ZONES[i].mode = XVID_ZONE_QUANT; ZONES[i].modifier = 200; } } static void apply_zone_modifiers(xvid_enc_frame_t * frame, int framenum) { int i; for (i=0; itype = ZONES[i].type; if (ZONES[i].greyscale) { frame->vop_flags |= XVID_VOP_GREYSCALE; } if (ZONES[i].chroma_opt) { frame->vop_flags |= XVID_VOP_CHROMAOPT; } if (ZONES[i].cartoon_mode) { frame->vop_flags |= XVID_VOP_CARTOON; frame->motion |= XVID_ME_DETECT_STATIC_MOTION; } if (ARG_MAXBFRAMES) { frame->bframe_threshold = ZONES[i].bvop_threshold; } } void removedivxp(char *buf, int bufsize) { int i; char* userdata; for (i=0; i <= (int)(bufsize-sizeof(userdata_start_code)); i++) { if (memcmp((void*)userdata_start_code, (void*)(buf+i), strlen(userdata_start_code))==0) { if ((userdata = strstr(buf+i+4, "DivX"))!=NULL) { userdata[strlen(userdata)-1] = '\0'; return; } } } } xvidcore/examples/bench.pl0000775000076500007650000001620210244343452016722 0ustar xvidbuildxvidbuild#!/usr/bin/perl ############################################################################### # # XVID MPEG-4 VIDEO CODEC # - Unit tests and benches - # # Copyright(C) 2005 Pascal Massimino # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # # ***************************************************************************/ # # Automated bench script. # # 1st draft: Skal / April 24th 2005 # ###################################################### $enc_bin = "xvid_encraw"; $dec_bin = "xvid_decraw"; $bench_bin = "xvid_bench"; if (-r "BENCH_CONF.pl") { require "BENCH_CONF.pl"; } else { $bin_dir = "."; $log_dir = "."; $data_dir = "."; $bench_list = "./bench_list.pl"; } require $bench_list; ######################################### $arch = ""; $verbose = 0; $use_valgrind = 0; $output_command_only = 0; $extra_arg = ""; #$log_name; #$log_file; #$input_dir; #$bin; #$bitstream; #$refstream; #@raw_options; #@ToDo; #$filter ######################################### # helper funcs ######################################### sub check_bin { # force re-build of binary (better safe than sorry) my_system( "rm -f $_[0]" ); my_system( "make $_[0]" ); } sub check_file { if (-r $_[0]) { my_system( "mv $_[0] $_[0]\_OLD" ); my_system( "touch $_[0]" ); } } sub setup { my $action = $_[0]; $log_name = $_[1]; $log_file = "$log_dir/$log_name"; $input_dir = "$data_dir/$_[2]"; $bin = $_[3]; check_bin $bin; check_file $log_file; printf "\nBench ...... $action\n"; printf "Binary ..... $bin $extra_arg\n"; printf "Log ........ $log_name\n"; printf "Data dir ... $input_dir\n\n"; $bin = "$bin_dir/$bin $extra_arg"; if ($use_valgrind) { $bin = "valgrind --tool=memcheck $bin"; $arch = ""; } if (not $arch eq "") { $bin .= "-$arch"; } } sub parse_bench { @raw_options = split; $bitstream = $raw_options[0]; if (defined($bench_filter) && !($bitstream =~ /(.*)$bench_filter(.*)/)) { printf "Filtering out bitstream [$bitstream]\n" if ($verbose>1); return 0; } shift @raw_options; printf "Checking bitstream: $bitstream [@raw_options]\n" if ($verbose>0); return 1; } ######################################### # debug ######################################### sub my_system { if ($output_command_only) { printf "system: $_[0]\n"; } else { system($_[0]); } return ($? >> 8); } ######################################### ## decoding benches ######################################### sub Do_Dec_Benches { setup( "decoding", "DEC_LOG", "./data_dec", $bench_bin ); my $n = 0, $Err = 0; foreach (@Dec_Benches) { if (parse_bench($_, 0)) { $n++; my $launch = "$bin 9 $input_dir/$bitstream @raw_options"; my $output = `$launch`; if (open( LOG_FILE, ">$log_file" )) { print LOG_FILE $output; close LOG_FILE; } else { printf "can't open core bench log file '$log_file'\n"; }; if ($output =~ /ERROR/) { print "ERROR detected in ouput, while decoding [$bitstream]:\n $output\n"; $Err++; next; } if ($output =~ /FPS:(.*) Checksum*/) { print "[$bitstream]: $1 fps\n"; } } } die "*** $Err error(s) detected!!\n" if $Err; } ######################################### ## Core benches ######################################### sub Do_Core_Benches { setup( "Core benches", "CORE_BENCH_LOG", ".", $bench_bin ); my $output = `$bin`; if (open( LOG_FILE, ">$log_file" )) { print LOG_FILE $output; close LOG_FILE; } else { printf "can't open core bench log file '$log_file'\n"; }; my $warning = 0; foreach (split('\n', $output)) { if (/ERROR/) { print "ERROR detected in ouput: [$_]\n"; if (/quant_mpeg/) { if (!$warning) { print "\n"; print "NB: MMX mpeg4 quantization is known to have very small errors (+/-1 magnitude)\n"; print "for 1 or 2 coefficients a block. This is mainly caused by the fact the unit\n"; print "test goes far behind the usual limits of real encoding. Please do not report\n"; print "this error to the developers.\n"; print "\n"; $warning = 1; } } } } } ######################################### ## Help ######################################### sub Do_Help { printf "\n -= Options =-\n\n"; printf "-h ................... this help\n"; printf "-v ................... Verbose++\n"; printf "-n ................... Check 'system' commands (no action performed)\n"; printf "-vlg ................. Use valgrind\n"; printf "-dec ................. perform decoding benches.\n"; printf "-core ................ perform core benches (using 'xvid_bench').\n"; printf "-all ................. perform all benches\n"; printf "-cpu ........... CPU to select (one of [c|mmx|mmxext|sse2|3dnow|3dnowe|altivec]).\n"; printf "-extra ......... Append extra argument 'arg' to binary commands.\n"; printf "\n"; exit; } ######################################### ## main ## ######################################### while(@ARGV) { my $command = shift @ARGV; if ($command eq "-h") { Do_Help; } elsif ($command eq "-v") { $verbose++; } elsif ($command eq "-n") { $output_command_only = 1; } elsif ($command eq "-vlg") { $use_valgrind = 1; } elsif ($command eq "-dec") { push @ToDo, $command; } elsif ($command eq "-core") { push @ToDo, $command; } elsif ($command eq "-all") { push @ToDo, ("-dec", "-core"); } elsif ($command eq "-cpu") { die "missing argument after $option\n" if (!defined($ARGV[0])); if ($ARGV[0] eq "c" || $ARGV[0] eq "mmx" || $ARGV[0] eq "mmxext" || $ARGV[0] eq "sse2" || $ARGV[0] eq "3dnow" || $ARGV[0] eq "3dnowe" || $ARGV[0] eq "altivec") { $arch = shift @ARGV; } else { die "Unrecognized cpu option '$ARGV[0]'.\n"; } } elsif ($command eq "-extra") { die "missing argument after $option\n" if (!defined($ARGV[0])); $extra_arg = "$extra_arg $ARGV[0]"; shift @ARGV; } elsif ($command =~ /^\-/) { printf "Unrecognized option [$command]\n"; Do_Help; } else { $bench_filter = $command; } } if (@ToDo==0) { push @ToDo, "help"; } printf "Filtering bench name with [$bench_filter]\n" if ($verbose>2 && defined($bench_filter)); foreach (@ToDo) { if ($_ eq "help" ) { Do_Help; } elsif ($_ eq "-dec") { Do_Dec_Benches; } elsif ($_ eq "-core") { Do_Core_Benches; } } ######################################### xvidcore/examples/Makefile0000664000076500007650000000165210513173406016745 0ustar xvidbuildxvidbuild############################################################################# # # XviD examples Makefile # # $Id: Makefile,v 1.10 2006-10-11 13:52:06 Skal Exp $ # ############################################################################# include ../build/generic/platform.inc HDIR = -I../src CFLAGS = -g $(ARCHITECTURE) $(BUS) $(ENDIANNESS) $(FEATURES) $(SPECIFIC_CFLAGS) LDFLAGS = -lc -lm ../build/generic/=build/$(STATIC_LIB) -lpthread SOURCES= xvid_encraw.c xvid_decraw.c xvid_bench.c OBJECTS=$(SOURCES:.c=.o) TESTS=$(SOURCES:.c=) all: $(TESTS) xvid_encraw: xvid_encraw.o $(CC) -o $@ $< $(LDFLAGS) xvid_encraw.o: xvid_encraw.c $(CC) $(CFLAGS) $(HDIR) -c $< xvid_decraw: xvid_decraw.o $(CC) -o $@ $< $(LDFLAGS) xvid_decraw.o: xvid_decraw.c $(CC) $(CFLAGS) $(HDIR) -c $< xvid_bench: xvid_bench.o $(CC) -o $@ $< $(LDFLAGS) xvid_bench.o: xvid_bench.c $(CC) $(CFLAGS) $(HDIR) -c $< clean: rm -f $(OBJECTS) $(TESTS) xvidcore/examples/README0000664000076500007650000000510211506410657016164 0ustar xvidbuildxvidbuild+--------------------------------------------------------------------+ | xvidcore lib examples | +--------------------------------------------------------------------+ In this directory can find some examples how to use Xvid MPEG4 codec in your own programs. ** cactus.pgm.bz2 ---------------------------------------------------------------------- This a test sequence of 3 images with a cactus moving from right to left. It bzip2-compressed for size reason (half the size of a ZIP-file). Binaries of bunzip2 are available for all major OSes at http://sources.redhat.com/bzip2/ The original source of the cactus image is unknown... * xvid_encraw.c ---------------------------------------------------------------------- This is a small example that allows you to encode YUV streams or PGM files into a MPEG4 stream. It can output single files (on per encoded frame), or one file for all the enced stream (m4v format or a simple container format that we called mp4u, its description can be found at the end of this file). This program also outputs some very basic time results. Type "xvid_encraw -help" to have all options' description. Examples : 1) bzip2 -dc cactus.pgm.bz2 | ./xvid_encraw -type 1 This command decompress cactus.pgm.bz2 and pipe the pgm file to xvid_encraw that will compress it to mpeg4 format. No mp4 stream output is written to disk. 2) ./xvid_encraw -type 1 -i cactus.pgm -save Compress cactus.pgm frames into mpeg4 stream, and then writes a m4v file per encoded frame. 3) ./xvid_encraw -type 1 -i cactus.pgm -o my_xvid_example.m4v -stats Same thing as above but saves all raw m4v data to a singlefile, and displays yuv-plane psnr statistics to stdout. ** xvid_decraw.c ---------------------------------------------------------------------- This is a decoder example that is able to decode a m4v or mp4u stream. You can use it to decode what xvid_encraw encoded. Type "xvid_decraw -help" to have all options' description. Examples : 1) ./xvid_decraw -i stream.m4v -d This command decodes a m4v file from stream.m4v and saves all decoder output frames to individual PGM files (framexxxxx.pgm). 2) cat stream.m4v | ./xvid_decraw This examples decodes a m4v stream from standard input, but does save any decoded frames. ** xvid_bench.c ---------------------------------------------------------------------- This is a tool to conduct unit testing and profiling of the signal processing functions used internally within libxvidcore. xvidcore/LICENSE0000664000076500007650000004310307623455453014506 0ustar xvidbuildxvidbuild GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License. xvidcore/TODO0000664000076500007650000000430511506140503014150 0ustar xvidbuildxvidbuildTODO ==== This file lists the TODO items Outstanding items: ------------------ xvidcore * update/fix CBR plugin - misses target bitrate, bitrate burst in static motion/high motion transitions * parallel slice decoding (pre-parse for resync marker boundaries) * filter/deblock reference frames before ME ("true motion") * PSNR-HVS-M adaptive quantization examples * profile/level support within xvid_encraw vfw * integrated Packed<->ISO converter * vfw-ext api to get/set configuration parameters * ICM_DECOMPRESSEX_* support * warn user before overwriting .pass file * improve ergonomics of user interface * user settings management directshow * option to display libxvidcore version and bitstream ontop of video hopefully, using smooth fonts, not image_printf(). Completed items: ---------------- * clusterable two-pass coding * multi-threaded decoder deblocking * parallel slice coding * multi-threaded motion estimation * manual aspect ratio setting (1:1, 4:3, 16:9, Custom) * MMX MPEG4 quantization precision. * sse3/sse4 SIMD optimizations. * x86_64 optimizations for xvidcore. * remove divx4 api (ed.gomez) * remove VOP_TYPE enumerations (peter) * remove HINTed ME stuff (ed.gomez) * xvid_image_t/xvid_gbl_convert_t (peter) * xvid_global structs (peter) * errors codes (peter) * xvid_decoder structs (peter) * apply encoder api changes "HEAPS" (peter) * rawdec (use xvid_decraw instead) (ed.gomez) * Support for GMC 3 warp points (christoph) * New Qpel code (michael) * ME splitting and ME improvements (syskin) * New unix build process (ed.gomez) * Move/clean/enhance 2pass code from vfw to core (ed.gomez) * New thread/instance safe sse2 code (p.massimino) * INSTALL guide for Unix and Win32 (ed.gomez) * dshow static link to libxvidcore.lib (peter) * update/fix Lumimasking (syskin) * trellis for mpeg and relaxed optimization for big levels (skal) * thread safe mpeg quantizing (michael) * Interlacing for bvop and svop (syskin) * YV12/I420/USER clarification (christoph) * vfw and dshow link dynamically to xvidcore.dll (syskin) * vfw bitrate calculator (peter) * dshow configure from command line (peter) * bug hunting (ed.gomez/syskin) * video buffer verifier (christoph) Last edited: $Date: 2010-12-27 16:39:31 $ xvidcore/src/0000775000076500007650000000000011566427763014273 5ustar xvidbuildxvidbuildxvidcore/src/xvid.h0000664000076500007650000010315711564705453015416 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Xvid Main header file - * * Copyright(C) 2001-2011 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: xvid.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _XVID_H_ #define _XVID_H_ #ifdef __cplusplus extern "C" { #endif /***************************************************************************** * versioning ****************************************************************************/ /* versioning version takes the form "$major.$minor.$patch" $patch is incremented when there is no api change $minor is incremented when the api is changed, but remains backwards compatible $major is incremented when the api is changed significantly when initialising an xvid structure, you must always zero it, and set the version field. memset(&struct,0,sizeof(struct)); struct.version = XVID_VERSION; XVID_UNSTABLE is defined only during development. */ #define XVID_MAKE_VERSION(a,b,c) ((((a)&0xff)<<16) | (((b)&0xff)<<8) | ((c)&0xff)) #define XVID_VERSION_MAJOR(a) ((char)(((a)>>16) & 0xff)) #define XVID_VERSION_MINOR(a) ((char)(((a)>> 8) & 0xff)) #define XVID_VERSION_PATCH(a) ((char)(((a)>> 0) & 0xff)) #define XVID_MAKE_API(a,b) ((((a)&0xff)<<16) | (((b)&0xff)<<0)) #define XVID_API_MAJOR(a) (((a)>>16) & 0xff) #define XVID_API_MINOR(a) (((a)>> 0) & 0xff) #define XVID_VERSION XVID_MAKE_VERSION(1,3,2) #define XVID_API XVID_MAKE_API(4, 3) /* Bitstream Version * this will be writen into the bitstream to allow easy detection of xvid * encoder bugs in the decoder, without this it might not possible to * automatically distinquish between a file which has been encoded with an * old & buggy XVID from a file which has been encoded with a bugfree version * see the infamous interlacing bug ... * * this MUST be increased if an encoder bug is fixed, increasing it too often * doesnt hurt but not increasing it could cause difficulty for decoders in the * future */ #define XVID_BS_VERSION 64 /***************************************************************************** * error codes ****************************************************************************/ /* all functions return values <0 indicate error */ #define XVID_ERR_FAIL -1 /* general fault */ #define XVID_ERR_MEMORY -2 /* memory allocation error */ #define XVID_ERR_FORMAT -3 /* file format error */ #define XVID_ERR_VERSION -4 /* structure version not supported */ #define XVID_ERR_END -5 /* encoder only; end of stream reached */ /***************************************************************************** * xvid_image_t ****************************************************************************/ /* colorspace values */ #define XVID_CSP_PLANAR (1<< 0) /* 4:2:0 planar (==I420, except for pointers/strides) */ #define XVID_CSP_USER XVID_CSP_PLANAR #define XVID_CSP_I420 (1<< 1) /* 4:2:0 planar */ #define XVID_CSP_YV12 (1<< 2) /* 4:2:0 planar */ #define XVID_CSP_YUY2 (1<< 3) /* 4:2:2 packed */ #define XVID_CSP_UYVY (1<< 4) /* 4:2:2 packed */ #define XVID_CSP_YVYU (1<< 5) /* 4:2:2 packed */ #define XVID_CSP_RGB (1<<16) /* 24-bit rgb packed */ #define XVID_CSP_BGRA (1<< 6) /* 32-bit bgra packed */ #define XVID_CSP_ABGR (1<< 7) /* 32-bit abgr packed */ #define XVID_CSP_RGBA (1<< 8) /* 32-bit rgba packed */ #define XVID_CSP_ARGB (1<<15) /* 32-bit argb packed */ #define XVID_CSP_BGR (1<< 9) /* 24-bit bgr packed */ #define XVID_CSP_RGB555 (1<<10) /* 16-bit rgb555 packed */ #define XVID_CSP_RGB565 (1<<11) /* 16-bit rgb565 packed */ #define XVID_CSP_SLICE (1<<12) /* decoder only: 4:2:0 planar, per slice rendering */ #define XVID_CSP_INTERNAL (1<<13) /* decoder only: 4:2:0 planar, returns ptrs to internal buffers */ #define XVID_CSP_NULL (1<<14) /* decoder only: dont output anything */ #define XVID_CSP_VFLIP (1<<31) /* vertical flip mask */ /* xvid_image_t for non-planar colorspaces use only plane[0] and stride[0] four plane reserved for alpha*/ typedef struct { int csp; /* [in] colorspace; or with XVID_CSP_VFLIP to perform vertical flip */ void * plane[4]; /* [in] image plane ptrs */ int stride[4]; /* [in] image stride; "bytes per row"*/ } xvid_image_t; /* video-object-sequence profiles */ #define XVID_PROFILE_S_L0 0x08 /* simple */ #define XVID_PROFILE_S_L1 0x01 #define XVID_PROFILE_S_L2 0x02 #define XVID_PROFILE_S_L3 0x03 #define XVID_PROFILE_S_L4a 0x04 #define XVID_PROFILE_S_L5 0x05 #define XVID_PROFILE_S_L6 0x06 #define XVID_PROFILE_ARTS_L1 0x91 /* advanced realtime simple */ #define XVID_PROFILE_ARTS_L2 0x92 #define XVID_PROFILE_ARTS_L3 0x93 #define XVID_PROFILE_ARTS_L4 0x94 #define XVID_PROFILE_AS_L0 0xf0 /* advanced simple */ #define XVID_PROFILE_AS_L1 0xf1 #define XVID_PROFILE_AS_L2 0xf2 #define XVID_PROFILE_AS_L3 0xf3 #define XVID_PROFILE_AS_L4 0xf4 /* aspect ratios */ #define XVID_PAR_11_VGA 1 /* 1:1 vga (square), default if supplied PAR is not a valid value */ #define XVID_PAR_43_PAL 2 /* 4:3 pal (12:11 625-line) */ #define XVID_PAR_43_NTSC 3 /* 4:3 ntsc (10:11 525-line) */ #define XVID_PAR_169_PAL 4 /* 16:9 pal (16:11 625-line) */ #define XVID_PAR_169_NTSC 5 /* 16:9 ntsc (40:33 525-line) */ #define XVID_PAR_EXT 15 /* extended par; use par_width, par_height */ /* frame type flags */ #define XVID_TYPE_VOL -1 /* decoder only: vol was decoded */ #define XVID_TYPE_NOTHING 0 /* decoder only (encoder stats): nothing was decoded/encoded */ #define XVID_TYPE_AUTO 0 /* encoder: automatically determine coding type */ #define XVID_TYPE_IVOP 1 /* intra frame */ #define XVID_TYPE_PVOP 2 /* predicted frame */ #define XVID_TYPE_BVOP 3 /* bidirectionally encoded */ #define XVID_TYPE_SVOP 4 /* predicted+sprite frame */ /***************************************************************************** * xvid_global() ****************************************************************************/ /* cpu_flags definitions (make sure to sync this with cpuid.asm for ia32) */ #define XVID_CPU_FORCE (1<<31) /* force passed cpu flags */ #define XVID_CPU_ASM (1<< 7) /* native assembly */ /* ARCH_IS_IA32 */ #define XVID_CPU_MMX (1<< 0) /* mmx : pentiumMMX,k6 */ #define XVID_CPU_MMXEXT (1<< 1) /* mmx-ext : pentium2, athlon */ #define XVID_CPU_SSE (1<< 2) /* sse : pentium3, athlonXP */ #define XVID_CPU_SSE2 (1<< 3) /* sse2 : pentium4, athlon64 */ #define XVID_CPU_SSE3 (1<< 8) /* sse3 : pentium4, athlon64 */ #define XVID_CPU_SSE41 (1<< 9) /* sse41: penryn */ #define XVID_CPU_3DNOW (1<< 4) /* 3dnow : k6-2 */ #define XVID_CPU_3DNOWEXT (1<< 5) /* 3dnow-ext : athlon */ #define XVID_CPU_TSC (1<< 6) /* tsc : Pentium */ /* ARCH_IS_PPC */ #define XVID_CPU_ALTIVEC (1<< 0) /* altivec */ #define XVID_DEBUG_ERROR (1<< 0) #define XVID_DEBUG_STARTCODE (1<< 1) #define XVID_DEBUG_HEADER (1<< 2) #define XVID_DEBUG_TIMECODE (1<< 3) #define XVID_DEBUG_MB (1<< 4) #define XVID_DEBUG_COEFF (1<< 5) #define XVID_DEBUG_MV (1<< 6) #define XVID_DEBUG_RC (1<< 7) #define XVID_DEBUG_DEBUG (1<<31) /* XVID_GBL_INIT param1 */ typedef struct { int version; unsigned int cpu_flags; /* [in:opt] zero = autodetect cpu; XVID_CPU_FORCE|{cpu features} = force cpu features */ int debug; /* [in:opt] debug level */ } xvid_gbl_init_t; /* XVID_GBL_INFO param1 */ typedef struct { int version; int actual_version; /* [out] returns the actual xvidcore version */ const char * build; /* [out] if !null, points to description of this xvid core build */ unsigned int cpu_flags; /* [out] detected cpu features */ int num_threads; /* [out] detected number of cpus/threads */ } xvid_gbl_info_t; /* XVID_GBL_CONVERT param1 */ typedef struct { int version; xvid_image_t input; /* [in] input image & colorspace */ xvid_image_t output; /* [in] output image & colorspace */ int width; /* [in] width */ int height; /* [in] height */ int interlacing; /* [in] interlacing */ } xvid_gbl_convert_t; #define XVID_GBL_INIT 0 /* initialize xvidcore; must be called before using xvid_decore, or xvid_encore) */ #define XVID_GBL_INFO 1 /* return some info about xvidcore, and the host computer */ #define XVID_GBL_CONVERT 2 /* colorspace conversion utility */ extern int xvid_global(void *handle, int opt, void *param1, void *param2); /***************************************************************************** * xvid_decore() ****************************************************************************/ #define XVID_DEC_CREATE 0 /* create decore instance; return 0 on success */ #define XVID_DEC_DESTROY 1 /* destroy decore instance: return 0 on success */ #define XVID_DEC_DECODE 2 /* decode a frame: returns number of bytes consumed >= 0 */ extern int xvid_decore(void *handle, int opt, void *param1, void *param2); /* XVID_DEC_CREATE param 1 image width & height as well as FourCC code may be specified here when known in advance (e.g. being read from container) */ typedef struct { int version; int width; /* [in:opt] image width */ int height; /* [in:opt] image width */ void * handle; /* [out] decore context handle */ /* ------- v1.3.x ------- */ int fourcc; /* [in:opt] fourcc of the input video */ int num_threads;/* [in:opt] number of threads to use in decoder */ } xvid_dec_create_t; /* XVID_DEC_DECODE param1 */ /* general flags */ #define XVID_LOWDELAY (1<<0) /* lowdelay mode */ #define XVID_DISCONTINUITY (1<<1) /* indicates break in stream */ #define XVID_DEBLOCKY (1<<2) /* perform luma deblocking */ #define XVID_DEBLOCKUV (1<<3) /* perform chroma deblocking */ #define XVID_FILMEFFECT (1<<4) /* adds film grain */ #define XVID_DERINGUV (1<<5) /* perform chroma deringing, requires deblocking to work */ #define XVID_DERINGY (1<<6) /* perform luma deringing, requires deblocking to work */ #define XVID_DEC_FAST (1<<29) /* disable postprocessing to decrease cpu usage *todo* */ #define XVID_DEC_DROP (1<<30) /* drop bframes to decrease cpu usage *todo* */ #define XVID_DEC_PREROLL (1<<31) /* decode as fast as you can, don't even show output *todo* */ typedef struct { int version; int general; /* [in:opt] general flags */ void *bitstream; /* [in] bitstream (read from)*/ int length; /* [in] bitstream length */ xvid_image_t output; /* [in] output image (written to) */ /* ------- v1.1.x ------- */ int brightness; /* [in] brightness offset (0=none) */ } xvid_dec_frame_t; /* XVID_DEC_DECODE param2 :: optional */ typedef struct { int version; int type; /* [out] output data type */ union { struct { /* type>0 {XVID_TYPE_IVOP,XVID_TYPE_PVOP,XVID_TYPE_BVOP,XVID_TYPE_SVOP} */ int general; /* [out] flags */ int time_base; /* [out] time base */ int time_increment; /* [out] time increment */ /* XXX: external deblocking stuff */ int * qscale; /* [out] pointer to quantizer table */ int qscale_stride; /* [out] quantizer scale stride */ } vop; struct { /* XVID_TYPE_VOL */ int general; /* [out] flags */ int width; /* [out] width */ int height; /* [out] height */ int par; /* [out] pixel aspect ratio (refer to XVID_PAR_xxx above) */ int par_width; /* [out] aspect ratio width [1..255] */ int par_height; /* [out] aspect ratio height [1..255] */ } vol; } data; } xvid_dec_stats_t; #define XVID_ZONE_QUANT (1<<0) #define XVID_ZONE_WEIGHT (1<<1) typedef struct { int frame; int mode; int increment; int base; } xvid_enc_zone_t; /*---------------------------------------------------------------------------- * xvid_enc_stats_t structure * * Used in: * - xvid_plg_data_t structure * - optional parameter in xvid_encore() function * * .coding_type = XVID_TYPE_NOTHING if the stats are not given *--------------------------------------------------------------------------*/ typedef struct { int version; /* encoding parameters */ int type; /* [out] coding type */ int quant; /* [out] frame quantizer */ int vol_flags; /* [out] vol flags (see above) */ int vop_flags; /* [out] vop flags (see above) */ /* bitrate */ int length; /* [out] frame length */ int hlength; /* [out] header length (bytes) */ int kblks; /* [out] number of blocks compressed as Intra */ int mblks; /* [out] number of blocks compressed as Inter */ int ublks; /* [out] number of blocks marked as not_coded */ int sse_y; /* [out] Y plane's sse */ int sse_u; /* [out] U plane's sse */ int sse_v; /* [out] V plane's sse */ } xvid_enc_stats_t; /***************************************************************************** xvid plugin system -- internals xvidcore will call XVID_PLG_INFO and XVID_PLG_CREATE during XVID_ENC_CREATE before encoding each frame xvidcore will call XVID_PLG_BEFORE after encoding each frame xvidcore will call XVID_PLG_AFTER xvidcore will call XVID_PLG_DESTROY during XVID_ENC_DESTROY ****************************************************************************/ #define XVID_PLG_CREATE (1<<0) #define XVID_PLG_DESTROY (1<<1) #define XVID_PLG_INFO (1<<2) #define XVID_PLG_BEFORE (1<<3) #define XVID_PLG_FRAME (1<<4) #define XVID_PLG_AFTER (1<<5) /* xvid_plg_info_t.flags */ #define XVID_REQORIGINAL (1<<0) /* plugin requires a copy of the original (uncompressed) image */ #define XVID_REQPSNR (1<<1) /* plugin requires psnr between the uncompressed and compressed image*/ #define XVID_REQDQUANTS (1<<2) /* plugin requires access to the dquant table */ #define XVID_REQLAMBDA (1<<3) /* plugin requires access to the lambda table */ typedef struct { int version; int flags; /* [in:opt] plugin flags */ } xvid_plg_info_t; typedef struct { int version; int num_zones; /* [out] */ xvid_enc_zone_t * zones; /* [out] */ int width; /* [out] */ int height; /* [out] */ int mb_width; /* [out] */ int mb_height; /* [out] */ int fincr; /* [out] */ int fbase; /* [out] */ void * param; /* [out] */ } xvid_plg_create_t; typedef struct { int version; int num_frames; /* [out] total frame encoded */ } xvid_plg_destroy_t; typedef struct { int version; xvid_enc_zone_t * zone; /* [out] current zone */ int width; /* [out] */ int height; /* [out] */ int mb_width; /* [out] */ int mb_height; /* [out] */ int fincr; /* [out] */ int fbase; /* [out] */ int min_quant[3]; /* [out] */ int max_quant[3]; /* [out] */ xvid_image_t reference; /* [out] -> [out] */ xvid_image_t current; /* [out] -> [in,out] */ xvid_image_t original; /* [out] after: points the original (uncompressed) copy of the current frame */ int frame_num; /* [out] frame number */ int type; /* [in,out] */ int quant; /* [in,out] */ int * dquant; /* [in,out] pointer to diff quantizer table */ int dquant_stride; /* [in,out] diff quantizer stride */ int vop_flags; /* [in,out] */ int vol_flags; /* [in,out] */ int motion_flags; /* [in,out] */ /* Lambda table for HVSPlugins */ float * lambda; /* [in,out] six floats for each macroblock. read, multiply, write back */ /* Deprecated, use the stats field instead. * Will disapear before 1.0 */ int length; /* [out] after: length of encoded frame */ int kblks; /* [out] number of blocks compressed as Intra */ int mblks; /* [out] number of blocks compressed as Inter */ int ublks; /* [out] number of blocks marked not_coded */ int sse_y; /* [out] Y plane's sse */ int sse_u; /* [out] U plane's sse */ int sse_v; /* [out] V plane's sse */ /* End of duplicated data, kept only for binary compatibility */ int bquant_ratio; /* [in] */ int bquant_offset; /* [in] */ xvid_enc_stats_t stats; /* [out] frame statistics */ } xvid_plg_data_t; /***************************************************************************** xvid plugin system -- external the application passes xvid an array of "xvid_plugin_t" at XVID_ENC_CREATE. the array indicates the plugin function pointer and plugin-specific data. xvidcore handles the rest. example: xvid_enc_create_t create; xvid_enc_plugin_t plugins[2]; plugins[0].func = xvid_psnr_func; plugins[0].param = NULL; plugins[1].func = xvid_cbr_func; plugins[1].param = &cbr_data; create.num_plugins = 2; create.plugins = plugins; ****************************************************************************/ typedef int (xvid_plugin_func)(void * handle, int opt, void * param1, void * param2); typedef struct { xvid_plugin_func * func; void * param; } xvid_enc_plugin_t; extern xvid_plugin_func xvid_plugin_single; /* single-pass rate control */ extern xvid_plugin_func xvid_plugin_2pass1; /* two-pass rate control: first pass */ extern xvid_plugin_func xvid_plugin_2pass2; /* two-pass rate control: second pass */ extern xvid_plugin_func xvid_plugin_lumimasking; /* lumimasking */ extern xvid_plugin_func xvid_plugin_psnr; /* write psnr values to stdout */ extern xvid_plugin_func xvid_plugin_dump; /* dump before and after yuvpgms */ extern xvid_plugin_func xvid_plugin_ssim; /*write ssim values to stdout*/ extern xvid_plugin_func xvid_plugin_psnrhvsm; /*write psnrhvsm values to stdout*/ /* single pass rate control * CBR and Constant quantizer modes */ typedef struct { int version; int bitrate; /* [in] bits per second */ int reaction_delay_factor; /* [in] */ int averaging_period; /* [in] */ int buffer; /* [in] */ } xvid_plugin_single_t; typedef struct { int version; char * filename; } xvid_plugin_2pass1_t; #define XVID_PAYBACK_BIAS 0 /* payback with bias */ #define XVID_PAYBACK_PROP 1 /* payback proportionally */ typedef struct { int version; int bitrate; /* [in] target bitrate (bits per second) */ char * filename; /* [in] first pass stats filename */ int keyframe_boost; /* [in] keyframe boost percentage: [0..100] */ int curve_compression_high; /* [in] percentage of compression performed on the high part of the curve (above average) */ int curve_compression_low; /* [in] percentage of compression performed on the low part of the curve (below average) */ int overflow_control_strength;/* [in] Payback delay expressed in number of frames */ int max_overflow_improvement; /* [in] percentage of allowed range for a frame that gets bigger because of overflow bonus */ int max_overflow_degradation; /* [in] percentage of allowed range for a frame that gets smaller because of overflow penalty */ int kfreduction; /* [in] maximum bitrate reduction applied to an iframe under the kfthreshold distance limit */ int kfthreshold; /* [in] if an iframe is closer to the next iframe than this distance, a quantity of bits * is substracted from its bit allocation. The reduction is computed as multiples of * kfreduction/kthreshold. It reaches kfreduction when the distance == kfthreshold, * 0 for 1 vbv_size + vbv_maxrate which * guarantees that vbv_peakrate won't be exceeded. */ }xvid_plugin_2pass2_t; typedef struct{ /*stat output*/ int b_printstat; char* stat_path; /*visualize*/ int b_visualize; /*accuracy 0 very accurate 4 very fast*/ int acc; int cpu_flags; /* XVID_CPU_XXX flags */ } xvid_plugin_ssim_t; typedef struct { int version; int method; /* [in] masking method to apply. 0 for luminance masking, 1 for variance masking */ } xvid_plugin_lumimasking_t; /***************************************************************************** * ENCODER API ****************************************************************************/ /*---------------------------------------------------------------------------- * Encoder operations *--------------------------------------------------------------------------*/ #define XVID_ENC_CREATE 0 /* create encoder instance; returns 0 on success */ #define XVID_ENC_DESTROY 1 /* destroy encoder instance; returns 0 on success */ #define XVID_ENC_ENCODE 2 /* encode a frame: returns number of ouput bytes * 0 means this frame should not be written (ie. encoder lag) */ /*---------------------------------------------------------------------------- * Encoder entry point *--------------------------------------------------------------------------*/ extern int xvid_encore(void *handle, int opt, void *param1, void *param2); /* Quick API reference * * XVID_ENC_CREATE operation * - handle: ignored * - opt: XVID_ENC_CREATE * - param1: address of a xvid_enc_create_t structure * - param2: ignored * * XVID_ENC_ENCODE operation * - handle: an instance returned by a CREATE op * - opt: XVID_ENC_ENCODE * - param1: address of a xvid_enc_frame_t structure * - param2: address of a xvid_enc_stats_t structure (optional) * its return value is asynchronous to what is written to the buffer * depending on the delay introduced by bvop use. It's display * ordered. * * XVID_ENC_DESTROY operation * - handle: an instance returned by a CREATE op * - opt: XVID_ENC_DESTROY * - param1: ignored * - param2: ignored */ /*---------------------------------------------------------------------------- * "Global" flags * * These flags are used for xvid_enc_create_t->global field during instance * creation (operation XVID_ENC_CREATE) *--------------------------------------------------------------------------*/ #define XVID_GLOBAL_PACKED (1<<0) /* packed bitstream */ #define XVID_GLOBAL_CLOSED_GOP (1<<1) /* closed_gop: was DX50BVOP dx50 bvop compatibility */ #define XVID_GLOBAL_EXTRASTATS_ENABLE (1<<2) #if 0 #define XVID_GLOBAL_VOL_AT_IVOP (1<<3) /* write vol at every ivop: WIN32/divx compatibility */ #define XVID_GLOBAL_FORCE_VOL (1<<4) /* when vol-based parameters are changed, insert an ivop NOT recommended */ #endif #define XVID_GLOBAL_DIVX5_USERDATA (1<<5) /* write divx5 userdata string this is implied if XVID_GLOBAL_PACKED is set */ /*---------------------------------------------------------------------------- * "VOL" flags * * These flags are used for xvid_enc_frame_t->vol_flags field during frame * encoding (operation XVID_ENC_ENCODE) *--------------------------------------------------------------------------*/ #define XVID_VOL_MPEGQUANT (1<<0) /* enable MPEG type quantization */ #define XVID_VOL_EXTRASTATS (1<<1) /* enable plane sse stats */ #define XVID_VOL_QUARTERPEL (1<<2) /* enable quarterpel: frames will encoded as quarterpel */ #define XVID_VOL_GMC (1<<3) /* enable GMC; frames will be checked for gmc suitability */ #define XVID_VOL_REDUCED_ENABLE (1<<4) /* enable reduced resolution vops: frames will be checked for rrv suitability */ /* NOTE: the reduced resolution feature is not supported anymore. This flag will have no effect! */ #define XVID_VOL_INTERLACING (1<<5) /* enable interlaced encoding */ /*---------------------------------------------------------------------------- * "VOP" flags * * These flags are used for xvid_enc_frame_t->vop_flags field during frame * encoding (operation XVID_ENC_ENCODE) *--------------------------------------------------------------------------*/ /* Always valid */ #define XVID_VOP_DEBUG (1<< 0) /* print debug messages in frames */ #define XVID_VOP_HALFPEL (1<< 1) /* use halfpel interpolation */ #define XVID_VOP_INTER4V (1<< 2) /* use 4 motion vectors per MB */ #define XVID_VOP_TRELLISQUANT (1<< 3) /* use trellis based R-D "optimal" quantization */ #define XVID_VOP_CHROMAOPT (1<< 4) /* enable chroma optimization pre-filter */ #define XVID_VOP_CARTOON (1<< 5) /* use 'cartoon mode' */ #define XVID_VOP_GREYSCALE (1<< 6) /* enable greyscale only mode (even for color input material chroma is ignored) */ #define XVID_VOP_HQACPRED (1<< 7) /* high quality ac prediction */ #define XVID_VOP_MODEDECISION_RD (1<< 8) /* enable DCT-ME and use it for mode decision */ #define XVID_VOP_FAST_MODEDECISION_RD (1<<12) /* use simplified R-D mode decision */ #define XVID_VOP_RD_BVOP (1<<13) /* enable rate-distortion mode decision in b-frames */ #define XVID_VOP_RD_PSNRHVSM (1<<14) /* use PSNR-HVS-M as metric for rate-distortion optimizations */ /* Only valid for vol_flags|=XVID_VOL_INTERLACING */ #define XVID_VOP_TOPFIELDFIRST (1<< 9) /* set top-field-first flag */ #define XVID_VOP_ALTERNATESCAN (1<<10) /* set alternate vertical scan flag */ /* only valid for vol_flags|=XVID_VOL_REDUCED_ENABLED */ #define XVID_VOP_REDUCED (1<<11) /* reduced resolution vop */ /* NOTE: reduced resolution feature is not supported anymore. This flag will have no effect! */ /*---------------------------------------------------------------------------- * "Motion" flags * * These flags are used for xvid_enc_frame_t->motion field during frame * encoding (operation XVID_ENC_ENCODE) *--------------------------------------------------------------------------*/ /* Motion Estimation Search Patterns */ #define XVID_ME_ADVANCEDDIAMOND16 (1<< 0) /* use advdiamonds instead of diamonds as search pattern */ #define XVID_ME_ADVANCEDDIAMOND8 (1<< 1) /* use advdiamond for XVID_ME_EXTSEARCH8 */ #define XVID_ME_USESQUARES16 (1<< 2) /* use squares instead of diamonds as search pattern */ #define XVID_ME_USESQUARES8 (1<< 3) /* use square for XVID_ME_EXTSEARCH8 */ /* SAD operator based flags */ #define XVID_ME_HALFPELREFINE16 (1<< 4) #define XVID_ME_HALFPELREFINE8 (1<< 6) #define XVID_ME_QUARTERPELREFINE16 (1<< 7) #define XVID_ME_QUARTERPELREFINE8 (1<< 8) #define XVID_ME_GME_REFINE (1<< 9) #define XVID_ME_EXTSEARCH16 (1<<10) /* extend PMV by more searches */ #define XVID_ME_EXTSEARCH8 (1<<11) /* use diamond/square for extended 8x8 search */ #define XVID_ME_CHROMA_PVOP (1<<12) /* also use chroma for P_VOP/S_VOP ME */ #define XVID_ME_CHROMA_BVOP (1<<13) /* also use chroma for B_VOP ME */ #define XVID_ME_FASTREFINE16 (1<<25) /* use low-complexity refinement functions */ #define XVID_ME_FASTREFINE8 (1<<29) /* low-complexity 8x8 sub-block refinement */ /* Rate Distortion based flags * Valid when XVID_VOP_MODEDECISION_RD is enabled */ #define XVID_ME_HALFPELREFINE16_RD (1<<14) /* perform RD-based halfpel refinement */ #define XVID_ME_HALFPELREFINE8_RD (1<<15) /* perform RD-based halfpel refinement for 8x8 mode */ #define XVID_ME_QUARTERPELREFINE16_RD (1<<16) /* perform RD-based qpel refinement */ #define XVID_ME_QUARTERPELREFINE8_RD (1<<17) /* perform RD-based qpel refinement for 8x8 mode */ #define XVID_ME_EXTSEARCH_RD (1<<18) /* perform RD-based search using square pattern enable XVID_ME_EXTSEARCH8 to do this in 8x8 search as well */ #define XVID_ME_CHECKPREDICTION_RD (1<<19) /* always check vector equal to prediction */ /* Other */ #define XVID_ME_DETECT_STATIC_MOTION (1<<24) /* speed-up ME by detecting stationary scenes */ #define XVID_ME_SKIP_DELTASEARCH (1<<26) /* speed-up by skipping b-frame delta search */ #define XVID_ME_FAST_MODEINTERPOLATE (1<<27) /* speed-up by partly skipping interpolate mode */ #define XVID_ME_BFRAME_EARLYSTOP (1<<28) /* speed-up by early exiting b-search */ /* Unused */ #define XVID_ME_UNRESTRICTED16 (1<<20) /* unrestricted ME, not implemented */ #define XVID_ME_OVERLAPPING16 (1<<21) /* overlapping ME, not implemented */ #define XVID_ME_UNRESTRICTED8 (1<<22) /* unrestricted ME, not implemented */ #define XVID_ME_OVERLAPPING8 (1<<23) /* overlapping ME, not implemented */ /*---------------------------------------------------------------------------- * xvid_enc_create_t structure definition * * This structure is passed as param1 during an instance creation (operation * XVID_ENC_CREATE) *--------------------------------------------------------------------------*/ typedef struct { int version; int profile; /* [in] profile@level; refer to XVID_PROFILE_xxx */ int width; /* [in] frame dimensions; width, pixel units */ int height; /* [in] frame dimensions; height, pixel units */ int num_zones; /* [in:opt] number of bitrate zones */ xvid_enc_zone_t * zones; /* ^^ zone array */ int num_plugins; /* [in:opt] number of plugins */ xvid_enc_plugin_t * plugins; /* ^^ plugin array */ int num_threads; /* [in:opt] number of threads to use in encoder */ int max_bframes; /* [in:opt] max sequential bframes (0=disable bframes) */ int global; /* [in:opt] global flags; controls encoding behavior */ /* --- vol-based stuff; included here for convenience */ int fincr; /* [in:opt] framerate increment; set to zero for variable framerate */ int fbase; /* [in] framerate base frame_duration = fincr/fbase seconds*/ /* ---------------------------------------------- */ /* --- vop-based; included here for convenience */ int max_key_interval; /* [in:opt] the maximum interval between key frames */ int frame_drop_ratio; /* [in:opt] frame dropping: 0=drop none... 100=drop all */ int bquant_ratio; /* [in:opt] bframe quantizer multipier/offeset; used to decide bframes quant when bquant==-1 */ int bquant_offset; /* bquant = (avg(past_ref_quant,future_ref_quant)*bquant_ratio + bquant_offset) / 100 */ int min_quant[3]; /* [in:opt] */ int max_quant[3]; /* [in:opt] */ /* ---------------------------------------------- */ void *handle; /* [out] encoder instance handle */ /* ------- v1.3.x ------- */ int start_frame_num; /* [in:opt] frame number of start frame relative to zones definitions. allows to encode sub-sequences */ int num_slices; /* [in:opt] number of slices to code for each frame */ } xvid_enc_create_t; /*---------------------------------------------------------------------------- * xvid_enc_frame_t structure definition * * This structure is passed as param1 during a frame encoding (operation * XVID_ENC_ENCODE) *--------------------------------------------------------------------------*/ /* out value for the frame structure->type field * unlike stats output in param2, this field is not asynchronous and tells * the client app, if the frame written into the stream buffer is an ivop * usually used for indexing purpose in the container */ #define XVID_KEYFRAME (1<<1) /* The structure */ typedef struct { int version; /* VOL related stuff * unless XVID_FORCEVOL is set, the encoder will not react to any changes * here until the next VOL (keyframe). */ int vol_flags; /* [in] vol flags */ unsigned char *quant_intra_matrix; /* [in:opt] custom intra qmatrix */ unsigned char *quant_inter_matrix; /* [in:opt] custom inter qmatrix */ int par; /* [in:opt] pixel aspect ratio (refer to XVID_PAR_xxx above) */ int par_width; /* [in:opt] aspect ratio width */ int par_height; /* [in:opt] aspect ratio height */ /* Other fields that can change on a frame base */ int fincr; /* [in:opt] framerate increment, for variable framerate only */ int vop_flags; /* [in] (general)vop-based flags */ int motion; /* [in] ME options */ xvid_image_t input; /* [in] input image (read from) */ int type; /* [in:opt] coding type */ int quant; /* [in] frame quantizer; if <=0, automatic (ratecontrol) */ int bframe_threshold; void *bitstream; /* [in:opt] bitstream ptr (written to)*/ int length; /* [in:opt] bitstream length (bytes) */ int out_flags; /* [out] bitstream output flags */ } xvid_enc_frame_t; #ifdef __cplusplus } #endif #endif xvidcore/src/global.h0000664000076500007650000001333511564705453015702 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Global definitions - * * Copyright(C) 2002-2010 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: global.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _GLOBAL_H_ #define _GLOBAL_H_ #include "xvid.h" #include "portab.h" /* --- macroblock modes --- */ #define MODE_INTER 0 #define MODE_INTER_Q 1 #define MODE_INTER4V 2 #define MODE_INTRA 3 #define MODE_INTRA_Q 4 #define MODE_NOT_CODED 16 #define MODE_NOT_CODED_GMC 17 /* --- bframe specific --- */ #define MODE_DIRECT 0 #define MODE_INTERPOLATE 1 #define MODE_BACKWARD 2 #define MODE_FORWARD 3 #define MODE_DIRECT_NONE_MV 4 #define MODE_DIRECT_NO4V 5 /* * vop coding types * intra, prediction, backward, sprite, not_coded */ #define I_VOP 0 #define P_VOP 1 #define B_VOP 2 #define S_VOP 3 #define N_VOP 4 /* convert mpeg-4 coding type i/p/b/s_VOP to XVID_TYPE_xxx */ static __inline int coding2type(int coding_type) { return coding_type + 1; } /* convert XVID_TYPE_xxx to bitstream coding type i/p/b/s_VOP */ static __inline int type2coding(int xvid_type) { return xvid_type - 1; } typedef struct { int x; int y; } VECTOR; typedef struct { VECTOR duv[3]; } WARPPOINTS; /* save all warping parameters for GMC once and for all, instead of recalculating for every block. This is needed for encoding&decoding When switching to incremental calculations, this will get much shorter */ /* we don't include WARPPOINTS wp here, but in FRAMEINFO itself */ typedef struct { int num_wp; /* [input]: 0=none, 1=translation, 2,3 = warping */ /* a value of -1 means: "structure not initialized!" */ int s; /* [input]: calc is done with 1/s pel resolution */ int W; int H; int ss; int smask; int sigma; int r; int rho; int i0s; int j0s; int i1s; int j1s; int i2s; int j2s; int i1ss; int j1ss; int i2ss; int j2ss; int alpha; int beta; int Ws; int Hs; int dxF, dyF, dxG, dyG; int Fo, Go; int cFo, cGo; } GMC_DATA; typedef struct _NEW_GMC_DATA { /* 0=none, 1=translation, 2,3 = warping * a value of -1 means: "structure not initialized!" */ int num_wp; /* {0,1,2,3} => {1/2,1/4,1/8,1/16} pel */ int accuracy; /* sprite size * 16 */ int sW, sH; /* gradient, calculated from warp points */ int dU[2], dV[2], Uo, Vo, Uco, Vco; void (*predict_16x16)(const struct _NEW_GMC_DATA * const This, uint8_t *dst, const uint8_t *src, int dststride, int srcstride, int x, int y, int rounding); void (*predict_8x8) (const struct _NEW_GMC_DATA * const This, uint8_t *uDst, const uint8_t *uSrc, uint8_t *vDst, const uint8_t *vSrc, int dststride, int srcstride, int x, int y, int rounding); void (*get_average_mv)(const struct _NEW_GMC_DATA * const Dsp, VECTOR * const mv, int x, int y, int qpel); } NEW_GMC_DATA; typedef struct { uint8_t *y; uint8_t *u; uint8_t *v; } IMAGE; typedef struct { uint32_t bufa; uint32_t bufb; uint32_t buf; uint32_t pos; uint32_t *tail; uint32_t *start; uint32_t length; uint32_t initpos; } Bitstream; #define MBPRED_SIZE 15 typedef struct { /* decoder/encoder */ VECTOR mvs[4]; short int pred_values[6][MBPRED_SIZE]; int acpred_directions[6]; int mode; int quant; /* absolute quant */ int field_dct; int field_pred; int field_for_top; int field_for_bot; /* encoder specific */ VECTOR pmvs[4]; VECTOR qmvs[4]; /* mvs in quarter pixel resolution */ int32_t sad8[4]; /* SAD values for inter4v-VECTORs */ int32_t sad16; /* SAD value for inter-VECTOR */ int32_t var16; /* Variance of the 16x16 luma block */ int32_t rel_var8[6]; /* Relative variances of the 8x8 sub-blocks */ int dquant; int cbp; /* lambda for these blocks */ int lambda[6]; /* bframe stuff */ VECTOR b_mvs[4]; VECTOR b_qmvs[4]; VECTOR amv; /* average motion vectors from GMC */ int32_t mcsel; VECTOR mvs_avg; //CK average of field motion vectors /* This structure has become way to big! What to do? Split it up? */ } MACROBLOCK; static __inline uint32_t get_dc_scaler(uint32_t quant, uint32_t lum) { if (quant < 5) return 8; if (quant < 25 && !lum) return (quant + 13) / 2; if (quant < 9) return 2 * quant; if (quant < 25) return quant + 8; if (lum) return 2 * quant - 16; else return quant - 6; } /* useful macros */ #define MIN(X, Y) ((X)<(Y)?(X):(Y)) #define MAX(X, Y) ((X)>(Y)?(X):(Y)) /* #define ABS(X) (((X)>0)?(X):-(X)) */ #define SIGN(X) (((X)>0)?1:-1) #define CLIP(X,AMIN,AMAX) (((X)<(AMIN)) ? (AMIN) : ((X)>(AMAX)) ? (AMAX) : (X)) #define DIV_DIV(a,b) (((a)>0) ? ((a)+((b)>>1))/(b) : ((a)-((b)>>1))/(b)) #define SWAP(_T_,A,B) { _T_ tmp = A; A = B; B = tmp; } static __inline uint32_t isqrt(unsigned long n) { uint32_t c = 0x8000; uint32_t g = 0x8000; for(;;) { if(g*g > n) g ^= c; c >>= 1; if(c == 0) return g; g |= c; } } #endif /* _GLOBAL_H_ */ xvidcore/src/dct/0000775000076500007650000000000011566427763015045 5ustar xvidbuildxvidbuildxvidcore/src/dct/fdct.h0000664000076500007650000000304011564706134016121 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Forward DCT header - * * Copyright(C) 2001-2011 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: fdct.h 1986 2011-05-18 09:07:40Z Isibaar $ * ****************************************************************************/ #ifndef _FDCT_H_ #define _FDCT_H_ #include "../portab.h" typedef void (fdctFunc) (short *const block); typedef fdctFunc *fdctFuncPtr; extern fdctFuncPtr fdct; fdctFunc fdct_int32; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) fdctFunc fdct_mmx_ffmpeg; fdctFunc fdct_xmm_ffmpeg; fdctFunc fdct_mmx_skal; fdctFunc fdct_xmm_skal; fdctFunc fdct_sse2_skal; #endif #ifdef ARCH_IS_IA64 fdctFunc fdct_ia64; #endif #endif /* _FDCT_H_ */ xvidcore/src/dct/ppc_asm/0000775000076500007650000000000011566427763016467 5ustar xvidbuildxvidbuildxvidcore/src/dct/ppc_asm/idct_altivec.c0000664000076500007650000001515711564705453021267 0ustar xvidbuildxvidbuild/* * Copyright (c) 2001 Michel Lespinasse * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public * License as published by the Free Software Foundation; either * version 2 of the License, or (at your option) any later version. * * This library is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public * License along with this library; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * */ /* * XviD integration by Christoph Ngeli * * This file is a direct copy of the altivec idct module from the libmpeg2 * project with some minor changes to fit in XviD. */ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" #define IDCT_Vectors \ vector signed short vx0, vx1, vx2, vx3, vx4, vx5, vx6, vx7; \ vector signed short vy0, vy1, vy2, vy3, vy4, vy5, vy6, vy7; \ vector signed short a0, a1, a2, ma2, c4, mc4, zero, bias; \ vector signed short t0, t1, t2, t3, t4, t5, t6, t7, t8; \ vector unsigned short shift static const vector signed short constants [5] = { (vector signed short)AVV(23170, 13573, 6518, 21895, -23170, -21895, 32, 31), (vector signed short)AVV(16384, 22725, 21407, 19266, 16384, 19266, 21407, 22725), (vector signed short)AVV(16069, 22289, 20995, 18895, 16069, 18895, 20995, 22289), (vector signed short)AVV(21407, 29692, 27969, 25172, 21407, 25172, 27969, 29692), (vector signed short)AVV(13623, 18895, 17799, 16019, 13623, 16019, 17799, 18895) }; #define IDCT()\ c4 = vec_splat (constants[0], 0); \ a0 = vec_splat (constants[0], 1); \ a1 = vec_splat (constants[0], 2); \ a2 = vec_splat (constants[0], 3); \ mc4 = vec_splat (constants[0], 4); \ ma2 = vec_splat (constants[0], 5); \ bias = (vector signed short)vec_splat((vector signed int)constants[0], 3); \ \ zero = vec_splat_s16 (0); \ \ vx0 = vec_adds (block[0], block[4]); \ vx4 = vec_subs (block[0], block[4]); \ t5 = vec_mradds (vx0, constants[1], zero); \ t0 = vec_mradds (vx4, constants[1], zero); \ \ vx1 = vec_mradds (a1, block[7], block[1]); \ vx7 = vec_mradds (a1, block[1], vec_subs (zero, block[7])); \ t1 = vec_mradds (vx1, constants[2], zero); \ t8 = vec_mradds (vx7, constants[2], zero); \ \ vx2 = vec_mradds (a0, block[6], block[2]); \ vx6 = vec_mradds (a0, block[2], vec_subs (zero, block[6])); \ t2 = vec_mradds (vx2, constants[3], zero); \ t4 = vec_mradds (vx6, constants[3], zero); \ \ vx3 = vec_mradds (block[3], constants[4], zero); \ vx5 = vec_mradds (block[5], constants[4], zero); \ t7 = vec_mradds (a2, vx5, vx3); \ t3 = vec_mradds (ma2, vx3, vx5); \ \ t6 = vec_adds (t8, t3); \ t3 = vec_subs (t8, t3); \ t8 = vec_subs (t1, t7); \ t1 = vec_adds (t1, t7); \ t6 = vec_mradds (a0, t6, t6); \ t1 = vec_mradds (a0, t1, t1); \ \ t7 = vec_adds (t5, t2); \ t2 = vec_subs (t5, t2); \ t5 = vec_adds (t0, t4); \ t0 = vec_subs (t0, t4); \ t4 = vec_subs (t8, t3); \ t3 = vec_adds (t8, t3); \ \ vy0 = vec_adds (t7, t1); \ vy7 = vec_subs (t7, t1); \ vy1 = vec_adds (t5, t3); \ vy6 = vec_subs (t5, t3); \ vy2 = vec_adds (t0, t4); \ vy5 = vec_subs (t0, t4); \ vy3 = vec_adds (t2, t6); \ vy4 = vec_subs (t2, t6); \ \ vx0 = vec_mergeh (vy0, vy4); \ vx1 = vec_mergel (vy0, vy4); \ vx2 = vec_mergeh (vy1, vy5); \ vx3 = vec_mergel (vy1, vy5); \ vx4 = vec_mergeh (vy2, vy6); \ vx5 = vec_mergel (vy2, vy6); \ vx6 = vec_mergeh (vy3, vy7); \ vx7 = vec_mergel (vy3, vy7); \ \ vy0 = vec_mergeh (vx0, vx4); \ vy1 = vec_mergel (vx0, vx4); \ vy2 = vec_mergeh (vx1, vx5); \ vy3 = vec_mergel (vx1, vx5); \ vy4 = vec_mergeh (vx2, vx6); \ vy5 = vec_mergel (vx2, vx6); \ vy6 = vec_mergeh (vx3, vx7); \ vy7 = vec_mergel (vx3, vx7); \ \ vx0 = vec_mergeh (vy0, vy4); \ vx1 = vec_mergel (vy0, vy4); \ vx2 = vec_mergeh (vy1, vy5); \ vx3 = vec_mergel (vy1, vy5); \ vx4 = vec_mergeh (vy2, vy6); \ vx5 = vec_mergel (vy2, vy6); \ vx6 = vec_mergeh (vy3, vy7); \ vx7 = vec_mergel (vy3, vy7); \ \ vx0 = vec_adds (vx0, bias); \ t5 = vec_adds (vx0, vx4); \ t0 = vec_subs (vx0, vx4); \ \ t1 = vec_mradds (a1, vx7, vx1); \ t8 = vec_mradds (a1, vx1, vec_subs (zero, vx7)); \ \ t2 = vec_mradds (a0, vx6, vx2); \ t4 = vec_mradds (a0, vx2, vec_subs (zero, vx6)); \ \ t7 = vec_mradds (a2, vx5, vx3); \ t3 = vec_mradds (ma2, vx3, vx5); \ \ t6 = vec_adds (t8, t3); \ t3 = vec_subs (t8, t3); \ t8 = vec_subs (t1, t7); \ t1 = vec_adds (t1, t7); \ \ t7 = vec_adds (t5, t2); \ t2 = vec_subs (t5, t2); \ t5 = vec_adds (t0, t4); \ t0 = vec_subs (t0, t4); \ t4 = vec_subs (t8, t3); \ t3 = vec_adds (t8, t3); \ \ vy0 = vec_adds (t7, t1); \ vy7 = vec_subs (t7, t1); \ vy1 = vec_mradds (c4, t3, t5); \ vy6 = vec_mradds (mc4, t3, t5); \ vy2 = vec_mradds (c4, t4, t0); \ vy5 = vec_mradds (mc4, t4, t0); \ vy3 = vec_adds (t2, t6); \ vy4 = vec_subs (t2, t6); \ \ shift = vec_splat_u16 (6); \ vx0 = vec_sra (vy0, shift); \ vx1 = vec_sra (vy1, shift); \ vx2 = vec_sra (vy2, shift); \ vx3 = vec_sra (vy3, shift); \ vx4 = vec_sra (vy4, shift); \ vx5 = vec_sra (vy5, shift); \ vx6 = vec_sra (vy6, shift); \ vx7 = vec_sra (vy7, shift) void idct_altivec_c(vector short *const block) { int i; int j; short block2[64]; short *block_ptr; IDCT_Vectors; block_ptr = (short*)block; for (i = 0; i < 64; i++) block2[i] = block_ptr[i]; for (i = 0; i < 8; i++) for (j = 0; j < 8; j++) block_ptr[i+8*j] = block2[j+8*i] << 4; IDCT(); block[0] = vx0; block[1] = vx1; block[2] = vx2; block[3] = vx3; block[4] = vx4; block[5] = vx5; block[6] = vx6; block[7] = vx7; } xvidcore/src/dct/ia64_asm/0000775000076500007650000000000011566427763016450 5ustar xvidbuildxvidbuildxvidcore/src/dct/ia64_asm/fdct_ia64.s0000664000076500007650000012023311147310721020354 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 forward discrete cosine transform - // * // * Copyright(C) 2002 Stephan Krause, Ingo-Marc Weber, Daniel Kallfass // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: fdct_ia64.s,v 1.6 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * fdct_ia64.s, IA-64 optimized forward DCT // * // * Completed version provided by Intel at AppNote AP-922 // * http://developer.intel.com/software/products/college/ia32/strmsimd/ // * Copyright (C) 1999 Intel Corporation, // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // ***************************************************************************** // // ***************************************************************************** // * // * Revision history: // * // * 24.07.2002 Initial Version // * // ***************************************************************************** // This is a fast precise implementation of 8x8 Discrete Cosine Transform // published in Intel Application Note 922 from 1999 and optimized for IA-64. // // An unoptimized "straight forward" version can be found at the end of this file. .pred.safe_across_calls p1-p5,p16-p63 .text .align 16 .global fdct_ia64# .proc fdct_ia64# fdct_ia64: .prologue alloc r14 = ar.pfs, 1, 56, 0, 0 // Save constants mov r31 = 0x32ec // c0 = tan(1pi/16) mov r30 = 0x6a0a // c1 = tan(2pi/16) mov r29 = 0xab0e // c2 = tan(3pi/16) mov r28 = 0xb505 // g4 = cos(4pi/16) mov r27 = 0xd4db // g3 = cos(3pi/16) mov r26 = 0xec83 // g2 = cos(2pi/16) mov r25 = 0xfb15 // g1 = cos(1pi/16) mov r24 = 0x0002 // correction bit for descaling mov r23 = 0x0004 // correction bit for descaling // Load Matrix into registers add loc0 = r0, r32 add loc2 = 16, r32 add loc4 = 32, r32 add loc6 = 48, r32 add loc8 = 64, r32 add loc10 = 80, r32 add loc12 = 96, r32 add loc14 = 112, r32 add loc1 = 8, r32 add loc3 = 24, r32 add loc5 = 40, r32 add loc7 = 56, r32 add loc9 = 72, r32 add loc11 = 88, r32 add loc13 = 104, r32 add loc15 = 120, r32 ;; ld8 loc16 = [loc0] ld8 loc17 = [loc2] ld8 loc18 = [loc4] ld8 loc19 = [loc6] ld8 loc20 = [loc8] ld8 loc21 = [loc10] ld8 loc22 = [loc12] ld8 loc23 = [loc14] ld8 loc24 = [loc1] ld8 loc25 = [loc3] ld8 loc26 = [loc5] ld8 loc27 = [loc7] mux2 r26 = r26, 0x00 ld8 loc28 = [loc9] mux2 r31 = r31, 0x00 mux2 r25 = r25, 0x00 ld8 loc29 = [loc11] mux2 r30 = r30, 0x00 mux2 r29 = r29, 0x00 ld8 loc30 = [loc13] mux2 r28 = r28, 0x00 mux2 r27 = r27, 0x00 ld8 loc31 = [loc15] mux2 r24 = r24, 0x00 mux2 r23 = r23, 0x00 ;; pshl2 loc16 = loc16, 3 pshl2 loc17 = loc17, 3 pshl2 loc18 = loc18, 3 pshl2 loc19 = loc19, 3 pshl2 loc20 = loc20, 3 pshl2 loc21 = loc21, 3 pshl2 loc22 = loc22, 3 pshl2 loc23 = loc23, 3 ;; pshl2 loc24 = loc24, 3 // ******************* // column-DTC 1st half // ******************* psub2 loc37 = loc17, loc22 // t5 = x1 - x6 pshl2 loc25 = loc25, 3 pshl2 loc26 = loc26, 3 psub2 loc38 = loc18, loc21 // t6 = x2 - x5 pshl2 loc27 = loc27, 3 pshl2 loc28 = loc28, 3 ;; padd2 loc32 = loc16, loc23 // t0 = x0 + x7 pshl2 loc29 = loc29, 3 pshl2 loc30 = loc30, 3 padd2 loc33 = loc17, loc22 // t1 = x1 + x6 padd2 loc40 = loc37, loc38 // buf0 = t5 + t6 psub2 loc41 = loc37, loc38 // buf1 = t5 - t6 ;; padd2 loc34 = loc18, loc21 // t2 = x2 + x5 pshl2 loc31 = loc31, 3 padd2 loc35 = loc19, loc20 // t3 = x3 + x4 psub2 loc36 = loc16, loc23 // t4 = x0 - x7 pmpyshr2 loc37 = loc40, r28, 16 // t5 = buf0 * g4 pmpyshr2 loc38 = loc41, r28, 16 // t6 = buf1 * g4 ;; psub2 loc39 = loc19, loc20 // t7 = x3 - x4 padd2 loc37 = loc37, loc40 // t5 = t5 + buf1 padd2 loc38 = loc38, loc41 // t6 = t6 + buf2 padd2 loc16 = loc32, loc35 // x0 = t0 + t3 padd2 loc17 = loc33, loc34 // x1 = t1 + t2 psub2 loc18 = loc32, loc35 // x2 = t0 - t3 ;; psub2 loc19 = loc33, loc34 // x3 = t1 - t2 padd2 loc20 = loc36, loc37 // x4 = t4 + t5 padd2 loc21 = loc38, loc39 // x5 = t6 + t7 psub2 loc22 = loc36, loc37 // x6 = t4 - t5 psub2 loc23 = loc38, loc39 // x7 = t6 - t7 ;; pmpyshr2 loc42 = loc18, r30, 16 // buf2 = x2 * c1 padd2 loc32 = loc16, loc17 // t0 = x0 + x1 pmpyshr2 loc43 = loc19, r30, 16 // buf3 = x3 * c1 pmpyshr2 loc44 = loc20, r31, 16 // buf4 = x4 * c0 psub2 loc33 = loc16, loc17 // t1 = x0 - x1 pmpyshr2 loc46 = loc22, r29, 16 // buf6 = x6 * c2 pmpyshr2 loc47 = loc23, r29, 16 // buf7 = x7 * c2 ;; padd2 loc46 = loc46, loc22 // buf6 = buf6 + x6 pmpyshr2 loc45 = loc21, r31, 16 // buf5 = x5 * c0 padd2 loc47 = loc47, loc23 // buf7 = buf7 + x7 padd2 loc34 = loc18, loc43 // t2 = x2 + (x3 * c1) psub2 loc35 = loc42, loc19 // t3 = (c1 * x2) - x3 psub2 loc37 = loc44, loc21 // t5 = (c1 * x4) - x5 ;; padd2 loc36 = loc20, loc45 // t4 = x4 + (x5 * c1) padd2 loc38 = loc22, loc47 // t6 = x6 + (x7 * c1) pmpyshr2 loc16 = loc32, r28, 16 // x0 = t0 * g4 pmpyshr2 loc17 = loc33, r28, 16 // x1 = t1 * g4 psub2 loc39 = loc46, loc23 // t7 = (c1 * x6) - x7 ;; padd2 loc48 = loc16, loc32 // y0 = x0 + t0 pmpyshr2 loc18 = loc34, r26, 16 // x2 = t2 * g2 pmpyshr2 loc19 = loc35, r26, 16 // x3 = t3 * g2 padd2 loc52 = loc17, loc33 // y4 = x1 + t1 pmpyshr2 loc20 = loc36, r25, 16 // x4 = t4 * g1 pmpyshr2 loc21 = loc37, r25, 16 // x5 = t5 * g1 ;; padd2 loc50 = loc18, loc34 // y2 = x2 + t2 pmpyshr2 loc22 = loc38, r27, 16 // x6 = t6 * g3 pmpyshr2 loc23 = loc39, r27, 16 // x7 = t7 * g3 padd2 loc55 = loc21, loc37 // y7 = x5 + t5 padd2 loc49 = loc20, loc36 // y1 = x4 + t4 padd2 loc54 = loc19, loc35 // y6 = x3 + t3 ;; padd2 loc51 = loc22, loc38 // y3 = x6 + t6 padd2 loc53 = loc23, loc39 // y5 = x7 + t7 //divide by 4 padd2 loc48 = loc48, r24 padd2 loc49 = loc49, r24 padd2 loc50 = loc50, r24 padd2 loc52 = loc52, r24 ;; padd2 loc51 = loc51, r24 pshr2 loc48 = loc48, 2 padd2 loc53 = loc53, r24 pshr2 loc49 = loc49, 2 padd2 loc54 = loc54, r24 pshr2 loc50 = loc50, 2 padd2 loc55 = loc55, r24 pshr2 loc52 = loc52, 2 ;; pshr2 loc51 = loc51, 2 pshr2 loc53 = loc53, 2 pshr2 loc54 = loc54, 2 pshr2 loc55 = loc55, 2 // ******************* // column-DTC 2nd half // ******************* psub2 loc37 = loc25, loc30 // t5 = x1.2 - x6.2 psub2 loc38 = loc26, loc29 // t6 = x2.2 - x5.2 padd2 loc32 = loc24, loc31 // t0 = x0.2 + x7.2 padd2 loc33 = loc25, loc30 // t1 = x1.2 + x6.2 ;; padd2 loc34 = loc26, loc29 // t2 = x2.2 + x5.2 psub2 loc41 = loc37, loc38 // buf1 = t5 - t6 padd2 loc40 = loc37, loc38 // buf0 = t5 + t6 padd2 loc35 = loc27, loc28 // t3 = x3.2 + x4.2 ;; psub2 loc36 = loc24, loc31 // t4 = x0.2 - x7.2 pmpyshr2 loc37 = loc40, r28, 16 // t5 = buf0 * g4 pmpyshr2 loc38 = loc41, r28, 16 // t6 = buf1 * g4 ;; psub2 loc39 = loc27, loc28 // t7 = x3.2 - x4.2 padd2 loc37 = loc37, loc40 // t5 = t5 + buf1 padd2 loc38 = loc38, loc41 // t6 = t6 + buf2 padd2 loc16 = loc32, loc35 // x0 = t0 + t3 padd2 loc17 = loc33, loc34 // x1 = t1 + t2 psub2 loc18 = loc32, loc35 // x2 = t0 - t3 ;; psub2 loc19 = loc33, loc34 // x3 = t1 - t2 padd2 loc20 = loc36, loc37 // x4 = t4 + t5 padd2 loc21 = loc38, loc39 // x5 = t6 + t7 psub2 loc22 = loc36, loc37 // x6 = t4 - t5 psub2 loc23 = loc38, loc39 // x7 = t6 - t7 ;; pmpyshr2 loc42 = loc18, r30, 16 // buf2 = x2 * c1 padd2 loc32 = loc16, loc17 // t0 = x0 + x1 pmpyshr2 loc43 = loc19, r30, 16 // buf3 = x3 * c1 pmpyshr2 loc44 = loc20, r31, 16 // buf4 = x4 * c0 psub2 loc33 = loc16, loc17 // t1 = x0 - x1 pmpyshr2 loc45 = loc21, r31, 16 // buf5 = x5 * c0 pmpyshr2 loc46 = loc22, r29, 16 // buf6 = x6 * c2 pmpyshr2 loc47 = loc23, r29, 16 // buf7 = x7 * c2 ;; padd2 loc34 = loc18, loc43 // t2 = x2 + buf3 padd2 loc46 = loc46, loc22 // buf6 = buf6 + x6 padd2 loc47 = loc47, loc23 // buf7 = buf7 + x7 psub2 loc35 = loc42, loc19 // t3 = buf2 - x3 padd2 loc36 = loc20, loc45 // t4 = x4 + buf5 pmpyshr2 loc16 = loc32, r28, 16 // x0 = t0 * g4 ;; psub2 loc37 = loc44, loc21 // t5 = buf4 - x5 padd2 loc38 = loc22, loc47 // t6 = x6 + buf7 psub2 loc39 = loc46, loc23 // t7 = buf6 - x7 pmpyshr2 loc17 = loc33, r28, 16 // x1 = t1 * g4 ;; padd2 loc40 = loc16, loc32 // y0.2 = x0 + t0 pmpyshr2 loc18 = loc34, r26, 16 // x2 = t2 * g2 pmpyshr2 loc19 = loc35, r26, 16 // x3 = t3 * g2 padd2 loc44 = loc17, loc33 // y4.2 = x1 + t1 pmpyshr2 loc20 = loc36, r25, 16 // x4 = t4 * g1 pmpyshr2 loc21 = loc37, r25, 16 // x5 = t5 * g1 ;; padd2 loc42 = loc18, loc34 // y2.2 = x2 + t2 pmpyshr2 loc22 = loc38, r27, 16 // x6 = t6 * g3 pmpyshr2 loc23 = loc39, r27, 16 // x7 = t7 * g3 padd2 loc47 = loc21, loc37 // y7.2 = x5 + t5 padd2 loc41 = loc20, loc36 // y1.2 = x4 + t4 padd2 loc46 = loc19, loc35 // y6.2 = x3 + t3 ;; padd2 loc43 = loc22, loc38 // y3.2 = x6 + t6 // ******************* // transpose matrix // ******************* mix2.r loc32 = loc48, loc49 // tmp0 = mixr y0, y1 mix2.l loc33 = loc48, loc49 // tmp1 = mixl y0, y1 padd2 loc45 = loc23, loc39 // y5.2 = x7 + t7 mix2.r loc34 = loc50, loc51 // tmp2 = mixr y2, y3 mix2.l loc35 = loc50, loc51 // tmp3 = mixl y2, y3 ;; //divide by 4 padd2 loc40 = loc40, r24 padd2 loc41 = loc41, r24 mix4.r loc16 = loc32, loc34 // x0 = mixr tmp0, tmp2 padd2 loc42 = loc42, r24 padd2 loc43 = loc43, r24 mix4.r loc17 = loc33, loc35 // x1 = mixr tmp1, tmp3 padd2 loc44 = loc44, r24 padd2 loc45 = loc45, r24 mix4.l loc18 = loc32, loc34 // x2 = mixl tmp0, tmp2 padd2 loc46 = loc46, r24 padd2 loc47 = loc47, r24 mix4.l loc19 = loc33, loc35 // x3 = mixl tmp1, tmp3 ;; pshr2 loc40 = loc40, 2 pshr2 loc41 = loc41, 2 pshr2 loc42 = loc42, 2 pshr2 loc43 = loc43, 2 mix2.r loc32 = loc52, loc53 // tmp0 = mixr y4, y5 mix2.l loc33 = loc52, loc53 // tmp1 = mixl y4, y5 mix2.r loc34 = loc54, loc55 // tmp2 = mixr y6, y7 mix2.l loc35 = loc54, loc55 // tmp3 = mixl y6, y7 ;; pshr2 loc44 = loc44, 2 pshr2 loc45 = loc45, 2 pshr2 loc46 = loc46, 2 pshr2 loc47 = loc47, 2 mix4.r loc24 = loc32, loc34 // x0.2 = mixr tmp0, tmp2 mix4.r loc25 = loc33, loc35 // x1.2 = mixr tmp1, tmp3 mix4.l loc26 = loc32, loc34 // x2.2 = mixl tmp0, tmp2 mix4.l loc27 = loc33, loc35 // x3.2 = mixl tmp1, tmp3 ;; mix2.r loc32 = loc40, loc41 // tmp0 = mixr y0.2, y1.2 mix2.l loc33 = loc40, loc41 // tmp1 = mixl y0.2, y1.2 mix2.r loc34 = loc42, loc43 // tmp2 = mixr y2.2, y3.2 mix2.l loc35 = loc42, loc43 // tmp3 = mixl y2.2, y3.2 ;; mix4.r loc20 = loc32, loc34 // x4 = mixr tmp0, tmp2 mix4.r loc21 = loc33, loc35 // x5 = mixr tmp1, tmp3 mix4.l loc22 = loc32, loc34 // x6 = mixl tmp0, tmp2 mix4.l loc23 = loc33, loc35 // x7 = mixl tmp1, tmp3 ;; mix2.r loc32 = loc44, loc45 // tmp0 = mixr y4.2, y5.2 mix2.l loc33 = loc44, loc45 // tmp1 = mixl y4.2, y5.2 mix2.r loc34 = loc46, loc47 // tmp2 = mixr y6.2, y6.2 mix2.l loc35 = loc46, loc47 // tmp3 = mixl y6.2, y6.2 ;; mix4.r loc28 = loc32, loc34 // x4.2 = mixr tmp0, tmp2 mix4.r loc29 = loc33, loc35 // x5.2 = mixr tmp1, tmp3 mix4.l loc30 = loc32, loc34 // x6.2 = mixl tmp0, tmp2 mix4.l loc31 = loc33, loc35 // x7.2 = mixl tmp1, tmp3 // ******************* // row-DTC 1st half // ******************* psub2 loc37 = loc17, loc22 // t5 = x1 - x6 psub2 loc38 = loc18, loc21 // t6 = x2 - x5 ;; padd2 loc32 = loc16, loc23 // t0 = x0 + x7 padd2 loc33 = loc17, loc22 // t1 = x1 + x6 padd2 loc34 = loc18, loc21 // t2 = x2 + x5 psub2 loc41 = loc37, loc38 // buf1 = t5 - t6 padd2 loc40 = loc37, loc38 // buf0 = t5 + t6 padd2 loc35 = loc19, loc20 // t3 = x3 + x4 ;; psub2 loc36 = loc16, loc23 // t4 = x0 - x7 pmpyshr2 loc37 = loc40, r28, 16 // t5 = buf0 * g4 pmpyshr2 loc38 = loc41, r28, 16 // t6 = buf1 * g4 ;; psub2 loc39 = loc19, loc20 // t7 = x3 - x4 padd2 loc37 = loc37, loc40 // t5 = t5 + buf1 padd2 loc38 = loc38, loc41 // t6 = t6 + buf2 padd2 loc16 = loc32, loc35 // x0 = t0 + t3 padd2 loc17 = loc33, loc34 // x1 = t1 + t2 psub2 loc18 = loc32, loc35 // x2 = t0 - t3 ;; psub2 loc19 = loc33, loc34 // x3 = t1 - t2 padd2 loc20 = loc36, loc37 // x4 = t4 + t5 padd2 loc21 = loc38, loc39 // x5 = t6 + t7 psub2 loc22 = loc36, loc37 // x6 = t4 - t5 psub2 loc23 = loc38, loc39 // x7 = t6 - t7 ;; pmpyshr2 loc42 = loc18, r30, 16 // buf2 = x2 * c1 padd2 loc32 = loc16, loc17 // t0 = x0 + x1 pmpyshr2 loc43 = loc19, r30, 16 // buf3 = x3 * c1 pmpyshr2 loc44 = loc20, r31, 16 // buf4 = x4 * c0 psub2 loc33 = loc16, loc17 // t1 = x0 - x1 pmpyshr2 loc45 = loc21, r31, 16 // buf5 = x5 * c0 pmpyshr2 loc46 = loc22, r29, 16 // buf6 = x6 * c2 pmpyshr2 loc47 = loc23, r29, 16 // buf7 = x7 * c2 ;; padd2 loc46 = loc46, loc22 // buf6 = buf6 + x6 padd2 loc47 = loc47, loc23 // buf7 = buf7 + x7 padd2 loc34 = loc18, loc43 // t2 = x2 + (x3 * c1) ;; psub2 loc35 = loc42, loc19 // t3 = (c1 * x2) - x3 padd2 loc36 = loc20, loc45 // t4 = x4 + (x5 * c1) psub2 loc37 = loc44, loc21 // t5 = (c1 * x4) - x5 padd2 loc38 = loc22, loc47 // t6 = x6 + (buf7 * c1) psub2 loc39 = loc46, loc23 // t7 = (c1 * buf6) - x7 pmpyshr2 loc16 = loc32, r28, 16 // x0 = t0 * g4 pmpyshr2 loc17 = loc33, r28, 16 // x1 = t1 * g4 ;; padd2 loc48 = loc16, loc32 // y0 = x0 + t0 pmpyshr2 loc18 = loc34, r26, 16 // x2 = t2 * g2 pmpyshr2 loc19 = loc35, r26, 16 // x3 = t3 * g2 padd2 loc52 = loc17, loc33 // y4 = x1 + t1 pmpyshr2 loc20 = loc36, r25, 16 // x4 = t4 * g1 pmpyshr2 loc21 = loc37, r25, 16 // x5 = t5 * g1 ;; padd2 loc50 = loc18, loc34 // y2 = x2 + t2 pmpyshr2 loc22 = loc38, r27, 16 // x6 = t6 * g3 pmpyshr2 loc23 = loc39, r27, 16 // x7 = t7 * g3 padd2 loc55 = loc21, loc37 // y7 = x5 + t5 padd2 loc49 = loc20, loc36 // y1 = x4 + t4 padd2 loc54 = loc19, loc35 // y6 = x3 + t3 ;; padd2 loc51 = loc22, loc38 // y3 = x6 + t6 padd2 loc53 = loc23, loc39 // y5 = x7 + t7 // ******************* // row-DTC 2nd half // ******************* psub2 loc37 = loc25, loc30 // t5 = x1.2 - x6.2 psub2 loc38 = loc26, loc29 // t6 = x2.2 - x5.2 padd2 loc32 = loc24, loc31 // t0 = x0.2 + x7.2 padd2 loc33 = loc25, loc30 // t1 = x1.2 + x6.2 ;; padd2 loc34 = loc26, loc29 // t2 = x2.2 + x5.2 psub2 loc41 = loc37, loc38 // buf1 = t5 - t6 padd2 loc40 = loc37, loc38 // buf0 = t5 + t6 padd2 loc35 = loc27, loc28 // t3 = x3.2 + x4.2 ;; psub2 loc36 = loc24, loc31 // t4 = x0.2 - x7.2 pmpyshr2 loc37 = loc40, r28, 16 // t5 = buf0 * g4 pmpyshr2 loc38 = loc41, r28, 16 // t6 = buf1 * g4 ;; psub2 loc39 = loc27, loc28 // t7 = x3.2 - x4.2 padd2 loc37 = loc37, loc40 // t5 = t5 + buf1 padd2 loc38 = loc38, loc41 // t6 = t6 + buf2 padd2 loc16 = loc32, loc35 // x0 = t0 + t3 padd2 loc17 = loc33, loc34 // x1 = t1 + t2 psub2 loc18 = loc32, loc35 // x2 = t0 - t3 ;; psub2 loc19 = loc33, loc34 // x3 = t1 - t2 padd2 loc20 = loc36, loc37 // x4 = t4 + t5 padd2 loc21 = loc38, loc39 // x5 = t6 + t7 psub2 loc22 = loc36, loc37 // x6 = t4 - t5 psub2 loc23 = loc38, loc39 // x7 = t6 - t7 ;; pmpyshr2 loc42 = loc18, r30, 16 // buf2 = x2 * c1 padd2 loc32 = loc16, loc17 // t0 = x0 + x1 pmpyshr2 loc43 = loc19, r30, 16 // buf3 = x3 * c1 pmpyshr2 loc44 = loc20, r31, 16 // buf4 = x4 * c0 psub2 loc33 = loc16, loc17 // t1 = x0 - x1 pmpyshr2 loc45 = loc21, r31, 16 // buf5 = x5 * c0 pmpyshr2 loc46 = loc22, r29, 16 // buf6 = x6 * c2 pmpyshr2 loc47 = loc23, r29, 16 // buf7 = x7 * c2 ;; padd2 loc46 = loc46, loc22 // buf6 = buf6 + x6 padd2 loc47 = loc47, loc23 // buf7 = buf7 + x7 padd2 loc34 = loc18, loc43 // t2 = x2 + (x3 * c1) ;; psub2 loc35 = loc42, loc19 // t3 = (c1 * x2) - x3 padd2 loc36 = loc20, loc45 // t4 = x4 + (x5 * c1) psub2 loc37 = loc44, loc21 // t5 = (c1 * x4) - x5 padd2 loc38 = loc22, loc47 // t6 = x6 + (buf7 * c1) psub2 loc39 = loc46, loc23 // t7 = (c1 * buf6) - x7 pmpyshr2 loc16 = loc32, r28, 16 // x0 = t0 * g4 pmpyshr2 loc17 = loc33, r28, 16 // x1 = t1 * g4 ;; padd2 loc40 = loc16, loc32 // y0.2 = x0 + t0 pmpyshr2 loc18 = loc34, r26, 16 // x2 = t2 * g2 pmpyshr2 loc19 = loc35, r26, 16 // x3 = t3 * g2 padd2 loc44 = loc17, loc33 // y4.2 = x1 + t1 pmpyshr2 loc20 = loc36, r25, 16 // x4 = t4 * g1 pmpyshr2 loc21 = loc37, r25, 16 // x5 = t5 * g1 ;; padd2 loc42 = loc18, loc34 // y2.2 = x2 + t2 pmpyshr2 loc22 = loc38, r27, 16 // x6 = t6 * g3 pmpyshr2 loc23 = loc39, r27, 16 // x7 = t7 * g3 padd2 loc46 = loc19, loc35 // y6.2 = x3 + t3 nop.i 0x0 nop.i 0x0 ;; // ******************* // Transpose matrix // ******************* padd2 loc41 = loc20, loc36 // y1.2 = x4 + t4 mix2.l loc32 = loc49, loc48 // tmp0 = mixr y1, y0 mix2.r loc33 = loc49, loc48 // tmp1 = mixl y1, y0 padd2 loc47 = loc21, loc37 // y7.2 = x5 + t5 mix2.l loc34 = loc51, loc50 // tmp2 = mixr y3, y2 mix2.r loc35 = loc51, loc50 // tmp3 = mixl y3, y2 ;; padd2 loc43 = loc22, loc38 // y3.2 = x6 + t6 mix4.l loc16 = loc34, loc32 // x0 = mixr tmp2, tmp0 mix4.l loc17 = loc35, loc33 // x1 = mixr tmp3, tmp1 padd2 loc45 = loc23, loc39 // y5.2 = x7 + t7 mix4.r loc18 = loc34, loc32 // x2 = mixl tmp2, tmp0 mix4.r loc19 = loc35, loc33 // x3 = mixl tmp3, tmp1 ;; padd2 loc16 = loc16, r23 mix2.l loc32 = loc41, loc40 // tmp0 = mixr y0.2, y1.2 mix2.r loc33 = loc41, loc40 // tmp1 = mixl y0.2, y1.2 padd2 loc17 = loc17, r23 mix2.l loc34 = loc43, loc42 // tmp2 = mixr y2.2, y3.2 mix2.r loc35 = loc43, loc42 // tmp3 = mixl y2.2, y3.2 ;; padd2 loc18 = loc18, r23 mix4.l loc20 = loc34, loc32 // x4 = mixr tmp2, tmp0 mix4.l loc21 = loc35, loc33 // x5 = mixr tmp3, tmp1 padd2 loc19 = loc19, r23 mix4.r loc22 = loc34, loc32 // x6 = mixl tmp2, tmp0 mix4.r loc23 = loc35, loc33 // x7 = mixl tmp3, tmp1 ;; padd2 loc20 = loc20, r23 mix2.l loc32 = loc53, loc52 // tmp0 = mixr y5, y4 mix2.r loc33 = loc53, loc52 // tmp1 = mixl y5, y4 padd2 loc21 = loc21, r23 mix2.l loc34 = loc55, loc54 // tmp2 = mixr y7, y6 mix2.r loc35 = loc55, loc54 // tmp3 = mixl y7, y6 ;; padd2 loc22 = loc22, r23 mix4.l loc24 = loc34, loc32 // x0.2 = mixr tmp2, tmp0 mix4.l loc25 = loc35, loc33 // x1.2 = mixr tmp3, tmp1 padd2 loc23 = loc23, r23 mix4.r loc26 = loc34, loc32 // x2.2 = mixl tmp2, tmp0 mix4.r loc27 = loc35, loc33 // x3.2 = mixl tmp3, tmp1 ;; padd2 loc24 = loc24, r23 mix2.l loc32 = loc45, loc44 // tmp0 = mixr y4.2, y5.2 mix2.r loc33 = loc45, loc44 // tmp1 = mixl y4.2, y5.2 padd2 loc25 = loc25, r23 mix2.l loc34 = loc47, loc46 // tmp2 = mixr y6.2, y6.2 mix2.r loc35 = loc47, loc46 // tmp3 = mixl y6.2, y6.2 ;; padd2 loc26 = loc26, r23 mix4.l loc28 = loc34, loc32 // x4.2 = mixr tmp2, tmp0 mix4.l loc29 = loc35, loc33 // x5.2 = mixr tmp3, tmp1 padd2 loc27 = loc27, r23 mix4.r loc30 = loc34, loc32 // x6.2 = mixl tmp2, tmp0 mix4.r loc31 = loc35, loc33 // x7.2 = mixl tmp3, tmp1 ;; // ******************* // Descale // ******************* padd2 loc28 = loc28, r23 pshr2 loc16 = loc16, 3 pshr2 loc17 = loc17, 3 padd2 loc29 = loc29, r23 pshr2 loc18 = loc18, 3 pshr2 loc19 = loc19, 3 padd2 loc30 = loc30, r23 pshr2 loc20 = loc20, 3 pshr2 loc21 = loc21, 3 padd2 loc31 = loc31, r23 pshr2 loc22 = loc22, 3 pshr2 loc23 = loc23, 3 ;; pshr2 loc24 = loc24, 3 pshr2 loc25 = loc25, 3 pshr2 loc26 = loc26, 3 pshr2 loc27 = loc27, 3 pshr2 loc28 = loc28, 3 pshr2 loc29 = loc29, 3 pshr2 loc30 = loc30, 3 pshr2 loc31 = loc31, 3 ;; // ******************* // Store matrix // ******************* st8 [loc0] = loc16 st8 [loc1] = loc24 st8 [loc2] = loc17 st8 [loc3] = loc25 st8 [loc4] = loc18 st8 [loc5] = loc26 st8 [loc6] = loc19 st8 [loc7] = loc27 st8 [loc8] = loc20 st8 [loc9] = loc28 st8 [loc10] = loc21 st8 [loc11] = loc29 st8 [loc12] = loc22 st8 [loc13] = loc30 st8 [loc14] = loc23 st8 [loc15] = loc31 mov ar.pfs = r14 br.ret.sptk.many b0 .endp fdct_ia64# .common fdct#,8,8 //*********************************************** //* Here is a version of the DCT implementation * //* unoptimized in terms of command ordering. * //* This version is about 30% slower but * //* easier understand. * //*********************************************** // // .pred.safe_across_calls p1-p5,p16-p63 //.text // .align 16 // .global fdct_ia64# // .proc fdct_ia64# //fdct_ia64: // .prologue // alloc r14 = ar.pfs, 1, 56, 0, 0 // // // ******************* // // Save constants // // ******************* // mov r31 = 0x32ec // c0 = tan(1pi/16) // mov r30 = 0x6a0a // c1 = tan(2pi/16) // mov r29 = 0xab0e // c2 = tan(3pi/16) // mov r28 = 0xb505 // g4 = cos(4pi/16) // mov r27 = 0xd4db // g3 = cos(3pi/16) // mov r26 = 0xec83 // g2 = cos(2pi/16) // mov r25 = 0xfb15 // g1 = cos(1pi/16) // mov r24 = 0x0002 // correction bit for descaling // mov r23 = 0x0004 // correction bit for descaling // // // ************************** // // Load Matrix into registers // // ************************** // // add loc0 = r0, r32 // ;; // mux2 r31 = r31, 0x00 // mux2 r30 = r30, 0x00 // mux2 r29 = r29, 0x00 // mux2 r28 = r28, 0x00 // mux2 r27 = r27, 0x00 // mux2 r26 = r26, 0x00 // mux2 r25 = r25, 0x00 // mux2 r24 = r24, 0x00 // mux2 r23 = r23, 0x00 // ld8 loc16 = [loc0] // add loc2 = 16, r32 // add loc4 = 32, r32 // add loc6 = 48, r32 // add loc8 = 64, r32 // add loc10 = 80, r32 // ;; // ld8 loc17 = [loc2] // ld8 loc18 = [loc4] // add loc12 = 96, r32 // ld8 loc19 = [loc6] // ld8 loc20 = [loc8] // add loc14 = 112, r32 // ;; // ld8 loc21 = [loc10] // ld8 loc22 = [loc12] // add loc1 = 8, r32 // ld8 loc23 = [loc14] // add loc3 = 24, r32 // add loc5 = 40, r32 // ;; // ld8 loc24 = [loc1] // ld8 loc25 = [loc3] // add loc7 = 56, r32 // ld8 loc26 = [loc5] // add loc9 = 72, r32 // add loc11 = 88, r32 // ;; // ld8 loc27 = [loc7] // ld8 loc28 = [loc9] // add loc13 = 104, r32 // ld8 loc29 = [loc11] // add loc15 = 120, r32 // ;; // ld8 loc30 = [loc13] // ld8 loc31 = [loc15] // ;; // // ****** // // Scale // // ****** // pshl2 loc16 = loc16, 3 // pshl2 loc17 = loc17, 3 // pshl2 loc18 = loc18, 3 // pshl2 loc19 = loc19, 3 // pshl2 loc20 = loc20, 3 // pshl2 loc21 = loc21, 3 // pshl2 loc22 = loc22, 3 // pshl2 loc23 = loc23, 3 // pshl2 loc24 = loc24, 3 // pshl2 loc25 = loc25, 3 // pshl2 loc26 = loc26, 3 // pshl2 loc27 = loc27, 3 // pshl2 loc28 = loc28, 3 // pshl2 loc29 = loc29, 3 // pshl2 loc30 = loc30, 3 // pshl2 loc31 = loc31, 3 // ;; // // // ******************* // // column-DTC 1st half // // ******************* // // padd2 loc32 = loc16, loc23 // t0 = x0 + x7 // padd2 loc33 = loc17, loc22 // t1 = x1 + x6 // padd2 loc34 = loc18, loc21 // t2 = x2 + x5 // padd2 loc35 = loc19, loc20 // t3 = x3 + x4 // psub2 loc36 = loc16, loc23 // t4 = x0 - x7 // psub2 loc37 = loc17, loc22 // t5 = x1 - x6 // psub2 loc38 = loc18, loc21 // t6 = x2 - x5 // psub2 loc39 = loc19, loc20 // t7 = x3 - x4 // ;; // padd2 loc40 = loc37, loc38 // buf0 = t5 + t6 // psub2 loc41 = loc37, loc38 // buf1 = t5 - t6 // ;; // pmpyshr2 loc37 = loc40, r28, 16 // t5 = buf0 * g4 // pmpyshr2 loc38 = loc41, r28, 16 // t6 = buf1 * g4 // ;; // padd2 loc37 = loc37, loc40 // t5 = t5 + buf1 // padd2 loc38 = loc38, loc41 // t6 = t6 + buf2 // ;; // padd2 loc16 = loc32, loc35 // x0 = t0 + t3 // padd2 loc17 = loc33, loc34 // x1 = t1 + t2 // psub2 loc18 = loc32, loc35 // x2 = t0 - t3 // psub2 loc19 = loc33, loc34 // x3 = t1 - t2 // padd2 loc20 = loc36, loc37 // x4 = t4 + t5 // padd2 loc21 = loc38, loc39 // x5 = t6 + t7 // psub2 loc22 = loc36, loc37 // x6 = t4 - t5 // psub2 loc23 = loc38, loc39 // x7 = t6 - t7 // ;; // // padd2 loc32 = loc16, loc17 // t0 = x0 + x1 // psub2 loc33 = loc16, loc17 // t1 = x0 - x1 // pmpyshr2 loc42 = loc18, r30, 16 // buf2 = x2 * c1 // pmpyshr2 loc43 = loc19, r30, 16 // buf3 = x3 * c1 // pmpyshr2 loc44 = loc20, r31, 16 // buf4 = x4 * c0 // pmpyshr2 loc45 = loc21, r31, 16 // buf5 = x5 * c0 // pmpyshr2 loc46 = loc22, r29, 16 // buf6 = x6 * c2 // pmpyshr2 loc47 = loc23, r29, 16 // buf7 = x7 * c2 // ;; // padd2 loc46 = loc46, loc22 // buf6 = buf6 + x6 // padd2 loc47 = loc47, loc23 // buf7 = buf7 + x7 // ;; // padd2 loc34 = loc18, loc43 // t2 = x2 + (x3 * c1) // psub2 loc35 = loc42, loc19 // t3 = (c1 * x2) - x3 // padd2 loc36 = loc20, loc45 // t4 = x4 + (x5 * c1) // psub2 loc37 = loc44, loc21 // t5 = (c1 * x4) - x5 // padd2 loc38 = loc22, loc47 // t6 = x6 + (x7 * c1) // psub2 loc39 = loc46, loc23 // t7 = (c1 * x6) - x7 // ;; // pmpyshr2 loc16 = loc32, r28, 16 // x0 = t0 * g4 // pmpyshr2 loc17 = loc33, r28, 16 // x1 = t1 * g4 // pmpyshr2 loc18 = loc34, r26, 16 // x2 = t2 * g2 // pmpyshr2 loc19 = loc35, r26, 16 // x3 = t3 * g2 // pmpyshr2 loc20 = loc36, r25, 16 // x4 = t4 * g1 // pmpyshr2 loc21 = loc37, r25, 16 // x5 = t5 * g1 // pmpyshr2 loc22 = loc38, r27, 16 // x6 = t6 * g3 // pmpyshr2 loc23 = loc39, r27, 16 // x7 = t7 * g3 // ;; // padd2 loc48 = loc16, loc32 // y0 = x0 + t0 // padd2 loc49 = loc20, loc36 // y1 = x4 + t4 // padd2 loc50 = loc18, loc34 // y2 = x2 + t2 // padd2 loc51 = loc22, loc38 // y3 = x6 + t6 // padd2 loc52 = loc17, loc33 // y4 = x1 + t1 // padd2 loc53 = loc23, loc39 // y5 = x7 + t7 // padd2 loc54 = loc19, loc35 // y6 = x3 + t3 // padd2 loc55 = loc21, loc37 // y7 = x5 + t5 // ;; // // // ******************* // // column-DTC 2nd half // // ******************* // // padd2 loc32 = loc24, loc31 // t0 = x0.2 + x7.2 // padd2 loc33 = loc25, loc30 // t1 = x1.2 + x6.2 // padd2 loc34 = loc26, loc29 // t2 = x2.2 + x5.2 // padd2 loc35 = loc27, loc28 // t3 = x3.2 + x4.2 // psub2 loc36 = loc24, loc31 // t4 = x0.2 - x7.2 // psub2 loc37 = loc25, loc30 // t5 = x1.2 - x6.2 // psub2 loc38 = loc26, loc29 // t6 = x2.2 - x5.2 // psub2 loc39 = loc27, loc28 // t7 = x3.2 - x4.2 // ;; // padd2 loc40 = loc37, loc38 // buf0 = t5 + t6 // psub2 loc41 = loc37, loc38 // buf1 = t5 - t6 // ;; // pmpyshr2 loc37 = loc40, r28, 16 // t5 = buf0 * g4 // pmpyshr2 loc38 = loc41, r28, 16 // t6 = buf1 * g4 // ;; // padd2 loc37 = loc37, loc40 // t5 = t5 + buf1 // padd2 loc38 = loc38, loc41 // t6 = t6 + buf2 // ;; // padd2 loc16 = loc32, loc35 // x0 = t0 + t3 // padd2 loc17 = loc33, loc34 // x1 = t1 + t2 // psub2 loc18 = loc32, loc35 // x2 = t0 - t3 // psub2 loc19 = loc33, loc34 // x3 = t1 - t2 // padd2 loc20 = loc36, loc37 // x4 = t4 + t5 // padd2 loc21 = loc38, loc39 // x5 = t6 + t7 // psub2 loc22 = loc36, loc37 // x6 = t4 - t5 // psub2 loc23 = loc38, loc39 // x7 = t6 - t7 // ;; // padd2 loc32 = loc16, loc17 // t0 = x0 + x1 // psub2 loc33 = loc16, loc17 // t1 = x0 - x1 // pmpyshr2 loc42 = loc18, r30, 16 // buf2 = x2 * c1 // pmpyshr2 loc43 = loc19, r30, 16 // buf3 = x3 * c1 // pmpyshr2 loc44 = loc20, r31, 16 // buf4 = x4 * c0 // pmpyshr2 loc45 = loc21, r31, 16 // buf5 = x5 * c0 // pmpyshr2 loc46 = loc22, r29, 16 // buf6 = x6 * c2 // pmpyshr2 loc47 = loc23, r29, 16 // buf7 = x7 * c2 // ;; // padd2 loc46 = loc46, loc22 // buf6 = buf6 + x6 // padd2 loc47 = loc47, loc23 // buf7 = buf7 + x7 // ;; // padd2 loc34 = loc18, loc43 // t2 = x2 + (x3 * c1) // psub2 loc35 = loc42, loc19 // t3 = (c1 * x2) - x3 // padd2 loc36 = loc20, loc45 // t4 = x4 + (x5 * c1) // psub2 loc37 = loc44, loc21 // t5 = (c1 * x4) - x5 // padd2 loc38 = loc22, loc47 // t6 = x6 + (x7 * c1) // psub2 loc39 = loc46, loc23 // t7 = (c1 * x6) - x7 // ;; // pmpyshr2 loc16 = loc32, r28, 16 // x0 = t0 * g4 // pmpyshr2 loc17 = loc33, r28, 16 // x1 = t1 * g4 // pmpyshr2 loc18 = loc34, r26, 16 // x2 = t2 * g2 // pmpyshr2 loc19 = loc35, r26, 16 // x3 = t3 * g2 // pmpyshr2 loc20 = loc36, r25, 16 // x4 = t4 * g1 // pmpyshr2 loc21 = loc37, r25, 16 // x5 = t5 * g1 // pmpyshr2 loc22 = loc38, r27, 16 // x6 = t6 * g3 // pmpyshr2 loc23 = loc39, r27, 16 // x7 = t7 * g3 // ;; // padd2 loc40 = loc16, loc32 // y0.2 = x0 + t0 // padd2 loc41 = loc20, loc36 // y1.2 = x4 + t4 // padd2 loc42 = loc18, loc34 // y2.2 = x2 + t2 // padd2 loc43 = loc22, loc38 // y3.2 = x6 + t6 // padd2 loc44 = loc17, loc33 // y4.2 = x1 + t1 // padd2 loc45 = loc23, loc39 // y5.2 = x7 + t7 // padd2 loc46 = loc19, loc35 // y6.2 = x3 + t3 // padd2 loc47 = loc21, loc37 // y7.2 = x5 + t5 // ;; // padd2 loc40 = loc40, r24 // add r24 to correct rounding // padd2 loc41 = loc41, r24 // padd2 loc42 = loc42, r24 // padd2 loc43 = loc43, r24 // padd2 loc44 = loc44, r24 // padd2 loc45 = loc45, r24 // padd2 loc46 = loc46, r24 // padd2 loc47 = loc47, r24 // padd2 loc48 = loc48, r24 // padd2 loc49 = loc49, r24 // padd2 loc50 = loc50, r24 // padd2 loc51 = loc51, r24 // padd2 loc52 = loc52, r24 // padd2 loc53 = loc53, r24 // padd2 loc54 = loc54, r24 // padd2 loc55 = loc55, r24 // ;; // pshr2 loc40 = loc40, 2 // Divide all matrix elements through 4 // pshr2 loc41 = loc41, 2 // pshr2 loc42 = loc42, 2 // pshr2 loc43 = loc43, 2 // pshr2 loc44 = loc44, 2 // pshr2 loc45 = loc45, 2 // pshr2 loc46 = loc46, 2 // pshr2 loc47 = loc47, 2 // pshr2 loc48 = loc48, 2 // pshr2 loc49 = loc49, 2 // pshr2 loc50 = loc50, 2 // pshr2 loc51 = loc51, 2 // pshr2 loc52 = loc52, 2 // pshr2 loc53 = loc53, 2 // pshr2 loc54 = loc54, 2 // pshr2 loc55 = loc55, 2 // ;; // // // ***************** // // Transpose matrix // // ***************** // // mix2.r loc32 = loc48, loc49 // tmp0 = mixr y0, y1 // mix2.l loc33 = loc48, loc49 // tmp1 = mixl y0, y1 // mix2.r loc34 = loc50, loc51 // tmp2 = mixr y2, y3 // mix2.l loc35 = loc50, loc51 // tmp3 = mixl y2, y3 // ;; // mix4.r loc16 = loc32, loc34 // x0 = mixr tmp0, tmp2 // mix4.r loc17 = loc33, loc35 // x1 = mixr tmp1, tmp3 // mix4.l loc18 = loc32, loc34 // x2 = mixl tmp0, tmp2 // mix4.l loc19 = loc33, loc35 // x3 = mixl tmp1, tmp3 // ;; // mix2.r loc32 = loc40, loc41 // tmp0 = mixr y0.2, y1.2 // mix2.l loc33 = loc40, loc41 // tmp1 = mixl y0.2, y1.2 // mix2.r loc34 = loc42, loc43 // tmp2 = mixr y2.2, y3.2 // mix2.l loc35 = loc42, loc43 // tmp3 = mixl y2.2, y3.2 // ;; // mix4.r loc20 = loc32, loc34 // x4 = mixr tmp0, tmp2 // mix4.r loc21 = loc33, loc35 // x5 = mixr tmp1, tmp3 // mix4.l loc22 = loc32, loc34 // x6 = mixl tmp0, tmp2 // mix4.l loc23 = loc33, loc35 // x7 = mixl tmp1, tmp3 // ;; // mix2.r loc32 = loc52, loc53 // tmp0 = mixr y4, y5 // mix2.l loc33 = loc52, loc53 // tmp1 = mixl y4, y5 // mix2.r loc34 = loc54, loc55 // tmp2 = mixr y6, y7 // mix2.l loc35 = loc54, loc55 // tmp3 = mixl y6, y7 // ;; // mix4.r loc24 = loc32, loc34 // x0.2 = mixr tmp0, tmp2 // mix4.r loc25 = loc33, loc35 // x1.2 = mixr tmp1, tmp3 // mix4.l loc26 = loc32, loc34 // x2.2 = mixl tmp0, tmp2 // mix4.l loc27 = loc33, loc35 // x3.2 = mixl tmp1, tmp3 // ;; // mix2.r loc32 = loc44, loc45 // tmp0 = mixr y4.2, y5.2 // mix2.l loc33 = loc44, loc45 // tmp1 = mixl y4.2, y5.2 // mix2.r loc34 = loc46, loc47 // tmp2 = mixr y6.2, y6.2 // mix2.l loc35 = loc46, loc47 // tmp3 = mixl y6.2, y6.2 // ;; // mix4.r loc28 = loc32, loc34 // x4.2 = mixr tmp0, tmp2 // mix4.r loc29 = loc33, loc35 // x5.2 = mixr tmp1, tmp3 // mix4.l loc30 = loc32, loc34 // x6.2 = mixl tmp0, tmp2 // mix4.l loc31 = loc33, loc35 // x7.2 = mixl tmp1, tmp3 // ;; // // // ******************* // // row-DTC 1st half // // ******************* // // padd2 loc32 = loc16, loc23 // t0 = x0 + x7 // padd2 loc33 = loc17, loc22 // t1 = x1 + x6 // padd2 loc34 = loc18, loc21 // t2 = x2 + x5 // padd2 loc35 = loc19, loc20 // t3 = x3 + x4 // psub2 loc36 = loc16, loc23 // t4 = x0 - x7 // psub2 loc37 = loc17, loc22 // t5 = x1 - x6 // psub2 loc38 = loc18, loc21 // t6 = x2 - x5 // psub2 loc39 = loc19, loc20 // t7 = x3 - x4 // ;; // padd2 loc40 = loc37, loc38 // buf0 = t5 + t6 // psub2 loc41 = loc37, loc38 // buf1 = t5 - t6 // ;; // pmpyshr2 loc37 = loc40, r28, 16 // t5 = buf0 * g4 // pmpyshr2 loc38 = loc41, r28, 16 // t6 = buf1 * g4 // ;; // padd2 loc37 = loc37, loc40 // t5 = t5 + buf1 // padd2 loc38 = loc38, loc41 // t6 = t6 + buf2 // ;; // padd2 loc16 = loc32, loc35 // x0 = t0 + t3 // padd2 loc17 = loc33, loc34 // x1 = t1 + t2 // psub2 loc18 = loc32, loc35 // x2 = t0 - t3 // psub2 loc19 = loc33, loc34 // x3 = t1 - t2 // padd2 loc20 = loc36, loc37 // x4 = t4 + t5 // padd2 loc21 = loc38, loc39 // x5 = t6 + t7 // psub2 loc22 = loc36, loc37 // x6 = t4 - t5 // psub2 loc23 = loc38, loc39 // x7 = t6 - t7 // ;; // padd2 loc32 = loc16, loc17 // t0 = x0 + x1 // psub2 loc33 = loc16, loc17 // t1 = x0 - x1 // pmpyshr2 loc42 = loc18, r30, 16 // buf2 = x2 * c1 // pmpyshr2 loc43 = loc19, r30, 16 // buf3 = x3 * c1 // pmpyshr2 loc44 = loc20, r31, 16 // buf4 = x4 * c0 // pmpyshr2 loc45 = loc21, r31, 16 // buf5 = x5 * c0 // pmpyshr2 loc46 = loc22, r29, 16 // buf6 = x6 * c2 // pmpyshr2 loc47 = loc23, r29, 16 // buf7 = x7 * c2 // ;; // padd2 loc46 = loc46, loc22 // buf6 = buf6 + x6 // padd2 loc47 = loc47, loc23 // buf7 = buf7 + x7 // ;; // padd2 loc34 = loc18, loc43 // t2 = x2 + (x3 * c1) // psub2 loc35 = loc42, loc19 // t3 = (c1 * x2) - x3 // padd2 loc36 = loc20, loc45 // t4 = x4 + (x5 * c1) // psub2 loc37 = loc44, loc21 // t5 = (c1 * x4) - x5 // padd2 loc38 = loc22, loc47 // t6 = x6 + (x7 * c1) // psub2 loc39 = loc46, loc23 // t7 = (c1 * x6) - x7 // ;; // pmpyshr2 loc16 = loc32, r28, 16 // x0 = t0 * g4 // pmpyshr2 loc17 = loc33, r28, 16 // x1 = t1 * g4 // pmpyshr2 loc18 = loc34, r26, 16 // x2 = t2 * g2 // pmpyshr2 loc19 = loc35, r26, 16 // x3 = t3 * g2 // pmpyshr2 loc20 = loc36, r25, 16 // x4 = t4 * g1 // pmpyshr2 loc21 = loc37, r25, 16 // x5 = t5 * g1 // pmpyshr2 loc22 = loc38, r27, 16 // x6 = t6 * g3 // pmpyshr2 loc23 = loc39, r27, 16 // x7 = t7 * g3 // ;; // padd2 loc48 = loc16, loc32 // y0 = x0 + t0 // padd2 loc49 = loc20, loc36 // y1 = x4 + t4 // padd2 loc50 = loc18, loc34 // y2 = x2 + t2 // padd2 loc51 = loc22, loc38 // y3 = x6 + t6 // padd2 loc52 = loc17, loc33 // y4 = x1 + t1 // padd2 loc53 = loc23, loc39 // y5 = x7 + t7 // padd2 loc54 = loc19, loc35 // y6 = x3 + t3 // padd2 loc55 = loc21, loc37 // y7 = x5 + t5 // ;; // // // ******************* // // row-DTC 2nd half // // ******************* // // padd2 loc32 = loc24, loc31 // t0 = x0.2 + x7.2 // padd2 loc33 = loc25, loc30 // t1 = x1.2 + x6.2 // padd2 loc34 = loc26, loc29 // t2 = x2.2 + x5.2 // padd2 loc35 = loc27, loc28 // t3 = x3.2 + x4.2 // psub2 loc36 = loc24, loc31 // t4 = x0.2 - x7.2 // psub2 loc37 = loc25, loc30 // t5 = x1.2 - x6.2 // psub2 loc38 = loc26, loc29 // t6 = x2.2 - x5.2 // psub2 loc39 = loc27, loc28 // t7 = x3.2 - x4.2 // ;; // padd2 loc40 = loc37, loc38 // buf0 = t5 + t6 // psub2 loc41 = loc37, loc38 // buf1 = t5 - t6 // ;; // pmpyshr2 loc37 = loc40, r28, 16 // t5 = buf0 * g4 // pmpyshr2 loc38 = loc41, r28, 16 // t6 = buf1 * g4 // ;; // padd2 loc37 = loc37, loc40 // t5 = t5 + buf1 // padd2 loc38 = loc38, loc41 // t6 = t6 + buf2 // ;; // padd2 loc16 = loc32, loc35 // x0 = t0 + t3 // padd2 loc17 = loc33, loc34 // x1 = t1 + t2 // psub2 loc18 = loc32, loc35 // x2 = t0 - t3 // psub2 loc19 = loc33, loc34 // x3 = t1 - t2 // padd2 loc20 = loc36, loc37 // x4 = t4 + t5 // padd2 loc21 = loc38, loc39 // x5 = t6 + t7 // psub2 loc22 = loc36, loc37 // x6 = t4 - t5 // psub2 loc23 = loc38, loc39 // x7 = t6 - t7 // ;; // padd2 loc32 = loc16, loc17 // t0 = x0 + x1 // psub2 loc33 = loc16, loc17 // t1 = x0 - x1 // pmpyshr2 loc42 = loc18, r30, 16 // buf2 = x2 * c1 // pmpyshr2 loc43 = loc19, r30, 16 // buf3 = x3 * c1 // pmpyshr2 loc44 = loc20, r31, 16 // buf4 = x4 * c0 // pmpyshr2 loc45 = loc21, r31, 16 // buf5 = x5 * c0 // pmpyshr2 loc46 = loc22, r29, 16 // buf6 = x6 * c2 // pmpyshr2 loc47 = loc23, r29, 16 // buf7 = x7 * c2 // ;; // padd2 loc46 = loc46, loc22 // buf6 = buf6 + x6 // padd2 loc47 = loc47, loc23 // buf7 = buf7 + x7 // ;; // padd2 loc34 = loc18, loc43 // t2 = x2 + (x3 * c1) // psub2 loc35 = loc42, loc19 // t3 = (c1 * x2) - x3 // padd2 loc36 = loc20, loc45 // t4 = x4 + (x5 * c1) // psub2 loc37 = loc44, loc21 // t5 = (c1 * x4) - x5 // padd2 loc38 = loc22, loc47 // t6 = x6 + (x7 * c1) // psub2 loc39 = loc46, loc23 // t7 = (c1 * x6) - x7 // ;; // pmpyshr2 loc16 = loc32, r28, 16 // x0 = t0 * g4 // pmpyshr2 loc17 = loc33, r28, 16 // x1 = t1 * g4 // pmpyshr2 loc18 = loc34, r26, 16 // x2 = t2 * g2 // pmpyshr2 loc19 = loc35, r26, 16 // x3 = t3 * g2 // pmpyshr2 loc20 = loc36, r25, 16 // x4 = t4 * g1 // pmpyshr2 loc21 = loc37, r25, 16 // x5 = t5 * g1 // pmpyshr2 loc22 = loc38, r27, 16 // x6 = t6 * g3 // pmpyshr2 loc23 = loc39, r27, 16 // x7 = t7 * g3 // ;; // padd2 loc40 = loc16, loc32 // y0.2 = x0 + t0 // padd2 loc41 = loc20, loc36 // y1.2 = x4 + t4 // padd2 loc42 = loc18, loc34 // y2.2 = x2 + t2 // padd2 loc43 = loc22, loc38 // y3.2 = x6 + t6 // padd2 loc44 = loc17, loc33 // y4.2 = x1 + t1 // padd2 loc45 = loc23, loc39 // y5.2 = x7 + t7 // padd2 loc46 = loc19, loc35 // y6.2 = x3 + t3 // padd2 loc47 = loc21, loc37 // y7.2 = x5 + t5 // ;; // // ******************* // // Transpose matrix // // ******************* // // mix2.l loc32 = loc49, loc48 // tmp0 = mixr y1, y0 // mix2.r loc33 = loc49, loc48 // tmp1 = mixl y1, y0 // mix2.l loc34 = loc51, loc50 // tmp2 = mixr y3, y2 // mix2.r loc35 = loc51, loc50 // tmp3 = mixl y3, y2 // ;; // mix4.l loc16 = loc34, loc32 // x0 = mixr tmp2, tmp0 // mix4.l loc17 = loc35, loc33 // x1 = mixr tmp3, tmp1 // mix4.r loc18 = loc34, loc32 // x2 = mixl tmp2, tmp0 // mix4.r loc19 = loc35, loc33 // x3 = mixl tmp3, tmp1 // ;; // mix2.l loc32 = loc41, loc40 // tmp0 = mixr y0.2, y1.2 // mix2.r loc33 = loc41, loc40 // tmp1 = mixl y0.2, y1.2 // mix2.l loc34 = loc43, loc42 // tmp2 = mixr y2.2, y3.2 // mix2.r loc35 = loc43, loc42 // tmp3 = mixl y2.2, y3.2 // ;; // mix4.l loc20 = loc34, loc32 // x4 = mixr tmp2, tmp0 // mix4.l loc21 = loc35, loc33 // x5 = mixr tmp3, tmp1 // mix4.r loc22 = loc34, loc32 // x6 = mixl tmp2, tmp0 // mix4.r loc23 = loc35, loc33 // x7 = mixl tmp3, tmp1 // ;; // mix2.l loc32 = loc53, loc52 // tmp0 = mixr y5, y4 // mix2.r loc33 = loc53, loc52 // tmp1 = mixl y5, y4 // mix2.l loc34 = loc55, loc54 // tmp2 = mixr y7, y6 // mix2.r loc35 = loc55, loc54 // tmp3 = mixl y7, y6 // ;; // mix4.l loc24 = loc34, loc32 // x0.2 = mixr tmp2, tmp0 // mix4.l loc25 = loc35, loc33 // x1.2 = mixr tmp3, tmp1 // mix4.r loc26 = loc34, loc32 // x2.2 = mixl tmp2, tmp0 // mix4.r loc27 = loc35, loc33 // x3.2 = mixl tmp3, tmp1 // ;; // mix2.l loc32 = loc45, loc44 // tmp0 = mixr y4.2, y5.2 // mix2.r loc33 = loc45, loc44 // tmp1 = mixl y4.2, y5.2 // mix2.l loc34 = loc47, loc46 // tmp2 = mixr y6.2, y6.2 // mix2.r loc35 = loc47, loc46 // tmp3 = mixl y6.2, y6.2 // ;; // mix4.l loc28 = loc34, loc32 // x4.2 = mixr tmp2, tmp0 // mix4.l loc29 = loc35, loc33 // x5.2 = mixr tmp3, tmp1 // mix4.r loc30 = loc34, loc32 // x6.2 = mixl tmp2, tmp0 // mix4.r loc31 = loc35, loc33 // x7.2 = mixl tmp3, tmp1 // ;; // // // ******** // // descale // // ******** // // padd2 loc16 = loc16, r23 // padd2 loc17 = loc17, r23 // padd2 loc18 = loc18, r23 // padd2 loc19 = loc19, r23 // padd2 loc20 = loc20, r23 // padd2 loc21 = loc21, r23 // padd2 loc22 = loc22, r23 // padd2 loc23 = loc23, r23 // padd2 loc24 = loc24, r23 // padd2 loc25 = loc25, r23 // padd2 loc26 = loc26, r23 // padd2 loc27 = loc27, r23 // padd2 loc28 = loc28, r23 // padd2 loc29 = loc29, r23 // padd2 loc30 = loc30, r23 // padd2 loc31 = loc31, r23 // ;; // pshr2 loc16 = loc16, 3 // pshr2 loc17 = loc17, 3 // pshr2 loc18 = loc18, 3 // pshr2 loc19 = loc19, 3 // pshr2 loc20 = loc20, 3 // pshr2 loc21 = loc21, 3 // pshr2 loc22 = loc22, 3 // pshr2 loc23 = loc23, 3 // pshr2 loc24 = loc24, 3 // pshr2 loc25 = loc25, 3 // pshr2 loc26 = loc26, 3 // pshr2 loc27 = loc27, 3 // pshr2 loc28 = loc28, 3 // pshr2 loc29 = loc29, 3 // pshr2 loc30 = loc30, 3 // pshr2 loc31 = loc31, 3 // ;; // // ************ // // Store Matrix // // ************ // st8 [loc0] = loc16 // st8 [loc1] = loc24 // st8 [loc2] = loc17 // st8 [loc3] = loc25 // st8 [loc4] = loc18 // st8 [loc5] = loc26 // st8 [loc6] = loc19 // st8 [loc7] = loc27 // st8 [loc8] = loc20 // st8 [loc9] = loc28 // st8 [loc10] = loc21 // st8 [loc11] = loc29 // st8 [loc12] = loc22 // st8 [loc13] = loc30 // st8 [loc14] = loc23 // st8 [loc15] = loc31 // // mov ar.pfs = r14 // br.ret.sptk.many b0 // .endp fdct_ia64# // .common fdct#,8,8 // xvidcore/src/dct/ia64_asm/idct_init.s0000664000076500007650000000463311147310721020564 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 inverse discrete cosine transform - // * // * Copyright(C) 2002 Christian Schwarz, Haiko Gaisser, Sebastian Hack // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: idct_init.s,v 1.2 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * idct_init.s, IA-64 optimized inverse DCT // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** // addreg3 = r20 addreg4 = r21 addreg5 = r22 addreg6 = r23 one = f30 alloc r16 = ar.pfs, 1, 71, 0, 0 addl addreg1 = @gprel(.data_c0#), gp addl addreg2 = @gprel(.data_c2#), gp ;; add addreg3 = 32, addreg1 add addreg4 = 32, addreg2 add addreg5 = 64, addreg1 add addreg6 = 64, addreg2 ;; ldfp8 c0, c1 = [addreg1] ldfp8 c2, c3 = [addreg2] ;; ldfp8 c4, c5 = [addreg3], 16 ldfp8 c6, c7 = [addreg4], 16 add addreg1 = 96, addreg1 add addreg2 = 96, addreg2 ;; ldfp8 c8, c9 = [addreg5], 16 ldfp8 c10, c11 = [addreg6], 16 ;; ldfp8 c12, c13 = [addreg1] ldfp8 c14, c15 = [addreg2] ;; mov addreg1 = in0 fpack one = f1, f1 add addreg2 = 2, in0 ;; xvidcore/src/dct/ia64_asm/idct_fini.s0000664000076500007650000000345211147310721020544 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 inverse discrete cosine transform - // * // * Copyright(C) 2002 Christian Schwarz, Haiko Gaisser, Sebastian Hack // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: idct_fini.s,v 1.2 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * idct_fini.s, IA-64 optimized inverse DCT // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** // mov ar.pfs = r16 br.ret.sptk.few b0 xvidcore/src/dct/ia64_asm/genidct.py0000775000076500007650000002052407513041670020427 0ustar xvidbuildxvidbuild#! /usr/bin/python # generate the linzer-feig multiply-add idct for ia64 # (c) 2002 Christian Schwarz , # Haiko Gaisser , # Sebastian Hack import math pre_shuffle = [ 0, 4, 2, 6, 1, 7, 3, 5 ] post_shuffle = [ 0, 1, 6, 3, 7, 2, 5, 4 ] constants = 16 float_scratch = range(32, 32+constants) regbase = max(float_scratch)+1 intregbase = 33 def print_matrix(matrix,s=''): if s != '': print "\n\t// %s" % s for i in range(0, 8): print "\t// ", for j in range(0, 4): print "%2d" % matrix[i*4+j], print "" def exchange_elements(list, a, b): """ Exchange two list elements """ (list[a], list[b]) = (list[b], list[a]) def alloc_regs(matrix, n): """ get the smallest register not used by the matrix """ regs = [ ] for i in range(0, n): m = regbase while m in matrix or m in regs: m = m + 1 regs.append(m) return regs def transpose_2x2_submatrix(matrix, i, j): """ transpose a 2x2 submatrix in the 8x8 matrix """ a = j b = i tmp = matrix[i*8+j] matrix[i*8+j] = matrix[a*8+b] matrix[a*8+b] = tmp tmp = matrix[i*8+j+4] matrix[i*8+j+4] = matrix[a*8+b+4] matrix[a*8+b+4] = tmp def transpose(matrix): """ register renaming for transpose """ regs = alloc_regs(matrix, 16) save_regs = regs[:] # emit code ... for i in range(1,8,2): for j in range(0,4): r1 = matrix[(i-1)*4+j] r2 = matrix[i*4+j] print '\tfmix.r f%d = f%d, f%d' % (save_regs.pop(0), r1, r2) print '\t;;' for i in range(0,8,2): for j in range(0,4): r1 = matrix[i*4+j] r2 = matrix[(i+1)*4+j] print '\tfmix.l f%d = f%d, f%d' % (r1, r1, r2) print '\t;;' # first stage, transpose the 2x2 matrices for i in range(1,8,2): for j in range(0,4): r = matrix[i*4+j] matrix[i*4+j] = regs.pop(0) # print_matrix(matrix) # exchange the 2x2 matrices by renaming the registers for i in range(0, 4): for j in range(i+1, 4): transpose_2x2_submatrix(matrix, i, j) # print '' # print_matrix(matrix) # print "transpose" # print_matrix(matrix) # register renaming for 8 regs containing a column def shuffle_column(matrix, col, permutation): l = [ ] for i in range(0,8): l.append(matrix[i*4+col]) for i in range(0,8): matrix[i*4+col] = l[permutation[i]] def butterfly(matrix, col, i, j, c1, c2): """ register renaming for a butterfly operation in a column """ ri = matrix[i*4+col] rj = matrix[j*4+col] regs = alloc_regs(matrix, 1) print '\t// (f%d, f%d) = (f%d, f%d) $ (%s, %s), (line %d, %d)' % \ (regs[0], rj, ri, rj, c1, c2, i, j) print '\tfpma f%d = f%d, %s, f%d' % (regs[0], rj, c1, ri) print '\tfpnma f%d = f%d, %s, f%d' % (rj, rj, c2, ri) print '\t;;' matrix[i*4+col] = regs[0] def column_idct(matrix, col): print_matrix(matrix, "before pre shuffle") shuffle_column(matrix, col, pre_shuffle) print_matrix(matrix, "after pre shuffle") butterfly(matrix, col, 0, 1, 'c0', 'c0') butterfly(matrix, col, 2, 3, 'c1', 'c2') butterfly(matrix, col, 4, 5, 'c3', 'c4') butterfly(matrix, col, 6, 7, 'c5', 'c6') print '\t;;' butterfly(matrix, col, 0, 3, 'c7', 'c7') butterfly(matrix, col, 1, 2, 'c8', 'c8') butterfly(matrix, col, 4, 6, 'c9', 'c9') butterfly(matrix, col, 5, 7, 'c10', 'c10') print '\t;;' butterfly(matrix, col, 5, 6, 'c11', 'c11') butterfly(matrix, col, 0, 4, 'c12', 'c12') butterfly(matrix, col, 3, 7, 'c14', 'c14') print '\t;;' butterfly(matrix, col, 1, 5, 'c13', 'c13') butterfly(matrix, col, 2, 6, 'c13', 'c13') print_matrix(matrix, "before post shuffle") shuffle_column(matrix, col, post_shuffle) print_matrix(matrix, "after post shuffle") def gen_idct(matrix): for j in range(0, 2): for i in range(0, 4): print '\tfpma f%d = f%d, c0, f0' \ % (2 * (matrix[i],)) print '\t;;' for i in range(0,4): column_idct(matrix, i) print '\t;;' transpose(matrix) def gen_consts(): print 'addreg1 = r14' print 'addreg2 = r15' for i in range(0, constants): print 'c%d = f%d' % (i, float_scratch.pop(0)) sqrt2 = math.sqrt(2.0) t = [ ] s = [ ] c = [ ] for i in range(0,5): t.append(math.tan(i * math.pi / 16)) s.append(math.sin(i * math.pi / 16)) c.append(math.cos(i * math.pi / 16)) consts = [ ] consts.append(1.0 / (2.0 * sqrt2)) consts.append(-1 / t[2]) consts.append(-t[2]) consts.append(t[1]) consts.append(1 / t[1]) consts.append(t[3]) consts.append(1 / t[3]) consts.append(0.5 * c[2]) consts.append(0.5 * s[2]) consts.append(c[3] / c[1]) consts.append(s[3] / s[1]) consts.append(c[1] / s[1]) consts.append(0.5 * c[1]) consts.append(0.5 * s[1] * c[4]) consts.append(0.5 * s[1]) consts.append(1.0) print '.sdata' for i in range(0, constants): if i % 2 == 0: print '.align 16' print '.data_c%d:' % i print '.single %.30f, %.30f' % (consts[i], consts[i]) print '' def gen_load(matrix): for i in range(0, 64, 2): print '\tld2 r%d = [addreg1], 4' % (intregbase+i) print '\tld2 r%d = [addreg2], 4' % (intregbase+i+1) print '\t;;' for i in range(0, 64, 2): print '\tsxt2 r%d = r%d' % (2*(intregbase+i,)) print '\tsxt2 r%d = r%d' % (2*(intregbase+i+1,)) print '\t;;' for i in range(0, 64, 2): print '\tsetf.sig f%d = r%d' % (regbase+i, intregbase+i) print '\tsetf.sig f%d = r%d' % (regbase+i+1, intregbase+i+1) print '\t;;' for i in range(0, 64, 2): print '\tfcvt.xf f%d = f%d' % (2*(regbase+i,)) print '\tfcvt.xf f%d = f%d' % (2*(regbase+i+1,)) print '\t;;' for i in range(0, 32): print '\tfpack f%d = f%d, f%d' \ % (regbase+i, regbase+2*i, regbase+2*i+1) print '\t;;' """ for i in range(0, len(matrix)): print '\tld2 r18 = [addreg1], 4' print '\tld2 r19 = [addreg2], 4' print '\t;;' print '\tsxt2 r18 = r18' print '\tsxt2 r19 = r19' print '\t;;' print '\tsetf.sig f18 = r18' print '\tsetf.sig f19 = r19' print '\t;;' print '\tfcvt.xf f18 = f18' print '\tfcvt.xf f19 = f19' print '\t;;' print '\tfpack f%d = f18, f19' % (matrix[i]) print '\t;;' """ def gen_store(matrix): print '\tmov addreg1 = in0' print '\tadd addreg2 = 4, in0' print '\t;;' for i in range(0, len(matrix)): print '\tfpcvt.fx f%d = f%d' % (2*(matrix[i],)) print '\t;;' for i in range(0, len(matrix)): print '\tgetf.sig r%d = f%d' % (intregbase+i, matrix[i]) print '\t;;' for i in range(0, len(matrix)): print '\tshl r%d = r%d, 7' % (2*(intregbase+i,)) print '\t;;' for i in range(0, len(matrix)): print '\tpack4.sss r%d = r%d, r0' % (2*(intregbase+i,)) print '\t;;' for i in range(0, len(matrix)): print '\tpshr2 r%d = r%d, 7' % (2*(intregbase+i,)) print '\t;;' for i in range(0, len(matrix)): print '\tmux2 r%d = r%d, 0xe1' % (2*(intregbase+i,)) print '\t;;' for i in range(0, len(matrix), 2): print '\tst4 [addreg1] = r%d, 8' % (intregbase+i) print '\tst4 [addreg2] = r%d, 8' % (intregbase+i+1) print '\t;;' def main(): gen_consts() print '.text' print '.global idct_ia64' print '.global idct_ia64_init' print '.align 16' print '.proc idct_ia64_init' print 'idct_ia64_init:' print 'br.ret.sptk.few b0' print '.endp' print '.align 16' print '.proc idct_ia64' print 'idct_ia64:' f = open('idct_init.s') print f.read() f.close() matrix = [ ] for i in range(0,32): matrix.append(regbase + i) gen_load(matrix) # print_matrix(matrix) gen_idct(matrix) # transpose(matrix) print_matrix(matrix) gen_store(matrix) f = open('idct_fini.s') print f.read() f.close() print '.endp' if __name__ == "__main__": main() xvidcore/src/dct/ia64_asm/idct_ia64_gcc.s0000664000076500007650000010477411147310721021207 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 inverse discrete cosine transform - // * // * Copyright(C) 2002 Christian Schwarz, Haiko Gaisser, Sebastian Hack // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: idct_ia64_gcc.s,v 1.2 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * idct_ia64_gcc.s, IA-64 optimized inverse DCT // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** // addreg1 = r14 addreg2 = r15 c0 = f32 c1 = f33 c2 = f34 c3 = f35 c4 = f36 c5 = f37 c6 = f38 c7 = f39 c8 = f40 c9 = f41 c10 = f42 c11 = f43 c12 = f44 c13 = f45 c14 = f46 c15 = f47 .sdata .align 16 .data_c0: .single 0.353553390593273730857504233427, 0.353553390593273730857504233427 .data_c1: .single -2.414213562373094923430016933708, -2.414213562373094923430016933708 .align 16 .data_c2: .single -0.414213562373095034452319396223, -0.414213562373095034452319396223 .data_c3: .single 0.198912367379658006072418174881, 0.198912367379658006072418174881 .align 16 .data_c4: .single 5.027339492125848074977056967327, 5.027339492125848074977056967327 .data_c5: .single 0.668178637919298878955487452913, 0.668178637919298878955487452913 .align 16 .data_c6: .single 1.496605762665489169904731170391, 1.496605762665489169904731170391 .data_c7: .single 0.461939766255643369241568052530, 0.461939766255643369241568052530 .align 16 .data_c8: .single 0.191341716182544890889616340246, 0.191341716182544890889616340246 .data_c9: .single 0.847759065022573476966272210120, 0.847759065022573476966272210120 .align 16 .data_c10: .single 2.847759065022573476966272210120, 2.847759065022573476966272210120 .data_c11: .single 5.027339492125848074977056967327, 5.027339492125848074977056967327 .align 16 .data_c12: .single 0.490392640201615215289621119155, 0.490392640201615215289621119155 .data_c13: .single 0.068974844820735750627882509889, 0.068974844820735750627882509889 .align 16 .data_c14: .single 0.097545161008064124041894160655, 0.097545161008064124041894160655 .data_c15: .single 1.000000000000000000000000000000, 1.000000000000000000000000000000 .text .global idct_ia64 .global idct_ia64_init .align 16 .proc idct_ia64_init idct_ia64_init: br.ret.sptk.few b0 .endp .align 16 .proc idct_ia64 idct_ia64: addreg3 = r20 addreg4 = r21 addreg5 = r22 addreg6 = r23 one = f30 alloc r16 = ar.pfs, 1, 71, 0, 0 addl addreg1 = @gprel(.data_c0#), gp addl addreg2 = @gprel(.data_c2#), gp ;; add addreg3 = 32, addreg1 add addreg4 = 32, addreg2 add addreg5 = 64, addreg1 add addreg6 = 64, addreg2 ;; ldfp8 c0, c1 = [addreg1] ldfp8 c2, c3 = [addreg2] ;; ldfp8 c4, c5 = [addreg3], 16 ldfp8 c6, c7 = [addreg4], 16 add addreg1 = 96, addreg1 add addreg2 = 96, addreg2 ;; ldfp8 c8, c9 = [addreg5], 16 ldfp8 c10, c11 = [addreg6], 16 ;; ldfp8 c12, c13 = [addreg1] ldfp8 c14, c15 = [addreg2] ;; mov addreg1 = in0 fpack one = f1, f1 add addreg2 = 2, in0 ;; ld2 r33 = [addreg1], 4 ld2 r34 = [addreg2], 4 ;; ld2 r35 = [addreg1], 4 ld2 r36 = [addreg2], 4 ;; ld2 r37 = [addreg1], 4 ld2 r38 = [addreg2], 4 ;; ld2 r39 = [addreg1], 4 ld2 r40 = [addreg2], 4 ;; ld2 r41 = [addreg1], 4 ld2 r42 = [addreg2], 4 ;; ld2 r43 = [addreg1], 4 ld2 r44 = [addreg2], 4 ;; ld2 r45 = [addreg1], 4 ld2 r46 = [addreg2], 4 ;; ld2 r47 = [addreg1], 4 ld2 r48 = [addreg2], 4 ;; ld2 r49 = [addreg1], 4 ld2 r50 = [addreg2], 4 ;; ld2 r51 = [addreg1], 4 ld2 r52 = [addreg2], 4 ;; ld2 r53 = [addreg1], 4 ld2 r54 = [addreg2], 4 ;; ld2 r55 = [addreg1], 4 ld2 r56 = [addreg2], 4 ;; ld2 r57 = [addreg1], 4 ld2 r58 = [addreg2], 4 ;; ld2 r59 = [addreg1], 4 ld2 r60 = [addreg2], 4 ;; ld2 r61 = [addreg1], 4 ld2 r62 = [addreg2], 4 ;; ld2 r63 = [addreg1], 4 ld2 r64 = [addreg2], 4 ;; ld2 r65 = [addreg1], 4 ld2 r66 = [addreg2], 4 ;; ld2 r67 = [addreg1], 4 ld2 r68 = [addreg2], 4 ;; ld2 r69 = [addreg1], 4 ld2 r70 = [addreg2], 4 ;; ld2 r71 = [addreg1], 4 ld2 r72 = [addreg2], 4 ;; ld2 r73 = [addreg1], 4 ld2 r74 = [addreg2], 4 ;; ld2 r75 = [addreg1], 4 ld2 r76 = [addreg2], 4 ;; ld2 r77 = [addreg1], 4 ld2 r78 = [addreg2], 4 ;; ld2 r79 = [addreg1], 4 ld2 r80 = [addreg2], 4 ;; ld2 r81 = [addreg1], 4 ld2 r82 = [addreg2], 4 ;; ld2 r83 = [addreg1], 4 ld2 r84 = [addreg2], 4 ;; ld2 r85 = [addreg1], 4 ld2 r86 = [addreg2], 4 ;; ld2 r87 = [addreg1], 4 ld2 r88 = [addreg2], 4 ;; ld2 r89 = [addreg1], 4 ld2 r90 = [addreg2], 4 ;; ld2 r91 = [addreg1], 4 ld2 r92 = [addreg2], 4 ;; ld2 r93 = [addreg1], 4 ld2 r94 = [addreg2], 4 ;; ld2 r95 = [addreg1], 4 ld2 r96 = [addreg2], 4 ;; sxt2 r33 = r33 sxt2 r34 = r34 sxt2 r35 = r35 sxt2 r36 = r36 sxt2 r37 = r37 sxt2 r38 = r38 sxt2 r39 = r39 sxt2 r40 = r40 sxt2 r41 = r41 sxt2 r42 = r42 sxt2 r43 = r43 sxt2 r44 = r44 sxt2 r45 = r45 sxt2 r46 = r46 sxt2 r47 = r47 sxt2 r48 = r48 sxt2 r49 = r49 sxt2 r50 = r50 sxt2 r51 = r51 sxt2 r52 = r52 sxt2 r53 = r53 sxt2 r54 = r54 sxt2 r55 = r55 sxt2 r56 = r56 sxt2 r57 = r57 sxt2 r58 = r58 sxt2 r59 = r59 sxt2 r60 = r60 sxt2 r61 = r61 sxt2 r62 = r62 sxt2 r63 = r63 sxt2 r64 = r64 sxt2 r65 = r65 sxt2 r66 = r66 sxt2 r67 = r67 sxt2 r68 = r68 sxt2 r69 = r69 sxt2 r70 = r70 sxt2 r71 = r71 sxt2 r72 = r72 sxt2 r73 = r73 sxt2 r74 = r74 sxt2 r75 = r75 sxt2 r76 = r76 sxt2 r77 = r77 sxt2 r78 = r78 sxt2 r79 = r79 sxt2 r80 = r80 sxt2 r81 = r81 sxt2 r82 = r82 sxt2 r83 = r83 sxt2 r84 = r84 sxt2 r85 = r85 sxt2 r86 = r86 sxt2 r87 = r87 sxt2 r88 = r88 sxt2 r89 = r89 sxt2 r90 = r90 sxt2 r91 = r91 sxt2 r92 = r92 sxt2 r93 = r93 sxt2 r94 = r94 sxt2 r95 = r95 sxt2 r96 = r96 ;; setf.sig f48 = r33 setf.sig f49 = r34 setf.sig f50 = r35 setf.sig f51 = r36 setf.sig f52 = r37 setf.sig f53 = r38 setf.sig f54 = r39 setf.sig f55 = r40 setf.sig f56 = r41 setf.sig f57 = r42 setf.sig f58 = r43 setf.sig f59 = r44 setf.sig f60 = r45 setf.sig f61 = r46 setf.sig f62 = r47 setf.sig f63 = r48 setf.sig f64 = r49 setf.sig f65 = r50 setf.sig f66 = r51 setf.sig f67 = r52 setf.sig f68 = r53 setf.sig f69 = r54 setf.sig f70 = r55 setf.sig f71 = r56 setf.sig f72 = r57 setf.sig f73 = r58 setf.sig f74 = r59 setf.sig f75 = r60 setf.sig f76 = r61 setf.sig f77 = r62 setf.sig f78 = r63 setf.sig f79 = r64 setf.sig f80 = r65 setf.sig f81 = r66 setf.sig f82 = r67 setf.sig f83 = r68 setf.sig f84 = r69 setf.sig f85 = r70 setf.sig f86 = r71 setf.sig f87 = r72 setf.sig f88 = r73 setf.sig f89 = r74 setf.sig f90 = r75 setf.sig f91 = r76 setf.sig f92 = r77 setf.sig f93 = r78 setf.sig f94 = r79 setf.sig f95 = r80 setf.sig f96 = r81 setf.sig f97 = r82 setf.sig f98 = r83 setf.sig f99 = r84 setf.sig f100 = r85 setf.sig f101 = r86 setf.sig f102 = r87 setf.sig f103 = r88 setf.sig f104 = r89 setf.sig f105 = r90 setf.sig f106 = r91 setf.sig f107 = r92 setf.sig f108 = r93 setf.sig f109 = r94 setf.sig f110 = r95 setf.sig f111 = r96 ;; fcvt.xf f48 = f48 fcvt.xf f49 = f49 fcvt.xf f50 = f50 fcvt.xf f51 = f51 fcvt.xf f52 = f52 fcvt.xf f53 = f53 fcvt.xf f54 = f54 fcvt.xf f55 = f55 fcvt.xf f56 = f56 fcvt.xf f57 = f57 fcvt.xf f58 = f58 fcvt.xf f59 = f59 fcvt.xf f60 = f60 fcvt.xf f61 = f61 fcvt.xf f62 = f62 fcvt.xf f63 = f63 fcvt.xf f64 = f64 fcvt.xf f65 = f65 fcvt.xf f66 = f66 fcvt.xf f67 = f67 fcvt.xf f68 = f68 fcvt.xf f69 = f69 fcvt.xf f70 = f70 fcvt.xf f71 = f71 fcvt.xf f72 = f72 fcvt.xf f73 = f73 fcvt.xf f74 = f74 fcvt.xf f75 = f75 fcvt.xf f76 = f76 fcvt.xf f77 = f77 fcvt.xf f78 = f78 fcvt.xf f79 = f79 fcvt.xf f80 = f80 fcvt.xf f81 = f81 fcvt.xf f82 = f82 fcvt.xf f83 = f83 fcvt.xf f84 = f84 fcvt.xf f85 = f85 fcvt.xf f86 = f86 fcvt.xf f87 = f87 fcvt.xf f88 = f88 fcvt.xf f89 = f89 fcvt.xf f90 = f90 fcvt.xf f91 = f91 fcvt.xf f92 = f92 fcvt.xf f93 = f93 fcvt.xf f94 = f94 fcvt.xf f95 = f95 fcvt.xf f96 = f96 fcvt.xf f97 = f97 fcvt.xf f98 = f98 fcvt.xf f99 = f99 fcvt.xf f100 = f100 fcvt.xf f101 = f101 fcvt.xf f102 = f102 fcvt.xf f103 = f103 fcvt.xf f104 = f104 fcvt.xf f105 = f105 fcvt.xf f106 = f106 fcvt.xf f107 = f107 fcvt.xf f108 = f108 fcvt.xf f109 = f109 fcvt.xf f110 = f110 fcvt.xf f111 = f111 ;; fpack f48 = f48, f49 ;; fpack f49 = f50, f51 ;; fpack f50 = f52, f53 ;; fpack f51 = f54, f55 ;; fpack f52 = f56, f57 ;; fpack f53 = f58, f59 ;; fpack f54 = f60, f61 ;; fpack f55 = f62, f63 ;; fpack f56 = f64, f65 ;; fpack f57 = f66, f67 ;; fpack f58 = f68, f69 ;; fpack f59 = f70, f71 ;; fpack f60 = f72, f73 ;; fpack f61 = f74, f75 ;; fpack f62 = f76, f77 ;; fpack f63 = f78, f79 ;; fpack f64 = f80, f81 ;; fpack f65 = f82, f83 ;; fpack f66 = f84, f85 ;; fpack f67 = f86, f87 ;; fpack f68 = f88, f89 ;; fpack f69 = f90, f91 ;; fpack f70 = f92, f93 ;; fpack f71 = f94, f95 ;; fpack f72 = f96, f97 ;; fpack f73 = f98, f99 ;; fpack f74 = f100, f101 ;; fpack f75 = f102, f103 ;; fpack f76 = f104, f105 ;; fpack f77 = f106, f107 ;; fpack f78 = f108, f109 ;; fpack f79 = f110, f111 ;; fpma f48 = f48, c0, f0 fpma f49 = f49, c0, f0 fpma f50 = f50, c0, f0 fpma f51 = f51, c0, f0 ;; // before pre shuffle // 48 49 50 51 // 52 53 54 55 // 56 57 58 59 // 60 61 62 63 // 64 65 66 67 // 68 69 70 71 // 72 73 74 75 // 76 77 78 79 // after pre shuffle // 48 49 50 51 // 64 53 54 55 // 56 57 58 59 // 72 61 62 63 // 52 65 66 67 // 76 69 70 71 // 60 73 74 75 // 68 77 78 79 // (f80, f64) = (f48, f64) $ (c0, c0), (line 0, 1) fpma f80 = f64, c0, f48 fpnma f64 = f64, c0, f48 ;; // (f48, f72) = (f56, f72) $ (c1, c2), (line 2, 3) fpma f48 = f72, c1, f56 fpnma f72 = f72, c2, f56 ;; // (f56, f76) = (f52, f76) $ (c3, c4), (line 4, 5) fpma f56 = f76, c3, f52 fpnma f76 = f76, c4, f52 ;; // (f52, f68) = (f60, f68) $ (c5, c6), (line 6, 7) fpma f52 = f68, c5, f60 fpnma f68 = f68, c6, f60 ;; ;; // (f60, f72) = (f80, f72) $ (c7, c7), (line 0, 3) fpma f60 = f72, c7, f80 fpnma f72 = f72, c7, f80 ;; // (f80, f48) = (f64, f48) $ (c8, c8), (line 1, 2) fpma f80 = f48, c8, f64 fpnma f48 = f48, c8, f64 ;; // (f64, f52) = (f56, f52) $ (c9, c9), (line 4, 6) fpma f64 = f52, c9, f56 fpnma f52 = f52, c9, f56 ;; // (f56, f68) = (f76, f68) $ (c10, c10), (line 5, 7) fpma f56 = f68, c10, f76 fpnma f68 = f68, c10, f76 ;; ;; // (f76, f52) = (f56, f52) $ (c11, c11), (line 5, 6) fpma f76 = f52, c11, f56 fpnma f52 = f52, c11, f56 ;; // (f56, f64) = (f60, f64) $ (c12, c12), (line 0, 4) fpma f56 = f64, c12, f60 fpnma f64 = f64, c12, f60 ;; // (f60, f68) = (f72, f68) $ (c14, c14), (line 3, 7) fpma f60 = f68, c14, f72 fpnma f68 = f68, c14, f72 ;; ;; // (f72, f76) = (f80, f76) $ (c13, c13), (line 1, 5) fpma f72 = f76, c13, f80 fpnma f76 = f76, c13, f80 ;; // (f80, f52) = (f48, f52) $ (c13, c13), (line 2, 6) fpma f80 = f52, c13, f48 fpnma f52 = f52, c13, f48 ;; // before post shuffle // 56 49 50 51 // 72 53 54 55 // 80 57 58 59 // 60 61 62 63 // 64 65 66 67 // 76 69 70 71 // 52 73 74 75 // 68 77 78 79 // after post shuffle // 56 49 50 51 // 72 53 54 55 // 52 57 58 59 // 60 61 62 63 // 68 65 66 67 // 80 69 70 71 // 76 73 74 75 // 64 77 78 79 // before pre shuffle // 56 49 50 51 // 72 53 54 55 // 52 57 58 59 // 60 61 62 63 // 68 65 66 67 // 80 69 70 71 // 76 73 74 75 // 64 77 78 79 // after pre shuffle // 56 49 50 51 // 72 65 54 55 // 52 57 58 59 // 60 73 62 63 // 68 53 66 67 // 80 77 70 71 // 76 61 74 75 // 64 69 78 79 // (f48, f65) = (f49, f65) $ (c0, c0), (line 0, 1) fpma f48 = f65, c0, f49 fpnma f65 = f65, c0, f49 ;; // (f49, f73) = (f57, f73) $ (c1, c2), (line 2, 3) fpma f49 = f73, c1, f57 fpnma f73 = f73, c2, f57 ;; // (f57, f77) = (f53, f77) $ (c3, c4), (line 4, 5) fpma f57 = f77, c3, f53 fpnma f77 = f77, c4, f53 ;; // (f53, f69) = (f61, f69) $ (c5, c6), (line 6, 7) fpma f53 = f69, c5, f61 fpnma f69 = f69, c6, f61 ;; ;; // (f61, f73) = (f48, f73) $ (c7, c7), (line 0, 3) fpma f61 = f73, c7, f48 fpnma f73 = f73, c7, f48 ;; // (f48, f49) = (f65, f49) $ (c8, c8), (line 1, 2) fpma f48 = f49, c8, f65 fpnma f49 = f49, c8, f65 ;; // (f65, f53) = (f57, f53) $ (c9, c9), (line 4, 6) fpma f65 = f53, c9, f57 fpnma f53 = f53, c9, f57 ;; // (f57, f69) = (f77, f69) $ (c10, c10), (line 5, 7) fpma f57 = f69, c10, f77 fpnma f69 = f69, c10, f77 ;; ;; // (f77, f53) = (f57, f53) $ (c11, c11), (line 5, 6) fpma f77 = f53, c11, f57 fpnma f53 = f53, c11, f57 ;; // (f57, f65) = (f61, f65) $ (c12, c12), (line 0, 4) fpma f57 = f65, c12, f61 fpnma f65 = f65, c12, f61 ;; // (f61, f69) = (f73, f69) $ (c14, c14), (line 3, 7) fpma f61 = f69, c14, f73 fpnma f69 = f69, c14, f73 ;; ;; // (f73, f77) = (f48, f77) $ (c13, c13), (line 1, 5) fpma f73 = f77, c13, f48 fpnma f77 = f77, c13, f48 ;; // (f48, f53) = (f49, f53) $ (c13, c13), (line 2, 6) fpma f48 = f53, c13, f49 fpnma f53 = f53, c13, f49 ;; // before post shuffle // 56 57 50 51 // 72 73 54 55 // 52 48 58 59 // 60 61 62 63 // 68 65 66 67 // 80 77 70 71 // 76 53 74 75 // 64 69 78 79 // after post shuffle // 56 57 50 51 // 72 73 54 55 // 52 53 58 59 // 60 61 62 63 // 68 69 66 67 // 80 48 70 71 // 76 77 74 75 // 64 65 78 79 // before pre shuffle // 56 57 50 51 // 72 73 54 55 // 52 53 58 59 // 60 61 62 63 // 68 69 66 67 // 80 48 70 71 // 76 77 74 75 // 64 65 78 79 // after pre shuffle // 56 57 50 51 // 72 73 66 55 // 52 53 58 59 // 60 61 74 63 // 68 69 54 67 // 80 48 78 71 // 76 77 62 75 // 64 65 70 79 // (f49, f66) = (f50, f66) $ (c0, c0), (line 0, 1) fpma f49 = f66, c0, f50 fpnma f66 = f66, c0, f50 ;; // (f50, f74) = (f58, f74) $ (c1, c2), (line 2, 3) fpma f50 = f74, c1, f58 fpnma f74 = f74, c2, f58 ;; // (f58, f78) = (f54, f78) $ (c3, c4), (line 4, 5) fpma f58 = f78, c3, f54 fpnma f78 = f78, c4, f54 ;; // (f54, f70) = (f62, f70) $ (c5, c6), (line 6, 7) fpma f54 = f70, c5, f62 fpnma f70 = f70, c6, f62 ;; ;; // (f62, f74) = (f49, f74) $ (c7, c7), (line 0, 3) fpma f62 = f74, c7, f49 fpnma f74 = f74, c7, f49 ;; // (f49, f50) = (f66, f50) $ (c8, c8), (line 1, 2) fpma f49 = f50, c8, f66 fpnma f50 = f50, c8, f66 ;; // (f66, f54) = (f58, f54) $ (c9, c9), (line 4, 6) fpma f66 = f54, c9, f58 fpnma f54 = f54, c9, f58 ;; // (f58, f70) = (f78, f70) $ (c10, c10), (line 5, 7) fpma f58 = f70, c10, f78 fpnma f70 = f70, c10, f78 ;; ;; // (f78, f54) = (f58, f54) $ (c11, c11), (line 5, 6) fpma f78 = f54, c11, f58 fpnma f54 = f54, c11, f58 ;; // (f58, f66) = (f62, f66) $ (c12, c12), (line 0, 4) fpma f58 = f66, c12, f62 fpnma f66 = f66, c12, f62 ;; // (f62, f70) = (f74, f70) $ (c14, c14), (line 3, 7) fpma f62 = f70, c14, f74 fpnma f70 = f70, c14, f74 ;; ;; // (f74, f78) = (f49, f78) $ (c13, c13), (line 1, 5) fpma f74 = f78, c13, f49 fpnma f78 = f78, c13, f49 ;; // (f49, f54) = (f50, f54) $ (c13, c13), (line 2, 6) fpma f49 = f54, c13, f50 fpnma f54 = f54, c13, f50 ;; // before post shuffle // 56 57 58 51 // 72 73 74 55 // 52 53 49 59 // 60 61 62 63 // 68 69 66 67 // 80 48 78 71 // 76 77 54 75 // 64 65 70 79 // after post shuffle // 56 57 58 51 // 72 73 74 55 // 52 53 54 59 // 60 61 62 63 // 68 69 70 67 // 80 48 49 71 // 76 77 78 75 // 64 65 66 79 // before pre shuffle // 56 57 58 51 // 72 73 74 55 // 52 53 54 59 // 60 61 62 63 // 68 69 70 67 // 80 48 49 71 // 76 77 78 75 // 64 65 66 79 // after pre shuffle // 56 57 58 51 // 72 73 74 67 // 52 53 54 59 // 60 61 62 75 // 68 69 70 55 // 80 48 49 79 // 76 77 78 63 // 64 65 66 71 // (f50, f67) = (f51, f67) $ (c0, c0), (line 0, 1) fpma f50 = f67, c0, f51 fpnma f67 = f67, c0, f51 ;; // (f51, f75) = (f59, f75) $ (c1, c2), (line 2, 3) fpma f51 = f75, c1, f59 fpnma f75 = f75, c2, f59 ;; // (f59, f79) = (f55, f79) $ (c3, c4), (line 4, 5) fpma f59 = f79, c3, f55 fpnma f79 = f79, c4, f55 ;; // (f55, f71) = (f63, f71) $ (c5, c6), (line 6, 7) fpma f55 = f71, c5, f63 fpnma f71 = f71, c6, f63 ;; ;; // (f63, f75) = (f50, f75) $ (c7, c7), (line 0, 3) fpma f63 = f75, c7, f50 fpnma f75 = f75, c7, f50 ;; // (f50, f51) = (f67, f51) $ (c8, c8), (line 1, 2) fpma f50 = f51, c8, f67 fpnma f51 = f51, c8, f67 ;; // (f67, f55) = (f59, f55) $ (c9, c9), (line 4, 6) fpma f67 = f55, c9, f59 fpnma f55 = f55, c9, f59 ;; // (f59, f71) = (f79, f71) $ (c10, c10), (line 5, 7) fpma f59 = f71, c10, f79 fpnma f71 = f71, c10, f79 ;; ;; // (f79, f55) = (f59, f55) $ (c11, c11), (line 5, 6) fpma f79 = f55, c11, f59 fpnma f55 = f55, c11, f59 ;; // (f59, f67) = (f63, f67) $ (c12, c12), (line 0, 4) fpma f59 = f67, c12, f63 fpnma f67 = f67, c12, f63 ;; // (f63, f71) = (f75, f71) $ (c14, c14), (line 3, 7) fpma f63 = f71, c14, f75 fpnma f71 = f71, c14, f75 ;; ;; // (f75, f79) = (f50, f79) $ (c13, c13), (line 1, 5) fpma f75 = f79, c13, f50 fpnma f79 = f79, c13, f50 ;; // (f50, f55) = (f51, f55) $ (c13, c13), (line 2, 6) fpma f50 = f55, c13, f51 fpnma f55 = f55, c13, f51 ;; // before post shuffle // 56 57 58 59 // 72 73 74 75 // 52 53 54 50 // 60 61 62 63 // 68 69 70 67 // 80 48 49 79 // 76 77 78 55 // 64 65 66 71 // after post shuffle // 56 57 58 59 // 72 73 74 75 // 52 53 54 55 // 60 61 62 63 // 68 69 70 71 // 80 48 49 50 // 76 77 78 79 // 64 65 66 67 ;; fmix.r f51 = f56, f72 fmix.r f81 = f57, f73 fmix.r f82 = f58, f74 fmix.r f83 = f59, f75 fmix.r f84 = f52, f60 fmix.r f85 = f53, f61 fmix.r f86 = f54, f62 fmix.r f87 = f55, f63 fmix.r f88 = f68, f80 fmix.r f89 = f69, f48 fmix.r f90 = f70, f49 fmix.r f91 = f71, f50 fmix.r f92 = f76, f64 fmix.r f93 = f77, f65 fmix.r f94 = f78, f66 fmix.r f95 = f79, f67 ;; fmix.l f56 = f56, f72 fmix.l f57 = f57, f73 fmix.l f58 = f58, f74 fmix.l f59 = f59, f75 fmix.l f52 = f52, f60 fmix.l f53 = f53, f61 fmix.l f54 = f54, f62 fmix.l f55 = f55, f63 fmix.l f68 = f68, f80 fmix.l f69 = f69, f48 fmix.l f70 = f70, f49 fmix.l f71 = f71, f50 fmix.l f76 = f76, f64 fmix.l f77 = f77, f65 fmix.l f78 = f78, f66 fmix.l f79 = f79, f67 ;; fpma f56 = f56, c0, f0 fpma f52 = f52, c0, f0 fpma f68 = f68, c0, f0 fpma f76 = f76, c0, f0 ;; // before pre shuffle // 56 52 68 76 // 51 84 88 92 // 57 53 69 77 // 81 85 89 93 // 58 54 70 78 // 82 86 90 94 // 59 55 71 79 // 83 87 91 95 // after pre shuffle // 56 52 68 76 // 58 84 88 92 // 57 53 69 77 // 59 85 89 93 // 51 54 70 78 // 83 86 90 94 // 81 55 71 79 // 82 87 91 95 // (f48, f58) = (f56, f58) $ (c0, c0), (line 0, 1) fpma f48 = f58, c0, f56 fpnma f58 = f58, c0, f56 ;; // (f49, f59) = (f57, f59) $ (c1, c2), (line 2, 3) fpma f49 = f59, c1, f57 fpnma f59 = f59, c2, f57 ;; // (f50, f83) = (f51, f83) $ (c3, c4), (line 4, 5) fpma f50 = f83, c3, f51 fpnma f83 = f83, c4, f51 ;; // (f51, f82) = (f81, f82) $ (c5, c6), (line 6, 7) fpma f51 = f82, c5, f81 fpnma f82 = f82, c6, f81 ;; ;; // (f56, f59) = (f48, f59) $ (c7, c7), (line 0, 3) fpma f56 = f59, c7, f48 fpnma f59 = f59, c7, f48 ;; // (f48, f49) = (f58, f49) $ (c8, c8), (line 1, 2) fpma f48 = f49, c8, f58 fpnma f49 = f49, c8, f58 ;; // (f57, f51) = (f50, f51) $ (c9, c9), (line 4, 6) fpma f57 = f51, c9, f50 fpnma f51 = f51, c9, f50 ;; // (f50, f82) = (f83, f82) $ (c10, c10), (line 5, 7) fpma f50 = f82, c10, f83 fpnma f82 = f82, c10, f83 ;; ;; // (f58, f51) = (f50, f51) $ (c11, c11), (line 5, 6) fpma f58 = f51, c11, f50 fpnma f51 = f51, c11, f50 ;; // (f50, f57) = (f56, f57) $ (c12, c12), (line 0, 4) fpma f50 = f57, c12, f56 fpnma f57 = f57, c12, f56 ;; // (f56, f82) = (f59, f82) $ (c14, c14), (line 3, 7) fpma f56 = f82, c14, f59 fpnma f82 = f82, c14, f59 ;; ;; // (f59, f58) = (f48, f58) $ (c13, c13), (line 1, 5) fpma f59 = f58, c13, f48 fpnma f58 = f58, c13, f48 ;; // (f48, f51) = (f49, f51) $ (c13, c13), (line 2, 6) fpma f48 = f51, c13, f49 fpnma f51 = f51, c13, f49 ;; // before post shuffle // 50 52 68 76 // 59 84 88 92 // 48 53 69 77 // 56 85 89 93 // 57 54 70 78 // 58 86 90 94 // 51 55 71 79 // 82 87 91 95 // after post shuffle // 50 52 68 76 // 59 84 88 92 // 51 53 69 77 // 56 85 89 93 // 82 54 70 78 // 48 86 90 94 // 58 55 71 79 // 57 87 91 95 // before pre shuffle // 50 52 68 76 // 59 84 88 92 // 51 53 69 77 // 56 85 89 93 // 82 54 70 78 // 48 86 90 94 // 58 55 71 79 // 57 87 91 95 // after pre shuffle // 50 52 68 76 // 59 54 88 92 // 51 53 69 77 // 56 55 89 93 // 82 84 70 78 // 48 87 90 94 // 58 85 71 79 // 57 86 91 95 // (f49, f54) = (f52, f54) $ (c0, c0), (line 0, 1) fpma f49 = f54, c0, f52 fpnma f54 = f54, c0, f52 ;; // (f52, f55) = (f53, f55) $ (c1, c2), (line 2, 3) fpma f52 = f55, c1, f53 fpnma f55 = f55, c2, f53 ;; // (f53, f87) = (f84, f87) $ (c3, c4), (line 4, 5) fpma f53 = f87, c3, f84 fpnma f87 = f87, c4, f84 ;; // (f60, f86) = (f85, f86) $ (c5, c6), (line 6, 7) fpma f60 = f86, c5, f85 fpnma f86 = f86, c6, f85 ;; ;; // (f61, f55) = (f49, f55) $ (c7, c7), (line 0, 3) fpma f61 = f55, c7, f49 fpnma f55 = f55, c7, f49 ;; // (f49, f52) = (f54, f52) $ (c8, c8), (line 1, 2) fpma f49 = f52, c8, f54 fpnma f52 = f52, c8, f54 ;; // (f54, f60) = (f53, f60) $ (c9, c9), (line 4, 6) fpma f54 = f60, c9, f53 fpnma f60 = f60, c9, f53 ;; // (f53, f86) = (f87, f86) $ (c10, c10), (line 5, 7) fpma f53 = f86, c10, f87 fpnma f86 = f86, c10, f87 ;; ;; // (f62, f60) = (f53, f60) $ (c11, c11), (line 5, 6) fpma f62 = f60, c11, f53 fpnma f60 = f60, c11, f53 ;; // (f53, f54) = (f61, f54) $ (c12, c12), (line 0, 4) fpma f53 = f54, c12, f61 fpnma f54 = f54, c12, f61 ;; // (f61, f86) = (f55, f86) $ (c14, c14), (line 3, 7) fpma f61 = f86, c14, f55 fpnma f86 = f86, c14, f55 ;; ;; // (f55, f62) = (f49, f62) $ (c13, c13), (line 1, 5) fpma f55 = f62, c13, f49 fpnma f62 = f62, c13, f49 ;; // (f49, f60) = (f52, f60) $ (c13, c13), (line 2, 6) fpma f49 = f60, c13, f52 fpnma f60 = f60, c13, f52 ;; // before post shuffle // 50 53 68 76 // 59 55 88 92 // 51 49 69 77 // 56 61 89 93 // 82 54 70 78 // 48 62 90 94 // 58 60 71 79 // 57 86 91 95 // after post shuffle // 50 53 68 76 // 59 55 88 92 // 51 60 69 77 // 56 61 89 93 // 82 86 70 78 // 48 49 90 94 // 58 62 71 79 // 57 54 91 95 // before pre shuffle // 50 53 68 76 // 59 55 88 92 // 51 60 69 77 // 56 61 89 93 // 82 86 70 78 // 48 49 90 94 // 58 62 71 79 // 57 54 91 95 // after pre shuffle // 50 53 68 76 // 59 55 70 92 // 51 60 69 77 // 56 61 71 93 // 82 86 88 78 // 48 49 91 94 // 58 62 89 79 // 57 54 90 95 // (f52, f70) = (f68, f70) $ (c0, c0), (line 0, 1) fpma f52 = f70, c0, f68 fpnma f70 = f70, c0, f68 ;; // (f63, f71) = (f69, f71) $ (c1, c2), (line 2, 3) fpma f63 = f71, c1, f69 fpnma f71 = f71, c2, f69 ;; // (f64, f91) = (f88, f91) $ (c3, c4), (line 4, 5) fpma f64 = f91, c3, f88 fpnma f91 = f91, c4, f88 ;; // (f65, f90) = (f89, f90) $ (c5, c6), (line 6, 7) fpma f65 = f90, c5, f89 fpnma f90 = f90, c6, f89 ;; ;; // (f66, f71) = (f52, f71) $ (c7, c7), (line 0, 3) fpma f66 = f71, c7, f52 fpnma f71 = f71, c7, f52 ;; // (f52, f63) = (f70, f63) $ (c8, c8), (line 1, 2) fpma f52 = f63, c8, f70 fpnma f63 = f63, c8, f70 ;; // (f67, f65) = (f64, f65) $ (c9, c9), (line 4, 6) fpma f67 = f65, c9, f64 fpnma f65 = f65, c9, f64 ;; // (f64, f90) = (f91, f90) $ (c10, c10), (line 5, 7) fpma f64 = f90, c10, f91 fpnma f90 = f90, c10, f91 ;; ;; // (f68, f65) = (f64, f65) $ (c11, c11), (line 5, 6) fpma f68 = f65, c11, f64 fpnma f65 = f65, c11, f64 ;; // (f64, f67) = (f66, f67) $ (c12, c12), (line 0, 4) fpma f64 = f67, c12, f66 fpnma f67 = f67, c12, f66 ;; // (f66, f90) = (f71, f90) $ (c14, c14), (line 3, 7) fpma f66 = f90, c14, f71 fpnma f90 = f90, c14, f71 ;; ;; // (f69, f68) = (f52, f68) $ (c13, c13), (line 1, 5) fpma f69 = f68, c13, f52 fpnma f68 = f68, c13, f52 ;; // (f52, f65) = (f63, f65) $ (c13, c13), (line 2, 6) fpma f52 = f65, c13, f63 fpnma f65 = f65, c13, f63 ;; // before post shuffle // 50 53 64 76 // 59 55 69 92 // 51 60 52 77 // 56 61 66 93 // 82 86 67 78 // 48 49 68 94 // 58 62 65 79 // 57 54 90 95 // after post shuffle // 50 53 64 76 // 59 55 69 92 // 51 60 65 77 // 56 61 66 93 // 82 86 90 78 // 48 49 52 94 // 58 62 68 79 // 57 54 67 95 // before pre shuffle // 50 53 64 76 // 59 55 69 92 // 51 60 65 77 // 56 61 66 93 // 82 86 90 78 // 48 49 52 94 // 58 62 68 79 // 57 54 67 95 // after pre shuffle // 50 53 64 76 // 59 55 69 78 // 51 60 65 77 // 56 61 66 79 // 82 86 90 92 // 48 49 52 95 // 58 62 68 93 // 57 54 67 94 // (f63, f78) = (f76, f78) $ (c0, c0), (line 0, 1) fpma f63 = f78, c0, f76 fpnma f78 = f78, c0, f76 ;; // (f70, f79) = (f77, f79) $ (c1, c2), (line 2, 3) fpma f70 = f79, c1, f77 fpnma f79 = f79, c2, f77 ;; // (f71, f95) = (f92, f95) $ (c3, c4), (line 4, 5) fpma f71 = f95, c3, f92 fpnma f95 = f95, c4, f92 ;; // (f72, f94) = (f93, f94) $ (c5, c6), (line 6, 7) fpma f72 = f94, c5, f93 fpnma f94 = f94, c6, f93 ;; ;; // (f73, f79) = (f63, f79) $ (c7, c7), (line 0, 3) fpma f73 = f79, c7, f63 fpnma f79 = f79, c7, f63 ;; // (f63, f70) = (f78, f70) $ (c8, c8), (line 1, 2) fpma f63 = f70, c8, f78 fpnma f70 = f70, c8, f78 ;; // (f74, f72) = (f71, f72) $ (c9, c9), (line 4, 6) fpma f74 = f72, c9, f71 fpnma f72 = f72, c9, f71 ;; // (f71, f94) = (f95, f94) $ (c10, c10), (line 5, 7) fpma f71 = f94, c10, f95 fpnma f94 = f94, c10, f95 ;; ;; // (f75, f72) = (f71, f72) $ (c11, c11), (line 5, 6) fpma f75 = f72, c11, f71 fpnma f72 = f72, c11, f71 ;; // (f71, f74) = (f73, f74) $ (c12, c12), (line 0, 4) fpma f71 = f74, c12, f73 fpnma f74 = f74, c12, f73 ;; // (f73, f94) = (f79, f94) $ (c14, c14), (line 3, 7) fpma f73 = f94, c14, f79 fpnma f94 = f94, c14, f79 ;; ;; // (f76, f75) = (f63, f75) $ (c13, c13), (line 1, 5) fpma f76 = f75, c13, f63 fpnma f75 = f75, c13, f63 ;; // (f63, f72) = (f70, f72) $ (c13, c13), (line 2, 6) fpma f63 = f72, c13, f70 fpnma f72 = f72, c13, f70 ;; // before post shuffle // 50 53 64 71 // 59 55 69 76 // 51 60 65 63 // 56 61 66 73 // 82 86 90 74 // 48 49 52 75 // 58 62 68 72 // 57 54 67 94 // after post shuffle // 50 53 64 71 // 59 55 69 76 // 51 60 65 72 // 56 61 66 73 // 82 86 90 94 // 48 49 52 63 // 58 62 68 75 // 57 54 67 74 ;; fmix.r f70 = f50, f59 fmix.r f77 = f53, f55 fmix.r f78 = f64, f69 fmix.r f79 = f71, f76 fmix.r f80 = f51, f56 fmix.r f81 = f60, f61 fmix.r f83 = f65, f66 fmix.r f84 = f72, f73 fmix.r f85 = f82, f48 fmix.r f87 = f86, f49 fmix.r f88 = f90, f52 fmix.r f89 = f94, f63 fmix.r f91 = f58, f57 fmix.r f92 = f62, f54 fmix.r f93 = f68, f67 fmix.r f95 = f75, f74 ;; fmix.l f50 = f50, f59 fmix.l f53 = f53, f55 fmix.l f64 = f64, f69 fmix.l f71 = f71, f76 fmix.l f51 = f51, f56 fmix.l f60 = f60, f61 fmix.l f65 = f65, f66 fmix.l f72 = f72, f73 fmix.l f82 = f82, f48 fmix.l f86 = f86, f49 fmix.l f90 = f90, f52 fmix.l f94 = f94, f63 fmix.l f58 = f58, f57 fmix.l f62 = f62, f54 fmix.l f68 = f68, f67 fmix.l f75 = f75, f74 ;; // 50 51 82 58 // 70 80 85 91 // 53 60 86 62 // 77 81 87 92 // 64 65 90 68 // 78 83 88 93 // 71 72 94 75 // 79 84 89 95 mov addreg1 = in0 add addreg2 = 4, in0 ;; fpcvt.fx f50 = f50 fpcvt.fx f51 = f51 fpcvt.fx f82 = f82 fpcvt.fx f58 = f58 fpcvt.fx f70 = f70 fpcvt.fx f80 = f80 fpcvt.fx f85 = f85 fpcvt.fx f91 = f91 fpcvt.fx f53 = f53 fpcvt.fx f60 = f60 fpcvt.fx f86 = f86 fpcvt.fx f62 = f62 fpcvt.fx f77 = f77 fpcvt.fx f81 = f81 fpcvt.fx f87 = f87 fpcvt.fx f92 = f92 fpcvt.fx f64 = f64 fpcvt.fx f65 = f65 fpcvt.fx f90 = f90 fpcvt.fx f68 = f68 fpcvt.fx f78 = f78 fpcvt.fx f83 = f83 fpcvt.fx f88 = f88 fpcvt.fx f93 = f93 fpcvt.fx f71 = f71 fpcvt.fx f72 = f72 fpcvt.fx f94 = f94 fpcvt.fx f75 = f75 fpcvt.fx f79 = f79 fpcvt.fx f84 = f84 fpcvt.fx f89 = f89 fpcvt.fx f95 = f95 ;; getf.sig r33 = f50 getf.sig r34 = f51 getf.sig r35 = f82 getf.sig r36 = f58 getf.sig r37 = f70 getf.sig r38 = f80 getf.sig r39 = f85 getf.sig r40 = f91 getf.sig r41 = f53 getf.sig r42 = f60 getf.sig r43 = f86 getf.sig r44 = f62 getf.sig r45 = f77 getf.sig r46 = f81 getf.sig r47 = f87 getf.sig r48 = f92 getf.sig r49 = f64 getf.sig r50 = f65 getf.sig r51 = f90 getf.sig r52 = f68 getf.sig r53 = f78 getf.sig r54 = f83 getf.sig r55 = f88 getf.sig r56 = f93 getf.sig r57 = f71 getf.sig r58 = f72 getf.sig r59 = f94 getf.sig r60 = f75 getf.sig r61 = f79 getf.sig r62 = f84 getf.sig r63 = f89 getf.sig r64 = f95 ;; shl r33 = r33, 7 shl r34 = r34, 7 shl r35 = r35, 7 shl r36 = r36, 7 shl r37 = r37, 7 shl r38 = r38, 7 shl r39 = r39, 7 shl r40 = r40, 7 shl r41 = r41, 7 shl r42 = r42, 7 shl r43 = r43, 7 shl r44 = r44, 7 shl r45 = r45, 7 shl r46 = r46, 7 shl r47 = r47, 7 shl r48 = r48, 7 shl r49 = r49, 7 shl r50 = r50, 7 shl r51 = r51, 7 shl r52 = r52, 7 shl r53 = r53, 7 shl r54 = r54, 7 shl r55 = r55, 7 shl r56 = r56, 7 shl r57 = r57, 7 shl r58 = r58, 7 shl r59 = r59, 7 shl r60 = r60, 7 shl r61 = r61, 7 shl r62 = r62, 7 shl r63 = r63, 7 shl r64 = r64, 7 ;; pack4.sss r33 = r33, r0 pack4.sss r34 = r34, r0 pack4.sss r35 = r35, r0 pack4.sss r36 = r36, r0 pack4.sss r37 = r37, r0 pack4.sss r38 = r38, r0 pack4.sss r39 = r39, r0 pack4.sss r40 = r40, r0 pack4.sss r41 = r41, r0 pack4.sss r42 = r42, r0 pack4.sss r43 = r43, r0 pack4.sss r44 = r44, r0 pack4.sss r45 = r45, r0 pack4.sss r46 = r46, r0 pack4.sss r47 = r47, r0 pack4.sss r48 = r48, r0 pack4.sss r49 = r49, r0 pack4.sss r50 = r50, r0 pack4.sss r51 = r51, r0 pack4.sss r52 = r52, r0 pack4.sss r53 = r53, r0 pack4.sss r54 = r54, r0 pack4.sss r55 = r55, r0 pack4.sss r56 = r56, r0 pack4.sss r57 = r57, r0 pack4.sss r58 = r58, r0 pack4.sss r59 = r59, r0 pack4.sss r60 = r60, r0 pack4.sss r61 = r61, r0 pack4.sss r62 = r62, r0 pack4.sss r63 = r63, r0 pack4.sss r64 = r64, r0 ;; pshr2 r33 = r33, 7 pshr2 r34 = r34, 7 pshr2 r35 = r35, 7 pshr2 r36 = r36, 7 pshr2 r37 = r37, 7 pshr2 r38 = r38, 7 pshr2 r39 = r39, 7 pshr2 r40 = r40, 7 pshr2 r41 = r41, 7 pshr2 r42 = r42, 7 pshr2 r43 = r43, 7 pshr2 r44 = r44, 7 pshr2 r45 = r45, 7 pshr2 r46 = r46, 7 pshr2 r47 = r47, 7 pshr2 r48 = r48, 7 pshr2 r49 = r49, 7 pshr2 r50 = r50, 7 pshr2 r51 = r51, 7 pshr2 r52 = r52, 7 pshr2 r53 = r53, 7 pshr2 r54 = r54, 7 pshr2 r55 = r55, 7 pshr2 r56 = r56, 7 pshr2 r57 = r57, 7 pshr2 r58 = r58, 7 pshr2 r59 = r59, 7 pshr2 r60 = r60, 7 pshr2 r61 = r61, 7 pshr2 r62 = r62, 7 pshr2 r63 = r63, 7 pshr2 r64 = r64, 7 ;; mux2 r33 = r33, 0xe1 mux2 r34 = r34, 0xe1 mux2 r35 = r35, 0xe1 mux2 r36 = r36, 0xe1 mux2 r37 = r37, 0xe1 mux2 r38 = r38, 0xe1 mux2 r39 = r39, 0xe1 mux2 r40 = r40, 0xe1 mux2 r41 = r41, 0xe1 mux2 r42 = r42, 0xe1 mux2 r43 = r43, 0xe1 mux2 r44 = r44, 0xe1 mux2 r45 = r45, 0xe1 mux2 r46 = r46, 0xe1 mux2 r47 = r47, 0xe1 mux2 r48 = r48, 0xe1 mux2 r49 = r49, 0xe1 mux2 r50 = r50, 0xe1 mux2 r51 = r51, 0xe1 mux2 r52 = r52, 0xe1 mux2 r53 = r53, 0xe1 mux2 r54 = r54, 0xe1 mux2 r55 = r55, 0xe1 mux2 r56 = r56, 0xe1 mux2 r57 = r57, 0xe1 mux2 r58 = r58, 0xe1 mux2 r59 = r59, 0xe1 mux2 r60 = r60, 0xe1 mux2 r61 = r61, 0xe1 mux2 r62 = r62, 0xe1 mux2 r63 = r63, 0xe1 mux2 r64 = r64, 0xe1 ;; st4 [addreg1] = r33, 8 st4 [addreg2] = r34, 8 ;; st4 [addreg1] = r35, 8 st4 [addreg2] = r36, 8 ;; st4 [addreg1] = r37, 8 st4 [addreg2] = r38, 8 ;; st4 [addreg1] = r39, 8 st4 [addreg2] = r40, 8 ;; st4 [addreg1] = r41, 8 st4 [addreg2] = r42, 8 ;; st4 [addreg1] = r43, 8 st4 [addreg2] = r44, 8 ;; st4 [addreg1] = r45, 8 st4 [addreg2] = r46, 8 ;; st4 [addreg1] = r47, 8 st4 [addreg2] = r48, 8 ;; st4 [addreg1] = r49, 8 st4 [addreg2] = r50, 8 ;; st4 [addreg1] = r51, 8 st4 [addreg2] = r52, 8 ;; st4 [addreg1] = r53, 8 st4 [addreg2] = r54, 8 ;; st4 [addreg1] = r55, 8 st4 [addreg2] = r56, 8 ;; st4 [addreg1] = r57, 8 st4 [addreg2] = r58, 8 ;; st4 [addreg1] = r59, 8 st4 [addreg2] = r60, 8 ;; st4 [addreg1] = r61, 8 st4 [addreg2] = r62, 8 ;; st4 [addreg1] = r63, 8 st4 [addreg2] = r64, 8 ;; mov ar.pfs = r16 br.ret.sptk.few b0 .endp xvidcore/src/dct/ia64_asm/idct_ia64_ecc.s0000664000076500007650000010473411147310721021201 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 inverse discrete cosine transform - // * // * Copyright(C) 2002 Christian Schwarz, Haiko Gaisser, Sebastian Hack // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: idct_ia64_ecc.s,v 1.2 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * idct_ia64_ecc.s, IA-64 optimized inverse DCT // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** // addreg1 = r14 addreg2 = r15 c0 = f32 c1 = f33 c2 = f34 c3 = f35 c4 = f36 c5 = f37 c6 = f38 c7 = f39 c8 = f40 c9 = f41 c10 = f42 c11 = f43 c12 = f44 c13 = f45 c14 = f46 c15 = f47 .sdata .align 16 .data_c0: real4 0.353553390593273730857504233427, 0.353553390593273730857504233427 .data_c1: real4 -2.414213562373094923430016933708, -2.414213562373094923430016933708 .align 16 .data_c2: real4 -0.414213562373095034452319396223, -0.414213562373095034452319396223 .data_c3: real4 0.198912367379658006072418174881, 0.198912367379658006072418174881 .align 16 .data_c4: real4 5.027339492125848074977056967327, 5.027339492125848074977056967327 .data_c5: real4 0.668178637919298878955487452913, 0.668178637919298878955487452913 .align 16 .data_c6: real4 1.496605762665489169904731170391, 1.496605762665489169904731170391 .data_c7: real4 0.461939766255643369241568052530, 0.461939766255643369241568052530 .align 16 .data_c8: real4 0.191341716182544890889616340246, 0.191341716182544890889616340246 .data_c9: real4 0.847759065022573476966272210120, 0.847759065022573476966272210120 .align 16 .data_c10: real4 2.847759065022573476966272210120, 2.847759065022573476966272210120 .data_c11: real4 5.027339492125848074977056967327, 5.027339492125848074977056967327 .align 16 .data_c12: real4 0.490392640201615215289621119155, 0.490392640201615215289621119155 .data_c13: real4 0.068974844820735750627882509889, 0.068974844820735750627882509889 .align 16 .data_c14: real4 0.097545161008064124041894160655, 0.097545161008064124041894160655 .data_c15: real4 1.000000000000000000000000000000, 1.000000000000000000000000000000 .text .global idct_ia64 .global idct_ia64_init .align 16 .proc idct_ia64_init idct_ia64_init: br.ret.sptk.few b0 .endp .align 16 .proc idct_ia64 idct_ia64: addreg3 = r20 addreg4 = r21 addreg5 = r22 addreg6 = r23 one = f30 alloc r16 = ar.pfs, 1, 71, 0, 0 addl addreg1 = @gprel(.data_c0#), gp addl addreg2 = @gprel(.data_c2#), gp ;; add addreg3 = 32, addreg1 add addreg4 = 32, addreg2 add addreg5 = 64, addreg1 add addreg6 = 64, addreg2 ;; ldfp8 c0, c1 = [addreg1] ldfp8 c2, c3 = [addreg2] ;; ldfp8 c4, c5 = [addreg3], 16 ldfp8 c6, c7 = [addreg4], 16 add addreg1 = 96, addreg1 add addreg2 = 96, addreg2 ;; ldfp8 c8, c9 = [addreg5], 16 ldfp8 c10, c11 = [addreg6], 16 ;; ldfp8 c12, c13 = [addreg1] ldfp8 c14, c15 = [addreg2] ;; mov addreg1 = in0 fpack one = f1, f1 add addreg2 = 2, in0 ;; ld2 r33 = [addreg1], 4 ld2 r34 = [addreg2], 4 ;; ld2 r35 = [addreg1], 4 ld2 r36 = [addreg2], 4 ;; ld2 r37 = [addreg1], 4 ld2 r38 = [addreg2], 4 ;; ld2 r39 = [addreg1], 4 ld2 r40 = [addreg2], 4 ;; ld2 r41 = [addreg1], 4 ld2 r42 = [addreg2], 4 ;; ld2 r43 = [addreg1], 4 ld2 r44 = [addreg2], 4 ;; ld2 r45 = [addreg1], 4 ld2 r46 = [addreg2], 4 ;; ld2 r47 = [addreg1], 4 ld2 r48 = [addreg2], 4 ;; ld2 r49 = [addreg1], 4 ld2 r50 = [addreg2], 4 ;; ld2 r51 = [addreg1], 4 ld2 r52 = [addreg2], 4 ;; ld2 r53 = [addreg1], 4 ld2 r54 = [addreg2], 4 ;; ld2 r55 = [addreg1], 4 ld2 r56 = [addreg2], 4 ;; ld2 r57 = [addreg1], 4 ld2 r58 = [addreg2], 4 ;; ld2 r59 = [addreg1], 4 ld2 r60 = [addreg2], 4 ;; ld2 r61 = [addreg1], 4 ld2 r62 = [addreg2], 4 ;; ld2 r63 = [addreg1], 4 ld2 r64 = [addreg2], 4 ;; ld2 r65 = [addreg1], 4 ld2 r66 = [addreg2], 4 ;; ld2 r67 = [addreg1], 4 ld2 r68 = [addreg2], 4 ;; ld2 r69 = [addreg1], 4 ld2 r70 = [addreg2], 4 ;; ld2 r71 = [addreg1], 4 ld2 r72 = [addreg2], 4 ;; ld2 r73 = [addreg1], 4 ld2 r74 = [addreg2], 4 ;; ld2 r75 = [addreg1], 4 ld2 r76 = [addreg2], 4 ;; ld2 r77 = [addreg1], 4 ld2 r78 = [addreg2], 4 ;; ld2 r79 = [addreg1], 4 ld2 r80 = [addreg2], 4 ;; ld2 r81 = [addreg1], 4 ld2 r82 = [addreg2], 4 ;; ld2 r83 = [addreg1], 4 ld2 r84 = [addreg2], 4 ;; ld2 r85 = [addreg1], 4 ld2 r86 = [addreg2], 4 ;; ld2 r87 = [addreg1], 4 ld2 r88 = [addreg2], 4 ;; ld2 r89 = [addreg1], 4 ld2 r90 = [addreg2], 4 ;; ld2 r91 = [addreg1], 4 ld2 r92 = [addreg2], 4 ;; ld2 r93 = [addreg1], 4 ld2 r94 = [addreg2], 4 ;; ld2 r95 = [addreg1], 4 ld2 r96 = [addreg2], 4 ;; sxt2 r33 = r33 sxt2 r34 = r34 sxt2 r35 = r35 sxt2 r36 = r36 sxt2 r37 = r37 sxt2 r38 = r38 sxt2 r39 = r39 sxt2 r40 = r40 sxt2 r41 = r41 sxt2 r42 = r42 sxt2 r43 = r43 sxt2 r44 = r44 sxt2 r45 = r45 sxt2 r46 = r46 sxt2 r47 = r47 sxt2 r48 = r48 sxt2 r49 = r49 sxt2 r50 = r50 sxt2 r51 = r51 sxt2 r52 = r52 sxt2 r53 = r53 sxt2 r54 = r54 sxt2 r55 = r55 sxt2 r56 = r56 sxt2 r57 = r57 sxt2 r58 = r58 sxt2 r59 = r59 sxt2 r60 = r60 sxt2 r61 = r61 sxt2 r62 = r62 sxt2 r63 = r63 sxt2 r64 = r64 sxt2 r65 = r65 sxt2 r66 = r66 sxt2 r67 = r67 sxt2 r68 = r68 sxt2 r69 = r69 sxt2 r70 = r70 sxt2 r71 = r71 sxt2 r72 = r72 sxt2 r73 = r73 sxt2 r74 = r74 sxt2 r75 = r75 sxt2 r76 = r76 sxt2 r77 = r77 sxt2 r78 = r78 sxt2 r79 = r79 sxt2 r80 = r80 sxt2 r81 = r81 sxt2 r82 = r82 sxt2 r83 = r83 sxt2 r84 = r84 sxt2 r85 = r85 sxt2 r86 = r86 sxt2 r87 = r87 sxt2 r88 = r88 sxt2 r89 = r89 sxt2 r90 = r90 sxt2 r91 = r91 sxt2 r92 = r92 sxt2 r93 = r93 sxt2 r94 = r94 sxt2 r95 = r95 sxt2 r96 = r96 ;; setf.sig f48 = r33 setf.sig f49 = r34 setf.sig f50 = r35 setf.sig f51 = r36 setf.sig f52 = r37 setf.sig f53 = r38 setf.sig f54 = r39 setf.sig f55 = r40 setf.sig f56 = r41 setf.sig f57 = r42 setf.sig f58 = r43 setf.sig f59 = r44 setf.sig f60 = r45 setf.sig f61 = r46 setf.sig f62 = r47 setf.sig f63 = r48 setf.sig f64 = r49 setf.sig f65 = r50 setf.sig f66 = r51 setf.sig f67 = r52 setf.sig f68 = r53 setf.sig f69 = r54 setf.sig f70 = r55 setf.sig f71 = r56 setf.sig f72 = r57 setf.sig f73 = r58 setf.sig f74 = r59 setf.sig f75 = r60 setf.sig f76 = r61 setf.sig f77 = r62 setf.sig f78 = r63 setf.sig f79 = r64 setf.sig f80 = r65 setf.sig f81 = r66 setf.sig f82 = r67 setf.sig f83 = r68 setf.sig f84 = r69 setf.sig f85 = r70 setf.sig f86 = r71 setf.sig f87 = r72 setf.sig f88 = r73 setf.sig f89 = r74 setf.sig f90 = r75 setf.sig f91 = r76 setf.sig f92 = r77 setf.sig f93 = r78 setf.sig f94 = r79 setf.sig f95 = r80 setf.sig f96 = r81 setf.sig f97 = r82 setf.sig f98 = r83 setf.sig f99 = r84 setf.sig f100 = r85 setf.sig f101 = r86 setf.sig f102 = r87 setf.sig f103 = r88 setf.sig f104 = r89 setf.sig f105 = r90 setf.sig f106 = r91 setf.sig f107 = r92 setf.sig f108 = r93 setf.sig f109 = r94 setf.sig f110 = r95 setf.sig f111 = r96 ;; fcvt.xf f48 = f48 fcvt.xf f49 = f49 fcvt.xf f50 = f50 fcvt.xf f51 = f51 fcvt.xf f52 = f52 fcvt.xf f53 = f53 fcvt.xf f54 = f54 fcvt.xf f55 = f55 fcvt.xf f56 = f56 fcvt.xf f57 = f57 fcvt.xf f58 = f58 fcvt.xf f59 = f59 fcvt.xf f60 = f60 fcvt.xf f61 = f61 fcvt.xf f62 = f62 fcvt.xf f63 = f63 fcvt.xf f64 = f64 fcvt.xf f65 = f65 fcvt.xf f66 = f66 fcvt.xf f67 = f67 fcvt.xf f68 = f68 fcvt.xf f69 = f69 fcvt.xf f70 = f70 fcvt.xf f71 = f71 fcvt.xf f72 = f72 fcvt.xf f73 = f73 fcvt.xf f74 = f74 fcvt.xf f75 = f75 fcvt.xf f76 = f76 fcvt.xf f77 = f77 fcvt.xf f78 = f78 fcvt.xf f79 = f79 fcvt.xf f80 = f80 fcvt.xf f81 = f81 fcvt.xf f82 = f82 fcvt.xf f83 = f83 fcvt.xf f84 = f84 fcvt.xf f85 = f85 fcvt.xf f86 = f86 fcvt.xf f87 = f87 fcvt.xf f88 = f88 fcvt.xf f89 = f89 fcvt.xf f90 = f90 fcvt.xf f91 = f91 fcvt.xf f92 = f92 fcvt.xf f93 = f93 fcvt.xf f94 = f94 fcvt.xf f95 = f95 fcvt.xf f96 = f96 fcvt.xf f97 = f97 fcvt.xf f98 = f98 fcvt.xf f99 = f99 fcvt.xf f100 = f100 fcvt.xf f101 = f101 fcvt.xf f102 = f102 fcvt.xf f103 = f103 fcvt.xf f104 = f104 fcvt.xf f105 = f105 fcvt.xf f106 = f106 fcvt.xf f107 = f107 fcvt.xf f108 = f108 fcvt.xf f109 = f109 fcvt.xf f110 = f110 fcvt.xf f111 = f111 ;; fpack f48 = f48, f49 ;; fpack f49 = f50, f51 ;; fpack f50 = f52, f53 ;; fpack f51 = f54, f55 ;; fpack f52 = f56, f57 ;; fpack f53 = f58, f59 ;; fpack f54 = f60, f61 ;; fpack f55 = f62, f63 ;; fpack f56 = f64, f65 ;; fpack f57 = f66, f67 ;; fpack f58 = f68, f69 ;; fpack f59 = f70, f71 ;; fpack f60 = f72, f73 ;; fpack f61 = f74, f75 ;; fpack f62 = f76, f77 ;; fpack f63 = f78, f79 ;; fpack f64 = f80, f81 ;; fpack f65 = f82, f83 ;; fpack f66 = f84, f85 ;; fpack f67 = f86, f87 ;; fpack f68 = f88, f89 ;; fpack f69 = f90, f91 ;; fpack f70 = f92, f93 ;; fpack f71 = f94, f95 ;; fpack f72 = f96, f97 ;; fpack f73 = f98, f99 ;; fpack f74 = f100, f101 ;; fpack f75 = f102, f103 ;; fpack f76 = f104, f105 ;; fpack f77 = f106, f107 ;; fpack f78 = f108, f109 ;; fpack f79 = f110, f111 ;; fpma f48 = f48, c0, f0 fpma f49 = f49, c0, f0 fpma f50 = f50, c0, f0 fpma f51 = f51, c0, f0 ;; // before pre shuffle // 48 49 50 51 // 52 53 54 55 // 56 57 58 59 // 60 61 62 63 // 64 65 66 67 // 68 69 70 71 // 72 73 74 75 // 76 77 78 79 // after pre shuffle // 48 49 50 51 // 64 53 54 55 // 56 57 58 59 // 72 61 62 63 // 52 65 66 67 // 76 69 70 71 // 60 73 74 75 // 68 77 78 79 // (f80, f64) = (f48, f64) $ (c0, c0), (line 0, 1) fpma f80 = f64, c0, f48 fpnma f64 = f64, c0, f48 ;; // (f48, f72) = (f56, f72) $ (c1, c2), (line 2, 3) fpma f48 = f72, c1, f56 fpnma f72 = f72, c2, f56 ;; // (f56, f76) = (f52, f76) $ (c3, c4), (line 4, 5) fpma f56 = f76, c3, f52 fpnma f76 = f76, c4, f52 ;; // (f52, f68) = (f60, f68) $ (c5, c6), (line 6, 7) fpma f52 = f68, c5, f60 fpnma f68 = f68, c6, f60 ;; ;; // (f60, f72) = (f80, f72) $ (c7, c7), (line 0, 3) fpma f60 = f72, c7, f80 fpnma f72 = f72, c7, f80 ;; // (f80, f48) = (f64, f48) $ (c8, c8), (line 1, 2) fpma f80 = f48, c8, f64 fpnma f48 = f48, c8, f64 ;; // (f64, f52) = (f56, f52) $ (c9, c9), (line 4, 6) fpma f64 = f52, c9, f56 fpnma f52 = f52, c9, f56 ;; // (f56, f68) = (f76, f68) $ (c10, c10), (line 5, 7) fpma f56 = f68, c10, f76 fpnma f68 = f68, c10, f76 ;; ;; // (f76, f52) = (f56, f52) $ (c11, c11), (line 5, 6) fpma f76 = f52, c11, f56 fpnma f52 = f52, c11, f56 ;; // (f56, f64) = (f60, f64) $ (c12, c12), (line 0, 4) fpma f56 = f64, c12, f60 fpnma f64 = f64, c12, f60 ;; // (f60, f68) = (f72, f68) $ (c14, c14), (line 3, 7) fpma f60 = f68, c14, f72 fpnma f68 = f68, c14, f72 ;; ;; // (f72, f76) = (f80, f76) $ (c13, c13), (line 1, 5) fpma f72 = f76, c13, f80 fpnma f76 = f76, c13, f80 ;; // (f80, f52) = (f48, f52) $ (c13, c13), (line 2, 6) fpma f80 = f52, c13, f48 fpnma f52 = f52, c13, f48 ;; // before post shuffle // 56 49 50 51 // 72 53 54 55 // 80 57 58 59 // 60 61 62 63 // 64 65 66 67 // 76 69 70 71 // 52 73 74 75 // 68 77 78 79 // after post shuffle // 56 49 50 51 // 72 53 54 55 // 52 57 58 59 // 60 61 62 63 // 68 65 66 67 // 80 69 70 71 // 76 73 74 75 // 64 77 78 79 // before pre shuffle // 56 49 50 51 // 72 53 54 55 // 52 57 58 59 // 60 61 62 63 // 68 65 66 67 // 80 69 70 71 // 76 73 74 75 // 64 77 78 79 // after pre shuffle // 56 49 50 51 // 72 65 54 55 // 52 57 58 59 // 60 73 62 63 // 68 53 66 67 // 80 77 70 71 // 76 61 74 75 // 64 69 78 79 // (f48, f65) = (f49, f65) $ (c0, c0), (line 0, 1) fpma f48 = f65, c0, f49 fpnma f65 = f65, c0, f49 ;; // (f49, f73) = (f57, f73) $ (c1, c2), (line 2, 3) fpma f49 = f73, c1, f57 fpnma f73 = f73, c2, f57 ;; // (f57, f77) = (f53, f77) $ (c3, c4), (line 4, 5) fpma f57 = f77, c3, f53 fpnma f77 = f77, c4, f53 ;; // (f53, f69) = (f61, f69) $ (c5, c6), (line 6, 7) fpma f53 = f69, c5, f61 fpnma f69 = f69, c6, f61 ;; ;; // (f61, f73) = (f48, f73) $ (c7, c7), (line 0, 3) fpma f61 = f73, c7, f48 fpnma f73 = f73, c7, f48 ;; // (f48, f49) = (f65, f49) $ (c8, c8), (line 1, 2) fpma f48 = f49, c8, f65 fpnma f49 = f49, c8, f65 ;; // (f65, f53) = (f57, f53) $ (c9, c9), (line 4, 6) fpma f65 = f53, c9, f57 fpnma f53 = f53, c9, f57 ;; // (f57, f69) = (f77, f69) $ (c10, c10), (line 5, 7) fpma f57 = f69, c10, f77 fpnma f69 = f69, c10, f77 ;; ;; // (f77, f53) = (f57, f53) $ (c11, c11), (line 5, 6) fpma f77 = f53, c11, f57 fpnma f53 = f53, c11, f57 ;; // (f57, f65) = (f61, f65) $ (c12, c12), (line 0, 4) fpma f57 = f65, c12, f61 fpnma f65 = f65, c12, f61 ;; // (f61, f69) = (f73, f69) $ (c14, c14), (line 3, 7) fpma f61 = f69, c14, f73 fpnma f69 = f69, c14, f73 ;; ;; // (f73, f77) = (f48, f77) $ (c13, c13), (line 1, 5) fpma f73 = f77, c13, f48 fpnma f77 = f77, c13, f48 ;; // (f48, f53) = (f49, f53) $ (c13, c13), (line 2, 6) fpma f48 = f53, c13, f49 fpnma f53 = f53, c13, f49 ;; // before post shuffle // 56 57 50 51 // 72 73 54 55 // 52 48 58 59 // 60 61 62 63 // 68 65 66 67 // 80 77 70 71 // 76 53 74 75 // 64 69 78 79 // after post shuffle // 56 57 50 51 // 72 73 54 55 // 52 53 58 59 // 60 61 62 63 // 68 69 66 67 // 80 48 70 71 // 76 77 74 75 // 64 65 78 79 // before pre shuffle // 56 57 50 51 // 72 73 54 55 // 52 53 58 59 // 60 61 62 63 // 68 69 66 67 // 80 48 70 71 // 76 77 74 75 // 64 65 78 79 // after pre shuffle // 56 57 50 51 // 72 73 66 55 // 52 53 58 59 // 60 61 74 63 // 68 69 54 67 // 80 48 78 71 // 76 77 62 75 // 64 65 70 79 // (f49, f66) = (f50, f66) $ (c0, c0), (line 0, 1) fpma f49 = f66, c0, f50 fpnma f66 = f66, c0, f50 ;; // (f50, f74) = (f58, f74) $ (c1, c2), (line 2, 3) fpma f50 = f74, c1, f58 fpnma f74 = f74, c2, f58 ;; // (f58, f78) = (f54, f78) $ (c3, c4), (line 4, 5) fpma f58 = f78, c3, f54 fpnma f78 = f78, c4, f54 ;; // (f54, f70) = (f62, f70) $ (c5, c6), (line 6, 7) fpma f54 = f70, c5, f62 fpnma f70 = f70, c6, f62 ;; ;; // (f62, f74) = (f49, f74) $ (c7, c7), (line 0, 3) fpma f62 = f74, c7, f49 fpnma f74 = f74, c7, f49 ;; // (f49, f50) = (f66, f50) $ (c8, c8), (line 1, 2) fpma f49 = f50, c8, f66 fpnma f50 = f50, c8, f66 ;; // (f66, f54) = (f58, f54) $ (c9, c9), (line 4, 6) fpma f66 = f54, c9, f58 fpnma f54 = f54, c9, f58 ;; // (f58, f70) = (f78, f70) $ (c10, c10), (line 5, 7) fpma f58 = f70, c10, f78 fpnma f70 = f70, c10, f78 ;; ;; // (f78, f54) = (f58, f54) $ (c11, c11), (line 5, 6) fpma f78 = f54, c11, f58 fpnma f54 = f54, c11, f58 ;; // (f58, f66) = (f62, f66) $ (c12, c12), (line 0, 4) fpma f58 = f66, c12, f62 fpnma f66 = f66, c12, f62 ;; // (f62, f70) = (f74, f70) $ (c14, c14), (line 3, 7) fpma f62 = f70, c14, f74 fpnma f70 = f70, c14, f74 ;; ;; // (f74, f78) = (f49, f78) $ (c13, c13), (line 1, 5) fpma f74 = f78, c13, f49 fpnma f78 = f78, c13, f49 ;; // (f49, f54) = (f50, f54) $ (c13, c13), (line 2, 6) fpma f49 = f54, c13, f50 fpnma f54 = f54, c13, f50 ;; // before post shuffle // 56 57 58 51 // 72 73 74 55 // 52 53 49 59 // 60 61 62 63 // 68 69 66 67 // 80 48 78 71 // 76 77 54 75 // 64 65 70 79 // after post shuffle // 56 57 58 51 // 72 73 74 55 // 52 53 54 59 // 60 61 62 63 // 68 69 70 67 // 80 48 49 71 // 76 77 78 75 // 64 65 66 79 // before pre shuffle // 56 57 58 51 // 72 73 74 55 // 52 53 54 59 // 60 61 62 63 // 68 69 70 67 // 80 48 49 71 // 76 77 78 75 // 64 65 66 79 // after pre shuffle // 56 57 58 51 // 72 73 74 67 // 52 53 54 59 // 60 61 62 75 // 68 69 70 55 // 80 48 49 79 // 76 77 78 63 // 64 65 66 71 // (f50, f67) = (f51, f67) $ (c0, c0), (line 0, 1) fpma f50 = f67, c0, f51 fpnma f67 = f67, c0, f51 ;; // (f51, f75) = (f59, f75) $ (c1, c2), (line 2, 3) fpma f51 = f75, c1, f59 fpnma f75 = f75, c2, f59 ;; // (f59, f79) = (f55, f79) $ (c3, c4), (line 4, 5) fpma f59 = f79, c3, f55 fpnma f79 = f79, c4, f55 ;; // (f55, f71) = (f63, f71) $ (c5, c6), (line 6, 7) fpma f55 = f71, c5, f63 fpnma f71 = f71, c6, f63 ;; ;; // (f63, f75) = (f50, f75) $ (c7, c7), (line 0, 3) fpma f63 = f75, c7, f50 fpnma f75 = f75, c7, f50 ;; // (f50, f51) = (f67, f51) $ (c8, c8), (line 1, 2) fpma f50 = f51, c8, f67 fpnma f51 = f51, c8, f67 ;; // (f67, f55) = (f59, f55) $ (c9, c9), (line 4, 6) fpma f67 = f55, c9, f59 fpnma f55 = f55, c9, f59 ;; // (f59, f71) = (f79, f71) $ (c10, c10), (line 5, 7) fpma f59 = f71, c10, f79 fpnma f71 = f71, c10, f79 ;; ;; // (f79, f55) = (f59, f55) $ (c11, c11), (line 5, 6) fpma f79 = f55, c11, f59 fpnma f55 = f55, c11, f59 ;; // (f59, f67) = (f63, f67) $ (c12, c12), (line 0, 4) fpma f59 = f67, c12, f63 fpnma f67 = f67, c12, f63 ;; // (f63, f71) = (f75, f71) $ (c14, c14), (line 3, 7) fpma f63 = f71, c14, f75 fpnma f71 = f71, c14, f75 ;; ;; // (f75, f79) = (f50, f79) $ (c13, c13), (line 1, 5) fpma f75 = f79, c13, f50 fpnma f79 = f79, c13, f50 ;; // (f50, f55) = (f51, f55) $ (c13, c13), (line 2, 6) fpma f50 = f55, c13, f51 fpnma f55 = f55, c13, f51 ;; // before post shuffle // 56 57 58 59 // 72 73 74 75 // 52 53 54 50 // 60 61 62 63 // 68 69 70 67 // 80 48 49 79 // 76 77 78 55 // 64 65 66 71 // after post shuffle // 56 57 58 59 // 72 73 74 75 // 52 53 54 55 // 60 61 62 63 // 68 69 70 71 // 80 48 49 50 // 76 77 78 79 // 64 65 66 67 ;; fmix.r f51 = f56, f72 fmix.r f81 = f57, f73 fmix.r f82 = f58, f74 fmix.r f83 = f59, f75 fmix.r f84 = f52, f60 fmix.r f85 = f53, f61 fmix.r f86 = f54, f62 fmix.r f87 = f55, f63 fmix.r f88 = f68, f80 fmix.r f89 = f69, f48 fmix.r f90 = f70, f49 fmix.r f91 = f71, f50 fmix.r f92 = f76, f64 fmix.r f93 = f77, f65 fmix.r f94 = f78, f66 fmix.r f95 = f79, f67 ;; fmix.l f56 = f56, f72 fmix.l f57 = f57, f73 fmix.l f58 = f58, f74 fmix.l f59 = f59, f75 fmix.l f52 = f52, f60 fmix.l f53 = f53, f61 fmix.l f54 = f54, f62 fmix.l f55 = f55, f63 fmix.l f68 = f68, f80 fmix.l f69 = f69, f48 fmix.l f70 = f70, f49 fmix.l f71 = f71, f50 fmix.l f76 = f76, f64 fmix.l f77 = f77, f65 fmix.l f78 = f78, f66 fmix.l f79 = f79, f67 ;; fpma f56 = f56, c0, f0 fpma f52 = f52, c0, f0 fpma f68 = f68, c0, f0 fpma f76 = f76, c0, f0 ;; // before pre shuffle // 56 52 68 76 // 51 84 88 92 // 57 53 69 77 // 81 85 89 93 // 58 54 70 78 // 82 86 90 94 // 59 55 71 79 // 83 87 91 95 // after pre shuffle // 56 52 68 76 // 58 84 88 92 // 57 53 69 77 // 59 85 89 93 // 51 54 70 78 // 83 86 90 94 // 81 55 71 79 // 82 87 91 95 // (f48, f58) = (f56, f58) $ (c0, c0), (line 0, 1) fpma f48 = f58, c0, f56 fpnma f58 = f58, c0, f56 ;; // (f49, f59) = (f57, f59) $ (c1, c2), (line 2, 3) fpma f49 = f59, c1, f57 fpnma f59 = f59, c2, f57 ;; // (f50, f83) = (f51, f83) $ (c3, c4), (line 4, 5) fpma f50 = f83, c3, f51 fpnma f83 = f83, c4, f51 ;; // (f51, f82) = (f81, f82) $ (c5, c6), (line 6, 7) fpma f51 = f82, c5, f81 fpnma f82 = f82, c6, f81 ;; ;; // (f56, f59) = (f48, f59) $ (c7, c7), (line 0, 3) fpma f56 = f59, c7, f48 fpnma f59 = f59, c7, f48 ;; // (f48, f49) = (f58, f49) $ (c8, c8), (line 1, 2) fpma f48 = f49, c8, f58 fpnma f49 = f49, c8, f58 ;; // (f57, f51) = (f50, f51) $ (c9, c9), (line 4, 6) fpma f57 = f51, c9, f50 fpnma f51 = f51, c9, f50 ;; // (f50, f82) = (f83, f82) $ (c10, c10), (line 5, 7) fpma f50 = f82, c10, f83 fpnma f82 = f82, c10, f83 ;; ;; // (f58, f51) = (f50, f51) $ (c11, c11), (line 5, 6) fpma f58 = f51, c11, f50 fpnma f51 = f51, c11, f50 ;; // (f50, f57) = (f56, f57) $ (c12, c12), (line 0, 4) fpma f50 = f57, c12, f56 fpnma f57 = f57, c12, f56 ;; // (f56, f82) = (f59, f82) $ (c14, c14), (line 3, 7) fpma f56 = f82, c14, f59 fpnma f82 = f82, c14, f59 ;; ;; // (f59, f58) = (f48, f58) $ (c13, c13), (line 1, 5) fpma f59 = f58, c13, f48 fpnma f58 = f58, c13, f48 ;; // (f48, f51) = (f49, f51) $ (c13, c13), (line 2, 6) fpma f48 = f51, c13, f49 fpnma f51 = f51, c13, f49 ;; // before post shuffle // 50 52 68 76 // 59 84 88 92 // 48 53 69 77 // 56 85 89 93 // 57 54 70 78 // 58 86 90 94 // 51 55 71 79 // 82 87 91 95 // after post shuffle // 50 52 68 76 // 59 84 88 92 // 51 53 69 77 // 56 85 89 93 // 82 54 70 78 // 48 86 90 94 // 58 55 71 79 // 57 87 91 95 // before pre shuffle // 50 52 68 76 // 59 84 88 92 // 51 53 69 77 // 56 85 89 93 // 82 54 70 78 // 48 86 90 94 // 58 55 71 79 // 57 87 91 95 // after pre shuffle // 50 52 68 76 // 59 54 88 92 // 51 53 69 77 // 56 55 89 93 // 82 84 70 78 // 48 87 90 94 // 58 85 71 79 // 57 86 91 95 // (f49, f54) = (f52, f54) $ (c0, c0), (line 0, 1) fpma f49 = f54, c0, f52 fpnma f54 = f54, c0, f52 ;; // (f52, f55) = (f53, f55) $ (c1, c2), (line 2, 3) fpma f52 = f55, c1, f53 fpnma f55 = f55, c2, f53 ;; // (f53, f87) = (f84, f87) $ (c3, c4), (line 4, 5) fpma f53 = f87, c3, f84 fpnma f87 = f87, c4, f84 ;; // (f60, f86) = (f85, f86) $ (c5, c6), (line 6, 7) fpma f60 = f86, c5, f85 fpnma f86 = f86, c6, f85 ;; ;; // (f61, f55) = (f49, f55) $ (c7, c7), (line 0, 3) fpma f61 = f55, c7, f49 fpnma f55 = f55, c7, f49 ;; // (f49, f52) = (f54, f52) $ (c8, c8), (line 1, 2) fpma f49 = f52, c8, f54 fpnma f52 = f52, c8, f54 ;; // (f54, f60) = (f53, f60) $ (c9, c9), (line 4, 6) fpma f54 = f60, c9, f53 fpnma f60 = f60, c9, f53 ;; // (f53, f86) = (f87, f86) $ (c10, c10), (line 5, 7) fpma f53 = f86, c10, f87 fpnma f86 = f86, c10, f87 ;; ;; // (f62, f60) = (f53, f60) $ (c11, c11), (line 5, 6) fpma f62 = f60, c11, f53 fpnma f60 = f60, c11, f53 ;; // (f53, f54) = (f61, f54) $ (c12, c12), (line 0, 4) fpma f53 = f54, c12, f61 fpnma f54 = f54, c12, f61 ;; // (f61, f86) = (f55, f86) $ (c14, c14), (line 3, 7) fpma f61 = f86, c14, f55 fpnma f86 = f86, c14, f55 ;; ;; // (f55, f62) = (f49, f62) $ (c13, c13), (line 1, 5) fpma f55 = f62, c13, f49 fpnma f62 = f62, c13, f49 ;; // (f49, f60) = (f52, f60) $ (c13, c13), (line 2, 6) fpma f49 = f60, c13, f52 fpnma f60 = f60, c13, f52 ;; // before post shuffle // 50 53 68 76 // 59 55 88 92 // 51 49 69 77 // 56 61 89 93 // 82 54 70 78 // 48 62 90 94 // 58 60 71 79 // 57 86 91 95 // after post shuffle // 50 53 68 76 // 59 55 88 92 // 51 60 69 77 // 56 61 89 93 // 82 86 70 78 // 48 49 90 94 // 58 62 71 79 // 57 54 91 95 // before pre shuffle // 50 53 68 76 // 59 55 88 92 // 51 60 69 77 // 56 61 89 93 // 82 86 70 78 // 48 49 90 94 // 58 62 71 79 // 57 54 91 95 // after pre shuffle // 50 53 68 76 // 59 55 70 92 // 51 60 69 77 // 56 61 71 93 // 82 86 88 78 // 48 49 91 94 // 58 62 89 79 // 57 54 90 95 // (f52, f70) = (f68, f70) $ (c0, c0), (line 0, 1) fpma f52 = f70, c0, f68 fpnma f70 = f70, c0, f68 ;; // (f63, f71) = (f69, f71) $ (c1, c2), (line 2, 3) fpma f63 = f71, c1, f69 fpnma f71 = f71, c2, f69 ;; // (f64, f91) = (f88, f91) $ (c3, c4), (line 4, 5) fpma f64 = f91, c3, f88 fpnma f91 = f91, c4, f88 ;; // (f65, f90) = (f89, f90) $ (c5, c6), (line 6, 7) fpma f65 = f90, c5, f89 fpnma f90 = f90, c6, f89 ;; ;; // (f66, f71) = (f52, f71) $ (c7, c7), (line 0, 3) fpma f66 = f71, c7, f52 fpnma f71 = f71, c7, f52 ;; // (f52, f63) = (f70, f63) $ (c8, c8), (line 1, 2) fpma f52 = f63, c8, f70 fpnma f63 = f63, c8, f70 ;; // (f67, f65) = (f64, f65) $ (c9, c9), (line 4, 6) fpma f67 = f65, c9, f64 fpnma f65 = f65, c9, f64 ;; // (f64, f90) = (f91, f90) $ (c10, c10), (line 5, 7) fpma f64 = f90, c10, f91 fpnma f90 = f90, c10, f91 ;; ;; // (f68, f65) = (f64, f65) $ (c11, c11), (line 5, 6) fpma f68 = f65, c11, f64 fpnma f65 = f65, c11, f64 ;; // (f64, f67) = (f66, f67) $ (c12, c12), (line 0, 4) fpma f64 = f67, c12, f66 fpnma f67 = f67, c12, f66 ;; // (f66, f90) = (f71, f90) $ (c14, c14), (line 3, 7) fpma f66 = f90, c14, f71 fpnma f90 = f90, c14, f71 ;; ;; // (f69, f68) = (f52, f68) $ (c13, c13), (line 1, 5) fpma f69 = f68, c13, f52 fpnma f68 = f68, c13, f52 ;; // (f52, f65) = (f63, f65) $ (c13, c13), (line 2, 6) fpma f52 = f65, c13, f63 fpnma f65 = f65, c13, f63 ;; // before post shuffle // 50 53 64 76 // 59 55 69 92 // 51 60 52 77 // 56 61 66 93 // 82 86 67 78 // 48 49 68 94 // 58 62 65 79 // 57 54 90 95 // after post shuffle // 50 53 64 76 // 59 55 69 92 // 51 60 65 77 // 56 61 66 93 // 82 86 90 78 // 48 49 52 94 // 58 62 68 79 // 57 54 67 95 // before pre shuffle // 50 53 64 76 // 59 55 69 92 // 51 60 65 77 // 56 61 66 93 // 82 86 90 78 // 48 49 52 94 // 58 62 68 79 // 57 54 67 95 // after pre shuffle // 50 53 64 76 // 59 55 69 78 // 51 60 65 77 // 56 61 66 79 // 82 86 90 92 // 48 49 52 95 // 58 62 68 93 // 57 54 67 94 // (f63, f78) = (f76, f78) $ (c0, c0), (line 0, 1) fpma f63 = f78, c0, f76 fpnma f78 = f78, c0, f76 ;; // (f70, f79) = (f77, f79) $ (c1, c2), (line 2, 3) fpma f70 = f79, c1, f77 fpnma f79 = f79, c2, f77 ;; // (f71, f95) = (f92, f95) $ (c3, c4), (line 4, 5) fpma f71 = f95, c3, f92 fpnma f95 = f95, c4, f92 ;; // (f72, f94) = (f93, f94) $ (c5, c6), (line 6, 7) fpma f72 = f94, c5, f93 fpnma f94 = f94, c6, f93 ;; ;; // (f73, f79) = (f63, f79) $ (c7, c7), (line 0, 3) fpma f73 = f79, c7, f63 fpnma f79 = f79, c7, f63 ;; // (f63, f70) = (f78, f70) $ (c8, c8), (line 1, 2) fpma f63 = f70, c8, f78 fpnma f70 = f70, c8, f78 ;; // (f74, f72) = (f71, f72) $ (c9, c9), (line 4, 6) fpma f74 = f72, c9, f71 fpnma f72 = f72, c9, f71 ;; // (f71, f94) = (f95, f94) $ (c10, c10), (line 5, 7) fpma f71 = f94, c10, f95 fpnma f94 = f94, c10, f95 ;; ;; // (f75, f72) = (f71, f72) $ (c11, c11), (line 5, 6) fpma f75 = f72, c11, f71 fpnma f72 = f72, c11, f71 ;; // (f71, f74) = (f73, f74) $ (c12, c12), (line 0, 4) fpma f71 = f74, c12, f73 fpnma f74 = f74, c12, f73 ;; // (f73, f94) = (f79, f94) $ (c14, c14), (line 3, 7) fpma f73 = f94, c14, f79 fpnma f94 = f94, c14, f79 ;; ;; // (f76, f75) = (f63, f75) $ (c13, c13), (line 1, 5) fpma f76 = f75, c13, f63 fpnma f75 = f75, c13, f63 ;; // (f63, f72) = (f70, f72) $ (c13, c13), (line 2, 6) fpma f63 = f72, c13, f70 fpnma f72 = f72, c13, f70 ;; // before post shuffle // 50 53 64 71 // 59 55 69 76 // 51 60 65 63 // 56 61 66 73 // 82 86 90 74 // 48 49 52 75 // 58 62 68 72 // 57 54 67 94 // after post shuffle // 50 53 64 71 // 59 55 69 76 // 51 60 65 72 // 56 61 66 73 // 82 86 90 94 // 48 49 52 63 // 58 62 68 75 // 57 54 67 74 ;; fmix.r f70 = f50, f59 fmix.r f77 = f53, f55 fmix.r f78 = f64, f69 fmix.r f79 = f71, f76 fmix.r f80 = f51, f56 fmix.r f81 = f60, f61 fmix.r f83 = f65, f66 fmix.r f84 = f72, f73 fmix.r f85 = f82, f48 fmix.r f87 = f86, f49 fmix.r f88 = f90, f52 fmix.r f89 = f94, f63 fmix.r f91 = f58, f57 fmix.r f92 = f62, f54 fmix.r f93 = f68, f67 fmix.r f95 = f75, f74 ;; fmix.l f50 = f50, f59 fmix.l f53 = f53, f55 fmix.l f64 = f64, f69 fmix.l f71 = f71, f76 fmix.l f51 = f51, f56 fmix.l f60 = f60, f61 fmix.l f65 = f65, f66 fmix.l f72 = f72, f73 fmix.l f82 = f82, f48 fmix.l f86 = f86, f49 fmix.l f90 = f90, f52 fmix.l f94 = f94, f63 fmix.l f58 = f58, f57 fmix.l f62 = f62, f54 fmix.l f68 = f68, f67 fmix.l f75 = f75, f74 ;; // 50 51 82 58 // 70 80 85 91 // 53 60 86 62 // 77 81 87 92 // 64 65 90 68 // 78 83 88 93 // 71 72 94 75 // 79 84 89 95 mov addreg1 = in0 add addreg2 = 4, in0 ;; fpcvt.fx f50 = f50 fpcvt.fx f51 = f51 fpcvt.fx f82 = f82 fpcvt.fx f58 = f58 fpcvt.fx f70 = f70 fpcvt.fx f80 = f80 fpcvt.fx f85 = f85 fpcvt.fx f91 = f91 fpcvt.fx f53 = f53 fpcvt.fx f60 = f60 fpcvt.fx f86 = f86 fpcvt.fx f62 = f62 fpcvt.fx f77 = f77 fpcvt.fx f81 = f81 fpcvt.fx f87 = f87 fpcvt.fx f92 = f92 fpcvt.fx f64 = f64 fpcvt.fx f65 = f65 fpcvt.fx f90 = f90 fpcvt.fx f68 = f68 fpcvt.fx f78 = f78 fpcvt.fx f83 = f83 fpcvt.fx f88 = f88 fpcvt.fx f93 = f93 fpcvt.fx f71 = f71 fpcvt.fx f72 = f72 fpcvt.fx f94 = f94 fpcvt.fx f75 = f75 fpcvt.fx f79 = f79 fpcvt.fx f84 = f84 fpcvt.fx f89 = f89 fpcvt.fx f95 = f95 ;; getf.sig r33 = f50 getf.sig r34 = f51 getf.sig r35 = f82 getf.sig r36 = f58 getf.sig r37 = f70 getf.sig r38 = f80 getf.sig r39 = f85 getf.sig r40 = f91 getf.sig r41 = f53 getf.sig r42 = f60 getf.sig r43 = f86 getf.sig r44 = f62 getf.sig r45 = f77 getf.sig r46 = f81 getf.sig r47 = f87 getf.sig r48 = f92 getf.sig r49 = f64 getf.sig r50 = f65 getf.sig r51 = f90 getf.sig r52 = f68 getf.sig r53 = f78 getf.sig r54 = f83 getf.sig r55 = f88 getf.sig r56 = f93 getf.sig r57 = f71 getf.sig r58 = f72 getf.sig r59 = f94 getf.sig r60 = f75 getf.sig r61 = f79 getf.sig r62 = f84 getf.sig r63 = f89 getf.sig r64 = f95 ;; shl r33 = r33, 7 shl r34 = r34, 7 shl r35 = r35, 7 shl r36 = r36, 7 shl r37 = r37, 7 shl r38 = r38, 7 shl r39 = r39, 7 shl r40 = r40, 7 shl r41 = r41, 7 shl r42 = r42, 7 shl r43 = r43, 7 shl r44 = r44, 7 shl r45 = r45, 7 shl r46 = r46, 7 shl r47 = r47, 7 shl r48 = r48, 7 shl r49 = r49, 7 shl r50 = r50, 7 shl r51 = r51, 7 shl r52 = r52, 7 shl r53 = r53, 7 shl r54 = r54, 7 shl r55 = r55, 7 shl r56 = r56, 7 shl r57 = r57, 7 shl r58 = r58, 7 shl r59 = r59, 7 shl r60 = r60, 7 shl r61 = r61, 7 shl r62 = r62, 7 shl r63 = r63, 7 shl r64 = r64, 7 ;; pack4.sss r33 = r33, r0 pack4.sss r34 = r34, r0 pack4.sss r35 = r35, r0 pack4.sss r36 = r36, r0 pack4.sss r37 = r37, r0 pack4.sss r38 = r38, r0 pack4.sss r39 = r39, r0 pack4.sss r40 = r40, r0 pack4.sss r41 = r41, r0 pack4.sss r42 = r42, r0 pack4.sss r43 = r43, r0 pack4.sss r44 = r44, r0 pack4.sss r45 = r45, r0 pack4.sss r46 = r46, r0 pack4.sss r47 = r47, r0 pack4.sss r48 = r48, r0 pack4.sss r49 = r49, r0 pack4.sss r50 = r50, r0 pack4.sss r51 = r51, r0 pack4.sss r52 = r52, r0 pack4.sss r53 = r53, r0 pack4.sss r54 = r54, r0 pack4.sss r55 = r55, r0 pack4.sss r56 = r56, r0 pack4.sss r57 = r57, r0 pack4.sss r58 = r58, r0 pack4.sss r59 = r59, r0 pack4.sss r60 = r60, r0 pack4.sss r61 = r61, r0 pack4.sss r62 = r62, r0 pack4.sss r63 = r63, r0 pack4.sss r64 = r64, r0 ;; pshr2 r33 = r33, 7 pshr2 r34 = r34, 7 pshr2 r35 = r35, 7 pshr2 r36 = r36, 7 pshr2 r37 = r37, 7 pshr2 r38 = r38, 7 pshr2 r39 = r39, 7 pshr2 r40 = r40, 7 pshr2 r41 = r41, 7 pshr2 r42 = r42, 7 pshr2 r43 = r43, 7 pshr2 r44 = r44, 7 pshr2 r45 = r45, 7 pshr2 r46 = r46, 7 pshr2 r47 = r47, 7 pshr2 r48 = r48, 7 pshr2 r49 = r49, 7 pshr2 r50 = r50, 7 pshr2 r51 = r51, 7 pshr2 r52 = r52, 7 pshr2 r53 = r53, 7 pshr2 r54 = r54, 7 pshr2 r55 = r55, 7 pshr2 r56 = r56, 7 pshr2 r57 = r57, 7 pshr2 r58 = r58, 7 pshr2 r59 = r59, 7 pshr2 r60 = r60, 7 pshr2 r61 = r61, 7 pshr2 r62 = r62, 7 pshr2 r63 = r63, 7 pshr2 r64 = r64, 7 ;; mux2 r33 = r33, 0xe1 mux2 r34 = r34, 0xe1 mux2 r35 = r35, 0xe1 mux2 r36 = r36, 0xe1 mux2 r37 = r37, 0xe1 mux2 r38 = r38, 0xe1 mux2 r39 = r39, 0xe1 mux2 r40 = r40, 0xe1 mux2 r41 = r41, 0xe1 mux2 r42 = r42, 0xe1 mux2 r43 = r43, 0xe1 mux2 r44 = r44, 0xe1 mux2 r45 = r45, 0xe1 mux2 r46 = r46, 0xe1 mux2 r47 = r47, 0xe1 mux2 r48 = r48, 0xe1 mux2 r49 = r49, 0xe1 mux2 r50 = r50, 0xe1 mux2 r51 = r51, 0xe1 mux2 r52 = r52, 0xe1 mux2 r53 = r53, 0xe1 mux2 r54 = r54, 0xe1 mux2 r55 = r55, 0xe1 mux2 r56 = r56, 0xe1 mux2 r57 = r57, 0xe1 mux2 r58 = r58, 0xe1 mux2 r59 = r59, 0xe1 mux2 r60 = r60, 0xe1 mux2 r61 = r61, 0xe1 mux2 r62 = r62, 0xe1 mux2 r63 = r63, 0xe1 mux2 r64 = r64, 0xe1 ;; st4 [addreg1] = r33, 8 st4 [addreg2] = r34, 8 ;; st4 [addreg1] = r35, 8 st4 [addreg2] = r36, 8 ;; st4 [addreg1] = r37, 8 st4 [addreg2] = r38, 8 ;; st4 [addreg1] = r39, 8 st4 [addreg2] = r40, 8 ;; st4 [addreg1] = r41, 8 st4 [addreg2] = r42, 8 ;; st4 [addreg1] = r43, 8 st4 [addreg2] = r44, 8 ;; st4 [addreg1] = r45, 8 st4 [addreg2] = r46, 8 ;; st4 [addreg1] = r47, 8 st4 [addreg2] = r48, 8 ;; st4 [addreg1] = r49, 8 st4 [addreg2] = r50, 8 ;; st4 [addreg1] = r51, 8 st4 [addreg2] = r52, 8 ;; st4 [addreg1] = r53, 8 st4 [addreg2] = r54, 8 ;; st4 [addreg1] = r55, 8 st4 [addreg2] = r56, 8 ;; st4 [addreg1] = r57, 8 st4 [addreg2] = r58, 8 ;; st4 [addreg1] = r59, 8 st4 [addreg2] = r60, 8 ;; st4 [addreg1] = r61, 8 st4 [addreg2] = r62, 8 ;; st4 [addreg1] = r63, 8 st4 [addreg2] = r64, 8 ;; mov ar.pfs = r16 br.ret.sptk.few b0 .endp xvidcore/src/dct/x86_asm/0000775000076500007650000000000011566427763016332 5ustar xvidbuildxvidbuildxvidcore/src/dct/x86_asm/fdct_mmx_skal.asm0000664000076500007650000004005611254216113021627 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - MMX and XMM forward discrete cosine transform - ; * ; * Copyright(C) 2002 Pascal Massimino ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: fdct_mmx_skal.asm,v 1.12 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;;; Define this if you want an unrolled version of the code %define UNROLLED_LOOP ;============================================================================= ; ; Vertical pass is an implementation of the scheme: ; Loeffler C., Ligtenberg A., and Moschytz C.S.: ; Practical Fast 1D DCT Algorithm with Eleven Multiplications, ; Proc. ICASSP 1989, 988-991. ; ; Horizontal pass is a double 4x4 vector/matrix multiplication, ; (see also Intel's Application Note 922: ; http://developer.intel.com/vtune/cbts/strmsimd/922down.htm ; Copyright (C) 1999 Intel Corporation) ; ; Notes: ; * tan(3pi/16) is greater than 0.5, and would use the ; sign bit when turned into 16b fixed-point precision. So, ; we use the trick: x*tan3 = x*(tan3-1)+x ; ; * There's only one SSE-specific instruction (pshufw). ; Porting to SSE2 also seems straightforward. ; ; * There's still 1 or 2 ticks to save in fLLM_PASS, but ; I prefer having a readable code, instead of a tightly ; scheduled one... ; ; * Quantization stage (as well as pre-transposition for the ; idct way back) can be included in the fTab* constants ; (with induced loss of precision, somehow) ; ; * Some more details at: http://skal.planet-d.net/coding/dct.html ; ;============================================================================= ; ; idct-like IEEE errors: ; ; ========================= ; Peak error: 1.0000 ; Peak MSE: 0.0365 ; Overall MSE: 0.0201 ; Peak ME: 0.0265 ; Overall ME: 0.0006 ; ; == Mean square errors == ; 0.000 0.001 0.001 0.002 0.000 0.002 0.001 0.000 [0.001] ; 0.035 0.029 0.032 0.032 0.031 0.032 0.034 0.035 [0.032] ; 0.026 0.028 0.027 0.027 0.025 0.028 0.028 0.025 [0.027] ; 0.037 0.032 0.031 0.030 0.028 0.029 0.026 0.031 [0.030] ; 0.000 0.001 0.001 0.002 0.000 0.002 0.001 0.001 [0.001] ; 0.025 0.024 0.022 0.022 0.022 0.022 0.023 0.023 [0.023] ; 0.026 0.028 0.025 0.028 0.030 0.025 0.026 0.027 [0.027] ; 0.021 0.020 0.020 0.022 0.020 0.022 0.017 0.019 [0.020] ; ; == Abs Mean errors == ; 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 [0.000] ; 0.020 0.001 0.003 0.003 0.000 0.004 0.002 0.003 [0.002] ; 0.000 0.001 0.001 0.001 0.001 0.004 0.000 0.000 [0.000] ; 0.027 0.001 0.000 0.002 0.002 0.002 0.001 0.000 [0.003] ; 0.000 0.000 0.000 0.000 0.000 0.001 0.000 0.001 [-0.000] ; 0.001 0.003 0.001 0.001 0.002 0.001 0.000 0.000 [-0.000] ; 0.000 0.002 0.002 0.001 0.001 0.002 0.001 0.000 [-0.000] ; 0.000 0.002 0.001 0.002 0.001 0.002 0.001 0.001 [-0.000] ; ;============================================================================= ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN tan1: dw 0x32ec,0x32ec,0x32ec,0x32ec ; tan( pi/16) tan2: dw 0x6a0a,0x6a0a,0x6a0a,0x6a0a ; tan(2pi/16) (=sqrt(2)-1) tan3: dw 0xab0e,0xab0e,0xab0e,0xab0e ; tan(3pi/16)-1 sqrt2: dw 0x5a82,0x5a82,0x5a82,0x5a82 ; 0.5/sqrt(2) ALIGN SECTION_ALIGN fdct_table: ;fTab1: dw 0x4000, 0x4000, 0x58c5, 0x4b42 dw 0x4000, 0x4000, 0x3249, 0x11a8 dw 0x539f, 0x22a3, 0x4b42, 0xee58 dw 0xdd5d, 0xac61, 0xa73b, 0xcdb7 dw 0x4000, 0xc000, 0x3249, 0xa73b dw 0xc000, 0x4000, 0x11a8, 0x4b42 dw 0x22a3, 0xac61, 0x11a8, 0xcdb7 dw 0x539f, 0xdd5d, 0x4b42, 0xa73b ;fTab2: dw 0x58c5, 0x58c5, 0x7b21, 0x6862 dw 0x58c5, 0x58c5, 0x45bf, 0x187e dw 0x73fc, 0x300b, 0x6862, 0xe782 dw 0xcff5, 0x8c04, 0x84df, 0xba41 dw 0x58c5, 0xa73b, 0x45bf, 0x84df dw 0xa73b, 0x58c5, 0x187e, 0x6862 dw 0x300b, 0x8c04, 0x187e, 0xba41 dw 0x73fc, 0xcff5, 0x6862, 0x84df ;fTab3: dw 0x539f, 0x539f, 0x73fc, 0x6254 dw 0x539f, 0x539f, 0x41b3, 0x1712 dw 0x6d41, 0x2d41, 0x6254, 0xe8ee dw 0xd2bf, 0x92bf, 0x8c04, 0xbe4d dw 0x539f, 0xac61, 0x41b3, 0x8c04 dw 0xac61, 0x539f, 0x1712, 0x6254 dw 0x2d41, 0x92bf, 0x1712, 0xbe4d dw 0x6d41, 0xd2bf, 0x6254, 0x8c04 ;fTab4: dw 0x4b42, 0x4b42, 0x6862, 0x587e dw 0x4b42, 0x4b42, 0x3b21, 0x14c3 dw 0x6254, 0x28ba, 0x587e, 0xeb3d dw 0xd746, 0x9dac, 0x979e, 0xc4df dw 0x4b42, 0xb4be, 0x3b21, 0x979e dw 0xb4be, 0x4b42, 0x14c3, 0x587e dw 0x28ba, 0x9dac, 0x14c3, 0xc4df dw 0x6254, 0xd746, 0x587e, 0x979e ;fTab1: dw 0x4000, 0x4000, 0x58c5, 0x4b42 dw 0x4000, 0x4000, 0x3249, 0x11a8 dw 0x539f, 0x22a3, 0x4b42, 0xee58 dw 0xdd5d, 0xac61, 0xa73b, 0xcdb7 dw 0x4000, 0xc000, 0x3249, 0xa73b dw 0xc000, 0x4000, 0x11a8, 0x4b42 dw 0x22a3, 0xac61, 0x11a8, 0xcdb7 dw 0x539f, 0xdd5d, 0x4b42, 0xa73b ;fTab4: dw 0x4b42, 0x4b42, 0x6862, 0x587e dw 0x4b42, 0x4b42, 0x3b21, 0x14c3 dw 0x6254, 0x28ba, 0x587e, 0xeb3d dw 0xd746, 0x9dac, 0x979e, 0xc4df dw 0x4b42, 0xb4be, 0x3b21, 0x979e dw 0xb4be, 0x4b42, 0x14c3, 0x587e dw 0x28ba, 0x9dac, 0x14c3, 0xc4df dw 0x6254, 0xd746, 0x587e, 0x979e ;fTab3: dw 0x539f, 0x539f, 0x73fc, 0x6254 dw 0x539f, 0x539f, 0x41b3, 0x1712 dw 0x6d41, 0x2d41, 0x6254, 0xe8ee dw 0xd2bf, 0x92bf, 0x8c04, 0xbe4d dw 0x539f, 0xac61, 0x41b3, 0x8c04 dw 0xac61, 0x539f, 0x1712, 0x6254 dw 0x2d41, 0x92bf, 0x1712, 0xbe4d dw 0x6d41, 0xd2bf, 0x6254, 0x8c04 ;fTab2: dw 0x58c5, 0x58c5, 0x7b21, 0x6862 dw 0x58c5, 0x58c5, 0x45bf, 0x187e dw 0x73fc, 0x300b, 0x6862, 0xe782 dw 0xcff5, 0x8c04, 0x84df, 0xba41 dw 0x58c5, 0xa73b, 0x45bf, 0x84df dw 0xa73b, 0x58c5, 0x187e, 0x6862 dw 0x300b, 0x8c04, 0x187e, 0xba41 dw 0x73fc, 0xcff5, 0x6862, 0x84df ALIGN SECTION_ALIGN fdct_rounding_1: dw 6, 8, 8, 8 dw 10, 8, 8, 8 dw 8, 8, 8, 8 dw 8, 8, 8, 8 dw 6, 8, 8, 8 dw 8, 8, 8, 8 dw 8, 8, 8, 8 dw 8, 8, 8, 8 ALIGN SECTION_ALIGN fdct_rounding_2: dw 6, 8, 8, 8 dw 8, 8, 8, 8 dw 8, 8, 8, 8 dw 8, 8, 8, 8 dw 6, 8, 8, 8 dw 8, 8, 8, 8 dw 8, 8, 8, 8 dw 8, 8, 8, 8 ALIGN SECTION_ALIGN MMX_One: dw 1, 1, 1, 1 ;============================================================================= ; Helper Macros for real code ;============================================================================= ;----------------------------------------------------------------------------- ; FDCT LLM vertical pass (~39c) ; %1=dst, %2=src, %3:Shift ;----------------------------------------------------------------------------- %macro fLLM_PASS 3 movq mm0, [%2+0*16] ; In0 movq mm2, [%2+2*16] ; In2 movq mm3, mm0 movq mm4, mm2 movq mm7, [%2+7*16] ; In7 movq mm5, [%2+5*16] ; In5 psubsw mm0, mm7 ; t7 = In0-In7 paddsw mm7, mm3 ; t0 = In0+In7 psubsw mm2, mm5 ; t5 = In2-In5 paddsw mm5, mm4 ; t2 = In2+In5 movq mm3, [%2+3*16] ; In3 movq mm4, [%2+4*16] ; In4 movq mm1, mm3 psubsw mm3, mm4 ; t4 = In3-In4 paddsw mm4, mm1 ; t3 = In3+In4 movq mm6, [%2+6*16] ; In6 movq mm1, [%2+1*16] ; In1 psubsw mm1, mm6 ; t6 = In1-In6 paddsw mm6, [%2+1*16] ; t1 = In1+In6 psubsw mm7, mm4 ; tm03 = t0-t3 psubsw mm6, mm5 ; tm12 = t1-t2 paddsw mm4, mm4 ; 2.t3 paddsw mm5, mm5 ; 2.t2 paddsw mm4, mm7 ; tp03 = t0+t3 paddsw mm5, mm6 ; tp12 = t1+t2 psllw mm2, %3+1 ; shift t5 (shift +1 to.. psllw mm1, %3+1 ; shift t6 ..compensate cos4/2) psllw mm4, %3 ; shift t3 psllw mm5, %3 ; shift t2 psllw mm7, %3 ; shift t0 psllw mm6, %3 ; shift t1 psllw mm3, %3 ; shift t4 psllw mm0, %3 ; shift t7 psubsw mm4, mm5 ; out4 = tp03-tp12 psubsw mm1, mm2 ; mm1: t6-t5 paddsw mm5, mm5 paddsw mm2, mm2 paddsw mm5, mm4 ; out0 = tp03+tp12 movq [%1+4*16], mm4 ; => out4 paddsw mm2, mm1 ; mm2: t6+t5 movq [%1+0*16], mm5 ; => out0 movq mm4, [tan2] ; mm4 <= tan2 pmulhw mm4, mm7 ; tm03*tan2 movq mm5, [tan2] ; mm5 <= tan2 psubsw mm4, mm6 ; out6 = tm03*tan2 - tm12 pmulhw mm5, mm6 ; tm12*tan2 paddsw mm5, mm7 ; out2 = tm12*tan2 + tm03 movq mm6, [sqrt2] movq mm7, [MMX_One] pmulhw mm2, mm6 ; mm2: tp65 = (t6 + t5)*cos4 por mm5, mm7 ; correct out2 por mm4, mm7 ; correct out6 pmulhw mm1, mm6 ; mm1: tm65 = (t6 - t5)*cos4 por mm2, mm7 ; correct tp65 movq [%1+2*16], mm5 ; => out2 movq mm5, mm3 ; save t4 movq [%1+6*16], mm4 ; => out6 movq mm4, mm0 ; save t7 psubsw mm3, mm1 ; mm3: tm465 = t4 - tm65 psubsw mm0, mm2 ; mm0: tm765 = t7 - tp65 paddsw mm2, mm4 ; mm2: tp765 = t7 + tp65 paddsw mm1, mm5 ; mm1: tp465 = t4 + tm65 movq mm4, [tan3] ; tan3 - 1 movq mm5, [tan1] ; tan1 movq mm7, mm3 ; save tm465 pmulhw mm3, mm4 ; tm465*(tan3-1) movq mm6, mm1 ; save tp465 pmulhw mm1, mm5 ; tp465*tan1 paddsw mm3, mm7 ; tm465*tan3 pmulhw mm4, mm0 ; tm765*(tan3-1) paddsw mm4, mm0 ; tm765*tan3 pmulhw mm5, mm2 ; tp765*tan1 paddsw mm1, mm2 ; out1 = tp765 + tp465*tan1 psubsw mm0, mm3 ; out3 = tm765 - tm465*tan3 paddsw mm7, mm4 ; out5 = tm465 + tm765*tan3 psubsw mm5, mm6 ; out7 =-tp465 + tp765*tan1 movq [%1+1*16], mm1 ; => out1 movq [%1+3*16], mm0 ; => out3 movq [%1+5*16], mm7 ; => out5 movq [%1+7*16], mm5 ; => out7 %endmacro ;----------------------------------------------------------------------------- ; fMTX_MULT_XMM (~20c) ; %1=dst, %2=src, %3 = Coeffs, %4/%5=rounders ;----------------------------------------------------------------------------- %macro fMTX_MULT_XMM 5 movq mm0, [%2 + 0] ; mm0 = [0123] ; the 'pshufw' below is the only SSE instruction. ; For MMX-only version, it should be emulated with ; some 'punpck' soup... pshufw mm1, [%2 + 8], 00011011b ; mm1 = [7654] movq mm7, mm0 paddsw mm0, mm1 ; mm0 = [a0 a1 a2 a3] psubsw mm7, mm1 ; mm7 = [b0 b1 b2 b3] movq mm1, mm0 punpckldq mm0, mm7 ; mm0 = [a0 a1 b0 b1] punpckhdq mm1, mm7 ; mm1 = [b2 b3 a2 a3] movq mm2, qword [%3 + 0] ; [ M00 M01 M16 M17] movq mm3, qword [%3 + 8] ; [ M02 M03 M18 M19] pmaddwd mm2, mm0 ; [a0.M00+a1.M01 | b0.M16+b1.M17] movq mm4, qword [%3 + 16] ; [ M04 M05 M20 M21] pmaddwd mm3, mm1 ; [a2.M02+a3.M03 | b2.M18+b3.M19] movq mm5, qword [%3 + 24] ; [ M06 M07 M22 M23] pmaddwd mm4, mm0 ; [a0.M04+a1.M05 | b0.M20+b1.M21] movq mm6, qword [%3 + 32] ; [ M08 M09 M24 M25] pmaddwd mm5, mm1 ; [a2.M06+a3.M07 | b2.M22+b3.M23] movq mm7, qword [%3 + 40] ; [ M10 M11 M26 M27] pmaddwd mm6, mm0 ; [a0.M08+a1.M09 | b0.M24+b1.M25] paddd mm2, mm3 ; [ out0 | out1 ] pmaddwd mm7, mm1 ; [a0.M10+a1.M11 | b0.M26+b1.M27] psrad mm2, 16 pmaddwd mm0, [%3 + 48] ; [a0.M12+a1.M13 | b0.M28+b1.M29] paddd mm4, mm5 ; [ out2 | out3 ] pmaddwd mm1, [%3 + 56] ; [a0.M14+a1.M15 | b0.M30+b1.M31] psrad mm4, 16 paddd mm6, mm7 ; [ out4 | out5 ] psrad mm6, 16 paddd mm0, mm1 ; [ out6 | out7 ] psrad mm0, 16 packssdw mm2, mm4 ; [ out0|out1|out2|out3 ] paddsw mm2, [%4] ; Round packssdw mm6, mm0 ; [ out4|out5|out6|out7 ] paddsw mm6, [%5] ; Round psraw mm2, 4 ; => [-2048, 2047] psraw mm6, 4 movq [%1 + 0], mm2 movq [%1 + 8], mm6 %endmacro ;----------------------------------------------------------------------------- ; fMTX_MULT_MMX (~22c) ; %1=dst, %2=src, %3 = Coeffs, %4/%5=rounders ;----------------------------------------------------------------------------- %macro fMTX_MULT_MMX 5 ; MMX-only version (no 'pshufw'. ~10% overall slower than SSE) movd mm1, [%2 + 8 + 4] ; [67..] movq mm0, [%2 + 0] ; mm0 = [0123] movq mm7, mm0 punpcklwd mm1, [%2 + 8] ; [6475] movq mm2, mm1 psrlq mm1, 32 ; [75..] punpcklwd mm1,mm2 ; [7654] paddsw mm0, mm1 ; mm0 = [a0 a1 a2 a3] psubsw mm7, mm1 ; mm7 = [b0 b1 b2 b3] movq mm1, mm0 punpckldq mm0, mm7 ; mm0 = [a0 a1 b0 b1] punpckhdq mm1, mm7 ; mm1 = [b2 b3 a2 a3] movq mm2, qword [%3 + 0] ; [ M00 M01 M16 M17] movq mm3, qword [%3 + 8] ; [ M02 M03 M18 M19] pmaddwd mm2, mm0 ; [a0.M00+a1.M01 | b0.M16+b1.M17] movq mm4, qword [%3 + 16] ; [ M04 M05 M20 M21] pmaddwd mm3, mm1 ; [a2.M02+a3.M03 | b2.M18+b3.M19] movq mm5, qword [%3 + 24] ; [ M06 M07 M22 M23] pmaddwd mm4, mm0 ; [a0.M04+a1.M05 | b0.M20+b1.M21] movq mm6, qword [%3 + 32] ; [ M08 M09 M24 M25] pmaddwd mm5, mm1 ; [a2.M06+a3.M07 | b2.M22+b3.M23] movq mm7, qword [%3 + 40] ; [ M10 M11 M26 M27] pmaddwd mm6, mm0 ; [a0.M08+a1.M09 | b0.M24+b1.M25] paddd mm2, mm3 ; [ out0 | out1 ] pmaddwd mm7, mm1 ; [a0.M10+a1.M11 | b0.M26+b1.M27] psrad mm2, 16 pmaddwd mm0, [%3 + 48] ; [a0.M12+a1.M13 | b0.M28+b1.M29] paddd mm4, mm5 ; [ out2 | out3 ] pmaddwd mm1, [%3 + 56] ; [a0.M14+a1.M15 | b0.M30+b1.M31] psrad mm4, 16 paddd mm6, mm7 ; [ out4 | out5 ] psrad mm6, 16 paddd mm0, mm1 ; [ out6 | out7 ] psrad mm0, 16 packssdw mm2, mm4 ; [ out0|out1|out2|out3 ] paddsw mm2, [%4] ; Round packssdw mm6, mm0 ; [ out4|out5|out6|out7 ] paddsw mm6, [%5] ; Round psraw mm2, 4 ; => [-2048, 2047] psraw mm6, 4 movq [%1 + 0], mm2 movq [%1 + 8], mm6 %endmacro ;----------------------------------------------------------------------------- ; MAKE_FDCT_FUNC ; %1 funcname, %2 macro for row dct ;----------------------------------------------------------------------------- %macro MAKE_FDCT_FUNC 2 ALIGN SECTION_ALIGN cglobal %1 %1: mov TMP0, prm1 %ifndef UNROLLED_LOOP push _EBX push _EDI %endif fLLM_PASS TMP0+0, TMP0+0, 3 fLLM_PASS TMP0+8, TMP0+8, 3 %ifdef UNROLLED_LOOP %assign i 0 %rep 8 %2 TMP0+i*16, TMP0+i*16, fdct_table+i*64, fdct_rounding_1+i*8, fdct_rounding_2+i*8 %assign i i+1 %endrep %else mov _EAX, 8 mov TMP1, fdct_table mov _EBX, fdct_rounding_1 mov _EDI, fdct_rounding_2 .loop %2 TMP0, TMP0, TMP1, _EBX, _EDI add TMP0, 2*8 add TMP1, 2*32 add _EBX, 2*4 add _EDI, 2*4 dec _EAX jne .loop pop _EDI pop _EBX %endif ret ENDFUNC %endmacro ;============================================================================= ; Code ;============================================================================= TEXT ;----------------------------------------------------------------------------- ; void fdct_mmx_skal(int16_t block[64]]; ;----------------------------------------------------------------------------- MAKE_FDCT_FUNC fdct_mmx_skal, fMTX_MULT_MMX ;----------------------------------------------------------------------------- ; void fdct_xmm_skal(int16_t block[64]]; ;----------------------------------------------------------------------------- MAKE_FDCT_FUNC fdct_xmm_skal, fMTX_MULT_XMM NON_EXEC_STACK xvidcore/src/dct/x86_asm/idct_mmx.asm0000664000076500007650000005564211254216113020627 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - MMX and XMM forward discrete cosine transform - ; * ; * Copyright(C) 2001 Peter Ross ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: idct_mmx.asm,v 1.15 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ; **************************************************************************** ; ; Originally provided by Intel at AP-922 ; http://developer.intel.com/vtune/cbts/strmsimd/922down.htm ; (See more app notes at http://developer.intel.com/vtune/cbts/strmsimd/appnotes.htm) ; but in a limited edition. ; New macro implements a column part for precise iDCT ; The routine precision now satisfies IEEE standard 1180-1990. ; ; Copyright(C) 2000-2001 Peter Gubanov ; Rounding trick Copyright(C) 2000 Michel Lespinasse ; ; http://www.elecard.com/peter/idct.html ; http://www.linuxvideo.org/mpeg2dec/ ; ; ***************************************************************************/ ; ; These examples contain code fragments for first stage iDCT 8x8 ; (for rows) and first stage DCT 8x8 (for columns) ; ;============================================================================= ; Macros and other preprocessor constants ;============================================================================= %include "nasm.inc" %define BITS_INV_ACC 5 ; 4 or 5 for IEEE %define SHIFT_INV_ROW 16 - BITS_INV_ACC %define SHIFT_INV_COL 1 + BITS_INV_ACC %define RND_INV_ROW 1024 * (6 - BITS_INV_ACC) ; 1 << (SHIFT_INV_ROW-1) %define RND_INV_COL 16 * (BITS_INV_ACC - 3) ; 1 << (SHIFT_INV_COL-1) %define RND_INV_CORR RND_INV_COL - 1 ; correction -1.0 and round %define BITS_FRW_ACC 3 ; 2 or 3 for accuracy %define SHIFT_FRW_COL BITS_FRW_ACC %define SHIFT_FRW_ROW BITS_FRW_ACC + 17 %define RND_FRW_ROW 262144*(BITS_FRW_ACC - 1) ; 1 << (SHIFT_FRW_ROW-1) ;============================================================================= ; Local Data (Read Only) ;============================================================================= DATA ;----------------------------------------------------------------------------- ; Various memory constants (trigonometric values or rounding values) ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN one_corr: dw 1, 1, 1, 1 round_inv_row: dd RND_INV_ROW, RND_INV_ROW round_inv_col: dw RND_INV_COL, RND_INV_COL, RND_INV_COL, RND_INV_COL round_inv_corr: dw RND_INV_CORR, RND_INV_CORR, RND_INV_CORR, RND_INV_CORR round_frw_row: dd RND_FRW_ROW, RND_FRW_ROW tg_1_16: dw 13036, 13036, 13036, 13036 ; tg * (2<<16) + 0.5 tg_2_16: dw 27146, 27146, 27146, 27146 ; tg * (2<<16) + 0.5 tg_3_16: dw -21746, -21746, -21746, -21746 ; tg * (2<<16) + 0.5 cos_4_16: dw -19195, -19195, -19195, -19195 ; cos * (2<<16) + 0.5 ocos_4_16: dw 23170, 23170, 23170, 23170 ; cos * (2<<15) + 0.5 otg_3_16: dw 21895, 21895, 21895, 21895 ; tg * (2<<16) + 0.5 %if SHIFT_INV_ROW == 12 ; assume SHIFT_INV_ROW == 12 rounder_0: dd 65536, 65536 rounder_4: dd 0, 0 rounder_1: dd 7195, 7195 rounder_7 dd 1024, 1024 rounder_2: dd 4520, 4520 rounder_6: dd 1024, 1024 rounder_3: dd 2407, 2407 rounder_5: dd 240, 240 %elif SHIFT_INV_ROW == 11 ; assume SHIFT_INV_ROW == 11 rounder_0: dd 65536, 65536 rounder_4: dd 0, 0 rounder_1: dd 3597, 3597 rounder_7: dd 512, 512 rounder_2: dd 2260, 2260 rounder_6: dd 512, 512 rounder_3: dd 1203, 1203 rounder_5: dd 120, 120 %else %error invalid SHIFT_INV_ROW %endif ;----------------------------------------------------------------------------- ; ; The first stage iDCT 8x8 - inverse DCTs of rows ; ;----------------------------------------------------------------------------- ; The 8-point inverse DCT direct algorithm ;----------------------------------------------------------------------------- ; ; static const short w[32] = { ; FIX(cos_4_16), FIX(cos_2_16), FIX(cos_4_16), FIX(cos_6_16), ; FIX(cos_4_16), FIX(cos_6_16), -FIX(cos_4_16), -FIX(cos_2_16), ; FIX(cos_4_16), -FIX(cos_6_16), -FIX(cos_4_16), FIX(cos_2_16), ; FIX(cos_4_16), -FIX(cos_2_16), FIX(cos_4_16), -FIX(cos_6_16), ; FIX(cos_1_16), FIX(cos_3_16), FIX(cos_5_16), FIX(cos_7_16), ; FIX(cos_3_16), -FIX(cos_7_16), -FIX(cos_1_16), -FIX(cos_5_16), ; FIX(cos_5_16), -FIX(cos_1_16), FIX(cos_7_16), FIX(cos_3_16), ; FIX(cos_7_16), -FIX(cos_5_16), FIX(cos_3_16), -FIX(cos_1_16) }; ; ; #define DCT_8_INV_ROW(x, y) ; { ; int a0, a1, a2, a3, b0, b1, b2, b3; ; ; a0 =x[0]*w[0]+x[2]*w[1]+x[4]*w[2]+x[6]*w[3]; ; a1 =x[0]*w[4]+x[2]*w[5]+x[4]*w[6]+x[6]*w[7]; ; a2 = x[0] * w[ 8] + x[2] * w[ 9] + x[4] * w[10] + x[6] * w[11]; ; a3 = x[0] * w[12] + x[2] * w[13] + x[4] * w[14] + x[6] * w[15]; ; b0 = x[1] * w[16] + x[3] * w[17] + x[5] * w[18] + x[7] * w[19]; ; b1 = x[1] * w[20] + x[3] * w[21] + x[5] * w[22] + x[7] * w[23]; ; b2 = x[1] * w[24] + x[3] * w[25] + x[5] * w[26] + x[7] * w[27]; ; b3 = x[1] * w[28] + x[3] * w[29] + x[5] * w[30] + x[7] * w[31]; ; ; y[0] = SHIFT_ROUND ( a0 + b0 ); ; y[1] = SHIFT_ROUND ( a1 + b1 ); ; y[2] = SHIFT_ROUND ( a2 + b2 ); ; y[3] = SHIFT_ROUND ( a3 + b3 ); ; y[4] = SHIFT_ROUND ( a3 - b3 ); ; y[5] = SHIFT_ROUND ( a2 - b2 ); ; y[6] = SHIFT_ROUND ( a1 - b1 ); ; y[7] = SHIFT_ROUND ( a0 - b0 ); ; } ; ;----------------------------------------------------------------------------- ; ; In this implementation the outputs of the iDCT-1D are multiplied ; for rows 0,4 - by cos_4_16, ; for rows 1,7 - by cos_1_16, ; for rows 2,6 - by cos_2_16, ; for rows 3,5 - by cos_3_16 ; and are shifted to the left for better accuracy ; ; For the constants used, ; FIX(float_const) = (short) (float_const * (1<<15) + 0.5) ; ;----------------------------------------------------------------------------- ;----------------------------------------------------------------------------- ; Tables for mmx processors ;----------------------------------------------------------------------------- ; Table for rows 0,4 - constants are multiplied by cos_4_16 tab_i_04_mmx: dw 16384, 16384, 16384, -16384 ; movq-> w06 w04 w02 w00 dw 21407, 8867, 8867, -21407 ; w07 w05 w03 w01 dw 16384, -16384, 16384, 16384 ; w14 w12 w10 w08 dw -8867, 21407, -21407, -8867 ; w15 w13 w11 w09 dw 22725, 12873, 19266, -22725 ; w22 w20 w18 w16 dw 19266, 4520, -4520, -12873 ; w23 w21 w19 w17 dw 12873, 4520, 4520, 19266 ; w30 w28 w26 w24 dw -22725, 19266, -12873, -22725 ; w31 w29 w27 w25 ; Table for rows 1,7 - constants are multiplied by cos_1_16 tab_i_17_mmx: dw 22725, 22725, 22725, -22725 ; movq-> w06 w04 w02 w00 dw 29692, 12299, 12299, -29692 ; w07 w05 w03 w01 dw 22725, -22725, 22725, 22725 ; w14 w12 w10 w08 dw -12299, 29692, -29692, -12299 ; w15 w13 w11 w09 dw 31521, 17855, 26722, -31521 ; w22 w20 w18 w16 dw 26722, 6270, -6270, -17855 ; w23 w21 w19 w17 dw 17855, 6270, 6270, 26722 ; w30 w28 w26 w24 dw -31521, 26722, -17855, -31521 ; w31 w29 w27 w25 ; Table for rows 2,6 - constants are multiplied by cos_2_16 tab_i_26_mmx: dw 21407, 21407, 21407, -21407 ; movq-> w06 w04 w02 w00 dw 27969, 11585, 11585, -27969 ; w07 w05 w03 w01 dw 21407, -21407, 21407, 21407 ; w14 w12 w10 w08 dw -11585, 27969, -27969, -11585 ; w15 w13 w11 w09 dw 29692, 16819, 25172, -29692 ; w22 w20 w18 w16 dw 25172, 5906, -5906, -16819 ; w23 w21 w19 w17 dw 16819, 5906, 5906, 25172 ; w30 w28 w26 w24 dw -29692, 25172, -16819, -29692 ; w31 w29 w27 w25 ; Table for rows 3,5 - constants are multiplied by cos_3_16 tab_i_35_mmx: dw 19266, 19266, 19266, -19266 ; movq-> w06 w04 w02 w00 dw 25172, 10426, 10426, -25172 ; w07 w05 w03 w01 dw 19266, -19266, 19266, 19266 ; w14 w12 w10 w08 dw -10426, 25172, -25172, -10426 ; w15 w13 w11 w09 dw 26722, 15137, 22654, -26722 ; w22 w20 w18 w16 dw 22654, 5315, -5315, -15137 ; w23 w21 w19 w17 dw 15137, 5315, 5315, 22654 ; w30 w28 w26 w24 dw -26722, 22654, -15137, -26722 ; w31 w29 w27 w25 ;----------------------------------------------------------------------------- ; Tables for xmm processors ;----------------------------------------------------------------------------- ; %3 for rows 0,4 - constants are multiplied by cos_4_16 tab_i_04_xmm: dw 16384, 21407, 16384, 8867 ; movq-> w05 w04 w01 w00 dw 16384, 8867, -16384, -21407 ; w07 w06 w03 w02 dw 16384, -8867, 16384, -21407 ; w13 w12 w09 w08 dw -16384, 21407, 16384, -8867 ; w15 w14 w11 w10 dw 22725, 19266, 19266, -4520 ; w21 w20 w17 w16 dw 12873, 4520, -22725, -12873 ; w23 w22 w19 w18 dw 12873, -22725, 4520, -12873 ; w29 w28 w25 w24 dw 4520, 19266, 19266, -22725 ; w31 w30 w27 w26 ; %3 for rows 1,7 - constants are multiplied by cos_1_16 tab_i_17_xmm: dw 22725, 29692, 22725, 12299 ; movq-> w05 w04 w01 w00 dw 22725, 12299, -22725, -29692 ; w07 w06 w03 w02 dw 22725, -12299, 22725, -29692 ; w13 w12 w09 w08 dw -22725, 29692, 22725, -12299 ; w15 w14 w11 w10 dw 31521, 26722, 26722, -6270 ; w21 w20 w17 w16 dw 17855, 6270, -31521, -17855 ; w23 w22 w19 w18 dw 17855, -31521, 6270, -17855 ; w29 w28 w25 w24 dw 6270, 26722, 26722, -31521 ; w31 w30 w27 w26 ; %3 for rows 2,6 - constants are multiplied by cos_2_16 tab_i_26_xmm: dw 21407, 27969, 21407, 11585 ; movq-> w05 w04 w01 w00 dw 21407, 11585, -21407, -27969 ; w07 w06 w03 w02 dw 21407, -11585, 21407, -27969 ; w13 w12 w09 w08 dw -21407, 27969, 21407, -11585 ; w15 w14 w11 w10 dw 29692, 25172, 25172, -5906 ; w21 w20 w17 w16 dw 16819, 5906, -29692, -16819 ; w23 w22 w19 w18 dw 16819, -29692, 5906, -16819 ; w29 w28 w25 w24 dw 5906, 25172, 25172, -29692 ; w31 w30 w27 w26 ; %3 for rows 3,5 - constants are multiplied by cos_3_16 tab_i_35_xmm: dw 19266, 25172, 19266, 10426 ; movq-> w05 w04 w01 w00 dw 19266, 10426, -19266, -25172 ; w07 w06 w03 w02 dw 19266, -10426, 19266, -25172 ; w13 w12 w09 w08 dw -19266, 25172, 19266, -10426 ; w15 w14 w11 w10 dw 26722, 22654, 22654, -5315 ; w21 w20 w17 w16 dw 15137, 5315, -26722, -15137 ; w23 w22 w19 w18 dw 15137, -26722, 5315, -15137 ; w29 w28 w25 w24 dw 5315, 22654, 22654, -26722 ; w31 w30 w27 w26 ;============================================================================= ; Helper macros for the code ;============================================================================= ;----------------------------------------------------------------------------- ; DCT_8_INV_ROW_MMX INP, OUT, TABLE, ROUNDER ;----------------------------------------------------------------------------- %macro DCT_8_INV_ROW_MMX 4 movq mm0, [%1] ; 0 ; x3 x2 x1 x0 movq mm1, [%1+8] ; 1 ; x7 x6 x5 x4 movq mm2, mm0 ; 2 ; x3 x2 x1 x0 movq mm3, [%3] ; 3 ; w06 w04 w02 w00 punpcklwd mm0, mm1 ; x5 x1 x4 x0 movq mm5, mm0 ; 5 ; x5 x1 x4 x0 punpckldq mm0, mm0 ; x4 x0 x4 x0 movq mm4, [%3+8] ; 4 ; w07 w05 w03 w01 punpckhwd mm2, mm1 ; 1 ; x7 x3 x6 x2 pmaddwd mm3, mm0 ; x4*w06+x0*w04 x4*w02+x0*w00 movq mm6, mm2 ; 6 ; x7 x3 x6 x2 movq mm1, [%3+32] ; 1 ; w22 w20 w18 w16 punpckldq mm2, mm2 ; x6 x2 x6 x2 pmaddwd mm4, mm2 ; x6*w07+x2*w05 x6*w03+x2*w01 punpckhdq mm5, mm5 ; x5 x1 x5 x1 pmaddwd mm0, [%3+16] ; x4*w14+x0*w12 x4*w10+x0*w08 punpckhdq mm6, mm6 ; x7 x3 x7 x3 movq mm7, [%3+40] ; 7 ; w23 w21 w19 w17 pmaddwd mm1, mm5 ; x5*w22+x1*w20 x5*w18+x1*w16 paddd mm3, [%4] ; +%4 pmaddwd mm7, mm6 ; x7*w23+x3*w21 x7*w19+x3*w17 pmaddwd mm2, [%3+24] ; x6*w15+x2*w13 x6*w11+x2*w09 paddd mm3, mm4 ; 4 ; a1=sum(even1) a0=sum(even0) pmaddwd mm5, [%3+48] ; x5*w30+x1*w28 x5*w26+x1*w24 movq mm4, mm3 ; 4 ; a1 a0 pmaddwd mm6, [%3+56] ; x7*w31+x3*w29 x7*w27+x3*w25 paddd mm1, mm7 ; 7 ; b1=sum(odd1) b0=sum(odd0) paddd mm0, [%4] ; +%4 psubd mm3, mm1 ; a1-b1 a0-b0 psrad mm3, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 paddd mm1, mm4 ; 4 ; a1+b1 a0+b0 paddd mm0, mm2 ; 2 ; a3=sum(even3) a2=sum(even2) psrad mm1, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 paddd mm5, mm6 ; 6 ; b3=sum(odd3) b2=sum(odd2) movq mm4, mm0 ; 4 ; a3 a2 paddd mm0, mm5 ; a3+b3 a2+b2 psubd mm4, mm5 ; 5 ; a3-b3 a2-b2 psrad mm0, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 psrad mm4, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 packssdw mm1, mm0 ; 0 ; y3 y2 y1 y0 packssdw mm4, mm3 ; 3 ; y6 y7 y4 y5 movq mm7, mm4 ; 7 ; y6 y7 y4 y5 psrld mm4, 16 ; 0 y6 0 y4 pslld mm7, 16 ; y7 0 y5 0 movq [%2], mm1 ; 1 ; save y3 y2 y1 y0 por mm7, mm4 ; 4 ; y7 y6 y5 y4 movq [%2+8], mm7 ; 7 ; save y7 y6 y5 y4 %endmacro ;----------------------------------------------------------------------------- ; DCT_8_INV_ROW_XMM INP, OUT, TABLE, ROUNDER ;----------------------------------------------------------------------------- %macro DCT_8_INV_ROW_XMM 4 movq mm0, [%1] ; 0 ; x3 x2 x1 x0 movq mm1, [%1+8] ; 1 ; x7 x6 x5 x4 movq mm2, mm0 ; 2 ; x3 x2 x1 x0 movq mm3, [%3] ; 3 ; w05 w04 w01 w00 pshufw mm0, mm0, 10001000b ; x2 x0 x2 x0 movq mm4, [%3+8] ; 4 ; w07 w06 w03 w02 movq mm5, mm1 ; 5 ; x7 x6 x5 x4 pmaddwd mm3, mm0 ; x2*w05+x0*w04 x2*w01+x0*w00 movq mm6, [%3+32] ; 6 ; w21 w20 w17 w16 pshufw mm1, mm1, 10001000b ; x6 x4 x6 x4 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 movq mm7, [%3+40] ; 7 ; w23 w22 w19 w18 pshufw mm2, mm2, 11011101b ; x3 x1 x3 x1 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pshufw mm5, mm5, 11011101b ; x7 x5 x7 x5 pmaddwd mm7, mm5 ; x7*w23+x5*w22 x7*w19+x5*w18 paddd mm3, [%4] ; +%4 pmaddwd mm0, [%3+16] ; x2*w13+x0*w12 x2*w09+x0*w08 paddd mm3, mm4 ; 4 ; a1=sum(even1) a0=sum(even0) pmaddwd mm1, [%3+24] ; x6*w15+x4*w14 x6*w11+x4*w10 movq mm4, mm3 ; 4 ; a1 a0 pmaddwd mm2, [%3+48] ; x3*w29+x1*w28 x3*w25+x1*w24 paddd mm6, mm7 ; 7 ; b1=sum(odd1) b0=sum(odd0) pmaddwd mm5, [%3+56] ; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm6 ; a1+b1 a0+b0 paddd mm0, [%4] ; +%4 psrad mm3, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 paddd mm0, mm1 ; 1 ; a3=sum(even3) a2=sum(even2) psubd mm4, mm6 ; 6 ; a1-b1 a0-b0 movq mm7, mm0 ; 7 ; a3 a2 paddd mm2, mm5 ; 5 ; b3=sum(odd3) b2=sum(odd2) paddd mm0, mm2 ; a3+b3 a2+b2 psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psubd mm7, mm2 ; 2 ; a3-b3 a2-b2 psrad mm0, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 packssdw mm3, mm0 ; 0 ; y3 y2 y1 y0 packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 movq [%2], mm3 ; 3 ; save y3 y2 y1 y0 pshufw mm7, mm7, 10110001b ; y7 y6 y5 y4 movq [%2+8], mm7 ; 7 ; save y7 y6 y5 y4 %endmacro ;----------------------------------------------------------------------------- ; ; The first stage DCT 8x8 - forward DCTs of columns ; ; The %2puts are multiplied ; for rows 0,4 - on cos_4_16, ; for rows 1,7 - on cos_1_16, ; for rows 2,6 - on cos_2_16, ; for rows 3,5 - on cos_3_16 ; and are shifted to the left for rise of accuracy ; ;----------------------------------------------------------------------------- ; ; The 8-point scaled forward DCT algorithm (26a8m) ; ;----------------------------------------------------------------------------- ; ; #define DCT_8_FRW_COL(x, y) ;{ ; short t0, t1, t2, t3, t4, t5, t6, t7; ; short tp03, tm03, tp12, tm12, tp65, tm65; ; short tp465, tm465, tp765, tm765; ; ; t0 = LEFT_SHIFT ( x[0] + x[7] ); ; t1 = LEFT_SHIFT ( x[1] + x[6] ); ; t2 = LEFT_SHIFT ( x[2] + x[5] ); ; t3 = LEFT_SHIFT ( x[3] + x[4] ); ; t4 = LEFT_SHIFT ( x[3] - x[4] ); ; t5 = LEFT_SHIFT ( x[2] - x[5] ); ; t6 = LEFT_SHIFT ( x[1] - x[6] ); ; t7 = LEFT_SHIFT ( x[0] - x[7] ); ; ; tp03 = t0 + t3; ; tm03 = t0 - t3; ; tp12 = t1 + t2; ; tm12 = t1 - t2; ; ; y[0] = tp03 + tp12; ; y[4] = tp03 - tp12; ; ; y[2] = tm03 + tm12 * tg_2_16; ; y[6] = tm03 * tg_2_16 - tm12; ; ; tp65 =(t6 +t5 )*cos_4_16; ; tm65 =(t6 -t5 )*cos_4_16; ; ; tp765 = t7 + tp65; ; tm765 = t7 - tp65; ; tp465 = t4 + tm65; ; tm465 = t4 - tm65; ; ; y[1] = tp765 + tp465 * tg_1_16; ; y[7] = tp765 * tg_1_16 - tp465; ; y[5] = tm765 * tg_3_16 + tm465; ; y[3] = tm765 - tm465 * tg_3_16; ;} ; ;----------------------------------------------------------------------------- ;----------------------------------------------------------------------------- ; DCT_8_INV_COL_4 INP,OUT ;----------------------------------------------------------------------------- %macro DCT_8_INV_COL 2 movq mm0, [tg_3_16] movq mm3, [%1+16*3] movq mm1, mm0 ; tg_3_16 movq mm5, [%1+16*5] pmulhw mm0, mm3 ; x3*(tg_3_16-1) movq mm4, [tg_1_16] pmulhw mm1, mm5 ; x5*(tg_3_16-1) movq mm7, [%1+16*7] movq mm2, mm4 ; tg_1_16 movq mm6, [%1+16*1] pmulhw mm4, mm7 ; x7*tg_1_16 paddsw mm0, mm3 ; x3*tg_3_16 pmulhw mm2, mm6 ; x1*tg_1_16 paddsw mm1, mm3 ; x3+x5*(tg_3_16-1) psubsw mm0, mm5 ; x3*tg_3_16-x5 = tm35 movq mm3, [ocos_4_16] paddsw mm1, mm5 ; x3+x5*tg_3_16 = tp35 paddsw mm4, mm6 ; x1+tg_1_16*x7 = tp17 psubsw mm2, mm7 ; x1*tg_1_16-x7 = tm17 movq mm5, mm4 ; tp17 movq mm6, mm2 ; tm17 paddsw mm5, mm1 ; tp17+tp35 = b0 psubsw mm6, mm0 ; tm17-tm35 = b3 psubsw mm4, mm1 ; tp17-tp35 = t1 paddsw mm2, mm0 ; tm17+tm35 = t2 movq mm7, [tg_2_16] movq mm1, mm4 ; t1 ; movq [SCRATCH+0], mm5 ; save b0 movq [%2+3*16], mm5 ; save b0 paddsw mm1, mm2 ; t1+t2 ; movq [SCRATCH+8], mm6 ; save b3 movq [%2+5*16], mm6 ; save b3 psubsw mm4, mm2 ; t1-t2 movq mm5, [%1+2*16] movq mm0, mm7 ; tg_2_16 movq mm6, [%1+6*16] pmulhw mm0, mm5 ; x2*tg_2_16 pmulhw mm7, mm6 ; x6*tg_2_16 ; slot pmulhw mm1, mm3 ; ocos_4_16*(t1+t2) = b1/2 ; slot movq mm2, [%1+0*16] pmulhw mm4, mm3 ; ocos_4_16*(t1-t2) = b2/2 psubsw mm0, mm6 ; t2*tg_2_16-x6 = tm26 movq mm3, mm2 ; x0 movq mm6, [%1+4*16] paddsw mm7, mm5 ; x2+x6*tg_2_16 = tp26 paddsw mm2, mm6 ; x0+x4 = tp04 psubsw mm3, mm6 ; x0-x4 = tm04 movq mm5, mm2 ; tp04 movq mm6, mm3 ; tm04 psubsw mm2, mm7 ; tp04-tp26 = a3 paddsw mm3, mm0 ; tm04+tm26 = a1 paddsw mm1, mm1 ; b1 paddsw mm4, mm4 ; b2 paddsw mm5, mm7 ; tp04+tp26 = a0 psubsw mm6, mm0 ; tm04-tm26 = a2 movq mm7, mm3 ; a1 movq mm0, mm6 ; a2 paddsw mm3, mm1 ; a1+b1 paddsw mm6, mm4 ; a2+b2 psraw mm3, SHIFT_INV_COL ; dst1 psubsw mm7, mm1 ; a1-b1 psraw mm6, SHIFT_INV_COL ; dst2 psubsw mm0, mm4 ; a2-b2 ; movq mm1, [SCRATCH+0] ; load b0 movq mm1, [%2+3*16] ; load b0 psraw mm7, SHIFT_INV_COL ; dst6 movq mm4, mm5 ; a0 psraw mm0, SHIFT_INV_COL ; dst5 movq [%2+1*16], mm3 paddsw mm5, mm1 ; a0+b0 movq [%2+2*16], mm6 psubsw mm4, mm1 ; a0-b0 ; movq mm3, [SCRATCH+8] ; load b3 movq mm3, [%2+5*16] ; load b3 psraw mm5, SHIFT_INV_COL ; dst0 movq mm6, mm2 ; a3 psraw mm4, SHIFT_INV_COL ; dst7 movq [%2+5*16], mm0 paddsw mm2, mm3 ; a3+b3 movq [%2+6*16], mm7 psubsw mm6, mm3 ; a3-b3 movq [%2+0*16], mm5 psraw mm2, SHIFT_INV_COL ; dst3 movq [%2+7*16], mm4 psraw mm6, SHIFT_INV_COL ; dst4 movq [%2+3*16], mm2 movq [%2+4*16], mm6 %endmacro ;============================================================================= ; Code ;============================================================================= TEXT cglobal idct_mmx cglobal idct_xmm ;----------------------------------------------------------------------------- ; void idct_mmx(uint16_t block[64]); ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN idct_mmx: mov TMP0, prm1 ;; Process each row DCT_8_INV_ROW_MMX TMP0+0*16, TMP0+0*16, tab_i_04_mmx, rounder_0 DCT_8_INV_ROW_MMX TMP0+1*16, TMP0+1*16, tab_i_17_mmx, rounder_1 DCT_8_INV_ROW_MMX TMP0+2*16, TMP0+2*16, tab_i_26_mmx, rounder_2 DCT_8_INV_ROW_MMX TMP0+3*16, TMP0+3*16, tab_i_35_mmx, rounder_3 DCT_8_INV_ROW_MMX TMP0+4*16, TMP0+4*16, tab_i_04_mmx, rounder_4 DCT_8_INV_ROW_MMX TMP0+5*16, TMP0+5*16, tab_i_35_mmx, rounder_5 DCT_8_INV_ROW_MMX TMP0+6*16, TMP0+6*16, tab_i_26_mmx, rounder_6 DCT_8_INV_ROW_MMX TMP0+7*16, TMP0+7*16, tab_i_17_mmx, rounder_7 ;; Process the columns (4 at a time) DCT_8_INV_COL TMP0+0, TMP0+0 DCT_8_INV_COL TMP0+8, TMP0+8 ret ENDFUNC ;----------------------------------------------------------------------------- ; void idct_xmm(uint16_t block[64]); ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN idct_xmm: mov TMP0, prm1 ;; Process each row DCT_8_INV_ROW_XMM TMP0+0*16, TMP0+0*16, tab_i_04_xmm, rounder_0 DCT_8_INV_ROW_XMM TMP0+1*16, TMP0+1*16, tab_i_17_xmm, rounder_1 DCT_8_INV_ROW_XMM TMP0+2*16, TMP0+2*16, tab_i_26_xmm, rounder_2 DCT_8_INV_ROW_XMM TMP0+3*16, TMP0+3*16, tab_i_35_xmm, rounder_3 DCT_8_INV_ROW_XMM TMP0+4*16, TMP0+4*16, tab_i_04_xmm, rounder_4 DCT_8_INV_ROW_XMM TMP0+5*16, TMP0+5*16, tab_i_35_xmm, rounder_5 DCT_8_INV_ROW_XMM TMP0+6*16, TMP0+6*16, tab_i_26_xmm, rounder_6 DCT_8_INV_ROW_XMM TMP0+7*16, TMP0+7*16, tab_i_17_xmm, rounder_7 ;; Process the columns (4 at a time) DCT_8_INV_COL TMP0+0, TMP0+0 DCT_8_INV_COL TMP0+8, TMP0+8 ret ENDFUNC NON_EXEC_STACK xvidcore/src/dct/x86_asm/idct_3dne.asm0000664000076500007650000007551311254216113020656 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - MMX and XMM forward discrete cosine transform - ; * ; * Copyright(C) 2001 Peter Ross ; * 2002 Jaan Kalda ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: idct_3dne.asm,v 1.11 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ; **************************************************************************** ; ; Originally provided by Intel at AP-922 ; http://developer.intel.com/vtune/cbts/strmsimd/922down.htm ; (See more app notes at http://developer.intel.com/vtune/cbts/strmsimd/appnotes.htm) ; but in a limited edition. ; New macro implements a column part for precise iDCT ; The routine precision now satisfies IEEE standard 1180-1990. ; ; Copyright(C) 2000-2001 Peter Gubanov ; Rounding trick Copyright(C) 2000 Michel Lespinasse ; ; http://www.elecard.com/peter/idct.html ; http://www.linuxvideo.org/mpeg2dec/ ; ; ***************************************************************************/ ; ; These examples contain code fragments for first stage iDCT 8x8 ; (for rows) and first stage DCT 8x8 (for columns) ; ; ***************************************************************************/ ; this 3dne function is compatible with iSSE, but is optimized specifically for ; K7 pipelines (ca 5% gain), for implementation details see the idct_mmx.asm ; file ; ; ---------------------------------------------------------------------------- ; Athlon optimizations contributed by Jaan Kalda ;----------------------------------------------------------------------------- ;============================================================================= ; Macros and other preprocessor constants ;============================================================================= %include "nasm.inc" %define BITS_INV_ACC 5 ; 4 or 5 for IEEE %define SHIFT_INV_ROW 16 - BITS_INV_ACC %define SHIFT_INV_COL 1 + BITS_INV_ACC %define RND_INV_ROW 1024 * (6 - BITS_INV_ACC) ; 1 << (SHIFT_INV_ROW-1) %define RND_INV_COL 16 * (BITS_INV_ACC - 3) ; 1 << (SHIFT_INV_COL-1) %define RND_INV_CORR RND_INV_COL - 1 ; correction -1.0 and round %define BITS_FRW_ACC 3 ; 2 or 3 for accuracy %define SHIFT_FRW_COL BITS_FRW_ACC %define SHIFT_FRW_ROW BITS_FRW_ACC + 17 %define RND_FRW_ROW 262144*(BITS_FRW_ACC - 1) ; 1 << (SHIFT_FRW_ROW-1) ;============================================================================= ; Local Data (Read Only) ;============================================================================= DATA ;----------------------------------------------------------------------------- ; Various memory constants (trigonometric values or rounding values) ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN one_corr: dw 1, 1, 1, 1 round_inv_row: dd RND_INV_ROW, RND_INV_ROW round_inv_col: dw RND_INV_COL, RND_INV_COL, RND_INV_COL, RND_INV_COL round_inv_corr: dw RND_INV_CORR, RND_INV_CORR, RND_INV_CORR, RND_INV_CORR round_frw_row: dd RND_FRW_ROW, RND_FRW_ROW tg_1_16: dw 13036, 13036, 13036, 13036 ; tg * (2<<16) + 0.5 tg_2_16: dw 27146, 27146, 27146, 27146 ; tg * (2<<16) + 0.5 tg_3_16: dw -21746, -21746, -21746, -21746 ; tg * (2<<16) + 0.5 cos_4_16: dw -19195, -19195, -19195, -19195 ; cos * (2<<16) + 0.5 ocos_4_16: dw 23170, 23170, 23170, 23170 ; cos * (2<<15) + 0.5 otg_3_16: dw 21895, 21895, 21895, 21895 ; tg * (2<<16) + 0.5 %if SHIFT_INV_ROW == 12 ; assume SHIFT_INV_ROW == 12 rounder_0: dd 65536, 65536 rounder_4: dd 0, 0 rounder_1: dd 7195, 7195 rounder_7 dd 1024, 1024 rounder_2: dd 4520, 4520 rounder_6: dd 1024, 1024 rounder_3: dd 2407, 2407 rounder_5: dd 240, 240 %elif SHIFT_INV_ROW == 11 ; assume SHIFT_INV_ROW == 11 rounder_0: dd 65536, 65536 rounder_4: dd 0, 0 rounder_1: dd 3597, 3597 rounder_7: dd 512, 512 rounder_2: dd 2260, 2260 rounder_6: dd 512, 512 rounder_3: dd 1203, 1203 rounder_5: dd 120, 120 %else %error invalid SHIFT_INV_ROW %endif ;----------------------------------------------------------------------------- ; Tables for xmm processors ;----------------------------------------------------------------------------- ; %3 for rows 0,4 - constants are multiplied by cos_4_16 tab_i_04_xmm: dw 16384, 21407, 16384, 8867 ; movq-> w05 w04 w01 w00 dw 16384, 8867, -16384, -21407 ; w07 w06 w03 w02 dw 16384, -8867, 16384, -21407 ; w13 w12 w09 w08 dw -16384, 21407, 16384, -8867 ; w15 w14 w11 w10 dw 22725, 19266, 19266, -4520 ; w21 w20 w17 w16 dw 12873, 4520, -22725, -12873 ; w23 w22 w19 w18 dw 12873, -22725, 4520, -12873 ; w29 w28 w25 w24 dw 4520, 19266, 19266, -22725 ; w31 w30 w27 w26 ; %3 for rows 1,7 - constants are multiplied by cos_1_16 tab_i_17_xmm: dw 22725, 29692, 22725, 12299 ; movq-> w05 w04 w01 w00 dw 22725, 12299, -22725, -29692 ; w07 w06 w03 w02 dw 22725, -12299, 22725, -29692 ; w13 w12 w09 w08 dw -22725, 29692, 22725, -12299 ; w15 w14 w11 w10 dw 31521, 26722, 26722, -6270 ; w21 w20 w17 w16 dw 17855, 6270, -31521, -17855 ; w23 w22 w19 w18 dw 17855, -31521, 6270, -17855 ; w29 w28 w25 w24 dw 6270, 26722, 26722, -31521 ; w31 w30 w27 w26 ; %3 for rows 2,6 - constants are multiplied by cos_2_16 tab_i_26_xmm: dw 21407, 27969, 21407, 11585 ; movq-> w05 w04 w01 w00 dw 21407, 11585, -21407, -27969 ; w07 w06 w03 w02 dw 21407, -11585, 21407, -27969 ; w13 w12 w09 w08 dw -21407, 27969, 21407, -11585 ; w15 w14 w11 w10 dw 29692, 25172, 25172, -5906 ; w21 w20 w17 w16 dw 16819, 5906, -29692, -16819 ; w23 w22 w19 w18 dw 16819, -29692, 5906, -16819 ; w29 w28 w25 w24 dw 5906, 25172, 25172, -29692 ; w31 w30 w27 w26 ; %3 for rows 3,5 - constants are multiplied by cos_3_16 tab_i_35_xmm: dw 19266, 25172, 19266, 10426 ; movq-> w05 w04 w01 w00 dw 19266, 10426, -19266, -25172 ; w07 w06 w03 w02 dw 19266, -10426, 19266, -25172 ; w13 w12 w09 w08 dw -19266, 25172, 19266, -10426 ; w15 w14 w11 w10 dw 26722, 22654, 22654, -5315 ; w21 w20 w17 w16 dw 15137, 5315, -26722, -15137 ; w23 w22 w19 w18 dw 15137, -26722, 5315, -15137 ; w29 w28 w25 w24 dw 5315, 22654, 22654, -26722 ; w31 w30 w27 w26 ;============================================================================= ; Code ;============================================================================= TEXT cglobal idct_3dne ;----------------------------------------------------------------------------- ; void idct_3dne(uint16_t block[64]); ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN idct_3dne: mov _ECX, prm1 ; DCT_8_INV_ROW_1_s [_ECX+64], [_ECX+64], tab_i_04_sse, rounder_4 ;rounder_4=0 pshufw mm0, [_ECX+64],10001000b ; x2 x0 x2 x0 movq mm3, [tab_i_04_xmm] ; 3 ; w05 w04 w01 w00 pshufw mm1, [_ECX+64+8],10001000b ; x6 x4 x6 x4 movq mm4, [tab_i_04_xmm+8] ; 4 ; w07 w06 w03 w02 pshufw mm2, [_ECX+64],11011101b ; x3 x1 x3 x1 pshufw mm5, [_ECX+64+8],11011101b ; x7 x5 x7 x5 movq mm6, [tab_i_04_xmm+32] ; 6 ; w21 w20 w17 w16 pmaddwd mm3, mm0 ; x2*w05+x0*w04 x2*w01+x0*w00 movq mm7, [tab_i_04_xmm+40] ; 7 ; w23 w22 w19 w18 ; pmaddwd mm0, [tab_i_04_xmm+16]; x2*w13+x0*w12 x2*w09+x0*w08 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 pmaddwd mm1, [tab_i_04_xmm+24]; x6*w15+x4*w14 x6*w11+x4*w10 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pmaddwd mm2, [tab_i_04_xmm+48]; x3*w29+x1*w28 x3*w25+x1*w24 pmaddwd mm7, mm5 ; 7 ; x7*w23+x5*w22 x7*w19+x5*w18 ; w23 w22 w19 w18 pmaddwd mm5, [tab_i_04_xmm+56]; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm4 ; 4 free ; a1=sum(even1) a0=sum(even0) paddd mm0, mm1 ; 1 free ; a3=sum(even3) a2=sum(even2) pshufw mm1, [_ECX+80+8],10001000b ; x6 x4 x6 x4 movq mm4, mm3 ; 4 ; a1 a0 paddd mm6, mm7 ; 7 free ; b1=sum(odd1) b0=sum(odd0) paddd mm2, mm5 ; 5 free ; b3=sum(odd3) b2=sum(odd2) pshufw mm5, [_ECX+80],10001000b; x2 x0 x2 x0 mm5 & mm0 exchanged for next cycle movq mm7, mm0 ; 7 ; a3 a2 psubd mm4, mm6 ; 6 free ; a1-b1 a0-b0 paddd mm6, mm3 ; mm6 = mm3+mm6+mm5+mm4; a1+b1 a0+b0 movq mm3, [tab_i_35_xmm] ; 3 ; w05 w04 w01 w00 psubd mm7, mm2 ; ; a3-b3 a2-b2 paddd mm0, mm2 ; 0 free a3+b3 a2+b2 pshufw mm2, [_ECX+80],11011101b; x3 x1 x3 x1 pmaddwd mm3, mm5 ; x2*w05+x0*w04 x2*w01+x0*w00 pmaddwd mm5, [tab_i_35_xmm+16]; x2*w13+x0*w12 x2*w09+x0*w08 psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 psrad mm6, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 psrad mm0, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 packssdw mm6, mm0 ; 0 free ; y3 y2 y1 y0 pshufw mm0, [_ECX+80+8],11011101b ; x7 x5 x7 x5 movq [_ECX+64], mm6 ; 3 ; save y3 y2 y1 y0 stall2 ; DCT_8_INV_ROW_1_s [_ECX+80], [_ECX+80], tab_i_35_xmm, rounder_5 movq mm4, [tab_i_35_xmm+8] ; 4 ; w07 w06 w03 w02 movq mm6, [tab_i_35_xmm+32] ; 6 ; w21 w20 w17 w16 pshufw mm7, mm7, 10110001b ; y7 y6 y5 y4 paddd mm3, [rounder_5] ; +rounder stall 6 paddd mm5, [rounder_5] ; +rounder movq [_ECX+64+8], mm7 ; 7 ; save y7 y6 y5 y4 movq mm7, [tab_i_35_xmm+40] ; 7 ; w23 w22 w19 w18 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 pmaddwd mm1, [tab_i_35_xmm+24]; x6*w15+x4*w14 x6*w11+x4*w10 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pmaddwd mm2, [tab_i_35_xmm+48]; x3*w29+x1*w28 x3*w25+x1*w24 pmaddwd mm7, mm0 ; 7 ; x7*w23+x5*w22 x7*w19+x5*w18 ; w23 w22 w19 w18 pmaddwd mm0, [tab_i_35_xmm+56]; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm4 ; 4 free ; a1=sum(even1) a0=sum(even0) paddd mm5, mm1 ; 1 free ; a3=sum(even3) a2=sum(even2) pshufw mm1, [_ECX+96+8],10001000b ; x6 x4 x6 x4 movq mm4, mm3 ; 4 ; a1 a0 paddd mm6, mm7 ; 7 free ; b1=sum(odd1) b0=sum(odd0) paddd mm2, mm0 ; 5 free ; b3=sum(odd3) b2=sum(odd2) pshufw mm0, [_ECX+96],10001000b ; x2 x0 x2 x0 movq mm7, mm5 ; 7 ; a3 a2 psubd mm4, mm6 ; 6 free ; a1-b1 a0-b0 stall 5 paddd mm6, mm3 ; mm3 = mm3+mm6+mm5+mm4; a1+b1 a0+b0 movq mm3, [tab_i_26_xmm] ; 3 ; w05 w04 w01 w00 psubd mm7, mm2 ; ; a3-b3 a2-b2 paddd mm5, mm2 ; 0 free a3+b3 a2+b2 pshufw mm2, [_ECX+96],11011101b; x3 x1 x3 x1 pmaddwd mm3, mm0 ; x2*w05+x0*w04 x2*w01+x0*w00 pmaddwd mm0, [tab_i_26_xmm+16]; x2*w13+x0*w12 x2*w09+x0*w08 psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 psrad mm6, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 psrad mm5, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 packssdw mm6, mm5 ; 0 free ; y3 y2 y1 y0 pshufw mm5, [_ECX+96+8],11011101b ; x7 x5 x7 x5 movq [_ECX+80], mm6 ; 3 ; save y3 y2 y1 y0 ; DCT_8_INV_ROW_1_s [_ECX+96], [_ECX+96], tab_i_26_xmm, rounder_6 movq mm4, [tab_i_26_xmm+8] ; 4 ; w07 w06 w03 w02 movq mm6, [tab_i_26_xmm+32] ; 6 ; w21 w20 w17 w16 pshufw mm7, mm7, 10110001b ; y7 y6 y5 y4 STALL 6 paddd mm3, [rounder_6] ; +rounder paddd mm0, [rounder_6] ; +rounder movq [_ECX+80+8], mm7 ; 7 ; save y7 y6 movq mm7, [tab_i_26_xmm+40] ; 7 ; w23 w22 w19 w18 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 pmaddwd mm1, [tab_i_26_xmm+24]; x6*w15+x4*w14 x6*w11+x4*w10 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pmaddwd mm2, [tab_i_26_xmm+48]; x3*w29+x1*w28 x3*w25+x1*w24 pmaddwd mm7, mm5 ; 7 ; x7*w23+x5*w22 x7*w19+x5*w18 ; w23 w22 w19 w18 pmaddwd mm5, [tab_i_26_xmm+56]; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm4 ; 4 free ; a1=sum(even1) a0=sum(even0) paddd mm0, mm1 ; 1 free ; a3=sum(even3) a2=sum(even2) pshufw mm1, [_ECX+112+8],10001000b ; x6 x4 x6 x4 movq mm4, mm3 ; 4 ; a1 a0 paddd mm6, mm7 ; 7 free ; b1=sum(odd1) b0=sum(odd0) paddd mm2, mm5 ; 5 free ; b3=sum(odd3) b2=sum(odd2) pshufw mm5, [_ECX+112],10001000b; x2 x0 x2 x0 mm5 & mm0 exchanged for next cycle movq mm7, mm0 ; 7 ; a3 a2 psubd mm4, mm6 ; 6 free ; a1-b1 a0-b0 paddd mm6, mm3 ; mm6 = mm3+mm6+mm5+mm4; a1+b1 a0+b0 movq mm3, [tab_i_17_xmm] ; 3 ; w05 w04 w01 w00 psubd mm7, mm2 ; ; a3-b3 a2-b2 paddd mm0, mm2 ; 0 free a3+b3 a2+b2 pshufw mm2, [_ECX+112],11011101b; x3 x1 x3 x1 pmaddwd mm3, mm5 ; x2*w05+x0*w04 x2*w01+x0*w00 pmaddwd mm5, [tab_i_17_xmm+16]; x2*w13+x0*w12 x2*w09+x0*w08 psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 psrad mm6, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 psrad mm0, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 packssdw mm6, mm0 ; 0 free ; y3 y2 y1 y0 pshufw mm0, [_ECX+112+8],11011101b ; x7 x5 x7 x5 movq [_ECX+96], mm6 ; 3 ; save y3 y2 y1 y0 stall2 ; DCT_8_INV_ROW_1_s [_ECX+112], [_ECX+112], tab_i_17_xmm, rounder_7 movq mm4, [tab_i_17_xmm+8] ; 4 ; w07 w06 w03 w02 movq mm6, [tab_i_17_xmm+32] ; 6 ; w21 w20 w17 w16 pshufw mm7, mm7, 10110001b ; y7 y6 y5 y4 paddd mm3, [rounder_7] ; +rounder stall 6 paddd mm5, [rounder_7] ; +rounder movq [_ECX+96+8], mm7 ; 7 ; save y7 y6 y5 y4 movq mm7, [tab_i_17_xmm+40] ; 7 ; w23 w22 w19 w18 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 pmaddwd mm1, [tab_i_17_xmm+24]; x6*w15+x4*w14 x6*w11+x4*w10 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pmaddwd mm2, [tab_i_17_xmm+48]; x3*w29+x1*w28 x3*w25+x1*w24 pmaddwd mm7, mm0 ; 7 ; x7*w23+x5*w22 x7*w19+x5*w18 ; w23 w22 w19 w18 pmaddwd mm0, [tab_i_17_xmm+56]; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm4 ; 4 free ; a1=sum(even1) a0=sum(even0) paddd mm5, mm1 ; 1 free ; a3=sum(even3) a2=sum(even2) pshufw mm1, [_ECX+0+8],10001000b; x6 x4 x6 x4 movq mm4, mm3 ; 4 ; a1 a0 paddd mm6, mm7 ; 7 free ; b1=sum(odd1) b0=sum(odd0) paddd mm2, mm0 ; 5 free ; b3=sum(odd3) b2=sum(odd2) pshufw mm0, [_ECX+0],10001000b ; x2 x0 x2 x0 movq mm7, mm5 ; 7 ; a3 a2 psubd mm4, mm6 ; 6 free ; a1-b1 a0-b0 stall 5 paddd mm6, mm3 ; mm3 = mm3+mm6+mm5+mm4; a1+b1 a0+b0 movq mm3, [tab_i_04_xmm] ; 3 ; w05 w04 w01 w00 psubd mm7, mm2 ; ; a3-b3 a2-b2 paddd mm5, mm2 ; 0 free a3+b3 a2+b2 pshufw mm2, [_ECX+0],11011101b ; x3 x1 x3 x1 pmaddwd mm3, mm0 ; x2*w05+x0*w04 x2*w01+x0*w00 pmaddwd mm0, [tab_i_04_xmm+16]; x2*w13+x0*w12 x2*w09+x0*w08 psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 psrad mm6, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 psrad mm5, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 packssdw mm6, mm5 ; 0 free ; y3 y2 y1 y0 pshufw mm5, [_ECX+0+8],11011101b; x7 x5 x7 x5 movq [_ECX+112], mm6 ; 3 ; save y3 y2 y1 y0 ; DCT_8_INV_ROW_1_s [_ECX+0], 0, tab_i_04_xmm, rounder_0 movq mm4, [tab_i_04_xmm+8] ; 4 ; w07 w06 w03 w02 movq mm6, [tab_i_04_xmm+32] ; 6 ; w21 w20 w17 w16 pshufw mm7, mm7, 10110001b ; y7 y6 y5 y4 STALL 6 paddd mm3, [rounder_0] ; +rounder paddd mm0, [rounder_0] ; +rounder movq [_ECX+112+8], mm7 ; 7 ; save y7 y6 movq mm7, [tab_i_04_xmm+40] ; 7 ; w23 w22 w19 w18 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 pmaddwd mm1, [tab_i_04_xmm+24]; x6*w15+x4*w14 x6*w11+x4*w10 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pmaddwd mm2, [tab_i_04_xmm+48]; x3*w29+x1*w28 x3*w25+x1*w24 pmaddwd mm7, mm5 ; 7 ; x7*w23+x5*w22 x7*w19+x5*w18 ; w23 w22 w19 w18 pmaddwd mm5, [tab_i_04_xmm+56]; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm4 ; 4 free ; a1=sum(even1) a0=sum(even0) paddd mm0, mm1 ; 1 pshufw mm1, [_ECX+16+8],10001000b ; x6 x4 x6 x4 movq mm4, mm3 ; 4 ; a1 a0 paddd mm6, mm7 ; 7 free ; b1=sum(odd1) b0=sum(odd0) paddd mm2, mm5 ; 5 free ; b3=sum(odd3) b2=sum(odd2) pshufw mm5, [_ECX+16],10001000b; x2 x0 x2 x0 mm5 & mm0 exchanged for next cycle movq mm7, mm0 ; 7 ; a3 a2 psubd mm4, mm6 ; 6 free ; a1-b1 a0-b0 paddd mm6, mm3 ; mm6 = mm3+mm6+mm5+mm4; a1+b1 a0+b0 movq mm3, [tab_i_17_xmm] ; 3 ; w05 w04 w01 w00 psubd mm7, mm2 ; ; a3-b3 a2-b2 paddd mm0, mm2 ; 0 free a3+b3 a2+b2 pshufw mm2, [_ECX+16],11011101b; x3 x1 x3 x1 pmaddwd mm3, mm5 ; x2*w05+x0*w04 x2*w01+x0*w00 pmaddwd mm5, [tab_i_17_xmm+16]; x2*w13+x0*w12 x2*w09+x0*w08 psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 psrad mm6, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 psrad mm0, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 packssdw mm6, mm0 ; 0 free ; y3 y2 y1 y0 pshufw mm0, [_ECX+16+8],11011101b ; x7 x5 x7 x5 movq [_ECX+0], mm6 ; 3 ; save y3 y2 y1 y0 stall2 ; DCT_8_INV_ROW_1_s [_ECX+16], 16, tab_i_17_xmm, rounder_1 movq mm4, [tab_i_17_xmm+8] ; 4 ; w07 w06 w03 w02 movq mm6, [tab_i_17_xmm+32] ; 6 ; w21 w20 w17 w16 pshufw mm7, mm7, 10110001b ; y7 y6 y5 y4 paddd mm3, [rounder_1] ; +rounder stall 6 paddd mm5, [rounder_1] ; +rounder movq [_ECX+0+8], mm7 ; 7 ; save y7 y6 y5 y4 movq mm7, [tab_i_17_xmm+40] ; 7 ; w23 w22 w19 w18 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 pmaddwd mm1, [tab_i_17_xmm+24]; x6*w15+x4*w14 x6*w11+x4*w10 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pmaddwd mm2, [tab_i_17_xmm+48]; x3*w29+x1*w28 x3*w25+x1*w24 pmaddwd mm7, mm0 ; 7 ; x7*w23+x5*w22 x7*w19+x5*w18 ; w23 w22 w19 w18 pmaddwd mm0, [tab_i_17_xmm+56]; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm4 ; 4 free ; a1=sum(even1) a0=sum(even0) paddd mm5, mm1 ; 1 free ; a3=sum(even3) a2=sum(even2) pshufw mm1, [_ECX+32+8],10001000b ; x6 x4 x6 x4 movq mm4, mm3 ; 4 ; a1 a0 paddd mm6, mm7 ; 7 free ; b1=sum(odd1) b0=sum(odd0) paddd mm2, mm0 ; 5 free ; b3=sum(odd3) b2=sum(odd2) pshufw mm0, [_ECX+32],10001000b; x2 x0 x2 x0 movq mm7, mm5 ; 7 ; a3 a2 psubd mm4, mm6 ; 6 free ; a1-b1 a0-b0 stall 5 paddd mm6, mm3 ; mm3 = mm3+mm6+mm5+mm4; a1+b1 a0+b0 movq mm3, [tab_i_26_xmm] ; 3 ; w05 w04 w01 w00 psubd mm7, mm2 ; ; a3-b3 a2-b2 paddd mm5, mm2 ; 0 free a3+b3 a2+b2 pshufw mm2, [_ECX+32],11011101b; x3 x1 x3 x1 pmaddwd mm3, mm0 ; x2*w05+x0*w04 x2*w01+x0*w00 pmaddwd mm0, [tab_i_26_xmm+16]; x2*w13+x0*w12 x2*w09+x0*w08 psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 psrad mm6, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 psrad mm5, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 packssdw mm6, mm5 ; 0 free ; y3 y2 y1 y0 pshufw mm5, [_ECX+32+8],11011101b ; x7 x5 x7 x5 movq [_ECX+16], mm6 ; 3 ; save y3 y2 y1 y0 ; DCT_8_INV_ROW_1_s [_ECX+32], 32, tab_i_26_xmm, rounder_2 movq mm4, [tab_i_26_xmm+8] ; 4 ; w07 w06 w03 w02 movq mm6, [tab_i_26_xmm+32] ; 6 ; w21 w20 w17 w16 pshufw mm7, mm7, 10110001b ; y7 y6 y5 y4 STALL 6 paddd mm3, [rounder_2] ; +rounder paddd mm0, [rounder_2] ; +rounder movq [_ECX+16+8], mm7 ; 7 ; save y7 y6 movq mm7, [tab_i_26_xmm+40] ; 7 ; w23 w22 w19 w18 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 pmaddwd mm1, [tab_i_26_xmm+24]; x6*w15+x4*w14 x6*w11+x4*w10 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pmaddwd mm2, [tab_i_26_xmm+48]; x3*w29+x1*w28 x3*w25+x1*w24 pmaddwd mm7, mm5 ; 7 ; x7*w23+x5*w22 x7*w19+x5*w18 ; w23 w22 w19 w18 pmaddwd mm5, [tab_i_26_xmm+56]; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm4 ; 4 free ; a1=sum(even1) a0=sum(even0) paddd mm0, mm1 ; 1 free ; a3=sum(even3) a2=sum(even2) pshufw mm1, [_ECX+48+8],10001000b ; x6 x4 x6 x4 movq mm4, mm3 ; 4 ; a1 a0 paddd mm6, mm7 ; 7 free ; b1=sum(odd1) b0=sum(odd0) paddd mm2, mm5 ; 5 free ; b3=sum(odd3) b2=sum(odd2) pshufw mm5, [_ECX+48],10001000b; x2 x0 x2 x0 mm5 & mm0 exchanged for next cycle movq mm7, mm0 ; 7 ; a3 a2 psubd mm4, mm6 ; 6 free ; a1-b1 a0-b0 paddd mm6, mm3 ; mm6 = mm3+mm6+mm5+mm4; a1+b1 a0+b0 movq mm3, [tab_i_35_xmm] ; 3 ; w05 w04 w01 w00 psubd mm7, mm2 ; ; a3-b3 a2-b2 paddd mm0, mm2 ; 0 free a3+b3 a2+b2 pshufw mm2, [_ECX+48],11011101b; x3 x1 x3 x1 pmaddwd mm3, mm5 ; x2*w05+x0*w04 x2*w01+x0*w00 pmaddwd mm5, [tab_i_35_xmm+16]; x2*w13+x0*w12 x2*w09+x0*w08 psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 psrad mm6, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 psrad mm0, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 packssdw mm6, mm0 ; 0 free ; y3 y2 y1 y0 pshufw mm0, [_ECX+48+8],11011101b ; x7 x5 x7 x5 movq [_ECX+32], mm6 ; 3 ; save y3 y2 y1 y0 stall2 ; DCT_8_INV_ROW_1_s [_ECX+48], [_ECX+48], tab_i_35_xmm, rounder_3 movq mm4, [tab_i_35_xmm+8] ; 4 ; w07 w06 w03 w02 movq mm6, [tab_i_35_xmm+32] ; 6 ; w21 w20 w17 w16 pshufw mm7, mm7, 10110001b ; y7 y6 y5 y4 paddd mm3, [rounder_3] ; +rounder stall 6 paddd mm5, [rounder_3] ; +rounder movq [_ECX+32+8], mm7 ; 7 ; save y7 y6 y5 y4 movq mm7, [tab_i_35_xmm+40] ; 7 ; w23 w22 w19 w18 pmaddwd mm4, mm1 ; x6*w07+x4*w06 x6*w03+x4*w02 pmaddwd mm1, [tab_i_35_xmm+24]; x6*w15+x4*w14 x6*w11+x4*w10 pmaddwd mm6, mm2 ; x3*w21+x1*w20 x3*w17+x1*w16 pmaddwd mm2, [tab_i_35_xmm+48]; x3*w29+x1*w28 x3*w25+x1*w24 pmaddwd mm7, mm0 ; 7 ; x7*w23+x5*w22 x7*w19+x5*w18 ; w23 w22 w19 w18 pmaddwd mm0, [tab_i_35_xmm+56]; x7*w31+x5*w30 x7*w27+x5*w26 paddd mm3, mm4 ; 4 free ; a1=sum(even1) a0=sum(even0) paddd mm5, mm1 ; mm1 free ; a3=sum(even3) a2=sum(even2) movq mm1, [tg_3_16] movq mm4, mm3 ; 4 ; a1 a0 paddd mm6, mm7 ; 7 free ; b1=sum(odd1) b0=sum(odd0) paddd mm2, mm0 ; 5 free ; b3=sum(odd3) b2=sum(odd2) movq mm0, [tg_3_16] movq mm7, mm5 ; 7 ; a3 a2 psubd mm4, mm6 ; 6 free ; a1-b1 a0-b0 paddd mm3, mm6 ; mm3 = mm3+mm6+mm5+mm4; a1+b1 a0+b0 psubd mm7, mm2 ; ; a3-b3 a2-b2 paddd mm2, mm5 ; 0 free a3+b3 a2+b2 movq mm5, [_ECX+16*5] psrad mm4, SHIFT_INV_ROW ; y6=a1-b1 y7=a0-b0 psrad mm7, SHIFT_INV_ROW ; y4=a3-b3 y5=a2-b2 psrad mm3, SHIFT_INV_ROW ; y1=a1+b1 y0=a0+b0 psrad mm2, SHIFT_INV_ROW ; y3=a3+b3 y2=a2+b2 movq mm6, [_ECX+16*1] packssdw mm7, mm4 ; 4 ; y6 y7 y4 y5 movq mm4, [tg_1_16] packssdw mm3, mm2 ; 0 free ; y3 y2 y1 y0 pshufw mm2, mm7, 10110001b ; y7 y6 y5 y4 ; DCT_8_INV_COL_4 [_ECX+0],[_ECX+0] ; movq mm3,mmword ptr [_ECX+16*3] movq mm7, [_ECX+16*7] pmulhw mm0, mm3 ; x3*(tg_3_16-1) pmulhw mm1, mm5 ; x5*(tg_3_16-1) movq [_ECX+48+8], mm2 ; 7 ; save y7 y6 y5 y4 movq mm2, mm4 ; tg_1_16 pmulhw mm4, mm7 ; x7*tg_1_16 paddsw mm0, mm3 ; x3*tg_3_16 pmulhw mm2, mm6 ; x1*tg_1_16 paddsw mm1, mm3 ; x3+x5*(tg_3_16-1) psubsw mm0, mm5 ; x3*tg_3_16-x5 = tm35 movq [_ECX+48], mm3 ; 3 ; save y3 y2 y1 y0 movq mm3, [ocos_4_16] paddsw mm1, mm5 ; x3+x5*tg_3_16 = tp35 paddsw mm4, mm6 ; x1+tg_1_16*x7 = tp17 psubsw mm2, mm7 ; x1*tg_1_16-x7 = tm17 movq mm5, mm4 ; tp17 movq mm6, mm2 ; tm17 paddsw mm5, mm1 ; tp17+tp35 = b0 psubsw mm6, mm0 ; tm17-tm35 = b3 psubsw mm4, mm1 ; tp17-tp35 = t1 paddsw mm2, mm0 ; tm17+tm35 = t2 movq mm7, [tg_2_16] movq mm1, mm4 ; t1 movq [_ECX+3*16], mm5 ; save b0 paddsw mm1, mm2 ; t1+t2 movq [_ECX+5*16], mm6 ; save b3 psubsw mm4, mm2 ; t1-t2 movq mm5, [_ECX+2*16] movq mm0, mm7 ; tg_2_16 movq mm6, [_ECX+6*16] pmulhw mm0, mm5 ; x2*tg_2_16 pmulhw mm7, mm6 ; x6*tg_2_16 ; slot pmulhw mm1, mm3 ; ocos_4_16*(t1+t2) = b1/2 ; slot movq mm2, [_ECX+0*16] pmulhw mm4, mm3 ; ocos_4_16*(t1-t2) = b2/2 psubsw mm0, mm6 ; t2*tg_2_16-x6 = tm26 movq mm3, [_ECX+0*16] ; x0 movq mm6, [_ECX+4*16] paddsw mm7, mm5 ; x2+x6*tg_2_16 = tp26 paddsw mm2, mm6 ; x0+x4 = tp04 psubsw mm3, mm6 ; x0-x4 = tm04 movq mm5, mm2 ; tp04 movq mm6, mm3 ; tm04 psubsw mm2, mm7 ; tp04-tp26 = a3 paddsw mm3, mm0 ; tm04+tm26 = a1 paddsw mm1, mm1 ; b1 paddsw mm4, mm4 ; b2 paddsw mm5, mm7 ; tp04+tp26 = a0 psubsw mm6, mm0 ; tm04-tm26 = a2 movq mm7, mm3 ; a1 movq mm0, mm6 ; a2 paddsw mm3, mm1 ; a1+b1 paddsw mm6, mm4 ; a2+b2 psraw mm3, SHIFT_INV_COL ; dst1 psubsw mm7, mm1 ; a1-b1 psraw mm6, SHIFT_INV_COL ; dst2 psubsw mm0, mm4 ; a2-b2 movq mm1, [_ECX+3*16] ; load b0 psraw mm7, SHIFT_INV_COL ; dst6 movq mm4, mm5 ; a0 psraw mm0, SHIFT_INV_COL ; dst5 movq [_ECX+1*16], mm3 paddsw mm5, mm1 ; a0+b0 movq [_ECX+2*16], mm6 psubsw mm4, mm1 ; a0-b0 movq mm3, [_ECX+5*16] ; load b3 psraw mm5, SHIFT_INV_COL ; dst0 movq mm6, mm2 ; a3 psraw mm4, SHIFT_INV_COL ; dst7 movq [_ECX+5*16], mm0 movq mm0, [tg_3_16] paddsw mm2, mm3 ; a3+b3 movq [_ECX+6*16], mm7 psubsw mm6, mm3 ; a3-b3 movq mm3, [_ECX+8+16*3] movq [_ECX+0*16], mm5 psraw mm2, SHIFT_INV_COL ; dst3 movq [_ECX+7*16], mm4 ; DCT_8_INV_COL_4 [_ECX+8],[_ECX+8] movq mm1, mm0 ; tg_3_16 movq mm5, [_ECX+8+16*5] psraw mm6, SHIFT_INV_COL ; dst4 pmulhw mm0, mm3 ; x3*(tg_3_16-1) movq mm4, [tg_1_16] pmulhw mm1, mm5 ; x5*(tg_3_16-1) movq mm7, [_ECX+8+16*7] movq [_ECX+3*16], mm2 movq mm2, mm4 ; tg_1_16 movq [_ECX+4*16], mm6 movq mm6, [_ECX+8+16*1] pmulhw mm4, mm7 ; x7*tg_1_16 paddsw mm0, mm3 ; x3*tg_3_16 pmulhw mm2, mm6 ; x1*tg_1_16 paddsw mm1, mm3 ; x3+x5*(tg_3_16-1) psubsw mm0, mm5 ; x3*tg_3_16-x5 = tm35 movq mm3, [ocos_4_16] paddsw mm1, mm5 ; x3+x5*tg_3_16 = tp35 paddsw mm4, mm6 ; x1+tg_1_16*x7 = tp17 psubsw mm2, mm7 ; x1*tg_1_16-x7 = tm17 movq mm5, mm4 ; tp17 movq mm6, mm2 ; tm17 paddsw mm5, mm1 ; tp17+tp35 = b0 psubsw mm4, mm1 ; tp17-tp35 = t1 paddsw mm2, mm0 ; tm17+tm35 = t2 movq mm7, [tg_2_16] movq mm1, mm4 ; t1 psubsw mm6, mm0 ; tm17-tm35 = b3 movq [_ECX+8+3*16], mm5 ; save b0 movq [_ECX+8+5*16], mm6 ; save b3 psubsw mm4, mm2 ; t1-t2 movq mm5, [_ECX+8+2*16] movq mm0, mm7 ; tg_2_16 movq mm6, [_ECX+8+6*16] paddsw mm1, mm2 ; t1+t2 pmulhw mm0, mm5 ; x2*tg_2_16 pmulhw mm7, mm6 ; x6*tg_2_16 movq mm2, [_ECX+8+0*16] pmulhw mm4, mm3 ; ocos_4_16*(t1-t2) = b2/2 psubsw mm0, mm6 ; t2*tg_2_16-x6 = tm26 ; slot pmulhw mm1, mm3 ; ocos_4_16*(t1+t2) = b1/2 ; slot movq mm3, [_ECX+8+0*16] ; x0 movq mm6, [_ECX+8+4*16] paddsw mm7, mm5 ; x2+x6*tg_2_16 = tp26 paddsw mm2, mm6 ; x0+x4 = tp04 psubsw mm3, mm6 ; x0-x4 = tm04 movq mm5, mm2 ; tp04 movq mm6, mm3 ; tm04 psubsw mm2, mm7 ; tp04-tp26 = a3 paddsw mm3, mm0 ; tm04+tm26 = a1 paddsw mm1, mm1 ; b1 paddsw mm4, mm4 ; b2 paddsw mm5, mm7 ; tp04+tp26 = a0 psubsw mm6, mm0 ; tm04-tm26 = a2 movq mm7, mm3 ; a1 movq mm0, mm6 ; a2 paddsw mm3, mm1 ; a1+b1 paddsw mm6, mm4 ; a2+b2 psraw mm3, SHIFT_INV_COL ; dst1 psubsw mm7, mm1 ; a1-b1 psraw mm6, SHIFT_INV_COL ; dst2 psubsw mm0, mm4 ; a2-b2 movq mm1, [_ECX+8+3*16] ; load b0 psraw mm7, SHIFT_INV_COL ; dst6 movq mm4, mm5 ; a0 psraw mm0, SHIFT_INV_COL ; dst5 movq [_ECX+8+1*16], mm3 paddsw mm5, mm1 ; a0+b0 movq [_ECX+8+2*16], mm6 psubsw mm4, mm1 ; a0-b0 movq mm3, [_ECX+8+5*16] ; load b3 psraw mm5, SHIFT_INV_COL ; dst0 movq mm6, mm2 ; a3 psraw mm4, SHIFT_INV_COL ; dst7 movq [_ECX+8+5*16], mm0 paddsw mm2, mm3 ; a3+b3 movq [_ECX+8+6*16], mm7 psubsw mm6, mm3 ; a3-b3 movq [_ECX+8+0*16], mm5 psraw mm2, SHIFT_INV_COL ; dst3 movq [_ECX+8+7*16], mm4 psraw mm6, SHIFT_INV_COL ; dst4 movq [_ECX+8+3*16], mm2 movq [_ECX+8+4*16], mm6 ret ENDFUNC NON_EXEC_STACK xvidcore/src/dct/x86_asm/fdct_sse2_skal.asm0000664000076500007650000004365411254216113021711 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - SSE2 forward discrete cosine transform - ; * ; * Copyright(C) 2003 Pascal Massimino ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: fdct_sse2_skal.asm,v 1.15 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;----------------------------------------------------------------------------- ; ; -=FDCT=- ; ; Vertical pass is an implementation of the scheme: ; Loeffler C., Ligtenberg A., and Moschytz C.S.: ; Practical Fast 1D DCT Algorithm with Eleven Multiplications, ; Proc. ICASSP 1989, 988-991. ; ; Horizontal pass is a double 4x4 vector/matrix multiplication, ; (see also Intel's Application Note 922: ; http://developer.intel.com/vtune/cbts/strmsimd/922down.htm ; Copyright (C) 1999 Intel Corporation) ; ; Notes: ; * tan(3pi/16) is greater than 0.5, and would use the ; sign bit when turned into 16b fixed-point precision. So, ; we use the trick: x*tan3 = x*(tan3-1)+x ; ; * There's only one SSE-specific instruction (pshufw). ; ; * There's still 1 or 2 ticks to save in fLLM_PASS, but ; I prefer having a readable code, instead of a tightly ; scheduled one... ; ; * Quantization stage (as well as pre-transposition for the ; idct way back) can be included in the fTab* constants ; (with induced loss of precision, somehow) ; ; * Some more details at: http://skal.planet-d.net/coding/dct.html ; ;----------------------------------------------------------------------------- ; ; -=IDCT=- ; ; A little slower than fdct, because the final stages (butterflies and ; descaling) require some unpairable shifting and packing, all on ; the same CPU unit. ; ;----------------------------------------------------------------------------- ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN tan1: times 8 dw 0x32ec ; tan( pi/16) tan2: times 8 dw 0x6a0a ; tan(2pi/16) (=sqrt(2)-1) tan3: times 8 dw 0xab0e ; tan(3pi/16)-1 sqrt2: times 8 dw 0x5a82 ; 0.5/sqrt(2) ;----------------------------------------------------------------------------- ; Inverse DCT tables ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN iTab1: dw 0x4000, 0x539f, 0x4000, 0x22a3 dw 0x4000, 0xdd5d, 0x4000, 0xac61 dw 0x4000, 0x22a3, 0xc000, 0xac61 dw 0xc000, 0x539f, 0x4000, 0xdd5d dw 0x58c5, 0x4b42, 0x4b42, 0xee58 dw 0x3249, 0xa73b, 0x11a8, 0xcdb7 dw 0x3249, 0x11a8, 0xa73b, 0xcdb7 dw 0x11a8, 0x4b42, 0x4b42, 0xa73b iTab2: dw 0x58c5, 0x73fc, 0x58c5, 0x300b dw 0x58c5, 0xcff5, 0x58c5, 0x8c04 dw 0x58c5, 0x300b, 0xa73b, 0x8c04 dw 0xa73b, 0x73fc, 0x58c5, 0xcff5 dw 0x7b21, 0x6862, 0x6862, 0xe782 dw 0x45bf, 0x84df, 0x187e, 0xba41 dw 0x45bf, 0x187e, 0x84df, 0xba41 dw 0x187e, 0x6862, 0x6862, 0x84df iTab3: dw 0x539f, 0x6d41, 0x539f, 0x2d41 dw 0x539f, 0xd2bf, 0x539f, 0x92bf dw 0x539f, 0x2d41, 0xac61, 0x92bf dw 0xac61, 0x6d41, 0x539f, 0xd2bf dw 0x73fc, 0x6254, 0x6254, 0xe8ee dw 0x41b3, 0x8c04, 0x1712, 0xbe4d dw 0x41b3, 0x1712, 0x8c04, 0xbe4d dw 0x1712, 0x6254, 0x6254, 0x8c04 iTab4: dw 0x4b42, 0x6254, 0x4b42, 0x28ba dw 0x4b42, 0xd746, 0x4b42, 0x9dac dw 0x4b42, 0x28ba, 0xb4be, 0x9dac dw 0xb4be, 0x6254, 0x4b42, 0xd746 dw 0x6862, 0x587e, 0x587e, 0xeb3d dw 0x3b21, 0x979e, 0x14c3, 0xc4df dw 0x3b21, 0x14c3, 0x979e, 0xc4df dw 0x14c3, 0x587e, 0x587e, 0x979e ALIGN SECTION_ALIGN Walken_Idct_Rounders: dd 65536, 65536, 65536, 65536 dd 3597, 3597, 3597, 3597 dd 2260, 2260, 2260, 2260 dd 1203, 1203, 1203, 1203 dd 0, 0, 0, 0 dd 120, 120, 120, 120 dd 512, 512, 512, 512 dd 512, 512, 512, 512 times 8 dw (65536>>11) times 8 dw ( 3597>>11) times 8 dw ( 2260>>11) ; other rounders are zero... ;----------------------------------------------------------------------------- ; Forward DCT tables ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN fTab1: dw 0x4000, 0x4000, 0x58c5, 0x4b42, dw 0xdd5d, 0xac61, 0xa73b, 0xcdb7, dw 0x4000, 0x4000, 0x3249, 0x11a8, dw 0x539f, 0x22a3, 0x4b42, 0xee58, dw 0x4000, 0xc000, 0x3249, 0xa73b, dw 0x539f, 0xdd5d, 0x4b42, 0xa73b, dw 0xc000, 0x4000, 0x11a8, 0x4b42, dw 0x22a3, 0xac61, 0x11a8, 0xcdb7 fTab2: dw 0x58c5, 0x58c5, 0x7b21, 0x6862, dw 0xcff5, 0x8c04, 0x84df, 0xba41, dw 0x58c5, 0x58c5, 0x45bf, 0x187e, dw 0x73fc, 0x300b, 0x6862, 0xe782, dw 0x58c5, 0xa73b, 0x45bf, 0x84df, dw 0x73fc, 0xcff5, 0x6862, 0x84df, dw 0xa73b, 0x58c5, 0x187e, 0x6862, dw 0x300b, 0x8c04, 0x187e, 0xba41 fTab3: dw 0x539f, 0x539f, 0x73fc, 0x6254, dw 0xd2bf, 0x92bf, 0x8c04, 0xbe4d, dw 0x539f, 0x539f, 0x41b3, 0x1712, dw 0x6d41, 0x2d41, 0x6254, 0xe8ee, dw 0x539f, 0xac61, 0x41b3, 0x8c04, dw 0x6d41, 0xd2bf, 0x6254, 0x8c04, dw 0xac61, 0x539f, 0x1712, 0x6254, dw 0x2d41, 0x92bf, 0x1712, 0xbe4d fTab4: dw 0x4b42, 0x4b42, 0x6862, 0x587e, dw 0xd746, 0x9dac, 0x979e, 0xc4df, dw 0x4b42, 0x4b42, 0x3b21, 0x14c3, dw 0x6254, 0x28ba, 0x587e, 0xeb3d, dw 0x4b42, 0xb4be, 0x3b21, 0x979e, dw 0x6254, 0xd746, 0x587e, 0x979e, dw 0xb4be, 0x4b42, 0x14c3, 0x587e, dw 0x28ba, 0x9dac, 0x14c3, 0xc4df ALIGN SECTION_ALIGN Fdct_Rnd0: dw 6,8,8,8, 6,8,8,8 Fdct_Rnd1: dw 8,8,8,8, 8,8,8,8 Fdct_Rnd2: dw 10,8,8,8, 8,8,8,8 Rounder1: dw 1,1,1,1, 1,1,1,1 ;============================================================================= ; Code ;============================================================================= TEXT cglobal idct_sse2_skal cglobal fdct_sse2_skal ;----------------------------------------------------------------------------- ; Helper macro iMTX_MULT ;----------------------------------------------------------------------------- %macro iMTX_MULT 4 ; %1=src, %2 = Table to use, %3=rounder, %4=Shift movdqa xmm0, [_ECX+%1*16] ; xmm0 = [01234567] pshuflw xmm0, xmm0, 11011000b ; [02134567] ; these two shufflings could be pshufhw xmm0, xmm0, 11011000b ; [02134657] ; integrated in zig-zag orders pshufd xmm4, xmm0, 00000000b ; [02020202] pshufd xmm5, xmm0, 10101010b ; [46464646] pshufd xmm6, xmm0, 01010101b ; [13131313] pshufd xmm7, xmm0, 11111111b ; [57575757] pmaddwd xmm4, [%2+ 0] ; dot [M00,M01][M04,M05][M08,M09][M12,M13] pmaddwd xmm5, [%2+16] ; dot [M02,M03][M06,M07][M10,M11][M14,M15] pmaddwd xmm6, [%2+32] ; dot [M16,M17][M20,M21][M24,M25][M28,M29] pmaddwd xmm7, [%2+48] ; dot [M18,M19][M22,M23][M26,M27][M30,M31] paddd xmm4, [%3] ; Round paddd xmm6, xmm7 ; [b0|b1|b2|b3] paddd xmm4, xmm5 ; [a0|a1|a2|a3] movdqa xmm7, xmm6 paddd xmm6, xmm4 ; mm6=a+b psubd xmm4, xmm7 ; mm4=a-b psrad xmm6, %4 ; => out [0123] psrad xmm4, %4 ; => out [7654] packssdw xmm6, xmm4 ; [01237654] pshufhw xmm6, xmm6, 00011011b ; [01234567] movdqa [_ECX+%1*16], xmm6 %endmacro ;----------------------------------------------------------------------------- ; Helper macro iLLM_PASS ;----------------------------------------------------------------------------- %macro iLLM_PASS 1 ; %1: src/dst movdqa xmm0, [tan3] ; t3-1 movdqa xmm3, [%1+16*3] ; x3 movdqa xmm1, xmm0 ; t3-1 movdqa xmm5, [%1+16*5] ; x5 movdqa xmm4, [tan1] ; t1 movdqa xmm6, [%1+16*1] ; x1 movdqa xmm7, [%1+16*7] ; x7 movdqa xmm2, xmm4 ; t1 pmulhw xmm0, xmm3 ; x3*(t3-1) pmulhw xmm1, xmm5 ; x5*(t3-1) paddsw xmm0, xmm3 ; x3*t3 paddsw xmm1, xmm5 ; x5*t3 psubsw xmm0, xmm5 ; x3*t3-x5 = tm35 paddsw xmm1, xmm3 ; x3+x5*t3 = tp35 pmulhw xmm4, xmm7 ; x7*t1 pmulhw xmm2, xmm6 ; x1*t1 paddsw xmm4, xmm6 ; x1+t1*x7 = tp17 psubsw xmm2, xmm7 ; x1*t1-x7 = tm17 movdqa xmm3, [sqrt2] movdqa xmm7, xmm4 movdqa xmm6, xmm2 psubsw xmm4, xmm1 ; tp17-tp35 = t1 psubsw xmm2, xmm0 ; tm17-tm35 = b3 paddsw xmm1, xmm7 ; tp17+tp35 = b0 paddsw xmm0, xmm6 ; tm17+tm35 = t2 ; xmm1 = b0, xmm2 = b3. preserved movdqa xmm6, xmm4 psubsw xmm4, xmm0 ; t1-t2 paddsw xmm0, xmm6 ; t1+t2 pmulhw xmm4, xmm3 ; (t1-t2)/(2.sqrt2) pmulhw xmm0, xmm3 ; (t1+t2)/(2.sqrt2) paddsw xmm0, xmm0 ; 2.(t1+t2) = b1 paddsw xmm4, xmm4 ; 2.(t1-t2) = b2 movdqa xmm7, [tan2] ; t2 movdqa xmm3, [%1+2*16] ; x2 movdqa xmm6, [%1+6*16] ; x6 movdqa xmm5, xmm7 ; t2 pmulhw xmm7, xmm6 ; x6*t2 pmulhw xmm5, xmm3 ; x2*t2 paddsw xmm7, xmm3 ; x2+x6*t2 = tp26 psubsw xmm5, xmm6 ; x2*t2-x6 = tm26 ; use:xmm3,xmm5,xmm6,xmm7 frozen: xmm0,xmm4,xmm1,xmm2 movdqa xmm3, [%1+0*16] ; x0 movdqa xmm6, [%1+4*16] ; x4 movdqa [%1 ], xmm2 ; we spill 1 reg to perform safe butterflies movdqa xmm2, xmm3 psubsw xmm3, xmm6 ; x0-x4 = tm04 paddsw xmm6, xmm2 ; x0+x4 = tp04 movdqa xmm2, xmm6 psubsw xmm6, xmm7 paddsw xmm7, xmm2 movdqa xmm2, xmm3 psubsw xmm3, xmm5 paddsw xmm5, xmm2 movdqa xmm2, xmm5 psubsw xmm5, xmm0 paddsw xmm0, xmm2 movdqa xmm2, xmm3 psubsw xmm3, xmm4 paddsw xmm4, xmm2 movdqa xmm2, [%1] psraw xmm5, 6 ; out6 psraw xmm3, 6 ; out5 psraw xmm0, 6 ; out1 psraw xmm4, 6 ; out2 movdqa [%1+6*16], xmm5 movdqa [%1+5*16], xmm3 movdqa [%1+1*16], xmm0 movdqa [%1+2*16], xmm4 ; reminder: xmm1=b0, xmm2=b3, xmm7=a0, xmm6=a3 movdqa xmm0, xmm7 movdqa xmm4, xmm6 psubsw xmm7, xmm1 ; a0-b0 psubsw xmm6, xmm2 ; a3-b3 paddsw xmm1, xmm0 ; a0+b0 paddsw xmm2, xmm4 ; a3+b3 psraw xmm1, 6 ; out0 psraw xmm7, 6 ; out7 psraw xmm2, 6 ; out3 psraw xmm6, 6 ; out4 ; store result movdqa [%1+0*16], xmm1 movdqa [%1+3*16], xmm2 movdqa [%1+4*16], xmm6 movdqa [%1+7*16], xmm7 %endmacro ;----------------------------------------------------------------------------- ; Helper macro TEST_ROW (test a null row) ;----------------------------------------------------------------------------- %macro TEST_ROW 2 ; %1:src, %2:label x8 mov _EAX, [%1 ] mov _EDX, [%1+ 8] or _EAX, [%1+ 4] or _EDX, [%1+12] or _EAX, _EDX jz near %2 %endmacro ;----------------------------------------------------------------------------- ; Function idct (this one skips null rows) ;----------------------------------------------------------------------------- ; IEEE1180 and Walken compatible version ALIGN SECTION_ALIGN idct_sse2_skal: PUSH_XMM6_XMM7 mov _ECX, prm1 ; Src TEST_ROW _ECX, .Row0_Round iMTX_MULT 0, iTab1, Walken_Idct_Rounders + 16*0, 11 jmp .Row1 .Row0_Round: movdqa xmm0, [Walken_Idct_Rounders + 16*8 + 8*0] movdqa [_ECX ], xmm0 .Row1: TEST_ROW _ECX+16, .Row1_Round iMTX_MULT 1, iTab2, Walken_Idct_Rounders + 16*1, 11 jmp .Row2 .Row1_Round: movdqa xmm0, [Walken_Idct_Rounders + 16*8 + 16*1] movdqa [_ECX+16 ], xmm0 .Row2: TEST_ROW _ECX+32, .Row2_Round iMTX_MULT 2, iTab3, Walken_Idct_Rounders + 16*2, 11 jmp .Row3 .Row2_Round: movdqa xmm0, [Walken_Idct_Rounders + 16*8 + 16*2] movdqa [_ECX+32 ], xmm0 .Row3: TEST_ROW _ECX+48, .Row4 iMTX_MULT 3, iTab4, Walken_Idct_Rounders + 16*3, 11 .Row4: TEST_ROW _ECX+64, .Row5 iMTX_MULT 4, iTab1, Walken_Idct_Rounders + 16*4, 11 .Row5: TEST_ROW _ECX+80, .Row6 iMTX_MULT 5, iTab4, Walken_Idct_Rounders + 16*5, 11 .Row6: TEST_ROW _ECX+96, .Row7 iMTX_MULT 6, iTab3, Walken_Idct_Rounders + 16*6, 11 .Row7: TEST_ROW _ECX+112, .End iMTX_MULT 7, iTab2, Walken_Idct_Rounders + 16*7, 11 .End: iLLM_PASS _ECX POP_XMM6_XMM7 ret ENDFUNC ;----------------------------------------------------------------------------- ; Helper macro fLLM_PASS ;----------------------------------------------------------------------------- %macro fLLM_PASS 2 ; %1: src/dst, %2:Shift movdqa xmm0, [%1+0*16] ; In0 movdqa xmm2, [%1+2*16] ; In2 movdqa xmm3, xmm0 movdqa xmm4, xmm2 movdqa xmm7, [%1+7*16] ; In7 movdqa xmm5, [%1+5*16] ; In5 psubsw xmm0, xmm7 ; t7 = In0-In7 paddsw xmm7, xmm3 ; t0 = In0+In7 psubsw xmm2, xmm5 ; t5 = In2-In5 paddsw xmm5, xmm4 ; t2 = In2+In5 movdqa xmm3, [%1+3*16] ; In3 movdqa xmm4, [%1+4*16] ; In4 movdqa xmm1, xmm3 psubsw xmm3, xmm4 ; t4 = In3-In4 paddsw xmm4, xmm1 ; t3 = In3+In4 movdqa xmm6, [%1+6*16] ; In6 movdqa xmm1, [%1+1*16] ; In1 psubsw xmm1, xmm6 ; t6 = In1-In6 paddsw xmm6, [%1+1*16] ; t1 = In1+In6 psubsw xmm7, xmm4 ; tm03 = t0-t3 psubsw xmm6, xmm5 ; tm12 = t1-t2 paddsw xmm4, xmm4 ; 2.t3 paddsw xmm5, xmm5 ; 2.t2 paddsw xmm4, xmm7 ; tp03 = t0+t3 paddsw xmm5, xmm6 ; tp12 = t1+t2 psllw xmm2, %2+1 ; shift t5 (shift +1 to.. psllw xmm1, %2+1 ; shift t6 ..compensate cos4/2) psllw xmm4, %2 ; shift t3 psllw xmm5, %2 ; shift t2 psllw xmm7, %2 ; shift t0 psllw xmm6, %2 ; shift t1 psllw xmm3, %2 ; shift t4 psllw xmm0, %2 ; shift t7 psubsw xmm4, xmm5 ; out4 = tp03-tp12 psubsw xmm1, xmm2 ; xmm1: t6-t5 paddsw xmm5, xmm5 paddsw xmm2, xmm2 paddsw xmm5, xmm4 ; out0 = tp03+tp12 movdqa [%1+4*16], xmm4 ; => out4 paddsw xmm2, xmm1 ; xmm2: t6+t5 movdqa [%1+0*16], xmm5 ; => out0 movdqa xmm4, [tan2] ; xmm4 <= tan2 pmulhw xmm4, xmm7 ; tm03*tan2 movdqa xmm5, [tan2] ; xmm5 <= tan2 psubsw xmm4, xmm6 ; out6 = tm03*tan2 - tm12 pmulhw xmm5, xmm6 ; tm12*tan2 paddsw xmm5, xmm7 ; out2 = tm12*tan2 + tm03 movdqa xmm6, [sqrt2] movdqa xmm7, [Rounder1] pmulhw xmm2, xmm6 ; xmm2: tp65 = (t6 + t5)*cos4 por xmm5, xmm7 ; correct out2 por xmm4, xmm7 ; correct out6 pmulhw xmm1, xmm6 ; xmm1: tm65 = (t6 - t5)*cos4 por xmm2, xmm7 ; correct tp65 movdqa [%1+2*16], xmm5 ; => out2 movdqa xmm5, xmm3 ; save t4 movdqa [%1+6*16], xmm4 ; => out6 movdqa xmm4, xmm0 ; save t7 psubsw xmm3, xmm1 ; xmm3: tm465 = t4 - tm65 psubsw xmm0, xmm2 ; xmm0: tm765 = t7 - tp65 paddsw xmm2, xmm4 ; xmm2: tp765 = t7 + tp65 paddsw xmm1, xmm5 ; xmm1: tp465 = t4 + tm65 movdqa xmm4, [tan3] ; tan3 - 1 movdqa xmm5, [tan1] ; tan1 movdqa xmm7, xmm3 ; save tm465 pmulhw xmm3, xmm4 ; tm465*(tan3-1) movdqa xmm6, xmm1 ; save tp465 pmulhw xmm1, xmm5 ; tp465*tan1 paddsw xmm3, xmm7 ; tm465*tan3 pmulhw xmm4, xmm0 ; tm765*(tan3-1) paddsw xmm4, xmm0 ; tm765*tan3 pmulhw xmm5, xmm2 ; tp765*tan1 paddsw xmm1, xmm2 ; out1 = tp765 + tp465*tan1 psubsw xmm0, xmm3 ; out3 = tm765 - tm465*tan3 paddsw xmm7, xmm4 ; out5 = tm465 + tm765*tan3 psubsw xmm5, xmm6 ; out7 =-tp465 + tp765*tan1 movdqa [%1+1*16], xmm1 ; => out1 movdqa [%1+3*16], xmm0 ; => out3 movdqa [%1+5*16], xmm7 ; => out5 movdqa [%1+7*16], xmm5 ; => out7 %endmacro ;----------------------------------------------------------------------------- ;Helper macro fMTX_MULT ;----------------------------------------------------------------------------- %macro fMTX_MULT 3 ; %1=src, %2 = Coeffs, %3=rounders movdqa xmm0, [_ECX+%1*16+0] ; xmm0 = [0123][4567] pshufhw xmm1, xmm0, 00011011b ; xmm1 = [----][7654] pshufd xmm0, xmm0, 01000100b pshufd xmm1, xmm1, 11101110b movdqa xmm2, xmm0 paddsw xmm0, xmm1 ; xmm0 = [a0 a1 a2 a3] psubsw xmm2, xmm1 ; xmm2 = [b0 b1 b2 b3] punpckldq xmm0, xmm2 ; xmm0 = [a0 a1 b0 b1][a2 a3 b2 b3] pshufd xmm2, xmm0, 01001110b ; xmm2 = [a2 a3 b2 b3][a0 a1 b0 b1] ; [M00 M01 M16 M17] [M06 M07 M22 M23] x mm0 = [0 /1 /2'/3'] ; [M02 M03 M18 M19] [M04 M05 M20 M21] x mm2 = [0'/1'/2 /3 ] ; [M08 M09 M24 M25] [M14 M15 M30 M31] x mm0 = [4 /5 /6'/7'] ; [M10 M11 M26 M27] [M12 M13 M28 M29] x mm2 = [4'/5'/6 /7 ] movdqa xmm1, [%2+16] movdqa xmm3, [%2+32] pmaddwd xmm1, xmm2 pmaddwd xmm3, xmm0 pmaddwd xmm2, [%2+48] pmaddwd xmm0, [%2+ 0] paddd xmm0, xmm1 ; [ out0 | out1 ][ out2 | out3 ] paddd xmm2, xmm3 ; [ out4 | out5 ][ out6 | out7 ] psrad xmm0, 16 psrad xmm2, 16 packssdw xmm0, xmm2 ; [ out0 .. out7 ] paddsw xmm0, [%3] ; Round psraw xmm0, 4 ; => [-2048, 2047] movdqa [_ECX+%1*16+0], xmm0 %endmacro ;----------------------------------------------------------------------------- ; Function Forward DCT ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN fdct_sse2_skal: PUSH_XMM6_XMM7 mov _ECX, prm1 fLLM_PASS _ECX+0, 3 fMTX_MULT 0, fTab1, Fdct_Rnd0 fMTX_MULT 1, fTab2, Fdct_Rnd2 fMTX_MULT 2, fTab3, Fdct_Rnd1 fMTX_MULT 3, fTab4, Fdct_Rnd1 fMTX_MULT 4, fTab1, Fdct_Rnd0 fMTX_MULT 5, fTab4, Fdct_Rnd1 fMTX_MULT 6, fTab3, Fdct_Rnd1 fMTX_MULT 7, fTab2, Fdct_Rnd1 POP_XMM6_XMM7 ret ENDFUNC ; Mac-specific workaround for misaligned DCT tables ALIGN SECTION_ALIGN times 8 dw 0 NON_EXEC_STACK xvidcore/src/dct/x86_asm/fdct_mmx_ffmpeg.asm0000664000076500007650000002704411254216113022143 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - MMX and XMM forward discrete cosine transform - ; * ; * Copyright(C) 2003 Edouard Gomez ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: fdct_mmx_ffmpeg.asm,v 1.10 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ;/**************************************************************************** ; * ; * Initial, but incomplete version provided by Intel at AppNote AP-922 ; * http://developer.intel.com/vtune/cbts/strmsimd/922down.htm ; * Copyright (C) 1999 Intel Corporation ; * ; * Completed and corrected in fdctmm32.c/fdctmm32.doc ; * http://members.tripod.com/~liaor/ ; * Copyright (C) 2000 - Royce Shih-Wea Liao ; * ; * Minimizing coefficients reordering changing the tables constants order ; * http://ffmpeg.sourceforge.net/ ; * Copyright (C) 2001 Fabrice Bellard. ; * ; * The version coded here is just a port to NASM syntax from the FFMPEG's ; * version. So all credits go to the previous authors for all their ; * respective work in order to have a nice/fast mmx fDCT. ; ***************************************************************************/ ;============================================================================= ; Macros and other preprocessor constants ;============================================================================= %include "nasm.inc" ;;; Define this if you want an unrolled version of the code %define UNROLLED_LOOP %define BITS_FRW_ACC 3 %define SHIFT_FRW_COL BITS_FRW_ACC %define SHIFT_FRW_ROW (BITS_FRW_ACC + 17) %define RND_FRW_ROW (1 << (SHIFT_FRW_ROW-1)) %define RND_FRW_COL (1 << (SHIFT_FRW_COL-1)) ;============================================================================= ; Local Data (Read Only) ;============================================================================= DATA ALIGN SECTION_ALIGN tab_frw_01234567: dw 16384, 16384, -8867, -21407 dw 16384, 16384, 21407, 8867 dw 16384, -16384, 21407, -8867 dw -16384, 16384, 8867, -21407 dw 22725, 19266, -22725, -12873 dw 12873, 4520, 19266, -4520 dw 12873, -22725, 19266, -22725 dw 4520, 19266, 4520, -12873 dw 22725, 22725, -12299, -29692 dw 22725, 22725, 29692, 12299 dw 22725, -22725, 29692, -12299 dw -22725, 22725, 12299, -29692 dw 31521, 26722, -31521, -17855 dw 17855, 6270, 26722, -6270 dw 17855, -31521, 26722, -31521 dw 6270, 26722, 6270, -17855 dw 21407, 21407, -11585, -27969 dw 21407, 21407, 27969, 11585 dw 21407, -21407, 27969, -11585 dw -21407, 21407, 11585, -27969 dw 29692, 25172, -29692, -16819 dw 16819, 5906, 25172, -5906 dw 16819, -29692, 25172, -29692 dw 5906, 25172, 5906, -16819 dw 19266, 19266, -10426, -25172 dw 19266, 19266, 25172, 10426 dw 19266, -19266, 25172, -10426 dw -19266, 19266, 10426, -25172 dw 26722, 22654, -26722, -15137 dw 15137, 5315, 22654, -5315 dw 15137, -26722, 22654, -26722 dw 5315, 22654, 5315, -15137 dw 16384, 16384, -8867, -21407 dw 16384, 16384, 21407, 8867 dw 16384, -16384, 21407, -8867 dw -16384, 16384, 8867, -21407 dw 22725, 19266, -22725, -12873 dw 12873, 4520, 19266, -4520 dw 12873, -22725, 19266, -22725 dw 4520, 19266, 4520, -12873 dw 19266, 19266, -10426, -25172 dw 19266, 19266, 25172, 10426 dw 19266, -19266, 25172, -10426 dw -19266, 19266, 10426, -25172 dw 26722, 22654, -26722, -15137 dw 15137, 5315, 22654, -5315 dw 15137, -26722, 22654, -26722 dw 5315, 22654, 5315, -15137 dw 21407, 21407, -11585, -27969 dw 21407, 21407, 27969, 11585 dw 21407, -21407, 27969, -11585 dw -21407, 21407, 11585, -27969 dw 29692, 25172, -29692, -16819 dw 16819, 5906, 25172, -5906 dw 16819, -29692, 25172, -29692 dw 5906, 25172, 5906, -16819, dw 22725, 22725, -12299, -29692 dw 22725, 22725, 29692, 12299 dw 22725, -22725, 29692, -12299 dw -22725, 22725, 12299, -29692 dw 31521, 26722, -31521, -17855 dw 17855, 6270, 26722, -6270 dw 17855, -31521, 26722, -31521 dw 6270, 26722, 6270, -17855 ALIGN SECTION_ALIGN fdct_one_corr: dw 1, 1, 1, 1 ALIGN SECTION_ALIGN fdct_tg_all_16: dw 13036, 13036, 13036, 13036 dw 27146, 27146, 27146, 27146 dw -21746, -21746, -21746, -21746 ALIGN SECTION_ALIGN cos_4_16: dw -19195, -19195, -19195, -19195 ALIGN SECTION_ALIGN ocos_4_16: dw 23170, 23170, 23170, 23170 ALIGN SECTION_ALIGN fdct_r_row: dd RND_FRW_ROW, RND_FRW_ROW ;============================================================================= ; Factorized parts of the code turned into macros for better understanding ;============================================================================= ;; Macro for column DCT ;; FDCT_COLUMN_MMX(int16_t *out, const int16_t *in, int offset); ;; - out, register name holding the out address ;; - in, register name holding the in address ;; - column number to process %macro FDCT_COLUMN_COMMON 3 movq mm0, [%2 + %3*2 + 1*16] movq mm1, [%2 + %3*2 + 6*16] movq mm2, mm0 movq mm3, [%2 + %3*2 + 2*16] paddsw mm0, mm1 movq mm4, [%2 + %3*2 + 5*16] psllw mm0, SHIFT_FRW_COL movq mm5, [%2 + %3*2 + 0*16] paddsw mm4, mm3 paddsw mm5, [%2 + %3*2 + 7*16] psllw mm4, SHIFT_FRW_COL movq mm6, mm0 psubsw mm2, mm1 movq mm1, [fdct_tg_all_16 + 4*2] psubsw mm0, mm4 movq mm7, [%2 + %3*2 + 3*16] pmulhw mm1, mm0 paddsw mm7, [%2 + %3*2 + 4*16] psllw mm5, SHIFT_FRW_COL paddsw mm6, mm4 psllw mm7, SHIFT_FRW_COL movq mm4, mm5 psubsw mm5, mm7 paddsw mm1, mm5 paddsw mm4, mm7 por mm1, [fdct_one_corr] psllw mm2, SHIFT_FRW_COL + 1 pmulhw mm5, [fdct_tg_all_16 + 4*2] movq mm7, mm4 psubsw mm3, [%2 + %3*2 + 5*16] psubsw mm4, mm6 movq [%1 + %3*2 + 2*16], mm1 paddsw mm7, mm6 movq mm1, [%2 + %3*2 + 3*16] psllw mm3, SHIFT_FRW_COL + 1 psubsw mm1, [%2 + %3*2 + 4*16] movq mm6, mm2 movq [%1 + %3*2 + 4*16], mm4 paddsw mm2, mm3 pmulhw mm2, [ocos_4_16] psubsw mm6, mm3 pmulhw mm6, [ocos_4_16] psubsw mm5, mm0 por mm5, [fdct_one_corr] psllw mm1, SHIFT_FRW_COL por mm2, [fdct_one_corr] movq mm4, mm1 movq mm3, [%2 + %3*2 + 0*16] paddsw mm1, mm6 psubsw mm3, [%2 + %3*2 + 7*16] psubsw mm4, mm6 movq mm0, [fdct_tg_all_16 + 0*2] psllw mm3, SHIFT_FRW_COL movq mm6, [fdct_tg_all_16 + 8*2] pmulhw mm0, mm1 movq [%1 + %3*2 + 0*16], mm7 pmulhw mm6, mm4 movq [%1 + %3*2 + 6*16], mm5 movq mm7, mm3 movq mm5, [fdct_tg_all_16 + 8*2] psubsw mm7, mm2 paddsw mm3, mm2 pmulhw mm5, mm7 paddsw mm0, mm3 paddsw mm6, mm4 pmulhw mm3, [fdct_tg_all_16 + 0*2] por mm0, [fdct_one_corr] paddsw mm5, mm7 psubsw mm7, mm6 movq [%1 + %3*2 + 1*16], mm0 paddsw mm5, mm4 movq [%1 + %3*2 + 3*16], mm7 psubsw mm3, mm1 movq [%1 + %3*2 + 5*16], mm5 movq [%1 + %3*2 + 7*16], mm3 %endmacro ;; Macro for row DCT using MMX punpcklw instructions ;; FDCT_ROW_MMX(int16_t *out, const int16_t *in, const int16_t *table); ;; - out, register name holding the out address ;; - in, register name holding the in address ;; - table coefficients address (register or absolute) %macro FDCT_ROW_MMX 3 movd mm1, [%2 + 6*2] punpcklwd mm1, [%2 + 4*2] movq mm2, mm1 psrlq mm1, 0x20 movq mm0, [%2 + 0*2] punpcklwd mm1, mm2 movq mm5, mm0 paddsw mm0, mm1 psubsw mm5, mm1 movq mm1, mm0 movq mm6, mm5 punpckldq mm3, mm5 punpckhdq mm6, mm3 movq mm3, [%3 + 0*2] movq mm4, [%3 + 4*2] punpckldq mm2, mm0 pmaddwd mm3, mm0 punpckhdq mm1, mm2 movq mm2, [%3 + 16*2] pmaddwd mm4, mm1 pmaddwd mm0, [%3 + 8*2] movq mm7, [%3 + 20*2] pmaddwd mm2, mm5 paddd mm3, [fdct_r_row] pmaddwd mm7, mm6 pmaddwd mm1, [%3 + 12*2] paddd mm3, mm4 pmaddwd mm5, [%3 + 24*2] pmaddwd mm6, [%3 + 28*2] paddd mm2, mm7 paddd mm0, [fdct_r_row] psrad mm3, SHIFT_FRW_ROW paddd mm2, [fdct_r_row] paddd mm0, mm1 paddd mm5, [fdct_r_row] psrad mm2, SHIFT_FRW_ROW paddd mm5, mm6 psrad mm0, SHIFT_FRW_ROW psrad mm5, SHIFT_FRW_ROW packssdw mm3, mm0 packssdw mm2, mm5 movq mm6, mm3 punpcklwd mm3, mm2 punpckhwd mm6, mm2 movq [%1 + 0*2], mm3 movq [%1 + 4*2], mm6 %endmacro ;; Macro for column DCT using XMM instuction pshufw ;; FDCT_ROW_XMM(int16_t *out, const int16_t *in, const int16_t *table); ;; - out, register name holding the out address ;; - in, register name holding the in address ;; - table coefficient address %macro FDCT_ROW_XMM 3 ;; fdct_row_mmx2(const int16_t *in, int16_t *out, const int16_t *table) pshufw mm5, [%2 + 4*2], 0x1B movq mm0, [%2 + 0*2] movq mm1, mm0 paddsw mm0, mm5 psubsw mm1, mm5 pshufw mm2, mm0, 0x4E pshufw mm3, mm1, 0x4E movq mm4, [%3 + 0*2] movq mm6, [%3 + 4*2] movq mm5, [%3 + 16*2] movq mm7, [%3 + 20*2] pmaddwd mm4, mm0 pmaddwd mm5, mm1 pmaddwd mm6, mm2 pmaddwd mm7, mm3 pmaddwd mm0, [%3 + 8*2] pmaddwd mm2, [%3 + 12*2] pmaddwd mm1, [%3 + 24*2] pmaddwd mm3, [%3 + 28*2] paddd mm4, mm6 paddd mm5, mm7 paddd mm0, mm2 paddd mm1, mm3 movq mm7, [fdct_r_row] paddd mm4, mm7 paddd mm5, mm7 paddd mm0, mm7 paddd mm1, mm7 psrad mm4, SHIFT_FRW_ROW psrad mm5, SHIFT_FRW_ROW psrad mm0, SHIFT_FRW_ROW psrad mm1, SHIFT_FRW_ROW packssdw mm4, mm0 packssdw mm5, mm1 movq mm2, mm4 punpcklwd mm4, mm5 punpckhwd mm2, mm5 movq [%1 + 0*2], mm4 movq [%1 + 4*2], mm2 %endmacro %macro MAKE_FDCT_FUNC 2 ALIGN SECTION_ALIGN cglobal %1 %1: ;; Move the destination/source address to the eax register mov _EAX, prm1 ;; Process the columns (4 at a time) FDCT_COLUMN_COMMON _EAX, _EAX, 0 ; columns 0..3 FDCT_COLUMN_COMMON _EAX, _EAX, 4 ; columns 4..7 %ifdef UNROLLED_LOOP ; Unrolled loop version %assign i 0 %rep 8 ;; Process the 'i'th row %2 _EAX+2*i*8, _EAX+2*i*8, tab_frw_01234567+2*32*i %assign i i+1 %endrep %else mov _ECX, 8 mov _EDX, tab_frw_01234567 ALIGN SECTION_ALIGN .loop %2 _EAX, _EAX,_EDX add _EAX, 2*8 add _EDX, 2*32 dec _ECX jne .loop %endif ret ENDFUNC %endmacro ;============================================================================= ; Code ;============================================================================= TEXT ;----------------------------------------------------------------------------- ; void fdct_mmx_ffmpeg(int16_t block[64]); ;----------------------------------------------------------------------------- MAKE_FDCT_FUNC fdct_mmx_ffmpeg, FDCT_ROW_MMX ;----------------------------------------------------------------------------- ; void fdct_xmm_ffmpeg(int16_t block[64]); ;----------------------------------------------------------------------------- MAKE_FDCT_FUNC fdct_xmm_ffmpeg, FDCT_ROW_XMM NON_EXEC_STACK xvidcore/src/dct/x86_asm/idct_sse2_dmitry.asm0000664000076500007650000002604011254216113022260 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - SSE2 inverse discrete cosine transform - ; * ; * Copyright(C) 2002 Dmitry Rozhdestvensky ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: idct_sse2_dmitry.asm,v 1.11 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ;============================================================================= ; Macros and other preprocessor constants ;============================================================================= %include "nasm.inc" %define BITS_INV_ACC 5 ; 4 or 5 for IEEE %define SHIFT_INV_ROW 16 - BITS_INV_ACC %define SHIFT_INV_COL 1 + BITS_INV_ACC %define RND_INV_ROW 1024 * (6 - BITS_INV_ACC) ; 1 << (SHIFT_INV_ROW-1) %define RND_INV_COL 16 * (BITS_INV_ACC - 3) ; 1 << (SHIFT_INV_COL-1) %define RND_INV_CORR RND_INV_COL - 1 ; correction -1.0 and round %define BITS_FRW_ACC 3 ; 2 or 3 for accuracy %define SHIFT_FRW_COL BITS_FRW_ACC %define SHIFT_FRW_ROW BITS_FRW_ACC + 17 %define RND_FRW_ROW 262144 * (BITS_FRW_ACC - 1) ; 1 << (SHIFT_FRW_ROW-1) ;============================================================================= ; Local Data (Read Only) ;============================================================================= DATA ALIGN SECTION_ALIGN tab_i_04: dw 16384, 21407, 16384, 8867 ; movq-> w05 w04 w01 w00 dw 16384, -8867, 16384, -21407 ; w13 w12 w09 w08 dw 16384, 8867, -16384, -21407 ; w07 w06 w03 w02 dw -16384, 21407, 16384, -8867 ; w15 w14 w11 w10 dw 22725, 19266, 19266, -4520 ; w21 w20 w17 w16 dw 12873, -22725, 4520, -12873 ; w29 w28 w25 w24 dw 12873, 4520, -22725, -12873 ; w23 w22 w19 w18 dw 4520, 19266, 19266, -22725 ; w31 w30 w27 w26 ; Table for rows 1,7 - constants are multiplied by cos_1_16 tab_i_17: dw 22725, 29692, 22725, 12299 ; movq-> w05 w04 w01 w00 dw 22725, -12299, 22725, -29692 ; w13 w12 w09 w08 dw 22725, 12299, -22725, -29692 ; w07 w06 w03 w02 dw -22725, 29692, 22725, -12299 ; w15 w14 w11 w10 dw 31521, 26722, 26722, -6270 ; w21 w20 w17 w16 dw 17855, -31521, 6270, -17855 ; w29 w28 w25 w24 dw 17855, 6270, -31521, -17855 ; w23 w22 w19 w18 dw 6270, 26722, 26722, -31521 ; w31 w30 w27 w26 ; Table for rows 2,6 - constants are multiplied by cos_2_16 tab_i_26: dw 21407, 27969, 21407, 11585 ; movq-> w05 w04 w01 w00 dw 21407, -11585, 21407, -27969 ; w13 w12 w09 w08 dw 21407, 11585, -21407, -27969 ; w07 w06 w03 w02 dw -21407, 27969, 21407, -11585 ; w15 w14 w11 w10 dw 29692, 25172, 25172, -5906 ; w21 w20 w17 w16 dw 16819, -29692, 5906, -16819 ; w29 w28 w25 w24 dw 16819, 5906, -29692, -16819 ; w23 w22 w19 w18 dw 5906, 25172, 25172, -29692 ; w31 w30 w27 w26 ; Table for rows 3,5 - constants are multiplied by cos_3_16 tab_i_35: dw 19266, 25172, 19266, 10426 ; movq-> w05 w04 w01 w00 dw 19266, -10426, 19266, -25172 ; w13 w12 w09 w08 dw 19266, 10426, -19266, -25172 ; w07 w06 w03 w02 dw -19266, 25172, 19266, -10426 ; w15 w14 w11 w10 dw 26722, 22654, 22654, -5315 ; w21 w20 w17 w16 dw 15137, -26722, 5315, -15137 ; w29 w28 w25 w24 dw 15137, 5315, -26722, -15137 ; w23 w22 w19 w18 dw 5315, 22654, 22654, -26722 ; w31 w30 w27 w26 %if SHIFT_INV_ROW == 12 ; assume SHIFT_INV_ROW == 12 rounder_2_0: dd 65536, 65536 dd 65536, 65536 rounder_2_4: dd 0, 0 dd 0, 0 rounder_2_1: dd 7195, 7195 dd 7195, 7195 rounder_2_7: dd 1024, 1024 dd 1024, 1024 rounder_2_2: dd 4520, 4520 dd 4520, 4520 rounder_2_6: dd 1024, 1024 dd 1024, 1024 rounder_2_3: dd 2407, 2407 dd 2407, 2407 rounder_2_5: dd 240, 240 dd 240, 240 %elif SHIFT_INV_ROW == 11 ; assume SHIFT_INV_ROW == 11 rounder_2_0: dd 65536, 65536 dd 65536, 65536 rounder_2_4: dd 0, 0 dd 0, 0 rounder_2_1: dd 3597, 3597 dd 3597, 3597 rounder_2_7: dd 512, 512 dd 512, 512 rounder_2_2: dd 2260, 2260 dd 2260, 2260 rounder_2_6: dd 512, 512 dd 512, 512 rounder_2_3: dd 1203, 1203 dd 1203, 1203 rounder_2_5: dd 120, 120 dd 120, 120 %else %error Invalid SHIFT_INV_ROW specified %endif tg_1_16: dw 13036, 13036, 13036, 13036 ; tg * (2<<16) + 0.5 dw 13036, 13036, 13036, 13036 tg_2_16: dw 27146, 27146, 27146, 27146 ; tg * (2<<16) + 0.5 dw 27146, 27146, 27146, 27146 tg_3_16: dw -21746, -21746, -21746, -21746 ; tg * (2<<16) + 0.5 dw -21746, -21746, -21746, -21746 ocos_4_16: dw 23170, 23170, 23170, 23170 ; cos * (2<<15) + 0.5 dw 23170, 23170, 23170, 23170 ;============================================================================= ; Code ;============================================================================= TEXT cglobal idct_sse2_dmitry ;----------------------------------------------------------------------------- ; Helper macro - ROW iDCT ;----------------------------------------------------------------------------- %macro DCT_8_INV_ROW_1_SSE2 4 pshufhw xmm1, [%1], 11011000b ;x 75643210 pshuflw xmm1, xmm1, 11011000b ;x 75643120 pshufd xmm0, xmm1, 00000000b ;x 20202020 pmaddwd xmm0, [%3] ;w 13 12 9 8 5410 ;a 3210 first part pshufd xmm2, xmm1, 10101010b ;x 64646464 pmaddwd xmm2, [%3+16] ;w 15 14 11 10 7632 ;a 3210 second part paddd xmm2, xmm0 ;a 3210 ready paddd xmm2, [%4] ;must be 4 dwords long, not 2 as for sse1 movdqa xmm5, xmm2 pshufd xmm3, xmm1, 01010101b ;x 31313131 pmaddwd xmm3, [%3+32] ;w 29 28 25 24 21 20 17 16 ;b 3210 first part pshufd xmm4, xmm1, 11111111b ;x 75757575 pmaddwd xmm4, [%3+48] ;w 31 30 27 26 23 22 19 18 ;b 3210 second part paddd xmm3,xmm4 ;b 3210 ready paddd xmm2, xmm3 ;will be y 3210 psubd xmm5, xmm3 ;will be y 4567 psrad xmm2, SHIFT_INV_ROW psrad xmm5, SHIFT_INV_ROW packssdw xmm2, xmm5 ;y 45673210 pshufhw xmm6, xmm2, 00011011b ;y 76543210 movdqa [%2], xmm6 %endmacro ;----------------------------------------------------------------------------- ; Helper macro - Columns iDCT ;----------------------------------------------------------------------------- %macro DCT_8_INV_COL_4_SSE2 2 movdqa xmm0, [%1+16*0] ;x0 (all columns) movdqa xmm2, [%1+16*4] ;x4 movdqa xmm1, xmm0 movdqa xmm4, [%1+16*2] ;x2 movdqa xmm5, [%1+16*6] ;x6 movdqa xmm6, [tg_2_16] movdqa xmm7, xmm6 paddsw xmm0, xmm2 ;u04=x0+x4 psubsw xmm1, xmm2 ;v04=x0-x4 movdqa xmm3, xmm0 movdqa xmm2, xmm1 pmulhw xmm6, xmm4 pmulhw xmm7, xmm5 psubsw xmm6, xmm5 ;v26=x2*T2-x6 paddsw xmm7, xmm4 ;u26=x6*T2+x2 paddsw xmm1, xmm6 ;a1=v04+v26 paddsw xmm0, xmm7 ;a0=u04+u26 psubsw xmm2, xmm6 ;a2=v04-v26 psubsw xmm3, xmm7 ;a3=u04-u26 movdqa [%2+16*0], xmm0 ;store a3-a0 to movdqa [%2+16*6], xmm1 ;free registers movdqa [%2+16*2], xmm2 movdqa [%2+16*4], xmm3 movdqa xmm0, [%1+16*1] ;x1 movdqa xmm1, [%1+16*7] ;x7 movdqa xmm2, [tg_1_16] movdqa xmm3, xmm2 movdqa xmm4, [%1+16*3] ;x3 movdqa xmm5, [%1+16*5] ;x5 movdqa xmm6, [tg_3_16] movdqa xmm7, xmm6 pmulhw xmm2, xmm0 pmulhw xmm3, xmm1 psubsw xmm2, xmm1 ;v17=x1*T1-x7 paddsw xmm3, xmm0 ;u17=x7*T1+x1 movdqa xmm0, xmm3 ;u17 movdqa xmm1, xmm2 ;v17 pmulhw xmm6, xmm4 ;x3*(t3-1) pmulhw xmm7, xmm5 ;x5*(t3-1) paddsw xmm6, xmm4 paddsw xmm7, xmm5 psubsw xmm6, xmm5 ;v35=x3*T3-x5 paddsw xmm7, xmm4 ;u35=x5*T3+x3 movdqa xmm4, [ocos_4_16] paddsw xmm0, xmm7 ;b0=u17+u35 psubsw xmm1, xmm6 ;b3=v17-v35 psubsw xmm3, xmm7 ;u12=u17-v35 paddsw xmm2, xmm6 ;v12=v17+v35 movdqa xmm5, xmm3 paddsw xmm3, xmm2 ;tb1 psubsw xmm5, xmm2 ;tb2 pmulhw xmm5, xmm4 pmulhw xmm4, xmm3 paddsw xmm5, xmm5 paddsw xmm4, xmm4 movdqa xmm6, [%2+16*0] ;a0 movdqa xmm7, xmm6 movdqa xmm2, [%2+16*4] ;a3 movdqa xmm3, xmm2 paddsw xmm6, xmm0 psubsw xmm7, xmm0 psraw xmm6, SHIFT_INV_COL ;y0=a0+b0 psraw xmm7, SHIFT_INV_COL ;y7=a0-b0 movdqa [%2+16*0], xmm6 movdqa [%2+16*7], xmm7 paddsw xmm2, xmm1 psubsw xmm3, xmm1 psraw xmm2, SHIFT_INV_COL ;y3=a3+b3 psraw xmm3, SHIFT_INV_COL ;y4=a3-b3 movdqa [%2+16*3], xmm2 movdqa [%2+16*4], xmm3 movdqa xmm0, [%2+16*6] ;a1 movdqa xmm1, xmm0 movdqa xmm6, [%2+16*2] ;a2 movdqa xmm7, xmm6 paddsw xmm0, xmm4 psubsw xmm1, xmm4 psraw xmm0, SHIFT_INV_COL ;y1=a1+b1 psraw xmm1, SHIFT_INV_COL ;y6=a1-b1 movdqa [%2+16*1], xmm0 movdqa [%2+16*6], xmm1 paddsw xmm6, xmm5 psubsw xmm7, xmm5 psraw xmm6, SHIFT_INV_COL ;y2=a2+b2 psraw xmm7, SHIFT_INV_COL ;y5=a2-b2 movdqa [%2+16*2], xmm6 movdqa [%2+16*5], xmm7 %endmacro ;----------------------------------------------------------------------------- ; void idct_sse2_dmitry(int16_t coeff[64]); ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN idct_sse2_dmitry: PUSH_XMM6_XMM7 mov _ECX, prm1 DCT_8_INV_ROW_1_SSE2 _ECX+ 0, _ECX+ 0, tab_i_04, rounder_2_0 DCT_8_INV_ROW_1_SSE2 _ECX+ 16, _ECX+ 16, tab_i_17, rounder_2_1 DCT_8_INV_ROW_1_SSE2 _ECX+ 32, _ECX+ 32, tab_i_26, rounder_2_2 DCT_8_INV_ROW_1_SSE2 _ECX+ 48, _ECX+ 48, tab_i_35, rounder_2_3 DCT_8_INV_ROW_1_SSE2 _ECX+ 64, _ECX+ 64, tab_i_04, rounder_2_4 DCT_8_INV_ROW_1_SSE2 _ECX+ 80, _ECX+ 80, tab_i_35, rounder_2_5 DCT_8_INV_ROW_1_SSE2 _ECX+ 96, _ECX+ 96, tab_i_26, rounder_2_6 DCT_8_INV_ROW_1_SSE2 _ECX+112, _ECX+112, tab_i_17, rounder_2_7 DCT_8_INV_COL_4_SSE2 _ECX, _ECX POP_XMM6_XMM7 ret ENDFUNC NON_EXEC_STACK xvidcore/src/dct/simple_idct.c0000664000076500007650000001553211564705453017504 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Inverse DCT (More precise version) - * * Copyright (c) 2001 Michael Niedermayer * * Originally distributed under the GNU LGPL License (ffmpeg). * It is licensed under the GNU GPL for the Xvid tree. * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: simple_idct.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ /* based upon some outcommented c code from mpeg2dec (idct_mmx.c written by Aaron Holtzman ) */ #include "../portab.h" #include "idct.h" #if 0 #define W1 2841 /* 2048*sqrt (2)*cos (1*pi/16) */ #define W2 2676 /* 2048*sqrt (2)*cos (2*pi/16) */ #define W3 2408 /* 2048*sqrt (2)*cos (3*pi/16) */ #define W4 2048 /* 2048*sqrt (2)*cos (4*pi/16) */ #define W5 1609 /* 2048*sqrt (2)*cos (5*pi/16) */ #define W6 1108 /* 2048*sqrt (2)*cos (6*pi/16) */ #define W7 565 /* 2048*sqrt (2)*cos (7*pi/16) */ #define ROW_SHIFT 8 #define COL_SHIFT 17 #else #define W1 22725 /* cos(i*M_PI/16)*sqrt(2)*(1<<14) + 0.5 */ #define W2 21407 /* cos(i*M_PI/16)*sqrt(2)*(1<<14) + 0.5 */ #define W3 19266 /* cos(i*M_PI/16)*sqrt(2)*(1<<14) + 0.5 */ #define W4 16383 /* cos(i*M_PI/16)*sqrt(2)*(1<<14) + 0.5 */ #define W5 12873 /* cos(i*M_PI/16)*sqrt(2)*(1<<14) + 0.5 */ #define W6 8867 /* cos(i*M_PI/16)*sqrt(2)*(1<<14) + 0.5 */ #define W7 4520 /* cos(i*M_PI/16)*sqrt(2)*(1<<14) + 0.5 */ #define ROW_SHIFT 11 #define COL_SHIFT 20 /* 6 */ #endif /* PPC mac operation. Causes compile problems on newer ppc targets Was originally: #if defined(ARCH_IS_PPC) */ #if 0 /* signed 16x16 -> 32 multiply add accumulate */ #define MAC16(rt, ra, rb) \ asm ("maclhw %0, %2, %3" : "=r" (rt) : "0" (rt), "r" (ra), "r" (rb)); /* signed 16x16 -> 32 multiply */ #define MUL16(rt, ra, rb) \ asm ("mullhw %0, %1, %2" : "=r" (rt) : "r" (ra), "r" (rb)); #else /* signed 16x16 -> 32 multiply add accumulate */ #define MAC16(rt, ra, rb) rt += (ra) * (rb) /* signed 16x16 -> 32 multiply */ #define MUL16(rt, ra, rb) rt = (ra) * (rb) #endif static __inline void idctRowCondDC (int16_t * const row) { int a0, a1, a2, a3, b0, b1, b2, b3; #ifdef FAST_64BIT uint64_t temp; #else uint32_t temp; #endif #ifdef FAST_64BIT #ifdef ARCH_IS_BIG_ENDIAN #define ROW0_MASK 0xffff000000000000LL #else #define ROW0_MASK 0xffffLL #endif if ( ((((uint64_t *)row)[0] & ~ROW0_MASK) | ((uint64_t *)row)[1]) == 0) { temp = (row[0] << 3) & 0xffff; temp += temp << 16; temp += temp << 32; ((uint64_t *)row)[0] = temp; ((uint64_t *)row)[1] = temp; return; } #else if (!(((uint32_t*)row)[1] | ((uint32_t*)row)[2] | ((uint32_t*)row)[3] | row[1])) { temp = (row[0] << 3) & 0xffff; temp += temp << 16; ((uint32_t*)row)[0]=((uint32_t*)row)[1] = ((uint32_t*)row)[2]=((uint32_t*)row)[3] = temp; return; } #endif a0 = (W4 * row[0]) + (1 << (ROW_SHIFT - 1)); a1 = a0; a2 = a0; a3 = a0; /* no need to optimize : gcc does it */ a0 += W2 * row[2]; a1 += W6 * row[2]; a2 -= W6 * row[2]; a3 -= W2 * row[2]; MUL16(b0, W1, row[1]); MAC16(b0, W3, row[3]); MUL16(b1, W3, row[1]); MAC16(b1, -W7, row[3]); MUL16(b2, W5, row[1]); MAC16(b2, -W1, row[3]); MUL16(b3, W7, row[1]); MAC16(b3, -W5, row[3]); #ifdef FAST_64BIT temp = ((uint64_t*)row)[1]; #else temp = ((uint32_t*)row)[2] | ((uint32_t*)row)[3]; #endif if (temp != 0) { a0 += W4*row[4] + W6*row[6]; a1 += - W4*row[4] - W2*row[6]; a2 += - W4*row[4] + W2*row[6]; a3 += W4*row[4] - W6*row[6]; MAC16(b0, W5, row[5]); MAC16(b0, W7, row[7]); MAC16(b1, -W1, row[5]); MAC16(b1, -W5, row[7]); MAC16(b2, W7, row[5]); MAC16(b2, W3, row[7]); MAC16(b3, W3, row[5]); MAC16(b3, -W1, row[7]); } row[0] = (a0 + b0) >> ROW_SHIFT; row[7] = (a0 - b0) >> ROW_SHIFT; row[1] = (a1 + b1) >> ROW_SHIFT; row[6] = (a1 - b1) >> ROW_SHIFT; row[2] = (a2 + b2) >> ROW_SHIFT; row[5] = (a2 - b2) >> ROW_SHIFT; row[3] = (a3 + b3) >> ROW_SHIFT; row[4] = (a3 - b3) >> ROW_SHIFT; } static __inline void idctSparseCol (int16_t * const col) { int a0, a1, a2, a3, b0, b1, b2, b3; /* XXX: I did that only to give same values as previous code */ a0 = W4 * (col[8*0] + ((1<<(COL_SHIFT-1))/W4)); a1 = a0; a2 = a0; a3 = a0; a0 += + W2*col[8*2]; a1 += + W6*col[8*2]; a2 += - W6*col[8*2]; a3 += - W2*col[8*2]; MUL16(b0, W1, col[8*1]); MUL16(b1, W3, col[8*1]); MUL16(b2, W5, col[8*1]); MUL16(b3, W7, col[8*1]); MAC16(b0, + W3, col[8*3]); MAC16(b1, - W7, col[8*3]); MAC16(b2, - W1, col[8*3]); MAC16(b3, - W5, col[8*3]); if(col[8*4]){ a0 += + W4*col[8*4]; a1 += - W4*col[8*4]; a2 += - W4*col[8*4]; a3 += + W4*col[8*4]; } if (col[8*5]) { MAC16(b0, + W5, col[8*5]); MAC16(b1, - W1, col[8*5]); MAC16(b2, + W7, col[8*5]); MAC16(b3, + W3, col[8*5]); } if(col[8*6]){ a0 += + W6*col[8*6]; a1 += - W2*col[8*6]; a2 += + W2*col[8*6]; a3 += - W6*col[8*6]; } if (col[8*7]) { MAC16(b0, + W7, col[8*7]); MAC16(b1, - W5, col[8*7]); MAC16(b2, + W3, col[8*7]); MAC16(b3, - W1, col[8*7]); } col[0 ] = ((a0 + b0) >> COL_SHIFT); col[8 ] = ((a1 + b1) >> COL_SHIFT); col[16] = ((a2 + b2) >> COL_SHIFT); col[24] = ((a3 + b3) >> COL_SHIFT); col[32] = ((a3 - b3) >> COL_SHIFT); col[40] = ((a2 - b2) >> COL_SHIFT); col[48] = ((a1 - b1) >> COL_SHIFT); col[56] = ((a0 - b0) >> COL_SHIFT); } void simple_idct_c(int16_t * const block) { int i; for(i=0; i<8; i++) idctRowCondDC(block + i*8); for(i=0; i<8; i++) idctSparseCol(block + i); } xvidcore/src/dct/fdct.c0000664000076500007650000002117011564706134016120 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Forward DCT - * * Copyright (C) 2006-2011 Xvid Solutions GmbH * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: fdct.c 1986 2011-05-18 09:07:40Z Isibaar $ * ****************************************************************************/ /* * Authors: Skal * * "Fast and precise" LLM implementation of FDCT/IDCT, where * rotations are decomposed using: * tmp = (x+y).cos t * x' = tmp + y.(sin t - cos t) * y' = tmp - x.(sin t + cos t) * * See details at the end of this file... * * Reference (e.g.): * Loeffler C., Ligtenberg A., and Moschytz C.S.: * Practical Fast 1D DCT Algorithm with Eleven Multiplications, * Proc. ICASSP 1989, 988-991. * * IEEE-1180-like error specs for FDCT: * Peak error: 1.0000 * Peak MSE: 0.0340 * Overall MSE: 0.0200 * Peak ME: 0.0191 * Overall ME: -0.0033 * ********************************************************/ #include "fdct.h" /* function pointer */ fdctFuncPtr fdct; /* ////////////////////////////////////////////////////////// */ #define BUTF(a, b, tmp) \ (tmp) = (a)+(b); \ (b) = (a)-(b); \ (a) = (tmp) #define LOAD_BUTF(m1, m2, a, b, tmp, S) \ (m1) = (S)[(a)] + (S)[(b)]; \ (m2) = (S)[(a)] - (S)[(b)] #define ROTATE(m1,m2,c,k1,k2,tmp,Fix,Rnd) \ (tmp) = ( (m1) + (m2) )*(c); \ (m1) *= k1; \ (m2) *= k2; \ (tmp) += (Rnd); \ (m1) = ((m1)+(tmp))>>(Fix); \ (m2) = ((m2)+(tmp))>>(Fix); #define ROTATE2(m1,m2,c,k1,k2,tmp) \ (tmp) = ( (m1) + (m2) )*(c); \ (m1) *= k1; \ (m2) *= k2; \ (m1) = (m1)+(tmp); \ (m2) = (m2)+(tmp); #define ROTATE0(m1,m2,c,k1,k2,tmp) \ (m1) = ( (m2) )*(c); \ (m2) = (m2)*k2+(m1); #define SHIFTL(x,n) ((x)<<(n)) #define SHIFTR(x, n) ((x)>>(n)) #define HALF(n) (1<<((n)-1)) #define IPASS 3 #define FPASS 2 #define FIX 16 #if 1 #define ROT6_C 35468 #define ROT6_SmC 50159 #define ROT6_SpC 121095 #define ROT17_C 77062 #define ROT17_SmC 25571 #define ROT17_SpC 128553 #define ROT37_C 58981 #define ROT37_SmC 98391 #define ROT37_SpC 19571 #define ROT13_C 167963 #define ROT13_SmC 134553 #define ROT13_SpC 201373 #else #include #define FX(x) ( (int)floor((x)*(1<0; --i) { int mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7, Spill; // even LOAD_BUTF(mm1,mm6, 1, 6, mm0, pIn); LOAD_BUTF(mm2,mm5, 2, 5, mm0, pIn); LOAD_BUTF(mm3,mm4, 3, 4, mm0, pIn); LOAD_BUTF(mm0,mm7, 0, 7, Spill, pIn); BUTF(mm1, mm2, Spill); BUTF(mm0, mm3, Spill); ROTATE(mm3, mm2, ROT6_C, ROT6_SmC, -ROT6_SpC, Spill, FIX-FPASS, HALF(FIX-FPASS)); pIn[2] = mm3; pIn[6] = mm2; BUTF(mm0, mm1, Spill); pIn[0] = SHIFTL(mm0, FPASS); pIn[4] = SHIFTL(mm1, FPASS); // odd mm3 = mm5 + mm7; mm2 = mm4 + mm6; ROTATE(mm2, mm3, ROT17_C, -ROT17_SpC, -ROT17_SmC, mm0, FIX-FPASS, HALF(FIX-FPASS)); ROTATE(mm4, mm7, -ROT37_C, ROT37_SpC, ROT37_SmC, mm0, FIX-FPASS, HALF(FIX-FPASS)); mm7 += mm3; mm4 += mm2; pIn[1] = mm7; pIn[7] = mm4; ROTATE(mm5, mm6, -ROT13_C, ROT13_SmC, ROT13_SpC, mm0, FIX-FPASS, HALF(FIX-FPASS)); mm5 += mm3; mm6 += mm2; pIn[3] = mm6; pIn[5] = mm5; pIn += 8; } pIn = In; for(i=8; i>0; --i) { int mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7, Spill; // even LOAD_BUTF(mm1,mm6, 1*8, 6*8, mm0, pIn); LOAD_BUTF(mm2,mm5, 2*8, 5*8, mm0, pIn); BUTF(mm1, mm2, mm0); LOAD_BUTF(mm3,mm4, 3*8, 4*8, mm0, pIn); LOAD_BUTF(mm0,mm7, 0*8, 7*8, Spill, pIn); BUTF(mm0, mm3, Spill); ROTATE(mm3, mm2, ROT6_C, ROT6_SmC, -ROT6_SpC, Spill, 0, HALF(FIX+FPASS+3)); pIn[2*8] = (int16_t)SHIFTR(mm3,FIX+FPASS+3); pIn[6*8] = (int16_t)SHIFTR(mm2,FIX+FPASS+3); mm0 += HALF(FPASS+3) - 1; BUTF(mm0, mm1, Spill); pIn[0*8] = (int16_t)SHIFTR(mm0, FPASS+3); pIn[4*8] = (int16_t)SHIFTR(mm1, FPASS+3); // odd mm3 = mm5 + mm7; mm2 = mm4 + mm6; ROTATE(mm2, mm3, ROT17_C, -ROT17_SpC, -ROT17_SmC, mm0, 0, HALF(FIX+FPASS+3)); ROTATE2(mm4, mm7, -ROT37_C, ROT37_SpC, ROT37_SmC, mm0); mm7 += mm3; mm4 += mm2; pIn[7*8] = (int16_t)SHIFTR(mm4,FIX+FPASS+3); pIn[1*8] = (int16_t)SHIFTR(mm7,FIX+FPASS+3); ROTATE2(mm5, mm6, -ROT13_C, ROT13_SmC, ROT13_SpC, mm0); mm5 += mm3; mm6 += mm2; pIn[5*8] = (int16_t)SHIFTR(mm5,FIX+FPASS+3); pIn[3*8] = (int16_t)SHIFTR(mm6,FIX+FPASS+3); pIn++; } } #undef FIX #undef FPASS #undef IPASS #undef BUTF #undef LOAD_BUTF #undef ROTATE #undef ROTATE2 #undef SHIFTL #undef SHIFTR ////////////////////////////////////////////////////////// // - Data flow schematics for FDCT - // Output is scaled by 2.sqrt(2) // Initial butterflies (in0/in7, etc.) are not fully depicted. // Note: Rot6 coeffs are multiplied by sqrt(2). ////////////////////////////////////////////////////////// /* <---------Stage1 =even part=-----------> in3 mm3 +_____.___-___________.____* out6 x \ / | in4 mm4 \ | / \ | in0 mm0 +_____o___+__.___-___ | ___* out4 x \ / | in7 mm7 \ (Rot6) / \ | in1 mm1 +_____o___+__o___+___ | ___* out0 x \ / | in6 mm6 / | / \ | in2 mm2 +_____.___-___________o____* out2 x in5 mm5 <---------Stage2 =odd part=----------------> mm7*___._________.___-___[xSqrt2]___* out3 | \ / (Rot3) \ | / \ mm5*__ | ___o____o___+___.___-______* out7 | | \ / | (Rot1) \ | | / \ mm6*__ |____.____o___+___o___+______* out1 | \ / | / | / \ mm4*___o_________.___-___[xSqrt2]___* out5 Alternative schematics for stage 2: ----------------------------------- mm7 *___[xSqrt2]____o___+____o_______* out1 \ / | / (Rot1) / \ | mm6 *____o___+______.___-___ | __.___* out5 \ / | | / | | / \ | | mm5 *____.___-______.___-____.__ | __* out7 \ / | \ (Rot3) / \ | mm4 *___[xSqrt2]____o___+________o___* out3 */ xvidcore/src/dct/idct.h0000664000076500007650000000362111564706134016131 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Inverse DCT header - * * Copyright(C) 2001-2011 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: idct.h 1986 2011-05-18 09:07:40Z Isibaar $ * ****************************************************************************/ #ifndef _IDCT_H_ #define _IDCT_H_ #include "../portab.h" void idct_int32_init(); void idct_ia64_init(); typedef void (idctFunc) (short *const block); typedef idctFunc *idctFuncPtr; extern idctFuncPtr idct; idctFunc idct_int32; idctFunc simple_idct_c; /* Michael Niedermayer */ #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) idctFunc idct_mmx; /* AP-992, Peter Gubanov, Michel Lespinasse */ idctFunc idct_xmm; /* AP-992, Peter Gubanov, Michel Lespinasse */ idctFunc idct_3dne; /* AP-992, Peter Gubanov, Michel Lespinasse, Jaan Kalda */ idctFunc idct_sse2_skal; /* Skal's one, IEEE 1180 compliant */ idctFunc idct_sse2_dmitry; /* Dmitry Rozhdestvensky */ #endif #ifdef ARCH_IS_IA64 idctFunc idct_ia64; #endif #ifdef ARCH_IS_PPC idctFunc idct_altivec_c; #endif #endif /* _IDCT_H_ */ xvidcore/src/dct/idct.c0000664000076500007650000002204411564706134016124 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Inverse DCT - * * Copyright (C) 2006-2011 Xvid Solutions GmbH * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: idct.c 1986 2011-05-18 09:07:40Z Isibaar $ * ****************************************************************************/ /* * Authors: Skal * * Walken IDCT * Alternative idct implementations for decoding compatibility * * NOTE: this "C" version is not the original one, * but is modified to yield the same error profile * than the MMX version. * ************************************************************************/ #include "idct.h" /* function pointer */ idctFuncPtr idct; #define XVID_DSP_CLIP_255(x) ( ((x)&~255) ? ((-(x)) >> (8*sizeof((x))-1))&0xff : (x) ) #define ROW_SHIFT 11 #define COL_SHIFT 6 // #define FIX(x) (int)((x) * (1<> ROW_SHIFT; In[1] = (a1 + b1) >> ROW_SHIFT; In[2] = (a2 + b2) >> ROW_SHIFT; In[3] = (a3 + b3) >> ROW_SHIFT; In[4] = (a3 - b3) >> ROW_SHIFT; In[5] = (a2 - b2) >> ROW_SHIFT; In[6] = (a1 - b1) >> ROW_SHIFT; In[7] = (a0 - b0) >> ROW_SHIFT; } else { const int a0 = K >> ROW_SHIFT; if (a0) { In[0] = In[1] = In[2] = In[3] = In[4] = In[5] = In[6] = In[7] = a0; } else return 0; } } else if (!(Left|Right)) { const int a0 = (Rnd + C4*(In[0]+In[4])) >> ROW_SHIFT; const int a1 = (Rnd + C4*(In[0]-In[4])) >> ROW_SHIFT; In[0] = a0; In[3] = a0; In[4] = a0; In[7] = a0; In[1] = a1; In[2] = a1; In[5] = a1; In[6] = a1; } else { const int K = C4*In[0] + Rnd; const int a0 = K + C2*In[2] + C4*In[4] + C6*In[6]; const int a1 = K + C6*In[2] - C4*In[4] - C2*In[6]; const int a2 = K - C6*In[2] - C4*In[4] + C2*In[6]; const int a3 = K - C2*In[2] + C4*In[4] - C6*In[6]; const int b0 = C1*In[1] + C3*In[3] + C5*In[5] + C7*In[7]; const int b1 = C3*In[1] - C7*In[3] - C1*In[5] - C5*In[7]; const int b2 = C5*In[1] - C1*In[3] + C7*In[5] + C3*In[7]; const int b3 = C7*In[1] - C5*In[3] + C3*In[5] - C1*In[7]; In[0] = (a0 + b0) >> ROW_SHIFT; In[1] = (a1 + b1) >> ROW_SHIFT; In[2] = (a2 + b2) >> ROW_SHIFT; In[3] = (a3 + b3) >> ROW_SHIFT; In[4] = (a3 - b3) >> ROW_SHIFT; In[5] = (a2 - b2) >> ROW_SHIFT; In[6] = (a1 - b1) >> ROW_SHIFT; In[7] = (a0 - b0) >> ROW_SHIFT; } return 1; } #define Tan1 0x32ec #define Tan2 0x6a0a #define Tan3 0xab0e #define Sqrt2 0x5a82 #define MULT(c,x, n) ( ((c) * (x)) >> (n) ) // 12b version => #define MULT(c,x, n) ( (((c)>>3) * (x)) >> ((n)-3) ) // 12b zero-testing version: #define BUTF(a, b, tmp) \ (tmp) = (a)+(b); \ (b) = (a)-(b); \ (a) = (tmp) #define LOAD_BUTF(m1, m2, a, b, tmp, S) \ (m1) = (S)[(a)] + (S)[(b)]; \ (m2) = (S)[(a)] - (S)[(b)] static void Idct_Col_8(short * const In) { int mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7, Spill; // odd mm4 = (int)In[7*8]; mm5 = (int)In[5*8]; mm6 = (int)In[3*8]; mm7 = (int)In[1*8]; mm0 = MULT(Tan1, mm4, 16) + mm7; mm1 = MULT(Tan1, mm7, 16) - mm4; mm2 = MULT(Tan3, mm5, 16) + mm6; mm3 = MULT(Tan3, mm6, 16) - mm5; mm7 = mm0 + mm2; mm4 = mm1 - mm3; mm0 = mm0 - mm2; mm1 = mm1 + mm3; mm6 = mm0 + mm1; mm5 = mm0 - mm1; mm5 = 2*MULT(Sqrt2, mm5, 16); // 2*sqrt2 mm6 = 2*MULT(Sqrt2, mm6, 16); // Watch out: precision loss but done to match // the pmulhw used in mmx/sse versions // even mm1 = (int)In[2*8]; mm2 = (int)In[6*8]; mm3 = MULT(Tan2,mm2, 16) + mm1; mm2 = MULT(Tan2,mm1, 16) - mm2; LOAD_BUTF(mm0, mm1, 0*8, 4*8, Spill, In); BUTF(mm0, mm3, Spill); BUTF(mm0, mm7, Spill); In[8*0] = (int16_t) (mm0 >> COL_SHIFT); In[8*7] = (int16_t) (mm7 >> COL_SHIFT); BUTF(mm3, mm4, mm0); In[8*3] = (int16_t) (mm3 >> COL_SHIFT); In[8*4] = (int16_t) (mm4 >> COL_SHIFT); BUTF(mm1, mm2, mm0); BUTF(mm1, mm6, mm0); In[8*1] = (int16_t) (mm1 >> COL_SHIFT); In[8*6] = (int16_t) (mm6 >> COL_SHIFT); BUTF(mm2, mm5, mm0); In[8*2] = (int16_t) (mm2 >> COL_SHIFT); In[8*5] = (int16_t) (mm5 >> COL_SHIFT); } static void Idct_Col_4(short * const In) { int mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7, Spill; // odd mm0 = (int)In[1*8]; mm2 = (int)In[3*8]; mm1 = MULT(Tan1, mm0, 16); mm3 = MULT(Tan3, mm2, 16); mm7 = mm0 + mm2; mm4 = mm1 - mm3; mm0 = mm0 - mm2; mm1 = mm1 + mm3; mm6 = mm0 + mm1; mm5 = mm0 - mm1; mm6 = 2*MULT(Sqrt2, mm6, 16); // 2*sqrt2 mm5 = 2*MULT(Sqrt2, mm5, 16); // even mm0 = mm1 = (int)In[0*8]; mm3 = (int)In[2*8]; mm2 = MULT(Tan2,mm3, 16); BUTF(mm0, mm3, Spill); BUTF(mm0, mm7, Spill); In[8*0] = (int16_t) (mm0 >> COL_SHIFT); In[8*7] = (int16_t) (mm7 >> COL_SHIFT); BUTF(mm3, mm4, mm0); In[8*3] = (int16_t) (mm3 >> COL_SHIFT); In[8*4] = (int16_t) (mm4 >> COL_SHIFT); BUTF(mm1, mm2, mm0); BUTF(mm1, mm6, mm0); In[8*1] = (int16_t) (mm1 >> COL_SHIFT); In[8*6] = (int16_t) (mm6 >> COL_SHIFT); BUTF(mm2, mm5, mm0); In[8*2] = (int16_t) (mm2 >> COL_SHIFT); In[8*5] = (int16_t) (mm5 >> COL_SHIFT); } static void Idct_Col_3(short * const In) { int mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7, Spill; // odd mm7 = (int)In[1*8]; mm4 = MULT(Tan1, mm7, 16); mm6 = mm7 + mm4; mm5 = mm7 - mm4; mm6 = 2*MULT(Sqrt2, mm6, 16); // 2*sqrt2 mm5 = 2*MULT(Sqrt2, mm5, 16); // even mm0 = mm1 = (int)In[0*8]; mm3 = (int)In[2*8]; mm2 = MULT(Tan2,mm3, 16); BUTF(mm0, mm3, Spill); BUTF(mm0, mm7, Spill); In[8*0] = (int16_t) (mm0 >> COL_SHIFT); In[8*7] = (int16_t) (mm7 >> COL_SHIFT); BUTF(mm3, mm4, mm0); In[8*3] = (int16_t) (mm3 >> COL_SHIFT); In[8*4] = (int16_t) (mm4 >> COL_SHIFT); BUTF(mm1, mm2, mm0); BUTF(mm1, mm6, mm0); In[8*1] = (int16_t) (mm1 >> COL_SHIFT); In[8*6] = (int16_t) (mm6 >> COL_SHIFT); BUTF(mm2, mm5, mm0); In[8*2] = (int16_t) (mm2 >> COL_SHIFT); In[8*5] = (int16_t) (mm5 >> COL_SHIFT); } #undef Tan1 #undef Tan2 #undef Tan3 #undef Sqrt2 #undef ROW_SHIFT #undef COL_SHIFT ////////////////////////////////////////////////////////// void idct_int32(short *const In) { int i, Rows = 0x07; Idct_Row(In + 0*8, Tab04, Rnd0); Idct_Row(In + 1*8, Tab17, Rnd1); Idct_Row(In + 2*8, Tab26, Rnd2); if (Idct_Row(In + 3*8, Tab35, Rnd3)) Rows |= 0x08; if (Idct_Row(In + 4*8, Tab04, Rnd4)) Rows |= 0x10; if (Idct_Row(In + 5*8, Tab35, Rnd5)) Rows |= 0x20; if (Idct_Row(In + 6*8, Tab26, Rnd6)) Rows |= 0x40; if (Idct_Row(In + 7*8, Tab17, Rnd7)) Rows |= 0x80; if (Rows&0xf0) { for(i=0; i<8; i++) Idct_Col_8(In + i); } else if (Rows&0x08) { for(i=0; i<8; i++) Idct_Col_4(In + i); } else { for(i=0; i<8; i++) Idct_Col_3(In + i); } } xvidcore/src/decoder.c0000664000076500007650000015637311564705453016054 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Decoder Module - * * Copyright(C) 2002 MinChen * 2002-2010 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: decoder.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #ifdef BFRAMES_DEC_DEBUG #define BFRAMES_DEC #endif #include "xvid.h" #include "portab.h" #include "global.h" #include "decoder.h" #include "bitstream/bitstream.h" #include "bitstream/mbcoding.h" #include "quant/quant.h" #include "quant/quant_matrix.h" #include "dct/idct.h" #include "dct/fdct.h" #include "utils/mem_transfer.h" #include "image/interpolate8x8.h" #include "image/font.h" #include "image/qpel.h" #include "bitstream/mbcoding.h" #include "prediction/mbprediction.h" #include "utils/timer.h" #include "utils/emms.h" #include "motion/motion.h" #include "motion/gmc.h" #include "image/image.h" #include "image/colorspace.h" #include "image/postprocessing.h" #include "utils/mem_align.h" #define DIV2ROUND(n) (((n)>>1)|((n)&1)) #define DIV2(n) ((n)>>1) #define DIVUVMOV(n) (((n) >> 1) + roundtab_79[(n) & 0x3]) // static int decoder_resize(DECODER * dec) { /* free existing */ image_destroy(&dec->cur, dec->edged_width, dec->edged_height); image_destroy(&dec->refn[0], dec->edged_width, dec->edged_height); image_destroy(&dec->refn[1], dec->edged_width, dec->edged_height); image_destroy(&dec->tmp, dec->edged_width, dec->edged_height); image_destroy(&dec->qtmp, dec->edged_width, dec->edged_height); image_destroy(&dec->gmc, dec->edged_width, dec->edged_height); image_null(&dec->cur); image_null(&dec->refn[0]); image_null(&dec->refn[1]); image_null(&dec->tmp); image_null(&dec->qtmp); image_null(&dec->gmc); xvid_free(dec->last_mbs); xvid_free(dec->mbs); xvid_free(dec->qscale); dec->last_mbs = NULL; dec->mbs = NULL; dec->qscale = NULL; /* realloc */ dec->mb_width = (dec->width + 15) / 16; dec->mb_height = (dec->height + 15) / 16; dec->edged_width = 16 * dec->mb_width + 2 * EDGE_SIZE; dec->edged_height = 16 * dec->mb_height + 2 * EDGE_SIZE; if ( image_create(&dec->cur, dec->edged_width, dec->edged_height) || image_create(&dec->refn[0], dec->edged_width, dec->edged_height) || image_create(&dec->refn[1], dec->edged_width, dec->edged_height) /* Support B-frame to reference last 2 frame */ || image_create(&dec->tmp, dec->edged_width, dec->edged_height) || image_create(&dec->qtmp, dec->edged_width, dec->edged_height) || image_create(&dec->gmc, dec->edged_width, dec->edged_height) ) goto memory_error; dec->mbs = xvid_malloc(sizeof(MACROBLOCK) * dec->mb_width * dec->mb_height, CACHE_LINE); if (dec->mbs == NULL) goto memory_error; memset(dec->mbs, 0, sizeof(MACROBLOCK) * dec->mb_width * dec->mb_height); /* For skip MB flag */ dec->last_mbs = xvid_malloc(sizeof(MACROBLOCK) * dec->mb_width * dec->mb_height, CACHE_LINE); if (dec->last_mbs == NULL) goto memory_error; memset(dec->last_mbs, 0, sizeof(MACROBLOCK) * dec->mb_width * dec->mb_height); /* nothing happens if that fails */ dec->qscale = xvid_malloc(sizeof(int) * dec->mb_width * dec->mb_height, CACHE_LINE); if (dec->qscale) memset(dec->qscale, 0, sizeof(int) * dec->mb_width * dec->mb_height); return 0; memory_error: /* Most structures were deallocated / nullifieded, so it should be safe */ /* decoder_destroy(dec) minus the write_timer */ xvid_free(dec->mbs); image_destroy(&dec->cur, dec->edged_width, dec->edged_height); image_destroy(&dec->refn[0], dec->edged_width, dec->edged_height); image_destroy(&dec->refn[1], dec->edged_width, dec->edged_height); image_destroy(&dec->tmp, dec->edged_width, dec->edged_height); image_destroy(&dec->qtmp, dec->edged_width, dec->edged_height); xvid_free(dec); return XVID_ERR_MEMORY; } int decoder_create(xvid_dec_create_t * create) { DECODER *dec; if (XVID_VERSION_MAJOR(create->version) != 1) /* v1.x.x */ return XVID_ERR_VERSION; dec = xvid_malloc(sizeof(DECODER), CACHE_LINE); if (dec == NULL) { return XVID_ERR_MEMORY; } memset(dec, 0, sizeof(DECODER)); dec->mpeg_quant_matrices = xvid_malloc(sizeof(uint16_t) * 64 * 8, CACHE_LINE); if (dec->mpeg_quant_matrices == NULL) { xvid_free(dec); return XVID_ERR_MEMORY; } create->handle = dec; dec->width = create->width; dec->height = create->height; dec->num_threads = MAX(0, create->num_threads); image_null(&dec->cur); image_null(&dec->refn[0]); image_null(&dec->refn[1]); image_null(&dec->tmp); image_null(&dec->qtmp); /* image based GMC */ image_null(&dec->gmc); dec->mbs = NULL; dec->last_mbs = NULL; dec->qscale = NULL; init_timer(); init_postproc(&dec->postproc); init_mpeg_matrix(dec->mpeg_quant_matrices); /* For B-frame support (used to save reference frame's time */ dec->frames = 0; dec->time = dec->time_base = dec->last_time_base = 0; dec->low_delay = 0; dec->packed_mode = 0; dec->time_inc_resolution = 1; /* until VOL header says otherwise */ dec->ver_id = 1; if (create->fourcc == ((int)('X')|((int)('V')<<8)| ((int)('I')<<16)|((int)('D')<<24))) { /* XVID */ dec->bs_version = 0; /* Initially assume oldest xvid version */ } else { dec->bs_version = 0xffff; /* Initialize to very high value -> assume bugfree stream */ } dec->fixed_dimensions = (dec->width > 0 && dec->height > 0); if (dec->fixed_dimensions) { int ret = decoder_resize(dec); if (ret == XVID_ERR_MEMORY) create->handle = NULL; return ret; } else return 0; } int decoder_destroy(DECODER * dec) { xvid_free(dec->last_mbs); xvid_free(dec->mbs); xvid_free(dec->qscale); /* image based GMC */ image_destroy(&dec->gmc, dec->edged_width, dec->edged_height); image_destroy(&dec->refn[0], dec->edged_width, dec->edged_height); image_destroy(&dec->refn[1], dec->edged_width, dec->edged_height); image_destroy(&dec->tmp, dec->edged_width, dec->edged_height); image_destroy(&dec->qtmp, dec->edged_width, dec->edged_height); image_destroy(&dec->cur, dec->edged_width, dec->edged_height); xvid_free(dec->mpeg_quant_matrices); xvid_free(dec); write_timer(); return 0; } static const int32_t dquant_table[4] = { -1, -2, 1, 2 }; /* decode an intra macroblock */ static void decoder_mbintra(DECODER * dec, MACROBLOCK * pMB, const uint32_t x_pos, const uint32_t y_pos, const uint32_t acpred_flag, const uint32_t cbp, Bitstream * bs, const uint32_t quant, const uint32_t intra_dc_threshold, const unsigned int bound) { DECLARE_ALIGNED_MATRIX(block, 6, 64, int16_t, CACHE_LINE); DECLARE_ALIGNED_MATRIX(data, 6, 64, int16_t, CACHE_LINE); uint32_t stride = dec->edged_width; uint32_t stride2 = stride / 2; uint32_t next_block = stride * 8; uint32_t i; uint32_t iQuant = pMB->quant; uint8_t *pY_Cur, *pU_Cur, *pV_Cur; pY_Cur = dec->cur.y + (y_pos << 4) * stride + (x_pos << 4); pU_Cur = dec->cur.u + (y_pos << 3) * stride2 + (x_pos << 3); pV_Cur = dec->cur.v + (y_pos << 3) * stride2 + (x_pos << 3); memset(block, 0, 6 * 64 * sizeof(int16_t)); /* clear */ for (i = 0; i < 6; i++) { uint32_t iDcScaler = get_dc_scaler(iQuant, i < 4); int16_t predictors[8]; int start_coeff; start_timer(); predict_acdc(dec->mbs, x_pos, y_pos, dec->mb_width, i, &block[i * 64], iQuant, iDcScaler, predictors, bound); if (!acpred_flag) { pMB->acpred_directions[i] = 0; } stop_prediction_timer(); if (quant < intra_dc_threshold) { int dc_size; int dc_dif; dc_size = i < 4 ? get_dc_size_lum(bs) : get_dc_size_chrom(bs); dc_dif = dc_size ? get_dc_dif(bs, dc_size) : 0; if (dc_size > 8) { BitstreamSkip(bs, 1); /* marker */ } block[i * 64 + 0] = dc_dif; start_coeff = 1; DPRINTF(XVID_DEBUG_COEFF,"block[0] %i\n", dc_dif); } else { start_coeff = 0; } start_timer(); if (cbp & (1 << (5 - i))) /* coded */ { int direction = dec->alternate_vertical_scan ? 2 : pMB->acpred_directions[i]; get_intra_block(bs, &block[i * 64], direction, start_coeff); } stop_coding_timer(); start_timer(); add_acdc(pMB, i, &block[i * 64], iDcScaler, predictors, dec->bs_version); stop_prediction_timer(); start_timer(); if (dec->quant_type == 0) { dequant_h263_intra(&data[i * 64], &block[i * 64], iQuant, iDcScaler, dec->mpeg_quant_matrices); } else { dequant_mpeg_intra(&data[i * 64], &block[i * 64], iQuant, iDcScaler, dec->mpeg_quant_matrices); } stop_iquant_timer(); start_timer(); idct((short * const)&data[i * 64]); stop_idct_timer(); } if (dec->interlacing && pMB->field_dct) { next_block = stride; stride *= 2; } start_timer(); transfer_16to8copy(pY_Cur, &data[0 * 64], stride); transfer_16to8copy(pY_Cur + 8, &data[1 * 64], stride); transfer_16to8copy(pY_Cur + next_block, &data[2 * 64], stride); transfer_16to8copy(pY_Cur + 8 + next_block, &data[3 * 64], stride); transfer_16to8copy(pU_Cur, &data[4 * 64], stride2); transfer_16to8copy(pV_Cur, &data[5 * 64], stride2); stop_transfer_timer(); } static void decoder_mb_decode(DECODER * dec, const uint32_t cbp, Bitstream * bs, uint8_t * pY_Cur, uint8_t * pU_Cur, uint8_t * pV_Cur, const MACROBLOCK * pMB) { DECLARE_ALIGNED_MATRIX(data, 1, 64, int16_t, CACHE_LINE); int stride = dec->edged_width; int i; const uint32_t iQuant = pMB->quant; const int direction = dec->alternate_vertical_scan ? 2 : 0; typedef void (*get_inter_block_function_t)( Bitstream * bs, int16_t * block, int direction, const int quant, const uint16_t *matrix); typedef void (*add_residual_function_t)( uint8_t *predicted_block, const int16_t *residual, int stride); const get_inter_block_function_t get_inter_block = (dec->quant_type == 0) ? (get_inter_block_function_t)get_inter_block_h263 : (get_inter_block_function_t)get_inter_block_mpeg; uint8_t *dst[6]; int strides[6]; if (dec->interlacing && pMB->field_dct) { dst[0] = pY_Cur; dst[1] = pY_Cur + 8; dst[2] = pY_Cur + stride; dst[3] = dst[2] + 8; dst[4] = pU_Cur; dst[5] = pV_Cur; strides[0] = strides[1] = strides[2] = strides[3] = stride*2; strides[4] = stride/2; strides[5] = stride/2; } else { dst[0] = pY_Cur; dst[1] = pY_Cur + 8; dst[2] = pY_Cur + 8*stride; dst[3] = dst[2] + 8; dst[4] = pU_Cur; dst[5] = pV_Cur; strides[0] = strides[1] = strides[2] = strides[3] = stride; strides[4] = stride/2; strides[5] = stride/2; } for (i = 0; i < 6; i++) { /* Process only coded blocks */ if (cbp & (1 << (5 - i))) { /* Clear the block */ memset(&data[0], 0, 64*sizeof(int16_t)); /* Decode coeffs and dequantize on the fly */ start_timer(); get_inter_block(bs, &data[0], direction, iQuant, get_inter_matrix(dec->mpeg_quant_matrices)); stop_coding_timer(); /* iDCT */ start_timer(); idct((short * const)&data[0]); stop_idct_timer(); /* Add this residual to the predicted block */ start_timer(); transfer_16to8add(dst[i], &data[0], strides[i]); stop_transfer_timer(); } } } static void __inline validate_vector(VECTOR * mv, unsigned int x_pos, unsigned int y_pos, const DECODER * dec) { /* clip a vector to valid range prevents crashes if bitstream is broken */ int shift = 5 + dec->quarterpel; int xborder_high = (int)(dec->mb_width - x_pos) << shift; int xborder_low = (-(int)x_pos-1) << shift; int yborder_high = (int)(dec->mb_height - y_pos) << shift; int yborder_low = (-(int)y_pos-1) << shift; #define CHECK_MV(mv) \ do { \ if ((mv).x > xborder_high) { \ DPRINTF(XVID_DEBUG_MV, "mv.x > max -- %d > %d, MB %d, %d", (mv).x, xborder_high, x_pos, y_pos); \ (mv).x = xborder_high; \ } else if ((mv).x < xborder_low) { \ DPRINTF(XVID_DEBUG_MV, "mv.x < min -- %d < %d, MB %d, %d", (mv).x, xborder_low, x_pos, y_pos); \ (mv).x = xborder_low; \ } \ if ((mv).y > yborder_high) { \ DPRINTF(XVID_DEBUG_MV, "mv.y > max -- %d > %d, MB %d, %d", (mv).y, yborder_high, x_pos, y_pos); \ (mv).y = yborder_high; \ } else if ((mv).y < yborder_low) { \ DPRINTF(XVID_DEBUG_MV, "mv.y < min -- %d < %d, MB %d, %d", (mv).y, yborder_low, x_pos, y_pos); \ (mv).y = yborder_low; \ } \ } while (0) CHECK_MV(mv[0]); CHECK_MV(mv[1]); CHECK_MV(mv[2]); CHECK_MV(mv[3]); } /* Up to this version, chroma rounding was wrong with qpel. * So we try to be backward compatible to avoid artifacts */ #define BS_VERSION_BUGGY_CHROMA_ROUNDING 1 /* decode an inter macroblock */ static void decoder_mbinter(DECODER * dec, const MACROBLOCK * pMB, const uint32_t x_pos, const uint32_t y_pos, const uint32_t cbp, Bitstream * bs, const uint32_t rounding, const int ref, const int bvop) { uint32_t stride = dec->edged_width; uint32_t stride2 = stride / 2; uint32_t i; uint8_t *pY_Cur, *pU_Cur, *pV_Cur; int uv_dx, uv_dy; VECTOR mv[4]; /* local copy of mvs */ pY_Cur = dec->cur.y + (y_pos << 4) * stride + (x_pos << 4); pU_Cur = dec->cur.u + (y_pos << 3) * stride2 + (x_pos << 3); pV_Cur = dec->cur.v + (y_pos << 3) * stride2 + (x_pos << 3); for (i = 0; i < 4; i++) mv[i] = pMB->mvs[i]; validate_vector(mv, x_pos, y_pos, dec); start_timer(); if ((pMB->mode != MODE_INTER4V) || (bvop)) { /* INTER, INTER_Q, NOT_CODED, FORWARD, BACKWARD */ uv_dx = mv[0].x; uv_dy = mv[0].y; if (dec->quarterpel) { if (dec->bs_version <= BS_VERSION_BUGGY_CHROMA_ROUNDING) { uv_dx = (uv_dx>>1) | (uv_dx&1); uv_dy = (uv_dy>>1) | (uv_dy&1); } else { uv_dx /= 2; uv_dy /= 2; } } uv_dx = (uv_dx >> 1) + roundtab_79[uv_dx & 0x3]; uv_dy = (uv_dy >> 1) + roundtab_79[uv_dy & 0x3]; if (dec->quarterpel) interpolate16x16_quarterpel(dec->cur.y, dec->refn[ref].y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos, mv[0].x, mv[0].y, stride, rounding); else interpolate16x16_switch(dec->cur.y, dec->refn[ref].y, 16*x_pos, 16*y_pos, mv[0].x, mv[0].y, stride, rounding); } else { /* MODE_INTER4V */ if(dec->quarterpel) { if (dec->bs_version <= BS_VERSION_BUGGY_CHROMA_ROUNDING) { int z; uv_dx = 0; uv_dy = 0; for (z = 0; z < 4; z++) { uv_dx += ((mv[z].x>>1) | (mv[z].x&1)); uv_dy += ((mv[z].y>>1) | (mv[z].y&1)); } } else { uv_dx = (mv[0].x / 2) + (mv[1].x / 2) + (mv[2].x / 2) + (mv[3].x / 2); uv_dy = (mv[0].y / 2) + (mv[1].y / 2) + (mv[2].y / 2) + (mv[3].y / 2); } } else { uv_dx = mv[0].x + mv[1].x + mv[2].x + mv[3].x; uv_dy = mv[0].y + mv[1].y + mv[2].y + mv[3].y; } uv_dx = (uv_dx >> 3) + roundtab_76[uv_dx & 0xf]; uv_dy = (uv_dy >> 3) + roundtab_76[uv_dy & 0xf]; if (dec->quarterpel) { interpolate8x8_quarterpel(dec->cur.y, dec->refn[0].y , dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos, mv[0].x, mv[0].y, stride, rounding); interpolate8x8_quarterpel(dec->cur.y, dec->refn[0].y , dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos + 8, 16*y_pos, mv[1].x, mv[1].y, stride, rounding); interpolate8x8_quarterpel(dec->cur.y, dec->refn[0].y , dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos + 8, mv[2].x, mv[2].y, stride, rounding); interpolate8x8_quarterpel(dec->cur.y, dec->refn[0].y , dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos + 8, 16*y_pos + 8, mv[3].x, mv[3].y, stride, rounding); } else { interpolate8x8_switch(dec->cur.y, dec->refn[0].y , 16*x_pos, 16*y_pos, mv[0].x, mv[0].y, stride, rounding); interpolate8x8_switch(dec->cur.y, dec->refn[0].y , 16*x_pos + 8, 16*y_pos, mv[1].x, mv[1].y, stride, rounding); interpolate8x8_switch(dec->cur.y, dec->refn[0].y , 16*x_pos, 16*y_pos + 8, mv[2].x, mv[2].y, stride, rounding); interpolate8x8_switch(dec->cur.y, dec->refn[0].y , 16*x_pos + 8, 16*y_pos + 8, mv[3].x, mv[3].y, stride, rounding); } } /* chroma */ interpolate8x8_switch(dec->cur.u, dec->refn[ref].u, 8 * x_pos, 8 * y_pos, uv_dx, uv_dy, stride2, rounding); interpolate8x8_switch(dec->cur.v, dec->refn[ref].v, 8 * x_pos, 8 * y_pos, uv_dx, uv_dy, stride2, rounding); stop_comp_timer(); if (cbp) decoder_mb_decode(dec, cbp, bs, pY_Cur, pU_Cur, pV_Cur, pMB); } /* decode an inter macroblock in field mode */ static void decoder_mbinter_field(DECODER * dec, const MACROBLOCK * pMB, const uint32_t x_pos, const uint32_t y_pos, const uint32_t cbp, Bitstream * bs, const uint32_t rounding, const int ref, const int bvop) { uint32_t stride = dec->edged_width; uint32_t stride2 = stride / 2; uint8_t *pY_Cur, *pU_Cur, *pV_Cur; int uvtop_dx, uvtop_dy; int uvbot_dx, uvbot_dy; VECTOR mv[4]; /* local copy of mvs */ /* Get pointer to memory areas */ pY_Cur = dec->cur.y + (y_pos << 4) * stride + (x_pos << 4); pU_Cur = dec->cur.u + (y_pos << 3) * stride2 + (x_pos << 3); pV_Cur = dec->cur.v + (y_pos << 3) * stride2 + (x_pos << 3); mv[0] = pMB->mvs[0]; mv[1] = pMB->mvs[1]; memset(&mv[2],0,2*sizeof(VECTOR)); validate_vector(mv, x_pos, y_pos, dec); start_timer(); if((pMB->mode!=MODE_INTER4V) || (bvop)) /* INTER, INTER_Q, NOT_CODED, FORWARD, BACKWARD */ { /* Prepare top field vector */ uvtop_dx = DIV2ROUND(mv[0].x); uvtop_dy = DIV2ROUND(mv[0].y); /* Prepare bottom field vector */ uvbot_dx = DIV2ROUND(mv[1].x); uvbot_dy = DIV2ROUND(mv[1].y); if(dec->quarterpel) { /* NOT supported */ } else { /* Interpolate top field left part(we use double stride for every 2nd line) */ interpolate8x8_switch(dec->cur.y,dec->refn[ref].y+pMB->field_for_top*stride, 16*x_pos,8*y_pos,mv[0].x, mv[0].y>>1,2*stride, rounding); /* top field right part */ interpolate8x8_switch(dec->cur.y,dec->refn[ref].y+pMB->field_for_top*stride, 16*x_pos+8,8*y_pos,mv[0].x, mv[0].y>>1,2*stride, rounding); /* Interpolate bottom field left part(we use double stride for every 2nd line) */ interpolate8x8_switch(dec->cur.y+stride,dec->refn[ref].y+pMB->field_for_bot*stride, 16*x_pos,8*y_pos,mv[1].x, mv[1].y>>1,2*stride, rounding); /* Bottom field right part */ interpolate8x8_switch(dec->cur.y+stride,dec->refn[ref].y+pMB->field_for_bot*stride, 16*x_pos+8,8*y_pos,mv[1].x, mv[1].y>>1,2*stride, rounding); /* Interpolate field1 U */ interpolate8x4_switch(dec->cur.u,dec->refn[ref].u+pMB->field_for_top*stride2, 8*x_pos,4*y_pos,uvtop_dx,DIV2ROUND(uvtop_dy),stride,rounding); /* Interpolate field1 V */ interpolate8x4_switch(dec->cur.v,dec->refn[ref].v+pMB->field_for_top*stride2, 8*x_pos,4*y_pos,uvtop_dx,DIV2ROUND(uvtop_dy),stride,rounding); /* Interpolate field2 U */ interpolate8x4_switch(dec->cur.u+stride2,dec->refn[ref].u+pMB->field_for_bot*stride2, 8*x_pos,4*y_pos,uvbot_dx,DIV2ROUND(uvbot_dy),stride,rounding); /* Interpolate field2 V */ interpolate8x4_switch(dec->cur.v+stride2,dec->refn[ref].v+pMB->field_for_bot*stride2, 8*x_pos,4*y_pos,uvbot_dx,DIV2ROUND(uvbot_dy),stride,rounding); } } else { /* We don't expect 4 motion vectors in interlaced mode */ } stop_comp_timer(); /* Must add error correction? */ if(cbp) decoder_mb_decode(dec, cbp, bs, pY_Cur, pU_Cur, pV_Cur, pMB); } static void decoder_mbgmc(DECODER * dec, MACROBLOCK * const pMB, const uint32_t x_pos, const uint32_t y_pos, const uint32_t fcode, const uint32_t cbp, Bitstream * bs, const uint32_t rounding) { const uint32_t stride = dec->edged_width; const uint32_t stride2 = stride / 2; uint8_t *const pY_Cur=dec->cur.y + (y_pos << 4) * stride + (x_pos << 4); uint8_t *const pU_Cur=dec->cur.u + (y_pos << 3) * stride2 + (x_pos << 3); uint8_t *const pV_Cur=dec->cur.v + (y_pos << 3) * stride2 + (x_pos << 3); NEW_GMC_DATA * gmc_data = &dec->new_gmc_data; pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = pMB->amv; start_timer(); /* this is where the calculations are done */ gmc_data->predict_16x16(gmc_data, dec->cur.y + y_pos*16*stride + x_pos*16, dec->refn[0].y, stride, stride, x_pos, y_pos, rounding); gmc_data->predict_8x8(gmc_data, dec->cur.u + y_pos*8*stride2 + x_pos*8, dec->refn[0].u, dec->cur.v + y_pos*8*stride2 + x_pos*8, dec->refn[0].v, stride2, stride2, x_pos, y_pos, rounding); gmc_data->get_average_mv(gmc_data, &pMB->amv, x_pos, y_pos, dec->quarterpel); pMB->amv.x = gmc_sanitize(pMB->amv.x, dec->quarterpel, fcode); pMB->amv.y = gmc_sanitize(pMB->amv.y, dec->quarterpel, fcode); pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = pMB->amv; stop_transfer_timer(); if (cbp) decoder_mb_decode(dec, cbp, bs, pY_Cur, pU_Cur, pV_Cur, pMB); } static void decoder_iframe(DECODER * dec, Bitstream * bs, int quant, int intra_dc_threshold) { uint32_t bound; uint32_t x, y; const uint32_t mb_width = dec->mb_width; const uint32_t mb_height = dec->mb_height; bound = 0; for (y = 0; y < mb_height; y++) { for (x = 0; x < mb_width; x++) { MACROBLOCK *mb; uint32_t mcbpc; uint32_t cbpc; uint32_t acpred_flag; uint32_t cbpy; uint32_t cbp; while (BitstreamShowBits(bs, 9) == 1) BitstreamSkip(bs, 9); if (check_resync_marker(bs, 0)) { bound = read_video_packet_header(bs, dec, 0, &quant, NULL, NULL, &intra_dc_threshold); x = bound % mb_width; y = MIN((bound / mb_width), (mb_height-1)); } mb = &dec->mbs[y * dec->mb_width + x]; DPRINTF(XVID_DEBUG_MB, "macroblock (%i,%i) %08x\n", x, y, BitstreamShowBits(bs, 32)); mcbpc = get_mcbpc_intra(bs); mb->mode = mcbpc & 7; cbpc = (mcbpc >> 4); acpred_flag = BitstreamGetBit(bs); cbpy = get_cbpy(bs, 1); cbp = (cbpy << 2) | cbpc; if (mb->mode == MODE_INTRA_Q) { quant += dquant_table[BitstreamGetBits(bs, 2)]; if (quant > 31) { quant = 31; } else if (quant < 1) { quant = 1; } } mb->quant = quant; mb->mvs[0].x = mb->mvs[0].y = mb->mvs[1].x = mb->mvs[1].y = mb->mvs[2].x = mb->mvs[2].y = mb->mvs[3].x = mb->mvs[3].y =0; if (dec->interlacing) { mb->field_dct = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB,"deci: field_dct: %i\n", mb->field_dct); } decoder_mbintra(dec, mb, x, y, acpred_flag, cbp, bs, quant, intra_dc_threshold, bound); } if(dec->out_frm) output_slice(&dec->cur, dec->edged_width,dec->width,dec->out_frm,0,y,mb_width); } } static void get_motion_vector(DECODER * dec, Bitstream * bs, int x, int y, int k, VECTOR * ret_mv, int fcode, const int bound) { const int scale_fac = 1 << (fcode - 1); const int high = (32 * scale_fac) - 1; const int low = ((-32) * scale_fac); const int range = (64 * scale_fac); const VECTOR pmv = get_pmv2(dec->mbs, dec->mb_width, bound, x, y, k); VECTOR mv; mv.x = get_mv(bs, fcode); mv.y = get_mv(bs, fcode); DPRINTF(XVID_DEBUG_MV,"mv_diff (%i,%i) pred (%i,%i) result (%i,%i)\n", mv.x, mv.y, pmv.x, pmv.y, mv.x+pmv.x, mv.y+pmv.y); mv.x += pmv.x; mv.y += pmv.y; if (mv.x < low) { mv.x += range; } else if (mv.x > high) { mv.x -= range; } if (mv.y < low) { mv.y += range; } else if (mv.y > high) { mv.y -= range; } ret_mv->x = mv.x; ret_mv->y = mv.y; } /* We use this when decoder runs interlaced -> different prediction */ static void get_motion_vector_interlaced(DECODER * dec, Bitstream * bs, int x, int y, int k, MACROBLOCK *pMB, int fcode, const int bound) { const int scale_fac = 1 << (fcode - 1); const int high = (32 * scale_fac) - 1; const int low = ((-32) * scale_fac); const int range = (64 * scale_fac); /* Get interlaced prediction */ const VECTOR pmv=get_pmv2_interlaced(dec->mbs,dec->mb_width,bound,x,y,k); VECTOR mv,mvf1,mvf2; if(!pMB->field_pred) { mv.x = get_mv(bs,fcode); mv.y = get_mv(bs,fcode); mv.x += pmv.x; mv.y += pmv.y; if(mv.xhigh) { mv.x-=range; } if (mv.y < low) { mv.y += range; } else if (mv.y > high) { mv.y -= range; } pMB->mvs[0]=pMB->mvs[1]=pMB->mvs[2]=pMB->mvs[3]=mv; } else { mvf1.x = get_mv(bs, fcode); mvf1.y = get_mv(bs, fcode); mvf1.x += pmv.x; mvf1.y = 2*(mvf1.y+pmv.y/2); /* It's multiple of 2 */ if (mvf1.x < low) { mvf1.x += range; } else if (mvf1.x > high) { mvf1.x -= range; } if (mvf1.y < low) { mvf1.y += range; } else if (mvf1.y > high) { mvf1.y -= range; } mvf2.x = get_mv(bs, fcode); mvf2.y = get_mv(bs, fcode); mvf2.x += pmv.x; mvf2.y = 2*(mvf2.y+pmv.y/2); /* It's multiple of 2 */ if (mvf2.x < low) { mvf2.x += range; } else if (mvf2.x > high) { mvf2.x -= range; } if (mvf2.y < low) { mvf2.y += range; } else if (mvf2.y > high) { mvf2.y -= range; } pMB->mvs[0]=mvf1; pMB->mvs[1]=mvf2; pMB->mvs[2].x=pMB->mvs[3].x=0; pMB->mvs[2].y=pMB->mvs[3].y=0; /* Calculate average for as it is field predicted */ pMB->mvs_avg.x=DIV2ROUND(pMB->mvs[0].x+pMB->mvs[1].x); pMB->mvs_avg.y=DIV2ROUND(pMB->mvs[0].y+pMB->mvs[1].y); } } /* for P_VOP set gmc_warp to NULL */ static void decoder_pframe(DECODER * dec, Bitstream * bs, int rounding, int quant, int fcode, int intra_dc_threshold, const WARPPOINTS *const gmc_warp) { uint32_t x, y; uint32_t bound; int cp_mb, st_mb; const uint32_t mb_width = dec->mb_width; const uint32_t mb_height = dec->mb_height; if (!dec->is_edged[0]) { start_timer(); image_setedges(&dec->refn[0], dec->edged_width, dec->edged_height, dec->width, dec->height, dec->bs_version); dec->is_edged[0] = 1; stop_edges_timer(); } if (gmc_warp) { /* accuracy: 0==1/2, 1=1/4, 2=1/8, 3=1/16 */ generate_GMCparameters( dec->sprite_warping_points, dec->sprite_warping_accuracy, gmc_warp, dec->width, dec->height, &dec->new_gmc_data); /* image warping is done block-based in decoder_mbgmc(), now */ } bound = 0; for (y = 0; y < mb_height; y++) { cp_mb = st_mb = 0; for (x = 0; x < mb_width; x++) { MACROBLOCK *mb; /* skip stuffing */ while (BitstreamShowBits(bs, 10) == 1) BitstreamSkip(bs, 10); if (check_resync_marker(bs, fcode - 1)) { bound = read_video_packet_header(bs, dec, fcode - 1, &quant, &fcode, NULL, &intra_dc_threshold); x = bound % mb_width; y = MIN((bound / mb_width), (mb_height-1)); } mb = &dec->mbs[y * dec->mb_width + x]; DPRINTF(XVID_DEBUG_MB, "macroblock (%i,%i) %08x\n", x, y, BitstreamShowBits(bs, 32)); if (!(BitstreamGetBit(bs))) { /* block _is_ coded */ uint32_t mcbpc, cbpc, cbpy, cbp; uint32_t intra, acpred_flag = 0; int mcsel = 0; /* mcsel: '0'=local motion, '1'=GMC */ cp_mb++; mcbpc = get_mcbpc_inter(bs); mb->mode = mcbpc & 7; cbpc = (mcbpc >> 4); DPRINTF(XVID_DEBUG_MB, "mode %i\n", mb->mode); DPRINTF(XVID_DEBUG_MB, "cbpc %i\n", cbpc); intra = (mb->mode == MODE_INTRA || mb->mode == MODE_INTRA_Q); if (gmc_warp && (mb->mode == MODE_INTER || mb->mode == MODE_INTER_Q)) mcsel = BitstreamGetBit(bs); else if (intra) acpred_flag = BitstreamGetBit(bs); cbpy = get_cbpy(bs, intra); DPRINTF(XVID_DEBUG_MB, "cbpy %i mcsel %i \n", cbpy,mcsel); cbp = (cbpy << 2) | cbpc; if (mb->mode == MODE_INTER_Q || mb->mode == MODE_INTRA_Q) { int dquant = dquant_table[BitstreamGetBits(bs, 2)]; DPRINTF(XVID_DEBUG_MB, "dquant %i\n", dquant); quant += dquant; if (quant > 31) { quant = 31; } else if (quant < 1) { quant = 1; } DPRINTF(XVID_DEBUG_MB, "quant %i\n", quant); } mb->quant = quant; mb->field_pred=0; if (dec->interlacing) { if (cbp || intra) { mb->field_dct = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB,"decp: field_dct: %i\n", mb->field_dct); } if ((mb->mode == MODE_INTER || mb->mode == MODE_INTER_Q) && !mcsel) { mb->field_pred = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB, "decp: field_pred: %i\n", mb->field_pred); if (mb->field_pred) { mb->field_for_top = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB,"decp: field_for_top: %i\n", mb->field_for_top); mb->field_for_bot = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB,"decp: field_for_bot: %i\n", mb->field_for_bot); } } } if (mcsel) { decoder_mbgmc(dec, mb, x, y, fcode, cbp, bs, rounding); continue; } else if (mb->mode == MODE_INTER || mb->mode == MODE_INTER_Q) { if(dec->interlacing) { /* Get motion vectors interlaced, field_pred is handled there */ get_motion_vector_interlaced(dec, bs, x, y, 0, mb, fcode, bound); } else { get_motion_vector(dec, bs, x, y, 0, &mb->mvs[0], fcode, bound); mb->mvs[1] = mb->mvs[2] = mb->mvs[3] = mb->mvs[0]; } } else if (mb->mode == MODE_INTER4V ) { /* interlaced missing here */ get_motion_vector(dec, bs, x, y, 0, &mb->mvs[0], fcode, bound); get_motion_vector(dec, bs, x, y, 1, &mb->mvs[1], fcode, bound); get_motion_vector(dec, bs, x, y, 2, &mb->mvs[2], fcode, bound); get_motion_vector(dec, bs, x, y, 3, &mb->mvs[3], fcode, bound); } else { /* MODE_INTRA, MODE_INTRA_Q */ mb->mvs[0].x = mb->mvs[1].x = mb->mvs[2].x = mb->mvs[3].x = 0; mb->mvs[0].y = mb->mvs[1].y = mb->mvs[2].y = mb->mvs[3].y = 0; decoder_mbintra(dec, mb, x, y, acpred_flag, cbp, bs, quant, intra_dc_threshold, bound); continue; } /* See how to decode */ if(!mb->field_pred) decoder_mbinter(dec, mb, x, y, cbp, bs, rounding, 0, 0); else decoder_mbinter_field(dec, mb, x, y, cbp, bs, rounding, 0, 0); } else if (gmc_warp) { /* a not coded S(GMC)-VOP macroblock */ mb->mode = MODE_NOT_CODED_GMC; mb->quant = quant; decoder_mbgmc(dec, mb, x, y, fcode, 0x00, bs, rounding); if(dec->out_frm && cp_mb > 0) { output_slice(&dec->cur, dec->edged_width,dec->width,dec->out_frm,st_mb,y,cp_mb); cp_mb = 0; } st_mb = x+1; } else { /* not coded P_VOP macroblock */ mb->mode = MODE_NOT_CODED; mb->quant = quant; mb->mvs[0].x = mb->mvs[1].x = mb->mvs[2].x = mb->mvs[3].x = 0; mb->mvs[0].y = mb->mvs[1].y = mb->mvs[2].y = mb->mvs[3].y = 0; mb->field_pred=0; /* (!) */ decoder_mbinter(dec, mb, x, y, 0, bs, rounding, 0, 0); if(dec->out_frm && cp_mb > 0) { output_slice(&dec->cur, dec->edged_width,dec->width,dec->out_frm,st_mb,y,cp_mb); cp_mb = 0; } st_mb = x+1; } } if(dec->out_frm && cp_mb > 0) output_slice(&dec->cur, dec->edged_width,dec->width,dec->out_frm,st_mb,y,cp_mb); } } /* decode B-frame motion vector */ static void get_b_motion_vector(Bitstream * bs, VECTOR * mv, int fcode, const VECTOR pmv, const DECODER * const dec, const int x, const int y) { const int scale_fac = 1 << (fcode - 1); const int high = (32 * scale_fac) - 1; const int low = ((-32) * scale_fac); const int range = (64 * scale_fac); int mv_x = get_mv(bs, fcode); int mv_y = get_mv(bs, fcode); mv_x += pmv.x; mv_y += pmv.y; if (mv_x < low) mv_x += range; else if (mv_x > high) mv_x -= range; if (mv_y < low) mv_y += range; else if (mv_y > high) mv_y -= range; mv->x = mv_x; mv->y = mv_y; } /* decode an B-frame direct & interpolate macroblock */ static void decoder_bf_interpolate_mbinter(DECODER * dec, IMAGE forward, IMAGE backward, MACROBLOCK * pMB, const uint32_t x_pos, const uint32_t y_pos, Bitstream * bs, const int direct) { uint32_t stride = dec->edged_width; uint32_t stride2 = stride / 2; int uv_dx, uv_dy; int b_uv_dx, b_uv_dy; uint8_t *pY_Cur, *pU_Cur, *pV_Cur; const uint32_t cbp = pMB->cbp; pY_Cur = dec->cur.y + (y_pos << 4) * stride + (x_pos << 4); pU_Cur = dec->cur.u + (y_pos << 3) * stride2 + (x_pos << 3); pV_Cur = dec->cur.v + (y_pos << 3) * stride2 + (x_pos << 3); validate_vector(pMB->mvs, x_pos, y_pos, dec); validate_vector(pMB->b_mvs, x_pos, y_pos, dec); if (!direct) { uv_dx = pMB->mvs[0].x; uv_dy = pMB->mvs[0].y; b_uv_dx = pMB->b_mvs[0].x; b_uv_dy = pMB->b_mvs[0].y; if (dec->quarterpel) { if (dec->bs_version <= BS_VERSION_BUGGY_CHROMA_ROUNDING) { uv_dx = (uv_dx>>1) | (uv_dx&1); uv_dy = (uv_dy>>1) | (uv_dy&1); b_uv_dx = (b_uv_dx>>1) | (b_uv_dx&1); b_uv_dy = (b_uv_dy>>1) | (b_uv_dy&1); } else { uv_dx /= 2; uv_dy /= 2; b_uv_dx /= 2; b_uv_dy /= 2; } } uv_dx = (uv_dx >> 1) + roundtab_79[uv_dx & 0x3]; uv_dy = (uv_dy >> 1) + roundtab_79[uv_dy & 0x3]; b_uv_dx = (b_uv_dx >> 1) + roundtab_79[b_uv_dx & 0x3]; b_uv_dy = (b_uv_dy >> 1) + roundtab_79[b_uv_dy & 0x3]; } else { if (dec->quarterpel) { /* for qpel the /2 shall be done before summation. We've done it right in the encoder in the past. */ /* TODO: figure out if we ever did it wrong on the encoder side. If yes, add some workaround */ if (dec->bs_version <= BS_VERSION_BUGGY_CHROMA_ROUNDING) { int z; uv_dx = 0; uv_dy = 0; b_uv_dx = 0; b_uv_dy = 0; for (z = 0; z < 4; z++) { uv_dx += ((pMB->mvs[z].x>>1) | (pMB->mvs[z].x&1)); uv_dy += ((pMB->mvs[z].y>>1) | (pMB->mvs[z].y&1)); b_uv_dx += ((pMB->b_mvs[z].x>>1) | (pMB->b_mvs[z].x&1)); b_uv_dy += ((pMB->b_mvs[z].y>>1) | (pMB->b_mvs[z].y&1)); } } else { uv_dx = (pMB->mvs[0].x / 2) + (pMB->mvs[1].x / 2) + (pMB->mvs[2].x / 2) + (pMB->mvs[3].x / 2); uv_dy = (pMB->mvs[0].y / 2) + (pMB->mvs[1].y / 2) + (pMB->mvs[2].y / 2) + (pMB->mvs[3].y / 2); b_uv_dx = (pMB->b_mvs[0].x / 2) + (pMB->b_mvs[1].x / 2) + (pMB->b_mvs[2].x / 2) + (pMB->b_mvs[3].x / 2); b_uv_dy = (pMB->b_mvs[0].y / 2) + (pMB->b_mvs[1].y / 2) + (pMB->b_mvs[2].y / 2) + (pMB->b_mvs[3].y / 2); } } else { uv_dx = pMB->mvs[0].x + pMB->mvs[1].x + pMB->mvs[2].x + pMB->mvs[3].x; uv_dy = pMB->mvs[0].y + pMB->mvs[1].y + pMB->mvs[2].y + pMB->mvs[3].y; b_uv_dx = pMB->b_mvs[0].x + pMB->b_mvs[1].x + pMB->b_mvs[2].x + pMB->b_mvs[3].x; b_uv_dy = pMB->b_mvs[0].y + pMB->b_mvs[1].y + pMB->b_mvs[2].y + pMB->b_mvs[3].y; } uv_dx = (uv_dx >> 3) + roundtab_76[uv_dx & 0xf]; uv_dy = (uv_dy >> 3) + roundtab_76[uv_dy & 0xf]; b_uv_dx = (b_uv_dx >> 3) + roundtab_76[b_uv_dx & 0xf]; b_uv_dy = (b_uv_dy >> 3) + roundtab_76[b_uv_dy & 0xf]; } start_timer(); if(dec->quarterpel) { if(!direct) { interpolate16x16_quarterpel(dec->cur.y, forward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos, pMB->mvs[0].x, pMB->mvs[0].y, stride, 0); } else { interpolate8x8_quarterpel(dec->cur.y, forward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos, pMB->mvs[0].x, pMB->mvs[0].y, stride, 0); interpolate8x8_quarterpel(dec->cur.y, forward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos + 8, 16*y_pos, pMB->mvs[1].x, pMB->mvs[1].y, stride, 0); interpolate8x8_quarterpel(dec->cur.y, forward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos + 8, pMB->mvs[2].x, pMB->mvs[2].y, stride, 0); interpolate8x8_quarterpel(dec->cur.y, forward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos + 8, 16*y_pos + 8, pMB->mvs[3].x, pMB->mvs[3].y, stride, 0); } } else { interpolate8x8_switch(dec->cur.y, forward.y, 16 * x_pos, 16 * y_pos, pMB->mvs[0].x, pMB->mvs[0].y, stride, 0); interpolate8x8_switch(dec->cur.y, forward.y, 16 * x_pos + 8, 16 * y_pos, pMB->mvs[1].x, pMB->mvs[1].y, stride, 0); interpolate8x8_switch(dec->cur.y, forward.y, 16 * x_pos, 16 * y_pos + 8, pMB->mvs[2].x, pMB->mvs[2].y, stride, 0); interpolate8x8_switch(dec->cur.y, forward.y, 16 * x_pos + 8, 16 * y_pos + 8, pMB->mvs[3].x, pMB->mvs[3].y, stride, 0); } interpolate8x8_switch(dec->cur.u, forward.u, 8 * x_pos, 8 * y_pos, uv_dx, uv_dy, stride2, 0); interpolate8x8_switch(dec->cur.v, forward.v, 8 * x_pos, 8 * y_pos, uv_dx, uv_dy, stride2, 0); if(dec->quarterpel) { if(!direct) { interpolate16x16_add_quarterpel(dec->cur.y, backward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos, pMB->b_mvs[0].x, pMB->b_mvs[0].y, stride, 0); } else { interpolate8x8_add_quarterpel(dec->cur.y, backward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos, pMB->b_mvs[0].x, pMB->b_mvs[0].y, stride, 0); interpolate8x8_add_quarterpel(dec->cur.y, backward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos + 8, 16*y_pos, pMB->b_mvs[1].x, pMB->b_mvs[1].y, stride, 0); interpolate8x8_add_quarterpel(dec->cur.y, backward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos, 16*y_pos + 8, pMB->b_mvs[2].x, pMB->b_mvs[2].y, stride, 0); interpolate8x8_add_quarterpel(dec->cur.y, backward.y, dec->qtmp.y, dec->qtmp.y + 64, dec->qtmp.y + 128, 16*x_pos + 8, 16*y_pos + 8, pMB->b_mvs[3].x, pMB->b_mvs[3].y, stride, 0); } } else { interpolate8x8_add_switch(dec->cur.y, backward.y, 16 * x_pos, 16 * y_pos, pMB->b_mvs[0].x, pMB->b_mvs[0].y, stride, 0); interpolate8x8_add_switch(dec->cur.y, backward.y, 16 * x_pos + 8, 16 * y_pos, pMB->b_mvs[1].x, pMB->b_mvs[1].y, stride, 0); interpolate8x8_add_switch(dec->cur.y, backward.y, 16 * x_pos, 16 * y_pos + 8, pMB->b_mvs[2].x, pMB->b_mvs[2].y, stride, 0); interpolate8x8_add_switch(dec->cur.y, backward.y, 16 * x_pos + 8, 16 * y_pos + 8, pMB->b_mvs[3].x, pMB->b_mvs[3].y, stride, 0); } interpolate8x8_add_switch(dec->cur.u, backward.u, 8 * x_pos, 8 * y_pos, b_uv_dx, b_uv_dy, stride2, 0); interpolate8x8_add_switch(dec->cur.v, backward.v, 8 * x_pos, 8 * y_pos, b_uv_dx, b_uv_dy, stride2, 0); stop_comp_timer(); if (cbp) decoder_mb_decode(dec, cbp, bs, pY_Cur, pU_Cur, pV_Cur, pMB); } /* for decode B-frame dbquant */ static __inline int32_t get_dbquant(Bitstream * bs) { if (!BitstreamGetBit(bs)) /* '0' */ return (0); else if (!BitstreamGetBit(bs)) /* '10' */ return (-2); else /* '11' */ return (2); } /* * decode B-frame mb_type * bit ret_value * 1 0 * 01 1 * 001 2 * 0001 3 */ static int32_t __inline get_mbtype(Bitstream * bs) { int32_t mb_type; for (mb_type = 0; mb_type <= 3; mb_type++) if (BitstreamGetBit(bs)) return (mb_type); return -1; } static int __inline get_resync_len_b(const int fcode_backward, const int fcode_forward) { int resync_len = ((fcode_forward>fcode_backward) ? fcode_forward : fcode_backward) - 1; if (resync_len < 1) resync_len = 1; return resync_len; } static void decoder_bframe(DECODER * dec, Bitstream * bs, int quant, int fcode_forward, int fcode_backward) { uint32_t x, y; VECTOR mv; const VECTOR zeromv = {0,0}; int i; int resync_len; if (!dec->is_edged[0]) { start_timer(); image_setedges(&dec->refn[0], dec->edged_width, dec->edged_height, dec->width, dec->height, dec->bs_version); dec->is_edged[0] = 1; stop_edges_timer(); } if (!dec->is_edged[1]) { start_timer(); image_setedges(&dec->refn[1], dec->edged_width, dec->edged_height, dec->width, dec->height, dec->bs_version); dec->is_edged[1] = 1; stop_edges_timer(); } resync_len = get_resync_len_b(fcode_backward, fcode_forward); for (y = 0; y < dec->mb_height; y++) { /* Initialize Pred Motion Vector */ dec->p_fmv = dec->p_bmv = zeromv; for (x = 0; x < dec->mb_width; x++) { MACROBLOCK *mb = &dec->mbs[y * dec->mb_width + x]; MACROBLOCK *last_mb = &dec->last_mbs[y * dec->mb_width + x]; int intra_dc_threshold; /* fake variable */ mv = mb->b_mvs[0] = mb->b_mvs[1] = mb->b_mvs[2] = mb->b_mvs[3] = mb->mvs[0] = mb->mvs[1] = mb->mvs[2] = mb->mvs[3] = zeromv; mb->quant = quant; /* * skip if the co-located P_VOP macroblock is not coded * if not codec in co-located S_VOP macroblock is _not_ * automatically skipped */ if (last_mb->mode == MODE_NOT_CODED) { mb->cbp = 0; mb->mode = MODE_FORWARD; decoder_mbinter(dec, mb, x, y, mb->cbp, bs, 0, 1, 1); continue; } if (check_resync_marker(bs, resync_len)) { int bound = read_video_packet_header(bs, dec, resync_len, &quant, &fcode_forward, &fcode_backward, &intra_dc_threshold); bound = MAX(0, bound-1); /* valid bound must always be >0 */ x = bound % dec->mb_width; y = MIN((bound / dec->mb_width), (dec->mb_height-1)); /* reset predicted macroblocks */ dec->p_fmv = dec->p_bmv = zeromv; /* update resync len with new fcodes */ resync_len = get_resync_len_b(fcode_backward, fcode_forward); continue; /* re-init loop */ } if (!BitstreamGetBit(bs)) { /* modb=='0' */ const uint8_t modb2 = BitstreamGetBit(bs); mb->mode = get_mbtype(bs); if (!modb2) /* modb=='00' */ mb->cbp = BitstreamGetBits(bs, 6); else mb->cbp = 0; if (mb->mode && mb->cbp) { quant += get_dbquant(bs); if (quant > 31) quant = 31; else if (quant < 1) quant = 1; } mb->quant = quant; if (dec->interlacing) { if (mb->cbp) { mb->field_dct = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB,"decp: field_dct: %i\n", mb->field_dct); } if (mb->mode) { mb->field_pred = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB, "decp: field_pred: %i\n", mb->field_pred); if (mb->field_pred) { mb->field_for_top = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB,"decp: field_for_top: %i\n", mb->field_for_top); mb->field_for_bot = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_MB,"decp: field_for_bot: %i\n", mb->field_for_bot); } } } } else { mb->mode = MODE_DIRECT_NONE_MV; mb->cbp = 0; } switch (mb->mode) { case MODE_DIRECT: get_b_motion_vector(bs, &mv, 1, zeromv, dec, x, y); case MODE_DIRECT_NONE_MV: for (i = 0; i < 4; i++) { mb->mvs[i].x = last_mb->mvs[i].x*dec->time_bp/dec->time_pp + mv.x; mb->mvs[i].y = last_mb->mvs[i].y*dec->time_bp/dec->time_pp + mv.y; mb->b_mvs[i].x = (mv.x) ? mb->mvs[i].x - last_mb->mvs[i].x : last_mb->mvs[i].x*(dec->time_bp - dec->time_pp)/dec->time_pp; mb->b_mvs[i].y = (mv.y) ? mb->mvs[i].y - last_mb->mvs[i].y : last_mb->mvs[i].y*(dec->time_bp - dec->time_pp)/dec->time_pp; } decoder_bf_interpolate_mbinter(dec, dec->refn[1], dec->refn[0], mb, x, y, bs, 1); break; case MODE_INTERPOLATE: get_b_motion_vector(bs, &mb->mvs[0], fcode_forward, dec->p_fmv, dec, x, y); dec->p_fmv = mb->mvs[1] = mb->mvs[2] = mb->mvs[3] = mb->mvs[0]; get_b_motion_vector(bs, &mb->b_mvs[0], fcode_backward, dec->p_bmv, dec, x, y); dec->p_bmv = mb->b_mvs[1] = mb->b_mvs[2] = mb->b_mvs[3] = mb->b_mvs[0]; decoder_bf_interpolate_mbinter(dec, dec->refn[1], dec->refn[0], mb, x, y, bs, 0); break; case MODE_BACKWARD: get_b_motion_vector(bs, &mb->mvs[0], fcode_backward, dec->p_bmv, dec, x, y); dec->p_bmv = mb->mvs[1] = mb->mvs[2] = mb->mvs[3] = mb->mvs[0]; decoder_mbinter(dec, mb, x, y, mb->cbp, bs, 0, 0, 1); break; case MODE_FORWARD: get_b_motion_vector(bs, &mb->mvs[0], fcode_forward, dec->p_fmv, dec, x, y); dec->p_fmv = mb->mvs[1] = mb->mvs[2] = mb->mvs[3] = mb->mvs[0]; decoder_mbinter(dec, mb, x, y, mb->cbp, bs, 0, 1, 1); break; default: DPRINTF(XVID_DEBUG_ERROR,"Not supported B-frame mb_type = %i\n", mb->mode); } } /* End of for */ } } /* perform post processing if necessary, and output the image */ static void decoder_output(DECODER * dec, IMAGE * img, MACROBLOCK * mbs, xvid_dec_frame_t * frame, xvid_dec_stats_t * stats, int coding_type, int quant) { const int brightness = XVID_VERSION_MINOR(frame->version) >= 1 ? frame->brightness : 0; if (dec->cartoon_mode) frame->general &= ~XVID_FILMEFFECT; if ((frame->general & (XVID_DEBLOCKY|XVID_DEBLOCKUV|XVID_FILMEFFECT) || brightness!=0) && mbs != NULL) /* post process */ { /* note: image is stored to tmp */ image_copy(&dec->tmp, img, dec->edged_width, dec->height); image_postproc(&dec->postproc, &dec->tmp, dec->edged_width, mbs, dec->mb_width, dec->mb_height, dec->mb_width, frame->general, brightness, dec->frames, (coding_type == B_VOP), dec->num_threads); img = &dec->tmp; } image_output(img, dec->width, dec->height, dec->edged_width, (uint8_t**)frame->output.plane, frame->output.stride, frame->output.csp, dec->interlacing); if (stats) { stats->type = coding2type(coding_type); stats->data.vop.time_base = (int)dec->time_base; stats->data.vop.time_increment = 0; /* XXX: todo */ stats->data.vop.qscale_stride = dec->mb_width; stats->data.vop.qscale = dec->qscale; if (stats->data.vop.qscale != NULL && mbs != NULL) { unsigned int i; for (i = 0; i < dec->mb_width*dec->mb_height; i++) stats->data.vop.qscale[i] = mbs[i].quant; } else stats->data.vop.qscale = NULL; } } int decoder_decode(DECODER * dec, xvid_dec_frame_t * frame, xvid_dec_stats_t * stats) { Bitstream bs; uint32_t rounding; uint32_t quant = 2; uint32_t fcode_forward; uint32_t fcode_backward; uint32_t intra_dc_threshold; WARPPOINTS gmc_warp; int coding_type; int success, output, seen_something; if (XVID_VERSION_MAJOR(frame->version) != 1 || (stats && XVID_VERSION_MAJOR(stats->version) != 1)) /* v1.x.x */ return XVID_ERR_VERSION; start_global_timer(); dec->low_delay_default = (frame->general & XVID_LOWDELAY); if ((frame->general & XVID_DISCONTINUITY)) dec->frames = 0; dec->out_frm = (frame->output.csp == XVID_CSP_SLICE) ? &frame->output : NULL; if(frame->length<0) { /* decoder flush */ int ret; /* if not decoding "low_delay/packed", and this isn't low_delay and we have a reference frame, then outout the reference frame */ if (!(dec->low_delay_default && dec->packed_mode) && !dec->low_delay && dec->frames>0) { decoder_output(dec, &dec->refn[0], dec->last_mbs, frame, stats, dec->last_coding_type, quant); dec->frames = 0; ret = 0; } else { if (stats) stats->type = XVID_TYPE_NOTHING; ret = XVID_ERR_END; } emms(); stop_global_timer(); return ret; } BitstreamInit(&bs, frame->bitstream, frame->length); /* XXX: 0x7f is only valid whilst decoding vfw xvid/divx5 avi's */ if(dec->low_delay_default && frame->length == 1 && BitstreamShowBits(&bs, 8) == 0x7f) { image_output(&dec->refn[0], dec->width, dec->height, dec->edged_width, (uint8_t**)frame->output.plane, frame->output.stride, frame->output.csp, dec->interlacing); if (stats) stats->type = XVID_TYPE_NOTHING; emms(); return 1; /* one byte consumed */ } success = 0; output = 0; seen_something = 0; repeat: coding_type = BitstreamReadHeaders(&bs, dec, &rounding, &quant, &fcode_forward, &fcode_backward, &intra_dc_threshold, &gmc_warp); DPRINTF(XVID_DEBUG_HEADER, "coding_type=%i, packed=%i, time=%" #if defined(_MSC_VER) "I64" #else "ll" #endif "i, time_pp=%i, time_bp=%i\n", coding_type, dec->packed_mode, dec->time, dec->time_pp, dec->time_bp); if (coding_type == -1) { /* nothing */ if (success) goto done; if (stats) stats->type = XVID_TYPE_NOTHING; emms(); return BitstreamPos(&bs)/8; } if (coding_type == -2 || coding_type == -3) { /* vol and/or resize */ if (coding_type == -3) if (decoder_resize(dec)) return XVID_ERR_MEMORY; if(stats) { stats->type = XVID_TYPE_VOL; stats->data.vol.general = 0; /*XXX: if (dec->interlacing) stats->data.vol.general |= ++INTERLACING; */ stats->data.vol.width = dec->width; stats->data.vol.height = dec->height; stats->data.vol.par = dec->aspect_ratio; stats->data.vol.par_width = dec->par_width; stats->data.vol.par_height = dec->par_height; emms(); return BitstreamPos(&bs)/8; /* number of bytes consumed */ } goto repeat; } if(dec->frames == 0 && coding_type != I_VOP) { /* 1st frame is not an i-vop */ goto repeat; } dec->p_bmv.x = dec->p_bmv.y = dec->p_fmv.x = dec->p_fmv.y = 0; /* init pred vector to 0 */ /* packed_mode: special-N_VOP treament */ if (dec->packed_mode && coding_type == N_VOP) { if (dec->low_delay_default && dec->frames > 0) { decoder_output(dec, &dec->refn[0], dec->last_mbs, frame, stats, dec->last_coding_type, quant); output = 1; } /* ignore otherwise */ } else if (coding_type != B_VOP) { switch(coding_type) { case I_VOP : decoder_iframe(dec, &bs, quant, intra_dc_threshold); break; case P_VOP : decoder_pframe(dec, &bs, rounding, quant, fcode_forward, intra_dc_threshold, NULL); break; case S_VOP : decoder_pframe(dec, &bs, rounding, quant, fcode_forward, intra_dc_threshold, &gmc_warp); break; case N_VOP : /* XXX: not_coded vops are not used for forward prediction */ /* we should not swap(last_mbs,mbs) */ image_copy(&dec->cur, &dec->refn[0], dec->edged_width, dec->height); SWAP(MACROBLOCK *, dec->mbs, dec->last_mbs); /* it will be swapped back */ break; } /* note: for packed_mode, output is performed when the special-N_VOP is decoded */ if (!(dec->low_delay_default && dec->packed_mode)) { if(dec->low_delay) { decoder_output(dec, &dec->cur, dec->mbs, frame, stats, coding_type, quant); output = 1; } else if (dec->frames > 0) { /* is the reference frame valid? */ /* output the reference frame */ decoder_output(dec, &dec->refn[0], dec->last_mbs, frame, stats, dec->last_coding_type, quant); output = 1; } } image_swap(&dec->refn[0], &dec->refn[1]); dec->is_edged[1] = dec->is_edged[0]; image_swap(&dec->cur, &dec->refn[0]); dec->is_edged[0] = 0; SWAP(MACROBLOCK *, dec->mbs, dec->last_mbs); dec->last_coding_type = coding_type; dec->frames++; seen_something = 1; } else { /* B_VOP */ if (dec->low_delay) { DPRINTF(XVID_DEBUG_ERROR, "warning: bvop found in low_delay==1 stream\n"); dec->low_delay = 0; } if (dec->frames < 2) { /* attemping to decode a bvop without atleast 2 reference frames */ image_printf(&dec->cur, dec->edged_width, dec->height, 16, 16, "broken b-frame, mising ref frames"); if (stats) stats->type = XVID_TYPE_NOTHING; } else if (dec->time_pp <= dec->time_bp) { /* this occurs when dx50_bvop_compatibility==0 sequences are decoded in vfw. */ image_printf(&dec->cur, dec->edged_width, dec->height, 16, 16, "broken b-frame, tpp=%i tbp=%i", dec->time_pp, dec->time_bp); if (stats) stats->type = XVID_TYPE_NOTHING; } else { decoder_bframe(dec, &bs, quant, fcode_forward, fcode_backward); decoder_output(dec, &dec->cur, dec->mbs, frame, stats, coding_type, quant); } output = 1; dec->frames++; } #if 0 /* Avoids to read to much data because of 32bit reads in our BS functions */ BitstreamByteAlign(&bs); #endif /* low_delay_default mode: repeat in packed_mode */ if (dec->low_delay_default && dec->packed_mode && output == 0 && success == 0) { success = 1; goto repeat; } done : /* if we reach here without outputing anything _and_ the calling application has specified low_delay_default, we *must* output something. this always occurs on the first call to decode() call when bframes are present in the bitstream. it may also occur if no vops were seen in the bitstream if packed_mode is enabled, then we output the recently decoded frame (the very first ivop). otherwise we have nothing to display, and therefore output a black screen. */ if (dec->low_delay_default && output == 0) { if (dec->packed_mode && seen_something) { decoder_output(dec, &dec->refn[0], dec->last_mbs, frame, stats, dec->last_coding_type, quant); } else { image_clear(&dec->cur, dec->width, dec->height, dec->edged_width, 0, 128, 128); decoder_output(dec, &dec->cur, NULL, frame, stats, P_VOP, quant); if (stats) stats->type = XVID_TYPE_NOTHING; } } emms(); stop_global_timer(); return (BitstreamPos(&bs)+7)/8; /* number of bytes consumed */ } xvidcore/src/utils/0000775000076500007650000000000011566427763015433 5ustar xvidbuildxvidbuildxvidcore/src/utils/emms.h0000664000076500007650000000411411564705453016536 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - emms related header - * * Copyright(C) 2002 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: emms.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _EMMS_H_ #define _EMMS_H_ /***************************************************************************** * emms API ****************************************************************************/ typedef void (emmsFunc) (); typedef emmsFunc *emmsFuncPtr; /* Our global function pointer - defined in emms.c */ extern emmsFuncPtr emms; /* Implemented functions */ emmsFunc emms_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) emmsFunc emms_mmx; emmsFunc emms_3dn; #endif /***************************************************************************** * Prototypes ****************************************************************************/ #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) /* cpu_flag detection helper functions */ extern int check_cpu_features(void); extern void sse_os_trigger(void); extern void sse2_os_trigger(void); #endif #if defined(ARCH_IS_X86_64) && defined(WIN32) extern void prime_xmm(void *); extern void get_xmm(void *); #endif #ifdef ARCH_IS_PPC extern void altivec_trigger(void); #endif #endif xvidcore/src/utils/mem_align.h0000664000076500007650000000237511564705453017534 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Aligned Memory Allocator header - * * Copyright(C) 2002-2003 Edouard Gomez * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mem_align.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _MEM_ALIGN_H_ #define _MEM_ALIGN_H_ #include "../portab.h" void *xvid_malloc(size_t size, uint8_t alignment); void xvid_free(void *mem_ptr); #endif /* _MEM_ALIGN_H_ */ xvidcore/src/utils/mbtransquant.c0000664000076500007650000012222511564705453020313 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - MB Transfer/Quantization functions - * * Copyright(C) 2001-2010 Peter Ross * 2001-2010 Michael Militzer * 2003 Edouard Gomez * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mbtransquant.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include "../portab.h" #include "mbfunctions.h" #include "../global.h" #include "mem_transfer.h" #include "timer.h" #include "../bitstream/mbcoding.h" #include "../bitstream/zigzag.h" #include "../dct/fdct.h" #include "../dct/idct.h" #include "../quant/quant.h" #include "../motion/sad.h" #include "../encoder.h" #include "../quant/quant_matrix.h" MBFIELDTEST_PTR MBFieldTest; /* * Skip blocks having a coefficient sum below this value. This value will be * corrected according to the MB quantizer to avoid artifacts for quant==1 */ #define PVOP_TOOSMALL_LIMIT 1 #define BVOP_TOOSMALL_LIMIT 3 /***************************************************************************** * Local functions ****************************************************************************/ /* permute block and return field dct choice */ static __inline uint32_t MBDecideFieldDCT(int16_t data[6 * 64]) { uint32_t field = MBFieldTest(data); if (field) MBFrameToField(data); return field; } /* Performs Forward DCT on all blocks */ static __inline void MBfDCT(const MBParam * const pParam, const FRAMEINFO * const frame, MACROBLOCK * const pMB, uint32_t x_pos, uint32_t y_pos, int16_t data[6 * 64]) { /* Handles interlacing */ start_timer(); pMB->field_dct = 0; if ((frame->vol_flags & XVID_VOL_INTERLACING) && (x_pos>0) && (x_posmb_width-1) && (y_pos>0) && (y_posmb_height-1)) { pMB->field_dct = MBDecideFieldDCT(data); } stop_interlacing_timer(); /* Perform DCT */ start_timer(); fdct((short * const)&data[0 * 64]); fdct((short * const)&data[1 * 64]); fdct((short * const)&data[2 * 64]); fdct((short * const)&data[3 * 64]); fdct((short * const)&data[4 * 64]); fdct((short * const)&data[5 * 64]); stop_dct_timer(); } /* Performs Inverse DCT on all blocks */ static __inline void MBiDCT(int16_t data[6 * 64], const uint8_t cbp) { start_timer(); if(cbp & (1 << (5 - 0))) idct((short * const)&data[0 * 64]); if(cbp & (1 << (5 - 1))) idct((short * const)&data[1 * 64]); if(cbp & (1 << (5 - 2))) idct((short * const)&data[2 * 64]); if(cbp & (1 << (5 - 3))) idct((short * const)&data[3 * 64]); if(cbp & (1 << (5 - 4))) idct((short * const)&data[4 * 64]); if(cbp & (1 << (5 - 5))) idct((short * const)&data[5 * 64]); stop_idct_timer(); } /* Quantize all blocks -- Intra mode */ static __inline void MBQuantIntra(const MBParam * pParam, const FRAMEINFO * const frame, const MACROBLOCK * pMB, int16_t qcoeff[6 * 64], int16_t data[6*64]) { int scaler_lum, scaler_chr; quant_intraFuncPtr quant; /* check if quant matrices need to be re-initialized with new quant */ if (pParam->vol_flags & XVID_VOL_MPEGQUANT) { if (pParam->last_quant_initialized_intra != pMB->quant) { init_intra_matrix(pParam->mpeg_quant_matrices, pMB->quant); } quant = quant_mpeg_intra; } else { quant = quant_h263_intra; } scaler_lum = get_dc_scaler(pMB->quant, 1); scaler_chr = get_dc_scaler(pMB->quant, 0); /* Quantize the block */ start_timer(); quant(&data[0 * 64], &qcoeff[0 * 64], pMB->quant, scaler_lum, pParam->mpeg_quant_matrices); quant(&data[1 * 64], &qcoeff[1 * 64], pMB->quant, scaler_lum, pParam->mpeg_quant_matrices); quant(&data[2 * 64], &qcoeff[2 * 64], pMB->quant, scaler_lum, pParam->mpeg_quant_matrices); quant(&data[3 * 64], &qcoeff[3 * 64], pMB->quant, scaler_lum, pParam->mpeg_quant_matrices); quant(&data[4 * 64], &qcoeff[4 * 64], pMB->quant, scaler_chr, pParam->mpeg_quant_matrices); quant(&data[5 * 64], &qcoeff[5 * 64], pMB->quant, scaler_chr, pParam->mpeg_quant_matrices); stop_quant_timer(); } /* DeQuantize all blocks -- Intra mode */ static __inline void MBDeQuantIntra(const MBParam * pParam, const int iQuant, int16_t qcoeff[6 * 64], int16_t data[6*64]) { int mpeg; int scaler_lum, scaler_chr; quant_intraFuncPtr const dequant[2] = { dequant_h263_intra, dequant_mpeg_intra }; mpeg = !!(pParam->vol_flags & XVID_VOL_MPEGQUANT); scaler_lum = get_dc_scaler(iQuant, 1); scaler_chr = get_dc_scaler(iQuant, 0); start_timer(); dequant[mpeg](&qcoeff[0 * 64], &data[0 * 64], iQuant, scaler_lum, pParam->mpeg_quant_matrices); dequant[mpeg](&qcoeff[1 * 64], &data[1 * 64], iQuant, scaler_lum, pParam->mpeg_quant_matrices); dequant[mpeg](&qcoeff[2 * 64], &data[2 * 64], iQuant, scaler_lum, pParam->mpeg_quant_matrices); dequant[mpeg](&qcoeff[3 * 64], &data[3 * 64], iQuant, scaler_lum, pParam->mpeg_quant_matrices); dequant[mpeg](&qcoeff[4 * 64], &data[4 * 64], iQuant, scaler_chr, pParam->mpeg_quant_matrices); dequant[mpeg](&qcoeff[5 * 64], &data[5 * 64], iQuant, scaler_chr, pParam->mpeg_quant_matrices); stop_iquant_timer(); } static int dct_quantize_trellis_c(int16_t *const Out, const int16_t *const In, int Q, const uint16_t * const Zigzag, const uint16_t * const QuantMatrix, int Non_Zero, int Sum, int Lambda_Mod, const uint32_t rel_var8, const int Metric); /* Quantize all blocks -- Inter mode */ static __inline uint8_t MBQuantInter(const MBParam * pParam, const FRAMEINFO * const frame, const MACROBLOCK * pMB, int16_t data[6 * 64], int16_t qcoeff[6 * 64], int bvop, int limit) { int i; uint8_t cbp = 0; int sum; int code_block, mpeg; quant_interFuncPtr const quant[2] = { quant_h263_inter, quant_mpeg_inter }; mpeg = !!(pParam->vol_flags & XVID_VOL_MPEGQUANT); for (i = 0; i < 6; i++) { /* Quantize the block */ start_timer(); sum = quant[mpeg](&qcoeff[i*64], &data[i*64], pMB->quant, pParam->mpeg_quant_matrices); if(sum && (pMB->quant > 2) && (frame->vop_flags & XVID_VOP_TRELLISQUANT)) { const uint16_t *matrix; const static uint16_t h263matrix[] = { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16 }; matrix = (mpeg)?get_inter_matrix(pParam->mpeg_quant_matrices):h263matrix; sum = dct_quantize_trellis_c(&qcoeff[i*64], &data[i*64], pMB->quant, &scan_tables[0][0], matrix, 63, sum, pMB->lambda[i], pMB->rel_var8[i], !!(frame->vop_flags & XVID_VOP_RD_PSNRHVSM)); } stop_quant_timer(); /* * We code the block if the sum is higher than the limit and if the first * two AC coefficients in zig zag order are not zero. */ code_block = 0; if ((sum >= limit) || (qcoeff[i*64+1] != 0) || (qcoeff[i*64+8] != 0)) { code_block = 1; } else { if (bvop && (pMB->mode == MODE_DIRECT || pMB->mode == MODE_DIRECT_NO4V)) { /* dark blocks prevention for direct mode */ if ((qcoeff[i*64] < -1) || (qcoeff[i*64] > 0)) code_block = 1; } else { /* not direct mode */ if (qcoeff[i*64] != 0) code_block = 1; } } /* Set the corresponding cbp bit */ cbp |= code_block << (5 - i); } return(cbp); } /* DeQuantize all blocks -- Inter mode */ static __inline void MBDeQuantInter(const MBParam * pParam, const int iQuant, int16_t data[6 * 64], int16_t qcoeff[6 * 64], const uint8_t cbp) { int mpeg; quant_interFuncPtr const dequant[2] = { dequant_h263_inter, dequant_mpeg_inter }; mpeg = !!(pParam->vol_flags & XVID_VOL_MPEGQUANT); start_timer(); if(cbp & (1 << (5 - 0))) dequant[mpeg](&data[0 * 64], &qcoeff[0 * 64], iQuant, pParam->mpeg_quant_matrices); if(cbp & (1 << (5 - 1))) dequant[mpeg](&data[1 * 64], &qcoeff[1 * 64], iQuant, pParam->mpeg_quant_matrices); if(cbp & (1 << (5 - 2))) dequant[mpeg](&data[2 * 64], &qcoeff[2 * 64], iQuant, pParam->mpeg_quant_matrices); if(cbp & (1 << (5 - 3))) dequant[mpeg](&data[3 * 64], &qcoeff[3 * 64], iQuant, pParam->mpeg_quant_matrices); if(cbp & (1 << (5 - 4))) dequant[mpeg](&data[4 * 64], &qcoeff[4 * 64], iQuant, pParam->mpeg_quant_matrices); if(cbp & (1 << (5 - 5))) dequant[mpeg](&data[5 * 64], &qcoeff[5 * 64], iQuant, pParam->mpeg_quant_matrices); stop_iquant_timer(); } typedef void (transfer_operation_8to16_t) (int16_t *Dst, const uint8_t *Src, int BpS); typedef void (transfer_operation_16to8_t) (uint8_t *Dst, const int16_t *Src, int BpS); static __inline void MBTrans8to16(const MBParam * const pParam, const FRAMEINFO * const frame, const MACROBLOCK * const pMB, const uint32_t x_pos, const uint32_t y_pos, int16_t data[6 * 64]) { uint32_t stride = pParam->edged_width; uint32_t stride2 = stride / 2; uint32_t next_block = stride * 8; uint8_t *pY_Cur, *pU_Cur, *pV_Cur; const IMAGE * const pCurrent = &frame->image; /* Image pointers */ pY_Cur = pCurrent->y + (y_pos << 4) * stride + (x_pos << 4); pU_Cur = pCurrent->u + (y_pos << 3) * stride2 + (x_pos << 3); pV_Cur = pCurrent->v + (y_pos << 3) * stride2 + (x_pos << 3); /* Do the transfer */ start_timer(); transfer_8to16copy(&data[0 * 64], pY_Cur, stride); transfer_8to16copy(&data[1 * 64], pY_Cur + 8, stride); transfer_8to16copy(&data[2 * 64], pY_Cur + next_block, stride); transfer_8to16copy(&data[3 * 64], pY_Cur + next_block + 8, stride); transfer_8to16copy(&data[4 * 64], pU_Cur, stride2); transfer_8to16copy(&data[5 * 64], pV_Cur, stride2); stop_transfer_timer(); } static __inline void MBTrans16to8(const MBParam * const pParam, const FRAMEINFO * const frame, const MACROBLOCK * const pMB, const uint32_t x_pos, const uint32_t y_pos, int16_t data[6 * 64], const uint32_t add, /* Must be 1 or 0 */ const uint8_t cbp) { uint8_t *pY_Cur, *pU_Cur, *pV_Cur; uint32_t stride = pParam->edged_width; uint32_t stride2 = stride / 2; uint32_t next_block = stride * 8; const IMAGE * const pCurrent = &frame->image; /* Array of function pointers, indexed by [add] */ transfer_operation_16to8_t * const functions[2] = { (transfer_operation_16to8_t*)transfer_16to8copy, (transfer_operation_16to8_t*)transfer_16to8add, }; transfer_operation_16to8_t *transfer_op = NULL; /* Image pointers */ pY_Cur = pCurrent->y + (y_pos << 4) * stride + (x_pos << 4); pU_Cur = pCurrent->u + (y_pos << 3) * stride2 + (x_pos << 3); pV_Cur = pCurrent->v + (y_pos << 3) * stride2 + (x_pos << 3); if (pMB->field_dct) { next_block = stride; stride *= 2; } /* Operation function */ transfer_op = functions[add]; /* Do the operation */ start_timer(); if (cbp&32) transfer_op(pY_Cur, &data[0 * 64], stride); if (cbp&16) transfer_op(pY_Cur + 8, &data[1 * 64], stride); if (cbp& 8) transfer_op(pY_Cur + next_block, &data[2 * 64], stride); if (cbp& 4) transfer_op(pY_Cur + next_block + 8, &data[3 * 64], stride); if (cbp& 2) transfer_op(pU_Cur, &data[4 * 64], stride2); if (cbp& 1) transfer_op(pV_Cur, &data[5 * 64], stride2); stop_transfer_timer(); } /***************************************************************************** * Module functions ****************************************************************************/ void MBTransQuantIntra(const MBParam * const pParam, const FRAMEINFO * const frame, MACROBLOCK * const pMB, const uint32_t x_pos, const uint32_t y_pos, int16_t data[6 * 64], int16_t qcoeff[6 * 64]) { /* Transfer data */ MBTrans8to16(pParam, frame, pMB, x_pos, y_pos, data); /* Perform DCT (and field decision) */ MBfDCT(pParam, frame, pMB, x_pos, y_pos, data); /* Quantize the block */ MBQuantIntra(pParam, frame, pMB, data, qcoeff); /* DeQuantize the block */ MBDeQuantIntra(pParam, pMB->quant, data, qcoeff); /* Perform inverse DCT*/ MBiDCT(data, 0x3F); /* Transfer back the data -- Don't add data */ MBTrans16to8(pParam, frame, pMB, x_pos, y_pos, data, 0, 0x3F); } uint8_t MBTransQuantInter(const MBParam * const pParam, const FRAMEINFO * const frame, MACROBLOCK * const pMB, const uint32_t x_pos, const uint32_t y_pos, int16_t data[6 * 64], int16_t qcoeff[6 * 64]) { uint8_t cbp; uint32_t limit; /* There is no MBTrans8to16 for Inter block, that's done in motion compensation * already */ /* Perform DCT (and field decision) */ MBfDCT(pParam, frame, pMB, x_pos, y_pos, data); /* Set the limit threshold */ limit = PVOP_TOOSMALL_LIMIT + ((pMB->quant == 1)? 1 : 0); if (frame->vop_flags & XVID_VOP_CARTOON) limit *= 3; /* Quantize the block */ cbp = MBQuantInter(pParam, frame, pMB, data, qcoeff, 0, limit); /* DeQuantize the block */ MBDeQuantInter(pParam, pMB->quant, data, qcoeff, cbp); /* Perform inverse DCT*/ MBiDCT(data, cbp); /* Transfer back the data -- Add the data */ MBTrans16to8(pParam, frame, pMB, x_pos, y_pos, data, 1, cbp); return(cbp); } uint8_t MBTransQuantInterBVOP(const MBParam * pParam, FRAMEINFO * frame, MACROBLOCK * pMB, const uint32_t x_pos, const uint32_t y_pos, int16_t data[6 * 64], int16_t qcoeff[6 * 64]) { uint8_t cbp; uint32_t limit; /* There is no MBTrans8to16 for Inter block, that's done in motion compensation * already */ /* Perform DCT (and field decision) */ MBfDCT(pParam, frame, pMB, x_pos, y_pos, data); /* Set the limit threshold */ limit = BVOP_TOOSMALL_LIMIT; if (frame->vop_flags & XVID_VOP_CARTOON) limit *= 2; /* Quantize the block */ cbp = MBQuantInter(pParam, frame, pMB, data, qcoeff, 1, limit); /* * History comment: * We don't have to DeQuant, iDCT and Transfer back data for B-frames. * * BUT some plugins require the rebuilt original frame to be passed so we * have to take care of that here */ if((pParam->plugin_flags & XVID_REQORIGINAL)) { /* DeQuantize the block */ MBDeQuantInter(pParam, pMB->quant, data, qcoeff, cbp); /* Perform inverse DCT*/ MBiDCT(data, cbp); /* Transfer back the data -- Add the data */ MBTrans16to8(pParam, frame, pMB, x_pos, y_pos, data, 1, cbp); } return(cbp); } /* if sum(diff between field lines) < sum(diff between frame lines), use field dct */ uint32_t MBFieldTest_c(int16_t data[6 * 64]) { const uint8_t blocks[] = { 0 * 64, 0 * 64, 0 * 64, 0 * 64, 2 * 64, 2 * 64, 2 * 64, 2 * 64 }; const uint8_t lines[] = { 0, 16, 32, 48, 0, 16, 32, 48 }; int frame = 0, field = 0; int i, j; for (i = 0; i < 7; ++i) { for (j = 0; j < 8; ++j) { frame += abs(data[0 * 64 + (i + 1) * 8 + j] - data[0 * 64 + i * 8 + j]); frame += abs(data[1 * 64 + (i + 1) * 8 + j] - data[1 * 64 + i * 8 + j]); frame += abs(data[2 * 64 + (i + 1) * 8 + j] - data[2 * 64 + i * 8 + j]); frame += abs(data[3 * 64 + (i + 1) * 8 + j] - data[3 * 64 + i * 8 + j]); field += abs(data[blocks[i + 1] + lines[i + 1] + j] - data[blocks[i] + lines[i] + j]); field += abs(data[blocks[i + 1] + lines[i + 1] + 8 + j] - data[blocks[i] + lines[i] + 8 + j]); field += abs(data[blocks[i + 1] + 64 + lines[i + 1] + j] - data[blocks[i] + 64 + lines[i] + j]); field += abs(data[blocks[i + 1] + 64 + lines[i + 1] + 8 + j] - data[blocks[i] + 64 + lines[i] + 8 + j]); } } return (frame >= (field + 350)); } /* deinterlace Y blocks vertically */ #define MOVLINE(X,Y) memcpy(X, Y, sizeof(tmp)) #define LINE(X,Y) &data[X*64 + Y*8] void MBFrameToField(int16_t data[6 * 64]) { int16_t tmp[8]; /* left blocks */ /* 1=2, 2=4, 4=8, 8=1 */ MOVLINE(tmp, LINE(0, 1)); MOVLINE(LINE(0, 1), LINE(0, 2)); MOVLINE(LINE(0, 2), LINE(0, 4)); MOVLINE(LINE(0, 4), LINE(2, 0)); MOVLINE(LINE(2, 0), tmp); /* 3=6, 6=12, 12=9, 9=3 */ MOVLINE(tmp, LINE(0, 3)); MOVLINE(LINE(0, 3), LINE(0, 6)); MOVLINE(LINE(0, 6), LINE(2, 4)); MOVLINE(LINE(2, 4), LINE(2, 1)); MOVLINE(LINE(2, 1), tmp); /* 5=10, 10=5 */ MOVLINE(tmp, LINE(0, 5)); MOVLINE(LINE(0, 5), LINE(2, 2)); MOVLINE(LINE(2, 2), tmp); /* 7=14, 14=13, 13=11, 11=7 */ MOVLINE(tmp, LINE(0, 7)); MOVLINE(LINE(0, 7), LINE(2, 6)); MOVLINE(LINE(2, 6), LINE(2, 5)); MOVLINE(LINE(2, 5), LINE(2, 3)); MOVLINE(LINE(2, 3), tmp); /* right blocks */ /* 1=2, 2=4, 4=8, 8=1 */ MOVLINE(tmp, LINE(1, 1)); MOVLINE(LINE(1, 1), LINE(1, 2)); MOVLINE(LINE(1, 2), LINE(1, 4)); MOVLINE(LINE(1, 4), LINE(3, 0)); MOVLINE(LINE(3, 0), tmp); /* 3=6, 6=12, 12=9, 9=3 */ MOVLINE(tmp, LINE(1, 3)); MOVLINE(LINE(1, 3), LINE(1, 6)); MOVLINE(LINE(1, 6), LINE(3, 4)); MOVLINE(LINE(3, 4), LINE(3, 1)); MOVLINE(LINE(3, 1), tmp); /* 5=10, 10=5 */ MOVLINE(tmp, LINE(1, 5)); MOVLINE(LINE(1, 5), LINE(3, 2)); MOVLINE(LINE(3, 2), tmp); /* 7=14, 14=13, 13=11, 11=7 */ MOVLINE(tmp, LINE(1, 7)); MOVLINE(LINE(1, 7), LINE(3, 6)); MOVLINE(LINE(3, 6), LINE(3, 5)); MOVLINE(LINE(3, 5), LINE(3, 3)); MOVLINE(LINE(3, 3), tmp); } /***************************************************************************** * Trellis based R-D optimal quantization * * Trellis Quant code (C) 2003 Pascal Massimino skal(at)planet-d.net * ****************************************************************************/ /*---------------------------------------------------------------------------- * * Trellis-Based quantization * * So far I understand this paper: * * "Trellis-Based R-D Optimal Quantization in H.263+" * J.Wen, M.Luttrell, J.Villasenor * IEEE Transactions on Image Processing, Vol.9, No.8, Aug. 2000. * * we are at stake with a simplified Bellmand-Ford / Dijkstra Single * Source Shortest Path algo. But due to the underlying graph structure * ("Trellis"), it can be turned into a dynamic programming algo, * partially saving the explicit graph's nodes representation. And * without using a heap, since the open frontier of the DAG is always * known, and of fixed size. *--------------------------------------------------------------------------*/ /* Codes lengths for relevant levels. */ /* let's factorize: */ static const uint8_t Code_Len0[64] = { 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len1[64] = { 20,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len2[64] = { 19,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len3[64] = { 18,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len4[64] = { 17,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len5[64] = { 16,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len6[64] = { 15,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len7[64] = { 13,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len8[64] = { 11,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len9[64] = { 12,21,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len10[64] = { 12,20,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len11[64] = { 12,19,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len12[64] = { 11,17,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len13[64] = { 11,15,21,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len14[64] = { 10,12,19,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len15[64] = { 10,13,17,19,21,21,21,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len16[64] = { 9,12,13,18,18,19,19,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30}; static const uint8_t Code_Len17[64] = { 8,11,13,14,14,14,15,19,19,19,21,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len18[64] = { 7, 9,11,11,13,13,13,15,15,15,16,22,22,22,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len19[64] = { 5, 7, 9,10,10,11,11,11,11,11,13,14,16,17,17,18,18,18,18,18,18,18,18,20,20,21,21,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30 }; static const uint8_t Code_Len20[64] = { 3, 4, 5, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 9, 9,10,10,10,10,10,10,10,10,12,12,13,13,12,13,14,15,15, 15,16,16,16,16,17,17,17,18,18,19,19,19,19,19,19,19,19,21,21,22,22,30,30,30,30,30,30,30,30,30,30 }; /* a few more table for LAST table: */ static const uint8_t Code_Len21[64] = { 13,20,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30}; static const uint8_t Code_Len22[64] = { 12,15,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30}; static const uint8_t Code_Len23[64] = { 10,12,15,15,15,16,16,16,16,17,17,17,17,17,17,17,17,18,18,18,18,18,18,18,18,19,19,19,19,20,20,20, 20,21,21,21,21,21,21,21,21,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30}; static const uint8_t Code_Len24[64] = { 5, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9,10,10,10,10,10,10,10,10,11,11,11,11,12,12,12, 12,13,13,13,13,13,13,13,13,14,16,16,16,16,17,17,17,17,18,18,18,18,18,18,18,18,19,19,19,19,19,19}; static const uint8_t * const B16_17_Code_Len[24] = { /* levels [1..24] */ Code_Len20,Code_Len19,Code_Len18,Code_Len17, Code_Len16,Code_Len15,Code_Len14,Code_Len13, Code_Len12,Code_Len11,Code_Len10,Code_Len9, Code_Len8, Code_Len7 ,Code_Len6 ,Code_Len5, Code_Len4, Code_Len3, Code_Len3 ,Code_Len2, Code_Len2, Code_Len1, Code_Len1, Code_Len1, }; static const uint8_t * const B16_17_Code_Len_Last[6] = { /* levels [1..6] */ Code_Len24,Code_Len23,Code_Len22,Code_Len21, Code_Len3, Code_Len1, }; /* TL_SHIFT controls the precision of the RD optimizations in trellis * valid range is [10..16]. The bigger, the more trellis is vulnerable * to overflows in cost formulas. * - 10 allows ac values up to 2^11 == 2048 * - 16 allows ac values up to 2^8 == 256 */ #define TL_SHIFT 11 #define TL(q) ((0xfe00>>(16-TL_SHIFT))/(q*q)) static const int Trellis_Lambda_Tabs[31] = { TL( 1),TL( 2),TL( 3),TL( 4),TL( 5),TL( 6), TL( 7), TL( 8),TL( 9),TL(10),TL(11),TL(12),TL(13),TL(14), TL(15), TL(16),TL(17),TL(18),TL(19),TL(20),TL(21),TL(22), TL(23), TL(24),TL(25),TL(26),TL(27),TL(28),TL(29),TL(30), TL(31) }; #undef TL static int __inline Find_Last(const int16_t *C, const uint16_t *Zigzag, int i) { while(i>=0) if (C[Zigzag[i]]) return i; else i--; return -1; } #define TRELLIS_MIN_EFFORT 3 static __inline uint32_t calc_mseh(int16_t dQ, uint16_t mask, const int index, const int Lambda) { uint32_t t = (mask * Inv_iMask_Coeff[index] + 32) >> 7; uint16_t u = abs(dQ) << 4; uint16_t thresh = (t < 65536) ? t : 65535; if (u <= thresh) u = 0; /* The error is not perceivable */ else u -= thresh; u = ((u + iCSF_Round[index]) * iCSF_Coeff[index]) >> 16; return (((Lambda*u*u)>>4) + 4*Lambda*dQ*dQ) / 5; } /* this routine has been strippen of all debug code */ static int dct_quantize_trellis_c(int16_t *const Out, const int16_t *const In, int Q, const uint16_t * const Zigzag, const uint16_t * const QuantMatrix, int Non_Zero, int Sum, int Lambda_Mod, const uint32_t rel_var8, const int Metric) { /* Note: We should search last non-zero coeffs on *real* DCT input coeffs * (In[]), not quantized one (Out[]). However, it only improves the result * *very* slightly (~0.01dB), whereas speed drops to crawling level :) * Well, actually, taking 1 more coeff past Non_Zero into account sometimes * helps. */ typedef struct { int16_t Run, Level; } NODE; NODE Nodes[65], Last = { 0, 0}; uint32_t Run_Costs0[64+1]; uint32_t * const Run_Costs = Run_Costs0 + 1; /* it's 1/lambda, actually */ const int Lambda = (Lambda_Mod*Trellis_Lambda_Tabs[Q-1])>>LAMBDA_EXP; int Run_Start = -1; uint32_t Min_Cost = 2<> 6) : 0; /* source (w/ CBP penalty) */ Run_Costs[-1] = 2<>4); const int Mult = 2*q; const int Bias = (q-1) | 1; const int Lev0 = Mult + Bias; const int AC = In[Zigzag[i]]; const int Level1 = Out[Zigzag[i]]; const unsigned int Dist0 = (Metric) ? (calc_mseh(AC, mask, Zigzag[i], Lambda)) : (Lambda* AC*AC); uint32_t Best_Cost = 0xf0000000; Last_Cost += Dist0; /* very specialized loop for -1,0,+1 */ if ((uint32_t)(Level1+1)<3) { int dQ; int Run; uint32_t Cost0; if (AC<0) { Nodes[i].Level = -1; dQ = Lev0 + AC; } else { Nodes[i].Level = 1; dQ = Lev0 - AC; } Cost0 = (Metric) ? (calc_mseh(dQ, mask, Zigzag[i], Lambda)) : (Lambda*dQ*dQ); Nodes[i].Run = 1; Best_Cost = (Code_Len20[0]<0; --Run) { const uint32_t Cost_Base = Cost0 + Run_Costs[i-Run]; const uint32_t Cost = Cost_Base + (Code_Len20[Run-1]< hifreq errors (HVS) */ if (Cost(uint32_t)(Level1+25)) { /* "big" levels (not less than ESC3, though) */ const uint8_t *Tbl_L1, *Tbl_L2, *Tbl_L1_Last, *Tbl_L2_Last; int Level2; int dQ1, dQ2; int Run; uint32_t Dist1,Dist2; int dDist21; if (Level1>1) { dQ1 = Level1*Mult-AC + Bias; dQ2 = dQ1 - Mult; Level2 = Level1-1; Tbl_L1 = (Level1<=24) ? B16_17_Code_Len[Level1-1] : Code_Len0; Tbl_L2 = (Level2<=24) ? B16_17_Code_Len[Level2-1] : Code_Len0; Tbl_L1_Last = (Level1<=6) ? B16_17_Code_Len_Last[Level1-1] : Code_Len0; Tbl_L2_Last = (Level2<=6) ? B16_17_Code_Len_Last[Level2-1] : Code_Len0; } else { /* Level1<-1 */ dQ1 = Level1*Mult-AC - Bias; dQ2 = dQ1 + Mult; Level2 = Level1 + 1; Tbl_L1 = (Level1>=-24) ? B16_17_Code_Len[Level1^-1] : Code_Len0; Tbl_L2 = (Level2>=-24) ? B16_17_Code_Len[Level2^-1] : Code_Len0; Tbl_L1_Last = (Level1>=- 6) ? B16_17_Code_Len_Last[Level1^-1] : Code_Len0; Tbl_L2_Last = (Level2>=- 6) ? B16_17_Code_Len_Last[Level2^-1] : Code_Len0; } if (Metric) { Dist1 = calc_mseh(dQ1, mask, Zigzag[i], Lambda); Dist2 = calc_mseh(dQ2, mask, Zigzag[i], Lambda); } else { Dist1 = Lambda*dQ1*dQ1; Dist2 = Lambda*dQ2*dQ2; } dDist21 = Dist2-Dist1; for(Run=i-Run_Start; Run>0; --Run) { const uint32_t Cost_Base = Dist1 + Run_Costs[i-Run]; uint32_t Cost1, Cost2; int bLevel; /* for sub-optimal (but slightly worth it, speed-wise) search, * uncomment the following: * if (Cost_Base>=Best_Cost) continue; * (? doesn't seem to have any effect -- gruel ) */ Cost1 = Cost_Base + (Tbl_L1[Run-1]< Simply pick best Run. */ int Run; for(Run=i-Run_Start; Run>0; --Run) { /* 30 bits + no distortion */ const uint32_t Cost = (30<Min_Cost+(1<=0) { Out[Zigzag[i]] = Nodes[i].Level; Sum += abs(Nodes[i].Level); i -= Nodes[i].Run; } return Sum; } /* original version including heavy debugging info */ #ifdef DBGTRELL #define DBG 0 static __inline uint32_t Evaluate_Cost(const int16_t *C, int Mult, int Bias, const uint16_t * Zigzag, int Max, int Lambda) { #if (DBG>0) const int16_t * const Ref = C + 6*64; int Last = Max; int Bits = 0; int Dist = 0; int i; uint32_t Cost; while(Last>=0 && C[Zigzag[Last]]==0) Last--; if (Last>=0) { int j=0, j0=0; int Run, Level; Bits = 2; /* CBP */ while(j=-24 && Level<=24) Bits += B16_17_Code_Len[(Level<0) ? -Level-1 : Level-1][Run]; else Bits += 30; } Level = C[Zigzag[Last]]; Run = j - j0; if (Level>=-6 && Level<=6) Bits += B16_17_Code_Len_Last[(Level<0) ? -Level-1 : Level-1][Run]; else Bits += 30; } for(i=0; i<=Last; ++i) { int V = C[Zigzag[i]]*Mult; if (V>0) V += Bias; else if (V<0) V -= Bias; V -= Ref[Zigzag[i]]; Dist += V*V; } Cost = Lambda*Dist + (Bits<>12= %d ", Last,Max, Bits, Dist, Cost, Cost>>12 ); return Cost; #else return 0; #endif } static int dct_quantize_trellis_h263_c(int16_t *const Out, const int16_t *const In, int Q, const uint16_t * const Zigzag, int Non_Zero) { /* * Note: We should search last non-zero coeffs on *real* DCT input coeffs (In[]), * not quantized one (Out[]). However, it only improves the result *very* * slightly (~0.01dB), whereas speed drops to crawling level :) * Well, actually, taking 1 more coeff past Non_Zero into account sometimes helps. */ typedef struct { int16_t Run, Level; } NODE; NODE Nodes[65], Last; uint32_t Run_Costs0[64+1]; uint32_t * const Run_Costs = Run_Costs0 + 1; const int Mult = 2*Q; const int Bias = (Q-1) | 1; const int Lev0 = Mult + Bias; const int Lambda = Trellis_Lambda_Tabs[Q-1]; /* it's 1/lambda, actually */ int Run_Start = -1; Run_Costs[-1] = 2<0) Last.Level = 0; Last.Run = -1; /* just initialize to smthg */ #endif Non_Zero = Find_Last(Out, Zigzag, Non_Zero); if (Non_Zero<0) return -1; for(i=0; i<=Non_Zero; i++) { const int AC = In[Zigzag[i]]; const int Level1 = Out[Zigzag[i]]; const int Dist0 = Lambda* AC*AC; uint32_t Best_Cost = 0xf0000000; Last_Cost += Dist0; if ((uint32_t)(Level1+1)<3) /* very specialized loop for -1,0,+1 */ { int dQ; int Run; uint32_t Cost0; if (AC<0) { Nodes[i].Level = -1; dQ = Lev0 + AC; } else { Nodes[i].Level = 1; dQ = Lev0 - AC; } Cost0 = Lambda*dQ*dQ; Nodes[i].Run = 1; Best_Cost = (Code_Len20[0]<0; --Run) { const uint32_t Cost_Base = Cost0 + Run_Costs[i-Run]; const uint32_t Cost = Cost_Base + (Code_Len20[Run-1]<>12 ); else if (j>Run_Start && j>12 ); else if (j==i) printf( "(%3.0d)", Run_Costs[j]>>12 ); else printf( " - |" ); } printf( "<%3.0d %2d %d>", Min_Cost>>12, Nodes[i].Level, Nodes[i].Run ); printf( " Last:#%2d {%3.0d %2d %d}", Last_Node, Last_Cost>>12, Last.Level, Last.Run ); printf( " AC:%3.0d Dist0:%3d Dist(%d)=%d", AC, Dist0>>12, Nodes[i].Level, Cost0>>12 ); printf( "\n" ); } } else /* "big" levels */ { const uint8_t *Tbl_L1, *Tbl_L2, *Tbl_L1_Last, *Tbl_L2_Last; int Level2; int dQ1, dQ2; int Run; uint32_t Dist1,Dist2; int dDist21; if (Level1>1) { dQ1 = Level1*Mult-AC + Bias; dQ2 = dQ1 - Mult; Level2 = Level1-1; Tbl_L1 = (Level1<=24) ? B16_17_Code_Len[Level1-1] : Code_Len0; Tbl_L2 = (Level2<=24) ? B16_17_Code_Len[Level2-1] : Code_Len0; Tbl_L1_Last = (Level1<=6) ? B16_17_Code_Len_Last[Level1-1] : Code_Len0; Tbl_L2_Last = (Level2<=6) ? B16_17_Code_Len_Last[Level2-1] : Code_Len0; } else { /* Level1<-1 */ dQ1 = Level1*Mult-AC - Bias; dQ2 = dQ1 + Mult; Level2 = Level1 + 1; Tbl_L1 = (Level1>=-24) ? B16_17_Code_Len[Level1^-1] : Code_Len0; Tbl_L2 = (Level2>=-24) ? B16_17_Code_Len[Level2^-1] : Code_Len0; Tbl_L1_Last = (Level1>=- 6) ? B16_17_Code_Len_Last[Level1^-1] : Code_Len0; Tbl_L2_Last = (Level2>=- 6) ? B16_17_Code_Len_Last[Level2^-1] : Code_Len0; } Dist1 = Lambda*dQ1*dQ1; Dist2 = Lambda*dQ2*dQ2; dDist21 = Dist2-Dist1; for(Run=i-Run_Start; Run>0; --Run) { const uint32_t Cost_Base = Dist1 + Run_Costs[i-Run]; uint32_t Cost1, Cost2; int bLevel; /* * for sub-optimal (but slightly worth it, speed-wise) search, uncomment the following: * if (Cost_Base>=Best_Cost) continue; */ Cost1 = Cost_Base + (Tbl_L1[Run-1]<>12 ); else if (j>Run_Start && j>12 ); else if (j==i) printf( "(%3.0d)", Run_Costs[j]>>12 ); else printf( " - |" ); } printf( "<%3.0d %2d %d>", Min_Cost>>12, Nodes[i].Level, Nodes[i].Run ); printf( " Last:#%2d {%3.0d %2d %d}", Last_Node, Last_Cost>>12, Last.Level, Last.Run ); printf( " AC:%3.0d Dist0:%3d Dist(%2d):%3d Dist(%2d):%3d", AC, Dist0>>12, Level1, Dist1>>12, Level2, Dist2>>12 ); printf( "\n" ); } } Run_Costs[i] = Best_Cost; if (Best_Cost < Min_Cost + Dist0) { Min_Cost = Best_Cost; Run_Start = i; } else { /* * as noticed by Michael Niedermayer (michaelni at gmx.at), there's * a code shorter by 1 bit for a larger run (!), same level. We give * it a chance by not moving the left barrier too much. */ while( Run_Costs[Run_Start]>Min_Cost+(1< " ); for(i=0; i<=Non_Zero; ++i) printf( "[%3.0d] ", Out[Zigzag[i]] ); printf( "\n" ); } } if (Last_Node<0) return -1; /* reconstruct optimal sequence backward with surviving paths */ memset(Out, 0x00, 64*sizeof(*Out)); Out[Zigzag[Last_Node]] = Last.Level; i = Last_Node - Last.Run; while(i>=0) { Out[Zigzag[i]] = Nodes[i].Level; i -= Nodes[i].Run; } if (DBG) { uint32_t Cost = Evaluate_Cost(Out,Mult,Bias, Zigzag,Non_Zero, Lambda); if (DBG==1) { printf( "<= " ); for(i=0; i<=Last_Node; ++i) printf( "[%3.0d] ", Out[Zigzag[i]] ); printf( "\n--------------------------------\n" ); } if (Cost>Last_Cost) printf( "!!! %u > %u\n", Cost, Last_Cost ); } return Last_Node; } #undef DBG #endif xvidcore/src/utils/mem_transfer.h0000664000076500007650000002360711564705453020267 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - 8<->16 bit buffer transfer header - * * Copyright(C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mem_transfer.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _MEM_TRANSFER_H #define _MEM_TRANSFER_H /***************************************************************************** * transfer8to16 API ****************************************************************************/ typedef void (TRANSFER_8TO16COPY) (int16_t * const dst, const uint8_t * const src, uint32_t stride); typedef TRANSFER_8TO16COPY *TRANSFER_8TO16COPY_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER_8TO16COPY_PTR transfer_8to16copy; /* Implemented functions */ extern TRANSFER_8TO16COPY transfer_8to16copy_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER_8TO16COPY transfer_8to16copy_mmx; extern TRANSFER_8TO16COPY transfer_8to16copy_3dne; #endif #ifdef ARCH_IS_IA64 extern TRANSFER_8TO16COPY transfer_8to16copy_ia64; #endif #ifdef ARCH_IS_PPC extern TRANSFER_8TO16COPY transfer_8to16copy_altivec_c; #endif /***************************************************************************** * transfer16to8 API ****************************************************************************/ typedef void (TRANSFER_16TO8COPY) (uint8_t * const dst, const int16_t * const src, uint32_t stride); typedef TRANSFER_16TO8COPY *TRANSFER_16TO8COPY_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER_16TO8COPY_PTR transfer_16to8copy; /* Implemented functions */ extern TRANSFER_16TO8COPY transfer_16to8copy_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER_16TO8COPY transfer_16to8copy_mmx; extern TRANSFER_16TO8COPY transfer_16to8copy_3dne; #endif #ifdef ARCH_IS_IA64 extern TRANSFER_16TO8COPY transfer_16to8copy_ia64; #endif #ifdef ARCH_IS_PPC extern TRANSFER_16TO8COPY transfer_16to8copy_altivec_c; #endif #ifdef ARCH_IS_X86_64 extern TRANSFER_16TO8COPY transfer_16to8copy_x86_64; #endif /***************************************************************************** * transfer8to16 + substraction *writeback* op API ****************************************************************************/ typedef void (TRANSFER_8TO16SUB) (int16_t * const dct, uint8_t * const cur, const uint8_t * ref, const uint32_t stride); typedef TRANSFER_8TO16SUB *TRANSFER_8TO16SUB_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER_8TO16SUB_PTR transfer_8to16sub; /* Implemented functions */ extern TRANSFER_8TO16SUB transfer_8to16sub_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER_8TO16SUB transfer_8to16sub_mmx; extern TRANSFER_8TO16SUB transfer_8to16sub_3dne; #endif #ifdef ARCH_IS_IA64 extern TRANSFER_8TO16SUB transfer_8to16sub_ia64; #endif #ifdef ARCH_IS_PPC extern TRANSFER_8TO16SUB transfer_8to16sub_altivec_c; #endif #ifdef ARCH_IS_X86_64 extern TRANSFER_8TO16SUB transfer_8to16sub_x86_64; #endif /***************************************************************************** * transfer8to16 + substraction *readonly* op API ****************************************************************************/ typedef void (TRANSFER_8TO16SUBRO) (int16_t * const dct, const uint8_t * const cur, const uint8_t * ref, const uint32_t stride); typedef TRANSFER_8TO16SUBRO *TRANSFER_8TO16SUBRO_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER_8TO16SUBRO_PTR transfer_8to16subro; /* Implemented functions */ extern TRANSFER_8TO16SUBRO transfer_8to16subro_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER_8TO16SUBRO transfer_8to16subro_mmx; extern TRANSFER_8TO16SUBRO transfer_8to16subro_3dne; #endif #ifdef ARCH_IS_PPC extern TRANSFER_8TO16SUBRO transfer_8to16subro_altivec_c; #endif #ifdef ARCH_IS_X86_64 extern TRANSFER_8TO16SUBRO transfer_8to16subro_x86_64; #endif /***************************************************************************** * transfer8to16 + substraction op API - Bidirectionnal Version ****************************************************************************/ typedef void (TRANSFER_8TO16SUB2) (int16_t * const dct, uint8_t * const cur, const uint8_t * ref1, const uint8_t * ref2, const uint32_t stride); typedef TRANSFER_8TO16SUB2 *TRANSFER_8TO16SUB2_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER_8TO16SUB2_PTR transfer_8to16sub2; /* Implemented functions */ TRANSFER_8TO16SUB2 transfer_8to16sub2_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER_8TO16SUB2 transfer_8to16sub2_mmx; extern TRANSFER_8TO16SUB2 transfer_8to16sub2_xmm; extern TRANSFER_8TO16SUB2 transfer_8to16sub2_3dne; #endif #ifdef ARCH_IS_IA64 extern TRANSFER_8TO16SUB2 transfer_8to16sub2_ia64; #endif #ifdef ARCH_IS_PPC extern TRANSFER_8TO16SUB2 transfer_8to16sub2_altivec_c; #endif #ifdef ARCH_IS_X86_64 extern TRANSFER_8TO16SUB2 transfer_8to16sub2_x86_64; #endif /***************************************************************************** * transfer8to16 + substraction op API - Bidirectionnal Version *readonly* ****************************************************************************/ typedef void (TRANSFER_8TO16SUB2RO) (int16_t * const dct, const uint8_t * const cur, const uint8_t * ref1, const uint8_t * ref2, const uint32_t stride); typedef TRANSFER_8TO16SUB2RO *TRANSFER_8TO16SUB2RO_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER_8TO16SUB2RO_PTR transfer_8to16sub2ro; /* Implemented functions */ TRANSFER_8TO16SUB2RO transfer_8to16sub2ro_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER_8TO16SUB2RO transfer_8to16sub2ro_xmm; #endif #ifdef ARCH_IS_X86_64 extern TRANSFER_8TO16SUB2RO transfer_8to16sub2ro_x86_64; #endif /***************************************************************************** * transfer16to8 + addition op API ****************************************************************************/ typedef void (TRANSFER_16TO8ADD) (uint8_t * const dst, const int16_t * const src, uint32_t stride); typedef TRANSFER_16TO8ADD *TRANSFER_16TO8ADD_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER_16TO8ADD_PTR transfer_16to8add; /* Implemented functions */ extern TRANSFER_16TO8ADD transfer_16to8add_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER_16TO8ADD transfer_16to8add_mmx; extern TRANSFER_16TO8ADD transfer_16to8add_3dne; #endif #ifdef ARCH_IS_IA64 extern TRANSFER_16TO8ADD transfer_16to8add_ia64; #endif #ifdef ARCH_IS_PPC extern TRANSFER_16TO8ADD transfer_16to8add_altivec_c; #endif #ifdef ARCH_IS_X86_64 extern TRANSFER_16TO8ADD transfer_16to8add_x86_64; #endif /***************************************************************************** * transfer8to8 + no op ****************************************************************************/ typedef void (TRANSFER8X8_COPY) (uint8_t * const dst, const uint8_t * const src, const uint32_t stride); typedef TRANSFER8X8_COPY *TRANSFER8X8_COPY_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER8X8_COPY_PTR transfer8x8_copy; /* Implemented functions */ extern TRANSFER8X8_COPY transfer8x8_copy_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER8X8_COPY transfer8x8_copy_mmx; extern TRANSFER8X8_COPY transfer8x8_copy_3dne; #endif #ifdef ARCH_IS_IA64 extern TRANSFER8X8_COPY transfer8x8_copy_ia64; #endif #ifdef ARCH_IS_PPC extern TRANSFER8X8_COPY transfer8x8_copy_altivec_c; #endif #ifdef ARCH_IS_X86_64 extern TRANSFER8X8_COPY transfer8x8_copy_x86_64; #endif /***************************************************************************** * transfer8to4 + no op ****************************************************************************/ typedef void (TRANSFER8X4_COPY) (uint8_t * const dst, const uint8_t * const src, const uint32_t stride); typedef TRANSFER8X4_COPY *TRANSFER8X4_COPY_PTR; /* Our global function pointer - Initialized in xvid.c */ extern TRANSFER8X4_COPY_PTR transfer8x4_copy; /* Implemented functions */ extern TRANSFER8X4_COPY transfer8x4_copy_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern TRANSFER8X4_COPY transfer8x4_copy_mmx; extern TRANSFER8X4_COPY transfer8x4_copy_3dne; #endif static __inline void transfer16x16_copy(uint8_t * const dst, const uint8_t * const src, const uint32_t stride) { transfer8x8_copy(dst, src, stride); transfer8x8_copy(dst + 8, src + 8, stride); transfer8x8_copy(dst + 8*stride, src + 8*stride, stride); transfer8x8_copy(dst + 8*stride + 8, src + 8*stride + 8, stride); } static __inline void transfer32x32_copy(uint8_t * const dst, const uint8_t * const src, const uint32_t stride) { transfer16x16_copy(dst, src, stride); transfer16x16_copy(dst + 16, src + 16, stride); transfer16x16_copy(dst + 16*stride, src + 16*stride, stride); transfer16x16_copy(dst + 16*stride + 16, src + 16*stride + 16, stride); } #endif xvidcore/src/utils/ppc_asm/0000775000076500007650000000000011566427763017055 5ustar xvidbuildxvidbuildxvidcore/src/utils/ppc_asm/mem_transfer_altivec.c0000664000076500007650000002405111564705453023405 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Altivec 8bit<->16bit transfer - * * Copyright(C) 2004 Christoph Naegeli * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mem_transfer_altivec.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" /* Turn this on if you like debugging the alignment */ #undef DEBUG #include /* This function assumes: * dst: 16 byte aligned */ #define COPY8TO16() \ s = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src));\ vec_st((vector signed short)vec_mergeh(zerovec,s),0,dst);\ src += stride;\ dst += 8 void transfer_8to16copy_altivec_c(int16_t *dst, uint8_t * src, uint32_t stride) { register vector unsigned char s; register vector unsigned char zerovec; #ifdef DEBUG /* Check the alignment */ if((long)dst & 0xf) fprintf(stderr, "transfer_8to16copy_altivec_c:incorrect align, dst: %lx\n", (long)dst); #endif /* initialization */ zerovec = vec_splat_u8(0); COPY8TO16(); COPY8TO16(); COPY8TO16(); COPY8TO16(); COPY8TO16(); COPY8TO16(); COPY8TO16(); COPY8TO16(); } /* * This function assumes dst is 8 byte aligned and stride is a multiple of 8 * src may be unaligned */ #define COPY16TO8() \ s = vec_perm(src[0], src[1], load_src_perm); \ packed = vec_packsu(s, vec_splat_s16(0)); \ mask = vec_perm(mask_stencil, mask_stencil, vec_lvsl(0, dst)); \ packed = vec_perm(packed, packed, vec_lvsl(0, dst)); \ packed = vec_sel(packed, vec_ld(0, dst), mask); \ vec_st(packed, 0, dst); \ src++; \ dst += stride void transfer_16to8copy_altivec_c(uint8_t *dst, vector signed short *src, uint32_t stride) { register vector signed short s; register vector unsigned char packed; register vector unsigned char mask_stencil; register vector unsigned char mask; register vector unsigned char load_src_perm; #ifdef DEBUG /* if this is on, print alignment errors */ if(((unsigned long) dst) & 0x7) fprintf(stderr, "transfer_16to8copy_altivec:incorrect align, dst %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "transfer_16to8copy_altivec:incorrect align, stride %u\n", stride); #endif /* Initialisation stuff */ load_src_perm = vec_lvsl(0, (unsigned char*)src); mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); COPY16TO8(); COPY16TO8(); COPY16TO8(); COPY16TO8(); COPY16TO8(); COPY16TO8(); COPY16TO8(); COPY16TO8(); } /* * This function assumes dst is 8 byte aligned and src is unaligned. Stride has * to be a multiple of 8 */ #define COPY8TO8() \ tmp = vec_perm(vec_ld(0, src), vec_ld(16, src), vec_lvsl(0, src)); \ t0 = vec_perm(tmp, tmp, vec_lvsl(0, dst));\ t1 = vec_perm(mask, mask, vec_lvsl(0, dst));\ tmp = vec_sel(t0, vec_ld(0, dst), t1);\ vec_st(tmp, 0, dst);\ dst += stride;\ src += stride void transfer8x8_copy_altivec_c( uint8_t * dst, uint8_t * src, uint32_t stride) { register vector unsigned char tmp; register vector unsigned char mask; register vector unsigned char t0, t1; #ifdef DEBUG if(((unsigned long)dst) & 0x7) fprintf(stderr, "transfer8x8_copy_altivec:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "transfer8x8_copy_altivec:incorrect stride, stride: %u\n", stride); #endif mask = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); COPY8TO8(); COPY8TO8(); COPY8TO8(); COPY8TO8(); COPY8TO8(); COPY8TO8(); COPY8TO8(); COPY8TO8(); } #define SUB8TO16() \ c = vec_perm(vec_ld(0,cur),vec_ld(16,cur),vec_lvsl(0,cur));\ r = vec_perm(vec_ld(0,ref),vec_ld(16,ref),vec_lvsl(0,ref));\ cs = (vector signed short)vec_mergeh(ox00,c);\ rs = (vector signed short)vec_mergeh(ox00,r);\ \ c = vec_lvsr(0,cur);\ mask = vec_perm(mask_00ff, mask_00ff, c);\ r = vec_perm(r, r, c);\ r = vec_sel(r, vec_ld(0,cur), mask);\ vec_st(r,0,cur);\ vec_st( vec_sub(cs,rs), 0, dct );\ \ dct += 8;\ cur += stride;\ ref += stride /* This function assumes: * dct: 16 Byte aligned * cur: 8 Byte aligned * stride: multiple of 8 */ void transfer_8to16sub_altivec_c(int16_t * dct, uint8_t * cur, uint8_t * ref, const uint32_t stride) { register vector unsigned char c,r; register vector unsigned char ox00; register vector unsigned char mask_00ff; register vector unsigned char mask; register vector signed short cs,rs; #ifdef DEBUG if((long)dct & 0xf) fprintf(stderr, "transfer_8to16sub_altivec_c:incorrect align, dct: %lx\n", (long)dct); if((long)cur & 0x7) fprintf(stderr, "transfer_8to16sub_altivec_c:incorrect align, cur: %lx\n", (long)cur); if(stride & 0x7) fprintf(stderr, "transfer_8to16sub_altivec_c:incorrect stride, stride: %lu\n", (long)stride); #endif /* initialize */ ox00 = vec_splat_u8(0); mask_00ff = vec_pack((vector unsigned short)ox00,vec_splat_u16(-1)); SUB8TO16(); SUB8TO16(); SUB8TO16(); SUB8TO16(); SUB8TO16(); SUB8TO16(); SUB8TO16(); SUB8TO16(); } #define SUBRO8TO16() \ c = vec_perm(vec_ld(0,cur),vec_ld(16,cur),vec_lvsl(0,cur));\ r = vec_perm(vec_ld(0,ref),vec_ld(16,ref),vec_lvsl(0,ref));\ cs = (vector signed short)vec_mergeh(z,c);\ rs = (vector signed short)vec_mergeh(z,r);\ vec_st( vec_sub(cs,rs), 0, dct );\ dct += 8;\ cur += stride;\ ref += stride /* This function assumes: * dct: 16 Byte aligned */ void transfer_8to16subro_altivec_c(int16_t * dct, const uint8_t * cur, const uint8_t * ref, const uint32_t stride) { register vector unsigned char c; register vector unsigned char r; register vector unsigned char z; register vector signed short cs; register vector signed short rs; #ifdef DEBUG /* Check the alignment assumptions if this is on */ if((long)dct & 0xf) fprintf(stderr, "transfer_8to16subro_altivec_c:incorrect align, dct: %lx\n", (long)dct); #endif /* initialize */ z = vec_splat_u8(0); SUBRO8TO16(); SUBRO8TO16(); SUBRO8TO16(); SUBRO8TO16(); SUBRO8TO16(); SUBRO8TO16(); SUBRO8TO16(); SUBRO8TO16(); } /* * This function assumes: * dct: 16 bytes alignment * cur: 8 bytes alignment * ref1: unaligned * ref2: unaligned * stride: multiple of 8 */ #define SUB28TO16() \ r1 = vec_perm(vec_ld(0, ref1), vec_ld(16, ref1), vec_lvsl(0, ref1)); \ r2 = vec_perm(vec_ld(0, ref2), vec_ld(16, ref2), vec_lvsl(0, ref2)); \ c = vec_perm(vec_ld(0, cur), vec_ld(16, cur), vec_lvsl(0, cur)); \ r = vec_avg(r1, r2); \ cs = (vector signed short)vec_mergeh(vec_splat_u8(0), c); \ rs = (vector signed short)vec_mergeh(vec_splat_u8(0), r); \ c = vec_perm(mask, mask, vec_lvsl(0, cur));\ r = vec_sel(r, vec_ld(0, cur), c);\ vec_st(r, 0, cur); \ *dct++ = vec_sub(cs, rs); \ cur += stride; \ ref1 += stride; \ ref2 += stride void transfer_8to16sub2_altivec_c(vector signed short *dct, uint8_t *cur, uint8_t *ref1, uint8_t *ref2, const uint32_t stride) { vector unsigned char r1; vector unsigned char r2; vector unsigned char r; vector unsigned char c; vector unsigned char mask; vector signed short cs; vector signed short rs; #ifdef DEBUG /* Dump alignment erros if DEBUG is set */ if(((unsigned long)dct) & 0xf) fprintf(stderr, "transfer_8to16sub2_altivec_c:incorrect align, dct: %lx\n", (long)dct); if(((unsigned long)cur) & 0x7) fprintf(stderr, "transfer_8to16sub2_altivec_c:incorrect align, cur: %lx\n", (long)cur); if(stride & 0x7) fprintf(stderr, "transfer_8to16sub2_altivec_c:incorrect align, dct: %u\n", stride); #endif /* Initialisation */ mask = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); SUB28TO16(); SUB28TO16(); SUB28TO16(); SUB28TO16(); SUB28TO16(); SUB28TO16(); SUB28TO16(); SUB28TO16(); } /* * This function assumes: * dst: 8 byte aligned * src: unaligned * stride: multiple of 8 */ #define ADD16TO8() \ s = vec_perm(vec_ld(0, src), vec_ld(16, src), vec_lvsl(0, src)); \ d = vec_perm(vec_ld(0, dst), vec_ld(16, dst), vec_lvsl(0, dst)); \ ds = (vector signed short)vec_mergeh(vec_splat_u8(0), d); \ ds = vec_add(ds, s); \ packed = vec_packsu(ds, vec_splat_s16(0)); \ mask = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); \ mask = vec_perm(mask, mask, vec_lvsl(0, dst)); \ packed = vec_perm(packed, packed, vec_lvsl(0, dst)); \ packed = vec_sel(packed, vec_ld(0, dst), mask); \ vec_st(packed, 0, dst); \ src += 8; \ dst += stride void transfer_16to8add_altivec_c(uint8_t *dst, int16_t *src, uint32_t stride) { vector signed short s; vector signed short ds; vector unsigned char d; vector unsigned char packed; vector unsigned char mask; #ifdef DEBUG /* if this is set, dump alignment errors */ if(((unsigned long)dst) & 0x7) fprintf(stderr, "transfer_16to8add_altivec_c:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "transfer_16to8add_altivec_c:incorrect align, dst: %u\n", stride); #endif ADD16TO8(); ADD16TO8(); ADD16TO8(); ADD16TO8(); ADD16TO8(); ADD16TO8(); ADD16TO8(); ADD16TO8(); } xvidcore/src/utils/ppc_asm/altivec_trigger.c0000664000076500007650000000256211564705453022371 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Altivec SIGILL trigger - * * Copyright(C) 2004 Edouard Gomez * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: altivec_trigger.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" #include "../emms.h" void altivec_trigger(void) { #if !defined(__amigaos4__) vector unsigned int var1 = (vector unsigned int)AVV(0); vector unsigned int var2 = (vector unsigned int)AVV(1); var1 = vec_add(var1, var2); #endif return; } xvidcore/src/utils/timer.c0000664000076500007650000001550711564705453016720 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Timer functions (used for internal debugging) - * * Copyright(C) 2002 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: timer.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include "timer.h" #if defined(_PROFILING_) struct ts { int64_t current; int64_t global; int64_t overall; int64_t dct; int64_t idct; int64_t quant; int64_t iquant; int64_t motion; int64_t comp; int64_t edges; int64_t inter; int64_t conv; int64_t trans; int64_t prediction; int64_t coding; int64_t interlacing; }; struct ts tim; double frequency = 0.0; /* determine cpu frequency not very precise but sufficient */ double get_freq() { int64_t x, y; int32_t i; i = time(NULL); while (i == time(NULL)); x = read_counter(); i++; while (i == time(NULL)); y = read_counter(); return (double) (y - x) / 1000.; } /* set everything to zero */ void init_timer() { frequency = get_freq(); count_frames = 0; tim.dct = tim.quant = tim.idct = tim.iquant = tim.motion = tim.conv = tim.edges = tim.inter = tim.interlacing = tim.trans = tim.trans = tim.coding = tim.global = tim.overall = 0; } void start_timer() { tim.current = read_counter(); } void start_global_timer() { tim.global = read_counter(); } void stop_dct_timer() { tim.dct += (read_counter() - tim.current); } void stop_idct_timer() { tim.idct += (read_counter() - tim.current); } void stop_quant_timer() { tim.quant += (read_counter() - tim.current); } void stop_iquant_timer() { tim.iquant += (read_counter() - tim.current); } void stop_motion_timer() { tim.motion += (read_counter() - tim.current); } void stop_comp_timer() { tim.comp += (read_counter() - tim.current); } void stop_edges_timer() { tim.edges += (read_counter() - tim.current); } void stop_inter_timer() { tim.inter += (read_counter() - tim.current); } void stop_conv_timer() { tim.conv += (read_counter() - tim.current); } void stop_transfer_timer() { tim.trans += (read_counter() - tim.current); } void stop_prediction_timer() { tim.prediction += (read_counter() - tim.current); } void stop_coding_timer() { tim.coding += (read_counter() - tim.current); } void stop_interlacing_timer() { tim.interlacing += (read_counter() - tim.current); } void stop_global_timer() { tim.overall += (read_counter() - tim.global); } /* write log file with some timer information */ void write_timer() { float dct_per, quant_per, idct_per, iquant_per, mot_per, comp_per, interlacing_per; float edges_per, inter_per, conv_per, trans_per, pred_per, cod_per, measured; int64_t sum_ticks = 0; count_frames++; // only write log file every 50 processed frames // if (count_frames % 50) { FILE *fp; fp = fopen("encoder.log", "w+"); dct_per = (float) (((float) ((float) tim.dct / (float) tim.overall)) * 100.0); quant_per = (float) (((float) ((float) tim.quant / (float) tim.overall)) * 100.0); idct_per = (float) (((float) ((float) tim.idct / (float) tim.overall)) * 100.0); iquant_per = (float) (((float) ((float) tim.iquant / (float) tim.overall)) * 100.0); mot_per = (float) (((float) ((float) tim.motion / (float) tim.overall)) * 100.0); comp_per = (float) (((float) ((float) tim.comp / (float) tim.overall)) * 100.0); edges_per = (float) (((float) ((float) tim.edges / (float) tim.overall)) * 100.0); inter_per = (float) (((float) ((float) tim.inter / (float) tim.overall)) * 100.0); conv_per = (float) (((float) ((float) tim.conv / (float) tim.overall)) * 100.0); trans_per = (float) (((float) ((float) tim.trans / (float) tim.overall)) * 100.0); pred_per = (float) (((float) ((float) tim.prediction / (float) tim.overall)) * 100.0); cod_per = (float) (((float) ((float) tim.coding / (float) tim.overall)) * 100.0); interlacing_per = (float) (((float) ((float) tim.interlacing / (float) tim.overall)) * 100.0); sum_ticks = tim.coding + tim.conv + tim.dct + tim.idct + tim.interlacing + tim.edges + tim.inter + tim.iquant + tim.motion + tim.trans + tim.quant + tim.trans; measured = (float) (((float) ((float) sum_ticks / (float) tim.overall)) * 100.0); fprintf(fp, "DCT:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "Quant:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "IDCT:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "IQuant:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "Mot estimation:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "Mot compensation:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "Edges:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "Interpolation:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "RGB2YUV:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "Transfer:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "Prediction:\nTotal time: %f ms (%3f percent of total encoding time)\n\n" "Coding:\nTotal time: %f ms (%3f percent of total encoding time)\n\n\n" "Interlacing:\nTotal time: %f ms (%3f percent of total encoding time)\n\n\n" "Overall encoding time: %f ms, we measured %f ms (%3f percent)\n", (float) (tim.dct / frequency), dct_per, (float) (tim.quant / frequency), quant_per, (float) (tim.idct / frequency), idct_per, (float) (tim.iquant / frequency), iquant_per, (float) (tim.motion / frequency), mot_per, (float) (tim.comp / frequency), comp_per, (float) (tim.edges / frequency), edges_per, (float) (tim.inter / frequency), inter_per, (float) (tim.conv / frequency), conv_per, (float) (tim.trans / frequency), trans_per, (float) (tim.prediction / frequency), pred_per, (float) (tim.coding / frequency), cod_per, (float) (tim.interlacing / frequency), interlacing_per, (float) (tim.overall / frequency), (float) (sum_ticks / frequency), measured); fclose(fp); } } #endif xvidcore/src/utils/mem_align.c0000664000076500007650000000722311564705453017524 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Aligned Memory Allocator - * * Copyright(C) 2002-2003 Edouard Gomez * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mem_align.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include "mem_align.h" /***************************************************************************** * xvid_malloc * * This function allocates 'size' bytes (usable by the user) on the heap and * takes care of the requested 'alignment'. * In order to align the allocated memory block, the xvid_malloc allocates * 'size' bytes + 'alignment' bytes. So try to keep alignment very small * when allocating small pieces of memory. * * NB : a block allocated by xvid_malloc _must_ be freed with xvid_free * (the libc free will return an error) * * Returned value : - NULL on error * - Pointer to the allocated aligned block * ****************************************************************************/ void * xvid_malloc(size_t size, uint8_t alignment) { uint8_t *mem_ptr; if (!alignment) { /* We have not to satisfy any alignment */ if ((mem_ptr = (uint8_t *) malloc(size + 1)) != NULL) { /* Store (mem_ptr - "real allocated memory") in *(mem_ptr-1) */ *mem_ptr = (uint8_t)1; /* Return the mem_ptr pointer */ return ((void *)(mem_ptr+1)); } } else { uint8_t *tmp; /* Allocate the required size memory + alignment so we * can realign the data if necessary */ if ((tmp = (uint8_t *) malloc(size + alignment)) != NULL) { /* Align the tmp pointer */ mem_ptr = (uint8_t *) ((ptr_t) (tmp + alignment - 1) & (~(ptr_t) (alignment - 1))); /* Special case where malloc have already satisfied the alignment * We must add alignment to mem_ptr because we must store * (mem_ptr - tmp) in *(mem_ptr-1) * If we do not add alignment to mem_ptr then *(mem_ptr-1) points * to a forbidden memory space */ if (mem_ptr == tmp) mem_ptr += alignment; /* (mem_ptr - tmp) is stored in *(mem_ptr-1) so we are able to retrieve * the real malloc block allocated and free it in xvid_free */ *(mem_ptr - 1) = (uint8_t) (mem_ptr - tmp); /* Return the aligned pointer */ return ((void *)mem_ptr); } } return(NULL); } /***************************************************************************** * xvid_free * * Free a previously 'xvid_malloc' allocated block. Does not free NULL * references. * * Returned value : None. * ****************************************************************************/ void xvid_free(void *mem_ptr) { uint8_t *ptr; if (mem_ptr == NULL) return; /* Aligned pointer */ ptr = mem_ptr; /* *(ptr - 1) holds the offset to the real allocated block * we sub that offset os we free the real pointer */ ptr -= *(ptr - 1); /* Free the memory */ free(ptr); } xvidcore/src/utils/ia64_asm/0000775000076500007650000000000011566427763017036 5ustar xvidbuildxvidbuildxvidcore/src/utils/ia64_asm/mem_transfer_ia64.s0000664000076500007650000005642411147310721022516 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 8bit<->16bit transfer - // * // * Copyright(C) 2002 Sebastian Felis, Max Stengel // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: mem_transfer_ia64.s,v 1.6 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * mem_transfer_ia64.s, IA-64 8bit<->16bit transfer // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** /////////////////////////////////////////////////////////////////////////////// // // mem_transfer.c optimized for ia-64 by Sebastian Felis and Max Stengel, // University of Karlsruhe, Germany, 03.06.2002, during the laboratory // "IA-64 Video Codec Assember Parktikum" at IPD Goos. ///// History ///////////////////////////////////////////////////////////////// // // - 16.07.2002: several minor changes for ecc-conformity // - 03.06.2002: initial version // /////////////////////////////////////////////////////////////////////////////// // // Annotations: // =========== // // - All functions work on 8x8-matrices. While the C-code-functions treat each // element seperatly, the functions in this assembler-code treat a whole line // simultaneously. So one loop is saved. // The remaining loop is relized by using softwarepipelining with rotating // rregisters. // - Register renaming is used for better readability // - To load 8 bytes of missaligned data, two 8-byte-blocks are loaded, both // parts are shifted and joined together with an "OR"-Instruction. // - First parameter is stored in GR 32, next in GR 33, and so on. They must be // saved, as these GRs are used for register-rotation. // - Some of the orininal, German comments used during development are left in // in the code. They shouldn't bother anyone. // // Anmerkungen: // ============ // // - Alle Funtionen arbeiten mit 8x8-Matrizen. Whrend die Funktionen im C-Code // jedes Element einzeln bearbeiten, bearbeiten die Funtionen dieses Assembler- // Codes eine Zeile gleichzeitig. Dadurch kann eine Schleife eingespart werden. // Die verbleibende Schleife wird unter Benutzung von Softwarepipelining mit // rotierenden Registern realisiert. // - Umbenennung der Register zwecks besserer Lesbarkeit wird verwendet. // - Um 8 Bytes falsch ausgerichtete Daten zu laden, werden zwei 8-Byte-Blcke // geladen, beide Teile mit "shift"-Operationen zurechterckt und mit einem // logischen Oder zusammenkopiert. // - Die Parameter werden in den Registern ab GR 32 bergeben. Sie mssen ge- // sichert werden, da die Register fr die register-Rotation bentigt werden. // - Einige der ursprnglichen, deutschen Kommentare aus der Entwicklungsphase // sind im Code verblieben. Sie sollten niemanden stren. // /////////////////////////////////////////////////////////////////////////////// // *** define Latencies for software pipilines *** LL = 3 // Load SL = 3 // Store PL = 1 // Pack SHL = 1 // Shift OL = 1 // Or UL = 1 // Unpack PAL = 1 // Parallel Add PSL = 1 // Parallel Subtract PAVGL = 1 // Parallel Avarage .text /////////////////////////////////////////////////////////////////////////////// // // transfer8x8_copy_ia64 // // SRC is missaligned, to align the source load two 8-bytes-words, shift it, // join them and store the aligned source into the destination address. // /////////////////////////////////////////////////////////////////////////////// .align 16 .global transfer8x8_copy_ia64# .proc transfer8x8_copy_ia64# transfer8x8_copy_ia64: .prologue // *** register renaming *** zero = r0 oldLC = r2 oldPR = r3 src_1 = r14 // left aligned address of src src_2 = r15 // right aligned address of src dst = r16 // destination address stride = r17 offset = r18 // shift right offset aoffset = r19 // shift left offset // *** Saving old Loop-Counter (LC) and Predicate Registers (PR) *** .save ar.lc, oldLC mov oldLC = ar.lc mov oldPR = pr .body // *** Allocating new stackframe, initialize LC, Epilogue-Counter and PR *** alloc r9 = ar.pfs, 3, 29, 0, 32 // *** Saving Parameters *** mov dst = r32 mov stride = r34 // *** Misalingment-Treatment *** and src_1 = -8, r33 // Computing adress of first aligned block containing src-values dep offset = r33, zero, 3, 3 // Extracting offset for shr from src-adress ;; sub aoffset = 64, offset // Computing counterpart of offset ("anti-offset"), used for shl add src_2 = 8, src_1 // Computing adress of second aligned block containing src-values // *** init loop: set loop counter, epilog counter, predicates *** mov ar.lc = 7 mov ar.ec = LL + SHL + OL + 1 mov pr.rot = 1 << 16 ;; // *** define register arrays and predicate array for software pipeline *** // src_v1 = source value 1, shd_r = shifted right, shd_l = shifted left .rotr src_v1[LL+1], src_v2[LL+1], shd_r[SHL+1], shd_l[SHL+1], value[OL+1] .rotp ld_stage[LL], sh_stage[SHL], or_stage[OL], st_stage[1] // Software pipelined loop: // Stage 1: Load two 2 bytes from SRC_1, SRC_2 into SRC_v1 and SRC_v2 // Stage 2: Shift both values of source to SHD_R and SHD_L // Stage 3: Join both parts together with OR // Stage 4: Store aligned date to destination and add stride to destination address .Loop_8x8copy: {.mii (ld_stage[0]) ld8 src_v1[0] = [src_1], stride (sh_stage[0]) shr.u shd_r[0] = src_v1[LL], offset } {.mii (ld_stage[0]) ld8 src_v2[0] = [src_2], stride (sh_stage[0]) shl shd_l[0] = src_v2[LL], aoffset (or_stage[0]) or value[0] = shd_l[SHL], shd_r[SHL] } {.mib (st_stage[0]) st8 [dst] = value[OL] (st_stage[0]) add dst = dst, stride br.ctop.sptk.few .Loop_8x8copy ;; } // *** Restore old LC and PRs *** mov ar.lc = oldLC mov pr = oldPR, -1 br.ret.sptk.many b0 .endp transfer8x8_copy_ia64# /////////////////////////////////////////////////////////////////////////////// // // transfer_8to16copy_ia64 // // SRC is aligned. To convert 8 bit unsigned values to 16 bit signed values, // UNPACK is used. So 8 bytes are loaded from source, unpacked to two // 4 x 16 bit values and stored to the destination. Destination is a continuous // array of 64 x 16 bit signed data. To store the next line, only 16 must be // added to the destination address. /////////////////////////////////////////////////////////////////////////////// .align 16 .global transfer_8to16copy_ia64# .proc transfer_8to16copy_ia64# transfer_8to16copy_ia64: .prologue // *** register renaming *** oldLC = r2 oldPR = r3 zero = r0 // damit ist die Zahl "zero" = 0 gemeint dst_1 = r14 // destination address for first 4 x 16 bit values dst_2 = r15 // destination address for second 4 x 16 bit values src = r16 stride = r17 // *** Saving old Loop-Counter (LC) and Predicate Registers (PR) *** .save ar.lc, oldLC mov oldLC = ar.lc mov oldPR = pr .body // *** Allocating new stackframe, define rotating registers *** alloc r9 = ar.pfs, 4, 92, 0, 96 // *** Saving Paramters *** mov dst_1 = r32 // fist 4 x 16 bit values add dst_2 = 8, r32 // second 4 x 16 bit values mov src = r33 mov stride = r34 // *** init loop: set loop counter, epilog counter, predicates *** mov ar.lc = 7 mov ar.ec = LL + UL + 1 mov pr.rot = 1 << 16 ;; // *** define register arrays and predicate array for software pipeline *** // src_v = source value, dst_v1 = destination value 1 .rotr src_v[LL+1], dst_v1[UL+1], dst_v2[UL+1] .rotp ld_stage[LL], upack_stage[UL], st_stage[1] // Software pipelined loop: // Stage 1: Load value of SRC // Stage 2: Unpack the SRC_V to two 4 x 16 bit signed data // Stage 3: Store both 8 byte of 16 bit data .Loop_8to16copy: {.mii (ld_stage[0]) ld8 src_v[0] = [src], stride (upack_stage[0]) unpack1.l dst_v1[0] = zero, src_v[LL] (upack_stage[0]) unpack1.h dst_v2[0] = zero, src_v[LL] } {.mmb (st_stage[0]) st8 [dst_1] = dst_v1[UL], 16 (st_stage[0]) st8 [dst_2] = dst_v2[UL], 16 br.ctop.sptk.few .Loop_8to16copy ;; } // *** Restore old LC and PRs *** mov ar.lc = oldLC mov pr = oldPR, -1 br.ret.sptk.many b0 .endp transfer_8to16copy_ia64# /////////////////////////////////////////////////////////////////////////////// // // transfer_16to8copy_ia64 // // src is a 64 x 16 bit signed continuous array. To convert the 16 bit // values to 8 bit unsigned data, PACK is used. So two 8-bytes-words of // 4 x 16 bit signed data are loaded, packed together and stored a 8-byte-word // of 8 x 8 unsigned data to the destination. /////////////////////////////////////////////////////////////////////////////// .align 16 .global transfer_16to8copy_ia64# .proc transfer_16to8copy_ia64# transfer_16to8copy_ia64: .prologue // *** register renaming *** dst = r14 src_1 = r15 src_2 = r17 stride = r16 // *** Saving old Loop-Counter (LC) and Predicate Registers (PR) *** .save ar.lc, oldLC mov oldLC = ar.lc mov oldPR = pr .body // *** Allocating new stackframe, define rotating registers *** alloc r9 = ar.pfs, 4, 92, 0, 96 // *** Saving Paramters *** mov dst = r32 mov src_1 = r33 add src_2 = 8, r33 mov stride = r34 // *** init loop: set loop counter, epilog counter, predicates *** mov ar.lc = 7 mov ar.ec = LL + PL + 1 mov pr.rot = 1 << 16 ;; // *** define register arrays and predicate array for software pipeline *** // src_v1 = source value 1, dst_v = destination value .rotr src_v1[LL+1], src_v2[LL+1], dst_v[PL+1] .rotp ld_stage[LL], pack_stage[PL], st_stage[1] // Software pipelined loop: // Stage 1: Load two 8-byte-words of 4 x 16 bit signed source data // Stage 2: Pack them together to one 8 byte 8 x 8 bit unsigned data // Stage 3: Store the 8 byte to the destination address and add stride to // destination address (to get the next 8 byte line of destination) .Loop_16to8copy: {.mmi (ld_stage[0]) ld8 src_v1[0] = [src_1], 16 (ld_stage[0]) ld8 src_v2[0] = [src_2], 16 (pack_stage[0]) pack2.uss dst_v[0] = src_v1[LL], src_v2[LL] } {.mib (st_stage[0]) st8 [dst] = dst_v[PL] (st_stage[0]) add dst = dst, stride br.ctop.sptk.few .Loop_16to8copy ;; } // *** Restore old LC and PRs *** mov ar.lc = oldLC mov pr = oldPR, -1 br.ret.sptk.many b0 .endp transfer_16to8copy_ia64# /////////////////////////////////////////////////////////////////////////////// // // transfer_16to8add_ia64 // // The 8-Bit-values of dst are "unpacked" into two 8-byte-blocks containing 16- // bit-values. These are "parallel-added" to the values of src. The result is // converted into 8-bit-values using "PACK" and stored at the adress of dst. // We assume that there is no misalignment. // /////////////////////////////////////////////////////////////////////////////// .align 16 .global transfer_16to8add_ia64# .proc transfer_16to8add_ia64# transfer_16to8add_ia64: .prologue // *** register renaming *** dst = r14 src = r15 stride = r16 _src = r17 // *** Saving old Loop-Counter (LC) and Predicate Registers (PR) *** .save ar.lc, r2 mov oldLC = ar.lc mov oldPR = pr .body // *** Allocating new stackframe, initialize LC, Epilogue-Counter and PR *** alloc r9 = ar.pfs, 4, 92, 0, 96 // *** Saving Paramters *** mov dst = r32 mov src = r33 mov stride = r34 add _src = 8, r33 // *** init loop: set loop counter, epilog counter, predicates *** mov ar.lc = 7 mov ar.ec = LL + UL + PAL + PL + 1 mov pr.rot = 1 << 16 ;; // *** define register arrays and predicate array for software pipeline *** .rotr _dst[LL+UL+PAL+PL+1], dst8[PL+1], pixel_1[PAL+1], pixel_2[PAL+1], w_dst16_1[UL+1], w_src_1[LL+UL+1], w_dst16_2[UL+1], w_src_2[LL+UL+1], w_dst8[LL+1] .rotp s1_p[LL], s2_p[UL], s3_p[PAL], s4_p[PL], s5_p[1] // Software pipelined loop: // s1_p: The values of src and dst are loaded // s2_p: The dst-values are converted to 16-bit-values // s3_p: The values of src and dst are added // s4_p: The Results are packed into 8-bit-values // s5_p: The 8-bit-values are stored at the dst-adresses .Loop_16to8add: {.mii (s1_p[0]) ld8 w_src_1[0] = [src], 16 // ld die 1. Hlfte der j. Zeile von src (i = 0..3) (s1_p[0]) mov _dst[0] = dst // erhht die Adresse von dst um stride (s3_p[0]) padd2.sss pixel_1[0] = w_dst16_1[UL], w_src_1[LL+UL] // parallele Addition von scr und dst } {.mii (s1_p[0]) ld8 w_dst8[0] = [dst], stride // ld die j. Zeile von dst (s2_p[0]) unpack1.l w_dst16_1[0] = r0, w_dst8[LL]; // dst wird fr i = 0..3 in 16-Bit umgewandelt (s2_p[0]) unpack1.h w_dst16_2[0] = r0, w_dst8[LL]; // dst wird fr i = 4..7 in 16-Bit umgewandelt } {.mii (s1_p[0]) ld8 w_src_2[0] = [_src], 16 // ld die 2. Hlfte der j. Zeile von src (i = 4..7) (s3_p[0]) padd2.sss pixel_2[0] = w_dst16_2[UL], w_src_2[LL+UL] // parallele Addition von scr und dst (s4_p[0]) pack2.uss dst8[0] = pixel_1[PAL], pixel_2[PAL] // wandelt die Summen (pixel) in 8-Bit Werte um. Die berprfung der Wertebereiche erfolgt automatisch } {.mmb (s5_p[0]) st8 [_dst[LL+UL+PAL+PL]] = dst8[PL] // speichert dst ab (s1_p[0]) nop.m 0 br.ctop.sptk.few .Loop_16to8add ;; } // *** Restore old LC and PRs *** mov ar.lc = oldLC mov pr = oldPR, -1 br.ret.sptk.many b0 .endp transfer_16to8add_ia64# /////////////////////////////////////////////////////////////////////////////// // // transfer_8to16sub_ia64 // // The 8-bit-values of ref and cur are loaded. cur is converted to 16-bit. The // Difference of cur and ref ist stored at the dct-adresses and cur is copied // into the ref-array. // // You must assume, that the data adressed by 'ref' are misaligned in memory. // But you can assume, that the other data are aligned (at least I hope so). // /////////////////////////////////////////////////////////////////////////////// .align 16 .global transfer_8to16sub_ia64# .proc transfer_8to16sub_ia64# transfer_8to16sub_ia64: .prologue // *** register renaming *** oldLC = r2 oldPR = r3 zero = r0 // damit ist die Zahl "zero" = 0 gemeint //Die folgenden Register erhalten die gleichen Namen, wie die Variablen in der C-Vorlage dct = r14 cur = r15 ref = r34 // muss nicht extra gesichert werden, deswegen bleibt das bergabeRegister in dieser Liste stride = r16 offset = r17 // Offset der falsch ausgerichteten Daten zum zurechtrcken aoffset = r18 // Gegenstck zum Offset, ref_a1 = r19 // Adresse des ersten 64-Bit Blocks von ref ref_a2 = r20 // Adresse des zweiten 64-Bit Blocks von ref _dct = r21 // Register fr die Zieladressen des 2. dct-Blocks // *** Saving old Loop-Counter (LC) and Predicate Registers (PR) *** .save ar.lc, r2 mov oldLC = ar.lc mov oldPR = pr .body // *** Allocating new stackframe, define rotating registers *** alloc r9 = ar.pfs, 4, 92, 0, 96 // *** Saving Paramters *** mov dct = r32 mov cur = r33 // mov ref = r34: ref is unaligned, get aligned ref below... mov stride = r35 and ref_a1 = -8, ref // Die Adresse des ersten 64-Bit Blocks, in dem ref liegt, wird berechnet (entspricht mod 8) dep offset = ref, zero, 3, 3 ;; add ref_a2 = 8, ref_a1 sub aoffset = 64, offset // Gegenstck zum Offset wird berechnet add _dct = 8, dct // Die Adresse fr den 2. dct-Block wird berechnet, um 8 Byte (= 64 Bit) hher als beim 1. Block // *** init loop: set loop counter, epilog counter, predicates *** mov ar.lc = 7 mov ar.ec = LL + SHL + OL + UL + PSL + 1 mov pr.rot = 1 << 16 ;; // *** define register arrays and predicate array for software pipeline *** .rotr c[LL+1], ref_v1[LL+1], ref_v2[LL+1], c16_1[SHL+OL+UL+1], c16_2[SHL+OL+UL+1], ref_shdr[SHL+1], ref_shdl[SHL+1], r[OL+1], r16_1[UL+1], r16_2[UL+1], dct_1[PSL+1], dct_2[PSL+1], _cur[LL+SHL+OL+UL+1] .rotp s1_p[LL], s2_p[SHL], s3_p[OL], s4_p[UL], s5_p[PSL], s6_p[1] // Software pipelined loop: // s1_p: The values of ref and cur ale loaded, a copy of cur is made. // s2_p: cur is converted to 16-bit and thehe misaligned values of ref are // shifted... // s3_p: ... and copied together. // s4_p: This ref-value is converted to 16-bit. The values of cur are stored // at the ref-adresses. // s5_p: the ref- abd cur-values are substracted... // s6_p: ...and the result is stored at the dct-adresses. loop_8to16sub: {.mii (s1_p[0]) ld8 ref_v1[0] = [ref_a1], stride // ld den 1. 64-Bit-Block, der einen Teil der ref-Daten enthlt (s1_p[0]) mov _cur[0] = cur // cur wird fr sptere Verwendung gesichert (s2_p[0]) shr.u ref_shdr[0] = ref_v1[LL], offset // Die rechte Hlfte wird zurechtgerckt } {.mii (s1_p[0]) ld8 ref_v2[0] = [ref_a2], stride // ld den 2. 64-Bit-Block (s2_p[0]) shl ref_shdl[0] = ref_v2[LL], aoffset // Die linke Hlfte wird zurechtgerckt (s3_p[0]) or r[0] = ref_shdr[SHL], ref_shdl[SHL] // Die zurechtgerckten Daten werden in r zusammenkopiert } {.mii (s1_p[0]) ld8 c[0] = [cur], stride //ld die j. Zeile von cur komplett (s2_p[0]) unpack1.l c16_1[0] = zero, c[LL]; // c wird fr i = 0..3 in 16-Bit umgewandelt (s2_p[0]) unpack1.h c16_2[0] = zero, c[LL]; // c wird fr i = 4..7 in 16-Bit umgewandelt } {.mii (s4_p[0]) st8 [_cur[LL+SHL+OL]] = r[OL] // cur wird auf den Wert von r gesetzt //Umwandeln der 8-Bit r und c -Werte in 16-bit Werte (s4_p[0]) unpack1.l r16_1[0] = zero, r[OL]; // r wird fr i = 0..3 in 16-Bit umgewandelt (s4_p[0]) unpack1.h r16_2[0] = zero, r[OL]; // r wird fr i = 4..7 in 16-Bit umgewandelt } {.mii (s5_p[0]) psub2.sss dct_1[0] = c16_1[SHL+OL+UL], r16_1[UL] // Subtraktion der 1. Hfte der j. Zeile (s5_p[0]) psub2.sss dct_2[0] = c16_2[SHL+OL+UL], r16_2[UL] // Subtraktion der 2. Hlfte } {.mmb (s6_p[0]) st8 [dct] = dct_1[PSL], 16 // speichert den 1. 64-Bit-Block an der vorgesehenen Adresse, erhhen der Adresse um 16 Byte fr den nchsten Wert (s6_p[0]) st8 [_dct] = dct_2[PSL], 16 // speichert den 2. 64-Bit-Block an der vorgesehenen Adresse, erhhen der Adresse um 16 Byte fr den nchsten Wert br.ctop.sptk.few loop_8to16sub // Und hopp ;; } // *** Restore old LC and PRs *** mov ar.lc = oldLC mov pr = oldPR, -1 br.ret.sptk.many b0 .endp transfer_8to16sub_ia64# /////////////////////////////////////////////////////////////////////////////// // // transfer_8to16sub2_ia64 // // At the time, this function was written, it was not yet in use. // We assume that the values of ref1/2 are misaligned. // // The values of ref1/2 and cur are loaded, the ref-values need misalignment- // treatment. The values are converted to 16-bit using unpack. The average of // ref1 and ref2 is computed with pavg and substacted from cur. The results are // stored at the dct-adresses. // pavg1.raz is used to get the same results as the C-code-function. // /////////////////////////////////////////////////////////////////////////////// .text .align 16 .global transfer_8to16sub2_ia64# .proc transfer_8to16sub2_ia64# transfer_8to16sub2_ia64: .prologue // *** register renaming *** // We've tried to keep the C-Code names as often as possible, at least as // part of register-names oldLC = r2 oldPR = r3 zero = r0 dct_al = r14 // dct: adress of left block in one line dct_ar = r15 // dct: adress of right block in one line cur = r16 ref1_al = r17 // ref1: aligned adress of lower part ref1_ah = r18 // ref1: aligned adress of higher part ref2_al = r19 // ref2: aligned adress of lower part ref2_ah = r20 // ref2: aligned adress of higher part stride = r21 offset_1 = r22 offset_2 = r23 aoffset_1 = r24 aoffset_2 = r25 // *** Saving old Loop-Counter (LC) and Predicate Registers (PR) *** .save ar.lc, r2 mov oldLC = ar.lc mov oldPR = pr .body // *** Saving Paramters *** // *** (as inputregisters r32 + are needed for register-rotation) *** mov dct_ar = r32 add dct_al = 8, r32 mov cur = r33 and ref1_al = -8, r34 and ref2_al = -8, r35 // ref2 aligned adrress of lower part mov stride = r36 // *** Calculations for Misaligment-Handling *** dep offset_1 = r34, zero, 3, 3 dep offset_2 = r35, zero, 3, 3 ;; add ref1_ah = 8, ref1_al add ref2_ah = 8, ref2_al sub aoffset_1 = 64, offset_1 sub aoffset_2 = 64, offset_2 ;; // *** Allocating new stackframe, define rotating registers *** alloc r9 = ar.pfs, 5, 91, 0, 96 // *** init loop: set loop counter, epilog counter, predicates *** mov ar.lc = 7 mov ar.ec = LL + SHL + OL + PAVGL + UL +PSL + 1 mov pr.rot = 1 << 16 ;; // *** define register arrays and predicate array for software pipeline *** .rotr ref1_vl[LL+1], ref1_vh[LL+1], ref2_vl[LL+1], ref2_vh[LL+1], c[LL+SHL+OL+PAVGL+1], ref1_l[SHL+1], ref1_h[SHL+1], ref2_l[SHL+1], ref2_h[SHL+1], ref1_aligned[OL+1], ref2_aligned[OL+1], r[PAVGL+1], r16_l[UL+1], r16_r[UL+1], c16_l[UL+1], c16_r[UL+1], dct16_l[PSL+1], dct16_r[PSL+1] .rotp ld_stage[LL], sh_stage[SHL], or_stage[OL], pavg_stage[PAVGL], up_stage[UL], psub_stage[PSL], st_stage[1] // software pipelined loop: // ld_stage: The values of ref1, ref2, cur are loaded // sh_stage: The misaligned values of ref1/2 are shifted... // or_stage: ...and copied together. // pavg_stage: The average of ref1 and ref2 is computed. // up_stage: The result and the cur-values are converted to 16-bit. // psub_stage: Those values are substracted... // st_stage: ...and stored at the dct-adresses. .Loop_8to16sub2: {.mii (ld_stage[0]) ld8 c[0] = [cur], stride (sh_stage[0]) shr.u ref1_l[0] = ref1_vl[LL], offset_1 (sh_stage[0]) shl ref1_h[0] = ref1_vh[LL], aoffset_1 } {.mii (ld_stage[0]) ld8 ref1_vl[0] = [ref1_al], stride (sh_stage[0]) shr.u ref2_l[0] = ref2_vl[LL], offset_2 (sh_stage[0]) shl ref2_h[0] = ref2_vh[LL], aoffset_2 } {.mii (ld_stage[0]) ld8 ref1_vh[0] = [ref1_ah], stride (or_stage[0]) or ref1_aligned[0] = ref1_h[SHL], ref1_l[SHL] (or_stage[0]) or ref2_aligned[0] = ref2_h[SHL], ref2_l[SHL] } {.mii (ld_stage[0]) ld8 ref2_vl[0] = [ref2_al], stride (pavg_stage[0]) pavg1.raz r[0] = ref1_aligned[OL], ref2_aligned[OL] (up_stage[0]) unpack1.l r16_r[0] = zero, r[PAVGL] } {.mii (ld_stage[0]) ld8 ref2_vh[0] = [ref2_ah], stride (up_stage[0]) unpack1.h r16_l[0] = zero, r[PAVGL] (up_stage[0]) unpack1.l c16_r[0] = zero, c[LL+SHL+OL+PAVGL] } {.mii (st_stage[0]) st8 [dct_ar] = dct16_r[PSL], 16 (up_stage[0]) unpack1.h c16_l[0] = zero, c[LL+SHL+OL+PAVGL] (psub_stage[0]) psub2.sss dct16_l[0] = c16_l[UL], r16_l[UL] } {.mib (st_stage[0]) st8 [dct_al] = dct16_l[PSL], 16 (psub_stage[0]) psub2.sss dct16_r[0] = c16_r[UL], r16_r[UL] br.ctop.sptk.few .Loop_8to16sub2 // Und hopp ;; } // *** Restore old LC and PRs *** mov ar.lc = oldLC mov pr = oldPR, -1 br.ret.sptk.many b0 .endp transfer_8to16sub2_ia64# xvidcore/src/utils/mem_transfer.c0000664000076500007650000001605611564705453020262 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - 8bit<->16bit transfer - * * Copyright(C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mem_transfer.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../global.h" #include "mem_transfer.h" /* Function pointers - Initialized in the xvid.c module */ TRANSFER_8TO16COPY_PTR transfer_8to16copy; TRANSFER_16TO8COPY_PTR transfer_16to8copy; TRANSFER_8TO16SUB_PTR transfer_8to16sub; TRANSFER_8TO16SUBRO_PTR transfer_8to16subro; TRANSFER_8TO16SUB2_PTR transfer_8to16sub2; TRANSFER_8TO16SUB2RO_PTR transfer_8to16sub2ro; TRANSFER_16TO8ADD_PTR transfer_16to8add; TRANSFER8X8_COPY_PTR transfer8x8_copy; TRANSFER8X4_COPY_PTR transfer8x4_copy; #define USE_REFERENCE_C /***************************************************************************** * * All these functions are used to transfer data from a 8 bit data array * to a 16 bit data array. * * This is typically used during motion compensation, that's why some * functions also do the addition/substraction of another buffer during the * so called transfer. * ****************************************************************************/ /* * SRC - the source buffer * DST - the destination buffer * * Then the function does the 8->16 bit transfer and this serie of operations : * * SRC (8bit) = SRC * DST (16bit) = SRC */ void transfer_8to16copy_c(int16_t * const dst, const uint8_t * const src, uint32_t stride) { int i, j; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { dst[j * 8 + i] = (int16_t) src[j * stride + i]; } } } /* * SRC - the source buffer * DST - the destination buffer * * Then the function does the 8->16 bit transfer and this serie of operations : * * SRC (16bit) = SRC * DST (8bit) = max(min(SRC, 255), 0) */ void transfer_16to8copy_c(uint8_t * const dst, const int16_t * const src, uint32_t stride) { int i, j; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { #ifdef USE_REFERENCE_C int16_t pixel = src[j * 8 + i]; if (pixel < 0) { pixel = 0; } else if (pixel > 255) { pixel = 255; } dst[j * stride + i] = (uint8_t) pixel; #else const int16_t pixel = src[j * 8 + i]; const uint8_t value = (uint8_t)( (pixel&~255) ? (-pixel)>>(8*sizeof(pixel)-1) : pixel ); dst[j*stride + i] = value; #endif } } } /* * C - the current buffer * R - the reference buffer * DCT - the dct coefficient buffer * * Then the function does the 8->16 bit transfer and this serie of operations : * * R (8bit) = R * C (8bit) = R * DCT (16bit) = C - R */ void transfer_8to16sub_c(int16_t * const dct, uint8_t * const cur, const uint8_t * ref, const uint32_t stride) { int i, j; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { const uint8_t c = cur[j * stride + i]; const uint8_t r = ref[j * stride + i]; cur[j * stride + i] = r; dct[j * 8 + i] = (int16_t) c - (int16_t) r; } } } void transfer_8to16subro_c(int16_t * const dct, const uint8_t * const cur, const uint8_t * ref, const uint32_t stride) { int i, j; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { const uint8_t c = cur[j * stride + i]; const uint8_t r = ref[j * stride + i]; dct[j * 8 + i] = (int16_t) c - (int16_t) r; } } } /* * C - the current buffer * R1 - the 1st reference buffer * R2 - the 2nd reference buffer * DCT - the dct coefficient buffer * * Then the function does the 8->16 bit transfer and this serie of operations : * * R1 (8bit) = R1 * R2 (8bit) = R2 * R (temp) = min((R1 + R2)/2, 255) * DCT (16bit)= C - R * C (8bit) = R */ void transfer_8to16sub2_c(int16_t * const dct, uint8_t * const cur, const uint8_t * ref1, const uint8_t * ref2, const uint32_t stride) { uint32_t i, j; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { const uint8_t c = cur[j * stride + i]; const uint8_t r = (ref1[j * stride + i] + ref2[j * stride + i] + 1) >> 1; cur[j * stride + i] = r; dct[j * 8 + i] = (int16_t) c - (int16_t) r; } } } void transfer_8to16sub2ro_c(int16_t * const dct, const uint8_t * const cur, const uint8_t * ref1, const uint8_t * ref2, const uint32_t stride) { uint32_t i, j; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { const uint8_t c = cur[j * stride + i]; const uint8_t r = (ref1[j * stride + i] + ref2[j * stride + i] + 1) >> 1; dct[j * 8 + i] = (int16_t) c - (int16_t) r; } } } /* * SRC - the source buffer * DST - the destination buffer * * Then the function does the 16->8 bit transfer and this serie of operations : * * SRC (16bit) = SRC * DST (8bit) = max(min(DST+SRC, 255), 0) */ void transfer_16to8add_c(uint8_t * const dst, const int16_t * const src, uint32_t stride) { int i, j; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { #ifdef USE_REFERENCE_C int16_t pixel = (int16_t) dst[j * stride + i] + src[j * 8 + i]; if (pixel < 0) { pixel = 0; } else if (pixel > 255) { pixel = 255; } dst[j * stride + i] = (uint8_t) pixel; #else const int16_t pixel = (int16_t) dst[j * stride + i] + src[j * 8 + i]; const uint8_t value = (uint8_t)( (pixel&~255) ? (-pixel)>>(8*sizeof(pixel)-1) : pixel ); dst[j*stride + i] = value; #endif } } } /* * SRC - the source buffer * DST - the destination buffer * * Then the function does the 8->8 bit transfer and this serie of operations : * * SRC (8bit) = SRC * DST (8bit) = SRC */ void transfer8x8_copy_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride) { int j, i; for (j = 0; j < 8; ++j) { uint8_t *d = dst + j*stride; const uint8_t *s = src + j*stride; for (i = 0; i < 8; ++i) { *d++ = *s++; } } } /* * SRC - the source buffer * DST - the destination buffer * * Then the function does the 8->8 bit transfer and this serie of operations : * * SRC (8bit) = SRC * DST (8bit) = SRC */ void transfer8x4_copy_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride) { uint32_t j; for (j = 0; j < 4; j++) { uint32_t *d= (uint32_t*)(dst + j*stride); const uint32_t *s = (const uint32_t*)(src + j*stride); *(d+0) = *(s+0); *(d+1) = *(s+1); } } xvidcore/src/utils/emms.c0000664000076500007650000000323511564705453016534 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - emms C wrapper - * * Copyright(C) 2002 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: emms.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "emms.h" #include "../portab.h" /***************************************************************************** * Library data, declared here ****************************************************************************/ emmsFuncPtr emms; /***************************************************************************** * emms functions * * emms functions are used to restored the fpu context after mmx operations * because mmx and fpu share their registers/context * ****************************************************************************/ /* The no op wrapper for non MMX platforms */ void emms_c(void) { } xvidcore/src/utils/x86_asm/0000775000076500007650000000000011566427763016720 5ustar xvidbuildxvidbuildxvidcore/src/utils/x86_asm/mem_transfer_3dne.asm0000664000076500007650000002523011254216113022772 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - 8<->16 bit transfer functions - ; * ; * Copyright (C) 2002 Jaan Kalda ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: mem_transfer_3dne.asm,v 1.13 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ; these 3dne functions are compatible with iSSE, but are optimized specifically ; for K7 pipelines %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN mm_zero: dd 0,0 ;============================================================================= ; Macros ;============================================================================= %ifdef ARCH_IS_X86_64 %define nop4 %else %macro nop4 0 db 08Dh, 074h, 026h, 0 %endmacro %endif ;============================================================================= ; Code ;============================================================================= TEXT cglobal transfer_8to16copy_3dne cglobal transfer_16to8copy_3dne cglobal transfer_8to16sub_3dne cglobal transfer_8to16subro_3dne cglobal transfer_8to16sub2_3dne cglobal transfer_16to8add_3dne cglobal transfer8x8_copy_3dne cglobal transfer8x4_copy_3dne ;----------------------------------------------------------------------------- ; ; void transfer_8to16copy_3dne(int16_t * const dst, ; const uint8_t * const src, ; uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN transfer_8to16copy_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; Stride mov TMP0, prm1 ; Dst punpcklbw mm0, [byte _EAX] punpcklbw mm1, [_EAX+4] movq mm2, [_EAX+TMP1] movq mm3, [_EAX+TMP1] pxor mm7, mm7 lea _EAX, [_EAX+2*TMP1] punpcklbw mm2, mm7 punpckhbw mm3, mm7 psrlw mm0, 8 psrlw mm1, 8 punpcklbw mm4, [_EAX] punpcklbw mm5, [_EAX+TMP1+4] movq [byte TMP0+0*64], mm0 movq [TMP0+0*64+8], mm1 punpcklbw mm6, [_EAX+TMP1] punpcklbw mm7, [_EAX+4] lea _EAX, [byte _EAX+2*TMP1] psrlw mm4, 8 psrlw mm5, 8 punpcklbw mm0, [_EAX] punpcklbw mm1, [_EAX+TMP1+4] movq [TMP0+0*64+16], mm2 movq [TMP0+0*64+24], mm3 psrlw mm6, 8 psrlw mm7, 8 punpcklbw mm2, [_EAX+TMP1] punpcklbw mm3, [_EAX+4] lea _EAX, [byte _EAX+2*TMP1] movq [byte TMP0+0*64+32], mm4 movq [TMP0+0*64+56], mm5 psrlw mm0, 8 psrlw mm1, 8 punpcklbw mm4, [_EAX] punpcklbw mm5, [_EAX+TMP1+4] movq [byte TMP0+0*64+48], mm6 movq [TMP0+0*64+40], mm7 psrlw mm2, 8 psrlw mm3, 8 punpcklbw mm6, [_EAX+TMP1] punpcklbw mm7, [_EAX+4] movq [byte TMP0+1*64], mm0 movq [TMP0+1*64+24], mm1 psrlw mm4, 8 psrlw mm5, 8 movq [TMP0+1*64+16], mm2 movq [TMP0+1*64+8], mm3 psrlw mm6, 8 psrlw mm7, 8 movq [byte TMP0+1*64+32], mm4 movq [TMP0+1*64+56], mm5 movq [byte TMP0+1*64+48], mm6 movq [TMP0+1*64+40], mm7 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_16to8copy_3dne(uint8_t * const dst, ; const int16_t * const src, ; uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN transfer_16to8copy_3dne: mov _EAX, prm2 ; Src mov TMP0, prm1 ; Dst mov TMP1, prm3 ; Stride movq mm0, [byte _EAX+0*32] packuswb mm0, [_EAX+0*32+8] movq mm1, [_EAX+0*32+16] packuswb mm1, [_EAX+0*32+24] movq mm5, [_EAX+2*32+16] movq mm2, [_EAX+1*32] packuswb mm2, [_EAX+1*32+8] movq mm3, [_EAX+1*32+16] packuswb mm3, [_EAX+1*32+24] movq mm6, [_EAX+3*32] movq mm4, [_EAX+2*32] packuswb mm4, [_EAX+2*32+8] packuswb mm5, [_EAX+2*32+24] movq mm7, [_EAX+3*32+16] packuswb mm7, [_EAX+3*32+24] packuswb mm6, [_EAX+3*32+8] movq [TMP0], mm0 lea _EAX, [3*TMP1] add _EAX, TMP0 movq [TMP0+TMP1], mm1 movq [TMP0+2*TMP1], mm2 movq [byte _EAX], mm3 movq [TMP0+4*TMP1], mm4 lea TMP0, [byte TMP0+4*TMP1] movq [_EAX+2*TMP1], mm5 movq [_EAX+4*TMP1], mm7 movq [TMP0+2*TMP1], mm6 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_8to16sub_3dne(int16_t * const dct, ; uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ; when second argument == 1, reference (ebx) block is to current (_EAX) %macro COPY_8_TO_16_SUB 2 movq mm1, [_EAX] ; cur movq mm0, mm1 movq mm4, [TMP0] ; ref movq mm6, mm4 %if %2 == 1 movq [_EAX], mm4 %endif punpckhbw mm1, mm7 punpckhbw mm6, mm7 punpcklbw mm4, mm7 ALIGN SECTION_ALIGN movq mm2, [byte _EAX+TMP1] punpcklbw mm0, mm7 movq mm3, [byte _EAX+TMP1] punpcklbw mm2, mm7 movq mm5, [byte TMP0+TMP1] ; ref punpckhbw mm3, mm7 %if %2 == 1 movq [byte _EAX+TMP1], mm5 %endif psubsw mm1, mm6 movq mm6, mm5 psubsw mm0, mm4 %if (%1 < 3) lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] %else mov TMP0,[_ESP] add _ESP,byte PTR_SIZE %endif movq [_EDI+%1*32+ 8], mm1 movq [byte _EDI+%1*32+ 0], mm0 ; dst punpcklbw mm5, mm7 punpckhbw mm6, mm7 psubsw mm2, mm5 psubsw mm3, mm6 movq [_EDI+%1*32+16], mm2 movq [_EDI+%1*32+24], mm3 %endmacro ALIGN SECTION_ALIGN transfer_8to16sub_3dne: mov _EAX, prm2 ; Cur mov TMP0, prm3 ; Ref mov TMP1, prm4 ; Stride push _EDI %ifdef ARCH_IS_X86_64 mov _EDI, prm1 %else mov _EDI, [_ESP+4+4] ; Dst %endif pxor mm7, mm7 nop ALIGN SECTION_ALIGN COPY_8_TO_16_SUB 0, 1 COPY_8_TO_16_SUB 1, 1 COPY_8_TO_16_SUB 2, 1 COPY_8_TO_16_SUB 3, 1 mov _EDI, TMP0 ret ENDFUNC ALIGN SECTION_ALIGN transfer_8to16subro_3dne: mov _EAX, prm2 ; Cur mov TMP0, prm3 ; Ref mov TMP1, prm4 ; Stride push _EDI %ifdef ARCH_IS_X86_64 mov _EDI, prm1 %else mov _EDI, [_ESP+4+ 4] ; Dst %endif pxor mm7, mm7 nop ALIGN SECTION_ALIGN COPY_8_TO_16_SUB 0, 0 COPY_8_TO_16_SUB 1, 0 COPY_8_TO_16_SUB 2, 0 COPY_8_TO_16_SUB 3, 0 mov _EDI, TMP0 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_8to16sub2_3dne(int16_t * const dct, ; uint8_t * const cur, ; const uint8_t * ref1, ; const uint8_t * ref2, ; const uint32_t stride) ; ;----------------------------------------------------------------------------- %macro COPY_8_TO_16_SUB2_SSE 1 db 0Fh, 6Fh, 44h, 20h, 00 ;movq mm0, [byte _EAX] ; cur punpcklbw mm0, mm7 movq mm2, [byte _EAX+TMP1] punpcklbw mm2, mm7 db 0Fh, 6Fh, 4ch, 20h, 00 ;movq mm1, [byte _EAX] punpckhbw mm1, mm7 movq mm3, [byte _EAX+TMP1] punpckhbw mm3, mm7 movq mm4, [byte _EBX] ; ref1 pavgb mm4, [byte _ESI] ; ref2 movq [_EAX], mm4 movq mm5, [_EBX+TMP1] ; ref pavgb mm5, [_ESI+TMP1] ; ref2 movq [_EAX+TMP1], mm5 movq mm6, mm4 punpcklbw mm4, mm7 punpckhbw mm6, mm7 %if (%1 < 3) lea _ESI,[_ESI+2*TMP1] lea _EBX,[byte _EBX+2*TMP1] lea _EAX,[_EAX+2*TMP1] %else mov _ESI,[_ESP] mov _EBX,[_ESP+PTR_SIZE] add _ESP,byte 2*PTR_SIZE %endif psubsw mm0, mm4 psubsw mm1, mm6 movq mm6, mm5 punpcklbw mm5, mm7 punpckhbw mm6, mm7 psubsw mm2, mm5 psubsw mm3, mm6 movq [byte TMP0+%1*32+ 0], mm0 ; dst movq [TMP0+%1*32+ 8], mm1 movq [TMP0+%1*32+16], mm2 movq [TMP0+%1*32+24], mm3 %endmacro ALIGN SECTION_ALIGN transfer_8to16sub2_3dne: mov TMP1d, prm5d ; Stride mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Cur push _EBX lea _EBP,[byte _EBP] %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref1 %endif push _ESI pxor mm7, mm7 %ifdef ARCH_IS_X86_64 mov _ESI, prm4 %else mov _ESI, [_ESP+8+16] ; Ref2 %endif nop4 COPY_8_TO_16_SUB2_SSE 0 COPY_8_TO_16_SUB2_SSE 1 COPY_8_TO_16_SUB2_SSE 2 COPY_8_TO_16_SUB2_SSE 3 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_16to8add_3dne(uint8_t * const dst, ; const int16_t * const src, ; uint32_t stride); ; ;----------------------------------------------------------------------------- %macro COPY_16_TO_8_ADD 1 movq mm0, [byte TMP0] punpcklbw mm0, mm7 movq mm2, [byte TMP0+TMP1] punpcklbw mm2, mm7 movq mm1, [byte TMP0] punpckhbw mm1, mm7 movq mm3, [byte TMP0+TMP1] punpckhbw mm3, mm7 paddsw mm0, [byte _EAX+%1*32+ 0] paddsw mm1, [_EAX+%1*32+ 8] paddsw mm2, [_EAX+%1*32+16] paddsw mm3, [_EAX+%1*32+24] packuswb mm0, mm1 packuswb mm2, mm3 mov _ESP, _ESP movq [byte TMP0], mm0 movq [TMP0+TMP1], mm2 %endmacro ALIGN SECTION_ALIGN transfer_16to8add_3dne: mov TMP0, prm1 ; Dst mov TMP1, prm3 ; Stride mov _EAX, prm2 ; Src pxor mm7, mm7 nop COPY_16_TO_8_ADD 0 lea TMP0,[byte TMP0+2*TMP1] COPY_16_TO_8_ADD 1 lea TMP0,[byte TMP0+2*TMP1] COPY_16_TO_8_ADD 2 lea TMP0,[byte TMP0+2*TMP1] COPY_16_TO_8_ADD 3 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer8x8_copy_3dne(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride); ; ; ;----------------------------------------------------------------------------- %macro COPY_8_TO_8 0 movq mm0, [byte _EAX] movq mm1, [_EAX+TMP1] movq [byte TMP0], mm0 lea _EAX,[byte _EAX+2*TMP1] movq [TMP0+TMP1], mm1 %endmacro ALIGN SECTION_ALIGN transfer8x8_copy_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; Stride mov TMP0, prm1 ; Dst COPY_8_TO_8 lea TMP0,[byte TMP0+2*TMP1] COPY_8_TO_8 lea TMP0,[byte TMP0+2*TMP1] COPY_8_TO_8 lea TMP0,[byte TMP0+2*TMP1] COPY_8_TO_8 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer8x4_copy_3dne(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride); ; ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN transfer8x4_copy_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; Stride mov TMP0, prm1 ; Dst COPY_8_TO_8 lea TMP0,[byte TMP0+2*TMP1] COPY_8_TO_8 ret ENDFUNC NON_EXEC_STACK xvidcore/src/utils/x86_asm/cpuid.asm0000664000076500007650000001227111254216113020504 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - CPUID check processors capabilities - ; * ; * Copyright (C) 2001-2008 Michael Militzer ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: cpuid.asm,v 1.19 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Constants ;============================================================================= %define CPUID_TSC 0x00000010 %define CPUID_MMX 0x00800000 %define CPUID_SSE 0x02000000 %define CPUID_SSE2 0x04000000 %define CPUID_SSE3 0x00000001 %define CPUID_SSE41 0x00080000 %define EXT_CPUID_3DNOW 0x80000000 %define EXT_CPUID_AMD_3DNOWEXT 0x40000000 %define EXT_CPUID_AMD_MMXEXT 0x00400000 ;;; NB: Make sure these defines match the ones defined in xvid.h %define XVID_CPU_MMX (1<< 0) %define XVID_CPU_MMXEXT (1<< 1) %define XVID_CPU_SSE (1<< 2) %define XVID_CPU_SSE2 (1<< 3) %define XVID_CPU_SSE3 (1<< 8) %define XVID_CPU_SSE41 (1<< 9) %define XVID_CPU_3DNOW (1<< 4) %define XVID_CPU_3DNOWEXT (1<< 5) %define XVID_CPU_TSC (1<< 6) ;============================================================================= ; Read only data ;============================================================================= ALIGN SECTION_ALIGN DATA vendorAMD: db "AuthenticAMD" ;============================================================================= ; Macros ;============================================================================= %macro CHECK_FEATURE 4 mov eax, %1 and eax, %4 neg eax sbb eax, eax and eax, %2 or %3, eax %endmacro ;============================================================================= ; Code ;============================================================================= %ifdef ARCH_IS_X86_64 %define XVID_PUSHFD pushfq %define XVID_POPFD popfq %else %define XVID_PUSHFD pushfd %define XVID_POPFD popfd %endif TEXT ; int check_cpu_feature(void) cglobal check_cpu_features check_cpu_features: push _EBX push _ESI push _EDI push _EBP sub _ESP, 12 ; Stack space for vendor name xor ebp, ebp ; CPUID command ? XVID_PUSHFD pop _EAX mov ecx, eax xor eax, 0x200000 push _EAX XVID_POPFD XVID_PUSHFD pop _EAX cmp eax, ecx jz near .cpu_quit ; no CPUID command -> exit ; get vendor string, used later xor eax, eax cpuid mov [_ESP], ebx ; vendor string mov [_ESP+4], edx mov [_ESP+8], ecx test eax, eax jz near .cpu_quit mov eax, 1 cpuid ; RDTSC command ? CHECK_FEATURE CPUID_TSC, XVID_CPU_TSC, ebp, edx ; MMX support ? CHECK_FEATURE CPUID_MMX, XVID_CPU_MMX, ebp, edx ; SSE support ? CHECK_FEATURE CPUID_SSE, (XVID_CPU_MMXEXT|XVID_CPU_SSE), ebp, edx ; SSE2 support? CHECK_FEATURE CPUID_SSE2, XVID_CPU_SSE2, ebp, edx ; SSE3 support? CHECK_FEATURE CPUID_SSE3, XVID_CPU_SSE3, ebp, ecx ; SSE41 support? CHECK_FEATURE CPUID_SSE41, XVID_CPU_SSE41, ebp, ecx ; extended functions? mov eax, 0x80000000 cpuid cmp eax, 0x80000000 jbe near .cpu_quit mov eax, 0x80000001 cpuid ; AMD cpu ? lea _ESI, [vendorAMD] lea _EDI, [_ESP] mov ecx, 12 cld repe cmpsb jnz .cpu_quit ; 3DNow! support ? CHECK_FEATURE EXT_CPUID_3DNOW, XVID_CPU_3DNOW, ebp, edx ; 3DNOW extended ? CHECK_FEATURE EXT_CPUID_AMD_3DNOWEXT, XVID_CPU_3DNOWEXT, ebp, edx ; extended MMX ? CHECK_FEATURE EXT_CPUID_AMD_MMXEXT, XVID_CPU_MMXEXT, ebp, edx .cpu_quit: mov eax, ebp add _ESP, 12 pop _EBP pop _EDI pop _ESI pop _EBX ret ENDFUNC ; sse/sse2 operating support detection routines ; these will trigger an invalid instruction signal if not supported. ALIGN SECTION_ALIGN cglobal sse_os_trigger sse_os_trigger: xorps xmm0, xmm0 ret ENDFUNC ALIGN SECTION_ALIGN cglobal sse2_os_trigger sse2_os_trigger: xorpd xmm0, xmm0 ret ENDFUNC ; enter/exit mmx state ALIGN SECTION_ALIGN cglobal emms_mmx emms_mmx: emms ret ENDFUNC ; faster enter/exit mmx state ALIGN SECTION_ALIGN cglobal emms_3dn emms_3dn: femms ret ENDFUNC %ifdef ARCH_IS_X86_64 %ifdef WINDOWS cglobal prime_xmm prime_xmm: movdqa xmm6, [prm1] movdqa xmm7, [prm1+16] ret ENDFUNC cglobal get_xmm get_xmm: movdqa [prm1], xmm6 movdqa [prm1+16], xmm7 ret ENDFUNC %endif %endif NON_EXEC_STACK xvidcore/src/utils/x86_asm/mem_transfer_mmx.asm0000664000076500007650000003064211254216113022745 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - 8<->16 bit transfer functions - ; * ; * Copyright (C) 2001 Peter Ross ; * 2001-2008 Michael Militzer ; * 2002 Pascal Massimino ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: mem_transfer_mmx.asm,v 1.22 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN mmx_one: dw 1, 1, 1, 1 ;============================================================================= ; Code ;============================================================================= TEXT cglobal transfer_8to16copy_mmx cglobal transfer_16to8copy_mmx cglobal transfer_8to16sub_mmx cglobal transfer_8to16subro_mmx cglobal transfer_8to16sub2_mmx cglobal transfer_8to16sub2_xmm cglobal transfer_8to16sub2ro_xmm cglobal transfer_16to8add_mmx cglobal transfer8x8_copy_mmx cglobal transfer8x4_copy_mmx ;----------------------------------------------------------------------------- ; ; void transfer_8to16copy_mmx(int16_t * const dst, ; const uint8_t * const src, ; uint32_t stride); ; ;----------------------------------------------------------------------------- %macro COPY_8_TO_16 1 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] movq mm2, mm0 movq mm3, mm1 punpcklbw mm0, mm7 movq [TMP0+%1*32], mm0 punpcklbw mm1, mm7 movq [TMP0+%1*32+16], mm1 punpckhbw mm2, mm7 punpckhbw mm3, mm7 lea _EAX, [_EAX+2*TMP1] movq [TMP0+%1*32+8], mm2 movq [TMP0+%1*32+24], mm3 %endmacro ALIGN SECTION_ALIGN transfer_8to16copy_mmx: mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Src mov TMP1, prm3 ; Stride pxor mm7, mm7 COPY_8_TO_16 0 COPY_8_TO_16 1 COPY_8_TO_16 2 COPY_8_TO_16 3 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_16to8copy_mmx(uint8_t * const dst, ; const int16_t * const src, ; uint32_t stride); ; ;----------------------------------------------------------------------------- %macro COPY_16_TO_8 1 movq mm0, [_EAX+%1*32] movq mm1, [_EAX+%1*32+8] packuswb mm0, mm1 movq [TMP0], mm0 movq mm2, [_EAX+%1*32+16] movq mm3, [_EAX+%1*32+24] packuswb mm2, mm3 movq [TMP0+TMP1], mm2 %endmacro ALIGN SECTION_ALIGN transfer_16to8copy_mmx: mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Src mov TMP1, prm3 ; Stride COPY_16_TO_8 0 lea TMP0,[TMP0+2*TMP1] COPY_16_TO_8 1 lea TMP0,[TMP0+2*TMP1] COPY_16_TO_8 2 lea TMP0,[TMP0+2*TMP1] COPY_16_TO_8 3 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_8to16sub_mmx(int16_t * const dct, ; uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ; when second argument == 1, reference (ebx) block is to current (_EAX) %macro COPY_8_TO_16_SUB 2 movq mm0, [_EAX] ; cur movq mm2, [_EAX+TMP1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm7 punpcklbw mm2, mm7 movq mm4, [_EBX] ; ref punpckhbw mm1, mm7 punpckhbw mm3, mm7 movq mm5, [_EBX+TMP1] ; ref movq mm6, mm4 %if %2 == 1 movq [_EAX], mm4 movq [_EAX+TMP1], mm5 %endif punpcklbw mm4, mm7 punpckhbw mm6, mm7 psubsw mm0, mm4 psubsw mm1, mm6 movq mm6, mm5 punpcklbw mm5, mm7 punpckhbw mm6, mm7 psubsw mm2, mm5 lea _EAX, [_EAX+2*TMP1] psubsw mm3, mm6 lea _EBX,[_EBX+2*TMP1] movq [TMP0+%1*32+ 0], mm0 ; dst movq [TMP0+%1*32+ 8], mm1 movq [TMP0+%1*32+16], mm2 movq [TMP0+%1*32+24], mm3 %endmacro ALIGN SECTION_ALIGN transfer_8to16sub_mmx: mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Cur mov TMP1, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref %endif pxor mm7, mm7 COPY_8_TO_16_SUB 0, 1 COPY_8_TO_16_SUB 1, 1 COPY_8_TO_16_SUB 2, 1 COPY_8_TO_16_SUB 3, 1 pop _EBX ret ENDFUNC ALIGN SECTION_ALIGN transfer_8to16subro_mmx: mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Cur mov TMP1, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref %endif pxor mm7, mm7 COPY_8_TO_16_SUB 0, 0 COPY_8_TO_16_SUB 1, 0 COPY_8_TO_16_SUB 2, 0 COPY_8_TO_16_SUB 3, 0 pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_8to16sub2_mmx(int16_t * const dct, ; uint8_t * const cur, ; const uint8_t * ref1, ; const uint8_t * ref2, ; const uint32_t stride) ; ;----------------------------------------------------------------------------- %macro COPY_8_TO_16_SUB2_MMX 1 movq mm0, [_EAX] ; cur movq mm2, [_EAX+TMP1] ; mm4 <- (ref1+ref2+1) / 2 movq mm4, [_EBX] ; ref1 movq mm1, [_ESI] ; ref2 movq mm6, mm4 movq mm3, mm1 punpcklbw mm4, mm7 punpcklbw mm1, mm7 punpckhbw mm6, mm7 punpckhbw mm3, mm7 paddusw mm4, mm1 paddusw mm6, mm3 paddusw mm4, [mmx_one] paddusw mm6, [mmx_one] psrlw mm4, 1 psrlw mm6, 1 packuswb mm4, mm6 movq [_EAX], mm4 ; mm5 <- (ref1+ref2+1) / 2 movq mm5, [_EBX+TMP1] ; ref1 movq mm1, [_ESI+TMP1] ; ref2 movq mm6, mm5 movq mm3, mm1 punpcklbw mm5, mm7 punpcklbw mm1, mm7 punpckhbw mm6, mm7 punpckhbw mm3, mm7 paddusw mm5, mm1 paddusw mm6, mm3 paddusw mm5, [mmx_one] paddusw mm6, [mmx_one] lea _ESI, [_ESI+2*TMP1] psrlw mm5, 1 psrlw mm6, 1 packuswb mm5, mm6 movq [_EAX+TMP1], mm5 movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm7 punpcklbw mm2, mm7 punpckhbw mm1, mm7 punpckhbw mm3, mm7 movq mm6, mm4 punpcklbw mm4, mm7 punpckhbw mm6, mm7 psubsw mm0, mm4 psubsw mm1, mm6 movq mm6, mm5 punpcklbw mm5, mm7 punpckhbw mm6, mm7 psubsw mm2, mm5 lea _EAX, [_EAX+2*TMP1] psubsw mm3, mm6 lea _EBX, [_EBX+2*TMP1] movq [TMP0+%1*32+ 0], mm0 ; dst movq [TMP0+%1*32+ 8], mm1 movq [TMP0+%1*32+16], mm2 movq [TMP0+%1*32+24], mm3 %endmacro ALIGN SECTION_ALIGN transfer_8to16sub2_mmx: mov TMP0, prm1 ; Dst mov TMP1d, prm5d ; Stride mov _EAX, prm2 ; Cur push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref1 %endif push _ESI %ifdef ARCH_IS_X86_64 mov _ESI, prm4 %else mov _ESI, [_ESP+8+16] ; Ref2 %endif pxor mm7, mm7 COPY_8_TO_16_SUB2_MMX 0 COPY_8_TO_16_SUB2_MMX 1 COPY_8_TO_16_SUB2_MMX 2 COPY_8_TO_16_SUB2_MMX 3 pop _ESI pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_8to16sub2_xmm(int16_t * const dct, ; uint8_t * const cur, ; const uint8_t * ref1, ; const uint8_t * ref2, ; const uint32_t stride) ; ;----------------------------------------------------------------------------- %macro COPY_8_TO_16_SUB2_SSE 1 movq mm0, [_EAX] ; cur movq mm2, [_EAX+TMP1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm7 punpcklbw mm2, mm7 movq mm4, [_EBX] ; ref1 pavgb mm4, [_ESI] ; ref2 movq [_EAX], mm4 punpckhbw mm1, mm7 punpckhbw mm3, mm7 movq mm5, [_EBX+TMP1] ; ref pavgb mm5, [_ESI+TMP1] ; ref2 movq [_EAX+TMP1], mm5 movq mm6, mm4 punpcklbw mm4, mm7 punpckhbw mm6, mm7 psubsw mm0, mm4 psubsw mm1, mm6 lea _ESI, [_ESI+2*TMP1] movq mm6, mm5 punpcklbw mm5, mm7 punpckhbw mm6, mm7 psubsw mm2, mm5 lea _EAX, [_EAX+2*TMP1] psubsw mm3, mm6 lea _EBX, [_EBX+2*TMP1] movq [TMP0+%1*32+ 0], mm0 ; dst movq [TMP0+%1*32+ 8], mm1 movq [TMP0+%1*32+16], mm2 movq [TMP0+%1*32+24], mm3 %endmacro ALIGN SECTION_ALIGN transfer_8to16sub2_xmm: mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Cur mov TMP1d, prm5d ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 ; Ref1 %else mov _EBX, [_ESP+4+12] ; Ref1 %endif push _ESI %ifdef ARCH_IS_X86_64 mov _ESI, prm4 ; Ref1 %else mov _ESI, [_ESP+8+16] ; Ref2 %endif pxor mm7, mm7 COPY_8_TO_16_SUB2_SSE 0 COPY_8_TO_16_SUB2_SSE 1 COPY_8_TO_16_SUB2_SSE 2 COPY_8_TO_16_SUB2_SSE 3 pop _ESI pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_8to16sub2ro_xmm(int16_t * const dct, ; const uint8_t * const cur, ; const uint8_t * ref1, ; const uint8_t * ref2, ; const uint32_t stride) ; ;----------------------------------------------------------------------------- %macro COPY_8_TO_16_SUB2RO_SSE 1 movq mm0, [_EAX] ; cur movq mm2, [_EAX+TMP1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm7 punpcklbw mm2, mm7 movq mm4, [_EBX] ; ref1 pavgb mm4, [_ESI] ; ref2 punpckhbw mm1, mm7 punpckhbw mm3, mm7 movq mm5, [_EBX+TMP1] ; ref pavgb mm5, [_ESI+TMP1] ; ref2 movq mm6, mm4 punpcklbw mm4, mm7 punpckhbw mm6, mm7 psubsw mm0, mm4 psubsw mm1, mm6 lea _ESI, [_ESI+2*TMP1] movq mm6, mm5 punpcklbw mm5, mm7 punpckhbw mm6, mm7 psubsw mm2, mm5 lea _EAX, [_EAX+2*TMP1] psubsw mm3, mm6 lea _EBX, [_EBX+2*TMP1] movq [TMP0+%1*32+ 0], mm0 ; dst movq [TMP0+%1*32+ 8], mm1 movq [TMP0+%1*32+16], mm2 movq [TMP0+%1*32+24], mm3 %endmacro ALIGN SECTION_ALIGN transfer_8to16sub2ro_xmm: pxor mm7, mm7 mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Cur mov TMP1d, prm5d ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref1 %endif push _ESI %ifdef ARCH_IS_X86_64 mov _ESI, prm4 %else mov _ESI, [_ESP+8+16] ; Ref2 %endif COPY_8_TO_16_SUB2RO_SSE 0 COPY_8_TO_16_SUB2RO_SSE 1 COPY_8_TO_16_SUB2RO_SSE 2 COPY_8_TO_16_SUB2RO_SSE 3 pop _ESI pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer_16to8add_mmx(uint8_t * const dst, ; const int16_t * const src, ; uint32_t stride); ; ;----------------------------------------------------------------------------- %macro COPY_16_TO_8_ADD 1 movq mm0, [TMP0] movq mm2, [TMP0+TMP1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm7 punpcklbw mm2, mm7 punpckhbw mm1, mm7 punpckhbw mm3, mm7 paddsw mm0, [_EAX+%1*32+ 0] paddsw mm1, [_EAX+%1*32+ 8] paddsw mm2, [_EAX+%1*32+16] paddsw mm3, [_EAX+%1*32+24] packuswb mm0, mm1 movq [TMP0], mm0 packuswb mm2, mm3 movq [TMP0+TMP1], mm2 %endmacro ALIGN SECTION_ALIGN transfer_16to8add_mmx: mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Src mov TMP1, prm3 ; Stride pxor mm7, mm7 COPY_16_TO_8_ADD 0 lea TMP0,[TMP0+2*TMP1] COPY_16_TO_8_ADD 1 lea TMP0,[TMP0+2*TMP1] COPY_16_TO_8_ADD 2 lea TMP0,[TMP0+2*TMP1] COPY_16_TO_8_ADD 3 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer8x8_copy_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride); ; ; ;----------------------------------------------------------------------------- %macro COPY_8_TO_8 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] movq [TMP0], mm0 lea _EAX, [_EAX+2*TMP1] movq [TMP0+TMP1], mm1 %endmacro ALIGN SECTION_ALIGN transfer8x8_copy_mmx: mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Src mov TMP1, prm3 ; Stride COPY_8_TO_8 lea TMP0,[TMP0+2*TMP1] COPY_8_TO_8 lea TMP0,[TMP0+2*TMP1] COPY_8_TO_8 lea TMP0,[TMP0+2*TMP1] COPY_8_TO_8 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void transfer8x4_copy_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride); ; ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN transfer8x4_copy_mmx: mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Src mov TMP1, prm3 ; Stride COPY_8_TO_8 lea TMP0,[TMP0+2*TMP1] COPY_8_TO_8 ret ENDFUNC NON_EXEC_STACK xvidcore/src/utils/x86_asm/interlacing_mmx.asm0000664000076500007650000001257311254216113022565 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - Interlacing Field test - ; * ; * Copyright(C) 2002 Daniel Smith ; * ; * This program is free software ; you can r_EDIstribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: interlacing_mmx.asm,v 1.12 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ; advances to next block on right ALIGN SECTION_ALIGN nexts: dd 0, 0, 8, 120, 8 ; multiply word sums into dwords ALIGN SECTION_ALIGN ones: times 4 dw 1 ;============================================================================= ; Code ;============================================================================= TEXT cglobal MBFieldTest_mmx ; neater %define line0 _ESI %define line1 _ESI+16 %define line2 _ESI+32 %define line3 _ESI+48 %define line4 _ESI+64 %define line5 _ESI+80 %define line6 _ESI+96 %define line7 _ESI+112 %define line8 _EDI %define line9 _EDI+16 %define line10 _EDI+32 %define line11 _EDI+48 %define line12 _EDI+64 %define line13 _EDI+80 %define line14 _EDI+96 %define line15 _EDI+112 ; keep from losing track which reg holds which line - these never overlap %define m00 mm0 %define m01 mm1 %define m02 mm2 %define m03 mm0 %define m04 mm1 %define m05 mm2 %define m06 mm0 %define m07 mm1 %define m08 mm2 %define m09 mm0 %define m10 mm1 %define m11 mm2 %define m12 mm0 %define m13 mm1 %define m14 mm2 %define m15 mm0 ; gets diff between three lines low(%2),mid(%3),hi(%4): frame = mid-low, field = hi-low %macro ABS8 4 movq %4, [%1] ; m02 = hi movq mm3, %2 ; mm3 = low copy pxor mm4, mm4 ; mm4 = 0 pxor mm5, mm5 ; mm5 = 0 psubw %2, %3 ; diff(med,low) for frame psubw mm3, %4 ; diff(hi,low) for field pcmpgtw mm4, %2 ; if (diff<0), mm4 will be all 1's, else all 0's pcmpgtw mm5, mm3 pxor %2, mm4 ; this will get abs(), but off by 1 if (diff<0) pxor mm3, mm5 psubw %2, mm4 ; correct abs being off by 1 when (diff<0) psubw mm3, mm5 paddw mm6, %2 ; add to totals paddw mm7, mm3 %endmacro ;----------------------------------------------------------------------------- ; ; uint32_t MBFieldTest_mmx(int16_t * const data); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN MBFieldTest_mmx: mov _EAX, prm1 push _ESI push _EDI mov _ESI, _EAX ; _ESI = top left block mov _EDI, _ESI add _EDI, 256 ; _EDI = bottom left block pxor mm6, mm6 ; frame total pxor mm7, mm7 ; field total mov _EAX, 4 ; we do left 8 bytes of data[0*64], then right 8 bytes ; then left 8 bytes of data[1*64], then last 8 bytes .loop: movq m00, [line0] ; line0 movq m01, [line1] ; line1 ABS8 line2, m00, m01, m02 ; frame += (line2-line1), field += (line2-line0) ABS8 line3, m01, m02, m03 ABS8 line4, m02, m03, m04 ABS8 line5, m03, m04, m05 ABS8 line6, m04, m05, m06 ABS8 line7, m05, m06, m07 ABS8 line8, m06, m07, m08 movq m09, [line9] ; line9-line7, no frame comp for line9-line8! pxor mm4, mm4 psubw m07, m09 pcmpgtw mm4, mm1 pxor m07, mm4 psubw m07, mm4 paddw mm7, m07 ; add to field total ABS8 line10, m08, m09, m10 ; frame += (line10-line9), field += (line10-line8) ABS8 line11, m09, m10, m11 ABS8 line12, m10, m11, m12 ABS8 line13, m11, m12, m13 ABS8 line14, m12, m13, m14 ABS8 line15, m13, m14, m15 pxor mm4, mm4 ; line15-line14, we're done with field comps! psubw m14, m15 pcmpgtw mm4, m14 pxor m14, mm4 psubw m14, mm4 paddw mm6, m14 ; add to frame total lea TMP0, [nexts] mov TMP0d, dword [TMP0+_EAX*4] ; move _ESI/_EDI 8 pixels to the right add _ESI, TMP0 add _EDI, TMP0 dec _EAX jnz near .loop .decide: movq mm0, [ones] ; add packed words into single dwords pmaddwd mm6, mm0 pmaddwd mm7, mm0 movq mm0, mm6 ; TMP0 will be frame total, TMP1 field movq mm1, mm7 psrlq mm0, 32 psrlq mm1, 32 paddd mm0, mm6 paddd mm1, mm7 movd TMP0d, mm0 movd TMP1d, mm1 add TMP1, 350 ; add bias against field decision cmp TMP0, TMP1 jb .end ; if frame=field, use field dct (return 1) .end: pop _EDI pop _ESI ret ENDFUNC NON_EXEC_STACK xvidcore/src/utils/timer.h0000664000076500007650000000533211564705453016720 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Timer related header (used for internal debugging) - * * Copyright(C) 2002 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: timer.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _ENCORE_TIMER_H #define _ENCORE_TIMER_H #if defined(_PROFILING_) #include "../portab.h" uint64_t count_frames; extern void start_timer(void); extern void start_global_timer(void); extern void stop_dct_timer(void); extern void stop_idct_timer(void); extern void stop_motion_timer(void); extern void stop_comp_timer(void); extern void stop_edges_timer(void); extern void stop_inter_timer(void); extern void stop_quant_timer(void); extern void stop_iquant_timer(void); extern void stop_conv_timer(void); extern void stop_transfer_timer(void); extern void stop_coding_timer(void); extern void stop_prediction_timer(void); extern void stop_interlacing_timer(void); extern void stop_global_timer(void); extern void init_timer(void); extern void write_timer(void); #else static __inline void start_timer(void) { } static __inline void start_global_timer(void) { } static __inline void stop_dct_timer(void) { } static __inline void stop_idct_timer(void) { } static __inline void stop_motion_timer(void) { } static __inline void stop_comp_timer(void) { } static __inline void stop_edges_timer(void) { } static __inline void stop_inter_timer(void) { } static __inline void stop_quant_timer(void) { } static __inline void stop_iquant_timer(void) { } static __inline void stop_conv_timer(void) { } static __inline void stop_transfer_timer(void) { } static __inline void init_timer(void) { } static __inline void write_timer(void) { } static __inline void stop_coding_timer(void) { } static __inline void stop_interlacing_timer(void) { } static __inline void stop_prediction_timer(void) { } static __inline void stop_global_timer(void) { } #endif #endif /* _TIMER_H_ */ xvidcore/src/utils/mbfunctions.h0000664000076500007650000000553611564705453020135 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - MB related header - * * Copyright(C) 2001 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mbfunctions.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _ENCORE_BLOCK_H #define _ENCORE_BLOCK_H #include "../encoder.h" #include "../bitstream/bitstream.h" /** MBTransQuant.c **/ void MBTransQuantIntra(const MBParam * const pParam, const FRAMEINFO * const frame, MACROBLOCK * const pMB, const uint32_t x_pos, /* <-- The x position of the MB to be searched */ const uint32_t y_pos, /* <-- The y position of the MB to be searched */ int16_t data[6 * 64], /* <-> the data of the MB to be coded */ int16_t qcoeff[6 * 64]); /* <-> the quantized DCT coefficients */ uint8_t MBTransQuantInter(const MBParam * const pParam, const FRAMEINFO * const frame, MACROBLOCK * const pMB, const uint32_t x_pos, const uint32_t y_pos, int16_t data[6 * 64], int16_t qcoeff[6 * 64]); uint8_t MBTransQuantInterBVOP(const MBParam * pParam, FRAMEINFO * frame, MACROBLOCK * pMB, const uint32_t x_pos, const uint32_t y_pos, int16_t data[6 * 64], int16_t qcoeff[6 * 64]); typedef uint32_t (MBFIELDTEST) (int16_t data[6 * 64]); /* function pointer for field test */ typedef MBFIELDTEST *MBFIELDTEST_PTR; /* global field test pointer for xvid.c */ extern MBFIELDTEST_PTR MBFieldTest; /* field test implementations */ MBFIELDTEST MBFieldTest_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) MBFIELDTEST MBFieldTest_mmx; #endif void MBFrameToField(int16_t data[6 * 64]); /* de-interlace vertical Y blocks */ /** MBCoding.c **/ void MBCoding(const FRAMEINFO * const frame, /* <-- the parameter for coding of the bitstream */ MACROBLOCK * pMB, /* <-- Info of the MB to be coded */ int16_t qcoeff[6 * 64], /* <-- the quantized DCT coefficients */ Bitstream * bs, /* <-> the bitstream */ Statistics * pStat); /* <-> statistical data collected for current frame */ #endif xvidcore/src/bitstream/0000775000076500007650000000000011566427763016265 5ustar xvidbuildxvidbuildxvidcore/src/bitstream/bitstream.h0000664000076500007650000002700111564705453020421 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Bitstream reader/writer inlined functions and constants- * * Copyright (C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: bitstream.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _BITSTREAM_H_ #define _BITSTREAM_H_ #include "../portab.h" #include "../decoder.h" #include "../encoder.h" /***************************************************************************** * Constants ****************************************************************************/ /* comment any #defs we dont use */ #define VIDOBJ_START_CODE 0x00000100 /* ..0x0000011f */ #define VIDOBJLAY_START_CODE 0x00000120 /* ..0x0000012f */ #define VISOBJSEQ_START_CODE 0x000001b0 #define VISOBJSEQ_STOP_CODE 0x000001b1 /* ??? */ #define USERDATA_START_CODE 0x000001b2 #define GRPOFVOP_START_CODE 0x000001b3 /*#define VIDSESERR_ERROR_CODE 0x000001b4 */ #define VISOBJ_START_CODE 0x000001b5 #define VOP_START_CODE 0x000001b6 /*#define STUFFING_START_CODE 0x000001c3 */ #define VISOBJ_TYPE_VIDEO 1 /*#define VISOBJ_TYPE_STILLTEXTURE 2 */ /*#define VISOBJ_TYPE_MESH 3 */ /*#define VISOBJ_TYPE_FBA 4 */ /*#define VISOBJ_TYPE_3DMESH 5 */ #define VIDOBJLAY_TYPE_SIMPLE 1 /*#define VIDOBJLAY_TYPE_SIMPLE_SCALABLE 2 */ /*#define VIDOBJLAY_TYPE_CORE 3 */ /*#define VIDOBJLAY_TYPE_MAIN 4 */ /*#define VIDOBJLAY_TYPE_NBIT 5 */ /*#define VIDOBJLAY_TYPE_ANIM_TEXT 6 */ /*#define VIDOBJLAY_TYPE_ANIM_MESH 7 */ /*#define VIDOBJLAY_TYPE_SIMPLE_FACE 8 */ /*#define VIDOBJLAY_TYPE_STILL_SCALABLE 9 */ #define VIDOBJLAY_TYPE_ART_SIMPLE 10 /*#define VIDOBJLAY_TYPE_CORE_SCALABLE 11 */ /*#define VIDOBJLAY_TYPE_ACE 12 */ /*#define VIDOBJLAY_TYPE_ADVANCED_SCALABLE_TEXTURE 13 */ /*#define VIDOBJLAY_TYPE_SIMPLE_FBA 14 */ /*#define VIDEOJLAY_TYPE_SIMPLE_STUDIO 15*/ /*#define VIDEOJLAY_TYPE_CORE_STUDIO 16*/ #define VIDOBJLAY_TYPE_ASP 17 /*#define VIDOBJLAY_TYPE_FGS 18*/ /*#define VIDOBJLAY_AR_SQUARE 1 */ /*#define VIDOBJLAY_AR_625TYPE_43 2 */ /*#define VIDOBJLAY_AR_525TYPE_43 3 */ /*#define VIDOBJLAY_AR_625TYPE_169 8 */ /*#define VIDOBJLAY_AR_525TYPE_169 9 */ #define VIDOBJLAY_AR_EXTPAR 15 #define VIDOBJLAY_SHAPE_RECTANGULAR 0 #define VIDOBJLAY_SHAPE_BINARY 1 #define VIDOBJLAY_SHAPE_BINARY_ONLY 2 #define VIDOBJLAY_SHAPE_GRAYSCALE 3 #define SPRITE_NONE 0 #define SPRITE_STATIC 1 #define SPRITE_GMC 2 #define READ_MARKER() BitstreamSkip(bs, 1) #define WRITE_MARKER() BitstreamPutBit(bs, 1) /* vop coding types */ /* intra, prediction, backward, sprite, not_coded */ #define I_VOP 0 #define P_VOP 1 #define B_VOP 2 #define S_VOP 3 #define N_VOP 4 /* resync-specific */ #define NUMBITS_VP_RESYNC_MARKER 17 #define RESYNC_MARKER 1 /***************************************************************************** * Prototypes ****************************************************************************/ int read_video_packet_header(Bitstream *bs, DECODER * dec, const int addbits, int *quant, int *fcode_forward, int *fcode_backward, int *intra_dc_threshold); /* header stuff */ int BitstreamReadHeaders(Bitstream * bs, DECODER * dec, uint32_t * rounding, uint32_t * quant, uint32_t * fcode_forward, uint32_t * fcode_backward, uint32_t * intra_dc_threshold, WARPPOINTS * gmc_warp); void BitstreamWriteVolHeader(Bitstream * const bs, const MBParam * pParam, const FRAMEINFO * const frame, const int num_slices); void BitstreamWriteVopHeader(Bitstream * const bs, const MBParam * pParam, const FRAMEINFO * const frame, int vop_coded, unsigned int quant); void BitstreamWriteUserData(Bitstream * const bs, const char *data, const unsigned int length); void BitstreamWriteEndOfSequence(Bitstream * const bs); void BitstreamWriteGroupOfVopHeader(Bitstream * const bs, const MBParam * pParam, uint32_t is_closed_gov); void write_video_packet_header(Bitstream * const bs, const MBParam * pParam, const FRAMEINFO * const frame, int mbnum); /***************************************************************************** * Bitstream ****************************************************************************/ /* Input buffer should be readable as full chunks of 8bytes, including the end of the buffer. Padding might be appropriate. If only chunks of 4bytes are applicable, define XVID_SAFE_BS_TAIL. Note that this will slow decoding, so consider this as a last-resort solution */ /* #define XVID_SAFE_BS_TAIL */ /* initialise bitstream structure */ static void __inline BitstreamInit(Bitstream * const bs, void *const bitstream, uint32_t length) { uint32_t tmp; size_t bitpos; ptr_t adjbitstream = (ptr_t)bitstream; /* * Start the stream on a uint32_t boundary, by rounding down to the * previous uint32_t and skipping the intervening bytes. */ bitpos = ((sizeof(uint32_t)-1) & (size_t)bitstream); adjbitstream = adjbitstream - bitpos; bs->start = bs->tail = (uint32_t *) adjbitstream; tmp = *bs->start; #ifndef ARCH_IS_BIG_ENDIAN BSWAP(tmp); #endif bs->bufa = tmp; tmp = *(bs->start + 1); #ifndef ARCH_IS_BIG_ENDIAN BSWAP(tmp); #endif bs->bufb = tmp; bs->pos = bs->initpos = (uint32_t) bitpos*8; /* preserve the intervening bytes */ if (bs->initpos > 0) bs->buf = bs->bufa & (0xffffffff << (32 - bs->initpos)); else bs->buf = 0; bs->length = length; } /* reset bitstream state */ static void __inline BitstreamReset(Bitstream * const bs) { uint32_t tmp; bs->tail = bs->start; tmp = *bs->start; #ifndef ARCH_IS_BIG_ENDIAN BSWAP(tmp); #endif bs->bufa = tmp; tmp = *(bs->start + 1); #ifndef ARCH_IS_BIG_ENDIAN BSWAP(tmp); #endif bs->bufb = tmp; /* preserve the intervening bytes */ if (bs->initpos > 0) bs->buf = bs->bufa & (0xffffffff << (32 - bs->initpos)); else bs->buf = 0; bs->pos = bs->initpos; } /* reads n bits from bitstream without changing the stream pos */ static uint32_t __inline BitstreamShowBits(Bitstream * const bs, const uint32_t bits) { int nbit = (bits + bs->pos) - 32; if (nbit > 0) { return ((bs->bufa & (0xffffffff >> bs->pos)) << nbit) | (bs-> bufb >> (32 - nbit)); } else { return (bs->bufa & (0xffffffff >> bs->pos)) >> (32 - bs->pos - bits); } } /* skip n bits forward in bitstream */ static __inline void BitstreamSkip(Bitstream * const bs, const uint32_t bits) { bs->pos += bits; if (bs->pos >= 32) { uint32_t tmp; bs->bufa = bs->bufb; #if defined(XVID_SAFE_BS_TAIL) if (bs->tail<(bs->start+((bs->length+3)>>2))) #endif { tmp = *((uint32_t *) bs->tail + 2); #ifndef ARCH_IS_BIG_ENDIAN BSWAP(tmp); #endif bs->bufb = tmp; bs->tail++; } #if defined(XVID_SAFE_BS_TAIL) else { bs->bufb = 0; } #endif bs->pos -= 32; } } /* number of bits to next byte alignment */ static __inline uint32_t BitstreamNumBitsToByteAlign(Bitstream *bs) { uint32_t n = (32 - bs->pos) % 8; return n == 0 ? 8 : n; } /* show nbits from next byte alignment */ static __inline uint32_t BitstreamShowBitsFromByteAlign(Bitstream *bs, int bits) { int bspos = bs->pos + BitstreamNumBitsToByteAlign(bs); int nbit = (bits + bspos) - 32; if (bspos >= 32) { return bs->bufb >> (32 - nbit); } else if (nbit > 0) { return ((bs->bufa & (0xffffffff >> bspos)) << nbit) | (bs-> bufb >> (32 - nbit)); } else { return (bs->bufa & (0xffffffff >> bspos)) >> (32 - bspos - bits); } } /* move forward to the next byte boundary */ static __inline void BitstreamByteAlign(Bitstream * const bs) { uint32_t remainder = bs->pos % 8; if (remainder) { BitstreamSkip(bs, 8 - remainder); } } /* bitstream length (unit bits) */ static uint32_t __inline BitstreamPos(const Bitstream * const bs) { return((uint32_t)(8*((ptr_t)bs->tail - (ptr_t)bs->start) + bs->pos - bs->initpos)); } /* * flush the bitstream & return length (unit bytes) * NOTE: assumes no futher bitstream functions will be called. */ static uint32_t __inline BitstreamLength(Bitstream * const bs) { uint32_t len = (uint32_t)((ptr_t)bs->tail - (ptr_t)bs->start); if (bs->pos) { uint32_t b = bs->buf; #ifndef ARCH_IS_BIG_ENDIAN BSWAP(b); #endif *bs->tail = b; len += (bs->pos + 7) / 8; } /* initpos is always on a byte boundary */ if (bs->initpos) len -= bs->initpos/8; return len; } /* move bitstream position forward by n bits and write out buffer if needed */ static void __inline BitstreamForward(Bitstream * const bs, const uint32_t bits) { bs->pos += bits; if (bs->pos >= 32) { uint32_t b = bs->buf; #ifndef ARCH_IS_BIG_ENDIAN BSWAP(b); #endif *bs->tail++ = b; bs->buf = 0; bs->pos -= 32; } } /* read n bits from bitstream */ static uint32_t __inline BitstreamGetBits(Bitstream * const bs, const uint32_t n) { uint32_t ret = BitstreamShowBits(bs, n); BitstreamSkip(bs, n); return ret; } /* read single bit from bitstream */ static uint32_t __inline BitstreamGetBit(Bitstream * const bs) { return BitstreamGetBits(bs, 1); } /* write single bit to bitstream */ static void __inline BitstreamPutBit(Bitstream * const bs, const uint32_t bit) { if (bit) bs->buf |= (0x80000000 >> bs->pos); BitstreamForward(bs, 1); } /* write n bits to bitstream */ static void __inline BitstreamPutBits(Bitstream * const bs, const uint32_t value, const uint32_t size) { uint32_t shift = 32 - bs->pos - size; if (shift <= 32) { bs->buf |= value << shift; BitstreamForward(bs, size); } else { uint32_t remainder; shift = size - (32 - bs->pos); bs->buf |= value >> shift; BitstreamForward(bs, size - shift); remainder = shift; shift = 32 - shift; bs->buf |= value << shift; BitstreamForward(bs, remainder); } } static const int stuffing_codes[8] = { /* nbits stuffing code */ 0, /* 1 0 */ 1, /* 2 01 */ 3, /* 3 011 */ 7, /* 4 0111 */ 0xf, /* 5 01111 */ 0x1f, /* 6 011111 */ 0x3f, /* 7 0111111 */ 0x7f, /* 8 01111111 */ }; /* pad bitstream to the next byte boundary */ static void __inline BitstreamPad(Bitstream * const bs) { int bits = 8 - (bs->pos % 8); if (bits < 8) BitstreamPutBits(bs, stuffing_codes[bits - 1], bits); } /* * pad bitstream to the next byte boundary * alway pad: even if currently at the byte boundary */ static void __inline BitstreamPadAlways(Bitstream * const bs) { int bits = 8 - (bs->pos % 8); BitstreamPutBits(bs, stuffing_codes[bits - 1], bits); } #endif /* _BITSTREAM_H_ */ xvidcore/src/bitstream/vlc_codes.h0000664000076500007650000000453711564705453020401 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Variable Length Code header - * * Copyright(C) 2002 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: vlc_codes.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _VLC_CODES_H_ #define _VLC_CODES_H_ #include "../portab.h" #define VLC_ERROR (-1) #define ESCAPE 3 #define ESCAPE1 6 #define ESCAPE2 14 #define ESCAPE3 15 typedef struct { uint32_t code; uint8_t len; } VLC; typedef struct { uint8_t last; uint8_t run; int8_t level; } EVENT; typedef struct { uint8_t len; EVENT event; } REVERSE_EVENT; typedef struct { VLC vlc; EVENT event; } VLC_TABLE; /****************************************************************** * common tables between encoder/decoder * ******************************************************************/ extern VLC const dc_lum_tab[]; extern short const dc_threshold[]; extern VLC_TABLE const coeff_tab[2][102]; extern uint8_t const max_level[2][2][64]; extern uint8_t const max_run[2][2][64]; extern VLC sprite_trajectory_code[32768]; extern VLC sprite_trajectory_len[15]; extern VLC mcbpc_intra_tab[15]; extern VLC mcbpc_inter_tab[29]; extern const VLC xvid_cbpy_tab[16]; extern const VLC dcy_tab[511]; extern const VLC dcc_tab[511]; extern const VLC mb_motion_table[65]; extern VLC const mcbpc_intra_table[64]; extern VLC const mcbpc_inter_table[257]; extern VLC const cbpy_table[64]; extern VLC const TMNMVtab0[]; extern VLC const TMNMVtab1[]; extern VLC const TMNMVtab2[]; #endif /* _VLC_CODES_H */ xvidcore/src/bitstream/mbcoding.c0000664000076500007650000015731011564705453020213 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - MB coding - * * Copyright (C) 2002 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mbcoding.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include "../portab.h" #include "../global.h" #include "bitstream.h" #include "zigzag.h" #include "vlc_codes.h" #include "mbcoding.h" #include "../utils/mbfunctions.h" #ifdef _DEBUG # include "../motion/estimation.h" # include "../motion/motion_inlines.h" # include #endif #define LEVELOFFSET 32 /* Initialized once during xvid_global call * RO access is thread safe */ static REVERSE_EVENT DCT3D[2][4096]; static VLC coeff_VLC[2][2][64][64]; /* not really MB related, but VLCs are only available here */ void bs_put_spritetrajectory(Bitstream * bs, const int val) { const int code = sprite_trajectory_code[val+16384].code; const int len = sprite_trajectory_code[val+16384].len; const int code2 = sprite_trajectory_len[len].code; const int len2 = sprite_trajectory_len[len].len; #if 0 printf("GMC=%d Code/Len = %d / %d ",val, code,len); printf("Code2 / Len2 = %d / %d \n",code2,len2); #endif BitstreamPutBits(bs, code2, len2); if (len) BitstreamPutBits(bs, code, len); } int bs_get_spritetrajectory(Bitstream * bs) { int i; for (i = 0; i < 12; i++) { if (BitstreamShowBits(bs, sprite_trajectory_len[i].len) == sprite_trajectory_len[i].code) { BitstreamSkip(bs, sprite_trajectory_len[i].len); return i; } } return -1; } void init_vlc_tables(void) { uint32_t i, j, k, intra, last, run, run_esc, level, level_esc, escape, escape_len, offset; int32_t l; for (intra = 0; intra < 2; intra++) for (i = 0; i < 4096; i++) DCT3D[intra][i].event.level = 0; for (intra = 0; intra < 2; intra++) { for (last = 0; last < 2; last++) { for (run = 0; run < 63 + last; run++) { for (level = 0; level < (uint32_t)(32 << intra); level++) { offset = !intra * LEVELOFFSET; coeff_VLC[intra][last][level + offset][run].len = 128; } } } } for (intra = 0; intra < 2; intra++) { for (i = 0; i < 102; i++) { offset = !intra * LEVELOFFSET; for (j = 0; j < (uint32_t)(1 << (12 - coeff_tab[intra][i].vlc.len)); j++) { DCT3D[intra][(coeff_tab[intra][i].vlc.code << (12 - coeff_tab[intra][i].vlc.len)) | j].len = coeff_tab[intra][i].vlc.len; DCT3D[intra][(coeff_tab[intra][i].vlc.code << (12 - coeff_tab[intra][i].vlc.len)) | j].event = coeff_tab[intra][i].event; } coeff_VLC[intra][coeff_tab[intra][i].event.last][coeff_tab[intra][i].event.level + offset][coeff_tab[intra][i].event.run].code = coeff_tab[intra][i].vlc.code << 1; coeff_VLC[intra][coeff_tab[intra][i].event.last][coeff_tab[intra][i].event.level + offset][coeff_tab[intra][i].event.run].len = coeff_tab[intra][i].vlc.len + 1; if (!intra) { coeff_VLC[intra][coeff_tab[intra][i].event.last][offset - coeff_tab[intra][i].event.level][coeff_tab[intra][i].event.run].code = (coeff_tab[intra][i].vlc.code << 1) | 1; coeff_VLC[intra][coeff_tab[intra][i].event.last][offset - coeff_tab[intra][i].event.level][coeff_tab[intra][i].event.run].len = coeff_tab[intra][i].vlc.len + 1; } } } for (intra = 0; intra < 2; intra++) { for (last = 0; last < 2; last++) { for (run = 0; run < 63 + last; run++) { for (level = 1; level < (uint32_t)(32 << intra); level++) { if (level <= max_level[intra][last][run] && run <= max_run[intra][last][level]) continue; offset = !intra * LEVELOFFSET; level_esc = level - max_level[intra][last][run]; run_esc = run - 1 - max_run[intra][last][level]; if (level_esc <= max_level[intra][last][run] && run <= max_run[intra][last][level_esc]) { escape = ESCAPE1; escape_len = 7 + 1; run_esc = run; } else { if (run_esc <= max_run[intra][last][level] && level <= max_level[intra][last][run_esc]) { escape = ESCAPE2; escape_len = 7 + 2; level_esc = level; } else { if (!intra) { coeff_VLC[intra][last][level + offset][run].code = (ESCAPE3 << 21) | (last << 20) | (run << 14) | (1 << 13) | ((level & 0xfff) << 1) | 1; coeff_VLC[intra][last][level + offset][run].len = 30; coeff_VLC[intra][last][offset - level][run].code = (ESCAPE3 << 21) | (last << 20) | (run << 14) | (1 << 13) | ((-(int32_t)level & 0xfff) << 1) | 1; coeff_VLC[intra][last][offset - level][run].len = 30; } continue; } } coeff_VLC[intra][last][level + offset][run].code = (escape << coeff_VLC[intra][last][level_esc + offset][run_esc].len) | coeff_VLC[intra][last][level_esc + offset][run_esc].code; coeff_VLC[intra][last][level + offset][run].len = coeff_VLC[intra][last][level_esc + offset][run_esc].len + escape_len; if (!intra) { coeff_VLC[intra][last][offset - level][run].code = (escape << coeff_VLC[intra][last][level_esc + offset][run_esc].len) | coeff_VLC[intra][last][level_esc + offset][run_esc].code | 1; coeff_VLC[intra][last][offset - level][run].len = coeff_VLC[intra][last][level_esc + offset][run_esc].len + escape_len; } } if (!intra) { coeff_VLC[intra][last][0][run].code = (ESCAPE3 << 21) | (last << 20) | (run << 14) | (1 << 13) | ((-32 & 0xfff) << 1) | 1; coeff_VLC[intra][last][0][run].len = 30; } } } } /* init sprite_trajectory tables * even if GMC is not specified (it might be used later...) */ sprite_trajectory_code[0+16384].code = 0; sprite_trajectory_code[0+16384].len = 0; for (k=0;k<14;k++) { int limit = (1< (cmp - 1)) value -= 64 * scale_factor; if (value == 0) { BitstreamPutBits(bs, mb_motion_table[32].code, mb_motion_table[32].len); } else { uint16_t length, code, mv_res, sign; length = 16 << f_code; f_code--; sign = (value < 0); if (value >= length) value -= 2 * length; else if (value < -length) value += 2 * length; if (sign) value = -value; value--; mv_res = value & ((1 << f_code) - 1); code = ((value - mv_res) >> f_code) + 1; if (sign) code = -code; code += 32; BitstreamPutBits(bs, mb_motion_table[code].code, mb_motion_table[code].len); if (f_code) BitstreamPutBits(bs, mv_res, f_code); } } static __inline void CodeCoeffInter(Bitstream * bs, const int16_t qcoeff[64], const uint16_t * zigzag) { uint32_t i, run, prev_run, code, len; int32_t level, prev_level, level_shifted; i = 0; run = 0; while (!(level = qcoeff[zigzag[i++]])) run++; prev_level = level; prev_run = run; run = 0; while (i < 64) { if ((level = qcoeff[zigzag[i++]]) != 0) { level_shifted = prev_level + 32; if (!(level_shifted & -64)) { code = coeff_VLC[0][0][level_shifted][prev_run].code; len = coeff_VLC[0][0][level_shifted][prev_run].len; } else { code = (ESCAPE3 << 21) | (prev_run << 14) | (1 << 13) | ((prev_level & 0xfff) << 1) | 1; len = 30; } BitstreamPutBits(bs, code, len); prev_level = level; prev_run = run; run = 0; } else run++; } level_shifted = prev_level + 32; if (!(level_shifted & -64)) { code = coeff_VLC[0][1][level_shifted][prev_run].code; len = coeff_VLC[0][1][level_shifted][prev_run].len; } else { code = (ESCAPE3 << 21) | (1 << 20) | (prev_run << 14) | (1 << 13) | ((prev_level & 0xfff) << 1) | 1; len = 30; } BitstreamPutBits(bs, code, len); } static __inline void CodeCoeffIntra(Bitstream * bs, const int16_t qcoeff[64], const uint16_t * zigzag) { uint32_t i, abs_level, run, prev_run, code, len; int32_t level, prev_level; i = 1; run = 0; while (i<64 && !(level = qcoeff[zigzag[i++]])) run++; prev_level = level; prev_run = run; run = 0; while (i < 64) { if ((level = qcoeff[zigzag[i++]]) != 0) { abs_level = abs(prev_level); abs_level = abs_level < 64 ? abs_level : 0; code = coeff_VLC[1][0][abs_level][prev_run].code; len = coeff_VLC[1][0][abs_level][prev_run].len; if (len != 128) code |= (prev_level < 0); else { code = (ESCAPE3 << 21) | (prev_run << 14) | (1 << 13) | ((prev_level & 0xfff) << 1) | 1; len = 30; } BitstreamPutBits(bs, code, len); prev_level = level; prev_run = run; run = 0; } else run++; } abs_level = abs(prev_level); abs_level = abs_level < 64 ? abs_level : 0; code = coeff_VLC[1][1][abs_level][prev_run].code; len = coeff_VLC[1][1][abs_level][prev_run].len; if (len != 128) code |= (prev_level < 0); else { code = (ESCAPE3 << 21) | (1 << 20) | (prev_run << 14) | (1 << 13) | ((prev_level & 0xfff) << 1) | 1; len = 30; } BitstreamPutBits(bs, code, len); } /* returns the number of bits required to encode qcoeff */ int CodeCoeffIntra_CalcBits(const int16_t qcoeff[64], const uint16_t * zigzag) { int bits = 0; uint32_t i, abs_level, run, prev_run, len; int32_t level, prev_level; i = 1; run = 0; while (i<64 && !(level = qcoeff[zigzag[i++]])) run++; if (i >= 64) return 0; /* empty block */ prev_level = level; prev_run = run; run = 0; while (i < 64) { if ((level = qcoeff[zigzag[i++]]) != 0) { abs_level = abs(prev_level); abs_level = abs_level < 64 ? abs_level : 0; len = coeff_VLC[1][0][abs_level][prev_run].len; bits += len!=128 ? len : 30; prev_level = level; prev_run = run; run = 0; } else run++; } abs_level = abs(prev_level); abs_level = abs_level < 64 ? abs_level : 0; len = coeff_VLC[1][1][abs_level][prev_run].len; bits += len!=128 ? len : 30; return bits; } int CodeCoeffInter_CalcBits(const int16_t qcoeff[64], const uint16_t * zigzag) { uint32_t i, run, prev_run, len; int32_t level, prev_level, level_shifted; int bits = 0; i = 0; run = 0; while (!(level = qcoeff[zigzag[i++]])) run++; prev_level = level; prev_run = run; run = 0; while (i < 64) { if ((level = qcoeff[zigzag[i++]]) != 0) { level_shifted = prev_level + 32; if (!(level_shifted & -64)) len = coeff_VLC[0][0][level_shifted][prev_run].len; else len = 30; bits += len; prev_level = level; prev_run = run; run = 0; } else run++; } level_shifted = prev_level + 32; if (!(level_shifted & -64)) len = coeff_VLC[0][1][level_shifted][prev_run].len; else len = 30; bits += len; return bits; } static const int iDQtab[5] = { 1, 0, -1 /* no change */, 2, 3 }; #define DQ_VALUE2INDEX(value) iDQtab[(value)+2] static __inline void CodeBlockIntra(const FRAMEINFO * const frame, const MACROBLOCK * pMB, int16_t qcoeff[6 * 64], Bitstream * bs, Statistics * pStat) { uint32_t i, mcbpc, cbpy, bits; cbpy = pMB->cbp >> 2; /* write mcbpc */ if (frame->coding_type == I_VOP) { mcbpc = ((pMB->mode >> 1) & 3) | ((pMB->cbp & 3) << 2); BitstreamPutBits(bs, mcbpc_intra_tab[mcbpc].code, mcbpc_intra_tab[mcbpc].len); } else { mcbpc = (pMB->mode & 7) | ((pMB->cbp & 3) << 3); BitstreamPutBits(bs, mcbpc_inter_tab[mcbpc].code, mcbpc_inter_tab[mcbpc].len); } /* ac prediction flag */ if (pMB->acpred_directions[0]) BitstreamPutBits(bs, 1, 1); else BitstreamPutBits(bs, 0, 1); /* write cbpy */ BitstreamPutBits(bs, xvid_cbpy_tab[cbpy].code, xvid_cbpy_tab[cbpy].len); /* write dquant */ if (pMB->mode == MODE_INTRA_Q) BitstreamPutBits(bs, DQ_VALUE2INDEX(pMB->dquant), 2); /* write interlacing */ if (frame->vol_flags & XVID_VOL_INTERLACING) { BitstreamPutBit(bs, pMB->field_dct); } /* code block coeffs */ for (i = 0; i < 6; i++) { if (i < 4) BitstreamPutBits(bs, dcy_tab[qcoeff[i * 64 + 0] + 255].code, dcy_tab[qcoeff[i * 64 + 0] + 255].len); else BitstreamPutBits(bs, dcc_tab[qcoeff[i * 64 + 0] + 255].code, dcc_tab[qcoeff[i * 64 + 0] + 255].len); if (pMB->cbp & (1 << (5 - i))) { const uint16_t *scan_table = frame->vop_flags & XVID_VOP_ALTERNATESCAN ? scan_tables[2] : scan_tables[pMB->acpred_directions[i]]; bits = BitstreamPos(bs); CodeCoeffIntra(bs, &qcoeff[i * 64], scan_table); bits = BitstreamPos(bs) - bits; pStat->iTextBits += bits; } } } static void CodeBlockInter(const FRAMEINFO * const frame, const MACROBLOCK * pMB, int16_t qcoeff[6 * 64], Bitstream * bs, Statistics * pStat) { int32_t i; uint32_t bits, mcbpc, cbpy; mcbpc = (pMB->mode & 7) | ((pMB->cbp & 3) << 3); cbpy = 15 - (pMB->cbp >> 2); /* write mcbpc */ BitstreamPutBits(bs, mcbpc_inter_tab[mcbpc].code, mcbpc_inter_tab[mcbpc].len); if ( (frame->coding_type == S_VOP) && (pMB->mode == MODE_INTER || pMB->mode == MODE_INTER_Q) ) BitstreamPutBit(bs, pMB->mcsel); /* mcsel: '0'=local motion, '1'=GMC */ /* write cbpy */ BitstreamPutBits(bs, xvid_cbpy_tab[cbpy].code, xvid_cbpy_tab[cbpy].len); /* write dquant */ if (pMB->mode == MODE_INTER_Q) BitstreamPutBits(bs, DQ_VALUE2INDEX(pMB->dquant), 2); /* interlacing */ if (frame->vol_flags & XVID_VOL_INTERLACING) { if (pMB->cbp) { BitstreamPutBit(bs, pMB->field_dct); DPRINTF(XVID_DEBUG_MB,"codep: field_dct: %i\n", pMB->field_dct); } /* if inter block, write field ME flag */ if ((pMB->mode == MODE_INTER || pMB->mode == MODE_INTER_Q) && (pMB->mcsel == 0)) { BitstreamPutBit(bs, 0 /*pMB->field_pred*/); /* not implemented yet */ DPRINTF(XVID_DEBUG_MB,"codep: field_pred: %i\n", pMB->field_pred); /* write field prediction references */ #if 0 /* Remove the #if once field_pred is supported */ if (pMB->field_pred) { BitstreamPutBit(bs, pMB->field_for_top); BitstreamPutBit(bs, pMB->field_for_bot); } #endif } } bits = BitstreamPos(bs); /* code motion vector(s) if motion is local */ if (!pMB->mcsel) for (i = 0; i < (pMB->mode == MODE_INTER4V ? 4 : 1); i++) { CodeVector(bs, pMB->pmvs[i].x, frame->fcode); CodeVector(bs, pMB->pmvs[i].y, frame->fcode); #if 0 /* #ifdef _DEBUG */ if (i == 0) /* for simplicity */ { int coded_length = BitstreamPos(bs) - bits; int estimated_length = d_mv_bits(pMB->pmvs[i].x, pMB->pmvs[i].y, zeroMV, frame->fcode, 0); assert(estimated_length == coded_length); d_mv_bits(pMB->pmvs[i].x, pMB->pmvs[i].y, zeroMV, frame->fcode, 0); } #endif } bits = BitstreamPos(bs) - bits; pStat->iMVBits += bits; bits = BitstreamPos(bs); /* code block coeffs */ for (i = 0; i < 6; i++) if (pMB->cbp & (1 << (5 - i))) { const uint16_t *scan_table = frame->vop_flags & XVID_VOP_ALTERNATESCAN ? scan_tables[2] : scan_tables[0]; CodeCoeffInter(bs, &qcoeff[i * 64], scan_table); } bits = BitstreamPos(bs) - bits; pStat->iTextBits += bits; } void MBCoding(const FRAMEINFO * const frame, MACROBLOCK * pMB, int16_t qcoeff[6 * 64], Bitstream * bs, Statistics * pStat) { if (frame->coding_type != I_VOP) BitstreamPutBit(bs, 0); /* not_coded */ if (frame->vop_flags & XVID_VOP_GREYSCALE) { pMB->cbp &= 0x3C; /* keep only bits 5-2 */ qcoeff[4*64+0]=0; /* for INTRA DC value is saved */ qcoeff[5*64+0]=0; } if (pMB->mode == MODE_INTRA || pMB->mode == MODE_INTRA_Q) CodeBlockIntra(frame, pMB, qcoeff, bs, pStat); else CodeBlockInter(frame, pMB, qcoeff, bs, pStat); } /*************************************************************** * bframe encoding start ***************************************************************/ /* mbtype 0 1b direct(h263) mvdb 1 01b interpolate mc+q dbquant, mvdf, mvdb 2 001b backward mc+q dbquant, mvdb 3 0001b forward mc+q dbquant, mvdf */ static __inline void put_bvop_mbtype(Bitstream * bs, int value) { switch (value) { case MODE_FORWARD: BitstreamPutBit(bs, 0); case MODE_BACKWARD: BitstreamPutBit(bs, 0); case MODE_INTERPOLATE: BitstreamPutBit(bs, 0); case MODE_DIRECT: BitstreamPutBit(bs, 1); default: break; } } /* dbquant -2 10b 0 0b +2 11b */ static __inline void put_bvop_dbquant(Bitstream * bs, int value) { switch (value) { case 0: BitstreamPutBit(bs, 0); return; case -2: BitstreamPutBit(bs, 1); BitstreamPutBit(bs, 0); return; case 2: BitstreamPutBit(bs, 1); BitstreamPutBit(bs, 1); return; default:; /* invalid */ } } void MBCodingBVOP(const FRAMEINFO * const frame, const MACROBLOCK * mb, const int16_t qcoeff[6 * 64], const int32_t fcode, const int32_t bcode, Bitstream * bs, Statistics * pStat) { int vcode = fcode; unsigned int i; const uint16_t *scan_table = frame->vop_flags & XVID_VOP_ALTERNATESCAN ? scan_tables[2] : scan_tables[0]; int bits; /* ------------------------------------------------------------------ when a block is skipped it is decoded DIRECT(0,0) hence is interpolated from forward & backward frames ------------------------------------------------------------------ */ if (mb->mode == MODE_DIRECT_NONE_MV) { BitstreamPutBit(bs, 1); /* skipped */ return; } BitstreamPutBit(bs, 0); /* not skipped */ if (mb->cbp == 0) { BitstreamPutBit(bs, 1); /* cbp == 0 */ } else { BitstreamPutBit(bs, 0); /* cbp == xxx */ } put_bvop_mbtype(bs, mb->mode); if (mb->cbp) { BitstreamPutBits(bs, mb->cbp, 6); } if (mb->mode != MODE_DIRECT && mb->cbp != 0) { put_bvop_dbquant(bs, 0); /* todo: mb->dquant = 0 */ } if (frame->vol_flags & XVID_VOL_INTERLACING) { if (mb->cbp) { BitstreamPutBit(bs, mb->field_dct); DPRINTF(XVID_DEBUG_MB,"codep: field_dct: %i\n", mb->field_dct); } /* if not direct block, write field ME flag */ if (mb->mode != MODE_DIRECT) { BitstreamPutBit(bs, 0 /*mb->field_pred*/); /* field ME not implemented */ /* write field prediction references */ #if 0 /* Remove the #if once field_pred is supported */ if (mb->field_pred) { BitstreamPutBit(bs, mb->field_for_top); BitstreamPutBit(bs, mb->field_for_bot); } #endif } } bits = BitstreamPos(bs); switch (mb->mode) { case MODE_INTERPOLATE: CodeVector(bs, mb->pmvs[1].x, vcode); /* forward vector of interpolate mode */ CodeVector(bs, mb->pmvs[1].y, vcode); case MODE_BACKWARD: vcode = bcode; case MODE_FORWARD: CodeVector(bs, mb->pmvs[0].x, vcode); CodeVector(bs, mb->pmvs[0].y, vcode); break; case MODE_DIRECT: CodeVector(bs, mb->pmvs[3].x, 1); /* fcode is always 1 for delta vector */ CodeVector(bs, mb->pmvs[3].y, 1); /* prediction is always (0,0) */ default: break; } pStat->iMVBits += BitstreamPos(bs) - bits; bits = BitstreamPos(bs); for (i = 0; i < 6; i++) { if (mb->cbp & (1 << (5 - i))) { CodeCoeffInter(bs, &qcoeff[i * 64], scan_table); } } pStat->iTextBits += BitstreamPos(bs) - bits; } /*************************************************************** * decoding stuff starts here * ***************************************************************/ /* * for IVOP addbits == 0 * for PVOP addbits == fcode - 1 * for BVOP addbits == max(fcode,bcode) - 1 * returns true or false */ int check_resync_marker(Bitstream * bs, int addbits) { uint32_t nbits; uint32_t code; uint32_t nbitsresyncmarker = NUMBITS_VP_RESYNC_MARKER + addbits; nbits = BitstreamNumBitsToByteAlign(bs); code = BitstreamShowBits(bs, nbits); if (code == (((uint32_t)1 << (nbits - 1)) - 1)) { return BitstreamShowBitsFromByteAlign(bs, nbitsresyncmarker) == RESYNC_MARKER; } return 0; } int get_mcbpc_intra(Bitstream * bs) { uint32_t index; index = BitstreamShowBits(bs, 9); index >>= 3; BitstreamSkip(bs, mcbpc_intra_table[index].len); return mcbpc_intra_table[index].code; } int get_mcbpc_inter(Bitstream * bs) { uint32_t index; index = MIN(BitstreamShowBits(bs, 9), 256); BitstreamSkip(bs, mcbpc_inter_table[index].len); return mcbpc_inter_table[index].code; } int get_cbpy(Bitstream * bs, int intra) { int cbpy; uint32_t index = BitstreamShowBits(bs, 6); BitstreamSkip(bs, cbpy_table[index].len); cbpy = cbpy_table[index].code; if (!intra) cbpy = 15 - cbpy; return cbpy; } static __inline int get_mv_data(Bitstream * bs) { uint32_t index; if (BitstreamGetBit(bs)) return 0; index = BitstreamShowBits(bs, 12); if (index >= 512) { index = (index >> 8) - 2; BitstreamSkip(bs, TMNMVtab0[index].len); return TMNMVtab0[index].code; } if (index >= 128) { index = (index >> 2) - 32; BitstreamSkip(bs, TMNMVtab1[index].len); return TMNMVtab1[index].code; } index -= 4; BitstreamSkip(bs, TMNMVtab2[index].len); return TMNMVtab2[index].code; } int get_mv(Bitstream * bs, int fcode) { int data; int res; int mv; int scale_fac = 1 << (fcode - 1); data = get_mv_data(bs); if (scale_fac == 1 || data == 0) return data; res = BitstreamGetBits(bs, fcode - 1); mv = ((abs(data) - 1) * scale_fac) + res + 1; return data < 0 ? -mv : mv; } int get_dc_dif(Bitstream * bs, uint32_t dc_size) { int code = BitstreamGetBits(bs, dc_size); int msb = code >> (dc_size - 1); if (msb == 0) return (-1 * (code ^ ((1 << dc_size) - 1))); return code; } int get_dc_size_lum(Bitstream * bs) { int code, i; code = BitstreamShowBits(bs, 11); for (i = 11; i > 3; i--) { if (code == 1) { BitstreamSkip(bs, i); return i + 1; } code >>= 1; } BitstreamSkip(bs, dc_lum_tab[code].len); return dc_lum_tab[code].code; } int get_dc_size_chrom(Bitstream * bs) { uint32_t code, i; code = BitstreamShowBits(bs, 12); for (i = 12; i > 2; i--) { if (code == 1) { BitstreamSkip(bs, i); return i; } code >>= 1; } return 3 - BitstreamGetBits(bs, 2); } #define GET_BITS(cache, n) ((cache)>>(32-(n))) static __inline int get_coeff(Bitstream * bs, int *run, int *last, int intra, int short_video_header) { uint32_t mode; int32_t level; REVERSE_EVENT *reverse_event; uint32_t cache = BitstreamShowBits(bs, 32); if (short_video_header) /* inter-VLCs will be used for both intra and inter blocks */ intra = 0; if (GET_BITS(cache, 7) != ESCAPE) { reverse_event = &DCT3D[intra][GET_BITS(cache, 12)]; if ((level = reverse_event->event.level) == 0) goto error; *last = reverse_event->event.last; *run = reverse_event->event.run; /* Don't forget to update the bitstream position */ BitstreamSkip(bs, reverse_event->len+1); return (GET_BITS(cache, reverse_event->len+1)&0x01) ? -level : level; } /* flush 7bits of cache */ cache <<= 7; if (short_video_header) { /* escape mode 4 - H.263 type, only used if short_video_header = 1 */ *last = GET_BITS(cache, 1); *run = (GET_BITS(cache, 7) &0x3f); level = (GET_BITS(cache, 15)&0xff); if (level == 0 || level == 128) DPRINTF(XVID_DEBUG_ERROR, "Illegal LEVEL for ESCAPE mode 4: %d\n", level); /* We've "eaten" 22 bits */ BitstreamSkip(bs, 22); return (level << 24) >> 24; } if ((mode = GET_BITS(cache, 2)) < 3) { const int skip[3] = {1, 1, 2}; cache <<= skip[mode]; reverse_event = &DCT3D[intra][GET_BITS(cache, 12)]; if ((level = reverse_event->event.level) == 0) goto error; *last = reverse_event->event.last; *run = reverse_event->event.run; if (mode < 2) { /* first escape mode, level is offset */ level += max_level[intra][*last][*run]; } else { /* second escape mode, run is offset */ *run += max_run[intra][*last][level] + 1; } /* Update bitstream position */ BitstreamSkip(bs, 7 + skip[mode] + reverse_event->len + 1); return (GET_BITS(cache, reverse_event->len+1)&0x01) ? -level : level; } /* third escape mode - fixed length codes */ cache <<= 2; *last = GET_BITS(cache, 1); *run = (GET_BITS(cache, 7)&0x3f); level = (GET_BITS(cache, 20)&0xfff); /* Update bitstream position */ BitstreamSkip(bs, 30); return (level << 20) >> 20; error: *run = 64; return 0; } void get_intra_block(Bitstream * bs, int16_t * block, int direction, int coeff) { const uint16_t *scan = scan_tables[direction]; int level, run, last = 0; do { level = get_coeff(bs, &run, &last, 1, 0); coeff += run; if (coeff & ~63) { DPRINTF(XVID_DEBUG_ERROR,"fatal: invalid run or index"); break; } block[scan[coeff]] = level; DPRINTF(XVID_DEBUG_COEFF,"block[%i] %i\n", scan[coeff], level); #if 0 DPRINTF(XVID_DEBUG_COEFF,"block[%i] %i %08x\n", scan[coeff], level, BitstreamShowBits(bs, 32)); #endif if (level < -2047 || level > 2047) { DPRINTF(XVID_DEBUG_ERROR,"warning: intra_overflow %i\n", level); } coeff++; } while (!last); } void get_inter_block_h263( Bitstream * bs, int16_t * block, int direction, const int quant, const uint16_t *matrix) { const uint16_t *scan = scan_tables[direction]; const uint16_t quant_m_2 = quant << 1; const uint16_t quant_add = (quant & 1 ? quant : quant - 1); int p; int level; int run; int last = 0; p = 0; do { level = get_coeff(bs, &run, &last, 0, 0); p += run; if (p & ~63) { DPRINTF(XVID_DEBUG_ERROR,"fatal: invalid run or index"); break; } if (level < 0) { level = level*quant_m_2 - quant_add; block[scan[p]] = (level >= -2048 ? level : -2048); } else { level = level * quant_m_2 + quant_add; block[scan[p]] = (level <= 2047 ? level : 2047); } p++; } while (!last); } void get_inter_block_mpeg( Bitstream * bs, int16_t * block, int direction, const int quant, const uint16_t *matrix) { const uint16_t *scan = scan_tables[direction]; uint32_t sum = 0; int p; int level; int run; int last = 0; p = 0; do { level = get_coeff(bs, &run, &last, 0, 0); p += run; if (p & ~63) { DPRINTF(XVID_DEBUG_ERROR,"fatal: invalid run or index"); break; } if (level < 0) { level = ((2 * -level + 1) * matrix[scan[p]] * quant) >> 4; block[scan[p]] = (level <= 2048 ? -level : -2048); } else { level = ((2 * level + 1) * matrix[scan[p]] * quant) >> 4; block[scan[p]] = (level <= 2047 ? level : 2047); } sum ^= block[scan[p]]; p++; } while (!last); /* mismatch control */ if ((sum & 1) == 0) { block[63] ^= 1; } } /***************************************************************************** * VLC tables and other constant arrays ****************************************************************************/ VLC_TABLE const coeff_tab[2][102] = { /* intra = 0 */ { {{ 2, 2}, {0, 0, 1}}, {{15, 4}, {0, 0, 2}}, {{21, 6}, {0, 0, 3}}, {{23, 7}, {0, 0, 4}}, {{31, 8}, {0, 0, 5}}, {{37, 9}, {0, 0, 6}}, {{36, 9}, {0, 0, 7}}, {{33, 10}, {0, 0, 8}}, {{32, 10}, {0, 0, 9}}, {{ 7, 11}, {0, 0, 10}}, {{ 6, 11}, {0, 0, 11}}, {{32, 11}, {0, 0, 12}}, {{ 6, 3}, {0, 1, 1}}, {{20, 6}, {0, 1, 2}}, {{30, 8}, {0, 1, 3}}, {{15, 10}, {0, 1, 4}}, {{33, 11}, {0, 1, 5}}, {{80, 12}, {0, 1, 6}}, {{14, 4}, {0, 2, 1}}, {{29, 8}, {0, 2, 2}}, {{14, 10}, {0, 2, 3}}, {{81, 12}, {0, 2, 4}}, {{13, 5}, {0, 3, 1}}, {{35, 9}, {0, 3, 2}}, {{13, 10}, {0, 3, 3}}, {{12, 5}, {0, 4, 1}}, {{34, 9}, {0, 4, 2}}, {{82, 12}, {0, 4, 3}}, {{11, 5}, {0, 5, 1}}, {{12, 10}, {0, 5, 2}}, {{83, 12}, {0, 5, 3}}, {{19, 6}, {0, 6, 1}}, {{11, 10}, {0, 6, 2}}, {{84, 12}, {0, 6, 3}}, {{18, 6}, {0, 7, 1}}, {{10, 10}, {0, 7, 2}}, {{17, 6}, {0, 8, 1}}, {{ 9, 10}, {0, 8, 2}}, {{16, 6}, {0, 9, 1}}, {{ 8, 10}, {0, 9, 2}}, {{22, 7}, {0, 10, 1}}, {{85, 12}, {0, 10, 2}}, {{21, 7}, {0, 11, 1}}, {{20, 7}, {0, 12, 1}}, {{28, 8}, {0, 13, 1}}, {{27, 8}, {0, 14, 1}}, {{33, 9}, {0, 15, 1}}, {{32, 9}, {0, 16, 1}}, {{31, 9}, {0, 17, 1}}, {{30, 9}, {0, 18, 1}}, {{29, 9}, {0, 19, 1}}, {{28, 9}, {0, 20, 1}}, {{27, 9}, {0, 21, 1}}, {{26, 9}, {0, 22, 1}}, {{34, 11}, {0, 23, 1}}, {{35, 11}, {0, 24, 1}}, {{86, 12}, {0, 25, 1}}, {{87, 12}, {0, 26, 1}}, {{ 7, 4}, {1, 0, 1}}, {{25, 9}, {1, 0, 2}}, {{ 5, 11}, {1, 0, 3}}, {{15, 6}, {1, 1, 1}}, {{ 4, 11}, {1, 1, 2}}, {{14, 6}, {1, 2, 1}}, {{13, 6}, {1, 3, 1}}, {{12, 6}, {1, 4, 1}}, {{19, 7}, {1, 5, 1}}, {{18, 7}, {1, 6, 1}}, {{17, 7}, {1, 7, 1}}, {{16, 7}, {1, 8, 1}}, {{26, 8}, {1, 9, 1}}, {{25, 8}, {1, 10, 1}}, {{24, 8}, {1, 11, 1}}, {{23, 8}, {1, 12, 1}}, {{22, 8}, {1, 13, 1}}, {{21, 8}, {1, 14, 1}}, {{20, 8}, {1, 15, 1}}, {{19, 8}, {1, 16, 1}}, {{24, 9}, {1, 17, 1}}, {{23, 9}, {1, 18, 1}}, {{22, 9}, {1, 19, 1}}, {{21, 9}, {1, 20, 1}}, {{20, 9}, {1, 21, 1}}, {{19, 9}, {1, 22, 1}}, {{18, 9}, {1, 23, 1}}, {{17, 9}, {1, 24, 1}}, {{ 7, 10}, {1, 25, 1}}, {{ 6, 10}, {1, 26, 1}}, {{ 5, 10}, {1, 27, 1}}, {{ 4, 10}, {1, 28, 1}}, {{36, 11}, {1, 29, 1}}, {{37, 11}, {1, 30, 1}}, {{38, 11}, {1, 31, 1}}, {{39, 11}, {1, 32, 1}}, {{88, 12}, {1, 33, 1}}, {{89, 12}, {1, 34, 1}}, {{90, 12}, {1, 35, 1}}, {{91, 12}, {1, 36, 1}}, {{92, 12}, {1, 37, 1}}, {{93, 12}, {1, 38, 1}}, {{94, 12}, {1, 39, 1}}, {{95, 12}, {1, 40, 1}} }, /* intra = 1 */ { {{ 2, 2}, {0, 0, 1}}, {{15, 4}, {0, 0, 3}}, {{21, 6}, {0, 0, 6}}, {{23, 7}, {0, 0, 9}}, {{31, 8}, {0, 0, 10}}, {{37, 9}, {0, 0, 13}}, {{36, 9}, {0, 0, 14}}, {{33, 10}, {0, 0, 17}}, {{32, 10}, {0, 0, 18}}, {{ 7, 11}, {0, 0, 21}}, {{ 6, 11}, {0, 0, 22}}, {{32, 11}, {0, 0, 23}}, {{ 6, 3}, {0, 0, 2}}, {{20, 6}, {0, 1, 2}}, {{30, 8}, {0, 0, 11}}, {{15, 10}, {0, 0, 19}}, {{33, 11}, {0, 0, 24}}, {{80, 12}, {0, 0, 25}}, {{14, 4}, {0, 1, 1}}, {{29, 8}, {0, 0, 12}}, {{14, 10}, {0, 0, 20}}, {{81, 12}, {0, 0, 26}}, {{13, 5}, {0, 0, 4}}, {{35, 9}, {0, 0, 15}}, {{13, 10}, {0, 1, 7}}, {{12, 5}, {0, 0, 5}}, {{34, 9}, {0, 4, 2}}, {{82, 12}, {0, 0, 27}}, {{11, 5}, {0, 2, 1}}, {{12, 10}, {0, 2, 4}}, {{83, 12}, {0, 1, 9}}, {{19, 6}, {0, 0, 7}}, {{11, 10}, {0, 3, 4}}, {{84, 12}, {0, 6, 3}}, {{18, 6}, {0, 0, 8}}, {{10, 10}, {0, 4, 3}}, {{17, 6}, {0, 3, 1}}, {{ 9, 10}, {0, 8, 2}}, {{16, 6}, {0, 4, 1}}, {{ 8, 10}, {0, 5, 3}}, {{22, 7}, {0, 1, 3}}, {{85, 12}, {0, 1, 10}}, {{21, 7}, {0, 2, 2}}, {{20, 7}, {0, 7, 1}}, {{28, 8}, {0, 1, 4}}, {{27, 8}, {0, 3, 2}}, {{33, 9}, {0, 0, 16}}, {{32, 9}, {0, 1, 5}}, {{31, 9}, {0, 1, 6}}, {{30, 9}, {0, 2, 3}}, {{29, 9}, {0, 3, 3}}, {{28, 9}, {0, 5, 2}}, {{27, 9}, {0, 6, 2}}, {{26, 9}, {0, 7, 2}}, {{34, 11}, {0, 1, 8}}, {{35, 11}, {0, 9, 2}}, {{86, 12}, {0, 2, 5}}, {{87, 12}, {0, 7, 3}}, {{ 7, 4}, {1, 0, 1}}, {{25, 9}, {0, 11, 1}}, {{ 5, 11}, {1, 0, 6}}, {{15, 6}, {1, 1, 1}}, {{ 4, 11}, {1, 0, 7}}, {{14, 6}, {1, 2, 1}}, {{13, 6}, {0, 5, 1}}, {{12, 6}, {1, 0, 2}}, {{19, 7}, {1, 5, 1}}, {{18, 7}, {0, 6, 1}}, {{17, 7}, {1, 3, 1}}, {{16, 7}, {1, 4, 1}}, {{26, 8}, {1, 9, 1}}, {{25, 8}, {0, 8, 1}}, {{24, 8}, {0, 9, 1}}, {{23, 8}, {0, 10, 1}}, {{22, 8}, {1, 0, 3}}, {{21, 8}, {1, 6, 1}}, {{20, 8}, {1, 7, 1}}, {{19, 8}, {1, 8, 1}}, {{24, 9}, {0, 12, 1}}, {{23, 9}, {1, 0, 4}}, {{22, 9}, {1, 1, 2}}, {{21, 9}, {1, 10, 1}}, {{20, 9}, {1, 11, 1}}, {{19, 9}, {1, 12, 1}}, {{18, 9}, {1, 13, 1}}, {{17, 9}, {1, 14, 1}}, {{ 7, 10}, {0, 13, 1}}, {{ 6, 10}, {1, 0, 5}}, {{ 5, 10}, {1, 1, 3}}, {{ 4, 10}, {1, 2, 2}}, {{36, 11}, {1, 3, 2}}, {{37, 11}, {1, 4, 2}}, {{38, 11}, {1, 15, 1}}, {{39, 11}, {1, 16, 1}}, {{88, 12}, {0, 14, 1}}, {{89, 12}, {1, 0, 8}}, {{90, 12}, {1, 5, 2}}, {{91, 12}, {1, 6, 2}}, {{92, 12}, {1, 17, 1}}, {{93, 12}, {1, 18, 1}}, {{94, 12}, {1, 19, 1}}, {{95, 12}, {1, 20, 1}} } }; /* constants taken from momusys/vm_common/inlcude/max_level.h */ uint8_t const max_level[2][2][64] = { { /* intra = 0, last = 0 */ { 12, 6, 4, 3, 3, 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, /* intra = 0, last = 1 */ { 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } }, { /* intra = 1, last = 0 */ { 27, 10, 5, 4, 3, 3, 3, 3, 2, 2, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, /* intra = 1, last = 1 */ { 8, 3, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } } }; uint8_t const max_run[2][2][64] = { { /* intra = 0, last = 0 */ { 0, 26, 10, 6, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, }, /* intra = 0, last = 1 */ { 0, 40, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, } }, { /* intra = 1, last = 0 */ { 0, 14, 9, 7, 3, 2, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, }, /* intra = 1, last = 1 */ { 0, 20, 6, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, } } }; /****************************************************************** * encoder tables * ******************************************************************/ VLC sprite_trajectory_code[32768]; VLC sprite_trajectory_len[15] = { { 0x00 , 2}, { 0x02 , 3}, { 0x03, 3}, { 0x04, 3}, { 0x05, 3}, { 0x06, 3}, { 0x0E , 4}, { 0x1E, 5}, { 0x3E, 6}, { 0x7E, 7}, { 0xFE, 8}, { 0x1FE, 9}, {0x3FE,10}, {0x7FE,11}, {0xFFE,12} }; /* DCT coefficients. Four tables, two for last = 0, two for last = 1. the sign bit must be added afterwards. */ /* MCBPC Indexing by cbpc in first two bits, mode in last two. CBPC as in table 4/H.263, MB type (mode): 3 = 01, 4 = 10. Example: cbpc = 01 and mode = 4 gives index = 0110 = 6. */ VLC mcbpc_intra_tab[15] = { {0x01, 9}, {0x01, 1}, {0x01, 4}, {0x00, 0}, {0x00, 0}, {0x01, 3}, {0x01, 6}, {0x00, 0}, {0x00, 0}, {0x02, 3}, {0x02, 6}, {0x00, 0}, {0x00, 0}, {0x03, 3}, {0x03, 6} }; /* MCBPC inter. Addressing: 5 bit ccmmm (cc = CBPC, mmm = mode (1-4 binary)) */ VLC mcbpc_inter_tab[29] = { {1, 1}, {3, 3}, {2, 3}, {3, 5}, {4, 6}, {1, 9}, {0, 0}, {0, 0}, {3, 4}, {7, 7}, {5, 7}, {4, 8}, {4, 9}, {0, 0}, {0, 0}, {0, 0}, {2, 4}, {6, 7}, {4, 7}, {3, 8}, {3, 9}, {0, 0}, {0, 0}, {0, 0}, {5, 6}, {5, 9}, {5, 8}, {3, 7}, {2, 9} }; const VLC xvid_cbpy_tab[16] = { {3, 4}, {5, 5}, {4, 5}, {9, 4}, {3, 5}, {7, 4}, {2, 6}, {11, 4}, {2, 5}, {3, 6}, {5, 4}, {10, 4}, {4, 4}, {8, 4}, {6, 4}, {3, 2} }; const VLC dcy_tab[511] = { {0x100, 15}, {0x101, 15}, {0x102, 15}, {0x103, 15}, {0x104, 15}, {0x105, 15}, {0x106, 15}, {0x107, 15}, {0x108, 15}, {0x109, 15}, {0x10a, 15}, {0x10b, 15}, {0x10c, 15}, {0x10d, 15}, {0x10e, 15}, {0x10f, 15}, {0x110, 15}, {0x111, 15}, {0x112, 15}, {0x113, 15}, {0x114, 15}, {0x115, 15}, {0x116, 15}, {0x117, 15}, {0x118, 15}, {0x119, 15}, {0x11a, 15}, {0x11b, 15}, {0x11c, 15}, {0x11d, 15}, {0x11e, 15}, {0x11f, 15}, {0x120, 15}, {0x121, 15}, {0x122, 15}, {0x123, 15}, {0x124, 15}, {0x125, 15}, {0x126, 15}, {0x127, 15}, {0x128, 15}, {0x129, 15}, {0x12a, 15}, {0x12b, 15}, {0x12c, 15}, {0x12d, 15}, {0x12e, 15}, {0x12f, 15}, {0x130, 15}, {0x131, 15}, {0x132, 15}, {0x133, 15}, {0x134, 15}, {0x135, 15}, {0x136, 15}, {0x137, 15}, {0x138, 15}, {0x139, 15}, {0x13a, 15}, {0x13b, 15}, {0x13c, 15}, {0x13d, 15}, {0x13e, 15}, {0x13f, 15}, {0x140, 15}, {0x141, 15}, {0x142, 15}, {0x143, 15}, {0x144, 15}, {0x145, 15}, {0x146, 15}, {0x147, 15}, {0x148, 15}, {0x149, 15}, {0x14a, 15}, {0x14b, 15}, {0x14c, 15}, {0x14d, 15}, {0x14e, 15}, {0x14f, 15}, {0x150, 15}, {0x151, 15}, {0x152, 15}, {0x153, 15}, {0x154, 15}, {0x155, 15}, {0x156, 15}, {0x157, 15}, {0x158, 15}, {0x159, 15}, {0x15a, 15}, {0x15b, 15}, {0x15c, 15}, {0x15d, 15}, {0x15e, 15}, {0x15f, 15}, {0x160, 15}, {0x161, 15}, {0x162, 15}, {0x163, 15}, {0x164, 15}, {0x165, 15}, {0x166, 15}, {0x167, 15}, {0x168, 15}, {0x169, 15}, {0x16a, 15}, {0x16b, 15}, {0x16c, 15}, {0x16d, 15}, {0x16e, 15}, {0x16f, 15}, {0x170, 15}, {0x171, 15}, {0x172, 15}, {0x173, 15}, {0x174, 15}, {0x175, 15}, {0x176, 15}, {0x177, 15}, {0x178, 15}, {0x179, 15}, {0x17a, 15}, {0x17b, 15}, {0x17c, 15}, {0x17d, 15}, {0x17e, 15}, {0x17f, 15}, {0x80, 13}, {0x81, 13}, {0x82, 13}, {0x83, 13}, {0x84, 13}, {0x85, 13}, {0x86, 13}, {0x87, 13}, {0x88, 13}, {0x89, 13}, {0x8a, 13}, {0x8b, 13}, {0x8c, 13}, {0x8d, 13}, {0x8e, 13}, {0x8f, 13}, {0x90, 13}, {0x91, 13}, {0x92, 13}, {0x93, 13}, {0x94, 13}, {0x95, 13}, {0x96, 13}, {0x97, 13}, {0x98, 13}, {0x99, 13}, {0x9a, 13}, {0x9b, 13}, {0x9c, 13}, {0x9d, 13}, {0x9e, 13}, {0x9f, 13}, {0xa0, 13}, {0xa1, 13}, {0xa2, 13}, {0xa3, 13}, {0xa4, 13}, {0xa5, 13}, {0xa6, 13}, {0xa7, 13}, {0xa8, 13}, {0xa9, 13}, {0xaa, 13}, {0xab, 13}, {0xac, 13}, {0xad, 13}, {0xae, 13}, {0xaf, 13}, {0xb0, 13}, {0xb1, 13}, {0xb2, 13}, {0xb3, 13}, {0xb4, 13}, {0xb5, 13}, {0xb6, 13}, {0xb7, 13}, {0xb8, 13}, {0xb9, 13}, {0xba, 13}, {0xbb, 13}, {0xbc, 13}, {0xbd, 13}, {0xbe, 13}, {0xbf, 13}, {0x40, 11}, {0x41, 11}, {0x42, 11}, {0x43, 11}, {0x44, 11}, {0x45, 11}, {0x46, 11}, {0x47, 11}, {0x48, 11}, {0x49, 11}, {0x4a, 11}, {0x4b, 11}, {0x4c, 11}, {0x4d, 11}, {0x4e, 11}, {0x4f, 11}, {0x50, 11}, {0x51, 11}, {0x52, 11}, {0x53, 11}, {0x54, 11}, {0x55, 11}, {0x56, 11}, {0x57, 11}, {0x58, 11}, {0x59, 11}, {0x5a, 11}, {0x5b, 11}, {0x5c, 11}, {0x5d, 11}, {0x5e, 11}, {0x5f, 11}, {0x20, 9}, {0x21, 9}, {0x22, 9}, {0x23, 9}, {0x24, 9}, {0x25, 9}, {0x26, 9}, {0x27, 9}, {0x28, 9}, {0x29, 9}, {0x2a, 9}, {0x2b, 9}, {0x2c, 9}, {0x2d, 9}, {0x2e, 9}, {0x2f, 9}, {0x10, 7}, {0x11, 7}, {0x12, 7}, {0x13, 7}, {0x14, 7}, {0x15, 7}, {0x16, 7}, {0x17, 7}, {0x10, 6}, {0x11, 6}, {0x12, 6}, {0x13, 6}, {0x08, 4}, {0x09, 4}, {0x06, 3}, {0x03, 3}, {0x07, 3}, {0x0a, 4}, {0x0b, 4}, {0x14, 6}, {0x15, 6}, {0x16, 6}, {0x17, 6}, {0x18, 7}, {0x19, 7}, {0x1a, 7}, {0x1b, 7}, {0x1c, 7}, {0x1d, 7}, {0x1e, 7}, {0x1f, 7}, {0x30, 9}, {0x31, 9}, {0x32, 9}, {0x33, 9}, {0x34, 9}, {0x35, 9}, {0x36, 9}, {0x37, 9}, {0x38, 9}, {0x39, 9}, {0x3a, 9}, {0x3b, 9}, {0x3c, 9}, {0x3d, 9}, {0x3e, 9}, {0x3f, 9}, {0x60, 11}, {0x61, 11}, {0x62, 11}, {0x63, 11}, {0x64, 11}, {0x65, 11}, {0x66, 11}, {0x67, 11}, {0x68, 11}, {0x69, 11}, {0x6a, 11}, {0x6b, 11}, {0x6c, 11}, {0x6d, 11}, {0x6e, 11}, {0x6f, 11}, {0x70, 11}, {0x71, 11}, {0x72, 11}, {0x73, 11}, {0x74, 11}, {0x75, 11}, {0x76, 11}, {0x77, 11}, {0x78, 11}, {0x79, 11}, {0x7a, 11}, {0x7b, 11}, {0x7c, 11}, {0x7d, 11}, {0x7e, 11}, {0x7f, 11}, {0xc0, 13}, {0xc1, 13}, {0xc2, 13}, {0xc3, 13}, {0xc4, 13}, {0xc5, 13}, {0xc6, 13}, {0xc7, 13}, {0xc8, 13}, {0xc9, 13}, {0xca, 13}, {0xcb, 13}, {0xcc, 13}, {0xcd, 13}, {0xce, 13}, {0xcf, 13}, {0xd0, 13}, {0xd1, 13}, {0xd2, 13}, {0xd3, 13}, {0xd4, 13}, {0xd5, 13}, {0xd6, 13}, {0xd7, 13}, {0xd8, 13}, {0xd9, 13}, {0xda, 13}, {0xdb, 13}, {0xdc, 13}, {0xdd, 13}, {0xde, 13}, {0xdf, 13}, {0xe0, 13}, {0xe1, 13}, {0xe2, 13}, {0xe3, 13}, {0xe4, 13}, {0xe5, 13}, {0xe6, 13}, {0xe7, 13}, {0xe8, 13}, {0xe9, 13}, {0xea, 13}, {0xeb, 13}, {0xec, 13}, {0xed, 13}, {0xee, 13}, {0xef, 13}, {0xf0, 13}, {0xf1, 13}, {0xf2, 13}, {0xf3, 13}, {0xf4, 13}, {0xf5, 13}, {0xf6, 13}, {0xf7, 13}, {0xf8, 13}, {0xf9, 13}, {0xfa, 13}, {0xfb, 13}, {0xfc, 13}, {0xfd, 13}, {0xfe, 13}, {0xff, 13}, {0x180, 15}, {0x181, 15}, {0x182, 15}, {0x183, 15}, {0x184, 15}, {0x185, 15}, {0x186, 15}, {0x187, 15}, {0x188, 15}, {0x189, 15}, {0x18a, 15}, {0x18b, 15}, {0x18c, 15}, {0x18d, 15}, {0x18e, 15}, {0x18f, 15}, {0x190, 15}, {0x191, 15}, {0x192, 15}, {0x193, 15}, {0x194, 15}, {0x195, 15}, {0x196, 15}, {0x197, 15}, {0x198, 15}, {0x199, 15}, {0x19a, 15}, {0x19b, 15}, {0x19c, 15}, {0x19d, 15}, {0x19e, 15}, {0x19f, 15}, {0x1a0, 15}, {0x1a1, 15}, {0x1a2, 15}, {0x1a3, 15}, {0x1a4, 15}, {0x1a5, 15}, {0x1a6, 15}, {0x1a7, 15}, {0x1a8, 15}, {0x1a9, 15}, {0x1aa, 15}, {0x1ab, 15}, {0x1ac, 15}, {0x1ad, 15}, {0x1ae, 15}, {0x1af, 15}, {0x1b0, 15}, {0x1b1, 15}, {0x1b2, 15}, {0x1b3, 15}, {0x1b4, 15}, {0x1b5, 15}, {0x1b6, 15}, {0x1b7, 15}, {0x1b8, 15}, {0x1b9, 15}, {0x1ba, 15}, {0x1bb, 15}, {0x1bc, 15}, {0x1bd, 15}, {0x1be, 15}, {0x1bf, 15}, {0x1c0, 15}, {0x1c1, 15}, {0x1c2, 15}, {0x1c3, 15}, {0x1c4, 15}, {0x1c5, 15}, {0x1c6, 15}, {0x1c7, 15}, {0x1c8, 15}, {0x1c9, 15}, {0x1ca, 15}, {0x1cb, 15}, {0x1cc, 15}, {0x1cd, 15}, {0x1ce, 15}, {0x1cf, 15}, {0x1d0, 15}, {0x1d1, 15}, {0x1d2, 15}, {0x1d3, 15}, {0x1d4, 15}, {0x1d5, 15}, {0x1d6, 15}, {0x1d7, 15}, {0x1d8, 15}, {0x1d9, 15}, {0x1da, 15}, {0x1db, 15}, {0x1dc, 15}, {0x1dd, 15}, {0x1de, 15}, {0x1df, 15}, {0x1e0, 15}, {0x1e1, 15}, {0x1e2, 15}, {0x1e3, 15}, {0x1e4, 15}, {0x1e5, 15}, {0x1e6, 15}, {0x1e7, 15}, {0x1e8, 15}, {0x1e9, 15}, {0x1ea, 15}, {0x1eb, 15}, {0x1ec, 15}, {0x1ed, 15}, {0x1ee, 15}, {0x1ef, 15}, {0x1f0, 15}, {0x1f1, 15}, {0x1f2, 15}, {0x1f3, 15}, {0x1f4, 15}, {0x1f5, 15}, {0x1f6, 15}, {0x1f7, 15}, {0x1f8, 15}, {0x1f9, 15}, {0x1fa, 15}, {0x1fb, 15}, {0x1fc, 15}, {0x1fd, 15}, {0x1fe, 15}, {0x1ff, 15}, }; const VLC dcc_tab[511] = { {0x100, 16}, {0x101, 16}, {0x102, 16}, {0x103, 16}, {0x104, 16}, {0x105, 16}, {0x106, 16}, {0x107, 16}, {0x108, 16}, {0x109, 16}, {0x10a, 16}, {0x10b, 16}, {0x10c, 16}, {0x10d, 16}, {0x10e, 16}, {0x10f, 16}, {0x110, 16}, {0x111, 16}, {0x112, 16}, {0x113, 16}, {0x114, 16}, {0x115, 16}, {0x116, 16}, {0x117, 16}, {0x118, 16}, {0x119, 16}, {0x11a, 16}, {0x11b, 16}, {0x11c, 16}, {0x11d, 16}, {0x11e, 16}, {0x11f, 16}, {0x120, 16}, {0x121, 16}, {0x122, 16}, {0x123, 16}, {0x124, 16}, {0x125, 16}, {0x126, 16}, {0x127, 16}, {0x128, 16}, {0x129, 16}, {0x12a, 16}, {0x12b, 16}, {0x12c, 16}, {0x12d, 16}, {0x12e, 16}, {0x12f, 16}, {0x130, 16}, {0x131, 16}, {0x132, 16}, {0x133, 16}, {0x134, 16}, {0x135, 16}, {0x136, 16}, {0x137, 16}, {0x138, 16}, {0x139, 16}, {0x13a, 16}, {0x13b, 16}, {0x13c, 16}, {0x13d, 16}, {0x13e, 16}, {0x13f, 16}, {0x140, 16}, {0x141, 16}, {0x142, 16}, {0x143, 16}, {0x144, 16}, {0x145, 16}, {0x146, 16}, {0x147, 16}, {0x148, 16}, {0x149, 16}, {0x14a, 16}, {0x14b, 16}, {0x14c, 16}, {0x14d, 16}, {0x14e, 16}, {0x14f, 16}, {0x150, 16}, {0x151, 16}, {0x152, 16}, {0x153, 16}, {0x154, 16}, {0x155, 16}, {0x156, 16}, {0x157, 16}, {0x158, 16}, {0x159, 16}, {0x15a, 16}, {0x15b, 16}, {0x15c, 16}, {0x15d, 16}, {0x15e, 16}, {0x15f, 16}, {0x160, 16}, {0x161, 16}, {0x162, 16}, {0x163, 16}, {0x164, 16}, {0x165, 16}, {0x166, 16}, {0x167, 16}, {0x168, 16}, {0x169, 16}, {0x16a, 16}, {0x16b, 16}, {0x16c, 16}, {0x16d, 16}, {0x16e, 16}, {0x16f, 16}, {0x170, 16}, {0x171, 16}, {0x172, 16}, {0x173, 16}, {0x174, 16}, {0x175, 16}, {0x176, 16}, {0x177, 16}, {0x178, 16}, {0x179, 16}, {0x17a, 16}, {0x17b, 16}, {0x17c, 16}, {0x17d, 16}, {0x17e, 16}, {0x17f, 16}, {0x80, 14}, {0x81, 14}, {0x82, 14}, {0x83, 14}, {0x84, 14}, {0x85, 14}, {0x86, 14}, {0x87, 14}, {0x88, 14}, {0x89, 14}, {0x8a, 14}, {0x8b, 14}, {0x8c, 14}, {0x8d, 14}, {0x8e, 14}, {0x8f, 14}, {0x90, 14}, {0x91, 14}, {0x92, 14}, {0x93, 14}, {0x94, 14}, {0x95, 14}, {0x96, 14}, {0x97, 14}, {0x98, 14}, {0x99, 14}, {0x9a, 14}, {0x9b, 14}, {0x9c, 14}, {0x9d, 14}, {0x9e, 14}, {0x9f, 14}, {0xa0, 14}, {0xa1, 14}, {0xa2, 14}, {0xa3, 14}, {0xa4, 14}, {0xa5, 14}, {0xa6, 14}, {0xa7, 14}, {0xa8, 14}, {0xa9, 14}, {0xaa, 14}, {0xab, 14}, {0xac, 14}, {0xad, 14}, {0xae, 14}, {0xaf, 14}, {0xb0, 14}, {0xb1, 14}, {0xb2, 14}, {0xb3, 14}, {0xb4, 14}, {0xb5, 14}, {0xb6, 14}, {0xb7, 14}, {0xb8, 14}, {0xb9, 14}, {0xba, 14}, {0xbb, 14}, {0xbc, 14}, {0xbd, 14}, {0xbe, 14}, {0xbf, 14}, {0x40, 12}, {0x41, 12}, {0x42, 12}, {0x43, 12}, {0x44, 12}, {0x45, 12}, {0x46, 12}, {0x47, 12}, {0x48, 12}, {0x49, 12}, {0x4a, 12}, {0x4b, 12}, {0x4c, 12}, {0x4d, 12}, {0x4e, 12}, {0x4f, 12}, {0x50, 12}, {0x51, 12}, {0x52, 12}, {0x53, 12}, {0x54, 12}, {0x55, 12}, {0x56, 12}, {0x57, 12}, {0x58, 12}, {0x59, 12}, {0x5a, 12}, {0x5b, 12}, {0x5c, 12}, {0x5d, 12}, {0x5e, 12}, {0x5f, 12}, {0x20, 10}, {0x21, 10}, {0x22, 10}, {0x23, 10}, {0x24, 10}, {0x25, 10}, {0x26, 10}, {0x27, 10}, {0x28, 10}, {0x29, 10}, {0x2a, 10}, {0x2b, 10}, {0x2c, 10}, {0x2d, 10}, {0x2e, 10}, {0x2f, 10}, {0x10, 8}, {0x11, 8}, {0x12, 8}, {0x13, 8}, {0x14, 8}, {0x15, 8}, {0x16, 8}, {0x17, 8}, {0x08, 6}, {0x09, 6}, {0x0a, 6}, {0x0b, 6}, {0x04, 4}, {0x05, 4}, {0x04, 3}, {0x03, 2}, {0x05, 3}, {0x06, 4}, {0x07, 4}, {0x0c, 6}, {0x0d, 6}, {0x0e, 6}, {0x0f, 6}, {0x18, 8}, {0x19, 8}, {0x1a, 8}, {0x1b, 8}, {0x1c, 8}, {0x1d, 8}, {0x1e, 8}, {0x1f, 8}, {0x30, 10}, {0x31, 10}, {0x32, 10}, {0x33, 10}, {0x34, 10}, {0x35, 10}, {0x36, 10}, {0x37, 10}, {0x38, 10}, {0x39, 10}, {0x3a, 10}, {0x3b, 10}, {0x3c, 10}, {0x3d, 10}, {0x3e, 10}, {0x3f, 10}, {0x60, 12}, {0x61, 12}, {0x62, 12}, {0x63, 12}, {0x64, 12}, {0x65, 12}, {0x66, 12}, {0x67, 12}, {0x68, 12}, {0x69, 12}, {0x6a, 12}, {0x6b, 12}, {0x6c, 12}, {0x6d, 12}, {0x6e, 12}, {0x6f, 12}, {0x70, 12}, {0x71, 12}, {0x72, 12}, {0x73, 12}, {0x74, 12}, {0x75, 12}, {0x76, 12}, {0x77, 12}, {0x78, 12}, {0x79, 12}, {0x7a, 12}, {0x7b, 12}, {0x7c, 12}, {0x7d, 12}, {0x7e, 12}, {0x7f, 12}, {0xc0, 14}, {0xc1, 14}, {0xc2, 14}, {0xc3, 14}, {0xc4, 14}, {0xc5, 14}, {0xc6, 14}, {0xc7, 14}, {0xc8, 14}, {0xc9, 14}, {0xca, 14}, {0xcb, 14}, {0xcc, 14}, {0xcd, 14}, {0xce, 14}, {0xcf, 14}, {0xd0, 14}, {0xd1, 14}, {0xd2, 14}, {0xd3, 14}, {0xd4, 14}, {0xd5, 14}, {0xd6, 14}, {0xd7, 14}, {0xd8, 14}, {0xd9, 14}, {0xda, 14}, {0xdb, 14}, {0xdc, 14}, {0xdd, 14}, {0xde, 14}, {0xdf, 14}, {0xe0, 14}, {0xe1, 14}, {0xe2, 14}, {0xe3, 14}, {0xe4, 14}, {0xe5, 14}, {0xe6, 14}, {0xe7, 14}, {0xe8, 14}, {0xe9, 14}, {0xea, 14}, {0xeb, 14}, {0xec, 14}, {0xed, 14}, {0xee, 14}, {0xef, 14}, {0xf0, 14}, {0xf1, 14}, {0xf2, 14}, {0xf3, 14}, {0xf4, 14}, {0xf5, 14}, {0xf6, 14}, {0xf7, 14}, {0xf8, 14}, {0xf9, 14}, {0xfa, 14}, {0xfb, 14}, {0xfc, 14}, {0xfd, 14}, {0xfe, 14}, {0xff, 14}, {0x180, 16}, {0x181, 16}, {0x182, 16}, {0x183, 16}, {0x184, 16}, {0x185, 16}, {0x186, 16}, {0x187, 16}, {0x188, 16}, {0x189, 16}, {0x18a, 16}, {0x18b, 16}, {0x18c, 16}, {0x18d, 16}, {0x18e, 16}, {0x18f, 16}, {0x190, 16}, {0x191, 16}, {0x192, 16}, {0x193, 16}, {0x194, 16}, {0x195, 16}, {0x196, 16}, {0x197, 16}, {0x198, 16}, {0x199, 16}, {0x19a, 16}, {0x19b, 16}, {0x19c, 16}, {0x19d, 16}, {0x19e, 16}, {0x19f, 16}, {0x1a0, 16}, {0x1a1, 16}, {0x1a2, 16}, {0x1a3, 16}, {0x1a4, 16}, {0x1a5, 16}, {0x1a6, 16}, {0x1a7, 16}, {0x1a8, 16}, {0x1a9, 16}, {0x1aa, 16}, {0x1ab, 16}, {0x1ac, 16}, {0x1ad, 16}, {0x1ae, 16}, {0x1af, 16}, {0x1b0, 16}, {0x1b1, 16}, {0x1b2, 16}, {0x1b3, 16}, {0x1b4, 16}, {0x1b5, 16}, {0x1b6, 16}, {0x1b7, 16}, {0x1b8, 16}, {0x1b9, 16}, {0x1ba, 16}, {0x1bb, 16}, {0x1bc, 16}, {0x1bd, 16}, {0x1be, 16}, {0x1bf, 16}, {0x1c0, 16}, {0x1c1, 16}, {0x1c2, 16}, {0x1c3, 16}, {0x1c4, 16}, {0x1c5, 16}, {0x1c6, 16}, {0x1c7, 16}, {0x1c8, 16}, {0x1c9, 16}, {0x1ca, 16}, {0x1cb, 16}, {0x1cc, 16}, {0x1cd, 16}, {0x1ce, 16}, {0x1cf, 16}, {0x1d0, 16}, {0x1d1, 16}, {0x1d2, 16}, {0x1d3, 16}, {0x1d4, 16}, {0x1d5, 16}, {0x1d6, 16}, {0x1d7, 16}, {0x1d8, 16}, {0x1d9, 16}, {0x1da, 16}, {0x1db, 16}, {0x1dc, 16}, {0x1dd, 16}, {0x1de, 16}, {0x1df, 16}, {0x1e0, 16}, {0x1e1, 16}, {0x1e2, 16}, {0x1e3, 16}, {0x1e4, 16}, {0x1e5, 16}, {0x1e6, 16}, {0x1e7, 16}, {0x1e8, 16}, {0x1e9, 16}, {0x1ea, 16}, {0x1eb, 16}, {0x1ec, 16}, {0x1ed, 16}, {0x1ee, 16}, {0x1ef, 16}, {0x1f0, 16}, {0x1f1, 16}, {0x1f2, 16}, {0x1f3, 16}, {0x1f4, 16}, {0x1f5, 16}, {0x1f6, 16}, {0x1f7, 16}, {0x1f8, 16}, {0x1f9, 16}, {0x1fa, 16}, {0x1fb, 16}, {0x1fc, 16}, {0x1fd, 16}, {0x1fe, 16}, {0x1ff, 16}, }; const VLC mb_motion_table[65] = { {0x05, 13}, {0x07, 13}, {0x05, 12}, {0x07, 12}, {0x09, 12}, {0x0b, 12}, {0x0d, 12}, {0x0f, 12}, {0x09, 11}, {0x0b, 11}, {0x0d, 11}, {0x0f, 11}, {0x11, 11}, {0x13, 11}, {0x15, 11}, {0x17, 11}, {0x19, 11}, {0x1b, 11}, {0x1d, 11}, {0x1f, 11}, {0x21, 11}, {0x23, 11}, {0x13, 10}, {0x15, 10}, {0x17, 10}, {0x07, 8}, {0x09, 8}, {0x0b, 8}, {0x07, 7}, {0x03, 5}, {0x03, 4}, {0x03, 3}, {0x01, 1}, {0x02, 3}, {0x02, 4}, {0x02, 5}, {0x06, 7}, {0x0a, 8}, {0x08, 8}, {0x06, 8}, {0x16, 10}, {0x14, 10}, {0x12, 10}, {0x22, 11}, {0x20, 11}, {0x1e, 11}, {0x1c, 11}, {0x1a, 11}, {0x18, 11}, {0x16, 11}, {0x14, 11}, {0x12, 11}, {0x10, 11}, {0x0e, 11}, {0x0c, 11}, {0x0a, 11}, {0x08, 11}, {0x0e, 12}, {0x0c, 12}, {0x0a, 12}, {0x08, 12}, {0x06, 12}, {0x04, 12}, {0x06, 13}, {0x04, 13} }; /****************************************************************** * decoder tables * ******************************************************************/ VLC const mcbpc_intra_table[64] = { {-1, 0}, {20, 6}, {36, 6}, {52, 6}, {4, 4}, {4, 4}, {4, 4}, {4, 4}, {19, 3}, {19, 3}, {19, 3}, {19, 3}, {19, 3}, {19, 3}, {19, 3}, {19, 3}, {35, 3}, {35, 3}, {35, 3}, {35, 3}, {35, 3}, {35, 3}, {35, 3}, {35, 3}, {51, 3}, {51, 3}, {51, 3}, {51, 3}, {51, 3}, {51, 3}, {51, 3}, {51, 3}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1}, {3, 1} }; VLC const mcbpc_inter_table[257] = { {VLC_ERROR, 0}, {255, 9}, {52, 9}, {36, 9}, {20, 9}, {49, 9}, {35, 8}, {35, 8}, {19, 8}, {19, 8}, {50, 8}, {50, 8}, {51, 7}, {51, 7}, {51, 7}, {51, 7}, {34, 7}, {34, 7}, {34, 7}, {34, 7}, {18, 7}, {18, 7}, {18, 7}, {18, 7}, {33, 7}, {33, 7}, {33, 7}, {33, 7}, {17, 7}, {17, 7}, {17, 7}, {17, 7}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {48, 6}, {48, 6}, {48, 6}, {48, 6}, {48, 6}, {48, 6}, {48, 6}, {48, 6}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {3, 5}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {32, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {16, 4}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {2, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {1, 3}, {0, 1} }; VLC const cbpy_table[64] = { {-1, 0}, {-1, 0}, {6, 6}, {9, 6}, {8, 5}, {8, 5}, {4, 5}, {4, 5}, {2, 5}, {2, 5}, {1, 5}, {1, 5}, {0, 4}, {0, 4}, {0, 4}, {0, 4}, {12, 4}, {12, 4}, {12, 4}, {12, 4}, {10, 4}, {10, 4}, {10, 4}, {10, 4}, {14, 4}, {14, 4}, {14, 4}, {14, 4}, {5, 4}, {5, 4}, {5, 4}, {5, 4}, {13, 4}, {13, 4}, {13, 4}, {13, 4}, {3, 4}, {3, 4}, {3, 4}, {3, 4}, {11, 4}, {11, 4}, {11, 4}, {11, 4}, {7, 4}, {7, 4}, {7, 4}, {7, 4}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2}, {15, 2} }; VLC const TMNMVtab0[] = { {3, 4}, {-3, 4}, {2, 3}, {2, 3}, {-2, 3}, {-2, 3}, {1, 2}, {1, 2}, {1, 2}, {1, 2}, {-1, 2}, {-1, 2}, {-1, 2}, {-1, 2} }; VLC const TMNMVtab1[] = { {12, 10}, {-12, 10}, {11, 10}, {-11, 10}, {10, 9}, {10, 9}, {-10, 9}, {-10, 9}, {9, 9}, {9, 9}, {-9, 9}, {-9, 9}, {8, 9}, {8, 9}, {-8, 9}, {-8, 9}, {7, 7}, {7, 7}, {7, 7}, {7, 7}, {7, 7}, {7, 7}, {7, 7}, {7, 7}, {-7, 7}, {-7, 7}, {-7, 7}, {-7, 7}, {-7, 7}, {-7, 7}, {-7, 7}, {-7, 7}, {6, 7}, {6, 7}, {6, 7}, {6, 7}, {6, 7}, {6, 7}, {6, 7}, {6, 7}, {-6, 7}, {-6, 7}, {-6, 7}, {-6, 7}, {-6, 7}, {-6, 7}, {-6, 7}, {-6, 7}, {5, 7}, {5, 7}, {5, 7}, {5, 7}, {5, 7}, {5, 7}, {5, 7}, {5, 7}, {-5, 7}, {-5, 7}, {-5, 7}, {-5, 7}, {-5, 7}, {-5, 7}, {-5, 7}, {-5, 7}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6}, {-4, 6} }; VLC const TMNMVtab2[] = { {32, 12}, {-32, 12}, {31, 12}, {-31, 12}, {30, 11}, {30, 11}, {-30, 11}, {-30, 11}, {29, 11}, {29, 11}, {-29, 11}, {-29, 11}, {28, 11}, {28, 11}, {-28, 11}, {-28, 11}, {27, 11}, {27, 11}, {-27, 11}, {-27, 11}, {26, 11}, {26, 11}, {-26, 11}, {-26, 11}, {25, 11}, {25, 11}, {-25, 11}, {-25, 11}, {24, 10}, {24, 10}, {24, 10}, {24, 10}, {-24, 10}, {-24, 10}, {-24, 10}, {-24, 10}, {23, 10}, {23, 10}, {23, 10}, {23, 10}, {-23, 10}, {-23, 10}, {-23, 10}, {-23, 10}, {22, 10}, {22, 10}, {22, 10}, {22, 10}, {-22, 10}, {-22, 10}, {-22, 10}, {-22, 10}, {21, 10}, {21, 10}, {21, 10}, {21, 10}, {-21, 10}, {-21, 10}, {-21, 10}, {-21, 10}, {20, 10}, {20, 10}, {20, 10}, {20, 10}, {-20, 10}, {-20, 10}, {-20, 10}, {-20, 10}, {19, 10}, {19, 10}, {19, 10}, {19, 10}, {-19, 10}, {-19, 10}, {-19, 10}, {-19, 10}, {18, 10}, {18, 10}, {18, 10}, {18, 10}, {-18, 10}, {-18, 10}, {-18, 10}, {-18, 10}, {17, 10}, {17, 10}, {17, 10}, {17, 10}, {-17, 10}, {-17, 10}, {-17, 10}, {-17, 10}, {16, 10}, {16, 10}, {16, 10}, {16, 10}, {-16, 10}, {-16, 10}, {-16, 10}, {-16, 10}, {15, 10}, {15, 10}, {15, 10}, {15, 10}, {-15, 10}, {-15, 10}, {-15, 10}, {-15, 10}, {14, 10}, {14, 10}, {14, 10}, {14, 10}, {-14, 10}, {-14, 10}, {-14, 10}, {-14, 10}, {13, 10}, {13, 10}, {13, 10}, {13, 10}, {-13, 10}, {-13, 10}, {-13, 10}, {-13, 10} }; short const dc_threshold[] = { 26708, 29545, 29472, 26223, 30580, 29281, 8293, 29545, 25632, 29285, 30313, 25701, 26144, 28530, 8301, 26740, 8293, 20039, 8277, 20551, 8268, 30296, 17513, 25376, 25711, 25445, 10272, 11825, 11827, 10544, 2606, 28505, 29301, 29472, 26223, 30580, 29281, 8293, 26980, 29811, 26994, 30050, 28532, 8306, 24936, 8307, 28532, 26400, 30313, 8293, 25441, 25955, 29555, 29728, 8303, 29801, 8307, 28531, 29301, 25955, 25376, 25711, 11877, 10 }; VLC const dc_lum_tab[] = { {0, 0}, {4, 3}, {3, 3}, {0, 3}, {2, 2}, {2, 2}, {1, 2}, {1, 2}, }; xvidcore/src/bitstream/cbp.h0000664000076500007650000000275711564705453017206 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - CBP related header - * * Copyright(C) 2002-2003 Edouard Gomez * 2003 Christoph Lampert * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: cbp.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _ENCODER_CBP_H_ #define _ENCODER_CBP_H_ #include "../portab.h" typedef uint32_t(cbpFunc) (const int16_t * codes); typedef cbpFunc *cbpFuncPtr; extern cbpFuncPtr calc_cbp; extern cbpFunc calc_cbp_c; extern cbpFunc calc_cbp_plain; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern cbpFunc calc_cbp_mmx; extern cbpFunc calc_cbp_sse2; #endif #endif /* _ENCODER_CBP_H_ */ xvidcore/src/bitstream/x86_asm/0000775000076500007650000000000011566427763017552 5ustar xvidbuildxvidbuildxvidcore/src/bitstream/x86_asm/cbp_mmx.asm0000664000076500007650000000656011254216113021663 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - MMX CBP computation - ; * ; * Copyright (C) 2005 Carlo Bramini ; * 2001-2003 Peter Ross ; * 2002-2003 Pascal Massimino ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: cbp_mmx.asm,v 1.19 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ;============================================================================= ; Macros ;============================================================================= %include "nasm.inc" ;============================================================================= ; Local data ;============================================================================= DATA ALIGN SECTION_ALIGN mult_mask: db 0x10,0x20,0x04,0x08,0x01,0x02,0x00,0x00 ignore_dc: dw 0, -1, -1, -1 ;============================================================================= ; Code ;============================================================================= TEXT cglobal calc_cbp_mmx ;----------------------------------------------------------------------------- ; uint32_t calc_cbp_mmx(const int16_t coeff[6][64]); ;----------------------------------------------------------------------------- %macro MAKE_LOAD 2 por mm0, [%2-128*1+%1*8] por mm1, [%2+128*0+%1*8] por mm2, [%2+128*1+%1*8] por mm3, [%2+128*2+%1*8] por mm4, [%2+128*3+%1*8] por mm5, [%2+128*4+%1*8] %endmacro ALIGN SECTION_ALIGN calc_cbp_mmx: mov _EAX, prm1 ; coeff movq mm7, [ignore_dc] pxor mm6, mm6 ; used only for comparing movq mm0, [_EAX+128*0] movq mm1, [_EAX+128*1] movq mm2, [_EAX+128*2] movq mm3, [_EAX+128*3] movq mm4, [_EAX+128*4] movq mm5, [_EAX+128*5] add _EAX, 8+128 pand mm0, mm7 pand mm1, mm7 pand mm2, mm7 pand mm3, mm7 pand mm4, mm7 pand mm5, mm7 MAKE_LOAD 0, _EAX MAKE_LOAD 1, _EAX MAKE_LOAD 2, _EAX MAKE_LOAD 3, _EAX MAKE_LOAD 4, _EAX MAKE_LOAD 5, _EAX MAKE_LOAD 6, _EAX MAKE_LOAD 7, _EAX MAKE_LOAD 8, _EAX MAKE_LOAD 9, _EAX MAKE_LOAD 10, _EAX MAKE_LOAD 11, _EAX MAKE_LOAD 12, _EAX MAKE_LOAD 13, _EAX MAKE_LOAD 14, _EAX movq mm7, [mult_mask] packssdw mm0, mm1 packssdw mm2, mm3 packssdw mm4, mm5 packssdw mm0, mm2 packssdw mm4, mm6 pcmpeqw mm0, mm6 pcmpeqw mm4, mm6 pcmpeqw mm0, mm6 pcmpeqw mm4, mm6 psrlw mm0, 15 psrlw mm4, 15 packuswb mm0, mm4 pmaddwd mm0, mm7 movq mm1, mm0 psrlq mm1, 32 paddusb mm0, mm1 movd eax, mm0 shr _EAX, 8 and _EAX, 0x3F ret ENDFUNC NON_EXEC_STACK xvidcore/src/bitstream/x86_asm/cbp_sse2.asm0000664000076500007650000000614311254216113021733 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - SSE2 CBP computation - ; * ; * Copyright (C) 2002 Daniel Smith ; * 2002 Pascal Massimino ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: cbp_sse2.asm,v 1.14 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ;============================================================================= ; Macros ;============================================================================= %include "nasm.inc" %macro LOOP_SSE2 2 movdqa xmm0, [%2+(%1)*128] pand xmm0, xmm3 movdqa xmm1, [%2+(%1)*128+16] por xmm0, [%2+(%1)*128+32] por xmm1, [%2+(%1)*128+48] por xmm0, [%2+(%1)*128+64] por xmm1, [%2+(%1)*128+80] por xmm0, [%2+(%1)*128+96] por xmm1, [%2+(%1)*128+112] por xmm0, xmm1 ; xmm0 = xmm1 = 128 bits worth of info psadbw xmm0, xmm2 ; contains 2 dwords with sums movhlps xmm1, xmm0 ; move high dword from xmm0 to low xmm1 por xmm0, xmm1 ; combine movd ecx, xmm0 ; if ecx set, values were found test _ECX, _ECX %endmacro ;============================================================================= ; Data (Read Only) ;============================================================================= DATA ALIGN SECTION_ALIGN ignore_dc: dw 0, -1, -1, -1, -1, -1, -1, -1 ;============================================================================= ; Code ;============================================================================= TEXT ;----------------------------------------------------------------------------- ; uint32_t calc_cbp_sse2(const int16_t coeff[6*64]); ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN cglobal calc_cbp_sse2 calc_cbp_sse2: mov _EDX, prm1 ; coeff[] xor _EAX, _EAX ; cbp = 0 movdqu xmm3, [ignore_dc] ; mask to ignore dc value pxor xmm2, xmm2 ; zero LOOP_SSE2 0, _EDX jz .blk2 or _EAX, (1<<5) .blk2: LOOP_SSE2 1, _EDX jz .blk3 or _EAX, (1<<4) .blk3: LOOP_SSE2 2, _EDX jz .blk4 or _EAX, (1<<3) .blk4: LOOP_SSE2 3, _EDX jz .blk5 or _EAX, (1<<2) .blk5: LOOP_SSE2 4, _EDX jz .blk6 or _EAX, (1<<1) .blk6: LOOP_SSE2 5, _EDX jz .finished or _EAX, (1<<0) .finished: ret ENDFUNC NON_EXEC_STACK xvidcore/src/bitstream/zigzag.h0000664000076500007650000000403611564705453017725 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - ZigZag order matrices - * * Copyright(C) 2001-2002 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: zigzag.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _ZIGZAG_H_ #define _ZIGZAG_H_ static const uint16_t scan_tables[3][64] = { /* zig_zag_scan */ { 0, 1, 8, 16, 9, 2, 3, 10, 17, 24, 32, 25, 18, 11, 4, 5, 12, 19, 26, 33, 40, 48, 41, 34, 27, 20, 13, 6, 7, 14, 21, 28, 35, 42, 49, 56, 57, 50, 43, 36, 29, 22, 15, 23, 30, 37, 44, 51, 58, 59, 52, 45, 38, 31, 39, 46, 53, 60, 61, 54, 47, 55, 62, 63}, /* horizontal_scan */ { 0, 1, 2, 3, 8, 9, 16, 17, 10, 11, 4, 5, 6, 7, 15, 14, 13, 12, 19, 18, 24, 25, 32, 33, 26, 27, 20, 21, 22, 23, 28, 29, 30, 31, 34, 35, 40, 41, 48, 49, 42, 43, 36, 37, 38, 39, 44, 45, 46, 47, 50, 51, 56, 57, 58, 59, 52, 53, 54, 55, 60, 61, 62, 63}, /* vertical_scan */ { 0, 8, 16, 24, 1, 9, 2, 10, 17, 25, 32, 40, 48, 56, 57, 49, 41, 33, 26, 18, 3, 11, 4, 12, 19, 27, 34, 42, 50, 58, 35, 43, 51, 59, 20, 28, 5, 13, 6, 14, 21, 29, 36, 44, 52, 60, 37, 45, 53, 61, 22, 30, 7, 15, 23, 31, 38, 46, 54, 62, 39, 47, 55, 63} }; #endif /* _ZIGZAG_H_ */ xvidcore/src/bitstream/bitstream.c0000664000076500007650000012707011564705453020423 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Bitstream reader/writer - * * Copyright (C) 2001-2003 Peter Ross * 2003 Cristoph Lampert * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: bitstream.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include "bitstream.h" #include "zigzag.h" #include "../quant/quant_matrix.h" #include "mbcoding.h" static const uint8_t log2_tab_16[16] = { 0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4 }; static uint32_t __inline log2bin(uint32_t value) { int n = 0; if (value & 0xffff0000) { value >>= 16; n += 16; } if (value & 0xff00) { value >>= 8; n += 8; } if (value & 0xf0) { value >>= 4; n += 4; } return n + log2_tab_16[value]; } static const uint32_t intra_dc_threshold_table[] = { 32, /* never use */ 13, 15, 17, 19, 21, 23, 1, }; static void bs_get_matrix(Bitstream * bs, uint8_t * matrix) { int i = 0; int last, value = 0; do { last = value; value = BitstreamGetBits(bs, 8); matrix[scan_tables[0][i++]] = value; } while (value != 0 && i < 64); if (value != 0) return; i--; while (i < 64) { matrix[scan_tables[0][i++]] = last; } } /* * for PVOP addbits == fcode - 1 * for BVOP addbits == max(fcode,bcode) - 1 * returns mbpos */ int read_video_packet_header(Bitstream *bs, DECODER * dec, const int addbits, int * quant, int * fcode_forward, int * fcode_backward, int * intra_dc_threshold) { int startcode_bits = NUMBITS_VP_RESYNC_MARKER + addbits; int mbnum_bits = log2bin(dec->mb_width * dec->mb_height - 1); int mbnum; int hec = 0; BitstreamSkip(bs, BitstreamNumBitsToByteAlign(bs)); BitstreamSkip(bs, startcode_bits); DPRINTF(XVID_DEBUG_STARTCODE, "\n"); if (dec->shape != VIDOBJLAY_SHAPE_RECTANGULAR) { hec = BitstreamGetBit(bs); /* header_extension_code */ if (hec && !(dec->sprite_enable == SPRITE_STATIC /* && current_coding_type = I_VOP */)) { BitstreamSkip(bs, 13); /* vop_width */ READ_MARKER(); BitstreamSkip(bs, 13); /* vop_height */ READ_MARKER(); BitstreamSkip(bs, 13); /* vop_horizontal_mc_spatial_ref */ READ_MARKER(); BitstreamSkip(bs, 13); /* vop_vertical_mc_spatial_ref */ READ_MARKER(); } } mbnum = BitstreamGetBits(bs, mbnum_bits); /* macroblock_number */ DPRINTF(XVID_DEBUG_HEADER, "mbnum %i\n", mbnum); if (dec->shape != VIDOBJLAY_SHAPE_BINARY_ONLY) { *quant = BitstreamGetBits(bs, dec->quant_bits); /* quant_scale */ DPRINTF(XVID_DEBUG_HEADER, "quant %i\n", *quant); } if (dec->shape == VIDOBJLAY_SHAPE_RECTANGULAR) hec = BitstreamGetBit(bs); /* header_extension_code */ DPRINTF(XVID_DEBUG_HEADER, "header_extension_code %i\n", hec); if (hec) { int time_base; int time_increment; int coding_type; for (time_base=0; BitstreamGetBit(bs)!=0; time_base++); /* modulo_time_base */ READ_MARKER(); if (dec->time_inc_bits) time_increment = (BitstreamGetBits(bs, dec->time_inc_bits)); /* vop_time_increment */ READ_MARKER(); DPRINTF(XVID_DEBUG_HEADER,"time %i:%i\n", time_base, time_increment); coding_type = BitstreamGetBits(bs, 2); DPRINTF(XVID_DEBUG_HEADER,"coding_type %i\n", coding_type); if (dec->shape != VIDOBJLAY_SHAPE_RECTANGULAR) { BitstreamSkip(bs, 1); /* change_conv_ratio_disable */ if (coding_type != I_VOP) BitstreamSkip(bs, 1); /* vop_shape_coding_type */ } if (dec->shape != VIDOBJLAY_SHAPE_BINARY_ONLY) { *intra_dc_threshold = intra_dc_threshold_table[BitstreamGetBits(bs, 3)]; if (dec->sprite_enable == SPRITE_GMC && coding_type == S_VOP && dec->sprite_warping_points > 0) { /* TODO: sprite trajectory */ } if (dec->reduced_resolution_enable && dec->shape == VIDOBJLAY_SHAPE_RECTANGULAR && (coding_type == P_VOP || coding_type == I_VOP)) { BitstreamSkip(bs, 1); /* XXX: vop_reduced_resolution */ } if (coding_type != I_VOP && fcode_forward) { *fcode_forward = BitstreamGetBits(bs, 3); DPRINTF(XVID_DEBUG_HEADER,"fcode_forward %i\n", *fcode_forward); } if (coding_type == B_VOP && fcode_backward) { *fcode_backward = BitstreamGetBits(bs, 3); DPRINTF(XVID_DEBUG_HEADER,"fcode_backward %i\n", *fcode_backward); } } } if (dec->newpred_enable) { int vop_id; int vop_id_for_prediction; vop_id = BitstreamGetBits(bs, MIN(dec->time_inc_bits + 3, 15)); DPRINTF(XVID_DEBUG_HEADER, "vop_id %i\n", vop_id); if (BitstreamGetBit(bs)) /* vop_id_for_prediction_indication */ { vop_id_for_prediction = BitstreamGetBits(bs, MIN(dec->time_inc_bits + 3, 15)); DPRINTF(XVID_DEBUG_HEADER, "vop_id_for_prediction %i\n", vop_id_for_prediction); } READ_MARKER(); } return mbnum; } /* vol estimation header */ static void read_vol_complexity_estimation_header(Bitstream * bs, DECODER * dec) { ESTIMATION * e = &dec->estimation; e->method = BitstreamGetBits(bs, 2); /* estimation_method */ DPRINTF(XVID_DEBUG_HEADER,"+ complexity_estimation_header; method=%i\n", e->method); if (e->method == 0 || e->method == 1) { if (!BitstreamGetBit(bs)) /* shape_complexity_estimation_disable */ { e->opaque = BitstreamGetBit(bs); /* opaque */ e->transparent = BitstreamGetBit(bs); /* transparent */ e->intra_cae = BitstreamGetBit(bs); /* intra_cae */ e->inter_cae = BitstreamGetBit(bs); /* inter_cae */ e->no_update = BitstreamGetBit(bs); /* no_update */ e->upsampling = BitstreamGetBit(bs); /* upsampling */ } if (!BitstreamGetBit(bs)) /* texture_complexity_estimation_set_1_disable */ { e->intra_blocks = BitstreamGetBit(bs); /* intra_blocks */ e->inter_blocks = BitstreamGetBit(bs); /* inter_blocks */ e->inter4v_blocks = BitstreamGetBit(bs); /* inter4v_blocks */ e->not_coded_blocks = BitstreamGetBit(bs); /* not_coded_blocks */ } } READ_MARKER(); if (!BitstreamGetBit(bs)) /* texture_complexity_estimation_set_2_disable */ { e->dct_coefs = BitstreamGetBit(bs); /* dct_coefs */ e->dct_lines = BitstreamGetBit(bs); /* dct_lines */ e->vlc_symbols = BitstreamGetBit(bs); /* vlc_symbols */ e->vlc_bits = BitstreamGetBit(bs); /* vlc_bits */ } if (!BitstreamGetBit(bs)) /* motion_compensation_complexity_disable */ { e->apm = BitstreamGetBit(bs); /* apm */ e->npm = BitstreamGetBit(bs); /* npm */ e->interpolate_mc_q = BitstreamGetBit(bs); /* interpolate_mc_q */ e->forw_back_mc_q = BitstreamGetBit(bs); /* forw_back_mc_q */ e->halfpel2 = BitstreamGetBit(bs); /* halfpel2 */ e->halfpel4 = BitstreamGetBit(bs); /* halfpel4 */ } READ_MARKER(); if (e->method == 1) { if (!BitstreamGetBit(bs)) /* version2_complexity_estimation_disable */ { e->sadct = BitstreamGetBit(bs); /* sadct */ e->quarterpel = BitstreamGetBit(bs); /* quarterpel */ } } } /* vop estimation header */ static void read_vop_complexity_estimation_header(Bitstream * bs, DECODER * dec, int coding_type) { ESTIMATION * e = &dec->estimation; if (e->method == 0 || e->method == 1) { if (coding_type == I_VOP) { if (e->opaque) BitstreamSkip(bs, 8); /* dcecs_opaque */ if (e->transparent) BitstreamSkip(bs, 8); /* */ if (e->intra_cae) BitstreamSkip(bs, 8); /* */ if (e->inter_cae) BitstreamSkip(bs, 8); /* */ if (e->no_update) BitstreamSkip(bs, 8); /* */ if (e->upsampling) BitstreamSkip(bs, 8); /* */ if (e->intra_blocks) BitstreamSkip(bs, 8); /* */ if (e->not_coded_blocks) BitstreamSkip(bs, 8); /* */ if (e->dct_coefs) BitstreamSkip(bs, 8); /* */ if (e->dct_lines) BitstreamSkip(bs, 8); /* */ if (e->vlc_symbols) BitstreamSkip(bs, 8); /* */ if (e->vlc_bits) BitstreamSkip(bs, 8); /* */ if (e->sadct) BitstreamSkip(bs, 8); /* */ } if (coding_type == P_VOP) { if (e->opaque) BitstreamSkip(bs, 8); /* */ if (e->transparent) BitstreamSkip(bs, 8); /* */ if (e->intra_cae) BitstreamSkip(bs, 8); /* */ if (e->inter_cae) BitstreamSkip(bs, 8); /* */ if (e->no_update) BitstreamSkip(bs, 8); /* */ if (e->upsampling) BitstreamSkip(bs, 8); /* */ if (e->intra_blocks) BitstreamSkip(bs, 8); /* */ if (e->not_coded_blocks) BitstreamSkip(bs, 8); /* */ if (e->dct_coefs) BitstreamSkip(bs, 8); /* */ if (e->dct_lines) BitstreamSkip(bs, 8); /* */ if (e->vlc_symbols) BitstreamSkip(bs, 8); /* */ if (e->vlc_bits) BitstreamSkip(bs, 8); /* */ if (e->inter_blocks) BitstreamSkip(bs, 8); /* */ if (e->inter4v_blocks) BitstreamSkip(bs, 8); /* */ if (e->apm) BitstreamSkip(bs, 8); /* */ if (e->npm) BitstreamSkip(bs, 8); /* */ if (e->forw_back_mc_q) BitstreamSkip(bs, 8); /* */ if (e->halfpel2) BitstreamSkip(bs, 8); /* */ if (e->halfpel4) BitstreamSkip(bs, 8); /* */ if (e->sadct) BitstreamSkip(bs, 8); /* */ if (e->quarterpel) BitstreamSkip(bs, 8); /* */ } if (coding_type == B_VOP) { if (e->opaque) BitstreamSkip(bs, 8); /* */ if (e->transparent) BitstreamSkip(bs, 8); /* */ if (e->intra_cae) BitstreamSkip(bs, 8); /* */ if (e->inter_cae) BitstreamSkip(bs, 8); /* */ if (e->no_update) BitstreamSkip(bs, 8); /* */ if (e->upsampling) BitstreamSkip(bs, 8); /* */ if (e->intra_blocks) BitstreamSkip(bs, 8); /* */ if (e->not_coded_blocks) BitstreamSkip(bs, 8); /* */ if (e->dct_coefs) BitstreamSkip(bs, 8); /* */ if (e->dct_lines) BitstreamSkip(bs, 8); /* */ if (e->vlc_symbols) BitstreamSkip(bs, 8); /* */ if (e->vlc_bits) BitstreamSkip(bs, 8); /* */ if (e->inter_blocks) BitstreamSkip(bs, 8); /* */ if (e->inter4v_blocks) BitstreamSkip(bs, 8); /* */ if (e->apm) BitstreamSkip(bs, 8); /* */ if (e->npm) BitstreamSkip(bs, 8); /* */ if (e->forw_back_mc_q) BitstreamSkip(bs, 8); /* */ if (e->halfpel2) BitstreamSkip(bs, 8); /* */ if (e->halfpel4) BitstreamSkip(bs, 8); /* */ if (e->interpolate_mc_q) BitstreamSkip(bs, 8); /* */ if (e->sadct) BitstreamSkip(bs, 8); /* */ if (e->quarterpel) BitstreamSkip(bs, 8); /* */ } if (coding_type == S_VOP && dec->sprite_enable == SPRITE_STATIC) { if (e->intra_blocks) BitstreamSkip(bs, 8); /* */ if (e->not_coded_blocks) BitstreamSkip(bs, 8); /* */ if (e->dct_coefs) BitstreamSkip(bs, 8); /* */ if (e->dct_lines) BitstreamSkip(bs, 8); /* */ if (e->vlc_symbols) BitstreamSkip(bs, 8); /* */ if (e->vlc_bits) BitstreamSkip(bs, 8); /* */ if (e->inter_blocks) BitstreamSkip(bs, 8); /* */ if (e->inter4v_blocks) BitstreamSkip(bs, 8); /* */ if (e->apm) BitstreamSkip(bs, 8); /* */ if (e->npm) BitstreamSkip(bs, 8); /* */ if (e->forw_back_mc_q) BitstreamSkip(bs, 8); /* */ if (e->halfpel2) BitstreamSkip(bs, 8); /* */ if (e->halfpel4) BitstreamSkip(bs, 8); /* */ if (e->interpolate_mc_q) BitstreamSkip(bs, 8); /* */ } } } /* decode headers returns coding_type, or -1 if error */ #define VIDOBJ_START_CODE_MASK 0x0000001f #define VIDOBJLAY_START_CODE_MASK 0x0000000f int BitstreamReadHeaders(Bitstream * bs, DECODER * dec, uint32_t * rounding, uint32_t * quant, uint32_t * fcode_forward, uint32_t * fcode_backward, uint32_t * intra_dc_threshold, WARPPOINTS *gmc_warp) { uint32_t vol_ver_id; uint32_t coding_type; uint32_t start_code; uint32_t time_incr = 0; int32_t time_increment = 0; int resize = 0; while ((BitstreamPos(bs) >> 3) + 4 <= bs->length) { BitstreamByteAlign(bs); start_code = BitstreamShowBits(bs, 32); if (start_code == VISOBJSEQ_START_CODE) { int profile; DPRINTF(XVID_DEBUG_STARTCODE, "\n"); BitstreamSkip(bs, 32); /* visual_object_sequence_start_code */ profile = BitstreamGetBits(bs, 8); /* profile_and_level_indication */ DPRINTF(XVID_DEBUG_HEADER, "profile_and_level_indication %i\n", profile); } else if (start_code == VISOBJSEQ_STOP_CODE) { BitstreamSkip(bs, 32); /* visual_object_sequence_stop_code */ DPRINTF(XVID_DEBUG_STARTCODE, "\n"); } else if (start_code == VISOBJ_START_CODE) { DPRINTF(XVID_DEBUG_STARTCODE, "\n"); BitstreamSkip(bs, 32); /* visual_object_start_code */ if (BitstreamGetBit(bs)) /* is_visual_object_identified */ { dec->ver_id = BitstreamGetBits(bs, 4); /* visual_object_ver_id */ DPRINTF(XVID_DEBUG_HEADER,"visobj_ver_id %i\n", dec->ver_id); BitstreamSkip(bs, 3); /* visual_object_priority */ } else { dec->ver_id = 1; } if (BitstreamShowBits(bs, 4) != VISOBJ_TYPE_VIDEO) /* visual_object_type */ { DPRINTF(XVID_DEBUG_ERROR, "visual_object_type != video\n"); return -1; } BitstreamSkip(bs, 4); /* video_signal_type */ if (BitstreamGetBit(bs)) /* video_signal_type */ { DPRINTF(XVID_DEBUG_HEADER,"+ video_signal_type\n"); BitstreamSkip(bs, 3); /* video_format */ BitstreamSkip(bs, 1); /* video_range */ if (BitstreamGetBit(bs)) /* color_description */ { DPRINTF(XVID_DEBUG_HEADER,"+ color_description"); BitstreamSkip(bs, 8); /* color_primaries */ BitstreamSkip(bs, 8); /* transfer_characteristics */ BitstreamSkip(bs, 8); /* matrix_coefficients */ } } } else if ((start_code & ~VIDOBJ_START_CODE_MASK) == VIDOBJ_START_CODE) { DPRINTF(XVID_DEBUG_STARTCODE, "\n"); DPRINTF(XVID_DEBUG_HEADER, "vo id %i\n", start_code & VIDOBJ_START_CODE_MASK); BitstreamSkip(bs, 32); /* video_object_start_code */ } else if ((start_code & ~VIDOBJLAY_START_CODE_MASK) == VIDOBJLAY_START_CODE) { DPRINTF(XVID_DEBUG_STARTCODE, "\n"); DPRINTF(XVID_DEBUG_HEADER, "vol id %i\n", start_code & VIDOBJLAY_START_CODE_MASK); BitstreamSkip(bs, 32); /* video_object_layer_start_code */ BitstreamSkip(bs, 1); /* random_accessible_vol */ BitstreamSkip(bs, 8); /* video_object_type_indication */ if (BitstreamGetBit(bs)) /* is_object_layer_identifier */ { DPRINTF(XVID_DEBUG_HEADER, "+ is_object_layer_identifier\n"); vol_ver_id = BitstreamGetBits(bs, 4); /* video_object_layer_verid */ DPRINTF(XVID_DEBUG_HEADER,"ver_id %i\n", vol_ver_id); BitstreamSkip(bs, 3); /* video_object_layer_priority */ } else { vol_ver_id = dec->ver_id; } dec->aspect_ratio = BitstreamGetBits(bs, 4); if (dec->aspect_ratio == VIDOBJLAY_AR_EXTPAR) /* aspect_ratio_info */ { DPRINTF(XVID_DEBUG_HEADER, "+ aspect_ratio_info\n"); dec->par_width = BitstreamGetBits(bs, 8); /* par_width */ dec->par_height = BitstreamGetBits(bs, 8); /* par_height */ } if (BitstreamGetBit(bs)) /* vol_control_parameters */ { DPRINTF(XVID_DEBUG_HEADER, "+ vol_control_parameters\n"); BitstreamSkip(bs, 2); /* chroma_format */ dec->low_delay = BitstreamGetBit(bs); /* low_delay */ DPRINTF(XVID_DEBUG_HEADER, "low_delay %i\n", dec->low_delay); if (BitstreamGetBit(bs)) /* vbv_parameters */ { unsigned int bitrate; unsigned int buffer_size; unsigned int occupancy; DPRINTF(XVID_DEBUG_HEADER,"+ vbv_parameters\n"); bitrate = BitstreamGetBits(bs,15) << 15; /* first_half_bit_rate */ READ_MARKER(); bitrate |= BitstreamGetBits(bs,15); /* latter_half_bit_rate */ READ_MARKER(); buffer_size = BitstreamGetBits(bs, 15) << 3; /* first_half_vbv_buffer_size */ READ_MARKER(); buffer_size |= BitstreamGetBits(bs, 3); /* latter_half_vbv_buffer_size */ occupancy = BitstreamGetBits(bs, 11) << 15; /* first_half_vbv_occupancy */ READ_MARKER(); occupancy |= BitstreamGetBits(bs, 15); /* latter_half_vbv_occupancy */ READ_MARKER(); DPRINTF(XVID_DEBUG_HEADER,"bitrate %d (unit=400 bps)\n", bitrate); DPRINTF(XVID_DEBUG_HEADER,"buffer_size %d (unit=16384 bits)\n", buffer_size); DPRINTF(XVID_DEBUG_HEADER,"occupancy %d (unit=64 bits)\n", occupancy); } }else{ dec->low_delay = dec->low_delay_default; } dec->shape = BitstreamGetBits(bs, 2); /* video_object_layer_shape */ DPRINTF(XVID_DEBUG_HEADER, "shape %i\n", dec->shape); if (dec->shape != VIDOBJLAY_SHAPE_RECTANGULAR) { DPRINTF(XVID_DEBUG_ERROR,"non-rectangular shapes are not supported\n"); } if (dec->shape == VIDOBJLAY_SHAPE_GRAYSCALE && vol_ver_id != 1) { BitstreamSkip(bs, 4); /* video_object_layer_shape_extension */ } READ_MARKER(); /********************** for decode B-frame time ***********************/ dec->time_inc_resolution = BitstreamGetBits(bs, 16); /* vop_time_increment_resolution */ DPRINTF(XVID_DEBUG_HEADER,"vop_time_increment_resolution %i\n", dec->time_inc_resolution); if (dec->time_inc_resolution > 0) { dec->time_inc_bits = MAX(log2bin(dec->time_inc_resolution-1), 1); } else { /* for "old" xvid compatibility, set time_inc_bits = 1 */ dec->time_inc_bits = 1; } READ_MARKER(); if (BitstreamGetBit(bs)) /* fixed_vop_rate */ { DPRINTF(XVID_DEBUG_HEADER, "+ fixed_vop_rate\n"); BitstreamSkip(bs, dec->time_inc_bits); /* fixed_vop_time_increment */ } if (dec->shape != VIDOBJLAY_SHAPE_BINARY_ONLY) { if (dec->shape == VIDOBJLAY_SHAPE_RECTANGULAR) { uint32_t width, height; READ_MARKER(); width = BitstreamGetBits(bs, 13); /* video_object_layer_width */ READ_MARKER(); height = BitstreamGetBits(bs, 13); /* video_object_layer_height */ READ_MARKER(); DPRINTF(XVID_DEBUG_HEADER, "width %i\n", width); DPRINTF(XVID_DEBUG_HEADER, "height %i\n", height); if (dec->width != width || dec->height != height) { if (dec->fixed_dimensions) { DPRINTF(XVID_DEBUG_ERROR, "decoder width/height does not match bitstream\n"); return -1; } resize = 1; dec->width = width; dec->height = height; } } dec->interlacing = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_HEADER, "interlacing %i\n", dec->interlacing); if (!BitstreamGetBit(bs)) /* obmc_disable */ { DPRINTF(XVID_DEBUG_ERROR, "obmc_disabled==false not supported\n"); /* TODO */ /* fucking divx4.02 has this enabled */ } dec->sprite_enable = BitstreamGetBits(bs, (vol_ver_id == 1 ? 1 : 2)); /* sprite_enable */ if (dec->sprite_enable == SPRITE_STATIC || dec->sprite_enable == SPRITE_GMC) { int low_latency_sprite_enable; if (dec->sprite_enable != SPRITE_GMC) { int sprite_width; int sprite_height; int sprite_left_coord; int sprite_top_coord; sprite_width = BitstreamGetBits(bs, 13); /* sprite_width */ READ_MARKER(); sprite_height = BitstreamGetBits(bs, 13); /* sprite_height */ READ_MARKER(); sprite_left_coord = BitstreamGetBits(bs, 13); /* sprite_left_coordinate */ READ_MARKER(); sprite_top_coord = BitstreamGetBits(bs, 13); /* sprite_top_coordinate */ READ_MARKER(); } dec->sprite_warping_points = BitstreamGetBits(bs, 6); /* no_of_sprite_warping_points */ dec->sprite_warping_accuracy = BitstreamGetBits(bs, 2); /* sprite_warping_accuracy */ dec->sprite_brightness_change = BitstreamGetBits(bs, 1); /* brightness_change */ if (dec->sprite_enable != SPRITE_GMC) { low_latency_sprite_enable = BitstreamGetBits(bs, 1); /* low_latency_sprite_enable */ } } if (vol_ver_id != 1 && dec->shape != VIDOBJLAY_SHAPE_RECTANGULAR) { BitstreamSkip(bs, 1); /* sadct_disable */ } if (BitstreamGetBit(bs)) /* not_8_bit */ { DPRINTF(XVID_DEBUG_HEADER, "not_8_bit==true (ignored)\n"); dec->quant_bits = BitstreamGetBits(bs, 4); /* quant_precision */ BitstreamSkip(bs, 4); /* bits_per_pixel */ } else { dec->quant_bits = 5; } if (dec->shape == VIDOBJLAY_SHAPE_GRAYSCALE) { BitstreamSkip(bs, 1); /* no_gray_quant_update */ BitstreamSkip(bs, 1); /* composition_method */ BitstreamSkip(bs, 1); /* linear_composition */ } dec->quant_type = BitstreamGetBit(bs); /* quant_type */ DPRINTF(XVID_DEBUG_HEADER, "quant_type %i\n", dec->quant_type); if (dec->quant_type) { if (BitstreamGetBit(bs)) /* load_intra_quant_mat */ { uint8_t matrix[64]; DPRINTF(XVID_DEBUG_HEADER, "load_intra_quant_mat\n"); bs_get_matrix(bs, matrix); set_intra_matrix(dec->mpeg_quant_matrices, matrix); } else set_intra_matrix(dec->mpeg_quant_matrices, get_default_intra_matrix()); if (BitstreamGetBit(bs)) /* load_inter_quant_mat */ { uint8_t matrix[64]; DPRINTF(XVID_DEBUG_HEADER, "load_inter_quant_mat\n"); bs_get_matrix(bs, matrix); set_inter_matrix(dec->mpeg_quant_matrices, matrix); } else set_inter_matrix(dec->mpeg_quant_matrices, get_default_inter_matrix()); if (dec->shape == VIDOBJLAY_SHAPE_GRAYSCALE) { DPRINTF(XVID_DEBUG_ERROR, "greyscale matrix not supported\n"); return -1; } } if (vol_ver_id != 1) { dec->quarterpel = BitstreamGetBit(bs); /* quarter_sample */ DPRINTF(XVID_DEBUG_HEADER,"quarterpel %i\n", dec->quarterpel); } else dec->quarterpel = 0; dec->complexity_estimation_disable = BitstreamGetBit(bs); /* complexity estimation disable */ if (!dec->complexity_estimation_disable) { read_vol_complexity_estimation_header(bs, dec); } BitstreamSkip(bs, 1); /* resync_marker_disable */ if (BitstreamGetBit(bs)) /* data_partitioned */ { DPRINTF(XVID_DEBUG_ERROR, "data_partitioned not supported\n"); BitstreamSkip(bs, 1); /* reversible_vlc */ } if (vol_ver_id != 1) { dec->newpred_enable = BitstreamGetBit(bs); if (dec->newpred_enable) /* newpred_enable */ { DPRINTF(XVID_DEBUG_HEADER, "+ newpred_enable\n"); BitstreamSkip(bs, 2); /* requested_upstream_message_type */ BitstreamSkip(bs, 1); /* newpred_segment_type */ } dec->reduced_resolution_enable = BitstreamGetBit(bs); /* reduced_resolution_vop_enable */ DPRINTF(XVID_DEBUG_HEADER, "reduced_resolution_enable %i\n", dec->reduced_resolution_enable); } else { dec->newpred_enable = 0; dec->reduced_resolution_enable = 0; } dec->scalability = BitstreamGetBit(bs); /* scalability */ if (dec->scalability) { DPRINTF(XVID_DEBUG_ERROR, "scalability not supported\n"); BitstreamSkip(bs, 1); /* hierarchy_type */ BitstreamSkip(bs, 4); /* ref_layer_id */ BitstreamSkip(bs, 1); /* ref_layer_sampling_direc */ BitstreamSkip(bs, 5); /* hor_sampling_factor_n */ BitstreamSkip(bs, 5); /* hor_sampling_factor_m */ BitstreamSkip(bs, 5); /* vert_sampling_factor_n */ BitstreamSkip(bs, 5); /* vert_sampling_factor_m */ BitstreamSkip(bs, 1); /* enhancement_type */ if(dec->shape == VIDOBJLAY_SHAPE_BINARY /* && hierarchy_type==0 */) { BitstreamSkip(bs, 1); /* use_ref_shape */ BitstreamSkip(bs, 1); /* use_ref_texture */ BitstreamSkip(bs, 5); /* shape_hor_sampling_factor_n */ BitstreamSkip(bs, 5); /* shape_hor_sampling_factor_m */ BitstreamSkip(bs, 5); /* shape_vert_sampling_factor_n */ BitstreamSkip(bs, 5); /* shape_vert_sampling_factor_m */ } return -1; } } else /* dec->shape == BINARY_ONLY */ { if (vol_ver_id != 1) { dec->scalability = BitstreamGetBit(bs); /* scalability */ if (dec->scalability) { DPRINTF(XVID_DEBUG_ERROR, "scalability not supported\n"); BitstreamSkip(bs, 4); /* ref_layer_id */ BitstreamSkip(bs, 5); /* hor_sampling_factor_n */ BitstreamSkip(bs, 5); /* hor_sampling_factor_m */ BitstreamSkip(bs, 5); /* vert_sampling_factor_n */ BitstreamSkip(bs, 5); /* vert_sampling_factor_m */ return -1; } } BitstreamSkip(bs, 1); /* resync_marker_disable */ } return (resize ? -3 : -2 ); /* VOL */ } else if (start_code == GRPOFVOP_START_CODE) { DPRINTF(XVID_DEBUG_STARTCODE, "\n"); BitstreamSkip(bs, 32); { int hours, minutes, seconds; hours = BitstreamGetBits(bs, 5); minutes = BitstreamGetBits(bs, 6); READ_MARKER(); seconds = BitstreamGetBits(bs, 6); DPRINTF(XVID_DEBUG_HEADER, "time %ih%im%is\n", hours,minutes,seconds); } BitstreamSkip(bs, 1); /* closed_gov */ BitstreamSkip(bs, 1); /* broken_link */ } else if (start_code == VOP_START_CODE) { DPRINTF(XVID_DEBUG_STARTCODE, "\n"); BitstreamSkip(bs, 32); /* vop_start_code */ coding_type = BitstreamGetBits(bs, 2); /* vop_coding_type */ DPRINTF(XVID_DEBUG_HEADER, "coding_type %i\n", coding_type); /*********************** for decode B-frame time ***********************/ while (BitstreamGetBit(bs) != 0) /* time_base */ time_incr++; READ_MARKER(); if (dec->time_inc_bits) { time_increment = (BitstreamGetBits(bs, dec->time_inc_bits)); /* vop_time_increment */ } DPRINTF(XVID_DEBUG_HEADER, "time_base %i\n", time_incr); DPRINTF(XVID_DEBUG_HEADER, "time_increment %i\n", time_increment); DPRINTF(XVID_DEBUG_TIMECODE, "%c %i:%i\n", coding_type == I_VOP ? 'I' : coding_type == P_VOP ? 'P' : coding_type == B_VOP ? 'B' : 'S', time_incr, time_increment); if (coding_type != B_VOP) { dec->last_time_base = dec->time_base; dec->time_base += time_incr; dec->time = dec->time_base*dec->time_inc_resolution + time_increment; dec->time_pp = (int32_t)(dec->time - dec->last_non_b_time); dec->last_non_b_time = dec->time; } else { dec->time = (dec->last_time_base + time_incr)*dec->time_inc_resolution + time_increment; dec->time_bp = dec->time_pp - (int32_t)(dec->last_non_b_time - dec->time); } if (dec->time_pp <= 0) dec->time_pp = 1; DPRINTF(XVID_DEBUG_HEADER,"time_pp=%i\n", dec->time_pp); DPRINTF(XVID_DEBUG_HEADER,"time_bp=%i\n", dec->time_bp); READ_MARKER(); if (!BitstreamGetBit(bs)) /* vop_coded */ { DPRINTF(XVID_DEBUG_HEADER, "vop_coded==false\n"); return N_VOP; } if (dec->newpred_enable) { int vop_id; int vop_id_for_prediction; vop_id = BitstreamGetBits(bs, MIN(dec->time_inc_bits + 3, 15)); DPRINTF(XVID_DEBUG_HEADER, "vop_id %i\n", vop_id); if (BitstreamGetBit(bs)) /* vop_id_for_prediction_indication */ { vop_id_for_prediction = BitstreamGetBits(bs, MIN(dec->time_inc_bits + 3, 15)); DPRINTF(XVID_DEBUG_HEADER, "vop_id_for_prediction %i\n", vop_id_for_prediction); } READ_MARKER(); } /* fix a little bug by MinChen */ if ((dec->shape != VIDOBJLAY_SHAPE_BINARY_ONLY) && ( (coding_type == P_VOP) || (coding_type == S_VOP && dec->sprite_enable == SPRITE_GMC) ) ) { *rounding = BitstreamGetBit(bs); /* rounding_type */ DPRINTF(XVID_DEBUG_HEADER, "rounding %i\n", *rounding); } if (dec->reduced_resolution_enable && dec->shape == VIDOBJLAY_SHAPE_RECTANGULAR && (coding_type == P_VOP || coding_type == I_VOP)) { if (BitstreamGetBit(bs)); DPRINTF(XVID_DEBUG_ERROR, "RRV not supported (anymore)\n"); } if (dec->shape != VIDOBJLAY_SHAPE_RECTANGULAR) { if(!(dec->sprite_enable == SPRITE_STATIC && coding_type == I_VOP)) { uint32_t width, height; uint32_t horiz_mc_ref, vert_mc_ref; width = BitstreamGetBits(bs, 13); READ_MARKER(); height = BitstreamGetBits(bs, 13); READ_MARKER(); horiz_mc_ref = BitstreamGetBits(bs, 13); READ_MARKER(); vert_mc_ref = BitstreamGetBits(bs, 13); READ_MARKER(); DPRINTF(XVID_DEBUG_HEADER, "width %i\n", width); DPRINTF(XVID_DEBUG_HEADER, "height %i\n", height); DPRINTF(XVID_DEBUG_HEADER, "horiz_mc_ref %i\n", horiz_mc_ref); DPRINTF(XVID_DEBUG_HEADER, "vert_mc_ref %i\n", vert_mc_ref); } BitstreamSkip(bs, 1); /* change_conv_ratio_disable */ if (BitstreamGetBit(bs)) /* vop_constant_alpha */ { BitstreamSkip(bs, 8); /* vop_constant_alpha_value */ } } if (dec->shape != VIDOBJLAY_SHAPE_BINARY_ONLY) { if (!dec->complexity_estimation_disable) { read_vop_complexity_estimation_header(bs, dec, coding_type); } /* intra_dc_vlc_threshold */ *intra_dc_threshold = intra_dc_threshold_table[BitstreamGetBits(bs, 3)]; dec->top_field_first = 0; dec->alternate_vertical_scan = 0; if (dec->interlacing) { dec->top_field_first = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_HEADER, "interlace top_field_first %i\n", dec->top_field_first); dec->alternate_vertical_scan = BitstreamGetBit(bs); DPRINTF(XVID_DEBUG_HEADER, "interlace alternate_vertical_scan %i\n", dec->alternate_vertical_scan); } } if ((dec->sprite_enable == SPRITE_STATIC || dec->sprite_enable== SPRITE_GMC) && coding_type == S_VOP) { int i; for (i = 0 ; i < dec->sprite_warping_points; i++) { int length; int x = 0, y = 0; /* sprite code borowed from ffmpeg; thx Michael Niedermayer */ length = bs_get_spritetrajectory(bs); if(length){ x= BitstreamGetBits(bs, length); if ((x >> (length - 1)) == 0) /* if MSB not set it is negative*/ x = - (x ^ ((1 << length) - 1)); } READ_MARKER(); length = bs_get_spritetrajectory(bs); if(length){ y = BitstreamGetBits(bs, length); if ((y >> (length - 1)) == 0) /* if MSB not set it is negative*/ y = - (y ^ ((1 << length) - 1)); } READ_MARKER(); gmc_warp->duv[i].x = x; gmc_warp->duv[i].y = y; DPRINTF(XVID_DEBUG_HEADER,"sprite_warping_point[%i] xy=(%i,%i)\n", i, x, y); } if (dec->sprite_brightness_change) { /* XXX: brightness_change_factor() */ } if (dec->sprite_enable == SPRITE_STATIC) { /* XXX: todo */ } } if ((*quant = BitstreamGetBits(bs, dec->quant_bits)) < 1) /* vop_quant */ *quant = 1; DPRINTF(XVID_DEBUG_HEADER, "quant %i\n", *quant); if (coding_type != I_VOP) { *fcode_forward = BitstreamGetBits(bs, 3); /* fcode_forward */ DPRINTF(XVID_DEBUG_HEADER, "fcode_forward %i\n", *fcode_forward); } if (coding_type == B_VOP) { *fcode_backward = BitstreamGetBits(bs, 3); /* fcode_backward */ DPRINTF(XVID_DEBUG_HEADER, "fcode_backward %i\n", *fcode_backward); } if (!dec->scalability) { if ((dec->shape != VIDOBJLAY_SHAPE_RECTANGULAR) && (coding_type != I_VOP)) { BitstreamSkip(bs, 1); /* vop_shape_coding_type */ } } return coding_type; } else if (start_code == USERDATA_START_CODE) { char tmp[256]; int i, version, build; char packed; BitstreamSkip(bs, 32); /* user_data_start_code */ memset(tmp, 0, 256); tmp[0] = BitstreamShowBits(bs, 8); for(i = 1; i < 256; i++){ tmp[i] = (BitstreamShowBits(bs, 16) & 0xFF); if(tmp[i] == 0) break; BitstreamSkip(bs, 8); } DPRINTF(XVID_DEBUG_STARTCODE, ": %s\n", tmp); /* read xvid bitstream version */ if(strncmp(tmp, "XviD", 4) == 0) { if (tmp[strlen(tmp)-1] == 'C') { sscanf(tmp, "XviD%dC", &dec->bs_version); dec->cartoon_mode = 1; } else sscanf(tmp, "XviD%d", &dec->bs_version); DPRINTF(XVID_DEBUG_HEADER, "xvid bitstream version=%i\n", dec->bs_version); } /* divx detection */ i = sscanf(tmp, "DivX%dBuild%d%c", &version, &build, &packed); if (i < 2) i = sscanf(tmp, "DivX%db%d%c", &version, &build, &packed); if (i >= 2) { dec->packed_mode = (i == 3 && packed == 'p'); DPRINTF(XVID_DEBUG_HEADER, "divx version=%i, build=%i packed=%i\n", version, build, dec->packed_mode); } if ((dec->bs_version == 0) && (build > 0) && (build != 1393)) { /* non-xvid stream with xvid fourcc */ dec->bs_version = 0xffff; } } else /* start_code == ? */ { if (BitstreamShowBits(bs, 24) == 0x000001) { DPRINTF(XVID_DEBUG_STARTCODE, "\n", BitstreamShowBits(bs, 32)); } BitstreamSkip(bs, 8); } } #if 0 DPRINTF("*** WARNING: no vop_start_code found"); #endif return -1; /* ignore it */ } /* write custom quant matrix */ static void bs_put_matrix(Bitstream * bs, const uint16_t * matrix) { int i, j; const int last = matrix[scan_tables[0][63]]; for (j = 63; j > 0 && matrix[scan_tables[0][j - 1]] == last; j--); for (i = 0; i <= j; i++) { BitstreamPutBits(bs, matrix[scan_tables[0][i]], 8); } if (j < 63) { BitstreamPutBits(bs, 0, 8); } } /* write vol header */ void BitstreamWriteVolHeader(Bitstream * const bs, const MBParam * pParam, const FRAMEINFO * const frame, const int num_slices) { static const unsigned int vo_id = 0; static const unsigned int vol_id = 0; int vol_ver_id = 1; int vol_type_ind = VIDOBJLAY_TYPE_SIMPLE; int vol_profile = pParam->profile; if ( (pParam->vol_flags & XVID_VOL_QUARTERPEL) || (pParam->vol_flags & XVID_VOL_GMC)) vol_ver_id = 2; if ((pParam->vol_flags & (XVID_VOL_MPEGQUANT|XVID_VOL_QUARTERPEL|XVID_VOL_GMC|XVID_VOL_INTERLACING)) || pParam->max_bframes>0) { vol_type_ind = VIDOBJLAY_TYPE_ASP; } /* visual_object_sequence_start_code */ #if 0 BitstreamPad(bs); #endif /* * no padding here, anymore. You have to make sure that you are * byte aligned, and that always 1-8 padding bits have been written */ if (!vol_profile) { /* Profile was not set by client app, use the more permissive profile * compatible with the vol_type_id */ switch(vol_type_ind) { case VIDOBJLAY_TYPE_ASP: vol_profile = 0xf5; /* ASP level 5 */ break; case VIDOBJLAY_TYPE_ART_SIMPLE: vol_profile = 0x94; /* ARTS level 4 */ break; default: vol_profile = 0x03; /* Simple level 3 */ break; } } /* Write the VOS header */ BitstreamPutBits(bs, VISOBJSEQ_START_CODE, 32); BitstreamPutBits(bs, vol_profile, 8); /* profile_and_level_indication */ /* visual_object_start_code */ BitstreamPad(bs); BitstreamPutBits(bs, VISOBJ_START_CODE, 32); BitstreamPutBits(bs, 0, 1); /* is_visual_object_identifier */ /* Video type */ BitstreamPutBits(bs, VISOBJ_TYPE_VIDEO, 4); /* visual_object_type */ BitstreamPutBit(bs, 0); /* video_signal_type */ /* video object_start_code & vo_id */ BitstreamPadAlways(bs); /* next_start_code() */ BitstreamPutBits(bs, VIDOBJ_START_CODE|(vo_id&0x5), 32); /* video_object_layer_start_code & vol_id */ BitstreamPad(bs); BitstreamPutBits(bs, VIDOBJLAY_START_CODE|(vol_id&0x4), 32); BitstreamPutBit(bs, 0); /* random_accessible_vol */ BitstreamPutBits(bs, vol_type_ind, 8); /* video_object_type_indication */ if (vol_ver_id == 1) { BitstreamPutBit(bs, 0); /* is_object_layer_identified (0=not given) */ } else { BitstreamPutBit(bs, 1); /* is_object_layer_identified */ BitstreamPutBits(bs, vol_ver_id, 4); /* vol_ver_id == 2 */ BitstreamPutBits(bs, 4, 3); /* vol_ver_priority (1==highest, 7==lowest) */ } /* Aspect ratio */ BitstreamPutBits(bs, pParam->par, 4); /* aspect_ratio_info (1=1:1) */ if(pParam->par == XVID_PAR_EXT) { BitstreamPutBits(bs, pParam->par_width, 8); BitstreamPutBits(bs, pParam->par_height, 8); } BitstreamPutBit(bs, 1); /* vol_control_parameters */ BitstreamPutBits(bs, 1, 2); /* chroma_format 1="4:2:0" */ if (pParam->max_bframes > 0) { BitstreamPutBit(bs, 0); /* low_delay */ } else { BitstreamPutBit(bs, 1); /* low_delay */ } BitstreamPutBit(bs, 0); /* vbv_parameters (0=not given) */ BitstreamPutBits(bs, 0, 2); /* video_object_layer_shape (0=rectangular) */ WRITE_MARKER(); /* * time_inc_resolution; ignored by current decore versions * eg. 2fps res=2 inc=1 * 25fps res=25 inc=1 * 29.97fps res=30000 inc=1001 */ BitstreamPutBits(bs, pParam->fbase, 16); WRITE_MARKER(); if (pParam->fincr>0) { BitstreamPutBit(bs, 1); /* fixed_vop_rate = 1 */ BitstreamPutBits(bs, pParam->fincr, MAX(log2bin(pParam->fbase-1),1)); /* fixed_vop_time_increment */ }else{ BitstreamPutBit(bs, 0); /* fixed_vop_rate = 0 */ } WRITE_MARKER(); BitstreamPutBits(bs, pParam->width, 13); /* width */ WRITE_MARKER(); BitstreamPutBits(bs, pParam->height, 13); /* height */ WRITE_MARKER(); BitstreamPutBit(bs, pParam->vol_flags & XVID_VOL_INTERLACING); /* interlace */ BitstreamPutBit(bs, 1); /* obmc_disable (overlapped block motion compensation) */ if (vol_ver_id != 1) { if ((pParam->vol_flags & XVID_VOL_GMC)) { BitstreamPutBits(bs, 2, 2); /* sprite_enable=='GMC' */ BitstreamPutBits(bs, 3, 6); /* no_of_sprite_warping_points */ BitstreamPutBits(bs, 3, 2); /* sprite_warping_accuracy 0==1/2, 1=1/4, 2=1/8, 3=1/16 */ BitstreamPutBit(bs, 0); /* sprite_brightness_change (not supported) */ /* * currently we use no_of_sprite_warping_points==2, sprite_warping_accuracy==3 * for DivX5 compatability */ } else BitstreamPutBits(bs, 0, 2); /* sprite_enable==off */ } else BitstreamPutBit(bs, 0); /* sprite_enable==off */ BitstreamPutBit(bs, 0); /* not_8_bit */ /* quant_type 0=h.263 1=mpeg4(quantizer tables) */ BitstreamPutBit(bs, pParam->vol_flags & XVID_VOL_MPEGQUANT); if ((pParam->vol_flags & XVID_VOL_MPEGQUANT)) { BitstreamPutBit(bs, is_custom_intra_matrix(pParam->mpeg_quant_matrices)); /* load_intra_quant_mat */ if(is_custom_intra_matrix(pParam->mpeg_quant_matrices)) bs_put_matrix(bs, get_intra_matrix(pParam->mpeg_quant_matrices)); BitstreamPutBit(bs, is_custom_inter_matrix(pParam->mpeg_quant_matrices)); /* load_inter_quant_mat */ if(is_custom_inter_matrix(pParam->mpeg_quant_matrices)) bs_put_matrix(bs, get_inter_matrix(pParam->mpeg_quant_matrices)); } if (vol_ver_id != 1) { if ((pParam->vol_flags & XVID_VOL_QUARTERPEL)) BitstreamPutBit(bs, 1); /* quarterpel */ else BitstreamPutBit(bs, 0); /* no quarterpel */ } BitstreamPutBit(bs, 1); /* complexity_estimation_disable */ if (num_slices > 1) BitstreamPutBit(bs, 0); /* resync_marker_enabled */ else BitstreamPutBit(bs, 1); /* resync_marker_disabled */ BitstreamPutBit(bs, 0); /* data_partitioned */ if (vol_ver_id != 1) { BitstreamPutBit(bs, 0); /* newpred_enable */ BitstreamPutBit(bs, 0); /* reduced_resolution_vop_enabled */ } BitstreamPutBit(bs, 0); /* scalability */ BitstreamPadAlways(bs); /* next_start_code(); */ /* divx5 userdata string */ #define DIVX5_ID ((char *)"DivX503b1393") if ((pParam->global_flags & XVID_GLOBAL_DIVX5_USERDATA)) { BitstreamWriteUserData(bs, DIVX5_ID, (uint32_t) strlen(DIVX5_ID)); if (pParam->max_bframes > 0 && (pParam->global_flags & XVID_GLOBAL_PACKED)) BitstreamPutBits(bs, 'p', 8); } /* xvid id */ { const char xvid_user_format[] = "XviD%04d%c"; char xvid_user_data[100]; sprintf(xvid_user_data, xvid_user_format, XVID_BS_VERSION, (frame->vop_flags & XVID_VOP_CARTOON)?'C':'\0'); BitstreamWriteUserData(bs, xvid_user_data, (uint32_t) strlen(xvid_user_data)); } } /* write vop header */ void BitstreamWriteVopHeader( Bitstream * const bs, const MBParam * pParam, const FRAMEINFO * const frame, int vop_coded, unsigned int quant) { uint32_t i; #if 0 BitstreamPad(bs); #endif /* * no padding here, anymore. You have to make sure that you are * byte aligned, and that always 1-8 padding bits have been written */ BitstreamPutBits(bs, VOP_START_CODE, 32); BitstreamPutBits(bs, frame->coding_type, 2); #if 0 DPRINTF(XVID_DEBUG_HEADER, "coding_type = %i\n", frame->coding_type); #endif for (i = 0; i < frame->seconds; i++) { BitstreamPutBit(bs, 1); } BitstreamPutBit(bs, 0); WRITE_MARKER(); /* time_increment: value=nth_of_sec, nbits = log2(resolution) */ BitstreamPutBits(bs, frame->ticks, MAX(log2bin(pParam->fbase-1), 1)); #if 0 DPRINTF("[%i:%i] %c", frame->seconds, frame->ticks, frame->coding_type == I_VOP ? 'I' : frame->coding_type == P_VOP ? 'P' : frame->coding_type == S_VOP ? 'S' : 'B'); #endif WRITE_MARKER(); if (!vop_coded) { BitstreamPutBits(bs, 0, 1); #if 0 BitstreamPadAlways(bs); /* next_start_code() */ #endif /* NB: It's up to the function caller to write the next_start_code(). * At the moment encoder.c respects that requisite because a VOP * always ends with a next_start_code either if it's coded or not * and encoder.c terminates a frame with a next_start_code in whatever * case */ return; } BitstreamPutBits(bs, 1, 1); /* vop_coded */ if ( (frame->coding_type == P_VOP) || (frame->coding_type == S_VOP) ) BitstreamPutBits(bs, frame->rounding_type, 1); BitstreamPutBits(bs, 0, 3); /* intra_dc_vlc_threshold */ if ((frame->vol_flags & XVID_VOL_INTERLACING)) { BitstreamPutBit(bs, (frame->vop_flags & XVID_VOP_TOPFIELDFIRST)); BitstreamPutBit(bs, (frame->vop_flags & XVID_VOP_ALTERNATESCAN)); } if (frame->coding_type == S_VOP) { if (1) { /* no_of_sprite_warping_points>=1 (we use 2!) */ int k; for (k=0;k<3;k++) { bs_put_spritetrajectory(bs, frame->warp.duv[k].x ); /* du[k] */ WRITE_MARKER(); bs_put_spritetrajectory(bs, frame->warp.duv[k].y ); /* dv[k] */ WRITE_MARKER(); if ((frame->vol_flags & XVID_VOL_QUARTERPEL)) { DPRINTF(XVID_DEBUG_HEADER,"sprite_warping_point[%i] xy=(%i,%i) *QPEL*\n", k, frame->warp.duv[k].x/2, frame->warp.duv[k].y/2); } else { DPRINTF(XVID_DEBUG_HEADER,"sprite_warping_point[%i] xy=(%i,%i)\n", k, frame->warp.duv[k].x, frame->warp.duv[k].y); } } } } #if 0 DPRINTF(XVID_DEBUG_HEADER, "quant = %i\n", quant); #endif BitstreamPutBits(bs, quant, 5); /* quantizer */ if (frame->coding_type != I_VOP) BitstreamPutBits(bs, frame->fcode, 3); /* forward_fixed_code */ if (frame->coding_type == B_VOP) BitstreamPutBits(bs, frame->bcode, 3); /* backward_fixed_code */ } void BitstreamWriteUserData(Bitstream * const bs, const char *data, const unsigned int length) { unsigned int i; BitstreamPad(bs); BitstreamPutBits(bs, USERDATA_START_CODE, 32); for (i = 0; i < length; i++) { BitstreamPutBits(bs, data[i], 8); } } /* * Group of VOP */ void BitstreamWriteGroupOfVopHeader(Bitstream * const bs, const MBParam * pParam, uint32_t is_closed_gov) { int64_t time = (pParam->m_stamp + (pParam->fbase/2)) / pParam->fbase; int hours, minutes, seconds; /* compute time_code */ seconds = time % 60; time /= 60; minutes = time % 60; time /= 60; hours = time % 24; /* don't overflow */ BitstreamPutBits(bs, GRPOFVOP_START_CODE, 32); BitstreamPutBits(bs, hours, 5); BitstreamPutBits(bs, minutes, 6); BitstreamPutBit(bs, 1); BitstreamPutBits(bs, seconds, 6); BitstreamPutBits(bs, is_closed_gov, 1); BitstreamPutBits(bs, 0, 1); /* broken_link */ } /* * End of Sequence */ void BitstreamWriteEndOfSequence(Bitstream * const bs) { BitstreamPadAlways(bs); BitstreamPutBits(bs, VISOBJSEQ_STOP_CODE, 32); } /* * Video Packet (resync marker) */ void write_video_packet_header(Bitstream * const bs, const MBParam * pParam, const FRAMEINFO * const frame, int mbnum) { const int mbnum_bits = log2bin(pParam->mb_width * pParam->mb_height - 1); uint32_t nbitsresyncmarker; if (frame->coding_type == I_VOP) nbitsresyncmarker = NUMBITS_VP_RESYNC_MARKER; /* 16 zeros followed by a 1. */ else if (frame->coding_type == B_VOP) /* B_VOP */ nbitsresyncmarker = MAX(NUMBITS_VP_RESYNC_MARKER+1, NUMBITS_VP_RESYNC_MARKER + MAX(frame->fcode, frame->bcode) - 1); else /*(frame->coding_type == P_VOP)*/ nbitsresyncmarker = NUMBITS_VP_RESYNC_MARKER + frame->fcode - 1; BitstreamPutBits(bs, RESYNC_MARKER, nbitsresyncmarker); BitstreamPutBits(bs, mbnum, mbnum_bits); BitstreamPutBits(bs, frame->quant, 5); BitstreamPutBit(bs, 0); /* hec */ } xvidcore/src/bitstream/cbp.c0000664000076500007650000000536111564705453017173 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - CBP related function - * * Copyright(C) 2002-2003 Edouard Gomez * 2003 Christoph Lampert * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: cbp.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../portab.h" #include "cbp.h" cbpFuncPtr calc_cbp; /* * Returns a field of bits that indicates non zero ac blocks * for this macro block */ /* naive C */ uint32_t calc_cbp_plain(const int16_t codes[6 * 64]) { int i, j; uint32_t cbp = 0; for (i = 0; i < 6; i++) { for (j=1; j<64;j++) { if (codes[64*i+j]) { cbp |= 1 << (5-i); break; } } } return cbp; } /* optimized C */ uint32_t calc_cbp_c(const int16_t codes[6 * 64]) { unsigned int i=6; uint32_t cbp = 0; /* uses fixed relation: 4*codes = 1*codes64 */ /* if prototype is changed (e.g. from int16_t to something like int32) this routine has to be changed! */ do { uint64_t *codes64 = (uint64_t*)codes; /* the compiler doesn't really make this */ uint32_t *codes32 = (uint32_t*)codes; /* variables, just "addressing modes" */ cbp += cbp; if (codes[1] || codes32[1]) { cbp++; } else if (codes64[1] | codes64[2] | codes64[3]) { cbp++; } else if (codes64[4] | codes64[5] | codes64[6] | codes64[7]) { cbp++; } else if (codes64[8] | codes64[9] | codes64[10] | codes64[11]) { cbp++; } else if (codes64[12] | codes64[13] | codes64[14] | codes64[15]) { cbp++; } codes += 64; i--; } while (i != 0); return cbp; } /* older code maybe better on some plattforms? */ #if 0 for (i = 5; i >= 0; i--) { if (codes[1] | codes[2] | codes[3]) cbp |= 1 << i; else { for (j = 4; j <= 56; j+=4) /* [60],[61],[62],[63] are last */ if (codes[j] | codes[j+1] | codes[j+2] | codes[j+3]) { cbp |= 1 << i; break; } } codes += 64; } return cbp; #endif xvidcore/src/bitstream/mbcoding.h0000664000076500007650000000477211564705453020223 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - MB coding header - * * Copyright (C) 2002 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mbcoding.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _MB_CODING_H_ #define _MB_CODING_H_ #include "../portab.h" #include "../global.h" #include "vlc_codes.h" #include "bitstream.h" void init_vlc_tables(void); int check_resync_marker(Bitstream * bs, int addbits); void bs_put_spritetrajectory(Bitstream * bs, const int val); int bs_get_spritetrajectory(Bitstream * bs); int get_mcbpc_intra(Bitstream * bs); int get_mcbpc_inter(Bitstream * bs); int get_cbpy(Bitstream * bs, int intra); int get_mv(Bitstream * bs, int fcode); int get_dc_dif(Bitstream * bs, uint32_t dc_size); int get_dc_size_lum(Bitstream * bs); int get_dc_size_chrom(Bitstream * bs); void get_intra_block(Bitstream * bs, int16_t * block, int direction, int coeff); void get_inter_block_h263( Bitstream * bs, int16_t * block, int direction, const int quant, const uint16_t *matrix); void get_inter_block_mpeg( Bitstream * bs, int16_t * block, int direction, const int quant, const uint16_t *matrix); void MBCodingBVOP(const FRAMEINFO * const frame, const MACROBLOCK * mb, const int16_t qcoeff[6 * 64], const int32_t fcode, const int32_t bcode, Bitstream * bs, Statistics * pStat); static __inline void MBSkip(Bitstream * bs) { BitstreamPutBit(bs, 1); /* not coded */ } int CodeCoeffIntra_CalcBits(const int16_t qcoeff[64], const uint16_t * zigzag); int CodeCoeffInter_CalcBits(const int16_t qcoeff[64], const uint16_t * zigzag); #endif /* _MB_CODING_H_ */ xvidcore/src/motion/0000775000076500007650000000000011566427763015600 5ustar xvidbuildxvidbuildxvidcore/src/motion/sad.h0000664000076500007650000001550411564705453016516 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Sum Of Absolute Difference header - * * Copyright(C) 2001-2010 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: sad.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _ENCODER_SAD_H_ #define _ENCODER_SAD_H_ #include "../portab.h" typedef void (sadInitFunc) (void); typedef sadInitFunc *sadInitFuncPtr; extern sadInitFuncPtr sadInit; sadInitFunc sadInit_altivec; typedef uint32_t(sad16Func) (const uint8_t * const cur, const uint8_t * const ref, const uint32_t stride, const uint32_t best_sad); typedef sad16Func *sad16FuncPtr; extern sad16FuncPtr sad16; sad16Func sad16_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) sad16Func sad16_mmx; sad16Func sad16_xmm; sad16Func sad16_3dne; sad16Func sad16_sse2; sad16Func sad16_sse3; #endif #ifdef ARCH_IS_IA64 sad16Func sad16_ia64; #endif #ifdef ARCH_IS_PPC sad16Func sad16_altivec_c; #endif sad16Func mrsad16_c; typedef uint32_t(sad8Func) (const uint8_t * const cur, const uint8_t * const ref, const uint32_t stride); typedef sad8Func *sad8FuncPtr; extern sad8FuncPtr sad8; sad8Func sad8_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) sad8Func sad8_mmx; sad8Func sad8_xmm; sad8Func sad8_3dne; #endif #ifdef ARCH_IS_IA64 sad8Func sad8_ia64; #endif #ifdef ARCH_IS_PPC sad8Func sad8_altivec_c; #endif typedef uint32_t(sad16biFunc) (const uint8_t * const cur, const uint8_t * const ref1, const uint8_t * const ref2, const uint32_t stride); typedef sad16biFunc *sad16biFuncPtr; extern sad16biFuncPtr sad16bi; sad16biFunc sad16bi_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) sad16biFunc sad16bi_mmx; sad16biFunc sad16bi_xmm; sad16biFunc sad16bi_3dne; sad16biFunc sad16bi_3dn; #endif #ifdef ARCH_IS_IA64 sad16biFunc sad16bi_ia64; #endif #ifdef ARCH_IS_PPC sad16biFunc sad16bi_altivec_c; #endif typedef uint32_t(sad8biFunc) (const uint8_t * const cur, const uint8_t * const ref1, const uint8_t * const ref2, const uint32_t stride); typedef sad8biFunc *sad8biFuncPtr; extern sad8biFuncPtr sad8bi; sad8biFunc sad8bi_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) sad8biFunc sad8bi_mmx; sad8biFunc sad8bi_xmm; sad8biFunc sad8bi_3dne; sad8biFunc sad8bi_3dn; #endif typedef uint32_t(dev16Func) (const uint8_t * const cur, const uint32_t stride); typedef dev16Func *dev16FuncPtr; extern dev16FuncPtr dev16; dev16Func dev16_c; typedef uint32_t (sad16vFunc)( const uint8_t * const cur, const uint8_t * const ref, const uint32_t stride, int32_t *sad8); typedef sad16vFunc *sad16vFuncPtr; extern sad16vFuncPtr sad16v; sad16vFunc sad16v_c; sad16vFunc sad32v_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) dev16Func dev16_mmx; dev16Func dev16_xmm; dev16Func dev16_3dne; dev16Func dev16_sse2; dev16Func dev16_sse3; sad16vFunc sad16v_xmm; sad16vFunc sad16v_mmx; #endif #ifdef ARCH_IS_IA64 dev16Func dev16_ia64; #endif #ifdef ARCH_IS_PPC dev16Func dev16_altivec_c; #endif /* This function assumes blocks use 16bit signed elements */ typedef uint32_t (sse8Func_16bit)(const int16_t * cur, const int16_t * ref, const uint32_t stride); typedef sse8Func_16bit *sse8Func_16bitPtr; extern sse8Func_16bitPtr sse8_16bit; sse8Func_16bit sse8_16bit_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) sse8Func_16bit sse8_16bit_mmx; #endif #ifdef ARCH_IS_PPC sse8Func_16bit sse8_16bit_altivec_c; #endif /* This function assumes blocks use 8bit *un*signed elements */ typedef uint32_t (sse8Func_8bit)(const uint8_t * cur, const uint8_t * ref, const uint32_t stride); typedef sse8Func_8bit *sse8Func_8bitPtr; extern sse8Func_8bitPtr sse8_8bit; sse8Func_8bit sse8_8bit_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) sse8Func_8bit sse8_8bit_mmx; #endif typedef uint32_t (sseh8Func_16bit)(const int16_t * cur, const int16_t * ref, uint16_t mask); typedef sseh8Func_16bit *sseh8Func_16bitPtr; extern sseh8Func_16bitPtr sseh8_16bit; sseh8Func_16bit sseh8_16bit_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) sseh8Func_16bit sseh8_16bit_sse2; #endif typedef uint32_t (coeff8_energyFunc)(const int16_t * cur); typedef coeff8_energyFunc *coeff8_energyFunc_Ptr; extern coeff8_energyFunc_Ptr coeff8_energy; coeff8_energyFunc coeff8_energy_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) coeff8_energyFunc coeff8_energy_sse2; #endif typedef uint32_t (blocksum8Func)(const uint8_t * cur, int stride, uint16_t sums[4], uint32_t squares[4]); typedef blocksum8Func *blocksum8Func_Ptr; extern blocksum8Func_Ptr blocksum8; blocksum8Func blocksum8_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) blocksum8Func blocksum8_sse2; #endif /* Coeffs for MSE_H calculation */ static const int16_t Inv_iMask_Coeff[64] = { 0, 155, 128, 328, 737, 2048, 3329, 4763, 184, 184, 251, 462, 865, 4306, 4608, 3872, 251, 216, 328, 737, 2048, 4159, 6094, 4014, 251, 370, 620, 1076, 3329, 9688, 8192, 4920, 415, 620, 1752, 4014, 5919, 15207, 13579, 7589, 737, 1568, 3872, 5243, 8398, 13844, 16345, 10834, 3073, 5243, 7787, 9688, 13579, 18741, 18433, 13057, 6636, 10834, 11552, 12294, 16056, 12800, 13579, 12545 }; static const uint16_t iCSF_Coeff[64] = { 26353, 38331, 42164, 26353, 17568, 10541, 8268, 6912, 35137, 35137, 30117, 22192, 16217, 7270, 7027, 7666, 30117, 32434, 26353, 17568, 10541, 7397, 6111, 7529, 30117, 24803, 19166, 14539, 8268, 4846, 5271, 6801, 23425, 19166, 11396, 7529, 6201, 3868, 4094, 5476, 17568, 12047, 7666, 6588, 5205, 4054, 3731, 4583, 8605, 6588, 5406, 4846, 4094, 3485, 3514, 4175, 5856, 4583, 4438, 4302, 3765, 4216, 4094, 4259 }; static const uint16_t iCSF_Round[64] = { 1, 1, 1, 1, 2, 3, 4, 5, 1, 1, 1, 1, 2, 5, 5, 4, 1, 1, 1, 2, 3, 4, 5, 4, 1, 1, 2, 2, 4, 7, 6, 5, 1, 2, 3, 4, 5, 8, 8, 6, 2, 3, 4, 5, 6, 8, 9, 7, 4, 5, 6, 7, 8, 9, 9, 8, 6, 7, 7, 8, 9, 8, 8, 8 }; #endif /* _ENCODER_SAD_H_ */ xvidcore/src/motion/estimation_pvop.c0000664000076500007650000011104511564705453021157 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Motion Estimation for P- and S- VOPs - * * Copyright(C) 2002 Christoph Lampert * 2002-2010 Michael Militzer * 2002-2003 Radoslaw Czyz * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: estimation_pvop.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include /* memcpy */ #include "../encoder.h" #include "../prediction/mbprediction.h" #include "../global.h" #include "../utils/timer.h" #include "../image/interpolate8x8.h" #include "estimation.h" #include "motion.h" #include "sad.h" #include "motion_inlines.h" #include "motion_smp.h" static const int xvid_me_lambda_vec8[32] = { 0 ,(int)(1.0 * NEIGH_TEND_8X8 + 0.5), (int)(2.0*NEIGH_TEND_8X8 + 0.5), (int)(3.0*NEIGH_TEND_8X8 + 0.5), (int)(4.0*NEIGH_TEND_8X8 + 0.5), (int)(5.0*NEIGH_TEND_8X8 + 0.5), (int)(6.0*NEIGH_TEND_8X8 + 0.5), (int)(7.0*NEIGH_TEND_8X8 + 0.5), (int)(8.0*NEIGH_TEND_8X8 + 0.5), (int)(9.0*NEIGH_TEND_8X8 + 0.5), (int)(10.0*NEIGH_TEND_8X8 + 0.5), (int)(11.0*NEIGH_TEND_8X8 + 0.5), (int)(12.0*NEIGH_TEND_8X8 + 0.5), (int)(13.0*NEIGH_TEND_8X8 + 0.5), (int)(14.0*NEIGH_TEND_8X8 + 0.5), (int)(15.0*NEIGH_TEND_8X8 + 0.5), (int)(16.0*NEIGH_TEND_8X8 + 0.5), (int)(17.0*NEIGH_TEND_8X8 + 0.5), (int)(18.0*NEIGH_TEND_8X8 + 0.5), (int)(19.0*NEIGH_TEND_8X8 + 0.5), (int)(20.0*NEIGH_TEND_8X8 + 0.5), (int)(21.0*NEIGH_TEND_8X8 + 0.5), (int)(22.0*NEIGH_TEND_8X8 + 0.5), (int)(23.0*NEIGH_TEND_8X8 + 0.5), (int)(24.0*NEIGH_TEND_8X8 + 0.5), (int)(25.0*NEIGH_TEND_8X8 + 0.5), (int)(26.0*NEIGH_TEND_8X8 + 0.5), (int)(27.0*NEIGH_TEND_8X8 + 0.5), (int)(28.0*NEIGH_TEND_8X8 + 0.5), (int)(29.0*NEIGH_TEND_8X8 + 0.5), (int)(30.0*NEIGH_TEND_8X8 + 0.5), (int)(31.0*NEIGH_TEND_8X8 + 0.5) }; static void CheckCandidate16(const int x, const int y, SearchData * const data, const unsigned int Direction) { const uint8_t * Reference; int32_t sad, xc, yc; uint32_t t; VECTOR * current; if ( (x > data->max_dx) || (x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy) ) return; if (data->qpel_precision) { /* x and y are in 1/4 precision */ Reference = xvid_me_interpolate16x16qpel(x, y, 0, data); current = data->currentQMV; xc = x/2; yc = y/2; } else { Reference = GetReference(x, y, data); current = data->currentMV; xc = x; yc = y; } sad = sad16v(data->Cur, Reference, data->iEdgedWidth, data->temp); t = d_mv_bits(x, y, data->predMV, data->iFcode, data->qpel^data->qpel_precision); sad += (data->lambda16 * t); data->temp[0] += (data->lambda8 * t); if (data->chroma) { if (sad >= data->iMinSAD[0]) goto no16; sad += xvid_me_ChromaSAD((xc >> 1) + roundtab_79[xc & 0x3], (yc >> 1) + roundtab_79[yc & 0x3], data); } if (sad < data->iMinSAD[0]) { data->iMinSAD[0] = sad; current[0].x = x; current[0].y = y; data->dir = Direction; } no16: if (data->temp[0] < data->iMinSAD[1]) { data->iMinSAD[1] = data->temp[0]; current[1].x = x; current[1].y = y; } if (data->temp[1] < data->iMinSAD[2]) { data->iMinSAD[2] = data->temp[1]; current[2].x = x; current[2].y = y; } if (data->temp[2] < data->iMinSAD[3]) { data->iMinSAD[3] = data->temp[2]; current[3].x = x; current[3].y = y; } if (data->temp[3] < data->iMinSAD[4]) { data->iMinSAD[4] = data->temp[3]; current[4].x = x; current[4].y = y; } } static void CheckCandidate8(const int x, const int y, SearchData * const data, const unsigned int Direction) { int32_t sad; uint32_t t; const uint8_t * Reference; VECTOR * current; if ( (x > data->max_dx) || (x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy) ) return; if (!data->qpel_precision) { Reference = GetReference(x, y, data); current = data->currentMV; } else { /* x and y are in 1/4 precision */ Reference = xvid_me_interpolate8x8qpel(x, y, 0, 0, data); current = data->currentQMV; } sad = sad8(data->Cur, Reference, data->iEdgedWidth); t = d_mv_bits(x, y, data->predMV, data->iFcode, data->qpel^data->qpel_precision); sad += (data->lambda8 * t); if (sad < *(data->iMinSAD)) { *(data->iMinSAD) = sad; current->x = x; current->y = y; data->dir = Direction; } } int xvid_me_SkipDecisionP(const IMAGE * current, const IMAGE * reference, const int x, const int y, const uint32_t stride, const uint32_t iQuant) { int offset = (x + y*stride)*8; uint32_t sadC = sad8(current->u + offset, reference->u + offset, stride); if (sadC > iQuant * MAX_CHROMA_SAD_FOR_SKIP) return 0; sadC += sad8(current->v + offset, reference->v + offset, stride); if (sadC > iQuant * MAX_CHROMA_SAD_FOR_SKIP) return 0; return 1; } /* * pmv are filled with: * [0]: Median (or whatever is correct in a special case) * [1]: left neighbour * [2]: top neighbour * [3]: topright neighbour * psad are filled with: * [0]: minimum of [1] to [3] * [1]: left neighbour's SAD (NB:[1] to [3] are actually not needed) * [2]: top neighbour's SAD * [3]: topright neighbour's SAD */ static __inline void get_pmvdata2(const MACROBLOCK * const mbs, const int mb_width, const int bound, const int x, const int y, VECTOR * const pmv, int32_t * const psad) { int lx, ly, lz; /* left */ int tx, ty, tz; /* top */ int rx, ry, rz; /* top-right */ int lpos, tpos, rpos; int num_cand = 0, last_cand = 1; lx = x - 1; ly = y; lz = 1; tx = x; ty = y - 1; tz = 2; rx = x + 1; ry = y - 1; rz = 2; lpos = lx + ly * mb_width; rpos = rx + ry * mb_width; tpos = tx + ty * mb_width; if (lpos >= bound && lx >= 0) { num_cand++; last_cand = 1; pmv[1] = mbs[lpos].mvs[lz]; psad[1] = mbs[lpos].sad8[lz]; } else { pmv[1] = zeroMV; psad[1] = MV_MAX_ERROR; } if (tpos >= bound) { num_cand++; last_cand = 2; pmv[2]= mbs[tpos].mvs[tz]; psad[2] = mbs[tpos].sad8[tz]; } else { pmv[2] = zeroMV; psad[2] = MV_MAX_ERROR; } if (rpos >= bound && rx < mb_width) { num_cand++; last_cand = 3; pmv[3] = mbs[rpos].mvs[rz]; psad[3] = mbs[rpos].sad8[rz]; } else { pmv[3] = zeroMV; psad[3] = MV_MAX_ERROR; } /* original pmvdata() compatibility hack */ if (x == 0 && y == 0) { pmv[0] = pmv[1] = pmv[2] = pmv[3] = zeroMV; psad[0] = 0; psad[1] = psad[2] = psad[3] = MV_MAX_ERROR; return; } /* if only one valid candidate preictor, the invalid candiates are set to the canidate */ if (num_cand == 1) { pmv[0] = pmv[last_cand]; psad[0] = psad[last_cand]; return; } if ((MVequal(pmv[1], pmv[2])) && (MVequal(pmv[1], pmv[3]))) { pmv[0] = pmv[1]; psad[0] = MIN(MIN(psad[1], psad[2]), psad[3]); return; } /* set median, minimum */ pmv[0].x = MIN(MAX(pmv[1].x, pmv[2].x), MIN(MAX(pmv[2].x, pmv[3].x), MAX(pmv[1].x, pmv[3].x))); pmv[0].y = MIN(MAX(pmv[1].y, pmv[2].y), MIN(MAX(pmv[2].y, pmv[3].y), MAX(pmv[1].y, pmv[3].y))); psad[0] = MIN(MIN(psad[1], psad[2]), psad[3]); } static void ModeDecision_SAD(SearchData * const Data, MACROBLOCK * const pMB, const MACROBLOCK * const pMBs, const int x, const int y, const MBParam * const pParam, const uint32_t MotionFlags, const uint32_t VopFlags, const uint32_t VolFlags, const IMAGE * const pCurrent, const IMAGE * const pRef, const IMAGE * const vGMC, const int coding_type, const int skip_sad) { int mode = MODE_INTER; int mcsel = 0; int inter4v = (VopFlags & XVID_VOP_INTER4V) && (pMB->dquant == 0); const uint32_t iQuant = pMB->quant; const int skip_possible = (coding_type == P_VOP) && (pMB->dquant == 0); int sad; int InterBias = MV16_INTER_BIAS; pMB->mcsel = 0; if (inter4v == 0 || Data->iMinSAD[0] < Data->iMinSAD[1] + Data->iMinSAD[2] + Data->iMinSAD[3] + Data->iMinSAD[4] + IMV16X16 * (int32_t)iQuant) { mode = MODE_INTER; sad = Data->iMinSAD[0]; } else { mode = MODE_INTER4V; sad = Data->iMinSAD[1] + Data->iMinSAD[2] + Data->iMinSAD[3] + Data->iMinSAD[4] + IMV16X16 * (int32_t)iQuant; Data->iMinSAD[0] = sad; } /* final skip decision, a.k.a. "the vector you found, really that good?" */ if (skip_possible && (skip_sad < (int)iQuant * MAX_SAD00_FOR_SKIP)) if ( (100*skip_sad)/(pMB->sad16+1) < FINAL_SKIP_THRESH) if (Data->chroma || xvid_me_SkipDecisionP(pCurrent, pRef, x, y, Data->iEdgedWidth/2, iQuant)) { mode = MODE_NOT_CODED; sad = 0; } /* mcsel */ if (coding_type == S_VOP) { int32_t iSAD = sad16(Data->Cur, vGMC->y + 16*y*Data->iEdgedWidth + 16*x, Data->iEdgedWidth, 65536); if (Data->chroma) { iSAD += sad8(Data->CurU, vGMC->u + 8*y*(Data->iEdgedWidth/2) + 8*x, Data->iEdgedWidth/2); iSAD += sad8(Data->CurV, vGMC->v + 8*y*(Data->iEdgedWidth/2) + 8*x, Data->iEdgedWidth/2); } if (iSAD <= sad) { /* mode decision GMC */ mode = MODE_INTER; mcsel = 1; sad = iSAD; } } /* intra decision */ if (iQuant > 10) InterBias += 60 * (iQuant - 10); /* to make high quants work */ if (y != 0) if ((pMB - pParam->mb_width)->mode == MODE_INTRA ) InterBias -= 80; if (x != 0) if ((pMB - 1)->mode == MODE_INTRA ) InterBias -= 80; if (Data->chroma) InterBias += 50; /* dev8(chroma) ??? <-- yes, we need dev8 (no big difference though) */ if (InterBias < sad) { int32_t deviation = dev16(Data->Cur, Data->iEdgedWidth); if (deviation < (sad - InterBias)) mode = MODE_INTRA; } pMB->cbp = 63; pMB->sad16 = pMB->sad8[0] = pMB->sad8[1] = pMB->sad8[2] = pMB->sad8[3] = sad; if (mode == MODE_INTER && mcsel == 0) { pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = Data->currentMV[0]; if(Data->qpel) { pMB->qmvs[0] = pMB->qmvs[1] = pMB->qmvs[2] = pMB->qmvs[3] = Data->currentQMV[0]; pMB->pmvs[0].x = Data->currentQMV[0].x - Data->predMV.x; pMB->pmvs[0].y = Data->currentQMV[0].y - Data->predMV.y; } else { pMB->pmvs[0].x = Data->currentMV[0].x - Data->predMV.x; pMB->pmvs[0].y = Data->currentMV[0].y - Data->predMV.y; } } else if (mode == MODE_INTER ) { /* but mcsel == 1 */ pMB->mcsel = 1; if (Data->qpel) { pMB->qmvs[0] = pMB->qmvs[1] = pMB->qmvs[2] = pMB->qmvs[3] = pMB->amv; pMB->mvs[0].x = pMB->mvs[1].x = pMB->mvs[2].x = pMB->mvs[3].x = pMB->amv.x/2; pMB->mvs[0].y = pMB->mvs[1].y = pMB->mvs[2].y = pMB->mvs[3].y = pMB->amv.y/2; } else pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = pMB->amv; } else if (mode == MODE_INTER4V) ; /* anything here? */ else /* INTRA, NOT_CODED */ ZeroMacroblockP(pMB, 0); pMB->mode = mode; } static __inline void PreparePredictionsP(VECTOR * const pmv, int x, int y, int iWcount, int iHcount, const MACROBLOCK * const prevMB) { if ( (y != 0) && (x < (iWcount-1)) ) { /* [5] top-right neighbour */ pmv[5].x = EVEN(pmv[3].x); pmv[5].y = EVEN(pmv[3].y); } else pmv[5].x = pmv[5].y = 0; if (x != 0) { pmv[3].x = EVEN(pmv[1].x); pmv[3].y = EVEN(pmv[1].y); }/* pmv[3] is left neighbour */ else pmv[3].x = pmv[3].y = 0; if (y != 0) { pmv[4].x = EVEN(pmv[2].x); pmv[4].y = EVEN(pmv[2].y); }/* [4] top neighbour */ else pmv[4].x = pmv[4].y = 0; /* [1] median prediction */ pmv[1].x = EVEN(pmv[0].x); pmv[1].y = EVEN(pmv[0].y); pmv[0].x = pmv[0].y = 0; /* [0] is zero; not used in the loop (checked before) but needed here for make_mask */ pmv[2].x = EVEN(prevMB->mvs[0].x); /* [2] is last frame */ pmv[2].y = EVEN(prevMB->mvs[0].y); if ((x < iWcount-1) && (y < iHcount-1)) { pmv[6].x = EVEN((prevMB+1+iWcount)->mvs[0].x); /* [6] right-down neighbour in last frame */ pmv[6].y = EVEN((prevMB+1+iWcount)->mvs[0].y); } else pmv[6].x = pmv[6].y = 0; } static void Search8(SearchData * const OldData, const int x, const int y, const uint32_t MotionFlags, const MBParam * const pParam, MACROBLOCK * const pMB, const MACROBLOCK * const pMBs, const int block, SearchData * const Data, const int bound) { int i = 0; VECTOR vbest_q; int32_t sbest_q; *Data->iMinSAD = *(OldData->iMinSAD + 1 + block); *Data->currentMV = *(OldData->currentMV + 1 + block); *Data->currentQMV = *(OldData->currentQMV + 1 + block); if(Data->qpel) { Data->predMV = get_qpmv2(pMBs, pParam->mb_width, bound, x/2, y/2, block); if (block != 0) i = d_mv_bits( Data->currentQMV->x, Data->currentQMV->y, Data->predMV, Data->iFcode, 0); } else { Data->predMV = get_pmv2(pMBs, pParam->mb_width, bound, x/2, y/2, block); if (block != 0) i = d_mv_bits( Data->currentMV->x, Data->currentMV->y, Data->predMV, Data->iFcode, 0); } *(Data->iMinSAD) += (Data->lambda8 * i); if (MotionFlags & (XVID_ME_EXTSEARCH8|XVID_ME_HALFPELREFINE8|XVID_ME_QUARTERPELREFINE8)) { vbest_q = Data->currentQMV[0]; sbest_q = Data->iMinSAD[0]; Data->RefP[0] = OldData->RefP[0] + 8 * ((block&1) + Data->iEdgedWidth*(block>>1)); Data->RefP[1] = OldData->RefP[1] + 8 * ((block&1) + Data->iEdgedWidth*(block>>1)); Data->RefP[2] = OldData->RefP[2] + 8 * ((block&1) + Data->iEdgedWidth*(block>>1)); Data->RefP[3] = OldData->RefP[3] + 8 * ((block&1) + Data->iEdgedWidth*(block>>1)); Data->Cur = OldData->Cur + 8 * ((block&1) + Data->iEdgedWidth*(block>>1)); Data->qpel_precision = 0; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 3, pParam->width, pParam->height, Data->iFcode - Data->qpel, 1); if (MotionFlags & XVID_ME_EXTSEARCH8 && (!(MotionFlags & XVID_ME_EXTSEARCH_RD))) { MainSearchFunc *MainSearchPtr; if (MotionFlags & XVID_ME_USESQUARES8) MainSearchPtr = xvid_me_SquareSearch; else if (MotionFlags & XVID_ME_ADVANCEDDIAMOND8) MainSearchPtr = xvid_me_AdvDiamondSearch; else MainSearchPtr = xvid_me_DiamondSearch; MainSearchPtr(Data->currentMV->x, Data->currentMV->y, Data, 255, CheckCandidate8); } if(!Data->qpel) { /* halfpel mode */ if (MotionFlags & XVID_ME_HALFPELREFINE8) /* perform halfpel refine of current best vector */ xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate8, 0); } else { /* qpel mode */ Data->currentQMV->x = 2*Data->currentMV->x; Data->currentQMV->y = 2*Data->currentMV->y; if(MotionFlags & XVID_ME_FASTREFINE8) { /* fast */ get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 3, pParam->width, pParam->height, Data->iFcode, 2); FullRefine_Fast(Data, CheckCandidate8, 0); } else if(MotionFlags & XVID_ME_QUARTERPELREFINE8) { /* full */ if (MotionFlags & XVID_ME_HALFPELREFINE8) { xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate8, 0); /* hpel part */ Data->currentQMV->x = 2*Data->currentMV->x; Data->currentQMV->y = 2*Data->currentMV->y; } get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 3, pParam->width, pParam->height, Data->iFcode, 2); Data->qpel_precision = 1; xvid_me_SubpelRefine(Data->currentQMV[0], Data, CheckCandidate8, 0); /* qpel part */ } } if (sbest_q <= Data->iMinSAD[0]) /* we have not found a better match */ Data->currentQMV[0] = vbest_q; } if(Data->qpel) { pMB->pmvs[block].x = Data->currentQMV->x - Data->predMV.x; pMB->pmvs[block].y = Data->currentQMV->y - Data->predMV.y; pMB->qmvs[block] = *Data->currentQMV; } else { pMB->pmvs[block].x = Data->currentMV->x - Data->predMV.x; pMB->pmvs[block].y = Data->currentMV->y - Data->predMV.y; } *(OldData->iMinSAD + 1 + block) = *Data->iMinSAD; *(OldData->currentMV + 1 + block) = *Data->currentMV; *(OldData->currentQMV + 1 + block) = *Data->currentQMV; pMB->mvs[block] = *Data->currentMV; pMB->sad8[block] = 4 * *Data->iMinSAD; } static void SearchP(const IMAGE * const pRef, const uint8_t * const pRefH, const uint8_t * const pRefV, const uint8_t * const pRefHV, const IMAGE * const pCur, const int x, const int y, const uint32_t MotionFlags, const uint32_t VopFlags, SearchData * const Data, const MBParam * const pParam, const MACROBLOCK * const pMBs, const MACROBLOCK * const prevMBs, MACROBLOCK * const pMB, const int bound) { int i, threshA; VECTOR pmv[7]; int inter4v = (VopFlags & XVID_VOP_INTER4V) && (pMB->dquant == 0); CheckFunc * CheckCandidate; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode - Data->qpel, 1); get_pmvdata2(pMBs, pParam->mb_width, bound, x, y, pmv, Data->temp); Data->chromaX = Data->chromaY = 0; /* chroma-sad cache */ Data->Cur = pCur->y + (x + y * Data->iEdgedWidth) * 16; Data->CurV = pCur->v + (x + y * (Data->iEdgedWidth/2)) * 8; Data->CurU = pCur->u + (x + y * (Data->iEdgedWidth/2)) * 8; Data->RefP[0] = pRef->y + (x + Data->iEdgedWidth*y) * 16; Data->RefP[2] = pRefH + (x + Data->iEdgedWidth*y) * 16; Data->RefP[1] = pRefV + (x + Data->iEdgedWidth*y) * 16; Data->RefP[3] = pRefHV + (x + Data->iEdgedWidth*y) * 16; Data->RefP[4] = pRef->u + (x + y * (Data->iEdgedWidth/2)) * 8; Data->RefP[5] = pRef->v + (x + y * (Data->iEdgedWidth/2)) * 8; Data->lambda16 = xvid_me_lambda_vec16[pMB->quant]; Data->lambda8 = xvid_me_lambda_vec8[pMB->quant]; Data->qpel_precision = 0; Data->dir = 0; memset(Data->currentMV, 0, 5*sizeof(VECTOR)); if (Data->qpel) Data->predMV = get_qpmv2(pMBs, pParam->mb_width, bound, x, y, 0); else Data->predMV = pmv[0]; i = d_mv_bits(0, 0, Data->predMV, Data->iFcode, 0); Data->iMinSAD[0] = pMB->sad16 + (Data->lambda16 * i); Data->iMinSAD[1] = pMB->sad8[0] + (Data->lambda8 * i); Data->iMinSAD[2] = pMB->sad8[1]; Data->iMinSAD[3] = pMB->sad8[2]; Data->iMinSAD[4] = pMB->sad8[3]; if ((!(VopFlags & XVID_VOP_MODEDECISION_RD)) && (x | y)) { threshA = Data->temp[0]; /* that's where we keep this SAD atm */ if (threshA < 512) threshA = 512; else if (threshA > 1024) threshA = 1024; } else threshA = 512; PreparePredictionsP(pmv, x, y, pParam->mb_width, pParam->mb_height, prevMBs + x + y * pParam->mb_width); if (inter4v) CheckCandidate = CheckCandidate16; else CheckCandidate = CheckCandidate16no4v; /* for extra speed */ /* main loop. checking all predictions (but first, which is 0,0 and has been checked in MotionEstimation())*/ for (i = 1; i < 7; i++) if (!vector_repeats(pmv, i)) { CheckCandidate(pmv[i].x, pmv[i].y, Data, i); if (Data->iMinSAD[0] <= threshA) { i++; break; } } if ((Data->iMinSAD[0] <= threshA) || (MVequal(Data->currentMV[0], (prevMBs+x+y*pParam->mb_width)->mvs[0]) && (Data->iMinSAD[0] < (prevMBs+x+y*pParam->mb_width)->sad16))) inter4v = 0; else { MainSearchFunc * MainSearchPtr; int mask = make_mask(pmv, i, Data->dir); /* all vectors pmv[0..i-1] have been checked */ if (MotionFlags & XVID_ME_USESQUARES16) MainSearchPtr = xvid_me_SquareSearch; else if (MotionFlags & XVID_ME_ADVANCEDDIAMOND16) MainSearchPtr = xvid_me_AdvDiamondSearch; else MainSearchPtr = xvid_me_DiamondSearch; MainSearchPtr(Data->currentMV->x, Data->currentMV->y, Data, mask, CheckCandidate); /* extended search, diamond starting in 0,0 and in prediction. note that this search is/might be done in halfpel positions, which makes it more different than the diamond above */ if (MotionFlags & XVID_ME_EXTSEARCH16) { int32_t bSAD; VECTOR startMV = Data->predMV, backupMV = Data->currentMV[0]; if (Data->qpel) { startMV.x /= 2; startMV.y /= 2; } if (!(MVequal(startMV, backupMV))) { bSAD = Data->iMinSAD[0]; Data->iMinSAD[0] = MV_MAX_ERROR; CheckCandidate(startMV.x, startMV.y, Data, 255); xvid_me_DiamondSearch(startMV.x, startMV.y, Data, 255, CheckCandidate); if (bSAD < Data->iMinSAD[0]) { Data->currentMV[0] = backupMV; Data->iMinSAD[0] = bSAD; } } backupMV = Data->currentMV[0]; startMV.x = startMV.y = 1; if (!(MVequal(startMV, backupMV))) { bSAD = Data->iMinSAD[0]; Data->iMinSAD[0] = MV_MAX_ERROR; CheckCandidate(startMV.x, startMV.y, Data, 255); xvid_me_DiamondSearch(startMV.x, startMV.y, Data, 255, CheckCandidate); if (bSAD < Data->iMinSAD[0]) { Data->currentMV[0] = backupMV; Data->iMinSAD[0] = bSAD; } } } } if(!Data->qpel) { /* halfpel mode */ if (MotionFlags & XVID_ME_HALFPELREFINE16) xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate, 0); } else { /* qpel mode */ for(i = 0; i < 5; i++) { Data->currentQMV[i].x = 2 * Data->currentMV[i].x; /* initialize qpel vectors */ Data->currentQMV[i].y = 2 * Data->currentMV[i].y; } if(MotionFlags & XVID_ME_FASTREFINE16 && MotionFlags & XVID_ME_QUARTERPELREFINE16) { /* fast */ get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode, 2); FullRefine_Fast(Data, CheckCandidate, 0); } else { if(MotionFlags & (XVID_ME_QUARTERPELREFINE16 | XVID_ME_QUARTERPELREFINE16_RD)) { /* full */ if (MotionFlags & XVID_ME_HALFPELREFINE16) { xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate, 0); /* hpel part */ for(i = 0; i < 5; i++) { Data->currentQMV[i].x = 2 * Data->currentMV[i].x; Data->currentQMV[i].y = 2 * Data->currentMV[i].y; } } get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode, 2); Data->qpel_precision = 1; if(MotionFlags & XVID_ME_QUARTERPELREFINE16) xvid_me_SubpelRefine(Data->currentQMV[0], Data, CheckCandidate, 0); /* qpel part */ } } } if (Data->iMinSAD[0] < (int32_t)pMB->quant * 30 * ((MotionFlags & XVID_ME_FASTREFINE16) ? 8 : 1)) inter4v = 0; if (inter4v) { SearchData Data8; memcpy(&Data8, Data, sizeof(SearchData)); /* quick copy of common data */ Search8(Data, 2*x, 2*y, MotionFlags, pParam, pMB, pMBs, 0, &Data8, bound); Search8(Data, 2*x + 1, 2*y, MotionFlags, pParam, pMB, pMBs, 1, &Data8, bound); Search8(Data, 2*x, 2*y + 1, MotionFlags, pParam, pMB, pMBs, 2, &Data8, bound); Search8(Data, 2*x + 1, 2*y + 1, MotionFlags, pParam, pMB, pMBs, 3, &Data8, bound); if ((Data->chroma) && (!(VopFlags & XVID_VOP_MODEDECISION_RD))) { /* chroma is only used for comparison to INTER. if the comparison will be done in RD domain, it will not be used */ int sumx = 0, sumy = 0; if (Data->qpel) for (i = 1; i < 5; i++) { sumx += Data->currentQMV[i].x/2; sumy += Data->currentQMV[i].y/2; } else for (i = 1; i < 5; i++) { sumx += Data->currentMV[i].x; sumy += Data->currentMV[i].y; } Data->iMinSAD[1] += xvid_me_ChromaSAD((sumx >> 3) + roundtab_76[sumx & 0xf], (sumy >> 3) + roundtab_76[sumy & 0xf], Data); } } else Data->iMinSAD[1] = 4096*256; } static int InitialSkipDecisionP(int sad00, const MBParam * pParam, const FRAMEINFO * current, MACROBLOCK * pMB, const MACROBLOCK * prevMB, int x, int y, const SearchData * Data, const IMAGE * const pGMC, const IMAGE * const pCurrent, const IMAGE * const pRef, const uint32_t MotionFlags, const int bound) { const unsigned int iEdgedWidth = pParam->edged_width; int skip_thresh = INITIAL_SKIP_THRESH * \ (current->vop_flags & XVID_VOP_MODEDECISION_RD ? 2:1); int stat_thresh = 0; /* initial skip decision */ if (current->coding_type != S_VOP) { /* no fast SKIP for S(GMC)-VOPs */ if (pMB->dquant == 0 && sad00 < pMB->quant * skip_thresh) if (Data->chroma || xvid_me_SkipDecisionP(pCurrent, pRef, x, y, iEdgedWidth/2, pMB->quant)) { ZeroMacroblockP(pMB, sad00); pMB->mode = MODE_NOT_CODED; return 1; } } if(MotionFlags & XVID_ME_DETECT_STATIC_MOTION) { VECTOR *cmpMV; VECTOR staticMV = { 0, 0 }; const MACROBLOCK * pMBs = current->mbs; if (current->coding_type == S_VOP) cmpMV = &pMB->amv; else cmpMV = &staticMV; if(x > 0 && y > 0 && x < (int) pParam->mb_width) { if(MVequal((&pMBs[(x-1) + y * pParam->mb_width])->mvs[0], *cmpMV) && MVequal((&pMBs[x + (y-1) * pParam->mb_width])->mvs[0], *cmpMV) && MVequal((&pMBs[(x+1) + (y-1) * pParam->mb_width])->mvs[0], *cmpMV) && MVequal(prevMB->mvs[0], *cmpMV)) { stat_thresh = MAX((&pMBs[(x-1) + y * pParam->mb_width])->sad16, MAX((&pMBs[x + (y-1) * pParam->mb_width])->sad16, MAX((&pMBs[(x+1) + (y-1) * pParam->mb_width])->sad16, prevMB->sad16))); } else { stat_thresh = MIN((&pMBs[(x-1) + y * pParam->mb_width])->sad16, MIN((&pMBs[x + (y-1) * pParam->mb_width])->sad16, MIN((&pMBs[(x+1) + (y-1) * pParam->mb_width])->sad16, prevMB->sad16))); } } } /* favorize (0,0) or global vector for cartoons */ if (current->vop_flags & XVID_VOP_CARTOON) { if (current->coding_type == S_VOP) { int32_t iSAD = sad16(pCurrent->y + (x + y * iEdgedWidth) * 16, pGMC->y + 16*y*iEdgedWidth + 16*x, iEdgedWidth, 65536); if (Data->chroma) { iSAD += sad8(pCurrent->u + x*8 + y*(iEdgedWidth/2)*8, pGMC->u + 8*y*(iEdgedWidth/2) + 8*x, iEdgedWidth/2); iSAD += sad8(pCurrent->v + (x + y*(iEdgedWidth/2))*8, pGMC->v + 8*y*(iEdgedWidth/2) + 8*x, iEdgedWidth/2); } if (iSAD <= stat_thresh) { /* mode decision GMC */ pMB->mode = MODE_INTER; pMB->sad16 = pMB->sad8[0] = pMB->sad8[1] = pMB->sad8[2] = pMB->sad8[3] = iSAD; pMB->mcsel = 1; if (Data->qpel) { pMB->qmvs[0] = pMB->qmvs[1] = pMB->qmvs[2] = pMB->qmvs[3] = pMB->amv; pMB->mvs[0].x = pMB->mvs[1].x = pMB->mvs[2].x = pMB->mvs[3].x = pMB->amv.x/2; pMB->mvs[0].y = pMB->mvs[1].y = pMB->mvs[2].y = pMB->mvs[3].y = pMB->amv.y/2; } else pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = pMB->amv; return 1; } } else if (sad00 < stat_thresh) { VECTOR predMV; if (Data->qpel) predMV = get_qpmv2(current->mbs, pParam->mb_width, bound, x, y, 0); else predMV = get_pmv2(current->mbs, pParam->mb_width, bound, x, y, 0); ZeroMacroblockP(pMB, sad00); pMB->cbp = 0x3f; pMB->pmvs[0].x = - predMV.x; pMB->pmvs[0].y = - predMV.y; return 1; } } return 0; } static __inline uint32_t MakeGoodMotionFlags(const uint32_t MotionFlags, const uint32_t VopFlags, const uint32_t VolFlags) { uint32_t Flags = MotionFlags; if (!(VopFlags & XVID_VOP_MODEDECISION_RD)) Flags &= ~(XVID_ME_QUARTERPELREFINE16_RD+XVID_ME_QUARTERPELREFINE8_RD+XVID_ME_HALFPELREFINE16_RD+XVID_ME_HALFPELREFINE8_RD+XVID_ME_EXTSEARCH_RD); if (Flags & XVID_ME_EXTSEARCH_RD) Flags |= XVID_ME_HALFPELREFINE16_RD; if (Flags & XVID_ME_EXTSEARCH_RD && MotionFlags & XVID_ME_EXTSEARCH8) Flags |= XVID_ME_HALFPELREFINE8_RD; if (Flags & XVID_ME_HALFPELREFINE16_RD) Flags |= XVID_ME_QUARTERPELREFINE16_RD; if (Flags & XVID_ME_HALFPELREFINE8_RD) { Flags |= XVID_ME_QUARTERPELREFINE8_RD; Flags &= ~XVID_ME_HALFPELREFINE8; } if (Flags & XVID_ME_QUARTERPELREFINE8_RD) Flags &= ~XVID_ME_QUARTERPELREFINE8; if (Flags & XVID_ME_QUARTERPELREFINE16_RD) Flags &= ~XVID_ME_QUARTERPELREFINE16; if (!(VolFlags & XVID_VOL_QUARTERPEL)) Flags &= ~(XVID_ME_QUARTERPELREFINE16+XVID_ME_QUARTERPELREFINE8+XVID_ME_QUARTERPELREFINE16_RD+XVID_ME_QUARTERPELREFINE8_RD); if (!(VopFlags & XVID_VOP_HALFPEL)) Flags &= ~(XVID_ME_EXTSEARCH16+XVID_ME_HALFPELREFINE16+XVID_ME_HALFPELREFINE8+XVID_ME_HALFPELREFINE16_RD+XVID_ME_HALFPELREFINE8_RD); if (VopFlags & XVID_VOP_GREYSCALE) Flags &= ~(XVID_ME_CHROMA_PVOP + XVID_ME_CHROMA_BVOP); if (Flags & XVID_ME_FASTREFINE8) Flags &= ~XVID_ME_HALFPELREFINE8_RD; if (Flags & XVID_ME_FASTREFINE16) Flags &= ~XVID_ME_HALFPELREFINE16_RD; return Flags; } static __inline void motionStatsPVOP(int * const MVmax, int * const mvCount, int * const mvSum, const MACROBLOCK * const pMB, const int qpel) { const VECTOR * const mv = qpel ? pMB->qmvs : pMB->mvs; int i; int max = *MVmax; switch (pMB->mode) { case MODE_INTER4V: *mvCount += 3; for(i = 3; i; i--) { if (mv[i].x > max) max = mv[i].x; else if (-mv[i].x - 1 > max) max = -mv[i].x - 1; *mvSum += mv[i].x * mv[i].x; if (mv[i].y > max) max = mv[i].y; else if (-mv[i].y - 1 > max) max = -mv[i].y - 1; *mvSum += mv[i].y * mv[i].y; } case MODE_INTER: (*mvCount)++; *mvSum += mv[0].x * mv[0].x; *mvSum += mv[0].y * mv[0].y; if (mv[0].x > max) max = mv[0].x; else if (-mv[0].x - 1 > max) max = -mv[0].x - 1; if (mv[0].y > max) max = mv[0].y; else if (-mv[0].y - 1 > max) max = -mv[0].y - 1; *MVmax = max; default: break; } } void MotionEstimation(MBParam * const pParam, FRAMEINFO * const current, FRAMEINFO * const reference, const IMAGE * const pRefH, const IMAGE * const pRefV, const IMAGE * const pRefHV, const IMAGE * const pGMC, const uint32_t iLimit, const int num_slices) { MACROBLOCK *const pMBs = current->mbs; const IMAGE *const pCurrent = ¤t->image; const IMAGE *const pRef = &reference->image; const uint32_t mb_width = pParam->mb_width; const uint32_t mb_height = pParam->mb_height; const uint32_t iEdgedWidth = pParam->edged_width; const uint32_t MotionFlags = MakeGoodMotionFlags(current->motion_flags, current->vop_flags, current->vol_flags); int bound = 0; int MVmax = 0, mvSum = 0, mvCount = 0; uint32_t x, y; int sad00; int block = 0; /* some pre-initialized thingies for SearchP */ DECLARE_ALIGNED_MATRIX(dct_space, 3, 64, int16_t, CACHE_LINE); SearchData Data; memset(&Data, 0, sizeof(SearchData)); Data.iEdgedWidth = iEdgedWidth; Data.iFcode = current->fcode; Data.rounding = pParam->m_rounding_type; Data.qpel = (current->vol_flags & XVID_VOL_QUARTERPEL ? 1:0); Data.chroma = MotionFlags & XVID_ME_CHROMA_PVOP; Data.dctSpace = dct_space; Data.quant_type = !(pParam->vol_flags & XVID_VOL_MPEGQUANT); Data.mpeg_quant_matrices = pParam->mpeg_quant_matrices; Data.RefQ = pRefV->u; /* a good place, also used in MC (for similar purpose) */ if (sadInit) (*sadInit) (); for (y = 0; y < mb_height; y++) { bound = mb_width * ((((y*num_slices) / mb_height) * mb_height + (num_slices-1)) / num_slices); for (x = 0; x < mb_width; x++) { MACROBLOCK *pMB = &pMBs[block]; MACROBLOCK *prevMB = &reference->mbs[block]; int skip; block++; pMB->sad16 = sad16v(pCurrent->y + (x + y * iEdgedWidth) * 16, pRef->y + (x + y * iEdgedWidth) * 16, pParam->edged_width, pMB->sad8); sad00 = 4*MAX(MAX(pMB->sad8[0], pMB->sad8[1]), MAX(pMB->sad8[2], pMB->sad8[3])); if (Data.chroma) { Data.chromaSAD = sad8(pCurrent->u + x*8 + y*(iEdgedWidth/2)*8, pRef->u + x*8 + y*(iEdgedWidth/2)*8, iEdgedWidth/2) + sad8(pCurrent->v + (x + y*(iEdgedWidth/2))*8, pRef->v + (x + y*(iEdgedWidth/2))*8, iEdgedWidth/2); pMB->sad16 += Data.chromaSAD; sad00 += Data.chromaSAD; } skip = InitialSkipDecisionP(sad00, pParam, current, pMB, prevMB, x, y, &Data, pGMC, pCurrent, pRef, MotionFlags, bound); if (skip) continue; SearchP(pRef, pRefH->y, pRefV->y, pRefHV->y, pCurrent, x, y, MotionFlags, current->vop_flags, &Data, pParam, pMBs, reference->mbs, pMB, bound); if (current->vop_flags & XVID_VOP_MODEDECISION_RD) xvid_me_ModeDecision_RD(&Data, pMB, pMBs, x, y, pParam, MotionFlags, current->vop_flags, current->vol_flags, pCurrent, pRef, pGMC, current->coding_type, bound); else if (current->vop_flags & XVID_VOP_FAST_MODEDECISION_RD) xvid_me_ModeDecision_Fast(&Data, pMB, pMBs, x, y, pParam, MotionFlags, current->vop_flags, current->vol_flags, pCurrent, pRef, pGMC, current->coding_type, bound); else ModeDecision_SAD(&Data, pMB, pMBs, x, y, pParam, MotionFlags, current->vop_flags, current->vol_flags, pCurrent, pRef, pGMC, current->coding_type, sad00); motionStatsPVOP(&MVmax, &mvCount, &mvSum, pMB, Data.qpel); } } current->fcode = getMinFcode(MVmax); current->sStat.iMvSum = mvSum; current->sStat.iMvCount = mvCount; } void MotionEstimateSMP(SMPData * h) { Encoder *pEnc = (Encoder *) h->pEnc; const MBParam * const pParam = &pEnc->mbParam; const FRAMEINFO * const current = pEnc->current; const FRAMEINFO * const reference = pEnc->reference; const IMAGE * const pRefH = &pEnc->vInterH; const IMAGE * const pRefV = &pEnc->vInterV; const IMAGE * const pRefHV = &pEnc->vInterHV; const IMAGE * const pGMC = &pEnc->vGMC; uint32_t MotionFlags = MakeGoodMotionFlags(current->motion_flags, current->vop_flags, current->vol_flags); MACROBLOCK *const pMBs = current->mbs; const IMAGE *const pCurrent = ¤t->image; const IMAGE *const pRef = &reference->image; const int mb_width = pParam->mb_width; const int mb_height = pParam->mb_height; const uint32_t iEdgedWidth = pParam->edged_width; int bound = 0; int num_slices = pEnc->num_slices; int y_step = h->y_step; int y_row = h->y_row; int start_y = h->start_y; int stop_y = h->stop_y; int MVmax = 0, mvSum = 0, mvCount = 0; int x, y; int sad00; int block = (start_y+y_row)*mb_width; int * complete_count_self = h->complete_count_self; const volatile int * complete_count_above = h->complete_count_above; int max_mbs; int current_mb = 0; /* some pre-initialized thingies for SearchP */ DECLARE_ALIGNED_MATRIX(dct_space, 3, 64, int16_t, CACHE_LINE); SearchData Data; memset(&Data, 0, sizeof(SearchData)); Data.iEdgedWidth = iEdgedWidth; Data.iFcode = current->fcode; Data.rounding = pParam->m_rounding_type; Data.qpel = (current->vol_flags & XVID_VOL_QUARTERPEL ? 1:0); Data.chroma = MotionFlags & XVID_ME_CHROMA_PVOP; Data.dctSpace = dct_space; Data.quant_type = !(pParam->vol_flags & XVID_VOL_MPEGQUANT); Data.mpeg_quant_matrices = pParam->mpeg_quant_matrices; /* todo: sort out temp memory space */ Data.RefQ = h->RefQ; if (sadInit) (*sadInit) (); max_mbs = 0; for (y = (start_y + y_row); y < stop_y; y += y_step) { bound = mb_width * ((((y*num_slices) / mb_height) * mb_height + (num_slices-1)) / num_slices); if (y == start_y) max_mbs = mb_width; /* we can process all blocks of the first row */ for (x = 0; x < mb_width; x++) { MACROBLOCK *pMB, *prevMB; int skip; if (current_mb >= max_mbs) { /* we ME-ed all macroblocks we safely could. grab next portion */ int above_count = *complete_count_above; /* sync point */ if (above_count == mb_width) { /* full line above is ready */ above_count = mb_width+1; if (y < (stop_y-y_step)) { /* this is not last line, grab a portion of MBs from the next line too */ above_count += MAX(0, complete_count_above[1] - 1); } } max_mbs = current_mb + above_count - x - 1; if (current_mb >= max_mbs) { /* current workload is zero */ x--; sched_yield(); continue; } } pMB = &pMBs[block]; prevMB = &reference->mbs[block]; pMB->sad16 = sad16v(pCurrent->y + (x + y * iEdgedWidth) * 16, pRef->y + (x + y * iEdgedWidth) * 16, pParam->edged_width, pMB->sad8); sad00 = 4*MAX(MAX(pMB->sad8[0], pMB->sad8[1]), MAX(pMB->sad8[2], pMB->sad8[3])); if (Data.chroma) { Data.chromaSAD = sad8(pCurrent->u + x*8 + y*(iEdgedWidth/2)*8, pRef->u + x*8 + y*(iEdgedWidth/2)*8, iEdgedWidth/2) + sad8(pCurrent->v + (x + y*(iEdgedWidth/2))*8, pRef->v + (x + y*(iEdgedWidth/2))*8, iEdgedWidth/2); pMB->sad16 += Data.chromaSAD; sad00 += Data.chromaSAD; } skip = InitialSkipDecisionP(sad00, pParam, current, pMB, prevMB, x, y, &Data, pGMC, pCurrent, pRef, MotionFlags, bound); if (skip) { current_mb++; block++; *complete_count_self = x+1; continue; } SearchP(pRef, pRefH->y, pRefV->y, pRefHV->y, pCurrent, x, y, MotionFlags, current->vop_flags, &Data, pParam, pMBs, reference->mbs, pMB, bound); if (current->vop_flags & XVID_VOP_MODEDECISION_RD) xvid_me_ModeDecision_RD(&Data, pMB, pMBs, x, y, pParam, MotionFlags, current->vop_flags, current->vol_flags, pCurrent, pRef, pGMC, current->coding_type, bound); else if (current->vop_flags & XVID_VOP_FAST_MODEDECISION_RD) xvid_me_ModeDecision_Fast(&Data, pMB, pMBs, x, y, pParam, MotionFlags, current->vop_flags, current->vol_flags, pCurrent, pRef, pGMC, current->coding_type, bound); else ModeDecision_SAD(&Data, pMB, pMBs, x, y, pParam, MotionFlags, current->vop_flags, current->vol_flags, pCurrent, pRef, pGMC, current->coding_type, sad00); *complete_count_self = x+1; current_mb++; block++; motionStatsPVOP(&MVmax, &mvCount, &mvSum, pMB, Data.qpel); } block += (y_step-1)*pParam->mb_width; complete_count_self++; complete_count_above++; } h->minfcode = getMinFcode(MVmax); h->MVmax = MVmax; h->mvSum = mvSum; h->mvCount = mvCount; } xvidcore/src/motion/motion_smp.h0000664000076500007650000000327211564705453020132 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - multithreaded Motion Estimation header - * * Copyright(C) 2005 Radoslaw Czyz * * significant portions derived from x264 project, * original authors: Trax, Gianluigi Tiesi, Eric Petit * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: motion_smp.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef SMP_MOTION_H #define SMP_MOTION_H typedef struct { pthread_t handle; /* thread's handle */ const FRAMEINFO *current; uint8_t * RefQ; int y_row; int y_step; int start_y; int stop_y; int * complete_count_self; int * complete_count_above; int MVmax, mvSum, mvCount; /* out */ uint32_t minfcode, minbcode; uint8_t *tmp_buffer; Bitstream *bs; Statistics *sStat; void *pEnc; } SMPData; void MotionEstimateSMP(SMPData * h); void SMPMotionEstimationBVOP(SMPData * h); #endif /* SMP_MOTION_H */ xvidcore/src/motion/estimation_gmc.c0000664000076500007650000004122411564705453020742 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Global Motion Estimation - * * Copyright(C) 2003 Christoph Lampert * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: estimation_gmc.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include #include #include "../encoder.h" #include "../prediction/mbprediction.h" #include "estimation.h" #include "motion.h" #include "sad.h" #include "gmc.h" #include "../utils/emms.h" #include "motion_inlines.h" static void CheckCandidate16I(const int x, const int y, SearchData * const data, const unsigned int Direction) { int sad; const uint8_t * Reference; if ( (x > data->max_dx) || ( x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy) ) return; Reference = GetReference(x, y, data); sad = sad16(data->Cur, Reference, data->iEdgedWidth, 256*4096); if (sad < data->iMinSAD[0]) { data->iMinSAD[0] = sad; data->currentMV[0].x = x; data->currentMV[0].y = y; data->dir = Direction; } } static __inline void GMEanalyzeMB ( const uint8_t * const pCur, const uint8_t * const pRef, const uint8_t * const pRefH, const uint8_t * const pRefV, const uint8_t * const pRefHV, const int x, const int y, const MBParam * const pParam, MACROBLOCK * const pMBs, SearchData * const Data, const int bound) { MACROBLOCK * const pMB = &pMBs[x + y * pParam->mb_width]; Data->iMinSAD[0] = MV_MAX_ERROR; Data->predMV = get_pmv2(pMBs, pParam->mb_width, bound, x, y, 0); get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, 16, 1); Data->Cur = pCur + 16*(x + y * pParam->edged_width); Data->RefP[0] = pRef + 16*(x + y * pParam->edged_width); Data->RefP[1] = pRefV + 16*(x + y * pParam->edged_width); Data->RefP[2] = pRefH + 16*(x + y * pParam->edged_width); Data->RefP[3] = pRefHV + 16*(x + y * pParam->edged_width); Data->currentMV[0].x = Data->currentMV[0].y = 0; CheckCandidate16I(0, 0, Data, 255); if ( (Data->predMV.x !=0) || (Data->predMV.y != 0) ) CheckCandidate16I(Data->predMV.x, Data->predMV.y, Data, 255); xvid_me_DiamondSearch(Data->currentMV[0].x, Data->currentMV[0].y, Data, 255, CheckCandidate16I); xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate16I, 0); /* for QPel halfpel positions are worse than in halfpel mode :( */ /* if (Data->qpel) { Data->currentQMV->x = 2*Data->currentMV->x; Data->currentQMV->y = 2*Data->currentMV->y; Data->qpel_precision = 1; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, iFcode, 2, 0); SubpelRefine(Data); } */ pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = Data->currentMV[0]; pMB->sad16 = Data->iMinSAD[0]; pMB->mode = MODE_INTER; pMB->sad16 += 10*d_mv_bits(pMB->mvs[0].x, pMB->mvs[0].y, Data->predMV, Data->iFcode, 0); return; } void GMEanalysis(const MBParam * const pParam, const FRAMEINFO * const current, const FRAMEINFO * const reference, const IMAGE * const pRefH, const IMAGE * const pRefV, const IMAGE * const pRefHV, const int num_slices) { uint32_t x, y; MACROBLOCK * const pMBs = current->mbs; const IMAGE * const pCurrent = ¤t->image; const IMAGE * const pReference = &reference->image; int bound = 0; const uint32_t mb_width = pParam->mb_width; const uint32_t mb_height = pParam->mb_height; SearchData Data; memset(&Data, 0, sizeof(SearchData)); Data.iEdgedWidth = pParam->edged_width; Data.rounding = pParam->m_rounding_type; Data.iFcode = current->fcode; if (sadInit) (*sadInit) (); for (y = 0; y < pParam->mb_height; y ++) { bound = mb_width * ((((y*num_slices) / mb_height) * mb_height + (num_slices-1))/ num_slices); for (x = 0; x < pParam->mb_width; x ++) { GMEanalyzeMB(pCurrent->y, pReference->y, pRefH->y, pRefV->y, pRefHV->y, x, y, pParam, pMBs, &Data, bound); } } return; } WARPPOINTS GlobalMotionEst(MACROBLOCK * const pMBs, const MBParam * const pParam, const FRAMEINFO * const current, const FRAMEINFO * const reference, const IMAGE * const pRefH, const IMAGE * const pRefV, const IMAGE * const pRefHV, const int num_slices) { const int deltax=8; /* upper bound for difference between a MV and it's neighbour MVs */ const int deltay=8; const unsigned int gradx=512; /* lower bound for gradient in MB (ignore "flat" blocks) */ const unsigned int grady=512; double sol[4] = { 0., 0., 0., 0. }; WARPPOINTS gmc; uint32_t mx, my; int MBh = pParam->mb_height; int MBw = pParam->mb_width; const int minblocks = 9; /* was = /MBh*MBw/32+3 */ /* just some reasonable number 3% + 3 */ const int maxblocks = MBh*MBw/4; /* just some reasonable number 3% + 3 */ int num=0; int oldnum; gmc.duv[0].x = gmc.duv[0].y = gmc.duv[1].x = gmc.duv[1].y = gmc.duv[2].x = gmc.duv[2].y = 0; GMEanalysis(pParam,current, reference, pRefH, pRefV, pRefHV, num_slices); /* block based ME isn't done, yet, so do a quick presearch */ /* filter mask of all blocks */ for (my = 0; my < (uint32_t)MBh; my++) for (mx = 0; mx < (uint32_t)MBw; mx++) { const int mbnum = mx + my * MBw; pMBs[mbnum].mcsel = 0; } for (my = 1; my < (uint32_t)MBh-1; my++) /* ignore boundary blocks */ for (mx = 1; mx < (uint32_t)MBw-1; mx++) /* theirs MVs are often wrong */ { const int mbnum = mx + my * MBw; MACROBLOCK *const pMB = &pMBs[mbnum]; const VECTOR mv = pMB->mvs[0]; /* don't use object boundaries */ if ( (abs(mv.x - (pMB-1)->mvs[0].x) < deltax) && (abs(mv.y - (pMB-1)->mvs[0].y) < deltay) && (abs(mv.x - (pMB+1)->mvs[0].x) < deltax) && (abs(mv.y - (pMB+1)->mvs[0].y) < deltay) && (abs(mv.x - (pMB-MBw)->mvs[0].x) < deltax) && (abs(mv.y - (pMB-MBw)->mvs[0].y) < deltay) && (abs(mv.x - (pMB+MBw)->mvs[0].x) < deltax) && (abs(mv.y - (pMB+MBw)->mvs[0].y) < deltay) ) { const int iEdgedWidth = pParam->edged_width; const uint8_t *const pCur = current->image.y + 16*(my*iEdgedWidth + mx); if ( (sad16 ( pCur, pCur+1 , iEdgedWidth, 65536) >= gradx ) && (sad16 ( pCur, pCur+iEdgedWidth, iEdgedWidth, 65536) >= grady ) ) { pMB->mcsel = 1; num++; } /* only use "structured" blocks */ } } emms(); /* further filtering would be possible, but during iteration, remaining outliers usually are removed, too */ if (num>= minblocks) do { /* until convergence */ double DtimesF[4]; double a,b,c,n,invdenom; double meanx,meany; a = b = c = n = 0; DtimesF[0] = DtimesF[1] = DtimesF[2] = DtimesF[3] = 0.; for (my = 1; my < (uint32_t)MBh-1; my++) for (mx = 1; mx < (uint32_t)MBw-1; mx++) { const int mbnum = mx + my * MBw; const VECTOR mv = pMBs[mbnum].mvs[0]; if (!pMBs[mbnum].mcsel) continue; n++; a += 16*mx+8; b += 16*my+8; c += (16*mx+8)*(16*mx+8)+(16*my+8)*(16*my+8); DtimesF[0] += (double)mv.x; DtimesF[1] += (double)mv.x*(16*mx+8) + (double)mv.y*(16*my+8); DtimesF[2] += (double)mv.x*(16*my+8) - (double)mv.y*(16*mx+8); DtimesF[3] += (double)mv.y; } invdenom = a*a+b*b-c*n; /* Solve the system: sol = (D'*E*D)^{-1} D'*E*F */ /* D'*E*F has been calculated in the same loop as matrix */ sol[0] = -c*DtimesF[0] + a*DtimesF[1] + b*DtimesF[2]; sol[1] = a*DtimesF[0] - n*DtimesF[1] + b*DtimesF[3]; sol[2] = b*DtimesF[0] - n*DtimesF[2] - a*DtimesF[3]; sol[3] = b*DtimesF[1] - a*DtimesF[2] - c*DtimesF[3]; sol[0] /= invdenom; sol[1] /= invdenom; sol[2] /= invdenom; sol[3] /= invdenom; meanx = meany = 0.; oldnum = 0; for (my = 1; my < (uint32_t)MBh-1; my++) for (mx = 1; mx < (uint32_t)MBw-1; mx++) { const int mbnum = mx + my * MBw; const VECTOR mv = pMBs[mbnum].mvs[0]; if (!pMBs[mbnum].mcsel) continue; oldnum++; meanx += fabs(( sol[0] + (16*mx+8)*sol[1] + (16*my+8)*sol[2] ) - (double)mv.x ); meany += fabs(( sol[3] - (16*mx+8)*sol[2] + (16*my+8)*sol[1] ) - (double)mv.y ); } if (4*meanx > oldnum) /* better fit than 0.25 (=1/4pel) is useless */ meanx /= oldnum; else meanx = 0.25; if (4*meany > oldnum) meany /= oldnum; else meany = 0.25; num = 0; for (my = 0; my < (uint32_t)MBh; my++) for (mx = 0; mx < (uint32_t)MBw; mx++) { const int mbnum = mx + my * MBw; const VECTOR mv = pMBs[mbnum].mvs[0]; if (!pMBs[mbnum].mcsel) continue; if ( ( fabs(( sol[0] + (16*mx+8)*sol[1] + (16*my+8)*sol[2] ) - (double)mv.x ) > meanx ) || ( fabs(( sol[3] - (16*mx+8)*sol[2] + (16*my+8)*sol[1] ) - (double)mv.y ) > meany ) ) pMBs[mbnum].mcsel=0; else num++; } } while ( (oldnum != num) && (num>= minblocks) ); if (num < minblocks) { const int iEdgedWidth = pParam->edged_width; num = 0; /* fprintf(stderr,"Warning! Unreliable GME (%d/%d blocks), falling back to translation.\n",num,MBh*MBw); */ gmc.duv[0].x= gmc.duv[0].y= gmc.duv[1].x= gmc.duv[1].y= gmc.duv[2].x= gmc.duv[2].y=0; if (!(current->motion_flags & XVID_ME_GME_REFINE)) return gmc; for (my = 1; my < (uint32_t)MBh-1; my++) /* ignore boundary blocks */ for (mx = 1; mx < (uint32_t)MBw-1; mx++) /* theirs MVs are often wrong */ { const int mbnum = mx + my * MBw; MACROBLOCK *const pMB = &pMBs[mbnum]; const uint8_t *const pCur = current->image.y + 16*(my*iEdgedWidth + mx); if ( (sad16 ( pCur, pCur+1 , iEdgedWidth, 65536) >= gradx ) && (sad16 ( pCur, pCur+iEdgedWidth, iEdgedWidth, 65536) >= grady ) ) { pMB->mcsel = 1; gmc.duv[0].x += pMB->mvs[0].x; gmc.duv[0].y += pMB->mvs[0].y; num++; } } if (gmc.duv[0].x) gmc.duv[0].x /= num; if (gmc.duv[0].y) gmc.duv[0].y /= num; } else { gmc.duv[0].x=(int)(sol[0]+0.5); gmc.duv[0].y=(int)(sol[3]+0.5); gmc.duv[1].x=(int)(sol[1]*pParam->width+0.5); gmc.duv[1].y=(int)(-sol[2]*pParam->width+0.5); gmc.duv[2].x=-gmc.duv[1].y; /* two warp points only */ gmc.duv[2].y=gmc.duv[1].x; } if (num>maxblocks) { for (my = 1; my < (uint32_t)MBh-1; my++) for (mx = 1; mx < (uint32_t)MBw-1; mx++) { const int mbnum = mx + my * MBw; if (pMBs[mbnum-1].mcsel) pMBs[mbnum].mcsel=0; else if (pMBs[mbnum-MBw].mcsel) pMBs[mbnum].mcsel=0; } } return gmc; } int GlobalMotionEstRefine( WARPPOINTS *const startwp, MACROBLOCK * const pMBs, const MBParam * const pParam, const FRAMEINFO * const current, const FRAMEINFO * const reference, const IMAGE * const pCurr, const IMAGE * const pRef, const IMAGE * const pRefH, const IMAGE * const pRefV, const IMAGE * const pRefHV) { uint8_t* GMCblock = (uint8_t*)malloc(16*pParam->edged_width); WARPPOINTS bestwp=*startwp; WARPPOINTS centerwp,currwp; int gmcminSAD=0; int gmcSAD=0; int direction; #if 0 int mx,my; #endif #if 0 /* use many blocks... */ for (my = 0; my < (uint32_t)pParam->mb_height; my++) { for (mx = 0; mx < (uint32_t)pParam->mb_width; mx++) { const int mbnum = mx + my * pParam->mb_width; pMBs[mbnum].mcsel=1; } } #endif #if 0 /* or rather don't use too many blocks... */ for (my = 1; my < (uint32_t)MBh-1; my++) { for (mx = 1; mx < (uint32_t)MBw-1; mx++) { const int mbnum = mx + my * MBw; if (MBmask[mbnum-1]) MBmask[mbnum-1]=0; else if (MBmask[mbnum-MBw]) MBmask[mbnum-1]=0; } } #endif gmcminSAD = globalSAD(&bestwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if ( (reference->coding_type == S_VOP) && ( (reference->warp.duv[1].x != bestwp.duv[1].x) || (reference->warp.duv[1].y != bestwp.duv[1].y) || (reference->warp.duv[0].x != bestwp.duv[0].x) || (reference->warp.duv[0].y != bestwp.duv[0].y) || (reference->warp.duv[2].x != bestwp.duv[2].x) || (reference->warp.duv[2].y != bestwp.duv[2].y) ) ) { gmcSAD = globalSAD(&reference->warp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = reference->warp; gmcminSAD = gmcSAD; } } do { direction = 0; centerwp = bestwp; currwp = centerwp; currwp.duv[0].x--; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 1; } else { currwp = centerwp; currwp.duv[0].x++; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 2; } } if (direction) continue; currwp = centerwp; currwp.duv[0].y--; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 4; } else { currwp = centerwp; currwp.duv[0].y++; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 8; } } if (direction) continue; currwp = centerwp; currwp.duv[1].x++; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 32; } currwp.duv[2].y++; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 1024; } currwp = centerwp; currwp.duv[1].x--; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 16; } else { currwp = centerwp; currwp.duv[1].x++; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 32; } } if (direction) continue; currwp = centerwp; currwp.duv[1].y--; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 64; } else { currwp = centerwp; currwp.duv[1].y++; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 128; } } if (direction) continue; currwp = centerwp; currwp.duv[2].x--; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 256; } else { currwp = centerwp; currwp.duv[2].x++; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 512; } } if (direction) continue; currwp = centerwp; currwp.duv[2].y--; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 1024; } else { currwp = centerwp; currwp.duv[2].y++; gmcSAD = globalSAD(&currwp, pParam, pMBs, current, pRef, pCurr, GMCblock); if (gmcSAD < gmcminSAD) { bestwp = currwp; gmcminSAD = gmcSAD; direction = 2048; } } } while (direction); free(GMCblock); *startwp = bestwp; return gmcminSAD; } int globalSAD(const WARPPOINTS *const wp, const MBParam * const pParam, const MACROBLOCK * const pMBs, const FRAMEINFO * const current, const IMAGE * const pRef, const IMAGE * const pCurr, uint8_t *const GMCblock) { NEW_GMC_DATA gmc_data; int iSAD, gmcSAD=0; int num=0; unsigned int mx, my; generate_GMCparameters( 3, 3, wp, pParam->width, pParam->height, &gmc_data); for (my = 0; my < (uint32_t)pParam->mb_height; my++) for (mx = 0; mx < (uint32_t)pParam->mb_width; mx++) { const int mbnum = mx + my * pParam->mb_width; const int iEdgedWidth = pParam->edged_width; if (!pMBs[mbnum].mcsel) continue; gmc_data.predict_16x16(&gmc_data, GMCblock, pRef->y, iEdgedWidth, iEdgedWidth, mx, my, pParam->m_rounding_type); iSAD = sad16 ( pCurr->y + 16*(my*iEdgedWidth + mx), GMCblock , iEdgedWidth, 65536); iSAD -= pMBs[mbnum].sad16; if (iSAD<0) gmcSAD += iSAD; num++; } return gmcSAD; } xvidcore/src/motion/ppc_asm/0000775000076500007650000000000011566427763017222 5ustar xvidbuildxvidbuildxvidcore/src/motion/ppc_asm/sad_altivec.c0000664000076500007650000002070211564705453021636 0ustar xvidbuildxvidbuild/* Copyright (C) 2002 Benjamin Herrenschmidt This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA $Id: sad_altivec.c 1985 2011-05-18 09:02:35Z Isibaar $ */ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" /* no debugging by default */ #undef DEBUG #include #define SAD16() \ t1 = vec_perm(ref[0], ref[1], perm); /* align current vector */ \ t2 = vec_max(t1, *cur); /* find largest of two */ \ t1 = vec_min(t1, *cur); /* find smaller of two */ \ t1 = vec_sub(t2, t1); /* find absolute difference */ \ sad = vec_sum4s(t1, vec_splat_u32(0)); /* sum of differences */ \ sumdiffs = (vector unsigned int)vec_sums((vector signed int)sad, (vector signed int)sumdiffs); /* accumulate sumdiffs */ \ if(vec_any_ge(sumdiffs, best_vec)) \ goto bail; \ cur += stride; ref += stride; /* * This function assumes cur and stride are 16 bytes aligned and ref is unaligned */ uint32_t sad16_altivec_c(vector unsigned char *cur, vector unsigned char *ref, uint32_t stride, const uint32_t best_sad) { vector unsigned char perm; vector unsigned char t1, t2; vector unsigned int sad; vector unsigned int sumdiffs; vector unsigned int best_vec; uint32_t result; #ifdef DEBUG /* print alignment errors if DEBUG is on */ if (((unsigned long) cur) & 0xf) fprintf(stderr, "sad16_altivec:incorrect align, cur: %lx\n", (long)cur); if (stride & 0xf) fprintf(stderr, "sad16_altivec:incorrect align, stride: %lu\n", stride); #endif /* initialization */ sad = vec_splat_u32(0); sumdiffs = sad; stride >>= 4; perm = vec_lvsl(0, (unsigned char *) ref); *((uint32_t*)&best_vec) = best_sad; best_vec = vec_splat(best_vec, 0); /* perform sum of differences between current and previous */ SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); SAD16(); bail: /* copy vector sum into unaligned result */ sumdiffs = vec_splat(sumdiffs, 3); vec_ste(sumdiffs, 0, (uint32_t*) &result); return result; } #define SAD8() \ c = vec_perm(vec_ld(0,cur),vec_ld(16,cur),vec_lvsl(0,cur));\ r = vec_perm(vec_ld(0,ref),vec_ld(16,ref),vec_lvsl(0,ref));\ c = vec_sub(vec_max(c,r),vec_min(c,r));\ sad = vec_sum4s(c,sad);\ cur += stride;\ ref += stride /* * This function assumes nothing */ uint32_t sad8_altivec_c(const uint8_t * cur, const uint8_t *ref, const uint32_t stride) { uint32_t result = 0; register vector unsigned int sad; register vector unsigned char c; register vector unsigned char r; /* initialize */ sad = vec_splat_u32(0); /* Perform sad operations */ SAD8(); SAD8(); SAD8(); SAD8(); SAD8(); SAD8(); SAD8(); SAD8(); /* finish addition, add the first 2 together */ sad = vec_and(sad, (vector unsigned int)vec_pack(vec_splat_u16(-1),vec_splat_u16(0))); sad = (vector unsigned int)vec_sums((vector signed int)sad, vec_splat_s32(0)); sad = vec_splat(sad,3); vec_ste(sad, 0, &result); return result; } #define MEAN16() \ mean = vec_sum4s(*ptr,mean);\ ptr += stride #define DEV16() \ t2 = vec_max(*ptr, mn); /* find largest of two */ \ t3 = vec_min(*ptr, mn); /* find smaller of two */ \ t2 = vec_sub(t2, t3); /* find absolute difference */ \ dev = vec_sum4s(t2, dev); \ ptr += stride /* * This function assumes cur is 16 bytes aligned and stride is 16 bytes * aligned */ uint32_t dev16_altivec_c(vector unsigned char *cur, uint32_t stride) { vector unsigned char t2, t3, mn; vector unsigned int mean, dev; vector unsigned int sumdiffs; vector unsigned char *ptr; uint32_t result; #ifdef DEBUG /* print alignment errors if DEBUG is on */ if(((unsigned long)cur) & 0x7) fprintf(stderr, "dev16_altivec:incorrect align, cur: %lx\n", (long)cur); if(stride & 0xf) fprintf(stderr, "dev16_altivec:incorrect align, stride: %lu\n", stride); #endif dev = mean = vec_splat_u32(0); stride >>= 4; /* set pointer to iterate through cur */ ptr = cur; MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); MEAN16(); /* Add all together in sumdiffs */ sumdiffs = (vector unsigned int)vec_sums((vector signed int) mean, vec_splat_s32(0)); /* teilen durch 16 * 16 */ mn = vec_perm((vector unsigned char)sumdiffs, (vector unsigned char)sumdiffs, vec_splat_u8(14)); /* set pointer to iterate through cur */ ptr = cur; DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); DEV16(); /* sum all parts of difference into one 32 bit quantity */ sumdiffs = (vector unsigned int)vec_sums((vector signed int) dev, vec_splat_s32(0)); /* copy vector sum into unaligned result */ sumdiffs = vec_splat(sumdiffs, 3); vec_ste(sumdiffs, 0, (uint32_t*) &result); return result; } #define SAD16BI() \ t1 = vec_perm(ref1[0], ref1[1], mask1); \ t2 = vec_perm(ref2[0], ref2[1], mask2); \ t1 = vec_avg(t1, t2); \ t2 = vec_max(t1, *cur); \ t1 = vec_min(t1, *cur); \ sad = vec_sub(t2, t1); \ sum = vec_sum4s(sad, sum); \ cur += stride; \ ref1 += stride; \ ref2 += stride /* * This function assumes cur is 16 bytes aligned, stride is 16 bytes * aligned and ref1 and ref2 is unaligned */ uint32_t sad16bi_altivec_c(vector unsigned char *cur, vector unsigned char *ref1, vector unsigned char *ref2, uint32_t stride) { vector unsigned char t1, t2; vector unsigned char mask1, mask2; vector unsigned char sad; vector unsigned int sum; uint32_t result; #ifdef DEBUG /* print alignment errors if this is on */ if((long)cur & 0xf) fprintf(stderr, "sad16bi_altivec:incorrect align, cur: %lx\n", (long)cur); if(stride & 0xf) fprintf(stderr, "sad16bi_altivec:incorrect align, cur: %lu\n", stride); #endif /* Initialisation stuff */ stride >>= 4; mask1 = vec_lvsl(0, (unsigned char*)ref1); mask2 = vec_lvsl(0, (unsigned char*)ref2); sad = vec_splat_u8(0); sum = (vector unsigned int)sad; SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); SAD16BI(); sum = (vector unsigned int)vec_sums((vector signed int)sum, vec_splat_s32(0)); sum = vec_splat(sum, 3); vec_ste(sum, 0, (uint32_t*)&result); return result; } #define SSE8_16BIT() \ b1_vec = vec_perm(vec_ld(0,b1), vec_ld(16,b1), vec_lvsl(0,b1)); \ b2_vec = vec_perm(vec_ld(0,b2), vec_ld(16,b2), vec_lvsl(0,b2)); \ diff = vec_sub(b1_vec,b2_vec); \ sum = vec_msum(diff,diff,sum); \ b1 = (const int16_t*)((int8_t*)b1+stride); \ b2 = (const int16_t*)((int8_t*)b2+stride) uint32_t sse8_16bit_altivec_c(const int16_t * b1, const int16_t * b2, const uint32_t stride) { register vector signed short b1_vec; register vector signed short b2_vec; register vector signed short diff; register vector signed int sum; uint32_t result; /* initialize */ sum = vec_splat_s32(0); SSE8_16BIT(); SSE8_16BIT(); SSE8_16BIT(); SSE8_16BIT(); SSE8_16BIT(); SSE8_16BIT(); SSE8_16BIT(); SSE8_16BIT(); /* sum the vector */ sum = vec_sums(sum, vec_splat_s32(0)); sum = vec_splat(sum,3); vec_ste(sum,0,(int*)&result); /* and return */ return result; } xvidcore/src/motion/estimation.h0000664000076500007650000001631311564705453020122 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Motion Estimation related header - * * Copyright(C) 2002 Christoph Lampert * 2002-2010 Michael Militzer * 2002-2003 Radoslaw Czyz * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: estimation.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _ESTIMATION_H_ #define _ESTIMATION_H_ #include "../portab.h" #include "../global.h" #include "sad.h" /* hard coded motion search parameters */ /* very large value */ #define MV_MAX_ERROR (4096 * 256) /* INTER bias for INTER/INTRA decision; mpeg4 spec suggests 2*nb */ #define MV16_INTER_BIAS 450 /* vector map (vlc delta size) smoother parameters ! float !*/ #define NEIGH_TEND_16X16 0.6 #define NEIGH_TEND_8X8 1.5 #define NEIGH_8X8_BIAS 40 #define BITS_MULT 16 #define INITIAL_SKIP_THRESH 6 #define FINAL_SKIP_THRESH 50 #define MAX_SAD00_FOR_SKIP 20 #define MAX_CHROMA_SAD_FOR_SKIP 22 /* Parameters which control inter/inter4v decision */ #define IMV16X16 2 extern const int xvid_me_lambda_vec16[32]; #define CHECK_CANDIDATE(X,Y,D) { \ CheckCandidate((X),(Y), data, (D) ); } /* fast ((A)/2)*2 */ #define EVEN(A) (((A)<0?(A)+1:(A)) & ~1) #define MVequal(A,B) ( ((A).x)==((B).x) && ((A).y)==((B).y) ) static const VECTOR zeroMV = { 0, 0 }; typedef struct { int max_dx, min_dx, max_dy, min_dy; /* maximum search range */ /* data modified by CheckCandidates */ int32_t iMinSAD[5]; /* smallest SADs found so far */ VECTOR currentMV[5]; /* best vectors found so far */ VECTOR currentQMV[5]; /* as above, but used during qpel search */ int temp[4]; /* temporary space */ unsigned int dir; /* 'direction', set when better vector is found */ int chromaX, chromaY, chromaSAD; /* info to make ChromaSAD faster */ /* general fields */ uint32_t rounding; /* rounding type in use */ VECTOR predMV; /* vector which predicts current vector */ const uint8_t * RefP[6]; /* reference pictures - N, V, H, HV, cU, cV */ const uint8_t * Cur; /* current picture */ const uint8_t *CurU, *CurV; /* current picture - chroma planes */ uint8_t * RefQ; /* temporary space for interpolations */ uint32_t lambda16; /* how much vector bits weight */ uint32_t lambda8; /* as above - for inter4v mode */ uint32_t iEdgedWidth; /* picture's stride */ uint32_t iFcode; /* current fcode */ int qpel; /* if we're coding in qpel mode */ int qpel_precision; /* if X and Y are in qpel precision (refinement probably) */ int chroma; /* should we include chroma SAD? */ /* fields for interpolate and direct modes */ const uint8_t * b_RefP[6]; /* backward reference pictures - N, V, H, HV, cU, cV */ VECTOR bpredMV; /* backward prediction - used in Interpolate-mode search only */ uint32_t bFcode; /* backward fcode - used in Interpolate-mode search only */ int b_chromaX, b_chromaY; /* fields for direct mode */ VECTOR directmvF[4]; /* scaled reference vectors */ VECTOR directmvB[4]; const VECTOR * referencemv; /* pointer to not-scaled reference vectors */ /* BITS/R-D stuff */ int16_t * dctSpace; /* temporary space for dct */ uint32_t iQuant; /* current quant */ uint32_t quant_type; /* current quant type */ unsigned int cbp[2]; /* CBP of the best vector found so far + cbp for inter4v search */ const uint16_t * scan_table; /* current scan table */ const uint16_t * mpeg_quant_matrices; /* current MPEG quantization matrices */ int lambda[6]; /* R-D lambdas for all 6 blocks */ unsigned int quant_sq; /* quant squared - saves many multiplications in VHQ */ uint32_t rel_var8[6]; /* relative variances for all 6 sub-blocks */ int metric; /* distortion metric for R-D optimizations, currently: PSNR=0, PSNRHVSM=1 */ } SearchData; static __inline uint32_t masked_sseh8_16bit(int16_t * const orig, int16_t * const rec, const uint32_t rel_var8) { uint16_t mask = ((isqrt(2*coeff8_energy(orig)*rel_var8) + 48) >> 6); return (5*sseh8_16bit(orig, rec, (uint16_t) mask)) >> 7; } typedef void(CheckFunc)(const int x, const int y, SearchData * const Data, const unsigned int Direction); CheckFunc CheckCandidate16no4v; /* shared between p-vop and b-vop search */ uint8_t * xvid_me_interpolate8x8qpel(const int x, const int y, const uint32_t block, const uint32_t dir, const SearchData * const data); uint8_t * xvid_me_interpolate16x16qpel(const int x, const int y, const uint32_t dir, const SearchData * const data); int32_t xvid_me_ChromaSAD(const int dx, const int dy, SearchData * const data); int xvid_me_SkipDecisionP(const IMAGE * current, const IMAGE * reference, const int x, const int y, const uint32_t stride, const uint32_t iQuant); #define iDiamondSize 2 typedef void MainSearchFunc(int x, int y, SearchData * const Data, int bDirection, CheckFunc * const CheckCandidate); MainSearchFunc xvid_me_DiamondSearch, xvid_me_AdvDiamondSearch, xvid_me_SquareSearch; void xvid_me_SubpelRefine(VECTOR centerMV, SearchData * const data, CheckFunc * const CheckCandidate, int dir); void FullRefine_Fast(SearchData * data, CheckFunc * CheckCandidate, int direction); void xvid_me_ModeDecision_RD(SearchData * const Data, MACROBLOCK * const pMB, const MACROBLOCK * const pMBs, const int x, const int y, const MBParam * const pParam, const uint32_t MotionFlags, const uint32_t VopFlags, const uint32_t VolFlags, const IMAGE * const pCurrent, const IMAGE * const pRef, const IMAGE * const vGMC, const int coding_type, const int bound); void xvid_me_ModeDecision_Fast(SearchData * const Data, MACROBLOCK * const pMB, const MACROBLOCK * const pMBs, const int x, const int y, const MBParam * const pParam, const uint32_t MotionFlags, const uint32_t VopFlags, const uint32_t VolFlags, const IMAGE * const pCurrent, const IMAGE * const pRef, const IMAGE * const vGMC, const int coding_type, const int bound); void ModeDecision_BVOP_RD(SearchData * const Data_d, SearchData * const Data_b, SearchData * const Data_f, SearchData * const Data_i, MACROBLOCK * const pMB, const MACROBLOCK * const b_mb, VECTOR * f_predMV, VECTOR * b_predMV, const uint32_t MotionFlags, const uint32_t VopFlags, const MBParam * const pParam, int x, int y, int best_sad, int force_direct); unsigned int getMinFcode(const int MVmax); #endif /* _ESTIMATION_H_ */ xvidcore/src/motion/ia64_asm/0000775000076500007650000000000011566427763017203 5ustar xvidbuildxvidbuildxvidcore/src/motion/ia64_asm/calc_delta_1.s0000664000076500007650000000450211147310721021637 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 halfpel refinement - // * // * Copyright(C) 2002 Johannes Singler, Daniel Winkler // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: calc_delta_1.s,v 1.2 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * calc_delta_1.s, IA-64 halfpel refinement // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** ;; getf.sig ret0 = fmv add mpr[0] = mpr[0], mpr[1] add mpr[2] = mpr[2], mpr[3] add mpr[4] = mpr[4], mpr[5] add mpr[6] = mpr[6], mpr[7] ;; add mpr[0] = mpr[0], mpr[2] add mpr[4] = mpr[4], mpr[6] mov component[0] = dx mov component[1] = dy cmp.ne non0_2, p0 = 0, dx cmp.gt neg_2, p0 = 0, dx .pred.rel "mutex", p32, p36 //non0_0, neg_0 cmp.ne non0_3, p0 = 0, dy cmp.gt neg_3, p0 = 0, dy ;; .pred.rel "mutex", p33, p37 //non0_1, neg_1 add iSAD = iSAD, ret0 add mpr[8] = mpr[0], mpr[4] (neg_2) sub component[0] = 0, component[0] //abs (neg_3) sub component[1] = 0, component[1] //abs ;; xvidcore/src/motion/ia64_asm/sad_ia64.s0000664000076500007650000006423411147310721020746 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 sum of absolute differences - // * // * Copyright(C) 2002 Hannes Jtting, Christopher zbek // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: sad_ia64.s,v 1.8 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * sad_ia64.s, IA-64 sum of absolute differences // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** // ------------------------------------------------------------------------------ // * // * Optimized Assembler Versions of sad8 and sad16 // * // ------------------------------------------------------------------------------ // * // * Hannes Jtting and Christopher zbek // * {s_juetti,s_oezbek}@ira.uka.de // * // * Programmed for the IA64 laboratory held at University Karlsruhe 2002 // * http://www.info.uni-karlsruhe.de/~rubino/ia64p/ // * // ------------------------------------------------------------------------------ // * // * These are the optimized assembler versions of sad8 and sad16, which calculate // * the sum of absolute differences between two 8x8/16x16 block matrices. // * // * Our approach uses: // * - The Itanium command psad1, which solves the problem in hardware. // * - Modulo-Scheduled Loops as the best way to loop unrolling on the IA64 // * EPIC architecture // * - Alignment resolving to avoid memory faults // * // ------------------------------------------------------------------------------ .common sad16bi#,8,8 .align 16 .global sad16bi_ia64# .proc sad16bi_ia64# sad16bi_ia64: .prologue .save ar.lc, r2 mov r2 = ar.lc .body zxt4 r35 = r35 mov r8 = r0 mov r23 = r0 addl r22 = 255, r0 .L21: addl r14 = 7, r0 mov r19 = r32 mov r21 = r34 mov r20 = r33 ;; mov ar.lc = r14 ;; .L105: mov r17 = r20 mov r18 = r21 ;; ld1 r14 = [r17], 1 ld1 r15 = [r18], 1 ;; add r14 = r14, r15 ;; adds r14 = 1, r14 ;; shr.u r16 = r14, 1 ;; cmp4.le p6, p7 = r0, r16 ;; (p7) mov r16 = r0 (p7) br.cond.dpnt .L96 ;; cmp4.ge p6, p7 = r22, r16 ;; (p7) addl r16 = 255, r0 .L96: ld1 r14 = [r19] adds r20 = 2, r20 adds r21 = 2, r21 ;; sub r15 = r14, r16 ;; cmp4.ge p6, p7 = 0, r15 ;; (p6) sub r14 = r16, r14 (p7) add r8 = r8, r15 ;; (p6) add r8 = r8, r14 ld1 r15 = [r18] ld1 r14 = [r17] ;; add r14 = r14, r15 adds r17 = 1, r19 ;; adds r14 = 1, r14 ;; shr.u r16 = r14, 1 ;; cmp4.le p6, p7 = r0, r16 ;; (p7) mov r16 = r0 (p7) br.cond.dpnt .L102 ;; cmp4.ge p6, p7 = r22, r16 ;; (p7) addl r16 = 255, r0 .L102: ld1 r14 = [r17] adds r19 = 2, r19 ;; sub r15 = r14, r16 ;; cmp4.ge p6, p7 = 0, r15 ;; (p7) add r8 = r8, r15 (p6) sub r14 = r16, r14 ;; (p6) add r8 = r8, r14 br.cloop.sptk.few .L105 adds r23 = 1, r23 add r32 = r32, r35 add r33 = r33, r35 add r34 = r34, r35 ;; cmp4.geu p6, p7 = 15, r23 (p6) br.cond.dptk .L21 mov ar.lc = r2 br.ret.sptk.many b0 .endp sad16bi_ia64# .text .align 16 .global dev16_ia64# .proc dev16_ia64# .auto dev16_ia64: // renamings for better readability stride = r18 pfs = r19 //for saving previous function state cura0 = r20 //address of first 8-byte block of cur cura1 = r21 //address of second 8-byte block of cur mean0 = r22 //registers for calculating the sum in parallel mean1 = r23 mean2 = r24 mean3 = r25 dev0 = r26 //same for the deviation dev1 = r27 dev2 = r28 dev3 = r29 .body alloc pfs = ar.pfs, 2, 38, 0, 40 mov cura0 = in0 mov stride = in1 add cura1 = 8, cura0 .rotr c[32], psad[8] // just using rotating registers to get an array ;-) .explicit {.mmi ld8 c[0] = [cura0], stride // load them ... ld8 c[1] = [cura1], stride ;; } {.mmi ld8 c[2] = [cura0], stride ld8 c[3] = [cura1], stride ;; } {.mmi ld8 c[4] = [cura0], stride ld8 c[5] = [cura1], stride ;; } {.mmi ld8 c[6] = [cura0], stride ld8 c[7] = [cura1], stride ;; } {.mmi ld8 c[8] = [cura0], stride ld8 c[9] = [cura1], stride ;; } {.mmi ld8 c[10] = [cura0], stride ld8 c[11] = [cura1], stride ;; } {.mii ld8 c[12] = [cura0], stride psad1 mean0 = c[0], r0 // get the sum of them ... psad1 mean1 = c[1], r0 } {.mmi ld8 c[13] = [cura1], stride ;; ld8 c[14] = [cura0], stride psad1 mean2 = c[2], r0 } {.mii ld8 c[15] = [cura1], stride psad1 mean3 = c[3], r0 ;; psad1 psad[0] = c[4], r0 } {.mmi ld8 c[16] = [cura0], stride ld8 c[17] = [cura1], stride psad1 psad[1] = c[5], r0 ;; } {.mii ld8 c[18] = [cura0], stride psad1 psad[2] = c[6], r0 psad1 psad[3] = c[7], r0 } {.mmi ld8 c[19] = [cura1], stride ;; ld8 c[20] = [cura0], stride psad1 psad[4] = c[8], r0 } {.mii ld8 c[21] = [cura1], stride psad1 psad[5] = c[9], r0 ;; add mean0 = mean0, psad[0] } {.mmi ld8 c[22] = [cura0], stride ld8 c[23] = [cura1], stride add mean1 = mean1, psad[1] ;; } {.mii ld8 c[24] = [cura0], stride psad1 psad[0] = c[10], r0 psad1 psad[1] = c[11], r0 } {.mmi ld8 c[25] = [cura1], stride ;; ld8 c[26] = [cura0], stride add mean2 = mean2, psad[2] } {.mii ld8 c[27] = [cura1], stride add mean3 = mean3, psad[3] ;; psad1 psad[2] = c[12], r0 } {.mmi ld8 c[28] = [cura0], stride ld8 c[29] = [cura1], stride psad1 psad[3] = c[13], r0 ;; } {.mii ld8 c[30] = [cura0] psad1 psad[6] = c[14], r0 psad1 psad[7] = c[15], r0 } {.mmi ld8 c[31] = [cura1] ;; add mean0 = mean0, psad[0] add mean1 = mean1, psad[1] } {.mii add mean2 = mean2, psad[4] add mean3 = mean3, psad[5] ;; psad1 psad[0] = c[16], r0 } {.mmi add mean0 = mean0, psad[2] add mean1 = mean1, psad[3] psad1 psad[1] = c[17], r0 ;; } {.mii add mean2 = mean2, psad[6] psad1 psad[2] = c[18], r0 psad1 psad[3] = c[19], r0 } {.mmi add mean3 = mean3, psad[7] ;; add mean0 = mean0, psad[0] psad1 psad[4] = c[20], r0 } {.mii add mean1 = mean1, psad[1] psad1 psad[5] = c[21], r0 ;; psad1 psad[6] = c[22], r0 } {.mmi add mean2 = mean2, psad[2] add mean3 = mean3, psad[3] psad1 psad[7] = c[23], r0 ;; } {.mii add mean0 = mean0, psad[4] psad1 psad[0] = c[24], r0 psad1 psad[1] = c[25], r0 } {.mmi add mean1 = mean1, psad[5] ;; add mean2 = mean2, psad[6] psad1 psad[2] = c[26], r0 } {.mii add mean3 = mean3, psad[7] psad1 psad[3] = c[27], r0 ;; psad1 psad[4] = c[28], r0 } {.mmi add mean0 = mean0, psad[0] add mean1 = mean1, psad[1] psad1 psad[5] = c[29], r0 ;; } {.mii add mean2 = mean2, psad[2] psad1 psad[6] = c[30], r0 psad1 psad[7] = c[31], r0 } {.mmi add mean3 = mean3, psad[3] ;; add mean0 = mean0, psad[4] add mean1 = mean1, psad[5] } {.mbb add mean2 = mean2, mean3 nop.b 1 nop.b 1 ;; } {.mib add mean0 = mean0, psad[6] add mean1 = mean1, psad[7] nop.b 1 ;; } {.mib add mean0 = mean0, mean1 // add mean2 = 127, mean2 // this could make our division more exactly, but does not help much ;; } {.mib add mean0 = mean0, mean2 ;; } {.mib shr.u mean0 = mean0, 8 // divide them ... ;; } {.mib mux1 mean0 = mean0, @brcst ;; } {.mii nop.m 0 psad1 dev0 = c[0], mean0 // and do a sad again ... psad1 dev1 = c[1], mean0 } {.mii nop.m 0 psad1 dev2 = c[2], mean0 psad1 dev3 = c[3], mean0 } {.mii nop.m 0 psad1 psad[0] = c[4], mean0 psad1 psad[1] = c[5], mean0 } {.mii nop.m 0 psad1 psad[2] = c[6], mean0 psad1 psad[3] = c[7], mean0 } {.mii nop.m 0 psad1 psad[4] = c[8], mean0 psad1 psad[5] = c[9], mean0 ;; } {.mii add dev0 = dev0, psad[0] psad1 psad[6] = c[10], mean0 psad1 psad[7] = c[11], mean0 } {.mmi add dev1 = dev1, psad[1] add dev2 = dev2, psad[2] psad1 psad[0] = c[12], mean0 } {.mii add dev3 = dev3, psad[3] psad1 psad[1] = c[13], mean0 ;; psad1 psad[2] = c[14], mean0 } {.mmi add dev0 = dev0, psad[4] add dev1 = dev1, psad[5] psad1 psad[3] = c[15], mean0 } {.mii add dev2 = dev2, psad[6] psad1 psad[4] = c[16], mean0 psad1 psad[5] = c[17], mean0 } {.mmi add dev3 = dev3, psad[7] ;; add dev0 = dev0, psad[0] psad1 psad[6] = c[18], mean0 } {.mii add dev1 = dev1, psad[1] psad1 psad[7] = c[19], mean0 psad1 psad[0] = c[20], mean0 } {.mmi add dev2 = dev2, psad[2] add dev3 = dev3, psad[3] psad1 psad[1] = c[21], mean0 ;; } {.mii add dev0 = dev0, psad[4] psad1 psad[2] = c[22], mean0 psad1 psad[3] = c[23], mean0 } {.mmi add dev1 = dev1, psad[5] add dev2 = dev2, psad[6] psad1 psad[4] = c[24], mean0 } {.mii add dev3 = dev3, psad[7] psad1 psad[5] = c[25], mean0 ;; psad1 psad[6] = c[26], mean0 } {.mmi add dev0 = dev0, psad[0] add dev1 = dev1, psad[1] psad1 psad[7] = c[27], mean0 } {.mii add dev2 = dev2, psad[2] psad1 psad[0] = c[28], mean0 psad1 psad[1] = c[29], mean0 } {.mmi add dev3 = dev3, psad[3] ;; add dev0 = dev0, psad[4] psad1 psad[2] = c[30], mean0 } {.mii add dev1 = dev1, psad[5] psad1 psad[3] = c[31], mean0 ;; add dev2 = dev2, psad[6] } {.mmi add dev3 = dev3, psad[7] add dev0 = dev0, psad[0] add dev1 = dev1, psad[1] ;; } {.mii add dev2 = dev2, psad[2] add dev3 = dev3, psad[3] add ret0 = dev0, dev1 ;; } {.mib add dev2 = dev2, dev3 nop.i 1 nop.b 1 ;; } {.mib add ret0 = ret0, dev2 nop.i 1 br.ret.sptk.many b0 } .endp dev16_ia64# // ########################################################### // ########################################################### // Neue version von gruppe 01 ################################ // ########################################################### // ########################################################### .text .align 16 .global sad16_ia64# .proc sad16_ia64# sad16_ia64: alloc r1 = ar.pfs, 4, 76, 0, 0 mov r2 = pr dep r14 = r0, r33, 0, 3 // r14 = (r33 div 8)*8 (aligned version of ref) dep.z r31 = r33, 0, 3 // r31 = r33 mod 8 (misalignment of ref) ;; mov r64 = r34 //(1) calculate multiples of stride shl r65 = r34, 1 //(2) for being able to load all the shladd r66 = r34, 1, r34 //(3) data at once shl r67 = r34, 2 //(4) shladd r68 = r34, 2, r34 //(5) shl r71 = r34, 3 //(8) shladd r72 = r34, 3, r34 //(9) ;; shl r69 = r66, 1 //(6) shladd r70 = r66, 1, r34 //(7) shl r73 = r68, 1 //(10) shladd r74 = r68, 1, r34 //(11) shl r75 = r66, 2 //(12) shladd r76 = r66, 2, r34 //(13) shladd r77 = r66, 2, r65 //(14) shladd r78 = r66, 2, r66 //(15) ;; cmp.eq p16, p17 = 0, r31 // prepare predicates according to the misalignment cmp.eq p18, p19 = 2, r31 // ref cmp.eq p20, p21 = 4, r31 cmp.eq p22, p23 = 6, r31 cmp.eq p24, p25 = 1, r31 cmp.eq p26, p27 = 3, r31 cmp.eq p28, p29 = 5, r31 mov r96 = r14 // and calculate all the adresses where we have mov r33 = r32 // to load from add r97 = r14, r64 add r35 = r32, r64 add r98 = r14, r65 add r37 = r32, r65 add r99 = r14, r66 add r39 = r32, r66 add r100 = r14, r67 add r41 = r32, r67 add r101 = r14, r68 add r43 = r32, r68 add r102 = r14, r69 add r45 = r32, r69 add r103 = r14, r70 add r47 = r32, r70 add r104 = r14, r71 add r49 = r32, r71 add r105 = r14, r72 add r51 = r32, r72 add r106 = r14, r73 add r53 = r32, r73 add r107 = r14, r74 add r55 = r32, r74 add r108 = r14, r75 add r57 = r32, r75 add r109 = r14, r76 add r59 = r32, r76 add r110 = r14, r77 add r61 = r32, r77 add r111 = r14, r78 add r63 = r32, r78 ;; ld8 r32 = [r33], 8 // Load all the data which is needed for the sad ld8 r34 = [r35], 8 // in the registers. the goal is to have the array ld8 r36 = [r37], 8 // adressed by cur in the registers r32 - r63 and ld8 r38 = [r39], 8 // the aray adressed by ref in the registers ld8 r40 = [r41], 8 // r64 - r95. The registers r96 - r111 are needed ld8 r42 = [r43], 8 // to load the aligned 24 bits in which the ld8 r44 = [r45], 8 // needed misaligned 16 bits must be. ld8 r46 = [r47], 8 // After loading we start a preprocessing which ld8 r48 = [r49], 8 // guarantees that the data adressed by ref is in ld8 r50 = [r51], 8 // the registers r64 - r95. ld8 r52 = [r53], 8 ld8 r54 = [r55], 8 ld8 r56 = [r57], 8 ld8 r58 = [r59], 8 ld8 r60 = [r61], 8 ld8 r62 = [r63], 8 ld8 r64 = [r96], 8 ld8 r66 = [r97], 8 ld8 r68 = [r98], 8 ld8 r70 = [r99], 8 ld8 r72 = [r100], 8 ld8 r74 = [r101], 8 ld8 r76 = [r102], 8 ld8 r78 = [r103], 8 ld8 r80 = [r104], 8 ld8 r82 = [r105], 8 ld8 r84 = [r106], 8 ld8 r86 = [r107], 8 ld8 r88 = [r108], 8 ld8 r90 = [r109], 8 ld8 r92 = [r110], 8 ld8 r94 = [r111], 8 ;; ld8 r33 = [r33] ld8 r35 = [r35] ld8 r37 = [r37] ld8 r39 = [r39] ld8 r41 = [r41] ld8 r43 = [r43] ld8 r45 = [r45] ld8 r47 = [r47] ld8 r49 = [r49] ld8 r51 = [r51] ld8 r53 = [r53] ld8 r55 = [r55] ld8 r57 = [r57] ld8 r59 = [r59] ld8 r61 = [r61] ld8 r63 = [r63] ld8 r65 = [r96], 8 ld8 r67 = [r97], 8 ld8 r69 = [r98], 8 ld8 r71 = [r99], 8 ld8 r73 = [r100], 8 ld8 r75 = [r101], 8 ld8 r77 = [r102], 8 ld8 r79 = [r103], 8 ld8 r81 = [r104], 8 ld8 r83 = [r105], 8 ld8 r85 = [r106], 8 ld8 r87 = [r107], 8 ld8 r89 = [r108], 8 ld8 r91 = [r109], 8 ld8 r93 = [r110], 8 ld8 r95 = [r111], 8 (p16) br.cond.dptk.many .Lber // If ref is aligned, everything is loaded and we can start the calculation ;; ld8 r96 = [r96] // If not, we have to load a bit more ld8 r97 = [r97] ld8 r98 = [r98] ld8 r99 = [r99] ld8 r100 = [r100] ld8 r101 = [r101] ld8 r102 = [r102] ld8 r103 = [r103] ld8 r104 = [r104] ld8 r105 = [r105] ld8 r106 = [r106] ld8 r107 = [r107] ld8 r108 = [r108] ld8 r109 = [r109] ld8 r110 = [r110] ld8 r111 = [r111] (p24) br.cond.dptk.many .Lmod1 // according to the misalignment, we have (p18) br.cond.dpnt.many .Lmod2 // to jump to different preprocessing routines (p26) br.cond.dpnt.many .Lmod3 (p20) br.cond.dpnt.many .Lmod4 (p28) br.cond.dpnt.many .Lmod5 (p22) br.cond.dpnt.many .Lmod6 ;; .Lmod7: // this jump point is not needed shrp r64 = r65, r64, 56 // in these blocks, we do the preprocessing shrp r65 = r96, r65, 56 shrp r66 = r67, r66, 56 shrp r67 = r97, r67, 56 shrp r68 = r69, r68, 56 shrp r69 = r98, r69, 56 shrp r70 = r71, r70, 56 shrp r71 = r99, r71, 56 shrp r72 = r73, r72, 56 shrp r73 = r100, r73, 56 shrp r74 = r75, r74, 56 shrp r75 = r101, r75, 56 shrp r76 = r77, r76, 56 shrp r77 = r102, r77, 56 shrp r78 = r79, r78, 56 shrp r79 = r103, r79, 56 shrp r80 = r81, r80, 56 shrp r81 = r104, r81, 56 shrp r82 = r83, r82, 56 shrp r83 = r105, r83, 56 shrp r84 = r85, r84, 56 shrp r85 = r106, r85, 56 shrp r86 = r87, r86, 56 shrp r87 = r107, r87, 56 shrp r88 = r89, r88, 56 shrp r89 = r108, r89, 56 shrp r90 = r91, r90, 56 shrp r91 = r109, r91, 56 shrp r92 = r93, r92, 56 shrp r93 = r110, r93, 56 shrp r94 = r95, r94, 56 shrp r95 = r111, r95, 56 br.cond.sptk.many .Lber // and then we jump to the calculation ;; .Lmod6: shrp r64 = r65, r64, 48 shrp r65 = r96, r65, 48 shrp r66 = r67, r66, 48 shrp r67 = r97, r67, 48 shrp r68 = r69, r68, 48 shrp r69 = r98, r69, 48 shrp r70 = r71, r70, 48 shrp r71 = r99, r71, 48 shrp r72 = r73, r72, 48 shrp r73 = r100, r73, 48 shrp r74 = r75, r74, 48 shrp r75 = r101, r75, 48 shrp r76 = r77, r76, 48 shrp r77 = r102, r77, 48 shrp r78 = r79, r78, 48 shrp r79 = r103, r79, 48 shrp r80 = r81, r80, 48 shrp r81 = r104, r81, 48 shrp r82 = r83, r82, 48 shrp r83 = r105, r83, 48 shrp r84 = r85, r84, 48 shrp r85 = r106, r85, 48 shrp r86 = r87, r86, 48 shrp r87 = r107, r87, 48 shrp r88 = r89, r88, 48 shrp r89 = r108, r89, 48 shrp r90 = r91, r90, 48 shrp r91 = r109, r91, 48 shrp r92 = r93, r92, 48 shrp r93 = r110, r93, 48 shrp r94 = r95, r94, 48 shrp r95 = r111, r95, 48 br.cond.sptk.many .Lber ;; .Lmod5: shrp r64 = r65, r64, 40 shrp r65 = r96, r65, 40 shrp r66 = r67, r66, 40 shrp r67 = r97, r67, 40 shrp r68 = r69, r68, 40 shrp r69 = r98, r69, 40 shrp r70 = r71, r70, 40 shrp r71 = r99, r71, 40 shrp r72 = r73, r72, 40 shrp r73 = r100, r73, 40 shrp r74 = r75, r74, 40 shrp r75 = r101, r75, 40 shrp r76 = r77, r76, 40 shrp r77 = r102, r77, 40 shrp r78 = r79, r78, 40 shrp r79 = r103, r79, 40 shrp r80 = r81, r80, 40 shrp r81 = r104, r81, 40 shrp r82 = r83, r82, 40 shrp r83 = r105, r83, 40 shrp r84 = r85, r84, 40 shrp r85 = r106, r85, 40 shrp r86 = r87, r86, 40 shrp r87 = r107, r87, 40 shrp r88 = r89, r88, 40 shrp r89 = r108, r89, 40 shrp r90 = r91, r90, 40 shrp r91 = r109, r91, 40 shrp r92 = r93, r92, 40 shrp r93 = r110, r93, 40 shrp r94 = r95, r94, 40 shrp r95 = r111, r95, 40 br.cond.sptk.many .Lber ;; .Lmod4: shrp r64 = r65, r64, 32 shrp r65 = r96, r65, 32 shrp r66 = r67, r66, 32 shrp r67 = r97, r67, 32 shrp r68 = r69, r68, 32 shrp r69 = r98, r69, 32 shrp r70 = r71, r70, 32 shrp r71 = r99, r71, 32 shrp r72 = r73, r72, 32 shrp r73 = r100, r73, 32 shrp r74 = r75, r74, 32 shrp r75 = r101, r75, 32 shrp r76 = r77, r76, 32 shrp r77 = r102, r77, 32 shrp r78 = r79, r78, 32 shrp r79 = r103, r79, 32 shrp r80 = r81, r80, 32 shrp r81 = r104, r81, 32 shrp r82 = r83, r82, 32 shrp r83 = r105, r83, 32 shrp r84 = r85, r84, 32 shrp r85 = r106, r85, 32 shrp r86 = r87, r86, 32 shrp r87 = r107, r87, 32 shrp r88 = r89, r88, 32 shrp r89 = r108, r89, 32 shrp r90 = r91, r90, 32 shrp r91 = r109, r91, 32 shrp r92 = r93, r92, 32 shrp r93 = r110, r93, 32 shrp r94 = r95, r94, 32 shrp r95 = r111, r95, 32 br.cond.sptk.many .Lber ;; .Lmod3: shrp r64 = r65, r64, 24 shrp r65 = r96, r65, 24 shrp r66 = r67, r66, 24 shrp r67 = r97, r67, 24 shrp r68 = r69, r68, 24 shrp r69 = r98, r69, 24 shrp r70 = r71, r70, 24 shrp r71 = r99, r71, 24 shrp r72 = r73, r72, 24 shrp r73 = r100, r73, 24 shrp r74 = r75, r74, 24 shrp r75 = r101, r75, 24 shrp r76 = r77, r76, 24 shrp r77 = r102, r77, 24 shrp r78 = r79, r78, 24 shrp r79 = r103, r79, 24 shrp r80 = r81, r80, 24 shrp r81 = r104, r81, 24 shrp r82 = r83, r82, 24 shrp r83 = r105, r83, 24 shrp r84 = r85, r84, 24 shrp r85 = r106, r85, 24 shrp r86 = r87, r86, 24 shrp r87 = r107, r87, 24 shrp r88 = r89, r88, 24 shrp r89 = r108, r89, 24 shrp r90 = r91, r90, 24 shrp r91 = r109, r91, 24 shrp r92 = r93, r92, 24 shrp r93 = r110, r93, 24 shrp r94 = r95, r94, 24 shrp r95 = r111, r95, 24 br.cond.sptk.many .Lber ;; .Lmod2: shrp r64 = r65, r64, 16 shrp r65 = r96, r65, 16 shrp r66 = r67, r66, 16 shrp r67 = r97, r67, 16 shrp r68 = r69, r68, 16 shrp r69 = r98, r69, 16 shrp r70 = r71, r70, 16 shrp r71 = r99, r71, 16 shrp r72 = r73, r72, 16 shrp r73 = r100, r73, 16 shrp r74 = r75, r74, 16 shrp r75 = r101, r75, 16 shrp r76 = r77, r76, 16 shrp r77 = r102, r77, 16 shrp r78 = r79, r78, 16 shrp r79 = r103, r79, 16 shrp r80 = r81, r80, 16 shrp r81 = r104, r81, 16 shrp r82 = r83, r82, 16 shrp r83 = r105, r83, 16 shrp r84 = r85, r84, 16 shrp r85 = r106, r85, 16 shrp r86 = r87, r86, 16 shrp r87 = r107, r87, 16 shrp r88 = r89, r88, 16 shrp r89 = r108, r89, 16 shrp r90 = r91, r90, 16 shrp r91 = r109, r91, 16 shrp r92 = r93, r92, 16 shrp r93 = r110, r93, 16 shrp r94 = r95, r94, 16 shrp r95 = r111, r95, 16 br.cond.sptk.many .Lber ;; .Lmod1: shrp r64 = r65, r64, 8 shrp r65 = r96, r65, 8 shrp r66 = r67, r66, 8 shrp r67 = r97, r67, 8 shrp r68 = r69, r68, 8 shrp r69 = r98, r69, 8 shrp r70 = r71, r70, 8 shrp r71 = r99, r71, 8 shrp r72 = r73, r72, 8 shrp r73 = r100, r73, 8 shrp r74 = r75, r74, 8 shrp r75 = r101, r75, 8 shrp r76 = r77, r76, 8 shrp r77 = r102, r77, 8 shrp r78 = r79, r78, 8 shrp r79 = r103, r79, 8 shrp r80 = r81, r80, 8 shrp r81 = r104, r81, 8 shrp r82 = r83, r82, 8 shrp r83 = r105, r83, 8 shrp r84 = r85, r84, 8 shrp r85 = r106, r85, 8 shrp r86 = r87, r86, 8 shrp r87 = r107, r87, 8 shrp r88 = r89, r88, 8 shrp r89 = r108, r89, 8 shrp r90 = r91, r90, 8 shrp r91 = r109, r91, 8 shrp r92 = r93, r92, 8 shrp r93 = r110, r93, 8 shrp r94 = r95, r94, 8 shrp r95 = r111, r95, 8 .Lber: ;; psad1 r32 = r32, r64 // Here we do the calculation. psad1 r33 = r33, r65 // The machine is providing a fast method psad1 r34 = r34, r66 // for calculating sad, so we use it psad1 r35 = r35, r67 psad1 r36 = r36, r68 psad1 r37 = r37, r69 psad1 r38 = r38, r70 psad1 r39 = r39, r71 psad1 r40 = r40, r72 psad1 r41 = r41, r73 psad1 r42 = r42, r74 psad1 r43 = r43, r75 psad1 r44 = r44, r76 psad1 r45 = r45, r77 psad1 r46 = r46, r78 psad1 r47 = r47, r79 psad1 r48 = r48, r80 psad1 r49 = r49, r81 psad1 r50 = r50, r82 psad1 r51 = r51, r83 psad1 r52 = r52, r84 psad1 r53 = r53, r85 psad1 r54 = r54, r86 psad1 r55 = r55, r87 psad1 r56 = r56, r88 psad1 r57 = r57, r89 psad1 r58 = r58, r90 psad1 r59 = r59, r91 psad1 r60 = r60, r92 psad1 r61 = r61, r93 psad1 r62 = r62, r94 psad1 r63 = r63, r95 ;; add r32 = r32, r63 // at last, we have to sum up add r33 = r33, r62 // in 5 stages add r34 = r34, r61 add r35 = r35, r60 add r36 = r36, r59 add r37 = r37, r58 add r38 = r38, r57 add r39 = r39, r56 add r40 = r40, r55 add r41 = r41, r54 add r42 = r42, r53 add r43 = r43, r52 add r44 = r44, r51 add r45 = r45, r50 add r46 = r46, r49 add r47 = r47, r48 ;; add r32 = r32, r47 add r33 = r33, r46 add r34 = r34, r45 add r35 = r35, r44 add r36 = r36, r43 add r37 = r37, r42 add r38 = r38, r41 add r39 = r39, r40 ;; add r32 = r32, r39 add r33 = r33, r38 add r34 = r34, r37 add r35 = r35, r36 ;; add r32 = r32, r35 add r33 = r33, r34 ;; add r8 = r32, r33 // and store the result in r8 mov pr = r2, -1 mov ar.pfs = r1 br.ret.sptk.many b0 .endp sad16_ia64# .align 16 .global sad8_ia64# .proc sad8_ia64# sad8_ia64: alloc r1 = ar.pfs, 3, 21, 0, 0 mov r2 = pr dep r14 = r0, r33, 0, 3 // calculate aligned version of ref dep.z r31 = r33, 0, 3 // calculate misalignment of ref ;; mov r40 = r34 //(1) calculate multiples of stride shl r41 = r34, 1 //(2) shladd r42 = r34, 1, r34 //(3) shl r43 = r34, 2 //(4) shladd r44 = r34, 2, r34 //(5) ;; cmp.eq p16, p17 = 0, r31 // set predicates according to the misalignment of ref cmp.eq p18, p19 = 2, r31 shl r45 = r42, 1 //(6) cmp.eq p20, p21 = 4, r31 cmp.eq p22, p23 = 6, r31 shladd r46 = r42, 1, r34 //(7) cmp.eq p24, p25 = 1, r31 cmp.eq p26, p27 = 3, r31 cmp.eq p28, p29 = 5, r31 ;; mov r48 = r14 // calculate memory adresses of data add r33 = r32, r40 add r49 = r14, r40 add r34 = r32, r41 add r50 = r14, r41 add r35 = r32, r42 add r51 = r14, r42 add r36 = r32, r43 add r52 = r14, r43 add r37 = r32, r44 add r53 = r14, r44 add r38 = r32, r45 add r54 = r14, r45 add r39 = r32, r46 add r55 = r14, r46 ;; ld8 r32 = [r32] // load everythingund alles wird geladen ld8 r33 = [r33] // cur is located in r32 - r39 ld8 r34 = [r34] // ref in r40 - r47 ld8 r35 = [r35] ld8 r36 = [r36] ld8 r37 = [r37] ld8 r38 = [r38] ld8 r39 = [r39] ld8 r40 = [r48] ,8 ld8 r41 = [r49] ,8 ld8 r42 = [r50] ,8 ld8 r43 = [r51] ,8 ld8 r44 = [r52] ,8 ld8 r45 = [r53] ,8 ld8 r46 = [r54] ,8 ld8 r47 = [r55] ,8 (p16) br.cond.dptk.many .Lber2 // if ref is aligned, we can start the calculation ;; ld8 r48 = [r48] // if not, we have to load some more ld8 r49 = [r49] // because of the alignment of ld8 ld8 r50 = [r50] ld8 r51 = [r51] ld8 r52 = [r52] ld8 r53 = [r53] ld8 r54 = [r54] ld8 r55 = [r55] (p24) br.cond.dptk.many .Lmode1 (p18) br.cond.dpnt.many .Lmode2 (p26) br.cond.dpnt.many .Lmode3 (p20) br.cond.dpnt.many .Lmode4 (p28) br.cond.dpnt.many .Lmode5 (p22) br.cond.dpnt.many .Lmode6 ;; .Lmode7: // this jump piont is not needed, it is for better understandment shrp r40 = r48, r40, 56 // here we do some preprocessing on the data shrp r41 = r49, r41, 56 // this is because of the alignment problem of ref shrp r42 = r50, r42, 56 shrp r43 = r51, r43, 56 shrp r44 = r52, r44, 56 shrp r45 = r53, r45, 56 shrp r46 = r54, r46, 56 shrp r47 = r55, r47, 56 br.cond.sptk.many .Lber2 ;; .Lmode6: shrp r40 = r48, r40, 48 shrp r41 = r49, r41, 48 shrp r42 = r50, r42, 48 shrp r43 = r51, r43, 48 shrp r44 = r52, r44, 48 shrp r45 = r53, r45, 48 shrp r46 = r54, r46, 48 shrp r47 = r55, r47, 48 br.cond.sptk.many .Lber2 ;; .Lmode5: shrp r40 = r48, r40, 40 shrp r41 = r49, r41, 40 shrp r42 = r50, r42, 40 shrp r43 = r51, r43, 40 shrp r44 = r52, r44, 40 shrp r45 = r53, r45, 40 shrp r46 = r54, r46, 40 shrp r47 = r55, r47, 40 br.cond.sptk.many .Lber2 ;; .Lmode4: shrp r40 = r48, r40, 32 shrp r41 = r49, r41, 32 shrp r42 = r50, r42, 32 shrp r43 = r51, r43, 32 shrp r44 = r52, r44, 32 shrp r45 = r53, r45, 32 shrp r46 = r54, r46, 32 shrp r47 = r55, r47, 32 br.cond.sptk.many .Lber2 ;; .Lmode3: shrp r40 = r48, r40, 24 shrp r41 = r49, r41, 24 shrp r42 = r50, r42, 24 shrp r43 = r51, r43, 24 shrp r44 = r52, r44, 24 shrp r45 = r53, r45, 24 shrp r46 = r54, r46, 24 shrp r47 = r55, r47, 24 br.cond.sptk.many .Lber2 ;; .Lmode2: shrp r40 = r48, r40, 16 shrp r41 = r49, r41, 16 shrp r42 = r50, r42, 16 shrp r43 = r51, r43, 16 shrp r44 = r52, r44, 16 shrp r45 = r53, r45, 16 shrp r46 = r54, r46, 16 shrp r47 = r55, r47, 16 br.cond.sptk.many .Lber2 ;; .Lmode1: shrp r40 = r48, r40, 8 shrp r41 = r49, r41, 8 shrp r42 = r50, r42, 8 shrp r43 = r51, r43, 8 shrp r44 = r52, r44, 8 shrp r45 = r53, r45, 8 shrp r46 = r54, r46, 8 shrp r47 = r55, r47, 8 .Lber2: ;; psad1 r32 = r32, r40 // we start calculating sad psad1 r33 = r33, r41 // using th psad1 command of IA64 psad1 r34 = r34, r42 psad1 r35 = r35, r43 psad1 r36 = r36, r44 psad1 r37 = r37, r45 psad1 r38 = r38, r46 psad1 r39 = r39, r47 ;; add r32 = r32, r33 // then we sum up everything add r33 = r34, r35 add r34 = r36, r37 add r35 = r38, r39 ;; add r32 = r32, r33 add r33 = r34, r35 ;; add r8 = r32, r33 // and store the result un r8 mov pr = r2, -1 mov ar.pfs = r1 br.ret.sptk.many b0 .endp sad8_ia64# xvidcore/src/motion/ia64_asm/calc_delta_2.s0000664000076500007650000001040011147310721021632 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 halfpel refinement - // * // * Copyright(C) 2002 Johannes Singler, Daniel Winkler // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: calc_delta_2.s,v 1.2 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * calc_delta_2.s, IA-64 halfpel refinement // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** (non0_2) mov sc[0] = 1 (non0_3) mov sc[1] = 1 ;; add mpr[0] = mpr[0], mpr[1] (non0_2) shl sc[0] = sc[0], iFcode add mpr[2] = mpr[2], mpr[3] (non0_3) shl sc[1] = sc[1], iFcode add mpr[4] = mpr[4], mpr[5] add mpr[6] = mpr[6], mpr[7] ;; (non0_2) add sc[0] = -1, sc[0] (non0_3) add sc[1] = -1, sc[1] mov ret0 = 2 ;; (non0_2) add component[0] = component[0], sc[0] (non0_3) add component[1] = component[1], sc[1] ;; (non0_2) shr component[0] = component[0], iFcode (non0_3) shr component[1] = component[1], iFcode add mpr[0] = mpr[0], mpr[2] add mpr[4] = mpr[4], mpr[6] ;; (non0_2) cmp.lt cg32_0, p0 = 32, component[0] (non0_3) cmp.lt cg32_1, p0 = 32, component[1] ;; (cg32_0) mov component[0] = 32 (cg32_1) mov component[1] = 32 ;; (non0_2) addl tabaddress[0] = @gprel(mvtab#), gp (non0_3) addl tabaddress[1] = @gprel(mvtab#), gp ;; (non0_2) shladd tabaddress[0] = component[0], 2, tabaddress[0] (non0_3) shladd tabaddress[1] = component[1], 2, tabaddress[1] ;; (non0_2) ld4 sc[0] = [tabaddress[0]] (non0_3) ld4 sc[1] = [tabaddress[1]] mov component[0] = dx mov component[1] = dy cmp.ne non0_0, p0 = 0, dx cmp.gt neg_0, p0 = 0, dx .pred.rel "mutex", p30, p34 //non0_0, neg_0 cmp.ne non0_1, p0 = 0, dy cmp.gt neg_1, p0 = 0, dy ;; .pred.rel "mutex", p31, p35 //non0_1, neg_1 (non0_2) add sc[0] = iFcode, sc[0] (non0_3) add sc[1] = iFcode, sc[1] ;; (non0_2) add ret0 = ret0, sc[0] (neg_0) sub component[0] = 0, component[0] //abs (neg_1) sub component[1] = 0, component[1] //abs ;; (non0_3) add ret0 = ret0, sc[1] add iSAD = mpr[0], mpr[4] ;; .explicit {.mii setf.sig fmv = ret0 (non0_0) mov sc[0] = 1 (non0_1) mov sc[1] = 1 ;; } {.mfb xmpy.l fmv = fmv, fQuant } {.mii (non0_0) shl sc[0] = sc[0], iFcode (non0_1) shl sc[1] = sc[1], iFcode ;; } .default (non0_0) add sc[0] = -1, sc[0] (non0_1) add sc[1] = -1, sc[1] ;; (non0_0) add component[0] = component[0], sc[0] (non0_1) add component[1] = component[1], sc[1] ;; (non0_0) shr component[0] = component[0], iFcode (non0_1) shr component[1] = component[1], iFcode ;; (non0_0) cmp.lt cg32_0, p0 = 32, component[0] (non0_1) cmp.lt cg32_1, p0 = 32, component[1] ;; (cg32_0) mov component[0] = 32 (cg32_1) mov component[1] = 32 ;; (non0_0) addl tabaddress[0] = @gprel(mvtab#), gp (non0_1) addl tabaddress[1] = @gprel(mvtab#), gp ;; (non0_0) shladd tabaddress[0] = component[0], 2, tabaddress[0] (non0_1) shladd tabaddress[1] = component[1], 2, tabaddress[1] getf.sig ret0 = fmv ;; (non0_0) ld4 sc[0] = [tabaddress[0]] (non0_1) ld4 sc[1] = [tabaddress[1]] add mpr[8] = mpr[8], ret0 ;; xvidcore/src/motion/ia64_asm/halfpel8_refine_ia64.s0000664000076500007650000005007511147310721023230 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 halfpel refinement - // * // * Copyright(C) 2002 Johannes Singler, Daniel Winkler // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: halfpel8_refine_ia64.s,v 1.4 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * halfpel8_refine_ia64.s, IA-64 halfpel refinement // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** // ------------------------------------------------------------------------------ // * Programmed by // * Johannes Singler (email@jsingler.de), Daniel Winkler (infostudent@uni.de) // * // * Programmed for the IA64 laboratory held at University Karlsruhe 2002 // * http://www.info.uni-karlsruhe.de/~rubino/ia64p/ // * // ------------------------------------------------------------------------------ // * // * This is the optimized assembler version of Halfpel8_Refine. This function // * is worth it to be optimized for the IA-64 architecture because of the huge // * register set. We can hold all necessary data in general use registers // * and reuse it. // * // * Our approach uses: // * - The Itanium command psad1, which solves the problem in hardware. // * - Alignment resolving to avoid memory faults // * - Massive lopp unrolling // * // ------------------------------------------------------------------------------ // * // * ------- Half-pixel steps around the center (*) and corresponding // * |0|1|0| register set parts. // * ------- // * |2|*|2| // * ------- // * |0|1|0| // * ------- // * // ------------------------------------------------------------------------------ // * calc_delta is split up in three parts wich are included from // * // * calc_delta_1.s // * calc_delta_2.s // * calc_delta_3.s // * // ------------------------------------------------------------------------------ // * We assume min_dx <= currX <= max_dx && min_dy <= currY <= max_dy .sdata .align 4 .type lambda_vec8#,@object .size lambda_vec8#,128 lambda_vec8: data4 0 data4 1 data4 1 data4 1 data4 1 data4 2 data4 2 data4 2 data4 2 data4 3 data4 3 data4 3 data4 4 data4 4 data4 4 data4 5 data4 5 data4 6 data4 7 data4 7 data4 8 data4 9 data4 10 data4 11 data4 13 data4 14 data4 16 data4 18 data4 21 data4 25 data4 30 data4 36 .type mvtab#,@object .size mvtab#,132 mvtab: data4 1 data4 2 data4 3 data4 4 data4 6 data4 7 data4 7 data4 7 data4 9 data4 9 data4 9 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 10 data4 11 data4 11 data4 11 data4 11 data4 11 data4 11 data4 12 data4 12 .text .align 16 .global Halfpel8_Refine_ia64# .proc Halfpel8_Refine_ia64# Halfpel8_Refine_ia64: pfs = r14 prsave = r15 // Save important registers alloc pfs = ar.pfs, 18, 74, 4, 96 mov prsave = pr // Naming registers for better readability pRef = in0 pRefH = in1 pRefV = in2 pRefHV = in3 cura = in4 x = in5 y = in6 currMV = in7 iMinSAD = in8 dx = in9 dy = in10 min_dx = in11 max_dx = in12 min_dy = in13 max_dy = in14 iFcode = in15 iQuant = in16 iEdgedWidth = in17 iSAD = r17 backupX = r18 backupY = r19 currX = r20 currY = r21 currYAddress = r22 bitX0 = r23 bitY0 = r24 dxd2 = r25 dyd2 = r26 offset = r27 block = r28 nob02 = r29 nob1 = r30 nob64m02 = r31 nob64m1 = r127 const7 = r126 nob56m02 = r125 oldX = r124 oldY = r123 .rotr inregisters[18], refaa[3], refab[3], cur[8], ref0a[9], ref0b[9], ref1a[9], mpr[9], ref2a[8], ref2b[8], component[2], sc[2], tabaddress[2] fx = f8 fy = f9 fblock = f10 fiEdgedWidth = f11 fdxd2 = f12 fdyd2 = f13 foffset = f14 fydiEdgedWidth = f15 fQuant = f32 fmv = f33 n = p16 h = p17 v = p18 hv = p19 l = p20 r = p21 t = p22 b = p23 lt = p24 lb = p25 rt = p26 rb = p27 fb = p28 non0_0 = p30 non0_1 = p31 non0_2 = p32 non0_3 = p33 neg_0 = p34 neg_1 = p35 neg_2 = p36 neg_3 = p37 cg32_0 = p29 cg32_1 = p38 // Initialize input variables add sp = 16, sp ;; ld4 iMinSAD = [sp], 8 ;; sxt4 iMinSAD = iMinSAD ld4 dx = [sp], 8 ;; sxt4 dx = dx ld4 dy = [sp], 8 ;; sxt4 dy = dy ld4 min_dx = [sp], 8 ;; sxt4 min_dx = min_dx ld4 max_dx = [sp], 8 ;; sxt4 max_dx = max_dx ld4 min_dy = [sp], 8 ;; sxt4 min_dy = min_dy ld4 max_dy = [sp], 8 ;; sxt4 max_dy = max_dy ld4 iFcode = [sp], 8 ;; sxt4 iFcode = iFcode ld4 iQuant = [sp], 8 add tabaddress[0] = @gprel(lambda_vec8#), gp ;; shladd tabaddress[0] = iQuant, 2, tabaddress[0] ;; ld4 iQuant = [tabaddress[0]] ;; sxt4 iQuant = iQuant ;; add iFcode = -1, iFcode //only used in decreased version shl iQuant = iQuant, 1 ;; setf.sig fQuant = iQuant ld4 iEdgedWidth = [sp] add sp = -88, sp // Initialize local variables ld4 currX = [currMV] add currYAddress = 4, currMV ;; sxt4 currX = currX ld4 currY = [currYAddress] ;; sxt4 currY = currY ;; // Calculate references cmp.gt l, p0 = currX, min_dx cmp.lt r, p0 = currX, max_dx cmp.gt t, p0 = currY, min_dy cmp.lt b, p0 = currY, max_dy add backupX = -1, currX //move to left upper corner of quadrate add backupY = -1, currY ;; (b) cmp.gt.unc lb, p0 = currX, min_dx (t) cmp.lt.unc rt, p0 = currX, max_dx (l) cmp.gt.unc lt, p0 = currY, min_dy (r) cmp.lt.unc rb, p0 = currY, max_dy and bitX0 = 1, backupX and bitY0 = 1, backupY ;; cmp.eq n, p0 = 0, bitX0 cmp.eq h, p0 = 1, bitX0 cmp.eq v, p0 = 0, bitX0 cmp.eq hv, p0 = 1, bitX0 ;; cmp.eq.and n, p0 = 0, bitY0 cmp.eq.and h, p0 = 0, bitY0 cmp.eq.and v, p0 = 1, bitY0 cmp.eq.and hv, p0 = 1, bitY0 ;; .pred.rel "mutex", p16, p17, p18, p19 //n, h, v, hv (n) mov refaa[0] = pRef (h) mov refaa[0] = pRefH (v) mov refaa[0] = pRefV (hv) mov refaa[0] = pRefHV (n) mov refaa[1] = pRefH (h) mov refaa[1] = pRef (v) mov refaa[1] = pRefHV (hv) mov refaa[1] = pRefV (n) mov refaa[2] = pRefV (h) mov refaa[2] = pRefHV (v) mov refaa[2] = pRef (hv) mov refaa[2] = pRefH // Calculate offset (integer multiplication on IA-64 sucks!) mov block = 8 shr dxd2 = backupX, 1 shr dyd2 = backupY, 1 setf.sig fx = x setf.sig fy = y ;; setf.sig fblock = block setf.sig fiEdgedWidth = iEdgedWidth ;; setf.sig fdxd2 = dxd2 setf.sig fdyd2 = dyd2 ;; xma.l foffset = fx, fblock, fdxd2 xma.l fydiEdgedWidth = fy, fblock, fdyd2 ;; xma.l foffset = fydiEdgedWidth, fiEdgedWidth, foffset ;; getf.sig offset = foffset ;; add refaa[0] = refaa[0], offset add refaa[1] = refaa[1], offset add refaa[2] = refaa[2], offset ;; (h) add refaa[1] = 1, refaa[1] (hv) add refaa[1] = 1, refaa[1] (v) add refaa[2] = iEdgedWidth, refaa[2] (hv) add refaa[2] = iEdgedWidth, refaa[2] // Load respecting misalignment of refx... mov const7 = 7 ;; dep.z nob02 = refaa[0], 3, 3 dep.z nob1 = refaa[1], 3, 3 ;; andcm refaa[0] = refaa[0], const7 // set last 3 bits = 0 andcm refaa[1] = refaa[1], const7 andcm refaa[2] = refaa[2], const7 ;; add refab[0] = 8, refaa[0] add refab[1] = 8, refaa[1] add refab[2] = 8, refaa[2] ;; ld8 cur[0] = [cura], iEdgedWidth ld8 ref0a[0] = [refaa[0]], iEdgedWidth sub nob64m02 = 64, nob02 // 64 - nob ld8 ref0b[0] = [refab[0]], iEdgedWidth ld8 ref1a[0] = [refaa[1]], iEdgedWidth sub nob56m02 = 56, nob02 // 56 - nob ld8 mpr[0] = [refab[1]], iEdgedWidth ld8 ref2a[0] = [refaa[2]], iEdgedWidth sub nob64m1 = 64, nob1 ld8 ref2b[0] = [refab[2]], iEdgedWidth ;; ld8 cur[1] = [cura], iEdgedWidth ld8 ref0a[1] = [refaa[0]], iEdgedWidth ld8 ref0b[1] = [refab[0]], iEdgedWidth ld8 ref1a[1] = [refaa[1]], iEdgedWidth ld8 mpr[1] = [refab[1]], iEdgedWidth ld8 ref2a[1] = [refaa[2]], iEdgedWidth ld8 ref2b[1] = [refab[2]], iEdgedWidth ;; ld8 cur[2] = [cura], iEdgedWidth ld8 ref0a[2] = [refaa[0]], iEdgedWidth ld8 ref0b[2] = [refab[0]], iEdgedWidth ld8 ref1a[2] = [refaa[1]], iEdgedWidth ld8 mpr[2] = [refab[1]], iEdgedWidth ld8 ref2a[2] = [refaa[2]], iEdgedWidth ld8 ref2b[2] = [refab[2]], iEdgedWidth ;; ld8 cur[3] = [cura], iEdgedWidth ld8 ref0a[3] = [refaa[0]], iEdgedWidth ld8 ref0b[3] = [refab[0]], iEdgedWidth ld8 ref1a[3] = [refaa[1]], iEdgedWidth ld8 mpr[3] = [refab[1]], iEdgedWidth ld8 ref2a[3] = [refaa[2]], iEdgedWidth ld8 ref2b[3] = [refab[2]], iEdgedWidth ;; ld8 cur[4] = [cura], iEdgedWidth ld8 ref0a[4] = [refaa[0]], iEdgedWidth ld8 ref0b[4] = [refab[0]], iEdgedWidth ld8 ref1a[4] = [refaa[1]], iEdgedWidth ld8 mpr[4] = [refab[1]], iEdgedWidth ld8 ref2a[4] = [refaa[2]], iEdgedWidth ld8 ref2b[4] = [refab[2]], iEdgedWidth ;; ld8 cur[5] = [cura], iEdgedWidth ld8 ref0a[5] = [refaa[0]], iEdgedWidth ld8 ref0b[5] = [refab[0]], iEdgedWidth ld8 ref1a[5] = [refaa[1]], iEdgedWidth ld8 mpr[5] = [refab[1]], iEdgedWidth ld8 ref2a[5] = [refaa[2]], iEdgedWidth ld8 ref2b[5] = [refab[2]], iEdgedWidth ;; ld8 cur[6] = [cura], iEdgedWidth ld8 ref0a[6] = [refaa[0]], iEdgedWidth ld8 ref0b[6] = [refab[0]], iEdgedWidth ld8 ref1a[6] = [refaa[1]], iEdgedWidth ld8 mpr[6] = [refab[1]], iEdgedWidth ld8 ref2a[6] = [refaa[2]], iEdgedWidth ld8 ref2b[6] = [refab[2]], iEdgedWidth ;; ld8 cur[7] = [cura] ld8 ref0a[7] = [refaa[0]], iEdgedWidth ld8 ref0b[7] = [refab[0]], iEdgedWidth ld8 ref1a[7] = [refaa[1]], iEdgedWidth ld8 mpr[7] = [refab[1]], iEdgedWidth ld8 ref2a[7] = [refaa[2]] ld8 ref2b[7] = [refab[2]] ;; ld8 ref0a[8] = [refaa[0]] ld8 ref0b[8] = [refab[0]] ld8 ref1a[8] = [refaa[1]] ld8 mpr[8] = [refab[1]] ;; // Align ref1 shr.u ref1a[0] = ref1a[0], nob1 shr.u ref1a[1] = ref1a[1], nob1 shr.u ref1a[2] = ref1a[2], nob1 shr.u ref1a[3] = ref1a[3], nob1 shr.u ref1a[4] = ref1a[4], nob1 shr.u ref1a[5] = ref1a[5], nob1 shr.u ref1a[6] = ref1a[6], nob1 shr.u ref1a[7] = ref1a[7], nob1 shr.u ref1a[8] = ref1a[8], nob1 shl mpr[0] = mpr[0], nob64m1 shl mpr[1] = mpr[1], nob64m1 shl mpr[2] = mpr[2], nob64m1 shl mpr[3] = mpr[3], nob64m1 shl mpr[4] = mpr[4], nob64m1 shl mpr[5] = mpr[5], nob64m1 shl mpr[6] = mpr[6], nob64m1 shl mpr[7] = mpr[7], nob64m1 shl mpr[8] = mpr[8], nob64m1 ;; .explicit {.mii or ref1a[0] = ref1a[0], mpr[0] shr.u ref0a[0] = ref0a[0], nob02 shr.u ref0a[1] = ref0a[1], nob02 } {.mmi or ref1a[1] = ref1a[1], mpr[1] or ref1a[2] = ref1a[2], mpr[2] shr.u ref0a[2] = ref0a[2], nob02 } {.mii or ref1a[3] = ref1a[3], mpr[3] shr.u ref0a[3] = ref0a[3], nob02 shr.u ref0a[4] = ref0a[4], nob02 } {.mmi or ref1a[4] = ref1a[4], mpr[4] or ref1a[5] = ref1a[5], mpr[5] shr.u ref0a[5] = ref0a[5], nob02 } {.mii or ref1a[6] = ref1a[6], mpr[6] shr.u ref0a[6] = ref0a[6], nob02 shr.u ref0a[7] = ref0a[7], nob02 } {.mii or ref1a[7] = ref1a[7], mpr[7] or ref1a[8] = ref1a[8], mpr[8] shr.u ref0a[8] = ref0a[8], nob02 } .default // ref1a[] now contains center position values // mpr[] not used any more // Align ref0 left ;; shl mpr[0] = ref0b[0], nob56m02 shl mpr[1] = ref0b[1], nob56m02 shl mpr[2] = ref0b[2], nob56m02 shl mpr[3] = ref0b[3], nob56m02 shl mpr[4] = ref0b[4], nob56m02 shl mpr[5] = ref0b[5], nob56m02 shl mpr[6] = ref0b[6], nob56m02 shl mpr[7] = ref0b[7], nob56m02 shl mpr[8] = ref0b[8], nob56m02 shl ref0b[0] = ref0b[0], nob64m02 shl ref0b[1] = ref0b[1], nob64m02 shl ref0b[2] = ref0b[2], nob64m02 shl ref0b[3] = ref0b[3], nob64m02 shl ref0b[4] = ref0b[4], nob64m02 shl ref0b[5] = ref0b[5], nob64m02 shl ref0b[6] = ref0b[6], nob64m02 shl ref0b[7] = ref0b[7], nob64m02 shl ref0b[8] = ref0b[8], nob64m02 ;; or ref0a[0] = ref0a[0], ref0b[0] or ref0a[1] = ref0a[1], ref0b[1] or ref0a[2] = ref0a[2], ref0b[2] or ref0a[3] = ref0a[3], ref0b[3] or ref0a[4] = ref0a[4], ref0b[4] or ref0a[5] = ref0a[5], ref0b[5] or ref0a[6] = ref0a[6], ref0b[6] or ref0a[7] = ref0a[7], ref0b[7] or ref0a[8] = ref0a[8], ref0b[8] ;; // ref0a[] now contains left position values // mpr[] contains intermediate result for right position values (former ref0a << 56 - nob02) // Align ref0 right // Shift one byte more to the right (seen als big-endian) shr.u ref0b[0] = ref0a[0], 8 shr.u ref0b[1] = ref0a[1], 8 shr.u ref0b[2] = ref0a[2], 8 shr.u ref0b[3] = ref0a[3], 8 shr.u ref0b[4] = ref0a[4], 8 shr.u ref0b[5] = ref0a[5], 8 shr.u ref0b[6] = ref0a[6], 8 shr.u ref0b[7] = ref0a[7], 8 shr.u ref0b[8] = ref0a[8], 8 ;; .explicit {.mii or ref0b[0] = ref0b[0], mpr[0] shr.u ref2a[0] = ref2a[0], nob02 shr.u ref2a[1] = ref2a[1], nob02 } {.mmi or ref0b[1] = ref0b[1], mpr[1] or ref0b[2] = ref0b[2], mpr[2] shr.u ref2a[2] = ref2a[2], nob02 } {.mii or ref0b[3] = ref0b[3], mpr[3] shr.u ref2a[3] = ref2a[3], nob02 shr.u ref2a[4] = ref2a[4], nob02 } {.mmi or ref0b[4] = ref0b[4], mpr[4] or ref0b[5] = ref0b[5], mpr[5] shr.u ref2a[5] = ref2a[5], nob02 } {.mii or ref0b[6] = ref0b[6], mpr[6] shr.u ref2a[6] = ref2a[6], nob02 shr.u ref2a[7] = ref2a[7], nob02 } .default or ref0b[7] = ref0b[7], mpr[7] or ref0b[8] = ref0b[8], mpr[8] // ref0b[] now contains right position values // mpr[] not needed any more // Align ref2 left ;; shl mpr[0] = ref2b[0], nob56m02 shl mpr[1] = ref2b[1], nob56m02 shl mpr[2] = ref2b[2], nob56m02 shl mpr[3] = ref2b[3], nob56m02 shl mpr[4] = ref2b[4], nob56m02 shl mpr[5] = ref2b[5], nob56m02 shl mpr[6] = ref2b[6], nob56m02 shl mpr[7] = ref2b[7], nob56m02 shl ref2b[0] = ref2b[0], nob64m02 shl ref2b[1] = ref2b[1], nob64m02 shl ref2b[2] = ref2b[2], nob64m02 shl ref2b[3] = ref2b[3], nob64m02 shl ref2b[4] = ref2b[4], nob64m02 shl ref2b[5] = ref2b[5], nob64m02 shl ref2b[6] = ref2b[6], nob64m02 shl ref2b[7] = ref2b[7], nob64m02 ;; or ref2a[0] = ref2a[0], ref2b[0] or ref2a[1] = ref2a[1], ref2b[1] or ref2a[2] = ref2a[2], ref2b[2] or ref2a[3] = ref2a[3], ref2b[3] or ref2a[4] = ref2a[4], ref2b[4] or ref2a[5] = ref2a[5], ref2b[5] or ref2a[6] = ref2a[6], ref2b[6] or ref2a[7] = ref2a[7], ref2b[7] ;; // ref2a[] now contains left position values // mpr[] contains intermediate result for right position values (former ref2a << 56 - nob02) // Align ref2 right // Shift one byte more to the right (seen als big-endian) shr.u ref2b[0] = ref2a[0], 8 shr.u ref2b[1] = ref2a[1], 8 shr.u ref2b[2] = ref2a[2], 8 shr.u ref2b[3] = ref2a[3], 8 shr.u ref2b[4] = ref2a[4], 8 shr.u ref2b[5] = ref2a[5], 8 shr.u ref2b[6] = ref2a[6], 8 shr.u ref2b[7] = ref2a[7], 8 ;; or ref2b[0] = ref2b[0], mpr[0] or ref2b[1] = ref2b[1], mpr[1] or ref2b[2] = ref2b[2], mpr[2] or ref2b[3] = ref2b[3], mpr[3] or ref2b[4] = ref2b[4], mpr[4] or ref2b[5] = ref2b[5], mpr[5] or ref2b[6] = ref2b[6], mpr[6] or ref2b[7] = ref2b[7], mpr[7] // ref2b[] now contains right position values // mpr[] not needed any more // Let's SAD // Left top corner sub dx = backupX, dx psad1 mpr[0] = cur[0], ref0a[0] psad1 mpr[1] = cur[1], ref0a[1] sub dy = backupY, dy psad1 mpr[2] = cur[2], ref0a[2] psad1 mpr[3] = cur[3], ref0a[3] psad1 mpr[4] = cur[4], ref0a[4] psad1 mpr[5] = cur[5], ref0a[5] psad1 mpr[6] = cur[6], ref0a[6] psad1 mpr[7] = cur[7], ref0a[7] ;; .include "../../src/motion/ia64_asm/calc_delta_1.s" // Top edge psad1 mpr[0] = cur[0], ref1a[0] psad1 mpr[1] = cur[1], ref1a[1] psad1 mpr[2] = cur[2], ref1a[2] psad1 mpr[3] = cur[3], ref1a[3] psad1 mpr[4] = cur[4], ref1a[4] add dx = 1, dx psad1 mpr[5] = cur[5], ref1a[5] psad1 mpr[6] = cur[6], ref1a[6] psad1 mpr[7] = cur[7], ref1a[7] ;; .include "../../src/motion/ia64_asm/calc_delta_2.s" (lt) cmp.lt.unc fb, p0 = mpr[8], iMinSAD .include "../../src/motion/ia64_asm/calc_delta_3.s" // Right top corner psad1 mpr[0] = cur[0], ref0b[0] psad1 mpr[1] = cur[1], ref0b[1] psad1 mpr[2] = cur[2], ref0b[2] psad1 mpr[3] = cur[3], ref0b[3] psad1 mpr[4] = cur[4], ref0b[4] add backupX = 1, backupX psad1 mpr[5] = cur[5], ref0b[5] psad1 mpr[6] = cur[6], ref0b[6] add dx = 1, dx psad1 mpr[7] = cur[7], ref0b[7] ;; .include "../../src/motion/ia64_asm/calc_delta_1.s" (t) cmp.lt.unc fb, p0 = iSAD, iMinSAD ;; // Left edge (fb) mov iMinSAD = iSAD psad1 mpr[0] = cur[0], ref2a[0] (fb) mov currX = backupX psad1 mpr[1] = cur[1], ref2a[1] psad1 mpr[2] = cur[2], ref2a[2] (fb) mov currY = backupY psad1 mpr[3] = cur[3], ref2a[3] psad1 mpr[4] = cur[4], ref2a[4] add backupX = 1, backupX psad1 mpr[5] = cur[5], ref2a[5] psad1 mpr[6] = cur[6], ref2a[6] psad1 mpr[7] = cur[7], ref2a[7] add dx = -2, dx add dy = 1, dy ;; .include "../../src/motion/ia64_asm/calc_delta_2.s" (rt) cmp.lt.unc fb, p0 = mpr[8], iMinSAD .include "../../src/motion/ia64_asm/calc_delta_3.s" // Right edge psad1 mpr[0] = cur[0], ref2b[0] psad1 mpr[1] = cur[1], ref2b[1] psad1 mpr[2] = cur[2], ref2b[2] psad1 mpr[3] = cur[3], ref2b[3] psad1 mpr[4] = cur[4], ref2b[4] add backupX = -2, backupX psad1 mpr[5] = cur[5], ref2b[5] psad1 mpr[6] = cur[6], ref2b[6] add backupY = 1, backupY add dx = 2, dx psad1 mpr[7] = cur[7], ref2b[7] ;; .include "../../src/motion/ia64_asm/calc_delta_1.s" (l) cmp.lt.unc fb, p0 = iSAD, iMinSAD ;; // Left bottom corner (fb) mov iMinSAD = iSAD psad1 mpr[0] = cur[0], ref0a[1] (fb) mov currX = backupX psad1 mpr[1] = cur[1], ref0a[2] psad1 mpr[2] = cur[2], ref0a[3] (fb) mov currY = backupY psad1 mpr[3] = cur[3], ref0a[4] psad1 mpr[4] = cur[4], ref0a[5] add backupX = 2, backupX psad1 mpr[5] = cur[5], ref0a[6] psad1 mpr[6] = cur[6], ref0a[7] psad1 mpr[7] = cur[7], ref0a[8] add dx = -2, dx add dy = 1, dy ;; .include "../../src/motion/ia64_asm/calc_delta_2.s" (r) cmp.lt.unc fb, p0 = mpr[8], iMinSAD .include "../../src/motion/ia64_asm/calc_delta_3.s" // Bottom edge psad1 mpr[0] = cur[0], ref1a[1] psad1 mpr[1] = cur[1], ref1a[2] psad1 mpr[2] = cur[2], ref1a[3] psad1 mpr[3] = cur[3], ref1a[4] psad1 mpr[4] = cur[4], ref1a[5] add backupX = -2, backupX psad1 mpr[5] = cur[5], ref1a[6] psad1 mpr[6] = cur[6], ref1a[7] add backupY = 1, backupY add dx = 1, dx psad1 mpr[7] = cur[7], ref1a[8] ;; .include "../../src/motion/ia64_asm/calc_delta_1.s" (lb) cmp.lt.unc fb, p0 = iSAD, iMinSAD ;; // Right bottom corner (fb) mov iMinSAD = iSAD psad1 mpr[0] = cur[0], ref0b[1] (fb) mov currX = backupX psad1 mpr[1] = cur[1], ref0b[2] psad1 mpr[2] = cur[2], ref0b[3] (fb) mov currY = backupY psad1 mpr[3] = cur[3], ref0b[4] psad1 mpr[4] = cur[4], ref0b[5] add backupX = 1, backupX psad1 mpr[5] = cur[5], ref0b[6] psad1 mpr[6] = cur[6], ref0b[7] add dx = 1, dx psad1 mpr[7] = cur[7], ref0b[8] ;; .include "../../src/motion/ia64_asm/calc_delta_2.s" (b) cmp.lt.unc fb, p0 = mpr[8], iMinSAD .include "../../src/motion/ia64_asm/calc_delta_3.s" (rb) getf.sig ret0 = fmv add backupX = 1, backupX ;; (rb) add iSAD = iSAD, ret0 ;; (rb) cmp.lt.unc fb, p0 = iSAD, iMinSAD ;; (fb) mov iMinSAD = iSAD (fb) mov currX = backupX (fb) mov currY = backupY ;; // Write back result st4 [currMV] = currX st4 [currYAddress] = currY mov ret0 = iMinSAD // Restore important registers ;; mov pr = prsave, -1 mov ar.pfs = pfs br.ret.sptk.many b0 .endp Halfpel8_Refine_ia64# xvidcore/src/motion/ia64_asm/calc_delta_3.s0000664000076500007650000000401311147310721021636 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 halfpel refinement - // * // * Copyright(C) 2002 Johannes Singler, Daniel Winkler // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: calc_delta_3.s,v 1.2 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * calc_delta_3.s, IA-64 halfpel refinement // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** ;; (fb) mov iMinSAD = mpr[8] (fb) mov currX = backupX (fb) mov currY = backupY mov ret0 = 2 (non0_0) add sc[0] = iFcode, sc[0] (non0_1) add sc[1] = iFcode, sc[1] ;; (non0_0) add ret0 = ret0, sc[0] ;; (non0_1) add ret0 = ret0, sc[1] ;; setf.sig fmv = ret0 ;; xmpy.l fmv = fmv, fQuant ;; xvidcore/src/motion/estimation_bvop.c0000664000076500007650000012457011564705453021150 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Motion Estimation for B-VOPs - * * Copyright(C) 2002 Christoph Lampert * 2002-2010 Michael Militzer * 2002-2003 Radoslaw Czyz * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: estimation_bvop.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include /* memcpy */ #include "../encoder.h" #include "../global.h" #include "../image/interpolate8x8.h" #include "estimation.h" #include "motion.h" #include "sad.h" #include "motion_inlines.h" static int32_t ChromaSAD2(const int fx, const int fy, const int bx, const int by, SearchData * const data) { int sad; const uint32_t stride = data->iEdgedWidth/2; uint8_t *f_refu, *f_refv, *b_refu, *b_refv; int offset, filter; const INTERPOLATE8X8_PTR interpolate8x8_halfpel[] = { NULL, interpolate8x8_halfpel_v, interpolate8x8_halfpel_h, interpolate8x8_halfpel_hv }; if (data->chromaX == fx && data->chromaY == fy && data->b_chromaX == bx && data->b_chromaY == by) return data->chromaSAD; offset = (fx>>1) + (fy>>1)*stride; filter = ((fx & 1) << 1) | (fy & 1); if (filter != 0) { f_refu = data->RefQ + 64; f_refv = data->RefQ + 64 + 8; if (data->chromaX != fx || data->chromaY != fy) { interpolate8x8_halfpel[filter](f_refu, data->RefP[4] + offset, stride, data->rounding); interpolate8x8_halfpel[filter](f_refv, data->RefP[5] + offset, stride, data->rounding); } } else { f_refu = (uint8_t*)data->RefP[4] + offset; f_refv = (uint8_t*)data->RefP[5] + offset; } data->chromaX = fx; data->chromaY = fy; offset = (bx>>1) + (by>>1)*stride; filter = ((bx & 1) << 1) | (by & 1); if (filter != 0) { b_refu = data->RefQ + 64 + 16; b_refv = data->RefQ + 64 + 24; if (data->b_chromaX != bx || data->b_chromaY != by) { interpolate8x8_halfpel[filter](b_refu, data->b_RefP[4] + offset, stride, data->rounding); interpolate8x8_halfpel[filter](b_refv, data->b_RefP[5] + offset, stride, data->rounding); } } else { b_refu = (uint8_t*)data->b_RefP[4] + offset; b_refv = (uint8_t*)data->b_RefP[5] + offset; } data->b_chromaX = bx; data->b_chromaY = by; sad = sad8bi(data->CurU, b_refu, f_refu, stride); sad += sad8bi(data->CurV, b_refv, f_refv, stride); data->chromaSAD = sad; return sad; } static void CheckCandidateInt(const int x, const int y, SearchData * const data, const unsigned int Direction) { int32_t sad, xf, yf, xb, yb, xcf, ycf, xcb, ycb; uint32_t t; const uint8_t *ReferenceF, *ReferenceB; VECTOR *current; if ((x > data->max_dx) || (x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy)) return; if (Direction == 1) { /* x and y mean forward vector */ VECTOR backward = data->qpel_precision ? data->currentQMV[1] : data->currentMV[1]; xb = backward.x; yb = backward.y; xf = x; yf = y; } else { /* x and y mean backward vector */ VECTOR forward = data->qpel_precision ? data->currentQMV[0] : data->currentMV[0]; xf = forward.x; yf = forward.y; xb = x; yb = y; } if (!data->qpel_precision) { ReferenceF = GetReference(xf, yf, data); ReferenceB = GetReferenceB(xb, yb, 1, data); current = data->currentMV + Direction - 1; xcf = xf; ycf = yf; xcb = xb; ycb = yb; } else { ReferenceF = xvid_me_interpolate16x16qpel(xf, yf, 0, data); current = data->currentQMV + Direction - 1; ReferenceB = xvid_me_interpolate16x16qpel(xb, yb, 1, data); xcf = xf/2; ycf = yf/2; xcb = xb/2; ycb = yb/2; } t = d_mv_bits(xf, yf, data->predMV, data->iFcode, data->qpel^data->qpel_precision) + d_mv_bits(xb, yb, data->bpredMV, data->iFcode, data->qpel^data->qpel_precision); sad = sad16bi(data->Cur, ReferenceF, ReferenceB, data->iEdgedWidth); sad += (data->lambda16 * t); if (data->chroma && sad < *data->iMinSAD) sad += ChromaSAD2((xcf >> 1) + roundtab_79[xcf & 0x3], (ycf >> 1) + roundtab_79[ycf & 0x3], (xcb >> 1) + roundtab_79[xcb & 0x3], (ycb >> 1) + roundtab_79[ycb & 0x3], data); if (sad < *(data->iMinSAD)) { *data->iMinSAD = sad; current->x = x; current->y = y; data->dir = Direction; } } static void CheckCandidateDirect(const int x, const int y, SearchData * const data, const unsigned int Direction) { int32_t sad = 0, xcf = 0, ycf = 0, xcb = 0, ycb = 0; uint32_t k; const uint8_t *ReferenceF; const uint8_t *ReferenceB; VECTOR mvs, b_mvs; const int blocks[4] = {0, 8, 8*data->iEdgedWidth, 8*data->iEdgedWidth+8}; if (( x > 31) || ( x < -32) || ( y > 31) || (y < -32)) return; for (k = 0; k < 4; k++) { mvs.x = data->directmvF[k].x + x; b_mvs.x = ((x == 0) ? data->directmvB[k].x : mvs.x - data->referencemv[k].x); mvs.y = data->directmvF[k].y + y; b_mvs.y = ((y == 0) ? data->directmvB[k].y : mvs.y - data->referencemv[k].y); if ((mvs.x > data->max_dx) || (mvs.x < data->min_dx) || (mvs.y > data->max_dy) || (mvs.y < data->min_dy) || (b_mvs.x > data->max_dx) || (b_mvs.x < data->min_dx) || (b_mvs.y > data->max_dy) || (b_mvs.y < data->min_dy) ) return; if (data->qpel) { xcf += mvs.x/2; ycf += mvs.y/2; xcb += b_mvs.x/2; ycb += b_mvs.y/2; if (data->qpel_precision) { ReferenceF = xvid_me_interpolate8x8qpel(mvs.x, mvs.y, k, 0, data); ReferenceB = xvid_me_interpolate8x8qpel(b_mvs.x, b_mvs.y, k, 1, data); goto done; } mvs.x >>=1; mvs.y >>=1; b_mvs.x >>=1; b_mvs.y >>=1; // qpel->hpel } else { xcf += mvs.x; ycf += mvs.y; xcb += b_mvs.x; ycb += b_mvs.y; } ReferenceF = GetReference(mvs.x, mvs.y, data) + blocks[k]; ReferenceB = GetReferenceB(b_mvs.x, b_mvs.y, 1, data) + blocks[k]; done: sad += data->iMinSAD[k+1] = sad8bi(data->Cur + blocks[k], ReferenceF, ReferenceB, data->iEdgedWidth); if (sad > *(data->iMinSAD)) return; } sad += (data->lambda16 * d_mv_bits(x, y, zeroMV, 1, 0)); if (data->chroma && sad < *data->iMinSAD) sad += ChromaSAD2((xcf >> 3) + roundtab_76[xcf & 0xf], (ycf >> 3) + roundtab_76[ycf & 0xf], (xcb >> 3) + roundtab_76[xcb & 0xf], (ycb >> 3) + roundtab_76[ycb & 0xf], data); if (sad < *(data->iMinSAD)) { data->iMinSAD[0] = sad; data->currentMV->x = x; data->currentMV->y = y; data->dir = Direction; } } static void CheckCandidateDirectno4v(const int x, const int y, SearchData * const data, const unsigned int Direction) { int32_t sad, xcf, ycf, xcb, ycb; const uint8_t *ReferenceF; const uint8_t *ReferenceB; VECTOR mvs, b_mvs; if (( x > 31) || ( x < -32) || ( y > 31) || (y < -32)) return; mvs.x = data->directmvF[0].x + x; b_mvs.x = ((x == 0) ? data->directmvB[0].x : mvs.x - data->referencemv[0].x); mvs.y = data->directmvF[0].y + y; b_mvs.y = ((y == 0) ? data->directmvB[0].y : mvs.y - data->referencemv[0].y); if ( (mvs.x > data->max_dx) || (mvs.x < data->min_dx) || (mvs.y > data->max_dy) || (mvs.y < data->min_dy) || (b_mvs.x > data->max_dx) || (b_mvs.x < data->min_dx) || (b_mvs.y > data->max_dy) || (b_mvs.y < data->min_dy) ) return; if (data->qpel) { xcf = 4*(mvs.x/2); ycf = 4*(mvs.y/2); xcb = 4*(b_mvs.x/2); ycb = 4*(b_mvs.y/2); if (data->qpel_precision) { ReferenceF = xvid_me_interpolate16x16qpel(mvs.x, mvs.y, 0, data); ReferenceB = xvid_me_interpolate16x16qpel(b_mvs.x, b_mvs.y, 1, data); goto done; } mvs.x >>=1; mvs.y >>=1; b_mvs.x >>=1; b_mvs.y >>=1; // qpel->hpel } else { xcf = 4*mvs.x; ycf = 4*mvs.y; xcb = 4*b_mvs.x; ycb = 4*b_mvs.y; } ReferenceF = GetReference(mvs.x, mvs.y, data); ReferenceB = GetReferenceB(b_mvs.x, b_mvs.y, 1, data); done: sad = sad16bi(data->Cur, ReferenceF, ReferenceB, data->iEdgedWidth); sad += (data->lambda16 * d_mv_bits(x, y, zeroMV, 1, 0)); if (data->chroma && sad < *data->iMinSAD) sad += ChromaSAD2((xcf >> 3) + roundtab_76[xcf & 0xf], (ycf >> 3) + roundtab_76[ycf & 0xf], (xcb >> 3) + roundtab_76[xcb & 0xf], (ycb >> 3) + roundtab_76[ycb & 0xf], data); if (sad < *(data->iMinSAD)) { *(data->iMinSAD) = sad; data->currentMV->x = x; data->currentMV->y = y; data->dir = Direction; } } void CheckCandidate16no4v(const int x, const int y, SearchData * const data, const unsigned int Direction) { int32_t sad, xc, yc; const uint8_t * Reference; uint32_t t; VECTOR * current; if ( (x > data->max_dx) || ( x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy) ) return; if (data->qpel_precision) { /* x and y are in 1/4 precision */ Reference = xvid_me_interpolate16x16qpel(x, y, 0, data); current = data->currentQMV; xc = x/2; yc = y/2; } else { Reference = GetReference(x, y, data); current = data->currentMV; xc = x; yc = y; } t = d_mv_bits(x, y, data->predMV, data->iFcode, data->qpel^data->qpel_precision); sad = sad16(data->Cur, Reference, data->iEdgedWidth, 256*4096); sad += (data->lambda16 * t); if (data->chroma && sad < *data->iMinSAD) sad += xvid_me_ChromaSAD((xc >> 1) + roundtab_79[xc & 0x3], (yc >> 1) + roundtab_79[yc & 0x3], data); if (sad < *(data->iMinSAD)) { *(data->iMinSAD) = sad; current->x = x; current->y = y; data->dir = Direction; } } static void initialize_searchData(SearchData * Data_d, SearchData * Data_f, SearchData * Data_b, SearchData * Data_i, int x, int y, const IMAGE * const f_Ref, const uint8_t * const f_RefH, const uint8_t * const f_RefV, const uint8_t * const f_RefHV, const IMAGE * const b_Ref, const uint8_t * const b_RefH, const uint8_t * const b_RefV, const uint8_t * const b_RefHV, const IMAGE * const pCur, const MACROBLOCK * const b_mb) { /* per-macroblock SearchData initialization - too many things would be repeated 4 times */ const uint8_t * RefP[6], * b_RefP[6], * Cur[3]; const uint32_t iEdgedWidth = Data_d->iEdgedWidth; unsigned int lambda; int i; /* luma */ int offset = (x + iEdgedWidth*y) * 16; RefP[0] = f_Ref->y + offset; RefP[2] = f_RefH + offset; RefP[1] = f_RefV + offset; RefP[3] = f_RefHV + offset; b_RefP[0] = b_Ref->y + offset; b_RefP[2] = b_RefH + offset; b_RefP[1] = b_RefV + offset; b_RefP[3] = b_RefHV + offset; Cur[0] = pCur->y + offset; /* chroma */ offset = (x + (iEdgedWidth/2)*y) * 8; RefP[4] = f_Ref->u + offset; RefP[5] = f_Ref->v + offset; b_RefP[4] = b_Ref->u + offset; b_RefP[5] = b_Ref->v + offset; Cur[1] = pCur->u + offset; Cur[2] = pCur->v + offset; lambda = xvid_me_lambda_vec16[b_mb->quant]; for (i = 0; i < 6; i++) { Data_d->RefP[i] = Data_f->RefP[i] = Data_i->RefP[i] = RefP[i]; Data_d->b_RefP[i] = Data_b->RefP[i] = Data_i->b_RefP[i] = b_RefP[i]; } Data_d->Cur = Data_f->Cur = Data_b->Cur = Data_i->Cur = Cur[0]; Data_d->CurU = Data_f->CurU = Data_b->CurU = Data_i->CurU = Cur[1]; Data_d->CurV = Data_f->CurV = Data_b->CurV = Data_i->CurV = Cur[2]; Data_d->lambda16 = Data_f->lambda16 = Data_b->lambda16 = Data_i->lambda16 = lambda; /* reset chroma-sad cache */ Data_d->b_chromaX = Data_d->b_chromaY = Data_d->chromaX = Data_d->chromaY = Data_d->chromaSAD = 256*4096; Data_i->b_chromaX = Data_i->b_chromaY = Data_i->chromaX = Data_i->chromaY = Data_i->chromaSAD = 256*4096; Data_f->chromaX = Data_f->chromaY = Data_f->chromaSAD = 256*4096; Data_b->chromaX = Data_b->chromaY = Data_b->chromaSAD = 256*4096; *Data_d->iMinSAD = *Data_b->iMinSAD = *Data_f->iMinSAD = *Data_i->iMinSAD = 4096*256; } static __inline VECTOR ChoosePred(const MACROBLOCK * const pMB, const uint32_t mode) { /* the stupidiest function ever */ return (mode == MODE_FORWARD ? pMB->mvs[0] : pMB->b_mvs[0]); } static void __inline PreparePredictionsBF(VECTOR * const pmv, const int x, const int y, const uint32_t iWcount, const MACROBLOCK * const pMB, const uint32_t mode_curr, const VECTOR hint, const int bound) { int lx, ly; /* left */ int tx, ty; /* top */ int rtx, rty; /* top-right */ int ltx, lty; /* top-left */ int lpos, tpos, rtpos, ltpos; lx = x - 1; ly = y; tx = x; ty = y - 1; rtx = x + 1; rty = y - 1; ltx = x - 1; lty = y - 1; lpos = lx + ly * iWcount; rtpos = rtx + rty * iWcount; tpos = tx + ty * iWcount; ltpos = ltx + lty * iWcount; /* [0] is prediction */ /* [1] is zero */ pmv[1].x = pmv[1].y = 0; pmv[2].x = hint.x; pmv[2].y = hint.y; if (rtpos >= bound && rtx < (int)iWcount) { /* [3] top-right neighbour */ pmv[3] = ChoosePred(pMB+1-iWcount, mode_curr); } else pmv[3].x = pmv[3].y = 0; if (tpos >= bound) { pmv[4] = ChoosePred(pMB-iWcount, mode_curr); /* [4] top */ } else pmv[4].x = pmv[4].y = 0; if (lpos >= bound && lx >= 0) { pmv[5] = ChoosePred(pMB-1, mode_curr); /* [5] left */ } else pmv[5].x = pmv[5].y = 0; if (ltpos >= bound && ltx >= 0) { pmv[6] = ChoosePred(pMB-1-iWcount, mode_curr); /* [6] top-left */ } else pmv[6].x = pmv[6].y = 0; } /* search backward or forward */ static void SearchBF_initial(const int x, const int y, const uint32_t MotionFlags, const uint32_t iFcode, const MBParam * const pParam, MACROBLOCK * const pMB, const VECTOR * const predMV, int32_t * const best_sad, const int32_t mode_current, SearchData * const Data, VECTOR hint, const int bound) { int i; VECTOR pmv[7]; *Data->iMinSAD = MV_MAX_ERROR; Data->qpel_precision = 0; Data->predMV = *predMV; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, iFcode - Data->qpel, 1); pmv[0] = Data->predMV; if (Data->qpel) { pmv[0].x /= 2; pmv[0].y /= 2; hint.x /= 2; hint.y /= 2; } PreparePredictionsBF(pmv, x, y, pParam->mb_width, pMB, mode_current, hint, bound); Data->currentMV->x = Data->currentMV->y = 0; /* main loop. checking all predictions */ for (i = 0; i < 7; i++) if (!vector_repeats(pmv, i) ) CheckCandidate16no4v(pmv[i].x, pmv[i].y, Data, i); if (*Data->iMinSAD > 512) { unsigned int mask = make_mask(pmv, 7, Data->dir); MainSearchFunc *MainSearchPtr; if (MotionFlags & XVID_ME_USESQUARES16) MainSearchPtr = xvid_me_SquareSearch; else if (MotionFlags & XVID_ME_ADVANCEDDIAMOND16) MainSearchPtr = xvid_me_AdvDiamondSearch; else MainSearchPtr = xvid_me_DiamondSearch; MainSearchPtr(Data->currentMV->x, Data->currentMV->y, Data, mask, CheckCandidate16no4v); } if (Data->iMinSAD[0] < *best_sad) *best_sad = Data->iMinSAD[0]; } static void SearchBF_final(const int x, const int y, const uint32_t MotionFlags, const MBParam * const pParam, int32_t * const best_sad, SearchData * const Data) { if(!Data->qpel) { /* halfpel mode */ if (MotionFlags & XVID_ME_HALFPELREFINE16) xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate16no4v, 0); } else { /* qpel mode */ if(MotionFlags & XVID_ME_FASTREFINE16) { /* fast */ get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode, 2); FullRefine_Fast(Data, CheckCandidate16no4v, 0); } else { Data->currentQMV->x = 2*Data->currentMV->x; Data->currentQMV->y = 2*Data->currentMV->y; if(MotionFlags & XVID_ME_QUARTERPELREFINE16) { /* full */ if (MotionFlags & XVID_ME_HALFPELREFINE16) { xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate16no4v, 0); /* hpel part */ Data->currentQMV->x = 2*Data->currentMV->x; Data->currentQMV->y = 2*Data->currentMV->y; } get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode, 2); Data->qpel_precision = 1; xvid_me_SubpelRefine(Data->currentQMV[0], Data, CheckCandidate16no4v, 0); /* qpel part */ } } } if (Data->iMinSAD[0] < *best_sad) *best_sad = Data->iMinSAD[0]; } static void SkipDecisionB(MACROBLOCK * const pMB, const SearchData * const Data) { int k; if (!Data->chroma) { int dx = 0, dy = 0, b_dx = 0, b_dy = 0; int32_t sum; const uint32_t stride = Data->iEdgedWidth/2; /* this is not full chroma compensation, only it's fullpel approximation. should work though */ for (k = 0; k < 4; k++) { dy += Data->directmvF[k].y >> Data->qpel; dx += Data->directmvF[k].x >> Data->qpel; b_dy += Data->directmvB[k].y >> Data->qpel; b_dx += Data->directmvB[k].x >> Data->qpel; } dy = (dy >> 3) + roundtab_76[dy & 0xf]; dx = (dx >> 3) + roundtab_76[dx & 0xf]; b_dy = (b_dy >> 3) + roundtab_76[b_dy & 0xf]; b_dx = (b_dx >> 3) + roundtab_76[b_dx & 0xf]; sum = sad8bi(Data->CurU, Data->RefP[4] + (dy/2) * (int)stride + dx/2, Data->b_RefP[4] + (b_dy/2) * (int)stride + b_dx/2, stride); if (sum >= MAX_CHROMA_SAD_FOR_SKIP * (int)Data->iQuant) return; /* no skip */ sum += sad8bi(Data->CurV, Data->RefP[5] + (dy/2) * (int)stride + dx/2, Data->b_RefP[5] + (b_dy/2) * (int)stride + b_dx/2, stride); if (sum >= MAX_CHROMA_SAD_FOR_SKIP * (int)Data->iQuant) return; /* no skip */ } else { int sum = Data->chromaSAD; /* chroma-sad SAD caching keeps it there */ if (sum >= MAX_CHROMA_SAD_FOR_SKIP * (int)Data->iQuant) return; /* no skip */ } /* skip */ pMB->mode = MODE_DIRECT_NONE_MV; /* skipped */ for (k = 0; k < 4; k++) { pMB->qmvs[k] = pMB->mvs[k] = Data->directmvF[k]; pMB->b_qmvs[k] = pMB->b_mvs[k] = Data->directmvB[k]; if (Data->qpel) { pMB->mvs[k].x /= 2; pMB->mvs[k].y /= 2; /* it's a hint for future searches */ pMB->b_mvs[k].x /= 2; pMB->b_mvs[k].y /= 2; } } } static uint32_t SearchDirect_initial(const int x, const int y, const uint32_t MotionFlags, const int32_t TRB, const int32_t TRD, const MBParam * const pParam, MACROBLOCK * const pMB, const MACROBLOCK * const b_mb, int32_t * const best_sad, SearchData * const Data) { int32_t skip_sad; int k = (x + Data->iEdgedWidth*y) * 16; k = Data->qpel ? 4 : 2; Data->max_dx = k * (pParam->width - x * 16); Data->max_dy = k * (pParam->height - y * 16); Data->min_dx = -k * (16 + x * 16); Data->min_dy = -k * (16 + y * 16); Data->referencemv = Data->qpel ? b_mb->qmvs : b_mb->mvs; for (k = 0; k < 4; k++) { Data->directmvF[k].x = ((TRB * Data->referencemv[k].x) / TRD); Data->directmvB[k].x = ((TRB - TRD) * Data->referencemv[k].x) / TRD; Data->directmvF[k].y = ((TRB * Data->referencemv[k].y) / TRD); Data->directmvB[k].y = ((TRB - TRD) * Data->referencemv[k].y) / TRD; if ( (Data->directmvB[k].x > Data->max_dx) | (Data->directmvB[k].x < Data->min_dx) | (Data->directmvB[k].y > Data->max_dy) | (Data->directmvB[k].y < Data->min_dy) ) { Data->iMinSAD[0] = *best_sad = 256*4096; /* in that case, we won't use direct mode */ return 256*4096; } if (b_mb->mode != MODE_INTER4V) { Data->directmvF[1] = Data->directmvF[2] = Data->directmvF[3] = Data->directmvF[0]; Data->directmvB[1] = Data->directmvB[2] = Data->directmvB[3] = Data->directmvB[0]; break; } } Data->qpel_precision = Data->qpel; /* this initial check is done with full precision, to find real SKIP sad */ CheckCandidateDirect(0, 0, Data, 255); /* will also fill iMinSAD[1..4] with 8x8 SADs */ /* initial (fast) skip decision */ if (Data->iMinSAD[1] < (int)Data->iQuant * INITIAL_SKIP_THRESH && Data->iMinSAD[2] < (int)Data->iQuant * INITIAL_SKIP_THRESH && Data->iMinSAD[3] < (int)Data->iQuant * INITIAL_SKIP_THRESH && Data->iMinSAD[4] < (int)Data->iQuant * INITIAL_SKIP_THRESH) { /* possible skip */ SkipDecisionB(pMB, Data); if (pMB->mode == MODE_DIRECT_NONE_MV) return *Data->iMinSAD; /* skipped */ } if (Data->chroma && Data->chromaSAD >= MAX_CHROMA_SAD_FOR_SKIP * (int)Data->iQuant) /* chroma doesn't allow skip */ skip_sad = 256*4096; else skip_sad = 4*MAX(MAX(Data->iMinSAD[1],Data->iMinSAD[2]), MAX(Data->iMinSAD[3],Data->iMinSAD[4])); Data->currentMV[1].x = Data->directmvF[0].x + Data->currentMV->x; /* hints for forward and backward searches */ Data->currentMV[1].y = Data->directmvF[0].y + Data->currentMV->y; Data->currentMV[2].x = ((Data->currentMV->x == 0) ? Data->directmvB[0].x : Data->currentMV[1].x - Data->referencemv[0].x); Data->currentMV[2].y = ((Data->currentMV->y == 0) ? Data->directmvB[0].y : Data->currentMV[1].y - Data->referencemv[0].y); *best_sad = Data->iMinSAD[0]; return skip_sad; } static void SearchDirect_final( const uint32_t MotionFlags, const MACROBLOCK * const b_mb, int32_t * const best_sad, SearchData * const Data) { CheckFunc * CheckCandidate = b_mb->mode == MODE_INTER4V ? CheckCandidateDirect : CheckCandidateDirectno4v; MainSearchFunc *MainSearchPtr; if (MotionFlags & XVID_ME_USESQUARES16) MainSearchPtr = xvid_me_SquareSearch; else if (MotionFlags & XVID_ME_ADVANCEDDIAMOND16) MainSearchPtr = xvid_me_AdvDiamondSearch; else MainSearchPtr = xvid_me_DiamondSearch; Data->qpel_precision = 0; MainSearchPtr(0, 0, Data, 255, CheckCandidate); Data->qpel_precision = Data->qpel; if(Data->qpel) { *Data->iMinSAD = 256*4096; /* this old SAD was not real, it was in hpel precision */ CheckCandidate(Data->currentMV->x, Data->currentMV->y, Data, 255); } xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate, 0); if (Data->iMinSAD[0] < *best_sad) { *best_sad = Data->iMinSAD[0]; } } static __inline void set_range(int * range, SearchData * Data) { Data->min_dx = range[0]; Data->max_dx = range[1]; Data->min_dy = range[2]; Data->max_dy = range[3]; } static void SearchInterpolate_initial( const int x, const int y, const uint32_t MotionFlags, const MBParam * const pParam, const VECTOR * const f_predMV, const VECTOR * const b_predMV, int32_t * const best_sad, SearchData * const Data, const VECTOR startF, const VECTOR startB) { int b_range[4], f_range[4]; Data->qpel_precision = 0; Data->predMV = *f_predMV; Data->bpredMV = *b_predMV; Data->currentMV[0] = startF; Data->currentMV[1] = startB; get_range(f_range, f_range+1, f_range+2, f_range+3, x, y, 4, pParam->width, pParam->height, Data->iFcode - Data->qpel, 1); get_range(b_range, b_range+1, b_range+2, b_range+3, x, y, 4, pParam->width, pParam->height, Data->bFcode - Data->qpel, 1); if (Data->currentMV[0].x > f_range[1]) Data->currentMV[0].x = f_range[1]; if (Data->currentMV[0].x < f_range[0]) Data->currentMV[0].x = f_range[0]; if (Data->currentMV[0].y > f_range[3]) Data->currentMV[0].y = f_range[3]; if (Data->currentMV[0].y < f_range[2]) Data->currentMV[0].y = f_range[2]; if (Data->currentMV[1].x > b_range[1]) Data->currentMV[1].x = b_range[1]; if (Data->currentMV[1].x < b_range[0]) Data->currentMV[1].x = b_range[0]; if (Data->currentMV[1].y > b_range[3]) Data->currentMV[1].y = b_range[3]; if (Data->currentMV[1].y < b_range[2]) Data->currentMV[1].y = b_range[2]; set_range(f_range, Data); CheckCandidateInt(Data->currentMV[0].x, Data->currentMV[0].y, Data, 1); if (Data->iMinSAD[0] < *best_sad) *best_sad = Data->iMinSAD[0]; } static void SearchInterpolate_final(const int x, const int y, const uint32_t MotionFlags, const MBParam * const pParam, int32_t * const best_sad, SearchData * const Data) { int i, j; int b_range[4], f_range[4]; get_range(f_range, f_range+1, f_range+2, f_range+3, x, y, 4, pParam->width, pParam->height, Data->iFcode - Data->qpel, 1); get_range(b_range, b_range+1, b_range+2, b_range+3, x, y, 4, pParam->width, pParam->height, Data->bFcode - Data->qpel, 1); /* diamond */ do { Data->dir = 0; /* forward MV moves */ i = Data->currentMV[0].x; j = Data->currentMV[0].y; CheckCandidateInt(i + 1, j, Data, 1); CheckCandidateInt(i, j + 1, Data, 1); CheckCandidateInt(i - 1, j, Data, 1); CheckCandidateInt(i, j - 1, Data, 1); /* backward MV moves */ set_range(b_range, Data); i = Data->currentMV[1].x; j = Data->currentMV[1].y; CheckCandidateInt(i + 1, j, Data, 2); CheckCandidateInt(i, j + 1, Data, 2); CheckCandidateInt(i - 1, j, Data, 2); CheckCandidateInt(i, j - 1, Data, 2); set_range(f_range, Data); } while (Data->dir != 0); /* qpel refinement */ if (Data->qpel) { Data->qpel_precision = 1; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode, 2); Data->currentQMV[0].x = 2 * Data->currentMV[0].x; Data->currentQMV[0].y = 2 * Data->currentMV[0].y; Data->currentQMV[1].x = 2 * Data->currentMV[1].x; Data->currentQMV[1].y = 2 * Data->currentMV[1].y; if (MotionFlags & XVID_ME_QUARTERPELREFINE16) { xvid_me_SubpelRefine(Data->currentQMV[0], Data, CheckCandidateInt, 1); get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->bFcode, 2); xvid_me_SubpelRefine(Data->currentQMV[1], Data, CheckCandidateInt, 2); } } if (Data->iMinSAD[0] < *best_sad) *best_sad = Data->iMinSAD[0]; } static void ModeDecision_BVOP_SAD(const SearchData * const Data_d, const SearchData * const Data_b, const SearchData * const Data_f, const SearchData * const Data_i, MACROBLOCK * const pMB, const MACROBLOCK * const b_mb, VECTOR * f_predMV, VECTOR * b_predMV, int force_direct) { int mode = MODE_DIRECT, k; int best_sad, f_sad, b_sad, i_sad; const int qpel = Data_d->qpel; /* evaluate cost of all modes - quite simple in SAD */ best_sad = Data_d->iMinSAD[0] + 1*Data_d->lambda16; b_sad = Data_b->iMinSAD[0] + 3*Data_d->lambda16; f_sad = Data_f->iMinSAD[0] + 4*Data_d->lambda16; i_sad = Data_i->iMinSAD[0] + 2*Data_d->lambda16; if (force_direct) goto set_mode; /* bypass checks for non-direct modes */ if (b_sad < best_sad) { mode = MODE_BACKWARD; best_sad = b_sad; } if (f_sad < best_sad) { mode = MODE_FORWARD; best_sad = f_sad; } if (i_sad < best_sad) { mode = MODE_INTERPOLATE; best_sad = i_sad; } set_mode: pMB->sad16 = best_sad; pMB->mode = mode; pMB->cbp = 63; switch (mode) { case MODE_DIRECT: if (!qpel && b_mb->mode != MODE_INTER4V) pMB->mode = MODE_DIRECT_NO4V; /* for faster compensation */ pMB->pmvs[3] = Data_d->currentMV[0]; for (k = 0; k < 4; k++) { pMB->mvs[k].x = Data_d->directmvF[k].x + Data_d->currentMV->x; pMB->b_mvs[k].x = ( (Data_d->currentMV->x == 0) ? Data_d->directmvB[k].x :pMB->mvs[k].x - Data_d->referencemv[k].x); pMB->mvs[k].y = (Data_d->directmvF[k].y + Data_d->currentMV->y); pMB->b_mvs[k].y = ((Data_d->currentMV->y == 0) ? Data_d->directmvB[k].y : pMB->mvs[k].y - Data_d->referencemv[k].y); if (qpel) { pMB->qmvs[k].x = pMB->mvs[k].x; pMB->mvs[k].x /= 2; pMB->b_qmvs[k].x = pMB->b_mvs[k].x; pMB->b_mvs[k].x /= 2; pMB->qmvs[k].y = pMB->mvs[k].y; pMB->mvs[k].y /= 2; pMB->b_qmvs[k].y = pMB->b_mvs[k].y; pMB->b_mvs[k].y /= 2; } if (b_mb->mode != MODE_INTER4V) { pMB->mvs[3] = pMB->mvs[2] = pMB->mvs[1] = pMB->mvs[0]; pMB->b_mvs[3] = pMB->b_mvs[2] = pMB->b_mvs[1] = pMB->b_mvs[0]; pMB->qmvs[3] = pMB->qmvs[2] = pMB->qmvs[1] = pMB->qmvs[0]; pMB->b_qmvs[3] = pMB->b_qmvs[2] = pMB->b_qmvs[1] = pMB->b_qmvs[0]; break; } } break; case MODE_FORWARD: if (qpel) { pMB->pmvs[0].x = Data_f->currentQMV->x - f_predMV->x; pMB->pmvs[0].y = Data_f->currentQMV->y - f_predMV->y; pMB->qmvs[0] = *Data_f->currentQMV; *f_predMV = Data_f->currentQMV[0]; } else { pMB->pmvs[0].x = Data_f->currentMV->x - f_predMV->x; pMB->pmvs[0].y = Data_f->currentMV->y - f_predMV->y; *f_predMV = Data_f->currentMV[0]; } pMB->mvs[0] = *Data_f->currentMV; pMB->b_mvs[0] = *Data_b->currentMV; /* hint for future searches */ break; case MODE_BACKWARD: if (qpel) { pMB->pmvs[0].x = Data_b->currentQMV->x - b_predMV->x; pMB->pmvs[0].y = Data_b->currentQMV->y - b_predMV->y; pMB->b_qmvs[0] = *Data_b->currentQMV; *b_predMV = Data_b->currentQMV[0]; } else { pMB->pmvs[0].x = Data_b->currentMV->x - b_predMV->x; pMB->pmvs[0].y = Data_b->currentMV->y - b_predMV->y; *b_predMV = Data_b->currentMV[0]; } pMB->b_mvs[0] = *Data_b->currentMV; pMB->mvs[0] = *Data_f->currentMV; /* hint for future searches */ break; case MODE_INTERPOLATE: pMB->mvs[0] = Data_i->currentMV[0]; pMB->b_mvs[0] = Data_i->currentMV[1]; if (qpel) { pMB->qmvs[0] = Data_i->currentQMV[0]; pMB->b_qmvs[0] = Data_i->currentQMV[1]; pMB->pmvs[1].x = pMB->qmvs[0].x - f_predMV->x; pMB->pmvs[1].y = pMB->qmvs[0].y - f_predMV->y; pMB->pmvs[0].x = pMB->b_qmvs[0].x - b_predMV->x; pMB->pmvs[0].y = pMB->b_qmvs[0].y - b_predMV->y; *f_predMV = Data_i->currentQMV[0]; *b_predMV = Data_i->currentQMV[1]; } else { pMB->pmvs[1].x = pMB->mvs[0].x - f_predMV->x; pMB->pmvs[1].y = pMB->mvs[0].y - f_predMV->y; pMB->pmvs[0].x = pMB->b_mvs[0].x - b_predMV->x; pMB->pmvs[0].y = pMB->b_mvs[0].y - b_predMV->y; *f_predMV = Data_i->currentMV[0]; *b_predMV = Data_i->currentMV[1]; } break; } } static __inline void maxMotionBVOP(int * const MVmaxF, int * const MVmaxB, const MACROBLOCK * const pMB, const int qpel) { if (pMB->mode == MODE_FORWARD || pMB->mode == MODE_INTERPOLATE) { const VECTOR * const mv = qpel ? pMB->qmvs : pMB->mvs; int max = *MVmaxF; if (mv[0].x > max) max = mv[0].x; else if (-mv[0].x - 1 > max) max = -mv[0].x - 1; if (mv[0].y > max) max = mv[0].y; else if (-mv[0].y - 1 > max) max = -mv[0].y - 1; *MVmaxF = max; } if (pMB->mode == MODE_BACKWARD || pMB->mode == MODE_INTERPOLATE) { const VECTOR * const mv = qpel ? pMB->b_qmvs : pMB->b_mvs; int max = *MVmaxB; if (mv[0].x > max) max = mv[0].x; else if (-mv[0].x - 1 > max) max = -mv[0].x - 1; if (mv[0].y > max) max = mv[0].y; else if (-mv[0].y - 1 > max) max = -mv[0].y - 1; *MVmaxB = max; } } void MotionEstimationBVOP(MBParam * const pParam, FRAMEINFO * const frame, const int32_t time_bp, const int32_t time_pp, /* forward (past) reference */ const MACROBLOCK * const f_mbs, const IMAGE * const f_ref, const IMAGE * const f_refH, const IMAGE * const f_refV, const IMAGE * const f_refHV, /* backward (future) reference */ const FRAMEINFO * const b_reference, const IMAGE * const b_ref, const IMAGE * const b_refH, const IMAGE * const b_refV, const IMAGE * const b_refHV, const int num_slices) { uint32_t i, j; int32_t best_sad = 256*4096; uint32_t skip_sad; int fb_thresh; const MACROBLOCK * const b_mbs = b_reference->mbs; VECTOR f_predMV, b_predMV; int mb_width = pParam->mb_width; int mb_height = pParam->mb_height; int MVmaxF = 0, MVmaxB = 0; const int32_t TRB = time_pp - time_bp; const int32_t TRD = time_pp; DECLARE_ALIGNED_MATRIX(dct_space, 3, 64, int16_t, CACHE_LINE); /* some pre-inintialized data for the rest of the search */ SearchData Data_d, Data_f, Data_b, Data_i; memset(&Data_d, 0, sizeof(SearchData)); Data_d.iEdgedWidth = pParam->edged_width; Data_d.qpel = pParam->vol_flags & XVID_VOL_QUARTERPEL ? 1 : 0; Data_d.rounding = 0; Data_d.chroma = frame->motion_flags & XVID_ME_CHROMA_BVOP; Data_d.iQuant = frame->quant; Data_d.quant_sq = frame->quant*frame->quant; Data_d.dctSpace = dct_space; Data_d.quant_type = !(pParam->vol_flags & XVID_VOL_MPEGQUANT); Data_d.mpeg_quant_matrices = pParam->mpeg_quant_matrices; Data_d.RefQ = f_refV->u; /* a good place, also used in MC (for similar purpose) */ memcpy(&Data_f, &Data_d, sizeof(SearchData)); memcpy(&Data_b, &Data_d, sizeof(SearchData)); memcpy(&Data_i, &Data_d, sizeof(SearchData)); Data_f.iFcode = Data_i.iFcode = frame->fcode = b_reference->fcode; Data_b.iFcode = Data_i.bFcode = frame->bcode = b_reference->fcode; for (j = 0; j < pParam->mb_height; j++) { int new_bound = mb_width * ((((j*num_slices) / mb_height) * mb_height + (num_slices-1)) / num_slices); f_predMV = b_predMV = zeroMV; /* prediction is reset at left boundary */ for (i = 0; i < pParam->mb_width; i++) { MACROBLOCK * const pMB = frame->mbs + i + j * pParam->mb_width; const MACROBLOCK * const b_mb = b_mbs + i + j * pParam->mb_width; int force_direct = (((j*mb_width+i)==new_bound) && (j > 0)) ? 1 : 0; /* MTK decoder chipsets do NOT reset predMVs upon resync marker in BVOPs. We workaround this problem by placing the slice border on second MB in a row and then force the first MB to be direct mode */ pMB->mode = -1; initialize_searchData(&Data_d, &Data_f, &Data_b, &Data_i, i, j, f_ref, f_refH->y, f_refV->y, f_refHV->y, b_ref, b_refH->y, b_refV->y, b_refHV->y, &frame->image, b_mb); /* special case, if collocated block is SKIPed in P-VOP: encoding is forward (0,0), cpb=0 without further ado */ if (b_reference->coding_type != S_VOP) if (b_mb->mode == MODE_NOT_CODED) { pMB->mode = MODE_NOT_CODED; pMB->mvs[0] = pMB->b_mvs[0] = zeroMV; pMB->sad16 = 0; continue; } /* direct search comes first, because it (1) checks for SKIP-mode and (2) sets very good predictions for forward and backward search */ skip_sad = SearchDirect_initial(i, j, frame->motion_flags, TRB, TRD, pParam, pMB, b_mb, &best_sad, &Data_d); if (pMB->mode == MODE_DIRECT_NONE_MV) { pMB->sad16 = best_sad; pMB->cbp = 0; continue; } SearchBF_initial(i, j, frame->motion_flags, frame->fcode, pParam, pMB, &f_predMV, &best_sad, MODE_FORWARD, &Data_f, Data_d.currentMV[1], new_bound); SearchBF_initial(i, j, frame->motion_flags, frame->bcode, pParam, pMB, &b_predMV, &best_sad, MODE_BACKWARD, &Data_b, Data_d.currentMV[2], new_bound); if (frame->motion_flags&XVID_ME_BFRAME_EARLYSTOP) fb_thresh = best_sad; else fb_thresh = best_sad + (best_sad>>1); if (Data_f.iMinSAD[0] <= fb_thresh) SearchBF_final(i, j, frame->motion_flags, pParam, &best_sad, &Data_f); if (Data_b.iMinSAD[0] <= fb_thresh) SearchBF_final(i, j, frame->motion_flags, pParam, &best_sad, &Data_b); SearchInterpolate_initial(i, j, frame->motion_flags, pParam, &f_predMV, &b_predMV, &best_sad, &Data_i, Data_f.currentMV[0], Data_b.currentMV[0]); if (((Data_i.iMinSAD[0] < best_sad +(best_sad>>3)) && !(frame->motion_flags&XVID_ME_FAST_MODEINTERPOLATE)) || Data_i.iMinSAD[0] <= best_sad) SearchInterpolate_final(i, j, frame->motion_flags, pParam, &best_sad, &Data_i); if (Data_d.iMinSAD[0] <= 2*best_sad) if ((!(frame->motion_flags&XVID_ME_SKIP_DELTASEARCH) && (best_sad > 750)) || (best_sad > 1000)) SearchDirect_final(frame->motion_flags, b_mb, &best_sad, &Data_d); /* final skip decision */ if ( (skip_sad < 2 * Data_d.iQuant * MAX_SAD00_FOR_SKIP ) && ((100*best_sad)/(skip_sad+1) > FINAL_SKIP_THRESH) ) { Data_d.chromaSAD = 0; /* green light for chroma check */ SkipDecisionB(pMB, &Data_d); if (pMB->mode == MODE_DIRECT_NONE_MV) { /* skipped? */ pMB->sad16 = skip_sad; pMB->cbp = 0; continue; } } if (frame->vop_flags & XVID_VOP_RD_BVOP) ModeDecision_BVOP_RD(&Data_d, &Data_b, &Data_f, &Data_i, pMB, b_mb, &f_predMV, &b_predMV, frame->motion_flags, frame->vop_flags, pParam, i, j, best_sad, force_direct); else ModeDecision_BVOP_SAD(&Data_d, &Data_b, &Data_f, &Data_i, pMB, b_mb, &f_predMV, &b_predMV, force_direct); maxMotionBVOP(&MVmaxF, &MVmaxB, pMB, Data_d.qpel); } } frame->fcode = getMinFcode(MVmaxF); frame->bcode = getMinFcode(MVmaxB); } void SMPMotionEstimationBVOP(SMPData * h) { Encoder *pEnc = (Encoder *) h->pEnc; const MBParam * const pParam = &pEnc->mbParam; const FRAMEINFO * const frame = h->current; const int32_t time_bp = (int32_t)(pEnc->current->stamp - frame->stamp); const int32_t time_pp = (int32_t)(pEnc->current->stamp - pEnc->reference->stamp); /* forward (past) reference */ const IMAGE * const f_ref = &pEnc->reference->image; const IMAGE * const f_refH = &pEnc->f_refh; const IMAGE * const f_refV = &pEnc->f_refv; const IMAGE * const f_refHV = &pEnc->f_refhv; /* backward (future) reference */ const FRAMEINFO * const b_reference = pEnc->current; const IMAGE * const b_ref = &pEnc->current->image; const IMAGE * const b_refH = &pEnc->vInterH; const IMAGE * const b_refV = &pEnc->vInterV; const IMAGE * const b_refHV = &pEnc->vInterHV; int mb_width = pParam->mb_width; int mb_height = pParam->mb_height; int num_slices = pEnc->num_slices; int y_row = h->y_row; int y_step = h->y_step; int start_y = h->start_y; int stop_y = h->stop_y; int * complete_count_self = h->complete_count_self; const int * complete_count_above = h->complete_count_above; int max_mbs; int current_mb = 0; int32_t i, j; int32_t best_sad = 256*4096; uint32_t skip_sad; int fb_thresh; const MACROBLOCK * const b_mbs = b_reference->mbs; VECTOR f_predMV, b_predMV; int MVmaxF = 0, MVmaxB = 0; const int32_t TRB = time_pp - time_bp; const int32_t TRD = time_pp; DECLARE_ALIGNED_MATRIX(dct_space, 3, 64, int16_t, CACHE_LINE); /* some pre-inintialized data for the rest of the search */ SearchData Data_d, Data_f, Data_b, Data_i; memset(&Data_d, 0, sizeof(SearchData)); Data_d.iEdgedWidth = pParam->edged_width; Data_d.qpel = pParam->vol_flags & XVID_VOL_QUARTERPEL ? 1 : 0; Data_d.rounding = 0; Data_d.chroma = frame->motion_flags & XVID_ME_CHROMA_BVOP; Data_d.iQuant = frame->quant; Data_d.quant_sq = frame->quant*frame->quant; Data_d.dctSpace = dct_space; Data_d.quant_type = !(pParam->vol_flags & XVID_VOL_MPEGQUANT); Data_d.mpeg_quant_matrices = pParam->mpeg_quant_matrices; Data_d.RefQ = h->RefQ; memcpy(&Data_f, &Data_d, sizeof(SearchData)); memcpy(&Data_b, &Data_d, sizeof(SearchData)); memcpy(&Data_i, &Data_d, sizeof(SearchData)); Data_f.iFcode = Data_i.iFcode = frame->fcode; Data_b.iFcode = Data_i.bFcode = frame->bcode; max_mbs = 0; for (j = (start_y+y_row); j < stop_y; j += y_step) { int new_bound = mb_width * ((((j*num_slices) / mb_height) * mb_height + (num_slices-1)) / num_slices); if (j == start_y) max_mbs = pParam->mb_width; /* we can process all blocks of the first row */ f_predMV = b_predMV = zeroMV; /* prediction is reset at left boundary */ for (i = 0; i < (int) pParam->mb_width; i++) { MACROBLOCK * const pMB = frame->mbs + i + j * pParam->mb_width; const MACROBLOCK * const b_mb = b_mbs + i + j * pParam->mb_width; int force_direct = (((j*mb_width+i)==new_bound) && (j > 0)) ? 1 : 0; /* MTK decoder chipsets do NOT reset predMVs upon resync marker in BVOPs. We workaround this problem by placing the slice border on second MB in a row and then force the first MB to be direct mode */ pMB->mode = -1; initialize_searchData(&Data_d, &Data_f, &Data_b, &Data_i, i, j, f_ref, f_refH->y, f_refV->y, f_refHV->y, b_ref, b_refH->y, b_refV->y, b_refHV->y, &frame->image, b_mb); if (current_mb >= max_mbs) { /* we ME-ed all macroblocks we safely could. grab next portion */ int above_count = *complete_count_above; /* sync point */ if (above_count == pParam->mb_width) { /* full line above is ready */ above_count = pParam->mb_width+1; if (j < stop_y-y_step) { /* this is not last line, grab a portion of MBs from the next line too */ above_count += MAX(0, complete_count_above[1] - 1); } } max_mbs = current_mb + above_count - i - 1; if (current_mb >= max_mbs) { /* current workload is zero */ i--; sched_yield(); continue; } } /* special case, if collocated block is SKIPed in P-VOP: encoding is forward (0,0), cpb=0 without further ado */ if (b_reference->coding_type != S_VOP) if (b_mb->mode == MODE_NOT_CODED) { pMB->mode = MODE_NOT_CODED; pMB->mvs[0] = pMB->b_mvs[0] = zeroMV; pMB->sad16 = 0; *complete_count_self = i+1; current_mb++; continue; } /* direct search comes first, because it (1) checks for SKIP-mode and (2) sets very good predictions for forward and backward search */ skip_sad = SearchDirect_initial(i, j, frame->motion_flags, TRB, TRD, pParam, pMB, b_mb, &best_sad, &Data_d); if (pMB->mode == MODE_DIRECT_NONE_MV) { pMB->sad16 = best_sad; pMB->cbp = 0; *complete_count_self = i+1; current_mb++; continue; } SearchBF_initial(i, j, frame->motion_flags, frame->fcode, pParam, pMB, &f_predMV, &best_sad, MODE_FORWARD, &Data_f, Data_d.currentMV[1], new_bound); SearchBF_initial(i, j, frame->motion_flags, frame->bcode, pParam, pMB, &b_predMV, &best_sad, MODE_BACKWARD, &Data_b, Data_d.currentMV[2], new_bound); if (frame->motion_flags&XVID_ME_BFRAME_EARLYSTOP) fb_thresh = best_sad; else fb_thresh = best_sad + (best_sad>>1); if (Data_f.iMinSAD[0] <= fb_thresh) SearchBF_final(i, j, frame->motion_flags, pParam, &best_sad, &Data_f); if (Data_b.iMinSAD[0] <= fb_thresh) SearchBF_final(i, j, frame->motion_flags, pParam, &best_sad, &Data_b); SearchInterpolate_initial(i, j, frame->motion_flags, pParam, &f_predMV, &b_predMV, &best_sad, &Data_i, Data_f.currentMV[0], Data_b.currentMV[0]); if (((Data_i.iMinSAD[0] < best_sad +(best_sad>>3)) && !(frame->motion_flags&XVID_ME_FAST_MODEINTERPOLATE)) || Data_i.iMinSAD[0] <= best_sad) SearchInterpolate_final(i, j, frame->motion_flags, pParam, &best_sad, &Data_i); if (Data_d.iMinSAD[0] <= 2*best_sad) if ((!(frame->motion_flags&XVID_ME_SKIP_DELTASEARCH) && (best_sad > 750)) || (best_sad > 1000)) SearchDirect_final(frame->motion_flags, b_mb, &best_sad, &Data_d); /* final skip decision */ if ( (skip_sad < 2 * Data_d.iQuant * MAX_SAD00_FOR_SKIP ) && ((100*best_sad)/(skip_sad+1) > FINAL_SKIP_THRESH) ) { Data_d.chromaSAD = 0; /* green light for chroma check */ SkipDecisionB(pMB, &Data_d); if (pMB->mode == MODE_DIRECT_NONE_MV) { /* skipped? */ pMB->sad16 = skip_sad; pMB->cbp = 0; *complete_count_self = i+1; current_mb++; continue; } } if (frame->vop_flags & XVID_VOP_RD_BVOP) ModeDecision_BVOP_RD(&Data_d, &Data_b, &Data_f, &Data_i, pMB, b_mb, &f_predMV, &b_predMV, frame->motion_flags, frame->vop_flags, pParam, i, j, best_sad, force_direct); else ModeDecision_BVOP_SAD(&Data_d, &Data_b, &Data_f, &Data_i, pMB, b_mb, &f_predMV, &b_predMV, force_direct); *complete_count_self = i+1; current_mb++; maxMotionBVOP(&MVmaxF, &MVmaxB, pMB, Data_d.qpel); } complete_count_self++; complete_count_above++; } h->minfcode = getMinFcode(MVmaxF); h->minbcode = getMinFcode(MVmaxB); } xvidcore/src/motion/gmc.h0000664000076500007650000000523711564705453016517 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - GMC interpolation module header - * * Copyright(C) 2002-2003 Pascal Massimino * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: gmc.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../portab.h" #include "../global.h" /* ************************************************************* * will initialize internal pointers */ void init_GMC(const unsigned int cpu_flags); /* ************************************************************* * Warning! It's Accuracy being passed, not 'resolution'! */ void generate_GMCparameters(int nb_pts, const int accuracy, const WARPPOINTS *const pts, const int width, const int height, NEW_GMC_DATA *const gmc); /* ******************************************************************* */ void generate_GMCimage( const NEW_GMC_DATA *const gmc_data, /* [input] precalculated data */ const IMAGE *const pRef, /* [input] */ const int mb_width, const int mb_height, const int stride, const int stride2, const int fcode, /* [input] some parameters... */ const int32_t quarterpel, /* [input] for rounding avgMV */ const int reduced_resolution, /* [input] ignored */ const int32_t rounding, /* [input] for rounding image data */ MACROBLOCK *const pMBs, /* [output] average motion vectors */ IMAGE *const pGMC); /* [output] full warped image */ /* ************************************************************* * utils */ /* This is borrowed from decoder.c */ static __inline int gmc_sanitize(int value, int quarterpel, int fcode) { int length = 1 << (fcode+4); #if 0 if (quarterpel) value *= 2; #endif if (value < -length) return -length; else if (value >= length) return length-1; else return value; } /* *************************************************************/ xvidcore/src/motion/x86_asm/0000775000076500007650000000000011566427763017065 5ustar xvidbuildxvidbuildxvidcore/src/motion/x86_asm/sad_mmx.asm0000664000076500007650000003342511254216113021201 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - K7 optimized SAD operators - ; * ; * Copyright(C) 2001 Peter Ross ; * 2002 Pascal Massimino ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: sad_mmx.asm,v 1.22 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN mmx_one: times 4 dw 1 ;============================================================================= ; Helper macros ;============================================================================= %macro SAD_16x16_MMX 0 movq mm0, [_EAX] movq mm1, [TMP1] movq mm2, [_EAX+8] movq mm3, [TMP1+8] movq mm4, mm0 psubusb mm0, mm1 lea _EAX, [_EAX+TMP0] movq mm5, mm2 psubusb mm2, mm3 psubusb mm1, mm4 psubusb mm3, mm5 por mm0, mm1 por mm2, mm3 movq mm1, mm0 punpcklbw mm0,mm7 movq mm3, mm2 punpckhbw mm1,mm7 lea TMP1, [TMP1+TMP0] punpcklbw mm2,mm7 paddusw mm0, mm1 punpckhbw mm3,mm7 paddusw mm6, mm0 paddusw mm2, mm3 paddusw mm6, mm2 %endmacro %macro SAD_8x8_MMX 0 movq mm0, [_EAX] movq mm1, [TMP1] movq mm2, [_EAX+TMP0] movq mm3, [TMP1+TMP0] lea _EAX,[_EAX+2*TMP0] lea TMP1,[TMP1+2*TMP0] movq mm4, mm0 psubusb mm0, mm1 movq mm5, mm2 psubusb mm2, mm3 psubusb mm1, mm4 psubusb mm3, mm5 por mm0, mm1 por mm2, mm3 movq mm1,mm0 punpcklbw mm0,mm7 movq mm3,mm2 punpckhbw mm1,mm7 punpcklbw mm2,mm7 paddusw mm0,mm1 punpckhbw mm3,mm7 paddusw mm6,mm0 paddusw mm2,mm3 paddusw mm6,mm2 %endmacro %macro SADV_16x16_MMX 0 movq mm0, [_EAX] movq mm1, [TMP1] movq mm2, [_EAX+8] movq mm4, mm0 movq mm3, [TMP1+8] psubusb mm0, mm1 psubusb mm1, mm4 lea _EAX,[_EAX+TMP0] por mm0, mm1 movq mm4, mm2 psubusb mm2, mm3 psubusb mm3, mm4 por mm2, mm3 movq mm1,mm0 punpcklbw mm0,mm7 movq mm3,mm2 punpckhbw mm1,mm7 punpcklbw mm2,mm7 paddusw mm0,mm1 punpckhbw mm3,mm7 paddusw mm5, mm0 paddusw mm2,mm3 lea TMP1,[TMP1+TMP0] paddusw mm6, mm2 %endmacro %macro SADBI_16x16_MMX 2 ; SADBI_16x16_MMX( int_ptr_offset, bool_increment_ptr ); movq mm0, [TMP1+%1] movq mm2, [_EBX+%1] movq mm1, mm0 movq mm3, mm2 %if %2 != 0 add TMP1, TMP0 %endif punpcklbw mm0, mm7 punpckhbw mm1, mm7 punpcklbw mm2, mm7 punpckhbw mm3, mm7 %if %2 != 0 add _EBX, TMP0 %endif paddusw mm0, mm2 ; mm01 = ref1 + ref2 paddusw mm1, mm3 paddusw mm0, [mmx_one] ; mm01 += 1 paddusw mm1, [mmx_one] psrlw mm0, 1 ; mm01 >>= 1 psrlw mm1, 1 movq mm2, [_EAX+%1] movq mm3, mm2 punpcklbw mm2, mm7 ; mm23 = src punpckhbw mm3, mm7 %if %2 != 0 add _EAX, TMP0 %endif movq mm4, mm0 movq mm5, mm1 psubusw mm0, mm2 psubusw mm1, mm3 psubusw mm2, mm4 psubusw mm3, mm5 por mm0, mm2 ; mm01 = ABS(mm01 - mm23) por mm1, mm3 paddusw mm6, mm0 ; mm6 += mm01 paddusw mm6, mm1 %endmacro %macro MEAN_16x16_MMX 0 movq mm0, [_EAX] movq mm2, [_EAX+8] lea _EAX, [_EAX+TMP0] movq mm1, mm0 punpcklbw mm0, mm7 movq mm3, mm2 punpckhbw mm1, mm7 paddw mm5, mm0 punpcklbw mm2, mm7 paddw mm6, mm1 punpckhbw mm3, mm7 paddw mm5, mm2 paddw mm6, mm3 %endmacro %macro ABS_16x16_MMX 0 movq mm0, [_EAX] movq mm2, [_EAX+8] lea _EAX, [_EAX+TMP0] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm7 punpcklbw mm2, mm7 punpckhbw mm1, mm7 punpckhbw mm3, mm7 movq mm4, mm6 psubusw mm4, mm0 psubusw mm0, mm6 por mm0, mm4 movq mm4, mm6 psubusw mm4, mm1 psubusw mm1, mm6 por mm1, mm4 movq mm4, mm6 psubusw mm4, mm2 psubusw mm2, mm6 por mm2, mm4 movq mm4, mm6 psubusw mm4, mm3 psubusw mm3, mm6 por mm3, mm4 paddw mm0, mm1 paddw mm2, mm3 paddw mm5, mm0 paddw mm5, mm2 %endmacro ;============================================================================= ; Code ;============================================================================= TEXT cglobal sad16_mmx cglobal sad16v_mmx cglobal sad8_mmx cglobal sad16bi_mmx cglobal sad8bi_mmx cglobal dev16_mmx cglobal sse8_16bit_mmx cglobal sse8_8bit_mmx ;----------------------------------------------------------------------------- ; ; uint32_t sad16_mmx(const uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride, ; const uint32_t best_sad); ; ; (early termination ignore; slows this down) ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad16_mmx: mov _EAX, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov TMP0, prm3 ; Stride pxor mm6, mm6 ; accum pxor mm7, mm7 ; zero SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX SAD_16x16_MMX pmaddwd mm6, [mmx_one] ; collapse movq mm7, mm6 psrlq mm7, 32 paddd mm6, mm7 movd eax, mm6 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad8_mmx(const uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad8_mmx: mov _EAX, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov TMP0, prm3 ; Stride pxor mm6, mm6 ; accum pxor mm7, mm7 ; zero SAD_8x8_MMX SAD_8x8_MMX SAD_8x8_MMX SAD_8x8_MMX pmaddwd mm6, [mmx_one] ; collapse movq mm7, mm6 psrlq mm7, 32 paddd mm6, mm7 movd eax, mm6 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad16v_mmx(const uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride, ; int32_t *sad); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad16v_mmx: mov _EAX, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov TMP0, prm3 ; Stride push _EBX push _EDI %ifdef ARCH_IS_X86_64 mov _EBX, prm4 %else mov _EBX, [_ESP + 8 + 16] ; sad ptr %endif pxor mm5, mm5 ; accum pxor mm6, mm6 ; accum pxor mm7, mm7 ; zero SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX pmaddwd mm5, [mmx_one] ; collapse pmaddwd mm6, [mmx_one] ; collapse movq mm2, mm5 movq mm3, mm6 psrlq mm2, 32 psrlq mm3, 32 paddd mm5, mm2 paddd mm6, mm3 movd [_EBX], mm5 movd [_EBX + 4], mm6 paddd mm5, mm6 movd edi, mm5 pxor mm5, mm5 pxor mm6, mm6 SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX SADV_16x16_MMX pmaddwd mm5, [mmx_one] ; collapse pmaddwd mm6, [mmx_one] ; collapse movq mm2, mm5 movq mm3, mm6 psrlq mm2, 32 psrlq mm3, 32 paddd mm5, mm2 paddd mm6, mm3 movd [_EBX + 8], mm5 movd [_EBX + 12], mm6 paddd mm5, mm6 movd eax, mm5 add _EAX, _EDI pop _EDI pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad16bi_mmx(const uint8_t * const cur, ; const uint8_t * const ref1, ; const uint8_t * const ref2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad16bi_mmx: mov _EAX, prm1 ; Src mov TMP1, prm2 ; Ref1 mov TMP0, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 ; Ref2 %else mov _EBX, [_ESP+4+12] ; Ref2 %endif pxor mm6, mm6 ; accum2 pxor mm7, mm7 .Loop: SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 SADBI_16x16_MMX 0, 0 SADBI_16x16_MMX 8, 1 pmaddwd mm6, [mmx_one] ; collapse movq mm7, mm6 psrlq mm7, 32 paddd mm6, mm7 movd eax, mm6 pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad8bi_mmx(const uint8_t * const cur, ; const uint8_t * const ref1, ; const uint8_t * const ref2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad8bi_mmx: mov _EAX, prm1 ; Src mov TMP1, prm2 ; Ref1 mov TMP0, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref2 %endif pxor mm6, mm6 ; accum2 pxor mm7, mm7 .Loop: SADBI_16x16_MMX 0, 1 SADBI_16x16_MMX 0, 1 SADBI_16x16_MMX 0, 1 SADBI_16x16_MMX 0, 1 SADBI_16x16_MMX 0, 1 SADBI_16x16_MMX 0, 1 SADBI_16x16_MMX 0, 1 SADBI_16x16_MMX 0, 1 pmaddwd mm6, [mmx_one] ; collapse movq mm7, mm6 psrlq mm7, 32 paddd mm6, mm7 movd eax, mm6 pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dev16_mmx(const uint8_t * const cur, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN dev16_mmx: mov _EAX, prm1 ; Src mov TMP0, prm2 ; Stride pxor mm7, mm7 ; zero pxor mm5, mm5 ; accum1 pxor mm6, mm6 ; accum2 MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX MEAN_16x16_MMX paddusw mm6, mm5 pmaddwd mm6, [mmx_one] ; collapse movq mm5, mm6 psrlq mm5, 32 paddd mm6, mm5 psllq mm6, 32 ; blank upper dword psrlq mm6, 32 + 8 ; /= (16*16) punpckldq mm6, mm6 packssdw mm6, mm6 ; mm6 contains the mean ; mm5 is the new accum pxor mm5, mm5 mov _EAX, prm1 ; Src ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX ABS_16x16_MMX pmaddwd mm5, [mmx_one] ; collapse movq mm6, mm5 psrlq mm6, 32 paddd mm6, mm5 movd eax, mm6 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sse8_16bit_mmx(const int16_t *b1, ; const int16_t *b2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- %macro ROW_SSE_16bit_MMX 2 movq mm0, [%1] movq mm1, [%1+8] psubw mm0, [%2] psubw mm1, [%2+8] pmaddwd mm0, mm0 pmaddwd mm1, mm1 paddd mm2, mm0 paddd mm2, mm1 %endmacro sse8_16bit_mmx: ;; Load the function params mov _EAX, prm1 mov TMP0, prm2 mov TMP1, prm3 ;; Reset the sse accumulator pxor mm2, mm2 ;; Let's go %rep 8 ROW_SSE_16bit_MMX _EAX, TMP0 lea _EAX, [_EAX+TMP1] lea TMP0, [TMP0+TMP1] %endrep ;; Finish adding each dword of the accumulator movq mm3, mm2 psrlq mm2, 32 paddd mm2, mm3 movd eax, mm2 ;; All done ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sse8_8bit_mmx(const int8_t *b1, ; const int8_t *b2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- %macro ROW_SSE_8bit_MMX 2 movq mm0, [%1] ; load a row movq mm2, [%2] ; load a row movq mm1, mm0 ; copy row movq mm3, mm2 ; copy row punpcklbw mm0, mm7 ; turn the 4low elements into 16bit punpckhbw mm1, mm7 ; turn the 4high elements into 16bit punpcklbw mm2, mm7 ; turn the 4low elements into 16bit punpckhbw mm3, mm7 ; turn the 4high elements into 16bit psubw mm0, mm2 ; low part of src-dst psubw mm1, mm3 ; high part of src-dst pmaddwd mm0, mm0 ; compute the square sum pmaddwd mm1, mm1 ; compute the square sum paddd mm6, mm0 ; add to the accumulator paddd mm6, mm1 ; add to the accumulator %endmacro sse8_8bit_mmx: ;; Load the function params mov _EAX, prm1 mov TMP0, prm2 mov TMP1, prm3 ;; Reset the sse accumulator pxor mm6, mm6 ;; Used to interleave 8bit data with 0x00 values pxor mm7, mm7 ;; Let's go %rep 8 ROW_SSE_8bit_MMX _EAX, TMP0 lea _EAX, [_EAX+TMP1] lea TMP0, [TMP0+TMP1] %endrep ;; Finish adding each dword of the accumulator movq mm7, mm6 psrlq mm6, 32 paddd mm6, mm7 movd eax, mm6 ;; All done ret ENDFUNC NON_EXEC_STACK xvidcore/src/motion/x86_asm/sad_3dn.asm0000664000076500007650000001104711254216113021060 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - 3DNow sad operators w/o XMM instructions - ; * ; * Copyright(C) 2002 Peter ross ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: sad_3dn.asm,v 1.14 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN mmx_one: times 4 dw 1 ;============================================================================= ; Helper macros ;============================================================================= %macro SADBI_16x16_3DN 0 movq mm0, [_EAX] ; src movq mm2, [_EAX+8] movq mm1, [TMP1] ; ref1 movq mm3, [TMP1+8] pavgusb mm1, [_EBX] ; ref2 lea TMP1, [TMP1+TMP0] pavgusb mm3, [_EBX+8] lea _EBX, [_EBX+TMP0] movq mm4, mm0 lea _EAX, [_EAX+TMP0] psubusb mm0, mm1 movq mm5, mm2 psubusb mm2, mm3 psubusb mm1, mm4 por mm0, mm1 psubusb mm3, mm5 por mm2, mm3 movq mm1, mm0 movq mm3, mm2 punpcklbw mm0,mm7 punpckhbw mm1,mm7 punpcklbw mm2,mm7 punpckhbw mm3,mm7 paddusw mm0,mm1 paddusw mm2,mm3 paddusw mm6,mm0 paddusw mm6,mm2 %endmacro %macro SADBI_8x8_3DN 0 movq mm0, [_EAX] ; src movq mm2, [_EAX+TMP0] movq mm1, [TMP1] ; ref1 movq mm3, [TMP1+TMP0] pavgusb mm1, [_EBX] ; ref2 lea TMP1, [TMP1+2*TMP0] pavgusb mm3, [_EBX+TMP0] lea _EBX, [_EBX+2*TMP0] movq mm4, mm0 lea _EAX, [_EAX+2*TMP0] psubusb mm0, mm1 movq mm5, mm2 psubusb mm2, mm3 psubusb mm1, mm4 por mm0, mm1 psubusb mm3, mm5 por mm2, mm3 movq mm1, mm0 movq mm3, mm2 punpcklbw mm0,mm7 punpckhbw mm1,mm7 punpcklbw mm2,mm7 punpckhbw mm3,mm7 paddusw mm0,mm1 paddusw mm2,mm3 paddusw mm6,mm0 paddusw mm6,mm2 %endmacro ;============================================================================= ; Code ;============================================================================= TEXT cglobal sad16bi_3dn cglobal sad8bi_3dn ;----------------------------------------------------------------------------- ; ; uint32_t sad16bi_3dn(const uint8_t * const cur, ; const uint8_t * const ref1, ; const uint8_t * const ref2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad16bi_3dn: mov _EAX, prm1 ; Src mov TMP1, prm2 ; Ref1 mov TMP0, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref2 %endif pxor mm6, mm6 ; accum2 pxor mm7, mm7 .Loop: SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN SADBI_16x16_3DN pmaddwd mm6, [mmx_one] ; collapse movq mm7, mm6 psrlq mm7, 32 paddd mm6, mm7 movd eax, mm6 pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad8bi_3dn(const uint8_t * const cur, ; const uint8_t * const ref1, ; const uint8_t * const ref2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad8bi_3dn: mov _EAX, prm1 ; Src mov TMP1, prm2 ; Ref1 mov TMP0, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref2 %endif pxor mm6, mm6 ; accum2 pxor mm7, mm7 .Loop: SADBI_8x8_3DN SADBI_8x8_3DN SADBI_8x8_3DN SADBI_8x8_3DN pmaddwd mm6, [mmx_one] ; collapse movq mm7, mm6 psrlq mm7, 32 paddd mm6, mm7 movd eax, mm6 pop _EBX ret ENDFUNC NON_EXEC_STACK xvidcore/src/motion/x86_asm/sad_sse2.asm0000664000076500007650000002736311474471353021275 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - SSE2 optimized SAD operators - ; * ; * Copyright(C) 2003-2010 Pascal Massimino ; * 2008-2010 Michael Militzer ; * ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: sad_sse2.asm,v 1.21 2010-11-28 15:18:21 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN zero times 4 dd 0 ALIGN SECTION_ALIGN ones times 8 dw 1 ALIGN SECTION_ALIGN round32 times 4 dd 32 ;============================================================================= ; Coeffs for MSE_H calculation ;============================================================================= ALIGN SECTION_ALIGN iMask_Coeff: dw 0, 29788, 32767, 20479, 13653, 8192, 6425, 5372, dw 27306, 27306, 23405, 17246, 12603, 5650, 5461, 5958, dw 23405, 25205, 20479, 13653, 8192, 5749, 4749, 5851, dw 23405, 19275, 14894, 11299, 6425, 3766, 4096, 5285, dw 18204, 14894, 8856, 5851, 4819, 3006, 3181, 4255, dw 13653, 9362, 5958, 5120, 4045, 3151, 2900, 3562, dw 6687, 5120, 4201, 3766, 3181, 2708, 2730, 3244, dw 4551, 3562, 3449, 3344, 2926, 3277, 3181, 3310 ALIGN SECTION_ALIGN Inv_iMask_Coeff: dd 0, 155, 128, 328, 737, 2048, 3329, 4763, dd 184, 184, 251, 462, 865, 4306, 4608, 3872, dd 251, 216, 328, 737, 2048, 4159, 6094, 4014, dd 251, 370, 620, 1076, 3329, 9688, 8192, 4920, dd 415, 620, 1752, 4014, 5919, 15207, 13579, 7589, dd 737, 1568, 3872, 5243, 8398, 13844, 16345, 10834, dd 3073, 5243, 7787, 9688, 13579, 18741, 18433, 13057, dd 6636, 10834, 11552, 12294, 16056, 12800, 13579, 12545 ALIGN SECTION_ALIGN iCSF_Coeff: dw 26353, 38331, 42164, 26353, 17568, 10541, 8268, 6912, dw 35137, 35137, 30117, 22192, 16217, 7270, 7027, 7666, dw 30117, 32434, 26353, 17568, 10541, 7397, 6111, 7529, dw 30117, 24803, 19166, 14539, 8268, 4846, 5271, 6801, dw 23425, 19166, 11396, 7529, 6201, 3868, 4094, 5476, dw 17568, 12047, 7666, 6588, 5205, 4054, 3731, 4583, dw 8605, 6588, 5406, 4846, 4094, 3485, 3514, 4175, dw 5856, 4583, 4438, 4302, 3765, 4216, 4094, 4259 ALIGN SECTION_ALIGN iCSF_Round: dw 1, 1, 1, 1, 2, 3, 4, 5, dw 1, 1, 1, 1, 2, 5, 5, 4, dw 1, 1, 1, 2, 3, 4, 5, 4, dw 1, 1, 2, 2, 4, 7, 6, 5, dw 1, 2, 3, 4, 5, 8, 8, 6, dw 2, 3, 4, 5, 6, 8, 9, 7, dw 4, 5, 6, 7, 8, 9, 9, 8, dw 6, 7, 7, 8, 9, 8, 8, 8 ;============================================================================= ; Code ;============================================================================= TEXT cglobal sad16_sse2 cglobal dev16_sse2 cglobal sad16_sse3 cglobal dev16_sse3 cglobal sseh8_16bit_sse2 cglobal coeff8_energy_sse2 cglobal blocksum8_sse2 ;----------------------------------------------------------------------------- ; uint32_t sad16_sse2 (const uint8_t * const cur, <- assumed aligned! ; const uint8_t * const ref, ; const uint32_t stride, ; const uint32_t /*ignored*/); ;----------------------------------------------------------------------------- %macro SAD_16x16_SSE2 1 %1 xmm0, [TMP1] %1 xmm1, [TMP1+TMP0] lea TMP1,[TMP1+2*TMP0] movdqa xmm2, [_EAX] movdqa xmm3, [_EAX+TMP0] lea _EAX,[_EAX+2*TMP0] psadbw xmm0, xmm2 paddusw xmm4,xmm0 psadbw xmm1, xmm3 paddusw xmm4,xmm1 %endmacro %macro SAD16_SSE2_SSE3 1 mov _EAX, prm1 ; cur (assumed aligned) mov TMP1, prm2 ; ref mov TMP0, prm3 ; stride pxor xmm4, xmm4 ; accum SAD_16x16_SSE2 %1 SAD_16x16_SSE2 %1 SAD_16x16_SSE2 %1 SAD_16x16_SSE2 %1 SAD_16x16_SSE2 %1 SAD_16x16_SSE2 %1 SAD_16x16_SSE2 %1 SAD_16x16_SSE2 %1 pshufd xmm5, xmm4, 00000010b paddusw xmm4, xmm5 pextrw eax, xmm4, 0 ret %endmacro ALIGN SECTION_ALIGN sad16_sse2: SAD16_SSE2_SSE3 movdqu ENDFUNC ALIGN SECTION_ALIGN sad16_sse3: SAD16_SSE2_SSE3 lddqu ENDFUNC ;----------------------------------------------------------------------------- ; uint32_t dev16_sse2(const uint8_t * const cur, const uint32_t stride); ;----------------------------------------------------------------------------- %macro MEAN_16x16_SSE2 1 ; _EAX: src, TMP0:stride, mm7: zero or mean => mm6: result %1 xmm0, [_EAX] %1 xmm1, [_EAX+TMP0] lea _EAX, [_EAX+2*TMP0] ; + 2*stride psadbw xmm0, xmm5 paddusw xmm4, xmm0 psadbw xmm1, xmm5 paddusw xmm4, xmm1 %endmacro %macro MEAN16_SSE2_SSE3 1 mov _EAX, prm1 ; src mov TMP0, prm2 ; stride pxor xmm4, xmm4 ; accum pxor xmm5, xmm5 ; zero MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 mov _EAX, prm1 ; src again pshufd xmm5, xmm4, 10b paddusw xmm5, xmm4 pxor xmm4, xmm4 ; zero accum psrlw xmm5, 8 ; => Mean pshuflw xmm5, xmm5, 0 ; replicate Mean packuswb xmm5, xmm5 pshufd xmm5, xmm5, 00000000b MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 MEAN_16x16_SSE2 %1 pshufd xmm5, xmm4, 10b paddusw xmm5, xmm4 pextrw eax, xmm5, 0 ret %endmacro ALIGN SECTION_ALIGN dev16_sse2: MEAN16_SSE2_SSE3 movdqu ENDFUNC ALIGN SECTION_ALIGN dev16_sse3: MEAN16_SSE2_SSE3 lddqu ENDFUNC ;----------------------------------------------------------------------------- ; uint32_t coeff8_energy_sse2(const int16_t * dct); ;----------------------------------------------------------------------------- %macro DCT_ENERGY_SSE2 4 movdqa %1, [%3 + %4] movdqa %2, [%3 + %4 + 16] psllw %1, 4 psllw %2, 4 pmulhw %1, [iMask_Coeff + %4] pmulhw %2, [iMask_Coeff + %4 + 16] pmaddwd %1, %1 pmaddwd %2, %2 paddd %1, %2 psrld %1, 3 %endmacro ALIGN SECTION_ALIGN coeff8_energy_sse2: mov TMP0, prm1 ; DCT_A DCT_ENERGY_SSE2 xmm0, xmm1, TMP0, 0 DCT_ENERGY_SSE2 xmm1, xmm2, TMP0, 32 DCT_ENERGY_SSE2 xmm2, xmm3, TMP0, 64 DCT_ENERGY_SSE2 xmm3, xmm4, TMP0, 96 paddd xmm0, xmm1 paddd xmm2, xmm3 paddd xmm0, xmm2 ; A B C D ; convolute pshufd xmm1, xmm0, 238 paddd xmm0, xmm1 pshufd xmm2, xmm0, 85 paddd xmm0, xmm2 movd eax, xmm0 ret ENDFUNC ;----------------------------------------------------------------------------------- ; uint32_t mseh8_16bit_sse2(const int16_t * cur, const int16_t * ref, uint16_t mask) ;----------------------------------------------------------------------------------- %macro SSEH_SSE2 4 movdqa xmm0, [%1 + %3] movdqa xmm1, [%2 + %3] movdqa xmm2, [%1 + %3 + 16] movdqa xmm3, [%2 + %3 + 16] movdqa xmm4, xmm7 ; MASK movdqa xmm5, xmm7 psubsw xmm0, xmm1 ; A - B psubsw xmm2, xmm3 ; ABS pxor xmm1, xmm1 pxor xmm3, xmm3 pcmpgtw xmm1, xmm0 pcmpgtw xmm3, xmm2 pxor xmm0, xmm1 ; change sign if negative pxor xmm2, xmm3 ; psubw xmm0, xmm1 ; ABS (A - B) psubw xmm2, xmm3 ; ABS (A - B) movdqa xmm1, xmm7 ; MASK movdqa xmm3, xmm7 pmaddwd xmm4, [Inv_iMask_Coeff + 2*(%3)] pmaddwd xmm5, [Inv_iMask_Coeff + 2*(%3) + 16] pmaddwd xmm1, [Inv_iMask_Coeff + 2*(%3) + 32] pmaddwd xmm3, [Inv_iMask_Coeff + 2*(%3) + 48] psllw xmm0, 4 psllw xmm2, 4 paddd xmm4, [round32] paddd xmm5, [round32] paddd xmm1, [round32] paddd xmm3, [round32] psrad xmm4, 7 psrad xmm5, 7 psrad xmm1, 7 psrad xmm3, 7 packssdw xmm4, xmm5 ; Thresh packssdw xmm1, xmm3 ; Thresh psubusw xmm0, xmm4 ; Decimate by masking effect psubusw xmm2, xmm1 paddusw xmm0, [iCSF_Round + %3] paddusw xmm2, [iCSF_Round + %3 + 16] pmulhuw xmm0, [iCSF_Coeff + %3] pmulhuw xmm2, [iCSF_Coeff + %3 + 16] pmaddwd xmm0, xmm0 pmaddwd xmm2, xmm2 paddd xmm0, xmm2 %endmacro ALIGN SECTION_ALIGN sseh8_16bit_sse2: PUSH_XMM6_XMM7 mov TMP0, prm1 ; DCT_A mov TMP1, prm2 ; DCT_B mov _EAX, prm3 ; MASK movd xmm7, eax pshufd xmm7, xmm7, 0 SSEH_SSE2 TMP0, TMP1, 0, xmm7 movdqa xmm6, xmm0 SSEH_SSE2 TMP0, TMP1, 32, xmm7 paddd xmm6, xmm0 SSEH_SSE2 TMP0, TMP1, 64, xmm7 paddd xmm6, xmm0 SSEH_SSE2 TMP0, TMP1, 96, xmm7 paddd xmm6, xmm0 ; convolute pshufd xmm1, xmm6, 238 paddd xmm6, xmm1 pshufd xmm2, xmm6, 85 paddd xmm6, xmm2 movd eax, xmm6 POP_XMM6_XMM7 ret ENDFUNC ;-------------------------------------------------------------------------------------------- ; uint32_t blocksum8_c(const int8_t * cur, int stride, uint16_t sums[4], uint32_t squares[4]) ;-------------------------------------------------------------------------------------------- %macro BLOCKSUM_SSE2 3 movq xmm0, [%1 ] ; 0 0 B A movq xmm2, [%1 + %2] ; 0 0 B A movq xmm1, [%1 + 2*%2] movq xmm3, [%1 + %3] punpckldq xmm0, xmm2 ; B B A A punpckldq xmm1, xmm3 ; B B A A movdqa xmm2, xmm0 movdqa xmm3, xmm1 psadbw xmm0, xmm7 ; 000b000a psadbw xmm1, xmm7 movdqa xmm4, xmm2 movdqa xmm5, xmm3 punpcklbw xmm2, xmm7 ; aaaaaaaa punpcklbw xmm3, xmm7 punpckhbw xmm4, xmm7 ; bbbbbbbb punpckhbw xmm5, xmm7 pmaddwd xmm2, xmm2 ; a*a+a*a a*a+a*a a*a+a*a a*a+a*a pmaddwd xmm3, xmm3 pmaddwd xmm4, xmm4 ; b*b+b*b b*b+b*b b*b+b*b b*b+b*b pmaddwd xmm5, xmm5 paddd xmm2, xmm3 paddd xmm4, xmm5 movdqa xmm3, xmm2 punpckldq xmm2, xmm4 ; BABA punpckhdq xmm3, xmm4 ; BABA paddd xmm2, xmm3 lea %1, [%1 + 4*%2] movdqa xmm4, xmm2 punpckhqdq xmm4, xmm7 ; paddd xmm2, xmm4 ; movq xmm3, [%1 ] ; 0 0 D C movq xmm5, [%1 + %2] ; 0 0 D C movq xmm4, [%1 + 2*%2] movq xmm6, [%1 + %3] punpckldq xmm3, xmm5 ; D D C C punpckldq xmm4, xmm6 ; D D C C movdqa xmm5, xmm3 movdqa xmm6, xmm4 psadbw xmm3, xmm7 ; 000d000c psadbw xmm4, xmm7 packssdw xmm0, xmm3 ; 0d0c0b0a packssdw xmm1, xmm4 ; paddusw xmm0, xmm1 packssdw xmm0, xmm7 ; 0000dcba movdqa xmm3, xmm5 movdqa xmm4, xmm6 punpcklbw xmm3, xmm7 punpcklbw xmm4, xmm7 punpckhbw xmm5, xmm7 punpckhbw xmm6, xmm7 pmaddwd xmm3, xmm3 ; C*C+C*C pmaddwd xmm4, xmm4 pmaddwd xmm5, xmm5 ; D*D+D*D pmaddwd xmm6, xmm6 paddd xmm3, xmm4 paddd xmm5, xmm6 movdqa xmm1, xmm3 punpckldq xmm3, xmm5 ; DCDC punpckhdq xmm1, xmm5 ; DCDC paddd xmm3, xmm1 movdqa xmm4, xmm3 punpckhqdq xmm4, xmm7 ; paddd xmm3, xmm4 punpcklqdq xmm2, xmm3 %endmacro ALIGN SECTION_ALIGN blocksum8_sse2: PUSH_XMM6_XMM7 mov TMP0, prm1 ; cur mov TMP1, prm2 ; stride mov _EAX, prm3 ; sums push _EBP lea _EBP, [TMP1 + 2*TMP1] pxor xmm7, xmm7 BLOCKSUM_SSE2 TMP0, TMP1, _EBP pop _EBP mov TMP0, prm4 ; squares movq [_EAX], xmm0 ; sums of the 4x4 sub-blocks movdqa [TMP0], xmm2 ; squares of the 4x4 sub-blocks pmaddwd xmm0, [ones] packssdw xmm0, xmm7 pmaddwd xmm0, [ones] movd eax, xmm0 POP_XMM6_XMM7 ret ENDFUNC NON_EXEC_STACK xvidcore/src/motion/x86_asm/sad_3dne.asm0000664000076500007650000002345711254216113021235 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - K7 optimized SAD operators - ; * ; * Copyright(C) 2002 Jaan Kalda ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: sad_3dne.asm,v 1.12 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ; these 3dne functions are compatible with iSSE, but are optimized specifically ; for K7 pipelines %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN mmx_one: times 4 dw 1 ;============================================================================= ; Helper macros ;============================================================================= ;; %1 block number (0..4) %macro SAD_16x16_SSE 1 movq mm7, [_EAX] movq mm6, [_EAX+8] psadbw mm7, [TMP1] psadbw mm6, [TMP1+8] %if (%1) paddd mm1, mm5 %endif movq mm5, [_EAX+TMP0] movq mm4, [_EAX+TMP0+8] psadbw mm5, [TMP1+TMP0] psadbw mm4, [TMP1+TMP0+8] movq mm3, [_EAX+2*TMP0] movq mm2, [_EAX+2*TMP0+8] psadbw mm3, [TMP1+2*TMP0] psadbw mm2, [TMP1+2*TMP0+8] %if (%1) movd [_ESP+4*(%1-1)], mm1 %else sub _ESP, byte 12 %endif movq mm1, [_EAX+_EBX] movq mm0, [_EAX+_EBX+8] psadbw mm1, [TMP1+_EBX] psadbw mm0, [TMP1+_EBX+8] lea _EAX, [_EAX+4*TMP0] lea TMP1, [TMP1+4*TMP0] paddd mm7, mm6 paddd mm5, mm4 paddd mm3, mm2 paddd mm1, mm0 paddd mm5, mm7 paddd mm1, mm3 %endmacro %macro SADBI_16x16_SSE0 0 movq mm2, [TMP1] movq mm3, [TMP1+8] movq mm5, [byte _EAX] movq mm6, [_EAX+8] pavgb mm2, [byte _EBX] pavgb mm3, [_EBX+8] add TMP1, TMP0 psadbw mm5, mm2 psadbw mm6, mm3 add _EAX, TMP0 add _EBX, TMP0 movq mm2, [byte TMP1] movq mm3, [TMP1+8] movq mm0, [byte _EAX] movq mm1, [_EAX+8] pavgb mm2, [byte _EBX] pavgb mm3, [_EBX+8] add TMP1, TMP0 add _EAX, TMP0 add _EBX, TMP0 psadbw mm0, mm2 psadbw mm1, mm3 %endmacro %macro SADBI_16x16_SSE 0 movq mm2, [byte TMP1] movq mm3, [TMP1+8] paddusw mm5, mm0 paddusw mm6, mm1 movq mm0, [_EAX] movq mm1, [_EAX+8] pavgb mm2, [_EBX] pavgb mm3, [_EBX+8] add TMP1, TMP0 add _EAX, TMP0 add _EBX, TMP0 psadbw mm0, mm2 psadbw mm1, mm3 %endmacro %macro SADBI_8x8_3dne 0 movq mm2, [TMP1] movq mm3, [TMP1+TMP0] pavgb mm2, [_EAX] pavgb mm3, [_EAX+TMP0] lea TMP1, [TMP1+2*TMP0] lea _EAX, [_EAX+2*TMP0] paddusw mm5, mm0 paddusw mm6, mm1 movq mm0, [_EBX] movq mm1, [_EBX+TMP0] lea _EBX, [_EBX+2*TMP0] psadbw mm0, mm2 psadbw mm1, mm3 %endmacro %macro ABS_16x16_SSE 1 %if (%1 == 0) movq mm7, [_EAX] psadbw mm7, mm4 mov esi, esi movq mm6, [_EAX+8] movq mm5, [_EAX+TMP0] movq mm3, [_EAX+TMP0+8] psadbw mm6, mm4 movq mm2, [byte _EAX+2*TMP0] psadbw mm5, mm4 movq mm1, [_EAX+2*TMP0+8] psadbw mm3, mm4 movq mm0, [_EAX+TMP1+0] psadbw mm2, mm4 add _EAX, TMP1 psadbw mm1, mm4 %endif %if (%1 == 1) psadbw mm0, mm4 paddd mm7, mm0 movq mm0, [_EAX+8] psadbw mm0, mm4 paddd mm6, mm0 movq mm0, [byte _EAX+TMP0] psadbw mm0, mm4 paddd mm5, mm0 movq mm0, [_EAX+TMP0+8] psadbw mm0, mm4 paddd mm3, mm0 movq mm0, [_EAX+2*TMP0] psadbw mm0, mm4 paddd mm2, mm0 movq mm0, [_EAX+2*TMP0+8] add _EAX, TMP1 psadbw mm0, mm4 paddd mm1, mm0 movq mm0, [_EAX] %endif %if (%1 == 2) psadbw mm0, mm4 paddd mm7, mm0 movq mm0, [_EAX+8] psadbw mm0, mm4 paddd mm6, mm0 %endif %endmacro ;============================================================================= ; Code ;============================================================================= TEXT cglobal sad16_3dne cglobal sad8_3dne cglobal sad16bi_3dne cglobal sad8bi_3dne cglobal dev16_3dne ;----------------------------------------------------------------------------- ; ; uint32_t sad16_3dne(const uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride, ; const uint32_t best_sad); ; ;----------------------------------------------------------------------------- ; optimization: 21% faster ALIGN SECTION_ALIGN sad16_3dne: mov _EAX, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov TMP0, prm3 ; Stride push _EBX lea _EBX, [2*TMP0+TMP0] SAD_16x16_SSE 0 SAD_16x16_SSE 1 SAD_16x16_SSE 2 SAD_16x16_SSE 3 paddd mm1, mm5 movd eax, mm1 add eax, dword [_ESP] add eax, dword [_ESP+4] add eax, dword [_ESP+8] mov _EBX, [_ESP+12] add _ESP, byte PTR_SIZE+12 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad8_3dne(const uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad8_3dne: mov _EAX, prm1 ; Src1 mov TMP0, prm3 ; Stride mov TMP1, prm2 ; Src2 push _EBX lea _EBX, [TMP0+2*TMP0] movq mm0, [byte _EAX] ;0 psadbw mm0, [byte TMP1] movq mm1, [_EAX+TMP0] ;1 psadbw mm1, [TMP1+TMP0] movq mm2, [_EAX+2*TMP0] ;2 psadbw mm2, [TMP1+2*TMP0] movq mm3, [_EAX+_EBX] ;3 psadbw mm3, [TMP1+_EBX] paddd mm0, mm1 movq mm4, [byte _EAX+4*TMP0];4 psadbw mm4, [TMP1+4*TMP0] movq mm5, [_EAX+2*_EBX] ;6 psadbw mm5, [TMP1+2*_EBX] paddd mm2, mm3 paddd mm0, mm2 lea _EBX, [_EBX+4*TMP0] ;3+4=7 lea TMP0, [TMP0+4*TMP0] ; 5 movq mm6, [_EAX+TMP0] ;5 psadbw mm6, [TMP1+TMP0] movq mm7, [_EAX+_EBX] ;7 psadbw mm7, [TMP1+_EBX] paddd mm4, mm5 paddd mm6, mm7 paddd mm0, mm4 mov _EBX, [_ESP] add _ESP, byte PTR_SIZE paddd mm0, mm6 movd eax, mm0 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad16bi_3dne(const uint8_t * const cur, ; const uint8_t * const ref1, ; const uint8_t * const ref2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ;optimization: 14% faster ALIGN SECTION_ALIGN sad16bi_3dne: mov _EAX, prm1 ; Src mov TMP1, prm2 ; Ref1 mov TMP0, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref2 %endif SADBI_16x16_SSE0 SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE paddusw mm5,mm0 paddusw mm6,mm1 pop _EBX paddusw mm6,mm5 movd eax, mm6 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad8bi_3dne(const uint8_t * const cur, ; const uint8_t * const ref1, ; const uint8_t * const ref2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad8bi_3dne: mov _EAX, prm3 ; Ref2 mov TMP1, prm2 ; Ref1 mov TMP0, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm1 %else mov _EBX, [_ESP+4+ 4] ; Src %endif movq mm2, [TMP1] movq mm3, [TMP1+TMP0] pavgb mm2, [_EAX] pavgb mm3, [_EAX+TMP0] lea TMP1, [TMP1+2*TMP0] lea _EAX, [_EAX+2*TMP0] movq mm5, [_EBX] movq mm6, [_EBX+TMP0] lea _EBX, [_EBX+2*TMP0] psadbw mm5, mm2 psadbw mm6, mm3 movq mm2, [TMP1] movq mm3, [TMP1+TMP0] pavgb mm2, [_EAX] pavgb mm3, [_EAX+TMP0] lea TMP1, [TMP1+2*TMP0] lea _EAX, [_EAX+2*TMP0] movq mm0, [_EBX] movq mm1, [_EBX+TMP0] lea _EBX, [_EBX+2*TMP0] psadbw mm0, mm2 psadbw mm1, mm3 movq mm2, [TMP1] movq mm3, [TMP1+TMP0] pavgb mm2, [_EAX] pavgb mm3, [_EAX+TMP0] lea TMP1, [TMP1+2*TMP0] lea _EAX, [_EAX+2*TMP0] paddusw mm5,mm0 paddusw mm6,mm1 movq mm0, [_EBX] movq mm1, [_EBX+TMP0] lea _EBX, [_EBX+2*TMP0] psadbw mm0, mm2 psadbw mm1, mm3 movq mm2, [TMP1] movq mm3, [TMP1+TMP0] pavgb mm2, [_EAX] pavgb mm3, [_EAX+TMP0] paddusw mm5,mm0 paddusw mm6,mm1 movq mm0, [_EBX] movq mm1, [_EBX+TMP0] psadbw mm0, mm2 psadbw mm1, mm3 paddusw mm5,mm0 paddusw mm6,mm1 paddusw mm6,mm5 mov _EBX,[_ESP] add _ESP,byte PTR_SIZE movd eax, mm6 ret ENDFUNC ;=========================================================================== ; ; uint32_t dev16_3dne(const uint8_t * const cur, ; const uint32_t stride); ; ;=========================================================================== ; optimization: 25 % faster ALIGN SECTION_ALIGN dev16_3dne: mov _EAX, prm1 ; Src mov TMP0, prm2 ; Stride lea TMP1, [TMP0+2*TMP0] pxor mm4, mm4 ALIGN SECTION_ALIGN ABS_16x16_SSE 0 ABS_16x16_SSE 1 ABS_16x16_SSE 1 ABS_16x16_SSE 1 ABS_16x16_SSE 1 paddd mm1, mm2 paddd mm3, mm5 ABS_16x16_SSE 2 paddd mm7, mm6 paddd mm1, mm3 mov _EAX, prm1 ; Src paddd mm7, mm1 punpcklbw mm7, mm7 ;xxyyaazz pshufw mm4, mm7, 055h ; mm4 contains the mean pxor mm1, mm1 ABS_16x16_SSE 0 ABS_16x16_SSE 1 ABS_16x16_SSE 1 ABS_16x16_SSE 1 ABS_16x16_SSE 1 paddd mm1, mm2 paddd mm3, mm5 ABS_16x16_SSE 2 paddd mm7, mm6 paddd mm1, mm3 paddd mm7, mm1 movd eax, mm7 ret ENDFUNC NON_EXEC_STACK xvidcore/src/motion/x86_asm/sad_xmm.asm0000664000076500007650000002074711254216113021204 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - K7 optimized SAD operators - ; * ; * Copyright(C) 2001 Peter Ross ; * 2001-2008 Michael Militzer ; * 2002 Pascal Massimino ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: sad_xmm.asm,v 1.15 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN mmx_one: times 4 dw 1 ;============================================================================= ; Helper macros ;============================================================================= %macro SAD_16x16_SSE 0 movq mm0, [_EAX] psadbw mm0, [TMP1] movq mm1, [_EAX+8] add _EAX, TMP0 psadbw mm1, [TMP1+8] paddusw mm5, mm0 add TMP1, TMP0 paddusw mm6, mm1 %endmacro %macro SAD_8x8_SSE 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP0] psadbw mm0, [TMP1] psadbw mm1, [TMP1+TMP0] add _EAX, _EBX add TMP1, _EBX paddusw mm5, mm0 paddusw mm6, mm1 %endmacro %macro SADBI_16x16_SSE 0 movq mm0, [_EAX] movq mm1, [_EAX+8] movq mm2, [TMP1] movq mm3, [TMP1+8] pavgb mm2, [_EBX] add TMP1, TMP0 pavgb mm3, [_EBX+8] add _EBX, TMP0 psadbw mm0, mm2 add _EAX, TMP0 psadbw mm1, mm3 paddusw mm5, mm0 paddusw mm6, mm1 %endmacro %macro SADBI_8x8_XMM 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP0] movq mm2, [TMP1] movq mm3, [TMP1+TMP0] pavgb mm2, [_EBX] lea TMP1, [TMP1+2*TMP0] pavgb mm3, [_EBX+TMP0] lea _EBX, [_EBX+2*TMP0] psadbw mm0, mm2 lea _EAX, [_EAX+2*TMP0] psadbw mm1, mm3 paddusw mm5, mm0 paddusw mm6, mm1 %endmacro %macro MEAN_16x16_SSE 0 movq mm0, [_EAX] movq mm1, [_EAX+8] psadbw mm0, mm7 psadbw mm1, mm7 add _EAX, TMP0 paddw mm5, mm0 paddw mm6, mm1 %endmacro %macro ABS_16x16_SSE 0 movq mm0, [_EAX] movq mm1, [_EAX+8] psadbw mm0, mm4 psadbw mm1, mm4 lea _EAX, [_EAX+TMP0] paddw mm5, mm0 paddw mm6, mm1 %endmacro ;============================================================================= ; Code ;============================================================================= TEXT cglobal sad16_xmm cglobal sad8_xmm cglobal sad16bi_xmm cglobal sad8bi_xmm cglobal dev16_xmm cglobal sad16v_xmm ;----------------------------------------------------------------------------- ; ; uint32_t sad16_xmm(const uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride, ; const uint32_t best_sad); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad16_xmm: mov _EAX, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov TMP0, prm3 ; Stride pxor mm5, mm5 ; accum1 pxor mm6, mm6 ; accum2 SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE paddusw mm6,mm5 movd eax, mm6 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad8_xmm(const uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad8_xmm: mov _EAX, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov TMP0, prm3 ; Stride push _EBX lea _EBX, [TMP0+TMP0] pxor mm5, mm5 ; accum1 pxor mm6, mm6 ; accum2 SAD_8x8_SSE SAD_8x8_SSE SAD_8x8_SSE movq mm0, [_EAX] movq mm1, [_EAX+TMP0] psadbw mm0, [TMP1] psadbw mm1, [TMP1+TMP0] pop _EBX paddusw mm5,mm0 paddusw mm6,mm1 paddusw mm6,mm5 movd eax, mm6 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad16bi_xmm(const uint8_t * const cur, ; const uint8_t * const ref1, ; const uint8_t * const ref2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad16bi_xmm: mov _EAX, prm1 ; Src mov TMP1, prm2 ; Ref1 mov TMP0, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref2 %endif pxor mm5, mm5 ; accum1 pxor mm6, mm6 ; accum2 SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE SADBI_16x16_SSE paddusw mm6,mm5 movd eax, mm6 pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t sad8bi_xmm(const uint8_t * const cur, ; const uint8_t * const ref1, ; const uint8_t * const ref2, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad8bi_xmm: mov _EAX, prm1 ; Src mov TMP1, prm2 ; Ref1 mov TMP0, prm4 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [_ESP+4+12] ; Ref2 %endif pxor mm5, mm5 ; accum1 pxor mm6, mm6 ; accum2 .Loop: SADBI_8x8_XMM SADBI_8x8_XMM SADBI_8x8_XMM SADBI_8x8_XMM paddusw mm6,mm5 movd eax, mm6 pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dev16_xmm(const uint8_t * const cur, ; const uint32_t stride); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN dev16_xmm: mov _EAX, prm1 ; Src mov TMP0, prm2 ; Stride pxor mm7, mm7 ; zero pxor mm5, mm5 ; mean accums pxor mm6, mm6 MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE MEAN_16x16_SSE paddusw mm6, mm5 movq mm4, mm6 psllq mm4, 32 paddd mm4, mm6 psrld mm4, 8 ; /= (16*16) packssdw mm4, mm4 packuswb mm4, mm4 ; mm4 contains the mean mov _EAX, prm1 ; Src pxor mm5, mm5 ; sums pxor mm6, mm6 ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE ABS_16x16_SSE paddusw mm6, mm5 movq mm7, mm6 psllq mm7, 32 paddd mm6, mm7 movd eax, mm6 ret ENDFUNC ;----------------------------------------------------------------------------- ;int sad16v_xmm(const uint8_t * const cur, ; const uint8_t * const ref, ; const uint32_t stride, ; int* sad8); ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN sad16v_xmm: mov _EAX, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov TMP0, prm3 ; Stride push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm4 %else mov _EBX, [_ESP+4+16] ; sad ptr %endif pxor mm5, mm5 ; accum1 pxor mm6, mm6 ; accum2 pxor mm7, mm7 ; total SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE paddusw mm7, mm5 paddusw mm7, mm6 movd [_EBX], mm5 movd [_EBX+4], mm6 pxor mm5, mm5 ; accum1 pxor mm6, mm6 ; accum2 SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE SAD_16x16_SSE paddusw mm7, mm5 paddusw mm7, mm6 movd [_EBX+8], mm5 movd [_EBX+12], mm6 movd eax, mm7 pop _EBX ret ENDFUNC NON_EXEC_STACK xvidcore/src/motion/estimation_rd_based.c0000664000076500007650000010325311564705453021740 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Rate-Distortion Based Motion Estimation for P- and S- VOPs - * * Copyright(C) 2003 Radoslaw Czyz * 2003-2010 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: estimation_rd_based.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ /* RD mode decision and search */ #include #include #include #include /* memcpy */ #include "../encoder.h" #include "../bitstream/mbcoding.h" #include "../prediction/mbprediction.h" #include "../global.h" #include "../image/interpolate8x8.h" #include "estimation.h" #include "motion.h" #include "sad.h" #include "../bitstream/zigzag.h" #include "../quant/quant.h" #include "../bitstream/vlc_codes.h" #include "../dct/fdct.h" #include "motion_inlines.h" /* rd = BITS_MULT*bits + LAMBDA*distortion */ #define LAMBDA ( (int)(BITS_MULT*1.0) ) static __inline unsigned int Block_CalcBits( int16_t * const coeff, int16_t * const data, int16_t * const dqcoeff, const uint32_t quant, const int quant_type, uint32_t * cbp, const int block, const uint16_t * scan_table, const unsigned int lambda, const uint16_t * mpeg_quant_matrices, const unsigned int quant_sq, const unsigned int rel_var8, const unsigned int metric) { int sum; int bits; int distortion = 0; fdct((short * const)data); if (quant_type) sum = quant_h263_inter(coeff, data, quant, mpeg_quant_matrices); else sum = quant_mpeg_inter(coeff, data, quant, mpeg_quant_matrices); if (sum > 0) { *cbp |= 1 << (5 - block); bits = BITS_MULT * CodeCoeffInter_CalcBits(coeff, scan_table); if (quant_type) dequant_h263_inter(dqcoeff, coeff, quant, mpeg_quant_matrices); else dequant_mpeg_inter(dqcoeff, coeff, quant, mpeg_quant_matrices); if (metric) distortion = masked_sseh8_16bit(data, dqcoeff, rel_var8); else distortion = sse8_16bit(data, dqcoeff, 8*sizeof(int16_t)); } else { const static int16_t zero_block[64] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, }; bits = 0; if (metric) distortion = masked_sseh8_16bit(data, (int16_t * const) zero_block, rel_var8); else distortion = sse8_16bit(data, (int16_t * const) zero_block, 8*sizeof(int16_t)); } return bits + (lambda*distortion)/quant_sq; } static __inline unsigned int Block_CalcBitsIntra(MACROBLOCK * pMB, const unsigned int x, const unsigned int y, const unsigned int mb_width, const uint32_t block, int16_t coeff[64], int16_t qcoeff[64], int16_t dqcoeff[64], int16_t predictors[8], const uint32_t quant, const int quant_type, unsigned int bits[2], unsigned int cbp[2], unsigned int lambda, const uint16_t * mpeg_quant_matrices, const unsigned int quant_sq, const unsigned int metric, const int bound) { int direction; int16_t *pCurrent; unsigned int i, coded; unsigned int distortion = 0; const uint32_t iDcScaler = get_dc_scaler(quant, block < 4); fdct((short * const)coeff); if (quant_type) { quant_h263_intra(qcoeff, coeff, quant, iDcScaler, mpeg_quant_matrices); dequant_h263_intra(dqcoeff, qcoeff, quant, iDcScaler, mpeg_quant_matrices); } else { quant_mpeg_intra(qcoeff, coeff, quant, iDcScaler, mpeg_quant_matrices); dequant_mpeg_intra(dqcoeff, qcoeff, quant, iDcScaler, mpeg_quant_matrices); } predict_acdc(pMB-(x+mb_width*y), x, y, mb_width, block, qcoeff, quant, iDcScaler, predictors, bound); direction = pMB->acpred_directions[block]; pCurrent = (int16_t*)pMB->pred_values[block]; /* store current coeffs to pred_values[] for future prediction */ pCurrent[0] = qcoeff[0] * iDcScaler; pCurrent[0] = CLIP(pCurrent[0], -2048, 2047); for (i = 1; i < 8; i++) { pCurrent[i] = qcoeff[i]; pCurrent[i + 7] = qcoeff[i * 8]; } /* dc prediction */ qcoeff[0] = qcoeff[0] - predictors[0]; if (block < 4) bits[1] = bits[0] = dcy_tab[qcoeff[0] + 255].len - 3; /* 3 bits added before (4 times) */ else bits[1] = bits[0] = dcc_tab[qcoeff[0] + 255].len - 2; /* 2 bits added before (2 times)*/ /* calc cost before ac prediction */ bits[0] += coded = CodeCoeffIntra_CalcBits(qcoeff, scan_tables[0]); if (coded > 0) cbp[0] |= 1 << (5 - block); /* apply ac prediction & calc cost*/ if (direction == 1) { for (i = 1; i < 8; i++) { qcoeff[i] -= predictors[i]; predictors[i] = qcoeff[i]; } } else { /* acpred_direction == 2 */ for (i = 1; i < 8; i++) { qcoeff[i*8] -= predictors[i]; predictors[i] = qcoeff[i*8]; } } bits[1] += coded = CodeCoeffIntra_CalcBits(qcoeff, scan_tables[direction]); if (coded > 0) cbp[1] |= 1 << (5 - block); if (metric) distortion = masked_sseh8_16bit(coeff, dqcoeff, pMB->rel_var8[block]); else distortion = sse8_16bit(coeff, dqcoeff, 8*sizeof(int16_t)); return (lambda*distortion)/quant_sq; } static void CheckCandidateRD16(const int x, const int y, SearchData * const data, const unsigned int Direction) { int16_t *in = data->dctSpace, *coeff = data->dctSpace + 64; /* minimum nuber of bits INTER can take is 1 (mcbpc) + 2 (cby) + 2 (vector) */ int32_t rd = BITS_MULT * (1+2+2); VECTOR * current; const uint8_t * ptr; int i, t, xc, yc; unsigned cbp = 0; if ( (x > data->max_dx) || (x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy) ) return; if (!data->qpel_precision) { ptr = GetReference(x, y, data); current = data->currentMV; xc = x; yc = y; } else { /* x and y are in 1/4 precision */ ptr = xvid_me_interpolate16x16qpel(x, y, 0, data); current = data->currentQMV; xc = x/2; yc = y/2; } for(i = 0; i < 4; i++) { int s = 8*((i&1) + (i>>1)*data->iEdgedWidth); transfer_8to16subro(in, data->Cur + s, ptr + s, data->iEdgedWidth); rd += data->temp[i] = Block_CalcBits(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, i, data->scan_table, data->lambda[i], data->mpeg_quant_matrices, data->quant_sq, data->rel_var8[i], data->metric); } rd += t = BITS_MULT * (d_mv_bits(x, y, data->predMV, data->iFcode, data->qpel^data->qpel_precision) - 2); if (data->temp[0] + t < data->iMinSAD[1]) { data->iMinSAD[1] = data->temp[0] + t; current[1].x = x; current[1].y = y; data->cbp[1] = (data->cbp[1]&~32) | (cbp&32); } if (data->temp[1] < data->iMinSAD[2]) { data->iMinSAD[2] = data->temp[1]; current[2].x = x; current[2].y = y; data->cbp[1] = (data->cbp[1]&~16) | (cbp&16); } if (data->temp[2] < data->iMinSAD[3]) { data->iMinSAD[3] = data->temp[2]; current[3].x = x; current[3].y = y; data->cbp[1] = (data->cbp[1]&~8) | (cbp&8); } if (data->temp[3] < data->iMinSAD[4]) { data->iMinSAD[4] = data->temp[3]; current[4].x = x; current[4].y = y; data->cbp[1] = (data->cbp[1]&~4) | (cbp&4); } rd += BITS_MULT * (xvid_cbpy_tab[15-(cbp>>2)].len - 2); if (rd >= data->iMinSAD[0]) return; /* chroma */ xc = (xc >> 1) + roundtab_79[xc & 0x3]; yc = (yc >> 1) + roundtab_79[yc & 0x3]; /* chroma U */ ptr = interpolate8x8_switch2(data->RefQ, data->RefP[4], 0, 0, xc, yc, data->iEdgedWidth/2, data->rounding); transfer_8to16subro(in, data->CurU, ptr, data->iEdgedWidth/2); rd += Block_CalcBits(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 4, data->scan_table, data->lambda[4], data->mpeg_quant_matrices, data->quant_sq, data->rel_var8[4], data->metric); if (rd >= data->iMinSAD[0]) return; /* chroma V */ ptr = interpolate8x8_switch2(data->RefQ, data->RefP[5], 0, 0, xc, yc, data->iEdgedWidth/2, data->rounding); transfer_8to16subro(in, data->CurV, ptr, data->iEdgedWidth/2); rd += Block_CalcBits(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 5, data->scan_table, data->lambda[5], data->mpeg_quant_matrices, data->quant_sq, data->rel_var8[5], data->metric); rd += BITS_MULT * (mcbpc_inter_tab[(MODE_INTER & 7) | ((cbp & 3) << 3)].len - 1); /* one was added before */ if (rd < data->iMinSAD[0]) { data->iMinSAD[0] = rd; current[0].x = x; current[0].y = y; data->dir = Direction; *data->cbp = cbp; } } static void CheckCandidateRD8(const int x, const int y, SearchData * const data, const unsigned int Direction) { int16_t *in = data->dctSpace, *coeff = data->dctSpace + 64; int32_t rd; VECTOR * current; const uint8_t * ptr; unsigned int cbp = 0; if ( (x > data->max_dx) || (x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy) ) return; if (!data->qpel_precision) { ptr = GetReference(x, y, data); current = data->currentMV; } else { /* x and y are in 1/4 precision */ ptr = xvid_me_interpolate8x8qpel(x, y, 0, 0, data); current = data->currentQMV; } transfer_8to16subro(in, data->Cur, ptr, data->iEdgedWidth); rd = Block_CalcBits(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 5, data->scan_table, data->lambda[0], data->mpeg_quant_matrices, data->quant_sq, data->rel_var8[0], data->metric); /* we took 2 bits into account before */ rd += BITS_MULT * (d_mv_bits(x, y, data->predMV, data->iFcode, data->qpel^data->qpel_precision) - 2); if (rd < data->iMinSAD[0]) { *data->cbp = cbp; data->iMinSAD[0] = rd; current[0].x = x; current[0].y = y; data->dir = Direction; } } static int findRD_inter(SearchData * const Data, const int x, const int y, const MBParam * const pParam, const uint32_t MotionFlags) { int i; int32_t bsad[5]; if (Data->qpel) { for(i = 0; i < 5; i++) { Data->currentMV[i].x = Data->currentQMV[i].x/2; Data->currentMV[i].y = Data->currentQMV[i].y/2; } Data->qpel_precision = 1; CheckCandidateRD16(Data->currentQMV[0].x, Data->currentQMV[0].y, Data, 255); if (MotionFlags & (XVID_ME_HALFPELREFINE16_RD | XVID_ME_EXTSEARCH_RD)) { /* we have to prepare for halfpixel-precision search */ for(i = 0; i < 5; i++) bsad[i] = Data->iMinSAD[i]; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode - Data->qpel, 1); Data->qpel_precision = 0; if (Data->currentQMV->x & 1 || Data->currentQMV->y & 1) CheckCandidateRD16(Data->currentMV[0].x, Data->currentMV[0].y, Data, 255); } } else { /* not qpel */ CheckCandidateRD16(Data->currentMV[0].x, Data->currentMV[0].y, Data, 255); } if (MotionFlags&XVID_ME_EXTSEARCH_RD) xvid_me_SquareSearch(Data->currentMV->x, Data->currentMV->y, Data, 255, CheckCandidateRD16); if (MotionFlags&XVID_ME_HALFPELREFINE16_RD) xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidateRD16, 0); if (Data->qpel) { if (MotionFlags&(XVID_ME_EXTSEARCH_RD | XVID_ME_HALFPELREFINE16_RD)) { /* there was halfpel-precision search */ for(i = 0; i < 5; i++) if (bsad[i] > Data->iMinSAD[i]) { Data->currentQMV[i].x = 2 * Data->currentMV[i].x; /* we have found a better match */ Data->currentQMV[i].y = 2 * Data->currentMV[i].y; } /* preparing for qpel-precision search */ Data->qpel_precision = 1; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode, 2); } if (MotionFlags & XVID_ME_QUARTERPELREFINE16_RD) { if (MotionFlags & XVID_ME_FASTREFINE16) FullRefine_Fast(Data, CheckCandidateRD16, 0); else xvid_me_SubpelRefine(Data->currentQMV[0], Data, CheckCandidateRD16, 0); } } if (MotionFlags&XVID_ME_CHECKPREDICTION_RD) { /* let's check vector equal to prediction */ VECTOR * v = Data->qpel ? Data->currentQMV : Data->currentMV; if (!MVequal(Data->predMV, *v)) CheckCandidateRD16(Data->predMV.x, Data->predMV.y, Data, 255); } return Data->iMinSAD[0]; } static int findRD_inter4v(SearchData * const Data, MACROBLOCK * const pMB, const MACROBLOCK * const pMBs, const int x, const int y, const MBParam * const pParam, const uint32_t MotionFlags, const VECTOR * const backup, const int bound) { unsigned int cbp = 0, t = 0, i; /* minimum number of bits INTER4V can take is 2 (cbpy) + 3 (mcbpc) + 4*2 (vectors)*/ int bits = (2+3+4*2)*BITS_MULT; SearchData Data2, *Data8 = &Data2; int sumx = 0, sumy = 0; int16_t *in = Data->dctSpace, *coeff = Data->dctSpace + 64; uint8_t * ptr; memcpy(Data8, Data, sizeof(SearchData)); for (i = 0; i < 4; i++) { /* for all luma blocks */ *Data8->iMinSAD = *(Data->iMinSAD + i + 1); *Data8->currentMV = *(Data->currentMV + i + 1); *Data8->currentQMV = *(Data->currentQMV + i + 1); Data8->Cur = Data->Cur + 8*((i&1) + (i>>1)*Data->iEdgedWidth); Data8->RefP[0] = Data->RefP[0] + 8*((i&1) + (i>>1)*Data->iEdgedWidth); Data8->RefP[2] = Data->RefP[2] + 8*((i&1) + (i>>1)*Data->iEdgedWidth); Data8->RefP[1] = Data->RefP[1] + 8*((i&1) + (i>>1)*Data->iEdgedWidth); Data8->RefP[3] = Data->RefP[3] + 8*((i&1) + (i>>1)*Data->iEdgedWidth); *Data8->cbp = (Data->cbp[1] & (1<<(5-i))) ? 1:0; /* copy corresponding cbp bit */ Data8->lambda[0] = Data->lambda[i]; Data8->rel_var8[0] = Data->rel_var8[i]; if(Data->qpel) { Data8->predMV = get_qpmv2(pMBs, pParam->mb_width, bound, x, y, i); if (i != 0) t = d_mv_bits( Data8->currentQMV->x, Data8->currentQMV->y, Data8->predMV, Data8->iFcode, 0) - 2; } else { Data8->predMV = get_pmv2(pMBs, pParam->mb_width, bound, x, y, i); if (i != 0) t = d_mv_bits( Data8->currentMV->x, Data8->currentMV->y, Data8->predMV, Data8->iFcode, 0) - 2; } get_range(&Data8->min_dx, &Data8->max_dx, &Data8->min_dy, &Data8->max_dy, 2*x + (i&1), 2*y + (i>>1), 3, pParam->width, pParam->height, Data8->iFcode, Data8->qpel+1); *Data8->iMinSAD += BITS_MULT * t; Data8->qpel_precision = Data8->qpel; /* checking the vector which has been found by SAD-based 8x8 search (if it's different than the one found so far) */ { VECTOR *v = Data8->qpel ? Data8->currentQMV : Data8->currentMV; if (!MVequal (*v, backup[i+1]) ) CheckCandidateRD8(backup[i+1].x, backup[i+1].y, Data8, 255); } if (Data8->qpel) { int bsad = Data8->iMinSAD[0]; int bx = Data8->currentQMV->x; int by = Data8->currentQMV->y; Data8->currentMV->x = Data8->currentQMV->x/2; Data8->currentMV->y = Data8->currentQMV->y/2; if (MotionFlags&XVID_ME_HALFPELREFINE8_RD || (MotionFlags&XVID_ME_EXTSEARCH8 && MotionFlags&XVID_ME_EXTSEARCH_RD)) { /* halfpixel motion search follows */ Data8->qpel_precision = 0; get_range(&Data8->min_dx, &Data8->max_dx, &Data8->min_dy, &Data8->max_dy, 2*x + (i&1), 2*y + (i>>1), 3, pParam->width, pParam->height, Data8->iFcode - 1, 1); if (Data8->currentQMV->x & 1 || Data8->currentQMV->y & 1) CheckCandidateRD8(Data8->currentMV->x, Data8->currentMV->y, Data8, 255); if (MotionFlags & XVID_ME_EXTSEARCH8 && MotionFlags & XVID_ME_EXTSEARCH_RD) xvid_me_SquareSearch(Data8->currentMV->x, Data8->currentMV->x, Data8, 255, CheckCandidateRD8); if (MotionFlags & XVID_ME_HALFPELREFINE8_RD) xvid_me_SubpelRefine(Data->currentMV[0], Data8, CheckCandidateRD8, 0); if(bsad > *Data8->iMinSAD) { /* we have found a better match */ bx = Data8->currentQMV->x = 2*Data8->currentMV->x; by = Data8->currentQMV->y = 2*Data8->currentMV->y; bsad = Data8->iMinSAD[0]; } Data8->qpel_precision = 1; get_range(&Data8->min_dx, &Data8->max_dx, &Data8->min_dy, &Data8->max_dy, 2*x + (i&1), 2*y + (i>>1), 3, pParam->width, pParam->height, Data8->iFcode, 2); } if (MotionFlags & XVID_ME_QUARTERPELREFINE8_RD) { if (MotionFlags & XVID_ME_FASTREFINE8) FullRefine_Fast(Data8, CheckCandidateRD8, 0); else xvid_me_SubpelRefine(Data->currentQMV[0], Data8, CheckCandidateRD8, 0); } if (bsad <= Data->iMinSAD[0]) { /* we have not found a better match */ Data8->iMinSAD[0] = bsad; Data8->currentQMV->x = bx; Data8->currentQMV->y = by; } } else { /* not qpel */ if (MotionFlags & XVID_ME_EXTSEARCH8 && MotionFlags & XVID_ME_EXTSEARCH_RD) /* extsearch */ xvid_me_SquareSearch(Data8->currentMV->x, Data8->currentMV->x, Data8, 255, CheckCandidateRD8); if (MotionFlags & XVID_ME_HALFPELREFINE8_RD) xvid_me_SubpelRefine(Data->currentMV[0], Data8, CheckCandidateRD8, 0); /* halfpel refinement */ } /* checking vector equal to predicion */ if (i != 0 && MotionFlags & XVID_ME_CHECKPREDICTION_RD) { const VECTOR * v = Data->qpel ? Data8->currentQMV : Data8->currentMV; if (!MVequal(*v, Data8->predMV)) CheckCandidateRD8(Data8->predMV.x, Data8->predMV.y, Data8, 255); } bits += *Data8->iMinSAD; if (bits >= Data->iMinSAD[0]) return bits; /* no chances for INTER4V */ /* MB structures for INTER4V mode; we have to set them here, we don't have predictor anywhere else */ if(Data->qpel) { pMB->pmvs[i].x = Data8->currentQMV->x - Data8->predMV.x; pMB->pmvs[i].y = Data8->currentQMV->y - Data8->predMV.y; pMB->qmvs[i] = *Data8->currentQMV; sumx += Data8->currentQMV->x/2; sumy += Data8->currentQMV->y/2; } else { pMB->pmvs[i].x = Data8->currentMV->x - Data8->predMV.x; pMB->pmvs[i].y = Data8->currentMV->y - Data8->predMV.y; sumx += Data8->currentMV->x; sumy += Data8->currentMV->y; } pMB->mvs[i] = *Data8->currentMV; pMB->sad8[i] = 4 * *Data8->iMinSAD; if (Data8->cbp[0]) cbp |= 1 << (5 - i); } /* end - for all luma blocks */ bits += BITS_MULT * (xvid_cbpy_tab[15-(cbp>>2)].len - 2); /* 2 were added before */ /* let's check chroma */ sumx = (sumx >> 3) + roundtab_76[sumx & 0xf]; sumy = (sumy >> 3) + roundtab_76[sumy & 0xf]; /* chroma U */ ptr = interpolate8x8_switch2(Data->RefQ + 64, Data->RefP[4], 0, 0, sumx, sumy, Data->iEdgedWidth/2, Data->rounding); transfer_8to16subro(in, Data->CurU, ptr, Data->iEdgedWidth/2); bits += Block_CalcBits(coeff, in, Data->dctSpace + 128, Data->iQuant, Data->quant_type, &cbp, 4, Data->scan_table, Data->lambda[4], Data->mpeg_quant_matrices, Data->quant_sq, Data->rel_var8[4], Data->metric); if (bits >= *Data->iMinSAD) return bits; /* chroma V */ ptr = interpolate8x8_switch2(Data->RefQ + 64, Data->RefP[5], 0, 0, sumx, sumy, Data->iEdgedWidth/2, Data->rounding); transfer_8to16subro(in, Data->CurV, ptr, Data->iEdgedWidth/2); bits += Block_CalcBits(coeff, in, Data->dctSpace + 128, Data->iQuant, Data->quant_type, &cbp, 5, Data->scan_table, Data->lambda[5], Data->mpeg_quant_matrices, Data->quant_sq, Data->rel_var8[5], Data->metric); bits += BITS_MULT*(mcbpc_inter_tab[(MODE_INTER4V & 7) | ((cbp & 3) << 3)].len - 3); /* 3 were added before */ *Data->cbp = cbp; return bits; } static int findRD_intra(SearchData * const Data, MACROBLOCK * pMB, const int x, const int y, const int mb_width, const int bound) { unsigned int cbp[2] = {0, 0}, bits[2], i; /* minimum number of bits that WILL be coded in intra - mcbpc 5, cby 2 acdc flag - 1 and DC coeffs - 4*3+2*2 */ int bits1 = BITS_MULT*(5+2+1+4*3+2*2), bits2 = BITS_MULT*(5+2+1+4*3+2*2); unsigned int distortion = 0; int16_t *in = Data->dctSpace, * coeff = Data->dctSpace + 64, * dqcoeff = Data->dctSpace + 128; const uint32_t iQuant = Data->iQuant; int16_t predictors[6][8]; for(i = 0; i < 4; i++) { int s = 8*((i&1) + (i>>1)*Data->iEdgedWidth); transfer_8to16copy(in, Data->Cur + s, Data->iEdgedWidth); distortion = Block_CalcBitsIntra(pMB, x, y, mb_width, i, in, coeff, dqcoeff, predictors[i], iQuant, Data->quant_type, bits, cbp, Data->lambda[i], Data->mpeg_quant_matrices, Data->quant_sq, Data->metric, bound); bits1 += distortion + BITS_MULT * bits[0]; bits2 += distortion + BITS_MULT * bits[1]; if (bits1 >= Data->iMinSAD[0] && bits2 >= Data->iMinSAD[0]) return bits1; } bits1 += BITS_MULT * (xvid_cbpy_tab[cbp[0]>>2].len - 2); /* two bits were added before */ bits2 += BITS_MULT * (xvid_cbpy_tab[cbp[1]>>2].len - 2); /*chroma U */ transfer_8to16copy(in, Data->CurU, Data->iEdgedWidth/2); distortion = Block_CalcBitsIntra(pMB, x, y, mb_width, 4, in, coeff, dqcoeff, predictors[4], iQuant, Data->quant_type, bits, cbp, Data->lambda[4], Data->mpeg_quant_matrices, Data->quant_sq, Data->metric, bound); bits1 += distortion + BITS_MULT * bits[0]; bits2 += distortion + BITS_MULT * bits[1]; if (bits1 >= Data->iMinSAD[0] && bits2 >= Data->iMinSAD[0]) return bits1; /* chroma V */ transfer_8to16copy(in, Data->CurV, Data->iEdgedWidth/2); distortion = Block_CalcBitsIntra(pMB, x, y, mb_width, 5, in, coeff, dqcoeff, predictors[5], iQuant, Data->quant_type, bits, cbp, Data->lambda[5], Data->mpeg_quant_matrices, Data->quant_sq, Data->metric, bound); bits1 += distortion + BITS_MULT * bits[0]; bits2 += distortion + BITS_MULT * bits[1]; bits1 += BITS_MULT * (mcbpc_inter_tab[(MODE_INTRA & 7) | ((cbp[0] & 3) << 3)].len - 5); /* 5 bits were added before */ bits2 += BITS_MULT * (mcbpc_inter_tab[(MODE_INTRA & 7) | ((cbp[1] & 3) << 3)].len - 5); *Data->cbp = bits1 <= bits2 ? cbp[0] : cbp[1]; return MIN(bits1, bits2); } static int findRD_gmc(SearchData * const Data, const IMAGE * const vGMC, const int x, const int y) { /* minimum nubler of bits - 1 (mcbpc) + 2 (cby) + 1 (mcsel) */ int bits = BITS_MULT * (1+2+1); unsigned int cbp = 0, i; int16_t *in = Data->dctSpace, * coeff = Data->dctSpace + 64; for(i = 0; i < 4; i++) { int s = 8*((i&1) + (i>>1)*Data->iEdgedWidth); transfer_8to16subro(in, Data->Cur + s, vGMC->y + s + 16*(x+y*Data->iEdgedWidth), Data->iEdgedWidth); bits += Block_CalcBits(coeff, in, Data->dctSpace + 128, Data->iQuant, Data->quant_type, &cbp, i, Data->scan_table, Data->lambda[i], Data->mpeg_quant_matrices, Data->quant_sq, Data->rel_var8[i], Data->metric); if (bits >= Data->iMinSAD[0]) return bits; } bits += BITS_MULT * (xvid_cbpy_tab[15-(cbp>>2)].len - 2); /*chroma U */ transfer_8to16subro(in, Data->CurU, vGMC->u + 8*(x+y*(Data->iEdgedWidth/2)), Data->iEdgedWidth/2); bits += Block_CalcBits(coeff, in, Data->dctSpace + 128, Data->iQuant, Data->quant_type, &cbp, 4, Data->scan_table, Data->lambda[4], Data->mpeg_quant_matrices, Data->quant_sq, Data->rel_var8[4], Data->metric); if (bits >= Data->iMinSAD[0]) return bits; /* chroma V */ transfer_8to16subro(in, Data->CurV , vGMC->v + 8*(x+y*(Data->iEdgedWidth/2)), Data->iEdgedWidth/2); bits += Block_CalcBits(coeff, in, Data->dctSpace + 128, Data->iQuant, Data->quant_type, &cbp, 5, Data->scan_table, Data->lambda[5], Data->mpeg_quant_matrices, Data->quant_sq, Data->rel_var8[5], Data->metric); bits += BITS_MULT * (mcbpc_inter_tab[(MODE_INTER & 7) | ((cbp & 3) << 3)].len - 1); *Data->cbp = cbp; return bits; } void xvid_me_ModeDecision_RD(SearchData * const Data, MACROBLOCK * const pMB, const MACROBLOCK * const pMBs, const int x, const int y, const MBParam * const pParam, const uint32_t MotionFlags, const uint32_t VopFlags, const uint32_t VolFlags, const IMAGE * const pCurrent, const IMAGE * const pRef, const IMAGE * const vGMC, const int coding_type, const int bound) { int mode = MODE_INTER; int mcsel = 0; int inter4v = (VopFlags & XVID_VOP_INTER4V) && (pMB->dquant == 0); const uint32_t iQuant = pMB->quant; int min_rd, intra_rd, i, cbp; VECTOR backup[5], *v; Data->iQuant = iQuant; Data->quant_sq = iQuant*iQuant; Data->scan_table = VopFlags & XVID_VOP_ALTERNATESCAN ? scan_tables[2] : scan_tables[0]; Data->metric = !!(VopFlags & XVID_VOP_RD_PSNRHVSM); pMB->mcsel = 0; v = Data->qpel ? Data->currentQMV : Data->currentMV; for (i = 0; i < 5; i++) { Data->iMinSAD[i] = 256*4096; backup[i] = v[i]; } for (i = 0; i < 6; i++) { Data->lambda[i] = (LAMBDA*pMB->lambda[i])>>LAMBDA_EXP; Data->rel_var8[i] = pMB->rel_var8[i]; } min_rd = findRD_inter(Data, x, y, pParam, MotionFlags); cbp = *Data->cbp; if (coding_type == S_VOP) { int gmc_rd; *Data->iMinSAD = min_rd += BITS_MULT*1; /* mcsel */ gmc_rd = findRD_gmc(Data, vGMC, x, y); if (gmc_rd < min_rd) { mcsel = 1; *Data->iMinSAD = min_rd = gmc_rd; mode = MODE_INTER; cbp = *Data->cbp; } } if (inter4v) { int v4_rd; v4_rd = findRD_inter4v(Data, pMB, pMBs, x, y, pParam, MotionFlags, backup, bound); if (v4_rd < min_rd) { Data->iMinSAD[0] = min_rd = v4_rd; mode = MODE_INTER4V; cbp = *Data->cbp; } } /* there is no way for INTRA to take less than 24 bits - go to findRD_intra() for calculations */ if (min_rd > 24*BITS_MULT) { intra_rd = findRD_intra(Data, pMB, x, y, pParam->mb_width, bound); if (intra_rd < min_rd) { *Data->iMinSAD = min_rd = intra_rd; mode = MODE_INTRA; cbp = *Data->cbp; } } pMB->sad16 = pMB->sad8[0] = pMB->sad8[1] = pMB->sad8[2] = pMB->sad8[3] = 0; pMB->cbp = cbp; if (mode == MODE_INTER && mcsel == 0) { pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = Data->currentMV[0]; if(Data->qpel) { pMB->qmvs[0] = pMB->qmvs[1] = pMB->qmvs[2] = pMB->qmvs[3] = Data->currentQMV[0]; pMB->pmvs[0].x = Data->currentQMV[0].x - Data->predMV.x; pMB->pmvs[0].y = Data->currentQMV[0].y - Data->predMV.y; } else { pMB->pmvs[0].x = Data->currentMV[0].x - Data->predMV.x; pMB->pmvs[0].y = Data->currentMV[0].y - Data->predMV.y; } } else if (mode == MODE_INTER ) { /* but mcsel == 1 */ pMB->mcsel = 1; if (Data->qpel) { pMB->qmvs[0] = pMB->qmvs[1] = pMB->qmvs[2] = pMB->qmvs[3] = pMB->amv; pMB->mvs[0].x = pMB->mvs[1].x = pMB->mvs[2].x = pMB->mvs[3].x = pMB->amv.x/2; pMB->mvs[0].y = pMB->mvs[1].y = pMB->mvs[2].y = pMB->mvs[3].y = pMB->amv.y/2; } else pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = pMB->amv; } else if (mode == MODE_INTER4V) ; /* anything here? */ else /* INTRA, NOT_CODED */ ZeroMacroblockP(pMB, 0); pMB->mode = mode; } void xvid_me_ModeDecision_Fast(SearchData * const Data, MACROBLOCK * const pMB, const MACROBLOCK * const pMBs, const int x, const int y, const MBParam * const pParam, const uint32_t MotionFlags, const uint32_t VopFlags, const uint32_t VolFlags, const IMAGE * const pCurrent, const IMAGE * const pRef, const IMAGE * const vGMC, const int coding_type, const int bound) { int mode = MODE_INTER; int mcsel = 0; int inter4v = (VopFlags & XVID_VOP_INTER4V) && (pMB->dquant == 0); const uint32_t iQuant = pMB->quant; const int skip_possible = (coding_type == P_VOP) && (pMB->dquant == 0); int sad; int min_rd = -1, intra_rd, i, cbp = 63; VECTOR backup[5], *v; int sad_backup[5]; int InterBias = MV16_INTER_BIAS; int thresh = 0; int top = 0, top_right = 0, left = 0; Data->scan_table = VopFlags & XVID_VOP_ALTERNATESCAN ? scan_tables[2] : scan_tables[0]; Data->metric = !!(VopFlags & XVID_VOP_RD_PSNRHVSM); pMB->mcsel = 0; Data->iQuant = iQuant; Data->quant_sq = iQuant*iQuant; for (i = 0; i < 6; i++) { Data->lambda[i] = (LAMBDA*pMB->lambda[i])>>LAMBDA_EXP; Data->rel_var8[i] = pMB->rel_var8[i]; } /* INTER <-> INTER4V decision */ if ((Data->iMinSAD[0] + 75 < Data->iMinSAD[1] + Data->iMinSAD[2] + Data->iMinSAD[3] + Data->iMinSAD[4])) { /* normal, fast, SAD-based mode decision */ if (inter4v == 0 || Data->iMinSAD[0] < Data->iMinSAD[1] + Data->iMinSAD[2] + Data->iMinSAD[3] + Data->iMinSAD[4] + IMV16X16 * (int32_t)iQuant) { mode = MODE_INTER; sad = Data->iMinSAD[0]; } else { mode = MODE_INTER4V; sad = Data->iMinSAD[1] + Data->iMinSAD[2] + Data->iMinSAD[3] + Data->iMinSAD[4] + IMV16X16 * (int32_t)iQuant; Data->iMinSAD[0] = sad; } /* final skip decision, a.k.a. "the vector you found, really that good?" */ if (skip_possible && (pMB->sad16 < (int)iQuant * MAX_SAD00_FOR_SKIP)) if ( (100*sad)/(pMB->sad16+1) > FINAL_SKIP_THRESH) if (Data->chroma || xvid_me_SkipDecisionP(pCurrent, pRef, x, y, Data->iEdgedWidth/2, iQuant)) { mode = MODE_NOT_CODED; sad = 0; /* Compiler warning */ goto early_out; } /* mcsel */ if (coding_type == S_VOP) { int32_t iSAD = sad16(Data->Cur, vGMC->y + 16*y*Data->iEdgedWidth + 16*x, Data->iEdgedWidth, 65536); if (Data->chroma) { iSAD += sad8(Data->CurU, vGMC->u + 8*y*(Data->iEdgedWidth/2) + 8*x, Data->iEdgedWidth/2); iSAD += sad8(Data->CurV, vGMC->v + 8*y*(Data->iEdgedWidth/2) + 8*x, Data->iEdgedWidth/2); } if (iSAD <= sad) { /* mode decision GMC */ mode = MODE_INTER; mcsel = 1; sad = iSAD; } } } else { /* Rate-Distortion INTER<->INTER4V */ Data->iQuant = iQuant; v = Data->qpel ? Data->currentQMV : Data->currentMV; /* final skip decision, a.k.a. "the vector you found, really that good?" */ if (skip_possible && (pMB->sad16 < (int)iQuant * MAX_SAD00_FOR_SKIP)) if ( (100*Data->iMinSAD[0])/(pMB->sad16+1) > FINAL_SKIP_THRESH) if (Data->chroma || xvid_me_SkipDecisionP(pCurrent, pRef, x, y, Data->iEdgedWidth/2, iQuant)) { mode = MODE_NOT_CODED; sad = 0; /* Compiler warning */ goto early_out; } for (i = 0; i < 5; i++) { sad_backup[i] = Data->iMinSAD[i]; Data->iMinSAD[i] = 256*4096; backup[i] = v[i]; } min_rd = findRD_inter(Data, x, y, pParam, MotionFlags); cbp = *Data->cbp; sad = sad_backup[0]; if (coding_type == S_VOP) { int gmc_rd; *Data->iMinSAD = min_rd += BITS_MULT*1; /* mcsel */ gmc_rd = findRD_gmc(Data, vGMC, x, y); if (gmc_rd < min_rd) { mcsel = 1; *Data->iMinSAD = min_rd = gmc_rd; mode = MODE_INTER; cbp = *Data->cbp; sad = sad16(Data->Cur, vGMC->y + 16*y*Data->iEdgedWidth + 16*x, Data->iEdgedWidth, 65536); if (Data->chroma) { sad += sad8(Data->CurU, vGMC->u + 8*y*(Data->iEdgedWidth/2) + 8*x, Data->iEdgedWidth/2); sad += sad8(Data->CurV, vGMC->v + 8*y*(Data->iEdgedWidth/2) + 8*x, Data->iEdgedWidth/2); } } } if (inter4v) { int v4_rd; v4_rd = findRD_inter4v(Data, pMB, pMBs, x, y, pParam, MotionFlags, backup, bound); if (v4_rd < min_rd) { Data->iMinSAD[0] = min_rd = v4_rd; mode = MODE_INTER4V; cbp = *Data->cbp; sad = sad_backup[1] + sad_backup[2] + sad_backup[3] + sad_backup[4] + IMV16X16 * (int32_t)iQuant; } } } left = top = top_right = -1; thresh = 0; if((x > 0) && (y > 0) && (x < (int32_t) pParam->mb_width)) { left = (&pMBs[(x-1) + y * pParam->mb_width])->sad16; /* left */ top = (&pMBs[x + (y-1) * pParam->mb_width])->sad16; /* top */ top_right = (&pMBs[(x+1) + (y-1) * pParam->mb_width])->sad16; /* top right */ if(((&pMBs[(x-1) + y * pParam->mb_width])->mode != MODE_INTRA) && ((&pMBs[x + (y-1) * pParam->mb_width])->mode != MODE_INTRA) && ((&pMBs[(x+1) + (y-1) * pParam->mb_width])->mode != MODE_INTRA)) { thresh = MAX(MAX(top, left), top_right); } else thresh = MIN(MIN(top, left), top_right); } /* INTRA <-> INTER decision */ if (sad < thresh) { /* normal, fast, SAD-based mode decision */ /* intra decision */ if (iQuant > 8) InterBias += 100 * (iQuant - 8); /* to make high quants work */ if (y != 0) if ((pMB - pParam->mb_width)->mode == MODE_INTRA ) InterBias -= 80; if (x != 0) if ((pMB - 1)->mode == MODE_INTRA ) InterBias -= 80; if (Data->chroma) InterBias += 50; /* dev8(chroma) ??? <-- yes, we need dev8 (no big difference though) */ if (InterBias < sad) { int32_t deviation = dev16(Data->Cur, Data->iEdgedWidth); if (deviation < (sad - InterBias)) mode = MODE_INTRA; } pMB->cbp = 63; } else { /* Rate-Distortion INTRA<->INTER */ if(min_rd < 0) { Data->iQuant = iQuant; v = Data->qpel ? Data->currentQMV : Data->currentMV; for (i = 0; i < 5; i++) { Data->iMinSAD[i] = 256*4096; backup[i] = v[i]; } if(mode == MODE_INTER) { min_rd = findRD_inter(Data, x, y, pParam, MotionFlags); cbp = *Data->cbp; if (coding_type == S_VOP) { int gmc_rd; *Data->iMinSAD = min_rd += BITS_MULT*1; /* mcsel */ gmc_rd = findRD_gmc(Data, vGMC, x, y); if (gmc_rd < min_rd) { mcsel = 1; *Data->iMinSAD = min_rd = gmc_rd; mode = MODE_INTER; cbp = *Data->cbp; } } } if(mode == MODE_INTER4V) { int v4_rd; v4_rd = findRD_inter4v(Data, pMB, pMBs, x, y, pParam, MotionFlags, backup, bound); if (v4_rd < min_rd) { Data->iMinSAD[0] = min_rd = v4_rd; mode = MODE_INTER4V; cbp = *Data->cbp; } } } intra_rd = findRD_intra(Data, pMB, x, y, pParam->mb_width, bound); if (intra_rd < min_rd) { *Data->iMinSAD = min_rd = intra_rd; mode = MODE_INTRA; } pMB->cbp = cbp; } early_out: pMB->sad16 = pMB->sad8[0] = pMB->sad8[1] = pMB->sad8[2] = pMB->sad8[3] = sad; if (mode == MODE_INTER && mcsel == 0) { pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = Data->currentMV[0]; if(Data->qpel) { pMB->qmvs[0] = pMB->qmvs[1] = pMB->qmvs[2] = pMB->qmvs[3] = Data->currentQMV[0]; pMB->pmvs[0].x = Data->currentQMV[0].x - Data->predMV.x; pMB->pmvs[0].y = Data->currentQMV[0].y - Data->predMV.y; } else { pMB->pmvs[0].x = Data->currentMV[0].x - Data->predMV.x; pMB->pmvs[0].y = Data->currentMV[0].y - Data->predMV.y; } } else if (mode == MODE_INTER ) { /* but mcsel == 1 */ pMB->mcsel = 1; if (Data->qpel) { pMB->qmvs[0] = pMB->qmvs[1] = pMB->qmvs[2] = pMB->qmvs[3] = pMB->amv; pMB->mvs[0].x = pMB->mvs[1].x = pMB->mvs[2].x = pMB->mvs[3].x = pMB->amv.x/2; pMB->mvs[0].y = pMB->mvs[1].y = pMB->mvs[2].y = pMB->mvs[3].y = pMB->amv.y/2; } else pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = pMB->amv; } else if (mode == MODE_INTER4V) ; /* anything here? */ else /* INTRA, NOT_CODED */ ZeroMacroblockP(pMB, 0); pMB->mode = mode; } xvidcore/src/motion/gmc.c0000664000076500007650000005242111564705453016507 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - GMC interpolation module - * * Copyright(C) 2002-2003 Pascal Massimino * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: gmc.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../portab.h" #include "../global.h" #include "../encoder.h" #include "gmc.h" #include "../utils/emms.h" #include /* initialized by init_GMC(), for 3points */ static void (*Predict_16x16_func)(const NEW_GMC_DATA * const This, uint8_t *dst, const uint8_t *src, int dststride, int srcstride, int x, int y, int rounding) = 0; static void (*Predict_8x8_func)(const NEW_GMC_DATA * const This, uint8_t *uDst, const uint8_t *uSrc, uint8_t *vDst, const uint8_t *vSrc, int dststride, int srcstride, int x, int y, int rounding) = 0; /****************************************************************************/ /* this is borrowed from bitstream.c until we find a common solution */ static uint32_t __inline log2bin(uint32_t value) { /* Changed by Chenm001 */ #if !defined(_MSC_VER) || defined(ARCH_IS_X86_64) int n = 0; while (value) { value >>= 1; n++; } return n; #else __asm { bsr eax, value inc eax } #endif } /* 16*sizeof(int) -> 1 or 2 cachelines */ /* table lookup might be faster! (still to be benchmarked) */ /* static int log2bin_table[16] = { 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4}; */ /* 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 */ #define RDIV(a,b) (((a)>0 ? (a) + ((b)>>1) : (a) - ((b)>>1))/(b)) #define RSHIFT(a,b) ( (a)>0 ? ((a) + (1<<((b)-1)))>>(b) : ((a) + (1<<((b)-1))-1)>>(b)) #define MLT(i) (((16-(i))<<16) + (i)) static const uint32_t MTab[16] = { MLT( 0), MLT( 1), MLT( 2), MLT( 3), MLT( 4), MLT( 5), MLT( 6), MLT( 7), MLT( 8), MLT( 9), MLT(10), MLT(11), MLT(12), MLT(13), MLT(14), MLT(15) }; #undef MLT /* ************************************************************ * Pts = 2 or 3 * * Warning! *src is the global frame pointer (that is: adress * of pixel 0,0), not the macroblock one. * Conversely, *dst is the macroblock top-left adress. */ static void Predict_16x16_C(const NEW_GMC_DATA * const This, uint8_t *dst, const uint8_t *src, int dststride, int srcstride, int x, int y, int rounding) { const int W = This->sW; const int H = This->sH; const int rho = 3 - This->accuracy; const int Rounder = ( (1<<7) - (rounding<<(2*rho)) ) << 16; const int dUx = This->dU[0]; const int dVx = This->dV[0]; const int dUy = This->dU[1]; const int dVy = This->dV[1]; int Uo = This->Uo + 16*(dUy*y + dUx*x); int Vo = This->Vo + 16*(dVy*y + dVx*x); int i, j; dst += 16; for (j=16; j>0; --j) { int U = Uo, V = Vo; Uo += dUy; Vo += dVy; for (i=-16; i<0; ++i) { unsigned int f0, f1, ri = 16, rj = 16; int Offset; int u = ( U >> 16 ) << rho; int v = ( V >> 16 ) << rho; U += dUx; V += dVx; if (u > 0 && u <= W) { ri = MTab[u&15]; Offset = u>>4; } else { if (u > W) Offset = W>>4; else Offset = 0; ri = MTab[0]; } if (v > 0 && v <= H) { rj = MTab[v&15]; Offset += (v>>4)*srcstride; } else { if (v > H) Offset += (H>>4)*srcstride; rj = MTab[0]; } f0 = src[Offset + 0]; f0 |= src[Offset + 1] << 16; f1 = src[Offset + srcstride + 0]; f1 |= src[Offset + srcstride + 1] << 16; f0 = (ri*f0)>>16; f1 = (ri*f1) & 0x0fff0000; f0 |= f1; f0 = (rj*f0 + Rounder) >> 24; dst[i] = (uint8_t)f0; } dst += dststride; } } static void Predict_8x8_C(const NEW_GMC_DATA * const This, uint8_t *uDst, const uint8_t *uSrc, uint8_t *vDst, const uint8_t *vSrc, int dststride, int srcstride, int x, int y, int rounding) { const int W = This->sW >> 1; const int H = This->sH >> 1; const int rho = 3-This->accuracy; const int32_t Rounder = ( 128 - (rounding<<(2*rho)) ) << 16; const int32_t dUx = This->dU[0]; const int32_t dVx = This->dV[0]; const int32_t dUy = This->dU[1]; const int32_t dVy = This->dV[1]; int32_t Uo = This->Uco + 8*(dUy*y + dUx*x); int32_t Vo = This->Vco + 8*(dVy*y + dVx*x); int i, j; uDst += 8; vDst += 8; for (j=8; j>0; --j) { int32_t U = Uo, V = Vo; Uo += dUy; Vo += dVy; for (i=-8; i<0; ++i) { int Offset; uint32_t f0, f1, ri, rj; int32_t u, v; u = ( U >> 16 ) << rho; v = ( V >> 16 ) << rho; U += dUx; V += dVx; if (u > 0 && u <= W) { ri = MTab[u&15]; Offset = u>>4; } else { if (u>W) Offset = W>>4; else Offset = 0; ri = MTab[0]; } if (v > 0 && v <= H) { rj = MTab[v&15]; Offset += (v>>4)*srcstride; } else { if (v>H) Offset += (H>>4)*srcstride; rj = MTab[0]; } f0 = uSrc[Offset + 0]; f0 |= uSrc[Offset + 1] << 16; f1 = uSrc[Offset + srcstride + 0]; f1 |= uSrc[Offset + srcstride + 1] << 16; f0 = (ri*f0)>>16; f1 = (ri*f1) & 0x0fff0000; f0 |= f1; f0 = (rj*f0 + Rounder) >> 24; uDst[i] = (uint8_t)f0; f0 = vSrc[Offset + 0]; f0 |= vSrc[Offset + 1] << 16; f1 = vSrc[Offset + srcstride + 0]; f1 |= vSrc[Offset + srcstride + 1] << 16; f0 = (ri*f0)>>16; f1 = (ri*f1) & 0x0fff0000; f0 |= f1; f0 = (rj*f0 + Rounder) >> 24; vDst[i] = (uint8_t)f0; } uDst += dststride; vDst += dststride; } } static void get_average_mv_C(const NEW_GMC_DATA * const Dsp, VECTOR * const mv, int x, int y, int qpel) { int i, j; int vx = 0, vy = 0; int32_t uo = Dsp->Uo + 16*(Dsp->dU[1]*y + Dsp->dU[0]*x); int32_t vo = Dsp->Vo + 16*(Dsp->dV[1]*y + Dsp->dV[0]*x); for (j=16; j>0; --j) { int32_t U, V; U = uo; uo += Dsp->dU[1]; V = vo; vo += Dsp->dV[1]; for (i=16; i>0; --i) { int32_t u,v; u = U >> 16; U += Dsp->dU[0]; vx += u; v = V >> 16; V += Dsp->dV[0]; vy += v; } } vx -= (256*x+120) << (5+Dsp->accuracy); /* 120 = 15*16/2 */ vy -= (256*y+120) << (5+Dsp->accuracy); mv->x = RSHIFT( vx, 8+Dsp->accuracy - qpel ); mv->y = RSHIFT( vy, 8+Dsp->accuracy - qpel ); } /* ************************************************************ * simplified version for 1 warp point */ static void Predict_1pt_16x16_C(const NEW_GMC_DATA * const This, uint8_t *Dst, const uint8_t *Src, int dststride, int srcstride, int x, int y, int rounding) { const int W = This->sW; const int H = This->sH; const int rho = 3-This->accuracy; const int32_t Rounder = ( 128 - (rounding<<(2*rho)) ) << 16; int32_t uo = This->Uo + (x<<8); /* ((16*x)<<4) */ int32_t vo = This->Vo + (y<<8); uint32_t ri = MTab[uo & 15]; uint32_t rj = MTab[vo & 15]; int i, j; int32_t Offset; if (vo>=(-16<<4) && vo<=H) Offset = (vo>>4)*srcstride; else { if (vo>H) Offset = ( H>>4)*srcstride; else Offset =-16*srcstride; rj = MTab[0]; } if (uo>=(-16<<4) && uo<=W) Offset += (uo>>4); else { if (uo>W) Offset += (W>>4); else Offset -= 16; ri = MTab[0]; } Dst += 16; for(j=16; j>0; --j, Offset+=srcstride-16) { for(i=-16; i<0; ++i, ++Offset) { uint32_t f0, f1; f0 = Src[ Offset +0 ]; f0 |= Src[ Offset +1 ] << 16; f1 = Src[ Offset+srcstride +0 ]; f1 |= Src[ Offset+srcstride +1 ] << 16; f0 = (ri*f0)>>16; f1 = (ri*f1) & 0x0fff0000; f0 |= f1; f0 = ( rj*f0 + Rounder ) >> 24; Dst[i] = (uint8_t)f0; } Dst += dststride; } } static void Predict_1pt_8x8_C(const NEW_GMC_DATA * const This, uint8_t *uDst, const uint8_t *uSrc, uint8_t *vDst, const uint8_t *vSrc, int dststride, int srcstride, int x, int y, int rounding) { const int W = This->sW >> 1; const int H = This->sH >> 1; const int rho = 3-This->accuracy; const int32_t Rounder = ( 128 - (rounding<<(2*rho)) ) << 16; int32_t uo = This->Uco + (x<<7); int32_t vo = This->Vco + (y<<7); uint32_t rri = MTab[uo & 15]; uint32_t rrj = MTab[vo & 15]; int i, j; int32_t Offset; if (vo>=(-8<<4) && vo<=H) Offset = (vo>>4)*srcstride; else { if (vo>H) Offset = ( H>>4)*srcstride; else Offset =-8*srcstride; rrj = MTab[0]; } if (uo>=(-8<<4) && uo<=W) Offset += (uo>>4); else { if (uo>W) Offset += ( W>>4); else Offset -= 8; rri = MTab[0]; } uDst += 8; vDst += 8; for(j=8; j>0; --j, Offset+=srcstride-8) { for(i=-8; i<0; ++i, Offset++) { uint32_t f0, f1; f0 = uSrc[ Offset + 0 ]; f0 |= uSrc[ Offset + 1 ] << 16; f1 = uSrc[ Offset + srcstride + 0 ]; f1 |= uSrc[ Offset + srcstride + 1 ] << 16; f0 = (rri*f0)>>16; f1 = (rri*f1) & 0x0fff0000; f0 |= f1; f0 = ( rrj*f0 + Rounder ) >> 24; uDst[i] = (uint8_t)f0; f0 = vSrc[ Offset + 0 ]; f0 |= vSrc[ Offset + 1 ] << 16; f1 = vSrc[ Offset + srcstride + 0 ]; f1 |= vSrc[ Offset + srcstride + 1 ] << 16; f0 = (rri*f0)>>16; f1 = (rri*f1) & 0x0fff0000; f0 |= f1; f0 = ( rrj*f0 + Rounder ) >> 24; vDst[i] = (uint8_t)f0; } uDst += dststride; vDst += dststride; } } static void get_average_mv_1pt_C(const NEW_GMC_DATA * const Dsp, VECTOR * const mv, int x, int y, int qpel) { mv->x = RSHIFT(Dsp->Uo<y = RSHIFT(Dsp->Vo<>4) + (v>>4)*srcstride; f0 = Src[0]; f0 |= Src[1] << 16; f1 = Src[srcstride +0]; f1 |= Src[srcstride +1] << 16; f0 = (ri*f0)>>16; f1 = (ri*f1) & 0x0fff0000; f0 |= f1; f0 = ( rj*f0 + Rounder ) >> 24; Dst[i] = (uint8_t)f0; } } ////////////////////////////////////////////////////////// static void Predict_16x16_mmx(const NEW_GMC_DATA * const This, uint8_t *dst, const uint8_t *src, int dststride, int srcstride, int x, int y, int rounding) { const int W = This->sW; const int H = This->sH; const int rho = 3 - This->accuracy; const int Rounder = ( 128 - (rounding<<(2*rho)) ) << 16; const uint32_t W2 = W<<(16-rho); const uint32_t H2 = H<<(16-rho); const int dUx = This->dU[0]; const int dVx = This->dV[0]; const int dUy = This->dU[1]; const int dVy = This->dV[1]; int Uo = This->Uo + 16*(dUy*y + dUx*x); int Vo = This->Vo + 16*(dVy*y + dVx*x); int i, j; DECLARE_ALIGNED_MATRIX(Offsets, 2,16, uint16_t, CACHE_LINE); for(j=16; j>0; --j) { int32_t U = Uo, V = Vo; Uo += dUy; Vo += dVy; if ( W2>(uint32_t)U && W2>(uint32_t)(U+15*dUx) && H2>(uint32_t)V && H2>(uint32_t)(V+15*dVx) ) { uint32_t UV1, UV2; for(i=0; i<16; ++i) { uint32_t u = ( U >> 16 ) << rho; uint32_t v = ( V >> 16 ) << rho; U += dUx; V += dVx; Offsets[ i] = u; Offsets[16+i] = v; } // batch 8 input pixels when linearity says it's ok UV1 = (Offsets[0] | (Offsets[16]<<16)) & 0xfff0fff0U; UV2 = (Offsets[7] | (Offsets[23]<<16)) & 0xfff0fff0U; if (UV1+7*16==UV2) GMC_Core_Lin_8(dst, Offsets, src + (Offsets[0]>>4) + (Offsets[16]>>4)*srcstride, srcstride, Rounder); else GMC_Core_Non_Lin_8(dst, Offsets, src, srcstride, Rounder); UV1 = (Offsets[ 8] | (Offsets[24]<<16)) & 0xfff0fff0U; UV2 = (Offsets[15] | (Offsets[31]<<16)) & 0xfff0fff0U; if (UV1+7*16==UV2) GMC_Core_Lin_8(dst+8, Offsets+8, src + (Offsets[8]>>4) + (Offsets[24]>>4)*srcstride, srcstride, Rounder); else GMC_Core_Non_Lin_8(dst+8, Offsets+8, src, srcstride, Rounder); } else { for(i=0; i<16; ++i) { int u = ( U >> 16 ) << rho; int v = ( V >> 16 ) << rho; U += dUx; V += dVx; Offsets[ i] = (u<0) ? 0 : (u>=W) ? W : u; Offsets[16+i] = (v<0) ? 0 : (v>=H) ? H : v; } // due to boundary clipping, we cannot infer the 8-pixels batchability // simply by using the linearity. Oh well, not a big deal... GMC_Core_Non_Lin_8(dst, Offsets, src, srcstride, Rounder); GMC_Core_Non_Lin_8(dst+8, Offsets+8, src, srcstride, Rounder); } dst += dststride; } } static void Predict_8x8_mmx(const NEW_GMC_DATA * const This, uint8_t *uDst, const uint8_t *uSrc, uint8_t *vDst, const uint8_t *vSrc, int dststride, int srcstride, int x, int y, int rounding) { const int W = This->sW >> 1; const int H = This->sH >> 1; const int rho = 3-This->accuracy; const int32_t Rounder = ( 128 - (rounding<<(2*rho)) ) << 16; const uint32_t W2 = W<<(16-rho); const uint32_t H2 = H<<(16-rho); const int dUx = This->dU[0]; const int dVx = This->dV[0]; const int dUy = This->dU[1]; const int dVy = This->dV[1]; int Uo = This->Uco + 8*(dUy*y + dUx*x); int Vo = This->Vco + 8*(dVy*y + dVx*x); DECLARE_ALIGNED_MATRIX(Offsets, 2,16, uint16_t, CACHE_LINE); int i, j; for(j=8; j>0; --j) { int32_t U = Uo, V = Vo; Uo += dUy; Vo += dVy; if ( W2>(uint32_t)U && W2>(uint32_t)(U+15*dUx) && H2>(uint32_t)V && H2>(uint32_t)(V+15*dVx) ) { uint32_t UV1, UV2; for(i=0; i<8; ++i) { int32_t u = ( U >> 16 ) << rho; int32_t v = ( V >> 16 ) << rho; U += dUx; V += dVx; Offsets[ i] = u; Offsets[16+i] = v; } // batch 8 input pixels when linearity says it's ok UV1 = (Offsets[ 0] | (Offsets[16]<<16)) & 0xfff0fff0U; UV2 = (Offsets[ 7] | (Offsets[23]<<16)) & 0xfff0fff0U; if (UV1+7*16==UV2) { const uint32_t Off = (Offsets[0]>>4) + (Offsets[16]>>4)*srcstride; GMC_Core_Lin_8(uDst, Offsets, uSrc+Off, srcstride, Rounder); GMC_Core_Lin_8(vDst, Offsets, vSrc+Off, srcstride, Rounder); } else { GMC_Core_Non_Lin_8(uDst, Offsets, uSrc, srcstride, Rounder); GMC_Core_Non_Lin_8(vDst, Offsets, vSrc, srcstride, Rounder); } } else { for(i=0; i<8; ++i) { int u = ( U >> 16 ) << rho; int v = ( V >> 16 ) << rho; U += dUx; V += dVx; Offsets[ i] = (u<0) ? 0 : (u>=W) ? W : u; Offsets[16+i] = (v<0) ? 0 : (v>=H) ? H : v; } GMC_Core_Non_Lin_8(uDst, Offsets, uSrc, srcstride, Rounder); GMC_Core_Non_Lin_8(vDst, Offsets, vSrc, srcstride, Rounder); } uDst += dststride; vDst += dststride; } } #endif /* ARCH_IS_IA32 */ /* ************************************************************* * will initialize internal pointers */ void init_GMC(const unsigned int cpu_flags) { Predict_16x16_func = Predict_16x16_C; Predict_8x8_func = Predict_8x8_C; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) if ((cpu_flags & XVID_CPU_MMX) || (cpu_flags & XVID_CPU_MMXEXT) || (cpu_flags & XVID_CPU_3DNOW) || (cpu_flags & XVID_CPU_3DNOWEXT) || (cpu_flags & XVID_CPU_SSE) || (cpu_flags & XVID_CPU_SSE2) || (cpu_flags & XVID_CPU_SSE3) || (cpu_flags & XVID_CPU_SSE41)) { Predict_16x16_func = Predict_16x16_mmx; Predict_8x8_func = Predict_8x8_mmx; if (cpu_flags & XVID_CPU_SSE41) GMC_Core_Lin_8 = xvid_GMC_Core_Lin_8_sse41; else if (cpu_flags & XVID_CPU_SSE2) GMC_Core_Lin_8 = xvid_GMC_Core_Lin_8_sse2; else GMC_Core_Lin_8 = xvid_GMC_Core_Lin_8_mmx; } #endif } /* ************************************************************* * Warning! It's Accuracy being passed, not 'resolution'! */ void generate_GMCparameters( int nb_pts, const int accuracy, const WARPPOINTS *const pts, const int width, const int height, NEW_GMC_DATA *const gmc) { gmc->sW = width << 4; gmc->sH = height << 4; gmc->accuracy = accuracy; gmc->num_wp = nb_pts; /* reduce the number of points, if possible */ if (nb_pts<2 || (pts->duv[2].x==0 && pts->duv[2].y==0 && pts->duv[1].x==0 && pts->duv[1].y==0 )) { if (nb_pts<2 || (pts->duv[1].x==0 && pts->duv[1].y==0)) { if (nb_pts<1 || (pts->duv[0].x==0 && pts->duv[0].y==0)) { nb_pts = 0; } else nb_pts = 1; } else nb_pts = 2; } /* now, nb_pts stores the actual number of points required for interpolation */ if (nb_pts<=1) { if (nb_pts==1) { /* store as 4b fixed point */ gmc->Uo = pts->duv[0].x << accuracy; gmc->Vo = pts->duv[0].y << accuracy; gmc->Uco = ((pts->duv[0].x>>1) | (pts->duv[0].x&1)) << accuracy; /* DIV2RND() */ gmc->Vco = ((pts->duv[0].y>>1) | (pts->duv[0].y&1)) << accuracy; /* DIV2RND() */ } else { /* zero points?! */ gmc->Uo = gmc->Vo = 0; gmc->Uco = gmc->Vco = 0; } gmc->predict_16x16 = Predict_1pt_16x16_C; gmc->predict_8x8 = Predict_1pt_8x8_C; gmc->get_average_mv = get_average_mv_1pt_C; } else { /* 2 or 3 points */ const int rho = 3 - accuracy; /* = {3,2,1,0} for Acc={0,1,2,3} */ int Alpha = log2bin(width-1); int Ws = 1 << Alpha; gmc->dU[0] = 16*Ws + RDIV( 8*Ws*pts->duv[1].x, width ); /* dU/dx */ gmc->dV[0] = RDIV( 8*Ws*pts->duv[1].y, width ); /* dV/dx */ if (nb_pts==2) { gmc->dU[1] = -gmc->dV[0]; /* -Sin */ gmc->dV[1] = gmc->dU[0] ; /* Cos */ } else { const int Beta = log2bin(height-1); const int Hs = 1<dU[1] = RDIV( 8*Hs*pts->duv[2].x, height ); /* dU/dy */ gmc->dV[1] = 16*Hs + RDIV( 8*Hs*pts->duv[2].y, height ); /* dV/dy */ if (Beta>Alpha) { gmc->dU[0] <<= (Beta-Alpha); gmc->dV[0] <<= (Beta-Alpha); Alpha = Beta; Ws = Hs; } else { gmc->dU[1] <<= Alpha - Beta; gmc->dV[1] <<= Alpha - Beta; } } /* upscale to 16b fixed-point */ gmc->dU[0] <<= (16-Alpha - rho); gmc->dU[1] <<= (16-Alpha - rho); gmc->dV[0] <<= (16-Alpha - rho); gmc->dV[1] <<= (16-Alpha - rho); gmc->Uo = ( pts->duv[0].x <<(16+ accuracy)) + (1<<15); gmc->Vo = ( pts->duv[0].y <<(16+ accuracy)) + (1<<15); gmc->Uco = ((pts->duv[0].x-1)<<(17+ accuracy)) + (1<<17); gmc->Vco = ((pts->duv[0].y-1)<<(17+ accuracy)) + (1<<17); gmc->Uco = (gmc->Uco + gmc->dU[0] + gmc->dU[1])>>2; gmc->Vco = (gmc->Vco + gmc->dV[0] + gmc->dV[1])>>2; gmc->predict_16x16 = Predict_16x16_func; gmc->predict_8x8 = Predict_8x8_func; gmc->get_average_mv = get_average_mv_C; } } /* ******************************************************************* * quick and dirty routine to generate the full warped image * (pGMC != NULL) or just all average Motion Vectors (pGMC == NULL) */ void generate_GMCimage( const NEW_GMC_DATA *const gmc_data, /* [input] precalculated data */ const IMAGE *const pRef, /* [input] */ const int mb_width, const int mb_height, const int stride, const int stride2, const int fcode, /* [input] some parameters... */ const int32_t quarterpel, /* [input] for rounding avgMV */ const int reduced_resolution, /* [input] ignored */ const int32_t rounding, /* [input] for rounding image data */ MACROBLOCK *const pMBs, /* [output] average motion vectors */ IMAGE *const pGMC) /* [output] full warped image */ { unsigned int mj,mi; VECTOR avgMV; for (mj = 0; mj < (unsigned int)mb_height; mj++) for (mi = 0; mi < (unsigned int)mb_width; mi++) { const int mbnum = mj*mb_width+mi; if (pGMC) { gmc_data->predict_16x16(gmc_data, pGMC->y + mj*16*stride + mi*16, pRef->y, stride, stride, mi, mj, rounding); gmc_data->predict_8x8(gmc_data, pGMC->u + mj*8*stride2 + mi*8, pRef->u, pGMC->v + mj*8*stride2 + mi*8, pRef->v, stride2, stride2, mi, mj, rounding); } gmc_data->get_average_mv(gmc_data, &avgMV, mi, mj, quarterpel); pMBs[mbnum].amv.x = gmc_sanitize(avgMV.x, quarterpel, fcode); pMBs[mbnum].amv.y = gmc_sanitize(avgMV.y, quarterpel, fcode); pMBs[mbnum].mcsel = 0; /* until mode decision */ } emms(); } xvidcore/src/motion/motion.h0000664000076500007650000001150211564705453017246 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Motion module header - * * Copyright(C) 2002-2003 Radoslaw Czyz * 2002 Michael Militzer * * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: motion.h 1985 2011-05-18 09:02:35Z Isibaar $ * ***************************************************************************/ #ifndef _MOTION_H_ #define _MOTION_H_ #include "../portab.h" #include "../global.h" /***************************************************************************** * Modified rounding tables -- defined in estimation_common.c * Original tables see ISO spec tables 7-6 -> 7-9 ****************************************************************************/ extern const uint32_t roundtab[16]; /* K = 4 */ extern const uint32_t roundtab_76[16]; /* K = 2 */ extern const uint32_t roundtab_78[8]; /* K = 1 */ extern const uint32_t roundtab_79[4]; /** MotionEstimation **/ void MotionEstimation(MBParam * const pParam, FRAMEINFO * const current, FRAMEINFO * const reference, const IMAGE * const pRefH, const IMAGE * const pRefV, const IMAGE * const pRefHV, const IMAGE * const pGMC, const uint32_t iLimit, const int num_slices); void MotionEstimationBVOP(MBParam * const pParam, FRAMEINFO * const frame, const int32_t time_bp, const int32_t time_pp, const MACROBLOCK * const f_mbs, const IMAGE * const f_ref, const IMAGE * const f_refH, const IMAGE * const f_refV, const IMAGE * const f_refHV, const FRAMEINFO * const b_reference, const IMAGE * const b_ref, const IMAGE * const b_refH, const IMAGE * const b_refV, const IMAGE * const b_refHV, const int num_slices); void GMEanalysis(const MBParam * const pParam, const FRAMEINFO * const current, const FRAMEINFO * const reference, const IMAGE * const pRefH, const IMAGE * const pRefV, const IMAGE * const pRefHV, const int num_slices); WARPPOINTS GlobalMotionEst(MACROBLOCK * const pMBs, const MBParam * const pParam, const FRAMEINFO * const current, const FRAMEINFO * const reference, const IMAGE * const pRefH, const IMAGE * const pRefV, const IMAGE * const pRefHV, const int num_slices); int GlobalMotionEstRefine( WARPPOINTS *const startwp, MACROBLOCK * const pMBs, const MBParam * const pParam, const FRAMEINFO * const current, const FRAMEINFO * const reference, const IMAGE * const pCurr, const IMAGE * const pRef, const IMAGE * const pRefH, const IMAGE * const pRefV, const IMAGE * const pRefHV); int globalSAD(const WARPPOINTS *const wp, const MBParam * const pParam, const MACROBLOCK * const pMBs, const FRAMEINFO * const current, const IMAGE * const pRef, const IMAGE * const pCurr, uint8_t *const GMCblock); int MEanalysis( const IMAGE * const pRef, const FRAMEINFO * const Current, const MBParam * const pParam, const int maxIntra, const int intraCount, const int bCount, const int b_thresh, const MACROBLOCK * const prev_mbs); /** MotionCompensation **/ void MBMotionCompensation(MACROBLOCK * const mb, const uint32_t i, const uint32_t j, const IMAGE * const ref, const IMAGE * const refh, const IMAGE * const refv, const IMAGE * const refhv, const IMAGE * const refGMC, IMAGE * const cur, int16_t * dct_codes, const uint32_t width, const uint32_t height, const uint32_t edged_width, const int32_t quarterpel, const int32_t rounding, uint8_t * const tmp); void MBMotionCompensationBVOP(MBParam * pParam, MACROBLOCK * const mb, const uint32_t i, const uint32_t j, IMAGE * const cur, const IMAGE * const f_ref, const IMAGE * const f_refh, const IMAGE * const f_refv, const IMAGE * const f_refhv, const IMAGE * const b_ref, const IMAGE * const b_refh, const IMAGE * const b_refv, const IMAGE * const b_refhv, int16_t * dct_codes, uint8_t * const tmp); #endif /* _MOTION_H_ */ xvidcore/src/motion/estimation_rd_based_bvop.c0000664000076500007650000005504311564705453022771 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Rate-Distortion Based Motion Estimation for B- VOPs - * * Copyright(C) 2004 Radoslaw Czyz * Copyright(C) 2010 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: estimation_rd_based_bvop.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include /* memcpy */ #include "../encoder.h" #include "../bitstream/mbcoding.h" #include "../prediction/mbprediction.h" #include "../global.h" #include "../image/interpolate8x8.h" #include "estimation.h" #include "motion.h" #include "sad.h" #include "../bitstream/zigzag.h" #include "../quant/quant.h" #include "../bitstream/vlc_codes.h" #include "../dct/fdct.h" #include "motion_inlines.h" /* rd = BITS_MULT*bits + LAMBDA*distortion */ #define LAMBDA ( (int)(BITS_MULT*1.0) ) static __inline unsigned int Block_CalcBits_BVOP(int16_t * const coeff, int16_t * const data, int16_t * const dqcoeff, const uint32_t quant, const int quant_type, uint32_t * cbp, const int block, const uint16_t * scan_table, const unsigned int lambda, const uint16_t * mpeg_quant_matrices, const unsigned int quant_sq, int * const cbpcost, const unsigned int rel_var8, const unsigned int metric) { int sum; int bits; int distortion = 0; fdct((short * const)data); if (quant_type) sum = quant_h263_inter(coeff, data, quant, mpeg_quant_matrices); else sum = quant_mpeg_inter(coeff, data, quant, mpeg_quant_matrices); if ((sum >= 3) || (coeff[1] != 0) || (coeff[8] != 0) || (coeff[0] != 0)) { *cbp |= 1 << (5 - block); bits = BITS_MULT * CodeCoeffInter_CalcBits(coeff, scan_table); bits += *cbpcost; *cbpcost = 0; /* don't add cbp cost second time */ if (quant_type) dequant_h263_inter(dqcoeff, coeff, quant, mpeg_quant_matrices); else dequant_mpeg_inter(dqcoeff, coeff, quant, mpeg_quant_matrices); if (metric) distortion = masked_sseh8_16bit(data, dqcoeff, rel_var8); else distortion = sse8_16bit(data, dqcoeff, 8*sizeof(int16_t)); } else { const static int16_t zero_block[64] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, }; bits = 0; if (metric) distortion = masked_sseh8_16bit(data, (int16_t * const) zero_block, rel_var8); else distortion = sse8_16bit(data, (int16_t * const) zero_block, 8*sizeof(int16_t)); } return bits + (lambda*distortion)/quant_sq; } static __inline unsigned int Block_CalcBits_BVOP_direct(int16_t * const coeff, int16_t * const data, int16_t * const dqcoeff, const uint32_t quant, const int quant_type, uint32_t * cbp, const int block, const uint16_t * scan_table, const unsigned int lambda, const uint16_t * mpeg_quant_matrices, const unsigned int quant_sq, int * const cbpcost, const unsigned int rel_var8, const unsigned int metric) { int sum; int bits; int distortion = 0; fdct((short * const)data); if (quant_type) sum = quant_h263_inter(coeff, data, quant, mpeg_quant_matrices); else sum = quant_mpeg_inter(coeff, data, quant, mpeg_quant_matrices); if ((sum >= 3) || (coeff[1] != 0) || (coeff[8] != 0) || (coeff[0] > 0) || (coeff[0] < -1)) { *cbp |= 1 << (5 - block); bits = BITS_MULT * CodeCoeffInter_CalcBits(coeff, scan_table); bits += *cbpcost; *cbpcost = 0; if (quant_type) dequant_h263_inter(dqcoeff, coeff, quant, mpeg_quant_matrices); else dequant_mpeg_inter(dqcoeff, coeff, quant, mpeg_quant_matrices); if (metric) distortion = masked_sseh8_16bit(data, dqcoeff, rel_var8); else distortion = sse8_16bit(data, dqcoeff, 8*sizeof(int16_t)); } else { const static int16_t zero_block[64] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, }; bits = 0; if (metric) distortion = masked_sseh8_16bit(data, (int16_t * const) zero_block, rel_var8); else distortion = sse8_16bit(data, (int16_t * const) zero_block, 8*sizeof(int16_t)); } return bits + (lambda*distortion)/quant_sq; } static void CheckCandidateRDBF(const int x, const int y, SearchData * const data, const unsigned int Direction) { int16_t *in = data->dctSpace, *coeff = data->dctSpace + 64; int32_t rd = (3+2)*BITS_MULT; /* 3 bits for mode + 2 for vector (minimum) */ VECTOR * current; const uint8_t * ptr; int i, xc, yc; unsigned cbp = 0; int cbpcost = 7*BITS_MULT; /* how much to add if cbp turns out to be non-zero */ if ( (x > data->max_dx) || (x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy) ) return; if (!data->qpel_precision) { ptr = GetReference(x, y, data); current = data->currentMV; xc = x; yc = y; } else { /* x and y are in 1/4 precision */ ptr = xvid_me_interpolate16x16qpel(x, y, 0, data); current = data->currentQMV; xc = x/2; yc = y/2; } rd += BITS_MULT*(d_mv_bits(x, y, data->predMV, data->iFcode, data->qpel^data->qpel_precision)-2); for(i = 0; i < 4; i++) { int s = 8*((i&1) + (i>>1)*data->iEdgedWidth); transfer_8to16subro(in, data->Cur + s, ptr + s, data->iEdgedWidth); rd += Block_CalcBits_BVOP(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, i, data->scan_table, data->lambda[i], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[i], data->metric); if (rd >= data->iMinSAD[0]) return; } /* chroma */ xc = (xc >> 1) + roundtab_79[xc & 0x3]; yc = (yc >> 1) + roundtab_79[yc & 0x3]; /* chroma U */ ptr = interpolate8x8_switch2(data->RefQ, data->RefP[4], 0, 0, xc, yc, data->iEdgedWidth/2, data->rounding); transfer_8to16subro(in, data->CurU, ptr, data->iEdgedWidth/2); rd += Block_CalcBits_BVOP(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 4, data->scan_table, data->lambda[4], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[4], data->metric); if (rd >= data->iMinSAD[0]) return; /* chroma V */ ptr = interpolate8x8_switch2(data->RefQ, data->RefP[5], 0, 0, xc, yc, data->iEdgedWidth/2, data->rounding); transfer_8to16subro(in, data->CurV, ptr, data->iEdgedWidth/2); rd += Block_CalcBits_BVOP(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 5, data->scan_table, data->lambda[5], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[5], data->metric); if (rd < data->iMinSAD[0]) { data->iMinSAD[0] = rd; current[0].x = x; current[0].y = y; data->dir = Direction; *data->cbp = cbp; } } static void CheckCandidateRDDirect(const int x, const int y, SearchData * const data, const unsigned int Direction) { int32_t xcf = 0, ycf = 0, xcb = 0, ycb = 0; int32_t rd = 1*BITS_MULT; int16_t *in = data->dctSpace, *coeff = data->dctSpace + 64; unsigned int cbp = 0; unsigned int k; VECTOR mvs, b_mvs; int cbpcost = 6*BITS_MULT; /* how much to add if cbp turns out to be non-zero */ const uint8_t *ReferenceF, *ReferenceB; if (( x > 31) || ( x < -32) || ( y > 31) || (y < -32)) return; for (k = 0; k < 4; k++) { int s = 8*((k&1) + (k>>1)*data->iEdgedWidth); mvs.x = data->directmvF[k].x + x; b_mvs.x = ((x == 0) ? data->directmvB[k].x : mvs.x - data->referencemv[k].x); mvs.y = data->directmvF[k].y + y; b_mvs.y = ((y == 0) ? data->directmvB[k].y : mvs.y - data->referencemv[k].y); if ((mvs.x > data->max_dx) || (mvs.x < data->min_dx) || (mvs.y > data->max_dy) || (mvs.y < data->min_dy) || (b_mvs.x > data->max_dx) || (b_mvs.x < data->min_dx) || (b_mvs.y > data->max_dy) || (b_mvs.y < data->min_dy) ) return; if (data->qpel) { xcf += mvs.x/2; ycf += mvs.y/2; xcb += b_mvs.x/2; ycb += b_mvs.y/2; ReferenceF = xvid_me_interpolate8x8qpel(mvs.x, mvs.y, k, 0, data); ReferenceB = xvid_me_interpolate8x8qpel(b_mvs.x, b_mvs.y, k, 1, data); } else { xcf += mvs.x; ycf += mvs.y; xcb += b_mvs.x; ycb += b_mvs.y; ReferenceF = GetReference(mvs.x, mvs.y, data) + s; ReferenceB = GetReferenceB(b_mvs.x, b_mvs.y, 1, data) + s; } transfer_8to16sub2ro(in, data->Cur + s, ReferenceF, ReferenceB, data->iEdgedWidth); rd += Block_CalcBits_BVOP_direct(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, k, data->scan_table, data->lambda[k], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[k], data->metric); if (rd > *(data->iMinSAD)) return; } /* chroma */ xcf = (xcf >> 3) + roundtab_76[xcf & 0xf]; ycf = (ycf >> 3) + roundtab_76[ycf & 0xf]; xcb = (xcb >> 3) + roundtab_76[xcb & 0xf]; ycb = (ycb >> 3) + roundtab_76[ycb & 0xf]; /* chroma U */ ReferenceF = interpolate8x8_switch2(data->RefQ, data->RefP[4], 0, 0, xcf, ycf, data->iEdgedWidth/2, data->rounding); ReferenceB = interpolate8x8_switch2(data->RefQ + 16, data->b_RefP[4], 0, 0, xcb, ycb, data->iEdgedWidth/2, data->rounding); transfer_8to16sub2ro(in, data->CurU, ReferenceF, ReferenceB, data->iEdgedWidth/2); rd += Block_CalcBits_BVOP_direct(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 4, data->scan_table, data->lambda[4], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[4], data->metric); if (rd >= data->iMinSAD[0]) return; /* chroma V */ ReferenceF = interpolate8x8_switch2(data->RefQ, data->RefP[5], 0, 0, xcf, ycf, data->iEdgedWidth/2, data->rounding); ReferenceB = interpolate8x8_switch2(data->RefQ + 16, data->b_RefP[5], 0, 0, xcb, ycb, data->iEdgedWidth/2, data->rounding); transfer_8to16sub2ro(in, data->CurV, ReferenceF, ReferenceB, data->iEdgedWidth/2); rd += Block_CalcBits_BVOP_direct(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 5, data->scan_table, data->lambda[5], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[5], data->metric); if (cbp || x != 0 || y != 0) rd += BITS_MULT * d_mv_bits(x, y, zeroMV, 1, 0); if (rd < *(data->iMinSAD)) { *data->iMinSAD = rd; data->currentMV->x = x; data->currentMV->y = y; data->dir = Direction; *data->cbp = cbp; } } static void CheckCandidateRDInt(const int x, const int y, SearchData * const data, const unsigned int Direction) { int32_t xf, yf, xb, yb, xcf, ycf, xcb, ycb; int32_t rd = 2*BITS_MULT; int16_t *in = data->dctSpace, *coeff = data->dctSpace + 64; unsigned int cbp = 0; unsigned int i; int cbpcost = 7*BITS_MULT; /* how much to add if cbp turns out to be non-zero */ const uint8_t *ReferenceF, *ReferenceB; VECTOR *current; if ((x > data->max_dx) || (x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy)) return; if (Direction == 1) { /* x and y mean forward vector */ VECTOR backward = data->qpel_precision ? data->currentQMV[1] : data->currentMV[1]; xb = backward.x; yb = backward.y; xf = x; yf = y; } else { /* x and y mean backward vector */ VECTOR forward = data->qpel_precision ? data->currentQMV[0] : data->currentMV[0]; xf = forward.x; yf = forward.y; xb = x; yb = y; } if (!data->qpel_precision) { ReferenceF = GetReference(xf, yf, data); ReferenceB = GetReferenceB(xb, yb, 1, data); current = data->currentMV + Direction - 1; xcf = xf; ycf = yf; xcb = xb; ycb = yb; } else { ReferenceF = xvid_me_interpolate16x16qpel(xf, yf, 0, data); current = data->currentQMV + Direction - 1; ReferenceB = xvid_me_interpolate16x16qpel(xb, yb, 1, data); xcf = xf/2; ycf = yf/2; xcb = xb/2; ycb = yb/2; } rd += BITS_MULT * (d_mv_bits(xf, yf, data->predMV, data->iFcode, data->qpel^data->qpel_precision) + d_mv_bits(xb, yb, data->bpredMV, data->iFcode, data->qpel^data->qpel_precision)); for(i = 0; i < 4; i++) { int s = 8*((i&1) + (i>>1)*data->iEdgedWidth); if (rd >= *data->iMinSAD) return; transfer_8to16sub2ro(in, data->Cur + s, ReferenceF + s, ReferenceB + s, data->iEdgedWidth); rd += Block_CalcBits_BVOP(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, i, data->scan_table, data->lambda[i], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[i], data->metric); } /* chroma */ xcf = (xcf >> 1) + roundtab_79[xcf & 0x3]; ycf = (ycf >> 1) + roundtab_79[ycf & 0x3]; xcb = (xcb >> 1) + roundtab_79[xcb & 0x3]; ycb = (ycb >> 1) + roundtab_79[ycb & 0x3]; /* chroma U */ ReferenceF = interpolate8x8_switch2(data->RefQ, data->RefP[4], 0, 0, xcf, ycf, data->iEdgedWidth/2, data->rounding); ReferenceB = interpolate8x8_switch2(data->RefQ + 16, data->b_RefP[4], 0, 0, xcb, ycb, data->iEdgedWidth/2, data->rounding); transfer_8to16sub2ro(in, data->CurU, ReferenceF, ReferenceB, data->iEdgedWidth/2); rd += Block_CalcBits_BVOP(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 4, data->scan_table, data->lambda[4], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[4], data->metric); if (rd >= data->iMinSAD[0]) return; /* chroma V */ ReferenceF = interpolate8x8_switch2(data->RefQ, data->RefP[5], 0, 0, xcf, ycf, data->iEdgedWidth/2, data->rounding); ReferenceB = interpolate8x8_switch2(data->RefQ + 16, data->b_RefP[5], 0, 0, xcb, ycb, data->iEdgedWidth/2, data->rounding); transfer_8to16sub2ro(in, data->CurV, ReferenceF, ReferenceB, data->iEdgedWidth/2); rd += Block_CalcBits_BVOP(coeff, in, data->dctSpace + 128, data->iQuant, data->quant_type, &cbp, 5, data->scan_table, data->lambda[5], data->mpeg_quant_matrices, data->quant_sq, &cbpcost, data->rel_var8[5], data->metric); if (rd < *(data->iMinSAD)) { *data->iMinSAD = rd; current->x = x; current->y = y; data->dir = Direction; *data->cbp = cbp; } } static int SearchInterpolate_RD(const int x, const int y, const uint32_t MotionFlags, const MBParam * const pParam, int32_t * const best_sad, SearchData * const Data) { int i, j; Data->iMinSAD[0] = *best_sad; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode, 1 + Data->qpel); Data->qpel_precision = Data->qpel; if (Data->qpel) { i = Data->currentQMV[0].x; j = Data->currentQMV[0].y; } else { i = Data->currentMV[0].x; j = Data->currentMV[0].y; } CheckCandidateRDInt(i, j, Data, 1); return Data->iMinSAD[0]; } static int SearchDirect_RD(const int x, const int y, const uint32_t MotionFlags, const MBParam * const pParam, int32_t * const best_sad, SearchData * const Data) { Data->iMinSAD[0] = *best_sad; Data->qpel_precision = Data->qpel; CheckCandidateRDDirect(Data->currentMV->x, Data->currentMV->y, Data, 255); return Data->iMinSAD[0]; } static int SearchBF_RD(const int x, const int y, const uint32_t MotionFlags, const MBParam * const pParam, int32_t * const best_sad, SearchData * const Data) { int i, j; Data->iMinSAD[0] = *best_sad; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode, 1 + Data->qpel); Data->qpel_precision = Data->qpel; if (Data->qpel) { i = Data->currentQMV[0].x; j = Data->currentQMV[0].y; } else { i = Data->currentMV[0].x; j = Data->currentMV[0].y; } CheckCandidateRDBF(i, j, Data, 1); return Data->iMinSAD[0]; } static int get_sad_for_mode(int mode, SearchData * const Data_d, SearchData * const Data_b, SearchData * const Data_f, SearchData * const Data_i) { switch(mode) { case MODE_DIRECT: return Data_d->iMinSAD[0]; case MODE_FORWARD: return Data_f->iMinSAD[0]; case MODE_BACKWARD: return Data_b->iMinSAD[0]; default: case MODE_INTERPOLATE: return Data_i->iMinSAD[0]; } } void ModeDecision_BVOP_RD(SearchData * const Data_d, SearchData * const Data_b, SearchData * const Data_f, SearchData * const Data_i, MACROBLOCK * const pMB, const MACROBLOCK * const b_mb, VECTOR * f_predMV, VECTOR * b_predMV, const uint32_t MotionFlags, const uint32_t VopFlags, const MBParam * const pParam, int x, int y, int best_sad, int force_direct) { int mode = MODE_DIRECT, k; int f_rd, b_rd, i_rd, d_rd, best_rd; const int qpel = Data_d->qpel; const uint32_t iQuant = Data_d->iQuant; int i; int ref_quant = b_mb->quant; int no_of_checks = 0; int order[4] = {MODE_DIRECT, MODE_FORWARD, MODE_BACKWARD, MODE_INTERPOLATE}; Data_d->metric = Data_b->metric = Data_f->metric = Data_i->metric = !!(VopFlags & XVID_VOP_RD_PSNRHVSM); Data_d->scan_table = Data_b->scan_table = Data_f->scan_table = Data_i->scan_table = /*VopFlags & XVID_VOP_ALTERNATESCAN ? scan_tables[2] : */scan_tables[0]; *Data_f->cbp = *Data_b->cbp = *Data_i->cbp = *Data_d->cbp = 63; f_rd = b_rd = i_rd = d_rd = best_rd = 256*4096; for (i = 0; i < 6; i++) { /* re-calculate as if it was p-frame's quant +.5 */ int lam = (pMB->lambda[i]*LAMBDA*iQuant*iQuant)/(ref_quant*(ref_quant+1)); lam >>= LAMBDA_EXP; Data_d->lambda[i] = lam; Data_b->lambda[i] = lam; Data_f->lambda[i] = lam; Data_i->lambda[i] = lam; Data_d->rel_var8[i] = pMB->rel_var8[i]; Data_b->rel_var8[i] = pMB->rel_var8[i]; Data_f->rel_var8[i] = pMB->rel_var8[i]; Data_i->rel_var8[i] = pMB->rel_var8[i]; } if (force_direct) { best_rd = 0; goto set_mode; /* bypass checks for non-direct modes */ } /* find the best order of evaluation - smallest SAD comes first, because *if* it means smaller RD, early-stops will activate sooner */ for (i = 3; i >= 0; i--) { int j; for (j = 0; j < i; j++) { int sad1 = get_sad_for_mode(order[j], Data_d, Data_b, Data_f, Data_i); int sad2 = get_sad_for_mode(order[j+1], Data_d, Data_b, Data_f, Data_i); if (sad1 > sad2) { int t = order[j]; order[j] = order[j+1]; order[j+1] = t; } } } for(i = 0; i < 4; i++) if (get_sad_for_mode(order[i], Data_d, Data_b, Data_f, Data_i) < 2*best_sad) no_of_checks++; if (no_of_checks > 1) { /* evaluate cost of all modes */ for (i = 0; i < no_of_checks; i++) { int rd; if (2*best_sad < get_sad_for_mode(order[i], Data_d, Data_b, Data_f, Data_i)) break; /* further SADs are too big */ switch (order[i]) { case MODE_DIRECT: rd = d_rd = SearchDirect_RD(x, y, MotionFlags, pParam, &best_rd, Data_d); break; case MODE_FORWARD: rd = f_rd = SearchBF_RD(x, y, MotionFlags, pParam, &best_rd, Data_f) + 1*BITS_MULT; /* extra one bit for FORWARD vs BACKWARD */ break; case MODE_BACKWARD: rd = b_rd = SearchBF_RD(x, y, MotionFlags, pParam, &best_rd, Data_b); break; default: case MODE_INTERPOLATE: rd = i_rd = SearchInterpolate_RD(x, y, MotionFlags, pParam, &best_rd, Data_i); break; } if (rd < best_rd) { mode = order[i]; best_rd = rd; } } } else { /* only 1 mode is below the threshold */ mode = order[0]; best_rd = 0; } set_mode: pMB->sad16 = best_rd; pMB->mode = mode; switch (mode) { case MODE_DIRECT: if (!qpel && b_mb->mode != MODE_INTER4V) pMB->mode = MODE_DIRECT_NO4V; /* for faster compensation */ pMB->pmvs[3] = Data_d->currentMV[0]; pMB->cbp = *Data_d->cbp; for (k = 0; k < 4; k++) { pMB->mvs[k].x = Data_d->directmvF[k].x + Data_d->currentMV->x; pMB->b_mvs[k].x = ( (Data_d->currentMV->x == 0) ? Data_d->directmvB[k].x :pMB->mvs[k].x - Data_d->referencemv[k].x); pMB->mvs[k].y = (Data_d->directmvF[k].y + Data_d->currentMV->y); pMB->b_mvs[k].y = ((Data_d->currentMV->y == 0) ? Data_d->directmvB[k].y : pMB->mvs[k].y - Data_d->referencemv[k].y); if (qpel) { pMB->qmvs[k].x = pMB->mvs[k].x; pMB->mvs[k].x /= 2; pMB->b_qmvs[k].x = pMB->b_mvs[k].x; pMB->b_mvs[k].x /= 2; pMB->qmvs[k].y = pMB->mvs[k].y; pMB->mvs[k].y /= 2; pMB->b_qmvs[k].y = pMB->b_mvs[k].y; pMB->b_mvs[k].y /= 2; } if (b_mb->mode != MODE_INTER4V) { pMB->mvs[3] = pMB->mvs[2] = pMB->mvs[1] = pMB->mvs[0]; pMB->b_mvs[3] = pMB->b_mvs[2] = pMB->b_mvs[1] = pMB->b_mvs[0]; pMB->qmvs[3] = pMB->qmvs[2] = pMB->qmvs[1] = pMB->qmvs[0]; pMB->b_qmvs[3] = pMB->b_qmvs[2] = pMB->b_qmvs[1] = pMB->b_qmvs[0]; break; } } break; case MODE_FORWARD: if (qpel) { pMB->pmvs[0].x = Data_f->currentQMV->x - f_predMV->x; pMB->pmvs[0].y = Data_f->currentQMV->y - f_predMV->y; pMB->qmvs[0] = *Data_f->currentQMV; *f_predMV = Data_f->currentQMV[0]; } else { pMB->pmvs[0].x = Data_f->currentMV->x - f_predMV->x; pMB->pmvs[0].y = Data_f->currentMV->y - f_predMV->y; *f_predMV = Data_f->currentMV[0]; } pMB->mvs[0] = *Data_f->currentMV; pMB->cbp = *Data_f->cbp; pMB->b_mvs[0] = *Data_b->currentMV; /* hint for future searches */ break; case MODE_BACKWARD: if (qpel) { pMB->pmvs[0].x = Data_b->currentQMV->x - b_predMV->x; pMB->pmvs[0].y = Data_b->currentQMV->y - b_predMV->y; pMB->b_qmvs[0] = *Data_b->currentQMV; *b_predMV = Data_b->currentQMV[0]; } else { pMB->pmvs[0].x = Data_b->currentMV->x - b_predMV->x; pMB->pmvs[0].y = Data_b->currentMV->y - b_predMV->y; *b_predMV = Data_b->currentMV[0]; } pMB->b_mvs[0] = *Data_b->currentMV; pMB->cbp = *Data_b->cbp; pMB->mvs[0] = *Data_f->currentMV; /* hint for future searches */ break; case MODE_INTERPOLATE: pMB->mvs[0] = Data_i->currentMV[0]; pMB->b_mvs[0] = Data_i->currentMV[1]; if (qpel) { pMB->qmvs[0] = Data_i->currentQMV[0]; pMB->b_qmvs[0] = Data_i->currentQMV[1]; pMB->pmvs[1].x = pMB->qmvs[0].x - f_predMV->x; pMB->pmvs[1].y = pMB->qmvs[0].y - f_predMV->y; pMB->pmvs[0].x = pMB->b_qmvs[0].x - b_predMV->x; pMB->pmvs[0].y = pMB->b_qmvs[0].y - b_predMV->y; *f_predMV = Data_i->currentQMV[0]; *b_predMV = Data_i->currentQMV[1]; } else { pMB->pmvs[1].x = pMB->mvs[0].x - f_predMV->x; pMB->pmvs[1].y = pMB->mvs[0].y - f_predMV->y; pMB->pmvs[0].x = pMB->b_mvs[0].x - b_predMV->x; pMB->pmvs[0].y = pMB->b_mvs[0].y - b_predMV->y; *f_predMV = Data_i->currentMV[0]; *b_predMV = Data_i->currentMV[1]; } pMB->cbp = *Data_i->cbp; break; } } xvidcore/src/motion/vop_type_decision.c0000664000076500007650000001762211564705453021467 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - ME-based Frame Type Decision - * * Copyright(C) 2002-2003 Radoslaw Czyz * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: vop_type_decision.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../encoder.h" #include "../prediction/mbprediction.h" #include "estimation.h" #include "motion.h" #include "sad.h" #include "gmc.h" #include "../utils/emms.h" #include "motion_inlines.h" #define INTRA_THRESH 2000 #define INTER_THRESH 40 #define INTRA_THRESH2 90 /* when we are in 1/I_SENS_TH before forced keyframe, we start to decrese i-frame threshold */ #define I_SENS_TH 3 /* how much we subtract from each p-frame threshold for 2nd, 3rd etc. b-frame in a row */ #define P_SENS_BIAS 18 /* .. but never below INTER_THRESH_MIN */ #define INTER_THRESH_MIN 5 static void CheckCandidate32I(const int x, const int y, SearchData * const data, const unsigned int Direction) { /* maximum speed */ int32_t sad; if ( (x > data->max_dx) || (x < data->min_dx) || (y > data->max_dy) || (y < data->min_dy) ) return; sad = sad32v_c(data->Cur, data->RefP[0] + x + y*((int)data->iEdgedWidth), data->iEdgedWidth, data->temp); if (sad < *(data->iMinSAD)) { *(data->iMinSAD) = sad; data->currentMV[0].x = x; data->currentMV[0].y = y; data->dir = Direction; } if (data->temp[0] < data->iMinSAD[1]) { data->iMinSAD[1] = data->temp[0]; data->currentMV[1].x = x; data->currentMV[1].y = y; } if (data->temp[1] < data->iMinSAD[2]) { data->iMinSAD[2] = data->temp[1]; data->currentMV[2].x = x; data->currentMV[2].y = y; } if (data->temp[2] < data->iMinSAD[3]) { data->iMinSAD[3] = data->temp[2]; data->currentMV[3].x = x; data->currentMV[3].y = y; } if (data->temp[3] < data->iMinSAD[4]) { data->iMinSAD[4] = data->temp[3]; data->currentMV[4].x = x; data->currentMV[4].y = y; } } static __inline void MEanalyzeMB ( const uint8_t * const pRef, const uint8_t * const pCur, const int x, const int y, const MBParam * const pParam, MACROBLOCK * const pMBs, SearchData * const Data) { int i; VECTOR pmv[3]; MACROBLOCK * const pMB = &pMBs[x + y * pParam->mb_width]; unsigned int simplicity = 0; for (i = 0; i < 5; i++) Data->iMinSAD[i] = MV_MAX_ERROR; get_range(&Data->min_dx, &Data->max_dx, &Data->min_dy, &Data->max_dy, x, y, 4, pParam->width, pParam->height, Data->iFcode - Data->qpel - 1, 0); Data->Cur = pCur + (x + y * pParam->edged_width) * 16; Data->RefP[0] = pRef + (x + y * pParam->edged_width) * 16; pmv[0].x = pMB->mvs[0].x; pmv[0].y = pMB->mvs[0].y; CheckCandidate32I(pmv[0].x, pmv[0].y, Data, 0); if (*Data->iMinSAD > 200) { pmv[1].x = pmv[1].y = 0; /* median is only used as prediction. it doesn't have to be real */ if (x == 1 && y == 1) Data->predMV.x = Data->predMV.y = 0; else if (x == 1) /* left macroblock does not have any vector now */ Data->predMV = (pMB - pParam->mb_width)->mvs[0]; /* top instead of median */ else if (y == 1) /* top macroblock doesn't have it's vector */ Data->predMV = (pMB - 1)->mvs[0]; /* left instead of median */ else Data->predMV = get_pmv2(pMBs, pParam->mb_width, 0, x, y, 0); /* else median */ pmv[2].x = Data->predMV.x; pmv[2].y = Data->predMV.y; if (!vector_repeats(pmv, 1)) CheckCandidate32I(pmv[1].x, pmv[1].y, Data, 1); if (!vector_repeats(pmv, 2)) CheckCandidate32I(pmv[2].x, pmv[2].y, Data, 2); if (*Data->iMinSAD > 500) { /* diamond only if needed */ unsigned int mask = make_mask(pmv, 3, Data->dir); xvid_me_DiamondSearch(Data->currentMV->x, Data->currentMV->y, Data, mask, CheckCandidate32I); } else simplicity++; if (*Data->iMinSAD > 500) /* refinement from 2-pixel to 1-pixel */ xvid_me_SubpelRefine(Data->currentMV[0], Data, CheckCandidate32I, 0); else simplicity++; } else simplicity++; for (i = 0; i < 4; i++) { MACROBLOCK * MB = &pMBs[x + (i&1) + (y+(i>>1)) * pParam->mb_width]; MB->mvs[0] = MB->mvs[1] = MB->mvs[2] = MB->mvs[3] = Data->currentMV[i]; MB->mode = MODE_INTER; /* if we skipped some search steps, we have to assume that SAD would be lower with them */ MB->sad16 = Data->iMinSAD[i+1] - (simplicity<<7); if (MB->sad16 < 0) MB->sad16 = 0; } } int MEanalysis( const IMAGE * const pRef, const FRAMEINFO * const Current, const MBParam * const pParam, const int maxIntra, /* maximum number if non-I frames */ const int intraCount, /* number of non-I frames after last I frame; 0 if we force P/B frame */ const int bCount, /* number of B frames in a row */ const int b_thresh, const MACROBLOCK * const prev_mbs) { uint32_t x, y, intra = 0; int sSAD = 0; MACROBLOCK * const pMBs = Current->mbs; const IMAGE * const pCurrent = &Current->image; int IntraThresh = INTRA_THRESH, InterThresh = INTER_THRESH + b_thresh, IntraThresh2 = INTRA_THRESH2; int blocks = 10; int complexity = 0; SearchData Data; Data.iEdgedWidth = pParam->edged_width; Data.iFcode = Current->fcode; Data.qpel = (pParam->vol_flags & XVID_VOL_QUARTERPEL)? 1: 0; Data.qpel_precision = 0; if (intraCount != 0) { if (intraCount < 30) { /* we're right after an I frame we increase thresholds to prevent consecutive i-frames */ if (intraCount < 10) IntraThresh += 15*(10 - intraCount)*(10 - intraCount); IntraThresh2 += 4*(30 - intraCount); } else if (I_SENS_TH*(maxIntra - intraCount) < maxIntra) { /* we're close to maximum. we decrease thresholds to catch any good keyframe */ IntraThresh -= IntraThresh*((maxIntra - I_SENS_TH*(maxIntra - intraCount))/maxIntra); IntraThresh2 -= IntraThresh2*((maxIntra - I_SENS_TH*(maxIntra - intraCount))/maxIntra); } } InterThresh -= P_SENS_BIAS * bCount; if (InterThresh < INTER_THRESH_MIN) InterThresh = INTER_THRESH_MIN; if (sadInit) (*sadInit) (); for (y = 1; y < pParam->mb_height-1; y += 2) { for (x = 1; x < pParam->mb_width-1; x += 2) { int i; blocks += 10; if (bCount == 0) pMBs[x + y * pParam->mb_width].mvs[0] = zeroMV; else { /* extrapolation of the vector found for last frame */ pMBs[x + y * pParam->mb_width].mvs[0].x = (prev_mbs[x + y * pParam->mb_width].mvs[0].x * (bCount+1) ) / bCount; pMBs[x + y * pParam->mb_width].mvs[0].y = (prev_mbs[x + y * pParam->mb_width].mvs[0].y * (bCount+1) ) / bCount; } MEanalyzeMB(pRef->y, pCurrent->y, x, y, pParam, pMBs, &Data); for (i = 0; i < 4; i++) { int dev; MACROBLOCK *pMB = &pMBs[x+(i&1) + (y+(i>>1)) * pParam->mb_width]; dev = dev16(pCurrent->y + (x + (i&1) + (y + (i>>1)) * pParam->edged_width) * 16, pParam->edged_width); complexity += MAX(dev, 300); if (dev + IntraThresh < pMB->sad16) { pMB->mode = MODE_INTRA; if (++intra > ((pParam->mb_height-2)*(pParam->mb_width-2))/2) return I_VOP; } if (pMB->mvs[0].x == 0 && pMB->mvs[0].y == 0) if (dev > 1000 && pMB->sad16 < 1000) sSAD += 512; sSAD += (dev < 3000) ? pMB->sad16 : pMB->sad16/2; /* blocks with big contrast differences usually have large SAD - while they look very good in b-frames */ } } } complexity >>= 7; sSAD /= complexity + 4*blocks; if (sSAD > IntraThresh2) return I_VOP; if (sSAD > InterThresh) return P_VOP; emms(); return B_VOP; } xvidcore/src/motion/motion_inlines.h0000664000076500007650000001124511564705453020773 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Motion Estimation shared functions - * * Copyright(C) 2002 Christoph Lampert * 2002 Michael Militzer * 2002-2003 Radoslaw Czyz * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: motion_inlines.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _MOTION_INLINES_ #define _MOTION_INLINES_ #include /* * Calculate the min/max range * relative to the _MACROBLOCK_ position */ static void __inline get_range(int32_t * const min_dx, int32_t * const max_dx, int32_t * const min_dy, int32_t * const max_dy, const uint32_t x, const uint32_t y, uint32_t block_sz, /* block dimension, 3(8) or 4(16) */ const uint32_t width, const uint32_t height, const int fcode, const int precision) /* 2 for qpel, 1 for halfpel */ { int k; const int search_range = 1 << (4+fcode); int high = search_range - 1; int low = -search_range; k = (int)(width - (x<>= (iFcode - 1); bits += r_mvtab[x+64]; y -= pred.y; bits += (y != 0 ? iFcode:0); y = -abs(y); y >>= (iFcode - 1); bits += r_mvtab[y+64]; return bits; } static __inline const uint8_t * GetReference(const int x, const int y, const SearchData * const data) { const int picture = ((x&1)<<1) | (y&1); const int offset = (x>>1) + (y>>1)*data->iEdgedWidth; return data->RefP[picture] + offset; } static __inline const uint8_t * GetReferenceB(const int x, const int y, const uint32_t dir, const SearchData * const data) { /* dir : 0 = forward, 1 = backward */ const uint8_t *const *const direction = ( dir == 0 ? data->RefP : data->b_RefP ); const int picture = ((x&1)<<1) | (y&1); const int offset = (x>>1) + (y>>1)*data->iEdgedWidth; return direction[picture] + offset; } static __inline void ZeroMacroblockP(MACROBLOCK *pMB, const int32_t sad) { pMB->mode = MODE_INTER; pMB->mvs[0] = pMB->mvs[1] = pMB->mvs[2] = pMB->mvs[3] = zeroMV; pMB->qmvs[0] = pMB->qmvs[1] = pMB->qmvs[2] = pMB->qmvs[3] = zeroMV; pMB->sad16 = pMB->sad8[0] = pMB->sad8[1] = pMB->sad8[2] = pMB->sad8[3] = sad; pMB->mcsel = 0; pMB->cbp = 0; } /* check if given vector is equal to any vector checked before */ static __inline int vector_repeats(const VECTOR * const pmv, const unsigned int i) { unsigned int j; for (j = 0; j < i; j++) if (MVequal(pmv[i], pmv[j])) return 1; /* same vector has been checked already */ return 0; } /* make a binary mask that prevents diamonds/squares from checking a vector which has been checked as a prediction */ static __inline int make_mask(const VECTOR * const pmv, const unsigned int i, const unsigned int current) { unsigned int mask = 255, j; for (j = 0; j < i; j++) { if (pmv[current].x == pmv[j].x) { if (pmv[current].y == pmv[j].y + iDiamondSize) mask &= ~4; else if (pmv[current].y == pmv[j].y - iDiamondSize) mask &= ~8; } else if (pmv[current].y == pmv[j].y) { if (pmv[current].x == pmv[j].x + iDiamondSize) mask &= ~1; else if (pmv[current].x == pmv[j].x - iDiamondSize) mask &= ~2; } } return mask; } #endif /* _MOTION_INLINES_ */ xvidcore/src/motion/estimation_common.c0000664000076500007650000004512611564705453021471 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Motion Estimation shared functions - * * Copyright(C) 2002 Christoph Lampert * 2002 Michael Militzer * 2002-2003 Radoslaw Czyz * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: estimation_common.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../encoder.h" #include "../global.h" #include "../image/interpolate8x8.h" #include "estimation.h" #include "motion.h" #include "sad.h" #include "motion_inlines.h" /***************************************************************************** * Modified rounding tables * Original tables see ISO spec tables 7-6 -> 7-9 ****************************************************************************/ const uint32_t roundtab[16] = {0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2 }; /* K = 4 */ const uint32_t roundtab_76[16] = { 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1 }; /* K = 2 */ const uint32_t roundtab_78[8] = { 0, 0, 1, 1, 0, 0, 0, 1 }; /* K = 1 */ const uint32_t roundtab_79[4] = { 0, 1, 0, 0 }; const int xvid_me_lambda_vec16[32] = { 0 ,(int)(1.0 * NEIGH_TEND_16X16 + 0.5), (int)(2.0*NEIGH_TEND_16X16 + 0.5), (int)(3.0*NEIGH_TEND_16X16 + 0.5), (int)(4.0*NEIGH_TEND_16X16 + 0.5), (int)(5.0*NEIGH_TEND_16X16 + 0.5), (int)(6.0*NEIGH_TEND_16X16 + 0.5), (int)(7.0*NEIGH_TEND_16X16 + 0.5), (int)(8.0*NEIGH_TEND_16X16 + 0.5), (int)(9.0*NEIGH_TEND_16X16 + 0.5), (int)(10.0*NEIGH_TEND_16X16 + 0.5), (int)(11.0*NEIGH_TEND_16X16 + 0.5), (int)(12.0*NEIGH_TEND_16X16 + 0.5), (int)(13.0*NEIGH_TEND_16X16 + 0.5), (int)(14.0*NEIGH_TEND_16X16 + 0.5), (int)(15.0*NEIGH_TEND_16X16 + 0.5), (int)(16.0*NEIGH_TEND_16X16 + 0.5), (int)(17.0*NEIGH_TEND_16X16 + 0.5), (int)(18.0*NEIGH_TEND_16X16 + 0.5), (int)(19.0*NEIGH_TEND_16X16 + 0.5), (int)(20.0*NEIGH_TEND_16X16 + 0.5), (int)(21.0*NEIGH_TEND_16X16 + 0.5), (int)(22.0*NEIGH_TEND_16X16 + 0.5), (int)(23.0*NEIGH_TEND_16X16 + 0.5), (int)(24.0*NEIGH_TEND_16X16 + 0.5), (int)(25.0*NEIGH_TEND_16X16 + 0.5), (int)(26.0*NEIGH_TEND_16X16 + 0.5), (int)(27.0*NEIGH_TEND_16X16 + 0.5), (int)(28.0*NEIGH_TEND_16X16 + 0.5), (int)(29.0*NEIGH_TEND_16X16 + 0.5), (int)(30.0*NEIGH_TEND_16X16 + 0.5), (int)(31.0*NEIGH_TEND_16X16 + 0.5) }; /***************************************************************************** * Code ****************************************************************************/ int32_t xvid_me_ChromaSAD(const int dx, const int dy, SearchData * const data) { int sad; const uint32_t stride = data->iEdgedWidth/2; int offset = (dx>>1) + (dy>>1)*stride; int next = 1; if (dx == data->chromaX && dy == data->chromaY) return data->chromaSAD; /* it has been checked recently */ data->chromaX = dx; data->chromaY = dy; /* backup */ switch (((dx & 1) << 1) | (dy & 1)) { case 0: sad = sad8(data->CurU, data->RefP[4] + offset, stride); sad += sad8(data->CurV, data->RefP[5] + offset, stride); break; case 1: next = stride; case 2: sad = sad8bi(data->CurU, data->RefP[4] + offset, data->RefP[4] + offset + next, stride); sad += sad8bi(data->CurV, data->RefP[5] + offset, data->RefP[5] + offset + next, stride); break; default: interpolate8x8_halfpel_hv(data->RefQ, data->RefP[4] + offset, stride, data->rounding); sad = sad8(data->CurU, data->RefQ, stride); interpolate8x8_halfpel_hv(data->RefQ, data->RefP[5] + offset, stride, data->rounding); sad += sad8(data->CurV, data->RefQ, stride); break; } data->chromaSAD = sad; /* backup, part 2 */ return sad; } uint8_t * xvid_me_interpolate8x8qpel(const int x, const int y, const uint32_t block, const uint32_t dir, const SearchData * const data) { /* create or find a qpel-precision reference picture; return pointer to it */ uint8_t * Reference = data->RefQ + 16*dir; const uint32_t iEdgedWidth = data->iEdgedWidth; const uint32_t rounding = data->rounding; const int halfpel_x = x/2; const int halfpel_y = y/2; const uint8_t *ref1, *ref2, *ref3, *ref4; ref1 = GetReferenceB(halfpel_x, halfpel_y, dir, data); ref1 += 8 * (block&1) + 8 * (block>>1) * iEdgedWidth; switch( ((x&1)<<1) + (y&1) ) { case 3: /* x and y in qpel resolution - the "corners" (top left/right and */ /* bottom left/right) during qpel refinement */ ref2 = GetReferenceB(halfpel_x, y - halfpel_y, dir, data); ref3 = GetReferenceB(x - halfpel_x, halfpel_y, dir, data); ref4 = GetReferenceB(x - halfpel_x, y - halfpel_y, dir, data); ref2 += 8 * (block&1) + 8 * (block>>1) * iEdgedWidth; ref3 += 8 * (block&1) + 8 * (block>>1) * iEdgedWidth; ref4 += 8 * (block&1) + 8 * (block>>1) * iEdgedWidth; interpolate8x8_avg4(Reference, ref1, ref2, ref3, ref4, iEdgedWidth, rounding); break; case 1: /* x halfpel, y qpel - top or bottom during qpel refinement */ ref2 = GetReferenceB(halfpel_x, y - halfpel_y, dir, data); ref2 += 8 * (block&1) + 8 * (block>>1) * iEdgedWidth; interpolate8x8_avg2(Reference, ref1, ref2, iEdgedWidth, rounding, 8); break; case 2: /* x qpel, y halfpel - left or right during qpel refinement */ ref2 = GetReferenceB(x - halfpel_x, halfpel_y, dir, data); ref2 += 8 * (block&1) + 8 * (block>>1) * iEdgedWidth; interpolate8x8_avg2(Reference, ref1, ref2, iEdgedWidth, rounding, 8); break; default: /* pure halfpel position */ return (uint8_t *) ref1; } return Reference; } uint8_t * xvid_me_interpolate16x16qpel(const int x, const int y, const uint32_t dir, const SearchData * const data) { /* create or find a qpel-precision reference picture; return pointer to it */ uint8_t * Reference = data->RefQ + 16*dir; const uint32_t iEdgedWidth = data->iEdgedWidth; const uint32_t rounding = data->rounding; const int halfpel_x = x/2; const int halfpel_y = y/2; const uint8_t *ref1, *ref2, *ref3, *ref4; ref1 = GetReferenceB(halfpel_x, halfpel_y, dir, data); switch( ((x&1)<<1) + (y&1) ) { case 3: /* * x and y in qpel resolution - the "corners" (top left/right and * bottom left/right) during qpel refinement */ ref2 = GetReferenceB(halfpel_x, y - halfpel_y, dir, data); ref3 = GetReferenceB(x - halfpel_x, halfpel_y, dir, data); ref4 = GetReferenceB(x - halfpel_x, y - halfpel_y, dir, data); interpolate8x8_avg4(Reference, ref1, ref2, ref3, ref4, iEdgedWidth, rounding); interpolate8x8_avg4(Reference+8, ref1+8, ref2+8, ref3+8, ref4+8, iEdgedWidth, rounding); interpolate8x8_avg4(Reference+8*iEdgedWidth, ref1+8*iEdgedWidth, ref2+8*iEdgedWidth, ref3+8*iEdgedWidth, ref4+8*iEdgedWidth, iEdgedWidth, rounding); interpolate8x8_avg4(Reference+8*iEdgedWidth+8, ref1+8*iEdgedWidth+8, ref2+8*iEdgedWidth+8, ref3+8*iEdgedWidth+8, ref4+8*iEdgedWidth+8, iEdgedWidth, rounding); break; case 1: /* x halfpel, y qpel - top or bottom during qpel refinement */ ref2 = GetReferenceB(halfpel_x, y - halfpel_y, dir, data); interpolate8x8_avg2(Reference, ref1, ref2, iEdgedWidth, rounding, 8); interpolate8x8_avg2(Reference+8, ref1+8, ref2+8, iEdgedWidth, rounding, 8); interpolate8x8_avg2(Reference+8*iEdgedWidth, ref1+8*iEdgedWidth, ref2+8*iEdgedWidth, iEdgedWidth, rounding, 8); interpolate8x8_avg2(Reference+8*iEdgedWidth+8, ref1+8*iEdgedWidth+8, ref2+8*iEdgedWidth+8, iEdgedWidth, rounding, 8); break; case 2: /* x qpel, y halfpel - left or right during qpel refinement */ ref2 = GetReferenceB(x - halfpel_x, halfpel_y, dir, data); interpolate8x8_avg2(Reference, ref1, ref2, iEdgedWidth, rounding, 8); interpolate8x8_avg2(Reference+8, ref1+8, ref2+8, iEdgedWidth, rounding, 8); interpolate8x8_avg2(Reference+8*iEdgedWidth, ref1+8*iEdgedWidth, ref2+8*iEdgedWidth, iEdgedWidth, rounding, 8); interpolate8x8_avg2(Reference+8*iEdgedWidth+8, ref1+8*iEdgedWidth+8, ref2+8*iEdgedWidth+8, iEdgedWidth, rounding, 8); break; default: /* pure halfpel position */ return (uint8_t *) ref1; } return Reference; } void xvid_me_AdvDiamondSearch(int x, int y, SearchData * const data, int bDirection, CheckFunc * const CheckCandidate) { /* directions: 1 - left (x-1); 2 - right (x+1), 4 - up (y-1); 8 - down (y+1) */ unsigned int * const iDirection = &data->dir; for(;;) { /* forever */ *iDirection = 0; if (bDirection & 1) CHECK_CANDIDATE(x - iDiamondSize, y, 1); if (bDirection & 2) CHECK_CANDIDATE(x + iDiamondSize, y, 2); if (bDirection & 4) CHECK_CANDIDATE(x, y - iDiamondSize, 4); if (bDirection & 8) CHECK_CANDIDATE(x, y + iDiamondSize, 8); /* now we're doing diagonal checks near our candidate */ if (*iDirection) { /* if anything found */ bDirection = *iDirection; *iDirection = 0; x = data->currentMV->x; y = data->currentMV->y; if (bDirection & 3) { /* our candidate is left or right */ CHECK_CANDIDATE(x, y + iDiamondSize, 8); CHECK_CANDIDATE(x, y - iDiamondSize, 4); } else { /* what remains here is up or down */ CHECK_CANDIDATE(x + iDiamondSize, y, 2); CHECK_CANDIDATE(x - iDiamondSize, y, 1); } if (*iDirection) { bDirection += *iDirection; x = data->currentMV->x; y = data->currentMV->y; } } else { /* about to quit, eh? not so fast.... */ switch (bDirection) { case 2: CHECK_CANDIDATE(x + iDiamondSize, y - iDiamondSize, 2 + 4); CHECK_CANDIDATE(x + iDiamondSize, y + iDiamondSize, 2 + 8); break; case 1: CHECK_CANDIDATE(x - iDiamondSize, y - iDiamondSize, 1 + 4); CHECK_CANDIDATE(x - iDiamondSize, y + iDiamondSize, 1 + 8); break; case 2 + 4: CHECK_CANDIDATE(x - iDiamondSize, y - iDiamondSize, 1 + 4); CHECK_CANDIDATE(x + iDiamondSize, y - iDiamondSize, 2 + 4); CHECK_CANDIDATE(x + iDiamondSize, y + iDiamondSize, 2 + 8); break; case 4: CHECK_CANDIDATE(x + iDiamondSize, y - iDiamondSize, 2 + 4); CHECK_CANDIDATE(x - iDiamondSize, y - iDiamondSize, 1 + 4); break; case 8: CHECK_CANDIDATE(x + iDiamondSize, y + iDiamondSize, 2 + 8); CHECK_CANDIDATE(x - iDiamondSize, y + iDiamondSize, 1 + 8); break; case 1 + 4: CHECK_CANDIDATE(x - iDiamondSize, y + iDiamondSize, 1 + 8); CHECK_CANDIDATE(x - iDiamondSize, y - iDiamondSize, 1 + 4); CHECK_CANDIDATE(x + iDiamondSize, y - iDiamondSize, 2 + 4); break; case 2 + 8: CHECK_CANDIDATE(x + iDiamondSize, y - iDiamondSize, 2 + 4); CHECK_CANDIDATE(x + iDiamondSize, y + iDiamondSize, 2 + 8); CHECK_CANDIDATE(x - iDiamondSize, y + iDiamondSize, 1 + 8); break; case 1 + 8: CHECK_CANDIDATE(x - iDiamondSize, y - iDiamondSize, 1 + 4); CHECK_CANDIDATE(x - iDiamondSize, y + iDiamondSize, 1 + 8); CHECK_CANDIDATE(x + iDiamondSize, y + iDiamondSize, 2 + 8); break; default: /* 1+2+4+8 == we didn't find anything at all */ CHECK_CANDIDATE(x - iDiamondSize, y - iDiamondSize, 1 + 4); CHECK_CANDIDATE(x - iDiamondSize, y + iDiamondSize, 1 + 8); CHECK_CANDIDATE(x + iDiamondSize, y - iDiamondSize, 2 + 4); CHECK_CANDIDATE(x + iDiamondSize, y + iDiamondSize, 2 + 8); break; } if (!*iDirection) break; /* ok, the end. really */ bDirection = *iDirection; x = data->currentMV->x; y = data->currentMV->y; } } } void xvid_me_SquareSearch(int x, int y, SearchData * const data, int bDirection, CheckFunc * const CheckCandidate) { unsigned int * const iDirection = &data->dir; do { *iDirection = 0; if (bDirection & 1) CHECK_CANDIDATE(x - iDiamondSize, y, 1+16+64); if (bDirection & 2) CHECK_CANDIDATE(x + iDiamondSize, y, 2+32+128); if (bDirection & 4) CHECK_CANDIDATE(x, y - iDiamondSize, 4+16+32); if (bDirection & 8) CHECK_CANDIDATE(x, y + iDiamondSize, 8+64+128); if (bDirection & 16) CHECK_CANDIDATE(x - iDiamondSize, y - iDiamondSize, 1+4+16+32+64); if (bDirection & 32) CHECK_CANDIDATE(x + iDiamondSize, y - iDiamondSize, 2+4+16+32+128); if (bDirection & 64) CHECK_CANDIDATE(x - iDiamondSize, y + iDiamondSize, 1+8+16+64+128); if (bDirection & 128) CHECK_CANDIDATE(x + iDiamondSize, y + iDiamondSize, 2+8+32+64+128); bDirection = *iDirection; x = data->currentMV->x; y = data->currentMV->y; } while (*iDirection); } void xvid_me_DiamondSearch(int x, int y, SearchData * const data, int bDirection, CheckFunc * const CheckCandidate) { /* directions: 1 - left (x-1); 2 - right (x+1), 4 - up (y-1); 8 - down (y+1) */ unsigned int * const iDirection = &data->dir; for (;;) { *iDirection = 0; if (bDirection & 1) CHECK_CANDIDATE(x - iDiamondSize, y, 1); if (bDirection & 2) CHECK_CANDIDATE(x + iDiamondSize, y, 2); if (bDirection & 4) CHECK_CANDIDATE(x, y - iDiamondSize, 4); if (bDirection & 8) CHECK_CANDIDATE(x, y + iDiamondSize, 8); if (*iDirection == 0) break; /* now we're doing diagonal checks near our candidate */ bDirection = *iDirection; x = data->currentMV->x; y = data->currentMV->y; if (bDirection & 3) { /* our candidate is left or right */ CHECK_CANDIDATE(x, y + iDiamondSize, 8); CHECK_CANDIDATE(x, y - iDiamondSize, 4); } else { /* what remains here is up or down */ CHECK_CANDIDATE(x + iDiamondSize, y, 2); CHECK_CANDIDATE(x - iDiamondSize, y, 1); } bDirection |= *iDirection; x = data->currentMV->x; y = data->currentMV->y; } } void xvid_me_SubpelRefine(VECTOR centerMV, SearchData * const data, CheckFunc * const CheckCandidate, int dir) { /* Do a half-pel or q-pel refinement */ CHECK_CANDIDATE(centerMV.x, centerMV.y - 1, dir); CHECK_CANDIDATE(centerMV.x + 1, centerMV.y - 1, dir); CHECK_CANDIDATE(centerMV.x + 1, centerMV.y, dir); CHECK_CANDIDATE(centerMV.x + 1, centerMV.y + 1, dir); CHECK_CANDIDATE(centerMV.x, centerMV.y + 1, dir); CHECK_CANDIDATE(centerMV.x - 1, centerMV.y + 1, dir); CHECK_CANDIDATE(centerMV.x - 1, centerMV.y, dir); CHECK_CANDIDATE(centerMV.x - 1, centerMV.y - 1, dir); } #define CHECK_CANDIDATE_2ndBEST(X, Y, DIR) { \ *data->iMinSAD = s_best2; \ CheckCandidate((X),(Y), data, direction); \ if (data->iMinSAD[0] < s_best) { \ s_best2 = s_best; \ s_best = data->iMinSAD[0]; \ v_best2 = v_best; \ v_best.x = X; v_best.y = Y; \ dir = DIR; \ } else if (data->iMinSAD[0] < s_best2) { \ s_best2 = data->iMinSAD[0]; \ v_best2.x = X; v_best2.y = Y; \ } \ } void FullRefine_Fast(SearchData * data, CheckFunc * CheckCandidate, int direction) { /* Do a fast h-pel and then q-pel refinement */ int32_t s_best = data->iMinSAD[0], s_best2 = 256*4096; VECTOR v_best, v_best2; int dir = 0, xo2, yo2, best_halfpel, b_cbp; int xo = 2*data->currentMV[0].x, yo = 2*data->currentMV[0].y; data->currentQMV[0].x = v_best.x = v_best2.x = xo; data->currentQMV[0].y = v_best.y = v_best2.y = yo; data->qpel_precision = 1; /* halfpel refinement: check 8 neighbours, but keep the second best SAD as well */ CHECK_CANDIDATE_2ndBEST(xo - 2, yo, 1+16+64); CHECK_CANDIDATE_2ndBEST(xo + 2, yo, 2+32+128); CHECK_CANDIDATE_2ndBEST(xo, yo - 2, 4+16+32); CHECK_CANDIDATE_2ndBEST(xo, yo + 2, 8+64+128); CHECK_CANDIDATE_2ndBEST(xo - 2, yo - 2, 1+4+16+32+64); CHECK_CANDIDATE_2ndBEST(xo + 2, yo - 2, 2+4+16+32+128); CHECK_CANDIDATE_2ndBEST(xo - 2, yo + 2, 1+8+16+64+128); CHECK_CANDIDATE_2ndBEST(xo + 2, yo + 2, 2+8+32+64+128); xo = v_best.x; yo = v_best.y, b_cbp = data->cbp[0]; /* we need all 8 neighbours *of best hpel position found above* checked for 2nd best let's check the missing ones */ /* on rare occasions, 1st best and 2nd best are far away, and 2nd best is not 1st best's neighbour. to simplify stuff, we'll forget that evil 2nd best and make a full search for a new 2nd best */ /* todo. we should check the missing neighbours first, maybe they'll give us 2nd best which is even better than the infamous one. in that case, we will not have to re-check the other neighbours */ if (abs(v_best.x - v_best2.x) > 2 || abs(v_best.y - v_best2.y) > 2) { /* v_best2 is useless */ data->iMinSAD[0] = 256*4096; dir = ~0; /* all */ } else { data->iMinSAD[0] = s_best2; data->currentQMV[0] = v_best2; } if (dir & 1) CHECK_CANDIDATE( xo - 2, yo, direction); if (dir & 2) CHECK_CANDIDATE( xo + 2, yo, direction); if (dir & 4) CHECK_CANDIDATE( xo, yo - 2, direction); if (dir & 8) CHECK_CANDIDATE( xo, yo + 2, direction); if (dir & 16) CHECK_CANDIDATE( xo - 2, yo - 2, direction); if (dir & 32) CHECK_CANDIDATE( xo + 2, yo - 2, direction); if (dir & 64) CHECK_CANDIDATE( xo - 2, yo + 2, direction); if (dir & 128) CHECK_CANDIDATE( xo + 2, yo + 2, direction); /* read the position of 2nd best */ v_best2 = data->currentQMV[0]; /* after second_best has been found, go back to best vector */ data->currentQMV[0].x = xo; data->currentQMV[0].y = yo; data->cbp[0] = b_cbp; data->currentMV[0].x = xo/2; data->currentMV[0].y = yo/2; data->iMinSAD[0] = best_halfpel = s_best; xo2 = v_best2.x; yo2 = v_best2.y; s_best2 = 256*4096; if (yo == yo2) { CHECK_CANDIDATE_2ndBEST((xo+xo2)>>1, yo, 0); CHECK_CANDIDATE_2ndBEST(xo, yo-1, 0); CHECK_CANDIDATE_2ndBEST(xo, yo+1, 0); data->currentQMV[0] = v_best; data->iMinSAD[0] = s_best; if(best_halfpel <= s_best2) return; if(data->currentQMV[0].x == v_best2.x) { CHECK_CANDIDATE((xo+xo2)>>1, yo-1, direction); CHECK_CANDIDATE((xo+xo2)>>1, yo+1, direction); } else { CHECK_CANDIDATE((xo+xo2)>>1, (data->currentQMV[0].x == xo) ? data->currentQMV[0].y : v_best2.y, direction); } return; } if (xo == xo2) { CHECK_CANDIDATE_2ndBEST(xo, (yo+yo2)>>1, 0); CHECK_CANDIDATE_2ndBEST(xo-1, yo, 0); CHECK_CANDIDATE_2ndBEST(xo+1, yo, 0); data->currentQMV[0] = v_best; data->iMinSAD[0] = s_best; if(best_halfpel <= s_best2) return; if(data->currentQMV[0].y == v_best2.y) { CHECK_CANDIDATE(xo-1, (yo+yo2)>>1, direction); CHECK_CANDIDATE(xo+1, (yo+yo2)>>1, direction); } else { CHECK_CANDIDATE((data->currentQMV[0].y == yo) ? data->currentQMV[0].x : v_best2.x, (yo+yo2)>>1, direction); } return; } CHECK_CANDIDATE_2ndBEST(xo, (yo+yo2)>>1, 0); CHECK_CANDIDATE_2ndBEST((xo+xo2)>>1, yo, 0); data->currentQMV[0] = v_best; data->iMinSAD[0] = s_best; if(best_halfpel <= s_best2) return; CHECK_CANDIDATE((xo+xo2)>>1, (yo+yo2)>>1, direction); } /* it's the positive max, so "32" needs fcode of 2, not 1 */ unsigned int getMinFcode(const int MVmax) { unsigned int fcode; for (fcode = 1; (16 << fcode) <= MVmax; fcode++); return fcode; } xvidcore/src/motion/motion_comp.c0000664000076500007650000003331311564705453020263 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Motion Compensation related code - * * Copyright(C) 2002 Peter Ross * 2003 Christoph Lampert * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: motion_comp.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include "../encoder.h" #include "../utils/mbfunctions.h" #include "../image/interpolate8x8.h" #include "../image/qpel.h" #include "../utils/timer.h" #include "motion.h" /* * getref: calculate reference image pointer * the decision to use interpolation h/v/hv or the normal image is * based on dx & dy. */ static __inline const uint8_t * get_ref(const uint8_t * const refn, const uint8_t * const refh, const uint8_t * const refv, const uint8_t * const refhv, const uint32_t x, const uint32_t y, const uint32_t block, const int32_t dx, const int32_t dy, const int32_t stride) { switch (((dx & 1) << 1) + (dy & 1)) { case 0: return refn + (int) (((int)x * (int)block + dx / 2) + ((int)y * (int)block + dy / 2) * (int)stride); case 1: return refv + (int) (((int)x * (int)block + dx / 2) + ((int)y * (int)block + (dy - 1) / 2) * (int)stride); case 2: return refh + (int) (((int)x * (int)block + (dx - 1) / 2) + ((int)y * (int)block + dy / 2) * (int)stride); default: return refhv + (int) (((int)x * (int)block + (dx - 1) / 2) + ((int)y * (int)block + (dy - 1) / 2) * (int)stride); } } static __inline void compensate16x16_interpolate(int16_t * const dct_codes, uint8_t * const cur, const uint8_t * const ref, const uint8_t * const refh, const uint8_t * const refv, const uint8_t * const refhv, uint8_t * const tmp, uint32_t x, uint32_t y, const int32_t dx, const int32_t dy, const int32_t stride, const int quarterpel, const int32_t rounding) { const uint8_t * ptr; if(quarterpel) { if ((dx&3) | (dy&3)) { interpolate16x16_quarterpel(tmp - y * stride - x, (uint8_t *) ref, tmp + 32, tmp + 64, tmp + 96, x, y, dx, dy, stride, rounding); ptr = tmp; } else ptr = ref + ((int)y + dy/4)*(int)stride + (int)x + dx/4; /* fullpixel position */ } else ptr = get_ref(ref, refh, refv, refhv, x, y, 1, dx, dy, stride); transfer_8to16sub(dct_codes, cur + y * stride + x, ptr, stride); transfer_8to16sub(dct_codes+64, cur + y * stride + x + 8, ptr + 8, stride); transfer_8to16sub(dct_codes+128, cur + y * stride + x + 8*stride, ptr + 8*stride, stride); transfer_8to16sub(dct_codes+192, cur + y * stride + x + 8*stride+8, ptr + 8*stride + 8, stride); } static __inline void compensate8x8_interpolate( int16_t * const dct_codes, uint8_t * const cur, const uint8_t * const ref, const uint8_t * const refh, const uint8_t * const refv, const uint8_t * const refhv, uint8_t * const tmp, uint32_t x, uint32_t y, const int32_t dx, const int32_t dy, const int32_t stride, const int32_t quarterpel, const int32_t rounding) { const uint8_t * ptr; if(quarterpel) { if ((dx&3) | (dy&3)) { interpolate8x8_quarterpel(tmp - y*stride - x, (uint8_t *) ref, tmp + 32, tmp + 64, tmp + 96, x, y, dx, dy, stride, rounding); ptr = tmp; } else ptr = ref + ((int)y + dy/4)*(int)stride + (int)x + dx/4; /* fullpixel position */ } else ptr = get_ref(ref, refh, refv, refhv, x, y, 1, dx, dy, stride); transfer_8to16sub(dct_codes, cur + y * stride + x, ptr, stride); } static void CompensateChroma( int dx, int dy, const int i, const int j, IMAGE * const Cur, const IMAGE * const Ref, uint8_t * const temp, int16_t * const coeff, const int32_t stride, const int rounding) { /* uv-block-based compensation */ transfer_8to16sub(coeff, Cur->u + 8 * j * stride + 8 * i, interpolate8x8_switch2(temp, Ref->u, 8 * i, 8 * j, dx, dy, stride, rounding), stride); transfer_8to16sub(coeff + 64, Cur->v + 8 * j * stride + 8 * i, interpolate8x8_switch2(temp, Ref->v, 8 * i, 8 * j, dx, dy, stride, rounding), stride); } void MBMotionCompensation(MACROBLOCK * const mb, const uint32_t i, const uint32_t j, const IMAGE * const ref, const IMAGE * const refh, const IMAGE * const refv, const IMAGE * const refhv, const IMAGE * const refGMC, IMAGE * const cur, int16_t * dct_codes, const uint32_t width, const uint32_t height, const uint32_t edged_width, const int32_t quarterpel, const int32_t rounding, uint8_t * const tmp) { int32_t dx; int32_t dy; if (mb->mode == MODE_NOT_CODED) { /* quick copy for early SKIP */ /* early SKIP is only activated in P-VOPs, not in S-VOPs, so mcsel can never be 1 */ transfer16x16_copy(cur->y + 16 * (i + j * edged_width), ref->y + 16 * (i + j * edged_width), edged_width); transfer8x8_copy(cur->u + 8 * (i + j * edged_width/2), ref->u + 8 * (i + j * edged_width/2), edged_width / 2); transfer8x8_copy(cur->v + 8 * (i + j * edged_width/2), ref->v + 8 * (i + j * edged_width/2), edged_width / 2); return; } if ((mb->mode == MODE_NOT_CODED || mb->mode == MODE_INTER || mb->mode == MODE_INTER_Q)) { if (mb->mcsel) { /* call normal routine once, easier than "if (mcsel)"ing all the time */ transfer_8to16sub(&dct_codes[0*64], cur->y + 16*j*edged_width + 16*i, refGMC->y + 16*j*edged_width + 16*i, edged_width); transfer_8to16sub(&dct_codes[1*64], cur->y + 16*j*edged_width + 16*i+8, refGMC->y + 16*j*edged_width + 16*i+8, edged_width); transfer_8to16sub(&dct_codes[2*64], cur->y + (16*j+8)*edged_width + 16*i, refGMC->y + (16*j+8)*edged_width + 16*i, edged_width); transfer_8to16sub(&dct_codes[3*64], cur->y + (16*j+8)*edged_width + 16*i+8, refGMC->y + (16*j+8)*edged_width + 16*i+8, edged_width); transfer_8to16sub(&dct_codes[4 * 64], cur->u + 8 *j*edged_width/2 + 8*i, refGMC->u + 8 *j*edged_width/2 + 8*i, edged_width/2); transfer_8to16sub(&dct_codes[5 * 64], cur->v + 8*j* edged_width/2 + 8*i, refGMC->v + 8*j* edged_width/2 + 8*i, edged_width/2); return; } /* ordinary compensation */ dx = (quarterpel ? mb->qmvs[0].x : mb->mvs[0].x); dy = (quarterpel ? mb->qmvs[0].y : mb->mvs[0].y); compensate16x16_interpolate(&dct_codes[0 * 64], cur->y, ref->y, refh->y, refv->y, refhv->y, tmp, 16 * i, 16 * j, dx, dy, edged_width, quarterpel, rounding); if (quarterpel) { dx /= 2; dy /= 2; } dx = (dx >> 1) + roundtab_79[dx & 0x3]; dy = (dy >> 1) + roundtab_79[dy & 0x3]; } else { /* mode == MODE_INTER4V */ int k, sumx = 0, sumy = 0; const VECTOR * const mvs = (quarterpel ? mb->qmvs : mb->mvs); for (k = 0; k < 4; k++) { dx = mvs[k].x; dy = mvs[k].y; sumx += quarterpel ? dx/2 : dx; sumy += quarterpel ? dy/2 : dy; compensate8x8_interpolate(&dct_codes[k * 64], cur->y, ref->y, refh->y, refv->y, refhv->y, tmp, 16 * i + 8*(k&1), 16 * j + 8*(k>>1), dx, dy, edged_width, quarterpel, rounding); } dx = (sumx >> 3) + roundtab_76[sumx & 0xf]; dy = (sumy >> 3) + roundtab_76[sumy & 0xf]; } CompensateChroma(dx, dy, i, j, cur, ref, tmp, &dct_codes[4 * 64], edged_width / 2, rounding); } void MBMotionCompensationBVOP(MBParam * pParam, MACROBLOCK * const mb, const uint32_t i, const uint32_t j, IMAGE * const cur, const IMAGE * const f_ref, const IMAGE * const f_refh, const IMAGE * const f_refv, const IMAGE * const f_refhv, const IMAGE * const b_ref, const IMAGE * const b_refh, const IMAGE * const b_refv, const IMAGE * const b_refhv, int16_t * dct_codes, uint8_t * const tmp) { const uint32_t edged_width = pParam->edged_width; int32_t dx, dy, b_dx, b_dy, sumx, sumy, b_sumx, b_sumy; int k; const int quarterpel = pParam->vol_flags & XVID_VOL_QUARTERPEL; const uint8_t * ptr1, * ptr2; const VECTOR * const fmvs = (quarterpel ? mb->qmvs : mb->mvs); const VECTOR * const bmvs = (quarterpel ? mb->b_qmvs : mb->b_mvs); switch (mb->mode) { case MODE_FORWARD: dx = fmvs->x; dy = fmvs->y; compensate16x16_interpolate(&dct_codes[0 * 64], cur->y, f_ref->y, f_refh->y, f_refv->y, f_refhv->y, tmp, 16 * i, 16 * j, dx, dy, edged_width, quarterpel, 0); if (quarterpel) { dx /= 2; dy /= 2; } CompensateChroma( (dx >> 1) + roundtab_79[dx & 0x3], (dy >> 1) + roundtab_79[dy & 0x3], i, j, cur, f_ref, tmp, &dct_codes[4 * 64], edged_width / 2, 0); return; case MODE_BACKWARD: b_dx = bmvs->x; b_dy = bmvs->y; compensate16x16_interpolate(&dct_codes[0 * 64], cur->y, b_ref->y, b_refh->y, b_refv->y, b_refhv->y, tmp, 16 * i, 16 * j, b_dx, b_dy, edged_width, quarterpel, 0); if (quarterpel) { b_dx /= 2; b_dy /= 2; } CompensateChroma( (b_dx >> 1) + roundtab_79[b_dx & 0x3], (b_dy >> 1) + roundtab_79[b_dy & 0x3], i, j, cur, b_ref, tmp, &dct_codes[4 * 64], edged_width / 2, 0); return; case MODE_INTERPOLATE: case MODE_DIRECT_NO4V: dx = fmvs->x; dy = fmvs->y; b_dx = bmvs->x; b_dy = bmvs->y; if (quarterpel) { if ((dx&3) | (dy&3)) { interpolate16x16_quarterpel(tmp - i * 16 - j * 16 * edged_width, (uint8_t *) f_ref->y, tmp + 32, tmp + 64, tmp + 96, 16*i, 16*j, dx, dy, edged_width, 0); ptr1 = tmp; } else ptr1 = f_ref->y + (16*(int)j + dy/4)*(int)edged_width + 16*(int)i + dx/4; /* fullpixel position */ if ((b_dx&3) | (b_dy&3)) { interpolate16x16_quarterpel(tmp - i * 16 - j * 16 * edged_width + 16, (uint8_t *) b_ref->y, tmp + 32, tmp + 64, tmp + 96, 16*i, 16*j, b_dx, b_dy, edged_width, 0); ptr2 = tmp + 16; } else ptr2 = b_ref->y + (16*(int)j + b_dy/4)*(int)edged_width + 16*(int)i + b_dx/4; /* fullpixel position */ b_dx /= 2; b_dy /= 2; dx /= 2; dy /= 2; } else { ptr1 = get_ref(f_ref->y, f_refh->y, f_refv->y, f_refhv->y, i, j, 16, dx, dy, edged_width); ptr2 = get_ref(b_ref->y, b_refh->y, b_refv->y, b_refhv->y, i, j, 16, b_dx, b_dy, edged_width); } for (k = 0; k < 4; k++) transfer_8to16sub2(&dct_codes[k * 64], cur->y + (i * 16+(k&1)*8) + (j * 16+((k>>1)*8)) * edged_width, ptr1 + (k&1)*8 + (k>>1)*8*edged_width, ptr2 + (k&1)*8 + (k>>1)*8*edged_width, edged_width); dx = (dx >> 1) + roundtab_79[dx & 0x3]; dy = (dy >> 1) + roundtab_79[dy & 0x3]; b_dx = (b_dx >> 1) + roundtab_79[b_dx & 0x3]; b_dy = (b_dy >> 1) + roundtab_79[b_dy & 0x3]; break; default: /* MODE_DIRECT (or MODE_DIRECT_NONE_MV in case of bframes decoding) */ sumx = sumy = b_sumx = b_sumy = 0; for (k = 0; k < 4; k++) { dx = fmvs[k].x; dy = fmvs[k].y; b_dx = bmvs[k].x; b_dy = bmvs[k].y; if (quarterpel) { sumx += dx/2; sumy += dy/2; b_sumx += b_dx/2; b_sumy += b_dy/2; if ((dx&3) | (dy&3)) { interpolate8x8_quarterpel(tmp - (i * 16+(k&1)*8) - (j * 16+((k>>1)*8)) * edged_width, (uint8_t *) f_ref->y, tmp + 32, tmp + 64, tmp + 96, 16*i + (k&1)*8, 16*j + (k>>1)*8, dx, dy, edged_width, 0); ptr1 = tmp; } else ptr1 = f_ref->y + (16*(int)j + (k>>1)*8 + dy/4)*(int)edged_width + 16*(int)i + (k&1)*8 + dx/4; if ((b_dx&3) | (b_dy&3)) { interpolate8x8_quarterpel(tmp - (i * 16+(k&1)*8) - (j * 16+((k>>1)*8)) * edged_width + 16, (uint8_t *) b_ref->y, tmp + 16, tmp + 32, tmp + 48, 16*i + (k&1)*8, 16*j + (k>>1)*8, b_dx, b_dy, edged_width, 0); ptr2 = tmp + 16; } else ptr2 = b_ref->y + (16*(int)j + (k>>1)*8 + b_dy/4)*(int)edged_width + 16*(int)i + (k&1)*8 + b_dx/4; } else { sumx += dx; sumy += dy; b_sumx += b_dx; b_sumy += b_dy; ptr1 = get_ref(f_ref->y, f_refh->y, f_refv->y, f_refhv->y, 2*i + (k&1), 2*j + (k>>1), 8, dx, dy, edged_width); ptr2 = get_ref(b_ref->y, b_refh->y, b_refv->y, b_refhv->y, 2*i + (k&1), 2*j + (k>>1), 8, b_dx, b_dy, edged_width); } transfer_8to16sub2(&dct_codes[k * 64], cur->y + (i * 16+(k&1)*8) + (j * 16+((k>>1)*8)) * edged_width, ptr1, ptr2, edged_width); } dx = (sumx >> 3) + roundtab_76[sumx & 0xf]; dy = (sumy >> 3) + roundtab_76[sumy & 0xf]; b_dx = (b_sumx >> 3) + roundtab_76[b_sumx & 0xf]; b_dy = (b_sumy >> 3) + roundtab_76[b_sumy & 0xf]; break; } /* block-based chroma interpolation for direct and interpolate modes */ transfer_8to16sub2(&dct_codes[4 * 64], cur->u + (j * 8) * edged_width / 2 + (i * 8), interpolate8x8_switch2(tmp, b_ref->u, 8 * i, 8 * j, b_dx, b_dy, edged_width / 2, 0), interpolate8x8_switch2(tmp + 8, f_ref->u, 8 * i, 8 * j, dx, dy, edged_width / 2, 0), edged_width / 2); transfer_8to16sub2(&dct_codes[5 * 64], cur->v + (j * 8) * edged_width / 2 + (i * 8), interpolate8x8_switch2(tmp, b_ref->v, 8 * i, 8 * j, b_dx, b_dy, edged_width / 2, 0), interpolate8x8_switch2(tmp + 8, f_ref->v, 8 * i, 8 * j, dx, dy, edged_width / 2, 0), edged_width / 2); } xvidcore/src/motion/sad.c0000664000076500007650000002244711564705453016515 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Sum Of Absolute Difference related code - * * Copyright(C) 2001-2010 Peter Ross * 2010 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: sad.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../portab.h" #include "../global.h" #include "sad.h" #include sad16FuncPtr sad16; sad8FuncPtr sad8; sad16biFuncPtr sad16bi; sad8biFuncPtr sad8bi; /* not really sad16, but no difference in prototype */ dev16FuncPtr dev16; sad16vFuncPtr sad16v; sse8Func_16bitPtr sse8_16bit; sse8Func_8bitPtr sse8_8bit; sseh8Func_16bitPtr sseh8_16bit; coeff8_energyFunc_Ptr coeff8_energy; blocksum8Func_Ptr blocksum8; sadInitFuncPtr sadInit; uint32_t sad16_c(const uint8_t * const cur, const uint8_t * const ref, const uint32_t stride, const uint32_t best_sad) { uint32_t sad = 0; uint32_t j; uint8_t const *ptr_cur = cur; uint8_t const *ptr_ref = ref; for (j = 0; j < 16; j++) { sad += abs(ptr_cur[0] - ptr_ref[0]); sad += abs(ptr_cur[1] - ptr_ref[1]); sad += abs(ptr_cur[2] - ptr_ref[2]); sad += abs(ptr_cur[3] - ptr_ref[3]); sad += abs(ptr_cur[4] - ptr_ref[4]); sad += abs(ptr_cur[5] - ptr_ref[5]); sad += abs(ptr_cur[6] - ptr_ref[6]); sad += abs(ptr_cur[7] - ptr_ref[7]); sad += abs(ptr_cur[8] - ptr_ref[8]); sad += abs(ptr_cur[9] - ptr_ref[9]); sad += abs(ptr_cur[10] - ptr_ref[10]); sad += abs(ptr_cur[11] - ptr_ref[11]); sad += abs(ptr_cur[12] - ptr_ref[12]); sad += abs(ptr_cur[13] - ptr_ref[13]); sad += abs(ptr_cur[14] - ptr_ref[14]); sad += abs(ptr_cur[15] - ptr_ref[15]); if (sad >= best_sad) return sad; ptr_cur += stride; ptr_ref += stride; } return sad; } uint32_t sad16bi_c(const uint8_t * const cur, const uint8_t * const ref1, const uint8_t * const ref2, const uint32_t stride) { uint32_t sad = 0; uint32_t i, j; uint8_t const *ptr_cur = cur; uint8_t const *ptr_ref1 = ref1; uint8_t const *ptr_ref2 = ref2; for (j = 0; j < 16; j++) { for (i = 0; i < 16; i++) { int pixel = (ptr_ref1[i] + ptr_ref2[i] + 1) / 2; sad += abs(ptr_cur[i] - pixel); } ptr_cur += stride; ptr_ref1 += stride; ptr_ref2 += stride; } return sad; } uint32_t sad8bi_c(const uint8_t * const cur, const uint8_t * const ref1, const uint8_t * const ref2, const uint32_t stride) { uint32_t sad = 0; uint32_t i, j; uint8_t const *ptr_cur = cur; uint8_t const *ptr_ref1 = ref1; uint8_t const *ptr_ref2 = ref2; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { int pixel = (ptr_ref1[i] + ptr_ref2[i] + 1) / 2; sad += abs(ptr_cur[i] - pixel); } ptr_cur += stride; ptr_ref1 += stride; ptr_ref2 += stride; } return sad; } uint32_t sad8_c(const uint8_t * const cur, const uint8_t * const ref, const uint32_t stride) { uint32_t sad = 0; uint32_t j; uint8_t const *ptr_cur = cur; uint8_t const *ptr_ref = ref; for (j = 0; j < 8; j++) { sad += abs(ptr_cur[0] - ptr_ref[0]); sad += abs(ptr_cur[1] - ptr_ref[1]); sad += abs(ptr_cur[2] - ptr_ref[2]); sad += abs(ptr_cur[3] - ptr_ref[3]); sad += abs(ptr_cur[4] - ptr_ref[4]); sad += abs(ptr_cur[5] - ptr_ref[5]); sad += abs(ptr_cur[6] - ptr_ref[6]); sad += abs(ptr_cur[7] - ptr_ref[7]); ptr_cur += stride; ptr_ref += stride; } return sad; } /* average deviation from mean */ uint32_t dev16_c(const uint8_t * const cur, const uint32_t stride) { uint32_t mean = 0; uint32_t dev = 0; uint32_t i, j; uint8_t const *ptr_cur = cur; for (j = 0; j < 16; j++) { for (i = 0; i < 16; i++) mean += *(ptr_cur + i); ptr_cur += stride; } mean /= (16 * 16); ptr_cur = cur; for (j = 0; j < 16; j++) { for (i = 0; i < 16; i++) dev += abs(*(ptr_cur + i) - (int32_t) mean); ptr_cur += stride; } return dev; } uint32_t sad16v_c(const uint8_t * const cur, const uint8_t * const ref, const uint32_t stride, int32_t *sad) { sad[0] = sad8(cur, ref, stride); sad[1] = sad8(cur + 8, ref + 8, stride); sad[2] = sad8(cur + 8*stride, ref + 8*stride, stride); sad[3] = sad8(cur + 8*stride + 8, ref + 8*stride + 8, stride); return sad[0]+sad[1]+sad[2]+sad[3]; } uint32_t sad32v_c(const uint8_t * const cur, const uint8_t * const ref, const uint32_t stride, int32_t *sad) { sad[0] = sad16(cur, ref, stride, 256*4096); sad[1] = sad16(cur + 16, ref + 16, stride, 256*4096); sad[2] = sad16(cur + 16*stride, ref + 16*stride, stride, 256*4096); sad[3] = sad16(cur + 16*stride + 16, ref + 16*stride + 16, stride, 256*4096); return sad[0]+sad[1]+sad[2]+sad[3]; } #define MRSAD16_CORRFACTOR 8 uint32_t mrsad16_c(const uint8_t * const cur, const uint8_t * const ref, const uint32_t stride, const uint32_t best_sad) { uint32_t sad = 0; int32_t mean = 0; uint32_t i, j; uint8_t const *ptr_cur = cur; uint8_t const *ptr_ref = ref; for (j = 0; j < 16; j++) { for (i = 0; i < 16; i++) { mean += ((int) *(ptr_cur + i) - (int) *(ptr_ref + i)); } ptr_cur += stride; ptr_ref += stride; } mean /= 256; for (j = 0; j < 16; j++) { ptr_cur -= stride; ptr_ref -= stride; for (i = 0; i < 16; i++) { sad += abs(*(ptr_cur + i) - *(ptr_ref + i) - mean); if (sad >= best_sad) { return MRSAD16_CORRFACTOR * sad; } } } return MRSAD16_CORRFACTOR * sad; } uint32_t sse8_16bit_c(const int16_t * b1, const int16_t * b2, const uint32_t stride) { int i; int sse = 0; for (i=0; i<8; i++) { sse += (b1[0] - b2[0])*(b1[0] - b2[0]); sse += (b1[1] - b2[1])*(b1[1] - b2[1]); sse += (b1[2] - b2[2])*(b1[2] - b2[2]); sse += (b1[3] - b2[3])*(b1[3] - b2[3]); sse += (b1[4] - b2[4])*(b1[4] - b2[4]); sse += (b1[5] - b2[5])*(b1[5] - b2[5]); sse += (b1[6] - b2[6])*(b1[6] - b2[6]); sse += (b1[7] - b2[7])*(b1[7] - b2[7]); b1 = (const int16_t*)((int8_t*)b1+stride); b2 = (const int16_t*)((int8_t*)b2+stride); } return(sse); } uint32_t sse8_8bit_c(const uint8_t * b1, const uint8_t * b2, const uint32_t stride) { int i; int sse = 0; for (i=0; i<8; i++) { sse += (b1[0] - b2[0])*(b1[0] - b2[0]); sse += (b1[1] - b2[1])*(b1[1] - b2[1]); sse += (b1[2] - b2[2])*(b1[2] - b2[2]); sse += (b1[3] - b2[3])*(b1[3] - b2[3]); sse += (b1[4] - b2[4])*(b1[4] - b2[4]); sse += (b1[5] - b2[5])*(b1[5] - b2[5]); sse += (b1[6] - b2[6])*(b1[6] - b2[6]); sse += (b1[7] - b2[7])*(b1[7] - b2[7]); b1 = b1+stride; b2 = b2+stride; } return(sse); } /* PSNR-HVS-M helper functions */ static const int16_t iMask_Coeff[64] = { 0, 29788, 32767, 20479, 13653, 8192, 6425, 5372, 27306, 27306, 23405, 17246, 12603, 5650, 5461, 5958, 23405, 25205, 20479, 13653, 8192, 5749, 4749, 5851, 23405, 19275, 14894, 11299, 6425, 3766, 4096, 5285, 18204, 14894, 8856, 5851, 4819, 3006, 3181, 4255, 13653, 9362, 5958, 5120, 4045, 3151, 2900, 3562, 6687, 5120, 4201, 3766, 3181, 2708, 2730, 3244, 4551, 3562, 3449, 3344, 2926, 3277, 3181, 3310 }; /* Calculate CSF weighted energy of DCT coefficients */ uint32_t coeff8_energy_c(const int16_t * dct) { int x, y; uint32_t sum_a = 0; for (y = 0; y < 8; y += 2) { for (x = 0; x < 8; x += 2) { int16_t a0 = ((dct[y*8+x]<<4) * iMask_Coeff[y*8+x]) >> 16; int16_t a1 = ((dct[y*8+x+1]<<4) * iMask_Coeff[y*8+x+1]) >> 16; int16_t a2 = ((dct[(y+1)*8+x]<<4) * iMask_Coeff[(y+1)*8+x]) >> 16; int16_t a3 = ((dct[(y+1)*8+x+1]<<4) * iMask_Coeff[(y+1)*8+x+1]) >> 16; sum_a += ((a0*a0 + a1*a1 + a2*a2 + a3*a3) >> 3); } } return sum_a; } /* Calculate MSE of DCT coeffs reduced by masking effect */ uint32_t sseh8_16bit_c(const int16_t * cur, const int16_t * ref, uint16_t mask) { int j, i; uint32_t mse_h = 0; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { uint32_t t = (mask * Inv_iMask_Coeff[j*8+i] + 32) >> 7; uint16_t u = abs(cur[j*8+i] - ref[j*8+i]) << 4; uint16_t thresh = (t < 65536) ? t : 65535; if (u < thresh) u = 0; /* The error is not perceivable */ else u -= thresh; u = ((u + iCSF_Round[j*8 + i]) * iCSF_Coeff[j*8 + i]) >> 16; mse_h += (uint32_t) (u * u); } } return mse_h; } /* Sums all pixels of 8x8 block */ uint32_t blocksum8_c(const uint8_t * cur, int stride, uint16_t sums[4], uint32_t squares[4]) { int i, j; uint32_t sum = 0; sums[0] = sums[1] = sums[2] = sums[3] = 0; squares[0] = squares[1] = squares[2] = squares[3] = 0; for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { uint8_t p = cur[j*stride + i]; sums[(j>>2)*2 + (i>>2)] += p; squares[(j>>2)*2 + (i>>2)] += p*p; sum += p; } } return sum; } xvidcore/src/nasm.inc0000664000076500007650000001043111531721471015704 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - NASM common header - ; * ; * Copyright (C) 2008 Michael Militzer ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: nasm.inc,v 1.7.2.2 2011-02-25 12:40:25 Isibaar Exp $ ; * ; ***************************************************************************/ %ifdef ARCH_IS_X86_64 BITS 64 DEFAULT REL %define SECTION_ALIGN 32 %ifdef WINDOWS %define prm1 rcx %define prm2 rdx %define prm3 r8 %define prm4 r9 %define prm5 [rsp+40] %define prm6 [rsp+48] %define prm7 [rsp+56] %define prm8 [rsp+64] %define prm1d ecx %define prm2d edx %define prm3d r8d %define prm4d r9d %define prm5d dword prm5 %define prm6d dword prm6 %define prm7d dword prm7 %define prm8d dword prm8 %macro PUSH_XMM6_XMM7 0 movdqa [_ESP+PTR_SIZE], xmm6 movdqa [_ESP+PTR_SIZE+16], xmm7 %endmacro %macro POP_XMM6_XMM7 0 movdqa xmm6, [_ESP+PTR_SIZE] movdqa xmm7, [_ESP+PTR_SIZE+16] %endmacro %else ; Linux %define prm1 rdi %define prm2 rsi %define prm3 rdx %define prm4 rcx %define prm5 r8 %define prm6 r9 %define prm7 [rsp+8] %define prm8 [rsp+16] %define prm1d edi %define prm2d esi %define prm3d edx %define prm4d ecx %define prm5d r8d %define prm6d r9d %define prm7d dword prm7 %define prm8d dword prm8 %define PUSH_XMM6_XMM7 %define POP_XMM6_XMM7 %endif %define _EAX rax %define _EBX rbx %define _ECX rcx %define _EDX rdx %define _ESI rsi %define _EDI rdi %define _EBP rbp %define _ESP rsp %define TMP0 r10 %define TMP1 r11 %define TMP0d r10d %define TMP1d r11d %define PTR_SIZE 8 %define PTR_TYPE qword %ifdef __YASM_VERSION_ID__ %define XVID_MOVSXD movsxd %else %define XVID_MOVSXD movsx %endif %else %define SECTION_ALIGN 16 BITS 32 %define prm1 [esp + 4] %define prm2 [esp + 8] %define prm3 [esp + 12] %define prm4 [esp + 16] %define prm5 [esp + 20] %define prm6 [esp + 24] %define prm7 [esp + 28] %define prm8 [esp + 32] %define prm1d dword prm1 %define prm2d dword prm2 %define prm3d dword prm3 %define prm4d dword prm4 %define prm5d dword prm5 %define prm6d dword prm6 %define prm7d dword prm7 %define prm8d dword prm8 %define _EAX eax %define _EBX ebx %define _ECX ecx %define _EDX edx %define _ESI esi %define _EDI edi %define _EBP ebp %define _ESP esp %define TMP0 ecx %define TMP1 edx %define TMP0d ecx %define TMP1d edx %define PTR_SIZE 4 %define PTR_TYPE dword %define PUSH_XMM6_XMM7 %define POP_XMM6_XMM7 %define XVID_MOVSXD movsx %endif %ifdef WINDOWS %define PREFIX %endif %ifdef NO_PREFIX %undef PREFIX %endif %macro DATA 0 %ifdef FORMAT_COFF SECTION .rodata %else SECTION .rodata align=SECTION_ALIGN %endif %endmacro %macro TEXT 0 %ifidn __OUTPUT_FORMAT__,macho32 SECTION .text align=SECTION_ALIGN %else %ifidn __OUTPUT_FORMAT__,macho64 SECTION .text align=SECTION_ALIGN %else SECTION .rotext align=SECTION_ALIGN %endif %endif %endmacro %macro cglobal 1 %ifdef PREFIX %ifdef MARK_FUNCS global _%1:function %1.endfunc-%1 %define %1 _%1:function %1.endfunc-%1 %define ENDFUNC .endfunc: %else global _%1 %define %1 _%1 %define ENDFUNC %endif %else %ifdef MARK_FUNCS global %1:function %1.endfunc-%1 %define ENDFUNC .endfunc: %else global %1 %define ENDFUNC %endif %endif %endmacro %macro NON_EXEC_STACK 0 %ifidn __OUTPUT_FORMAT__,elf section .note.GNU-stack noalloc noexec nowrite progbits %endif %ifidn __OUTPUT_FORMAT__,elf32 section .note.GNU-stack noalloc noexec nowrite progbits %endif %ifidn __OUTPUT_FORMAT__,elf64 section .note.GNU-stack noalloc noexec nowrite progbits %endif %endmacro xvidcore/src/xvid.c0000664000076500007650000006363411566410431015405 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Native API implementation - * * Copyright(C) 2001-2011 Peter Ross * 2002-2011 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: xvid.c 2011 2011-05-23 07:47:37Z Isibaar $ * ****************************************************************************/ #include #include #include #include #if !defined(_WIN32) #include #endif #if defined(__APPLE__) && defined(__MACH__) && !defined(_SC_NPROCESSORS_CONF) #include #include #ifdef MAX #undef MAX #endif #ifdef MIN #undef MIN #endif #endif #if defined(__amigaos4__) #include #include #endif #include "xvid.h" #include "decoder.h" #include "encoder.h" #include "bitstream/cbp.h" #include "dct/idct.h" #include "dct/fdct.h" #include "image/colorspace.h" #include "image/interpolate8x8.h" #include "utils/mem_transfer.h" #include "utils/mbfunctions.h" #include "quant/quant.h" #include "motion/motion.h" #include "motion/gmc.h" #include "motion/sad.h" #include "utils/emms.h" #include "utils/timer.h" #include "bitstream/mbcoding.h" #include "image/qpel.h" #include "image/postprocessing.h" #if defined(_DEBUG) unsigned int xvid_debug = 0; /* xvid debug mask */ #endif #if (defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64)) && defined(_MSC_VER) # include #elif defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) || (defined(ARCH_IS_PPC) && !defined(__amigaos4__)) # include # include static jmp_buf mark; static void sigill_handler(int signal) { longjmp(mark, 1); } #endif /* * Calls the funcptr, and returns whether SIGILL (illegal instruction) was * signalled * * Return values: * -1 : could not determine * 0 : SIGILL was *not* signalled * 1 : SIGILL was signalled */ #if (defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64)) && defined(_MSC_VER) static int sigill_check(void (*func)()) { _try { func(); } _except(EXCEPTION_EXECUTE_HANDLER) { if (_exception_code() == STATUS_ILLEGAL_INSTRUCTION) return(1); } return(0); } #elif defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) || (defined(ARCH_IS_PPC) && !defined(__amigaos4__)) static int sigill_check(void (*func)()) { void *old_handler; int jmpret; /* Set our SIGILL handler */ old_handler = signal(SIGILL, sigill_handler); /* Check for error */ if (old_handler == SIG_ERR) { return(-1); } /* Save stack context, so if func triggers a SIGILL, we can still roll * back to a valid CPU state */ jmpret = setjmp(mark); /* If setjmp returned directly, then its returned value is 0, and we still * have to test the passed func. Otherwise it means the stack context has * been restored by a longjmp() call, which in our case happens only in the * signal handler */ if (jmpret == 0) { func(); } /* Restore old signal handler */ signal(SIGILL, old_handler); return(jmpret); } #endif /* detect cpu flags */ static unsigned int detect_cpu_flags(void) { /* enable native assembly optimizations by default */ unsigned int cpu_flags = XVID_CPU_ASM; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) cpu_flags |= check_cpu_features(); if ((cpu_flags & XVID_CPU_SSE) && sigill_check(sse_os_trigger)) cpu_flags &= ~XVID_CPU_SSE; if ((cpu_flags & (XVID_CPU_SSE2|XVID_CPU_SSE3|XVID_CPU_SSE41)) && sigill_check(sse2_os_trigger)) cpu_flags &= ~(XVID_CPU_SSE2|XVID_CPU_SSE3|XVID_CPU_SSE41); #endif #if defined(ARCH_IS_PPC) #if defined(__amigaos4__) { uint32_t vector_unit = VECTORTYPE_NONE; IExec->GetCPUInfoTags(GCIT_VectorUnit, &vector_unit, TAG_END); if (vector_unit == VECTORTYPE_ALTIVEC) { cpu_flags |= XVID_CPU_ALTIVEC; } } #else if (!sigill_check(altivec_trigger)) cpu_flags |= XVID_CPU_ALTIVEC; #endif #endif return cpu_flags; } /***************************************************************************** * Xvid Init Entry point * * Well this function initialize all internal function pointers according * to the CPU features forced by the library client or autodetected (depending * on the XVID_CPU_FORCE flag). It also initializes vlc coding tables and all * image colorspace transformation tables. * * Returned value : XVID_ERR_OK * + API_VERSION in the input XVID_INIT_PARAM structure * + core build " " " " " * ****************************************************************************/ static int xvid_gbl_init(xvid_gbl_init_t * init) { unsigned int cpu_flags; if (XVID_VERSION_MAJOR(init->version) != 1) /* v1.x.x */ return XVID_ERR_VERSION; cpu_flags = (init->cpu_flags & XVID_CPU_FORCE) ? init->cpu_flags : detect_cpu_flags(); /* Initialize the function pointers */ init_vlc_tables(); /* Fixed Point Forward/Inverse DCT transformations */ fdct = fdct_int32; idct = idct_int32; /* Only needed on PPC Altivec archs */ sadInit = NULL; /* Restore FPU context : emms_c is a nop functions */ emms = emms_c; /* Qpel stuff */ xvid_QP_Funcs = &xvid_QP_Funcs_C; xvid_QP_Add_Funcs = &xvid_QP_Add_Funcs_C; xvid_Init_QP(); /* Quantization functions */ quant_h263_intra = quant_h263_intra_c; quant_h263_inter = quant_h263_inter_c; dequant_h263_intra = dequant_h263_intra_c; dequant_h263_inter = dequant_h263_inter_c; quant_mpeg_intra = quant_mpeg_intra_c; quant_mpeg_inter = quant_mpeg_inter_c; dequant_mpeg_intra = dequant_mpeg_intra_c; dequant_mpeg_inter = dequant_mpeg_inter_c; /* Block transfer related functions */ transfer_8to16copy = transfer_8to16copy_c; transfer_16to8copy = transfer_16to8copy_c; transfer_8to16sub = transfer_8to16sub_c; transfer_8to16subro = transfer_8to16subro_c; transfer_8to16sub2 = transfer_8to16sub2_c; transfer_8to16sub2ro = transfer_8to16sub2ro_c; transfer_16to8add = transfer_16to8add_c; transfer8x8_copy = transfer8x8_copy_c; transfer8x4_copy = transfer8x4_copy_c; /* Interlacing functions */ MBFieldTest = MBFieldTest_c; /* Image interpolation related functions */ interpolate8x8_halfpel_h = interpolate8x8_halfpel_h_c; interpolate8x8_halfpel_v = interpolate8x8_halfpel_v_c; interpolate8x8_halfpel_hv = interpolate8x8_halfpel_hv_c; interpolate8x4_halfpel_h = interpolate8x4_halfpel_h_c; interpolate8x4_halfpel_v = interpolate8x4_halfpel_v_c; interpolate8x4_halfpel_hv = interpolate8x4_halfpel_hv_c; interpolate8x8_halfpel_add = interpolate8x8_halfpel_add_c; interpolate8x8_halfpel_h_add = interpolate8x8_halfpel_h_add_c; interpolate8x8_halfpel_v_add = interpolate8x8_halfpel_v_add_c; interpolate8x8_halfpel_hv_add = interpolate8x8_halfpel_hv_add_c; interpolate16x16_lowpass_h = interpolate16x16_lowpass_h_c; interpolate16x16_lowpass_v = interpolate16x16_lowpass_v_c; interpolate16x16_lowpass_hv = interpolate16x16_lowpass_hv_c; interpolate8x8_lowpass_h = interpolate8x8_lowpass_h_c; interpolate8x8_lowpass_v = interpolate8x8_lowpass_v_c; interpolate8x8_lowpass_hv = interpolate8x8_lowpass_hv_c; interpolate8x8_6tap_lowpass_h = interpolate8x8_6tap_lowpass_h_c; interpolate8x8_6tap_lowpass_v = interpolate8x8_6tap_lowpass_v_c; interpolate8x8_avg2 = interpolate8x8_avg2_c; interpolate8x8_avg4 = interpolate8x8_avg4_c; /* postprocessing */ image_brightness = image_brightness_c; /* Initialize internal colorspace transformation tables */ colorspace_init(); /* All colorspace transformation functions User Format->YV12 */ yv12_to_yv12 = yv12_to_yv12_c; rgb555_to_yv12 = rgb555_to_yv12_c; rgb565_to_yv12 = rgb565_to_yv12_c; rgb_to_yv12 = rgb_to_yv12_c; bgr_to_yv12 = bgr_to_yv12_c; bgra_to_yv12 = bgra_to_yv12_c; abgr_to_yv12 = abgr_to_yv12_c; rgba_to_yv12 = rgba_to_yv12_c; argb_to_yv12 = argb_to_yv12_c; yuyv_to_yv12 = yuyv_to_yv12_c; uyvy_to_yv12 = uyvy_to_yv12_c; rgb555i_to_yv12 = rgb555i_to_yv12_c; rgb565i_to_yv12 = rgb565i_to_yv12_c; bgri_to_yv12 = bgri_to_yv12_c; bgrai_to_yv12 = bgrai_to_yv12_c; abgri_to_yv12 = abgri_to_yv12_c; rgbai_to_yv12 = rgbai_to_yv12_c; argbi_to_yv12 = argbi_to_yv12_c; yuyvi_to_yv12 = yuyvi_to_yv12_c; uyvyi_to_yv12 = uyvyi_to_yv12_c; /* All colorspace transformation functions YV12->User format */ yv12_to_rgb555 = yv12_to_rgb555_c; yv12_to_rgb565 = yv12_to_rgb565_c; yv12_to_rgb = yv12_to_rgb_c; yv12_to_bgr = yv12_to_bgr_c; yv12_to_bgra = yv12_to_bgra_c; yv12_to_abgr = yv12_to_abgr_c; yv12_to_rgba = yv12_to_rgba_c; yv12_to_argb = yv12_to_argb_c; yv12_to_yuyv = yv12_to_yuyv_c; yv12_to_uyvy = yv12_to_uyvy_c; yv12_to_rgb555i = yv12_to_rgb555i_c; yv12_to_rgb565i = yv12_to_rgb565i_c; yv12_to_bgri = yv12_to_bgri_c; yv12_to_bgrai = yv12_to_bgrai_c; yv12_to_abgri = yv12_to_abgri_c; yv12_to_rgbai = yv12_to_rgbai_c; yv12_to_argbi = yv12_to_argbi_c; yv12_to_yuyvi = yv12_to_yuyvi_c; yv12_to_uyvyi = yv12_to_uyvyi_c; /* Functions used in motion estimation algorithms */ calc_cbp = calc_cbp_c; sad16 = sad16_c; sad8 = sad8_c; sad16bi = sad16bi_c; sad8bi = sad8bi_c; dev16 = dev16_c; sad16v = sad16v_c; sse8_16bit = sse8_16bit_c; sse8_8bit = sse8_8bit_c; sseh8_16bit = sseh8_16bit_c; coeff8_energy = coeff8_energy_c; blocksum8 = blocksum8_c; init_GMC(cpu_flags); #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) if ((cpu_flags & XVID_CPU_MMX) || (cpu_flags & XVID_CPU_MMXEXT) || (cpu_flags & XVID_CPU_3DNOW) || (cpu_flags & XVID_CPU_3DNOWEXT) || (cpu_flags & XVID_CPU_SSE) || (cpu_flags & XVID_CPU_SSE2) || (cpu_flags & XVID_CPU_SSE3) || (cpu_flags & XVID_CPU_SSE41)) { /* Restore FPU context : emms_c is a nop functions */ emms = emms_mmx; } if ((cpu_flags & XVID_CPU_MMX)) { /* Forward and Inverse Discrete Cosine Transformation functions */ fdct = fdct_mmx_skal; idct = idct_mmx; /* Qpel stuff */ xvid_QP_Funcs = &xvid_QP_Funcs_mmx; xvid_QP_Add_Funcs = &xvid_QP_Add_Funcs_mmx; /* Quantization related functions */ quant_h263_intra = quant_h263_intra_mmx; quant_h263_inter = quant_h263_inter_mmx; dequant_h263_intra = dequant_h263_intra_mmx; dequant_h263_inter = dequant_h263_inter_mmx; quant_mpeg_intra = quant_mpeg_intra_mmx; quant_mpeg_inter = quant_mpeg_inter_mmx; dequant_mpeg_intra = dequant_mpeg_intra_mmx; dequant_mpeg_inter = dequant_mpeg_inter_mmx; /* Block related functions */ transfer_8to16copy = transfer_8to16copy_mmx; transfer_16to8copy = transfer_16to8copy_mmx; transfer_8to16sub = transfer_8to16sub_mmx; transfer_8to16subro = transfer_8to16subro_mmx; transfer_8to16sub2 = transfer_8to16sub2_mmx; transfer_16to8add = transfer_16to8add_mmx; transfer8x8_copy = transfer8x8_copy_mmx; transfer8x4_copy = transfer8x4_copy_mmx; /* Interlacing Functions */ MBFieldTest = MBFieldTest_mmx; /* Image Interpolation related functions */ interpolate8x8_halfpel_h = interpolate8x8_halfpel_h_mmx; interpolate8x8_halfpel_v = interpolate8x8_halfpel_v_mmx; interpolate8x8_halfpel_hv = interpolate8x8_halfpel_hv_mmx; interpolate8x4_halfpel_h = interpolate8x4_halfpel_h_mmx; interpolate8x4_halfpel_v = interpolate8x4_halfpel_v_mmx; interpolate8x4_halfpel_hv = interpolate8x4_halfpel_hv_mmx; interpolate8x8_halfpel_add = interpolate8x8_halfpel_add_mmx; interpolate8x8_halfpel_h_add = interpolate8x8_halfpel_h_add_mmx; interpolate8x8_halfpel_v_add = interpolate8x8_halfpel_v_add_mmx; interpolate8x8_halfpel_hv_add = interpolate8x8_halfpel_hv_add_mmx; interpolate8x8_6tap_lowpass_h = interpolate8x8_6tap_lowpass_h_mmx; interpolate8x8_6tap_lowpass_v = interpolate8x8_6tap_lowpass_v_mmx; interpolate8x8_avg2 = interpolate8x8_avg2_mmx; interpolate8x8_avg4 = interpolate8x8_avg4_mmx; /* postprocessing */ image_brightness = image_brightness_mmx; /* image input xxx_to_yv12 related functions */ yv12_to_yv12 = yv12_to_yv12_mmx; bgr_to_yv12 = bgr_to_yv12_mmx; rgb_to_yv12 = rgb_to_yv12_mmx; bgra_to_yv12 = bgra_to_yv12_mmx; rgba_to_yv12 = rgba_to_yv12_mmx; yuyv_to_yv12 = yuyv_to_yv12_mmx; uyvy_to_yv12 = uyvy_to_yv12_mmx; /* image output yv12_to_xxx related functions */ yv12_to_bgr = yv12_to_bgr_mmx; yv12_to_bgra = yv12_to_bgra_mmx; yv12_to_yuyv = yv12_to_yuyv_mmx; yv12_to_uyvy = yv12_to_uyvy_mmx; yv12_to_yuyvi = yv12_to_yuyvi_mmx; yv12_to_uyvyi = yv12_to_uyvyi_mmx; /* Motion estimation related functions */ calc_cbp = calc_cbp_mmx; sad16 = sad16_mmx; sad8 = sad8_mmx; sad16bi = sad16bi_mmx; sad8bi = sad8bi_mmx; dev16 = dev16_mmx; sad16v = sad16v_mmx; sse8_16bit = sse8_16bit_mmx; sse8_8bit = sse8_8bit_mmx; } /* these 3dnow functions are faster than mmx, but slower than xmm. */ if ((cpu_flags & XVID_CPU_3DNOW)) { emms = emms_3dn; /* ME functions */ sad16bi = sad16bi_3dn; sad8bi = sad8bi_3dn; yuyv_to_yv12 = yuyv_to_yv12_3dn; uyvy_to_yv12 = uyvy_to_yv12_3dn; } if ((cpu_flags & XVID_CPU_MMXEXT)) { /* DCT */ fdct = fdct_xmm_skal; idct = idct_xmm; /* Interpolation */ interpolate8x8_halfpel_h = interpolate8x8_halfpel_h_xmm; interpolate8x8_halfpel_v = interpolate8x8_halfpel_v_xmm; interpolate8x8_halfpel_hv = interpolate8x8_halfpel_hv_xmm; interpolate8x4_halfpel_h = interpolate8x4_halfpel_h_xmm; interpolate8x4_halfpel_v = interpolate8x4_halfpel_v_xmm; interpolate8x4_halfpel_hv = interpolate8x4_halfpel_hv_xmm; interpolate8x8_halfpel_add = interpolate8x8_halfpel_add_xmm; interpolate8x8_halfpel_h_add = interpolate8x8_halfpel_h_add_xmm; interpolate8x8_halfpel_v_add = interpolate8x8_halfpel_v_add_xmm; interpolate8x8_halfpel_hv_add = interpolate8x8_halfpel_hv_add_xmm; /* Quantization */ quant_mpeg_inter = quant_mpeg_inter_xmm; dequant_h263_intra = dequant_h263_intra_xmm; dequant_h263_inter = dequant_h263_inter_xmm; /* Buffer transfer */ transfer_8to16sub2 = transfer_8to16sub2_xmm; transfer_8to16sub2ro = transfer_8to16sub2ro_xmm; /* Colorspace transformation */ /* yv12_to_yv12 = yv12_to_yv12_xmm; */ /* appears to be slow on many machines */ yuyv_to_yv12 = yuyv_to_yv12_xmm; uyvy_to_yv12 = uyvy_to_yv12_xmm; /* ME functions */ sad16 = sad16_xmm; sad8 = sad8_xmm; sad16bi = sad16bi_xmm; sad8bi = sad8bi_xmm; dev16 = dev16_xmm; sad16v = sad16v_xmm; } if ((cpu_flags & XVID_CPU_3DNOW)) { /* Interpolation */ interpolate8x8_halfpel_h = interpolate8x8_halfpel_h_3dn; interpolate8x8_halfpel_v = interpolate8x8_halfpel_v_3dn; interpolate8x8_halfpel_hv = interpolate8x8_halfpel_hv_3dn; interpolate8x4_halfpel_h = interpolate8x4_halfpel_h_3dn; interpolate8x4_halfpel_v = interpolate8x4_halfpel_v_3dn; interpolate8x4_halfpel_hv = interpolate8x4_halfpel_hv_3dn; } if ((cpu_flags & XVID_CPU_3DNOWEXT)) { /* Buffer transfer */ transfer_8to16copy = transfer_8to16copy_3dne; transfer_16to8copy = transfer_16to8copy_3dne; transfer_8to16sub = transfer_8to16sub_3dne; transfer_8to16subro = transfer_8to16subro_3dne; transfer_16to8add = transfer_16to8add_3dne; transfer8x8_copy = transfer8x8_copy_3dne; transfer8x4_copy = transfer8x4_copy_3dne; if ((cpu_flags & XVID_CPU_MMXEXT)) { /* Inverse DCT */ idct = idct_3dne; /* Buffer transfer */ transfer_8to16sub2 = transfer_8to16sub2_3dne; /* Interpolation */ interpolate8x8_halfpel_h = interpolate8x8_halfpel_h_3dne; interpolate8x8_halfpel_v = interpolate8x8_halfpel_v_3dne; interpolate8x8_halfpel_hv = interpolate8x8_halfpel_hv_3dne; interpolate8x4_halfpel_h = interpolate8x4_halfpel_h_3dne; interpolate8x4_halfpel_v = interpolate8x4_halfpel_v_3dne; interpolate8x4_halfpel_hv = interpolate8x4_halfpel_hv_3dne; /* Quantization */ quant_h263_intra = quant_h263_intra_3dne; /* cmov only */ quant_h263_inter = quant_h263_inter_3dne; dequant_mpeg_intra = dequant_mpeg_intra_3dne; /* cmov only */ dequant_mpeg_inter = dequant_mpeg_inter_3dne; dequant_h263_intra = dequant_h263_intra_3dne; dequant_h263_inter = dequant_h263_inter_3dne; /* ME functions */ sad16 = sad16_3dne; sad8 = sad8_3dne; sad16bi = sad16bi_3dne; sad8bi = sad8bi_3dne; dev16 = dev16_3dne; } } if ((cpu_flags & XVID_CPU_SSE2)) { calc_cbp = calc_cbp_sse2; /* Quantization */ quant_h263_intra = quant_h263_intra_sse2; quant_h263_inter = quant_h263_inter_sse2; dequant_h263_intra = dequant_h263_intra_sse2; dequant_h263_inter = dequant_h263_inter_sse2; /* SAD operators */ sad16 = sad16_sse2; dev16 = dev16_sse2; /* PSNR-HVS-M distortion metric */ sseh8_16bit = sseh8_16bit_sse2; coeff8_energy = coeff8_energy_sse2; blocksum8 = blocksum8_sse2; /* DCT operators */ fdct = fdct_sse2_skal; idct = idct_sse2_skal; /* Is now IEEE1180 and Walken compliant. */ /* postprocessing */ image_brightness = image_brightness_sse2; } if ((cpu_flags & XVID_CPU_SSE3)) { /* SAD operators */ sad16 = sad16_sse3; dev16 = dev16_sse3; } #endif /* ARCH_IS_IA32 */ #if defined(ARCH_IS_IA64) if ((cpu_flags & XVID_CPU_ASM)) { /* use assembler routines? */ idct_ia64_init(); fdct = fdct_ia64; idct = idct_ia64; /*not yet working, crashes */ interpolate8x8_halfpel_h = interpolate8x8_halfpel_h_ia64; interpolate8x8_halfpel_v = interpolate8x8_halfpel_v_ia64; interpolate8x8_halfpel_hv = interpolate8x8_halfpel_hv_ia64; sad16 = sad16_ia64; sad16bi = sad16bi_ia64; sad8 = sad8_ia64; dev16 = dev16_ia64; /* Halfpel8_Refine = Halfpel8_Refine_ia64; */ quant_h263_intra = quant_h263_intra_ia64; quant_h263_inter = quant_h263_inter_ia64; dequant_h263_intra = dequant_h263_intra_ia64; dequant_h263_inter = dequant_h263_inter_ia64; transfer_8to16copy = transfer_8to16copy_ia64; transfer_16to8copy = transfer_16to8copy_ia64; transfer_8to16sub = transfer_8to16sub_ia64; transfer_8to16sub2 = transfer_8to16sub2_ia64; transfer_16to8add = transfer_16to8add_ia64; transfer8x8_copy = transfer8x8_copy_ia64; } #endif #if defined(ARCH_IS_PPC) if ((cpu_flags & XVID_CPU_ALTIVEC)) { /* sad operators */ sad16 = sad16_altivec_c; sad16bi = sad16bi_altivec_c; sad8 = sad8_altivec_c; dev16 = dev16_altivec_c; sse8_16bit = sse8_16bit_altivec_c; /* mem transfer */ transfer_8to16copy = transfer_8to16copy_altivec_c; transfer_16to8copy = transfer_16to8copy_altivec_c; transfer_8to16sub = transfer_8to16sub_altivec_c; transfer_8to16subro = transfer_8to16subro_altivec_c; transfer_8to16sub2 = transfer_8to16sub2_altivec_c; transfer_16to8add = transfer_16to8add_altivec_c; transfer8x8_copy = transfer8x8_copy_altivec_c; /* Inverse DCT */ idct = idct_altivec_c; /* Interpolation */ interpolate8x8_halfpel_h = interpolate8x8_halfpel_h_altivec_c; interpolate8x8_halfpel_v = interpolate8x8_halfpel_v_altivec_c; interpolate8x8_halfpel_hv = interpolate8x8_halfpel_hv_altivec_c; interpolate8x8_avg2 = interpolate8x8_avg2_altivec_c; interpolate8x8_avg4 = interpolate8x8_avg4_altivec_c; interpolate8x8_halfpel_add = interpolate8x8_halfpel_add_altivec_c; interpolate8x8_halfpel_h_add = interpolate8x8_halfpel_h_add_altivec_c; interpolate8x8_halfpel_v_add = interpolate8x8_halfpel_v_add_altivec_c; interpolate8x8_halfpel_hv_add = interpolate8x8_halfpel_hv_add_altivec_c; /* Colorspace conversion */ bgra_to_yv12 = bgra_to_yv12_altivec_c; abgr_to_yv12 = abgr_to_yv12_altivec_c; rgba_to_yv12 = rgba_to_yv12_altivec_c; argb_to_yv12 = argb_to_yv12_altivec_c; yuyv_to_yv12 = yuyv_to_yv12_altivec_c; uyvy_to_yv12 = uyvy_to_yv12_altivec_c; yv12_to_yuyv = yv12_to_yuyv_altivec_c; yv12_to_uyvy = yv12_to_uyvy_altivec_c; /* Quantization */ quant_h263_intra = quant_h263_intra_altivec_c; quant_h263_inter = quant_h263_inter_altivec_c; dequant_h263_intra = dequant_h263_intra_altivec_c; dequant_h263_inter = dequant_h263_inter_altivec_c; dequant_mpeg_intra = dequant_mpeg_intra_altivec_c; dequant_mpeg_inter = dequant_mpeg_inter_altivec_c; /* Qpel stuff */ xvid_QP_Funcs = &xvid_QP_Funcs_Altivec_C; xvid_QP_Add_Funcs = &xvid_QP_Add_Funcs_Altivec_C; } #endif #if defined(_DEBUG) xvid_debug = init->debug; #endif return(0); } static int xvid_gbl_info(xvid_gbl_info_t * info) { if (XVID_VERSION_MAJOR(info->version) != 1) /* v1.x.x */ return XVID_ERR_VERSION; info->actual_version = XVID_VERSION; info->build = "xvid-1.3.2"; info->cpu_flags = detect_cpu_flags(); info->num_threads = 0; /* single-thread */ #if defined(_WIN32) { SYSTEM_INFO siSysInfo; GetSystemInfo(&siSysInfo); info->num_threads = siSysInfo.dwNumberOfProcessors; /* number of _logical_ cores */ } #elif defined(_SC_NPROCESSORS_CONF) /* should be available on Apple too actually */ info->num_threads = sysconf(_SC_NPROCESSORS_CONF); #elif defined(__APPLE__) && defined(__MACH__) { size_t len; int mib[2], ncpu; mib[0] = CTL_HW; mib[1] = HW_NCPU; len = sizeof(ncpu); if (sysctl(mib, 2, &ncpu, &len, NULL, 0) == 0) info -> num_threads = ncpu; else info -> num_threads = 1; } #elif defined(__amigaos4__) { uint32_t num_threads = 1; IExec->GetCPUInfoTags(GCIT_NumberOfCPUs, &num_threads, TAG_END); info->num_threads = num_threads; } #endif return 0; } static int xvid_gbl_convert(xvid_gbl_convert_t* convert) { int width; int height; int width2; int height2; IMAGE img; if (XVID_VERSION_MAJOR(convert->version) != 1) /* v1.x.x */ return XVID_ERR_VERSION; #if 0 const int flip1 = (convert->input.colorspace & XVID_CSP_VFLIP) ^ (convert->output.colorspace & XVID_CSP_VFLIP); #endif width = convert->width; height = convert->height; width2 = convert->width/2; height2 = convert->height/2; switch (convert->input.csp & ~XVID_CSP_VFLIP) { case XVID_CSP_YV12 : img.y = convert->input.plane[0]; img.v = (uint8_t*)convert->input.plane[0] + convert->input.stride[0]*height; img.u = (uint8_t*)convert->input.plane[0] + convert->input.stride[0]*height + (convert->input.stride[0]/2)*height2; image_output(&img, width, height, width, (uint8_t**)convert->output.plane, convert->output.stride, convert->output.csp, convert->interlacing); break; case XVID_CSP_INTERNAL : img.y = (uint8_t*)convert->input.plane[0]; img.u = (uint8_t*)convert->input.plane[1]; img.v = (uint8_t*)convert->input.plane[2]; image_output(&img, width, height, convert->input.stride[0], (uint8_t**)convert->output.plane, convert->output.stride, convert->output.csp, convert->interlacing); break; default : return XVID_ERR_FORMAT; } emms(); return 0; } /***************************************************************************** * Xvid Global Entry point * * Well this function initialize all internal function pointers according * to the CPU features forced by the library client or autodetected (depending * on the XVID_CPU_FORCE flag). It also initializes vlc coding tables and all * image colorspace transformation tables. * ****************************************************************************/ int xvid_global(void *handle, int opt, void *param1, void *param2) { switch(opt) { case XVID_GBL_INIT : return xvid_gbl_init((xvid_gbl_init_t*)param1); case XVID_GBL_INFO : return xvid_gbl_info((xvid_gbl_info_t*)param1); case XVID_GBL_CONVERT : return xvid_gbl_convert((xvid_gbl_convert_t*)param1); default : return XVID_ERR_FAIL; } } /***************************************************************************** * Xvid Native decoder entry point * * This function is just a wrapper to all the option cases. * * Returned values : XVID_ERR_FAIL when opt is invalid * else returns the wrapped function result * ****************************************************************************/ int xvid_decore(void *handle, int opt, void *param1, void *param2) { switch (opt) { case XVID_DEC_CREATE: return decoder_create((xvid_dec_create_t *) param1); case XVID_DEC_DESTROY: return decoder_destroy((DECODER *) handle); case XVID_DEC_DECODE: return decoder_decode((DECODER *) handle, (xvid_dec_frame_t *) param1, (xvid_dec_stats_t*) param2); default: return XVID_ERR_FAIL; } } /***************************************************************************** * Xvid Native encoder entry point * * This function is just a wrapper to all the option cases. * * Returned values : XVID_ERR_FAIL when opt is invalid * else returns the wrapped function result * ****************************************************************************/ int xvid_encore(void *handle, int opt, void *param1, void *param2) { switch (opt) { case XVID_ENC_ENCODE: return enc_encode((Encoder *) handle, (xvid_enc_frame_t *) param1, (xvid_enc_stats_t *) param2); case XVID_ENC_CREATE: return enc_create((xvid_enc_create_t *) param1); case XVID_ENC_DESTROY: return enc_destroy((Encoder *) handle); default: return XVID_ERR_FAIL; } } xvidcore/src/quant/0000775000076500007650000000000011566427762015422 5ustar xvidbuildxvidbuildxvidcore/src/quant/quant_mpeg.c0000664000076500007650000001274311564705453017727 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - MPEG4 Quantization related header - * * Copyright(C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: quant_mpeg.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../global.h" #include "quant.h" #include "quant_matrix.h" /***************************************************************************** * Global function pointers ****************************************************************************/ /* Quant */ quant_intraFuncPtr quant_mpeg_intra; quant_interFuncPtr quant_mpeg_inter; /* DeQuant */ quant_intraFuncPtr dequant_mpeg_intra; quant_interFuncPtr dequant_mpeg_inter; /***************************************************************************** * Local data ****************************************************************************/ /* divide-by-multiply table * needs 17 bit shift (16 causes slight errors when q > 19) */ #define FIX(X) ((1UL << SCALEBITS) / (X) + 1) static const uint32_t multipliers[32] = { 0, FIX(2), FIX(4), FIX(6), FIX(8), FIX(10), FIX(12), FIX(14), FIX(16), FIX(18), FIX(20), FIX(22), FIX(24), FIX(26), FIX(28), FIX(30), FIX(32), FIX(34), FIX(36), FIX(38), FIX(40), FIX(42), FIX(44), FIX(46), FIX(48), FIX(50), FIX(52), FIX(54), FIX(56), FIX(58), FIX(60), FIX(62) }; /***************************************************************************** * Function definitions ****************************************************************************/ /* quantize intra-block */ uint32_t quant_mpeg_intra_c(int16_t * coeff, const int16_t * data, const uint32_t quant, const uint32_t dcscalar, const uint16_t * mpeg_quant_matrices) { const uint16_t * intra_matrix_rec = mpeg_quant_matrices + 1*64; int i; int rounding = 1<<(SCALEBITS-1-3); coeff[0] = DIV_DIV(data[0], (int32_t) dcscalar); for (i = 1; i < 64; i++) { int32_t level = data[i]; level *= intra_matrix_rec[i]; level = (level + rounding)>>(SCALEBITS-3); coeff[i] = level; } return(0); } /* quantize inter-block * * level = DIV_DIV(16 * data[i], default_intra_matrix[i]); * coeff[i] = (level + quantd) / quant2; * sum += abs(level); */ uint32_t quant_mpeg_inter_c(int16_t * coeff, const int16_t * data, const uint32_t quant, const uint16_t * mpeg_quant_matrices) { const uint32_t mult = multipliers[quant]; const uint16_t *inter_matrix = get_inter_matrix(mpeg_quant_matrices); uint32_t sum = 0; int i; for (i = 0; i < 64; i++) { if (data[i] < 0) { uint32_t level = -data[i]; level = ((level << 4) + (inter_matrix[i] >> 1)) / inter_matrix[i]; level = (level * mult) >> 17; sum += level; coeff[i] = -(int16_t) level; } else if (data[i] > 0) { uint32_t level = data[i]; level = ((level << 4) + (inter_matrix[i] >> 1)) / inter_matrix[i]; level = (level * mult) >> 17; sum += level; coeff[i] = level; } else { coeff[i] = 0; } } return(sum); } /* dequantize intra-block & clamp to [-2048,2047] * * data[i] = (coeff[i] * default_intra_matrix[i] * quant2) >> 4; */ uint32_t dequant_mpeg_intra_c(int16_t * data, const int16_t * coeff, const uint32_t quant, const uint32_t dcscalar, const uint16_t * mpeg_quant_matrices) { const uint16_t *intra_matrix = get_intra_matrix(mpeg_quant_matrices); int i; data[0] = coeff[0] * dcscalar; if (data[0] < -2048) { data[0] = -2048; } else if (data[0] > 2047) { data[0] = 2047; } for (i = 1; i < 64; i++) { if (coeff[i] == 0) { data[i] = 0; } else if (coeff[i] < 0) { uint32_t level = -coeff[i]; level = (level * intra_matrix[i] * quant) >> 3; data[i] = (level <= 2048 ? -(int16_t) level : -2048); } else { uint32_t level = coeff[i]; level = (level * intra_matrix[i] * quant) >> 3; data[i] = (level <= 2047 ? level : 2047); } } return(0); } /* dequantize inter-block & clamp to [-2048,2047] * data = ((2 * coeff + SIGN(coeff)) * inter_matrix[i] * quant) / 16 */ uint32_t dequant_mpeg_inter_c(int16_t * data, const int16_t * coeff, const uint32_t quant, const uint16_t * mpeg_quant_matrices) { uint32_t sum = 0; const uint16_t *inter_matrix = get_inter_matrix(mpeg_quant_matrices); int i; for (i = 0; i < 64; i++) { if (coeff[i] == 0) { data[i] = 0; } else if (coeff[i] < 0) { int32_t level = -coeff[i]; level = ((2 * level + 1) * inter_matrix[i] * quant) >> 4; data[i] = (level <= 2048 ? -level : -2048); } else { uint32_t level = coeff[i]; level = ((2 * level + 1) * inter_matrix[i] * quant) >> 4; data[i] = (level <= 2047 ? level : 2047); } sum ^= data[i]; } /* mismatch control */ if ((sum & 1) == 0) { data[63] ^= 1; } return(0); } xvidcore/src/quant/quant_h263.c0000664000076500007650000001246611564705453017463 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - MPEG4 Quantization H263 implementation - * * Copyright(C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: quant_h263.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../global.h" #include "quant.h" /***************************************************************************** * Global function pointers ****************************************************************************/ /* Quant */ quant_intraFuncPtr quant_h263_intra; quant_interFuncPtr quant_h263_inter; /* DeQuant */ quant_intraFuncPtr dequant_h263_intra; quant_interFuncPtr dequant_h263_inter; /***************************************************************************** * Local data ****************************************************************************/ /* divide-by-multiply table * a 16 bit shiting is enough in this case */ #define SCALEBITS 16 #define FIX(X) ((1L << SCALEBITS) / (X) + 1) static const uint32_t multipliers[32] = { 0, FIX(2), FIX(4), FIX(6), FIX(8), FIX(10), FIX(12), FIX(14), FIX(16), FIX(18), FIX(20), FIX(22), FIX(24), FIX(26), FIX(28), FIX(30), FIX(32), FIX(34), FIX(36), FIX(38), FIX(40), FIX(42), FIX(44), FIX(46), FIX(48), FIX(50), FIX(52), FIX(54), FIX(56), FIX(58), FIX(60), FIX(62) }; /***************************************************************************** * Function definitions ****************************************************************************/ /* quantize intra-block */ uint32_t quant_h263_intra_c(int16_t * coeff, const int16_t * data, const uint32_t quant, const uint32_t dcscalar, const uint16_t * mpeg_quant_matrices) { const uint32_t mult = multipliers[quant]; const uint16_t quant_m_2 = quant << 1; int i; coeff[0] = DIV_DIV(data[0], (int32_t) dcscalar); for (i = 1; i < 64; i++) { int16_t acLevel = data[i]; if (acLevel < 0) { acLevel = -acLevel; if (acLevel < quant_m_2) { coeff[i] = 0; continue; } acLevel = (acLevel * mult) >> SCALEBITS; coeff[i] = -acLevel; } else { if (acLevel < quant_m_2) { coeff[i] = 0; continue; } acLevel = (acLevel * mult) >> SCALEBITS; coeff[i] = acLevel; } } return(0); } /* quantize inter-block */ uint32_t quant_h263_inter_c(int16_t * coeff, const int16_t * data, const uint32_t quant, const uint16_t * mpeg_quant_matrices) { const uint32_t mult = multipliers[quant]; const uint16_t quant_m_2 = quant << 1; const uint16_t quant_d_2 = quant >> 1; uint32_t sum = 0; uint32_t i; for (i = 0; i < 64; i++) { int16_t acLevel = data[i]; if (acLevel < 0) { acLevel = (-acLevel) - quant_d_2; if (acLevel < quant_m_2) { coeff[i] = 0; continue; } acLevel = (acLevel * mult) >> SCALEBITS; sum += acLevel; /* sum += |acLevel| */ coeff[i] = -acLevel; } else { acLevel -= quant_d_2; if (acLevel < quant_m_2) { coeff[i] = 0; continue; } acLevel = (acLevel * mult) >> SCALEBITS; sum += acLevel; coeff[i] = acLevel; } } return(sum); } /* dequantize intra-block & clamp to [-2048,2047] */ uint32_t dequant_h263_intra_c(int16_t * data, const int16_t * coeff, const uint32_t quant, const uint32_t dcscalar, const uint16_t * mpeg_quant_matrices) { const int32_t quant_m_2 = quant << 1; const int32_t quant_add = (quant & 1 ? quant : quant - 1); int i; data[0] = coeff[0] * dcscalar; if (data[0] < -2048) { data[0] = -2048; } else if (data[0] > 2047) { data[0] = 2047; } for (i = 1; i < 64; i++) { int32_t acLevel = coeff[i]; if (acLevel == 0) { data[i] = 0; } else if (acLevel < 0) { acLevel = quant_m_2 * -acLevel + quant_add; data[i] = (acLevel <= 2048 ? -acLevel : -2048); } else { acLevel = quant_m_2 * acLevel + quant_add; data[i] = (acLevel <= 2047 ? acLevel : 2047); } } return(0); } /* dequantize inter-block & clamp to [-2048,2047] */ uint32_t dequant_h263_inter_c(int16_t * data, const int16_t * coeff, const uint32_t quant, const uint16_t * mpeg_quant_matrices) { const uint16_t quant_m_2 = quant << 1; const uint16_t quant_add = (quant & 1 ? quant : quant - 1); int i; for (i = 0; i < 64; i++) { int16_t acLevel = coeff[i]; if (acLevel == 0) { data[i] = 0; } else if (acLevel < 0) { acLevel = acLevel * quant_m_2 - quant_add; data[i] = (acLevel >= -2048 ? acLevel : -2048); } else { acLevel = acLevel * quant_m_2 + quant_add; data[i] = (acLevel <= 2047 ? acLevel : 2047); } } return(0); } xvidcore/src/quant/ppc_asm/0000775000076500007650000000000011566427762017044 5ustar xvidbuildxvidbuildxvidcore/src/quant/ppc_asm/quant_h263_altivec.c0000664000076500007650000003437411564705453022616 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - MPEG4 Quantization H263 implementation with altivec optimization - * * Copyright(C) 2004 Christoph Naegeli * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: quant_h263_altivec.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" #include "../../global.h" #undef DEBUG #include /***************************************************************************** * Local data ****************************************************************************/ /* divide-by-multiply table * a 16 bit shiting is enough in this case */ #define SCALEBITS 16 #define FIX(X) ((1L << SCALEBITS) / (X) + 1) static const uint32_t multipliers[32] = { 0, FIX(2), FIX(4), FIX(6), FIX(8), FIX(10), FIX(12), FIX(14), FIX(16), FIX(18), FIX(20), FIX(22), FIX(24), FIX(26), FIX(28), FIX(30), FIX(32), FIX(34), FIX(36), FIX(38), FIX(40), FIX(42), FIX(44), FIX(46), FIX(48), FIX(50), FIX(52), FIX(54), FIX(56), FIX(58), FIX(60), FIX(62) }; /***************************************************************************** * Function definitions ****************************************************************************/ /* quantize intra-block */ #define QUANT_H263_INTRA_ALTIVEC() \ acLevel = vec_perm(vec_ld(0, data), vec_ld(16, data), vec_lvsl(0, data)); \ zero_mask = vec_cmplt(acLevel, (vector signed short)zerovec); \ acLevel = vec_abs(acLevel); \ \ m2_mask = vec_cmpgt(quant_m_2, (vector unsigned short)acLevel); \ acLevel = vec_sel(acLevel, (vector signed short)zerovec, m2_mask); \ \ even = vec_mule(mult, (vector unsigned short)acLevel); \ odd = vec_mulo(mult, (vector unsigned short)acLevel); \ \ even = vec_sr(even, vec_add(vec_splat_u32(8), vec_splat_u32(8))); \ odd = vec_sr(odd, vec_add(vec_splat_u32(8), vec_splat_u32(8))); \ \ acLevel = (vector signed short)vec_pack(vec_mergeh(even, odd), vec_mergel(even, odd)); \ acLevel = vec_xor(acLevel, zero_mask); \ acLevel = vec_add(acLevel, vec_and(zero_mask, vec_splat_s16(1))); \ vec_st(acLevel, 0, coeff); \ \ coeff += 8; \ data += 8 /* This function assumes: * coeff is 16 byte aligned * data is unaligned */ uint32_t quant_h263_intra_altivec_c(int16_t *coeff, int16_t *data, const uint32_t quant, const uint32_t dcscalar, const uint16_t *mpeg_quant_matrices) { vector unsigned char zerovec; vector unsigned short mult; vector unsigned short quant_m_2; vector signed short acLevel; register vector unsigned int even; register vector unsigned int odd; vector bool short zero_mask; vector bool short m2_mask; register int16_t *origin_coeff = coeff; register int16_t *origin_data = data; #ifdef DEBUG if(((unsigned)coeff) & 15) fprintf(stderr, "quant_h263_intra_altivec_c:incorrect align, coeff: %lx\n", (long)coeff); #endif zerovec = vec_splat_u8(0); *((unsigned short*)&mult) = (unsigned short)multipliers[quant]; mult = vec_splat(mult, 0); *((unsigned short*)&quant_m_2) = (unsigned short)quant; quant_m_2 = vec_splat(quant_m_2, 0); quant_m_2 = vec_sl(quant_m_2, vec_splat_u16(1)); QUANT_H263_INTRA_ALTIVEC(); QUANT_H263_INTRA_ALTIVEC(); QUANT_H263_INTRA_ALTIVEC(); QUANT_H263_INTRA_ALTIVEC(); QUANT_H263_INTRA_ALTIVEC(); QUANT_H263_INTRA_ALTIVEC(); QUANT_H263_INTRA_ALTIVEC(); QUANT_H263_INTRA_ALTIVEC(); // noch erstes setzen origin_coeff[0] = DIV_DIV(origin_data[0], (int32_t)dcscalar); return 0; } #define QUANT_H263_INTER_ALTIVEC() \ acLevel = vec_perm(vec_ld(0, data), vec_ld(16, data), vec_lvsl(0, data)); \ zero_mask = vec_cmplt(acLevel, (vector signed short)zerovec); \ acLevel = vec_abs(acLevel); \ acLevel = (vector signed short)vec_sub((vector unsigned short)acLevel, quant_d_2); \ \ m2_mask = vec_cmpgt((vector signed short)quant_m_2, acLevel); \ acLevel = vec_sel(acLevel, (vector signed short)zerovec, m2_mask); \ \ even = vec_mule((vector unsigned short)acLevel, mult); \ odd = vec_mulo((vector unsigned short)acLevel, mult); \ \ even = vec_sr(even, vec_add(vec_splat_u32(8), vec_splat_u32(8))); \ odd = vec_sr(odd, vec_add(vec_splat_u32(8), vec_splat_u32(8))); \ \ acLevel = (vector signed short)vec_pack(vec_mergeh(even, odd), vec_mergel(even, odd)); \ sum_short = vec_add(sum_short, (vector unsigned short)acLevel); \ \ acLevel = vec_xor(acLevel, zero_mask); \ acLevel = vec_add(acLevel, vec_and(zero_mask, vec_splat_s16(1))); \ \ vec_st(acLevel, 0, coeff); \ \ coeff += 8; \ data += 8 /* This function assumes: * coeff is 16 byte aligned * data is unaligned */ uint32_t quant_h263_inter_altivec_c(int16_t *coeff, int16_t *data, const uint32_t quant, const uint16_t *mpeg_quant_matrices) { vector unsigned char zerovec; vector unsigned short mult; vector unsigned short quant_m_2; vector unsigned short quant_d_2; vector unsigned short sum_short; vector signed short acLevel; vector unsigned int even; vector unsigned int odd; vector bool short m2_mask; vector bool short zero_mask; uint32_t result; #ifdef DEBUG if(((unsigned)coeff) & 0x15) fprintf(stderr, "quant_h263_inter_altivec_c:incorrect align, coeff: %lx\n", (long)coeff); #endif /* initialisation stuff */ zerovec = vec_splat_u8(0); *((unsigned short*)&mult) = (unsigned short)multipliers[quant]; mult = vec_splat(mult, 0); *((unsigned short*)&quant_m_2) = (unsigned short)quant; quant_m_2 = vec_splat(quant_m_2, 0); quant_m_2 = vec_sl(quant_m_2, vec_splat_u16(1)); *((unsigned short*)&quant_d_2) = (unsigned short)quant; quant_d_2 = vec_splat(quant_d_2, 0); quant_d_2 = vec_sr(quant_d_2, vec_splat_u16(1)); sum_short = (vector unsigned short)zerovec; /* Quantize */ QUANT_H263_INTER_ALTIVEC(); QUANT_H263_INTER_ALTIVEC(); QUANT_H263_INTER_ALTIVEC(); QUANT_H263_INTER_ALTIVEC(); QUANT_H263_INTER_ALTIVEC(); QUANT_H263_INTER_ALTIVEC(); QUANT_H263_INTER_ALTIVEC(); QUANT_H263_INTER_ALTIVEC(); /* Calculate the return value */ even = (vector unsigned int)vec_sum4s((vector signed short)sum_short, (vector signed int)zerovec); even = (vector unsigned int)vec_sums((vector signed int)even, (vector signed int)zerovec); even = vec_splat(even, 3); vec_ste(even, 0, &result); return result; } /* dequantize intra-block & clamp to [-2048,2047] */ #define DEQUANT_H263_INTRA_ALTIVEC() \ acLevel = vec_perm(vec_ld(0,coeff_ptr), vec_ld(16,coeff_ptr), vec_lvsl(0,coeff_ptr)); \ equal_zero = vec_cmpeq(acLevel, (vector signed short)zerovec); \ less_zero = vec_cmplt(acLevel, (vector signed short)zerovec); \ acLevel = vec_abs(acLevel); \ \ even = vec_mule((vector unsigned short)acLevel, quant_m_2); \ odd = vec_mulo((vector unsigned short)acLevel, quant_m_2); \ \ high = vec_mergeh(even,odd); \ low = vec_mergel(even,odd); \ \ t = vec_sel(quant_add, (vector unsigned short)zerovec, equal_zero); \ high = vec_add(high, (vector unsigned int)vec_mergeh((vector unsigned short)zerovec, t)); \ low = vec_add(low, (vector unsigned int)vec_mergel((vector unsigned short)zerovec, t)); \ \ acLevel = vec_packs((vector signed int)high, (vector signed int)low); \ \ overflow = vec_cmpgt(acLevel, vec_2048); \ acLevel = vec_sel(acLevel, vec_2048, overflow); \ overflow = (vector bool short)vec_and(overflow, vec_xor(less_zero, vec_splat_s16(-1))); \ overflow = (vector bool short)vec_and(overflow, vec_splat_s16(1)); \ acLevel = vec_sub(acLevel, (vector signed short)overflow); \ \ acLevel = vec_xor(acLevel, less_zero); \ acLevel = vec_add(acLevel, vec_and(less_zero, vec_splat_s16(1))); \ \ vec_st(acLevel, 0, data_ptr); \ \ data_ptr += 8; \ coeff_ptr += 8 /* This function assumes: * data is 16 byte aligned * coeff is unaligned */ uint32_t dequant_h263_intra_altivec_c(int16_t *data, const int16_t *coeff, const uint32_t quant, const uint32_t dcscalar, const uint16_t *mpeg_quant_matrices) { vector signed short acLevel; vector signed short vec_2048; vector unsigned short quant_add; vector unsigned short quant_m_2; vector unsigned short t; vector bool short equal_zero; vector bool short less_zero; vector bool short overflow; register vector unsigned int even; register vector unsigned int odd; register vector unsigned int high; register vector unsigned int low; register vector unsigned char zerovec; register int16_t *data_ptr; register int16_t *coeff_ptr; #ifdef DEBUG if(((unsigned)data) & 0x15) fprintf(stderr, "dequant_h263_intra_altivec_c:incorrect align, data: %lx\n", (long)data); #endif /* initialize */ *((unsigned short*)&quant_add) = (unsigned short)(quant & 1 ? quant : quant - 1); quant_add = vec_splat(quant_add,0); *((unsigned short*)&quant_m_2) = (unsigned short)(quant << 1); quant_m_2 = vec_splat(quant_m_2,0); vec_2048 = vec_sl(vec_splat_s16(1), vec_splat_u16(11)); zerovec = vec_splat_u8(0); data_ptr = (int16_t*)data; coeff_ptr = (int16_t*)coeff; /* dequant */ DEQUANT_H263_INTRA_ALTIVEC(); DEQUANT_H263_INTRA_ALTIVEC(); DEQUANT_H263_INTRA_ALTIVEC(); DEQUANT_H263_INTRA_ALTIVEC(); DEQUANT_H263_INTRA_ALTIVEC(); DEQUANT_H263_INTRA_ALTIVEC(); DEQUANT_H263_INTRA_ALTIVEC(); DEQUANT_H263_INTRA_ALTIVEC(); /* data[0] is special */ data[0] = coeff[0] * dcscalar; if(data[0] < -2048) data[0] = -2048; else if(data[0] > 2047) data[0] = 2047; return 0; } /* dequantize inter-block & clamp to [-2048,2047] */ #define DEQUANT_H263_INTER_ALTIVEC() \ acLevel = vec_perm(vec_ld(0,coeff), vec_ld(16,coeff), vec_lvsl(0,coeff)); \ equal_zero = vec_cmpeq(acLevel, (vector signed short)zerovec); \ less_zero = vec_cmplt(acLevel, (vector signed short)zerovec); \ acLevel = vec_abs(acLevel); \ \ even = vec_mule((vector unsigned short)acLevel, quant_m_2); \ odd = vec_mulo((vector unsigned short)acLevel, quant_m_2); \ high = vec_mergeh(even,odd); \ low = vec_mergel(even,odd); \ \ t = vec_sel(quant_add, (vector unsigned short)zerovec, equal_zero); \ high = vec_add(high, (vector unsigned int)vec_mergeh((vector unsigned short)zerovec, t)); \ low = vec_add(low, (vector unsigned int)vec_mergel((vector unsigned short)zerovec, t)); \ acLevel = vec_packs((vector signed int)high, (vector signed int)low); \ \ overflow = vec_cmpgt(acLevel,vec_2048); \ acLevel = vec_sel(acLevel, vec_2048, overflow); \ overflow = (vector bool short)vec_and(overflow, vec_xor(less_zero, vec_splat_s16(-1))); \ overflow = (vector bool short)vec_and(overflow, vec_splat_s16(1)); \ acLevel = vec_sub(acLevel, (vector signed short)overflow); \ \ acLevel = vec_xor(acLevel, less_zero); \ acLevel = vec_add(acLevel, vec_and(less_zero, vec_splat_s16(1))); \ \ vec_st(acLevel, 0, data); \ data += 8; \ coeff += 8 /* This function assumes: * data is 16 byte aligned * coeff is unaligned */ uint32_t dequant_h263_inter_altivec_c(int16_t *data, int16_t *coeff, const uint32_t quant, const uint16_t *mpeg_quant_matrices) { vector signed short acLevel; vector signed short vec_2048; vector unsigned short quant_m_2; vector unsigned short quant_add; vector unsigned short t; register vector unsigned int even; register vector unsigned int odd; register vector unsigned int high; register vector unsigned int low; register vector unsigned char zerovec; vector bool short equal_zero; vector bool short less_zero; vector bool short overflow; #ifdef DEBUG /* print alignment errors if this is on */ if(((unsigned)data) & 0x15) fprintf(stderr, "dequant_h263_inter_altivec_c:incorrect align, data: %lx\n", (long)data); #endif /* initialize */ *((unsigned short*)&quant_m_2) = (unsigned short)(quant << 1); quant_m_2 = vec_splat(quant_m_2,0); *((unsigned short*)&quant_add) = (unsigned short)(quant & 1 ? quant : quant - 1); quant_add = vec_splat(quant_add,0); vec_2048 = vec_sl(vec_splat_s16(1), vec_splat_u16(11)); zerovec = vec_splat_u8(0); /* dequant */ DEQUANT_H263_INTER_ALTIVEC(); DEQUANT_H263_INTER_ALTIVEC(); DEQUANT_H263_INTER_ALTIVEC(); DEQUANT_H263_INTER_ALTIVEC(); DEQUANT_H263_INTER_ALTIVEC(); DEQUANT_H263_INTER_ALTIVEC(); DEQUANT_H263_INTER_ALTIVEC(); DEQUANT_H263_INTER_ALTIVEC(); return 0; } xvidcore/src/quant/ppc_asm/quant_mpeg_altivec.c0000664000076500007650000002106311564705453023053 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - MPEG4 Quantization MPEG implementation with altivec optimization - * * Copyright(C) 2005 Christoph Naegeli * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: quant_mpeg_altivec.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" #include "../../global.h" #include "../quant.h" #include "../quant_matrix.h" #undef DEBUG #include /* Some useful typedefs */ typedef vector unsigned char vec_uint8_t; typedef vector unsigned short vec_uint16_t; typedef vector unsigned int vec_uint32_t; typedef vector signed char vec_sint8_t; typedef vector signed short vec_sint16_t; typedef vector signed int vec_sint32_t; /***************************************************************************** * Local data ****************************************************************************/ #define VM18P 3 #define VM18Q 4 /* divide-by-multiply table * needs 17 bit shift (16 causes slight errors when q > 19) */ #define SCALEBITS 17 #define DEQUANT_MPEG_INTRA() \ level = vec_perm(vec_ld(0,coeff_ptr),vec_ld(16,coeff_ptr),vec_lvsl(0,coeff_ptr));\ zero_less = vec_cmplt(level,ox00);\ level = vec_abs(level);\ vintra = vec_perm(vec_ld(0,intra_matrix),vec_ld(16,intra_matrix),vec_lvsl(0,intra_matrix));\ even = vec_mule((vec_uint16_t)level,vintra);\ odd = vec_mulo((vec_uint16_t)level,vintra);\ t = vec_splat_u32(-16);\ et = vec_msum( (vec_uint16_t)even, (vec_uint16_t)swap, (vec_uint32_t)ox00);\ ot = vec_msum( (vec_uint16_t)odd, (vec_uint16_t)swap, (vec_uint32_t)ox00);\ et = vec_sl(et, t);\ ot = vec_sl(ot, t);\ even = vec_mulo( (vec_uint16_t)even, (vec_uint16_t)vquant);\ odd = vec_mulo( (vec_uint16_t)odd, (vec_uint16_t)vquant);\ t = vec_splat_u32(3);\ even = vec_add(even,et);\ odd = vec_add(odd,ot);\ even = vec_sr(even,t);\ odd = vec_sr(odd,t);\ /* Pack & Clamp to [-2048,2047] */\ level = vec_packs( (vec_sint32_t)vec_mergeh(even,odd), (vec_sint32_t)vec_mergel(even,odd) );\ t = (vec_uint32_t)vec_splat_s16(-1);\ overflow = vec_cmpgt(level, vec_add(vec_2048, (vec_sint16_t)t));\ level = vec_sel(level, vec_2048, overflow);\ overflow = (vector bool short)vec_and(overflow, vec_xor(zero_less, (vec_sint16_t)t));\ t = (vec_uint32_t)vec_splat_s16(1);\ overflow = (vector bool short)vec_and(overflow, (vec_sint16_t)t);\ level = vec_sub(level, (vec_sint16_t)overflow);\ /* Invert the negative ones */\ level = vec_xor(level, zero_less);\ level = vec_add(level, (vec_sint16_t)vec_and(zero_less, (vec_uint16_t)t));\ /* Save & Advance Pointers */\ vec_st(level,0,data_ptr);\ data_ptr += 8;\ intra_matrix += 8;\ coeff_ptr += 8 /* This function assuems: * data: 16 Byte aligned */ uint32_t dequant_mpeg_intra_altivec_c(int16_t * data, const int16_t * coeff, const uint32_t quant, const uint32_t dcscalar, const uint16_t * mpeg_quant_matrices) { register const uint16_t *intra_matrix = get_intra_matrix(mpeg_quant_matrices); register const int16_t *coeff_ptr = coeff; register int16_t *data_ptr = data; register vec_sint16_t ox00; register vec_sint16_t level; register vec_sint16_t vec_2048; register vec_uint16_t vintra; register vec_uint32_t swap; register vec_uint32_t even,odd; register vec_uint32_t et,ot,t; vec_uint32_t vquant; vector bool short zero_less; vector bool short overflow; #ifdef DEBUG if((long)data & 0xf) fprintf(stderr, "xvidcore: error in dequant_mpeg_intra_altivec_c, incorrect align: %x\n", data); #endif /* Initialize */ ox00 = vec_splat_s16(0); *((uint32_t*)&vquant) = quant; vquant = vec_splat(vquant,0); swap = vec_rl(vquant, vec_splat_u32(-16)); vec_2048 = (vec_sint16_t)vec_rl(vec_splat_u16(8),vec_splat_u16(8)); DEQUANT_MPEG_INTRA(); DEQUANT_MPEG_INTRA(); DEQUANT_MPEG_INTRA(); DEQUANT_MPEG_INTRA(); DEQUANT_MPEG_INTRA(); DEQUANT_MPEG_INTRA(); DEQUANT_MPEG_INTRA(); DEQUANT_MPEG_INTRA(); /* Process the first */ data[0] = coeff[0] * dcscalar; if (data[0] < -2048) { data[0] = -2048; } else if (data[0] > 2047) { data[0] = 2047; } return 0; } #define DEQUANT_MPEG_INTER() \ level = vec_perm(vec_ld(0,coeff),vec_ld(16,coeff),vec_lvsl(0,coeff));\ zero_eq = vec_cmpeq(level,ox00);\ zero_less = vec_cmplt(level,ox00);\ level = vec_abs(level);\ vinter = vec_perm(vec_ld(0,inter_matrix),vec_ld(16,inter_matrix),vec_lvsl(0,inter_matrix));\ t = vec_splat_u32(1);\ hi = (vec_uint32_t)vec_unpackh(level);\ lo = (vec_uint32_t)vec_unpackl(level);\ hi = vec_sl(hi, t);\ lo = vec_sl(lo, t);\ hi = vec_add(hi, t);\ lo = vec_add(lo, t);\ /* Multiplication with vinter */\ sw_hi = vec_rl(hi, v16);\ sw_lo = vec_rl(lo, v16);\ hi = vec_mulo((vec_uint16_t)hi, vec_mergeh((vec_uint16_t)ox00,vinter));\ lo = vec_mulo((vec_uint16_t)lo, vec_mergel((vec_uint16_t)ox00,vinter));\ sw_hi = vec_mulo((vec_uint16_t)sw_hi, vec_mergeh((vec_uint16_t)ox00,vinter));\ sw_lo = vec_mulo((vec_uint16_t)sw_lo, vec_mergeh((vec_uint16_t)ox00,vinter));\ hi = vec_add(hi, vec_sl(sw_hi,v16));\ lo = vec_add(lo, vec_sl(sw_lo,v16));\ /* Multiplication with Quant */\ t = vec_splat_u32(4);\ sw_hi = vec_msum( (vec_uint16_t)hi, (vec_uint16_t)swap, (vec_uint32_t)ox00 );\ sw_lo = vec_msum( (vec_uint16_t)lo, (vec_uint16_t)swap, (vec_uint32_t)ox00 );\ hi = vec_mulo( (vec_uint16_t)hi, (vec_uint16_t)vquant );\ lo = vec_mulo( (vec_uint16_t)lo, (vec_uint16_t)vquant );\ hi = vec_add(hi, vec_sl(sw_hi,v16));\ lo = vec_add(lo, vec_sl(sw_lo,v16));\ hi = vec_sr(hi, t);\ lo = vec_sr(lo, t);\ /* Pack & Clamp to [-2048,2047] */\ t = (vec_uint32_t)vec_splat_s16(-1);\ level = vec_packs((vec_sint32_t)hi, (vec_sint32_t)lo);\ overflow = vec_cmpgt(level, vec_add(v2048, (vec_sint16_t)t));\ level = vec_sel(level, v2048, overflow);\ overflow = (vector bool short)vec_and(overflow, vec_xor(zero_less, (vec_sint16_t)t));\ t = (vec_uint32_t)vec_splat_s16(1);\ overflow = (vector bool short)vec_and(overflow, (vec_sint16_t)t);\ level = vec_sub(level, (vec_sint16_t)overflow);\ level = vec_sel(level, ox00, zero_eq);\ level = vec_xor(level, zero_less);\ level = vec_add(level, (vec_sint16_t)vec_and(zero_less, (vec_uint16_t)t));\ /* Get vsum */\ vsum = vec_xor(vsum, (vec_uint32_t)vec_unpackh(level));\ vsum = vec_xor(vsum, (vec_uint32_t)vec_unpackl(level));\ /* Save & Advance Pointers */\ vec_st(level,0,data);\ data+=8;\ inter_matrix+=8;\ coeff+=8 /* This function assumes: * data: 16 Byte aligned */ uint32_t dequant_mpeg_inter_altivec_c(int16_t * data, const int16_t * coeff, const uint32_t quant, const uint16_t * mpeg_quant_matrices) { register uint32_t sum; register const uint16_t *inter_matrix = get_inter_matrix(mpeg_quant_matrices); register vec_sint16_t ox00; register vec_sint16_t v2048; register vec_sint16_t level; register vec_uint16_t vinter; register vec_uint32_t hi,lo; register vec_uint32_t sw_hi,sw_lo; register vec_uint32_t swap; register vec_uint32_t t,v16; vec_uint32_t vsum; vec_uint32_t vquant; vector bool short zero_eq; vector bool short zero_less; vector bool short overflow; #ifdef DEBUG if((long)data & 0xf) fprintf(stderr, "xvidcore: error in dequant_mpeg_inter_altivec_c, incorrect align: %x\n", data); #endif /* Initialization */ ox00 = vec_splat_s16(0); v16 = vec_splat_u32(-16); v2048 = vec_rl(vec_splat_s16(8),vec_splat_u16(8)); vsum = (vec_uint32_t)ox00; *((uint32_t*)&vquant) = quant; vquant = vec_splat(vquant,0); swap = vec_rl(vquant,v16); DEQUANT_MPEG_INTER(); DEQUANT_MPEG_INTER(); DEQUANT_MPEG_INTER(); DEQUANT_MPEG_INTER(); DEQUANT_MPEG_INTER(); DEQUANT_MPEG_INTER(); DEQUANT_MPEG_INTER(); DEQUANT_MPEG_INTER(); sum = ((uint32_t*)&vsum)[0]; sum ^= ((uint32_t*)&vsum)[1]; sum ^= ((uint32_t*)&vsum)[2]; sum ^= ((uint32_t*)&vsum)[3]; /* mismatch control */ if((sum & 1) == 0) { data -= 1; *data ^= 1; } return 0; } xvidcore/src/quant/ia64_asm/0000775000076500007650000000000011566427762017025 5ustar xvidbuildxvidbuildxvidcore/src/quant/ia64_asm/quant_h263_ia64.s0000664000076500007650000003220011147310721021700 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 h.263 quantization - // * // * Copyright(C) 2002 Christian Engel, Hans-Joachim Daniels // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: quant_h263_ia64.s,v 1.7 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * quant_h263_ia64.s, IA-64 h.263 quantization // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** // ***************************************************************************** // * // * functions quant_inter and dequant_inter have been softwarepipelined // * use was made of the pmpyshr2 instruction // * // * by Christian Engel and Hans-Joachim Daniels // * christian.engel@ira.uka.de hans-joachim.daniels@ira.uka.de // * // * This was made for the ia64 DivX laboratory (yes, it was really called // * this way, originally OpenDivX was intendet, but died shortly before our // * work started (you will probably already know ...)) // * at the Universitat Karlsruhe (TH) held between April and July 2002 // * http://www.info.uni-karlsruhe.de/~rubino/ia64p/ // * // *****************************************************************************/ .file "quant_h263_ia64.s" .pred.safe_across_calls p1-p5,p16-p63 .section .rodata .align 4 .type multipliers#,@object .size multipliers#,128 multipliers: data4 0 data4 32769 data4 16385 data4 10923 data4 8193 data4 6554 data4 5462 data4 4682 data4 4097 data4 3641 data4 3277 data4 2979 data4 2731 data4 2521 data4 2341 data4 2185 data4 2049 data4 1928 data4 1821 data4 1725 data4 1639 data4 1561 data4 1490 data4 1425 data4 1366 data4 1311 data4 1261 data4 1214 data4 1171 data4 1130 data4 1093 data4 1058 .global __divdi3# .text .align 16 .global quant_h263_intra_ia64# .proc quant_h263_intra_ia64# quant_h263_intra_ia64: .prologue .save ar.pfs, r38 alloc r38 = ar.pfs, 4, 3, 2, 0 adds r16 = -8, r12 .fframe 32 adds r12 = -32, r12 mov r17 = ar.lc addl r14 = @ltoff(multipliers#), gp ld2 r15 = [r33] ;; .savesp ar.lc, 24 st8 [r16] = r17, 8 ld8 r14 = [r14] sxt2 r15 = r15 ;; .save.f 0x1 stf.spill [r16] = f2 .save rp, r37 mov r37 = b0 .body dep.z r36 = r34, 1, 15 dep.z r16 = r34, 2, 32 cmp4.ge p6, p7 = 0, r15 ;; add r16 = r16, r14 ;; ld4 r16 = [r16] ;; setf.sig f2 = r16 (p6) br.cond.dptk .L8 extr r39 = r35, 1, 31 sxt4 r40 = r35 ;; add r39 = r39, r15 br .L21 ;; .L8: extr r39 = r35, 1, 31 sxt4 r40 = r35 ;; sub r39 = r15, r39 ;; .L21: sxt4 r39 = r39 br.call.sptk.many b0 = __divdi3# ;; addl r14 = 62, r0 st2 [r32] = r8 addl r19 = 1, r0 ;; mov ar.lc = r14 ;; .L20: dep.z r17 = r19, 1, 32 ;; add r15 = r17, r33 adds r19 = 1, r19 ;; ld2 r14 = [r15] ;; sxt2 r14 = r14 ;; mov r16 = r14 mov r18 = r14 ;; sub r15 = r0, r16 cmp4.le p8, p9 = r36, r16 cmp4.le p6, p7 = r0, r16 ;; sxt2 r14 = r15 (p6) br.cond.dptk .L14 ;; mov r16 = r14 add r18 = r17, r32 ;; setf.sig f6 = r16 cmp4.le p6, p7 = r36, r16 mov r15 = r18 ;; xma.l f6 = f6, f2, f0 (p7) st2 [r18] = r0 ;; getf.sig r14 = f6 ;; extr r14 = r14, 16, 16 ;; sub r14 = r0, r14 ;; (p6) st2 [r15] = r14 br .L12 .L14: .pred.rel "mutex", p8, p9 setf.sig f6 = r18 add r16 = r17, r32 ;; xma.l f6 = f6, f2, f0 mov r15 = r16 (p9) st2 [r16] = r0 ;; getf.sig r14 = f6 ;; extr r14 = r14, 16, 16 ;; (p8) st2 [r15] = r14 .L12: br.cloop.sptk.few .L20 adds r18 = 24, r12 ;; ld8 r19 = [r18], 8 mov ar.pfs = r38 mov b0 = r37 ;; mov ar.lc = r19 ldf.fill f2 = [r18] .restore sp adds r12 = 32, r12 br.ret.sptk.many b0 .endp quant_h263_intra_ia64# .common quant_h263_intra#,8,8 .common dequant_h263_intra#,8,8 .align 16 .global dequant_h263_intra_ia64# .proc dequant_h263_intra_ia64# dequant_h263_intra_ia64: .prologue ld2 r14 = [r33] andcm r15 = 1, r34 setf.sig f8 = r35 ;; sxt2 r14 = r14 sub r15 = r34, r15 addl r16 = -2048, r0 ;; setf.sig f6 = r14 setf.sig f7 = r15 shladd r34 = r34, 1, r0 ;; xma.l f8 = f6, f8, f0 .save ar.lc, r2 mov r2 = ar.lc ;; .body getf.sig r14 = f8 setf.sig f6 = r34 ;; sxt2 r15 = r14 st2 [r32] = r14 ;; cmp4.le p6, p7 = r16, r15 ;; (p7) st2 [r32] = r16 (p7) br.cond.dptk .L32 addl r14 = 2047, r0 ;; cmp4.ge p6, p7 = r14, r15 ;; (p7) st2 [r32] = r14 .L32: addl r14 = 62, r0 addl r19 = 1, r0 addl r22 = 2048, r0 addl r21 = -2048, r0 addl r20 = 2047, r0 ;; mov ar.lc = r14 ;; .L56: dep.z r16 = r19, 1, 32 ;; add r14 = r16, r33 add r17 = r16, r32 adds r19 = 1, r19 ;; ld2 r15 = [r14] ;; sxt2 r15 = r15 ;; cmp4.ne p6, p7 = 0, r15 cmp4.le p8, p9 = r0, r15 ;; (p7) st2 [r17] = r0 (p7) br.cond.dpnt .L36 add r18 = r16, r32 sub r17 = r0, r15 ;; mov r14 = r18 (p8) br.cond.dptk .L40 setf.sig f8 = r17 ;; xma.l f8 = f6, f8, f7 ;; getf.sig r15 = f8 ;; cmp4.lt p6, p7 = r22, r15 sub r16 = r0, r15 ;; (p7) st2 [r14] = r16 (p6) st2 [r14] = r21 br .L36 .L40: setf.sig f8 = r15 ;; xma.l f8 = f6, f8, f7 ;; getf.sig r15 = f8 ;; cmp4.le p6, p7 = r20, r15 ;; (p6) mov r14 = r20 (p7) mov r14 = r15 ;; st2 [r18] = r14 .L36: br.cloop.sptk.few .L56 ;; mov ar.lc = r2 br.ret.sptk.many b0 .endp dequant_h263_intra_ia64# // uint32_t quant_h263_inter_ia64(int16_t *coeff, const int16_t *data, const uint32_t quant) .common quant_h263_inter#,8,8 .align 16 .global quant_h263_inter_ia64# .proc quant_h263_inter_ia64# quant_h263_inter_ia64: //******************************************************* //* * //* const uint32_t mult = multipliers[quant]; * //* const uint16_t quant_m_2 = quant << 1; * //* const uint16_t quant_d_2 = quant >> 1; * //* int sum = 0; * //* uint32_t i; * //* int16_t acLevel,acL; * //* * //*******************************************************/ LL=3 // LL = load latency //if LL is changed, you'll also have to change the .pred.rel... parts below! .prologue addl r14 = @ltoff(multipliers#), gp dep.z r15 = r34, 2, 32 .save ar.lc, r2 mov r2 = ar.lc ;; .body alloc r9=ar.pfs,0,24,0,24 mov r17 = ar.ec mov r10 = pr ld8 r14 = [r14] extr.u r16 = r34, 1, 16 //r16 = quant_d_2 dep.z r20 = r34, 1, 15 //r20 = quant_m_2 ;; add r15 = r15, r14 mov r21 = r16 //r21 = quant_d_2 mov r8 = r0 //r8 = sum = 0 mov pr.rot = 0 //p16-p63 = 0 ;; ld4 r15 = [r15] addl r14 = 63, r0 mov pr.rot = 1 << 16 //p16=1 ;; mov ar.lc = r14 mov ar.ec = LL+9 mov r29 = r15 ;; mov r15 = r33 //r15 = data mov r18 = r32 //r18 = coeff ;; .rotr ac1[LL+3], ac2[8], ac3[2] .rotp p[LL+9], cmp1[8], cmp1neg[8],cmp2[5], cmp2neg[2] //******************************************************************************* //* * //* for (i = 0; i < 64; i++) { * //* acL=acLevel = data[i]; * //* acLevel = ((acLevel < 0)?-acLevel:acLevel) - quant_d_2; * //* if (acLevel < quant_m_2){ * //* acLevel = 0; * //* } * //* acLevel = (acLevel * mult) >> SCALEBITS; * //* sum += acLevel; * //* coeff[i] = ((acL < 0)?-acLevel:acLevel); * //* } * //* * //*******************************************************************************/ .explicit .L58: .pred.rel "clear", p29, p37 .pred.rel "mutex", p29, p37 //pipeline stage {.mmi (p[0]) ld2 ac1[0] = [r15],2 // 0 acL=acLevel = data[i]; (p[LL+1]) sub ac2[0] = r0, ac1[LL+1] // LL+1 ac2=-acLevel (p[LL]) sxt2 ac1[LL] = ac1[LL] // LL } {.mmi (p[LL+1]) cmp4.le cmp1[0], cmp1neg[0] = r0, ac1[LL+1] // LL+1 cmp1 = (0<=acLevel) ; cmp1neg = !(0<=acLevel) (p[LL+4]) cmp4.le cmp2[0], cmp2neg[0] = r20, ac2[3] // LL+4 cmp2 = (quant_m_2 < acLevel) ; cmp2neg = !(quant_m_2 < acLevel) (cmp1[1]) sub ac2[1] = ac1[LL+2], r21 // LL+2 acLevel = acLevel - quant_d_2; } {.mmi (cmp2neg[1]) mov ac2[4] = r0 // LL+5 if (acLevel < quant_m_2) acLevel=0; (cmp1neg[1]) sub ac2[1] = ac2[1], r21 // LL+2 acLevel = ac2 - quant_d_2; (p[LL+3]) sxt2 ac2[2] = ac2[2] // LL+3 } {.mmi .pred.rel "mutex", p34, p42 (cmp1[6]) mov ac3[0] = ac2[6] // LL+7 ac3 = acLevel; (cmp1neg[6]) sub ac3[0] = r0, ac2[6] // LL+7 ac3 = -acLevel; (p[LL+6]) pmpyshr2.u ac2[5] = r29, ac2[5], 16 // LL+6 acLevel = (acLevel * mult) >> SCALEBITS; } {.mib (p[LL+8]) st2 [r18] = ac3[1] , 2 // LL+8 coeff[i] = ac3; (cmp2[4]) add r8 = r8, ac2[7] // LL+8 sum += acLevel; br.ctop.sptk.few .L58 ;; } .pred.rel "clear", p29, p37 .default mov ar.ec = r17 ;; mov ar.lc = r2 mov pr = r10, -1 mov ar.pfs = r9 br.ret.sptk.many b0 .endp quant_h263_inter_ia64# // void dequant_h263_inter_ia64(int16_t *data, const int16_t *coeff, const uint32_t quant) .common dequant_h263_inter#,8,8 .align 16 .global dequant_h263_inter_ia64# .proc dequant_h263_inter_ia64# dequant_h263_inter_ia64: //*********************************************************************** //* * //* const uint16_t quant_m_2 = quant << 1; * //* const uint16_t quant_add = (quant & 1 ? quant : quant - 1); * //* uint32_t i; * //* * //*********************************************************************** .prologue andcm r14 = 1, r34 dep.z r29 = r34, 1, 15 alloc r9=ar.pfs,0,32,0,32 .save ar.lc, r2 mov r2 = ar.lc ;; .body sub r15 = r34, r14 // r15 = quant addl r14 = 63, r0 addl r21 = -2048, r0 addl r20 = 2047, r0 mov r16 = ar.ec mov r17 = pr ;; zxt2 r15 = r15 mov ar.lc = r14 mov pr.rot = 0 ;; adds r14 = 0, r33 // r14 = coeff mov r18 = r32 // r18 = data mov ar.ec = LL+10 mov pr.rot = 1 << 16 ;; //******************************************************************************* //* * //*for (i = 0; i < 64; i++) { * //* int16_t acLevel = coeff[i]; * //* * //* if (acLevel == 0) * //* { * //* data[i] = 0; * //* } * //* else if (acLevel < 0) * //* { * //* acLevel = acLevel * quant_m_2 - quant_add; * //* data[i] = (acLevel >= -2048 ? acLevel : -2048); * //* } * //* else // if (acLevel > 0) * //* { * //* acLevel = acLevel * quant_m_2 + quant_add; * //* data[i] = (acLevel <= 2047 ? acLevel : 2047); * //* } * //* } * //* * //*******************************************************************************/ LL=2 // LL := load latency //if LL is changed, you'll also have to change the .pred.rel... parts below! .rotr ac1[LL+10], x[5], y1[3], y2[3] .rotp p[LL+10] , cmp1neg[8], cmp2[5], cmp2neg[5],cmp3[2], cmp3neg[2] .explicit //pipeline stage .L60: .pred.rel "clear", p36 .pred.rel "mutex", p47, p49 .pred.rel "mutex", p46, p48 .pred.rel "mutex", p40, p45 .pred.rel "mutex", p39, p44 .pred.rel "mutex", p38, p43 .pred.rel "mutex", p37, p42 .pred.rel "mutex", p36, p41 {.mmi (p[0])ld2 ac1[0] = [r14] ,2 // 0 acLevel = coeff[i]; (p[LL+1])cmp4.ne p6, cmp1neg[0] = 0, ac1[LL+1] // LL+1 (p[LL])sxt2 ac1[LL] = ac1[LL] // LL } {.mmi (p[LL+1])cmp4.le cmp2[0], cmp2neg[0] = r0, ac1[LL+1] // LL+1 (cmp2[1]) mov x[0] = r20 // LL+2 (p[LL+2])pmpyshr2.u ac1[LL+2] = r29, ac1[LL+2], 0 // LL+2 } {.mmi (cmp2neg[1]) mov x[0] = r21 // LL+2 (cmp2[2]) add ac1[LL+3] = ac1[LL+3], r15 // LL+3 (cmp2neg[2]) sub ac1[LL+3] = ac1[LL+3], r15 // LL+3 } {.mmi (cmp2neg[4]) mov y1[0] = ac1[LL+5] // LL+5 (cmp2neg[4]) mov y2[0] = x[3] // LL+5 (p[LL+4])sxt2 ac1[LL+4] = ac1[LL+4] // LL+4 } {.mmi (cmp2[4]) mov y1[0] = x[3] // LL+5 (cmp2[4]) mov y2[0] = ac1[LL+5] // LL+5 (p[LL+6])cmp4.le cmp3[0], cmp3neg[0] = x[4], ac1[LL+6] // LL+6 } {.mmi (cmp3[1]) mov ac1[LL+7] = y1[2] // LL+7 (cmp3neg[1]) mov ac1[LL+7] = y2[2] // LL+7 (cmp1neg[7]) mov ac1[LL+8] = r0 // LL+8 } {.mbb (p[LL+9])st2 [r18] = ac1[LL+9] ,2 // LL+9 nop.b 0x0 br.ctop.sptk.few .L60 ;; } .pred.rel "clear", p36 .default mov ar.lc = r2 mov ar.pfs = r9 mov ar.ec = r16 mov pr = r17, -1 ;; mov ar.lc = r2 br.ret.sptk.many b0 .endp dequant_h263_inter_ia64# .ident "GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.1 2.96-85)" xvidcore/src/quant/x86_asm/0000775000076500007650000000000011566427762016707 5ustar xvidbuildxvidbuildxvidcore/src/quant/x86_asm/quantize_h263_mmx.asm0000664000076500007650000006004111254216113022651 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - MPEG4 Quantization H263 implementation / MMX optimized - ; * ; * Copyright(C) 2001-2003 Peter Ross ; * 2002-2003 Pascal Massimino ; * 2004 Jean-Marc Bastide ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: quantize_h263_mmx.asm,v 1.16 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ****************************************************************************/ ; enable dequant saturate [-2048,2047], test purposes only. %define SATURATE %include "nasm.inc" ;============================================================================= ; Read only Local data ;============================================================================= DATA ALIGN SECTION_ALIGN plus_one: times 8 dw 1 ;----------------------------------------------------------------------------- ; ; quant table ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_quant: %assign quant 0 %rep 32 times 4 dw quant %assign quant quant+1 %endrep ;----------------------------------------------------------------------------- ; ; subtract by Q/2 table ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_sub: %assign quant 1 %rep 31 times 4 dw quant / 2 %assign quant quant+1 %endrep ;----------------------------------------------------------------------------- ; ; divide by 2Q table ; ; use a shift of 16 to take full advantage of _pmulhw_ ; for q=1, _pmulhw_ will overflow so it is treated seperately ; (3dnow2 provides _pmulhuw_ which wont cause overflow) ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_div: %assign quant 1 %rep 31 times 4 dw (1<<16) / (quant*2) + 1 %assign quant quant+1 %endrep ;============================================================================= ; Code ;============================================================================= TEXT cglobal quant_h263_intra_mmx cglobal quant_h263_intra_sse2 cglobal quant_h263_inter_mmx cglobal quant_h263_inter_sse2 cglobal dequant_h263_intra_mmx cglobal dequant_h263_intra_xmm cglobal dequant_h263_intra_sse2 cglobal dequant_h263_inter_mmx cglobal dequant_h263_inter_xmm cglobal dequant_h263_inter_sse2 ;----------------------------------------------------------------------------- ; ; uint32_t quant_h263_intra_mmx(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN quant_h263_intra_mmx: mov _EAX, prm2 ; data mov TMP0, prm4 ; dcscalar movsx _EAX, word [_EAX] ; data[0] sar TMP0, 1 ; dcscalar /2 mov TMP1, _EAX sar TMP1, 31 ; sgn(data[0]) xor TMP0,TMP1 ; *sgn(data[0]) sub _EAX,TMP1 add _EAX,TMP0 ; + (dcscalar/2)*sgn(data[0]) mov TMP0, prm3 ; quant lea TMP1, [mmx_div] movq mm7, [TMP1+TMP0 * 8 - 8] %ifdef ARCH_IS_X86_64 %ifdef WINDOWS mov TMP1, prm2 %endif %endif cdq idiv prm4d ; dcscalar %ifdef ARCH_IS_X86_64 %ifdef WINDOWS mov prm2, TMP1 %endif %endif cmp TMP0, 1 mov TMP1, prm1 ; coeff je .low mov TMP0, prm2 ; data push _EAX ; DC mov _EAX, TMP0 mov TMP0,4 .loop: movq mm0, [_EAX] ; data pxor mm4,mm4 movq mm1, [_EAX + 8] pcmpgtw mm4,mm0 ; (data<0) pxor mm5,mm5 pmulhw mm0,mm7 ; /(2*quant) pcmpgtw mm5,mm1 movq mm2, [_EAX+16] psubw mm0,mm4 ; +(data<0) pmulhw mm1,mm7 pxor mm4,mm4 movq mm3,[_EAX+24] pcmpgtw mm4,mm2 psubw mm1,mm5 pmulhw mm2,mm7 pxor mm5,mm5 pcmpgtw mm5,mm3 pmulhw mm3,mm7 psubw mm2,mm4 psubw mm3,mm5 movq [TMP1], mm0 lea _EAX, [_EAX+32] movq [TMP1 + 8], mm1 movq [TMP1 + 16], mm2 movq [TMP1 + 24], mm3 dec TMP0 lea TMP1, [TMP1+32] jne .loop jmp .end .low: movd mm7,TMP0d mov TMP0, prm2 push _EAX mov _EAX, TMP0 mov TMP0,4 .loop_low: movq mm0, [_EAX] pxor mm4,mm4 movq mm1, [_EAX + 8] pcmpgtw mm4,mm0 pxor mm5,mm5 psubw mm0,mm4 pcmpgtw mm5,mm1 psraw mm0,mm7 psubw mm1,mm5 movq mm2,[_EAX+16] pxor mm4,mm4 psraw mm1,mm7 pcmpgtw mm4,mm2 pxor mm5,mm5 psubw mm2,mm4 movq mm3,[_EAX+24] pcmpgtw mm5,mm3 psraw mm2,mm7 psubw mm3,mm5 movq [TMP1], mm0 psraw mm3,mm7 movq [TMP1 + 8], mm1 movq [TMP1+16],mm2 lea _EAX, [_EAX+32] movq [TMP1+24],mm3 dec TMP0 lea TMP1, [TMP1+32] jne .loop_low .end: pop _EAX mov TMP1, prm1 ; coeff mov [TMP1],ax xor _EAX,_EAX ; return 0 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t quant_h263_intra_sse2(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN quant_h263_intra_sse2: PUSH_XMM6_XMM7 mov _EAX, prm2 ; data movsx _EAX, word [_EAX] ; data[0] mov TMP0,prm4 ; dcscalar mov TMP1, _EAX sar TMP0,1 add _EAX,TMP0 sub TMP1,TMP0 cmovl _EAX,TMP1 ; +/- dcscalar/2 mov TMP0, prm3 ; quant lea TMP1, [mmx_div] movq xmm7, [TMP1+TMP0 * 8 - 8] %ifdef ARCH_IS_X86_64 %ifdef WINDOWS mov TMP1, prm2 %endif %endif cdq idiv prm4d ; dcscalar %ifdef ARCH_IS_X86_64 %ifdef WINDOWS mov prm2, TMP1 %endif %endif cmp TMP0, 1 mov TMP1, prm1 ; coeff je near .low mov TMP0, prm2 push _EAX ; DC mov _EAX, TMP0 mov TMP0,2 movlhps xmm7,xmm7 .loop: movdqa xmm0, [_EAX] pxor xmm4,xmm4 movdqa xmm1, [_EAX + 16] pcmpgtw xmm4,xmm0 pxor xmm5,xmm5 pmulhw xmm0,xmm7 pcmpgtw xmm5,xmm1 movdqa xmm2, [_EAX+32] psubw xmm0,xmm4 pmulhw xmm1,xmm7 pxor xmm4,xmm4 movdqa xmm3,[_EAX+48] pcmpgtw xmm4,xmm2 psubw xmm1,xmm5 pmulhw xmm2,xmm7 pxor xmm5,xmm5 pcmpgtw xmm5,xmm3 pmulhw xmm3,xmm7 psubw xmm2,xmm4 psubw xmm3,xmm5 movdqa [TMP1], xmm0 lea _EAX, [_EAX+64] movdqa [TMP1 + 16], xmm1 movdqa [TMP1 + 32], xmm2 movdqa [TMP1 + 48], xmm3 dec TMP0 lea TMP1, [TMP1+64] jne .loop jmp .end .low: movd xmm7,TMP0d mov TMP0, prm2 push _EAX ; DC mov _EAX, TMP0 mov TMP0,2 .loop_low: movdqa xmm0, [_EAX] pxor xmm4,xmm4 movdqa xmm1, [_EAX + 16] pcmpgtw xmm4,xmm0 pxor xmm5,xmm5 psubw xmm0,xmm4 pcmpgtw xmm5,xmm1 psraw xmm0,xmm7 psubw xmm1,xmm5 movdqa xmm2,[_EAX+32] pxor xmm4,xmm4 psraw xmm1,xmm7 pcmpgtw xmm4,xmm2 pxor xmm5,xmm5 psubw xmm2,xmm4 movdqa xmm3,[_EAX+48] pcmpgtw xmm5,xmm3 psraw xmm2,xmm7 psubw xmm3,xmm5 movdqa [TMP1], xmm0 psraw xmm3,xmm7 movdqa [TMP1+16], xmm1 movdqa [TMP1+32],xmm2 lea _EAX, [_EAX+64] movdqa [TMP1+48],xmm3 dec TMP0 lea TMP1, [TMP1+64] jne .loop_low .end: pop _EAX mov TMP1, prm1 ; coeff mov [TMP1],ax xor _EAX,_EAX ; return 0 POP_XMM6_XMM7 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t quant_h263_inter_mmx(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN quant_h263_inter_mmx: mov TMP1, prm1 ; coeff mov _EAX, prm3 ; quant pxor mm5, mm5 ; sum lea TMP0, [mmx_sub] movq mm6, [TMP0 + _EAX * 8 - 8] ; sub cmp al, 1 jz near .q1routine lea TMP0, [mmx_div] movq mm7, [TMP0 + _EAX * 8 - 8] ; divider xor TMP0, TMP0 mov _EAX, prm2 ; data ALIGN SECTION_ALIGN .loop: movq mm0, [_EAX + 8*TMP0] ; mm0 = [1st] movq mm3, [_EAX + 8*TMP0 + 8] pxor mm1, mm1 ; mm1 = 0 pxor mm4, mm4 ; pcmpgtw mm1, mm0 ; mm1 = (0 > mm0) pcmpgtw mm4, mm3 ; pxor mm0, mm1 ; mm0 = |mm0| pxor mm3, mm4 ; psubw mm0, mm1 ; displace psubw mm3, mm4 ; psubusw mm0, mm6 ; mm0 -= sub (unsigned, dont go < 0) psubusw mm3, mm6 ; pmulhw mm0, mm7 ; mm0 = (mm0 / 2Q) >> 16 pmulhw mm3, mm7 ; paddw mm5, mm0 ; sum += mm0 pxor mm0, mm1 ; mm0 *= sign(mm0) paddw mm5, mm3 ; pxor mm3, mm4 ; psubw mm0, mm1 ; undisplace psubw mm3, mm4 movq [TMP1 + 8*TMP0], mm0 movq [TMP1 + 8*TMP0 + 8], mm3 add TMP0, 2 cmp TMP0, 16 jnz .loop .done: pmaddwd mm5, [plus_one] movq mm0, mm5 psrlq mm5, 32 paddd mm0, mm5 movd eax, mm0 ; return sum ret .q1routine: xor TMP0, TMP0 mov _EAX, prm2 ; data ALIGN SECTION_ALIGN .q1loop: movq mm0, [_EAX + 8*TMP0] ; mm0 = [1st] movq mm3, [_EAX + 8*TMP0+ 8] ; pxor mm1, mm1 ; mm1 = 0 pxor mm4, mm4 ; pcmpgtw mm1, mm0 ; mm1 = (0 > mm0) pcmpgtw mm4, mm3 ; pxor mm0, mm1 ; mm0 = |mm0| pxor mm3, mm4 ; psubw mm0, mm1 ; displace psubw mm3, mm4 ; psubusw mm0, mm6 ; mm0 -= sub (unsigned, dont go < 0) psubusw mm3, mm6 ; psrlw mm0, 1 ; mm0 >>= 1 (/2) psrlw mm3, 1 ; paddw mm5, mm0 ; sum += mm0 pxor mm0, mm1 ; mm0 *= sign(mm0) paddw mm5, mm3 ; pxor mm3, mm4 ; psubw mm0, mm1 ; undisplace psubw mm3, mm4 movq [TMP1 + 8*TMP0], mm0 movq [TMP1 + 8*TMP0 + 8], mm3 add TMP0, 2 cmp TMP0, 16 jnz .q1loop jmp .done ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t quant_h263_inter_sse2(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN quant_h263_inter_sse2: PUSH_XMM6_XMM7 mov TMP1, prm1 ; coeff mov _EAX, prm3 ; quant pxor xmm5, xmm5 ; sum lea TMP0, [mmx_sub] movq xmm6, [TMP0 + _EAX*8 - 8] ; sub movlhps xmm6, xmm6 ; duplicate into high 8 bytes cmp al, 1 jz near .qes2_q1_routine .qes2_not1: lea TMP0, [mmx_div] movq xmm7, [TMP0 + _EAX*8 - 8] ; divider xor TMP0, TMP0 mov _EAX, prm2 ; data movlhps xmm7, xmm7 ALIGN SECTION_ALIGN .qes2_loop: movdqa xmm0, [_EAX + TMP0*8] ; xmm0 = [1st] movdqa xmm3, [_EAX + TMP0*8 + 16] ; xmm3 = [2nd] pxor xmm1, xmm1 pxor xmm4, xmm4 pcmpgtw xmm1, xmm0 pcmpgtw xmm4, xmm3 pxor xmm0, xmm1 pxor xmm3, xmm4 psubw xmm0, xmm1 psubw xmm3, xmm4 psubusw xmm0, xmm6 psubusw xmm3, xmm6 pmulhw xmm0, xmm7 pmulhw xmm3, xmm7 paddw xmm5, xmm0 pxor xmm0, xmm1 paddw xmm5, xmm3 pxor xmm3, xmm4 psubw xmm0, xmm1 psubw xmm3, xmm4 movdqa [TMP1 + TMP0*8], xmm0 movdqa [TMP1 + TMP0*8 + 16], xmm3 add TMP0, 4 cmp TMP0, 16 jnz .qes2_loop .qes2_done: movdqa xmm6, [plus_one] pmaddwd xmm5, xmm6 movhlps xmm6, xmm5 paddd xmm5, xmm6 movdq2q mm0, xmm5 movq mm5, mm0 psrlq mm5, 32 paddd mm0, mm5 movd eax, mm0 ; return sum POP_XMM6_XMM7 ret .qes2_q1_routine: xor TMP0, TMP0 mov _EAX, prm2 ; data ALIGN SECTION_ALIGN .qes2_q1loop: movdqa xmm0, [_EAX + TMP0*8] ; xmm0 = [1st] movdqa xmm3, [_EAX + TMP0*8 + 16] ; xmm3 = [2nd] pxor xmm1, xmm1 pxor xmm4, xmm4 pcmpgtw xmm1, xmm0 pcmpgtw xmm4, xmm3 pxor xmm0, xmm1 pxor xmm3, xmm4 psubw xmm0, xmm1 psubw xmm3, xmm4 psubusw xmm0, xmm6 psubusw xmm3, xmm6 psrlw xmm0, 1 psrlw xmm3, 1 paddw xmm5, xmm0 pxor xmm0, xmm1 paddw xmm5, xmm3 pxor xmm3, xmm4 psubw xmm0, xmm1 psubw xmm3, xmm4 movdqa [TMP1 + TMP0*8], xmm0 movdqa [TMP1 + TMP0*8 + 16], xmm3 add TMP0, 4 cmp TMP0, 16 jnz .qes2_q1loop jmp .qes2_done ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_h263_intra_mmx(int16_t *data, ; const int16_t const *coeff, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN dequant_h263_intra_mmx: mov TMP0, prm3 ; quant mov _EAX, prm2 ; coeff pcmpeqw mm0,mm0 lea TMP1, [mmx_quant] movq mm6, [TMP1 + TMP0*8] ; quant shl TMP0,31 ; quant & 1 ? 0 : - 1 movq mm7,mm6 movq mm5,mm0 movd mm1,TMP0d mov TMP1, prm1 ; data psllw mm0,mm1 paddw mm7,mm7 ; 2*quant paddw mm6,mm0 ; quant-1 psllw mm5,12 mov TMP0,8 psrlw mm5,1 .loop: movq mm0,[_EAX] pxor mm2,mm2 pxor mm4,mm4 pcmpgtw mm2,mm0 pcmpeqw mm4,mm0 pmullw mm0,mm7 ; * 2 * quant movq mm1,[_EAX+8] psubw mm0,mm2 pxor mm2,mm6 pxor mm3,mm3 pandn mm4,mm2 pxor mm2,mm2 pcmpgtw mm3,mm1 pcmpeqw mm2,mm1 pmullw mm1,mm7 paddw mm0,mm4 psubw mm1,mm3 pxor mm3,mm6 pandn mm2,mm3 paddsw mm0, mm5 ; saturate paddw mm1,mm2 paddsw mm1, mm5 psubsw mm0, mm5 psubsw mm1, mm5 psubsw mm0, mm5 psubsw mm1, mm5 paddsw mm0, mm5 paddsw mm1, mm5 movq [TMP1],mm0 lea _EAX,[_EAX+16] movq [TMP1+8],mm1 dec TMP0 lea TMP1,[TMP1+16] jne .loop ; deal with DC mov _EAX, prm2 ; coeff movd mm1,prm4d ; dcscalar movd mm0,[_EAX] ; coeff[0] pmullw mm0,mm1 ; * dcscalar mov TMP1, prm1 ; data paddsw mm0, mm5 ; saturate + psubsw mm0, mm5 psubsw mm0, mm5 ; saturate - paddsw mm0, mm5 movd eax,mm0 mov [TMP1], ax xor _EAX, _EAX ; return 0 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_h263_intra_xmm(int16_t *data, ; const int16_t const *coeff, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN dequant_h263_intra_xmm: mov TMP0, prm3 ; quant mov _EAX, prm2 ; coeff movd mm6,TMP0d ; quant pcmpeqw mm0,mm0 pshufw mm6,mm6,0 ; all quant shl TMP0,31 movq mm5,mm0 movq mm7,mm6 movd mm1,TMP0d mov TMP1, prm1 ; data psllw mm0,mm1 ; quant & 1 ? 0 : - 1 movq mm4,mm5 paddw mm7,mm7 ; quant*2 paddw mm6,mm0 ; quant-1 psrlw mm4,5 ; mm4=2047 mov TMP0,8 pxor mm5,mm4 ; mm5=-2048 .loop: movq mm0,[_EAX] pxor mm2,mm2 pxor mm3,mm3 pcmpgtw mm2,mm0 pcmpeqw mm3,mm0 ; if coeff==0... pmullw mm0,mm7 ; * 2 * quant movq mm1,[_EAX+8] psubw mm0,mm2 pxor mm2,mm6 pandn mm3,mm2 ; ...then data=0 pxor mm2,mm2 paddw mm0,mm3 pxor mm3,mm3 pcmpeqw mm2,mm1 pcmpgtw mm3,mm1 pmullw mm1,mm7 pminsw mm0,mm4 psubw mm1,mm3 pxor mm3,mm6 pandn mm2,mm3 paddw mm1,mm2 pmaxsw mm0,mm5 pminsw mm1,mm4 movq [TMP1],mm0 pmaxsw mm1,mm5 lea _EAX,[_EAX+16] movq [TMP1+8],mm1 dec TMP0 lea TMP1,[TMP1+16] jne .loop ; deal with DC mov _EAX, prm2 ; coeff movd mm1,prm4d ; dcscalar movd mm0, [_EAX] pmullw mm0, mm1 mov TMP1, prm1 ; data pminsw mm0,mm4 pmaxsw mm0,mm5 movd eax, mm0 mov [TMP1], ax xor _EAX, _EAX ; return 0 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_h263_intra_sse2(int16_t *data, ; const int16_t const *coeff, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN dequant_h263_intra_sse2: PUSH_XMM6_XMM7 mov TMP0, prm3 ; quant mov _EAX, prm2 ; coeff movd xmm6,TMP0d ; quant shl TMP0,31 pshuflw xmm6,xmm6,0 pcmpeqw xmm0,xmm0 movlhps xmm6,xmm6 ; all quant movd xmm1,TMP0d movdqa xmm5,xmm0 movdqa xmm7,xmm6 mov TMP1, prm1 ; data paddw xmm7,xmm7 ; quant *2 psllw xmm0,xmm1 ; quant & 1 ? 0 : - 1 movdqa xmm4,xmm5 paddw xmm6,xmm0 ; quant-1 psrlw xmm4,5 ; 2047 mov TMP0,4 pxor xmm5,xmm4 ; mm5=-2048 .loop: movdqa xmm0,[_EAX] pxor xmm2,xmm2 pxor xmm3,xmm3 pcmpgtw xmm2,xmm0 pcmpeqw xmm3,xmm0 pmullw xmm0,xmm7 ; * 2 * quant movdqa xmm1,[_EAX+16] psubw xmm0,xmm2 pxor xmm2,xmm6 pandn xmm3,xmm2 pxor xmm2,xmm2 paddw xmm0,xmm3 pxor xmm3,xmm3 pcmpeqw xmm2,xmm1 pcmpgtw xmm3,xmm1 pmullw xmm1,xmm7 pminsw xmm0,xmm4 psubw xmm1,xmm3 pxor xmm3,xmm6 pandn xmm2,xmm3 paddw xmm1,xmm2 pmaxsw xmm0,xmm5 pminsw xmm1,xmm4 movdqa [TMP1],xmm0 pmaxsw xmm1,xmm5 lea _EAX,[_EAX+32] movdqa [TMP1+16],xmm1 dec TMP0 lea TMP1,[TMP1+32] jne .loop ; deal with DC mov _EAX, prm2 ; coeff movsx _EAX,word [_EAX] imul prm4d ; dcscalar mov TMP1, prm1 ; data movd xmm0,eax pminsw xmm0,xmm4 pmaxsw xmm0,xmm5 movd eax,xmm0 mov [TMP1], ax xor _EAX, _EAX ; return 0 POP_XMM6_XMM7 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32t dequant_h263_inter_mmx(int16_t * data, ; const int16_t * const coeff, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN dequant_h263_inter_mmx: mov TMP0, prm3 ; quant mov _EAX, prm2 ; coeff pcmpeqw mm0,mm0 lea TMP1, [mmx_quant] movq mm6, [TMP1 + TMP0*8] ; quant shl TMP0,31 ; odd/even movq mm7,mm6 movd mm1,TMP0d mov TMP1, prm1 ; data movq mm5,mm0 psllw mm0,mm1 ; quant & 1 ? 0 : - 1 paddw mm7,mm7 ; quant*2 paddw mm6,mm0 ; quant & 1 ? quant : quant - 1 psllw mm5,12 mov TMP0,8 psrlw mm5,1 ; 32767-2047 (32768-2048) .loop: movq mm0,[_EAX] pxor mm4,mm4 pxor mm2,mm2 pcmpeqw mm4,mm0 ; if coeff==0... pcmpgtw mm2,mm0 pmullw mm0,mm7 ; * 2 * quant pxor mm3,mm3 psubw mm0,mm2 movq mm1,[_EAX+8] pxor mm2,mm6 pcmpgtw mm3,mm1 pandn mm4,mm2 ; ... then data==0 pmullw mm1,mm7 pxor mm2,mm2 pcmpeqw mm2,mm1 psubw mm1,mm3 pxor mm3,mm6 pandn mm2,mm3 paddw mm0,mm4 paddw mm1,mm2 paddsw mm0, mm5 ; saturate paddsw mm1, mm5 psubsw mm0, mm5 psubsw mm1, mm5 psubsw mm0, mm5 psubsw mm1, mm5 paddsw mm0, mm5 paddsw mm1, mm5 movq [TMP1],mm0 lea _EAX,[_EAX+16] movq [TMP1+8],mm1 dec TMP0 lea TMP1,[TMP1+16] jne .loop xor _EAX, _EAX ; return 0 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_h263_inter_xmm(int16_t * data, ; const int16_t * const coeff, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN dequant_h263_inter_xmm: mov TMP0, prm3 ; quant mov _EAX, prm2 ; coeff pcmpeqw mm0,mm0 lea TMP1, [mmx_quant] movq mm6, [TMP1 + TMP0*8] ; quant shl TMP0,31 movq mm5,mm0 movd mm1,TMP0d movq mm7,mm6 psllw mm0,mm1 mov TMP1, prm1 ; data movq mm4,mm5 paddw mm7,mm7 paddw mm6,mm0 ; quant-1 psrlw mm4,5 mov TMP0,8 pxor mm5,mm4 ; mm5=-2048 .loop: movq mm0,[_EAX] pxor mm3,mm3 pxor mm2,mm2 pcmpeqw mm3,mm0 pcmpgtw mm2,mm0 pmullw mm0,mm7 ; * 2 * quant pandn mm3,mm6 movq mm1,[_EAX+8] psubw mm0,mm2 pxor mm2,mm3 pxor mm3,mm3 paddw mm0,mm2 pxor mm2,mm2 pcmpgtw mm3,mm1 pcmpeqw mm2,mm1 pmullw mm1,mm7 pandn mm2,mm6 psubw mm1,mm3 pxor mm3,mm2 paddw mm1,mm3 pminsw mm0,mm4 pminsw mm1,mm4 pmaxsw mm0,mm5 pmaxsw mm1,mm5 movq [TMP1],mm0 lea _EAX,[_EAX+16] movq [TMP1+8],mm1 dec TMP0 lea TMP1,[TMP1+16] jne .loop xor _EAX, _EAX ; return 0 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_h263_inter_sse2(int16_t * data, ; const int16_t * const coeff, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN dequant_h263_inter_sse2: PUSH_XMM6_XMM7 mov TMP0, prm3 ; quant mov _EAX, prm2 ; coeff lea TMP1, [mmx_quant] movq xmm6, [TMP1 + TMP0*8] ; quant inc TMP0 pcmpeqw xmm5,xmm5 and TMP0,1 movlhps xmm6,xmm6 movd xmm0,TMP0d movdqa xmm7,xmm6 pshuflw xmm0,xmm0,0 movdqa xmm4,xmm5 mov TMP1, prm1 ; data movlhps xmm0,xmm0 paddw xmm7,xmm7 psubw xmm6,xmm0 psrlw xmm4,5 ; 2047 mov TMP0,4 pxor xmm5,xmm4 ; mm5=-2048 .loop: movdqa xmm0,[_EAX] pxor xmm3,xmm3 pxor xmm2,xmm2 pcmpeqw xmm3,xmm0 pcmpgtw xmm2,xmm0 pmullw xmm0,xmm7 ; * 2 * quant pandn xmm3,xmm6 movdqa xmm1,[_EAX+16] psubw xmm0,xmm2 pxor xmm2,xmm3 pxor xmm3,xmm3 paddw xmm0,xmm2 pxor xmm2,xmm2 pcmpgtw xmm3,xmm1 pcmpeqw xmm2,xmm1 pmullw xmm1,xmm7 pandn xmm2,xmm6 psubw xmm1,xmm3 pxor xmm3,xmm2 paddw xmm1,xmm3 pminsw xmm0,xmm4 pminsw xmm1,xmm4 pmaxsw xmm0,xmm5 pmaxsw xmm1,xmm5 movdqa [TMP1],xmm0 lea _EAX,[_EAX+32] movdqa [TMP1+16],xmm1 dec TMP0 lea TMP1,[TMP1+32] jne .loop xor _EAX, _EAX ; return 0 POP_XMM6_XMM7 ret ENDFUNC NON_EXEC_STACK xvidcore/src/quant/x86_asm/quantize_mpeg_mmx.asm0000664000076500007650000004224311254216113023123 0ustar xvidbuildxvidbuild;/************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - 3dne Quantization/Dequantization - ; * ; * Copyright (C) 2002-2003 Peter Ross ; * 2002-2008 Michael Militzer ; * 2002-2003 Pascal Massimino ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: quantize_mpeg_mmx.asm,v 1.16 2009-09-16 17:07:58 Isibaar Exp $ ; * ; *************************************************************************/ %define SATURATE %include "nasm.inc" ;============================================================================= ; Local data (Read Only) ;============================================================================= DATA mmx_one: times 4 dw 1 ;----------------------------------------------------------------------------- ; divide by 2Q table ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_div: times 4 dw 65535 ; the div by 2 formula will overflow for the case ; quant=1 but we don't care much because quant=1 ; is handled by a different piece of code that ; doesn't use this table. %assign quant 2 %rep 30 times 4 dw (1<<17) / (quant*2) + 1 %assign quant quant+1 %endrep %define VM18P 3 %define VM18Q 4 ;----------------------------------------------------------------------------- ; quantd table ;----------------------------------------------------------------------------- quantd: %assign quant 1 %rep 31 times 4 dw ((VM18P*quant) + (VM18Q/2)) / VM18Q %assign quant quant+1 %endrep ;----------------------------------------------------------------------------- ; multiple by 2Q table ;----------------------------------------------------------------------------- mmx_mul_quant: %assign quant 1 %rep 31 times 4 dw quant %assign quant quant+1 %endrep ;----------------------------------------------------------------------------- ; saturation limits ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_32767_minus_2047: times 4 dw (32767-2047) mmx_32768_minus_2048: times 4 dw (32768-2048) mmx_2047: times 4 dw 2047 mmx_minus_2048: times 4 dw (-2048) zero: times 4 dw 0 ;============================================================================= ; rounding ;============================================================================= ALIGN SECTION_ALIGN mmx_rounding: dd (1<<13) dd (1<<13) ;============================================================================= ; Code ;============================================================================= TEXT cglobal quant_mpeg_intra_mmx cglobal quant_mpeg_inter_mmx cglobal dequant_mpeg_intra_mmx cglobal dequant_mpeg_inter_mmx %macro QUANT_MMX 1 movq mm0, [_EAX + 16*(%1)] ; data movq mm2, [TMP0 + 16*(%1) + 128] ; intra_matrix_rec movq mm4, [_EAX + 16*(%1) + 8] ; data movq mm6, [TMP0 + 16*(%1) + 128 + 8] ; intra_matrix_rec movq mm1, mm0 movq mm5, mm4 pmullw mm0, mm2 ; low results pmulhw mm1, mm2 ; high results pmullw mm4, mm6 ; low results pmulhw mm5, mm6 ; high results movq mm2, mm0 movq mm6, mm4 punpckhwd mm0, mm1 punpcklwd mm2, mm1 punpckhwd mm4, mm5 punpcklwd mm6, mm5 paddd mm2, mm7 paddd mm0, mm7 paddd mm6, mm7 paddd mm4, mm7 psrad mm2, 14 psrad mm0, 14 psrad mm6, 14 psrad mm4, 14 packssdw mm2, mm0 packssdw mm6, mm4 movq [TMP1 + 16*(%1)], mm2 movq [TMP1 + 16*(%1)+8], mm6 %endmacro ;----------------------------------------------------------------------------- ; ; uint32_t quant_mpeg_intra_mmx(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN quant_mpeg_intra_mmx: mov _EAX, prm2 ; data mov TMP0, prm5 ; mpeg_quant_matrices mov TMP1, prm1 ; coeff movq mm7, [mmx_rounding] QUANT_MMX(0) QUANT_MMX(1) QUANT_MMX(2) QUANT_MMX(3) QUANT_MMX(4) QUANT_MMX(5) QUANT_MMX(6) QUANT_MMX(7) ; calculate DC movsx _EAX, word [_EAX] ; data[0] mov TMP0, prm4 ; dcscalar mov _EDX, _EAX shr TMP0, 1 ; TMP0 = dcscalar/2 sar _EDX, 31 ; TMP1 = sign extend of _EAX (ready for division too) xor TMP0, _EDX ; adjust TMP0 according to the sign of data[0] sub TMP0, _EDX add _EAX, TMP0 mov TMP0, prm4 ; dcscalar idiv TMP0 ; _EAX = _EDX:_EAX / dcscalar mov _EDX, prm1 ; coeff again mov word [_EDX], ax ; coeff[0] = ax xor _EAX, _EAX ; return(0); ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t quant_mpeg_inter_mmx(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN quant_mpeg_inter_mmx: mov TMP1, prm1 ; coeff mov _EAX, prm3 ; quant mov TMP0, prm4 ; mpeg_quant_matrices push _ESI %ifdef ARCH_IS_X86_64 mov _ESI, prm2 ; data %else mov _ESI, [_ESP + 4 + 8] ; data %endif push _EBX xor _EBX, _EBX pxor mm5, mm5 ; sum cmp al, 1 jz near .q1loop cmp al, 2 jz near .q2loop %ifdef ARCH_IS_X86_64 lea r9, [mmx_div] movq mm7, [r9 + _EAX * 8 - 8] %else movq mm7, [mmx_div + _EAX * 8 - 8] ; divider %endif ALIGN SECTION_ALIGN .loop: movq mm0, [_ESI + 8*_EBX] ; mm0 = [1st] movq mm3, [_ESI + 8*_EBX + 8] ; pxor mm1, mm1 ; mm1 = 0 pxor mm4, mm4 ; pcmpgtw mm1, mm0 ; mm1 = (0 > mm0) pcmpgtw mm4, mm3 ; pxor mm0, mm1 ; mm0 = |mm0| pxor mm3, mm4 ; psubw mm0, mm1 ; displace psubw mm3, mm4 ; psllw mm0, 4 psllw mm3, 4 movq mm2, [TMP0 + 512 + 8*_EBX] psrlw mm2, 1 paddw mm0, mm2 movq mm2, [TMP0 + 768 + _EBX*8] pmulhw mm0, mm2 ; (level<<4 + inter_matrix[i]>>1) / inter_matrix[i] movq mm2, [TMP0 + 512 + 8*_EBX + 8] psrlw mm2, 1 paddw mm3, mm2 movq mm2, [TMP0 + 768 + _EBX*8 + 8] pmulhw mm3, mm2 pmulhw mm0, mm7 ; mm0 = (mm0 / 2Q) >> 16 pmulhw mm3, mm7 ; psrlw mm0, 1 ; additional shift by 1 => 16 + 1 = 17 psrlw mm3, 1 paddw mm5, mm0 ; sum += mm0 pxor mm0, mm1 ; mm0 *= sign(mm0) paddw mm5, mm3 ; pxor mm3, mm4 ; psubw mm0, mm1 ; undisplace psubw mm3, mm4 movq [TMP1 + 8*_EBX], mm0 movq [TMP1 + 8*_EBX + 8], mm3 add _EBX, 2 cmp _EBX, 16 jnz near .loop .done: pmaddwd mm5, [mmx_one] movq mm0, mm5 psrlq mm5, 32 paddd mm0, mm5 movd eax, mm0 ; return sum pop _EBX pop _ESI ret ALIGN SECTION_ALIGN .q1loop: movq mm0, [_ESI + 8*_EBX] ; mm0 = [1st] movq mm3, [_ESI + 8*_EBX+ 8] pxor mm1, mm1 ; mm1 = 0 pxor mm4, mm4 ; pcmpgtw mm1, mm0 ; mm1 = (0 > mm0) pcmpgtw mm4, mm3 ; pxor mm0, mm1 ; mm0 = |mm0| pxor mm3, mm4 ; psubw mm0, mm1 ; displace psubw mm3, mm4 ; psllw mm0, 4 psllw mm3, 4 movq mm2, [TMP0 + 512 + 8*_EBX] psrlw mm2, 1 paddw mm0, mm2 movq mm2, [TMP0 + 768 + _EBX*8] pmulhw mm0, mm2 ; (level<<4 + inter_matrix[i]>>1) / inter_matrix[i] movq mm2, [TMP0 + 512 + 8*_EBX + 8] psrlw mm2, 1 paddw mm3, mm2 movq mm2, [TMP0 + 768 + _EBX*8 + 8] pmulhw mm3, mm2 psrlw mm0, 1 ; mm0 >>= 1 (/2) psrlw mm3, 1 ; paddw mm5, mm0 ; sum += mm0 pxor mm0, mm1 ; mm0 *= sign(mm0) paddw mm5, mm3 ; pxor mm3, mm4 ; psubw mm0, mm1 ; undisplace psubw mm3, mm4 movq [TMP1 + 8*_EBX], mm0 movq [TMP1 + 8*_EBX + 8], mm3 add _EBX, 2 cmp _EBX, 16 jnz near .q1loop jmp .done ALIGN SECTION_ALIGN .q2loop: movq mm0, [_ESI + 8*_EBX] ; mm0 = [1st] movq mm3, [_ESI + 8*_EBX+ 8] pxor mm1, mm1 ; mm1 = 0 pxor mm4, mm4 ; pcmpgtw mm1, mm0 ; mm1 = (0 > mm0) pcmpgtw mm4, mm3 ; pxor mm0, mm1 ; mm0 = |mm0| pxor mm3, mm4 ; psubw mm0, mm1 ; displace psubw mm3, mm4 ; psllw mm0, 4 psllw mm3, 4 movq mm2, [TMP0 + 512 + 8*_EBX] psrlw mm2, 1 paddw mm0, mm2 movq mm2, [TMP0 + 768 + _EBX*8] pmulhw mm0, mm2 ; (level<<4 + inter_matrix[i]>>1) / inter_matrix[i] movq mm2, [TMP0 + 512 + 8*_EBX + 8] psrlw mm2, 1 paddw mm3, mm2 movq mm2, [TMP0 + 768 + _EBX*8 + 8] pmulhw mm3, mm2 psrlw mm0, 2 ; mm0 >>= 1 (/2) psrlw mm3, 2 ; paddw mm5, mm0 ; sum += mm0 pxor mm0, mm1 ; mm0 *= sign(mm0) paddw mm5, mm3 ; pxor mm3, mm4 ; psubw mm0, mm1 ; undisplace psubw mm3, mm4 movq [TMP1 + 8*_EBX], mm0 movq [TMP1 + 8*_EBX + 8], mm3 add _EBX, 2 cmp _EBX, 16 jnz near .q2loop jmp .done ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_mpeg_intra_mmx(int16_t *data, ; const int16_t const *coeff, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ; Note: in order to saturate 'easily', we pre-shift the quantifier ; by 4. Then, the high-word of (coeff[]*matrix[i]*quant) are used to ; build a saturating mask. It is non-zero only when an overflow occured. ; We thus avoid packing/unpacking toward double-word. ; Moreover, we perform the mult (matrix[i]*quant) first, instead of, e.g., ; (coeff[i]*matrix[i]). This is less prone to overflow if coeff[] are not ; checked. Input ranges are: coeff in [-127,127], inter_matrix in [1..255],a ; and quant in [1..31]. ; ; The original loop is: ; %if 0 movq mm0, [TMP0+8*_EAX + 8*16] ; mm0 = coeff[i] pxor mm1, mm1 pcmpgtw mm1, mm0 pxor mm0, mm1 ; change sign if negative psubw mm0, mm1 ; -> mm0 = abs(coeff[i]), mm1 = sign of coeff[i] movq mm2, mm7 ; mm2 = quant pmullw mm2, [_EBX + 8*_EAX + 8*16 ] ; matrix[i]*quant. movq mm6, mm2 pmulhw mm2, mm0 ; high of coeff*(matrix*quant) (should be 0 if no overflow) pmullw mm0, mm6 ; low of coeff*(matrix*quant) pxor mm5, mm5 pcmpgtw mm2, mm5 ; otherflow? psrlw mm2, 5 ; =0 if no clamp, 2047 otherwise psrlw mm0, 5 paddw mm0, mm1 ; start restoring sign por mm0, mm2 ; saturate to 2047 if needed pxor mm0, mm1 ; finish negating back movq [TMP1 + 8*_EAX + 8*16], mm0 ; data[i] add _EAX, 1 %endif ;******************************************************************** ALIGN SECTION_ALIGN dequant_mpeg_intra_mmx: mov TMP1, prm1 ; data mov TMP0, prm2 ; coeff mov _EAX, prm5 ; mpeg_quant_matrices push _EBX mov _EBX, _EAX %ifdef ARCH_IS_X86_64 mov _EAX, prm3 lea prm1, [mmx_mul_quant] movq mm7, [prm1 + _EAX*8 - 8] %else mov _EAX, [_ESP + 4 + 12] ; quant movq mm7, [mmx_mul_quant + _EAX*8 - 8] %endif mov _EAX, -16 ; to keep ALIGNed, we regularly process coeff[0] psllw mm7, 2 ; << 2. See comment. pxor mm6, mm6 ; this is a NOP ALIGN SECTION_ALIGN .loop: movq mm0, [TMP0+8*_EAX + 8*16] ; mm0 = c = coeff[i] movq mm3, [TMP0+8*_EAX + 8*16 +8]; mm3 = c' = coeff[i+1] pxor mm1, mm1 pxor mm4, mm4 pcmpgtw mm1, mm0 ; mm1 = sgn(c) movq mm2, mm7 ; mm2 = quant pcmpgtw mm4, mm3 ; mm4 = sgn(c') pmullw mm2, [_EBX + 8*_EAX + 8*16 ] ; matrix[i]*quant pxor mm0, mm1 ; negate if negative pxor mm3, mm4 ; negate if negative psubw mm0, mm1 psubw mm3, mm4 ; we're short on register, here. Poor pairing... movq mm5, mm2 pmullw mm2, mm0 ; low of coeff*(matrix*quant) pmulhw mm0, mm5 ; high of coeff*(matrix*quant) movq mm5, mm7 ; mm2 = quant pmullw mm5, [_EBX + 8*_EAX + 8*16 +8] ; matrix[i+1]*quant movq mm6, mm5 add _EAX,2 ; z-flag will be tested later pmullw mm6, mm3 ; low of coeff*(matrix*quant) pmulhw mm3, mm5 ; high of coeff*(matrix*quant) pcmpgtw mm0, [zero] paddusw mm2, mm0 psrlw mm2, 5 pcmpgtw mm3, [zero] paddusw mm6, mm3 psrlw mm6, 5 pxor mm2, mm1 ; start negating back pxor mm6, mm4 ; start negating back psubusw mm1, mm0 psubusw mm4, mm3 psubw mm2, mm1 ; finish negating back psubw mm6, mm4 ; finish negating back movq [TMP1 + 8*_EAX + 8*16 -2*8 ], mm2 ; data[i] movq [TMP1 + 8*_EAX + 8*16 -2*8 +8], mm6 ; data[i+1] jnz near .loop pop _EBX ; deal with DC movd mm0, [TMP0] %ifdef ARCH_IS_X86_64 movq mm6, prm4 pmullw mm0, mm6 %else pmullw mm0, prm4 ; dcscalar %endif movq mm2, [mmx_32767_minus_2047] paddsw mm0, mm2 psubsw mm0, mm2 movq mm2, [mmx_32768_minus_2048] psubsw mm0, mm2 paddsw mm0, mm2 movd eax, mm0 mov [TMP1], ax xor _EAX, _EAX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_mpeg_inter_mmx(int16_t * data, ; const int16_t * const coeff, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ; Note: We use (2*c + sgn(c) - sgn(-c)) as multiplier ; so we handle the 3 cases: c<0, c==0, and c>0 in one shot. ; sgn(x) is the result of 'pcmpgtw 0,x': 0 if x>=0, -1 if x<0. ; It's mixed with the extraction of the absolute value. ALIGN SECTION_ALIGN dequant_mpeg_inter_mmx: mov TMP1, prm1 ; data mov TMP0, prm2 ; coeff mov _EAX, prm3 ; quant push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm4 lea r9, [mmx_mul_quant] movq mm7, [r9 + _EAX*8 - 8] %else mov _EBX, [_ESP + 4 + 16] ; mpeg_quant_matrices movq mm7, [mmx_mul_quant + _EAX*8 - 8] %endif mov _EAX, -16 paddw mm7, mm7 ; << 1 pxor mm6, mm6 ; mismatch sum ALIGN SECTION_ALIGN .loop: movq mm0, [TMP0+8*_EAX + 8*16 ] ; mm0 = coeff[i] movq mm2, [TMP0+8*_EAX + 8*16 +8] ; mm2 = coeff[i+1] add _EAX, 2 pxor mm1, mm1 pxor mm3, mm3 pcmpgtw mm1, mm0 ; mm1 = sgn(c) (preserved) pcmpgtw mm3, mm2 ; mm3 = sgn(c') (preserved) paddsw mm0, mm1 ; c += sgn(c) paddsw mm2, mm3 ; c += sgn(c') paddw mm0, mm0 ; c *= 2 paddw mm2, mm2 ; c'*= 2 pxor mm4, mm4 pxor mm5, mm5 psubw mm4, mm0 ; -c psubw mm5, mm2 ; -c' psraw mm4, 16 ; mm4 = sgn(-c) psraw mm5, 16 ; mm5 = sgn(-c') psubsw mm0, mm4 ; c -= sgn(-c) psubsw mm2, mm5 ; c' -= sgn(-c') pxor mm0, mm1 ; finish changing sign if needed pxor mm2, mm3 ; finish changing sign if needed ; we're short on register, here. Poor pairing... movq mm4, mm7 ; (matrix*quant) pmullw mm4, [_EBX + 512 + 8*_EAX + 8*16 -2*8] movq mm5, mm4 pmulhw mm5, mm0 ; high of c*(matrix*quant) pmullw mm0, mm4 ; low of c*(matrix*quant) movq mm4, mm7 ; (matrix*quant) pmullw mm4, [_EBX + 512 + 8*_EAX + 8*16 -2*8 + 8] pcmpgtw mm5, [zero] paddusw mm0, mm5 psrlw mm0, 5 pxor mm0, mm1 ; start restoring sign psubusw mm1, mm5 movq mm5, mm4 pmulhw mm5, mm2 ; high of c*(matrix*quant) pmullw mm2, mm4 ; low of c*(matrix*quant) psubw mm0, mm1 ; finish restoring sign pcmpgtw mm5, [zero] paddusw mm2, mm5 psrlw mm2, 5 pxor mm2, mm3 ; start restoring sign psubusw mm3, mm5 psubw mm2, mm3 ; finish restoring sign pxor mm6, mm0 ; mismatch control movq [TMP1 + 8*_EAX + 8*16 -2*8 ], mm0 ; data[i] pxor mm6, mm2 ; mismatch control movq [TMP1 + 8*_EAX + 8*16 -2*8 +8], mm2 ; data[i+1] jnz near .loop ; mismatch control movq mm0, mm6 psrlq mm0, 48 movq mm1, mm6 movq mm2, mm6 psrlq mm1, 32 pxor mm6, mm0 psrlq mm2, 16 pxor mm6, mm1 pxor mm6, mm2 movd eax, mm6 and _EAX, 1 xor _EAX, 1 xor word [TMP1 + 2*63], ax xor _EAX, _EAX pop _EBX ret ENDFUNC NON_EXEC_STACK xvidcore/src/quant/x86_asm/quantize_mpeg_xmm.asm0000664000076500007650000004046711254216113023131 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - 3dne Quantization/Dequantization - ; * ; * Copyright (C) 2002-2003 Peter Ross ; * 2002 Jaan Kalda ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: quantize_mpeg_xmm.asm,v 1.13 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ ; _3dne functions are compatible with iSSE, but are optimized specifically ; for K7 pipelines %define SATURATE %include "nasm.inc" ;============================================================================= ; Local data ;============================================================================= DATA ALIGN SECTION_ALIGN mmzero: dd 0,0 mmx_one: times 4 dw 1 ;----------------------------------------------------------------------------- ; divide by 2Q table ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_divs: ;i>2 %assign i 1 %rep 31 times 4 dw ((1 << 15) / i + 1) %assign i i+1 %endrep ALIGN SECTION_ALIGN mmx_div: ;quant>2 times 4 dw 65535 ; the div by 2 formula will overflow for the case ; quant=1 but we don't care much because quant=1 ; is handled by a different piece of code that ; doesn't use this table. %assign quant 2 %rep 31 times 4 dw ((1 << 16) / quant + 1) %assign quant quant+1 %endrep %macro FIXX 1 dw (1 << 16) / (%1) + 1 %endmacro %ifndef ARCH_IS_X86_64 %define nop4 db 08Dh, 074h, 026h,0 %define nop7 db 08dh, 02ch, 02dh,0,0,0,0 %else %define nop4 %define nop7 %endif %define nop3 add _ESP, byte 0 %define nop2 mov _ESP, _ESP %define nop6 add _EBP, dword 0 ;----------------------------------------------------------------------------- ; quantd table ;----------------------------------------------------------------------------- %define VM18P 3 %define VM18Q 4 ALIGN SECTION_ALIGN quantd: %assign i 1 %rep 31 times 4 dw (((VM18P*i) + (VM18Q/2)) / VM18Q) %assign i i+1 %endrep ;----------------------------------------------------------------------------- ; multiple by 2Q table ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_mul_quant: %assign i 1 %rep 31 times 4 dw i %assign i i+1 %endrep ;----------------------------------------------------------------------------- ; saturation limits ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_32767_minus_2047: times 4 dw (32767-2047) mmx_32768_minus_2048: times 4 dw (32768-2048) mmx_2047: times 4 dw 2047 mmx_minus_2048: times 4 dw (-2048) zero: times 4 dw 0 int_div: dd 0 %assign i 1 %rep 255 dd (1 << 17) / ( i) + 1 %assign i i+1 %endrep ;============================================================================= ; Code ;============================================================================= TEXT cglobal quant_mpeg_inter_xmm cglobal dequant_mpeg_intra_3dne cglobal dequant_mpeg_inter_3dne ;----------------------------------------------------------------------------- ; ; uint32_t quant_mpeg_inter_xmm(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN quant_mpeg_inter_xmm: mov _EAX, prm2 ; data mov TMP0, prm3 ; quant mov TMP1, prm1 ; coeff push _ESI push _EDI push _EBX nop %ifdef ARCH_IS_X86_64 mov _EDI, prm4 %else mov _EDI, [_ESP + 12 + 16] %endif mov _ESI, -14 mov _EBX, _ESP sub _ESP, byte 24 lea _EBX, [_ESP+8] and _EBX, byte -8 ;ALIGN 8 pxor mm0, mm0 pxor mm3, mm3 movq [byte _EBX],mm0 movq [_EBX+8],mm0 cmp TMP0, byte 1 je near .q1loop cmp TMP0, byte 19 jg near .lloop nop ALIGN SECTION_ALIGN .loop: movq mm1, [_EAX + 8*_ESI+112] ; mm0 = [1st] psubw mm0, mm1 ;-mm1 movq mm4, [_EAX + 8*_ESI + 120] ; psubw mm3, mm4 ;-mm4 pmaxsw mm0, mm1 ;|src| pmaxsw mm3, mm4 nop2 psraw mm1, 15 ;sign src psraw mm4, 15 psllw mm0, 4 ; level << 4 psllw mm3, 4 ; paddw mm0, [_EDI + 640 + 8*_ESI+112] paddw mm3, [_EDI + 640 + 8*_ESI+120] movq mm5, [_EDI + 896 + 8*_ESI+112] movq mm7, [_EDI + 896 + 8*_ESI+120] pmulhuw mm5, mm0 pmulhuw mm7, mm3 mov _ESP, _ESP movq mm2, [_EDI + 512 + 8*_ESI+112] movq mm6, [_EDI + 512 + 8*_ESI+120] pmullw mm2, mm5 pmullw mm6, mm7 psubw mm0, mm2 psubw mm3, mm6 movq mm2, [byte _EBX] %ifdef ARCH_IS_X86_64 lea r9, [mmx_divs] movq mm6, [r9 + TMP0 * 8 - 8] %else movq mm6, [mmx_divs + TMP0 * 8 - 8] %endif pmulhuw mm0, [_EDI + 768 + 8*_ESI+112] pmulhuw mm3, [_EDI + 768 + 8*_ESI+120] paddw mm2, [_EBX+8] ;sum paddw mm5, mm0 paddw mm7, mm3 pxor mm0, mm0 pxor mm3, mm3 pmulhuw mm5, mm6 ; mm0 = (mm0 / 2Q) >> 16 pmulhuw mm7, mm6 ; (level ) / quant (0> 16 pmulhuw mm7, mm6 ; (level ) / quant (00 in one shot. ; sgn(x) is the result of 'pcmpgtw 0,x': 0 if x>=0, -1 if x<0. ; It's mixed with the extraction of the absolute value. ALIGN SECTION_ALIGN dequant_mpeg_inter_3dne: mov _EAX, prm3 ; quant %ifdef ARCH_IS_X86_64 lea TMP0, [mmx_mul_quant] movq mm7, [TMP0 + _EAX*8 - 8] %else movq mm7, [mmx_mul_quant + _EAX*8 - 8] %endif mov TMP1, prm1 ; data mov TMP0, prm2 ; coeff mov _EAX, -14 paddw mm7, mm7 ; << 1 pxor mm6, mm6 ; mismatch sum push _ESI push _EDI lea _ESI, [mmzero] pxor mm1, mm1 pxor mm3, mm3 %ifdef ARCH_IS_X86_64 mov _EDI, prm4 %else mov _EDI, [_ESP + 8 + 16] ; mpeg_quant_matrices %endif nop nop4 ALIGN SECTION_ALIGN .loop: movq mm0, [TMP0+8*_EAX + 7*16 ] ; mm0 = coeff[i] pcmpgtw mm1, mm0 ; mm1 = sgn(c) (preserved) movq mm2, [TMP0+8*_EAX + 7*16 +8] ; mm2 = coeff[i+1] pcmpgtw mm3, mm2 ; mm3 = sgn(c') (preserved) paddsw mm0, mm1 ; c += sgn(c) paddsw mm2, mm3 ; c += sgn(c') paddw mm0, mm0 ; c *= 2 paddw mm2, mm2 ; c'*= 2 movq mm4, [_ESI] movq mm5, [_ESI] psubw mm4, mm0 ; -c psubw mm5, mm2 ; -c' psraw mm4, 16 ; mm4 = sgn(-c) psraw mm5, 16 ; mm5 = sgn(-c') psubsw mm0, mm4 ; c -= sgn(-c) psubsw mm2, mm5 ; c' -= sgn(-c') pxor mm0, mm1 ; finish changing sign if needed pxor mm2, mm3 ; finish changing sign if needed ; we're short on register, here. Poor pairing... movq mm4, mm7 ; (matrix*quant) nop pmullw mm4, [_EDI + 512 + 8*_EAX + 7*16] movq mm5, mm4 pmulhw mm5, mm0 ; high of c*(matrix*quant) pmullw mm0, mm4 ; low of c*(matrix*quant) movq mm4, mm7 ; (matrix*quant) pmullw mm4, [_EDI + 512 + 8*_EAX + 7*16 + 8] add _EAX, byte 2 pcmpgtw mm5, [_ESI] paddusw mm0, mm5 psrlw mm0, 5 pxor mm0, mm1 ; start restoring sign psubusw mm1, mm5 movq mm5, mm4 pmulhw mm5, mm2 ; high of c*(matrix*quant) pmullw mm2, mm4 ; low of c*(matrix*quant) psubw mm0, mm1 ; finish restoring sign pcmpgtw mm5, [_ESI] paddusw mm2, mm5 psrlw mm2, 5 pxor mm2, mm3 ; start restoring sign psubusw mm3, mm5 psubw mm2, mm3 ; finish restoring sign movq mm1, [_ESI] movq mm3, [byte _ESI] pxor mm6, mm0 ; mismatch control movq [TMP1 + 8*_EAX + 7*16 -2*8 ], mm0 ; data[i] pxor mm6, mm2 ; mismatch control movq [TMP1 + 8*_EAX + 7*16 -2*8 +8], mm2 ; data[i+1] jng .loop nop ; mismatch control pshufw mm0, mm6, 01010101b pshufw mm1, mm6, 10101010b pshufw mm2, mm6, 11111111b pxor mm6, mm0 pxor mm1, mm2 pxor mm6, mm1 movd eax, mm6 pop _EDI and _EAX, byte 1 xor _EAX, byte 1 mov _ESI, [_ESP] add _ESP, byte PTR_SIZE xor word [TMP1 + 2*63], ax xor _EAX, _EAX ret ENDFUNC NON_EXEC_STACK xvidcore/src/quant/x86_asm/quantize_h263_3dne.asm0000664000076500007650000005641611254216113022714 0ustar xvidbuildxvidbuild;/************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - 3dne Quantization/Dequantization - ; * ; * Copyright(C) 2002-2003 Jaan Kalda ; * ; * This program is free software ; you can r_EDIstribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: quantize_h263_3dne.asm,v 1.12 2009-09-16 17:07:58 Isibaar Exp $ ; * ; *************************************************************************/ ; ; these 3dne functions are compatible with iSSE, but are optimized specifically for ; K7 pipelines ; enable dequant saturate [-2048,2047], test purposes only. %define SATURATE %include "nasm.inc" ;============================================================================= ; Local data ;============================================================================= DATA align SECTION_ALIGN int_div: dd 0 %assign i 1 %rep 255 dd (1 << 16) / (i) + 1 %assign i i+1 %endrep ALIGN SECTION_ALIGN plus_one: times 8 dw 1 ;----------------------------------------------------------------------------- ; subtract by Q/2 table ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_sub: %assign i 1 %rep 31 times 4 dw i / 2 %assign i i+1 %endrep ;----------------------------------------------------------------------------- ; ; divide by 2Q table ; ; use a shift of 16 to take full advantage of _pmulhw_ ; for q=1, _pmulhw_ will overflow so it is treated seperately ; (3dnow2 provides _pmulhuw_ which wont cause overflow) ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_div: %assign i 1 %rep 31 times 4 dw (1 << 16) / (i * 2) + 1 %assign i i+1 %endrep ;----------------------------------------------------------------------------- ; add by (odd(Q) ? Q : Q - 1) table ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_add: %assign i 1 %rep 31 %if i % 2 != 0 times 4 dw i %else times 4 dw i - 1 %endif %assign i i+1 %endrep ;----------------------------------------------------------------------------- ; multiple by 2Q table ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_mul: %assign i 1 %rep 31 times 4 dw i * 2 %assign i i+1 %endrep ;----------------------------------------------------------------------------- ; saturation limits ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN mmx_32768_minus_2048: times 4 dw (32768-2048) mmx_32767_minus_2047: times 4 dw (32767-2047) ALIGN SECTION_ALIGN mmx_2047: times 4 dw 2047 ALIGN SECTION_ALIGN mmzero: dd 0, 0 int2047: dd 2047 int_2048: dd -2048 ;============================================================================= ; Code ;============================================================================= TEXT ;----------------------------------------------------------------------------- ; ; uint32_t quant_h263_intra_3dne(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ;This is Athlon-optimized code (ca 70 clk per call) %macro quant_intra1 1 psubw mm1, mm0 ;A3 psubw mm3, mm2 ;B3 %if (%1) psubw mm5, mm4 ;C8 psubw mm7, mm6 ;D8 %endif ALIGN SECTION_ALIGN movq mm4, [_ECX + %1 * 32 +16] ;C1 pmaxsw mm1, mm0 ;A4 movq mm6, [_ECX + %1 * 32 +24] ;D1 pmaxsw mm3, mm2 ;B4 psraw mm0, 15 ;A5 psraw mm2, 15 ;B5 %if (%1) movq [_EDX + %1 * 32 + 16-32], mm5 ;C9 movq [_EDX + %1 * 32 + 24-32], mm7 ;D9 %endif psrlw mm1, 1 ;A6 psrlw mm3, 1 ;B6 movq mm5, [_EBX] ;C2 movq mm7, [_EBX] ;D2 pxor mm1, mm0 ;A7 pxor mm3, mm2 ;B7 psubw mm5, mm4 ;C3 psubw mm7, mm6 ;D3 psubw mm1, mm0 ;A8 psubw mm3, mm2 ;B8 %if (%1 == 0) push _EBP movq mm0, [_ECX + %1 * 32 +32] %elif (%1 < 3) movq mm0, [_ECX + %1 * 32 +32] ;A1 %endif pmaxsw mm5, mm4 ;C4 %if (%1 < 3) movq mm2, [_ECX + %1 * 32 +8+32] ;B1 %else cmp _ESP, _ESP %endif pmaxsw mm7, mm6 ;D4 psraw mm4, 15 ;C5 psraw mm6, 15 ;D5 movq [byte _EDX + %1 * 32], mm1 ;A9 movq [_EDX + %1 * 32+8], mm3 ;B9 psrlw mm5, 1 ;C6 psrlw mm7, 1 ;D6 %if (%1 < 3) movq mm1, [_EBX] ;A2 movq mm3, [_EBX] ;B2 %endif %if (%1 == 3) %ifdef ARCH_IS_X86_64 lea r9, [int_div] imul eax, dword [r9+4*_EDI] %else imul _EAX, [int_div+4*_EDI] %endif %endif pxor mm5, mm4 ;C7 pxor mm7, mm6 ;D7 %endm %macro quant_intra 1 ; Rules for athlon: ; 1) schedule latencies ; 2) add/mul and load/store in 2:1 proportion ; 3) avoid spliting >3byte instructions over 8byte boundaries psubw mm1, mm0 ;A3 psubw mm3, mm2 ;B3 %if (%1) psubw mm5, mm4 ;C8 psubw mm7, mm6 ;D8 %endif ALIGN SECTION_ALIGN movq mm4, [_ECX + %1 * 32 +16] ;C1 pmaxsw mm1, mm0 ;A4 movq mm6, [_ECX + %1 * 32 +24] ;D1 pmaxsw mm3, mm2 ;B4 psraw mm0, 15 ;A5 psraw mm2, 15 ;B5 %if (%1) movq [_EDX + %1 * 32 + 16-32], mm5 ;C9 movq [_EDX + %1 * 32 + 24-32], mm7 ;D9 %endif pmulhw mm1, [_ESI] ;A6 pmulhw mm3, [_ESI] ;B6 movq mm5, [_EBX] ;C2 movq mm7, [_EBX] ;D2 nop nop pxor mm1, mm0 ;A7 pxor mm3, mm2 ;B7 psubw mm5, mm4 ;C3 psubw mm7, mm6 ;D3 psubw mm1, mm0 ;A8 psubw mm3, mm2 ;B8 %if (%1 < 3) movq mm0, [_ECX + %1 * 32 +32] ;A1 %endif pmaxsw mm5, mm4 ;C4 %if (%1 < 3) movq mm2, [_ECX + %1 * 32 +8+32] ;B1 %else cmp _ESP, _ESP %endif pmaxsw mm7,mm6 ;D4 psraw mm4, 15 ;C5 psraw mm6, 15 ;D5 movq [byte _EDX + %1 * 32], mm1 ;A9 movq [_EDX + %1 * 32+8], mm3 ;B9 pmulhw mm5, [_ESI] ;C6 pmulhw mm7, [_ESI] ;D6 %if (%1 < 3) movq mm1, [_EBX] ;A2 movq mm3, [_EBX] ;B2 %endif %if (%1 == 0) push _EBP %elif (%1 < 3) nop %endif nop %if (%1 == 3) %ifdef ARCH_IS_X86_64 lea r9, [int_div] imul eax, dword [r9+4*_EDI] %else imul _EAX, [int_div+4*_EDI] %endif %endif pxor mm5, mm4 ;C7 pxor mm7, mm6 ;D7 %endmacro ALIGN SECTION_ALIGN cglobal quant_h263_intra_3dne quant_h263_intra_3dne: %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] add _ESP, PTR_SIZE %ifndef WINDOWS push prm6 push prm5 %endif push prm4 push prm3 push prm2 push prm1 sub _ESP, PTR_SIZE mov [_ESP], TMP0 %endif mov _EAX, [_ESP + 3*PTR_SIZE] ; quant mov _ECX, [_ESP + 2*PTR_SIZE] ; data mov _EDX, [_ESP + 1*PTR_SIZE] ; coeff cmp al, 1 pxor mm1, mm1 pxor mm3, mm3 movq mm0, [_ECX] ; mm0 = [1st] movq mm2, [_ECX + 8] push _ESI %ifdef ARCH_IS_X86_64 lea _ESI, [mmx_div] lea _ESI, [_ESI + _EAX*8 - 8] %else lea _ESI, [mmx_div + _EAX*8 - 8] %endif push _EBX lea _EBX, [mmzero] push _EDI jz near .q1loop quant_intra 0 mov _EBP, [_ESP + (4+4)*PTR_SIZE] ; dcscalar ; NB -- there are 3 pushes in the function preambule and one more ; in "quant_intra 0", thus an added offset of 16 bytes movsx _EAX, word [byte _ECX] ; DC quant_intra 1 mov _EDI, _EAX sar _EDI, 31 ; sign(DC) shr _EBP, byte 1 ; _EBP = dcscalar/2 quant_intra 2 sub _EAX, _EDI ; DC (+1) xor _EBP, _EDI ; sign(DC) dcscalar /2 (-1) mov _EDI, [_ESP + (4+4)*PTR_SIZE] ; dscalar lea _EAX, [byte _EAX + _EBP] ; DC + sign(DC) dcscalar/2 mov _EBP, [byte _ESP] quant_intra 3 psubw mm5, mm4 ;C8 mov _ESI, [_ESP + 3*PTR_SIZE] ; pop back the register value mov _EDI, [_ESP + 1*PTR_SIZE] ; pop back the register value sar _EAX, 16 lea _EBX, [byte _EAX + 1] ; workaround for _EAX < 0 cmovs _EAX, _EBX ; conditionnaly move the corrected value mov [_EDX], ax ; coeff[0] = ax mov _EBX, [_ESP + 2*PTR_SIZE] ; pop back the register value add _ESP, byte 4*PTR_SIZE ; "quant_intra 0" pushed _EBP, but we don't restore that one, just correct the stack offset by 16 psubw mm7, mm6 ;D8 movq [_EDX + 3 * 32 + 16], mm5 ;C9 movq [_EDX + 3 * 32 + 24], mm7 ;D9 xor _EAX, _EAX %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] %ifndef WINDOWS add _ESP, 6*PTR_SIZE %else add _ESP, 4*PTR_SIZE %endif mov [_ESP], TMP0 %endif ret ALIGN SECTION_ALIGN .q1loop: quant_intra1 0 mov _EBP, [_ESP + (4+4)*PTR_SIZE] ; dcscalar movsx _EAX, word [byte _ECX] ; DC quant_intra1 1 mov _EDI, _EAX sar _EDI, 31 ; sign(DC) shr _EBP, byte 1 ; _EBP = dcscalar /2 quant_intra1 2 sub _EAX, _EDI ; DC (+1) xor _EBP, _EDI ; sign(DC) dcscalar /2 (-1) mov _EDI, [_ESP + (4+4)*PTR_SIZE] ; dcscalar lea _EAX, [byte _EAX + _EBP] ; DC + sign(DC) dcscalar /2 mov _EBP, [byte _ESP] quant_intra1 3 psubw mm5, mm4 ;C8 mov _ESI, [_ESP + 3*PTR_SIZE] ; pop back the register value mov _EDI, [_ESP + 1*PTR_SIZE] ; pop back the register value sar _EAX, 16 lea _EBX, [byte _EAX + 1] ; workaround for _EAX < 0 cmovs _EAX, _EBX ; conditionnaly move the corrected value mov [_EDX], ax ; coeff[0] = ax mov _EBX, [_ESP + 2*PTR_SIZE] ; pop back the register value add _ESP, byte 4*PTR_SIZE ; "quant_intra 0" pushed _EBP, but we don't restore that one, just correct the stack offset by 16 psubw mm7, mm6 ;D8 movq [_EDX + 3 * 32 + 16], mm5 ;C9 movq [_EDX + 3 * 32 + 24], mm7 ;D9 xor _EAX, _EAX %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] %ifndef WINDOWS add _ESP, 6*PTR_SIZE %else add _ESP, 4*PTR_SIZE %endif mov [_ESP], TMP0 %endif ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t quant_h263_inter_3dne(int16_t * coeff, ; const int16_t const * data, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ;This is Athlon-optimized code (ca 90 clk per call) ;Optimized by Jaan, 30 Nov 2002 %macro quantinter 1 movq mm1, [_EAX] ;A2 psraw mm3, 15 ;B6 %if (%1) psubw mm2, mm6 ;C10 %endif psubw mm1, mm0 ;A3 pmulhw mm4, mm7 ;B7 movq mm6, [_ECX + %1*24+16] ;C1 pmaxsw mm1, mm0 ;A4 paddw mm5, mm4 ;B8 %if (%1) movq [_EDX + %1*24+16-24], mm2 ;C11 %endif psubusw mm1, [_EBX] ;A5 mm0 -= sub (unsigned, dont go < 0) pxor mm4, mm3 ;B9 movq mm2, [_EAX] ;C2 psraw mm0, 15 ;A6 psubw mm4, mm3 ;B10 psubw mm2, mm6 ;C3 pmulhw mm1, mm7 ;A7 mm0 = (mm0 / 2Q) >> 24 movq mm3, [_ECX + %1*24+8] ;B1 pmaxsw mm2, mm6 ;C4 paddw mm5, mm1 ;A8 sum += mm0 %if (%1) movq [_EDX + %1*24+8-24], mm4 ;B11 %else movq [_EDX + 120], mm4 ;B11 %endif psubusw mm2, [_EBX] ;C5 pxor mm1, mm0 ;A9 mm0 *= sign(mm0) movq mm4, [_EAX] ;B2 psraw mm6, 15 ;C6 psubw mm1, mm0 ;A10 undisplace psubw mm4, mm3 ;B3 pmulhw mm2, mm7 ;C7 movq mm0, [_ECX + %1*24+24] ;A1 mm0 = [1st] pmaxsw mm4, mm3 ;B4 paddw mm5, mm2 ;C8 movq [byte _EDX + %1*24], mm1 ;A11 psubusw mm4, [_EBX] ;B5 pxor mm2, mm6 ;C9 %endmacro %macro quantinter1 1 movq mm0, [byte _ECX + %1*16] ;mm0 = [1st] movq mm3, [_ECX + %1*16+8] ; movq mm1, [_EAX] movq mm4, [_EAX] psubw mm1, mm0 psubw mm4, mm3 pmaxsw mm1, mm0 pmaxsw mm4, mm3 psubusw mm1, mm6 ; mm0 -= sub (unsigned, dont go < 0) psubusw mm4, mm6 ; psraw mm0, 15 psraw mm3, 15 psrlw mm1, 1 ; mm0 = (mm0 / 2Q) >> 16 psrlw mm4, 1 ; paddw mm5, mm1 ; sum += mm0 pxor mm1, mm0 ; mm0 *= sign(mm0) paddw mm5, mm4 pxor mm4, mm3 ; psubw mm1, mm0 ; undisplace psubw mm4, mm3 cmp _ESP, _ESP movq [byte _EDX + %1*16], mm1 movq [_EDX + %1*16+8], mm4 %endmacro ALIGN SECTION_ALIGN cglobal quant_h263_inter_3dne quant_h263_inter_3dne: %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] add _ESP, PTR_SIZE %ifndef WINDOWS push prm6 push prm5 %endif push prm4 push prm3 push prm2 push prm1 sub _ESP, PTR_SIZE mov [_ESP], TMP0 %endif mov _EDX, [_ESP + 1*PTR_SIZE] ; coeff mov _ECX, [_ESP + 2*PTR_SIZE] ; data mov _EAX, [_ESP + 3*PTR_SIZE] ; quant push _EBX pxor mm5, mm5 ; sum nop %ifdef ARCH_IS_X86_64 lea _EBX, [mmx_div] movq mm7, [_EBX + _EAX * 8 - 8] lea _EBX, [mmx_sub] lea _EBX, [_EBX + _EAX * 8 - 8] %else lea _EBX,[mmx_sub + _EAX * 8 - 8] ; sub movq mm7, [mmx_div + _EAX * 8 - 8] ; divider %endif cmp al, 1 lea _EAX, [mmzero] jz near .q1loop cmp _ESP, _ESP ALIGN SECTION_ALIGN movq mm3, [_ECX + 120] ;B1 pxor mm4, mm4 ;B2 psubw mm4, mm3 ;B3 movq mm0, [_ECX] ;A1 mm0 = [1st] pmaxsw mm4, mm3 ;B4 psubusw mm4, [_EBX] ;B5 quantinter 0 quantinter 1 quantinter 2 quantinter 3 quantinter 4 psraw mm3, 15 ;B6 psubw mm2, mm6 ;C10 pmulhw mm4, mm7 ;B7 paddw mm5, mm4 ;B8 pxor mm4, mm3 ;B9 psubw mm4, mm3 ;B10 movq [_EDX + 4*24+16], mm2 ;C11 pop _EBX movq [_EDX + 4*24+8], mm4 ;B11 pmaddwd mm5, [plus_one] movq mm0, mm5 punpckhdq mm5, mm5 paddd mm0, mm5 movd eax, mm0 ; return sum %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] %ifndef WINDOWS add _ESP, 6*PTR_SIZE %else add _ESP, 4*PTR_SIZE %endif mov [_ESP], TMP0 %endif ret ALIGN SECTION_ALIGN .q1loop: movq mm6, [byte _EBX] quantinter1 0 quantinter1 1 quantinter1 2 quantinter1 3 quantinter1 4 quantinter1 5 quantinter1 6 quantinter1 7 pmaddwd mm5, [plus_one] movq mm0, mm5 psrlq mm5, 32 paddd mm0, mm5 movd eax, mm0 ; return sum pop _EBX %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] %ifndef WINDOWS add _ESP, 6*PTR_SIZE %else add _ESP, 4*PTR_SIZE %endif mov [_ESP], TMP0 %endif ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_h263_intra_3dne(int16_t *data, ; const int16_t const *coeff, ; const uint32_t quant, ; const uint32_t dcscalar, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ; this is the same as dequant_inter_3dne, except that we're ; saturating using 'pminsw' (saves 2 cycles/loop => ~5% faster) ;This is Athlon-optimized code (ca 106 clk per call) %macro dequant 1 movq mm1, [_ECX+%1*24] ; c = coeff[i] ;A2 psubw mm0, mm1 ;-c ;A3 (1st dep) %if (%1) paddw mm4, mm6 ;C11 mm6 free (4th+) %endif pmaxsw mm0, mm1 ;|c| ;A4 (2nd) %if (%1) mov _EBP, _EBP pminsw mm4, [_EBX] ;C12 saturates to +2047 (5th+) later %endif movq mm6, [_ESI] ;0 ;A5 mm6 in use pandn mm7, [_EAX] ;B9 offset = isZero ? 0 : quant_add (2nd) %if (%1) pxor mm5, mm4 ;C13 (6th+) 1later %endif movq mm4, [_ESI] ;C1 ;0 mov _ESP, _ESP pcmpeqw mm6, [_ECX+%1*24] ;A6 (c ==0) ? -1 : 0 (1st) ALIGN SECTION_ALIGN psraw mm1, 15 ; sign(c) ;A7 (2nd) %if (%1) movq [_EDX+%1*24+16-24], mm5 ; C14 (7th) 2later %endif paddw mm7, mm3 ;B10 offset +negate back (3rd) pmullw mm0, [_EDI] ;*= 2Q ;A8 (3rd+) paddw mm2, mm7 ;B11 mm7 free (4th+) lea _EBP, [byte _EBP] movq mm5, [_ECX+%1*24+16] ;C2 ; c = coeff[i] psubw mm4, mm5 ;-c ;C3 (1st dep) pandn mm6, [_EAX] ;A9 offset = isZero ? 0 : quant_add (2nd) pminsw mm2, [_EBX] ;B12 saturates to +2047 (5th+) pxor mm3, mm2 ;B13 (6th+) movq mm2, [byte _ESI] ;B1 ;0 %if (%1) movq [_EDX+%1*24+8-24], mm3 ;B14 (7th) %else movq [_EDX+120], mm3 %endif pmaxsw mm4, mm5 ;|c| ;C4 (2nd) paddw mm6, mm1 ;A10 offset +negate back (3rd) movq mm3, [_ECX+%1*24 + 8] ;B2 ; c = coeff[i] psubw mm2, mm3 ;-c ;B3 (1st dep) paddw mm0, mm6 ;A11 mm6 free (4th+) movq mm6, [byte _ESI] ;0 ;C5 mm6 in use pcmpeqw mm6, [_ECX+%1*24+16] ;C6 (c ==0) ? -1 : 0 (1st) pminsw mm0, [_EBX] ;A12 saturates to +2047 (5th+) pmaxsw mm2, mm3 ;|c| ;B4 (2nd) pxor mm1, mm0 ;A13 (6th+) pmullw mm4, [_EDI] ;*= 2Q ;C8 (3rd+) psraw mm5, 15 ; sign(c) ;C7 (2nd) movq mm7, [byte _ESI] ;0 ;B5 mm7 in use pcmpeqw mm7, [_ECX+%1*24 + 8] ;B6 (c ==0) ? -1 : 0 (1st) %if (%1 < 4) movq mm0, [byte _ESI] ;A1 ;0 %endif pandn mm6, [byte _EAX] ;C9 offset = isZero ? 0 : quant_add (2nd) psraw mm3, 15 ;sign(c) ;B7 (2nd) movq [byte _EDX+%1*24], mm1 ;A14 (7th) paddw mm6, mm5 ;C10 offset +negate back (3rd) pmullw mm2, [_EDI] ;*= 2Q ;B8 (3rd+) mov _ESP, _ESP %endmacro ALIGN SECTION_ALIGN cglobal dequant_h263_intra_3dne dequant_h263_intra_3dne: %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] add _ESP, PTR_SIZE %ifndef WINDOWS push prm6 push prm5 %endif push prm4 push prm3 push prm2 push prm1 sub _ESP, PTR_SIZE mov [_ESP], TMP0 %endif mov _ECX, [_ESP+ 2*PTR_SIZE] ; coeff mov _EAX, [_ESP+ 3*PTR_SIZE] ; quant pxor mm0, mm0 pxor mm2, mm2 push _EDI push _EBX %ifdef ARCH_IS_X86_64 lea _EDI, [mmx_mul] lea _EDI, [_EDI + _EAX*8 - 8] ; 2*quant %else lea _EDI, [mmx_mul + _EAX*8 - 8] ; 2*quant %endif push _EBP lea _EBX, [mmx_2047] movsx _EBP, word [_ECX] %ifdef ARCH_IS_X86_64 lea r9, [mmx_add] lea _EAX, [r9 + _EAX*8 - 8] ; quant or quant-1 %else lea _EAX, [mmx_add + _EAX*8 - 8] ; quant or quant-1 %endif push _ESI lea _ESI, [mmzero] pxor mm7, mm7 movq mm3, [_ECX+120] ;B2 ; c = coeff[i] pcmpeqw mm7, [_ECX+120] ;B6 (c ==0) ? -1 : 0 (1st) imul _EBP, [_ESP+(4+4)*PTR_SIZE] ; dcscalar psubw mm2, mm3 ;-c ;B3 (1st dep) pmaxsw mm2, mm3 ;|c| ;B4 (2nd) pmullw mm2, [_EDI] ;*= 2Q ;B8 (3rd+) psraw mm3, 15 ; sign(c) ;B7 (2nd) mov _EDX, [_ESP+ (1+4)*PTR_SIZE] ; data ALIGN SECTION_ALIGN dequant 0 cmp _EBP, -2048 mov _ESP, _ESP dequant 1 cmovl _EBP, [int_2048] nop dequant 2 cmp _EBP, 2047 mov _ESP, _ESP dequant 3 cmovg _EBP, [int2047] nop dequant 4 paddw mm4, mm6 ;C11 mm6 free (4th+) pminsw mm4, [_EBX] ;C12 saturates to +2047 (5th+) pandn mm7, [_EAX] ;B9 offset = isZero ? 0 : quant_add (2nd) mov _EAX, _EBP mov _ESI, [_ESP] mov _EBP, [_ESP+PTR_SIZE] pxor mm5, mm4 ;C13 (6th+) paddw mm7, mm3 ;B10 offset +negate back (3rd) movq [_EDX+4*24+16], mm5 ;C14 (7th) paddw mm2, mm7 ;B11 mm7 free (4th+) pminsw mm2, [_EBX] ;B12 saturates to +2047 (5th+) mov _EBX, [_ESP+2*PTR_SIZE] mov _EDI, [_ESP+3*PTR_SIZE] add _ESP, byte 4*PTR_SIZE pxor mm3, mm2 ;B13 (6th+) movq [_EDX+4*24+8], mm3 ;B14 (7th) mov [_EDX], ax xor _EAX, _EAX %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] %ifndef WINDOWS add _ESP, 6*PTR_SIZE %else add _ESP, 4*PTR_SIZE %endif mov [_ESP], TMP0 %endif ret ENDFUNC ;----------------------------------------------------------------------------- ; ; uint32_t dequant_h263_inter_3dne(int16_t * data, ; const int16_t * const coeff, ; const uint32_t quant, ; const uint16_t *mpeg_matrices); ; ;----------------------------------------------------------------------------- ; this is the same as dequant_inter_3dne, ; except that we're saturating using 'pminsw' (saves 2 cycles/loop) ; This is Athlon-optimized code (ca 100 clk per call) ALIGN SECTION_ALIGN cglobal dequant_h263_inter_3dne dequant_h263_inter_3dne: %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] add _ESP, PTR_SIZE %ifndef WINDOWS push prm6 push prm5 %endif push prm4 push prm3 push prm2 push prm1 sub _ESP, PTR_SIZE mov [_ESP], TMP0 %endif mov _ECX, [_ESP+ 2*PTR_SIZE] ; coeff mov _EAX, [_ESP+ 3*PTR_SIZE] ; quant pxor mm0, mm0 pxor mm2, mm2 push _EDI push _EBX push _ESI %ifdef ARCH_IS_X86_64 lea _EDI, [mmx_mul] lea _EDI, [_EDI + _EAX*8 - 8] ; 2*quant %else lea _EDI, [mmx_mul + _EAX*8 - 8] ; 2*quant %endif lea _EBX, [mmx_2047] pxor mm7, mm7 movq mm3, [_ECX+120] ;B2 ; c = coeff[i] pcmpeqw mm7, [_ECX+120] ;B6 (c ==0) ? -1 : 0 (1st) %ifdef ARCH_IS_X86_64 lea r9, [mmx_add] lea _EAX, [r9 + _EAX*8 - 8] ; quant or quant-1 %else lea _EAX, [mmx_add + _EAX*8 - 8] ; quant or quant-1 %endif psubw mm2, mm3 ;-c ;B3 (1st dep) lea _ESI, [mmzero] pmaxsw mm2, mm3 ;|c| ;B4 (2nd) pmullw mm2, [_EDI] ;*= 2Q ;B8 (3rd+) psraw mm3, 15 ; sign(c) ;B7 (2nd) mov _EDX, [_ESP+ (1+3)*PTR_SIZE] ; data ALIGN SECTION_ALIGN dequant 0 dequant 1 dequant 2 dequant 3 dequant 4 paddw mm4, mm6 ;C11 mm6 free (4th+) pminsw mm4, [_EBX] ;C12 saturates to +2047 (5th+) pandn mm7, [_EAX] ;B9 offset = isZero ? 0 : quant_add (2nd) mov _ESI, [_ESP] pxor mm5, mm4 ;C13 (6th+) paddw mm7, mm3 ;B10 offset +negate back (3rd) movq [_EDX+4*24+16], mm5 ;C14 (7th) paddw mm2, mm7 ;B11 mm7 free (4th+) pminsw mm2, [_EBX] ;B12 saturates to +2047 (5th+) mov _EBX, [_ESP+PTR_SIZE] mov _EDI, [_ESP+2*PTR_SIZE] add _ESP, byte 3*PTR_SIZE pxor mm3, mm2 ;B13 (6th+) movq [_EDX+4*24+8], mm3 ;B14 (7th) xor _EAX, _EAX %ifdef ARCH_IS_X86_64 mov TMP0, [_ESP] %ifndef WINDOWS add _ESP, 6*PTR_SIZE %else add _ESP, 4*PTR_SIZE %endif mov [_ESP], TMP0 %endif ret ENDFUNC NON_EXEC_STACK xvidcore/src/quant/quant_matrix.c0000664000076500007650000001055511564705453020302 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Quantization matrix management code - * * Copyright(C) 2002 Michael Militzer * 2002 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: quant_matrix.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "quant_matrix.h" #define FIX(X) (((X)==1) ? 0xFFFF : ((1UL << 16) / (X) + 1)) #define FIXL(X) ((1UL << 16) / (X) - 1) /***************************************************************************** * Default matrices ****************************************************************************/ static const uint8_t default_intra_matrix[64] = { 8, 17, 18, 19, 21, 23, 25, 27, 17, 18, 19, 21, 23, 25, 27, 28, 20, 21, 22, 23, 24, 26, 28, 30, 21, 22, 23, 24, 26, 28, 30, 32, 22, 23, 24, 26, 28, 30, 32, 35, 23, 24, 26, 28, 30, 32, 35, 38, 25, 26, 28, 30, 32, 35, 38, 41, 27, 28, 30, 32, 35, 38, 41, 45 }; static const uint8_t default_inter_matrix[64] = { 16, 17, 18, 19, 20, 21, 22, 23, 17, 18, 19, 20, 21, 22, 23, 24, 18, 19, 20, 21, 22, 23, 24, 25, 19, 20, 21, 22, 23, 24, 26, 27, 20, 21, 22, 23, 25, 26, 27, 28, 21, 22, 23, 24, 26, 27, 28, 30, 22, 23, 24, 26, 27, 28, 30, 31, 23, 24, 25, 27, 28, 30, 31, 33 }; const uint16_t * get_intra_matrix(const uint16_t * mpeg_quant_matrices) { return(mpeg_quant_matrices + 0*64); } const uint16_t * get_inter_matrix(const uint16_t * mpeg_quant_matrices) { return(mpeg_quant_matrices + 4*64); } const uint8_t * get_default_intra_matrix(void) { return default_intra_matrix; } const uint8_t * get_default_inter_matrix(void) { return default_inter_matrix; } int is_custom_intra_matrix(const uint16_t * mpeg_quant_matrices) { int i; const uint16_t *intra_matrix = get_intra_matrix(mpeg_quant_matrices); const uint8_t *def_intra_matrix = get_default_intra_matrix(); for (i = 0; i < 64; i++) { if(intra_matrix[i] != def_intra_matrix[i]) return 1; } return 0; } int is_custom_inter_matrix(const uint16_t * mpeg_quant_matrices) { int i; const uint16_t *inter_matrix = get_inter_matrix(mpeg_quant_matrices); const uint8_t *def_inter_matrix = get_default_inter_matrix(); for (i = 0; i < 64; i++) { if(inter_matrix[i] != (uint16_t)def_inter_matrix[i]) return 1; } return 0; } void set_intra_matrix(uint16_t * mpeg_quant_matrices, const uint8_t * matrix) { int i; uint16_t *intra_matrix = mpeg_quant_matrices + 0*64; for (i = 0; i < 64; i++) { intra_matrix[i] = (!i) ? (uint16_t)8: (uint16_t)matrix[i]; } } void init_intra_matrix(uint16_t * mpeg_quant_matrices, uint32_t quant) { int i; uint16_t *intra_matrix = mpeg_quant_matrices + 0*64; uint16_t *intra_matrix_rec = mpeg_quant_matrices + 1*64; for (i = 0; i < 64; i++) { uint32_t div = intra_matrix[i]*quant; intra_matrix_rec[i] = ((uint32_t)((1<>1)))/div; } } void set_inter_matrix(uint16_t * mpeg_quant_matrices, const uint8_t * matrix) { int i; uint16_t *inter_matrix = mpeg_quant_matrices + 4*64; uint16_t *inter_matrix1 = mpeg_quant_matrices + 5*64; uint16_t *inter_matrix_fix = mpeg_quant_matrices + 6*64; uint16_t *inter_matrix_fixl = mpeg_quant_matrices + 7*64; for (i = 0; i < 64; i++) { inter_matrix1[i] = ((inter_matrix[i] = (int16_t) matrix[i])>>1); inter_matrix1[i] += ((inter_matrix[i] == 1) ? 1: 0); inter_matrix_fix[i] = (uint16_t) FIX(inter_matrix[i]); inter_matrix_fixl[i] = (uint16_t) FIXL(inter_matrix[i]); } } void init_mpeg_matrix(uint16_t * mpeg_quant_matrices) { set_intra_matrix(mpeg_quant_matrices, default_intra_matrix); set_inter_matrix(mpeg_quant_matrices, default_inter_matrix); } xvidcore/src/quant/quant.h0000664000076500007650000001152311564705453016717 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - (de)Quantization related header - * * Copyright(C) 2003 Edouard Gomez * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: quant.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _QUANT_H_ #define _QUANT_H_ #include "../portab.h" /***************************************************************************** * Common API for Intra (de)Quant functions ****************************************************************************/ typedef uint32_t (quant_intraFunc) (int16_t * coeff, const int16_t * data, const uint32_t quant, const uint32_t dcscalar, const uint16_t * mpeg_quant_matrices); typedef quant_intraFunc *quant_intraFuncPtr; /* Global function pointers */ extern quant_intraFuncPtr quant_h263_intra; extern quant_intraFuncPtr quant_mpeg_intra; extern quant_intraFuncPtr dequant_h263_intra; extern quant_intraFuncPtr dequant_mpeg_intra; /***************************************************************************** * Known implementation of Intra (de)Quant functions ****************************************************************************/ /* Quant functions */ quant_intraFunc quant_h263_intra_c; quant_intraFunc quant_mpeg_intra_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) quant_intraFunc quant_h263_intra_mmx; quant_intraFunc quant_h263_intra_3dne; quant_intraFunc quant_h263_intra_sse2; quant_intraFunc quant_mpeg_intra_mmx; #endif #ifdef ARCH_IS_IA64 quant_intraFunc quant_h263_intra_ia64; #endif #ifdef ARCH_IS_PPC quant_intraFunc quant_h263_intra_altivec_c; #endif /* DeQuant functions */ quant_intraFunc dequant_h263_intra_c; quant_intraFunc dequant_mpeg_intra_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) quant_intraFunc dequant_h263_intra_mmx; quant_intraFunc dequant_h263_intra_xmm; quant_intraFunc dequant_h263_intra_3dne; quant_intraFunc dequant_h263_intra_sse2; quant_intraFunc dequant_mpeg_intra_mmx; quant_intraFunc dequant_mpeg_intra_3dne; #endif #ifdef ARCH_IS_IA64 quant_intraFunc dequant_h263_intra_ia64; #endif #ifdef ARCH_IS_PPC quant_intraFunc dequant_h263_intra_altivec_c; quant_intraFunc dequant_mpeg_intra_altivec_c; #endif /***************************************************************************** * Common API for Inter (de)Quant functions ****************************************************************************/ typedef uint32_t (quant_interFunc) (int16_t * coeff, const int16_t * data, const uint32_t quant, const uint16_t * mpeg_quant_matrices); typedef quant_interFunc *quant_interFuncPtr; /* Global function pointers */ extern quant_interFuncPtr quant_h263_inter; extern quant_interFuncPtr quant_mpeg_inter; extern quant_interFuncPtr dequant_h263_inter; extern quant_interFuncPtr dequant_mpeg_inter; /***************************************************************************** * Known implementation of Inter (de)Quant functions ****************************************************************************/ quant_interFunc quant_h263_inter_c; quant_interFunc quant_mpeg_inter_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) quant_interFunc quant_h263_inter_mmx; quant_interFunc quant_h263_inter_3dne; quant_interFunc quant_h263_inter_sse2; quant_interFunc quant_mpeg_inter_mmx; quant_interFunc quant_mpeg_inter_xmm; #endif #ifdef ARCH_IS_IA64 quant_interFunc quant_h263_inter_ia64; #endif #ifdef ARCH_IS_PPC quant_interFunc quant_h263_inter_altivec_c; #endif quant_interFunc dequant_h263_inter_c; quant_interFunc dequant_mpeg_inter_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) quant_interFunc dequant_h263_inter_mmx; quant_interFunc dequant_h263_inter_xmm; quant_interFunc dequant_h263_inter_3dne; quant_interFunc dequant_h263_inter_sse2; quant_interFunc dequant_mpeg_inter_mmx; quant_interFunc dequant_mpeg_inter_3dne; #endif #ifdef ARCH_IS_IA64 quant_interFunc dequant_h263_inter_ia64; #endif #ifdef ARCH_IS_PPC quant_interFunc dequant_h263_inter_altivec_c; quant_interFunc dequant_mpeg_inter_altivec_c; #endif #endif /* _QUANT_H_ */ xvidcore/src/quant/quant_matrix.h0000664000076500007650000000361511564705453020306 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Quantization matrix management header - * * Copyright(C) 2002 Michael Militzer * 2002 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: quant_matrix.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _QUANT_MATRIX_H_ #define _QUANT_MATRIX_H_ #include "../portab.h" #define SCALEBITS 17 void init_mpeg_matrix(uint16_t * mpeg_quant_matrices); int is_custom_intra_matrix(const uint16_t * mpeg_quant_matrices); int is_custom_inter_matrix(const uint16_t * mpeg_quant_matrices); void set_intra_matrix(uint16_t *mpeg_quant_matrices, const uint8_t * matrix); void set_inter_matrix(uint16_t *mpeg_quant_matrices, const uint8_t * matrix); void init_intra_matrix(uint16_t * mpeg_quant_matrices, uint32_t quant); const uint16_t *get_intra_matrix(const uint16_t *mpeg_quant_matrices); const uint16_t *get_inter_matrix(const uint16_t *mpeg_quant_matrices); const uint8_t *get_default_intra_matrix(void); const uint8_t *get_default_inter_matrix(void); #endif /* _QUANT_MATRIX_H_ */ xvidcore/src/encoder.c0000664000076500007650000022405011564705453016052 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Encoder main module - * * Copyright(C) 2002-2010 Michael Militzer * 2002-2003 Peter Ross * 2002 Daniel Smith * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: encoder.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include #include "encoder.h" #include "prediction/mbprediction.h" #include "global.h" #include "utils/timer.h" #include "image/image.h" #include "image/font.h" #include "motion/sad.h" #include "motion/motion.h" #include "motion/gmc.h" #include "bitstream/cbp.h" #include "utils/mbfunctions.h" #include "bitstream/bitstream.h" #include "bitstream/mbcoding.h" #include "utils/emms.h" #include "bitstream/mbcoding.h" #include "quant/quant_matrix.h" #include "utils/mem_align.h" # include "motion/motion_smp.h" /***************************************************************************** * Local function prototypes ****************************************************************************/ static int FrameCodeI(Encoder * pEnc, Bitstream * bs); static int FrameCodeP(Encoder * pEnc, Bitstream * bs); static void FrameCodeB(Encoder * pEnc, FRAMEINFO * frame, Bitstream * bs); /***************************************************************************** * Encoder creation * * This function creates an Encoder instance, it allocates all necessary * image buffers (reference, current and bframes) and initialize the internal * xvid encoder paremeters according to the XVID_ENC_PARAM input parameter. * * The code seems to be very long but is very basic, mainly memory allocation * and cleaning code. * * Returned values : * - 0 - no errors * - XVID_ERR_MEMORY - the libc could not allocate memory, the function * cleans the structure before exiting. * pParam->handle is also set to NULL. * ****************************************************************************/ /* * Simplify the "fincr/fbase" fraction */ static int gcd(int a, int b) { int r ; if (b > a) { r = a; a = b; b = r; } while ((r = a % b)) { a = b; b = r; } return b; } static void simplify_time(int *inc, int *base) { /* common factor */ const int s = gcd(*inc, *base); *inc /= s; *base /= s; if (*base > 65535 || *inc > 65535) { int *biggest; int *other; float div; if (*base > *inc) { biggest = base; other = inc; } else { biggest = inc; other = base; } div = ((float)*biggest)/((float)65535); *biggest = (unsigned int)(((float)*biggest)/div); *other = (unsigned int)(((float)*other)/div); } } int enc_create(xvid_enc_create_t * create) { Encoder *pEnc; int n; if (XVID_VERSION_MAJOR(create->version) != 1) /* v1.x.x */ return XVID_ERR_VERSION; if (create->width%2 || create->height%2) return XVID_ERR_FAIL; if (create->width<=0 || create->height<=0) return XVID_ERR_FAIL; /* allocate encoder struct */ pEnc = (Encoder *) xvid_malloc(sizeof(Encoder), CACHE_LINE); if (pEnc == NULL) return XVID_ERR_MEMORY; memset(pEnc, 0, sizeof(Encoder)); pEnc->mbParam.profile = create->profile; /* global flags */ pEnc->mbParam.global_flags = create->global; if ((pEnc->mbParam.global_flags & XVID_GLOBAL_PACKED)) pEnc->mbParam.global_flags |= XVID_GLOBAL_DIVX5_USERDATA; /* width, height */ pEnc->mbParam.width = create->width; pEnc->mbParam.height = create->height; pEnc->mbParam.mb_width = (pEnc->mbParam.width + 15) / 16; pEnc->mbParam.mb_height = (pEnc->mbParam.height + 15) / 16; pEnc->mbParam.edged_width = 16 * pEnc->mbParam.mb_width + 2 * EDGE_SIZE; pEnc->mbParam.edged_height = 16 * pEnc->mbParam.mb_height + 2 * EDGE_SIZE; /* framerate */ pEnc->mbParam.fincr = MAX(create->fincr, 0); pEnc->mbParam.fbase = create->fincr <= 0 ? 25 : create->fbase; if (pEnc->mbParam.fincr>0) simplify_time((int*)&pEnc->mbParam.fincr, (int*)&pEnc->mbParam.fbase); /* zones */ if(create->num_zones > 0) { pEnc->num_zones = create->num_zones; pEnc->zones = xvid_malloc(sizeof(xvid_enc_zone_t) * pEnc->num_zones, CACHE_LINE); if (pEnc->zones == NULL) goto xvid_err_memory0; memcpy(pEnc->zones, create->zones, sizeof(xvid_enc_zone_t) * pEnc->num_zones); } else { pEnc->num_zones = 0; pEnc->zones = NULL; } /* plugins */ if(create->num_plugins > 0) { pEnc->num_plugins = create->num_plugins; pEnc->plugins = xvid_malloc(sizeof(xvid_enc_plugin_t) * pEnc->num_plugins, CACHE_LINE); if (pEnc->plugins == NULL) goto xvid_err_memory0; } else { pEnc->num_plugins = 0; pEnc->plugins = NULL; } for (n=0; nnum_plugins;n++) { xvid_plg_create_t pcreate; xvid_plg_info_t pinfo; memset(&pinfo, 0, sizeof(xvid_plg_info_t)); pinfo.version = XVID_VERSION; if (create->plugins[n].func(NULL, XVID_PLG_INFO, &pinfo, NULL) >= 0) { pEnc->mbParam.plugin_flags |= pinfo.flags; } memset(&pcreate, 0, sizeof(xvid_plg_create_t)); pcreate.version = XVID_VERSION; pcreate.num_zones = pEnc->num_zones; pcreate.zones = pEnc->zones; pcreate.width = pEnc->mbParam.width; pcreate.height = pEnc->mbParam.height; pcreate.mb_width = pEnc->mbParam.mb_width; pcreate.mb_height = pEnc->mbParam.mb_height; pcreate.fincr = pEnc->mbParam.fincr; pcreate.fbase = pEnc->mbParam.fbase; pcreate.param = create->plugins[n].param; pEnc->plugins[n].func = NULL; /* disable plugins that fail */ if (create->plugins[n].func(NULL, XVID_PLG_CREATE, &pcreate, &pEnc->plugins[n].param) >= 0) { pEnc->plugins[n].func = create->plugins[n].func; } } if ((pEnc->mbParam.global_flags & XVID_GLOBAL_EXTRASTATS_ENABLE) || (pEnc->mbParam.plugin_flags & XVID_REQPSNR)) { pEnc->mbParam.plugin_flags |= XVID_REQORIGINAL; /* psnr calculation requires the original */ } /* temp dquants */ if ((pEnc->mbParam.plugin_flags & XVID_REQDQUANTS)) { pEnc->temp_dquants = (int *) xvid_malloc(pEnc->mbParam.mb_width * pEnc->mbParam.mb_height * sizeof(int), CACHE_LINE); if (pEnc->temp_dquants==NULL) goto xvid_err_memory1a; } /* temp lambdas */ if (pEnc->mbParam.plugin_flags & XVID_REQLAMBDA) { pEnc->temp_lambda = (float *) xvid_malloc(pEnc->mbParam.mb_width * pEnc->mbParam.mb_height * 6 * sizeof(float), CACHE_LINE); if (pEnc->temp_lambda == NULL) goto xvid_err_memory1a; } /* bframes */ pEnc->mbParam.max_bframes = MAX(create->max_bframes, 0); pEnc->mbParam.bquant_ratio = MAX(create->bquant_ratio, 0); pEnc->mbParam.bquant_offset = create->bquant_offset; /* min/max quant */ for (n=0; n<3; n++) { pEnc->mbParam.min_quant[n] = create->min_quant[n] > 0 ? create->min_quant[n] : 2; pEnc->mbParam.max_quant[n] = create->max_quant[n] > 0 ? create->max_quant[n] : 31; } /* frame drop ratio */ pEnc->mbParam.frame_drop_ratio = MAX(create->frame_drop_ratio, 0); /* max keyframe interval */ pEnc->mbParam.iMaxKeyInterval = create->max_key_interval <= 0 ? (10 * (int)pEnc->mbParam.fbase) / (int)pEnc->mbParam.fincr : create->max_key_interval; /* allocate working frame-image memory */ pEnc->current = xvid_malloc(sizeof(FRAMEINFO), CACHE_LINE); pEnc->reference = xvid_malloc(sizeof(FRAMEINFO), CACHE_LINE); if (pEnc->current == NULL || pEnc->reference == NULL) goto xvid_err_memory1; /* allocate macroblock memory */ pEnc->current->mbs = xvid_malloc(sizeof(MACROBLOCK) * pEnc->mbParam.mb_width * pEnc->mbParam.mb_height, CACHE_LINE); pEnc->reference->mbs = xvid_malloc(sizeof(MACROBLOCK) * pEnc->mbParam.mb_width * pEnc->mbParam.mb_height, CACHE_LINE); if (pEnc->current->mbs == NULL || pEnc->reference->mbs == NULL) goto xvid_err_memory2; /* allocate quant matrix memory */ pEnc->mbParam.mpeg_quant_matrices = xvid_malloc(sizeof(uint16_t) * 64 * 8, CACHE_LINE); if (pEnc->mbParam.mpeg_quant_matrices == NULL) goto xvid_err_memory2a; /* allocate interpolation image memory */ if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { image_null(&pEnc->sOriginal); image_null(&pEnc->sOriginal2); } image_null(&pEnc->f_refh); image_null(&pEnc->f_refv); image_null(&pEnc->f_refhv); image_null(&pEnc->current->image); image_null(&pEnc->reference->image); image_null(&pEnc->vInterH); image_null(&pEnc->vInterV); image_null(&pEnc->vInterHV); if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { if (image_create (&pEnc->sOriginal, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; if (image_create (&pEnc->sOriginal2, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; } if (image_create (&pEnc->f_refh, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; if (image_create (&pEnc->f_refv, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; if (image_create (&pEnc->f_refhv, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; if (image_create (&pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; if (image_create (&pEnc->reference->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; if (image_create (&pEnc->vInterH, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; if (image_create (&pEnc->vInterV, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; if (image_create (&pEnc->vInterHV, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; /* Create full bitplane for GMC, this might be wasteful */ if (image_create (&pEnc->vGMC, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory3; /* init bframe image buffers */ pEnc->bframenum_head = 0; pEnc->bframenum_tail = 0; pEnc->flush_bframes = 0; pEnc->closed_bframenum = -1; /* B Frames specific init */ pEnc->bframes = NULL; if (pEnc->mbParam.max_bframes > 0) { pEnc->bframes = xvid_malloc(pEnc->mbParam.max_bframes * sizeof(FRAMEINFO *), CACHE_LINE); if (pEnc->bframes == NULL) goto xvid_err_memory3; for (n = 0; n < pEnc->mbParam.max_bframes; n++) pEnc->bframes[n] = NULL; for (n = 0; n < pEnc->mbParam.max_bframes; n++) { pEnc->bframes[n] = xvid_malloc(sizeof(FRAMEINFO), CACHE_LINE); if (pEnc->bframes[n] == NULL) goto xvid_err_memory4; pEnc->bframes[n]->mbs = xvid_malloc(sizeof(MACROBLOCK) * pEnc->mbParam.mb_width * pEnc->mbParam.mb_height, CACHE_LINE); if (pEnc->bframes[n]->mbs == NULL) goto xvid_err_memory4; image_null(&pEnc->bframes[n]->image); if (image_create (&pEnc->bframes[n]->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory4; } } /* init incoming frame queue */ pEnc->queue_head = 0; pEnc->queue_tail = 0; pEnc->queue_size = 0; pEnc->queue = xvid_malloc((pEnc->mbParam.max_bframes+1) * sizeof(QUEUEINFO), CACHE_LINE); if (pEnc->queue == NULL) goto xvid_err_memory4; for (n = 0; n < pEnc->mbParam.max_bframes+1; n++) image_null(&pEnc->queue[n].image); for (n = 0; n < pEnc->mbParam.max_bframes+1; n++) { if (image_create (&pEnc->queue[n].image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height) < 0) goto xvid_err_memory5; } /* timestamp stuff */ pEnc->mbParam.m_stamp = 0; pEnc->m_framenum = create->start_frame_num; pEnc->current->stamp = 0; pEnc->reference->stamp = 0; /* other stuff */ pEnc->iFrameNum = 0; pEnc->fMvPrevSigma = -1; /* slices */ pEnc->num_slices = MIN(MAX(1, create->num_slices), (int) pEnc->mbParam.mb_height); /* multithreaded stuff */ if (create->num_threads > 0) { #ifndef HAVE_PTHREAD int t = MAX(1, create->num_threads); #else int t = MIN(create->num_threads, (int) (pEnc->mbParam.mb_height>>1)); /* at least two rows per thread */ #endif int threads_per_slice = MAX(1, (t / pEnc->num_slices)); int rows_per_thread = (pEnc->mbParam.mb_height + threads_per_slice - 1) / threads_per_slice; pEnc->num_threads = t; pEnc->smpData = xvid_malloc(t*sizeof(SMPData), CACHE_LINE); if (!pEnc->smpData) goto xvid_err_nosmp; /* tmp bitstream buffer for slice coding */ pEnc->smpData[0].tmp_buffer = xvid_malloc(16*pEnc->mbParam.edged_width*pEnc->mbParam.mb_height*sizeof(uint8_t), CACHE_LINE); if (! pEnc->smpData[0].tmp_buffer) goto xvid_err_nosmp; for (n = 0; n < t; n++) { int s = MIN(pEnc->num_threads, pEnc->num_slices); pEnc->smpData[n].complete_count_self = xvid_malloc(rows_per_thread * sizeof(int), CACHE_LINE); if (!pEnc->smpData[n].complete_count_self) goto xvid_err_nosmp; if (n > 0 && n < s) { pEnc->smpData[n].bs = (Bitstream *) xvid_malloc(sizeof(Bitstream), CACHE_LINE); if (!pEnc->smpData[n].bs) goto xvid_err_nosmp; pEnc->smpData[n].sStat = (Statistics *) xvid_malloc(sizeof(Statistics), CACHE_LINE); if (!pEnc->smpData[n].sStat) goto xvid_err_nosmp; pEnc->smpData[n].tmp_buffer = pEnc->smpData[0].tmp_buffer + 16*(((n-1)*pEnc->mbParam.edged_width*pEnc->mbParam.mb_height)/s); BitstreamInit(pEnc->smpData[n].bs, pEnc->smpData[n].tmp_buffer, 0); } if (n != 0) pEnc->smpData[n].complete_count_above = pEnc->smpData[n-1].complete_count_self; } pEnc->smpData[0].complete_count_above = pEnc->smpData[t-1].complete_count_self - 1; } else { xvid_err_nosmp: /* no SMP */ if (pEnc->smpData) { if (pEnc->smpData[0].tmp_buffer) xvid_free(pEnc->smpData[0].tmp_buffer); } else { pEnc->smpData = xvid_malloc(1*sizeof(SMPData), CACHE_LINE); if (pEnc->smpData == NULL) goto xvid_err_memory5; } create->num_threads = 0; } create->handle = (void *) pEnc; init_timer(); init_mpeg_matrix(pEnc->mbParam.mpeg_quant_matrices); return 0; /* ok */ /* * We handle all XVID_ERR_MEMORY here, this makes the code lighter */ xvid_err_memory5: for (n = 0; n < pEnc->mbParam.max_bframes+1; n++) { image_destroy(&pEnc->queue[n].image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); } xvid_free(pEnc->queue); xvid_err_memory4: if (pEnc->mbParam.max_bframes > 0) { int i; for (i = 0; i < pEnc->mbParam.max_bframes; i++) { if (pEnc->bframes[i] == NULL) continue; image_destroy(&pEnc->bframes[i]->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); xvid_free(pEnc->bframes[i]->mbs); xvid_free(pEnc->bframes[i]); } xvid_free(pEnc->bframes); } xvid_err_memory3: if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { image_destroy(&pEnc->sOriginal, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->sOriginal2, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); } image_destroy(&pEnc->f_refh, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->f_refv, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->f_refhv, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->reference->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->vInterH, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->vInterV, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->vInterHV, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); /* destroy GMC image */ image_destroy(&pEnc->vGMC, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); xvid_err_memory2a: xvid_free(pEnc->mbParam.mpeg_quant_matrices); xvid_err_memory2: xvid_free(pEnc->current->mbs); xvid_free(pEnc->reference->mbs); xvid_err_memory1: xvid_free(pEnc->current); xvid_free(pEnc->reference); xvid_err_memory1a: if ((pEnc->mbParam.plugin_flags & XVID_REQDQUANTS)) { xvid_free(pEnc->temp_dquants); } if(pEnc->mbParam.plugin_flags & XVID_REQLAMBDA) { xvid_free(pEnc->temp_lambda); } xvid_err_memory0: for (n=0; nnum_plugins;n++) { if (pEnc->plugins[n].func) { pEnc->plugins[n].func(pEnc->plugins[n].param, XVID_PLG_DESTROY, NULL, NULL); } } xvid_free(pEnc->plugins); xvid_free(pEnc->zones); xvid_free(pEnc); create->handle = NULL; return XVID_ERR_MEMORY; } /***************************************************************************** * Encoder destruction * * This function destroy the entire encoder structure created by a previous * successful enc_create call. * * Returned values (for now only one returned value) : * - 0 - no errors * ****************************************************************************/ int enc_destroy(Encoder * pEnc) { int i; /* B Frames specific */ for (i = 0; i < pEnc->mbParam.max_bframes+1; i++) { image_destroy(&pEnc->queue[i].image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); } xvid_free(pEnc->queue); if (pEnc->mbParam.max_bframes > 0) { for (i = 0; i < pEnc->mbParam.max_bframes; i++) { if (pEnc->bframes[i] == NULL) continue; image_destroy(&pEnc->bframes[i]->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); xvid_free(pEnc->bframes[i]->mbs); xvid_free(pEnc->bframes[i]); } xvid_free(pEnc->bframes); } /* All images, reference, current etc ... */ image_destroy(&pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->reference->image, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->vInterH, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->vInterV, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->vInterHV, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->f_refh, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->f_refv, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->f_refhv, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->vGMC, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { image_destroy(&pEnc->sOriginal, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); image_destroy(&pEnc->sOriginal2, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height); } /* Encoder structure */ xvid_free(pEnc->current->mbs); xvid_free(pEnc->current); xvid_free(pEnc->reference->mbs); xvid_free(pEnc->reference); if ((pEnc->mbParam.plugin_flags & XVID_REQDQUANTS)) { xvid_free(pEnc->temp_dquants); } if ((pEnc->mbParam.plugin_flags & XVID_REQLAMBDA)) { xvid_free(pEnc->temp_lambda); } if (pEnc->num_plugins>0) { xvid_plg_destroy_t pdestroy; memset(&pdestroy, 0, sizeof(xvid_plg_destroy_t)); pdestroy.version = XVID_VERSION; pdestroy.num_frames = pEnc->m_framenum; for (i=0; inum_plugins;i++) { if (pEnc->plugins[i].func) { pEnc->plugins[i].func(pEnc->plugins[i].param, XVID_PLG_DESTROY, &pdestroy, NULL); } } xvid_free(pEnc->plugins); } xvid_free(pEnc->mbParam.mpeg_quant_matrices); if (pEnc->num_zones > 0) xvid_free(pEnc->zones); if (pEnc->num_threads > 0) { for (i = 1; i < MAX(1, MIN(pEnc->num_threads, pEnc->num_slices)); i++) { xvid_free(pEnc->smpData[i].bs); xvid_free(pEnc->smpData[i].sStat); } if (pEnc->smpData[0].tmp_buffer) xvid_free(pEnc->smpData[0].tmp_buffer); for (i = 0; i < pEnc->num_threads; i++) xvid_free(pEnc->smpData[i].complete_count_self); } xvid_free(pEnc->smpData); xvid_free(pEnc); return 0; /* ok */ } /* call the plugins */ static void call_plugins(Encoder * pEnc, FRAMEINFO * frame, IMAGE * original, int opt, int * type, int * quant, xvid_enc_stats_t * stats) { unsigned int i, j, k; xvid_plg_data_t data; /* set data struct */ memset(&data, 0, sizeof(xvid_plg_data_t)); data.version = XVID_VERSION; /* find zone */ for(i=0; inum_zones && pEnc->zones[i].frame<=frame->frame_num; i++) ; data.zone = i>0 ? &pEnc->zones[i-1] : NULL; data.width = pEnc->mbParam.width; data.height = pEnc->mbParam.height; data.mb_width = pEnc->mbParam.mb_width; data.mb_height = pEnc->mbParam.mb_height; data.fincr = frame->fincr; data.fbase = pEnc->mbParam.fbase; data.bquant_ratio = pEnc->mbParam.bquant_ratio; data.bquant_offset = pEnc->mbParam.bquant_offset; for (i=0; i<3; i++) { data.min_quant[i] = pEnc->mbParam.min_quant[i]; data.max_quant[i] = pEnc->mbParam.max_quant[i]; } data.reference.csp = XVID_CSP_PLANAR; data.reference.plane[0] = pEnc->reference->image.y; data.reference.plane[1] = pEnc->reference->image.u; data.reference.plane[2] = pEnc->reference->image.v; data.reference.stride[0] = pEnc->mbParam.edged_width; data.reference.stride[1] = pEnc->mbParam.edged_width/2; data.reference.stride[2] = pEnc->mbParam.edged_width/2; data.current.csp = XVID_CSP_PLANAR; data.current.plane[0] = frame->image.y; data.current.plane[1] = frame->image.u; data.current.plane[2] = frame->image.v; data.current.stride[0] = pEnc->mbParam.edged_width; data.current.stride[1] = pEnc->mbParam.edged_width/2; data.current.stride[2] = pEnc->mbParam.edged_width/2; data.frame_num = frame->frame_num; if (opt == XVID_PLG_BEFORE) { data.type = *type; data.quant = *quant; data.vol_flags = frame->vol_flags; data.vop_flags = frame->vop_flags; data.motion_flags = frame->motion_flags; } else if (opt == XVID_PLG_FRAME) { data.type = coding2type(frame->coding_type); data.quant = frame->quant; if ((pEnc->mbParam.plugin_flags & XVID_REQDQUANTS)) { data.dquant = pEnc->temp_dquants; data.dquant_stride = pEnc->mbParam.mb_width; memset(data.dquant, 0, data.mb_width*data.mb_height*sizeof(int)); } if(pEnc->mbParam.plugin_flags & XVID_REQLAMBDA) { int block = 0; emms(); data.lambda = pEnc->temp_lambda; for(i = 0;i < pEnc->mbParam.mb_height; i++) for(j = 0;j < pEnc->mbParam.mb_width; j++) for (k = 0; k < 6; k++) data.lambda[block++] = 1.0f; } } else { /* XVID_PLG_AFTER */ if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { data.original.csp = XVID_CSP_PLANAR; data.original.plane[0] = original->y; data.original.plane[1] = original->u; data.original.plane[2] = original->v; data.original.stride[0] = pEnc->mbParam.edged_width; data.original.stride[1] = pEnc->mbParam.edged_width/2; data.original.stride[2] = pEnc->mbParam.edged_width/2; } if ((frame->vol_flags & XVID_VOL_EXTRASTATS) || (pEnc->mbParam.plugin_flags & XVID_REQPSNR)) { data.sse_y = plane_sse( original->y, frame->image.y, pEnc->mbParam.edged_width, pEnc->mbParam.width, pEnc->mbParam.height); data.sse_u = plane_sse( original->u, frame->image.u, pEnc->mbParam.edged_width/2, pEnc->mbParam.width/2, pEnc->mbParam.height/2); data.sse_v = plane_sse( original->v, frame->image.v, pEnc->mbParam.edged_width/2, pEnc->mbParam.width/2, pEnc->mbParam.height/2); } data.type = coding2type(frame->coding_type); data.quant = frame->quant; if ((pEnc->mbParam.plugin_flags & XVID_REQDQUANTS)) { data.dquant = pEnc->temp_dquants; data.dquant_stride = pEnc->mbParam.mb_width; for (j=0; jmbParam.mb_height; j++) for (i=0; imbParam.mb_width; i++) { data.dquant[j*data.dquant_stride + i] = frame->mbs[j*pEnc->mbParam.mb_width + i].dquant; } } data.vol_flags = frame->vol_flags; data.vop_flags = frame->vop_flags; data.motion_flags = frame->motion_flags; data.length = frame->length; data.kblks = frame->sStat.kblks; data.mblks = frame->sStat.mblks; data.ublks = frame->sStat.ublks; /* New code */ data.stats.type = coding2type(frame->coding_type); data.stats.quant = frame->quant; data.stats.vol_flags = frame->vol_flags; data.stats.vop_flags = frame->vop_flags; data.stats.length = frame->length; data.stats.hlength = frame->length - (frame->sStat.iTextBits / 8); data.stats.kblks = frame->sStat.kblks; data.stats.mblks = frame->sStat.mblks; data.stats.ublks = frame->sStat.ublks; data.stats.sse_y = data.sse_y; data.stats.sse_u = data.sse_u; data.stats.sse_v = data.sse_v; if (stats) *stats = data.stats; } /* call plugins */ for (i=0; i<(unsigned int)pEnc->num_plugins;i++) { emms(); if (pEnc->plugins[i].func) { if (pEnc->plugins[i].func(pEnc->plugins[i].param, opt, &data, NULL) < 0) { continue; } } } emms(); /* copy modified values back into frame*/ if (opt == XVID_PLG_BEFORE) { *type = data.type; *quant = data.quant > 0 ? data.quant : 2; /* default */ frame->vol_flags = data.vol_flags; frame->vop_flags = data.vop_flags; frame->motion_flags = data.motion_flags; } else if (opt == XVID_PLG_FRAME) { if ((pEnc->mbParam.plugin_flags & XVID_REQDQUANTS)) { for (j=0; jmbParam.mb_height; j++) for (i=0; imbParam.mb_width; i++) { frame->mbs[j*pEnc->mbParam.mb_width + i].dquant = data.dquant[j*data.mb_width + i]; } } else { for (j=0; jmbParam.mb_height; j++) for (i=0; imbParam.mb_width; i++) { frame->mbs[j*pEnc->mbParam.mb_width + i].dquant = 0; } } if (pEnc->mbParam.plugin_flags & XVID_REQLAMBDA) { for (j = 0; j < pEnc->mbParam.mb_height; j++) for (i = 0; i < pEnc->mbParam.mb_width; i++) for (k = 0; k < 6; k++) { frame->mbs[j*pEnc->mbParam.mb_width + i].lambda[k] = (int) ((float)(1<mbParam.mb_height; j++) for (i = 0; imbParam.mb_width; i++) for (k = 0; k < 6; k++) { frame->mbs[j*pEnc->mbParam.mb_width + i].lambda[k] = 1<mbs[0].quant = data.quant; /* FRAME will not affect the quant in stats */ } } static __inline void inc_frame_num(Encoder * pEnc) { pEnc->current->frame_num = pEnc->m_framenum; pEnc->current->stamp = pEnc->mbParam.m_stamp; /* first frame is zero */ pEnc->mbParam.m_stamp += pEnc->current->fincr; pEnc->m_framenum++; /* debug ticker */ } static __inline void dec_frame_num(Encoder * pEnc) { pEnc->mbParam.m_stamp -= pEnc->mbParam.fincr; pEnc->m_framenum--; /* debug ticker */ } static __inline void MBSetDquant(MACROBLOCK * pMB, int x, int y, MBParam * mbParam) { if (pMB->cbp == 0) { /* we want to code dquant but the quantizer value will not be used yet let's find out if we can postpone dquant to next MB */ if (x == mbParam->mb_width-1 && y == mbParam->mb_height-1) { pMB->dquant = 0; /* it's the last MB of all, the easiest case */ return; } else { MACROBLOCK * next = pMB + 1; const MACROBLOCK * prev = pMB - 1; if (next->mode != MODE_INTER4V && next->mode != MODE_NOT_CODED) /* mode allows dquant change in the future */ if (abs(next->quant - prev->quant) <= 2) { /* quant change is not out of range */ pMB->quant = prev->quant; pMB->dquant = 0; next->dquant = next->quant - prev->quant; return; } } } /* couldn't skip this dquant */ pMB->mode = MODE_INTER_Q; } static __inline void set_timecodes(FRAMEINFO* pCur,FRAMEINFO *pRef, int32_t time_base) { pCur->ticks = (int32_t)pCur->stamp % time_base; pCur->seconds = ((int32_t)pCur->stamp / time_base) - ((int32_t)pRef->stamp / time_base) ; #if 0 /* HEAVY DEBUG OUTPUT */ fprintf(stderr,"WriteVop: %d - %d \n", ((int32_t)pCur->stamp / time_base), ((int32_t)pRef->stamp / time_base)); fprintf(stderr,"set_timecodes: VOP %1d stamp=%lld ref_stamp=%lld base=%d\n", pCur->coding_type, pCur->stamp, pRef->stamp, time_base); fprintf(stderr,"set_timecodes: VOP %1d seconds=%d ticks=%d (ref-sec=%d ref-tick=%d)\n", pCur->coding_type, pCur->seconds, pCur->ticks, pRef->seconds, pRef->ticks); #endif } static void simplify_par(int *par_width, int *par_height) { int _par_width = (!*par_width) ? 1 : (*par_width<0) ? -*par_width: *par_width; int _par_height = (!*par_height) ? 1 : (*par_height<0) ? -*par_height: *par_height; int divisor = gcd(_par_width, _par_height); _par_width /= divisor; _par_height /= divisor; /* 2^8 precision maximum */ if (_par_width>255 || _par_height>255) { float div; emms(); if (_par_width>_par_height) div = (float)_par_width/255; else div = (float)_par_height/255; _par_width = (int)((float)_par_width/div); _par_height = (int)((float)_par_height/div); } *par_width = _par_width; *par_height = _par_height; return; } /***************************************************************************** * IPB frame encoder entry point * * Returned values : * - >0 - output bytes * - 0 - no output * - XVID_ERR_VERSION - wrong version passed to core * - XVID_ERR_END - End of stream reached before end of coding * - XVID_ERR_FORMAT - the image subsystem reported the image had a wrong * format ****************************************************************************/ int enc_encode(Encoder * pEnc, xvid_enc_frame_t * xFrame, xvid_enc_stats_t * stats) { xvid_enc_frame_t * frame; int type; Bitstream bs; if (XVID_VERSION_MAJOR(xFrame->version) != 1 || (stats && XVID_VERSION_MAJOR(stats->version) != 1)) /* v1.x.x */ return XVID_ERR_VERSION; xFrame->out_flags = 0; start_global_timer(); BitstreamInit(&bs, xFrame->bitstream, 0); /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * enqueue image to the encoding-queue * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ if (xFrame->input.csp != XVID_CSP_NULL) { QUEUEINFO * q = &pEnc->queue[pEnc->queue_tail]; start_timer(); if (image_input (&q->image, pEnc->mbParam.width, pEnc->mbParam.height, pEnc->mbParam.edged_width, (uint8_t**)xFrame->input.plane, xFrame->input.stride, xFrame->input.csp, xFrame->vol_flags & XVID_VOL_INTERLACING)) { emms(); return XVID_ERR_FORMAT; } stop_conv_timer(); if ((xFrame->vop_flags & XVID_VOP_CHROMAOPT)) { image_chroma_optimize(&q->image, pEnc->mbParam.width, pEnc->mbParam.height, pEnc->mbParam.edged_width); } q->frame = *xFrame; if (xFrame->quant_intra_matrix) { memcpy(q->quant_intra_matrix, xFrame->quant_intra_matrix, 64*sizeof(unsigned char)); q->frame.quant_intra_matrix = q->quant_intra_matrix; } if (xFrame->quant_inter_matrix) { memcpy(q->quant_inter_matrix, xFrame->quant_inter_matrix, 64*sizeof(unsigned char)); q->frame.quant_inter_matrix = q->quant_inter_matrix; } pEnc->queue_tail = (pEnc->queue_tail + 1) % (pEnc->mbParam.max_bframes+1); pEnc->queue_size++; } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * bframe flush code * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ repeat: if (pEnc->flush_bframes) { if (pEnc->bframenum_head < pEnc->bframenum_tail) { DPRINTF(XVID_DEBUG_DEBUG,"*** BFRAME (flush) bf: head=%i tail=%i queue: head=%i tail=%i size=%i\n", pEnc->bframenum_head, pEnc->bframenum_tail, pEnc->queue_head, pEnc->queue_tail, pEnc->queue_size); if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { image_copy(&pEnc->sOriginal2, &pEnc->bframes[pEnc->bframenum_head]->image, pEnc->mbParam.edged_width, pEnc->mbParam.height); } FrameCodeB(pEnc, pEnc->bframes[pEnc->bframenum_head], &bs); call_plugins(pEnc, pEnc->bframes[pEnc->bframenum_head], &pEnc->sOriginal2, XVID_PLG_AFTER, NULL, NULL, stats); pEnc->bframenum_head++; goto done; } /* write an empty marker to the bitstream. for divx5 decoder compatibility, this marker must consist of a not-coded p-vop, with a time_base of zero, and time_increment indentical to the future-referece frame. */ if ((pEnc->mbParam.global_flags & XVID_GLOBAL_PACKED && pEnc->bframenum_tail > 0)) { int tmp; int bits; DPRINTF(XVID_DEBUG_DEBUG,"*** EMPTY bf: head=%i tail=%i queue: head=%i tail=%i size=%i\n", pEnc->bframenum_head, pEnc->bframenum_tail, pEnc->queue_head, pEnc->queue_tail, pEnc->queue_size); bits = BitstreamPos(&bs); tmp = pEnc->current->seconds; pEnc->current->seconds = 0; /* force time_base = 0 */ BitstreamWriteVopHeader(&bs, &pEnc->mbParam, pEnc->current, 0, pEnc->current->quant); BitstreamPad(&bs); pEnc->current->seconds = tmp; /* add the not-coded length to the reference frame size */ pEnc->current->length += (BitstreamPos(&bs) - bits) / 8; call_plugins(pEnc, pEnc->current, &pEnc->sOriginal, XVID_PLG_AFTER, NULL, NULL, stats); /* flush complete: reset counters */ pEnc->flush_bframes = 0; pEnc->bframenum_head = pEnc->bframenum_tail = 0; goto done; } /* flush complete: reset counters */ pEnc->flush_bframes = 0; pEnc->bframenum_head = pEnc->bframenum_tail = 0; } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * dequeue frame from the encoding queue * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ if (pEnc->queue_size == 0) /* empty */ { if (xFrame->input.csp == XVID_CSP_NULL) /* no futher input */ { DPRINTF(XVID_DEBUG_DEBUG,"*** FINISH bf: head=%i tail=%i queue: head=%i tail=%i size=%i\n", pEnc->bframenum_head, pEnc->bframenum_tail, pEnc->queue_head, pEnc->queue_tail, pEnc->queue_size); if (!(pEnc->mbParam.global_flags & XVID_GLOBAL_PACKED) && pEnc->mbParam.max_bframes > 0) { call_plugins(pEnc, pEnc->current, &pEnc->sOriginal, XVID_PLG_AFTER, NULL, NULL, stats); } /* if the very last frame is to be b-vop, we must change it to a p-vop */ if (pEnc->bframenum_tail > 0) { SWAP(FRAMEINFO*, pEnc->current, pEnc->reference); pEnc->bframenum_tail--; SWAP(FRAMEINFO*, pEnc->current, pEnc->bframes[pEnc->bframenum_tail]); /* convert B-VOP to P-VOP */ pEnc->current->quant = 100*pEnc->current->quant - pEnc->mbParam.bquant_offset; pEnc->current->quant += pEnc->mbParam.bquant_ratio - 1; /* to avoid rouding issues */ pEnc->current->quant /= pEnc->mbParam.bquant_ratio; if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { image_copy(&pEnc->sOriginal, &pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.height); } DPRINTF(XVID_DEBUG_DEBUG,"*** PFRAME bf: head=%i tail=%i queue: head=%i tail=%i size=%i\n", pEnc->bframenum_head, pEnc->bframenum_tail, pEnc->queue_head, pEnc->queue_tail, pEnc->queue_size); pEnc->mbParam.frame_drop_ratio = -1; /* it must be a coded vop */ FrameCodeP(pEnc, &bs); if ((pEnc->mbParam.global_flags & XVID_GLOBAL_PACKED) && pEnc->bframenum_tail==0) { call_plugins(pEnc, pEnc->current, &pEnc->sOriginal, XVID_PLG_AFTER, NULL, NULL, stats); }else{ pEnc->flush_bframes = 1; goto done; } } DPRINTF(XVID_DEBUG_DEBUG, "*** END\n"); emms(); return XVID_ERR_END; /* end of stream reached */ } goto done; /* nothing to encode yet; encoder lag */ } /* the current FRAME becomes the reference */ SWAP(FRAMEINFO*, pEnc->current, pEnc->reference); /* remove frame from encoding-queue (head), and move it into the current */ image_swap(&pEnc->current->image, &pEnc->queue[pEnc->queue_head].image); frame = &pEnc->queue[pEnc->queue_head].frame; pEnc->queue_head = (pEnc->queue_head + 1) % (pEnc->mbParam.max_bframes+1); pEnc->queue_size--; /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * init pEnc->current fields * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ pEnc->current->fincr = pEnc->mbParam.fincr>0 ? pEnc->mbParam.fincr : frame->fincr; inc_frame_num(pEnc); pEnc->current->vol_flags = frame->vol_flags; pEnc->current->vop_flags = frame->vop_flags; pEnc->current->motion_flags = frame->motion; pEnc->current->fcode = pEnc->mbParam.m_fcode; pEnc->current->bcode = pEnc->mbParam.m_fcode; if ((xFrame->vop_flags & XVID_VOP_CHROMAOPT)) { image_chroma_optimize(&pEnc->current->image, pEnc->mbParam.width, pEnc->mbParam.height, pEnc->mbParam.edged_width); } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * frame type & quant selection * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ type = frame->type; pEnc->current->quant = frame->quant; call_plugins(pEnc, pEnc->current, NULL, XVID_PLG_BEFORE, &type, (int*)&pEnc->current->quant, stats); if (type > 0){ /* XVID_TYPE_?VOP */ type = type2coding(type); /* convert XVID_TYPE_?VOP to bitstream coding type */ } else{ /* XVID_TYPE_AUTO */ if (pEnc->iFrameNum == 0 || (pEnc->mbParam.iMaxKeyInterval > 0 && pEnc->iFrameNum >= pEnc->mbParam.iMaxKeyInterval)){ pEnc->iFrameNum = 0; type = I_VOP; }else{ type = MEanalysis(&pEnc->reference->image, pEnc->current, &pEnc->mbParam, pEnc->mbParam.iMaxKeyInterval, pEnc->iFrameNum, pEnc->bframenum_tail, xFrame->bframe_threshold, (pEnc->bframes) ? pEnc->bframes[pEnc->bframenum_head]->mbs: NULL); } } if (type != I_VOP) pEnc->current->vol_flags = pEnc->mbParam.vol_flags; /* don't allow VOL changes here */ /* bframes buffer overflow check */ if (type == B_VOP && pEnc->bframenum_tail >= pEnc->mbParam.max_bframes) { type = P_VOP; } pEnc->iFrameNum++; if ((pEnc->current->vop_flags & XVID_VOP_DEBUG)) { image_printf(&pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.height, 5, 5, "%d st:%lld if:%d", pEnc->current->frame_num, pEnc->current->stamp, pEnc->iFrameNum); } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * encode this frame as a b-vop * (we dont encode here, rather we store the frame in the bframes queue, to be encoded later) * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ if (type == B_VOP) { if ((pEnc->current->vop_flags & XVID_VOP_DEBUG)) { image_printf(&pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.height, 5, 200, "BVOP"); } if (frame->quant < 1) { pEnc->current->quant = ((((pEnc->reference->quant + pEnc->current->quant) * pEnc->mbParam.bquant_ratio) / 2) + pEnc->mbParam.bquant_offset)/100; } else { pEnc->current->quant = frame->quant; } if (pEnc->current->quant < 1) pEnc->current->quant = 1; else if (pEnc->current->quant > 31) pEnc->current->quant = 31; DPRINTF(XVID_DEBUG_DEBUG,"*** BFRAME (store) bf: head=%i tail=%i queue: head=%i tail=%i size=%i quant=%i\n", pEnc->bframenum_head, pEnc->bframenum_tail, pEnc->queue_head, pEnc->queue_tail, pEnc->queue_size,pEnc->current->quant); /* store frame into bframe buffer & swap ref back to current */ SWAP(FRAMEINFO*, pEnc->current, pEnc->bframes[pEnc->bframenum_tail]); SWAP(FRAMEINFO*, pEnc->current, pEnc->reference); pEnc->bframenum_tail++; goto repeat; } DPRINTF(XVID_DEBUG_DEBUG,"*** XXXXXX bf: head=%i tail=%i queue: head=%i tail=%i size=%i\n", pEnc->bframenum_head, pEnc->bframenum_tail, pEnc->queue_head, pEnc->queue_tail, pEnc->queue_size); /* for unpacked bframes, output the stats for the last encoded frame */ if (!(pEnc->mbParam.global_flags & XVID_GLOBAL_PACKED) && pEnc->mbParam.max_bframes > 0) { if (pEnc->current->stamp > 0) { call_plugins(pEnc, pEnc->reference, &pEnc->sOriginal, XVID_PLG_AFTER, NULL, NULL, stats); } else if (stats) { stats->type = XVID_TYPE_NOTHING; } } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * closed-gop * if the frame prior to an iframe is scheduled as a bframe, we must change it to a pframe * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ if (type == I_VOP && (pEnc->mbParam.global_flags & XVID_GLOBAL_CLOSED_GOP) && pEnc->bframenum_tail > 0) { /* place this frame back on the encoding-queue (head) */ /* we will deal with it next time */ dec_frame_num(pEnc); pEnc->iFrameNum--; pEnc->queue_head = (pEnc->queue_head + (pEnc->mbParam.max_bframes+1) - 1) % (pEnc->mbParam.max_bframes+1); pEnc->queue_size++; image_swap(&pEnc->current->image, &pEnc->queue[pEnc->queue_head].image); /* grab the last frame from the bframe-queue */ pEnc->bframenum_tail--; SWAP(FRAMEINFO*, pEnc->current, pEnc->bframes[pEnc->bframenum_tail]); if ((pEnc->current->vop_flags & XVID_VOP_DEBUG)) { image_printf(&pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.height, 5, 100, "CLOSED GOP BVOP->PVOP"); } /* convert B-VOP quant to P-VOP */ pEnc->current->quant = 100*pEnc->current->quant - pEnc->mbParam.bquant_offset; pEnc->current->quant += pEnc->mbParam.bquant_ratio - 1; /* to avoid rouding issues */ pEnc->current->quant /= pEnc->mbParam.bquant_ratio; type = P_VOP; } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * encode this frame as an i-vop * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ if (type == I_VOP) { DPRINTF(XVID_DEBUG_DEBUG,"*** IFRAME bf: head=%i tail=%i queue: head=%i tail=%i size=%i\n", pEnc->bframenum_head, pEnc->bframenum_tail, pEnc->queue_head, pEnc->queue_tail, pEnc->queue_size); if ((pEnc->current->vop_flags & XVID_VOP_DEBUG)) { image_printf(&pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.height, 5, 200, "IVOP"); } pEnc->iFrameNum = 1; /* ---- update vol flags at IVOP ----------- */ pEnc->mbParam.vol_flags = pEnc->current->vol_flags; /* Aspect ratio */ switch(frame->par) { case XVID_PAR_11_VGA: case XVID_PAR_43_PAL: case XVID_PAR_43_NTSC: case XVID_PAR_169_PAL: case XVID_PAR_169_NTSC: case XVID_PAR_EXT: pEnc->mbParam.par = frame->par; break; default: pEnc->mbParam.par = XVID_PAR_11_VGA; break; } /* For extended PAR only, we try to sanityse/simplify par values */ if (pEnc->mbParam.par == XVID_PAR_EXT) { pEnc->mbParam.par_width = frame->par_width; pEnc->mbParam.par_height = frame->par_height; simplify_par(&pEnc->mbParam.par_width, &pEnc->mbParam.par_height); } if ((pEnc->mbParam.vol_flags & XVID_VOL_MPEGQUANT)) { if (frame->quant_intra_matrix != NULL) set_intra_matrix(pEnc->mbParam.mpeg_quant_matrices, frame->quant_intra_matrix); if (frame->quant_inter_matrix != NULL) set_inter_matrix(pEnc->mbParam.mpeg_quant_matrices, frame->quant_inter_matrix); } /* prevent vol/vop misuse */ if (!(pEnc->current->vol_flags & XVID_VOL_INTERLACING)) pEnc->current->vop_flags &= ~(XVID_VOP_TOPFIELDFIRST|XVID_VOP_ALTERNATESCAN); /* ^^^------------------------ */ if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { image_copy(&pEnc->sOriginal, &pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.height); } FrameCodeI(pEnc, &bs); xFrame->out_flags |= XVID_KEYFRAME; /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * encode this frame as an p-vop * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ } else { /* (type == P_VOP || type == S_VOP) */ DPRINTF(XVID_DEBUG_DEBUG,"*** PFRAME bf: head=%i tail=%i queue: head=%i tail=%i size=%i\n", pEnc->bframenum_head, pEnc->bframenum_tail, pEnc->queue_head, pEnc->queue_tail, pEnc->queue_size); if ((pEnc->current->vop_flags & XVID_VOP_DEBUG)) { image_printf(&pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.height, 5, 200, "PVOP"); } if ((pEnc->mbParam.plugin_flags & XVID_REQORIGINAL)) { image_copy(&pEnc->sOriginal, &pEnc->current->image, pEnc->mbParam.edged_width, pEnc->mbParam.height); } if ( FrameCodeP(pEnc, &bs) == 0 ) { /* N-VOP, we mustn't code b-frames yet */ if ((pEnc->mbParam.global_flags & XVID_GLOBAL_PACKED) || pEnc->mbParam.max_bframes == 0) call_plugins(pEnc, pEnc->current, &pEnc->sOriginal, XVID_PLG_AFTER, NULL, NULL, stats); goto done; } } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * on next enc_encode call we must flush bframes * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ /*done_flush:*/ pEnc->flush_bframes = 1; /* packed & queued_bframes: dont bother outputting stats here, we do so after the flush */ if ((pEnc->mbParam.global_flags & XVID_GLOBAL_PACKED) && pEnc->bframenum_tail > 0) { goto repeat; } /* packed or no-bframes or no-bframes-queued: output stats */ if ((pEnc->mbParam.global_flags & XVID_GLOBAL_PACKED) || pEnc->mbParam.max_bframes == 0 ) { call_plugins(pEnc, pEnc->current, &pEnc->sOriginal, XVID_PLG_AFTER, NULL, NULL, stats); } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * done; return number of bytes consumed * %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ done: stop_global_timer(); write_timer(); emms(); return BitstreamLength(&bs); } static void SetMacroblockQuants(MBParam * const pParam, FRAMEINFO * frame) { unsigned int i; MACROBLOCK * pMB = frame->mbs; int quant = frame->mbs[0].quant; /* set by XVID_PLG_FRAME */ if (quant > 31) frame->quant = quant = 31; else if (quant < 1) frame->quant = quant = 1; for (i = 0; i < pParam->mb_height * pParam->mb_width; i++) { quant += pMB->dquant; if (quant > 31) quant = 31; else if (quant < 1) quant = 1; pMB->quant = quant; pMB++; } } static __inline void CodeIntraMB(MACROBLOCK * pMB) { pMB->mode = MODE_INTRA; /* zero mv statistics */ pMB->mvs[0].x = pMB->mvs[1].x = pMB->mvs[2].x = pMB->mvs[3].x = 0; pMB->mvs[0].y = pMB->mvs[1].y = pMB->mvs[2].y = pMB->mvs[3].y = 0; pMB->sad8[0] = pMB->sad8[1] = pMB->sad8[2] = pMB->sad8[3] = 0; pMB->sad16 = 0; if (pMB->dquant != 0) { pMB->mode = MODE_INTRA_Q; } } static void SliceCodeI(SMPData *data) { Encoder *pEnc = (Encoder *) data->pEnc; Bitstream *bs = (Bitstream *) data->bs; uint16_t x, y; int mb_width = pEnc->mbParam.mb_width; int mb_height = pEnc->mbParam.mb_height; int bound = 0, num_slices = pEnc->num_slices; FRAMEINFO *const current = pEnc->current; DECLARE_ALIGNED_MATRIX(dct_codes, 6, 64, int16_t, CACHE_LINE); DECLARE_ALIGNED_MATRIX(qcoeff, 6, 64, int16_t, CACHE_LINE); if (data->start_y > 0) { /* write resync marker */ bound = data->start_y*mb_width; write_video_packet_header(bs, &pEnc->mbParam, current, bound); } for (y = data->start_y; y < data->stop_y; y++) { int new_bound = mb_width * ((((y*num_slices) / mb_height) * mb_height + (num_slices-1)) / num_slices); if (new_bound > bound) { bound = new_bound; BitstreamPadAlways(bs); write_video_packet_header(bs, &pEnc->mbParam, current, bound); } for (x = 0; x < mb_width; x++) { MACROBLOCK *pMB = ¤t->mbs[x + y * mb_width]; CodeIntraMB(pMB); MBTransQuantIntra(&pEnc->mbParam, current, pMB, x, y, dct_codes, qcoeff); start_timer(); MBPrediction(current, x, y, mb_width, qcoeff, bound); stop_prediction_timer(); start_timer(); MBCoding(current, pMB, qcoeff, bs, data->sStat); stop_coding_timer(); } } emms(); BitstreamPadAlways(bs); } static __inline void SerializeBitstreams(Encoder *pEnc, FRAMEINFO *current, Bitstream *bs, int num_threads) { int k; uint32_t pos = BitstreamLength(bs); for (k = 1; k < num_threads; k++) { uint32_t len = BitstreamLength(pEnc->smpData[k].bs); memcpy((void *)((ptr_t)bs->start + pos), (void *)((ptr_t)pEnc->smpData[k].bs->start), len); current->length += len; pos += len; /* collect stats */ current->sStat.iTextBits += pEnc->smpData[k].sStat->iTextBits; current->sStat.kblks += pEnc->smpData[k].sStat->kblks; current->sStat.mblks += pEnc->smpData[k].sStat->mblks; current->sStat.ublks += pEnc->smpData[k].sStat->ublks; current->sStat.iMVBits += pEnc->smpData[k].sStat->iMVBits; } if (num_threads > 1) { uint32_t pos32 = pos>>2; bs->tail = bs->start + pos32; bs->pos = 8*(pos - (pos32<<2)); bs->buf = 0; if (bs->pos > 0) { uint32_t pos8 = bs->pos/8; memset((void *)((ptr_t)bs->tail+pos8), 0, (4-pos8)); pos = *bs->tail; #ifndef ARCH_IS_BIG_ENDIAN BSWAP(pos); #endif bs->buf = pos; } } } static int FrameCodeI(Encoder * pEnc, Bitstream * bs) { int bits = BitstreamPos(bs); int bound = 0, num_slices = pEnc->num_slices; int num_threads = MAX(1, MIN(pEnc->num_threads, num_slices)); int slices_per_thread = (num_slices*1024 / num_threads); int mb_height = pEnc->mbParam.mb_height; #ifdef HAVE_PTHREAD void * status = NULL; #endif uint16_t k; pEnc->mbParam.m_rounding_type = 1; pEnc->current->rounding_type = pEnc->mbParam.m_rounding_type; pEnc->current->coding_type = I_VOP; call_plugins(pEnc, pEnc->current, NULL, XVID_PLG_FRAME, NULL, NULL, NULL); SetMacroblockQuants(&pEnc->mbParam, pEnc->current); BitstreamWriteVolHeader(bs, &pEnc->mbParam, pEnc->current, num_slices); set_timecodes(pEnc->current,pEnc->reference,pEnc->mbParam.fbase); BitstreamPad(bs); BitstreamWriteVopHeader(bs, &pEnc->mbParam, pEnc->current, 1, pEnc->current->mbs[0].quant); pEnc->current->sStat.iTextBits = 0; /* multithreaded intra coding - dispatch threads */ for (k = 0; k < num_threads; k++) { int add = ((slices_per_thread + 512) >> 10); slices_per_thread += ((num_slices*1024 / num_threads) - add*1024); pEnc->smpData[k].pEnc = (void *) pEnc; pEnc->smpData[k].stop_y = (((bound+add) * mb_height + (num_slices-1)) / num_slices); pEnc->smpData[k].start_y = ((bound * mb_height + (num_slices-1)) / num_slices); bound += add; if (k > 0) { BitstreamReset(pEnc->smpData[k].bs); pEnc->smpData[k].sStat->iTextBits = 0; } } pEnc->smpData[0].bs = bs; pEnc->smpData[0].sStat = &pEnc->current->sStat; #ifdef HAVE_PTHREAD /* create threads */ for (k = 1; k < num_threads; k++) { pthread_create(&pEnc->smpData[k].handle, NULL, (void*)SliceCodeI, (void*)&pEnc->smpData[k]); } #endif SliceCodeI(&pEnc->smpData[0]); #ifdef HAVE_PTHREAD /* wait until all threads are finished */ for (k = 1; k < num_threads; k++) { pthread_join(pEnc->smpData[k].handle, &status); } #endif pEnc->current->length = BitstreamLength(bs) - (bits/8); /* reassemble the pieces together */ SerializeBitstreams(pEnc, pEnc->current, bs, num_threads); pEnc->current->sStat.iMVBits = 0; pEnc->current->sStat.mblks = pEnc->current->sStat.ublks = 0; pEnc->current->sStat.kblks = pEnc->mbParam.mb_width * pEnc->mbParam.mb_height; pEnc->fMvPrevSigma = -1; pEnc->mbParam.m_fcode = 2; pEnc->current->is_edged = 0; /* not edged */ pEnc->current->is_interpolated = -1; /* not interpolated (fake rounding -1) */ return 1; /* intra */ } static __inline void updateFcode(Statistics * sStat, Encoder * pEnc) { float fSigma; int iSearchRange; if (sStat->iMvCount == 0) sStat->iMvCount = 1; fSigma = (float) sqrt((float) sStat->iMvSum / sStat->iMvCount); iSearchRange = 16 << pEnc->mbParam.m_fcode; if ((3.0 * fSigma > iSearchRange) && (pEnc->mbParam.m_fcode <= 5) ) pEnc->mbParam.m_fcode++; else if ((5.0 * fSigma < iSearchRange) && (4.0 * pEnc->fMvPrevSigma < iSearchRange) && (pEnc->mbParam.m_fcode >= 2) ) pEnc->mbParam.m_fcode--; pEnc->fMvPrevSigma = fSigma; } #define BFRAME_SKIP_THRESHHOLD 30 static void SliceCodeP(SMPData *data) { Encoder *pEnc = (Encoder *) data->pEnc; Bitstream *bs = (Bitstream *) data->bs; int x, y, k; FRAMEINFO *const current = pEnc->current; FRAMEINFO *const reference = pEnc->reference; MBParam * const pParam = &pEnc->mbParam; int mb_width = pParam->mb_width; int mb_height = pParam->mb_height; DECLARE_ALIGNED_MATRIX(dct_codes, 6, 64, int16_t, CACHE_LINE); DECLARE_ALIGNED_MATRIX(qcoeff, 6, 64, int16_t, CACHE_LINE); int bound = 0, num_slices = pEnc->num_slices; if (data->start_y > 0) { /* write resync marker */ bound = data->start_y*mb_width; write_video_packet_header(bs, pParam, current, bound); } for (y = data->start_y; y < data->stop_y; y++) { int new_bound = mb_width * ((((y*num_slices) / mb_height) * mb_height + (num_slices-1)) / num_slices); if (new_bound > bound) { bound = new_bound; BitstreamPadAlways(bs); write_video_packet_header(bs, pParam, current, bound); } for (x = 0; x < mb_width; x++) { MACROBLOCK *pMB = ¤t->mbs[x + y * pParam->mb_width]; int skip_possible; if (pMB->mode == MODE_INTRA || pMB->mode == MODE_INTRA_Q) { CodeIntraMB(pMB); MBTransQuantIntra(pParam, current, pMB, x, y, dct_codes, qcoeff); start_timer(); MBPrediction(current, x, y, pParam->mb_width, qcoeff, bound); stop_prediction_timer(); data->sStat->kblks++; MBCoding(current, pMB, qcoeff, bs, data->sStat); stop_coding_timer(); continue; } start_timer(); MBMotionCompensation(pMB, x, y, &reference->image, &pEnc->vInterH, &pEnc->vInterV, &pEnc->vInterHV, &pEnc->vGMC, ¤t->image, dct_codes, pParam->width, pParam->height, pParam->edged_width, (current->vol_flags & XVID_VOL_QUARTERPEL), current->rounding_type, data->RefQ); stop_comp_timer(); pMB->field_pred = 0; if (pMB->cbp != 0) { pMB->cbp = MBTransQuantInter(pParam, current, pMB, x, y, dct_codes, qcoeff); } if (pMB->dquant != 0) MBSetDquant(pMB, x, y, pParam); if (pMB->cbp || pMB->mvs[0].x || pMB->mvs[0].y || pMB->mvs[1].x || pMB->mvs[1].y || pMB->mvs[2].x || pMB->mvs[2].y || pMB->mvs[3].x || pMB->mvs[3].y) { data->sStat->mblks++; } else { data->sStat->ublks++; } start_timer(); /* Finished processing the MB, now check if to CODE or SKIP */ skip_possible = (pMB->cbp == 0) && (pMB->mode == MODE_INTER); if (current->coding_type == S_VOP) skip_possible &= (pMB->mcsel == 1); else { /* PVOP */ const VECTOR * const mv = (pParam->vol_flags & XVID_VOL_QUARTERPEL) ? pMB->qmvs : pMB->mvs; skip_possible &= ((mv->x|mv->y) == 0); } if ((pMB->mode == MODE_NOT_CODED) || (skip_possible)) { /* This is a candidate for SKIPping, but for P-VOPs check intermediate B-frames first */ int bSkip = 1; if (current->coding_type == P_VOP) { /* special rule for P-VOP's SKIP */ for (k = pEnc->bframenum_head; k < pEnc->bframenum_tail; k++) { int iSAD; iSAD = sad16(reference->image.y + 16*y*pParam->edged_width + 16*x, pEnc->bframes[k]->image.y + 16*y*pParam->edged_width + 16*x, pParam->edged_width, BFRAME_SKIP_THRESHHOLD * pMB->quant); if (iSAD >= BFRAME_SKIP_THRESHHOLD * pMB->quant || ((bound > 1) && ((y*mb_width+x == bound) || (y*mb_width+x == bound+1)))) { /* Some third-party decoders have problems with coloc skip MB before or after resync marker in BVOP. We avoid any ambiguity and force no skip at slice boundary */ bSkip = 0; /* could not SKIP */ if (pParam->vol_flags & XVID_VOL_QUARTERPEL) { VECTOR predMV = get_qpmv2(current->mbs, pParam->mb_width, bound, x, y, 0); pMB->pmvs[0].x = - predMV.x; pMB->pmvs[0].y = - predMV.y; } else { VECTOR predMV = get_pmv2(current->mbs, pParam->mb_width, bound, x, y, 0); pMB->pmvs[0].x = - predMV.x; pMB->pmvs[0].y = - predMV.y; } pMB->mode = MODE_INTER; pMB->cbp = 0; break; } } } if (bSkip) { /* do SKIP */ pMB->mode = MODE_NOT_CODED; MBSkip(bs); stop_coding_timer(); continue; /* next MB */ } } /* ordinary case: normal coded INTER/INTER4V block */ MBCoding(current, pMB, qcoeff, bs, data->sStat); stop_coding_timer(); } } BitstreamPadAlways(bs); /* next_start_code() at the end of VideoObjectPlane() */ emms(); } /* FrameCodeP also handles S(GMC)-VOPs */ static int FrameCodeP(Encoder * pEnc, Bitstream * bs) { int bits = BitstreamPos(bs); FRAMEINFO *const current = pEnc->current; FRAMEINFO *const reference = pEnc->reference; MBParam * const pParam = &pEnc->mbParam; int mb_width = pParam->mb_width; int mb_height = pParam->mb_height; int coded = 1; int k = 0, bound = 0, num_slices = pEnc->num_slices; int num_threads = MAX(1, MIN(pEnc->num_threads, num_slices)); #ifdef HAVE_PTHREAD void * status = NULL; int threads_per_slice = (pEnc->num_threads*1024 / num_threads); #endif int slices_per_thread = (num_slices*1024 / num_threads); IMAGE *pRef = &reference->image; if (!reference->is_edged) { start_timer(); image_setedges(pRef, pParam->edged_width, pParam->edged_height, pParam->width, pParam->height, XVID_BS_VERSION); stop_edges_timer(); reference->is_edged = 1; } pParam->m_rounding_type = 1 - pParam->m_rounding_type; current->rounding_type = pParam->m_rounding_type; current->fcode = pParam->m_fcode; if ((current->vop_flags & XVID_VOP_HALFPEL)) { if (reference->is_interpolated != current->rounding_type) { start_timer(); image_interpolate(pRef->y, pEnc->vInterH.y, pEnc->vInterV.y, pEnc->vInterHV.y, pParam->edged_width, pParam->edged_height, (pParam->vol_flags & XVID_VOL_QUARTERPEL), current->rounding_type); stop_inter_timer(); reference->is_interpolated = current->rounding_type; } } current->sStat.iTextBits = current->sStat.iMvSum = current->sStat.iMvCount = current->sStat.kblks = current->sStat.mblks = current->sStat.ublks = current->sStat.iMVBits = 0; current->coding_type = P_VOP; if (current->vop_flags & XVID_VOP_RD_PSNRHVSM) { image_block_variance(¤t->image, pParam->edged_width, current->mbs, pParam->mb_width, pParam->mb_height); } call_plugins(pEnc, pEnc->current, NULL, XVID_PLG_FRAME, NULL, NULL, NULL); SetMacroblockQuants(&pEnc->mbParam, current); start_timer(); if (current->vol_flags & XVID_VOL_GMC) /* GMC only for S(GMC)-VOPs */ { int gmcval; current->warp = GlobalMotionEst( current->mbs, pParam, current, reference, &pEnc->vInterH, &pEnc->vInterV, &pEnc->vInterHV, num_slices); if (current->motion_flags & XVID_ME_GME_REFINE) { gmcval = GlobalMotionEstRefine(¤t->warp, current->mbs, pParam, current, reference, ¤t->image, &reference->image, &pEnc->vInterH, &pEnc->vInterV, &pEnc->vInterHV); } else { gmcval = globalSAD(¤t->warp, pParam, current->mbs, current, &reference->image, ¤t->image, pEnc->vGMC.y); } gmcval += /*current->quant*/ 2 * (int)(pParam->mb_width*pParam->mb_height); /* 1st '3': 3 warpoints, 2nd '3': 16th pel res (2<<3) */ generate_GMCparameters( 3, 3, ¤t->warp, pParam->width, pParam->height, ¤t->new_gmc_data); if ( (gmcval<0) && ( (current->warp.duv[1].x != 0) || (current->warp.duv[1].y != 0) || (current->warp.duv[2].x != 0) || (current->warp.duv[2].y != 0) ) ) { current->coding_type = S_VOP; generate_GMCimage(¤t->new_gmc_data, &reference->image, pParam->mb_width, pParam->mb_height, pParam->edged_width, pParam->edged_width/2, pParam->m_fcode, ((pParam->vol_flags & XVID_VOL_QUARTERPEL)?1:0), 0, current->rounding_type, current->mbs, &pEnc->vGMC); } else { generate_GMCimage(¤t->new_gmc_data, &reference->image, pParam->mb_width, pParam->mb_height, pParam->edged_width, pParam->edged_width/2, pParam->m_fcode, ((pParam->vol_flags & XVID_VOL_QUARTERPEL)?1:0), 0, current->rounding_type, current->mbs, NULL); /* no warping, just AMV */ } } #ifdef HAVE_PTHREAD if (pEnc->num_threads > 0) { /* multithreaded motion estimation - dispatch threads */ while (k < pEnc->num_threads) { int i, add_s = (slices_per_thread + 512) >> 10; int add_t = (threads_per_slice + 512) >> 10; int start_y = (bound * mb_height + (num_slices-1)) / num_slices; int stop_y = ((bound+add_s) * mb_height + (num_slices-1)) / num_slices; int rows_per_thread = (stop_y - start_y + add_t - 1) / add_t; slices_per_thread += ((num_slices*1024 / num_threads) - add_s*1024); threads_per_slice += ((pEnc->num_threads*1024 / num_threads) - add_t*1024); for (i = 0; i < add_t; i++) { memset(pEnc->smpData[k+i].complete_count_self, 0, rows_per_thread * sizeof(int)); pEnc->smpData[k+i].pEnc = (void *) pEnc; pEnc->smpData[k+i].y_row = i; pEnc->smpData[k+i].y_step = add_t; pEnc->smpData[k+i].stop_y = stop_y; pEnc->smpData[k+i].start_y = start_y; /* todo: sort out temp space once and for all */ pEnc->smpData[k+i].RefQ = (((k+i)&1) ? pEnc->vInterV.u : pEnc->vInterV.v) + 16*((k+i)>>1)*pParam->edged_width; } pEnc->smpData[k].complete_count_above = pEnc->smpData[k+add_t-1].complete_count_self - 1; bound += add_s; k += add_t; } for (k = 1; k < pEnc->num_threads; k++) { pthread_create(&pEnc->smpData[k].handle, NULL, (void*)MotionEstimateSMP, (void*)&pEnc->smpData[k]); } MotionEstimateSMP(&pEnc->smpData[0]); for (k = 1; k < pEnc->num_threads; k++) { pthread_join(pEnc->smpData[k].handle, &status); } current->fcode = 0; for (k = 0; k < pEnc->num_threads; k++) { current->sStat.iMvSum += pEnc->smpData[k].mvSum; current->sStat.iMvCount += pEnc->smpData[k].mvCount; if (pEnc->smpData[k].minfcode > current->fcode) current->fcode = pEnc->smpData[k].minfcode; } } else #endif { /* regular ME */ MotionEstimation(&pEnc->mbParam, current, reference, &pEnc->vInterH, &pEnc->vInterV, &pEnc->vInterHV, &pEnc->vGMC, 256*4096, num_slices); } stop_motion_timer(); set_timecodes(current,reference,pParam->fbase); BitstreamWriteVopHeader(bs, &pEnc->mbParam, current, 1, current->mbs[0].quant); /* multithreaded inter coding - dispatch threads */ bound = 0; slices_per_thread = (num_slices*1024 / num_threads); for (k = 0; k < num_threads; k++) { int add = ((slices_per_thread + 512) >> 10); slices_per_thread += ((num_slices*1024 / num_threads) - add*1024); pEnc->smpData[k].pEnc = (void *) pEnc; pEnc->smpData[k].stop_y = (((bound+add) * mb_height + (num_slices-1)) / num_slices); pEnc->smpData[k].start_y = ((bound * mb_height + (num_slices-1)) / num_slices); pEnc->smpData[k].RefQ = ((k&1) ? pEnc->vInterV.u : pEnc->vInterV.v) + 16*(k>>1)*pParam->edged_width; bound += add; if (k > 0) { pEnc->smpData[k].sStat->iTextBits = pEnc->smpData[k].sStat->kblks = pEnc->smpData[k].sStat->mblks = pEnc->smpData[k].sStat->ublks = pEnc->smpData[k].sStat->iMVBits = 0; BitstreamReset(pEnc->smpData[k].bs); } } pEnc->smpData[0].bs = bs; pEnc->smpData[0].sStat = ¤t->sStat; #ifdef HAVE_PTHREAD /* create threads */ for (k = 1; k < num_threads; k++) { pthread_create(&pEnc->smpData[k].handle, NULL, (void*)SliceCodeP, (void*)&pEnc->smpData[k]); } #endif SliceCodeP(&pEnc->smpData[0]); #ifdef HAVE_PTHREAD /* wait until all threads are finished */ for (k = 1; k < num_threads; k++) { pthread_join(pEnc->smpData[k].handle, &status); } #endif current->length = BitstreamLength(bs) - (bits/8); /* reassemble the pieces together */ SerializeBitstreams(pEnc, pEnc->current, bs, num_threads); updateFcode(¤t->sStat, pEnc); /* frame drop code */ #if 0 DPRINTF(XVID_DEBUG_DEBUG, "kmu %i %i %i\n", current->sStat.kblks, current->sStat.mblks, current->sStat.ublks); #endif if (current->sStat.kblks + current->sStat.mblks < (pParam->frame_drop_ratio * mb_width * mb_height) / 100 && ( (pEnc->bframenum_head >= pEnc->bframenum_tail) || !(pEnc->mbParam.global_flags & XVID_GLOBAL_CLOSED_GOP)) && (current->coding_type == P_VOP) ) { current->sStat.kblks = current->sStat.mblks = current->sStat.iTextBits = 0; current->sStat.ublks = mb_width * mb_height; BitstreamReset(bs); set_timecodes(current,reference,pParam->fbase); BitstreamWriteVopHeader(bs, &pEnc->mbParam, current, 0, current->mbs[0].quant); /* copy reference frame details into the current frame */ current->quant = reference->quant; current->motion_flags = reference->motion_flags; current->rounding_type = reference->rounding_type; current->fcode = reference->fcode; current->bcode = reference->bcode; current->stamp = reference->stamp; image_copy(¤t->image, &reference->image, pParam->edged_width, pParam->height); memcpy(current->mbs, reference->mbs, sizeof(MACROBLOCK) * mb_width * mb_height); coded = 0; BitstreamPadAlways(bs); /* next_start_code() at the end of VideoObjectPlane() */ current->length = (BitstreamPos(bs) - bits) / 8; } else { pEnc->current->is_edged = 0; /* not edged */ pEnc->current->is_interpolated = -1; /* not interpolated (fake rounding -1) */ /* what was this frame's interpolated reference will become forward (past) reference in b-frame coding */ image_swap(&pEnc->vInterH, &pEnc->f_refh); image_swap(&pEnc->vInterV, &pEnc->f_refv); image_swap(&pEnc->vInterHV, &pEnc->f_refhv); } /* XXX: debug { char s[100]; sprintf(s, "\\%05i_cur.pgm", pEnc->m_framenum); image_dump_yuvpgm(¤t->image, pParam->edged_width, pParam->width, pParam->height, s); sprintf(s, "\\%05i_ref.pgm", pEnc->m_framenum); image_dump_yuvpgm(&reference->image, pParam->edged_width, pParam->width, pParam->height, s); } */ return coded; } static void SliceCodeB(SMPData *data) { Encoder *pEnc = (Encoder *) data->pEnc; Bitstream *bs = (Bitstream *) data->bs; DECLARE_ALIGNED_MATRIX(dct_codes, 6, 64, int16_t, CACHE_LINE); DECLARE_ALIGNED_MATRIX(qcoeff, 6, 64, int16_t, CACHE_LINE); int x, y; FRAMEINFO * const frame = (FRAMEINFO * const) data->current; MBParam * const pParam = &pEnc->mbParam; int mb_width = pParam->mb_width; int mb_height = pParam->mb_height; IMAGE *f_ref = &pEnc->reference->image; IMAGE *b_ref = &pEnc->current->image; int bound = data->start_y*mb_width; int num_slices = pEnc->num_slices; if (data->start_y > 0) { /* write resync marker */ write_video_packet_header(bs, pParam, frame, bound+1); } for (y = data->start_y; y < MIN(data->stop_y+1, mb_height); y++) { int new_bound = mb_width * ((((y*num_slices) / mb_height) * mb_height + (num_slices-1)) / num_slices); int stop_x = (y == data->stop_y) ? 1 : mb_width; int start_x = (y == data->start_y && y > 0) ? 1 : 0; for (x = start_x; x < stop_x; x++) { MACROBLOCK * const mb = &frame->mbs[x + y * pEnc->mbParam.mb_width]; /* decoder ignores mb when refence block is INTER(0,0), CBP=0 */ if (mb->mode == MODE_NOT_CODED) { if (pParam->plugin_flags & XVID_REQORIGINAL) { MBMotionCompensation(mb, x, y, f_ref, NULL, f_ref, NULL, NULL, &frame->image, NULL, 0, 0, pParam->edged_width, 0, 0, data->RefQ); } continue; } if (new_bound > bound && x > 0) { bound = new_bound; BitstreamPadAlways(bs); write_video_packet_header(bs, pParam, frame, y*mb_width+x); } mb->quant = frame->quant; if (mb->cbp != 0 || pParam->plugin_flags & XVID_REQORIGINAL) { /* we have to motion-compensate, transfer etc, because there might be blocks to code */ MBMotionCompensationBVOP(pParam, mb, x, y, &frame->image, f_ref, &pEnc->f_refh, &pEnc->f_refv, &pEnc->f_refhv, b_ref, &pEnc->vInterH, &pEnc->vInterV, &pEnc->vInterHV, dct_codes, data->RefQ); mb->cbp = MBTransQuantInterBVOP(pParam, frame, mb, x, y, dct_codes, qcoeff); } if (mb->mode == MODE_DIRECT_NO4V) mb->mode = MODE_DIRECT; if (mb->mode == MODE_DIRECT && (mb->cbp | mb->pmvs[3].x | mb->pmvs[3].y) == 0) mb->mode = MODE_DIRECT_NONE_MV; /* skipped */ else if (frame->vop_flags & XVID_VOP_GREYSCALE) /* keep only bits 5-2 -- Chroma blocks will just be skipped by MBCodingBVOP */ mb->cbp &= 0x3C; start_timer(); MBCodingBVOP(frame, mb, qcoeff, frame->fcode, frame->bcode, bs, data->sStat); stop_coding_timer(); } } BitstreamPadAlways(bs); /* next_start_code() at the end of VideoObjectPlane() */ emms(); } static void FrameCodeB(Encoder * pEnc, FRAMEINFO * frame, Bitstream * bs) { int bits = BitstreamPos(bs); int k = 0, bound = 0, num_slices = pEnc->num_slices; int num_threads = MAX(1, MIN(pEnc->num_threads, num_slices)); #ifdef HAVE_PTHREAD void * status = NULL; int threads_per_slice = (pEnc->num_threads*1024 / num_threads); #endif int slices_per_thread = (num_slices*1024 / num_threads); IMAGE *f_ref = &pEnc->reference->image; IMAGE *b_ref = &pEnc->current->image; MBParam * const pParam = &pEnc->mbParam; int mb_height = pParam->mb_height; #ifdef BFRAMES_DEC_DEBUG FILE *fp; static char first=0; #define BFRAME_DEBUG if (!first && fp){ \ fprintf(fp,"Y=%3d X=%3d MB=%2d CBP=%02X\n",y,x,mb->mode,mb->cbp); \ } if (!first){ fp=fopen("C:\\XVIDDBGE.TXT","w"); } #endif /* forward */ if (!pEnc->reference->is_edged) { image_setedges(f_ref, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height, pEnc->mbParam.width, pEnc->mbParam.height, XVID_BS_VERSION); pEnc->reference->is_edged = 1; } if (pEnc->reference->is_interpolated != 0) { start_timer(); image_interpolate(f_ref->y, pEnc->f_refh.y, pEnc->f_refv.y, pEnc->f_refhv.y, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height, (pEnc->mbParam.vol_flags & XVID_VOL_QUARTERPEL), 0); stop_inter_timer(); pEnc->reference->is_interpolated = 0; } /* backward */ if (!pEnc->current->is_edged) { image_setedges(b_ref, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height, pEnc->mbParam.width, pEnc->mbParam.height, XVID_BS_VERSION); pEnc->current->is_edged = 1; } if (pEnc->current->is_interpolated != 0) { start_timer(); image_interpolate(b_ref->y, pEnc->vInterH.y, pEnc->vInterV.y, pEnc->vInterHV.y, pEnc->mbParam.edged_width, pEnc->mbParam.edged_height, (pEnc->mbParam.vol_flags & XVID_VOL_QUARTERPEL), 0); stop_inter_timer(); pEnc->current->is_interpolated = 0; } frame->coding_type = B_VOP; if ((frame->vop_flags & XVID_VOP_RD_PSNRHVSM) && (frame->vop_flags & XVID_VOP_RD_BVOP)) { image_block_variance(&frame->image, pEnc->mbParam.edged_width, frame->mbs, pEnc->mbParam.mb_width, pEnc->mbParam.mb_height); } call_plugins(pEnc, frame, NULL, XVID_PLG_FRAME, NULL, NULL, NULL); frame->fcode = frame->bcode = pEnc->current->fcode; start_timer(); #ifdef HAVE_PTHREAD if (pEnc->num_threads > 0) { /* multithreaded motion estimation - dispatch threads */ while (k < pEnc->num_threads) { int i, add_s = (slices_per_thread + 512) >> 10; int add_t = (threads_per_slice + 512) >> 10; int start_y = (bound * mb_height + (num_slices-1)) / num_slices; int stop_y = ((bound+add_s) * mb_height + (num_slices-1)) / num_slices; int rows_per_thread = (stop_y - start_y + add_t - 1) / add_t; slices_per_thread += ((num_slices*1024 / num_threads) - add_s*1024); threads_per_slice += ((pEnc->num_threads*1024 / num_threads) - add_t*1024); for (i = 0; i < add_t; i++) { memset(pEnc->smpData[k+i].complete_count_self, 0, rows_per_thread * sizeof(int)); pEnc->smpData[k+i].pEnc = (void *) pEnc; pEnc->smpData[k+i].current = frame; pEnc->smpData[k+i].y_row = i; pEnc->smpData[k+i].y_step = add_t; pEnc->smpData[k+i].stop_y = stop_y; pEnc->smpData[k+i].start_y = start_y; /* todo: sort out temp space once and for all */ pEnc->smpData[k+i].RefQ = (((k+i)&1) ? pEnc->vInterV.u : pEnc->vInterV.v) + 16*((k+i)>>1)*pParam->edged_width; } pEnc->smpData[k].complete_count_above = pEnc->smpData[k+add_t-1].complete_count_self - 1; bound += add_s; k += add_t; } for (k = 1; k < pEnc->num_threads; k++) { pthread_create(&pEnc->smpData[k].handle, NULL, (void*)SMPMotionEstimationBVOP, (void*)&pEnc->smpData[k]); } SMPMotionEstimationBVOP(&pEnc->smpData[0]); for (k = 1; k < pEnc->num_threads; k++) { pthread_join(pEnc->smpData[k].handle, &status); } frame->fcode = frame->bcode = 0; for (k = 0; k < pEnc->num_threads; k++) { if (pEnc->smpData[k].minfcode > frame->fcode) frame->fcode = pEnc->smpData[k].minfcode; if (pEnc->smpData[k].minbcode > frame->bcode) frame->bcode = pEnc->smpData[k].minbcode; } } else #endif { MotionEstimationBVOP(&pEnc->mbParam, frame, ((int32_t)(pEnc->current->stamp - frame->stamp)), /* time_bp */ ((int32_t)(pEnc->current->stamp - pEnc->reference->stamp)), /* time_pp */ pEnc->reference->mbs, f_ref, &pEnc->f_refh, &pEnc->f_refv, &pEnc->f_refhv, pEnc->current, b_ref, &pEnc->vInterH, &pEnc->vInterV, &pEnc->vInterHV, pEnc->num_slices); } stop_motion_timer(); set_timecodes(frame, pEnc->reference,pEnc->mbParam.fbase); BitstreamWriteVopHeader(bs, &pEnc->mbParam, frame, 1, frame->quant); /* reset stats */ frame->sStat.iTextBits = 0; frame->sStat.iMVBits = 0; frame->sStat.iMvSum = 0; frame->sStat.iMvCount = 0; frame->sStat.kblks = frame->sStat.mblks = frame->sStat.ublks = 0; frame->sStat.mblks = pEnc->mbParam.mb_width * pEnc->mbParam.mb_height; frame->sStat.kblks = frame->sStat.ublks = 0; /* multithreaded inter coding - dispatch threads */ bound = 0; slices_per_thread = (num_slices*1024 / num_threads); for (k = 0; k < num_threads; k++) { int add = ((slices_per_thread + 512) >> 10); slices_per_thread += ((num_slices*1024 / num_threads) - add*1024); pEnc->smpData[k].pEnc = (void *) pEnc; pEnc->smpData[k].current = frame; pEnc->smpData[k].stop_y = (((bound+add) * mb_height + (num_slices-1)) / num_slices); pEnc->smpData[k].start_y = ((bound * mb_height + (num_slices-1)) / num_slices); bound += add; /* todo: sort out temp space once and for all */ pEnc->smpData[k].RefQ = ((k&1) ? pEnc->vInterV.u : pEnc->vInterV.v) + 16*(k>>1)*pParam->edged_width; if (k > 0) { BitstreamReset(pEnc->smpData[k].bs); pEnc->smpData[k].sStat->iTextBits = pEnc->smpData[k].sStat->kblks = pEnc->smpData[k].sStat->mblks = pEnc->smpData[k].sStat->ublks = pEnc->smpData[k].sStat->iMVBits = 0; } } #ifdef HAVE_PTHREAD for (k = 1; k < num_threads; k++) { pthread_create(&pEnc->smpData[k].handle, NULL, (void*)SliceCodeB, (void*)&pEnc->smpData[k]); } #endif pEnc->smpData[0].bs = bs; pEnc->smpData[0].sStat = &frame->sStat; SliceCodeB(&pEnc->smpData[0]); #ifdef HAVE_PTHREAD for (k = 1; k < num_threads; k++) { pthread_join(pEnc->smpData[k].handle, &status); } #endif frame->length = BitstreamLength(bs) - (bits/8); /* reassemble the pieces together */ SerializeBitstreams(pEnc, frame, bs, num_threads); #ifdef BFRAMES_DEC_DEBUG if (!first){ first=1; if (fp) fclose(fp); } #endif } xvidcore/src/prediction/0000775000076500007650000000000011566427763016433 5ustar xvidbuildxvidbuildxvidcore/src/prediction/mbprediction.h0000664000076500007650000000507511564705453021263 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Prediction header - * * Copyright(C) 2002-2010 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mbprediction.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _MBPREDICTION_H_ #define _MBPREDICTION_H_ #include "../portab.h" #include "../decoder.h" #include "../global.h" #define MIN(X, Y) ((X)<(Y)?(X):(Y)) #define MAX(X, Y) ((X)>(Y)?(X):(Y)) /* very large value */ #define MV_MAX_ERROR (4096 * 256) #define MVequal(A,B) ( ((A).x)==((B).x) && ((A).y)==((B).y) ) void MBPrediction(FRAMEINFO * frame, /* <-- The parameter for ACDC and MV prediction */ uint32_t x_pos, /* <-- The x position of the MB to be searched */ uint32_t y_pos, /* <-- The y position of the MB to be searched */ uint32_t x_dim, /* <-- Number of macroblocks in a row */ int16_t * qcoeff, /* <-> The quantized DCT coefficients */ const int bound); void add_acdc(MACROBLOCK * pMB, uint32_t block, int16_t dct_codes[64], uint32_t iDcScaler, int16_t predictors[8], const int bsversion); void predict_acdc(MACROBLOCK * pMBs, uint32_t x, uint32_t y, uint32_t mb_width, uint32_t block, int16_t qcoeff[64], uint32_t current_quant, int32_t iDcScaler, int16_t predictors[8], const int bound); VECTOR get_pmv2(const MACROBLOCK * const mbs, const int mb_width, const int bound, const int x, const int y, const int block); VECTOR get_pmv2_interlaced(const MACROBLOCK * const mbs, const int mb_width, const int bound, const int x, const int y, const int block); VECTOR get_qpmv2(const MACROBLOCK * const mbs, const int mb_width, const int bound, const int x, const int y, const int block); #endif /* _MBPREDICTION_H_ */ xvidcore/src/prediction/mbprediction.c0000664000076500007650000003606011564705453021254 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Prediction module - * * Copyright (C) 2001-2003 Michael Militzer * 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: mbprediction.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include "../global.h" #include "../encoder.h" #include "mbprediction.h" #include "../utils/mbfunctions.h" #include "../bitstream/cbp.h" #include "../bitstream/mbcoding.h" #include "../bitstream/zigzag.h" static int __inline rescale(int predict_quant, int current_quant, int coeff) { return (coeff != 0) ? DIV_DIV((coeff) * (predict_quant), (current_quant)) : 0; } static const int16_t default_acdc_values[15] = { 1024, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* get dc/ac prediction direction for a single block and place predictor values into MB->pred_values[j][..] */ void predict_acdc(MACROBLOCK * pMBs, uint32_t x, uint32_t y, uint32_t mb_width, uint32_t block, int16_t qcoeff[64], uint32_t current_quant, int32_t iDcScaler, int16_t predictors[8], const int bound) { const int mbpos = (y * mb_width) + x; int16_t *left, *top, *diag, *current; int32_t left_quant = current_quant; int32_t top_quant = current_quant; const int16_t *pLeft = default_acdc_values; const int16_t *pTop = default_acdc_values; const int16_t *pDiag = default_acdc_values; uint32_t index = x + y * mb_width; /* current macroblock */ int *acpred_direction = &pMBs[index].acpred_directions[block]; uint32_t i; left = top = diag = current = NULL; /* grab left,top and diag macroblocks */ /* left macroblock */ if (x && mbpos >= bound + 1 && (pMBs[index - 1].mode == MODE_INTRA || pMBs[index - 1].mode == MODE_INTRA_Q)) { left = (int16_t*)pMBs[index - 1].pred_values[0]; left_quant = pMBs[index - 1].quant; } /* top macroblock */ if (mbpos >= bound + (int)mb_width && (pMBs[index - mb_width].mode == MODE_INTRA || pMBs[index - mb_width].mode == MODE_INTRA_Q)) { top = (int16_t*)pMBs[index - mb_width].pred_values[0]; top_quant = pMBs[index - mb_width].quant; } /* diag macroblock */ if (x && mbpos >= bound + (int)mb_width + 1 && (pMBs[index - 1 - mb_width].mode == MODE_INTRA || pMBs[index - 1 - mb_width].mode == MODE_INTRA_Q)) { diag = (int16_t*)pMBs[index - 1 - mb_width].pred_values[0]; } current = (int16_t*)pMBs[index].pred_values[0]; /* now grab pLeft, pTop, pDiag _blocks_ */ switch (block) { case 0: if (left) pLeft = left + MBPRED_SIZE; if (top) pTop = top + (MBPRED_SIZE << 1); if (diag) pDiag = diag + 3 * MBPRED_SIZE; break; case 1: pLeft = current; left_quant = current_quant; if (top) { pTop = top + 3 * MBPRED_SIZE; pDiag = top + (MBPRED_SIZE << 1); } break; case 2: if (left) { pLeft = left + 3 * MBPRED_SIZE; pDiag = left + MBPRED_SIZE; } pTop = current; top_quant = current_quant; break; case 3: pLeft = current + (MBPRED_SIZE << 1); left_quant = current_quant; pTop = current + MBPRED_SIZE; top_quant = current_quant; pDiag = current; break; case 4: if (left) pLeft = left + (MBPRED_SIZE << 2); if (top) pTop = top + (MBPRED_SIZE << 2); if (diag) pDiag = diag + (MBPRED_SIZE << 2); break; case 5: if (left) pLeft = left + 5 * MBPRED_SIZE; if (top) pTop = top + 5 * MBPRED_SIZE; if (diag) pDiag = diag + 5 * MBPRED_SIZE; break; } /* determine ac prediction direction & ac/dc predictor place rescaled ac/dc * predictions into predictors[] for later use */ if (abs(pLeft[0] - pDiag[0]) < abs(pDiag[0] - pTop[0])) { *acpred_direction = 1; /* vertical */ predictors[0] = DIV_DIV(pTop[0], iDcScaler); for (i = 1; i < 8; i++) { predictors[i] = rescale(top_quant, current_quant, pTop[i]); } } else { *acpred_direction = 2; /* horizontal */ predictors[0] = DIV_DIV(pLeft[0], iDcScaler); for (i = 1; i < 8; i++) { predictors[i] = rescale(left_quant, current_quant, pLeft[i + 7]); } } } /* decoder: add predictors to dct_codes[] and store current coeffs to pred_values[] for future prediction */ /* Up to this version, no DC clipping was performed, so we try to be backward * compatible to avoid artifacts */ #define BS_VERSION_BUGGY_DC_CLIPPING 34 void add_acdc(MACROBLOCK * pMB, uint32_t block, int16_t dct_codes[64], uint32_t iDcScaler, int16_t predictors[8], const int bsversion) { uint8_t acpred_direction = pMB->acpred_directions[block]; int16_t *pCurrent = (int16_t*)pMB->pred_values[block]; uint32_t i; DPRINTF(XVID_DEBUG_COEFF,"predictor[0] %i\n", predictors[0]); dct_codes[0] += predictors[0]; /* dc prediction */ pCurrent[0] = dct_codes[0]*iDcScaler; if (bsversion > BS_VERSION_BUGGY_DC_CLIPPING) { pCurrent[0] = CLIP(pCurrent[0], -2048, 2047); } if (acpred_direction == 1) { for (i = 1; i < 8; i++) { int level = dct_codes[i] + predictors[i]; DPRINTF(XVID_DEBUG_COEFF,"predictor[%i] %i\n",i, predictors[i]); dct_codes[i] = level; pCurrent[i] = level; pCurrent[i + 7] = dct_codes[i * 8]; } } else if (acpred_direction == 2) { for (i = 1; i < 8; i++) { int level = dct_codes[i * 8] + predictors[i]; DPRINTF(XVID_DEBUG_COEFF,"predictor[%i] %i\n",i*8, predictors[i]); dct_codes[i * 8] = level; pCurrent[i + 7] = level; pCurrent[i] = dct_codes[i]; } } else { for (i = 1; i < 8; i++) { pCurrent[i] = dct_codes[i]; pCurrent[i + 7] = dct_codes[i * 8]; } } } /***************************************************************************** ****************************************************************************/ /* encoder: subtract predictors from qcoeff[] and calculate S1/S2 returns sum of coeefficients *saved* if prediction is enabled S1 = sum of all (qcoeff - prediction) S2 = sum of all qcoeff */ static int calc_acdc_coeff(MACROBLOCK * pMB, uint32_t block, int16_t qcoeff[64], uint32_t iDcScaler, int16_t predictors[8]) { int16_t *pCurrent = (int16_t*)pMB->pred_values[block]; uint32_t i; int S1 = 0, S2 = 0; /* store current coeffs to pred_values[] for future prediction */ pCurrent[0] = qcoeff[0] * iDcScaler; pCurrent[0] = CLIP(pCurrent[0], -2048, 2047); for (i = 1; i < 8; i++) { pCurrent[i] = qcoeff[i]; pCurrent[i + 7] = qcoeff[i * 8]; } /* subtract predictors and store back in predictors[] */ qcoeff[0] = qcoeff[0] - predictors[0]; if (pMB->acpred_directions[block] == 1) { for (i = 1; i < 8; i++) { int16_t level; level = qcoeff[i]; S2 += abs(level); level -= predictors[i]; S1 += abs(level); predictors[i] = level; } } else /* acpred_direction == 2 */ { for (i = 1; i < 8; i++) { int16_t level; level = qcoeff[i * 8]; S2 += abs(level); level -= predictors[i]; S1 += abs(level); predictors[i] = level; } } return S2 - S1; } /* returns the bits *saved* if prediction is enabled */ static int calc_acdc_bits(MACROBLOCK * pMB, uint32_t block, int16_t qcoeff[64], uint32_t iDcScaler, int16_t predictors[8]) { const int direction = pMB->acpred_directions[block]; int16_t *pCurrent = (int16_t*)pMB->pred_values[block]; int16_t tmp[8]; unsigned int i; int Z1, Z2; /* store current coeffs to pred_values[] for future prediction */ pCurrent[0] = qcoeff[0] * iDcScaler; pCurrent[0] = CLIP(pCurrent[0], -2048, 2047); for (i = 1; i < 8; i++) { pCurrent[i] = qcoeff[i]; pCurrent[i + 7] = qcoeff[i * 8]; } /* dc prediction */ qcoeff[0] = qcoeff[0] - predictors[0]; /* calc cost before ac prediction */ Z2 = CodeCoeffIntra_CalcBits(qcoeff, scan_tables[0]); /* apply ac prediction & calc cost*/ if (direction == 1) { for (i = 1; i < 8; i++) { tmp[i] = qcoeff[i]; qcoeff[i] -= predictors[i]; predictors[i] = qcoeff[i]; } }else{ /* acpred_direction == 2 */ for (i = 1; i < 8; i++) { tmp[i] = qcoeff[i*8]; qcoeff[i*8] -= predictors[i]; predictors[i] = qcoeff[i*8]; } } Z1 = CodeCoeffIntra_CalcBits(qcoeff, scan_tables[direction]); /* undo prediction */ if (direction == 1) { for (i = 1; i < 8; i++) qcoeff[i] = tmp[i]; }else{ /* acpred_direction == 2 */ for (i = 1; i < 8; i++) qcoeff[i*8] = tmp[i]; } return Z2-Z1; } /* apply predictors[] to qcoeff */ static void apply_acdc(MACROBLOCK * pMB, uint32_t block, int16_t qcoeff[64], int16_t predictors[8]) { unsigned int i; if (pMB->acpred_directions[block] == 1) { for (i = 1; i < 8; i++) qcoeff[i] = predictors[i]; } else { for (i = 1; i < 8; i++) qcoeff[i * 8] = predictors[i]; } } void MBPrediction(FRAMEINFO * frame, uint32_t x, uint32_t y, uint32_t mb_width, int16_t qcoeff[6 * 64], const int bound) { int32_t j; int32_t iDcScaler, iQuant; int S = 0; int16_t predictors[6][8]; MACROBLOCK *pMB = &frame->mbs[x + y * mb_width]; iQuant = pMB->quant; if ((pMB->mode == MODE_INTRA) || (pMB->mode == MODE_INTRA_Q)) { for (j = 0; j < 6; j++) { iDcScaler = get_dc_scaler(iQuant, j<4); predict_acdc(frame->mbs, x, y, mb_width, j, &qcoeff[j * 64], iQuant, iDcScaler, predictors[j], bound); if ((frame->vop_flags & XVID_VOP_HQACPRED)) S += calc_acdc_bits(pMB, j, &qcoeff[j * 64], iDcScaler, predictors[j]); else S += calc_acdc_coeff(pMB, j, &qcoeff[j * 64], iDcScaler, predictors[j]); } if (S<=0) { /* dont predict */ for (j = 0; j < 6; j++) pMB->acpred_directions[j] = 0; }else{ for (j = 0; j < 6; j++) apply_acdc(pMB, j, &qcoeff[j * 64], predictors[j]); } pMB->cbp = calc_cbp(qcoeff); } } static const VECTOR zeroMV = { 0, 0 }; VECTOR get_pmv2(const MACROBLOCK * const mbs, const int mb_width, const int bound, const int x, const int y, const int block) { int lx, ly, lz; /* left */ int tx, ty, tz; /* top */ int rx, ry, rz; /* top-right */ int lpos, tpos, rpos; int num_cand = 0, last_cand = 1; VECTOR pmv[4]; /* left neighbour, top neighbour, top-right neighbour */ switch (block) { case 0: lx = x - 1; ly = y; lz = 1; tx = x; ty = y - 1; tz = 2; rx = x + 1; ry = y - 1; rz = 2; break; case 1: lx = x; ly = y; lz = 0; tx = x; ty = y - 1; tz = 3; rx = x + 1; ry = y - 1; rz = 2; break; case 2: lx = x - 1; ly = y; lz = 3; tx = x; ty = y; tz = 0; rx = x; ry = y; rz = 1; break; default: lx = x; ly = y; lz = 2; tx = x; ty = y; tz = 0; rx = x; ry = y; rz = 1; } lpos = lx + ly * mb_width; rpos = rx + ry * mb_width; tpos = tx + ty * mb_width; if (lpos >= bound && lx >= 0) { num_cand++; pmv[1] = mbs[lpos].mvs[lz]; } else pmv[1] = zeroMV; if (tpos >= bound) { num_cand++; last_cand = 2; pmv[2] = mbs[tpos].mvs[tz]; } else pmv[2] = zeroMV; if (rpos >= bound && rx < mb_width) { num_cand++; last_cand = 3; pmv[3] = mbs[rpos].mvs[rz]; } else pmv[3] = zeroMV; /* If there're more than one candidate, we return the median vector */ if (num_cand > 1) { /* set median */ pmv[0].x = MIN(MAX(pmv[1].x, pmv[2].x), MIN(MAX(pmv[2].x, pmv[3].x), MAX(pmv[1].x, pmv[3].x))); pmv[0].y = MIN(MAX(pmv[1].y, pmv[2].y), MIN(MAX(pmv[2].y, pmv[3].y), MAX(pmv[1].y, pmv[3].y))); return pmv[0]; } return pmv[last_cand]; /* no point calculating median mv */ } VECTOR get_pmv2_interlaced(const MACROBLOCK * const mbs, const int mb_width, const int bound, const int x, const int y, const int block) { int lx, ly, lz; /* left */ int tx, ty, tz; /* top */ int rx, ry, rz; /* top-right */ int lpos, tpos, rpos; int num_cand = 0, last_cand = 1; VECTOR pmv[4]; /* left neighbour, top neighbour, top-right neighbour */ lx=x-1; ly=y; lz=1; tx=x; ty=y-1; tz=2; rx=x+1; ry=y-1; rz=2; lpos=lx+ly*mb_width; rpos=rx+ry*mb_width; tpos=tx+ty*mb_width; if(lx>=0 && lpos>=bound) { num_cand++; if(mbs[lpos].field_pred) pmv[1] = mbs[lpos].mvs_avg; else pmv[1] = mbs[lpos].mvs[lz]; } else { pmv[1] = zeroMV; } if(tpos>=bound) { num_cand++; last_cand=2; if(mbs[tpos].field_pred) pmv[2] = mbs[tpos].mvs_avg; else pmv[2] = mbs[tpos].mvs[tz]; } else { pmv[2] = zeroMV; } if(rx=bound) { num_cand++; last_cand = 3; if(mbs[rpos].field_pred) pmv[3] = mbs[rpos].mvs_avg; else pmv[3] = mbs[rpos].mvs[rz]; } else { pmv[3] = zeroMV; } /* If there're more than one candidate, we return the median vector */ if(num_cand>1) { /* set median */ pmv[0].x = MIN(MAX(pmv[1].x, pmv[2].x), MIN(MAX(pmv[2].x, pmv[3].x), MAX(pmv[1].x, pmv[3].x))); pmv[0].y = MIN(MAX(pmv[1].y, pmv[2].y), MIN(MAX(pmv[2].y, pmv[3].y), MAX(pmv[1].y, pmv[3].y))); return pmv[0]; } return pmv[last_cand]; /* no point calculating median mv */ } VECTOR get_qpmv2(const MACROBLOCK * const mbs, const int mb_width, const int bound, const int x, const int y, const int block) { int lx, ly, lz; /* left */ int tx, ty, tz; /* top */ int rx, ry, rz; /* top-right */ int lpos, tpos, rpos; int num_cand = 0, last_cand = 1; VECTOR pmv[4]; /* left neighbour, top neighbour, top-right neighbour */ switch (block) { case 0: lx = x - 1; ly = y; lz = 1; tx = x; ty = y - 1; tz = 2; rx = x + 1; ry = y - 1; rz = 2; break; case 1: lx = x; ly = y; lz = 0; tx = x; ty = y - 1; tz = 3; rx = x + 1; ry = y - 1; rz = 2; break; case 2: lx = x - 1; ly = y; lz = 3; tx = x; ty = y; tz = 0; rx = x; ry = y; rz = 1; break; default: lx = x; ly = y; lz = 2; tx = x; ty = y; tz = 0; rx = x; ry = y; rz = 1; } lpos = lx + ly * mb_width; rpos = rx + ry * mb_width; tpos = tx + ty * mb_width; if (lpos >= bound && lx >= 0) { num_cand++; pmv[1] = mbs[lpos].qmvs[lz]; } else pmv[1] = zeroMV; if (tpos >= bound) { num_cand++; last_cand = 2; pmv[2] = mbs[tpos].qmvs[tz]; } else pmv[2] = zeroMV; if (rpos >= bound && rx < mb_width) { num_cand++; last_cand = 3; pmv[3] = mbs[rpos].qmvs[rz]; } else pmv[3] = zeroMV; /* If there're more than one candidate, we return the median vector */ if (num_cand > 1) { /* set median */ pmv[0].x = MIN(MAX(pmv[1].x, pmv[2].x), MIN(MAX(pmv[2].x, pmv[3].x), MAX(pmv[1].x, pmv[3].x))); pmv[0].y = MIN(MAX(pmv[1].y, pmv[2].y), MIN(MAX(pmv[2].y, pmv[3].y), MAX(pmv[1].y, pmv[3].y))); return pmv[0]; } return pmv[last_cand]; /* no point calculating median mv */ } xvidcore/src/decoder.h0000664000076500007650000001076511564705453016053 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Decoder related header - * * Copyright(C) 2002-2010 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: decoder.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _DECODER_H_ #define _DECODER_H_ #include "xvid.h" #include "portab.h" #include "global.h" #include "image/image.h" #include "image/postprocessing.h" /***************************************************************************** * Structures ****************************************************************************/ /* complexity estimation toggles */ typedef struct { int method; int opaque; int transparent; int intra_cae; int inter_cae; int no_update; int upsampling; int intra_blocks; int inter_blocks; int inter4v_blocks; int gmc_blocks; int not_coded_blocks; int dct_coefs; int dct_lines; int vlc_symbols; int vlc_bits; int apm; int npm; int interpolate_mc_q; int forw_back_mc_q; int halfpel2; int halfpel4; int sadct; int quarterpel; } ESTIMATION; typedef struct { /* vol bitstream */ int time_inc_resolution; int fixed_time_inc; uint32_t time_inc_bits; uint32_t shape; int ver_id; uint32_t quant_bits; uint32_t quant_type; uint16_t *mpeg_quant_matrices; int32_t quarterpel; int32_t cartoon_mode; int complexity_estimation_disable; ESTIMATION estimation; int interlacing; uint32_t top_field_first; uint32_t alternate_vertical_scan; int aspect_ratio; int par_width; int par_height; int sprite_enable; int sprite_warping_points; int sprite_warping_accuracy; int sprite_brightness_change; int newpred_enable; int reduced_resolution_enable; /* The bitstream version if it's a Xvid stream */ int bs_version; /* image */ int fixed_dimensions; uint32_t width; uint32_t height; uint32_t edged_width; uint32_t edged_height; IMAGE cur; IMAGE refn[2]; /* 0 -- last I or P VOP */ /* 1 -- first I or P */ IMAGE tmp; /* bframe interpolation, and post processing tmp buffer */ IMAGE qtmp; /* quarter pel tmp buffer */ /* postprocessing */ XVID_POSTPROC postproc; /* macroblock */ uint32_t mb_width; uint32_t mb_height; MACROBLOCK *mbs; /* * for B-frame & low_delay==0 * XXX: should move frame based stuff into a DECODER_FRAMEINFO struct */ MACROBLOCK *last_mbs; /* last MB */ int last_coding_type; /* last coding type value */ int last_reduced_resolution; /* last reduced_resolution value */ int32_t frames; /* total frame number */ int32_t packed_mode; /* bframes packed bitstream? (1 = yes) */ int8_t scalability; VECTOR p_fmv, p_bmv; /* pred forward & backward motion vector */ int64_t time; /* for record time */ int64_t time_base; int64_t last_time_base; int64_t last_non_b_time; int32_t time_pp; int32_t time_bp; uint32_t low_delay; /* low_delay flage (1 means no B_VOP) */ uint32_t low_delay_default; /* default value for low_delay flag */ /* for GMC: central place for all parameters */ IMAGE gmc; /* gmc tmp buffer, remove for blockbased compensation */ GMC_DATA gmc_data; NEW_GMC_DATA new_gmc_data; xvid_image_t* out_frm; /* This is used for slice rendering */ int * qscale; /* quantization table for decoder's stats */ /* Tells if the reference image is edged or not */ int is_edged[2]; int num_threads; } DECODER; /***************************************************************************** * Decoder prototypes ****************************************************************************/ void init_decoder(uint32_t cpu_flags); int decoder_create(xvid_dec_create_t * param); int decoder_destroy(DECODER * dec); int decoder_decode(DECODER * dec, xvid_dec_frame_t * frame, xvid_dec_stats_t * stats); #endif xvidcore/src/plugins/0000775000076500007650000000000011566427762015753 5ustar xvidbuildxvidbuildxvidcore/src/plugins/plugin_ssim.h0000664000076500007650000000327711564705453020460 0ustar xvidbuildxvidbuild /***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - SSIM plugin: computes the SSIM metric - * * Copyright(C) 2005 Johannes Reinhardt * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * * ****************************************************************************/ #ifndef SSIM_H #define SSIM_H /*Plugin for calculating and dumping the ssim quality metric according to http://www.cns.nyu.edu/~lcv/ssim/ there is a accurate (but very slow) implementation, using a 8x8 gaussian weighting window, that is quite close to the paper, and a faster unweighted implementation*/ typedef struct{ /*stat output*/ int b_printstat; char* stat_path; /*visualize*/ int b_visualize; /*accuracy 0 gaussian weigthed (original, as in paper, very slow) <=4 unweighted, 1 slow 4 fastest*/ int acc; int cpu_flags; /* XVID_CPU_XXX flags */ } plg_ssim_param_t; int plugin_ssim(void * handle, int opt, void * param1, void * param2); #endif xvidcore/src/plugins/plugin_psnr.c0000664000076500007650000000364011564705453020454 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Xvid plugin: outputs PSNR to stdout (should disapear soon) - * * Copyright(C) 2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: plugin_psnr.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include "../xvid.h" #include "../image/image.h" int xvid_plugin_psnr(void * handle, int opt, void * param1, void * param2) { switch(opt) { case XVID_PLG_INFO: { xvid_plg_info_t * info = (xvid_plg_info_t*)param1; info->flags = XVID_REQPSNR; return(0); } case XVID_PLG_CREATE: *((void**)param2) = NULL; /* We don't have any private data to bind here */ case XVID_PLG_DESTROY: case XVID_PLG_BEFORE: case XVID_PLG_FRAME: return(0); case XVID_PLG_AFTER: { xvid_plg_data_t * data = (xvid_plg_data_t*)param1; printf("y_psnr=%2.2f u_psnr=%2.2f v_psnr=%2.2f\n", sse_to_PSNR(data->sse_y, data->width*data->height), sse_to_PSNR(data->sse_u, data->width*data->height/4), sse_to_PSNR(data->sse_v, data->width*data->height/4)); return(0); } } return XVID_ERR_FAIL; } xvidcore/src/plugins/plugin_2pass2.c0000664000076500007650000016064611564705453020616 0ustar xvidbuildxvidbuild/****************************************************************************** * * Xvid Bit Rate Controller Library * - VBR 2 pass bitrate controller implementation - * * Copyright (C) 2002 Benjamin Lambert * 2002 Dirk Knop * 2002-2003 Edouard Gomez * 2003 Pete Ross * * This curve treatment algorithm is the one originally implemented by Foxer * and tuned by Dirk Knop for the Xvid vfw frontend. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: plugin_2pass2.c 1985 2011-05-18 09:02:35Z Isibaar $ * *****************************************************************************/ #define BQUANT_PRESCALE #undef COMPENSATE_FORMULA /* forces second pass not to be bigger than first */ #undef PASS_SMALLER /* automatically alters overflow controls (strength and improvement/degradation) to fight most common problems without user's knowladge */ #define SMART_OVERFLOW_SETTING #include #include #include #include "../xvid.h" #include "../image/image.h" /***************************************************************************** * Some default settings ****************************************************************************/ #define DEFAULT_KEYFRAME_BOOST 0 #define DEFAULT_OVERFLOW_CONTROL_STRENGTH 10 #define DEFAULT_CURVE_COMPRESSION_HIGH 0 #define DEFAULT_CURVE_COMPRESSION_LOW 0 #define DEFAULT_MAX_OVERFLOW_IMPROVEMENT 10 #define DEFAULT_MAX_OVERFLOW_DEGRADATION 10 /* Keyframe settings */ #define DEFAULT_KFREDUCTION 20 #define DEFAULT_KFTHRESHOLD 1 /***************************************************************************** * Some default constants (can be tuned) ****************************************************************************/ /* Specify the invariant part of the headers bits (header+MV) * as hlength/cst */ #define INVARIANT_HEADER_PART_IVOP 1 /* factor 1.0f */ #define INVARIANT_HEADER_PART_PVOP 2 /* factor 0.5f */ #define INVARIANT_HEADER_PART_BVOP 8 /* factor 0.125f */ /***************************************************************************** * Structures ****************************************************************************/ /* Statistics */ typedef struct { int type; /* first pass type */ int quant; /* first pass quant */ int blks[3]; /* k,m,y blks */ int length; /* first pass length */ int invariant; /* what we assume as being invariant between the two passes, it's a sub part of header + MV bits */ int scaled_length; /* scaled length */ int desired_length; /* desired length; calculated during encoding */ int error; int zone_mode; /* XVID_ZONE_xxx */ double weight; } twopass_stat_t; /* Context struct */ typedef struct { xvid_plugin_2pass2_t param; /*---------------------------------- * constant statistical data *--------------------------------*/ /* Number of frames of the sequence */ int num_frames; /* Number of Intra frames of the sequence */ int num_keyframes; /* Target filesize to reach */ uint64_t target; /* Count of each frame types */ int count[3]; /* Total length of each frame types (1st pass) */ uint64_t tot_length[3]; uint64_t tot_invariant[3]; /* Average length of each frame types (used first for 1st pass data and * then for scaled averages */ double avg_length[3]; /* Minimum frame length allowed for each frame type */ int min_length[3]; /* Total bytes per frame type once the curve has been scaled * NB: advanced parameters do not change this value. This field * represents the total scaled w/o any advanced settings */ uint64_t tot_scaled_length[3]; /* Maximum observed frame size observed during the first pass, the RC * will try tp force all frame sizes in the second pass to be under that * limit */ int max_length; /*---------------------------------- * Zones statistical data *--------------------------------*/ /* Total length used by XVID_ZONE_QUANT zones */ uint64_t tot_quant; uint64_t tot_quant_invariant; /* Holds the total amount of frame bytes, zone weighted (only scalable * part of frame bytes) */ uint64_t tot_weighted; /*---------------------------------- * Advanced settings helper ratios *--------------------------------*/ /* This the ratio that has to be applied to all p/b frames in order * to reserve/retrieve bits for/from keyframe boosting and consecutive * keyframe penalty */ double pb_iboost_tax_ratio; /* This the ratio to apply to all b/p frames in order to respect the * assymetric curve compression while respecting a target filesize * NB: The assymetric delta gain has to be computed before this ratio * is applied, and then the delta is added to the scaled size */ double assymetric_tax_ratio; /*---------------------------------- * Data from the stats file kept * into RAM for easy access *--------------------------------*/ /* Array of keyframe locations * eg: rc->keyframe_locations[100] returns the frame number of the 100th * keyframe */ int *keyframe_locations; /* Index of the last keyframe used in the keyframe_location */ int KF_idx; /* Array of all 1st pass data file -- see the twopass_stat_t structure * definition for more details */ twopass_stat_t * stats; /*---------------------------------- * Hysteresis helpers *--------------------------------*/ /* This field holds the int2float conversion errors of each quant per * frame type, this allow the RC to keep track of rouding error and thus * increase or decrease the chosen quant according to this residue */ double quant_error[3][32]; /* This fields stores the count of each quant usage per frame type * No real role but for debugging */ int quant_count[3][32]; /* Last valid quantizer used per frame type, it allows quantizer * increament/decreament limitation in order to avoid big image quality * "jumps" */ int last_quant[3]; /*---------------------------------- * Overflow control *--------------------------------*/ /* Current overflow that has to be distributed to p/b frames */ double overflow; /* Total overflow for keyframes -- not distributed directly */ double KFoverflow; /* Amount of keyframe overflow to introduce to the global p/b frame * overflow counter at each encoded frame */ double KFoverflow_partial; /* Unknown ??? * ToDo: description */ double fq_error; int min_quant; /* internal minimal quant, prevents wrong quants from being used */ /*---------------------------------- * Debug *--------------------------------*/ double desired_total; double real_total; int scaled_frames; } rc_2pass2_t; /***************************************************************************** * Sub plugin functions prototypes ****************************************************************************/ static int rc_2pass2_create(xvid_plg_create_t * create, rc_2pass2_t ** handle); static int rc_2pass2_before(rc_2pass2_t * rc, xvid_plg_data_t * data); static int rc_2pass2_after(rc_2pass2_t * rc, xvid_plg_data_t * data); static int rc_2pass2_destroy(rc_2pass2_t * rc, xvid_plg_destroy_t * destroy); /***************************************************************************** * Plugin definition ****************************************************************************/ int xvid_plugin_2pass2(void * handle, int opt, void * param1, void * param2) { switch(opt) { case XVID_PLG_INFO : case XVID_PLG_FRAME : return 0; case XVID_PLG_CREATE : return rc_2pass2_create((xvid_plg_create_t*)param1, param2); case XVID_PLG_DESTROY : return rc_2pass2_destroy((rc_2pass2_t*)handle, (xvid_plg_destroy_t*)param1); case XVID_PLG_BEFORE : return rc_2pass2_before((rc_2pass2_t*)handle, (xvid_plg_data_t*)param1); case XVID_PLG_AFTER : return rc_2pass2_after((rc_2pass2_t*)handle, (xvid_plg_data_t*)param1); } return XVID_ERR_FAIL; } /***************************************************************************** * Sub plugin functions definitions ****************************************************************************/ /* First a few local helping function prototypes */ static int statsfile_count_frames(rc_2pass2_t * rc, char * filename); static int statsfile_load(rc_2pass2_t *rc, char * filename); static void zone_process(rc_2pass2_t *rc, const xvid_plg_create_t * create); static void first_pass_stats_prepare_data(rc_2pass2_t * rc); static void first_pass_scale_curve_internal(rc_2pass2_t *rc); static void scaled_curve_apply_advanced_parameters(rc_2pass2_t * rc); static int check_curve_for_vbv_compliancy(rc_2pass2_t * rc, const float fps); static int scale_curve_for_vbv_compliancy(rc_2pass2_t * rc, const float fps); #if 0 static void stats_print(rc_2pass2_t * rc); #endif /*---------------------------------------------------------------------------- *--------------------------------------------------------------------------*/ static int rc_2pass2_create(xvid_plg_create_t * create, rc_2pass2_t **handle) { xvid_plugin_2pass2_t * param = (xvid_plugin_2pass2_t *)create->param; rc_2pass2_t * rc; int i; rc = malloc(sizeof(rc_2pass2_t)); if (rc == NULL) return XVID_ERR_MEMORY; /* v1.0.x */ rc->param.version = param->version; rc->param.bitrate = param->bitrate; rc->param.filename = param->filename; rc->param.keyframe_boost = param->keyframe_boost; rc->param.curve_compression_high = param->curve_compression_high; rc->param.curve_compression_low = param->curve_compression_low; rc->param.overflow_control_strength = param->overflow_control_strength; rc->param.max_overflow_improvement = param->max_overflow_improvement; rc->param.max_overflow_degradation = param->max_overflow_degradation; rc->param.kfreduction = param->kfreduction; rc->param.kfthreshold = param->kfthreshold; rc->param.container_frame_overhead = param->container_frame_overhead; if (XVID_VERSION_MINOR(param->version) >= 1) { rc->param.vbv_size = param->vbv_size; rc->param.vbv_initial = param->vbv_initial; rc->param.vbv_maxrate = param->vbv_maxrate; rc->param.vbv_peakrate = param->vbv_peakrate; }else{ rc->param.vbv_size = rc->param.vbv_initial = rc->param.vbv_maxrate = rc->param.vbv_peakrate = 0; } /* Initialize all defaults */ #define _INIT(a, b) if((a) <= 0) (a) = (b) /* Let's set our defaults if needed */ _INIT(rc->param.keyframe_boost, DEFAULT_KEYFRAME_BOOST); _INIT(rc->param.overflow_control_strength, DEFAULT_OVERFLOW_CONTROL_STRENGTH); _INIT(rc->param.curve_compression_high, DEFAULT_CURVE_COMPRESSION_HIGH); _INIT(rc->param.curve_compression_low, DEFAULT_CURVE_COMPRESSION_LOW); _INIT(rc->param.max_overflow_improvement, DEFAULT_MAX_OVERFLOW_IMPROVEMENT); _INIT(rc->param.max_overflow_degradation, DEFAULT_MAX_OVERFLOW_DEGRADATION); /* Keyframe settings */ _INIT(rc->param.kfreduction, DEFAULT_KFREDUCTION); _INIT(rc->param.kfthreshold, DEFAULT_KFTHRESHOLD); #undef _INIT /* Initialize some stuff to zero */ for(i=0; i<3; i++) { int j; for (j=0; j<32; j++) { rc->quant_error[i][j] = 0; rc->quant_count[i][j] = 0; } } for (i=0; i<3; i++) rc->last_quant[i] = 0; rc->fq_error = 0; rc->min_quant = 1; rc->scaled_frames = 0; /* Count frames (and intra frames) in the stats file, store the result into * the rc structure */ if (statsfile_count_frames(rc, param->filename) == -1) { DPRINTF(XVID_DEBUG_RC,"[xvid rc] -- ERROR: fopen %s failed\n", param->filename); free(rc); return(XVID_ERR_FAIL); } /* Allocate the stats' memory */ if ((rc->stats = malloc(rc->num_frames * sizeof(twopass_stat_t))) == NULL) { free(rc); return(XVID_ERR_MEMORY); } /* Allocate keyframes location's memory * PS: see comment in pre_process0 for the +1 location requirement */ rc->keyframe_locations = malloc((rc->num_keyframes + 1) * sizeof(int)); if (rc->keyframe_locations == NULL) { free(rc->stats); free(rc); return(XVID_ERR_MEMORY); } /* Load the first pass stats */ if (statsfile_load(rc, param->filename) == -1) { DPRINTF(XVID_DEBUG_RC,"[xvid rc] -- ERROR: fopen %s failed\n", param->filename); free(rc->keyframe_locations); free(rc->stats); free(rc); return XVID_ERR_FAIL; } /* Compute the target filesize */ if (rc->param.bitrate<0) { /* if negative, bitrate equals the target (in kbytes) */ rc->target = ((uint64_t)(-rc->param.bitrate)) * 1024; } else if (rc->num_frames < create->fbase/create->fincr) { /* Source sequence is less than 1s long, we do as if it was 1s long */ rc->target = rc->param.bitrate / 8; } else { /* Target filesize = bitrate/8 * numframes / framerate */ rc->target = ((uint64_t)rc->param.bitrate * (uint64_t)rc->num_frames * \ (uint64_t)create->fincr) / \ ((uint64_t)create->fbase * 8); } DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Frame rate: %d/%d (%ffps)\n", create->fbase, create->fincr, (double)create->fbase/(double)create->fincr); DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Number of frames: %d\n", rc->num_frames); if(rc->param.bitrate>=0) DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Target bitrate: %ld\n", rc->param.bitrate); DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Target filesize: %lld\n", rc->target); /* Compensate the average frame overhead caused by the container */ rc->target -= rc->num_frames*rc->param.container_frame_overhead; DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Container Frame overhead: %d\n", rc->param.container_frame_overhead); if(rc->param.container_frame_overhead) DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- New target filesize after container compensation: %lld\n", rc->target); /* When bitrate is not given it means it has been scaled by an external * application */ if (rc->param.bitrate) { /* Apply zone settings * - set rc->tot_quant which represents the total num of bytes spent in * fixed quant zones * - set rc->tot_weighted which represents the total amount of bytes * spent in normal or weighted zones in first pass (normal zones can * be considered weight=1) * - set rc->tot_quant_invariant which represents the total num of bytes * spent in fixed quant zones for headers */ zone_process(rc, create); } else { /* External scaling -- zones are ignored */ for (i=0;inum_frames;i++) { rc->stats[i].zone_mode = XVID_ZONE_WEIGHT; rc->stats[i].weight = 1.0; } rc->tot_quant = 0; } /* Gathers some information about first pass stats: * - finds the minimum frame length for each frame type during 1st pass. * rc->min_size[] * - determines the maximum frame length observed (no frame type distinction). * rc->max_size * - count how many times each frame type has been used. * rc->count[] * - total bytes used per frame type * rc->tot_length[] * - total bytes considered invariant between the 2 passes * - store keyframe location * rc->keyframe_locations[] */ first_pass_stats_prepare_data(rc); /* If we have a user bitrate, it means it's an internal curve scaling */ if (rc->param.bitrate) { /* Perform internal curve scaling */ first_pass_scale_curve_internal(rc); } /* Apply advanced curve options, and compute some parameters in order to * shape the curve in the BEFORE/AFTER pair of functions */ scaled_curve_apply_advanced_parameters(rc); /* Check curve for VBV compliancy and rescale if necessary */ #ifdef VBV_FORCE if (rc->param.vbv_size==0) { rc->param.vbv_size = 3145728; rc->param.vbv_initial = 2359296; rc->param.vbv_maxrate = 4854000; rc->param.vbv_peakrate = 8000000; } #endif /* vbv_size==0 switches VBV check off */ if (rc->param.vbv_size > 0) { const float fps = (float)((double)create->fbase/(double)create->fincr); int status = check_curve_for_vbv_compliancy(rc, fps); if (status) { DPRINTF(XVID_DEBUG_RC, "[xvid rc] Underflow detected - Scaling Curve for compliancy.\n"); } status = scale_curve_for_vbv_compliancy(rc, fps); if (status == 0) { DPRINTF(XVID_DEBUG_RC, "[xvid rc] VBV compliant curve scaling done.\n"); } else { DPRINTF(XVID_DEBUG_RC, "[xvid rc] VBV compliant curve scaling impossible.\n"); } } *handle = rc; return(0); } /*---------------------------------------------------------------------------- *--------------------------------------------------------------------------*/ static int rc_2pass2_destroy(rc_2pass2_t * rc, xvid_plg_destroy_t * destroy) { DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- target_total:%lld desired_total:%.2f (%.2f%%) actual_total:%.2f (%.2f%%)\n", rc->target, rc->desired_total, 100*rc->desired_total/(double)rc->target, rc->real_total, 100*rc->real_total/(double)rc->target); free(rc->keyframe_locations); free(rc->stats); free(rc); return(0); } /*---------------------------------------------------------------------------- *--------------------------------------------------------------------------*/ static int rc_2pass2_before(rc_2pass2_t * rc, xvid_plg_data_t * data) { twopass_stat_t * s = &rc->stats[data->frame_num]; double dbytes; double scaled_quant; double overflow; int capped_to_max_framesize = 0; /* This function is quite long but easy to understand. In order to simplify * the code path (a bit), we treat 3 cases that can return immediatly. */ /* First case: Another plugin has already set a quantizer */ if (data->quant > 0) return(0); /* Second case: insufficent stats data * We can't guess much what we should do, let core decide all alone */ if (data->frame_num >= rc->num_frames) { DPRINTF(XVID_DEBUG_RC,"[xvid rc] -- stats file too short (now processing frame %d)", data->frame_num); return(0); } /* Third case: We are in a Quant zone * Quant zones must just ensure we use the same settings as first pass * So set the quantizer and the type */ if (s->zone_mode == XVID_ZONE_QUANT) { /* Quant stuff */ rc->fq_error += s->weight; data->quant = (int)rc->fq_error; rc->fq_error -= data->quant; /* The type stuff */ data->type = s->type; /* The only required data for AFTER step is this one for the overflow * control */ s->desired_length = s->length; return(0); } /*************************************************************************/ /*************************************************************************/ /*************************************************************************/ /*------------------------------------------------------------------------- * Frame bit allocation first part * * First steps apply user settings, just like it is done in the theoritical * scaled_curve_apply_advanced_parameters *-----------------------------------------------------------------------*/ /* Set desired to what we are wanting to obtain for this frame */ dbytes = (double)s->scaled_length; /* IFrame user settings*/ if (s->type == XVID_TYPE_IVOP) { /* Keyframe boosting -- All keyframes benefit from it */ dbytes += dbytes*rc->param.keyframe_boost / 100; #if 0 /* ToDo: decide how to apply kfthresholding */ #endif } else { /* P/S/B frames must reserve some bits for iframe boosting */ dbytes *= rc->pb_iboost_tax_ratio; /* Apply assymetric curve compression */ if (rc->param.curve_compression_high || rc->param.curve_compression_low) { double assymetric_delta; /* Compute the assymetric delta, this is computed before applying * the tax, as done in the pre_process function */ if (dbytes > rc->avg_length[s->type-1]) assymetric_delta = (rc->avg_length[s->type-1] - dbytes) * rc->param.curve_compression_high / 100.0; else assymetric_delta = (rc->avg_length[s->type-1] - dbytes) * rc->param.curve_compression_low / 100.0; /* Now we must apply the assymetric tax, else our curve compression * would not give a theoritical target size equal to what it is * expected */ dbytes *= rc->assymetric_tax_ratio; /* Now we can add the assymetric delta */ dbytes += assymetric_delta; } } /* That is what we would like to have -- Don't put that chunk after * overflow control, otherwise, overflow is counted twice and you obtain * half sized bitrate sequences */ s->desired_length = (int)dbytes; rc->desired_total += dbytes; /*------------------------------------------------------------------------ * Frame bit allocation: overflow control part. * * Unlike the theoritical scaled_curve_apply_advanced_parameters, here * it's real encoding and we need to make sure we don't go so far from * what is our ideal scaled curve. *-----------------------------------------------------------------------*/ /* Compute the overflow we should compensate */ if (s->type != XVID_TYPE_IVOP || rc->overflow > 0) { double frametype_factor; double framesize_factor; /* Take only the desired part of overflow */ overflow = rc->overflow; /* Factor that will take care to decrease the overflow applied * according to the importance of this frame type in term of * overall size */ frametype_factor = rc->count[XVID_TYPE_IVOP-1]*rc->avg_length[XVID_TYPE_IVOP-1]; frametype_factor += rc->count[XVID_TYPE_PVOP-1]*rc->avg_length[XVID_TYPE_PVOP-1]; frametype_factor += rc->count[XVID_TYPE_BVOP-1]*rc->avg_length[XVID_TYPE_BVOP-1]; frametype_factor /= rc->count[s->type-1]*rc->avg_length[s->type-1]; frametype_factor = 1/frametype_factor; /* Factor that will take care not to compensate too much for this frame * size */ framesize_factor = dbytes; framesize_factor /= rc->avg_length[s->type-1]; /* Treat only the overflow part concerned by this frame type and size */ overflow *= frametype_factor; #if 0 /* Leave this one alone, as it impacts badly on quality */ overflow *= framesize_factor; #endif /* Apply the overflow strength imposed by the user */ overflow *= (rc->param.overflow_control_strength/100.0f); } else { /* no negative overflow applied in IFrames because: * - their role is important as they're references for P/BFrames. * - there aren't much in typical sequences, so if an IFrame overflows too * much, this overflow may impact the next IFrame too much and generate * a sequence of poor quality frames */ overflow = 0; } /* Make sure we are not trying to compensate more overflow than we even have */ if (fabs(overflow) > fabs(rc->overflow)) overflow = rc->overflow; /* Make sure the overflow doesn't make the frame size to get out of the range * [-max_degradation..+max_improvment] */ if (overflow > dbytes*rc->param.max_overflow_improvement / 100) { if(overflow <= dbytes) dbytes += dbytes * rc->param.max_overflow_improvement / 100; else dbytes += overflow * rc->param.max_overflow_improvement / 100; } else if (overflow < - dbytes * rc->param.max_overflow_degradation / 100) { dbytes -= dbytes * rc->param.max_overflow_degradation / 100; } else { dbytes += overflow; } /*------------------------------------------------------------------------- * Frame bit allocation last part: * * Cap frame length so we don't reach neither bigger frame sizes than first * pass nor smaller than the allowed minimum. *-----------------------------------------------------------------------*/ #ifdef PASS_SMALLER if (dbytes > s->length) { dbytes = s->length; } #endif /* Prevent stupid desired sizes under logical values */ if (dbytes < rc->min_length[s->type-1]) { dbytes = rc->min_length[s->type-1]; } /*------------------------------------------------------------------------ * Desired frame length <-> quantizer mapping *-----------------------------------------------------------------------*/ #ifdef BQUANT_PRESCALE /* For bframes we prescale the quantizer to avoid too high quant scaling */ if(s->type == XVID_TYPE_BVOP) { twopass_stat_t *b_ref = s; /* Find the reference frame */ while(b_ref != &rc->stats[0] && b_ref->type == XVID_TYPE_BVOP) b_ref--; /* Compute the original quant */ s->quant = 2*(100*s->quant - data->bquant_offset); s->quant += data->bquant_ratio - 1; /* to avoid rounding issues */ s->quant = s->quant/data->bquant_ratio - b_ref->quant; } #endif /* Don't laugh at this very 'simple' quant<->size relationship, it * proves to be acurate enough for our algorithm */ scaled_quant = (double)s->quant*(double)s->length/(double)dbytes; #ifdef COMPENSATE_FORMULA /* We know xvidcore will apply the bframe formula again, so we compensate * it right now to make sure we would not apply it twice */ if(s->type == XVID_TYPE_BVOP) { twopass_stat_t *b_ref = s; /* Find the reference frame */ while(b_ref != &rc->stats[0] && b_ref->type == XVID_TYPE_BVOP) b_ref--; /* Compute the quant it would be if the core did not apply the bframe * formula */ scaled_quant = 100*scaled_quant - data->bquant_offset; scaled_quant += data->bquant_ratio - 1; /* to avoid rouding issues */ scaled_quant /= data->bquant_ratio; } #endif /* Quantizer has been scaled using floating point operations/results, we * must cast it to integer */ data->quant = (int)scaled_quant; /* Let's clip the computed quantizer, if needed */ if (data->quant < 1) { data->quant = 1; } else if (data->quant > 31) { data->quant = 31; } else { /* The frame quantizer has not been clipped, this appears to be a good * computed quantizer, do not loose quantizer decimal part that we * accumulate for later reuse when its sum represents a complete * unit. */ rc->quant_error[s->type-1][data->quant] += scaled_quant - (double)data->quant; if (rc->quant_error[s->type-1][data->quant] >= 1.0) { rc->quant_error[s->type-1][data->quant] -= 1.0; data->quant++; } else if (rc->quant_error[s->type-1][data->quant] <= -1.0) { rc->quant_error[s->type-1][data->quant] += 1.0; data->quant--; } } /* Now we have a computed quant that is in the right quante range, with a * possible +1 correction due to cumulated error. We can now safely clip * the quantizer again with user's quant ranges. "Safely" means the Rate * Control could learn more about this quantizer, this knowledge is useful * for future frames even if it this quantizer won't be really used atm, * that's why we don't perform this clipping earlier. */ if (data->quant < data->min_quant[s->type-1]) { data->quant = data->min_quant[s->type-1]; } else if (data->quant > data->max_quant[s->type-1]) { data->quant = data->max_quant[s->type-1]; } if (data->quant < rc->min_quant) data->quant = rc->min_quant; /* To avoid big quality jumps from frame to frame, we apply a "security" * rule that makes |last_quant - new_quant| <= 2. This rule only applies * to predicted frames (P and B) */ if (s->type != XVID_TYPE_IVOP && rc->last_quant[s->type-1] && capped_to_max_framesize == 0) { if (data->quant > rc->last_quant[s->type-1] + 2) { data->quant = rc->last_quant[s->type-1] + 2; DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- frame %d p/b-frame quantizer prevented from rising too steeply\n", data->frame_num); } if (data->quant < rc->last_quant[s->type-1] - 2) { data->quant = rc->last_quant[s->type-1] - 2; DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- frame:%d p/b-frame quantizer prevented from falling too steeply\n", data->frame_num); } } /* We don't want to pollute the RC histerisis when our computed quant has * been computed from a capped frame size */ if (capped_to_max_framesize == 0) rc->last_quant[s->type-1] = data->quant; /* Don't forget to force 1st pass frame type ;-) */ if (rc->scaled_frames) data->type = s->type; rc->scaled_frames++; return 0; } /*---------------------------------------------------------------------------- *--------------------------------------------------------------------------*/ static int rc_2pass2_after(rc_2pass2_t * rc, xvid_plg_data_t * data) { const char frame_type[4] = { 'i', 'p', 'b', 's'}; twopass_stat_t * s = &rc->stats[data->frame_num]; /* Insufficent stats data */ if (data->frame_num >= rc->num_frames) return 0; /* Update the quantizer counter */ rc->quant_count[s->type-1][data->quant]++; /* Update the frame type overflow */ if (data->type == XVID_TYPE_IVOP) { int kfdiff = 0; if(rc->KF_idx != rc->num_frames -1) { kfdiff = rc->keyframe_locations[rc->KF_idx+1]; kfdiff -= rc->keyframe_locations[rc->KF_idx]; } /* Flush Keyframe overflow accumulator */ rc->overflow += rc->KFoverflow; /* Store the frame overflow to the keyframe accumulator */ rc->KFoverflow = s->desired_length - data->length; if (kfdiff > 1) { /* Non-consecutive keyframes case: * We can then divide this total keyframe overflow into equal parts * that we will distribute into regular overflow at each frame * between the sequence bounded by two IFrames */ rc->KFoverflow_partial = rc->KFoverflow / (kfdiff - 1); } else { /* Consecutive keyframes case: * Flush immediatly the keyframe overflow and reset keyframe * overflow */ rc->overflow += rc->KFoverflow; rc->KFoverflow = 0; rc->KFoverflow_partial = 0; } rc->KF_idx++; } else { /* Accumulate the frame overflow */ rc->overflow += s->desired_length - data->length; /* Distribute part of the keyframe overflow */ rc->overflow += rc->KFoverflow_partial; /* Don't forget to substract that same amount from the total keyframe * overflow */ rc->KFoverflow -= rc->KFoverflow_partial; } s->error = s->desired_length - data->length; rc->real_total += data->length; DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- frame:%d type:%c quant:%d stats:%d scaled:%d desired:%d actual:%d error:%d overflow:%.2f\n", data->frame_num, frame_type[data->type-1], data->quant, s->length, s->scaled_length, s->desired_length, s->desired_length - s->error, -s->error, rc->overflow); return(0); } /***************************************************************************** * Helper functions definition ****************************************************************************/ /* Default buffer size for reading lines */ #define BUF_SZ 1024 /* Helper functions for reading/parsing the stats file */ static char *skipspaces(char *string); static int iscomment(char *string); static char *readline(FILE *f); /* This function counts the number of frame entries in the stats file * It also counts the number of I Frames */ static int statsfile_count_frames(rc_2pass2_t * rc, char * filename) { FILE * f; char *line; int lines; rc->num_frames = 0; rc->num_keyframes = 0; if ((f = fopen(filename, "rb")) == NULL) return(-1); lines = 0; while ((line = readline(f)) != NULL) { char *ptr; char type; int fields; lines++; /* We skip spaces */ ptr = skipspaces(line); /* Skip coment lines or empty lines */ if(iscomment(ptr) || *ptr == '\0') { free(line); continue; } /* Read the stat line from buffer */ fields = sscanf(ptr, "%c", &type); /* Valid stats files have at least 7 fields */ if (fields == 1) { switch(type) { case 'i': case 'I': rc->num_keyframes++; case 'p': case 'P': case 'b': case 'B': case 's': case 'S': rc->num_frames++; break; default: DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- WARNING: L%d unknown frame type used (%c).\n", lines, type); } } else { DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- WARNING: L%d misses some stat fields (%d).\n", lines, 7-fields); } /* Free the line buffer */ free(line); } /* We are done with the file */ fclose(f); if (!rc->num_keyframes) return (-1); /* No keyframes? Then something is wrong */ else return(0); } /* open stats file(s) and read into rc->stats array */ static int statsfile_load(rc_2pass2_t *rc, char * filename) { FILE * f; int processed_entries; /* Opens the file */ if ((f = fopen(filename, "rb"))==NULL) return(-1); processed_entries = 0; while(processed_entries < rc->num_frames) { char type; int fields; twopass_stat_t * s = &rc->stats[processed_entries]; char *line, *ptr; /* Read the line from the file */ if((line = readline(f)) == NULL) break; /* We skip spaces */ ptr = skipspaces(line); /* Skip comment lines or empty lines */ if(iscomment(ptr) || *ptr == '\0') { free(line); continue; } /* Reset this field that is optional */ s->scaled_length = 0; /* Convert the fields */ fields = sscanf(ptr, "%c %d %d %d %d %d %d %d\n", &type, &s->quant, &s->blks[0], &s->blks[1], &s->blks[2], &s->length, &s->invariant /* not really yet */, &s->scaled_length); /* Free line buffer, we don't need it anymore */ free(line); /* Fail silently, this has probably been warned in * statsfile_count_frames */ if(fields != 7 && fields != 8) continue; /* Convert frame type and compute the invariant length part */ switch(type) { case 'i': case 'I': s->type = XVID_TYPE_IVOP; s->invariant /= INVARIANT_HEADER_PART_IVOP; break; case 'p': case 'P': case 's': case 'S': s->type = XVID_TYPE_PVOP; s->invariant /= INVARIANT_HEADER_PART_PVOP; break; case 'b': case 'B': s->type = XVID_TYPE_BVOP; s->invariant /= INVARIANT_HEADER_PART_BVOP; break; default: /* Same as before, fail silently */ continue; } /* Ok it seems it's been processed correctly */ processed_entries++; } /* Close the file */ fclose(f); return(0); } /* pre-process the statistics data * - for each type, count, tot_length, min_length, max_length * - set keyframes_locations, tot_prescaled */ static void first_pass_stats_prepare_data(rc_2pass2_t * rc) { int i,j; /* *rc fields initialization * NB: INT_MAX and INT_MIN are used in order to be immediately replaced * with real values of the 1pass */ for (i=0; i<3; i++) { rc->count[i]=0; rc->tot_length[i] = 0; rc->tot_invariant[i] = 0; rc->min_length[i] = INT_MAX; } rc->max_length = INT_MIN; rc->tot_weighted = 0; /* Loop through all frames and find/compute all the stuff this function * is supposed to do */ for (i=j=0; inum_frames; i++) { twopass_stat_t * s = &rc->stats[i]; rc->count[s->type-1]++; rc->tot_length[s->type-1] += s->length; rc->tot_invariant[s->type-1] += s->invariant; if (s->zone_mode != XVID_ZONE_QUANT) rc->tot_weighted += (int)(s->weight*(s->length - s->invariant)); if (s->length < rc->min_length[s->type-1]) { rc->min_length[s->type-1] = s->length; } if (s->length > rc->max_length) { rc->max_length = s->length; } if (s->type == XVID_TYPE_IVOP) { rc->keyframe_locations[j] = i; j++; } } /* NB: * The "per sequence" overflow system considers a natural sequence to be * formed by all frames between two iframes, so if we want to make sure * the system does not go nuts during last sequence, we force the last * frame to appear in the keyframe locations array. */ rc->keyframe_locations[j] = i; DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Min 1st pass IFrame length: %d\n", rc->min_length[0]); DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Min 1st pass PFrame length: %d\n", rc->min_length[1]); DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Min 1st pass BFrame length: %d\n", rc->min_length[2]); } /* calculate zone weight "center" */ static void zone_process(rc_2pass2_t *rc, const xvid_plg_create_t * create) { int i,j; int n = 0; rc->tot_quant = 0; rc->tot_quant_invariant = 0; if (create->num_zones == 0) { for (j = 0; j < rc->num_frames; j++) { rc->stats[j].zone_mode = XVID_ZONE_WEIGHT; rc->stats[j].weight = 1.0; } n += rc->num_frames; } for(i=0; i < create->num_zones; i++) { int next = (i+1num_zones) ? create->zones[i+1].frame : rc->num_frames; /* Zero weight make no sense */ if (create->zones[i].increment == 0) create->zones[i].increment = 1; /* And obviously an undetermined infinite makes even less sense */ if (create->zones[i].base == 0) create->zones[i].base = 1; if (i==0 && create->zones[i].frame > 0) { for (j = 0; j < create->zones[i].frame && j < rc->num_frames; j++) { rc->stats[j].zone_mode = XVID_ZONE_WEIGHT; rc->stats[j].weight = 1.0; } n += create->zones[i].frame; } if (create->zones[i].mode == XVID_ZONE_WEIGHT) { for (j = create->zones[i].frame; j < next && j < rc->num_frames; j++ ) { rc->stats[j].zone_mode = XVID_ZONE_WEIGHT; rc->stats[j].weight = (double)create->zones[i].increment / (double)create->zones[i].base; } next -= create->zones[i].frame; n += next; } else{ /* XVID_ZONE_QUANT */ for (j = create->zones[i].frame; j < next && j < rc->num_frames; j++ ) { rc->stats[j].zone_mode = XVID_ZONE_QUANT; rc->stats[j].weight = (double)create->zones[i].increment / (double)create->zones[i].base; rc->tot_quant += rc->stats[j].length; rc->tot_quant_invariant += rc->stats[j].invariant; } } } } /* scale the curve */ static void first_pass_scale_curve_internal(rc_2pass2_t *rc) { int64_t target; int64_t total_invariant; double scaler; int i, num_MBs; /* We only scale texture data ! */ total_invariant = rc->tot_invariant[XVID_TYPE_IVOP-1]; total_invariant += rc->tot_invariant[XVID_TYPE_PVOP-1]; total_invariant += rc->tot_invariant[XVID_TYPE_BVOP-1]; /* don't forget to substract header bytes used in quant zones, otherwise we * counting them twice */ total_invariant -= rc->tot_quant_invariant; /* We remove the bytes used by the fixed quantizer zones during first pass * with the same quants, so we know very precisely how much that * represents */ target = rc->target; target -= rc->tot_quant; /* Let's compute a linear scaler in order to perform curve scaling */ scaler = (double)(target - total_invariant) / (double)(rc->tot_weighted); #ifdef SMART_OVERFLOW_SETTING if (scaler > 0.9) { rc->param.max_overflow_degradation *= 5; rc->param.max_overflow_improvement *= 5; rc->param.overflow_control_strength *= 3; } else if (scaler > 0.6) { rc->param.max_overflow_degradation *= 2; rc->param.max_overflow_improvement *= 2; rc->param.overflow_control_strength *= 2; } else { rc->min_quant = 2; } #endif /* Compute min frame lengths (for each frame type) according to the number * of MBs. We sum all block type counters of frame 0, this gives us the * number of MBs. * * We compare these hardcoded values with observed values in first pass * (determined in pre_process0).Then we keep the real minimum. */ /* Number of MBs */ num_MBs = rc->stats[0].blks[0]; num_MBs += rc->stats[0].blks[1]; num_MBs += rc->stats[0].blks[2]; /* Minimum for I frames */ if(rc->min_length[XVID_TYPE_IVOP-1] > ((num_MBs*22) + 240) / 8) rc->min_length[XVID_TYPE_IVOP-1] = ((num_MBs*22) + 240) / 8; /* Minimum for P/S frames */ if(rc->min_length[XVID_TYPE_PVOP-1] > ((num_MBs) + 88) / 8) rc->min_length[XVID_TYPE_PVOP-1] = ((num_MBs) + 88) / 8; /* Minimum for B frames */ if(rc->min_length[XVID_TYPE_BVOP-1] > 8) rc->min_length[XVID_TYPE_BVOP-1] = 8; /* Perform an initial scale pass. * * If a frame size is scaled underneath our hardcoded minimums, then we * force the frame size to the minimum, and deduct the original & scaled * frame length from the original and target total lengths */ for (i=0; inum_frames; i++) { twopass_stat_t * s = &rc->stats[i]; int len; /* No need to scale frame length for which a specific quantizer is * specified thanks to zones */ if (s->zone_mode == XVID_ZONE_QUANT) { s->scaled_length = s->length; continue; } /* Compute the scaled length -- only non invariant data length is scaled */ len = s->invariant + (int)((double)(s->length-s->invariant) * scaler * s->weight); /* Compare with the computed minimum */ if (len < rc->min_length[s->type-1]) { /* This is a 'forced size' frame, set its frame size to the * computed minimum */ s->scaled_length = rc->min_length[s->type-1]; /* Remove both scaled and original size from their respective * total counters, as we prepare a second pass for 'regular' * frames */ target -= s->scaled_length; } else { /* Do nothing for now, we'll scale this later */ s->scaled_length = 0; } } /* The first pass on data substracted all 'forced size' frames from the * total counters. Now, it's possible to scale the 'regular' frames. */ /* Scaling factor for 'regular' frames */ scaler = (double)(target - total_invariant) / (double)(rc->tot_weighted); /* Do another pass with the new scaler */ for (i=0; inum_frames; i++) { twopass_stat_t * s = &rc->stats[i]; /* Ignore frame with forced frame sizes */ if (s->scaled_length == 0) s->scaled_length = s->invariant + (int)((double)(s->length-s->invariant) * scaler * s->weight); } /* Job done */ return; } /* Apply all user settings to the scaled curve * This implies: * keyframe boosting * high/low compression */ static void scaled_curve_apply_advanced_parameters(rc_2pass2_t * rc) { int i; int64_t ivop_boost_total; /* Reset the rate controller (per frame type) total byte counters */ for (i=0; i<3; i++) rc->tot_scaled_length[i] = 0; /* Compute total bytes for each frame type */ for (i=0; inum_frames;i++) { twopass_stat_t *s = &rc->stats[i]; rc->tot_scaled_length[s->type-1] += s->scaled_length; } /* First we compute the total amount of bits needed, as being described by * the scaled distribution. During this pass over the complete stats data, * we see how much bits two user settings will get/give from/to p&b frames: * - keyframe boosting * - keyframe distance penalty */ rc->KF_idx = 0; ivop_boost_total = 0; for (i=0; inum_frames; i++) { twopass_stat_t * s = &rc->stats[i]; /* Some more work is needed for I frames */ if (s->type == XVID_TYPE_IVOP) { int ivop_boost; /* Accumulate bytes needed for keyframe boosting */ ivop_boost = s->scaled_length*rc->param.keyframe_boost/100; #if 0 /* ToDo: decide how to apply kfthresholding */ #endif /* If the frame size drops under the minimum length, then cap ivop_boost */ if (ivop_boost + s->scaled_length < rc->min_length[XVID_TYPE_IVOP-1]) ivop_boost = rc->min_length[XVID_TYPE_IVOP-1] - s->scaled_length; /* Accumulate the ivop boost */ ivop_boost_total += ivop_boost; /* Don't forget to update the keyframe index */ rc->KF_idx++; } } /* Initialize the IBoost tax ratio for P/S/B frames * * This ratio has to be applied to p/b/s frames in order to reserve * additional bits for keyframes (keyframe boosting) or if too much * keyframe distance is applied, bits retrieved from the keyframes. * * ie pb_length *= rc->pb_iboost_tax_ratio; * * gives the ideal length of a p/b frame */ /* Compute the total length of p/b/s frames (temporary storage into * movie_curve) */ rc->pb_iboost_tax_ratio = (double)rc->tot_scaled_length[XVID_TYPE_PVOP-1]; rc->pb_iboost_tax_ratio += (double)rc->tot_scaled_length[XVID_TYPE_BVOP-1]; /* Compute the ratio described above * taxed_total = sum(0, n, tax*scaled_length) * <=> taxed_total = tax.sum(0, n, scaled_length) * <=> tax = taxed_total / original_total */ rc->pb_iboost_tax_ratio = (rc->pb_iboost_tax_ratio - ivop_boost_total) / rc->pb_iboost_tax_ratio; DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- IFrame boost tax ratio:%.2f\n", rc->pb_iboost_tax_ratio); /* Compute the average size of frames per frame type */ for(i=0; i<3; i++) { /* Special case for missing type or weird case */ if (rc->count[i] == 0 || rc->pb_iboost_tax_ratio == 0) { rc->avg_length[i] = 1; } else { rc->avg_length[i] = (double)rc->tot_scaled_length[i]; if (i == (XVID_TYPE_IVOP-1)) { /* I Frames total has to be added the boost total */ rc->avg_length[i] += ivop_boost_total; } else { /* P/B frames has to taxed */ rc->avg_length[i] *= rc->pb_iboost_tax_ratio; } /* Finally compute the average frame size */ rc->avg_length[i] /= (double)rc->count[i]; } } /* Assymetric curve compression */ if (rc->param.curve_compression_high || rc->param.curve_compression_low) { double symetric_total; double assymetric_delta_total; /* Like I frame boosting, assymetric curve compression modifies the total * amount of needed bits, we must compute the ratio so we can prescale lengths */ symetric_total = 0; assymetric_delta_total = 0; for (i=0; inum_frames; i++) { double assymetric_delta; double dbytes; twopass_stat_t * s = &rc->stats[i]; /* I Frames are not concerned by assymetric scaling */ if (s->type == XVID_TYPE_IVOP) continue; /* During the real run, we would have to apply the iboost tax */ dbytes = s->scaled_length * rc->pb_iboost_tax_ratio; /* Update the symmetric curve compression total */ symetric_total += dbytes; /* Apply assymetric curve compression */ if (dbytes > rc->avg_length[s->type-1]) assymetric_delta = (rc->avg_length[s->type-1] - dbytes) * (double)rc->param.curve_compression_high / 100.0f; else assymetric_delta = (rc->avg_length[s->type-1] - dbytes) * (double)rc->param.curve_compression_low / 100.0f; /* Cap to the minimum frame size if needed */ if (dbytes + assymetric_delta < rc->min_length[s->type-1]) assymetric_delta = rc->min_length[s->type-1] - dbytes; /* Accumulate after assymetric curve compression */ assymetric_delta_total += assymetric_delta; } /* Compute the tax that all p/b frames have to pay in order to respect the * bit distribution changes that the assymetric compression curve imposes * We want assymetric_total = sum(0, n-1, tax.scaled_length) * ie assymetric_total = ratio.sum(0, n-1, scaled_length) * ratio = assymetric_total / symmetric_total */ rc->assymetric_tax_ratio = ((double)symetric_total - (double)assymetric_delta_total) / (double)symetric_total; } else { rc->assymetric_tax_ratio = 1.0f; } DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- Assymetric tax ratio:%.2f\n", rc->assymetric_tax_ratio); /* Last bits that need to be reset */ rc->overflow = 0; rc->KFoverflow = 0; rc->KFoverflow_partial = 0; rc->KF_idx = 0; rc->desired_total = 0; rc->real_total = 0; /* Job done */ return; } /***************************************************************************** * VBV compliancy check and scale * MPEG-4 standard specifies certain restrictions for bitrate/framesize in VBR * to enable playback on devices with limited readspeed and memory (and which * aren't...) * * DivX profiles have 2 criteria: VBV as in MPEG standard * a limit on peak bitrate for any 1 second * * But if VBV is fulfilled, peakrate is automatically fulfilled in any profile * defined so far, so we check for it (for completeness) but correct only VBV * *****************************************************************************/ #define VBV_COMPLIANT 0 #define VBV_UNDERFLOW 1 /* video buffer runs empty */ #define VBV_OVERFLOW 2 /* doesn't exist for VBR encoding */ #define VBV_PEAKRATE 4 /* peak bitrate (within 1s) violated */ static int check_curve_for_vbv_compliancy(rc_2pass2_t * rc, const float fps) { /* We do all calculations in float, for higher accuracy, * and in bytes for convenience. * * typical values from e.g. Home profile: * vbv_size= 384*1024 (384kB) * vbv_initial= 288*1024 (75% fill) * maxrate= 4854000 (4.854MBps) * peakrate= 8000000 (8MBps) * * PAL: offset1s = 25 (1 second of 25fps) * NTSC: offset1s = 30 (1 second of 29.97fps) or 24 (1 second of 23.976fps) */ const float vbv_size = (float)rc->param.vbv_size/8.f; float vbvfill = (float)rc->param.vbv_initial/8.f; float vbvmin; const float maxrate = (float)rc->param.vbv_maxrate; const float peakrate = (float)rc->param.vbv_peakrate; const float r0 = (int)(maxrate/fps+0.5)/8.f; int bytes1s = 0; int offset1s = (int)(1.f*fps+0.5); int i; /* 1Gbit should be enough to inuitialize the vbvmin * an arbitrary high value */ vbvmin = 1000*1000*1000; for (i=0; inum_frames; i++) { /* DivX 1s peak bitrate check */ bytes1s += rc->stats[i].scaled_length; if (i>=offset1s) bytes1s -= rc->stats[i-offset1s].scaled_length; /* ignore peakrate constraint if peakrate is <= 0.f */ if (peakrate>0.f && 8.f*bytes1s > peakrate) return(VBV_PEAKRATE); /* update vbv fill level */ vbvfill += r0 - rc->stats[i].scaled_length; /* this check is _NOT_ an "overflow"! only reading from disk stops then */ if (vbvfill > vbv_size) vbvfill = vbv_size; /* but THIS would be an underflow. report it! */ if (vbvfill < 0) return(VBV_UNDERFLOW); /* Store the minimum buffer filling */ if (vbvfill < vbvmin) vbvmin = vbvfill; } DPRINTF(XVID_DEBUG_RC, "[xvid rc] Minimum buffer fill: %f bytes\n", vbvmin); return(VBV_COMPLIANT); } static int scale_curve_for_vbv_compliancy(rc_2pass2_t * rc, const float fps) { /* correct any VBV violations. Peak bitrate violations disappears * by this automatically * * This implementation follows * * Westerink, Rajagopalan, Gonzales "Two-pass MPEG-2 variable-bitrate encoding" * IBM J. RES. DEVELOP. VOL 43, No. 4, July 1999, p.471--488 * * Thanks, guys! This paper rocks!!! */ /* For each scene of len N, we have to check up to N^2 possible buffer fills. * This works well with MPEG-2 where N==12 or so, but for MPEG-4 it's a * little slow... * * TODO: Better control on VBVfill between scenes */ const float vbv_size = (float)rc->param.vbv_size/8.f; const float vbv_initial = (float)rc->param.vbv_initial/8.f; const float maxrate = 0.9f*rc->param.vbv_maxrate; const float vbv_low = 0.10f*vbv_size; const float r0 = (int)(maxrate/fps+0.5)/8.f; int i,k,l,n,violation = 0; float *scenefactor; int *scenestart; int *scenelength; /* first step: determine how many "scenes" there are and store their * boundaries we could get all this from existing keyframe_positions, * somehow, but there we don't have a min_scenelength, and it's no big * deal to get it again. */ const int min_scenelength = (int)(fps+0.5); int num_scenes = 0; int last_scene = -999; for (i=0; inum_frames; i++) { if ((rc->stats[i].type == XVID_TYPE_IVOP) && (i-last_scene>min_scenelength)) { last_scene = i; num_scenes++; } } scenefactor = (float*)malloc(num_scenes*sizeof(float)); scenestart = (int*)malloc(num_scenes*sizeof(int)); scenelength = (int*)malloc(num_scenes*sizeof(int)); if ((!scenefactor) || (!scenestart) || (!scenelength) ) { free(scenefactor); free(scenestart); free(scenelength); /* remember: free(0) is valid and does exactly nothing. */ return(-1); } /* count again and safe the length/position */ num_scenes = 0; last_scene = -999; for (i=0; inum_frames; i++) { if ((rc->stats[i].type == XVID_TYPE_IVOP) && (i-last_scene>min_scenelength)) { if (num_scenes>0) { scenelength[num_scenes-1]=i-last_scene; } scenestart[num_scenes]=i; num_scenes++; last_scene = i; } } scenelength[num_scenes-1]=i-last_scene; /* second step: check for each scene, how much we can scale its frames up or * down such that the VBV restriction is just fulfilled */ #define R(k,n) (((n)+1-(k))*r0) /* how much enters the buffer between frame k and n */ for (l=0; lstats[start]; float S0n,Skn; float f,minf = 99999.f; S0n=0.; for (n=0;n<=length-1;n++) { S0n += frames[n].scaled_length; k = 0; Skn = S0n; f = (R(k,n-1) + (vbv_initial - vbv_low)) / Skn; if (f < minf) minf = f; for (k=1;k<=n;k++) { Skn -= frames[k].scaled_length; f = (R(k,n-1) + (vbv_size - vbv_low)) / Skn; if (f < minf) minf = f; } } /* special case: at the end, fill buffer up to vbv_initial again * * TODO: Allow other values for buffer fill between scenes * e.g. if n=N is smallest f-value, then check for better value */ n=length; k=0; Skn = S0n; f = R(k,n-1)/Skn; if (f < minf) minf = f; for (k=1;k<=n-1;k++) { Skn -= frames[k].scaled_length; f = (R(k,n-1) + (vbv_initial - vbv_low)) / Skn; if (f < minf) minf = f; } DPRINTF(XVID_DEBUG_RC, "[xvid rc] Scene %d (Frames %d-%d): VBVfactor %f\n", l, start, start+length-1 , minf); scenefactor[l] = minf; } #undef R /* last step: now we know of any scene how much it can be scaled up or down * without violating VBV. Next, distribute bits from the evil scenes to the * good ones */ do { float S_red = 0.f; /* how much to redistribute */ float S_elig = 0.f; /* sum of bit for those scenes you can still swallow something*/ float f_red; int l; /* check how much is wrong */ for (l=0;lstats[start]; /* exactly 1 means "don't touch this anymore!" */ if (scenefactor[l] == 1.) continue; /* within limits */ if (scenefactor[l] > 1.) { for (n= 0; n < length; n++) S_elig += frames[n].scaled_length; } else { /* underflowing segment */ for (n= 0; n < length; n++) { float newbytes = (float)frames[n].scaled_length * scenefactor[l]; S_red += (float)frames[n].scaled_length - (float)newbytes; frames[n].scaled_length =(int)newbytes; } scenefactor[l] = 1.f; } } /* no more underflows */ if (S_red < 1.f) break; if (S_elig < 1.f) { DPRINTF(XVID_DEBUG_RC, "[xvid rc] Everything underflowing.\n"); free(scenefactor); free(scenestart); free(scenelength); return(-2); } f_red = (1.f + S_red/S_elig); DPRINTF(XVID_DEBUG_RC, "[xvid rc] Moving %.0f kB to avoid buffer underflow, correction factor: %.5f\n", S_red/1024.f, f_red); violation=0; /* scale remaining scenes up to meet total size */ for (l=0; lstats[start]; if (scenefactor[l] == 1.) continue; /* there shouldn't be any segments with factor<1 left, so all the rest is >1 */ for (n= 0; n < length; n++) { frames[n].scaled_length = (int)(frames[n].scaled_length * f_red + 0.5); } scenefactor[l] /= f_red; if (scenefactor[l] < 1.f) violation=1; } } while (violation); free(scenefactor); free(scenestart); free(scenelength); return(0); } /***************************************************************************** * Still more low level stuff (nothing to do with stats treatment) ****************************************************************************/ /* This function returns an allocated string containing a complete line read * from the file starting at the current position */ static char * readline(FILE *f) { char *buffer = NULL; int buffer_size = 0; int pos = 0; do { int c; /* Read a character from the stream */ c = fgetc(f); /* Is that EOF or new line ? */ if(c == EOF || c == '\n') break; /* Do we have to update buffer ? */ if(pos >= buffer_size - 1) { buffer_size += BUF_SZ; buffer = (char*)realloc(buffer, buffer_size); if (buffer == NULL) return(NULL); } buffer[pos] = c; pos++; } while(1); /* Read \n or EOF */ if (buffer == NULL) { /* EOF, so we reached the end of the file, return NULL */ if(feof(f)) return(NULL); /* Just an empty line with just a newline, allocate a 1 byte buffer to * store a zero length string */ buffer = (char*)malloc(1); if(buffer == NULL) return(NULL); } /* Zero terminated string */ buffer[pos] = '\0'; return(buffer); } /* This function returns a pointer to the first non space char in the given * string */ static char * skipspaces(char *string) { const char spaces[] = { ' ','\t','\0' }; const char *spacechar = spaces; if (string == NULL) return(NULL); while (*string != '\0') { /* Test against space chars */ while (*spacechar != '\0') { if (*string == *spacechar) { string++; spacechar = spaces; break; } spacechar++; } /* No space char */ if (*spacechar == '\0') return(string); } return(string); } /* This function returns a boolean that tells if the string is only a * comment */ static int iscomment(char *string) { const char comments[] = { '#',';', '%', '\0' }; const char *cmtchar = comments; int iscomment = 0; if (string == NULL) return(1); string = skipspaces(string); while(*cmtchar != '\0') { if(*string == *cmtchar) { iscomment = 1; break; } cmtchar++; } return(iscomment); } #if 0 static void stats_print(rc_2pass2_t * rc) { int i; const char frame_type[4] = { 'i', 'p', 'b', 's'}; for (i=0; inum_frames; i++) { twopass_stat_t *s = &rc->stats[i]; DPRINTF(XVID_DEBUG_RC, "[xvid rc] -- frame:%d type:%c quant:%d stats:%d scaled:%d desired:%d actual:%d overflow(%c):%.2f\n", i, frame_type[s->type-1], -1, s->length, s->scaled_length, s->desired_length, -1, frame_type[s->type-1], -1.0f); } } #endif xvidcore/src/plugins/plugin_2pass1.c0000664000076500007650000001641511564705453020607 0ustar xvidbuildxvidbuild/****************************************************************************** * * Xvid Bit Rate Controller Library * - VBR 2 pass bitrate controler implementation - * * Copyright (C) 2002-2003 Edouard Gomez * * The curve treatment algorithm is the one implemented by Foxer and * Dirk Knop for the Xvid vfw dynamic library. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: plugin_2pass1.c 1985 2011-05-18 09:02:35Z Isibaar $ * *****************************************************************************/ #include #include /* errno var (or function with recent libc) */ #include /* strerror() */ #include "../xvid.h" #include "../image/image.h" /* This preprocessor constant controls wheteher or not, first pass is done * using fast ME routines to speed up the 2pass process at the expense of * less precise first pass stats */ #define FAST1PASS #define FAST1PASS_QPEL_TOO /* context struct */ typedef struct { FILE * stat_file; double fq_error; } rc_2pass1_t; static int rc_2pass1_create(xvid_plg_create_t * create, rc_2pass1_t ** handle) { xvid_plugin_2pass1_t * param = (xvid_plugin_2pass1_t *)create->param; rc_2pass1_t * rc; /* check filename */ if ((param->filename == NULL) || (param->filename != NULL && param->filename[0] == '\0')) return XVID_ERR_FAIL; /* allocate context struct */ if((rc = malloc(sizeof(rc_2pass1_t))) == NULL) return(XVID_ERR_MEMORY); /* Initialize safe defaults for 2pass 1 */ rc->stat_file = NULL; /* Open the 1st pass file */ if((rc->stat_file = fopen(param->filename, "w+b")) == NULL) return(XVID_ERR_FAIL); /* I swear xvidcore isn't buggy, but when using mencoder+xvid4 i observe * this weird bug. * * Symptoms: The stats file grows until it's fclosed, but at this moment * a large part of the file is filled by 0x00 bytes w/o any * reasonable cause. The stats file is then completly unusable * * So far, i think i found "the why": * - take a MPEG stream containing 2 sequences (concatenate 2 MPEG files * together) * - Encode this MPEG file * * It should trigger the bug * * I think this is caused by some kind of race condition on mencoder module * start/stop. * - mencoder encodes the first sequence * + xvid4 module opens xvid-twopass.stats and writes stats in it. * - mencoder detects the second sequence and initialize a second * module and stops the old encoder * + new xvid4 module opens a new xvid-twopass.stats, old xvid4 * module closes it * * This is IT, got a racing condition. * Unbuffered IO, may help ... */ setbuf(rc->stat_file, NULL); /* * The File Header */ fprintf(rc->stat_file, "# XviD 2pass stat file (core version %d.%d.%d)\n", XVID_VERSION_MAJOR(XVID_VERSION), XVID_VERSION_MINOR(XVID_VERSION), XVID_VERSION_PATCH(XVID_VERSION)); fprintf(rc->stat_file, "# Please do not modify this file\n\n"); rc->fq_error = 0; *handle = rc; return(0); } static int rc_2pass1_destroy(rc_2pass1_t * rc, xvid_plg_destroy_t * destroy) { if (rc->stat_file) { if (fclose(rc->stat_file) == EOF) { DPRINTF(XVID_DEBUG_RC, "Error closing stats file (%s)", strerror(errno)); } } rc->stat_file = NULL; /* Just a paranoid reset */ free(rc); /* as the container structure is freed anyway */ return(0); } static int rc_2pass1_before(rc_2pass1_t * rc, xvid_plg_data_t * data) { if (data->quant <= 0) { if (data->zone && data->zone->mode == XVID_ZONE_QUANT) { /* We disable no options in quant zones, as their implementation is * based on the fact we do first pass exactly the same way as the * second one to have exact zone size */ rc->fq_error += (double)data->zone->increment / (double)data->zone->base; data->quant = (int)rc->fq_error; rc->fq_error -= data->quant; } else { data->quant = 2; #ifdef FAST1PASS /* Given the fact our 2pass algorithm is based on very simple * rules, we can disable some options that are too CPU intensive * and do not provide the 2nd pass any benefit */ /* First disable some motion flags */ data->motion_flags &= ~XVID_ME_CHROMA_PVOP; data->motion_flags &= ~XVID_ME_CHROMA_BVOP; data->motion_flags &= ~XVID_ME_USESQUARES16; data->motion_flags &= ~XVID_ME_ADVANCEDDIAMOND16; data->motion_flags &= ~XVID_ME_EXTSEARCH16; /* And enable fast replacements */ data->motion_flags |= XVID_ME_FAST_MODEINTERPOLATE; data->motion_flags |= XVID_ME_SKIP_DELTASEARCH; data->motion_flags |= XVID_ME_FASTREFINE16; data->motion_flags |= XVID_ME_BFRAME_EARLYSTOP; /* Now VOP flags (no fast replacements) */ data->vop_flags &= ~XVID_VOP_MODEDECISION_RD; data->vop_flags &= ~XVID_VOP_RD_BVOP; data->vop_flags &= ~XVID_VOP_FAST_MODEDECISION_RD; data->vop_flags &= ~XVID_VOP_TRELLISQUANT; data->vop_flags &= ~XVID_VOP_INTER4V; data->vop_flags &= ~XVID_VOP_HQACPRED; /* Finnaly VOL flags * * NB: Qpel cannot be disable because this option really changes * too much the texture data compressibility, and thus the * second pass gets confused by too much impredictability * of frame sizes, and actually hurts quality */ #ifdef FAST1PASS_QPEL_TOO /* or maybe we can disable it after all? */ data->vol_flags &= ~XVID_VOL_QUARTERPEL; #endif data->vol_flags &= ~XVID_VOL_GMC; #endif } } return(0); } static int rc_2pass1_after(rc_2pass1_t * rc, xvid_plg_data_t * data) { char type; xvid_enc_stats_t *stats = &data->stats; /* Frame type in ascii I/P/B */ switch(stats->type) { case XVID_TYPE_IVOP: type = 'i'; break; case XVID_TYPE_PVOP: type = 'p'; break; case XVID_TYPE_BVOP: type = 'b'; break; case XVID_TYPE_SVOP: type = 's'; break; default: /* Should not go here */ return(XVID_ERR_FAIL); } /* write the resulting statistics */ fprintf(rc->stat_file, "%c %d %d %d %d %d %d\n", type, stats->quant, stats->kblks, stats->mblks, stats->ublks, stats->length, stats->hlength); return(0); } int xvid_plugin_2pass1(void * handle, int opt, void * param1, void * param2) { switch(opt) { case XVID_PLG_INFO : case XVID_PLG_FRAME : return 0; case XVID_PLG_CREATE : return rc_2pass1_create((xvid_plg_create_t*)param1, param2); case XVID_PLG_DESTROY : return rc_2pass1_destroy((rc_2pass1_t*)handle, (xvid_plg_destroy_t*)param1); case XVID_PLG_BEFORE : return rc_2pass1_before((rc_2pass1_t*)handle, (xvid_plg_data_t*)param1); case XVID_PLG_AFTER : return rc_2pass1_after((rc_2pass1_t*)handle, (xvid_plg_data_t*)param1); } return XVID_ERR_FAIL; } xvidcore/src/plugins/plugin_dump.c0000664000076500007650000000432011564705453020433 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Xvid plugin: dump pgm files of original and encoded frames - * * Copyright(C) 2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: plugin_dump.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include "../xvid.h" #include "../image/image.h" int xvid_plugin_dump(void * handle, int opt, void * param1, void * param2) { switch(opt) { case XVID_PLG_INFO : { xvid_plg_info_t * info = (xvid_plg_info_t*)param1; info->flags = XVID_REQORIGINAL; return(0); } case XVID_PLG_CREATE : *((void**)param2) = NULL; /* We don't have any private data to bind here */ case XVID_PLG_DESTROY : case XVID_PLG_BEFORE : case XVID_PLG_FRAME : return(0); case XVID_PLG_AFTER : { xvid_plg_data_t * data = (xvid_plg_data_t*)param1; IMAGE img; char tmp[100]; img.y = data->original.plane[0]; img.u = data->original.plane[1]; img.v = data->original.plane[2]; sprintf(tmp, "ori-%03i.pgm", data->frame_num); image_dump_yuvpgm(&img, data->original.stride[0], data->width, data->height, tmp); img.y = data->current.plane[0]; img.u = data->current.plane[1]; img.v = data->current.plane[2]; sprintf(tmp, "enc-%03i.pgm", data->frame_num); image_dump_yuvpgm(&img, data->current.stride[0], data->width, data->height, tmp); } return(0); } return XVID_ERR_FAIL; } xvidcore/src/plugins/x86_asm/0000775000076500007650000000000011566427762017240 5ustar xvidbuildxvidbuildxvidcore/src/plugins/x86_asm/plugin_ssim-a.asm0000664000076500007650000001103511254216113022465 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - optimized SSIM routines - ; * ; * Copyright(C) 2006 Johannes Reinhardt ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * ; ***************************************************************************/ BITS 32 %include "nasm.inc" %macro ACC_ROW 2 movq %1,[ TMP0] movq %2,[TMP0+TMP1] psadbw %1,mm0 psadbw %2,mm0 lea TMP0, [TMP0+2*TMP1] paddw %1, %2 %endmacro %macro CONSIM_1x8_SSE2 0 movdqu xmm0,[TMP0] movdqu xmm1,[TMP1] ;unpack to words punpcklbw xmm0,xmm2 punpcklbw xmm1,xmm2 movaps xmm3,xmm0 movaps xmm4,xmm1 pmaddwd xmm0,xmm0;orig pmaddwd xmm1,xmm1;comp pmaddwd xmm3,xmm4;corr paddd xmm5,xmm0 paddd xmm6,xmm1 paddd xmm7,xmm3 %endmacro %macro CONSIM_1x8_MMX 0 movq mm0,[TMP0];orig movq mm1,[TMP1];comp ;unpack low half of qw to words punpcklbw mm0,mm2 punpcklbw mm1,mm2 movq mm3,mm0 pmaddwd mm3,mm0 paddd mm5,mm3; movq mm4,mm1 pmaddwd mm4,mm1 paddd mm6,mm4; pmaddwd mm1,mm0 paddd mm7,mm1 movq mm0,[TMP0];orig movq mm1,[TMP1];comp ;unpack high half of qw to words punpckhbw mm0,mm2 punpckhbw mm1,mm2 movq mm3,mm0 pmaddwd mm3,mm0 paddd mm5,mm3; movq mm4,mm1 pmaddwd mm4,mm1 paddd mm6,mm4; pmaddwd mm1,mm0 paddd mm7,mm1 %endmacro %macro CONSIM_WRITEOUT 3 mov eax,prm4d;lumo mul eax; lumo^2 add eax, 32 shr eax, 6; 64*lum0^2 movd TMP0d,%1 sub TMP0d, eax mov TMP1,prm6; pdevo mov dword [TMP1],TMP0d mov eax,prm5d ;lumc mul eax; lumc^2 add eax, 32 shr eax, 6; 64*lumc^2 movd TMP0d,%2 sub TMP0d, eax mov TMP1,prm7; pdevc mov dword [TMP1],TMP0d mov eax,prm4d;lumo mul prm5d; lumo*lumc, should fit in _EAX add eax, 32 shr eax, 6; 64*lumo*lumc movd TMP0d,%3 sub TMP0d, eax mov TMP1,prm8; pcorr mov dword [TMP1],TMP0d %endmacro TEXT cglobal lum_8x8_mmx cglobal consim_sse2 cglobal consim_mmx ;int lum_8x8_c(uint8_t* ptr, uint32_t stride) ALIGN SECTION_ALIGN lum_8x8_mmx: mov TMP0, prm1 ;ptr mov TMP1, prm2 ;stride pxor mm0,mm0 ACC_ROW mm1, mm2 ACC_ROW mm3, mm4 ACC_ROW mm5, mm6 ACC_ROW mm7, mm4 paddw mm1, mm3 paddw mm5, mm7 paddw mm1, mm5 movd eax,mm1 ret ENDFUNC ALIGN SECTION_ALIGN consim_sse2: PUSH_XMM6_XMM7 mov TMP0,prm1 ;ptro mov TMP1,prm2 ;ptrc mov _EAX, prm3 ;stride pxor xmm2,xmm2;null vektor pxor xmm5,xmm5;devo pxor xmm6,xmm6;devc pxor xmm7,xmm7;corr CONSIM_1x8_SSE2 add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_SSE2 add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_SSE2 add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_SSE2 add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_SSE2 add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_SSE2 add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_SSE2 add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_SSE2 ;accumulate xmm5-7 pshufd xmm0, xmm5, 0x0E paddd xmm5, xmm0 pshufd xmm0, xmm5, 0x01 paddd xmm5, xmm0 pshufd xmm1, xmm6, 0x0E paddd xmm6, xmm1 pshufd xmm1, xmm6, 0x01 paddd xmm6, xmm1 pshufd xmm2, xmm7, 0x0E paddd xmm7, xmm2 pshufd xmm2, xmm7, 0x01 paddd xmm7, xmm2 CONSIM_WRITEOUT xmm5,xmm6,xmm7 POP_XMM6_XMM7 ret ENDFUNC ALIGN SECTION_ALIGN consim_mmx: mov TMP0,prm1 ;ptro mov TMP1,prm2 ;ptrc mov _EAX,prm3;stride pxor mm2,mm2;null pxor mm5,mm5;devo pxor mm6,mm6;devc pxor mm7,mm7;corr CONSIM_1x8_MMX add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_MMX add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_MMX add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_MMX add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_MMX add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_MMX add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_MMX add TMP0,_EAX add TMP1,_EAX CONSIM_1x8_MMX movq mm0,mm5 psrlq mm0,32 paddd mm5,mm0 movq mm1,mm6 psrlq mm1,32 paddd mm6,mm1 movq mm2,mm7 psrlq mm2,32 paddd mm7,mm2 CONSIM_WRITEOUT mm5,mm6,mm7 ret ENDFUNC NON_EXEC_STACK xvidcore/src/plugins/plugin_single.c0000664000076500007650000002075311564705453020757 0ustar xvidbuildxvidbuild/***************************************************************************** * * Xvid Standard Plugins * - single-pass bitrate controller implementation - * * Copyright(C) 2002-2004 Benjamin Lambert * 2002-2003 Edouard Gomez * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: plugin_single.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include "../xvid.h" #include "../image/image.h" #define DEFAULT_INITIAL_QUANTIZER 8 #define DEFAULT_BITRATE 900000 /* 900kbps */ #define DEFAULT_DELAY_FACTOR 16 #define DEFAULT_AVERAGING_PERIOD 100 #define DEFAULT_BUFFER 100 typedef struct { int reaction_delay_factor; int averaging_period; int buffer; int bytes_per_sec; double target_framesize; double time; int64_t total_size; int rtn_quant; double sequence_quality; double avg_framesize; double quant_error[31]; double fq_error; } rc_single_t; static int get_initial_quant(unsigned int bitrate) { #if defined(DEFAULT_INITIAL_QUANTIZER) return (DEFAULT_INITIAL_QUANTIZER); #else int i; const unsigned int bitrate_quant[31] = { UINT_MAX }; for (i = 30; i >= 0; i--) { if (bitrate > bitrate_quant[i]) continue; } return (i + 1); #endif } static int rc_single_create(xvid_plg_create_t * create, rc_single_t ** handle) { xvid_plugin_single_t *param = (xvid_plugin_single_t *) create->param; rc_single_t *rc; int i; /* * single needs to caclculate the average frame size. In order to do that, * we really need valid fps */ if (create->fincr == 0) { return XVID_ERR_FAIL; } /* Allocate context struct */ if ((rc = malloc(sizeof(rc_single_t))) == NULL) return (XVID_ERR_MEMORY); /* Constants */ rc->bytes_per_sec = (param->bitrate > 0) ? param->bitrate / 8 : DEFAULT_BITRATE / 8; rc->target_framesize =(double) rc->bytes_per_sec / ((double) create->fbase / create->fincr); rc->reaction_delay_factor = (param->reaction_delay_factor > 0) ? param->reaction_delay_factor : DEFAULT_DELAY_FACTOR; rc->averaging_period = (param->averaging_period > 0) ? param->averaging_period : DEFAULT_AVERAGING_PERIOD; rc->buffer = (param->buffer > 0) ? param->buffer : DEFAULT_BUFFER; rc->time = 0; rc->total_size = 0; rc->rtn_quant = get_initial_quant(param->bitrate); /* Reset quant error accumulators */ for (i = 0; i < 31; i++) rc->quant_error[i] = 0.0; /* Last bunch of variables */ rc->sequence_quality = 2.0 / (double) rc->rtn_quant; rc->avg_framesize = rc->target_framesize; rc->fq_error = 0; /* Bind the RC */ *handle = rc; /* A bit of debug info */ DPRINTF(XVID_DEBUG_RC, "bytes_per_sec: %i\n", rc->bytes_per_sec); DPRINTF(XVID_DEBUG_RC, "frame rate : %f\n", (double) create->fbase / create->fincr); DPRINTF(XVID_DEBUG_RC, "target_framesize: %f\n", rc->target_framesize); return (0); } static int rc_single_destroy(rc_single_t * rc, xvid_plg_destroy_t * destroy) { free(rc); return (0); } static int rc_single_before(rc_single_t * rc, xvid_plg_data_t * data) { if (data->quant <= 0) { if (data->zone && data->zone->mode == XVID_ZONE_QUANT) { rc->fq_error += (double)data->zone->increment / (double)data->zone->base; data->quant = (int)rc->fq_error; rc->fq_error -= data->quant; } else { int q = rc->rtn_quant; /* limit to min/max range we don't know frame type of the next frame, so we just use P-VOP's range... */ if (q > data->max_quant[XVID_TYPE_PVOP-1]) q = data->max_quant[XVID_TYPE_PVOP-1]; else if (q < data->min_quant[XVID_TYPE_PVOP-1]) q = data->min_quant[XVID_TYPE_PVOP-1]; data->quant = q; } } return 0; } static int rc_single_after(rc_single_t * rc, xvid_plg_data_t * data) { int64_t deviation; int rtn_quant; double overflow; double averaging_period; double reaction_delay_factor; double quality_scale; double base_quality; double target_quality; /* Update internal values */ rc->time += (double) data->fincr / data->fbase; rc->total_size += data->length; /* Compute the deviation from expected total size */ deviation = (int64_t) (rc->total_size - rc->bytes_per_sec * rc->time); averaging_period = (double) rc->averaging_period; /* calculate the sequence quality */ rc->sequence_quality -= rc->sequence_quality / averaging_period; rc->sequence_quality += 2.0 / (double) data->quant / averaging_period; /* clamp the sequence quality to 10% to 100% * to try to avoid using the highest * and lowest quantizers 'too' much */ if (rc->sequence_quality < 0.1) rc->sequence_quality = 0.1; else if (rc->sequence_quality > 1.0) rc->sequence_quality = 1.0; /* factor this frame's size into the average framesize * but skip using ivops as they are usually very large * and as such, usually disrupt quantizer distribution */ if (data->type != XVID_TYPE_IVOP) { reaction_delay_factor = (double) rc->reaction_delay_factor; rc->avg_framesize -= rc->avg_framesize / reaction_delay_factor; rc->avg_framesize += data->length / reaction_delay_factor; } /* don't change the quantizer between pvops */ if (data->type == XVID_TYPE_BVOP) return (0); /* calculate the quality_scale which will be used * to drag the target quality up or down, depending * on if avg_framesize is >= target_framesize */ quality_scale = rc->target_framesize / rc->avg_framesize * rc->target_framesize / rc->avg_framesize; /* use the current sequence_quality as the * base_quality which will be dragged around * * 0.06452 = 6.452% quality (quant:31) */ base_quality = rc->sequence_quality; if (quality_scale >= 1.0) { base_quality = 1.0 - (1.0 - base_quality) / quality_scale; } else { base_quality = 0.06452 + (base_quality - 0.06452) * quality_scale; } overflow = -((double) deviation / (double) rc->buffer); /* clamp overflow to 1 buffer unit to avoid very * large bursts of bitrate following still scenes */ if (overflow > rc->target_framesize) overflow = rc->target_framesize; else if (overflow < -rc->target_framesize) overflow = -rc->target_framesize; /* apply overflow / buffer to get the target_quality */ target_quality = base_quality + (base_quality - 0.06452) * overflow / rc->target_framesize; /* clamp the target_quality to quant 1-31 * 2.0 = 200% quality (quant:1) */ if (target_quality > 2.0) target_quality = 2.0; else if (target_quality < 0.06452) target_quality = 0.06452; rtn_quant = (int) (2.0 / target_quality); /* accumulate quant <-> quality error and apply if >= 1.0 */ if (rtn_quant > 0 && rtn_quant < 31) { rc->quant_error[rtn_quant - 1] += 2.0 / target_quality - rtn_quant; if (rc->quant_error[rtn_quant - 1] >= 1.0) { rc->quant_error[rtn_quant - 1] -= 1.0; rtn_quant++; rc->rtn_quant++; } } /* prevent rapid quantization change */ if (rtn_quant > rc->rtn_quant + 1) { if (rtn_quant > rc->rtn_quant + 3) if (rtn_quant > rc->rtn_quant + 5) rtn_quant = rc->rtn_quant + 3; else rtn_quant = rc->rtn_quant + 2; else rtn_quant = rc->rtn_quant + 1; } else if (rtn_quant < rc->rtn_quant - 1) { if (rtn_quant < rc->rtn_quant - 3) if (rtn_quant < rc->rtn_quant - 5) rtn_quant = rc->rtn_quant - 3; else rtn_quant = rc->rtn_quant - 2; else rtn_quant = rc->rtn_quant - 1; } rc->rtn_quant = rtn_quant; return (0); } int xvid_plugin_single(void *handle, int opt, void *param1, void *param2) { switch (opt) { case XVID_PLG_INFO: case XVID_PLG_FRAME : return 0; case XVID_PLG_CREATE: return rc_single_create((xvid_plg_create_t *) param1, param2); case XVID_PLG_DESTROY: return rc_single_destroy((rc_single_t *) handle,(xvid_plg_destroy_t *) param1); case XVID_PLG_BEFORE: return rc_single_before((rc_single_t *) handle, (xvid_plg_data_t *) param1); case XVID_PLG_AFTER: return rc_single_after((rc_single_t *) handle, (xvid_plg_data_t *) param1); } return XVID_ERR_FAIL; } xvidcore/src/plugins/plugin_lumimasking.c0000664000076500007650000002353211564705453022014 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Xvid plugin: performs a lumimasking algorithm on encoded frame - * * Copyright(C) 2002-2003 Peter Ross * 2002 Christoph Lampert * 2008 Jason Garrett-Glaser * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: plugin_lumimasking.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include "../xvid.h" #include "../global.h" #include "../portab.h" #include "../utils/emms.h" /***************************************************************************** * Private data type ****************************************************************************/ typedef struct { float *quant; float *val; int method; } lumi_data_t; /***************************************************************************** * Sub plugin functions ****************************************************************************/ static int lumi_plg_info(xvid_plg_info_t *info); static int lumi_plg_create(xvid_plg_create_t *create, lumi_data_t **handle); static int lumi_plg_destroy(lumi_data_t *handle, xvid_plg_destroy_t * destroy); static int lumi_plg_frame(lumi_data_t *handle, xvid_plg_data_t *data); static int lumi_plg_after(lumi_data_t *handle, xvid_plg_data_t *data); /***************************************************************************** * The plugin entry function ****************************************************************************/ int xvid_plugin_lumimasking(void * handle, int opt, void * param1, void * param2) { switch(opt) { case XVID_PLG_INFO: return(lumi_plg_info((xvid_plg_info_t*)param1)); case XVID_PLG_CREATE: return(lumi_plg_create((xvid_plg_create_t *)param1, (lumi_data_t **)param2)); case XVID_PLG_DESTROY: return(lumi_plg_destroy((lumi_data_t *)handle, (xvid_plg_destroy_t*)param1)); case XVID_PLG_BEFORE : return 0; case XVID_PLG_FRAME : return(lumi_plg_frame((lumi_data_t *)handle, (xvid_plg_data_t *)param1)); case XVID_PLG_AFTER : return(lumi_plg_after((lumi_data_t *)handle, (xvid_plg_data_t *)param1)); } return(XVID_ERR_FAIL); } /*---------------------------------------------------------------------------- * Info plugin function *--------------------------------------------------------------------------*/ static int lumi_plg_info(xvid_plg_info_t *info) { /* We just require a diff quant array access */ info->flags = XVID_REQDQUANTS; return(0); } /*---------------------------------------------------------------------------- * Create plugin function * * Allocates the private plugin data arrays *--------------------------------------------------------------------------*/ static int lumi_plg_create(xvid_plg_create_t *create, lumi_data_t **handle) { lumi_data_t *lumi; xvid_plugin_lumimasking_t *param = (xvid_plugin_lumimasking_t *) create->param; if ((lumi = (lumi_data_t*)malloc(sizeof(lumi_data_t))) == NULL) return(XVID_ERR_MEMORY); lumi->method = 0; lumi->quant = (float*)malloc(create->mb_width*create->mb_height*sizeof(float)); if (lumi->quant == NULL) { free(lumi); return(XVID_ERR_MEMORY); } lumi->val = (float*)malloc(create->mb_width*create->mb_height*sizeof(float)); if (lumi->val == NULL) { free(lumi->quant); free(lumi); return(XVID_ERR_MEMORY); } if (param != NULL) lumi->method = param->method; /* Bind the data structure to the handle */ *handle = lumi; return(0); } /*---------------------------------------------------------------------------- * Destroy plugin function * * Free the private plugin data arrays *--------------------------------------------------------------------------*/ static int lumi_plg_destroy(lumi_data_t *handle, xvid_plg_destroy_t *destroy) { if (handle) { if (handle->quant) { free(handle->quant); handle->quant = NULL; } if (handle->val) { free(handle->val); handle->val = NULL; } free(handle); } return(0); } /*---------------------------------------------------------------------------- * Before plugin function * * Here is all the magic about lumimasking. *--------------------------------------------------------------------------*/ /* Helper function defined later */ static int normalize_quantizer_field(float *in, int *out, int num, int min_quant, int max_quant); static int lumi_plg_frame(lumi_data_t *handle, xvid_plg_data_t *data) { int i, j; float global = 0.0f; const float DarkAmpl = 14 / 4; const float BrightAmpl = 10 / 3; float DarkThres = 90; float BrightThres = 200; const float GlobalDarkThres = 60; const float GlobalBrightThres = 170; /* Arbitrary centerpoint for variance-based AQ. Roughly the same as used in x264. */ float center = 14000.f; /* Arbitrary strength for variance-based AQ. */ float strength = 0.2f; if (data->type == XVID_TYPE_BVOP) return 0; /* Do this for all macroblocks individually */ for (j = 0; j < data->mb_height; j++) { for (i = 0; i < data->mb_width; i++) { int k, l, sum = 0, sum_of_squares = 0; unsigned char *ptr; /* Initialize the current quant value to the frame quant */ handle->quant[j*data->mb_width + i] = (float)data->quant; /* Next steps compute the luminance-masking */ /* Get the MB address */ ptr = data->current.plane[0]; ptr += 16*j*data->current.stride[0] + 16*i; if (handle->method) { /* Variance masking mode */ int variance = 0; /* Accumulate sum and sum of squares over the MB */ for (k = 0; k < 16; k++) { for (l = 0; l < 16; l++) { int val = ptr[k*data->current.stride[0] + l]; sum += val; sum_of_squares += val * val; } } /* Variance = SSD - SAD^2 / (numpixels) */ variance = sum_of_squares - sum * sum / 256; handle->val[j*data->mb_width + i] = (float)variance; } else { /* Luminance masking mode */ /* Accumulate luminance */ for (k = 0; k < 16; k++) for (l = 0; l < 16; l++) sum += ptr[k*data->current.stride[0] + l]; handle->val[j*data->mb_width + i] = (float)sum/256.0f; /* Accumulate the global frame luminance */ global += (float)sum/256.0f; } } } if (handle->method) { /* Variance masking */ /* Apply the variance masking formula to all MBs */ for (i = 0; i < data->mb_height; i++) { for (j = 0; j < data->mb_width; j++) { float value = handle->val[i*data->mb_width + j]; float qscale_diff = strength * logf(value / center); handle->quant[i*data->mb_width + j] *= (1.0f + qscale_diff); } } } else { /* Luminance masking */ /* Normalize the global luminance accumulator */ global /= data->mb_width*data->mb_height; DarkThres = DarkThres*global/127.0f; BrightThres = BrightThres*global/127.0f; /* Apply luminance masking only to frames where the global luminance is * higher than DarkThreshold and lower than Bright Threshold */ if ((global < GlobalBrightThres) && (global > GlobalDarkThres)) { /* Apply the luminance masking formulas to all MBs */ for (i = 0; i < data->mb_height; i++) { for (j = 0; j < data->mb_width; j++) { if (handle->val[i*data->mb_width + j] < DarkThres) handle->quant[i*data->mb_width + j] *= 1 + DarkAmpl * (DarkThres - handle->val[i*data->mb_width + j]) / DarkThres; else if (handle->val[i*data->mb_width + j] > BrightThres) handle->quant[i*data->mb_width + j] *= 1 + BrightAmpl * (handle->val[i*data->mb_width + j] - BrightThres) / (255 - BrightThres); } } } } /* Normalize the quantizer field */ data->quant = normalize_quantizer_field(handle->quant, data->dquant, data->mb_width*data->mb_height, data->quant, MAX(2,data->quant + data->quant/2)); /* Plugin job finished */ return(0); } /*---------------------------------------------------------------------------- * After plugin function (dummy function) *--------------------------------------------------------------------------*/ static int lumi_plg_after(lumi_data_t *handle, xvid_plg_data_t *data) { return(0); } /***************************************************************************** * Helper functions ****************************************************************************/ #define RDIFF(a, b) ((int)(a+0.5)-(int)(b+0.5)) static int normalize_quantizer_field(float *in, int *out, int num, int min_quant, int max_quant) { int i; int finished; do { finished = 1; for (i = 1; i < num; i++) { if (RDIFF(in[i], in[i - 1]) > 2) { in[i] -= (float) 0.5; finished = 0; } else if (RDIFF(in[i], in[i - 1]) < -2) { in[i - 1] -= (float) 0.5; finished = 0; } if (in[i] > max_quant) { in[i] = (float) max_quant; finished = 0; } if (in[i] < min_quant) { in[i] = (float) min_quant; finished = 0; } if (in[i - 1] > max_quant) { in[i - 1] = (float) max_quant; finished = 0; } if (in[i - 1] < min_quant) { in[i - 1] = (float) min_quant; finished = 0; } } } while (!finished); out[0] = 0; for (i = 1; i < num; i++) out[i] = RDIFF(in[i], in[i - 1]); return (int) (in[0] + 0.5); } xvidcore/src/plugins/plugin_psnrhvsm.c0000664000076500007650000002646211564705453021361 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - PSNR-HVS-M plugin: computes the PSNR-HVS-M metric - * * Copyright(C) 2010 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: plugin_psnrhvsm.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ /***************************************************************************** * * The PSNR-HVS-M metric is described in the following paper: * * "On between-coefficient contrast masking of DCT basis functions", by * N. Ponomarenko, F. Silvestri, K. Egiazarian, M. Carli, J. Astola, V. Lukin, * in Proceedings of the Third International Workshop on Video Processing and * Quality Metrics for Consumer Electronics VPQM-07, January, 2007, 4 p. * * http://www.ponomarenko.info/psnrhvsm.htm * ****************************************************************************/ #include #include #include #include "../portab.h" #include "../xvid.h" #include "../dct/fdct.h" #include "../image/image.h" #include "../motion/sad.h" #include "../utils/mem_transfer.h" #include "../utils/emms.h" typedef struct { uint64_t mse_sum_y; /* for avrg psnr-hvs-m */ uint64_t mse_sum_u; uint64_t mse_sum_v; long frame_cnt; } psnrhvsm_data_t; /* internal plugin data */ #if 0 /* Floating-point implementation: Slow but accurate */ static const float CSF_Coeff[64] = { 1.608443f, 2.339554f, 2.573509f, 1.608443f, 1.072295f, 0.643377f, 0.504610f, 0.421887f, 2.144591f, 2.144591f, 1.838221f, 1.354478f, 0.989811f, 0.443708f, 0.428918f, 0.467911f, 1.838221f, 1.979622f, 1.608443f, 1.072295f, 0.643377f, 0.451493f, 0.372972f, 0.459555f, 1.838221f, 1.513829f, 1.169777f, 0.887417f, 0.504610f, 0.295806f, 0.321689f, 0.415082f, 1.429727f, 1.169777f, 0.695543f, 0.459555f, 0.378457f, 0.236102f, 0.249855f, 0.334222f, 1.072295f, 0.735288f, 0.467911f, 0.402111f, 0.317717f, 0.247453f, 0.227744f, 0.279729f, 0.525206f, 0.402111f, 0.329937f, 0.295806f, 0.249855f, 0.212687f, 0.214459f, 0.254803f, 0.357432f, 0.279729f, 0.270896f, 0.262603f, 0.229778f, 0.257351f, 0.249855f, 0.259950f }; static const float Mask_Coeff[64] = { 0.000000f, 0.826446f, 1.000000f, 0.390625f, 0.173611f, 0.062500f, 0.038447f, 0.026874f, 0.694444f, 0.694444f, 0.510204f, 0.277008f, 0.147929f, 0.029727f, 0.027778f, 0.033058f, 0.510204f, 0.591716f, 0.390625f, 0.173611f, 0.062500f, 0.030779f, 0.021004f, 0.031888f, 0.510204f, 0.346021f, 0.206612f, 0.118906f, 0.038447f, 0.013212f, 0.015625f, 0.026015f, 0.308642f, 0.206612f, 0.073046f, 0.031888f, 0.021626f, 0.008417f, 0.009426f, 0.016866f, 0.173611f, 0.081633f, 0.033058f, 0.024414f, 0.015242f, 0.009246f, 0.007831f, 0.011815f, 0.041649f, 0.024414f, 0.016437f, 0.013212f, 0.009426f, 0.006830f, 0.006944f, 0.009803f, 0.019290f, 0.011815f, 0.011080f, 0.010412f, 0.007972f, 0.010000f, 0.009426f, 0.010203f }; static uint32_t calc_SSE_H(int16_t *DCT_A, int16_t *DCT_B, uint8_t *IMG_A, uint8_t *IMG_B, int stride) { int x, y, i, j; uint32_t Global_A, Global_B, Sum_A = 0, Sum_B = 0; uint32_t Local[8] = {0, 0, 0, 0, 0, 0, 0, 0}; uint32_t Local_Square[8] = {0, 0, 0, 0, 0, 0, 0, 0}; float MASK_A = 0.f, MASK_B = 0.f; float Mult1 = 1.f, Mult2 = 1.f; uint32_t MSE_H = 0; /* Step 1: Calculate CSF weighted energy of DCT coefficients */ for (y = 0; y < 8; y++) { for (x = 0; x < 8; x++) { MASK_A += (float)(DCT_A[y*8 + x]*DCT_A[y*8 + x])*Mask_Coeff[y*8 + x]; MASK_B += (float)(DCT_B[y*8 + x]*DCT_B[y*8 + x])*Mask_Coeff[y*8 + x]; } } /* Step 2: Determine local variances compared to entire block variance */ for (y = 0; y < 2; y++) { for (x = 0; x < 2; x++) { for (j = 0; j < 4; j++) { for (i = 0; i < 4; i++) { uint8_t A = IMG_A[(y*4+j)*stride + 4*x + i]; uint8_t B = IMG_B[(y*4+j)*stride + 4*x + i]; Local[y*2 + x] += A; Local[y*2 + x + 4] += B; Local_Square[y*2 + x] += A*A; Local_Square[y*2 + x + 4] += B*B; } } } } Global_A = Local[0] + Local[1] + Local[2] + Local[3]; Global_B = Local[4] + Local[5] + Local[6] + Local[7]; for (i = 0; i < 8; i++) Local[i] = (Local_Square[i]<<4) - (Local[i]*Local[i]); /* 16*Var(Di) */ Local_Square[0] += (Local_Square[1] + Local_Square[2] + Local_Square[3]); Local_Square[4] += (Local_Square[5] + Local_Square[6] + Local_Square[7]); Global_A = (Local_Square[0]<<6) - Global_A*Global_A; /* 64*Var(D) */ Global_B = (Local_Square[4]<<6) - Global_B*Global_B; /* 64*Var(D) */ /* Step 3: Calculate contrast masking threshold */ if (Global_A) Mult1 = (float)(Local[0]+Local[1]+Local[2]+Local[3])/((float)(Global_A)/4.f); if (Global_B) Mult2 = (float)(Local[4]+Local[5]+Local[6]+Local[7])/((float)(Global_B)/4.f); MASK_A = (float)sqrt(MASK_A * Mult1) / 32.f; MASK_B = (float)sqrt(MASK_B * Mult2) / 32.f; if (MASK_B > MASK_A) MASK_A = MASK_B; /* MAX(MASK_A, MASK_B) */ /* Step 4: Calculate MSE of DCT coeffs reduced by masking effect */ for (j = 0; j < 8; j++) { for (i = 0; i < 8; i++) { float u = (float)abs(DCT_A[j*8 + i] - DCT_B[j*8 + i]); if ((i|j)>0) { if (u < (MASK_A / Mask_Coeff[j*8 + i])) u = 0; /* The error is not perceivable */ else u -= (MASK_A / Mask_Coeff[j*8 + i]); } MSE_H += (uint32_t) ((16.f*(u * CSF_Coeff[j*8 + i])*(u * CSF_Coeff[j*8 + i])) + 0.5f); } } return MSE_H; /* Fixed-point value right-shifted by four */ } #else static uint32_t calc_SSE_H(int16_t *DCT_A, int16_t *DCT_B, uint8_t *IMG_A, uint8_t *IMG_B, int stride) { DECLARE_ALIGNED_MATRIX(sums, 1, 8, uint16_t, CACHE_LINE); DECLARE_ALIGNED_MATRIX(squares, 1, 8, uint32_t, CACHE_LINE); uint32_t i, Global_A, Global_B, Sum_A = 0, Sum_B = 0; uint32_t local[8], MASK_A, MASK_B, Mult1 = 64, Mult2 = 64; /* Step 1: Calculate CSF weighted energy of DCT coefficients */ Sum_A = coeff8_energy(DCT_A); Sum_B = coeff8_energy(DCT_B); /* Step 2: Determine local variances compared to entire block variance */ Global_A = blocksum8(IMG_A, stride, sums, squares); Global_B = blocksum8(IMG_B, stride, &sums[4], &squares[4]); for (i = 0; i < 8; i++) local[i] = (squares[i]<<4) - (sums[i]*sums[i]); /* 16*Var(Di) */ squares[0] += (squares[1] + squares[2] + squares[3]); squares[4] += (squares[5] + squares[6] + squares[7]); Global_A = (squares[0]<<6) - Global_A*Global_A; /* 64*Var(D) */ Global_B = (squares[4]<<6) - Global_B*Global_B; /* 64*Var(D) */ /* Step 3: Calculate contrast masking threshold */ if (Global_A) Mult1 = ((local[0]+local[1]+local[2]+local[3])<<8) / Global_A; if (Global_B) Mult2 = ((local[4]+local[5]+local[6]+local[7])<<8) / Global_B; MASK_A = isqrt(2*Sum_A*Mult1) + 16; MASK_B = isqrt(2*Sum_B*Mult2) + 16; if (MASK_B > MASK_A) /* MAX(MASK_A, MASK_B) */ MASK_A = ((MASK_B + 32) >> 6); else MASK_A = ((MASK_A + 32) >> 6); /* Step 4: Calculate MSE of DCT coeffs reduced by masking effect */ return sseh8_16bit(DCT_A, DCT_B, (uint16_t) MASK_A); } #endif static void psnrhvsm_after(xvid_plg_data_t *data, psnrhvsm_data_t *psnrhvsm) { DECLARE_ALIGNED_MATRIX(DCT, 2, 64, int16_t, CACHE_LINE); int32_t x, y, u, v; int16_t *DCT_A = &DCT[0], *DCT_B = &DCT[64]; uint64_t sse_y = 0, sse_u = 0, sse_v = 0; for (y = 0; y < data->height>>3; y++) { uint8_t *IMG_A = (uint8_t *) data->original.plane[0]; uint8_t *IMG_B = (uint8_t *) data->current.plane[0]; uint32_t stride = data->original.stride[0]; for (x = 0; x < data->width>>3; x++) { /* non multiple of 8 handling ?? */ int offset = (y<<3)*stride + (x<<3); emms(); /* Transfer data */ transfer_8to16copy(DCT_A, IMG_A + offset, stride); transfer_8to16copy(DCT_B, IMG_B + offset, stride); /* Perform DCT */ fdct(DCT_A); fdct(DCT_B); emms(); /* Calculate SSE_H reduced by contrast masking effect */ sse_y += calc_SSE_H(DCT_A, DCT_B, IMG_A + offset, IMG_B + offset, stride); } } for (y = 0; y < data->height>>4; y++) { uint8_t *U_A = (uint8_t *) data->original.plane[1]; uint8_t *V_A = (uint8_t *) data->original.plane[2]; uint8_t *U_B = (uint8_t *) data->current.plane[1]; uint8_t *V_B = (uint8_t *) data->current.plane[2]; uint32_t stride_uv = data->current.stride[1]; for (x = 0; x < data->width>>4; x++) { /* non multiple of 8 handling ?? */ int offset = (y<<3)*stride_uv + (x<<3); emms(); /* Transfer data */ transfer_8to16copy(DCT_A, U_A + offset, stride_uv); transfer_8to16copy(DCT_B, U_B + offset, stride_uv); /* Perform DCT */ fdct(DCT_A); fdct(DCT_B); emms(); /* Calculate SSE_H reduced by contrast masking effect */ sse_u += calc_SSE_H(DCT_A, DCT_B, U_A + offset, U_B + offset, stride_uv); emms(); /* Transfer data */ transfer_8to16copy(DCT_A, V_A + offset, stride_uv); transfer_8to16copy(DCT_B, V_B + offset, stride_uv); /* Perform DCT */ fdct(DCT_A); fdct(DCT_B); emms(); /* Calculate SSE_H reduced by contrast masking effect */ sse_v += calc_SSE_H(DCT_A, DCT_B, V_A + offset, V_B + offset, stride_uv); } } y = (int32_t) ( 4*16*sse_y / (data->width * data->height)); u = (int32_t) (16*16*sse_u / (data->width * data->height)); v = (int32_t) (16*16*sse_v / (data->width * data->height)); psnrhvsm->mse_sum_y += y; psnrhvsm->mse_sum_u += u; psnrhvsm->mse_sum_v += v; psnrhvsm->frame_cnt++; printf(" psnrhvsm y: %2.2f, psnrhvsm u: %2.2f, psnrhvsm v: %2.2f\n", sse_to_PSNR(y, 1024), sse_to_PSNR(u, 1024), sse_to_PSNR(v, 1024)); } static int psnrhvsm_create(xvid_plg_create_t *create, void **handle) { psnrhvsm_data_t *psnrhvsm; psnrhvsm = (psnrhvsm_data_t *) malloc(sizeof(psnrhvsm_data_t)); psnrhvsm->mse_sum_y = 0; psnrhvsm->mse_sum_u = 0; psnrhvsm->mse_sum_v = 0; psnrhvsm->frame_cnt = 0; *(handle) = (void*) psnrhvsm; return 0; } int xvid_plugin_psnrhvsm(void *handle, int opt, void *param1, void *param2) { switch(opt) { case(XVID_PLG_INFO): ((xvid_plg_info_t *)param1)->flags = XVID_REQORIGINAL; break; case(XVID_PLG_CREATE): psnrhvsm_create((xvid_plg_create_t *)param1,(void **)param2); break; case(XVID_PLG_BEFORE): case(XVID_PLG_FRAME): break; case(XVID_PLG_AFTER): psnrhvsm_after((xvid_plg_data_t *)param1, (psnrhvsm_data_t *)handle); break; case(XVID_PLG_DESTROY): { uint32_t y, u, v; psnrhvsm_data_t *psnrhvsm = (psnrhvsm_data_t *)handle; if (psnrhvsm) { y = (uint32_t) (psnrhvsm->mse_sum_y / psnrhvsm->frame_cnt); u = (uint32_t) (psnrhvsm->mse_sum_u / psnrhvsm->frame_cnt); v = (uint32_t) (psnrhvsm->mse_sum_v / psnrhvsm->frame_cnt); emms(); printf("Average psnrhvsm y: %2.2f, psnrhvsm u: %2.2f, psnrhvsm v: %2.2f\n", sse_to_PSNR(y, 1024), sse_to_PSNR(u, 1024), sse_to_PSNR(v, 1024)); free(psnrhvsm); } } break; default: break; } return 0; }; xvidcore/src/plugins/plugin_ssim.c0000664000076500007650000003435511564705453020454 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - SSIM plugin: computes the SSIM metric - * * Copyright(C) 2005 Johannes Reinhardt * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * * ****************************************************************************/ #include #include #include #include "../portab.h" #include "../xvid.h" #include "plugin_ssim.h" #include "../utils/emms.h" /* needed for visualisation of the error map with X display.h borrowed from x264 #include "display.h"*/ typedef struct framestat_t framestat_t; /*dev 1.0 gaussian weighting. the weight for the pixel x,y is w(x)*w(y)*/ static float mask8[8] = { 0.0069815f, 0.1402264f, 1.0361408f, 2.8165226f, 2.8165226f, 1.0361408f, 0.1402264f, 0.0069815f }; /* integer version. Norm: coeffs sums up to 4096. Define USE_INT_GAUSSIAN to use it as replacement to float version */ /* #define USE_INT_GAUSSIAN */ static const uint16_t imask8[8] = { 4, 72, 530, 1442, 1442, 530, 72, 4 }; #define GACCUM(X) ( ((X)+(1<<11)) >> 12 ) struct framestat_t{ int type; int quant; float ssim_min; float ssim_max; float ssim_avg; framestat_t* next; }; typedef int (*lumfunc)(uint8_t* ptr, int stride); typedef void (*csfunc)(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr); int lum_8x8_mmx(uint8_t* ptr, int stride); void consim_mmx(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr); void consim_sse2(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr); typedef struct{ plg_ssim_param_t* param; /* for error map visualisation uint8_t* errmap; */ int grid; /*for average SSIM*/ float ssim_sum; int frame_cnt; /*function pointers*/ lumfunc func8x8; lumfunc func2x8; csfunc consim; /*stats - for debugging*/ framestat_t* head; framestat_t* tail; } ssim_data_t; /* append the stats for another frame to the linked list*/ void framestat_append(ssim_data_t* ssim,int type, int quant, float min, float max, float avg){ framestat_t* act; act = (framestat_t*) malloc(sizeof(framestat_t)); act->type = type; act->quant = quant; act->ssim_min = min; act->ssim_max = max; act->ssim_avg = avg; act->next = NULL; if(ssim->head == NULL){ ssim->head = act; ssim->tail = act; } else { ssim->tail->next = act; ssim->tail = act; } } /* destroy the whole list*/ void framestat_free(framestat_t* stat){ if(stat != NULL){ if(stat->next != NULL) framestat_free(stat->next); free(stat); } return; } /*writeout the collected stats*/ void framestat_write(ssim_data_t* ssim, char* path){ framestat_t* tmp = ssim->head; FILE* out = fopen(path,"w"); if(out==NULL) printf("Cannot open %s in plugin_ssim\n",path); fprintf(out,"SSIM Error Metric\n"); fprintf(out,"quant avg min max\n"); while(tmp->next->next != NULL){ fprintf(out,"%3d %1.3f %1.3f %1.3f\n",tmp->quant,tmp->ssim_avg,tmp->ssim_min,tmp->ssim_max); tmp = tmp->next; } fclose(out); } /*writeout the collected stats in octave readable format*/ void framestat_write_oct(ssim_data_t* ssim, char* path){ framestat_t* tmp; FILE* out = fopen(path,"w"); if(out==NULL) printf("Cannot open %s in plugin_ssim\n",path); fprintf(out,"quant = ["); tmp = ssim->head; while(tmp->next->next != NULL){ fprintf(out,"%d, ",tmp->quant); tmp = tmp->next; } fprintf(out,"%d];\n\n",tmp->quant); fprintf(out,"ssim_min = ["); tmp = ssim->head; while(tmp->next->next != NULL){ fprintf(out,"%f, ",tmp->ssim_min); tmp = tmp->next; } fprintf(out,"%f];\n\n",tmp->ssim_min); fprintf(out,"ssim_max = ["); tmp = ssim->head; while(tmp->next->next != NULL){ fprintf(out,"%f, ",tmp->ssim_max); tmp = tmp->next; } fprintf(out,"%f];\n\n",tmp->ssim_max); fprintf(out,"ssim_avg = ["); tmp = ssim->head; while(tmp->next->next != NULL){ fprintf(out,"%f, ",tmp->ssim_avg); tmp = tmp->next; } fprintf(out,"%f];\n\n",tmp->ssim_avg); fprintf(out,"ivop = ["); tmp = ssim->head; while(tmp->next->next != NULL){ if(tmp->type == XVID_TYPE_IVOP){ fprintf(out,"%d, ",tmp->quant); fprintf(out,"%f, ",tmp->ssim_avg); fprintf(out,"%f, ",tmp->ssim_min); fprintf(out,"%f; ",tmp->ssim_max); } tmp = tmp->next; } fprintf(out,"%d, ",tmp->quant); fprintf(out,"%f, ",tmp->ssim_avg); fprintf(out,"%f, ",tmp->ssim_min); fprintf(out,"%f];\n\n",tmp->ssim_max); fprintf(out,"pvop = ["); tmp = ssim->head; while(tmp->next->next != NULL){ if(tmp->type == XVID_TYPE_PVOP){ fprintf(out,"%d, ",tmp->quant); fprintf(out,"%f, ",tmp->ssim_avg); fprintf(out,"%f, ",tmp->ssim_min); fprintf(out,"%f; ",tmp->ssim_max); } tmp = tmp->next; } fprintf(out,"%d, ",tmp->quant); fprintf(out,"%f, ",tmp->ssim_avg); fprintf(out,"%f, ",tmp->ssim_min); fprintf(out,"%f];\n\n",tmp->ssim_max); fprintf(out,"bvop = ["); tmp = ssim->head; while(tmp->next->next != NULL){ if(tmp->type == XVID_TYPE_BVOP){ fprintf(out,"%d, ",tmp->quant); fprintf(out,"%f, ",tmp->ssim_avg); fprintf(out,"%f, ",tmp->ssim_min); fprintf(out,"%f; ",tmp->ssim_max); } tmp = tmp->next; } fprintf(out,"%d, ",tmp->quant); fprintf(out,"%f, ",tmp->ssim_avg); fprintf(out,"%f, ",tmp->ssim_min); fprintf(out,"%f];\n\n",tmp->ssim_max); fclose(out); } /*calculate the luminance of a 8x8 block*/ int lum_8x8_c(uint8_t* ptr, int stride){ int mean=0,i,j; for(i=0;i< 8;i++) for(j=0;j< 8;j++){ mean += ptr[i*stride + j]; } return mean; } int lum_8x8_gaussian(uint8_t* ptr, int stride){ float mean=0,sum; int i,j; for(i=0;i<8;i++){ sum = 0; for(j=0;j<8;j++) sum += ptr[i*stride + j]*mask8[j]; sum *=mask8[i]; mean += sum; } return (int) (mean + 0.5); } int lum_8x8_gaussian_int(uint8_t* ptr, int stride){ uint32_t mean; int i,j; mean = 0; for(i=0;i<8;i++){ uint32_t sum = 0; for(j=0;j<8;j++) sum += ptr[i*stride + j]*imask8[j]; sum = GACCUM(sum) * imask8[i]; mean += sum; } return (int)GACCUM(mean); } /*calculate the difference between two blocks next to each other on a row*/ int lum_2x8_c(uint8_t* ptr, int stride){ int mean=0,i; /*Luminance*/ for(i=0;i< 8;i++){ mean -= *(ptr-1); mean += *(ptr+ 8 - 1); ptr+=stride; } return mean; } /*calculate contrast and correlation of the two blocks*/ void consim_gaussian(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr){ unsigned int valo, valc,i,j,str; float devo=0, devc=0, corr=0,sumo,sumc,sumcorr; str = stride - 8; for(i=0;i< 8;i++){ sumo = 0; sumc = 0; sumcorr = 0; for(j=0;j< 8;j++){ valo = *ptro; valc = *ptrc; sumo += valo*valo*mask8[j]; sumc += valc*valc*mask8[j]; sumcorr += valo*valc*mask8[j]; ptro++; ptrc++; } devo += sumo*mask8[i]; devc += sumc*mask8[i]; corr += sumcorr*mask8[i]; ptro += str; ptrc += str; } *pdevo = (int) ((devo - ((lumo*lumo + 32) >> 6)) + 0.5); *pdevc = (int) ((devc - ((lumc*lumc + 32) >> 6)) + 0.5); *pcorr = (int) ((corr - ((lumo*lumc + 32) >> 6)) + 0.5); }; void consim_gaussian_int(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr) { unsigned int valo, valc,i,j,str; uint32_t devo=0, devc=0, corr=0; str = stride - 8; for(i=0;i< 8;i++){ uint32_t sumo = 0; uint32_t sumc = 0; uint32_t sumcorr = 0; for(j=0;j< 8;j++){ valo = *ptro; valc = *ptrc; sumo += valo*valo*imask8[j]; sumc += valc*valc*imask8[j]; sumcorr += valo*valc*imask8[j]; ptro++; ptrc++; } devo += GACCUM(sumo)*imask8[i]; devc += GACCUM(sumc)*imask8[i]; corr += GACCUM(sumcorr)*imask8[i]; ptro += str; ptrc += str; } devo = GACCUM(devo); devc = GACCUM(devc); corr = GACCUM(corr); *pdevo = (int) ((devo - ((lumo*lumo + 32) >> 6)) + 0.5); *pdevc = (int) ((devc - ((lumc*lumc + 32) >> 6)) + 0.5); *pcorr = (int) ((corr - ((lumo*lumc + 32) >> 6)) + 0.5); }; /*calculate contrast and correlation of the two blocks*/ void consim_c(uint8_t* ptro, uint8_t* ptrc, int stride, int lumo, int lumc, int* pdevo, int* pdevc, int* pcorr){ unsigned int valo, valc, devo=0, devc=0, corr=0,i,j,str; str = stride - 8; for(i=0;i< 8;i++){ for(j=0;j< 8;j++){ valo = *ptro; valc = *ptrc; devo += valo*valo; devc += valc*valc; corr += valo*valc; ptro++; ptrc++; } ptro += str; ptrc += str; } *pdevo = devo - ((lumo*lumo + 32) >> 6); *pdevc = devc - ((lumc*lumc + 32) >> 6); *pcorr = corr - ((lumo*lumc + 32) >> 6); }; /*calculate the final ssim value*/ static float calc_ssim(float meano, float meanc, float devo, float devc, float corr){ static const float c1 = (0.01f*255)*(0.01f*255); static const float c2 = (0.03f*255)*(0.03f*255); /*printf("meano: %f meanc: %f devo: %f devc: %f corr: %f\n",meano,meanc,devo,devc,corr);*/ return ((2.0f*meano*meanc + c1)*(corr/32.0f + c2))/((meano*meano + meanc*meanc + c1)*(devc/64.0f + devo/64.0f + c2)); } static void ssim_after(xvid_plg_data_t* data, ssim_data_t* ssim){ int i,j,c=0,opt; int width,height,str,ovr; unsigned char * ptr1,*ptr2; float isum=0, min=1.00,max=0.00, val; int meanc, meano; int devc, devo, corr; width = data->width - 8; height = data->height - 8; str = data->original.stride[0]; if(str != data->current.stride[0]) printf("WARNING: Different strides in plugin_ssim original: %d current: %d\n",str,data->current.stride[0]); ovr = str - width + (width % ssim->grid); ptr1 = (unsigned char*) data->original.plane[0]; ptr2 = (unsigned char*) data->current.plane[0]; opt = ssim->grid == 1 && ssim->param->acc != 0; /*TODO: Thread*/ for(i=0;igrid){ /*begin of each row*/ meano = meanc = devc = devo = corr = 0; meano = ssim->func8x8(ptr1,str); meanc = ssim->func8x8(ptr2,str); ssim->consim(ptr1,ptr2,str,meano,meanc,&devo,&devc,&corr); emms(); val = calc_ssim((float) meano,(float) meanc,(float) devo,(float) devc,(float) corr); isum += val; c++; /* for visualisation if(ssim->param->b_visualize) ssim->errmap[i*width] = (uint8_t) 127*val; */ if(val < min) min = val; if(val > max) max = val; ptr1+=ssim->grid; ptr2+=ssim->grid; /*rest of each row*/ for(j=ssim->grid;jgrid){ if(opt){ meano += ssim->func2x8(ptr1,str); meanc += ssim->func2x8(ptr2,str); } else { meano = ssim->func8x8(ptr1,str); meanc = ssim->func8x8(ptr2,str); } ssim->consim(ptr1,ptr2,str,meano,meanc,&devo,&devc,&corr); emms(); val = calc_ssim((float) meano,(float) meanc,(float) devo,(float) devc,(float) corr); isum += val; c++; /* for visualisation if(ssim->param->b_visualize) ssim->errmap[i*width +j] = (uint8_t) 255*val; */ if(val < min) min = val; if(val > max) max = val; ptr1+=ssim->grid; ptr2+=ssim->grid; } ptr1 +=ovr; ptr2 +=ovr; } isum/=c; ssim->ssim_sum += isum; ssim->frame_cnt++; if(ssim->param->stat_path != NULL) framestat_append(ssim,data->type,data->quant,min,max,isum); /* for visualization if(ssim->param->b_visualize){ disp_gray(0,ssim->errmap,width,height,width, "Error-Map"); disp_gray(1,data->original.plane[0],data->width,data->height,data->original.stride[0],"Original"); disp_gray(2,data->current.plane[0],data->width,data->height,data->original.stride[0],"Compressed"); disp_sync(); } */ if(ssim->param->b_printstat){ printf(" SSIM: avg: %1.3f min: %1.3f max: %1.3f\n",isum,min,max); } } static int ssim_create(xvid_plg_create_t* create, void** handle){ ssim_data_t* ssim; plg_ssim_param_t* param; param = (plg_ssim_param_t*) malloc(sizeof(plg_ssim_param_t)); *param = *((plg_ssim_param_t*) create->param); ssim = (ssim_data_t*) malloc(sizeof(ssim_data_t)); ssim->func8x8 = lum_8x8_c; ssim->func2x8 = lum_2x8_c; ssim->consim = consim_c; ssim->param = param; ssim->grid = param->acc; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) { int cpu_flags = (param->cpu_flags & XVID_CPU_FORCE) ? param->cpu_flags : check_cpu_features(); if((cpu_flags & XVID_CPU_MMX) && (param->acc > 0)){ ssim->func8x8 = lum_8x8_mmx; ssim->consim = consim_mmx; } if((cpu_flags & XVID_CPU_SSE2) && (param->acc > 0)){ ssim->consim = consim_sse2; } } #endif /*gaussian weigthing not implemented*/ #if !defined(USE_INT_GAUSSIAN) if(ssim->grid == 0){ ssim->grid = 1; ssim->func8x8 = lum_8x8_gaussian; ssim->func2x8 = NULL; ssim->consim = consim_gaussian; } #else if(ssim->grid == 0){ ssim->grid = 1; ssim->func8x8 = lum_8x8_gaussian_int; ssim->func2x8 = NULL; ssim->consim = consim_gaussian_int; } #endif if(ssim->grid > 4) ssim->grid = 4; ssim->ssim_sum = 0.0; ssim->frame_cnt = 0; /* for visualization if(param->b_visualize){ //error map ssim->errmap = (uint8_t*) malloc(sizeof(uint8_t)*(create->width-8)*(create->height-8)); } else { ssim->errmap = NULL; }; */ /*stats*/ ssim->head=NULL; ssim->tail=NULL; *(handle) = (void*) ssim; return 0; } int xvid_plugin_ssim(void * handle, int opt, void * param1, void * param2){ ssim_data_t* ssim; switch(opt){ case(XVID_PLG_INFO): ((xvid_plg_info_t*) param1)->flags = XVID_REQORIGINAL; break; case(XVID_PLG_CREATE): ssim_create((xvid_plg_create_t*) param1,(void**) param2); break; case(XVID_PLG_BEFORE): case(XVID_PLG_FRAME): break; case(XVID_PLG_AFTER): ssim_after((xvid_plg_data_t*) param1, (ssim_data_t*) handle); break; case(XVID_PLG_DESTROY): ssim = (ssim_data_t*) handle; printf("Average SSIM: %f\n",ssim->ssim_sum/ssim->frame_cnt); if(ssim->param->stat_path != NULL) framestat_write(ssim,ssim->param->stat_path); framestat_free(ssim->head); /*free(ssim->errmap);*/ free(ssim->param); free(ssim); break; default: break; } return 0; }; xvidcore/src/image/0000775000076500007650000000000011566427762015354 5ustar xvidbuildxvidbuildxvidcore/src/image/image.c0000664000076500007650000010035511564705453016600 0ustar xvidbuildxvidbuild/************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Image management functions - * * Copyright(C) 2001-2010 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: image.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include /* memcpy, memset */ #include #include "../portab.h" #include "../global.h" /* XVID_CSP_XXX's */ #include "../xvid.h" /* XVID_CSP_XXX's */ #include "image.h" #include "colorspace.h" #include "interpolate8x8.h" #include "../utils/mem_align.h" #include "../motion/sad.h" #include "../utils/emms.h" #include "font.h" /* XXX: remove later */ #define SAFETY 64 #define EDGE_SIZE2 (EDGE_SIZE/2) int32_t image_create(IMAGE * image, uint32_t edged_width, uint32_t edged_height) { const uint32_t edged_width2 = edged_width / 2; const uint32_t edged_height2 = edged_height / 2; image->y = xvid_malloc(edged_width * (edged_height + 1) + SAFETY, CACHE_LINE); if (image->y == NULL) { return -1; } memset(image->y, 0, edged_width * (edged_height + 1) + SAFETY); image->u = xvid_malloc(edged_width2 * edged_height2 + SAFETY, CACHE_LINE); if (image->u == NULL) { xvid_free(image->y); image->y = NULL; return -1; } memset(image->u, 0, edged_width2 * edged_height2 + SAFETY); image->v = xvid_malloc(edged_width2 * edged_height2 + SAFETY, CACHE_LINE); if (image->v == NULL) { xvid_free(image->u); image->u = NULL; xvid_free(image->y); image->y = NULL; return -1; } memset(image->v, 0, edged_width2 * edged_height2 + SAFETY); image->y += EDGE_SIZE * edged_width + EDGE_SIZE; image->u += EDGE_SIZE2 * edged_width2 + EDGE_SIZE2; image->v += EDGE_SIZE2 * edged_width2 + EDGE_SIZE2; return 0; } void image_destroy(IMAGE * image, uint32_t edged_width, uint32_t edged_height) { const uint32_t edged_width2 = edged_width / 2; if (image->y) { xvid_free(image->y - (EDGE_SIZE * edged_width + EDGE_SIZE)); image->y = NULL; } if (image->u) { xvid_free(image->u - (EDGE_SIZE2 * edged_width2 + EDGE_SIZE2)); image->u = NULL; } if (image->v) { xvid_free(image->v - (EDGE_SIZE2 * edged_width2 + EDGE_SIZE2)); image->v = NULL; } } void image_swap(IMAGE * image1, IMAGE * image2) { SWAP(uint8_t*, image1->y, image2->y); SWAP(uint8_t*, image1->u, image2->u); SWAP(uint8_t*, image1->v, image2->v); } void image_copy(IMAGE * image1, IMAGE * image2, uint32_t edged_width, uint32_t height) { memcpy(image1->y, image2->y, edged_width * height); memcpy(image1->u, image2->u, edged_width * height / 4); memcpy(image1->v, image2->v, edged_width * height / 4); } /* setedges bug was in this BS versions */ #define SETEDGES_BUG_BEFORE 18 #define SETEDGES_BUG_AFTER 57 #define SETEDGES_BUG_REFIXED 63 void image_setedges(IMAGE * image, uint32_t edged_width, uint32_t edged_height, uint32_t width, uint32_t height, int bs_version) { const uint32_t edged_width2 = edged_width / 2; uint32_t width2; uint32_t i; uint8_t *dst; uint8_t *src; dst = image->y - (EDGE_SIZE + EDGE_SIZE * edged_width); src = image->y; /* According to the Standard Clause 7.6.4, padding is done starting at 16 * pixel width and height multiples. This was not respected in old xvids */ if ((bs_version >= SETEDGES_BUG_BEFORE && bs_version < SETEDGES_BUG_AFTER) || bs_version >= SETEDGES_BUG_REFIXED) { width = (width+15)&~15; height = (height+15)&~15; } width2 = width/2; for (i = 0; i < EDGE_SIZE; i++) { memset(dst, *src, EDGE_SIZE); memcpy(dst + EDGE_SIZE, src, width); memset(dst + edged_width - EDGE_SIZE, *(src + width - 1), EDGE_SIZE); dst += edged_width; } for (i = 0; i < height; i++) { memset(dst, *src, EDGE_SIZE); memset(dst + edged_width - EDGE_SIZE, src[width - 1], EDGE_SIZE); dst += edged_width; src += edged_width; } src -= edged_width; for (i = 0; i < EDGE_SIZE; i++) { memset(dst, *src, EDGE_SIZE); memcpy(dst + EDGE_SIZE, src, width); memset(dst + edged_width - EDGE_SIZE, *(src + width - 1), EDGE_SIZE); dst += edged_width; } /* U */ dst = image->u - (EDGE_SIZE2 + EDGE_SIZE2 * edged_width2); src = image->u; for (i = 0; i < EDGE_SIZE2; i++) { memset(dst, *src, EDGE_SIZE2); memcpy(dst + EDGE_SIZE2, src, width2); memset(dst + edged_width2 - EDGE_SIZE2, *(src + width2 - 1), EDGE_SIZE2); dst += edged_width2; } for (i = 0; i < height / 2; i++) { memset(dst, *src, EDGE_SIZE2); memset(dst + edged_width2 - EDGE_SIZE2, src[width2 - 1], EDGE_SIZE2); dst += edged_width2; src += edged_width2; } src -= edged_width2; for (i = 0; i < EDGE_SIZE2; i++) { memset(dst, *src, EDGE_SIZE2); memcpy(dst + EDGE_SIZE2, src, width2); memset(dst + edged_width2 - EDGE_SIZE2, *(src + width2 - 1), EDGE_SIZE2); dst += edged_width2; } /* V */ dst = image->v - (EDGE_SIZE2 + EDGE_SIZE2 * edged_width2); src = image->v; for (i = 0; i < EDGE_SIZE2; i++) { memset(dst, *src, EDGE_SIZE2); memcpy(dst + EDGE_SIZE2, src, width2); memset(dst + edged_width2 - EDGE_SIZE2, *(src + width2 - 1), EDGE_SIZE2); dst += edged_width2; } for (i = 0; i < height / 2; i++) { memset(dst, *src, EDGE_SIZE2); memset(dst + edged_width2 - EDGE_SIZE2, src[width2 - 1], EDGE_SIZE2); dst += edged_width2; src += edged_width2; } src -= edged_width2; for (i = 0; i < EDGE_SIZE2; i++) { memset(dst, *src, EDGE_SIZE2); memcpy(dst + EDGE_SIZE2, src, width2); memset(dst + edged_width2 - EDGE_SIZE2, *(src + width2 - 1), EDGE_SIZE2); dst += edged_width2; } } void image_interpolate(const uint8_t * refn, uint8_t * refh, uint8_t * refv, uint8_t * refhv, uint32_t edged_width, uint32_t edged_height, uint32_t quarterpel, uint32_t rounding) { const uint32_t offset = EDGE_SIZE2 * (edged_width + 1); /* we only interpolate half of the edge area */ const uint32_t stride_add = 7 * edged_width; uint8_t *n_ptr; uint8_t *h_ptr, *v_ptr, *hv_ptr; uint32_t x, y; n_ptr = (uint8_t*)refn; h_ptr = refh; v_ptr = refv; n_ptr -= offset; h_ptr -= offset; v_ptr -= offset; /* Note we initialize the hv pointer later, as we can optimize code a bit * doing it down to up in quarterpel and up to down in halfpel */ if(quarterpel) { for (y = 0; y < (edged_height - EDGE_SIZE); y += 8) { for (x = 0; x < (edged_width - EDGE_SIZE); x += 8) { interpolate8x8_6tap_lowpass_h(h_ptr, n_ptr, edged_width, rounding); interpolate8x8_6tap_lowpass_v(v_ptr, n_ptr, edged_width, rounding); n_ptr += 8; h_ptr += 8; v_ptr += 8; } n_ptr += EDGE_SIZE; h_ptr += EDGE_SIZE; v_ptr += EDGE_SIZE; h_ptr += stride_add; v_ptr += stride_add; n_ptr += stride_add; } h_ptr = refh + (edged_height - EDGE_SIZE - EDGE_SIZE2)*edged_width - EDGE_SIZE2; hv_ptr = refhv + (edged_height - EDGE_SIZE - EDGE_SIZE2)*edged_width - EDGE_SIZE2; for (y = 0; y < (edged_height - EDGE_SIZE); y = y + 8) { hv_ptr -= stride_add; h_ptr -= stride_add; hv_ptr -= EDGE_SIZE; h_ptr -= EDGE_SIZE; for (x = 0; x < (edged_width - EDGE_SIZE); x = x + 8) { hv_ptr -= 8; h_ptr -= 8; interpolate8x8_6tap_lowpass_v(hv_ptr, h_ptr, edged_width, rounding); } } } else { hv_ptr = refhv; hv_ptr -= offset; for (y = 0; y < (edged_height - EDGE_SIZE); y += 8) { for (x = 0; x < (edged_width - EDGE_SIZE); x += 8) { interpolate8x8_halfpel_h(h_ptr, n_ptr, edged_width, rounding); interpolate8x8_halfpel_v(v_ptr, n_ptr, edged_width, rounding); interpolate8x8_halfpel_hv(hv_ptr, n_ptr, edged_width, rounding); n_ptr += 8; h_ptr += 8; v_ptr += 8; hv_ptr += 8; } h_ptr += EDGE_SIZE; v_ptr += EDGE_SIZE; hv_ptr += EDGE_SIZE; n_ptr += EDGE_SIZE; h_ptr += stride_add; v_ptr += stride_add; hv_ptr += stride_add; n_ptr += stride_add; } } } /* chroma optimize filter, invented by mf a chroma pixel is average from the surrounding pixels, when the correpsonding luma pixels are pure black or white. */ void image_chroma_optimize(IMAGE * img, int width, int height, int edged_width) { int x,y; int pixels = 0; for (y = 1; y < height/2 - 1; y++) for (x = 1; x < width/2 - 1; x++) { #define IS_PURE(a) ((a)<=16||(a)>=235) #define IMG_Y(Y,X) img->y[(Y)*edged_width + (X)] #define IMG_U(Y,X) img->u[(Y)*edged_width/2 + (X)] #define IMG_V(Y,X) img->v[(Y)*edged_width/2 + (X)] if (IS_PURE(IMG_Y(y*2 ,x*2 )) && IS_PURE(IMG_Y(y*2 ,x*2+1)) && IS_PURE(IMG_Y(y*2+1,x*2 )) && IS_PURE(IMG_Y(y*2+1,x*2+1))) { IMG_U(y,x) = (IMG_U(y,x-1) + IMG_U(y-1, x) + IMG_U(y, x+1) + IMG_U(y+1, x)) / 4; IMG_V(y,x) = (IMG_V(y,x-1) + IMG_V(y-1, x) + IMG_V(y, x+1) + IMG_V(y+1, x)) / 4; pixels++; } #undef IS_PURE #undef IMG_Y #undef IMG_U #undef IMG_V } DPRINTF(XVID_DEBUG_DEBUG,"chroma_optimized_pixels = %i/%i\n", pixels, width*height/4); } /* perform safe packed colorspace conversion, by splitting the image up into an optimized area (pixel width divisible by 16), and two unoptimized/plain-c areas (pixel width divisible by 2) */ static void safe_packed_conv(uint8_t * x_ptr, int x_stride, uint8_t * y_ptr, uint8_t * u_ptr, uint8_t * v_ptr, int y_stride, int uv_stride, int width, int height, int vflip, packedFunc * func_opt, packedFunc func_c, int size, int interlacing) { int width_opt, width_c, height_opt; if (width<0 || width==1 || height==1) return; /* forget about it */ if (func_opt != func_c && x_stride < size*((width+15)/16)*16) { width_opt = width & (~15); width_c = (width - width_opt) & (~1); } else if (func_opt != func_c && !(width&1) && (size==3)) { /* MMX reads 4 bytes per pixel for RGB/BGR */ width_opt = width - 2; width_c = 2; } else { /* Enforce the width to be divisable by two. */ width_opt = width & (~1); width_c = 0; } /* packed conversions require height to be divisable by 2 (or even by 4 for interlaced conversion) */ if (interlacing) height_opt = height & (~3); else height_opt = height & (~1); func_opt(x_ptr, x_stride, y_ptr, u_ptr, v_ptr, y_stride, uv_stride, width_opt, height_opt, vflip); if (width_c) { func_c(x_ptr + size*width_opt, x_stride, y_ptr + width_opt, u_ptr + width_opt/2, v_ptr + width_opt/2, y_stride, uv_stride, width_c, height_opt, vflip); } } int image_input(IMAGE * image, uint32_t width, int height, uint32_t edged_width, uint8_t * src[4], int src_stride[4], int csp, int interlacing) { const int edged_width2 = edged_width/2; const int width2 = width/2; const int height2 = height/2; #if 0 const int height_signed = (csp & XVID_CSP_VFLIP) ? -height : height; #endif switch (csp & ~XVID_CSP_VFLIP) { case XVID_CSP_RGB555: safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?rgb555i_to_yv12 :rgb555_to_yv12, interlacing?rgb555i_to_yv12_c:rgb555_to_yv12_c, 2, interlacing); break; case XVID_CSP_RGB565: safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?rgb565i_to_yv12 :rgb565_to_yv12, interlacing?rgb565i_to_yv12_c:rgb565_to_yv12_c, 2, interlacing); break; case XVID_CSP_BGR: safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?bgri_to_yv12 :bgr_to_yv12, interlacing?bgri_to_yv12_c:bgr_to_yv12_c, 3, interlacing); break; case XVID_CSP_BGRA: safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?bgrai_to_yv12 :bgra_to_yv12, interlacing?bgrai_to_yv12_c:bgra_to_yv12_c, 4, interlacing); break; case XVID_CSP_ABGR : safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?abgri_to_yv12 :abgr_to_yv12, interlacing?abgri_to_yv12_c:abgr_to_yv12_c, 4, interlacing); break; case XVID_CSP_RGB: safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?rgbi_to_yv12 :rgb_to_yv12, interlacing?rgbi_to_yv12_c:rgb_to_yv12_c, 3, interlacing); break; case XVID_CSP_RGBA : safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?rgbai_to_yv12 :rgba_to_yv12, interlacing?rgbai_to_yv12_c:rgba_to_yv12_c, 4, interlacing); break; case XVID_CSP_ARGB: safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?argbi_to_yv12 : argb_to_yv12, interlacing?argbi_to_yv12_c: argb_to_yv12_c, 4, interlacing); break; case XVID_CSP_YUY2: safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yuyvi_to_yv12 :yuyv_to_yv12, interlacing?yuyvi_to_yv12_c:yuyv_to_yv12_c, 2, interlacing); break; case XVID_CSP_YVYU: /* u/v swapped */ safe_packed_conv( src[0], src_stride[0], image->y, image->v, image->u, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yuyvi_to_yv12 :yuyv_to_yv12, interlacing?yuyvi_to_yv12_c:yuyv_to_yv12_c, 2, interlacing); break; case XVID_CSP_UYVY: safe_packed_conv( src[0], src_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?uyvyi_to_yv12 :uyvy_to_yv12, interlacing?uyvyi_to_yv12_c:uyvy_to_yv12_c, 2, interlacing); break; case XVID_CSP_I420: /* YCbCr == YUV == internal colorspace for MPEG */ yv12_to_yv12(image->y, image->u, image->v, edged_width, edged_width2, src[0], src[0] + src_stride[0]*height, src[0] + src_stride[0]*height + (src_stride[0]/2)*height2, src_stride[0], src_stride[0]/2, width, height, (csp & XVID_CSP_VFLIP)); break; case XVID_CSP_YV12: /* YCrCb == YVA == U and V plane swapped */ yv12_to_yv12(image->y, image->v, image->u, edged_width, edged_width2, src[0], src[0] + src_stride[0]*height, src[0] + src_stride[0]*height + (src_stride[0]/2)*height2, src_stride[0], src_stride[0]/2, width, height, (csp & XVID_CSP_VFLIP)); break; case XVID_CSP_PLANAR: /* YCbCr with arbitrary pointers and different strides for Y and UV */ yv12_to_yv12(image->y, image->u, image->v, edged_width, edged_width2, src[0], src[1], src[2], src_stride[0], src_stride[1], /* v: dst_stride[2] not yet supported */ width, height, (csp & XVID_CSP_VFLIP)); break; case XVID_CSP_NULL: break; default : return -1; } /* pad out image when the width and/or height is not a multiple of 16 */ if (width & 15) { int i; int pad_width = 16 - (width&15); for (i = 0; i < height; i++) { memset(image->y + i*edged_width + width, *(image->y + i*edged_width + width - 1), pad_width); } for (i = 0; i < height/2; i++) { memset(image->u + i*edged_width2 + width2, *(image->u + i*edged_width2 + width2 - 1),pad_width/2); memset(image->v + i*edged_width2 + width2, *(image->v + i*edged_width2 + width2 - 1),pad_width/2); } } if (height & 15) { int pad_height = 16 - (height&15); int length = ((width+15)/16)*16; int i; for (i = 0; i < pad_height; i++) { memcpy(image->y + (height+i)*edged_width, image->y + (height-1)*edged_width,length); } for (i = 0; i < pad_height/2; i++) { memcpy(image->u + (height2+i)*edged_width2, image->u + (height2-1)*edged_width2,length/2); memcpy(image->v + (height2+i)*edged_width2, image->v + (height2-1)*edged_width2,length/2); } } /* if (interlacing) image_printf(image, edged_width, height, 5,5, "[i]"); image_dump_yuvpgm(image, edged_width, ((width+15)/16)*16, ((height+15)/16)*16, "\\encode.pgm"); */ return 0; } int image_output(IMAGE * image, uint32_t width, int height, uint32_t edged_width, uint8_t * dst[4], int dst_stride[4], int csp, int interlacing) { const int edged_width2 = edged_width/2; int height2 = height/2; /* if (interlacing) image_printf(image, edged_width, height, 5,100, "[i]=%i,%i",width,height); image_dump_yuvpgm(image, edged_width, width, height, "\\decode.pgm"); */ switch (csp & ~XVID_CSP_VFLIP) { case XVID_CSP_RGB555: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_rgb555i :yv12_to_rgb555, interlacing?yv12_to_rgb555i_c:yv12_to_rgb555_c, 2, interlacing); return 0; case XVID_CSP_RGB565: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_rgb565i :yv12_to_rgb565, interlacing?yv12_to_rgb565i_c:yv12_to_rgb565_c, 2, interlacing); return 0; case XVID_CSP_BGR: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_bgri :yv12_to_bgr, interlacing?yv12_to_bgri_c:yv12_to_bgr_c, 3, interlacing); return 0; case XVID_CSP_BGRA: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_bgrai :yv12_to_bgra, interlacing?yv12_to_bgrai_c:yv12_to_bgra_c, 4, interlacing); return 0; case XVID_CSP_ABGR: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_abgri :yv12_to_abgr, interlacing?yv12_to_abgri_c:yv12_to_abgr_c, 4, interlacing); return 0; case XVID_CSP_RGB: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_rgbi :yv12_to_rgb, interlacing?yv12_to_rgbi_c:yv12_to_rgb_c, 3, interlacing); return 0; case XVID_CSP_RGBA: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_rgbai :yv12_to_rgba, interlacing?yv12_to_rgbai_c:yv12_to_rgba_c, 4, interlacing); return 0; case XVID_CSP_ARGB: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_argbi :yv12_to_argb, interlacing?yv12_to_argbi_c:yv12_to_argb_c, 4, interlacing); return 0; case XVID_CSP_YUY2: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_yuyvi :yv12_to_yuyv, interlacing?yv12_to_yuyvi_c:yv12_to_yuyv_c, 2, interlacing); return 0; case XVID_CSP_YVYU: /* u,v swapped */ safe_packed_conv( dst[0], dst_stride[0], image->y, image->v, image->u, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_yuyvi :yv12_to_yuyv, interlacing?yv12_to_yuyvi_c:yv12_to_yuyv_c, 2, interlacing); return 0; case XVID_CSP_UYVY: safe_packed_conv( dst[0], dst_stride[0], image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP), interlacing?yv12_to_uyvyi :yv12_to_uyvy, interlacing?yv12_to_uyvyi_c:yv12_to_uyvy_c, 2, interlacing); return 0; case XVID_CSP_I420: /* YCbCr == YUV == internal colorspace for MPEG */ yv12_to_yv12(dst[0], dst[0] + dst_stride[0]*height, dst[0] + dst_stride[0]*height + (dst_stride[0]/2)*height2, dst_stride[0], dst_stride[0]/2, image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP)); return 0; case XVID_CSP_YV12: /* YCrCb == YVU == U and V plane swapped */ yv12_to_yv12(dst[0], dst[0] + dst_stride[0]*height, dst[0] + dst_stride[0]*height + (dst_stride[0]/2)*height2, dst_stride[0], dst_stride[0]/2, image->y, image->v, image->u, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP)); return 0; case XVID_CSP_PLANAR: /* YCbCr with arbitrary pointers and different strides for Y and UV */ yv12_to_yv12(dst[0], dst[1], dst[2], dst_stride[0], dst_stride[1], /* v: dst_stride[2] not yet supported */ image->y, image->u, image->v, edged_width, edged_width2, width, height, (csp & XVID_CSP_VFLIP)); return 0; case XVID_CSP_INTERNAL : dst[0] = image->y; dst[1] = image->u; dst[2] = image->v; dst_stride[0] = edged_width; dst_stride[1] = edged_width/2; dst_stride[2] = edged_width/2; return 0; case XVID_CSP_NULL: case XVID_CSP_SLICE: return 0; } return -1; } float image_psnr(IMAGE * orig_image, IMAGE * recon_image, uint16_t stride, uint16_t width, uint16_t height) { int32_t diff, x, y, quad = 0; uint8_t *orig = orig_image->y; uint8_t *recon = recon_image->y; float psnr_y; for (y = 0; y < height; y++) { for (x = 0; x < width; x++) { diff = *(orig + x) - *(recon + x); quad += diff * diff; } orig += stride; recon += stride; } psnr_y = (float) quad / (float) (width * height); if (psnr_y) { psnr_y = (float) (255 * 255) / psnr_y; psnr_y = 10 * (float) log10(psnr_y); } else psnr_y = (float) 99.99; return psnr_y; } float sse_to_PSNR(long sse, int pixels) { if (sse==0) return 99.99F; return 48.131F - 10*(float)log10((float)sse/(float)(pixels)); /* log10(255*255)=4.8131 */ } long plane_sse(uint8_t *orig, uint8_t *recon, uint16_t stride, uint16_t width, uint16_t height) { int y, bwidth, bheight; long sse = 0; bwidth = width & (~0x07); bheight = height & (~0x07); /* Compute the 8x8 integer part */ for (y = 0; yy; uint8_t *orig_u = orig_image->u; uint8_t *orig_v = orig_image->v; for (y = 0; y < mb_height; y++) { for (x = 0; x < mb_width; x++) { MACROBLOCK *pMB = &mbs[x + y * mb_width]; uint32_t var4[4]; uint32_t sum = 0, square = 0; /* y-blocks */ for (j = 0; j < 2; j++) { for (i = 0; i < 2; i++) { int lsum = blocksum8(orig_y + ((y<<4) + (j<<3))*stride + (x<<4) + (i<<3), stride, sums, squares); int lsquare = (squares[0] + squares[1] + squares[2] + squares[3])<<6; sum += lsum; square += lsquare; var4[0] = (squares[0]<<4) - sums[0]*sums[0]; var4[1] = (squares[1]<<4) - sums[1]*sums[1]; var4[2] = (squares[2]<<4) - sums[2]*sums[2]; var4[3] = (squares[3]<<4) - sums[3]*sums[3]; pMB->rel_var8[j*2 + i] = lsquare - lsum*lsum; if (pMB->rel_var8[j*2 + i]) pMB->rel_var8[j*2 + i] = ((var4[0] + var4[1] + var4[2] + var4[3])<<8) / pMB->rel_var8[j*2 + i]; /* 4*(Var(Di)/Var(D)) */ else pMB->rel_var8[j*2 + i] = 64; } } /* u */ { int lsum = blocksum8(orig_u + (y<<3)*(stride>>1) + (x<<3), stride, sums, squares); int lsquare = (squares[0] + squares[1] + squares[2] + squares[3])<<6; sum += lsum; square += lsquare; var4[0] = (squares[0]<<4) - sums[0]*sums[0]; var4[1] = (squares[1]<<4) - sums[1]*sums[1]; var4[2] = (squares[2]<<4) - sums[2]*sums[2]; var4[3] = (squares[3]<<4) - sums[3]*sums[3]; pMB->rel_var8[4] = lsquare - lsum*lsum; if (pMB->rel_var8[4]) pMB->rel_var8[4] = ((var4[0] + var4[1] + var4[2] + var4[3])<<8) / pMB->rel_var8[4]; /* 4*(Var(Di)/Var(D)) */ else pMB->rel_var8[4] = 64; } /* v */ { int lsum = blocksum8(orig_v + (y<<3)*(stride>>1) + (x<<3), stride, sums, squares); int lsquare = (squares[0] + squares[1] + squares[2] + squares[3])<<6; sum += lsum; square += lsquare; var4[0] = (squares[0]<<4) - sums[0]*sums[0]; var4[1] = (squares[1]<<4) - sums[1]*sums[1]; var4[2] = (squares[2]<<4) - sums[2]*sums[2]; var4[3] = (squares[3]<<4) - sums[3]*sums[3]; pMB->rel_var8[5] = lsquare - lsum*lsum; if (pMB->rel_var8[5]) pMB->rel_var8[5] = ((var4[0] + var4[1] + var4[2] + var4[3])<<8) / pMB->rel_var8[5]; /* 4*(Var(Di)/Var(D)) */ else pMB->rel_var8[5] = 64; } } } } #if 0 #include #include int image_dump_pgm(uint8_t * bmp, uint32_t width, uint32_t height, char * filename) { FILE * f; char hdr[1024]; f = fopen(filename, "wb"); if ( f == NULL) { return -1; } sprintf(hdr, "P5\n#xvid\n%i %i\n255\n", width, height); fwrite(hdr, strlen(hdr), 1, f); fwrite(bmp, width, height, f); fclose(f); return 0; } /* dump image+edges to yuv pgm files */ int image_dump(IMAGE * image, uint32_t edged_width, uint32_t edged_height, char * path, int number) { char filename[1024]; sprintf(filename, "%s_%i_%c.pgm", path, number, 'y'); image_dump_pgm( image->y - (EDGE_SIZE * edged_width + EDGE_SIZE), edged_width, edged_height, filename); sprintf(filename, "%s_%i_%c.pgm", path, number, 'u'); image_dump_pgm( image->u - (EDGE_SIZE2 * edged_width / 2 + EDGE_SIZE2), edged_width / 2, edged_height / 2, filename); sprintf(filename, "%s_%i_%c.pgm", path, number, 'v'); image_dump_pgm( image->v - (EDGE_SIZE2 * edged_width / 2 + EDGE_SIZE2), edged_width / 2, edged_height / 2, filename); return 0; } #endif /* dump image to yuvpgm file */ #include int image_dump_yuvpgm(const IMAGE * image, const uint32_t edged_width, const uint32_t width, const uint32_t height, char *filename) { FILE *f; char hdr[1024]; uint32_t i; uint8_t *bmp1; uint8_t *bmp2; f = fopen(filename, "wb"); if (f == NULL) { return -1; } sprintf(hdr, "P5\n#xvid\n%i %i\n255\n", width, (3 * height) / 2); fwrite(hdr, strlen(hdr), 1, f); bmp1 = image->y; for (i = 0; i < height; i++) { fwrite(bmp1, width, 1, f); bmp1 += edged_width; } bmp1 = image->u; bmp2 = image->v; for (i = 0; i < height / 2; i++) { fwrite(bmp1, width / 2, 1, f); fwrite(bmp2, width / 2, 1, f); bmp1 += edged_width / 2; bmp2 += edged_width / 2; } fclose(f); return 0; } float image_mad(const IMAGE * img1, const IMAGE * img2, uint32_t stride, uint32_t width, uint32_t height) { const uint32_t stride2 = stride / 2; const uint32_t width2 = width / 2; const uint32_t height2 = height / 2; uint32_t x, y; uint32_t sum = 0; for (y = 0; y < height; y++) for (x = 0; x < width; x++) sum += abs(img1->y[x + y * stride] - img2->y[x + y * stride]); for (y = 0; y < height2; y++) for (x = 0; x < width2; x++) sum += abs(img1->u[x + y * stride2] - img2->u[x + y * stride2]); for (y = 0; y < height2; y++) for (x = 0; x < width2; x++) sum += abs(img1->v[x + y * stride2] - img2->v[x + y * stride2]); return (float) sum / (width * height * 3 / 2); } void output_slice(IMAGE * cur, int stride, int width, xvid_image_t* out_frm, int mbx, int mby,int mbl) { uint8_t *dY,*dU,*dV,*sY,*sU,*sV; int stride2 = stride >> 1; int w = mbl << 4, w2,i; if(w > width) w = width; w2 = w >> 1; dY = (uint8_t*)out_frm->plane[0] + (mby << 4) * out_frm->stride[0] + (mbx << 4); dU = (uint8_t*)out_frm->plane[1] + (mby << 3) * out_frm->stride[1] + (mbx << 3); dV = (uint8_t*)out_frm->plane[2] + (mby << 3) * out_frm->stride[2] + (mbx << 3); sY = cur->y + (mby << 4) * stride + (mbx << 4); sU = cur->u + (mby << 3) * stride2 + (mbx << 3); sV = cur->v + (mby << 3) * stride2 + (mbx << 3); for(i = 0 ; i < 16 ; i++) { memcpy(dY,sY,w); dY += out_frm->stride[0]; sY += stride; } for(i = 0 ; i < 8 ; i++) { memcpy(dU,sU,w2); dU += out_frm->stride[1]; sU += stride2; } for(i = 0 ; i < 8 ; i++) { memcpy(dV,sV,w2); dV += out_frm->stride[2]; sV += stride2; } } void image_clear(IMAGE * img, int width, int height, int edged_width, int y, int u, int v) { uint8_t * p; int i; p = img->y; for (i = 0; i < height; i++) { memset(p, y, width); p += edged_width; } p = img->u; for (i = 0; i < height/2; i++) { memset(p, u, width/2); p += edged_width/2; } p = img->v; for (i = 0; i < height/2; i++) { memset(p, v, width/2); p += edged_width/2; } } /****************************************************************************/ static void (*deintl_core)(uint8_t *, int width, int height, const int stride) = 0; extern void xvid_deinterlace_sse(uint8_t *, int width, int height, const int stride); #define CLIP_255(x) ( ((x)&~255) ? ((-(x)) >> (8*sizeof((x))-1))&0xff : (x) ) static void deinterlace_c(uint8_t *pix, int width, int height, const int bps) { pix += bps; while(width-->0) { int p1 = pix[-bps]; int p2 = pix[0]; int p0 = p2; int j = (height>>1) - 1; int V; unsigned char *P = pix++; while(j-->0) { const int p3 = P[ bps]; const int p4 = P[2*bps]; V = ((p1+p3+1)>>1) + ((p2 - ((p0+p4+1)>>1)) >> 2); P[0] = CLIP_255( V ); p0 = p2; p1 = p3; p2 = p4; P += 2*bps; } V = ((p1+p1+1)>>1) + ((p2 - ((p0+p2+1)>>1)) >> 2); P[0] = CLIP_255( V ); } } #undef CLIP_255 int xvid_image_deinterlace(xvid_image_t* img, int width, int height, int bottom_first) { if (height&1) return 0; if (img->csp!=XVID_CSP_PLANAR && img->csp!=XVID_CSP_I420 && img->csp!=XVID_CSP_YV12) return 0; /* not yet supported */ if (deintl_core==0) { deintl_core = deinterlace_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) { int cpu_flags = check_cpu_features(); if (cpu_flags & XVID_CPU_MMX) deintl_core = xvid_deinterlace_sse; } #endif } if (!bottom_first) { deintl_core(img->plane[0], width, height, img->stride[0]); deintl_core(img->plane[1], width>>1, height>>1, img->stride[1]); deintl_core(img->plane[2], width>>1, height>>1, img->stride[2]); } else { deintl_core((uint8_t *)img->plane[0] + ( height -1)*img->stride[0], width, height, -img->stride[0]); deintl_core((uint8_t *)img->plane[1] + ((height>>1)-1)*img->stride[1], width>>1, height>>1, -img->stride[1]); deintl_core((uint8_t *)img->plane[2] + ((height>>1)-1)*img->stride[2], width>>1, height>>1, -img->stride[2]); } emms(); return 1; } xvidcore/src/image/colorspace.c0000664000076500007650000004415311564705453017653 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Colorspace conversion functions - * * Copyright(C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: colorspace.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include /* memcpy */ #include "../global.h" #include "colorspace.h" /* function pointers */ /* input */ packedFuncPtr rgb555_to_yv12; packedFuncPtr rgb565_to_yv12; packedFuncPtr rgb_to_yv12; packedFuncPtr bgr_to_yv12; packedFuncPtr bgra_to_yv12; packedFuncPtr abgr_to_yv12; packedFuncPtr rgba_to_yv12; packedFuncPtr argb_to_yv12; packedFuncPtr yuyv_to_yv12; packedFuncPtr uyvy_to_yv12; packedFuncPtr rgb555i_to_yv12; packedFuncPtr rgb565i_to_yv12; packedFuncPtr rgbi_to_yv12; packedFuncPtr bgri_to_yv12; packedFuncPtr bgrai_to_yv12; packedFuncPtr abgri_to_yv12; packedFuncPtr rgbai_to_yv12; packedFuncPtr argbi_to_yv12; packedFuncPtr yuyvi_to_yv12; packedFuncPtr uyvyi_to_yv12; /* output */ packedFuncPtr yv12_to_rgb555; packedFuncPtr yv12_to_rgb565; packedFuncPtr yv12_to_bgr; packedFuncPtr yv12_to_bgra; packedFuncPtr yv12_to_abgr; packedFuncPtr yv12_to_rgb; packedFuncPtr yv12_to_rgba; packedFuncPtr yv12_to_argb; packedFuncPtr yv12_to_yuyv; packedFuncPtr yv12_to_uyvy; packedFuncPtr yv12_to_rgb555i; packedFuncPtr yv12_to_rgb565i; packedFuncPtr yv12_to_bgri; packedFuncPtr yv12_to_bgrai; packedFuncPtr yv12_to_abgri; packedFuncPtr yv12_to_rgbi; packedFuncPtr yv12_to_rgbai; packedFuncPtr yv12_to_argbi; packedFuncPtr yv12_to_yuyvi; packedFuncPtr yv12_to_uyvyi; planarFuncPtr yv12_to_yv12; static int32_t RGB_Y_tab[256]; static int32_t B_U_tab[256]; static int32_t G_U_tab[256]; static int32_t G_V_tab[256]; static int32_t R_V_tab[256]; /********** generic colorspace macro **********/ #define MAKE_COLORSPACE(NAME,SIZE,PIXELS,VPIXELS,FUNC,C1,C2,C3,C4) \ void \ NAME(uint8_t * x_ptr, int x_stride, \ uint8_t * y_ptr, uint8_t * u_ptr, uint8_t * v_ptr, \ int y_stride, int uv_stride, \ int width, int height, int vflip) \ { \ int fixed_width = (width + 1) & ~1; \ int x_dif = x_stride - (SIZE)*fixed_width; \ int y_dif = y_stride - fixed_width; \ int uv_dif = uv_stride - (fixed_width / 2); \ int x, y; \ if (vflip) { \ x_ptr += (height - 1) * x_stride; \ x_dif = -(SIZE)*fixed_width - x_stride; \ x_stride = -x_stride; \ } \ for (y = 0; y < height; y+=(VPIXELS)) { \ FUNC##_ROW(SIZE,C1,C2,C3,C4); \ for (x = 0; x < fixed_width; x+=(PIXELS)) { \ FUNC(SIZE,C1,C2,C3,C4); \ x_ptr += (PIXELS)*(SIZE); \ y_ptr += (PIXELS); \ u_ptr += (PIXELS)/2; \ v_ptr += (PIXELS)/2; \ } \ x_ptr += x_dif + (VPIXELS-1)*x_stride; \ y_ptr += y_dif + (VPIXELS-1)*y_stride; \ u_ptr += uv_dif + ((VPIXELS/2)-1)*uv_stride; \ v_ptr += uv_dif + ((VPIXELS/2)-1)*uv_stride; \ } \ } /********** colorspace input (xxx_to_yv12) functions **********/ /* rgb -> yuv def's this following constants are "official spec" Video Demystified" (ISBN 1-878707-09-4) rgb<->yuv _is_ lossy, since most programs do the conversion differently SCALEBITS/FIX taken from ffmpeg */ #define Y_R_IN 0.257 #define Y_G_IN 0.504 #define Y_B_IN 0.098 #define Y_ADD_IN 16 #define U_R_IN 0.148 #define U_G_IN 0.291 #define U_B_IN 0.439 #define U_ADD_IN 128 #define V_R_IN 0.439 #define V_G_IN 0.368 #define V_B_IN 0.071 #define V_ADD_IN 128 #define SCALEBITS_IN 13 #define FIX_IN(x) ((uint16_t) ((x) * (1L<> 2) & 0xf8 #define MK_RGB555_R(RGB) ((RGB) >> 7) & 0xf8 #define MK_RGB565_B(RGB) ((RGB) << 3) & 0xf8 #define MK_RGB565_G(RGB) ((RGB) >> 3) & 0xfc #define MK_RGB565_R(RGB) ((RGB) >> 8) & 0xf8 #define READ_RGB16_Y(ROW, UVID, C1,C2,C3,C4) \ rgb = *(uint16_t *) (x_ptr + ((ROW)*x_stride) + 0); \ b##UVID += b = C1##_B(rgb); \ g##UVID += g = C1##_G(rgb); \ r##UVID += r = C1##_R(rgb); \ y_ptr[(ROW)*y_stride+0] = \ (uint8_t) ((FIX_IN(Y_R_IN) * r + FIX_IN(Y_G_IN) * g + \ FIX_IN(Y_B_IN) * b + FIX_ROUND) >> SCALEBITS_IN) + Y_ADD_IN; \ rgb = *(uint16_t *) (x_ptr + ((ROW)*x_stride) + 2); \ b##UVID += b = C1##_B(rgb); \ g##UVID += g = C1##_G(rgb); \ r##UVID += r = C1##_R(rgb); \ y_ptr[(ROW)*y_stride+1] = \ (uint8_t) ((FIX_IN(Y_R_IN) * r + FIX_IN(Y_G_IN) * g + \ FIX_IN(Y_B_IN) * b + FIX_ROUND) >> SCALEBITS_IN) + Y_ADD_IN; #define READ_RGB16_UV(UV_ROW,UVID) \ u_ptr[(UV_ROW)*uv_stride] = \ (uint8_t) ((-FIX_IN(U_R_IN) * r##UVID - FIX_IN(U_G_IN) * g##UVID + \ FIX_IN(U_B_IN) * b##UVID + 4*FIX_ROUND) >> (SCALEBITS_IN + 2)) + U_ADD_IN; \ v_ptr[(UV_ROW)*uv_stride] = \ (uint8_t) ((FIX_IN(V_R_IN) * r##UVID - FIX_IN(V_G_IN) * g##UVID - \ FIX_IN(V_B_IN) * b##UVID + 4*FIX_ROUND) >> (SCALEBITS_IN + 2)) + V_ADD_IN; #define RGB16_TO_YV12_ROW(SIZE,C1,C2,C3,C4) \ /* nothing */ #define RGB16_TO_YV12(SIZE,C1,C2,C3,C4) \ uint32_t rgb, r, g, b, r0, g0, b0; \ r0 = g0 = b0 = 0; \ READ_RGB16_Y (0, 0, C1,C2,C3,C4) \ READ_RGB16_Y (1, 0, C1,C2,C3,C4) \ READ_RGB16_UV(0, 0) #define RGB16I_TO_YV12_ROW(SIZE,C1,C2,C3,C4) \ /* nothing */ #define RGB16I_TO_YV12(SIZE,C1,C2,C3,C4) \ uint32_t rgb, r, g, b, r0, g0, b0, r1, g1, b1; \ r0 = g0 = b0 = r1 = g1 = b1 = 0; \ READ_RGB16_Y (0, 0, C1,C2,C3,C4) \ READ_RGB16_Y (1, 1, C1,C2,C3,C4) \ READ_RGB16_Y (2, 0, C1,C2,C3,C4) \ READ_RGB16_Y (3, 1, C1,C2,C3,C4) \ READ_RGB16_UV(0, 0) \ READ_RGB16_UV(1, 1) /* rgb/rgbi input */ #define READ_RGB_Y(SIZE, ROW, UVID, C1,C2,C3,C4) \ r##UVID += r = x_ptr[(ROW)*x_stride+(C1)]; \ g##UVID += g = x_ptr[(ROW)*x_stride+(C2)]; \ b##UVID += b = x_ptr[(ROW)*x_stride+(C3)]; \ y_ptr[(ROW)*y_stride+0] = \ (uint8_t) ((FIX_IN(Y_R_IN) * r + FIX_IN(Y_G_IN) * g + \ FIX_IN(Y_B_IN) * b + FIX_ROUND) >> SCALEBITS_IN) + Y_ADD_IN; \ r##UVID += r = x_ptr[(ROW)*x_stride+(SIZE)+(C1)]; \ g##UVID += g = x_ptr[(ROW)*x_stride+(SIZE)+(C2)]; \ b##UVID += b = x_ptr[(ROW)*x_stride+(SIZE)+(C3)]; \ y_ptr[(ROW)*y_stride+1] = \ (uint8_t) ((FIX_IN(Y_R_IN) * r + FIX_IN(Y_G_IN) * g + \ FIX_IN(Y_B_IN) * b + FIX_ROUND) >> SCALEBITS_IN) + Y_ADD_IN; #define READ_RGB_UV(UV_ROW,UVID) \ u_ptr[(UV_ROW)*uv_stride] = \ (uint8_t) ((-FIX_IN(U_R_IN) * r##UVID - FIX_IN(U_G_IN) * g##UVID + \ FIX_IN(U_B_IN) * b##UVID + 4*FIX_ROUND) >> (SCALEBITS_IN + 2)) + U_ADD_IN; \ v_ptr[(UV_ROW)*uv_stride] = \ (uint8_t) ((FIX_IN(V_R_IN) * r##UVID - FIX_IN(V_G_IN) * g##UVID - \ FIX_IN(V_B_IN) * b##UVID + 4*FIX_ROUND) >> (SCALEBITS_IN + 2)) + V_ADD_IN; #define RGB_TO_YV12_ROW(SIZE,C1,C2,C3,C4) \ /* nothing */ #define RGB_TO_YV12(SIZE,C1,C2,C3,C4) \ uint32_t r, g, b, r0, g0, b0; \ r0 = g0 = b0 = 0; \ READ_RGB_Y(SIZE, 0, 0, C1,C2,C3,C4) \ READ_RGB_Y(SIZE, 1, 0, C1,C2,C3,C4) \ READ_RGB_UV( 0, 0) #define RGBI_TO_YV12_ROW(SIZE,C1,C2,C3,C4) \ /* nothing */ #define RGBI_TO_YV12(SIZE,C1,C2,C3,C4) \ uint32_t r, g, b, r0, g0, b0, r1, g1, b1; \ r0 = g0 = b0 = r1 = g1 = b1 = 0; \ READ_RGB_Y(SIZE, 0, 0, C1,C2,C3,C4) \ READ_RGB_Y(SIZE, 1, 1, C1,C2,C3,C4) \ READ_RGB_Y(SIZE, 2, 0, C1,C2,C3,C4) \ READ_RGB_Y(SIZE, 3, 1, C1,C2,C3,C4) \ READ_RGB_UV( 0, 0) \ READ_RGB_UV( 1, 1) /* yuyv/yuyvi input */ #define READ_YUYV_Y(ROW,C1,C2,C3,C4) \ y_ptr[(ROW)*y_stride+0] = x_ptr[(ROW)*x_stride+(C1)]; \ y_ptr[(ROW)*y_stride+1] = x_ptr[(ROW)*x_stride+(C3)]; #define READ_YUYV_UV(UV_ROW,ROW1,ROW2,C1,C2,C3,C4) \ u_ptr[(UV_ROW)*uv_stride] = (x_ptr[(ROW1)*x_stride+(C2)] + x_ptr[(ROW2)*x_stride+(C2)] + 1) / 2; \ v_ptr[(UV_ROW)*uv_stride] = (x_ptr[(ROW1)*x_stride+(C4)] + x_ptr[(ROW2)*x_stride+(C4)] + 1) / 2; #define YUYV_TO_YV12_ROW(SIZE,C1,C2,C3,C4) \ /* nothing */ #define YUYV_TO_YV12(SIZE,C1,C2,C3,C4) \ READ_YUYV_Y (0, C1,C2,C3,C4) \ READ_YUYV_Y (1, C1,C2,C3,C4) \ READ_YUYV_UV(0, 0,1, C1,C2,C3,C4) #define YUYVI_TO_YV12_ROW(SIZE,C1,C2,C3,C4) \ /* nothing */ #define YUYVI_TO_YV12(SIZE,C1,C2,C3,C4) \ READ_YUYV_Y (0, C1,C2,C3,C4) \ READ_YUYV_Y (1, C1,C2,C3,C4) \ READ_YUYV_Y (2, C1,C2,C3,C4) \ READ_YUYV_Y (3, C1,C2,C3,C4) \ READ_YUYV_UV(0, 0,2, C1,C2,C3,C4) \ READ_YUYV_UV(1, 1,3, C1,C2,C3,C4) MAKE_COLORSPACE(rgb555_to_yv12_c, 2,2,2, RGB16_TO_YV12, MK_RGB555, 0,0,0) MAKE_COLORSPACE(rgb565_to_yv12_c, 2,2,2, RGB16_TO_YV12, MK_RGB565, 0,0,0) MAKE_COLORSPACE(bgr_to_yv12_c, 3,2,2, RGB_TO_YV12, 2,1,0, 0) MAKE_COLORSPACE(bgra_to_yv12_c, 4,2,2, RGB_TO_YV12, 2,1,0, 0) MAKE_COLORSPACE(rgb_to_yv12_c, 3,2,2, RGB_TO_YV12, 0,1,2, 0) MAKE_COLORSPACE(abgr_to_yv12_c, 4,2,2, RGB_TO_YV12, 3,2,1, 0) MAKE_COLORSPACE(rgba_to_yv12_c, 4,2,2, RGB_TO_YV12, 0,1,2, 0) MAKE_COLORSPACE(argb_to_yv12_c, 4,2,2, RGB_TO_YV12, 1,2,3, 0) MAKE_COLORSPACE(yuyv_to_yv12_c, 2,2,2, YUYV_TO_YV12, 0,1,2,3) MAKE_COLORSPACE(uyvy_to_yv12_c, 2,2,2, YUYV_TO_YV12, 1,0,3,2) MAKE_COLORSPACE(rgb555i_to_yv12_c, 2,2,4, RGB16I_TO_YV12, MK_RGB555, 0,0,0) MAKE_COLORSPACE(rgb565i_to_yv12_c, 2,2,4, RGB16I_TO_YV12, MK_RGB565, 0,0,0) MAKE_COLORSPACE(bgri_to_yv12_c, 3,2,4, RGBI_TO_YV12, 2,1,0, 0) MAKE_COLORSPACE(bgrai_to_yv12_c, 4,2,4, RGBI_TO_YV12, 2,1,0, 0) MAKE_COLORSPACE(abgri_to_yv12_c, 4,2,4, RGBI_TO_YV12, 3,2,1, 0) MAKE_COLORSPACE(rgbi_to_yv12_c, 3,2,4, RGBI_TO_YV12, 0,1,2, 0) MAKE_COLORSPACE(rgbai_to_yv12_c, 4,2,4, RGBI_TO_YV12, 0,1,2, 0) MAKE_COLORSPACE(argbi_to_yv12_c, 4,2,4, RGBI_TO_YV12, 1,2,3, 0) MAKE_COLORSPACE(yuyvi_to_yv12_c, 2,2,4, YUYVI_TO_YV12, 0,1,2,3) MAKE_COLORSPACE(uyvyi_to_yv12_c, 2,2,4, YUYVI_TO_YV12, 1,0,3,2) /********** colorspace output (yv12_to_xxx) functions **********/ /* yuv -> rgb def's */ #define RGB_Y_OUT 1.164 #define B_U_OUT 2.018 #define Y_ADD_OUT 16 #define G_U_OUT 0.391 #define G_V_OUT 0.813 #define U_ADD_OUT 128 #define R_V_OUT 1.596 #define V_ADD_OUT 128 #define SCALEBITS_OUT 13 #define FIX_OUT(x) ((uint16_t) ((x) * (1L<> 3) & 0x001f) #define MK_RGB565(R,G,B) \ ((MAX(0,MIN(255, R)) << 8) & 0xf800) | \ ((MAX(0,MIN(255, G)) << 3) & 0x07e0) | \ ((MAX(0,MIN(255, B)) >> 3) & 0x001f) #define WRITE_RGB16(ROW,UV_ROW,C1) \ rgb_y = RGB_Y_tab[ y_ptr[y_stride*(ROW) + 0] ]; \ b[ROW] = (b[ROW] & 0x7) + ((rgb_y + b_u##UV_ROW) >> SCALEBITS_OUT); \ g[ROW] = (g[ROW] & 0x7) + ((rgb_y - g_uv##UV_ROW) >> SCALEBITS_OUT); \ r[ROW] = (r[ROW] & 0x7) + ((rgb_y + r_v##UV_ROW) >> SCALEBITS_OUT); \ *(uint16_t *) (x_ptr+((ROW)*x_stride)+0) = C1(r[ROW], g[ROW], b[ROW]); \ rgb_y = RGB_Y_tab[ y_ptr[y_stride*(ROW) + 1] ]; \ b[ROW] = (b[ROW] & 0x7) + ((rgb_y + b_u##UV_ROW) >> SCALEBITS_OUT); \ g[ROW] = (g[ROW] & 0x7) + ((rgb_y - g_uv##UV_ROW) >> SCALEBITS_OUT); \ r[ROW] = (r[ROW] & 0x7) + ((rgb_y + r_v##UV_ROW) >> SCALEBITS_OUT); \ *(uint16_t *) (x_ptr+((ROW)*x_stride)+2) = C1(r[ROW], g[ROW], b[ROW]); #define YV12_TO_RGB16_ROW(SIZE,C1,C2,C3,C4) \ int r[2], g[2], b[2]; \ r[0] = r[1] = g[0] = g[1] = b[0] = b[1] = 0; #define YV12_TO_RGB16(SIZE,C1,C2,C3,C4) \ int rgb_y; \ int b_u0 = B_U_tab[ u_ptr[0] ]; \ int g_uv0 = G_U_tab[ u_ptr[0] ] + G_V_tab[ v_ptr[0] ]; \ int r_v0 = R_V_tab[ v_ptr[0] ]; \ WRITE_RGB16(0, 0, C1) \ WRITE_RGB16(1, 0, C1) #define YV12_TO_RGB16I_ROW(SIZE,C1,C2,C3,C4) \ int r[4], g[4], b[4]; \ r[0] = r[1] = r[2] = r[3] = 0; \ g[0] = g[1] = g[2] = g[3] = 0; \ b[0] = b[1] = b[2] = b[3] = 0; #define YV12_TO_RGB16I(SIZE,C1,C2,C3,C4) \ int rgb_y; \ int b_u0 = B_U_tab[ u_ptr[0] ]; \ int g_uv0 = G_U_tab[ u_ptr[0] ] + G_V_tab[ v_ptr[0] ]; \ int r_v0 = R_V_tab[ v_ptr[0] ]; \ int b_u1 = B_U_tab[ u_ptr[uv_stride] ]; \ int g_uv1 = G_U_tab[ u_ptr[uv_stride] ] + G_V_tab[ v_ptr[uv_stride] ]; \ int r_v1 = R_V_tab[ v_ptr[uv_stride] ]; \ WRITE_RGB16(0, 0, C1) \ WRITE_RGB16(1, 1, C1) \ WRITE_RGB16(2, 0, C1) \ WRITE_RGB16(3, 1, C1) \ /* rgb/rgbi output */ #define WRITE_RGB(SIZE,ROW,UV_ROW,C1,C2,C3,C4) \ rgb_y = RGB_Y_tab[ y_ptr[(ROW)*y_stride + 0] ]; \ x_ptr[(ROW)*x_stride+(C3)] = MAX(0, MIN(255, (rgb_y + b_u##UV_ROW) >> SCALEBITS_OUT)); \ x_ptr[(ROW)*x_stride+(C2)] = MAX(0, MIN(255, (rgb_y - g_uv##UV_ROW) >> SCALEBITS_OUT)); \ x_ptr[(ROW)*x_stride+(C1)] = MAX(0, MIN(255, (rgb_y + r_v##UV_ROW) >> SCALEBITS_OUT)); \ if ((SIZE)>3) x_ptr[(ROW)*x_stride+(C4)] = 0; \ rgb_y = RGB_Y_tab[ y_ptr[(ROW)*y_stride + 1] ]; \ x_ptr[(ROW)*x_stride+(SIZE)+(C3)] = MAX(0, MIN(255, (rgb_y + b_u##UV_ROW) >> SCALEBITS_OUT)); \ x_ptr[(ROW)*x_stride+(SIZE)+(C2)] = MAX(0, MIN(255, (rgb_y - g_uv##UV_ROW) >> SCALEBITS_OUT)); \ x_ptr[(ROW)*x_stride+(SIZE)+(C1)] = MAX(0, MIN(255, (rgb_y + r_v##UV_ROW) >> SCALEBITS_OUT)); \ if ((SIZE)>3) x_ptr[(ROW)*x_stride+(SIZE)+(C4)] = 0; #define YV12_TO_RGB_ROW(SIZE,C1,C2,C3,C4) /* nothing */ #define YV12_TO_RGB(SIZE,C1,C2,C3,C4) \ int rgb_y; \ int b_u0 = B_U_tab[ u_ptr[0] ]; \ int g_uv0 = G_U_tab[ u_ptr[0] ] + G_V_tab[ v_ptr[0] ]; \ int r_v0 = R_V_tab[ v_ptr[0] ]; \ WRITE_RGB(SIZE, 0, 0, C1,C2,C3,C4) \ WRITE_RGB(SIZE, 1, 0, C1,C2,C3,C4) #define YV12_TO_RGBI_ROW(SIZE,C1,C2,C3,C4) /* nothing */ #define YV12_TO_RGBI(SIZE,C1,C2,C3,C4) \ int rgb_y; \ int b_u0 = B_U_tab[ u_ptr[0] ]; \ int g_uv0 = G_U_tab[ u_ptr[0] ] + G_V_tab[ v_ptr[0] ]; \ int r_v0 = R_V_tab[ v_ptr[0] ]; \ int b_u1 = B_U_tab[ u_ptr[uv_stride] ]; \ int g_uv1 = G_U_tab[ u_ptr[uv_stride] ] + G_V_tab[ v_ptr[uv_stride] ]; \ int r_v1 = R_V_tab[ v_ptr[uv_stride] ]; \ WRITE_RGB(SIZE, 0, 0, C1,C2,C3,C4) \ WRITE_RGB(SIZE, 1, 1, C1,C2,C3,C4) \ WRITE_RGB(SIZE, 2, 0, C1,C2,C3,C4) \ WRITE_RGB(SIZE, 3, 1, C1,C2,C3,C4) /* yuyv/yuyvi output */ #define WRITE_YUYV(ROW,UV_ROW,C1,C2,C3,C4) \ x_ptr[(ROW)*x_stride+(C1)] = y_ptr[ (ROW)*y_stride +0]; \ x_ptr[(ROW)*x_stride+(C2)] = u_ptr[(UV_ROW)*uv_stride+0]; \ x_ptr[(ROW)*x_stride+(C3)] = y_ptr[ (ROW)*y_stride +1]; \ x_ptr[(ROW)*x_stride+(C4)] = v_ptr[(UV_ROW)*uv_stride+0]; \ #define YV12_TO_YUYV_ROW(SIZE,C1,C2,C3,C4) /* nothing */ #define YV12_TO_YUYV(SIZE,C1,C2,C3,C4) \ WRITE_YUYV(0, 0, C1,C2,C3,C4) \ WRITE_YUYV(1, 0, C1,C2,C3,C4) #define YV12_TO_YUYVI_ROW(SIZE,C1,C2,C3,C4) /* nothing */ #define YV12_TO_YUYVI(SIZE,C1,C2,C3,C4) \ WRITE_YUYV(0, 0, C1,C2,C3,C4) \ WRITE_YUYV(1, 1, C1,C2,C3,C4) \ WRITE_YUYV(2, 0, C1,C2,C3,C4) \ WRITE_YUYV(3, 1, C1,C2,C3,C4) MAKE_COLORSPACE(yv12_to_rgb555_c, 2,2,2, YV12_TO_RGB16, MK_RGB555, 0,0,0) MAKE_COLORSPACE(yv12_to_rgb565_c, 2,2,2, YV12_TO_RGB16, MK_RGB565, 0,0,0) MAKE_COLORSPACE(yv12_to_bgr_c, 3,2,2, YV12_TO_RGB, 2,1,0,0) MAKE_COLORSPACE(yv12_to_bgra_c, 4,2,2, YV12_TO_RGB, 2,1,0,3) MAKE_COLORSPACE(yv12_to_abgr_c, 4,2,2, YV12_TO_RGB, 3,2,1,0) MAKE_COLORSPACE(yv12_to_rgb_c, 3,2,2, YV12_TO_RGB, 0,1,2,0) MAKE_COLORSPACE(yv12_to_rgba_c, 4,2,2, YV12_TO_RGB, 0,1,2,3) MAKE_COLORSPACE(yv12_to_argb_c, 4,2,2, YV12_TO_RGB, 1,2,3,0) MAKE_COLORSPACE(yv12_to_yuyv_c, 2,2,2, YV12_TO_YUYV, 0,1,2,3) MAKE_COLORSPACE(yv12_to_uyvy_c, 2,2,2, YV12_TO_YUYV, 1,0,3,2) MAKE_COLORSPACE(yv12_to_rgb555i_c, 2,2,4, YV12_TO_RGB16I, MK_RGB555, 0,0,0) MAKE_COLORSPACE(yv12_to_rgb565i_c, 2,2,4, YV12_TO_RGB16I, MK_RGB565, 0,0,0) MAKE_COLORSPACE(yv12_to_bgri_c, 3,2,4, YV12_TO_RGBI, 2,1,0, 0) MAKE_COLORSPACE(yv12_to_bgrai_c, 4,2,4, YV12_TO_RGBI, 2,1,0,3) MAKE_COLORSPACE(yv12_to_abgri_c, 4,2,4, YV12_TO_RGBI, 3,2,1,0) MAKE_COLORSPACE(yv12_to_rgbi_c, 3,2,4, YV12_TO_RGBI, 0,1,2,0) MAKE_COLORSPACE(yv12_to_rgbai_c, 4,2,4, YV12_TO_RGBI, 0,1,2,3) MAKE_COLORSPACE(yv12_to_argbi_c, 4,2,4, YV12_TO_RGBI, 1,2,3,0) MAKE_COLORSPACE(yv12_to_yuyvi_c, 2,2,4, YV12_TO_YUYVI, 0,1,2,3) MAKE_COLORSPACE(yv12_to_uyvyi_c, 2,2,4, YV12_TO_YUYVI, 1,0,3,2) /* yv12 to yv12 copy function */ void yv12_to_yv12_c(uint8_t * y_dst, uint8_t * u_dst, uint8_t * v_dst, int y_dst_stride, int uv_dst_stride, uint8_t * y_src, uint8_t * u_src, uint8_t * v_src, int y_src_stride, int uv_src_stride, int width, int height, int vflip) { int width2 = width / 2; int height2 = height / 2; int y; const int with_uv = (u_src!=0 && v_src!=0); if (vflip) { y_src += (height - 1) * y_src_stride; if (with_uv) { u_src += (height2 - 1) * uv_src_stride; v_src += (height2 - 1) * uv_src_stride; } y_src_stride = -y_src_stride; uv_src_stride = -uv_src_stride; } for (y = height; y; y--) { memcpy(y_dst, y_src, width); y_src += y_src_stride; y_dst += y_dst_stride; } if (with_uv) { for (y = height2; y; y--) { memcpy(u_dst, u_src, width2); memcpy(v_dst, v_src, width2); u_src += uv_src_stride; u_dst += uv_dst_stride; v_src += uv_src_stride; v_dst += uv_dst_stride; } } else { for (y = height2; y; y--) { memset(u_dst, 0x80, width2); memset(v_dst, 0x80, width2); u_dst += uv_dst_stride; v_dst += uv_dst_stride; } } } /* initialize rgb lookup tables */ void colorspace_init(void) { int32_t i; for (i = 0; i < 256; i++) { RGB_Y_tab[i] = FIX_OUT(RGB_Y_OUT) * (i - Y_ADD_OUT); B_U_tab[i] = FIX_OUT(B_U_OUT) * (i - U_ADD_OUT); G_U_tab[i] = FIX_OUT(G_U_OUT) * (i - U_ADD_OUT); G_V_tab[i] = FIX_OUT(G_V_OUT) * (i - V_ADD_OUT); R_V_tab[i] = FIX_OUT(R_V_OUT) * (i - V_ADD_OUT); } } xvidcore/src/image/postprocessing.c0000664000076500007650000003476011564705453020606 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Postprocessing functions - * * Copyright(C) 2003-2010 Michael Militzer * 2004 Marc Fauconneau * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: postprocessing.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include #include "../portab.h" #include "../global.h" #include "image.h" #include "../utils/emms.h" #include "postprocessing.h" /* function pointers */ IMAGEBRIGHTNESS_PTR image_brightness; /* Some useful (and fast) macros Note that the MIN/MAX macros assume signed shift - if your compiler doesn't do signed shifts, use the default MIN/MAX macros from global.h */ #define FAST_MAX(x,y) ((x) - ((((x) - (y))>>(32 - 1)) & ((x) - (y)))) #define FAST_MIN(x,y) ((x) + ((((y) - (x))>>(32 - 1)) & ((y) - (x)))) #define FAST_ABS(x) ((((int)(x)) >> 31) ^ ((int)(x))) - (((int)(x)) >> 31) #define ABS(X) (((X)>0)?(X):-(X)) void init_postproc(XVID_POSTPROC *tbls) { init_deblock(tbls); init_noise(tbls); } void stripe_deblock_h(SMPDeblock *h) { const int stride = h->stride; const int stride2 = stride /2; int i,j; int quant; /* luma: j,i in block units */ if ((h->flags & XVID_DEBLOCKY)) { int dering = h->flags & XVID_DERINGY; for (j = 1; j < h->stop_y; j++) /* horizontal luma deblocking */ for (i = h->start_x; i < h->stop_x; i++) { quant = h->mbs[(j+0)/2*h->mb_stride + (i/2)].quant; deblock8x8_h(h->tbls, h->img->y + j*8*stride + i*8, stride, quant, dering); } } /* chroma */ if ((h->flags & XVID_DEBLOCKUV)) { int dering = h->flags & XVID_DERINGUV; for (j = 1; j < h->stop_y/2; j++) /* horizontal deblocking */ for (i = h->start_x/2; i < h->stop_x/2; i++) { quant = h->mbs[(j+0)*h->mb_stride + i].quant; deblock8x8_h(h->tbls, h->img->u + j*8*stride2 + i*8, stride2, quant, dering); deblock8x8_h(h->tbls, h->img->v + j*8*stride2 + i*8, stride2, quant, dering); } } } void stripe_deblock_v(SMPDeblock *h) { const int stride = h->stride; const int stride2 = stride /2; int i,j; int quant; /* luma: j,i in block units */ if ((h->flags & XVID_DEBLOCKY)) { int dering = h->flags & XVID_DERINGY; for (j = h->start_y; j < h->stop_y; j++) /* vertical deblocking */ for (i = 1; i < h->stop_x; i++) { quant = h->mbs[(j+0)/2*h->mb_stride + (i/2)].quant; deblock8x8_v(h->tbls, h->img->y + j*8*stride + i*8, stride, quant, dering); } } /* chroma */ if ((h->flags & XVID_DEBLOCKUV)) { int dering = h->flags & XVID_DERINGUV; for (j = h->start_y/2; j < h->stop_y/2; j++) /* vertical deblocking */ for (i = 1; i < h->stop_x/2; i++) { quant = h->mbs[(j+0)*h->mb_stride + i].quant; deblock8x8_v(h->tbls, h->img->u + j*8*stride2 + i*8, stride2, quant, dering); deblock8x8_v(h->tbls, h->img->v + j*8*stride2 + i*8, stride2, quant, dering); } } } void image_postproc(XVID_POSTPROC *tbls, IMAGE * img, int edged_width, const MACROBLOCK * mbs, int mb_width, int mb_height, int mb_stride, int flags, int brightness, int frame_num, int bvop, int threads) { int k; #ifndef HAVE_PTHREAD int num_threads = 1; #else int num_threads = MAX(1, MIN(threads, 4)); void *status = NULL; #endif SMPDeblock data[4]; /* horizontal deblocking, dispatch threads */ for (k = 0; k < num_threads; k++) { data[k].flags = flags; data[k].img = img; data[k].mb_stride = mb_stride; data[k].mbs = mbs; data[k].stride = edged_width; data[k].tbls = tbls; data[k].start_x = (k*mb_width / num_threads)*2; data[k].stop_x = ((k+1)*mb_width / num_threads)*2; data[k].stop_y = mb_height*2; } #ifdef HAVE_PTHREAD /* create threads */ for (k = 1; k < num_threads; k++) { pthread_create(&data[k].handle, NULL, (void*)stripe_deblock_h, (void*)&data[k]); } #endif stripe_deblock_h(&data[0]); #ifdef HAVE_PTHREAD /* wait until all threads are finished */ for (k = 1; k < num_threads; k++) { pthread_join(data[k].handle, &status); } #endif /* vertical deblocking, dispatch threads */ for (k = 0; k < num_threads; k++) { data[k].start_y = (k*mb_height / num_threads)*2; data[k].stop_y = ((k+1)*mb_height / num_threads)*2; data[k].stop_x = mb_width*2; } #ifdef HAVE_PTHREAD /* create threads */ for (k = 1; k < num_threads; k++) { pthread_create(&data[k].handle, NULL, (void*)stripe_deblock_v, (void*)&data[k]); } #endif stripe_deblock_v(&data[0]); #ifdef HAVE_PTHREAD /* wait until all threads are finished */ for (k = 1; k < num_threads; k++) { pthread_join(data[k].handle, &status); } #endif if (!bvop) tbls->prev_quant = mbs->quant; if ((flags & XVID_FILMEFFECT)) { add_noise(tbls, img->y, img->y, edged_width, mb_width*16, mb_height*16, frame_num % 3, tbls->prev_quant); } if (brightness != 0) { image_brightness(img->y, edged_width, mb_width*16, mb_height*16, brightness); } } /******************************************************************************/ void init_deblock(XVID_POSTPROC *tbls) { int i; for(i = -255; i < 256; i++) { tbls->xvid_thresh_tbl[i + 255] = 0; if(ABS(i) < THR1) tbls->xvid_thresh_tbl[i + 255] = 1; tbls->xvid_abs_tbl[i + 255] = ABS(i); } } #define LOAD_DATA_HOR(x) \ /* Load pixel addresses and data for filtering */ \ s[0] = *(v[0] = img - 5*stride + x); \ s[1] = *(v[1] = img - 4*stride + x); \ s[2] = *(v[2] = img - 3*stride + x); \ s[3] = *(v[3] = img - 2*stride + x); \ s[4] = *(v[4] = img - 1*stride + x); \ s[5] = *(v[5] = img + 0*stride + x); \ s[6] = *(v[6] = img + 1*stride + x); \ s[7] = *(v[7] = img + 2*stride + x); \ s[8] = *(v[8] = img + 3*stride + x); \ s[9] = *(v[9] = img + 4*stride + x); #define LOAD_DATA_VER(x) \ /* Load pixel addresses and data for filtering */ \ s[0] = *(v[0] = img + x*stride - 5); \ s[1] = *(v[1] = img + x*stride - 4); \ s[2] = *(v[2] = img + x*stride - 3); \ s[3] = *(v[3] = img + x*stride - 2); \ s[4] = *(v[4] = img + x*stride - 1); \ s[5] = *(v[5] = img + x*stride + 0); \ s[6] = *(v[6] = img + x*stride + 1); \ s[7] = *(v[7] = img + x*stride + 2); \ s[8] = *(v[8] = img + x*stride + 3); \ s[9] = *(v[9] = img + x*stride + 4); #define APPLY_DERING(x) \ *v[x] = (e[x] == 0) ? ( \ (e[x-1] == 0) ? ( \ (e[x+1] == 0) ? \ ((s[x-1]+s[x]*2+s[x+1])>>2) \ : ((s[x-1]+s[x])>>1) ) \ : ((s[x]+s[x+1])>>1) ) \ : s[x]; #define APPLY_FILTER_CORE \ /* First, decide whether to use default or DC-offset mode */ \ \ eq_cnt = 0; \ \ eq_cnt += tbls->xvid_thresh_tbl[s[0] - s[1] + 255]; \ eq_cnt += tbls->xvid_thresh_tbl[s[1] - s[2] + 255]; \ eq_cnt += tbls->xvid_thresh_tbl[s[2] - s[3] + 255]; \ eq_cnt += tbls->xvid_thresh_tbl[s[3] - s[4] + 255]; \ eq_cnt += tbls->xvid_thresh_tbl[s[4] - s[5] + 255]; \ eq_cnt += tbls->xvid_thresh_tbl[s[5] - s[6] + 255]; \ eq_cnt += tbls->xvid_thresh_tbl[s[6] - s[7] + 255]; \ eq_cnt += tbls->xvid_thresh_tbl[s[7] - s[8] + 255]; \ \ if(eq_cnt < THR2) { /* Default mode */ \ int a30, a31, a32; \ int diff, limit; \ \ if(tbls->xvid_abs_tbl[(s[4] - s[5]) + 255] < quant) { \ a30 = ((s[3]<<1) - s[4] * 5 + s[5] * 5 - (s[6]<<1)); \ a31 = ((s[1]<<1) - s[2] * 5 + s[3] * 5 - (s[4]<<1)); \ a32 = ((s[5]<<1) - s[6] * 5 + s[7] * 5 - (s[8]<<1)); \ \ diff = (5 * ((SIGN(a30) * MIN(FAST_ABS(a30), MIN(FAST_ABS(a31), FAST_ABS(a32)))) - a30) + 32) >> 6; \ limit = (s[4] - s[5]) / 2; \ \ if (limit > 0) \ diff = (diff < 0) ? 0 : ((diff > limit) ? limit : diff); \ else \ diff = (diff > 0) ? 0 : ((diff < limit) ? limit : diff); \ \ *v[4] -= diff; \ *v[5] += diff; \ } \ if (dering) { \ e[0] = (tbls->xvid_abs_tbl[(s[0] - s[1]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ e[1] = (tbls->xvid_abs_tbl[(s[1] - s[2]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ e[2] = (tbls->xvid_abs_tbl[(s[2] - s[3]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ e[3] = (tbls->xvid_abs_tbl[(s[3] - s[4]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ e[4] = (tbls->xvid_abs_tbl[(s[4] - s[5]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ e[5] = (tbls->xvid_abs_tbl[(s[5] - s[6]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ e[6] = (tbls->xvid_abs_tbl[(s[6] - s[7]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ e[7] = (tbls->xvid_abs_tbl[(s[7] - s[8]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ e[8] = (tbls->xvid_abs_tbl[(s[8] - s[9]) + 255] > quant + DERING_STRENGTH) ? 1 : 0; \ \ e[1] |= e[0]; \ e[2] |= e[1]; \ e[3] |= e[2]; \ e[4] |= e[3]; \ e[5] |= e[4]; \ e[6] |= e[5]; \ e[7] |= e[6]; \ e[8] |= e[7]; \ e[9] = e[8]; \ \ APPLY_DERING(1) \ APPLY_DERING(2) \ APPLY_DERING(3) \ APPLY_DERING(4) \ APPLY_DERING(5) \ APPLY_DERING(6) \ APPLY_DERING(7) \ APPLY_DERING(8) \ } \ } \ else { /* DC-offset mode */ \ uint8_t p0, p9; \ int min, max; \ \ /* Now decide whether to apply smoothing filter or not */ \ max = FAST_MAX(s[1], FAST_MAX(s[2], FAST_MAX(s[3], FAST_MAX(s[4], FAST_MAX(s[5], FAST_MAX(s[6], FAST_MAX(s[7], s[8]))))))); \ min = FAST_MIN(s[1], FAST_MIN(s[2], FAST_MIN(s[3], FAST_MIN(s[4], FAST_MIN(s[5], FAST_MIN(s[6], FAST_MIN(s[7], s[8]))))))); \ \ if(((max-min)) < 2*quant) { \ \ /* Choose edge pixels */ \ p0 = (tbls->xvid_abs_tbl[(s[1] - s[0]) + 255] < quant) ? s[0] : s[1]; \ p9 = (tbls->xvid_abs_tbl[(s[8] - s[9]) + 255] < quant) ? s[9] : s[8]; \ \ *v[1] = (uint8_t) ((6*p0 + (s[1]<<2) + (s[2]<<1) + (s[3]<<1) + s[4] + s[5] + 8) >> 4); \ *v[2] = (uint8_t) (((p0<<2) + (s[1]<<1) + (s[2]<<2) + (s[3]<<1) + (s[4]<<1) + s[5] + s[6] + 8) >> 4); \ *v[3] = (uint8_t) (((p0<<1) + (s[1]<<1) + (s[2]<<1) + (s[3]<<2) + (s[4]<<1) + (s[5]<<1) + s[6] + s[7] + 8) >> 4); \ *v[4] = (uint8_t) ((p0 + s[1] + (s[2]<<1) + (s[3]<<1) + (s[4]<<2) + (s[5]<<1) + (s[6]<<1) + s[7] + s[8] + 8) >> 4); \ *v[5] = (uint8_t) ((s[1] + s[2] + (s[3]<<1) + (s[4]<<1) + (s[5]<<2) + (s[6]<<1) + (s[7]<<1) + s[8] + p9 + 8) >> 4); \ *v[6] = (uint8_t) ((s[2] + s[3] + (s[4]<<1) + (s[5]<<1) + (s[6]<<2) + (s[7]<<1) + (s[8]<<1) + (p9<<1) + 8) >> 4); \ *v[7] = (uint8_t) ((s[3] + s[4] + (s[5]<<1) + (s[6]<<1) + (s[7]<<2) + (s[8]<<1) + (p9<<2) + 8) >> 4); \ *v[8] = (uint8_t) ((s[4] + s[5] + (s[6]<<1) + (s[7]<<1) + (s[8]<<2) + 6*p9 + 8) >> 4); \ } \ } void deblock8x8_h(XVID_POSTPROC *tbls, uint8_t *img, int stride, int quant, int dering) { int eq_cnt; uint8_t *v[10]; int s[10]; int e[10]; LOAD_DATA_HOR(0) APPLY_FILTER_CORE LOAD_DATA_HOR(1) APPLY_FILTER_CORE LOAD_DATA_HOR(2) APPLY_FILTER_CORE LOAD_DATA_HOR(3) APPLY_FILTER_CORE LOAD_DATA_HOR(4) APPLY_FILTER_CORE LOAD_DATA_HOR(5) APPLY_FILTER_CORE LOAD_DATA_HOR(6) APPLY_FILTER_CORE LOAD_DATA_HOR(7) APPLY_FILTER_CORE } void deblock8x8_v(XVID_POSTPROC *tbls, uint8_t *img, int stride, int quant, int dering) { int eq_cnt; uint8_t *v[10]; int s[10]; int e[10]; LOAD_DATA_VER(0) APPLY_FILTER_CORE LOAD_DATA_VER(1) APPLY_FILTER_CORE LOAD_DATA_VER(2) APPLY_FILTER_CORE LOAD_DATA_VER(3) APPLY_FILTER_CORE LOAD_DATA_VER(4) APPLY_FILTER_CORE LOAD_DATA_VER(5) APPLY_FILTER_CORE LOAD_DATA_VER(6) APPLY_FILTER_CORE LOAD_DATA_VER(7) APPLY_FILTER_CORE } /****************************************************************************** * * * Noise code below taken from MPlayer: http://www.mplayerhq.hu/ * * Copyright (C) 2002 Michael Niedermayer * * * ******************************************************************************/ #define RAND_N(range) ((int) ((double)range * rand() / (RAND_MAX + 1.0))) #define STRENGTH1 12 #define STRENGTH2 8 void init_noise(XVID_POSTPROC *tbls) { int i, j; int patt[4] = { -1,0,1,0 }; emms(); srand(123457); for(i = 0, j = 0; i < MAX_NOISE; i++, j++) { double x1, x2, w, y1, y2; do { x1 = 2.0 * rand() / (float) RAND_MAX - 1.0; x2 = 2.0 * rand() / (float) RAND_MAX - 1.0; w = x1 * x1 + x2 * x2; } while (w >= 1.0); w = sqrt((-2.0 * log(w)) / w); y1 = x1 * w; y2 = x1 * w; y1 *= STRENGTH1 / sqrt(3.0); y2 *= STRENGTH2 / sqrt(3.0); y1 /= 2; y2 /= 2; y1 += patt[j%4] * STRENGTH1 * 0.35; y2 += patt[j%4] * STRENGTH2 * 0.35; if (y1 < -128) { y1=-128; } else if (y1 > 127) { y1= 127; } if (y2 < -128) { y2=-128; } else if (y2 > 127) { y2= 127; } y1 /= 3.0; y2 /= 3.0; tbls->xvid_noise1[i] = (int) y1; tbls->xvid_noise2[i] = (int) y2; if (RAND_N(6) == 0) { j--; } } for (i = 0; i < MAX_RES; i++) for (j = 0; j < 3; j++) { tbls->xvid_prev_shift[i][j] = tbls->xvid_noise1 + (rand() & (MAX_SHIFT - 1)); tbls->xvid_prev_shift[i][3 + j] = tbls->xvid_noise2 + (rand() & (MAX_SHIFT - 1)); } } void add_noise(XVID_POSTPROC *tbls, uint8_t *dst, uint8_t *src, int stride, int width, int height, int shiftptr, int quant) { int x, y; int shift = 0; int add = (quant < 5) ? 3 : 0; int8_t *noise = (quant < 5) ? tbls->xvid_noise2 : tbls->xvid_noise1; for(y = 0; y < height; y++) { int8_t *src2 = (int8_t *) src; shift = rand() & (MAX_SHIFT - 1); shift &= ~7; for(x = 0; x < width; x++) { const int n = tbls->xvid_prev_shift[y][0 + add][x] + tbls->xvid_prev_shift[y][1 + add][x] + tbls->xvid_prev_shift[y][2 + add][x]; dst[x] = src2[x] + ((n * src2[x]) >> 7); } tbls->xvid_prev_shift[y][shiftptr + add] = noise + shift; dst += stride; src += stride; } } void image_brightness_c(uint8_t *dst, int stride, int width, int height, int offset) { int x,y; for(y = 0; y < height; y++) { for(x = 0; x < width; x++) { int p = dst[y*stride + x]; dst[y*stride + x] = CLIP( p + offset, 0, 255); } } } xvidcore/src/image/ppc_asm/0000775000076500007650000000000011566427762016776 5ustar xvidbuildxvidbuildxvidcore/src/image/ppc_asm/colorspace_altivec.c0000664000076500007650000005360311564705453023004 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Colorspace conversion functions with altivec optimization - * * Copyright(C) 2004 Christoph Ngeli * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: colorspace_altivec.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" #include "../colorspace.h" #undef DEBUG #include /********** generic altivec RGB to YV12 colorspace macro **********/ #define MAKE_COLORSPACE_ALTIVEC_FROM_RGB(NAME,SIZE,PIXELS,VPIXELS,FUNC,C1,C2,C3,C4) \ void \ NAME(uint8_t *x_ptr, int x_stride, \ uint8_t *y_ptr, uint8_t *u_ptr, uint8_t *v_ptr, \ int y_stride, int uv_stride, \ int width, int height, int vflip) \ { \ int fixed_width = (width + 15) & ~15; \ int x_dif = x_stride - (SIZE) * fixed_width; \ int y_dif = y_stride - fixed_width; \ int uv_dif = uv_stride - (fixed_width / 2); \ int x, y; \ unsigned prefetch_constant; \ \ register vector unsigned int shift_consts[4]; \ \ vector unsigned char y_add; \ vector unsigned char u_add; \ vector unsigned char v_add; \ \ vector unsigned short vec_fix_ins[3]; \ \ vec_st(vec_ldl(0, &g_vec_fix_ins[0]), 0, &vec_fix_ins[0]); \ vec_st(vec_ldl(0, &g_vec_fix_ins[1]), 0, &vec_fix_ins[1]); \ vec_st(vec_ldl(0, &g_vec_fix_ins[2]), 0, &vec_fix_ins[2]); \ \ shift_consts[0] = vec_add(vec_splat_u32(12), vec_splat_u32(12)); \ shift_consts[1] = vec_add(vec_splat_u32(8), vec_splat_u32(8)); \ shift_consts[2] = vec_splat_u32(8); \ shift_consts[3] = vec_splat_u32(0); \ \ prefetch_constant = build_prefetch(16, 2, (short)x_stride); \ vec_dstt(x_ptr, prefetch_constant, 0); \ vec_dstt(x_ptr + (x_stride << 1), prefetch_constant, 1); \ \ *((unsigned char*)&y_add) = Y_ADD_IN; \ *((unsigned char*)&u_add) = U_ADD_IN; \ *((unsigned char*)&v_add) = V_ADD_IN; \ \ y_add = vec_splat(y_add, 0); \ u_add = vec_splat(u_add, 0); \ v_add = vec_splat(v_add, 0); \ \ if(vflip) { \ x_ptr += (height - 1) * x_stride; \ x_dif = -(SIZE) * fixed_width - x_stride; \ x_stride = -x_stride; \ } \ \ for(y = 0; y < height; y += (VPIXELS)) { \ FUNC##_ROW(SIZE,C1,C2,C3,C4); \ for(x = 0; x < fixed_width; x += (PIXELS)) { \ FUNC(SIZE,C1,C2,C3,C4); \ x_ptr += (PIXELS)*(SIZE); \ y_ptr += (PIXELS); \ u_ptr += (PIXELS)/2; \ v_ptr += (PIXELS)/2; \ } \ x_ptr += x_dif + (VPIXELS-1) * x_stride; \ y_ptr += y_dif + (VPIXELS-1) * y_stride; \ u_ptr += uv_dif + ((VPIXELS/2)-1)*uv_stride; \ v_ptr += uv_dif + ((VPIXELS/2)-1)*uv_stride; \ } \ vec_dssall(); \ } /********** generic altivec YUV to YV12 colorspace macro **********/ #define MAKE_COLORSPACE_ALTIVEC_FROM_YUV(NAME,SIZE,PIXELS,VPIXELS,FUNC,C1,C2,C3,C4) \ void \ NAME(uint8_t *x_ptr, int x_stride, \ uint8_t *y_ptr, uint8_t *u_ptr, uint8_t *v_ptr, \ int y_stride, int uv_stride, \ int width, int height, int vflip) \ { \ int fixed_width = (width + 15) & ~15; \ int x_dif = x_stride - (SIZE)*fixed_width; \ int y_dif = y_stride - fixed_width; \ int uv_dif = uv_stride - (fixed_width / 2); \ int x, y; \ \ unsigned prefetch_constant; \ \ vector unsigned int p0, p1; \ vector unsigned char lum0, lum1; \ vector unsigned char u0, u1; \ vector unsigned char v0, v1; \ vector unsigned char t; \ \ prefetch_constant = build_prefetch(16, 2, (short)x_stride); \ vec_dstt(x_ptr, prefetch_constant, 0); \ vec_dstt(x_ptr + (x_stride << 1), prefetch_constant, 1); \ \ if(vflip) { \ x_ptr += (height - 1) * x_stride; \ x_dif = -(SIZE)*fixed_width - x_stride; \ x_stride = -x_stride; \ } \ \ for(y = 0; y < height; y += (VPIXELS)) { \ FUNC##_ROW(SIZE,C1,C2,C3,C4); \ for(x = 0; x < fixed_width; x += (PIXELS)) { \ FUNC(SIZE,C1,C2,C3,C4); \ x_ptr += (PIXELS)*(SIZE); \ y_ptr += (PIXELS); \ u_ptr += (PIXELS)/2; \ v_ptr += (PIXELS)/2; \ } \ x_ptr += x_dif + (VPIXELS-1) * x_stride; \ y_ptr += y_dif + (VPIXELS-1) * y_stride; \ u_ptr += uv_dif + ((VPIXELS/2)-1) * uv_stride; \ v_ptr += uv_dif + ((VPIXELS/2)-1) * uv_stride; \ } \ vec_dssall(); \ } /********** generic altivec YV12 to YUV colorspace macro **********/ #define MAKE_COLORSPACE_ALTIVEC_TO_YUV(NAME,SIZE,PIXELS,VPIXELS,FUNC,C1,C2,C3,C4) \ void \ NAME(uint8_t *x_ptr, int x_stride, \ uint8_t *y_ptr, uint8_t *u_ptr, uint8_t *v_ptr, \ int y_stride, int uv_stride, \ int width, int height, int vflip) \ { \ int fixed_width = (width + 15) & ~15; \ int x_dif = x_stride - (SIZE)*fixed_width; \ int y_dif = y_stride - fixed_width; \ int uv_dif = uv_stride - (fixed_width / 2); \ int x, y; \ \ vector unsigned char y_vec; \ vector unsigned char u_vec; \ vector unsigned char v_vec; \ vector unsigned char p0, p1, ptmp; \ vector unsigned char mask; \ vector unsigned char mask_stencil; \ vector unsigned char t; \ vector unsigned char m4; \ vector unsigned char vec4; \ \ unsigned prefetch_constant_y; \ unsigned prefetch_constant_uv; \ \ prefetch_constant_y = build_prefetch(16, 4, (short)y_stride); \ prefetch_constant_uv = build_prefetch(16, 2, (short)uv_stride); \ \ vec_dstt(y_ptr, prefetch_constant_y, 0); \ vec_dstt(u_ptr, prefetch_constant_uv, 1); \ vec_dstt(v_ptr, prefetch_constant_uv, 2); \ \ mask_stencil = (vector unsigned char)vec_mergeh( (vector unsigned short)vec_mergeh(vec_splat_u8(-1), vec_splat_u8(0)), vec_splat_u16(0) ); \ m4 = vec_sr(vec_lvsl(0, (unsigned char*)0), vec_splat_u8(2)); \ vec4 = vec_splat_u8(4); \ \ if(vflip) { \ x_ptr += (height - 1) * x_stride; \ x_dif = -(SIZE)*fixed_width - x_stride; \ x_stride = -x_stride; \ } \ \ for(y = 0; y < height; y += (VPIXELS)) { \ FUNC##_ROW(SIZE,C1,C2,C3,C4); \ for(x = 0; x < fixed_width; x += (PIXELS)) { \ FUNC(SIZE,C1,C2,C3,C4); \ x_ptr += (PIXELS)*(SIZE); \ y_ptr += (PIXELS); \ u_ptr += (PIXELS)/2; \ v_ptr += (PIXELS)/2; \ } \ x_ptr += x_dif + (VPIXELS-1) * x_stride; \ y_ptr += y_dif + (VPIXELS-1) * y_stride; \ u_ptr += uv_dif + ((VPIXELS/2)-1) * uv_stride; \ v_ptr += uv_dif + ((VPIXELS/2)-1) * uv_stride; \ } \ vec_dssall(); \ } /********** colorspace input (xxx_to_yv12) functions **********/ /* rgb -> yuv def's this following constants are "official spec" Video Demystified" (ISBN 1-878707-09-4) rgb<->yuv _is_ lossy, since most programs do the conversion differently SCALEBITS/FIX taken from ffmpeg */ #define Y_R_IN 0.257 #define Y_G_IN 0.504 #define Y_B_IN 0.098 #define Y_ADD_IN 16 #define U_R_IN 0.148 #define U_G_IN 0.291 #define U_B_IN 0.439 #define U_ADD_IN 128 #define V_R_IN 0.439 #define V_G_IN 0.368 #define V_B_IN 0.071 #define V_ADD_IN 128 #define SCALEBITS_IN 8 #define FIX_IN(x) ((uint16_t) ((x) * (1L< * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: interpolate8x8_altivec.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" #undef DEBUG #include static inline unsigned build_prefetch(unsigned char block_size, unsigned char block_count, short stride) { if(block_size > 31) block_size = 0; return ((block_size << 24) | (block_count << 16) | stride); } #define NO_ROUNDING #define ROUNDING \ s1 = vec_and(vec_add(s1, s2), vec_splat_u8(1)); \ d = vec_sub(d, s1); #define INTERPLATE8X8_HALFPEL_H(round) \ s1 = vec_perm(vec_ld(0, src), vec_ld(16, src), vec_lvsl(0, src)); \ s2 = vec_perm(s1, s1, s2_mask); \ d = vec_avg(s1, s2); \ round; \ mask = vec_perm(mask_stencil, mask_stencil, vec_lvsl(0, dst)); \ d = vec_perm(d, d, vec_lvsl(0, dst)); \ d = vec_sel(d, vec_ld(0, dst), mask); \ vec_st(d, 0, dst); \ dst += stride; \ src += stride /* This function assumes: * dst is 8 byte aligned * src is unaligned * stride is a multiple of 8 */ void interpolate8x8_halfpel_h_altivec_c( uint8_t *dst, uint8_t *src, const uint32_t stride, const uint32_t rounding) { register vector unsigned char s1, s2; register vector unsigned char d; register vector unsigned char mask; register vector unsigned char s2_mask; register vector unsigned char mask_stencil; #ifdef DEBUG /* Dump alignment errors if DEBUG is defined */ if(((unsigned long)dst) & 0x7) fprintf(stderr, "interpolate8x8_halfpel_h_altivec_c:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "interpolate8x8_halfpel_h_altivec_c:incorrect stride, stride: %u\n", stride); #endif s2_mask = vec_lvsl(1, (unsigned char*)0); mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); if(rounding) { INTERPLATE8X8_HALFPEL_H(ROUNDING); INTERPLATE8X8_HALFPEL_H(ROUNDING); INTERPLATE8X8_HALFPEL_H(ROUNDING); INTERPLATE8X8_HALFPEL_H(ROUNDING); INTERPLATE8X8_HALFPEL_H(ROUNDING); INTERPLATE8X8_HALFPEL_H(ROUNDING); INTERPLATE8X8_HALFPEL_H(ROUNDING); INTERPLATE8X8_HALFPEL_H(ROUNDING); } else { INTERPLATE8X8_HALFPEL_H(NO_ROUNDING); INTERPLATE8X8_HALFPEL_H(NO_ROUNDING); INTERPLATE8X8_HALFPEL_H(NO_ROUNDING); INTERPLATE8X8_HALFPEL_H(NO_ROUNDING); INTERPLATE8X8_HALFPEL_H(NO_ROUNDING); INTERPLATE8X8_HALFPEL_H(NO_ROUNDING); INTERPLATE8X8_HALFPEL_H(NO_ROUNDING); INTERPLATE8X8_HALFPEL_H(NO_ROUNDING); } } #define INTERPLATE8X8_HALFPEL_V(round) \ s1 = vec_perm(vec_ld(0, src), vec_ld(16, src), vec_lvsl(0, src)); \ s2 = vec_perm(vec_ld(0, src + stride), vec_ld(16, src + stride), vec_lvsl(0, src + stride)); \ d = vec_avg(s1, s2); \ round; \ mask = vec_perm(mask_stencil, mask_stencil, vec_lvsl(0, dst)); \ d = vec_perm(d, d, vec_lvsl(0, dst)); \ d = vec_sel(d, vec_ld(0, dst), mask); \ vec_st(d, 0, dst); \ dst += stride; \ src += stride /* * This function assumes * dst is 8 byte aligned * src is unaligned * stride is a multiple of 8 */ void interpolate8x8_halfpel_v_altivec_c( uint8_t *dst, uint8_t *src, const uint32_t stride, const uint32_t rounding) { vector unsigned char s1, s2; vector unsigned char d; vector unsigned char mask; vector unsigned char mask_stencil; #ifdef DEBUG /* if this is on, print alignment errors */ if(((unsigned long)dst) & 0x7) fprintf(stderr, "interpolate8x8_halfpel_v_altivec_c:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "interpolate8x8_halfpel_v_altivec_c:incorrect stride, stride: %u\n", stride); #endif mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); if(rounding) { INTERPLATE8X8_HALFPEL_V(ROUNDING); INTERPLATE8X8_HALFPEL_V(ROUNDING); INTERPLATE8X8_HALFPEL_V(ROUNDING); INTERPLATE8X8_HALFPEL_V(ROUNDING); INTERPLATE8X8_HALFPEL_V(ROUNDING); INTERPLATE8X8_HALFPEL_V(ROUNDING); INTERPLATE8X8_HALFPEL_V(ROUNDING); INTERPLATE8X8_HALFPEL_V(ROUNDING); } else { INTERPLATE8X8_HALFPEL_V(NO_ROUNDING); INTERPLATE8X8_HALFPEL_V(NO_ROUNDING); INTERPLATE8X8_HALFPEL_V(NO_ROUNDING); INTERPLATE8X8_HALFPEL_V(NO_ROUNDING); INTERPLATE8X8_HALFPEL_V(NO_ROUNDING); INTERPLATE8X8_HALFPEL_V(NO_ROUNDING); INTERPLATE8X8_HALFPEL_V(NO_ROUNDING); INTERPLATE8X8_HALFPEL_V(NO_ROUNDING); } } #define INTERPOLATE8X8_HALFPEL_HV(adding) \ t = vec_perm(vec_ld(0, src), vec_ld(16, src), vec_lvsl(0, src)); \ s1 = (vector unsigned short)vec_mergeh(zerovec, t); \ t = vec_perm(vec_ld(1, src), vec_ld(17, src), vec_lvsl(1, src)); \ s2 = (vector unsigned short)vec_mergeh(zerovec, t); \ t = vec_perm(vec_ld(0, src + stride), vec_ld(16, src + stride), vec_lvsl(0, src + stride)); \ s3 = (vector unsigned short)vec_mergeh(zerovec, t); \ t = vec_perm(vec_ld(1, src + stride), vec_ld(17, src + stride), vec_lvsl(1, src + stride)); \ s4 = (vector unsigned short)vec_mergeh(zerovec, t); \ s1 = vec_add(s1,s2);\ s3 = vec_add(s3,s4);\ s1 = vec_add(s1,s3);\ s1 = vec_add(s1, adding); \ s1 = vec_sr(s1, two); \ t = vec_pack(s1, s1); \ mask = vec_perm(mask_stencil, mask_stencil, vec_lvsl(0, dst)); \ t = vec_sel(t, vec_ld(0, dst), mask); \ vec_st(t, 0, dst); \ dst += stride; \ src += stride void interpolate8x8_halfpel_hv_altivec_c(uint8_t *dst, uint8_t *src, const uint32_t stride, const uint32_t rounding) { vector unsigned short s1, s2, s3, s4; vector unsigned char t; vector unsigned short one, two; vector unsigned char zerovec; vector unsigned char mask; vector unsigned char mask_stencil; /* Initialisation stuff */ zerovec = vec_splat_u8(0); one = vec_splat_u16(1); two = vec_splat_u16(2); mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); if(rounding) { INTERPOLATE8X8_HALFPEL_HV(one); INTERPOLATE8X8_HALFPEL_HV(one); INTERPOLATE8X8_HALFPEL_HV(one); INTERPOLATE8X8_HALFPEL_HV(one); INTERPOLATE8X8_HALFPEL_HV(one); INTERPOLATE8X8_HALFPEL_HV(one); INTERPOLATE8X8_HALFPEL_HV(one); INTERPOLATE8X8_HALFPEL_HV(one); } else { INTERPOLATE8X8_HALFPEL_HV(two); INTERPOLATE8X8_HALFPEL_HV(two); INTERPOLATE8X8_HALFPEL_HV(two); INTERPOLATE8X8_HALFPEL_HV(two); INTERPOLATE8X8_HALFPEL_HV(two); INTERPOLATE8X8_HALFPEL_HV(two); INTERPOLATE8X8_HALFPEL_HV(two); INTERPOLATE8X8_HALFPEL_HV(two); } } /* * This function assumes: * dst is 8 byte aligned * src1 is unaligned * src2 is unaligned * stirde is a multiple of 8 * rounding is smaller than than max signed short + 2 */ void interpolate8x8_avg2_altivec_c( uint8_t *dst, const uint8_t *src1, const uint8_t *src2, const uint32_t stride, const uint32_t rounding, const uint32_t height) { uint32_t i; vector unsigned char t; vector unsigned char mask; vector unsigned char mask_stencil; vector unsigned char zerovec; vector signed short s1, s2; vector signed short d; vector signed short round; #ifdef DEBUG /* If this is on, print alignment errors */ if(((unsigned long)dst) & 0x7) fprintf(stderr, "interpolate8x8_avg2_altivec_c:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "interpolate8x8_avg2_altivec_c:incorrect stride, stride: %u\n", stride); if(rounding > (32767 + 2)) fprintf(stderr, "interpolate8x8_avg2_altivec_c:incorrect rounding, rounding: %d\n", rounding); #endif /* initialisation */ zerovec = vec_splat_u8(0); *((short*)&round) = 1 - rounding; round = vec_splat(round, 0); mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); for(i = 0; i < height; i++) { t = vec_perm(vec_ld(0, src1), vec_ld(16, src1), vec_lvsl(0, src1)); d = vec_add((vector signed short)zerovec, round); s1 = (vector signed short)vec_mergeh(zerovec, t); t = vec_perm(vec_ld(0, src2), vec_ld(16, src2), vec_lvsl(0, src2)); d = vec_add(d, s1); s2 = (vector signed short)vec_mergeh(zerovec, t); d = vec_add(d, s2); d = vec_sr(d, vec_splat_u16(1)); t = vec_pack((vector unsigned short)d, (vector unsigned short)zerovec); mask = vec_perm(mask_stencil, mask_stencil, vec_lvsl(0, dst)); t = vec_perm(t, t, vec_lvsl(0, dst)); t = vec_sel(t, vec_ld(0, dst), mask); vec_st(t, 0, dst); dst += stride; src1 += stride; src2 += stride; } } #define INTERPOLATE8X8_AVG4() \ d = r; \ \ t = vec_perm(vec_ld(0, src1), vec_ld(16, src1), vec_lvsl(0, src1)); \ s = (vector signed short)vec_mergeh(zerovec, t); \ d = vec_add(d, s); \ \ t = vec_perm(vec_ld(0, src2), vec_ld(16, src2), vec_lvsl(0, src2)); \ s = (vector signed short)vec_mergeh(zerovec, t); \ d = vec_add(d, s); \ \ t = vec_perm(vec_ld(0, src3), vec_ld(16, src3), vec_lvsl(0, src3)); \ s = (vector signed short)vec_mergeh(zerovec, t); \ d = vec_add(d, s); \ \ t = vec_perm(vec_ld(0, src4), vec_ld(16, src4), vec_lvsl(0, src4)); \ s = (vector signed short)vec_mergeh(zerovec, t); \ d = vec_add(d, s); \ \ d = vec_sr(d, shift); \ \ t = vec_pack((vector unsigned short)d, (vector unsigned short)zerovec); \ mask = vec_perm(mask_stencil, mask_stencil, vec_lvsl(0, dst)); \ t = vec_perm(t, t, vec_lvsl(0, dst)); \ t = vec_sel(t, vec_ld(0, dst), mask); \ vec_st(t, 0, dst); \ \ dst += stride; \ src1 += stride; \ src2 += stride; \ src3 += stride; \ src4 += stride /* This function assumes: * dst is 8 byte aligned * src1, src2, src3, src4 are unaligned * stride is a multiple of 8 */ void interpolate8x8_avg4_altivec_c(uint8_t *dst, const uint8_t *src1, const uint8_t *src2, const uint8_t *src3, const uint8_t *src4, const uint32_t stride, const uint32_t rounding) { vector signed short r; register vector signed short s, d; register vector unsigned short shift; register vector unsigned char t; register vector unsigned char zerovec; register vector unsigned char mask; register vector unsigned char mask_stencil; #ifdef DEBUG /* if debug is set, print alignment errors */ if(((unsigned)dst) & 0x7) fprintf(stderr, "interpolate8x8_avg4_altivec_c:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "interpolate8x8_avg4_altivec_c:incorrect stride, stride: %u\n", stride); #endif /* Initialization */ zerovec = vec_splat_u8(0); *((short*)&r) = 2 - rounding; r = vec_splat(r, 0); shift = vec_splat_u16(2); mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); /* interpolate */ INTERPOLATE8X8_AVG4(); INTERPOLATE8X8_AVG4(); INTERPOLATE8X8_AVG4(); INTERPOLATE8X8_AVG4(); INTERPOLATE8X8_AVG4(); INTERPOLATE8X8_AVG4(); INTERPOLATE8X8_AVG4(); INTERPOLATE8X8_AVG4(); } /* * This function assumes: * dst is 8 byte aligned * src is unaligned * stirde is a multiple of 8 * rounding is ignored */ void interpolate8x8_halfpel_add_altivec_c(uint8_t *dst, const uint8_t *src, const uint32_t stride, const uint32_t rouding) { interpolate8x8_avg2_altivec_c(dst, dst, src, stride, 0, 8); } #define INTERPOLATE8X8_HALFPEL_H_ADD_ROUND() \ mask_dst = vec_lvsl(0,dst); \ s1 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src)); \ d = vec_perm(vec_ld(0,dst),vec_ld(16,dst),mask_dst); \ \ s2 = vec_perm(s1,s1,rot1);\ tmp = vec_avg(s1,s2);\ s1 = vec_xor(s1,s2);\ s1 = vec_sub(tmp,vec_and(s1,one));\ \ d = vec_avg(s1,d);\ \ mask = vec_perm(mask_stencil, mask_stencil, mask_dst); \ d = vec_perm(d,d,mask_dst); \ d = vec_sel(d,vec_ld(0,dst),mask); \ vec_st(d,0,dst); \ \ dst += stride; \ src += stride #define INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND() \ mask_dst = vec_lvsl(0,dst); \ s1 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src)); \ d = vec_perm(vec_ld(0,dst),vec_ld(16,dst),mask_dst); \ \ s1 = vec_avg(s1, vec_perm(s1,s1,rot1));\ d = vec_avg(s1,d);\ \ mask = vec_perm(mask_stencil,mask_stencil,mask_dst);\ d = vec_perm(d,d,mask_dst);\ d = vec_sel(d,vec_ld(0,dst),mask);\ vec_st(d,0,dst);\ \ dst += stride;\ src += stride /* * This function assumes: * dst is 8 byte aligned * src is unaligned * stride is a multiple of 8 */ void interpolate8x8_halfpel_h_add_altivec_c(uint8_t *dst, uint8_t *src, const uint32_t stride, const uint32_t rounding) { register vector unsigned char s1,s2; register vector unsigned char d; register vector unsigned char tmp; register vector unsigned char mask_dst; register vector unsigned char one; register vector unsigned char rot1; register vector unsigned char mask_stencil; register vector unsigned char mask; #ifdef DEBUG if(((unsigned)dst) & 0x7) fprintf(stderr, "interpolate8x8_halfpel_h_add_altivec_c:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "interpolate8x8_halfpel_h_add_altivec_c:incorrect stride, stride: %u\n", stride); #endif /* initialization */ mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); one = vec_splat_u8(1); rot1 = vec_lvsl(1,(unsigned char*)0); if(rounding) { INTERPOLATE8X8_HALFPEL_H_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_ROUND(); } else { INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_H_ADD_NOROUND(); } } #define INTERPOLATE8X8_HALFPEL_V_ADD_ROUND()\ src += stride;\ mask_dst = vec_lvsl(0,dst);\ s2 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src));\ d = vec_perm(vec_ld(0,dst),vec_ld(16,dst),mask_dst);\ \ tmp = vec_avg(s1,s2);\ s1 = vec_xor(s1,s2);\ s1 = vec_sub(tmp,vec_and(s1,vec_splat_u8(1)));\ d = vec_avg(s1,d);\ \ mask = vec_perm(mask_stencil,mask_stencil,mask_dst);\ d = vec_perm(d,d,mask_dst);\ d = vec_sel(d,vec_ld(0,dst),mask);\ vec_st(d,0,dst);\ \ s1 = s2;\ \ dst += stride #define INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND()\ src += stride;\ mask_dst = vec_lvsl(0,dst);\ s2 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src));\ d = vec_perm(vec_ld(0,dst),vec_ld(16,dst),mask_dst);\ \ s1 = vec_avg(s1,s2);\ d = vec_avg(s1,d);\ \ mask = vec_perm(mask_stencil,mask_stencil,mask_dst);\ d = vec_perm(d,d,mask_dst);\ d = vec_sel(d,vec_ld(0,dst),mask);\ vec_st(d,0,dst);\ \ s1 = s2;\ dst += stride /* * This function assumes: * dst: 8 byte aligned * src: unaligned * stride is a multiple of 8 */ void interpolate8x8_halfpel_v_add_altivec_c(uint8_t *dst, uint8_t *src, const uint32_t stride, const uint32_t rounding) { register vector unsigned char s1,s2; register vector unsigned char tmp; register vector unsigned char d; register vector unsigned char mask; register vector unsigned char mask_dst; register vector unsigned char mask_stencil; #ifdef DEBUG if(((unsigned)dst) & 0x7) fprintf(stderr, "interpolate8x8_halfpel_v_add_altivec_c:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "interpolate8x8_halfpel_v_add_altivec_c:incorrect align, dst: %u\n", stride); #endif /* initialization */ mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); if(rounding) { /* Interpolate vertical with rounding */ s1 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src)); INTERPOLATE8X8_HALFPEL_V_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_ROUND(); } else { /* Interpolate vertical without rounding */ s1 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src)); INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_V_ADD_NOROUND(); } } #define INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND()\ src += stride;\ mask_dst = vec_lvsl(0,dst);\ c10 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src));\ d = vec_perm(vec_ld(0,dst),vec_ld(16,dst),mask_dst);\ c11 = vec_perm(c10,c10,rot1);\ \ s00 = (vector unsigned short)vec_mergeh(zero,c00);\ s01 = (vector unsigned short)vec_mergeh(zero,c01);\ s10 = (vector unsigned short)vec_mergeh(zero,c10);\ s11 = (vector unsigned short)vec_mergeh(zero,c11);\ \ s00 = vec_add(s00,s10);\ s01 = vec_add(s01,s11);\ s00 = vec_add(s00,s01);\ s00 = vec_add(s00,one);\ \ s00 = vec_sr(s00,two);\ s00 = vec_add(s00, (vector unsigned short)vec_mergeh(zero,d));\ s00 = vec_sr(s00,one);\ \ d = vec_pack(s00,s00);\ mask = vec_perm(mask_stencil,mask_stencil,mask_dst);\ d = vec_sel(d,vec_ld(0,dst),mask);\ vec_st(d,0,dst);\ \ c00 = c10;\ c01 = c11;\ dst += stride #define INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND()\ src += stride;\ mask_dst = vec_lvsl(0,dst);\ c10 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src));\ d = vec_perm(vec_ld(0,dst),vec_ld(16,dst),mask_dst);\ c11 = vec_perm(c10,c10,rot1);\ \ s00 = (vector unsigned short)vec_mergeh(zero,c00);\ s01 = (vector unsigned short)vec_mergeh(zero,c01);\ s10 = (vector unsigned short)vec_mergeh(zero,c10);\ s11 = (vector unsigned short)vec_mergeh(zero,c11);\ \ s00 = vec_add(s00,s10);\ s01 = vec_add(s01,s11);\ s00 = vec_add(s00,s01);\ s00 = vec_add(s00,two);\ s00 = vec_sr(s00,two);\ \ c00 = vec_pack(s00,s00);\ d = vec_avg(d,c00);\ \ mask = vec_perm(mask_stencil,mask_stencil,mask_dst);\ d = vec_perm(d,d,mask_dst);\ d = vec_sel(d,vec_ld(0,dst),mask);\ vec_st(d,0,dst);\ \ c00 = c10;\ c01 = c11;\ dst += stride /* * This function assumes: * dst: 8 byte aligned * src: unaligned * stride: multiple of 8 */ void interpolate8x8_halfpel_hv_add_altivec_c(uint8_t *dst, uint8_t *src, const uint32_t stride, const uint32_t rounding) { register vector unsigned char c00,c10,c01,c11; register vector unsigned short s00,s10,s01,s11; register vector unsigned char d; register vector unsigned char mask; register vector unsigned char mask_stencil; register vector unsigned char rot1; register vector unsigned char mask_dst; register vector unsigned char zero; register vector unsigned short one,two; #ifdef DEBUG if(((unsigned)dst) & 0x7) fprintf(stderr, "interpolate8x8_halfpel_hv_add_altivec_c:incorrect align, dst: %lx\n", (long)dst); if(stride & 0x7) fprintf(stderr, "interpolate8x8_halfpel_hv_add_altivec_c:incorrect stride, stride: %u\n", stride); #endif /* initialization */ mask_stencil = vec_pack(vec_splat_u16(0), vec_splat_u16(-1)); rot1 = vec_lvsl(1,(unsigned char*)0); zero = vec_splat_u8(0); one = vec_splat_u16(1); two = vec_splat_u16(2); if(rounding) { /* Load the first row 'manually' */ c00 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src)); c01 = vec_perm(c00,c00,rot1); INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_ROUND(); } else { /* Load the first row 'manually' */ c00 = vec_perm(vec_ld(0,src),vec_ld(16,src),vec_lvsl(0,src)); c01 = vec_perm(c00,c00,rot1); INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND(); INTERPOLATE8X8_HALFPEL_HV_ADD_NOROUND(); } } xvidcore/src/image/ppc_asm/qpel_altivec.c0000664000076500007650000003210611564705453021606 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - QPel interpolation with altivec optimization - * * Copyright(C) 2004 Christoph Ngeli * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: qpel_altivec.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifdef HAVE_ALTIVEC_H #include #endif #include "../../portab.h" #undef DEBUG #include static const vector signed char FIR_Tab_16[17] = { (vector signed char)AVV( 14, -3, 2, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ), (vector signed char)AVV( 23, 19, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ), (vector signed char)AVV( -7, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ), (vector signed char)AVV( 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0 ), (vector signed char)AVV( -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0, 0 ), (vector signed char)AVV( 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0 ), (vector signed char)AVV( 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0 ), (vector signed char)AVV( 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0 ), (vector signed char)AVV( 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0 ), (vector signed char)AVV( 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0 ), (vector signed char)AVV( 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0 ), (vector signed char)AVV( 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0 ), (vector signed char)AVV( 0, 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1 ), (vector signed char)AVV( 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3 ), (vector signed char)AVV( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -7 ), (vector signed char)AVV( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 19, 23 ), (vector signed char)AVV( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 2, -3, 14 ) }; static const vector signed short FIR_Tab_8[9] = { (vector signed short)AVV( 14, -3, 2, -1, 0, 0, 0, 0 ), (vector signed short)AVV( 23, 19, -6, 3, -1, 0, 0, 0 ), (vector signed short)AVV( -7, 20, 20, -6, 3, -1, 0, 0 ), (vector signed short)AVV( 3, -6, 20, 20, -6, 3, -1, 0 ), (vector signed short)AVV( -1, 3, -6, 20, 20, -6, 3, -1 ), (vector signed short)AVV( 0, -1, 3, -6, 20, 20, -6, 3 ), (vector signed short)AVV( 0, 0, -1, 3, -6, 20, 20, -7 ), (vector signed short)AVV( 0, 0, 0, -1, 3, -6, 19, 23 ), (vector signed short)AVV( 0, 0, 0, 0, -1, 2, -3, 14 ) }; /* Processing with FIR_Tab */ #define PROCESS_FIR_16(x) \ firs = FIR_Tab_16[x];\ \ tmp = vec_splat(vec_src,(x));\ sums1 = vec_mladd( (vector signed short)vec_mergeh(ox00, tmp), vec_unpackh(firs), sums1 );\ sums2 = vec_mladd( (vector signed short)vec_mergel(ox00, tmp), vec_unpackl(firs), sums2 ) #define PROCESS_FIR_8(x) \ firs = FIR_Tab_8[x];\ tmp = (vector signed short)vec_mergeh( ox00, vec_splat(vec_src,x) );\ sums = vec_mladd(firs,tmp,sums) #define NOTHING() \ /* Nothing here */ #pragma mark - /* "Postprocessing" macros */ #define AVRG_16() \ sums1 = (vector signed short)vec_mergeh(ox00,tmp);\ sums2 = (vector signed short)vec_mergel(ox00,tmp);\ sums1 = vec_add(sums1, (vector signed short)vec_mergeh(ox00,vec_src) );\ sums2 = vec_add(sums2, (vector signed short)vec_mergel(ox00,vec_src) );\ tmp = (vector unsigned char)vec_splat_u16(1);\ sums1 = vec_add(sums1, (vector signed short)tmp);\ sums2 = vec_add(sums2, (vector signed short)tmp);\ sums1 = vec_sub(sums1, vec_rnd);\ sums2 = vec_sub(sums2, vec_rnd);\ sums1 = vec_sra(sums1, (vector unsigned short)tmp);\ sums2 = vec_sra(sums2, (vector unsigned short)tmp);\ tmp = vec_packsu(sums1,sums2) #define AVRG_UP_16_H() \ vec_src = vec_perm(vec_ld(1,Src),vec_ld(17,Src),vec_lvsl(1,Src));\ AVRG_16() #define AVRG_UP_16_V() \ ((unsigned char*)&vec_src)[0] = Src[16 * BpS];\ vec_src = vec_perm(vec_src,vec_src,vec_lvsl(1,(unsigned char*)0));\ AVRG_16() #define AVRG_8() \ sums = (vector signed short)vec_mergeh(ox00, st);\ sums = vec_add(sums, (vector signed short)vec_mergeh(ox00, vec_src));\ st = (vector unsigned char)vec_splat_u16(1);\ sums = vec_add(sums, (vector signed short)st);\ sums = vec_sub(sums, vec_rnd);\ sums = vec_sra(sums, (vector unsigned short)st);\ st = vec_packsu(sums,sums) #define AVRG_UP_8() \ vec_src = vec_perm(vec_src, vec_src, vec_lvsl(1,(unsigned char*)0));\ AVRG_8() #pragma mark - /* Postprocessing Macros for the Pass_16 Add functions */ #define ADD_16_H()\ tmp = vec_avg(tmp, vec_perm(vec_ld(0,Dst),vec_ld(16,Dst),vec_lvsl(0,Dst))) #define AVRG_ADD_16_H()\ AVRG_16();\ ADD_16_H() #define AVRG_UP_ADD_16_H()\ AVRG_UP_16_H();\ ADD_16_H() #define ADD_16_V() \ for(j = 0; j < 16; j++)\ ((unsigned char*)&vec_src)[j] = Dst[j * BpS];\ tmp = vec_avg(tmp, vec_src) #define AVRG_ADD_16_V()\ AVRG_16();\ ADD_16_V() #define AVRG_UP_ADD_16_V()\ AVRG_UP_16_V();\ ADD_16_V() #pragma mark - /* Postprocessing Macros for the Pass_8 Add functions */ #define ADD_8_H()\ sums = (vector signed short)vec_mergeh(ox00, st);\ tmp = (vector signed short)vec_mergeh(ox00,vec_perm(vec_ld(0,Dst),vec_ld(16,Dst),vec_lvsl(0,Dst)));\ sums = vec_avg(sums,tmp);\ st = vec_packsu(sums,sums) #define AVRG_ADD_8_H()\ AVRG_8();\ ADD_8_H() #define AVRG_UP_ADD_8_H()\ AVRG_UP_8();\ ADD_8_H() #define ADD_8_V()\ for(j = 0; j < 8; j++)\ ((short*)&tmp)[j] = (short)Dst[j * BpS];\ sums = (vector signed short)vec_mergeh(ox00,st);\ sums = vec_avg(sums,tmp);\ st = vec_packsu(sums,sums) #define AVRG_ADD_8_V()\ AVRG_8();\ ADD_8_V() #define AVRG_UP_ADD_8_V()\ AVRG_UP_8();\ ADD_8_V() #pragma mark - /* Load/Store Macros */ #define LOAD_H() \ vec_src = vec_perm(vec_ld(0,Src),vec_ld(16,Src),vec_lvsl(0,Src)) #define LOAD_V_16() \ for(j = 0; j < 16; j++)\ ((unsigned char*)&vec_src)[j] = Src[j * BpS] #define LOAD_V_8() \ for(j = 0; j <= 8; j++)\ ((unsigned char*)&vec_src)[j] = Src[j * BpS] #define LOAD_UP_V_8() \ for(j = 0; j <= 9; j++)\ ((unsigned char*)&vec_src)[j] = Src[j * BpS] #define STORE_H_16() \ mask = vec_lvsr(0,Dst);\ tmp = vec_perm(tmp,tmp,mask);\ mask = vec_perm(oxFF, ox00, mask);\ tmp1 = vec_sel(tmp, vec_ld(0,Dst), mask);\ vec_st(tmp1, 0, Dst);\ tmp1 = vec_sel(vec_ld(16,Dst), tmp, mask);\ vec_st(tmp1, 16, Dst) #define STORE_V_16() \ for(j = 0; j < 16; j++)\ Dst[j * BpS] = ((unsigned char*)&tmp)[j] #define STORE_H_8() \ mask = vec_perm(mask_00ff, mask_00ff, vec_lvsr(0,Dst) );\ st = vec_sel(st, vec_ld(0,Dst), mask);\ vec_st(st, 0, Dst) #define STORE_V_8() \ for(j = 0; j < 8; j++)\ Dst[j * BpS] = ((unsigned char*)&st)[j] #pragma mark - /* Additional variable declaration/initialization macros */ #define VARS_H_16()\ register vector unsigned char mask;\ register vector unsigned char tmp1 #define VARS_V()\ register unsigned j #define VARS_H_8()\ register vector unsigned char mask_00ff;\ register vector unsigned char mask;\ mask_00ff = vec_pack(vec_splat_u16(0),vec_splat_u16(-1)) #pragma mark - /* Function macros */ #define MAKE_PASS_16(NAME, POSTPROC, ADDITIONAL_VARS, LOAD_SOURCE, STORE_DEST, NEXT_PIXEL, NEXT_LINE) \ void \ NAME(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t Rnd) \ {\ register vector signed short sums1,sums2;\ register vector unsigned char ox00;\ register vector unsigned char oxFF;\ register vector signed char firs;\ vector signed short vec_rnd;\ vector signed short s16Rnd;\ vector unsigned char vec_src;\ vector unsigned char tmp;\ \ register unsigned c;\ \ ADDITIONAL_VARS();\ \ ox00 = vec_splat_u8(0);\ oxFF = vec_splat_u8(-1);\ \ *((short*)&vec_rnd) = (short)Rnd;\ vec_rnd = vec_splat(vec_rnd,0);\ s16Rnd = vec_add(vec_splat_s16(8),vec_splat_s16(8));\ s16Rnd = vec_sub(s16Rnd, vec_rnd);\ \ c = ((1 << 24) | (16 << 16) | BpS);\ \ while(H-- > 0) {\ \ vec_dst(Src, c, 2);\ \ sums1 = s16Rnd;\ sums2 = s16Rnd;\ \ LOAD_SOURCE();\ \ PROCESS_FIR_16(0);\ PROCESS_FIR_16(1);\ PROCESS_FIR_16(2);\ PROCESS_FIR_16(3);\ \ PROCESS_FIR_16(4);\ PROCESS_FIR_16(5);\ PROCESS_FIR_16(6);\ PROCESS_FIR_16(7);\ \ PROCESS_FIR_16(8);\ PROCESS_FIR_16(9);\ PROCESS_FIR_16(10);\ PROCESS_FIR_16(11);\ \ PROCESS_FIR_16(12);\ PROCESS_FIR_16(13);\ PROCESS_FIR_16(14);\ PROCESS_FIR_16(15);\ \ firs = FIR_Tab_16[16];\ *((uint8_t*)&tmp) = Src[16*NEXT_PIXEL];\ tmp = vec_splat(tmp,0);\ \ sums1 = vec_mladd( (vector signed short)vec_mergeh(ox00,tmp),vec_unpackh(firs),sums1 );\ sums2 = vec_mladd( (vector signed short)vec_mergel(ox00,tmp),vec_unpackl(firs),sums2 );\ \ tmp = (vector unsigned char)vec_splat_u16(5);\ sums1 = vec_sra(sums1,(vector unsigned short)tmp);\ sums2 = vec_sra(sums2,(vector unsigned short)tmp);\ tmp = vec_packsu(sums1,sums2);\ \ POSTPROC();\ \ STORE_DEST();\ \ Src += NEXT_LINE;\ Dst += NEXT_LINE;\ }\ vec_dss(2);\ } #define MAKE_PASS_8(NAME,POSTPROC,ADDITIONAL_VARS, LOAD_SOURCE, STORE_DEST, INC) \ void \ NAME(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t Rnd)\ {\ register vector signed short sums;\ register vector signed short firs;\ register vector unsigned char ox00;\ vector signed short tmp;\ vector signed short vec_rnd;\ vector signed short vec_rnd16;\ vector unsigned char vec_src;\ vector unsigned char st;\ \ ADDITIONAL_VARS();\ \ ox00 = vec_splat_u8(0);\ \ *((short*)&vec_rnd) = (short)Rnd;\ vec_rnd = vec_splat(vec_rnd,0);\ vec_rnd16 = vec_sub( vec_add(vec_splat_s16(8),vec_splat_s16(8)), vec_rnd );\ \ while(H-- > 0) {\ \ sums = vec_rnd16;\ LOAD_SOURCE();\ \ PROCESS_FIR_8(0);\ PROCESS_FIR_8(1);\ PROCESS_FIR_8(2);\ PROCESS_FIR_8(3);\ \ PROCESS_FIR_8(4);\ PROCESS_FIR_8(5);\ PROCESS_FIR_8(6);\ PROCESS_FIR_8(7);\ \ PROCESS_FIR_8(8);\ \ sums = vec_sra(sums, vec_splat_u16(5));\ st = vec_packsu(sums,sums);\ \ POSTPROC();\ \ STORE_DEST();\ \ Src += INC;\ Dst += INC;\ }\ } /* Create the actual Functions ***************************************/ /* These functions assume no alignment */ MAKE_PASS_16(H_Pass_16_Altivec_C, NOTHING, VARS_H_16, LOAD_H, STORE_H_16, 1, BpS) MAKE_PASS_16(H_Pass_Avrg_16_Altivec_C, AVRG_16, VARS_H_16, LOAD_H, STORE_H_16, 1, BpS) MAKE_PASS_16(H_Pass_Avrg_Up_16_Altivec_C, AVRG_UP_16_H, VARS_H_16, LOAD_H, STORE_H_16, 1, BpS) MAKE_PASS_16(V_Pass_16_Altivec_C, NOTHING, VARS_V, LOAD_V_16, STORE_V_16, BpS, 1) MAKE_PASS_16(V_Pass_Avrg_16_Altivec_C, AVRG_16, VARS_V, LOAD_V_16, STORE_V_16, BpS, 1) MAKE_PASS_16(V_Pass_Avrg_Up_16_Altivec_C, AVRG_UP_16_V, VARS_V, LOAD_V_16, STORE_V_16, BpS, 1) /* These functions assume: * Dst: 8 Byte aligned * BpS: Multiple of 8 */ MAKE_PASS_8(H_Pass_8_Altivec_C, NOTHING, VARS_H_8, LOAD_H, STORE_H_8, BpS) MAKE_PASS_8(H_Pass_Avrg_8_Altivec_C, AVRG_8, VARS_H_8, LOAD_H, STORE_H_8, BpS) MAKE_PASS_8(H_Pass_Avrg_Up_8_Altivec_C, AVRG_UP_8, VARS_H_8, LOAD_H, STORE_H_8, BpS) /* These functions assume no alignment */ MAKE_PASS_8(V_Pass_8_Altivec_C, NOTHING, VARS_V, LOAD_V_8, STORE_V_8, 1) MAKE_PASS_8(V_Pass_Avrg_8_Altivec_C, AVRG_8, VARS_V, LOAD_V_8, STORE_V_8, 1) MAKE_PASS_8(V_Pass_Avrg_Up_8_Altivec_C, AVRG_UP_8, VARS_V, LOAD_UP_V_8, STORE_V_8, 1) /* These functions assume no alignment */ MAKE_PASS_16(H_Pass_16_Add_Altivec_C, ADD_16_H, VARS_H_16, LOAD_H, STORE_H_16, 1, BpS) MAKE_PASS_16(H_Pass_Avrg_16_Add_Altivec_C, AVRG_ADD_16_H, VARS_H_16, LOAD_H, STORE_H_16, 1, BpS) MAKE_PASS_16(H_Pass_Avrg_Up_16_Add_Altivec_C, AVRG_UP_ADD_16_H, VARS_H_16, LOAD_H, STORE_H_16, 1, BpS) MAKE_PASS_16(V_Pass_16_Add_Altivec_C, ADD_16_V, VARS_V, LOAD_V_16, STORE_V_16, BpS, 1) MAKE_PASS_16(V_Pass_Avrg_16_Add_Altivec_C, AVRG_ADD_16_V, VARS_V, LOAD_V_16, STORE_V_16, BpS, 1) MAKE_PASS_16(V_Pass_Avrg_Up_16_Add_Altivec_C, AVRG_UP_ADD_16_V, VARS_V, LOAD_V_16, STORE_V_16, BpS, 1) /* These functions assume: * Dst: 8 Byte aligned * BpS: Multiple of 8 */ MAKE_PASS_8(H_Pass_8_Add_Altivec_C, ADD_8_H, VARS_H_8, LOAD_H, STORE_H_8, BpS) MAKE_PASS_8(H_Pass_Avrg_8_Add_Altivec_C, AVRG_ADD_8_H, VARS_H_8, LOAD_H, STORE_H_8, BpS) MAKE_PASS_8(H_Pass_Avrg_Up_8_Add_Altivec_C, AVRG_UP_ADD_8_H, VARS_H_8, LOAD_H, STORE_H_8, BpS) /* These functions assume no alignment */ MAKE_PASS_8(V_Pass_8_Add_Altivec_C, ADD_8_V, VARS_V, LOAD_V_8, STORE_V_8, 1) MAKE_PASS_8(V_Pass_Avrg_8_Add_Altivec_C, AVRG_ADD_8_V, VARS_V, LOAD_V_8, STORE_V_8, 1) MAKE_PASS_8(V_Pass_Avrg_Up_8_Add_Altivec_C, AVRG_UP_ADD_8_V, VARS_V, LOAD_UP_V_8, STORE_V_8, 1) xvidcore/src/image/ia64_asm/0000775000076500007650000000000011566427762016757 5ustar xvidbuildxvidbuildxvidcore/src/image/ia64_asm/interpolate8x8_ia64.s0000664000076500007650000001547111147310721022651 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 halfpel interpolation - // * // * Copyright(C) 2002 Kai Khn, Alexander Viehl // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: interpolate8x8_ia64.s,v 1.6 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * interpolate8x8_ia64.s, IA-64 halfpel interpolation // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** .file "interpolate8x8_ia64.s" .pred.safe_across_calls p1-p5,p16-p63 .text .align 16 .global interpolate8x8_halfpel_h_ia64# .proc interpolate8x8_halfpel_h_ia64# interpolate8x8_halfpel_h_ia64: LL=3 SL=1 SL2=1 OL=1 OL2=1 AVL=1 AL=1 STL=3 alloc r9=ar.pfs,4, 60,0,64 mov r20 = ar.lc mov r21 = pr dep.z r22 = r33,3,3 // rshift of src and r14 = -8,r33 // align src mov r15 = r32 // get dest mov r16 = r34 // stride // sub r17 = 0,r0 // 1-rounding ;; add r18 = 8,r14 // mux1 r17 = r17, @brcst // broadcast 1-rounding sub r24 = 64,r22 // lshift of src add r26 = 8,r22 // rshift of src+1 sub r27 = 56,r22 // lshift of src+1 mov ar.lc = 7 // loopcounter mov ar.ec = LL + SL +OL + AVL + STL // sum of latencies mov pr.rot = 1 << 16 // init pr regs for sw-pipeling ;; .rotr ald1[LL+1],ald2[LL+1],shru1[SL+1],shl1[SL+1],shru2[SL+1],shl2[SL+1],or1[OL+1],or2[OL+1+AL],avg[AVL+1] .rotp aldp[LL], sh1p[SL], or1p[OL], pavg1p[AVL],stp[STL] .Lloop_interpolate: (aldp[0]) ld8 ald1[0] = [r14],r16 // load aligned src (aldp[0]) ld8 ald2[0] = [r18],r16 // and aligned src+8 (sh1p[0]) shr.u shru1[0] = ald1[LL],r22 // get src (sh1p[0]) shl shl1[0] = ald2[LL],r27 (sh1p[0]) shr.u shru2[0] = ald1[LL],r26 // get src+1 (sh1p[0]) shl shl2[0] = ald2[LL],r24 (or1p[0]) or or1[0] = shru1[SL],shl2[SL] // merge things (or1p[0]) or or2[0] = shru2[SL],shl1[SL] // (addp[0]) padd1.uus add1[0] = or1[OL],r17 // add 1-rounding (pavg1p[0]) pavg1 avg[0] = or1[OL],or2[OL] // parallel average (stp[0]) st8 [r15] = avg[AVL] // store results (stp[0]) add r15 = r15,r16 br.ctop.sptk.few .Lloop_interpolate ;; mov ar.lc = r20 mov pr = r21,-1 br.ret.sptk.many b0 .endp interpolate8x8_halfpel_h_ia64# .align 16 .global interpolate8x8_halfpel_v_ia64# .proc interpolate8x8_halfpel_v_ia64# interpolate8x8_halfpel_v_ia64: LL=3 SL=1 SL2=1 OL=1 OL2=1 AVL=1 AL=1 STL=3 alloc r9=ar.pfs,4, 60,0,64 mov r20 = ar.lc mov r21 = pr dep.z r22 = r33,3,3 and r14 = -8,r33 mov r15 = r32 mov r16 = r34 // sub r17 = 0,r0 ;; add r18 = 8,r14 add r19 = r14,r16 // src + stride // mux1 r17 = r17, @brcst sub r24 = 64,r22 ;; add r26 = 8,r19 // src + stride + 8 mov ar.lc = 7 mov ar.ec = LL + SL +OL + AVL + STL mov pr.rot = 1 << 16 ;; .rotr ald1[LL+1],ald2[LL+1],ald3[LL+1],ald4[LL+1],shru1[SL+1],shl1[SL+1],shru2[SL+1],shl2[SL+1],or1[OL+1],or2[OL+1+AL],avg[AVL+1] .rotp aldp[LL], sh1p[SL], or1p[OL], pavg1p[AVL],stp[STL] .Lloop_interpolate2: (aldp[0]) ld8 ald1[0] = [r14],r16 (aldp[0]) ld8 ald2[0] = [r18],r16 (aldp[0]) ld8 ald3[0] = [r19],r16 (aldp[0]) ld8 ald4[0] = [r26],r16 (sh1p[0]) shr.u shru1[0] = ald1[LL],r22 (sh1p[0]) shl shl1[0] = ald2[LL],r24 (sh1p[0]) shr.u shru2[0] = ald3[LL],r22 (sh1p[0]) shl shl2[0] = ald4[LL],r24 (or1p[0]) or or1[0] = shru1[SL],shl1[SL] (or1p[0]) or or2[0] = shru2[SL],shl2[SL] // (addp[0]) padd1.uus add1[0] = or1[OL],r17 (pavg1p[0]) pavg1 avg[0] = or1[OL],or2[OL] (stp[0]) st8 [r15] = avg[AVL] (stp[0]) add r15 = r15,r16 br.ctop.sptk.few .Lloop_interpolate2 ;; mov ar.lc = r20 mov pr = r21,-1 br.ret.sptk.many b0 .endp interpolate8x8_halfpel_v_ia64# .align 16 .global interpolate8x8_halfpel_hv_ia64# .proc interpolate8x8_halfpel_hv_ia64# interpolate8x8_halfpel_hv_ia64: LL=3 SL=1 SL2=1 OL=1 OL2=1 AVL=1 AL=1 STL=3 alloc r9=ar.pfs,4, 60,0,64 mov r20 = ar.lc mov r21 = pr dep.z r22 = r33,3,3 and r14 = -8,r33 mov r15 = r32 mov r16 = r34 // sub r17 = 0,r0 ;; add r18 = 8,r14 add r19 = r14,r16 // mux1 r17 = r17, @brcst add r27 = 8,r22 sub r28 = 56,r22 sub r24 = 64,r22 ;; add r26 = 8,r19 mov ar.lc = 7 mov ar.ec = LL + SL +OL + 2*AVL + STL mov pr.rot = 1 << 16 ;; .rotr ald1[LL+1],ald2[LL+1],ald3[LL+1],ald4[LL+1],shru1[SL+1],shl1[SL+1],shru2[SL+1],shl2[SL+1],shl3[SL+1],shru3[SL+1],shl4[SL+1],shru4[SL+1],or1[OL+1],or2[OL+1+AL],or3[OL+AL+1],or4[OL+AL+1],avg[AVL+1],avg1[AVL+1],avg2[AVL+1] .rotp aldp[LL], sh1p[SL], or1p[OL],pavg1p[AVL],pavg2p[AVL],stp[STL] .Lloop_interpolate3: (aldp[0]) ld8 ald1[0] = [r14],r16 (aldp[0]) ld8 ald2[0] = [r18],r16 (aldp[0]) ld8 ald3[0] = [r19],r16 (aldp[0]) ld8 ald4[0] = [r26],r16 (sh1p[0]) shr.u shru1[0] = ald1[LL],r22 (sh1p[0]) shl shl1[0] = ald2[LL],r24 (sh1p[0]) shr.u shru2[0] = ald3[LL],r22 (sh1p[0]) shl shl2[0] = ald4[LL],r24 (sh1p[0]) shr.u shru3[0] = ald1[LL],r27 (sh1p[0]) shl shl3[0] = ald2[LL],r28 (sh1p[0]) shr.u shru4[0] = ald3[LL],r27 (sh1p[0]) shl shl4[0] = ald4[LL],r28 (or1p[0]) or or1[0] = shru1[SL],shl1[SL] (or1p[0]) or or2[0] = shru2[SL],shl2[SL] (or1p[0]) or or3[0] = shru3[SL],shl3[SL] (or1p[0]) or or4[0] = shru4[SL],shl4[SL] // (addp[0]) padd1.uus add1[0] = or1[OL],r17 (pavg1p[0]) pavg1 avg[0] = or1[OL],or2[OL] (pavg1p[0]) pavg1 avg1[0] = or3[OL],or4[OL] (pavg2p[0]) pavg1 avg2[0] = avg[AVL],avg1[AVL] (stp[0]) st8 [r15] = avg2[AVL] (stp[0]) add r15 = r15,r16 br.ctop.sptk.few .Lloop_interpolate3 ;; mov ar.lc = r20 mov pr = r21,-1 br.ret.sptk.many b0 .endp interpolate8x8_halfpel_hv_ia64# xvidcore/src/image/ia64_asm/interpolate8x8_ia64_exact.s0000664000076500007650000002362611147310721024036 0ustar xvidbuildxvidbuild// **************************************************************************** // * // * XVID MPEG-4 VIDEO CODEC // * - IA64 halfpel interpolation - // * // * Copyright(C) 2002 Kai Khn, Alexander Viehl // * // * This program is free software; you can redistribute it and/or modify it // * under the terms of the GNU General Public License as published by // * the Free Software Foundation; either version 2 of the License, or // * (at your option) any later version. // * // * This program is distributed in the hope that it will be useful, // * but WITHOUT ANY WARRANTY; without even the implied warranty of // * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // * GNU General Public License for more details. // * // * You should have received a copy of the GNU General Public License // * along with this program; if not, write to the Free Software // * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // * // * $Id: interpolate8x8_ia64_exact.s,v 1.2 2009-02-19 17:07:29 Isibaar Exp $ // * // ***************************************************************************/ // // **************************************************************************** // * // * interpolate8x8_ia64_exact.s, IA-64 halfpel interpolation // * // * This version was implemented during an IA-64 practical training at // * the University of Karlsruhe (http://i44w3.info.uni-karlsruhe.de/) // * // **************************************************************************** // *********************************** // interpolate8x8_ia64.s // optimized for IA-64 // Authors : Kai Khn // Alexander Viehl // last update : 13.7.2002 // *********************************** .file "interpolate8x8_ia64.s" .pred.safe_across_calls p1-p5,p16-p63 .text .align 16 .global interpolate8x8_halfpel_h_ia64# .proc interpolate8x8_halfpel_h_ia64# interpolate8x8_halfpel_h_ia64: LL=3 SL=1 SL2=1 OL=1 OL2=1 AVL=1 AL=1 PSH=1 ML=1 STL=3 alloc r9=ar.pfs,4, 60,0,64 mov r20 = ar.lc mov r21 = pr dep.z r22 = r33,3,3 // rshift of src and r14 = -8,r33 // align src mov r15 = r32 // get dest mov r16 = r34 // stride sub r17 = 1,r35 // 1-rounding ;; add r18 = 8,r14 mux2 r17 = r17, 0x00 // broadcast 1-rounding sub r24 = 64,r22 // lshift of src add r26 = 8,r22 // rshift of src+1 sub r27 = 56,r22 // lshift of src+1 mov ar.lc = 7 // loopcounter mov ar.ec = LL + SL +OL + AVL + STL + 2*PSH + 2*AL + ML // sum of latencies mov pr.rot = 1 << 16 // init pr regs for sw-pipeling ;; .rotr ald1[LL+1],ald2[LL+1],shru1[SL+1],shl1[SL+1],shru2[SL+1],shl2[SL+1],or1[OL+1+PSH],or2[OL+1+PSH],pshb1[PSH+1],pshb2[PSH+1],pshe1[PSH+1],pshe2[PSH+1+AL],pshe3[PSH+1],pshe4[PSH+1+AL],add1[AL+1],add2[AL+1],add3[AL+1],add4[AL+1],avg1[AVL+1],avg2[AVL+1],pmix1[ML+1] .rotp aldp[LL], sh1p[SL], or1p[OL], pshb[PSH], pshe[PSH],addp[AL],add2p[AL],pavg1p[AVL],mixp[ML],stp[STL] .Lloop_interpolate: (aldp[0]) ld8 ald1[0] = [r14],r16 // load aligned src (aldp[0]) ld8 ald2[0] = [r18],r16 // and aligned src+8 (sh1p[0]) shr.u shru1[0] = ald1[LL],r22 // get src (sh1p[0]) shl shl1[0] = ald2[LL],r27 (sh1p[0]) shr.u shru2[0] = ald1[LL],r26 // get src+1 (sh1p[0]) shl shl2[0] = ald2[LL],r24 (or1p[0]) or or1[0] = shru1[SL],shl2[SL] // merge things (or1p[0]) or or2[0] = shru2[SL],shl1[SL] (pshb[0]) pshl2 pshb1[0] = or1[OL],8 (pshb[0]) pshl2 pshb2[0] = or2[OL],8 (pshe[0]) pshr2.u pshe1[0] = pshb1[PSH],8 (pshe[0]) pshr2.u pshe2[0] = pshb2[PSH],8 (pshe[0]) pshr2.u pshe3[0] = or1[PSH+OL],8 (pshe[0]) pshr2.u pshe4[0] = or2[PSH+OL],8 (addp[0]) padd2.sss add1[0] = pshe1[PSH],r17 // add 1-rounding (addp[0]) padd2.sss add2[0] = pshe3[PSH],r17 // add 1-rounding (add2p[0]) padd2.uus add3[0] = pshe2[AL+PSH],add1[AL] (add2p[0]) padd2.uus add4[0] = pshe4[AL+PSH],add2[AL] (pavg1p[0]) pshr2.u avg1[0] = add3[AL],1 // parallel average (pavg1p[0]) pshr2.u avg2[0] = add4[AL],1 // parallel average (mixp[0]) mix1.r pmix1[0] = avg2[AVL],avg1[AVL] (stp[0]) st8 [r15] = pmix1[ML] // store results (stp[0]) add r15 = r15,r16 br.ctop.sptk.few .Lloop_interpolate ;; mov ar.lc = r20 mov pr = r21,-1 br.ret.sptk.many b0 .endp interpolate8x8_halfpel_h_ia64# .align 16 .global interpolate8x8_halfpel_v_ia64# .proc interpolate8x8_halfpel_v_ia64# interpolate8x8_halfpel_v_ia64: LL=3 SL=1 SL2=1 OL=1 OL2=1 AVL=1 AL=1 PSH=1 ML=1 STL=3 alloc r9=ar.pfs,4, 60,0,64 mov r20 = ar.lc mov r21 = pr dep.z r22 = r33,3,3 and r14 = -8,r33 mov r15 = r32 mov r16 = r34 sub r17 = 1,r35 ;; add r18 = 8,r14 add r19 = r14,r16 // src + stride mux2 r17 = r17, 0x00 sub r24 = 64,r22 ;; add r26 = 8,r19 // src + stride + 8 mov ar.lc = 7 mov ar.ec = LL + SL +OL + AVL + STL + 2*PSH + 2*AL + ML // sum of latencies mov pr.rot = 1 << 16 ;; .rotr ald1[LL+1],ald2[LL+1],ald3[LL+1],ald4[LL+1],shru1[SL+1],shl1[SL+1],shru2[SL+1],shl2[SL+1],or1[OL+1+PSH],or2[OL+1+PSH],pshb1[PSH+1],pshb2[PSH+1],pshe1[PSH+1],pshe2[PSH+1+AL],pshe3[PSH+1],pshe4[PSH+1+AL],add1[AL+1],add2[AL+1],add3[AL+1],add4[AL+1],avg1[AVL+1],avg2[AVL+1],pmix1[ML+1] .rotp aldp[LL], sh1p[SL], or1p[OL], pshb[PSH], pshe[PSH],addp[AL],add2p[AL],pavg1p[AVL],mixp[ML],stp[STL] .Lloop_interpolate2: (aldp[0]) ld8 ald1[0] = [r14],r16 (aldp[0]) ld8 ald2[0] = [r18],r16 (aldp[0]) ld8 ald3[0] = [r19],r16 (aldp[0]) ld8 ald4[0] = [r26],r16 (sh1p[0]) shr.u shru1[0] = ald1[LL],r22 (sh1p[0]) shl shl1[0] = ald2[LL],r24 (sh1p[0]) shr.u shru2[0] = ald3[LL],r22 (sh1p[0]) shl shl2[0] = ald4[LL],r24 (or1p[0]) or or1[0] = shru1[SL],shl1[SL] (or1p[0]) or or2[0] = shru2[SL],shl2[SL] (pshb[0]) pshl2 pshb1[0] = or1[OL],8 (pshb[0]) pshl2 pshb2[0] = or2[OL],8 (pshe[0]) pshr2.u pshe1[0] = pshb1[PSH],8 (pshe[0]) pshr2.u pshe2[0] = pshb2[PSH],8 (pshe[0]) pshr2.u pshe3[0] = or1[PSH+OL],8 (pshe[0]) pshr2.u pshe4[0] = or2[PSH+OL],8 (addp[0]) padd2.sss add1[0] = pshe1[PSH],r17 // add 1-rounding (addp[0]) padd2.sss add2[0] = pshe3[PSH],r17 // add 1-rounding (add2p[0]) padd2.uus add3[0] = pshe2[AL+PSH],add1[AL] (add2p[0]) padd2.uus add4[0] = pshe4[AL+PSH],add2[AL] (pavg1p[0]) pshr2.u avg1[0] = add3[AL],1 // parallel average (pavg1p[0]) pshr2.u avg2[0] = add4[AL],1 // parallel average (mixp[0]) mix1.r pmix1[0] = avg2[AVL],avg1[AVL] (stp[0]) st8 [r15] = pmix1[ML] (stp[0]) add r15 = r15,r16 br.ctop.sptk.few .Lloop_interpolate2 ;; mov ar.lc = r20 mov pr = r21,-1 br.ret.sptk.many b0 .endp interpolate8x8_halfpel_v_ia64# .align 16 .global interpolate8x8_halfpel_hv_ia64# .proc interpolate8x8_halfpel_hv_ia64# interpolate8x8_halfpel_hv_ia64: LL=3 SL=1 SL2=1 OL=1 OL2=1 AVL=1 AL=1 PSH=1 ML=1 STL=3 alloc r9=ar.pfs,4, 92,0,96 mov r20 = ar.lc mov r21 = pr dep.z r22 = r33,3,3 and r14 = -8,r33 mov r15 = r32 mov r16 = r34 sub r17 = 2,r35 ;; add r18 = 8,r14 add r19 = r14,r16 mux2 r17 = r17, 0x00 add r27 = 8,r22 sub r28 = 56,r22 sub r24 = 64,r22 ;; add r26 = 8,r19 mov ar.lc = 7 mov ar.ec = LL + SL +OL + AVL + STL + 2*PSH + 2*AL + ML // sum of latencies mov pr.rot = 1 << 16 ;; .rotr ald1[LL+1],ald2[LL+1],ald3[LL+1],ald4[LL+1],shru1[SL+1],shl1[SL+1],shru2[SL+1],shl2[SL+1],shl3[SL+1],shru3[SL+1],shl4[SL+1],shru4[SL+1],or1[OL+1+PSH],or2[OL+1+PSH],or3[OL+1+PSH],or4[OL+1+PSH],pshb1[PSH+1],pshb2[PSH+1],pshb3[PSH+1],pshb4[PSH+1],pshe1[PSH+1],pshe2[PSH+1+AL],pshe3[PSH+1],pshe4[PSH+1+AL],pshe5[PSH+1],pshe6[PSH+1+AL],pshe7[PSH+1],pshe8[PSH+1+AL],add1[AL+1],add2[AL+1],add3[AL+1],add4[AL+1],add5[AL+1],add6[AL+1],add7[AL+1],add8[AL+1],avg1[AVL+1],avg2[AVL+1],pmix1[ML+1] .rotp aldp[LL], sh1p[SL],or1p[OL],pshb[PSH],pshe[PSH],addp[AL],add2p[AL],add3p[AL],pavg1p[AVL],mixp[ML],stp[STL] .Lloop_interpolate3: (aldp[0]) ld8 ald1[0] = [r14],r16 (aldp[0]) ld8 ald2[0] = [r18],r16 (aldp[0]) ld8 ald3[0] = [r19],r16 (aldp[0]) ld8 ald4[0] = [r26],r16 (sh1p[0]) shr.u shru1[0] = ald1[LL],r22 (sh1p[0]) shl shl1[0] = ald2[LL],r24 (sh1p[0]) shr.u shru2[0] = ald3[LL],r22 (sh1p[0]) shl shl2[0] = ald4[LL],r24 (sh1p[0]) shr.u shru3[0] = ald1[LL],r27 (sh1p[0]) shl shl3[0] = ald2[LL],r28 (sh1p[0]) shr.u shru4[0] = ald3[LL],r27 (sh1p[0]) shl shl4[0] = ald4[LL],r28 (or1p[0]) or or1[0] = shru1[SL],shl1[SL] (or1p[0]) or or2[0] = shru2[SL],shl2[SL] (or1p[0]) or or3[0] = shru3[SL],shl3[SL] (or1p[0]) or or4[0] = shru4[SL],shl4[SL] (pshb[0]) pshl2 pshb1[0] = or1[OL],8 (pshb[0]) pshl2 pshb2[0] = or2[OL],8 (pshb[0]) pshl2 pshb3[0] = or3[OL],8 (pshb[0]) pshl2 pshb4[0] = or4[OL],8 (pshe[0]) pshr2.u pshe1[0] = pshb1[PSH],8 (pshe[0]) pshr2.u pshe2[0] = pshb2[PSH],8 (pshe[0]) pshr2.u pshe3[0] = or1[PSH+OL],8 (pshe[0]) pshr2.u pshe4[0] = or2[PSH+OL],8 (pshe[0]) pshr2.u pshe5[0] = pshb3[PSH],8 (pshe[0]) pshr2.u pshe6[0] = pshb4[PSH],8 (pshe[0]) pshr2.u pshe7[0] = or3[PSH+OL],8 (pshe[0]) pshr2.u pshe8[0] = or4[PSH+OL],8 (addp[0]) padd2.sss add1[0] = pshe1[PSH],pshe2[PSH] // add 1-rounding (addp[0]) padd2.sss add2[0] = pshe3[PSH],pshe4[PSH] // add 1-rounding (addp[0]) padd2.sss add5[0] = pshe5[PSH],pshe6[PSH] // add 1-rounding (addp[0]) padd2.sss add6[0] = pshe7[PSH],pshe8[PSH] // add 1-rounding (add2p[0]) padd2.uus add3[0] = add1[AL],add5[AL] (add2p[0]) padd2.uus add4[0] = add2[AL],add6[AL] (add3p[0]) padd2.uus add7[0] = add3[AL],r17 (add3p[0]) padd2.uus add8[0] = add4[AL],r17 (pavg1p[0]) pshr2.u avg1[0] = add7[AL],2 // parallel average (pavg1p[0]) pshr2.u avg2[0] = add8[AL],2 // parallel average (mixp[0]) mix1.r pmix1[0] = avg2[AVL],avg1[AVL] (stp[0]) st8 [r15] = pmix1[ML] (stp[0]) add r15 = r15,r16 br.ctop.sptk.few .Lloop_interpolate3 ;; mov ar.lc = r20 mov pr = r21,-1 br.ret.sptk.many b0 .endp interpolate8x8_halfpel_hv_ia64# xvidcore/src/image/ia64_asm/README0000664000076500007650000000042607516264227017634 0ustar xvidbuildxvidbuildThe default version (interpolate8x8_ia64.s) of the interpolate functions suffer from some rounding errors, but are fast. I our oppinion there is no need to be such exact here. Just in the case we provide another set of function with exact rounding: interpolate8x8_ia64_exact.s xvidcore/src/image/qpel.h0000664000076500007650000004022111564705453016457 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - QPel interpolation - * * Copyright(C) 2003 Pascal Massimino * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: qpel.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _XVID_QPEL_H_ #define _XVID_QPEL_H_ #include "interpolate8x8.h" #include "../utils/mem_transfer.h" /***************************************************************************** * Signatures ****************************************************************************/ #define XVID_QP_PASS_SIGNATURE(NAME) \ void (NAME)(uint8_t *dst, const uint8_t *src, int32_t length, int32_t BpS, int32_t rounding) typedef XVID_QP_PASS_SIGNATURE(XVID_QP_PASS); /* We put everything in a single struct so it can easily be passed * to prediction functions as a whole... */ typedef struct _XVID_QP_FUNCS { /* filter for QPel 16x? prediction */ XVID_QP_PASS *H_Pass; XVID_QP_PASS *H_Pass_Avrg; XVID_QP_PASS *H_Pass_Avrg_Up; XVID_QP_PASS *V_Pass; XVID_QP_PASS *V_Pass_Avrg; XVID_QP_PASS *V_Pass_Avrg_Up; /* filter for QPel 8x? prediction */ XVID_QP_PASS *H_Pass_8; XVID_QP_PASS *H_Pass_Avrg_8; XVID_QP_PASS *H_Pass_Avrg_Up_8; XVID_QP_PASS *V_Pass_8; XVID_QP_PASS *V_Pass_Avrg_8; XVID_QP_PASS *V_Pass_Avrg_Up_8; } XVID_QP_FUNCS; /***************************************************************************** * fwd dcl ****************************************************************************/ extern void xvid_Init_QP(); extern XVID_QP_FUNCS xvid_QP_Funcs_C_ref; /* for P-frames */ extern XVID_QP_FUNCS xvid_QP_Add_Funcs_C_ref; /* for B-frames */ extern XVID_QP_FUNCS xvid_QP_Funcs_C; /* for P-frames */ extern XVID_QP_FUNCS xvid_QP_Add_Funcs_C; /* for B-frames */ #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern XVID_QP_FUNCS xvid_QP_Funcs_mmx; extern XVID_QP_FUNCS xvid_QP_Add_Funcs_mmx; #endif #ifdef ARCH_IS_PPC extern XVID_QP_FUNCS xvid_QP_Funcs_Altivec_C; extern XVID_QP_FUNCS xvid_QP_Add_Funcs_Altivec_C; #endif extern XVID_QP_FUNCS *xvid_QP_Funcs; /* <- main pointer for enc/dec structure */ extern XVID_QP_FUNCS *xvid_QP_Add_Funcs; /* <- main pointer for enc/dec structure */ /***************************************************************************** * macros ****************************************************************************/ /***************************************************************************** Passes to be performed case 0: copy case 2: h-pass case 1/3: h-pass + h-avrg case 8: v-pass case 10: h-pass + v-pass case 9/11: h-pass + h-avrg + v-pass case 4/12: v-pass + v-avrg case 6/14: h-pass + v-pass + v-avrg case 5/13/7/15: h-pass + h-avrg + v-pass + v-avrg ****************************************************************************/ static void __inline interpolate16x16_quarterpel(uint8_t * const cur, uint8_t * const refn, uint8_t * const refh, uint8_t * const refv, uint8_t * const refhv, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t *src; uint8_t *dst; uint8_t *tmp; int32_t quads; const XVID_QP_FUNCS *Ops; int32_t x_int, y_int; const int32_t xRef = (int)x*4 + dx; const int32_t yRef = (int)y*4 + dy; Ops = xvid_QP_Funcs; quads = (dx&3) | ((dy&3)<<2); x_int = xRef >> 2; y_int = yRef >> 2; dst = cur + y * stride + x; src = refn + y_int * (int)stride + x_int; tmp = refh; /* we need at least a 16 x stride scratch block */ switch(quads) { case 0: transfer8x8_copy(dst, src, stride); transfer8x8_copy(dst+8, src+8, stride); transfer8x8_copy(dst+8*stride, src+8*stride, stride); transfer8x8_copy(dst+8*stride+8, src+8*stride+8, stride); break; case 1: Ops->H_Pass_Avrg(dst, src, 16, stride, rounding); break; case 2: Ops->H_Pass(dst, src, 16, stride, rounding); break; case 3: Ops->H_Pass_Avrg_Up(dst, src, 16, stride, rounding); break; case 4: Ops->V_Pass_Avrg(dst, src, 16, stride, rounding); break; case 5: Ops->H_Pass_Avrg(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg(dst, tmp, 16, stride, rounding); break; case 6: Ops->H_Pass(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg(dst, tmp, 16, stride, rounding); break; case 7: Ops->H_Pass_Avrg_Up(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg(dst, tmp, 16, stride, rounding); break; case 8: Ops->V_Pass(dst, src, 16, stride, rounding); break; case 9: Ops->H_Pass_Avrg(tmp, src, 17, stride, rounding); Ops->V_Pass(dst, tmp, 16, stride, rounding); break; case 10: Ops->H_Pass(tmp, src, 17, stride, rounding); Ops->V_Pass(dst, tmp, 16, stride, rounding); break; case 11: Ops->H_Pass_Avrg_Up(tmp, src, 17, stride, rounding); Ops->V_Pass(dst, tmp, 16, stride, rounding); break; case 12: Ops->V_Pass_Avrg_Up(dst, src, 16, stride, rounding); break; case 13: Ops->H_Pass_Avrg(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg_Up(dst, tmp, 16, stride, rounding); break; case 14: Ops->H_Pass(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg_Up( dst, tmp, 16, stride, rounding); break; case 15: Ops->H_Pass_Avrg_Up(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg_Up(dst, tmp, 16, stride, rounding); break; } } static void __inline interpolate16x16_add_quarterpel(uint8_t * const cur, uint8_t * const refn, uint8_t * const refh, uint8_t * const refv, uint8_t * const refhv, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t *src; uint8_t *dst; uint8_t *tmp; int32_t quads; const XVID_QP_FUNCS *Ops; const XVID_QP_FUNCS *Ops_Copy; int32_t x_int, y_int; const int32_t xRef = (int)x*4 + dx; const int32_t yRef = (int)y*4 + dy; Ops = xvid_QP_Add_Funcs; Ops_Copy = xvid_QP_Funcs; quads = (dx&3) | ((dy&3)<<2); x_int = xRef >> 2; y_int = yRef >> 2; dst = cur + y * stride + x; src = refn + y_int * (int)stride + x_int; tmp = refh; /* we need at least a 16 x stride scratch block */ switch(quads) { case 0: /* NB: there is no halfpel involved ! the name's function can be * misleading */ interpolate8x8_halfpel_add(dst, src, stride, rounding); interpolate8x8_halfpel_add(dst+8, src+8, stride, rounding); interpolate8x8_halfpel_add(dst+8*stride, src+8*stride, stride, rounding); interpolate8x8_halfpel_add(dst+8*stride+8, src+8*stride+8, stride, rounding); break; case 1: Ops->H_Pass_Avrg(dst, src, 16, stride, rounding); break; case 2: Ops->H_Pass(dst, src, 16, stride, rounding); break; case 3: Ops->H_Pass_Avrg_Up(dst, src, 16, stride, rounding); break; case 4: Ops->V_Pass_Avrg(dst, src, 16, stride, rounding); break; case 5: Ops_Copy->H_Pass_Avrg(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg(dst, tmp, 16, stride, rounding); break; case 6: Ops_Copy->H_Pass(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg(dst, tmp, 16, stride, rounding); break; case 7: Ops_Copy->H_Pass_Avrg_Up(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg(dst, tmp, 16, stride, rounding); break; case 8: Ops->V_Pass(dst, src, 16, stride, rounding); break; case 9: Ops_Copy->H_Pass_Avrg(tmp, src, 17, stride, rounding); Ops->V_Pass(dst, tmp, 16, stride, rounding); break; case 10: Ops_Copy->H_Pass(tmp, src, 17, stride, rounding); Ops->V_Pass(dst, tmp, 16, stride, rounding); break; case 11: Ops_Copy->H_Pass_Avrg_Up(tmp, src, 17, stride, rounding); Ops->V_Pass(dst, tmp, 16, stride, rounding); break; case 12: Ops->V_Pass_Avrg_Up(dst, src, 16, stride, rounding); break; case 13: Ops_Copy->H_Pass_Avrg(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg_Up(dst, tmp, 16, stride, rounding); break; case 14: Ops_Copy->H_Pass(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg_Up( dst, tmp, 16, stride, rounding); break; case 15: Ops_Copy->H_Pass_Avrg_Up(tmp, src, 17, stride, rounding); Ops->V_Pass_Avrg_Up(dst, tmp, 16, stride, rounding); break; } } static void __inline interpolate16x8_quarterpel(uint8_t * const cur, uint8_t * const refn, uint8_t * const refh, uint8_t * const refv, uint8_t * const refhv, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t *src; uint8_t *dst; uint8_t *tmp; int32_t quads; const XVID_QP_FUNCS *Ops; int32_t x_int, y_int; const int32_t xRef = (int)x*4 + dx; const int32_t yRef = (int)y*4 + dy; Ops = xvid_QP_Funcs; quads = (dx&3) | ((dy&3)<<2); x_int = xRef >> 2; y_int = yRef >> 2; dst = cur + y * stride + x; src = refn + y_int * (int)stride + x_int; tmp = refh; /* we need at least a 16 x stride scratch block */ switch(quads) { case 0: transfer8x8_copy( dst, src, stride); transfer8x8_copy( dst+8, src+8, stride); break; case 1: Ops->H_Pass_Avrg(dst, src, 8, stride, rounding); break; case 2: Ops->H_Pass(dst, src, 8, stride, rounding); break; case 3: Ops->H_Pass_Avrg_Up(dst, src, 8, stride, rounding); break; case 4: Ops->V_Pass_Avrg_8(dst, src, 16, stride, rounding); break; case 5: Ops->H_Pass_Avrg(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 16, stride, rounding); break; case 6: Ops->H_Pass(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 16, stride, rounding); break; case 7: Ops->H_Pass_Avrg_Up(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 16, stride, rounding); break; case 8: Ops->V_Pass_8(dst, src, 16, stride, rounding); break; case 9: Ops->H_Pass_Avrg(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 16, stride, rounding); break; case 10: Ops->H_Pass(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 16, stride, rounding); break; case 11: Ops->H_Pass_Avrg_Up(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 16, stride, rounding); break; case 12: Ops->V_Pass_Avrg_Up_8(dst, src, 16, stride, rounding); break; case 13: Ops->H_Pass_Avrg(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8(dst, tmp, 16, stride, rounding); break; case 14: Ops->H_Pass(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8( dst, tmp, 16, stride, rounding); break; case 15: Ops->H_Pass_Avrg_Up(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8(dst, tmp, 16, stride, rounding); break; } } static void __inline interpolate8x8_quarterpel(uint8_t * const cur, uint8_t * const refn, uint8_t * const refh, uint8_t * const refv, uint8_t * const refhv, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t *src; uint8_t *dst; uint8_t *tmp; int32_t quads; const XVID_QP_FUNCS *Ops; int32_t x_int, y_int; const int32_t xRef = (int)x*4 + dx; const int32_t yRef = (int)y*4 + dy; Ops = xvid_QP_Funcs; quads = (dx&3) | ((dy&3)<<2); x_int = xRef >> 2; y_int = yRef >> 2; dst = cur + y * stride + x; src = refn + y_int * (int)stride + x_int; tmp = refh; /* we need at least a 16 x stride scratch block */ switch(quads) { case 0: transfer8x8_copy( dst, src, stride); break; case 1: Ops->H_Pass_Avrg_8(dst, src, 8, stride, rounding); break; case 2: Ops->H_Pass_8(dst, src, 8, stride, rounding); break; case 3: Ops->H_Pass_Avrg_Up_8(dst, src, 8, stride, rounding); break; case 4: Ops->V_Pass_Avrg_8(dst, src, 8, stride, rounding); break; case 5: Ops->H_Pass_Avrg_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 8, stride, rounding); break; case 6: Ops->H_Pass_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 8, stride, rounding); break; case 7: Ops->H_Pass_Avrg_Up_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 8, stride, rounding); break; case 8: Ops->V_Pass_8(dst, src, 8, stride, rounding); break; case 9: Ops->H_Pass_Avrg_8(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 8, stride, rounding); break; case 10: Ops->H_Pass_8(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 8, stride, rounding); break; case 11: Ops->H_Pass_Avrg_Up_8(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 8, stride, rounding); break; case 12: Ops->V_Pass_Avrg_Up_8(dst, src, 8, stride, rounding); break; case 13: Ops->H_Pass_Avrg_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8(dst, tmp, 8, stride, rounding); break; case 14: Ops->H_Pass_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8( dst, tmp, 8, stride, rounding); break; case 15: Ops->H_Pass_Avrg_Up_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8(dst, tmp, 8, stride, rounding); break; } } static void __inline interpolate8x8_add_quarterpel(uint8_t * const cur, uint8_t * const refn, uint8_t * const refh, uint8_t * const refv, uint8_t * const refhv, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t *src; uint8_t *dst; uint8_t *tmp; int32_t quads; const XVID_QP_FUNCS *Ops; const XVID_QP_FUNCS *Ops_Copy; int32_t x_int, y_int; const int32_t xRef = (int)x*4 + dx; const int32_t yRef = (int)y*4 + dy; Ops = xvid_QP_Add_Funcs; Ops_Copy = xvid_QP_Funcs; quads = (dx&3) | ((dy&3)<<2); x_int = xRef >> 2; y_int = yRef >> 2; dst = cur + y * stride + x; src = refn + y_int * (int)stride + x_int; tmp = refh; /* we need at least a 16 x stride scratch block */ switch(quads) { case 0: /* Misleading function name, there is no halfpel involved * just dst and src averaging with rounding=0 */ interpolate8x8_halfpel_add(dst, src, stride, rounding); break; case 1: Ops->H_Pass_Avrg_8(dst, src, 8, stride, rounding); break; case 2: Ops->H_Pass_8(dst, src, 8, stride, rounding); break; case 3: Ops->H_Pass_Avrg_Up_8(dst, src, 8, stride, rounding); break; case 4: Ops->V_Pass_Avrg_8(dst, src, 8, stride, rounding); break; case 5: Ops_Copy->H_Pass_Avrg_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 8, stride, rounding); break; case 6: Ops_Copy->H_Pass_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 8, stride, rounding); break; case 7: Ops_Copy->H_Pass_Avrg_Up_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_8(dst, tmp, 8, stride, rounding); break; case 8: Ops->V_Pass_8(dst, src, 8, stride, rounding); break; case 9: Ops_Copy->H_Pass_Avrg_8(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 8, stride, rounding); break; case 10: Ops_Copy->H_Pass_8(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 8, stride, rounding); break; case 11: Ops_Copy->H_Pass_Avrg_Up_8(tmp, src, 9, stride, rounding); Ops->V_Pass_8(dst, tmp, 8, stride, rounding); break; case 12: Ops->V_Pass_Avrg_Up_8(dst, src, 8, stride, rounding); break; case 13: Ops_Copy->H_Pass_Avrg_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8(dst, tmp, 8, stride, rounding); break; case 14: Ops_Copy->H_Pass_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8( dst, tmp, 8, stride, rounding); break; case 15: Ops_Copy->H_Pass_Avrg_Up_8(tmp, src, 9, stride, rounding); Ops->V_Pass_Avrg_Up_8(dst, tmp, 8, stride, rounding); break; } } #endif /* _XVID_QPEL_H_ */ xvidcore/src/image/interpolate8x8.c0000664000076500007650000010313111564705453020407 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - 8x8 block-based halfpel interpolation - * * Copyright(C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: interpolate8x8.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../portab.h" #include "../global.h" #include "interpolate8x8.h" /* function pointers */ INTERPOLATE8X8_PTR interpolate8x8_halfpel_h; INTERPOLATE8X8_PTR interpolate8x8_halfpel_v; INTERPOLATE8X8_PTR interpolate8x8_halfpel_hv; INTERPOLATE8X8_PTR interpolate8x4_halfpel_h; INTERPOLATE8X8_PTR interpolate8x4_halfpel_v; INTERPOLATE8X8_PTR interpolate8x4_halfpel_hv; INTERPOLATE8X8_PTR interpolate8x8_halfpel_add; INTERPOLATE8X8_PTR interpolate8x8_halfpel_h_add; INTERPOLATE8X8_PTR interpolate8x8_halfpel_v_add; INTERPOLATE8X8_PTR interpolate8x8_halfpel_hv_add; INTERPOLATE8X8_AVG2_PTR interpolate8x8_avg2; INTERPOLATE8X8_AVG4_PTR interpolate8x8_avg4; INTERPOLATE_LOWPASS_PTR interpolate8x8_lowpass_h; INTERPOLATE_LOWPASS_PTR interpolate8x8_lowpass_v; INTERPOLATE_LOWPASS_PTR interpolate16x16_lowpass_h; INTERPOLATE_LOWPASS_PTR interpolate16x16_lowpass_v; INTERPOLATE_LOWPASS_HV_PTR interpolate8x8_lowpass_hv; INTERPOLATE_LOWPASS_HV_PTR interpolate16x16_lowpass_hv; INTERPOLATE8X8_6TAP_LOWPASS_PTR interpolate8x8_6tap_lowpass_h; INTERPOLATE8X8_6TAP_LOWPASS_PTR interpolate8x8_6tap_lowpass_v; void interpolate8x8_avg2_c(uint8_t * dst, const uint8_t * src1, const uint8_t *src2, const uint32_t stride, const uint32_t rounding, const uint32_t height) { uint32_t i; const int32_t round = 1 - rounding; for(i = 0; i < height; i++) { dst[0] = (src1[0] + src2[0] + round) >> 1; dst[1] = (src1[1] + src2[1] + round) >> 1; dst[2] = (src1[2] + src2[2] + round) >> 1; dst[3] = (src1[3] + src2[3] + round) >> 1; dst[4] = (src1[4] + src2[4] + round) >> 1; dst[5] = (src1[5] + src2[5] + round) >> 1; dst[6] = (src1[6] + src2[6] + round) >> 1; dst[7] = (src1[7] + src2[7] + round) >> 1; dst += stride; src1 += stride; src2 += stride; } } void interpolate8x8_halfpel_add_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { interpolate8x8_avg2_c(dst, dst, src, stride, 0, 8); } void interpolate8x8_avg4_c(uint8_t *dst, const uint8_t *src1, const uint8_t *src2, const uint8_t *src3, const uint8_t *src4, const uint32_t stride, const uint32_t rounding) { int32_t i; const int32_t round = 2 - rounding; for(i = 0; i < 8; i++) { dst[0] = (src1[0] + src2[0] + src3[0] + src4[0] + round) >> 2; dst[1] = (src1[1] + src2[1] + src3[1] + src4[1] + round) >> 2; dst[2] = (src1[2] + src2[2] + src3[2] + src4[2] + round) >> 2; dst[3] = (src1[3] + src2[3] + src3[3] + src4[3] + round) >> 2; dst[4] = (src1[4] + src2[4] + src3[4] + src4[4] + round) >> 2; dst[5] = (src1[5] + src2[5] + src3[5] + src4[5] + round) >> 2; dst[6] = (src1[6] + src2[6] + src3[6] + src4[6] + round) >> 2; dst[7] = (src1[7] + src2[7] + src3[7] + src4[7] + round) >> 2; dst += stride; src1 += stride; src2 += stride; src3 += stride; src4 += stride; } } /* dst = interpolate(src) */ void interpolate8x8_halfpel_h_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j + 0] + src[j + 1] )>>1); dst[j + 1] = (uint8_t)((src[j + 1] + src[j + 2] )>>1); dst[j + 2] = (uint8_t)((src[j + 2] + src[j + 3] )>>1); dst[j + 3] = (uint8_t)((src[j + 3] + src[j + 4] )>>1); dst[j + 4] = (uint8_t)((src[j + 4] + src[j + 5] )>>1); dst[j + 5] = (uint8_t)((src[j + 5] + src[j + 6] )>>1); dst[j + 6] = (uint8_t)((src[j + 6] + src[j + 7] )>>1); dst[j + 7] = (uint8_t)((src[j + 7] + src[j + 8] )>>1); } } else { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j + 0] + src[j + 1] + 1)>>1); dst[j + 1] = (uint8_t)((src[j + 1] + src[j + 2] + 1)>>1); dst[j + 2] = (uint8_t)((src[j + 2] + src[j + 3] + 1)>>1); dst[j + 3] = (uint8_t)((src[j + 3] + src[j + 4] + 1)>>1); dst[j + 4] = (uint8_t)((src[j + 4] + src[j + 5] + 1)>>1); dst[j + 5] = (uint8_t)((src[j + 5] + src[j + 6] + 1)>>1); dst[j + 6] = (uint8_t)((src[j + 6] + src[j + 7] + 1)>>1); dst[j + 7] = (uint8_t)((src[j + 7] + src[j + 8] + 1)>>1); } } } /* dst = interpolate(src) */ void interpolate8x4_halfpel_h_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 4*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j + 0] + src[j + 1] )>>1); dst[j + 1] = (uint8_t)((src[j + 1] + src[j + 2] )>>1); dst[j + 2] = (uint8_t)((src[j + 2] + src[j + 3] )>>1); dst[j + 3] = (uint8_t)((src[j + 3] + src[j + 4] )>>1); dst[j + 4] = (uint8_t)((src[j + 4] + src[j + 5] )>>1); dst[j + 5] = (uint8_t)((src[j + 5] + src[j + 6] )>>1); dst[j + 6] = (uint8_t)((src[j + 6] + src[j + 7] )>>1); dst[j + 7] = (uint8_t)((src[j + 7] + src[j + 8] )>>1); } } else { for (j = 0; j < 4*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j + 0] + src[j + 1] + 1)>>1); dst[j + 1] = (uint8_t)((src[j + 1] + src[j + 2] + 1)>>1); dst[j + 2] = (uint8_t)((src[j + 2] + src[j + 3] + 1)>>1); dst[j + 3] = (uint8_t)((src[j + 3] + src[j + 4] + 1)>>1); dst[j + 4] = (uint8_t)((src[j + 4] + src[j + 5] + 1)>>1); dst[j + 5] = (uint8_t)((src[j + 5] + src[j + 6] + 1)>>1); dst[j + 6] = (uint8_t)((src[j + 6] + src[j + 7] + 1)>>1); dst[j + 7] = (uint8_t)((src[j + 7] + src[j + 8] + 1)>>1); } } } /* dst = (dst + interpolate(src)/2 */ void interpolate8x8_halfpel_h_add_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((((src[j + 0] + src[j + 1] )>>1) + dst[j+0] + 1)>>1); dst[j + 1] = (uint8_t)((((src[j + 1] + src[j + 2] )>>1) + dst[j+1] + 1)>>1); dst[j + 2] = (uint8_t)((((src[j + 2] + src[j + 3] )>>1) + dst[j+2] + 1)>>1); dst[j + 3] = (uint8_t)((((src[j + 3] + src[j + 4] )>>1) + dst[j+3] + 1)>>1); dst[j + 4] = (uint8_t)((((src[j + 4] + src[j + 5] )>>1) + dst[j+4] + 1)>>1); dst[j + 5] = (uint8_t)((((src[j + 5] + src[j + 6] )>>1) + dst[j+5] + 1)>>1); dst[j + 6] = (uint8_t)((((src[j + 6] + src[j + 7] )>>1) + dst[j+6] + 1)>>1); dst[j + 7] = (uint8_t)((((src[j + 7] + src[j + 8] )>>1) + dst[j+7] + 1)>>1); } } else { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((((src[j + 0] + src[j + 1] + 1)>>1) + dst[j+0] + 1)>>1); dst[j + 1] = (uint8_t)((((src[j + 1] + src[j + 2] + 1)>>1) + dst[j+1] + 1)>>1); dst[j + 2] = (uint8_t)((((src[j + 2] + src[j + 3] + 1)>>1) + dst[j+2] + 1)>>1); dst[j + 3] = (uint8_t)((((src[j + 3] + src[j + 4] + 1)>>1) + dst[j+3] + 1)>>1); dst[j + 4] = (uint8_t)((((src[j + 4] + src[j + 5] + 1)>>1) + dst[j+4] + 1)>>1); dst[j + 5] = (uint8_t)((((src[j + 5] + src[j + 6] + 1)>>1) + dst[j+5] + 1)>>1); dst[j + 6] = (uint8_t)((((src[j + 6] + src[j + 7] + 1)>>1) + dst[j+6] + 1)>>1); dst[j + 7] = (uint8_t)((((src[j + 7] + src[j + 8] + 1)>>1) + dst[j+7] + 1)>>1); } } } /* dst = interpolate(src) */ void interpolate8x8_halfpel_v_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j + 0] + src[j + stride + 0] )>>1); dst[j + 1] = (uint8_t)((src[j + 1] + src[j + stride + 1] )>>1); dst[j + 2] = (uint8_t)((src[j + 2] + src[j + stride + 2] )>>1); dst[j + 3] = (uint8_t)((src[j + 3] + src[j + stride + 3] )>>1); dst[j + 4] = (uint8_t)((src[j + 4] + src[j + stride + 4] )>>1); dst[j + 5] = (uint8_t)((src[j + 5] + src[j + stride + 5] )>>1); dst[j + 6] = (uint8_t)((src[j + 6] + src[j + stride + 6] )>>1); dst[j + 7] = (uint8_t)((src[j + 7] + src[j + stride + 7] )>>1); } } else { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j + 0] + src[j + stride + 0] + 1)>>1); dst[j + 1] = (uint8_t)((src[j + 1] + src[j + stride + 1] + 1)>>1); dst[j + 2] = (uint8_t)((src[j + 2] + src[j + stride + 2] + 1)>>1); dst[j + 3] = (uint8_t)((src[j + 3] + src[j + stride + 3] + 1)>>1); dst[j + 4] = (uint8_t)((src[j + 4] + src[j + stride + 4] + 1)>>1); dst[j + 5] = (uint8_t)((src[j + 5] + src[j + stride + 5] + 1)>>1); dst[j + 6] = (uint8_t)((src[j + 6] + src[j + stride + 6] + 1)>>1); dst[j + 7] = (uint8_t)((src[j + 7] + src[j + stride + 7] + 1)>>1); } } } /* dst = interpolate(src) */ void interpolate8x4_halfpel_v_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 4*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j + 0] + src[j + stride + 0] )>>1); dst[j + 1] = (uint8_t)((src[j + 1] + src[j + stride + 1] )>>1); dst[j + 2] = (uint8_t)((src[j + 2] + src[j + stride + 2] )>>1); dst[j + 3] = (uint8_t)((src[j + 3] + src[j + stride + 3] )>>1); dst[j + 4] = (uint8_t)((src[j + 4] + src[j + stride + 4] )>>1); dst[j + 5] = (uint8_t)((src[j + 5] + src[j + stride + 5] )>>1); dst[j + 6] = (uint8_t)((src[j + 6] + src[j + stride + 6] )>>1); dst[j + 7] = (uint8_t)((src[j + 7] + src[j + stride + 7] )>>1); } } else { for (j = 0; j < 4*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j + 0] + src[j + stride + 0] + 1)>>1); dst[j + 1] = (uint8_t)((src[j + 1] + src[j + stride + 1] + 1)>>1); dst[j + 2] = (uint8_t)((src[j + 2] + src[j + stride + 2] + 1)>>1); dst[j + 3] = (uint8_t)((src[j + 3] + src[j + stride + 3] + 1)>>1); dst[j + 4] = (uint8_t)((src[j + 4] + src[j + stride + 4] + 1)>>1); dst[j + 5] = (uint8_t)((src[j + 5] + src[j + stride + 5] + 1)>>1); dst[j + 6] = (uint8_t)((src[j + 6] + src[j + stride + 6] + 1)>>1); dst[j + 7] = (uint8_t)((src[j + 7] + src[j + stride + 7] + 1)>>1); } } } /* dst = (dst + interpolate(src))/2 */ void interpolate8x8_halfpel_v_add_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((((src[j + 0] + src[j + stride + 0] )>>1) + dst[j+0] + 1)>>1); dst[j + 1] = (uint8_t)((((src[j + 1] + src[j + stride + 1] )>>1) + dst[j+1] + 1)>>1); dst[j + 2] = (uint8_t)((((src[j + 2] + src[j + stride + 2] )>>1) + dst[j+2] + 1)>>1); dst[j + 3] = (uint8_t)((((src[j + 3] + src[j + stride + 3] )>>1) + dst[j+3] + 1)>>1); dst[j + 4] = (uint8_t)((((src[j + 4] + src[j + stride + 4] )>>1) + dst[j+4] + 1)>>1); dst[j + 5] = (uint8_t)((((src[j + 5] + src[j + stride + 5] )>>1) + dst[j+5] + 1)>>1); dst[j + 6] = (uint8_t)((((src[j + 6] + src[j + stride + 6] )>>1) + dst[j+6] + 1)>>1); dst[j + 7] = (uint8_t)((((src[j + 7] + src[j + stride + 7] )>>1) + dst[j+7] + 1)>>1); } } else { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((((src[j + 0] + src[j + stride + 0] + 1)>>1) + dst[j+0] + 1)>>1); dst[j + 1] = (uint8_t)((((src[j + 1] + src[j + stride + 1] + 1)>>1) + dst[j+1] + 1)>>1); dst[j + 2] = (uint8_t)((((src[j + 2] + src[j + stride + 2] + 1)>>1) + dst[j+2] + 1)>>1); dst[j + 3] = (uint8_t)((((src[j + 3] + src[j + stride + 3] + 1)>>1) + dst[j+3] + 1)>>1); dst[j + 4] = (uint8_t)((((src[j + 4] + src[j + stride + 4] + 1)>>1) + dst[j+4] + 1)>>1); dst[j + 5] = (uint8_t)((((src[j + 5] + src[j + stride + 5] + 1)>>1) + dst[j+5] + 1)>>1); dst[j + 6] = (uint8_t)((((src[j + 6] + src[j + stride + 6] + 1)>>1) + dst[j+6] + 1)>>1); dst[j + 7] = (uint8_t)((((src[j + 7] + src[j + stride + 7] + 1)>>1) + dst[j+7] + 1)>>1); } } } /* dst = interpolate(src) */ void interpolate8x8_halfpel_hv_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j+0] + src[j+1] + src[j+stride+0] + src[j+stride+1] +1)>>2); dst[j + 1] = (uint8_t)((src[j+1] + src[j+2] + src[j+stride+1] + src[j+stride+2] +1)>>2); dst[j + 2] = (uint8_t)((src[j+2] + src[j+3] + src[j+stride+2] + src[j+stride+3] +1)>>2); dst[j + 3] = (uint8_t)((src[j+3] + src[j+4] + src[j+stride+3] + src[j+stride+4] +1)>>2); dst[j + 4] = (uint8_t)((src[j+4] + src[j+5] + src[j+stride+4] + src[j+stride+5] +1)>>2); dst[j + 5] = (uint8_t)((src[j+5] + src[j+6] + src[j+stride+5] + src[j+stride+6] +1)>>2); dst[j + 6] = (uint8_t)((src[j+6] + src[j+7] + src[j+stride+6] + src[j+stride+7] +1)>>2); dst[j + 7] = (uint8_t)((src[j+7] + src[j+8] + src[j+stride+7] + src[j+stride+8] +1)>>2); } } else { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j+0] + src[j+1] + src[j+stride+0] + src[j+stride+1] +2)>>2); dst[j + 1] = (uint8_t)((src[j+1] + src[j+2] + src[j+stride+1] + src[j+stride+2] +2)>>2); dst[j + 2] = (uint8_t)((src[j+2] + src[j+3] + src[j+stride+2] + src[j+stride+3] +2)>>2); dst[j + 3] = (uint8_t)((src[j+3] + src[j+4] + src[j+stride+3] + src[j+stride+4] +2)>>2); dst[j + 4] = (uint8_t)((src[j+4] + src[j+5] + src[j+stride+4] + src[j+stride+5] +2)>>2); dst[j + 5] = (uint8_t)((src[j+5] + src[j+6] + src[j+stride+5] + src[j+stride+6] +2)>>2); dst[j + 6] = (uint8_t)((src[j+6] + src[j+7] + src[j+stride+6] + src[j+stride+7] +2)>>2); dst[j + 7] = (uint8_t)((src[j+7] + src[j+8] + src[j+stride+7] + src[j+stride+8] +2)>>2); } } } /* dst = interpolate(src) */ void interpolate8x4_halfpel_hv_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 4*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j+0] + src[j+1] + src[j+stride+0] + src[j+stride+1] +1)>>2); dst[j + 1] = (uint8_t)((src[j+1] + src[j+2] + src[j+stride+1] + src[j+stride+2] +1)>>2); dst[j + 2] = (uint8_t)((src[j+2] + src[j+3] + src[j+stride+2] + src[j+stride+3] +1)>>2); dst[j + 3] = (uint8_t)((src[j+3] + src[j+4] + src[j+stride+3] + src[j+stride+4] +1)>>2); dst[j + 4] = (uint8_t)((src[j+4] + src[j+5] + src[j+stride+4] + src[j+stride+5] +1)>>2); dst[j + 5] = (uint8_t)((src[j+5] + src[j+6] + src[j+stride+5] + src[j+stride+6] +1)>>2); dst[j + 6] = (uint8_t)((src[j+6] + src[j+7] + src[j+stride+6] + src[j+stride+7] +1)>>2); dst[j + 7] = (uint8_t)((src[j+7] + src[j+8] + src[j+stride+7] + src[j+stride+8] +1)>>2); } } else { for (j = 0; j < 4*stride; j+=stride) { dst[j + 0] = (uint8_t)((src[j+0] + src[j+1] + src[j+stride+0] + src[j+stride+1] +2)>>2); dst[j + 1] = (uint8_t)((src[j+1] + src[j+2] + src[j+stride+1] + src[j+stride+2] +2)>>2); dst[j + 2] = (uint8_t)((src[j+2] + src[j+3] + src[j+stride+2] + src[j+stride+3] +2)>>2); dst[j + 3] = (uint8_t)((src[j+3] + src[j+4] + src[j+stride+3] + src[j+stride+4] +2)>>2); dst[j + 4] = (uint8_t)((src[j+4] + src[j+5] + src[j+stride+4] + src[j+stride+5] +2)>>2); dst[j + 5] = (uint8_t)((src[j+5] + src[j+6] + src[j+stride+5] + src[j+stride+6] +2)>>2); dst[j + 6] = (uint8_t)((src[j+6] + src[j+7] + src[j+stride+6] + src[j+stride+7] +2)>>2); dst[j + 7] = (uint8_t)((src[j+7] + src[j+8] + src[j+stride+7] + src[j+stride+8] +2)>>2); } } } /* dst = (interpolate(src) + dst)/2 */ void interpolate8x8_halfpel_hv_add_c(uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding) { uintptr_t j; if (rounding) { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((((src[j+0] + src[j+1] + src[j+stride+0] + src[j+stride+1] +1)>>2) + dst[j+0])>>1); dst[j + 1] = (uint8_t)((((src[j+1] + src[j+2] + src[j+stride+1] + src[j+stride+2] +1)>>2) + dst[j+1])>>1); dst[j + 2] = (uint8_t)((((src[j+2] + src[j+3] + src[j+stride+2] + src[j+stride+3] +1)>>2) + dst[j+2])>>1); dst[j + 3] = (uint8_t)((((src[j+3] + src[j+4] + src[j+stride+3] + src[j+stride+4] +1)>>2) + dst[j+3])>>1); dst[j + 4] = (uint8_t)((((src[j+4] + src[j+5] + src[j+stride+4] + src[j+stride+5] +1)>>2) + dst[j+4])>>1); dst[j + 5] = (uint8_t)((((src[j+5] + src[j+6] + src[j+stride+5] + src[j+stride+6] +1)>>2) + dst[j+5])>>1); dst[j + 6] = (uint8_t)((((src[j+6] + src[j+7] + src[j+stride+6] + src[j+stride+7] +1)>>2) + dst[j+6])>>1); dst[j + 7] = (uint8_t)((((src[j+7] + src[j+8] + src[j+stride+7] + src[j+stride+8] +1)>>2) + dst[j+7])>>1); } } else { for (j = 0; j < 8*stride; j+=stride) { dst[j + 0] = (uint8_t)((((src[j+0] + src[j+1] + src[j+stride+0] + src[j+stride+1] +2)>>2) + dst[j+0] + 1)>>1); dst[j + 1] = (uint8_t)((((src[j+1] + src[j+2] + src[j+stride+1] + src[j+stride+2] +2)>>2) + dst[j+1] + 1)>>1); dst[j + 2] = (uint8_t)((((src[j+2] + src[j+3] + src[j+stride+2] + src[j+stride+3] +2)>>2) + dst[j+2] + 1)>>1); dst[j + 3] = (uint8_t)((((src[j+3] + src[j+4] + src[j+stride+3] + src[j+stride+4] +2)>>2) + dst[j+3] + 1)>>1); dst[j + 4] = (uint8_t)((((src[j+4] + src[j+5] + src[j+stride+4] + src[j+stride+5] +2)>>2) + dst[j+4] + 1)>>1); dst[j + 5] = (uint8_t)((((src[j+5] + src[j+6] + src[j+stride+5] + src[j+stride+6] +2)>>2) + dst[j+5] + 1)>>1); dst[j + 6] = (uint8_t)((((src[j+6] + src[j+7] + src[j+stride+6] + src[j+stride+7] +2)>>2) + dst[j+6] + 1)>>1); dst[j + 7] = (uint8_t)((((src[j+7] + src[j+8] + src[j+stride+7] + src[j+stride+8] +2)>>2) + dst[j+7] + 1)>>1); } } } /************************************************************* * QPEL STUFF STARTS HERE * *************************************************************/ void interpolate8x8_6tap_lowpass_h_c(uint8_t *dst, uint8_t *src, int32_t stride, int32_t rounding) { int32_t i; uint8_t round_add = 16 - rounding; for(i = 0; i < 8; i++) { dst[0] = CLIP((((src[-2] + src[3]) + 5 * (((src[0] + src[1])<<2) - (src[-1] + src[2])) + round_add) >> 5), 0, 255); dst[1] = CLIP((((src[-1] + src[4]) + 5 * (((src[1] + src[2])<<2) - (src[0] + src[3])) + round_add) >> 5), 0, 255); dst[2] = CLIP((((src[0] + src[5]) + 5 * (((src[2] + src[3])<<2) - (src[1] + src[4])) + round_add) >> 5), 0, 255); dst[3] = CLIP((((src[1] + src[6]) + 5 * (((src[3] + src[4])<<2) - (src[2] + src[5])) + round_add) >> 5), 0, 255); dst[4] = CLIP((((src[2] + src[7]) + 5 * (((src[4] + src[5])<<2) - (src[3] + src[6])) + round_add) >> 5), 0, 255); dst[5] = CLIP((((src[3] + src[8]) + 5 * (((src[5] + src[6])<<2) - (src[4] + src[7])) + round_add) >> 5), 0, 255); dst[6] = CLIP((((src[4] + src[9]) + 5 * (((src[6] + src[7])<<2) - (src[5] + src[8])) + round_add) >> 5), 0, 255); dst[7] = CLIP((((src[5] + src[10]) + 5 * (((src[7] + src[8])<<2) - (src[6] + src[9])) + round_add) >> 5), 0, 255); dst += stride; src += stride; } } void interpolate16x16_lowpass_h_c(uint8_t *dst, uint8_t *src, int32_t stride, int32_t rounding) { int32_t i; uint8_t round_add = 16 - rounding; for(i = 0; i < 17; i++) { dst[0] = CLIP(((7 * ((src[0]<<1) - src[2]) + 23 * src[1] + 3 * src[3] - src[4] + round_add) >> 5), 0, 255); dst[1] = CLIP(((19 * src[1] + 20 * src[2] - src[5] + 3 * (src[4] - src[0] - (src[3]<<1)) + round_add) >> 5), 0, 255); dst[2] = CLIP(((20 * (src[2] + src[3]) + (src[0]<<1) + 3 * (src[5] - ((src[1] + src[4])<<1)) - src[6] + round_add) >> 5), 0, 255); dst[3] = CLIP(((20 * (src[3] + src[4]) + 3 * ((src[6] + src[1]) - ((src[2] + src[5])<<1)) - (src[0] + src[7]) + round_add) >> 5), 0, 255); dst[4] = CLIP(((20 * (src[4] + src[5]) - 3 * (((src[3] + src[6])<<1) - (src[2] + src[7])) - (src[1] + src[8]) + round_add) >> 5), 0, 255); dst[5] = CLIP(((20 * (src[5] + src[6]) - 3 * (((src[4] + src[7])<<1) - (src[3] + src[8])) - (src[2] + src[9]) + round_add) >> 5), 0, 255); dst[6] = CLIP(((20 * (src[6] + src[7]) - 3 * (((src[5] + src[8])<<1) - (src[4] + src[9])) - (src[3] + src[10]) + round_add) >> 5), 0, 255); dst[7] = CLIP(((20 * (src[7] + src[8]) - 3 * (((src[6] + src[9])<<1) - (src[5] + src[10])) - (src[4] + src[11]) + round_add) >> 5), 0, 255); dst[8] = CLIP(((20 * (src[8] + src[9]) - 3 * (((src[7] + src[10])<<1) - (src[6] + src[11])) - (src[5] + src[12]) + round_add) >> 5), 0, 255); dst[9] = CLIP(((20 * (src[9] + src[10]) - 3 * (((src[8] + src[11])<<1) - (src[7] + src[12])) - (src[6] + src[13]) + round_add) >> 5), 0, 255); dst[10] = CLIP(((20 * (src[10] + src[11]) - 3 * (((src[9] + src[12])<<1) - (src[8] + src[13])) - (src[7] + src[14]) + round_add) >> 5), 0, 255); dst[11] = CLIP(((20 * (src[11] + src[12]) - 3 * (((src[10] + src[13])<<1) - (src[9] + src[14])) - (src[8] + src[15]) + round_add) >> 5), 0, 255); dst[12] = CLIP(((20 * (src[12] + src[13]) - 3 * (((src[11] + src[14])<<1) - (src[10] + src[15])) - (src[9] + src[16]) + round_add) >> 5), 0, 255); dst[13] = CLIP(((20 * (src[13] + src[14]) + (src[16]<<1) + 3 * (src[11] - ((src[12] + src[15]) << 1)) - src[10] + round_add) >> 5), 0, 255); dst[14] = CLIP(((19 * src[15] + 20 * src[14] + 3 * (src[12] - src[16] - (src[13] << 1)) - src[11] + round_add) >> 5), 0, 255); dst[15] = CLIP(((23 * src[15] + 7 * ((src[16]<<1) - src[14]) + 3 * src[13] - src[12] + round_add) >> 5), 0, 255); dst += stride; src += stride; } } void interpolate8x8_lowpass_h_c(uint8_t *dst, uint8_t *src, int32_t stride, int32_t rounding) { int32_t i; uint8_t round_add = 16 - rounding; for(i = 0; i < 9; i++) { dst[0] = CLIP(((7 * ((src[0]<<1) - src[2]) + 23 * src[1] + 3 * src[3] - src[4] + round_add) >> 5), 0, 255); dst[1] = CLIP(((19 * src[1] + 20 * src[2] - src[5] + 3 * (src[4] - src[0] - (src[3]<<1)) + round_add) >> 5), 0, 255); dst[2] = CLIP(((20 * (src[2] + src[3]) + (src[0]<<1) + 3 * (src[5] - ((src[1] + src[4])<<1)) - src[6] + round_add) >> 5), 0, 255); dst[3] = CLIP(((20 * (src[3] + src[4]) + 3 * ((src[6] + src[1]) - ((src[2] + src[5])<<1)) - (src[0] + src[7]) + round_add) >> 5), 0, 255); dst[4] = CLIP(((20 * (src[4] + src[5]) - 3 * (((src[3] + src[6])<<1) - (src[2] + src[7])) - (src[1] + src[8]) + round_add) >> 5), 0, 255); dst[5] = CLIP(((20 * (src[5] + src[6]) + (src[8]<<1) + 3 * (src[3] - ((src[4] + src[7]) << 1)) - src[2] + round_add) >> 5), 0, 255); dst[6] = CLIP(((19 * src[7] + 20 * src[6] + 3 * (src[4] - src[8] - (src[5] << 1)) - src[3] + round_add) >> 5), 0, 255); dst[7] = CLIP(((23 * src[7] + 7 * ((src[8]<<1) - src[6]) + 3 * src[5] - src[4] + round_add) >> 5), 0, 255); dst += stride; src += stride; } } void interpolate8x8_6tap_lowpass_v_c(uint8_t *dst, uint8_t *src, int32_t stride, int32_t rounding) { int32_t i; uint8_t round_add = 16 - rounding; for(i = 0; i < 8; i++) { int32_t src_2 = src[-2*stride]; int32_t src_1 = src[-stride]; int32_t src0 = src[0]; int32_t src1 = src[stride]; int32_t src2 = src[2 * stride]; int32_t src3 = src[3 * stride]; int32_t src4 = src[4 * stride]; int32_t src5 = src[5 * stride]; int32_t src6 = src[6 * stride]; int32_t src7 = src[7 * stride]; int32_t src8 = src[8 * stride]; int32_t src9 = src[9 * stride]; int32_t src10 = src[10 * stride]; dst[0] = CLIP((((src_2 + src3) + 5 * (((src0 + src1)<<2) - (src_1 + src2)) + round_add) >> 5), 0, 255); dst[stride] = CLIP((((src_1 + src4) + 5 * (((src1 + src2)<<2) - (src0 + src3)) + round_add) >> 5), 0, 255); dst[2 * stride] = CLIP((((src0 + src5) + 5 * (((src2 + src3)<<2) - (src1 + src4)) + round_add) >> 5), 0, 255); dst[3 * stride] = CLIP((((src1 + src6) + 5 * (((src3 + src4)<<2) - (src2 + src5)) + round_add) >> 5), 0, 255); dst[4 * stride] = CLIP((((src2 + src7) + 5 * (((src4 + src5)<<2) - (src3 + src6)) + round_add) >> 5), 0, 255); dst[5 * stride] = CLIP((((src3 + src8) + 5 * (((src5 + src6)<<2) - (src4 + src7)) + round_add) >> 5), 0, 255); dst[6 * stride] = CLIP((((src4 + src9) + 5 * (((src6 + src7)<<2) - (src5 + src8)) + round_add) >> 5), 0, 255); dst[7 * stride] = CLIP((((src5 + src10) + 5 * (((src7 + src8)<<2) - (src6 + src9)) + round_add) >> 5), 0, 255); dst++; src++; } } void interpolate16x16_lowpass_v_c(uint8_t *dst, uint8_t *src, int32_t stride, int32_t rounding) { int32_t i; uint8_t round_add = 16 - rounding; for(i = 0; i < 17; i++) { int32_t src0 = src[0]; int32_t src1 = src[stride]; int32_t src2 = src[2 * stride]; int32_t src3 = src[3 * stride]; int32_t src4 = src[4 * stride]; int32_t src5 = src[5 * stride]; int32_t src6 = src[6 * stride]; int32_t src7 = src[7 * stride]; int32_t src8 = src[8 * stride]; int32_t src9 = src[9 * stride]; int32_t src10 = src[10 * stride]; int32_t src11 = src[11 * stride]; int32_t src12 = src[12 * stride]; int32_t src13 = src[13 * stride]; int32_t src14 = src[14 * stride]; int32_t src15 = src[15 * stride]; int32_t src16 = src[16 * stride]; dst[0] = CLIP(((7 * ((src0<<1) - src2) + 23 * src1 + 3 * src3 - src4 + round_add) >> 5), 0, 255); dst[stride] = CLIP(((19 * src1 + 20 * src2 - src5 + 3 * (src4 - src0 - (src3<<1)) + round_add) >> 5), 0, 255); dst[2*stride] = CLIP(((20 * (src2 + src3) + (src0<<1) + 3 * (src5 - ((src1 + src4)<<1)) - src6 + round_add) >> 5), 0, 255); dst[3*stride] = CLIP(((20 * (src3 + src4) + 3 * ((src6 + src1) - ((src2 + src5)<<1)) - (src0 + src7) + round_add) >> 5), 0, 255); dst[4*stride] = CLIP(((20 * (src4 + src5) - 3 * (((src3 + src6)<<1) - (src2 + src7)) - (src1 + src8) + round_add) >> 5), 0, 255); dst[5*stride] = CLIP(((20 * (src5 + src6) - 3 * (((src4 + src7)<<1) - (src3 + src8)) - (src2 + src9) + round_add) >> 5), 0, 255); dst[6*stride] = CLIP(((20 * (src6 + src7) - 3 * (((src5 + src8)<<1) - (src4 + src9)) - (src3 + src10) + round_add) >> 5), 0, 255); dst[7*stride] = CLIP(((20 * (src7 + src8) - 3 * (((src6 + src9)<<1) - (src5 + src10)) - (src4 + src11) + round_add) >> 5), 0, 255); dst[8*stride] = CLIP(((20 * (src8 + src9) - 3 * (((src7 + src10)<<1) - (src6 + src11)) - (src5 + src12) + round_add) >> 5), 0, 255); dst[9*stride] = CLIP(((20 * (src9 + src10) - 3 * (((src8 + src11)<<1) - (src7 + src12)) - (src6 + src13) + round_add) >> 5), 0, 255); dst[10*stride] = CLIP(((20 * (src10 + src11) - 3 * (((src9 + src12)<<1) - (src8 + src13)) - (src7 + src14) + round_add) >> 5), 0, 255); dst[11*stride] = CLIP(((20 * (src11 + src12) - 3 * (((src10 + src13)<<1) - (src9 + src14)) - (src8 + src15) + round_add) >> 5), 0, 255); dst[12*stride] = CLIP(((20 * (src12 + src13) - 3 * (((src11 + src14)<<1) - (src10 + src15)) - (src9 + src16) + round_add) >> 5), 0, 255); dst[13*stride] = CLIP(((20 * (src13 + src14) + (src16<<1) + 3 * (src11 - ((src12 + src15) << 1)) - src10 + round_add) >> 5), 0, 255); dst[14*stride] = CLIP(((19 * src15 + 20 * src14 + 3 * (src12 - src16 - (src13 << 1)) - src11 + round_add) >> 5), 0, 255); dst[15*stride] = CLIP(((23 * src15 + 7 * ((src16<<1) - src14) + 3 * src13 - src12 + round_add) >> 5), 0, 255); dst++; src++; } } void interpolate8x8_lowpass_v_c(uint8_t *dst, uint8_t *src, int32_t stride, int32_t rounding) { int32_t i; uint8_t round_add = 16 - rounding; for(i = 0; i < 9; i++) { int32_t src0 = src[0]; int32_t src1 = src[stride]; int32_t src2 = src[2 * stride]; int32_t src3 = src[3 * stride]; int32_t src4 = src[4 * stride]; int32_t src5 = src[5 * stride]; int32_t src6 = src[6 * stride]; int32_t src7 = src[7 * stride]; int32_t src8 = src[8 * stride]; dst[0] = CLIP(((7 * ((src0<<1) - src2) + 23 * src1 + 3 * src3 - src4 + round_add) >> 5), 0, 255); dst[stride] = CLIP(((19 * src1 + 20 * src2 - src5 + 3 * (src4 - src0 - (src3 << 1)) + round_add) >> 5), 0, 255); dst[2 * stride] = CLIP(((20 * (src2 + src3) + (src0<<1) + 3 * (src5 - ((src1 + src4) <<1 )) - src6 + round_add) >> 5), 0, 255); dst[3 * stride] = CLIP(((20 * (src3 + src4) + 3 * ((src6 + src1) - ((src2 + src5)<<1)) - (src0 + src7) + round_add) >> 5), 0, 255); dst[4 * stride] = CLIP(((20 * (src4 + src5) + 3 * ((src2 + src7) - ((src3 + src6)<<1)) - (src1 + src8) + round_add) >> 5), 0, 255); dst[5 * stride] = CLIP(((20 * (src5 + src6) + (src8<<1) + 3 * (src3 - ((src4 + src7) << 1)) - src2 + round_add) >> 5), 0, 255); dst[6 * stride] = CLIP(((19 * src7 + 20 * src6 - src3 + 3 * (src4 - src8 - (src5 << 1)) + round_add) >> 5), 0, 255); dst[7 * stride] = CLIP(((7 * ((src8<<1) - src6) + 23 * src7 + 3 * src5 - src4 + round_add) >> 5), 0, 255); dst++; src++; } } void interpolate16x16_lowpass_hv_c(uint8_t *dst1, uint8_t *dst2, uint8_t *src, int32_t stride, int32_t rounding) { int32_t i; uint8_t round_add = 16 - rounding; uint8_t *h_ptr = dst2; for(i = 0; i < 17; i++) { h_ptr[0] = CLIP(((7 * ((src[0]<<1) - src[2]) + 23 * src[1] + 3 * src[3] - src[4] + round_add) >> 5), 0, 255); h_ptr[1] = CLIP(((19 * src[1] + 20 * src[2] - src[5] + 3 * (src[4] - src[0] - (src[3]<<1)) + round_add) >> 5), 0, 255); h_ptr[2] = CLIP(((20 * (src[2] + src[3]) + (src[0]<<1) + 3 * (src[5] - ((src[1] + src[4])<<1)) - src[6] + round_add) >> 5), 0, 255); h_ptr[3] = CLIP(((20 * (src[3] + src[4]) + 3 * ((src[6] + src[1]) - ((src[2] + src[5])<<1)) - (src[0] + src[7]) + round_add) >> 5), 0, 255); h_ptr[4] = CLIP(((20 * (src[4] + src[5]) - 3 * (((src[3] + src[6])<<1) - (src[2] + src[7])) - (src[1] + src[8]) + round_add) >> 5), 0, 255); h_ptr[5] = CLIP(((20 * (src[5] + src[6]) - 3 * (((src[4] + src[7])<<1) - (src[3] + src[8])) - (src[2] + src[9]) + round_add) >> 5), 0, 255); h_ptr[6] = CLIP(((20 * (src[6] + src[7]) - 3 * (((src[5] + src[8])<<1) - (src[4] + src[9])) - (src[3] + src[10]) + round_add) >> 5), 0, 255); h_ptr[7] = CLIP(((20 * (src[7] + src[8]) - 3 * (((src[6] + src[9])<<1) - (src[5] + src[10])) - (src[4] + src[11]) + round_add) >> 5), 0, 255); h_ptr[8] = CLIP(((20 * (src[8] + src[9]) - 3 * (((src[7] + src[10])<<1) - (src[6] + src[11])) - (src[5] + src[12]) + round_add) >> 5), 0, 255); h_ptr[9] = CLIP(((20 * (src[9] + src[10]) - 3 * (((src[8] + src[11])<<1) - (src[7] + src[12])) - (src[6] + src[13]) + round_add) >> 5), 0, 255); h_ptr[10] = CLIP(((20 * (src[10] + src[11]) - 3 * (((src[9] + src[12])<<1) - (src[8] + src[13])) - (src[7] + src[14]) + round_add) >> 5), 0, 255); h_ptr[11] = CLIP(((20 * (src[11] + src[12]) - 3 * (((src[10] + src[13])<<1) - (src[9] + src[14])) - (src[8] + src[15]) + round_add) >> 5), 0, 255); h_ptr[12] = CLIP(((20 * (src[12] + src[13]) - 3 * (((src[11] + src[14])<<1) - (src[10] + src[15])) - (src[9] + src[16]) + round_add) >> 5), 0, 255); h_ptr[13] = CLIP(((20 * (src[13] + src[14]) + (src[16]<<1) + 3 * (src[11] - ((src[12] + src[15]) << 1)) - src[10] + round_add) >> 5), 0, 255); h_ptr[14] = CLIP(((19 * src[15] + 20 * src[14] + 3 * (src[12] - src[16] - (src[13] << 1)) - src[11] + round_add) >> 5), 0, 255); h_ptr[15] = CLIP(((23 * src[15] + 7 * ((src[16]<<1) - src[14]) + 3 * src[13] - src[12] + round_add) >> 5), 0, 255); h_ptr += stride; src += stride; } interpolate16x16_lowpass_v_c(dst1, dst2, stride, rounding); } void interpolate8x8_lowpass_hv_c(uint8_t *dst1, uint8_t *dst2, uint8_t *src, int32_t stride, int32_t rounding) { int32_t i; uint8_t round_add = 16 - rounding; uint8_t *h_ptr = dst2; for(i = 0; i < 9; i++) { h_ptr[0] = CLIP(((7 * ((src[0]<<1) - src[2]) + 23 * src[1] + 3 * src[3] - src[4] + round_add) >> 5), 0, 255); h_ptr[1] = CLIP(((19 * src[1] + 20 * src[2] - src[5] + 3 * (src[4] - src[0] - (src[3]<<1)) + round_add) >> 5), 0, 255); h_ptr[2] = CLIP(((20 * (src[2] + src[3]) + (src[0]<<1) + 3 * (src[5] - ((src[1] + src[4])<<1)) - src[6] + round_add) >> 5), 0, 255); h_ptr[3] = CLIP(((20 * (src[3] + src[4]) + 3 * ((src[6] + src[1]) - ((src[2] + src[5])<<1)) - (src[0] + src[7]) + round_add) >> 5), 0, 255); h_ptr[4] = CLIP(((20 * (src[4] + src[5]) - 3 * (((src[3] + src[6])<<1) - (src[2] + src[7])) - (src[1] + src[8]) + round_add) >> 5), 0, 255); h_ptr[5] = CLIP(((20 * (src[5] + src[6]) + (src[8]<<1) + 3 * (src[3] - ((src[4] + src[7]) << 1)) - src[2] + round_add) >> 5), 0, 255); h_ptr[6] = CLIP(((19 * src[7] + 20 * src[6] + 3 * (src[4] - src[8] - (src[5] << 1)) - src[3] + round_add) >> 5), 0, 255); h_ptr[7] = CLIP(((23 * src[7] + 7 * ((src[8]<<1) - src[6]) + 3 * src[5] - src[4] + round_add) >> 5), 0, 255); h_ptr += stride; src += stride; } interpolate8x8_lowpass_v_c(dst1, dst2, stride, rounding); } xvidcore/src/image/interpolate8x8.h0000664000076500007650000003162411564705453020423 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Interpolation related header - * * Copyright(C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: interpolate8x8.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _INTERPOLATE8X8_H_ #define _INTERPOLATE8X8_H_ #include "../utils/mem_transfer.h" typedef void (INTERPOLATE8X8) (uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding); typedef INTERPOLATE8X8 *INTERPOLATE8X8_PTR; typedef void (INTERPOLATE8X4) (uint8_t * const dst, const uint8_t * const src, const uint32_t stride, const uint32_t rounding); typedef INTERPOLATE8X4 *INTERPOLATE8X4_PTR; typedef void (INTERPOLATE8X8_AVG2) (uint8_t *dst, const uint8_t *src1, const uint8_t *src2, const uint32_t stride, const uint32_t rounding, const uint32_t height); typedef INTERPOLATE8X8_AVG2 *INTERPOLATE8X8_AVG2_PTR; typedef void (INTERPOLATE8X8_AVG4) (uint8_t *dst, const uint8_t *src1, const uint8_t *src2, const uint8_t *src3, const uint8_t *src4, const uint32_t stride, const uint32_t rounding); typedef INTERPOLATE8X8_AVG4 *INTERPOLATE8X8_AVG4_PTR; typedef void (INTERPOLATE_LOWPASS) (uint8_t *dst, uint8_t *src, int32_t stride, int32_t rounding); typedef INTERPOLATE_LOWPASS *INTERPOLATE_LOWPASS_PTR; typedef void (INTERPOLATE_LOWPASS_HV) (uint8_t *dst1, uint8_t *dst2, uint8_t *src, int32_t stride, int32_t rounding); typedef INTERPOLATE_LOWPASS_HV *INTERPOLATE_LOWPASS_HV_PTR; typedef void (INTERPOLATE8X8_6TAP_LOWPASS) (uint8_t *dst, uint8_t *src, int32_t stride, int32_t rounding); typedef INTERPOLATE8X8_6TAP_LOWPASS *INTERPOLATE8X8_6TAP_LOWPASS_PTR; /* These function do: dst = interpolate(src) */ extern INTERPOLATE8X8_PTR interpolate8x8_halfpel_h; extern INTERPOLATE8X8_PTR interpolate8x8_halfpel_v; extern INTERPOLATE8X8_PTR interpolate8x8_halfpel_hv; extern INTERPOLATE8X4_PTR interpolate8x4_halfpel_h; extern INTERPOLATE8X4_PTR interpolate8x4_halfpel_v; extern INTERPOLATE8X4_PTR interpolate8x4_halfpel_hv; /* These functions do: dst = (dst+interpolate(src) + 1)/2 * Suitable for direct/interpolated bvop prediction block * building w/o the need for intermediate interpolated result * storing/reading * NB: the rounding applies to the interpolation, but not * the averaging step which will always use rounding=0 */ extern INTERPOLATE8X8_PTR interpolate8x8_halfpel_add; extern INTERPOLATE8X8_PTR interpolate8x8_halfpel_h_add; extern INTERPOLATE8X8_PTR interpolate8x8_halfpel_v_add; extern INTERPOLATE8X8_PTR interpolate8x8_halfpel_hv_add; extern INTERPOLATE8X8_AVG2_PTR interpolate8x8_avg2; extern INTERPOLATE8X8_AVG4_PTR interpolate8x8_avg4; extern INTERPOLATE_LOWPASS_PTR interpolate8x8_lowpass_h; extern INTERPOLATE_LOWPASS_PTR interpolate8x8_lowpass_v; extern INTERPOLATE_LOWPASS_PTR interpolate16x16_lowpass_h; extern INTERPOLATE_LOWPASS_PTR interpolate16x16_lowpass_v; extern INTERPOLATE_LOWPASS_HV_PTR interpolate8x8_lowpass_hv; extern INTERPOLATE_LOWPASS_HV_PTR interpolate16x16_lowpass_hv; extern INTERPOLATE8X8_6TAP_LOWPASS_PTR interpolate8x8_6tap_lowpass_h; extern INTERPOLATE8X8_6TAP_LOWPASS_PTR interpolate8x8_6tap_lowpass_v; INTERPOLATE8X8 interpolate8x8_halfpel_h_c; INTERPOLATE8X8 interpolate8x8_halfpel_v_c; INTERPOLATE8X8 interpolate8x8_halfpel_hv_c; INTERPOLATE8X4 interpolate8x4_halfpel_h_c; INTERPOLATE8X4 interpolate8x4_halfpel_v_c; INTERPOLATE8X4 interpolate8x4_halfpel_hv_c; INTERPOLATE8X8 interpolate8x8_halfpel_add_c; INTERPOLATE8X8 interpolate8x8_halfpel_h_add_c; INTERPOLATE8X8 interpolate8x8_halfpel_v_add_c; INTERPOLATE8X8 interpolate8x8_halfpel_hv_add_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) INTERPOLATE8X8 interpolate8x8_halfpel_h_mmx; INTERPOLATE8X8 interpolate8x8_halfpel_v_mmx; INTERPOLATE8X8 interpolate8x8_halfpel_hv_mmx; INTERPOLATE8X4 interpolate8x4_halfpel_h_mmx; INTERPOLATE8X4 interpolate8x4_halfpel_v_mmx; INTERPOLATE8X4 interpolate8x4_halfpel_hv_mmx; INTERPOLATE8X8 interpolate8x8_halfpel_add_mmx; INTERPOLATE8X8 interpolate8x8_halfpel_h_add_mmx; INTERPOLATE8X8 interpolate8x8_halfpel_v_add_mmx; INTERPOLATE8X8 interpolate8x8_halfpel_hv_add_mmx; INTERPOLATE8X8 interpolate8x8_halfpel_h_xmm; INTERPOLATE8X8 interpolate8x8_halfpel_v_xmm; INTERPOLATE8X8 interpolate8x8_halfpel_hv_xmm; INTERPOLATE8X4 interpolate8x4_halfpel_h_xmm; INTERPOLATE8X4 interpolate8x4_halfpel_v_xmm; INTERPOLATE8X4 interpolate8x4_halfpel_hv_xmm; INTERPOLATE8X8 interpolate8x8_halfpel_add_xmm; INTERPOLATE8X8 interpolate8x8_halfpel_h_add_xmm; INTERPOLATE8X8 interpolate8x8_halfpel_v_add_xmm; INTERPOLATE8X8 interpolate8x8_halfpel_hv_add_xmm; INTERPOLATE8X8 interpolate8x8_halfpel_h_3dn; INTERPOLATE8X8 interpolate8x8_halfpel_v_3dn; INTERPOLATE8X8 interpolate8x8_halfpel_hv_3dn; INTERPOLATE8X4 interpolate8x4_halfpel_h_3dn; INTERPOLATE8X4 interpolate8x4_halfpel_v_3dn; INTERPOLATE8X4 interpolate8x4_halfpel_hv_3dn; INTERPOLATE8X8 interpolate8x8_halfpel_h_3dne; INTERPOLATE8X8 interpolate8x8_halfpel_v_3dne; INTERPOLATE8X8 interpolate8x8_halfpel_hv_3dne; INTERPOLATE8X4 interpolate8x4_halfpel_h_3dne; INTERPOLATE8X4 interpolate8x4_halfpel_v_3dne; INTERPOLATE8X4 interpolate8x4_halfpel_hv_3dne; #endif #ifdef ARCH_IS_IA64 INTERPOLATE8X8 interpolate8x8_halfpel_h_ia64; INTERPOLATE8X8 interpolate8x8_halfpel_v_ia64; INTERPOLATE8X8 interpolate8x8_halfpel_hv_ia64; #endif #ifdef ARCH_IS_PPC INTERPOLATE8X8 interpolate8x8_halfpel_h_altivec_c; INTERPOLATE8X8 interpolate8x8_halfpel_v_altivec_c; INTERPOLATE8X8 interpolate8x8_halfpel_hv_altivec_c; INTERPOLATE8X8 interpolate8x8_halfpel_add_altivec_c; INTERPOLATE8X8 interpolate8x8_halfpel_h_add_altivec_c; INTERPOLATE8X8 interpolate8x8_halfpel_v_add_altivec_c; INTERPOLATE8X8 interpolate8x8_halfpel_hv_add_altivec_c; #endif INTERPOLATE8X8_AVG2 interpolate8x8_avg2_c; INTERPOLATE8X8_AVG4 interpolate8x8_avg4_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) INTERPOLATE8X8_AVG2 interpolate8x8_avg2_mmx; INTERPOLATE8X8_AVG4 interpolate8x8_avg4_mmx; #endif #ifdef ARCH_IS_PPC INTERPOLATE8X8_AVG2 interpolate8x8_avg2_altivec_c; INTERPOLATE8X8_AVG4 interpolate8x8_avg4_altivec_c; #endif INTERPOLATE_LOWPASS interpolate8x8_lowpass_h_c; INTERPOLATE_LOWPASS interpolate8x8_lowpass_v_c; INTERPOLATE_LOWPASS interpolate16x16_lowpass_h_c; INTERPOLATE_LOWPASS interpolate16x16_lowpass_v_c; INTERPOLATE_LOWPASS_HV interpolate8x8_lowpass_hv_c; INTERPOLATE_LOWPASS_HV interpolate16x16_lowpass_hv_c; INTERPOLATE8X8_6TAP_LOWPASS interpolate8x8_6tap_lowpass_h_c; INTERPOLATE8X8_6TAP_LOWPASS interpolate8x8_6tap_lowpass_v_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) INTERPOLATE8X8_6TAP_LOWPASS interpolate8x8_6tap_lowpass_h_mmx; INTERPOLATE8X8_6TAP_LOWPASS interpolate8x8_6tap_lowpass_v_mmx; #endif #ifdef ARCH_IS_PPC INTERPOLATE8X8_6TAP_LOWPASS interpolate8x8_6tap_lowpass_h_altivec_c; #endif static __inline void interpolate8x4_switch(uint8_t * const cur, const uint8_t * const refn, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t * const src = refn + (int)((y + (dy>>1)) * stride + x + (dx>>1)); uint8_t * const dst = cur + (int)(y * stride + x); switch (((dx & 1) << 1) + (dy & 1)) { /* ((dx%2)?2:0)+((dy%2)?1:0) */ case 0: transfer8x4_copy(dst, src, stride); break; case 1: interpolate8x4_halfpel_v(dst, src, stride, rounding); break; case 2: interpolate8x4_halfpel_h(dst, src, stride, rounding); break; default: interpolate8x4_halfpel_hv(dst, src, stride, rounding); break; } } static __inline void interpolate8x8_switch(uint8_t * const cur, const uint8_t * const refn, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t * const src = refn + (int)((y + (dy>>1)) * stride + x + (dx>>1)); uint8_t * const dst = cur + (int)(y * stride + x); switch (((dx & 1) << 1) + (dy & 1)) { /* ((dx%2)?2:0)+((dy%2)?1:0) */ case 0: transfer8x8_copy(dst, src, stride); break; case 1: interpolate8x8_halfpel_v(dst, src, stride, rounding); break; case 2: interpolate8x8_halfpel_h(dst, src, stride, rounding); break; default: interpolate8x8_halfpel_hv(dst, src, stride, rounding); break; } } static __inline void interpolate8x8_add_switch(uint8_t * const cur, const uint8_t * const refn, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t * const src = refn + (int)((y + (dy>>1)) * stride + x + (dx>>1)); uint8_t * const dst = cur + (int)(y * stride + x); switch (((dx & 1) << 1) + (dy & 1)) { /* ((dx%2)?2:0)+((dy%2)?1:0) */ case 0: interpolate8x8_halfpel_add(dst, src, stride, rounding); break; case 1: interpolate8x8_halfpel_v_add(dst, src, stride, rounding); break; case 2: interpolate8x8_halfpel_h_add(dst, src, stride, rounding); break; default: interpolate8x8_halfpel_hv_add(dst, src, stride, rounding); break; } } static __inline void interpolate16x16_switch(uint8_t * const cur, const uint8_t * const refn, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { interpolate8x8_switch(cur, refn, x, y, dx, dy, stride, rounding); interpolate8x8_switch(cur, refn, x+8, y, dx, dy, stride, rounding); interpolate8x8_switch(cur, refn, x, y+8, dx, dy, stride, rounding); interpolate8x8_switch(cur, refn, x+8, y+8, dx, dy, stride, rounding); } static __inline void interpolate16x16_add_switch(uint8_t * const cur, const uint8_t * const refn, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { interpolate8x8_add_switch(cur, refn, x, y, dx, dy, stride, rounding); interpolate8x8_add_switch(cur, refn, x+8, y, dx, dy, stride, rounding); interpolate8x8_add_switch(cur, refn, x, y+8, dx, dy, stride, rounding); interpolate8x8_add_switch(cur, refn, x+8, y+8, dx, dy, stride, rounding); } static __inline void interpolate32x32_switch(uint8_t * const cur, const uint8_t * const refn, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { interpolate16x16_switch(cur, refn, x, y, dx, dy, stride, rounding); interpolate16x16_switch(cur, refn, x+16, y, dx, dy, stride, rounding); interpolate16x16_switch(cur, refn, x, y+16, dx, dy, stride, rounding); interpolate16x16_switch(cur, refn, x+16, y+16, dx, dy, stride, rounding); } static __inline void interpolate32x32_add_switch(uint8_t * const cur, const uint8_t * const refn, const uint32_t x, const uint32_t y, const int32_t dx, const int dy, const uint32_t stride, const uint32_t rounding) { interpolate16x16_add_switch(cur, refn, x, y, dx, dy, stride, rounding); interpolate16x16_add_switch(cur, refn, x+16, y, dx, dy, stride, rounding); interpolate16x16_add_switch(cur, refn, x, y+16, dx, dy, stride, rounding); interpolate16x16_add_switch(cur, refn, x+16, y+16, dx, dy, stride, rounding); } static __inline uint8_t * interpolate8x8_switch2(uint8_t * const buffer, const uint8_t * const refn, const int x, const int y, const int dx, const int dy, const uint32_t stride, const uint32_t rounding) { const uint8_t * const src = refn + (int)((y + (dy>>1)) * stride + x + (dx>>1)); switch (((dx & 1) << 1) + (dy & 1)) { /* ((dx%2)?2:0)+((dy%2)?1:0) */ case 0: return (uint8_t *)src; case 1: interpolate8x8_halfpel_v(buffer, src, stride, rounding); break; case 2: interpolate8x8_halfpel_h(buffer, src, stride, rounding); break; default: interpolate8x8_halfpel_hv(buffer, src, stride, rounding); break; } return buffer; } #endif xvidcore/src/image/x86_asm/0000775000076500007650000000000011566427762016641 5ustar xvidbuildxvidbuildxvidcore/src/image/x86_asm/gmc_mmx.asm0000664000076500007650000001253711345416076020771 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - GMC core functions - ; * Copyright(C) 2006 Pascal Massimino ; * ; * This file is part of Xvid, a free MPEG-4 video encoder/decoder ; * ; * Xvid is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: gmc_mmx.asm,v 1.12 2010-03-09 10:00:14 Isibaar Exp $ ; * ; *************************************************************************/ ;/************************************************************************** ; * ; * History: ; * ; * Jun 14 2006: initial version (during Germany/Poland match;) ; * ; *************************************************************************/ %include "nasm.inc" ;////////////////////////////////////////////////////////////////////// cglobal xvid_GMC_Core_Lin_8_mmx cglobal xvid_GMC_Core_Lin_8_sse2 cglobal xvid_GMC_Core_Lin_8_sse41 ;////////////////////////////////////////////////////////////////////// DATA align SECTION_ALIGN Cst16: times 8 dw 16 TEXT ;////////////////////////////////////////////////////////////////////// ;// mmx version %macro GMC_4_SSE 2 ; %1: i %2: out reg (mm5 or mm6) pcmpeqw mm0, mm0 movq mm1, [_EAX+2*(%1) ] ; u0 | u1 | u2 | u3 psrlw mm0, 12 ; mask 0x000f movq mm2, [_EAX+2*(%1)+2*16] ; v0 | v1 | v2 | v3 pand mm1, mm0 ; u0 pand mm2, mm0 ; v0 movq mm0, [Cst16] movq mm3, mm1 ; u | ... movq mm4, mm0 pmullw mm3, mm2 ; u.v psubw mm0, mm1 ; 16-u psubw mm4, mm2 ; 16-v pmullw mm2, mm0 ; (16-u).v pmullw mm0, mm4 ; (16-u).(16-v) pmullw mm1, mm4 ; u .(16-v) movd mm4, [TMP0+TMP1 +%1] ; src2 movd %2, [TMP0+TMP1+1+%1] ; src3 punpcklbw mm4, mm7 punpcklbw %2, mm7 pmullw mm2, mm4 pmullw mm3, %2 movd mm4, [TMP0 +%1] ; src0 movd %2, [TMP0 +1+%1] ; src1 punpcklbw mm4, mm7 punpcklbw %2, mm7 pmullw mm4, mm0 pmullw %2, mm1 paddw mm2, mm3 paddw %2, mm4 paddw %2, mm2 %endmacro align SECTION_ALIGN xvid_GMC_Core_Lin_8_mmx: mov _EAX, prm2 ; Offsets mov TMP0, prm3 ; Src0 mov TMP1, prm4 ; BpS pxor mm7, mm7 GMC_4_SSE 0, mm5 GMC_4_SSE 4, mm6 ; pshufw mm4, prm5d, 01010101b ; Rounder (bits [16..31]) movd mm4, prm5d ; Rounder (bits [16..31]) mov _EAX, prm1 ; Dst punpcklwd mm4, mm4 punpckhdq mm4, mm4 paddw mm5, mm4 paddw mm6, mm4 psrlw mm5, 8 psrlw mm6, 8 packuswb mm5, mm6 movq [_EAX], mm5 ret ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// SSE2 version %macro GMC_8_SSE2 1 pcmpeqw xmm0, xmm0 movdqa xmm1, [_EAX ] ; u... psrlw xmm0, 12 ; mask = 0x000f movdqa xmm2, [_EAX+2*16] ; v... pand xmm1, xmm0 pand xmm2, xmm0 movdqa xmm0, [Cst16] movdqa xmm3, xmm1 ; u | ... movdqa xmm4, xmm0 pmullw xmm3, xmm2 ; u.v psubw xmm0, xmm1 ; 16-u psubw xmm4, xmm2 ; 16-v pmullw xmm2, xmm0 ; (16-u).v pmullw xmm0, xmm4 ; (16-u).(16-v) pmullw xmm1, xmm4 ; u .(16-v) %if (%1!=0) ; SSE41 pmovzxbw xmm4, [TMP0+TMP1 ] ; src2 pmovzxbw xmm5, [TMP0+TMP1+1] ; src3 %else movq xmm4, [TMP0+TMP1 ] ; src2 movq xmm5, [TMP0+TMP1+1] ; src3 punpcklbw xmm4, xmm7 punpcklbw xmm5, xmm7 %endif pmullw xmm2, xmm4 pmullw xmm3, xmm5 %if (%1!=0) ; SSE41 pmovzxbw xmm4, [TMP0 ] ; src0 pmovzxbw xmm5, [TMP0 +1] ; src1 %else movq xmm4, [TMP0 ] ; src0 movq xmm5, [TMP0 +1] ; src1 punpcklbw xmm4, xmm7 punpcklbw xmm5, xmm7 %endif pmullw xmm4, xmm0 pmullw xmm5, xmm1 paddw xmm2, xmm3 paddw xmm5, xmm4 paddw xmm5, xmm2 %endmacro align SECTION_ALIGN xvid_GMC_Core_Lin_8_sse2: PUSH_XMM6_XMM7 mov _EAX, prm2 ; Offsets mov TMP0, prm3 ; Src0 mov TMP1, prm4 ; BpS pxor xmm7, xmm7 GMC_8_SSE2 0 movd xmm4, prm5d pshuflw xmm4, xmm4, 01010101b ; Rounder (bits [16..31]) punpckldq xmm4, xmm4 mov _EAX, prm1 ; Dst paddw xmm5, xmm4 psrlw xmm5, 8 packuswb xmm5, xmm5 movq [_EAX], xmm5 POP_XMM6_XMM7 ret ENDFUNC align SECTION_ALIGN xvid_GMC_Core_Lin_8_sse41: mov _EAX, prm2 ; Offsets mov TMP0, prm3 ; Src0 mov TMP1, prm4 ; BpS GMC_8_SSE2 1 movd xmm4, prm5d pshuflw xmm4, xmm4, 01010101b ; Rounder (bits [16..31]) punpckldq xmm4, xmm4 mov _EAX, prm1 ; Dst paddw xmm5, xmm4 psrlw xmm5, 8 packuswb xmm5, xmm5 movq [_EAX], xmm5 ret ENDFUNC ;////////////////////////////////////////////////////////////////////// NON_EXEC_STACK xvidcore/src/image/x86_asm/colorspace_yuv_mmx.asm0000664000076500007650000002576511254216113023254 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - MMX and XMM YV12->YV12 conversion - ; * ; * Copyright(C) 2001-2008 Michael Militzer ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: colorspace_yuv_mmx.asm,v 1.15 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Helper macros ;============================================================================= %macro _MOVQ 3 %if %1 == 1 movntq %2, %3 ; xmm %else movq %2, %3 ; plain mmx %endif %endmacro ;------------------------------------------------------------------------------ ; PLANE_COPY ( DST, DST_STRIDE, SRC, SRC_STRIDE, WIDTH, HEIGHT, OPT ) ; DST dst buffer ; DST_STRIDE dst stride ; SRC src destination buffer ; SRC_STRIDE src stride ; WIDTH width ; HEIGHT height ; OPT 0=plain mmx, 1=xmm ; ; ; Trashes: DST, SRC, WIDTH, HEIGHT, _EBX, _ECX, _EDX ;------------------------------------------------------------------------------ %macro PLANE_COPY 7 %define DST %1 %define DST_STRIDE %2 %define SRC %3 %define SRC_STRIDE %4 %define WIDTH %5 %define HEIGHT %6 %define OPT %7 mov _EBX, WIDTH shr WIDTH, 6 ; $_EAX$ = width / 64 and _EBX, 63 ; remainder = width % 64 mov _EDX, _EBX shr _EBX, 4 ; $_EBX$ = remainder / 16 and _EDX, 15 ; $_EDX$ = remainder % 16 %%loop64_start_pc: push DST push SRC mov _ECX, WIDTH ; width64 test WIDTH, WIDTH jz %%loop16_start_pc %%loop64_pc: %if OPT == 1 ; xmm prefetchnta [SRC + 64] ; non temporal prefetch prefetchnta [SRC + 96] %endif movq mm1, [SRC ] ; read from src movq mm2, [SRC + 8] movq mm3, [SRC + 16] movq mm4, [SRC + 24] movq mm5, [SRC + 32] movq mm6, [SRC + 40] movq mm7, [SRC + 48] movq mm0, [SRC + 56] _MOVQ OPT, [DST ], mm1 ; write to y_out _MOVQ OPT, [DST + 8], mm2 _MOVQ OPT, [DST + 16], mm3 _MOVQ OPT, [DST + 24], mm4 _MOVQ OPT, [DST + 32], mm5 _MOVQ OPT, [DST + 40], mm6 _MOVQ OPT, [DST + 48], mm7 _MOVQ OPT, [DST + 56], mm0 add SRC, 64 add DST, 64 loop %%loop64_pc %%loop16_start_pc: mov _ECX, _EBX ; width16 test _EBX, _EBX jz %%loop1_start_pc %%loop16_pc: movq mm1, [SRC] movq mm2, [SRC + 8] _MOVQ OPT, [DST], mm1 _MOVQ OPT, [DST + 8], mm2 add SRC, 16 add DST, 16 loop %%loop16_pc %%loop1_start_pc: mov _ECX, _EDX rep movsb pop SRC pop DST %ifdef ARCH_IS_X86_64 XVID_MOVSXD _ECX, SRC_STRIDE add SRC, _ECX mov ecx, DST_STRIDE add DST, _ECX %else add SRC, SRC_STRIDE add DST, DST_STRIDE %endif dec HEIGHT jg near %%loop64_start_pc %undef DST %undef DST_STRIDE %undef SRC %undef SRC_STRIDE %undef WIDTH %undef HEIGHT %undef OPT %endmacro ;------------------------------------------------------------------------------ ; PLANE_FILL ( DST, DST_STRIDE, WIDTH, HEIGHT, OPT ) ; DST dst buffer ; DST_STRIDE dst stride ; WIDTH width ; HEIGHT height ; OPT 0=plain mmx, 1=xmm ; ; Trashes: DST, WIDTH, HEIGHT, _EBX, _ECX, _EDX, _EAX ;------------------------------------------------------------------------------ %macro PLANE_FILL 5 %define DST %1 %define DST_STRIDE %2 %define WIDTH %3 %define HEIGHT %4 %define OPT %5 mov _EAX, 0x80808080 mov _EBX, WIDTH shr WIDTH, 6 ; $_ESI$ = width / 64 and _EBX, 63 ; _EBX = remainder = width % 64 movd mm0, eax mov _EDX, _EBX shr _EBX, 4 ; $_EBX$ = remainder / 16 and _EDX, 15 ; $_EDX$ = remainder % 16 punpckldq mm0, mm0 %%loop64_start_pf: push DST mov _ECX, WIDTH ; width64 test WIDTH, WIDTH jz %%loop16_start_pf %%loop64_pf: _MOVQ OPT, [DST ], mm0 ; write to y_out _MOVQ OPT, [DST + 8], mm0 _MOVQ OPT, [DST + 16], mm0 _MOVQ OPT, [DST + 24], mm0 _MOVQ OPT, [DST + 32], mm0 _MOVQ OPT, [DST + 40], mm0 _MOVQ OPT, [DST + 48], mm0 _MOVQ OPT, [DST + 56], mm0 add DST, 64 loop %%loop64_pf %%loop16_start_pf: mov _ECX, _EBX ; width16 test _EBX, _EBX jz %%loop1_start_pf %%loop16_pf: _MOVQ OPT, [DST ], mm0 _MOVQ OPT, [DST + 8], mm0 add DST, 16 loop %%loop16_pf %%loop1_start_pf: mov _ECX, _EDX rep stosb pop DST %ifdef ARCH_IS_X86_64 mov ecx, DST_STRIDE add DST, _ECX %else add DST, DST_STRIDE %endif dec HEIGHT jg near %%loop64_start_pf %undef DST %undef DST_STRIDE %undef WIDTH %undef HEIGHT %undef OPT %endmacro ;------------------------------------------------------------------------------ ; MAKE_YV12_TO_YV12( NAME, OPT ) ; NAME function name ; OPT 0=plain mmx, 1=xmm ; ; yv12_to_yv12_mmx(uint8_t * y_dst, uint8_t * u_dst, uint8_t * v_dst, ; int y_dst_stride, int uv_dst_stride, ; uint8_t * y_src, uint8_t * u_src, uint8_t * v_src, ; int y_src_stride, int uv_src_stride, ; int width, int height, int vflip) ;------------------------------------------------------------------------------ %macro MAKE_YV12_TO_YV12 2 %define NAME %1 %define XMM_OPT %2 ALIGN SECTION_ALIGN cglobal NAME NAME: push _EBX ; _ESP + localsize + 3*PTR_SIZE %define localsize 2*4 %ifdef ARCH_IS_X86_64 %ifndef WINDOWS %define pushsize 2*PTR_SIZE %define shadow 0 %else %define pushsize 4*PTR_SIZE %define shadow 32 + 2*PTR_SIZE %endif %define prm_vflip dword [_ESP + localsize + pushsize + shadow + 7*PTR_SIZE] %define prm_height dword [_ESP + localsize + pushsize + shadow + 6*PTR_SIZE] %define prm_width dword [_ESP + localsize + pushsize + shadow + 5*PTR_SIZE] %define prm_uv_src_stride dword [_ESP + localsize + pushsize + shadow + 4*PTR_SIZE] %define prm_y_src_stride dword [_ESP + localsize + pushsize + shadow + 3*PTR_SIZE] %define prm_v_src [_ESP + localsize + pushsize + shadow + 2*PTR_SIZE] %define prm_u_src [_ESP + localsize + pushsize + shadow + 1*PTR_SIZE] %ifdef WINDOWS push _ESI ; _ESP + localsize + 2*PTR_SIZE push _EDI ; _ESP + localsize + 1*PTR_SIZE push _EBP ; _ESP + localsize + 0*PTR_SIZE sub _ESP, localsize %define prm_y_src _ESI %define prm_uv_dst_stride TMP0d %define prm_y_dst_stride prm4d %define prm_v_dst prm3 %define prm_u_dst TMP1 %define prm_y_dst _EDI mov _EDI, prm1 mov TMP1, prm2 mov _ESI, [_ESP + localsize + pushsize + shadow + 0*PTR_SIZE] mov TMP0d, dword [_ESP + localsize + pushsize + shadow - 1*PTR_SIZE] %else push _EBP ; _ESP + localsize + 0*PTR_SIZE sub _ESP, localsize %define prm_y_src _ESI %define prm_uv_dst_stride prm5d %define prm_y_dst_stride TMP1d %define prm_v_dst prm6 %define prm_u_dst TMP0 %define prm_y_dst _EDI mov TMP0, prm2 mov _ESI, prm6 mov prm6, prm3 mov TMP1d, prm4d %endif %define _ip _ESP + localsize + pushsize + 0 %else %define pushsize 4*PTR_SIZE %define prm_vflip [_ESP + localsize + pushsize + 13*PTR_SIZE] %define prm_height [_ESP + localsize + pushsize + 12*PTR_SIZE] %define prm_width [_ESP + localsize + pushsize + 11*PTR_SIZE] %define prm_uv_src_stride [_ESP + localsize + pushsize + 10*PTR_SIZE] %define prm_y_src_stride [_ESP + localsize + pushsize + 9*PTR_SIZE] %define prm_v_src [_ESP + localsize + pushsize + 8*PTR_SIZE] %define prm_u_src [_ESP + localsize + pushsize + 7*PTR_SIZE] %define prm_y_src _ESI %define prm_uv_dst_stride [_ESP + localsize + pushsize + 5*PTR_SIZE] %define prm_y_dst_stride [_ESP + localsize + pushsize + 4*PTR_SIZE] %define prm_v_dst [_ESP + localsize + pushsize + 3*PTR_SIZE] %define prm_u_dst [_ESP + localsize + pushsize + 2*PTR_SIZE] %define prm_y_dst _EDI %define _ip _ESP + localsize + pushsize + 0 push _ESI ; _ESP + localsize + 2*PTR_SIZE push _EDI ; _ESP + localsize + 1*PTR_SIZE push _EBP ; _ESP + localsize + 0*PTR_SIZE sub _ESP, localsize mov _ESI, [_ESP + localsize + pushsize + 6*PTR_SIZE] mov _EDI, [_ESP + localsize + pushsize + 1*PTR_SIZE] %endif %define width2 dword [_ESP + localsize - 1*4] %define height2 dword [_ESP + localsize - 2*4] mov eax, prm_width mov ebx, prm_height shr eax, 1 ; calculate widht/2, heigh/2 shr ebx, 1 mov width2, eax mov height2, ebx mov eax, prm_vflip test eax, eax jz near .go ; flipping support mov eax, prm_height mov ecx, prm_y_src_stride sub eax, 1 imul eax, ecx add _ESI, _EAX ; y_src += (height-1) * y_src_stride neg ecx mov prm_y_src_stride, ecx ; y_src_stride = -y_src_stride mov eax, height2 mov _EDX, prm_u_src mov _EBP, prm_v_src mov ecx, prm_uv_src_stride test _EDX, _EDX jz .go test _EBP, _EBP jz .go sub eax, 1 ; _EAX = height2 - 1 imul eax, ecx add _EDX, _EAX ; u_src += (height2-1) * uv_src_stride add _EBP, _EAX ; v_src += (height2-1) * uv_src_stride neg ecx mov prm_u_src, _EDX mov prm_v_src, _EBP mov prm_uv_src_stride, ecx ; uv_src_stride = -uv_src_stride .go: mov eax, prm_width mov ebp, prm_height PLANE_COPY _EDI, prm_y_dst_stride, _ESI, prm_y_src_stride, _EAX, _EBP, XMM_OPT mov _EAX, prm_u_src or _EAX, prm_v_src jz near .UVFill_0x80 mov eax, width2 mov ebp, height2 mov _ESI, prm_u_src mov _EDI, prm_u_dst PLANE_COPY _EDI, prm_uv_dst_stride, _ESI, prm_uv_src_stride, _EAX, _EBP, XMM_OPT mov eax, width2 mov ebp, height2 mov _ESI, prm_v_src mov _EDI, prm_v_dst PLANE_COPY _EDI, prm_uv_dst_stride, _ESI, prm_uv_src_stride, _EAX, _EBP, XMM_OPT .Done_UVPlane: add _ESP, localsize pop _EBP %ifndef ARCH_IS_X86_64 pop _EDI pop _ESI %else %ifdef WINDOWS pop _EDI pop _ESI %endif %endif pop _EBX ret .UVFill_0x80: mov esi, width2 mov ebp, height2 mov _EDI, prm_u_dst PLANE_FILL _EDI, prm_uv_dst_stride, _ESI, _EBP, XMM_OPT mov esi, width2 mov ebp, height2 mov _EDI, prm_v_dst PLANE_FILL _EDI, prm_uv_dst_stride, _ESI, _EBP, XMM_OPT jmp near .Done_UVPlane ENDFUNC %undef NAME %undef XMM_OPT %endmacro ;============================================================================= ; Code ;============================================================================= TEXT MAKE_YV12_TO_YV12 yv12_to_yv12_mmx, 0 MAKE_YV12_TO_YV12 yv12_to_yv12_xmm, 1 NON_EXEC_STACK xvidcore/src/image/x86_asm/reduced_mmx.asm0000664000076500007650000005373011345416076021636 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - Reduced-Resolution utilities - ; * ; * Copyright(C) 2002 Pascal Massimino ; * ; * Xvid is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: reduced_mmx.asm,v 1.13 2010-03-09 10:00:14 Isibaar Exp $ ; * ; *************************************************************************/ %include "nasm.inc" ;=========================================================================== DATA align SECTION_ALIGN Up31 dw 3, 1, 3, 1 Up13 dw 1, 3, 1, 3 Up93 dw 9, 3, 9, 3 Up39 dw 3, 9, 3, 9 Cst0 dw 0, 0, 0, 0 Cst2 dw 2, 2, 2, 2 Cst3 dw 3, 3, 3, 3 Cst32 dw 32,32,32,32 Cst2000 dw 2, 0, 0, 0 Cst0002 dw 0, 0, 0, 2 Mask_ff dw 0xff,0xff,0xff,0xff ;=========================================================================== TEXT cglobal xvid_Copy_Upsampled_8x8_16To8_mmx cglobal xvid_Add_Upsampled_8x8_16To8_mmx cglobal xvid_Copy_Upsampled_8x8_16To8_xmm cglobal xvid_Add_Upsampled_8x8_16To8_xmm cglobal xvid_HFilter_31_mmx cglobal xvid_VFilter_31_x86 cglobal xvid_HFilter_31_x86 cglobal xvid_Filter_18x18_To_8x8_mmx cglobal xvid_Filter_Diff_18x18_To_8x8_mmx ;////////////////////////////////////////////////////////////////////// ;// 8x8 -> 16x16 upsampling (16b) ;////////////////////////////////////////////////////////////////////// %macro MUL_PACK 4 ; %1/%2: regs %3/%4/%5: Up13/Up31 pmullw %1, %3 ; [Up13] pmullw mm4, %4 ; [Up31] pmullw %2, %3 ; [Up13] pmullw mm5, %4 ; [Up31] paddsw %1, [Cst2] paddsw %2, [Cst2] paddsw %1, mm4 paddsw %2, mm5 %endmacro ; MMX-way of reordering columns... %macro COL03 3 ;%1/%2: regs, %3: row -output: mm4/mm5 movq %1, [TMP1+%3*16+0*2] ; %1 = 0|1|2|3 movq %2,[TMP1+%3*16+1*2] ; %2 = 1|2|3|4 movq mm5, %1 ; mm5 = 0|1|2|3 movq mm4, %1 ; mm4 = 0|1|2|3 punpckhwd mm5,%2 ; mm5 = 2|3|3|4 punpcklwd mm4,%2 ; mm4 = 0|1|1|2 punpcklwd %1,%1 ; %1 = 0|0|1|1 punpcklwd %2, mm5 ; %2 = 1|2|2|3 punpcklwd %1, mm4 ; %1 = 0|0|0|1 %endmacro %macro COL47 3 ;%1-%2: regs, %3: row -output: mm4/mm5 movq mm5, [TMP1+%3*16+4*2] ; mm5 = 4|5|6|7 movq %1, [TMP1+%3*16+3*2] ; %1 = 3|4|5|6 movq %2, mm5 ; %2 = 4|5|6|7 movq mm4, mm5 ; mm4 = 4|5|6|7 punpckhwd %2, %2 ; %2 = 6|6|7|7 punpckhwd mm5, %2 ; mm5 = 6|7|7|7 movq %2, %1 ; %2 = 3|4|5|6 punpcklwd %1, mm4 ; %1 = 3|4|4|5 punpckhwd %2, mm4 ; %2 = 5|6|6|7 punpcklwd mm4, %2 ; mm4 = 4|5|5|6 %endmacro %macro MIX_ROWS 4 ; %1/%2:prev %3/4:cur (preserved) mm4/mm5: output ; we need to perform: (%1,%3) -> (%1 = 3*%1+%3, mm4 = 3*%3+%1), %3 preserved. movq mm4, [Cst3] movq mm5, [Cst3] pmullw mm4, %3 pmullw mm5, %4 paddsw mm4, %1 paddsw mm5, %2 pmullw %1, [Cst3] pmullw %2, [Cst3] paddsw %1, %3 paddsw %2, %4 %endmacro ;=========================================================================== ; ; void xvid_Copy_Upsampled_8x8_16To8_mmx(uint8_t *Dst, ; const int16_t *Src, const int BpS); ; ;=========================================================================== ; Note: we can use ">>2" instead of "/4" here, since we ; are (supposed to be) averaging positive values %macro STORE_1 2 psraw %1, 2 psraw %2, 2 packuswb %1,%2 movq [TMP0], %1 %endmacro %macro STORE_2 2 ; pack and store (%1,%2) + (mm4,mm5) psraw %1, 4 psraw %2, 4 psraw mm4, 4 psraw mm5, 4 packuswb %1,%2 packuswb mm4, mm5 movq [TMP0], %1 movq [TMP0+_EAX], mm4 lea TMP0, [TMP0+2*_EAX] %endmacro ;////////////////////////////////////////////////////////////////////// align SECTION_ALIGN xvid_Copy_Upsampled_8x8_16To8_mmx: ; 344c mov TMP0, prm1 ; Dst mov TMP1, prm2 ; Src mov _EAX, prm3 ; BpS movq mm6, [Up13] movq mm7, [Up31] COL03 mm0, mm1, 0 MUL_PACK mm0,mm1, mm6, mm7 movq mm4, mm0 movq mm5, mm1 STORE_1 mm4, mm5 add TMP0, _EAX COL03 mm2, mm3, 1 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL03 mm0, mm1, 2 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL03 mm2, mm3, 3 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL03 mm0, mm1, 4 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL03 mm2, mm3, 5 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL03 mm0, mm1, 6 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL03 mm2, mm3, 7 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 STORE_1 mm2, mm3 mov TMP0, prm1 add TMP0, 8 COL47 mm0, mm1, 0 MUL_PACK mm0,mm1, mm6, mm7 movq mm4, mm0 movq mm5, mm1 STORE_1 mm4, mm5 add TMP0, _EAX COL47 mm2, mm3, 1 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL47 mm0, mm1, 2 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL47 mm2, mm3, 3 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL47 mm0, mm1, 4 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL47 mm2, mm3, 5 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL47 mm0, mm1, 6 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL47 mm2, mm3, 7 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 STORE_1 mm2, mm3 ret ENDFUNC ;=========================================================================== ; ; void xvid_Add_Upsampled_8x8_16To8_mmx(uint8_t *Dst, ; const int16_t *Src, const int BpS); ; ;=========================================================================== ; Note: grrr... the 'pcmpgtw' stuff are the "/4" and "/16" operators ; implemented with ">>2" and ">>4" using: ; x/4 = ( (x-(x<0))>>2 ) + (x<0) ; x/16 = ( (x-(x<0))>>4 ) + (x<0) %macro STORE_ADD_1 2 ; We substract the rounder '2' for corner pixels, ; since when 'x' is negative, (x*4 + 2)/4 is *not* ; equal to 'x'. In fact, the correct relation is: ; (x*4 + 2)/4 = x - (x<0) ; So, better revert to (x*4)/4 = x. psubsw %1, [Cst2000] psubsw %2, [Cst0002] pxor mm6, mm6 pxor mm7, mm7 pcmpgtw mm6, %1 pcmpgtw mm7, %2 paddsw %1, mm6 paddsw %2, mm7 psraw %1, 2 psraw %2, 2 psubsw %1, mm6 psubsw %2, mm7 ; mix with destination [TMP0] movq mm6, [TMP0] movq mm7, [TMP0] punpcklbw mm6, [Cst0] punpckhbw mm7, [Cst0] paddsw %1, mm6 paddsw %2, mm7 packuswb %1,%2 movq [TMP0], %1 %endmacro %macro STORE_ADD_2 2 pxor mm6, mm6 pxor mm7, mm7 pcmpgtw mm6, %1 pcmpgtw mm7, %2 paddsw %1, mm6 paddsw %2, mm7 psraw %1, 4 psraw %2, 4 psubsw %1, mm6 psubsw %2, mm7 pxor mm6, mm6 pxor mm7, mm7 pcmpgtw mm6, mm4 pcmpgtw mm7, mm5 paddsw mm4, mm6 paddsw mm5, mm7 psraw mm4, 4 psraw mm5, 4 psubsw mm4, mm6 psubsw mm5, mm7 ; mix with destination movq mm6, [TMP0] movq mm7, [TMP0] punpcklbw mm6, [Cst0] punpckhbw mm7, [Cst0] paddsw %1, mm6 paddsw %2, mm7 movq mm6, [TMP0+_EAX] movq mm7, [TMP0+_EAX] punpcklbw mm6, [Cst0] punpckhbw mm7, [Cst0] paddsw mm4, mm6 paddsw mm5, mm7 packuswb %1,%2 packuswb mm4, mm5 movq [TMP0], %1 movq [TMP0+_EAX], mm4 lea TMP0, [TMP0+2*_EAX] %endmacro ;////////////////////////////////////////////////////////////////////// align SECTION_ALIGN xvid_Add_Upsampled_8x8_16To8_mmx: ; 579c mov TMP0, prm1 ; Dst mov TMP1, prm2 ; Src mov _EAX, prm3 ; BpS COL03 mm0, mm1, 0 MUL_PACK mm0,mm1, [Up13], [Up31] movq mm4, mm0 movq mm5, mm1 STORE_ADD_1 mm4, mm5 add TMP0, _EAX COL03 mm2, mm3, 1 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL03 mm0, mm1, 2 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL03 mm2, mm3, 3 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL03 mm0, mm1, 4 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL03 mm2, mm3, 5 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL03 mm0, mm1, 6 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL03 mm2, mm3, 7 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 STORE_ADD_1 mm2, mm3 mov TMP0, prm1 add TMP0, 8 COL47 mm0, mm1, 0 MUL_PACK mm0,mm1, [Up13], [Up31] movq mm4, mm0 movq mm5, mm1 STORE_ADD_1 mm4, mm5 add TMP0, _EAX COL47 mm2, mm3, 1 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL47 mm0, mm1, 2 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL47 mm2, mm3, 3 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL47 mm0, mm1, 4 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL47 mm2, mm3, 5 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL47 mm0, mm1, 6 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL47 mm2, mm3, 7 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 STORE_ADD_1 mm2, mm3 ret ENDFUNC ;=========================================================================== ; ; void xvid_Copy_Upsampled_8x8_16To8_xmm(uint8_t *Dst, ; const int16_t *Src, const int BpS); ; ;=========================================================================== ; xmm version can take (little) advantage of 'pshufw' %macro COL03_SSE 3 ;%1/%2: regs, %3: row -trashes mm4/mm5 movq %2, [TMP1+%3*16+0*2] ; <- 0|1|2|3 pshufw %1, %2, (0+0*4+0*16+1*64) ; %1 = 0|0|0|1 pshufw mm4, %2, (0+1*4+1*16+2*64) ; mm4= 0|1|1|2 pshufw %2, %2, (1+2*4+2*16+3*64) ; %2 = 1|2|2|3 pshufw mm5, [TMP1+%3*16+2*2], (0+1*4+1*16+2*64) ; mm5 = 2|3|3|4 %endmacro %macro COL47_SSE 3 ;%1-%2: regs, %3: row -trashes mm4/mm5 pshufw %1, [TMP1+%3*16+2*2], (1+2*4+2*16+3*64) ; 3|4|4|5 movq mm5, [TMP1+%3*16+2*4] ; <- 4|5|6|7 pshufw mm4, mm5, (0+1*4+1*16+2*64) ; 4|5|5|6 pshufw %2, mm5, (1+2*4+2*16+3*64) ; 5|6|6|7 pshufw mm5, mm5, (2+3*4+3*16+3*64) ; 6|7|7|7 %endmacro ;////////////////////////////////////////////////////////////////////// align SECTION_ALIGN xvid_Copy_Upsampled_8x8_16To8_xmm: ; 315c mov TMP0, prm1 ; Dst mov TMP1, prm2 ; Src mov _EAX, prm3 ; BpS movq mm6, [Up13] movq mm7, [Up31] COL03_SSE mm0, mm1, 0 MUL_PACK mm0,mm1, mm6, mm7 movq mm4, mm0 movq mm5, mm1 STORE_1 mm4, mm5 add TMP0, _EAX COL03_SSE mm2, mm3, 1 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL03_SSE mm0, mm1, 2 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL03_SSE mm2, mm3, 3 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL03_SSE mm0, mm1, 4 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL03_SSE mm2, mm3, 5 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL03_SSE mm0, mm1, 6 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL03_SSE mm2, mm3, 7 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 STORE_1 mm2, mm3 mov TMP0, prm1 add TMP0, 8 COL47_SSE mm0, mm1, 0 MUL_PACK mm0,mm1, mm6, mm7 movq mm4, mm0 movq mm5, mm1 STORE_1 mm4, mm5 add TMP0, _EAX COL47_SSE mm2, mm3, 1 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL47_SSE mm0, mm1, 2 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL47_SSE mm2, mm3, 3 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL47_SSE mm0, mm1, 4 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL47_SSE mm2, mm3, 5 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 COL47_SSE mm0, mm1, 6 MUL_PACK mm0,mm1, mm6, mm7 MIX_ROWS mm2, mm3, mm0, mm1 STORE_2 mm2, mm3 COL47_SSE mm2, mm3, 7 MUL_PACK mm2,mm3, mm6, mm7 MIX_ROWS mm0, mm1, mm2, mm3 STORE_2 mm0, mm1 STORE_1 mm2, mm3 ret ENDFUNC ;=========================================================================== ; ; void xvid_Add_Upsampled_8x8_16To8_xmm(uint8_t *Dst, ; const int16_t *Src, const int BpS); ; ;=========================================================================== align SECTION_ALIGN xvid_Add_Upsampled_8x8_16To8_xmm: ; 549c mov TMP0, prm1 ; Dst mov TMP1, prm2 ; Src mov _EAX, prm3 ; BpS COL03_SSE mm0, mm1, 0 MUL_PACK mm0,mm1, [Up13], [Up31] movq mm4, mm0 movq mm5, mm1 STORE_ADD_1 mm4, mm5 add TMP0, _EAX COL03_SSE mm2, mm3, 1 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL03_SSE mm0, mm1, 2 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL03_SSE mm2, mm3, 3 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL03_SSE mm0, mm1, 4 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL03_SSE mm2, mm3, 5 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL03_SSE mm0, mm1, 6 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL03_SSE mm2, mm3, 7 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 STORE_ADD_1 mm2, mm3 mov TMP0, prm1 add TMP0, 8 COL47_SSE mm0, mm1, 0 MUL_PACK mm0,mm1, [Up13], [Up31] movq mm4, mm0 movq mm5, mm1 STORE_ADD_1 mm4, mm5 add TMP0, _EAX COL47_SSE mm2, mm3, 1 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL47_SSE mm0, mm1, 2 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL47_SSE mm2, mm3, 3 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL47_SSE mm0, mm1, 4 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL47_SSE mm2, mm3, 5 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 COL47_SSE mm0, mm1, 6 MUL_PACK mm0,mm1, [Up13], [Up31] MIX_ROWS mm2, mm3, mm0, mm1 STORE_ADD_2 mm2, mm3 COL47_SSE mm2, mm3, 7 MUL_PACK mm2,mm3, [Up13], [Up31] MIX_ROWS mm0, mm1, mm2, mm3 STORE_ADD_2 mm0, mm1 STORE_ADD_1 mm2, mm3 ret ENDFUNC ;=========================================================================== ; ; void xvid_HFilter_31_mmx(uint8_t *Src1, uint8_t *Src2, int Nb_Blks); ; void xvid_VFilter_31_x86(uint8_t *Src1, uint8_t *Src2, const int BpS, int Nb_Blks); ; void xvid_HFilter_31_x86(uint8_t *Src1, uint8_t *Src2, int Nb_Blks); ; ;=========================================================================== ;////////////////////////////////////////////////////////////////////// ;// horizontal/vertical filtering: [x,y] -> [ (3x+y+2)>>2, (x+3y+2)>>2 ] ;// ;// We use the trick: tmp = (x+y+2) -> [x = (tmp+2x)>>2, y = (tmp+2y)>>2] ;////////////////////////////////////////////////////////////////////// align SECTION_ALIGN xvid_HFilter_31_mmx: mov TMP0, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov _EAX, prm3 ; Nb_Blks lea _EAX, [_EAX*2] movq mm5, [Cst2] pxor mm7, mm7 lea TMP0, [TMP0+_EAX*4] lea TMP1, [TMP1+_EAX*4] neg _EAX .Loop: ;12c movd mm0, [TMP0+_EAX*4] movd mm1, [TMP1+_EAX*4] movq mm2, mm5 punpcklbw mm0, mm7 punpcklbw mm1, mm7 paddsw mm2, mm0 paddsw mm0, mm0 paddsw mm2, mm1 paddsw mm1, mm1 paddsw mm0, mm2 paddsw mm1, mm2 psraw mm0, 2 psraw mm1, 2 packuswb mm0, mm7 packuswb mm1, mm7 movd [TMP0+_EAX*4], mm0 movd [TMP1+_EAX*4], mm1 inc _EAX jl .Loop ret ENDFUNC ; mmx is of no use here. Better use plain ASM. Moreover, ; this is for the fun of ASM coding, coz' every modern compiler can ; end up with a code that looks very much like this one... align SECTION_ALIGN xvid_VFilter_31_x86: mov TMP0, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov _EAX, prm4 ; Nb_Blks lea _EAX, [_EAX*8] push _ESI push _EDI push _EBX push _EBP %ifdef ARCH_IS_X86_64 mov _EBP, prm3 %else mov _EBP, [_ESP+12 +16] ; BpS %endif .Loop: ;7c movzx _ESI, byte [TMP0] movzx _EDI, byte [TMP1] lea _EBX,[_ESI+_EDI+2] lea _ESI,[_EBX+2*_ESI] lea _EDI,[_EBX+2*_EDI] shr _ESI,2 shr _EDI,2 mov [TMP0], cl mov [TMP1], dl lea TMP0, [TMP0+_EBP] lea TMP1, [TMP1+_EBP] dec _EAX jg .Loop pop _EBP pop _EBX pop _EDI pop _ESI ret ENDFUNC ; this one's just a little faster than gcc's code. Very little. align SECTION_ALIGN xvid_HFilter_31_x86: mov TMP0, prm1 ; Src1 mov TMP1, prm2 ; Src2 mov _EAX, prm3 ; Nb_Blks lea _EAX,[_EAX*8] lea TMP0, [TMP0+_EAX] lea TMP1, [TMP0+_EAX] neg _EAX push _ESI push _EDI push _EBX .Loop: ; 6c movzx _ESI, byte [TMP0+_EAX] movzx _EDI, byte [TMP1+_EAX] lea _EBX, [_ESI+_EDI+2] lea _ESI,[_EBX+2*_ESI] lea _EDI,[_EBX+2*_EDI] shr _ESI,2 shr _EDI,2 mov [TMP0+_EAX], cl mov [TMP1+_EAX], dl inc _EAX jl .Loop pop _EBX pop _EDI pop _ESI ret ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// 16b downsampling 16x16 -> 8x8 ;////////////////////////////////////////////////////////////////////// %macro HFILTER_1331 2 ;%1:src %2:dst reg. -trashes mm0/mm1/mm2 movq mm2, [Mask_ff] movq %2, [%1-1] ;-10123456 movq mm0, [%1] ; 01234567 movq mm1, [%1+1] ; 12345678 pand %2, mm2 ;-1|1|3|5 pand mm0, mm2 ; 0|2|4|6 pand mm1, mm2 ; 1|3|5|7 pand mm2, [%1+2] ; 2|4|6|8 paddusw mm0, mm1 paddusw %2, mm2 pmullw mm0, mm7 paddusw %2, mm0 %endmacro %macro VFILTER_1331 4 ; %1-4: regs %1-%2: trashed paddsw %1, [Cst32] paddsw %2, %3 pmullw %2, mm7 paddsw %1,%4 paddsw %1, %2 psraw %1, 6 %endmacro ;=========================================================================== ; ; void xvid_Filter_18x18_To_8x8_mmx(int16_t *Dst, ; const uint8_t *Src, const int BpS); ; ;=========================================================================== %macro COPY_TWO_LINES_1331 1 ; %1: dst HFILTER_1331 TMP1 , mm5 HFILTER_1331 TMP1+_EAX, mm6 lea TMP1, [TMP1+2*_EAX] VFILTER_1331 mm3,mm4,mm5, mm6 movq [%1], mm3 HFILTER_1331 TMP1 , mm3 HFILTER_1331 TMP1+_EAX, mm4 lea TMP1, [TMP1+2*_EAX] VFILTER_1331 mm5,mm6,mm3,mm4 movq [%1+16], mm5 %endmacro align SECTION_ALIGN xvid_Filter_18x18_To_8x8_mmx: ; 283c (~4.4c per output pixel) mov TMP0, prm1 ; Dst mov TMP1, prm2 ; Src mov _EAX, prm3 ; BpS movq mm7, [Cst3] sub TMP1, _EAX ; mm3/mm4/mm5/mm6 is used as a 4-samples delay line. ; process columns 0-3 HFILTER_1331 TMP1 , mm3 ; pre-load mm3/mm4 HFILTER_1331 TMP1+_EAX, mm4 lea TMP1, [TMP1+2*_EAX] COPY_TWO_LINES_1331 TMP0 + 0*16 COPY_TWO_LINES_1331 TMP0 + 2*16 COPY_TWO_LINES_1331 TMP0 + 4*16 COPY_TWO_LINES_1331 TMP0 + 6*16 ; process columns 4-7 mov TMP1, prm2 sub TMP1, _EAX add TMP1, 8 HFILTER_1331 TMP1 , mm3 ; pre-load mm3/mm4 HFILTER_1331 TMP1+_EAX, mm4 lea TMP1, [TMP1+2*_EAX] COPY_TWO_LINES_1331 TMP0 + 0*16 +8 COPY_TWO_LINES_1331 TMP0 + 2*16 +8 COPY_TWO_LINES_1331 TMP0 + 4*16 +8 COPY_TWO_LINES_1331 TMP0 + 6*16 +8 ret ENDFUNC ;=========================================================================== ; ; void xvid_Filter_Diff_18x18_To_8x8_mmx(int16_t *Dst, ; const uint8_t *Src, const int BpS); ; ;=========================================================================== %macro DIFF_TWO_LINES_1331 1 ; %1: dst HFILTER_1331 TMP1 , mm5 HFILTER_1331 TMP1+_EAX, mm6 lea TMP1, [TMP1+2*_EAX] movq mm2, [%1] VFILTER_1331 mm3,mm4,mm5, mm6 psubsw mm2, mm3 movq [%1], mm2 HFILTER_1331 TMP1 , mm3 HFILTER_1331 TMP1+_EAX, mm4 lea TMP1, [TMP1+2*_EAX] movq mm2, [%1+16] VFILTER_1331 mm5,mm6,mm3,mm4 psubsw mm2, mm5 movq [%1+16], mm2 %endmacro align SECTION_ALIGN xvid_Filter_Diff_18x18_To_8x8_mmx: ; 302c mov TMP0, prm1 ; Dst mov TMP1, prm2 ; Src mov _EAX, prm3 ; BpS movq mm7, [Cst3] sub TMP1, _EAX ; mm3/mm4/mm5/mm6 is used as a 4-samples delay line. ; process columns 0-3 HFILTER_1331 TMP1 , mm3 ; pre-load mm3/mm4 HFILTER_1331 TMP1+_EAX, mm4 lea TMP1, [TMP1+2*_EAX] DIFF_TWO_LINES_1331 TMP0 + 0*16 DIFF_TWO_LINES_1331 TMP0 + 2*16 DIFF_TWO_LINES_1331 TMP0 + 4*16 DIFF_TWO_LINES_1331 TMP0 + 6*16 ; process columns 4-7 mov TMP1, prm2 sub TMP1, _EAX add TMP1, 8 HFILTER_1331 TMP1 , mm3 ; pre-load mm3/mm4 HFILTER_1331 TMP1+_EAX, mm4 lea TMP1, [TMP1+2*_EAX] DIFF_TWO_LINES_1331 TMP0 + 0*16 +8 DIFF_TWO_LINES_1331 TMP0 + 2*16 +8 DIFF_TWO_LINES_1331 TMP0 + 4*16 +8 DIFF_TWO_LINES_1331 TMP0 + 6*16 +8 ret ENDFUNC ;////////////////////////////////////////////////////////////////////// ; pfeewwww... Never Do That On Stage Again. :) NON_EXEC_STACK xvidcore/src/image/x86_asm/postprocessing_mmx.asm0000664000076500007650000000544611345416076023306 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - mmx post processing - ; * ; * Copyright(C) 2004 Peter Ross ; * ; * Xvid is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: postprocessing_mmx.asm,v 1.14 2010-03-09 10:00:14 Isibaar Exp $ ; * ; *************************************************************************/ %include "nasm.inc" ;=========================================================================== ; read only data ;=========================================================================== DATA mmx_0x80: times 8 db 0x80 mmx_offset: %assign i -128 %rep 256 times 8 db i %assign i i+1 %endrep ;============================================================================= ; Code ;============================================================================= TEXT cglobal image_brightness_mmx ;////////////////////////////////////////////////////////////////////// ;// image_brightness_mmx ;////////////////////////////////////////////////////////////////////// align SECTION_ALIGN image_brightness_mmx: movq mm6, [mmx_0x80] %ifdef ARCH_IS_X86_64 XVID_MOVSXD _EAX, prm5d lea TMP0, [mmx_offset] movq mm7, [TMP0 + (_EAX + 128)*8] ; being lazy %else mov eax, prm5d ; offset movq mm7, [mmx_offset + (_EAX + 128)*8] ; being lazy %endif mov TMP1, prm1 ; Dst mov TMP0, prm2 ; stride push _ESI push _EDI %ifdef ARCH_IS_X86_64 mov _ESI, prm3 mov _EDI, prm4 %else mov _ESI, [_ESP+8+12] ; width mov _EDI, [_ESP+8+16] ; height %endif .yloop: xor _EAX, _EAX .xloop: movq mm0, [TMP1 + _EAX] movq mm1, [TMP1 + _EAX + 8] ; mm0 = [dst] paddb mm0, mm6 ; unsigned -> signed domain paddb mm1, mm6 paddsb mm0, mm7 paddsb mm1, mm7 ; mm0 += offset psubb mm0, mm6 psubb mm1, mm6 ; signed -> unsigned domain movq [TMP1 + _EAX], mm0 movq [TMP1 + _EAX + 8], mm1 ; [dst] = mm0 add _EAX,16 cmp _EAX,_ESI jl .xloop add TMP1, TMP0 ; dst += stride dec _EDI jg .yloop pop _EDI pop _ESI ret ENDFUNC ;////////////////////////////////////////////////////////////////////// NON_EXEC_STACK xvidcore/src/image/x86_asm/interpolate8x8_3dn.asm0000664000076500007650000002574011254216113022771 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - 3dnow 8x8 block-based halfpel interpolation - ; * ; * Copyright(C) 2001 Peter Ross ; * 2002-2008 Michael Militzer ; * 2002 Pascal Massimino ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; ****************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read Only data ;============================================================================= DATA ALIGN SECTION_ALIGN mmx_one: times 8 db 1 ;============================================================================= ; Code ;============================================================================= TEXT cglobal interpolate8x8_halfpel_h_3dn cglobal interpolate8x8_halfpel_v_3dn cglobal interpolate8x8_halfpel_hv_3dn cglobal interpolate8x4_halfpel_h_3dn cglobal interpolate8x4_halfpel_v_3dn cglobal interpolate8x4_halfpel_hv_3dn ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_h_3dn(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- %macro COPY_H_3DN_RND0 0 movq mm0, [_EAX] pavgusb mm0, [_EAX+1] movq mm1, [_EAX+TMP1] pavgusb mm1, [_EAX+TMP1+1] lea _EAX, [_EAX+2*TMP1] movq [TMP0], mm0 movq [TMP0+TMP1], mm1 %endmacro %macro COPY_H_3DN_RND1 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] movq mm4, mm0 movq mm5, mm1 movq mm2, [_EAX+1] movq mm3, [_EAX+TMP1+1] pavgusb mm0, mm2 pxor mm2, mm4 pavgusb mm1, mm3 lea _EAX, [_EAX+2*TMP1] pxor mm3, mm5 pand mm2, mm7 pand mm3, mm7 psubb mm0, mm2 movq [TMP0], mm0 psubb mm1, mm3 movq [TMP0+TMP1], mm1 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_h_3dn: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX, _EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride jnz near .rounding1 COPY_H_3DN_RND0 lea TMP0, [TMP0+2*TMP1] COPY_H_3DN_RND0 lea TMP0, [TMP0+2*TMP1] COPY_H_3DN_RND0 lea TMP0, [TMP0+2*TMP1] COPY_H_3DN_RND0 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 movq mm7, [mmx_one] COPY_H_3DN_RND1 lea TMP0, [TMP0+2*TMP1] COPY_H_3DN_RND1 lea TMP0, [TMP0+2*TMP1] COPY_H_3DN_RND1 lea TMP0, [TMP0+2*TMP1] COPY_H_3DN_RND1 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_v_3dn(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- %macro COPY_V_3DN_RND0 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] pavgusb mm0, mm1 pavgusb mm1, [_EAX+2*TMP1] lea _EAX, [_EAX+2*TMP1] movq [TMP0], mm0 movq [TMP0+TMP1], mm1 %endmacro %macro COPY_V_3DN_RND1 0 movq mm0, mm2 movq mm1, [_EAX] movq mm2, [_EAX+TMP1] lea _EAX, [_EAX+2*TMP1] movq mm4, mm0 movq mm5, mm1 pavgusb mm0, mm1 pxor mm4, mm1 pavgusb mm1, mm2 pxor mm5, mm2 pand mm4, mm7 ; lsb's of (i^j)... pand mm5, mm7 ; lsb's of (i^j)... psubb mm0, mm4 ; ...are substracted from result of pavgusb movq [TMP0], mm0 psubb mm1, mm5 ; ...are substracted from result of pavgusb movq [TMP0+TMP1], mm1 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_v_3dn: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX,_EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride ; we process 2 line at a time jnz near .rounding1 COPY_V_3DN_RND0 lea TMP0, [TMP0+2*TMP1] COPY_V_3DN_RND0 lea TMP0, [TMP0+2*TMP1] COPY_V_3DN_RND0 lea TMP0, [TMP0+2*TMP1] COPY_V_3DN_RND0 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 movq mm7, [mmx_one] movq mm2, [_EAX] ; loop invariant add _EAX, TMP1 COPY_V_3DN_RND1 lea TMP0, [TMP0+2*TMP1] COPY_V_3DN_RND1 lea TMP0, [TMP0+2*TMP1] COPY_V_3DN_RND1 lea TMP0, [TMP0+2*TMP1] COPY_V_3DN_RND1 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_hv_3dn(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;----------------------------------------------------------------------------- ; The trick is to correct the result of 'pavgusb' with some combination of the ; lsb's of the 4 input values i,j,k,l, and their intermediate 'pavgusb' (s and t). ; The boolean relations are: ; (i+j+k+l+3)/4 = (s+t+1)/2 - (ij&kl)&st ; (i+j+k+l+2)/4 = (s+t+1)/2 - (ij|kl)&st ; (i+j+k+l+1)/4 = (s+t+1)/2 - (ij&kl)|st ; (i+j+k+l+0)/4 = (s+t+1)/2 - (ij|kl)|st ; with s=(i+j+1)/2, t=(k+l+1)/2, ij = i^j, kl = k^l, st = s^t. ; Moreover, we process 2 lines at a times, for better overlapping (~15% faster). %macro COPY_HV_3DN_RND0 0 lea _EAX, [_EAX+TMP1] movq mm0, [_EAX] movq mm1, [_EAX+1] movq mm6, mm0 pavgusb mm0, mm1 ; mm0=(j+k+1)/2. preserved for next step lea _EAX, [_EAX+TMP1] pxor mm1, mm6 ; mm1=(j^k). preserved for next step por mm3, mm1 ; ij |= jk movq mm6, mm2 pxor mm6, mm0 ; mm6 = s^t pand mm3, mm6 ; (ij|jk) &= st pavgusb mm2, mm0 ; mm2 = (s+t+1)/2 pand mm3, mm7 ; mask lsb psubb mm2, mm3 ; apply. movq [TMP0], mm2 movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgusb mm2, mm3 ; preserved for next iteration lea TMP0, [TMP0+TMP1] pxor mm3, mm6 ; preserved for next iteration por mm1, mm3 movq mm6, mm0 pxor mm6, mm2 pand mm1, mm6 pavgusb mm0, mm2 pand mm1, mm7 psubb mm0, mm1 movq [TMP0], mm0 %endmacro %macro COPY_HV_3DN_RND1 0 lea _EAX,[_EAX+TMP1] movq mm0, [_EAX] movq mm1, [_EAX+1] movq mm6, mm0 pavgusb mm0, mm1 ; mm0=(j+k+1)/2. preserved for next step lea _EAX, [_EAX+TMP1] pxor mm1, mm6 ; mm1=(j^k). preserved for next step pand mm3, mm1 movq mm6, mm2 pxor mm6, mm0 por mm3, mm6 pavgusb mm2, mm0 pand mm3, mm7 psubb mm2, mm3 movq [TMP0], mm2 movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgusb mm2, mm3 ; preserved for next iteration lea TMP0, [TMP0+TMP1] pxor mm3, mm6 ; preserved for next iteration pand mm1, mm3 movq mm6, mm0 pxor mm6, mm2 por mm1, mm6 pavgusb mm0, mm2 pand mm1, mm7 psubb mm0, mm1 movq [TMP0], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_hv_3dn: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX, _EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride movq mm7, [mmx_one] ; loop invariants: mm2=(i+j+1)/2 and mm3= i^j movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgusb mm2, mm3 pxor mm3, mm6 ; mm2/mm3 ready jnz near .rounding1 COPY_HV_3DN_RND0 add TMP0, TMP1 COPY_HV_3DN_RND0 add TMP0, TMP1 COPY_HV_3DN_RND0 add TMP0, TMP1 COPY_HV_3DN_RND0 ret .rounding1: COPY_HV_3DN_RND1 add TMP0, TMP1 COPY_HV_3DN_RND1 add TMP0, TMP1 COPY_HV_3DN_RND1 add TMP0, TMP1 COPY_HV_3DN_RND1 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_h_3dn(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x4_halfpel_h_3dn: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX, _EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride jnz near .rounding1 COPY_H_3DN_RND0 lea TMP0, [TMP0+2*TMP1] COPY_H_3DN_RND0 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 movq mm7, [mmx_one] COPY_H_3DN_RND1 lea TMP0, [TMP0+2*TMP1] COPY_H_3DN_RND1 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_v_3dn(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x4_halfpel_v_3dn: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX,_EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride ; we process 2 line at a time jnz near .rounding1 COPY_V_3DN_RND0 lea TMP0, [TMP0+2*TMP1] COPY_V_3DN_RND0 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 movq mm7, [mmx_one] movq mm2, [_EAX] ; loop invariant add _EAX, TMP1 COPY_V_3DN_RND1 lea TMP0, [TMP0+2*TMP1] COPY_V_3DN_RND1 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_hv_3dn(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;----------------------------------------------------------------------------- ; The trick is to correct the result of 'pavgusb' with some combination of the ; lsb's of the 4 input values i,j,k,l, and their intermediate 'pavgusb' (s and t). ; The boolean relations are: ; (i+j+k+l+3)/4 = (s+t+1)/2 - (ij&kl)&st ; (i+j+k+l+2)/4 = (s+t+1)/2 - (ij|kl)&st ; (i+j+k+l+1)/4 = (s+t+1)/2 - (ij&kl)|st ; (i+j+k+l+0)/4 = (s+t+1)/2 - (ij|kl)|st ; with s=(i+j+1)/2, t=(k+l+1)/2, ij = i^j, kl = k^l, st = s^t. ALIGN SECTION_ALIGN interpolate8x4_halfpel_hv_3dn: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX, _EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride movq mm7, [mmx_one] ; loop invariants: mm2=(i+j+1)/2 and mm3= i^j movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgusb mm2, mm3 pxor mm3, mm6 ; mm2/mm3 ready jnz near .rounding1 COPY_HV_3DN_RND0 add TMP0, TMP1 COPY_HV_3DN_RND0 ret .rounding1: COPY_HV_3DN_RND1 add TMP0, TMP1 COPY_HV_3DN_RND1 ret ENDFUNC NON_EXEC_STACK xvidcore/src/image/x86_asm/colorspace_yuyv_mmx.asm0000664000076500007650000002523611254216113023436 0ustar xvidbuildxvidbuild;/**************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - MMX and XMM YUYV<->YV12 conversion - ; * ; * Copyright(C) 2002 Peter Ross ; * ; * This program is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: colorspace_yuyv_mmx.asm,v 1.12 2009-09-16 17:07:58 Isibaar Exp $ ; * ; ***************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ;----------------------------------------------------------------------------- ; yuyv/uyvy mask for extracting yuv components ;----------------------------------------------------------------------------- ; y u y v y u y v ALIGN SECTION_ALIGN yuyv_mask: db 0xff, 0, 0xff, 0, 0xff, 0, 0xff, 0 mmx_one: dw 1, 1, 1, 1 ;============================================================================= ; helper macros used with colorspace_mmx.inc ;============================================================================= ;----------------------------------------------------------------------------- ; YUYV_TO_YV12( TYPE, PAVG ) ; ; TYPE 0=yuyv, 1=uyvy ; PAVG 0=mmx, pavgusb=3dnow, pavgb=xmm ; ; bytes=2, pixels = 8, vpixels=2 ;----------------------------------------------------------------------------- %macro YUYV_TO_YV12_INIT 2 movq mm7, [yuyv_mask] %endmacro %macro YUYV_TO_YV12 2 movq mm0, [x_ptr] ; x_ptr[0] movq mm1, [x_ptr + 8] ; x_ptr[8] movq mm2, [x_ptr + x_stride] ; x_ptr[x_stride + 0] movq mm3, [x_ptr + x_stride + 8] ; x_ptr[x_stride + 8] ; average uv-components ;---[ plain mmx ]---------------------------------------------------- %ifidn %2,0 ; if (%2 eq "0") movq mm4, mm0 movq mm5, mm2 %if %1 == 0 ; yuyv psrlw mm4, 8 psrlw mm5, 8 %endif pand mm4, mm7 pand mm5, mm7 paddw mm4, mm5 movq mm5, mm1 movq mm6, mm3 %if %1 == 0 ; yuyv psrlw mm5, 8 psrlw mm6, 8 %endif pand mm5, mm7 pand mm6, mm7 paddw mm5, mm6 paddw mm4, [mmx_one] ; +1 rounding paddw mm5, [mmx_one] ; psrlw mm4, 1 psrlw mm5, 1 ;---[ 3dnow/xmm ]---------------------------------------------------- %else movq mm4, mm0 movq mm5, mm1 %2 mm4, mm2 ;pavgb/pavgusb mm4, mm2 %2 mm5, mm3 ;pavgb/pavgusb mm5, mm3 ;;movq mm6, mm0 ; 0 rounding ;;pxor mm6, mm2 ; ;;psubb mm4, mm6 ; ;;movq mm6, mm1 ; ;;pxor mm6, mm3 ; ;;psubb mm5, mm5 ; %if %1 == 0 ; yuyv psrlw mm4, 8 psrlw mm5, 8 %endif pand mm4, mm7 pand mm5, mm7 %endif ;-------------------------------------------------------------------- ; write y-component %if %1 == 1 ; uyvy psrlw mm0, 8 psrlw mm1, 8 psrlw mm2, 8 psrlw mm3, 8 %endif pand mm0, mm7 pand mm1, mm7 pand mm2, mm7 pand mm3, mm7 packuswb mm0, mm1 packuswb mm2, mm3 %ifidn %2,pavgb ; xmm movntq [y_ptr], mm0 movntq [y_ptr+y_stride], mm2 %else ; plain mmx,3dnow movq [y_ptr], mm0 movq [y_ptr+y_stride], mm2 %endif ; write uv-components packuswb mm4, mm5 movq mm5, mm4 psrlq mm4, 8 pand mm5, mm7 pand mm4, mm7 packuswb mm5,mm5 packuswb mm4,mm4 movd [u_ptr],mm5 movd [v_ptr],mm4 %endmacro ;----------------------------------------------------------------------------- ; YV12_TO_YUYV( TYPE ) ; ; bytes=2, pixels = 16, vpixels=2 ;----------------------------------------------------------------------------- %macro YV12_TO_YUYV_INIT 2 %endmacro %macro YV12_TO_YUYV 2 movq mm6, [u_ptr] ; [ |uuuu] movq mm2, [v_ptr] ; [ |vvvv] movq mm0, [y_ptr ] ; [yyyy|yyyy] ; y row 0 movq mm1, [y_ptr+y_stride] ; [yyyy|yyyy] ; y row 1 movq mm7, mm6 punpcklbw mm6, mm2 ; [vuvu|vuvu] ; uv[0..3] punpckhbw mm7, mm2 ; [vuvu|vuvu] ; uv[4..7] %if %1 == 0 ; YUYV movq mm2, mm0 movq mm3, mm1 movq mm4, [y_ptr +8] ; [yyyy|yyyy] ; y[8..15] row 0 movq mm5, [y_ptr+y_stride+8] ; [yyyy|yyyy] ; y[8..15] row 1 punpcklbw mm0, mm6 ; [vyuy|vyuy] ; y row 0 + 0 punpckhbw mm2, mm6 ; [vyuy|vyuy] ; y row 0 + 8 punpcklbw mm1, mm6 ; [vyuy|vyuy] ; y row 1 + 0 punpckhbw mm3, mm6 ; [vyuy|vyuy] ; y row 1 + 8 movq [x_ptr ], mm0 movq [x_ptr+8 ], mm2 movq [x_ptr+x_stride ], mm1 movq [x_ptr+x_stride+8], mm3 movq mm0, mm4 movq mm2, mm5 punpcklbw mm0, mm7 ; [vyuy|vyuy] ; y row 0 + 16 punpckhbw mm4, mm7 ; [vyuy|vyuy] ; y row 0 + 24 punpcklbw mm2, mm7 ; [vyuy|vyuy] ; y row 1 + 16 punpckhbw mm5, mm7 ; [vyuy|vyuy] ; y row 1 + 24 movq [x_ptr +16], mm0 movq [x_ptr +24], mm4 movq [x_ptr+x_stride+16], mm2 movq [x_ptr+x_stride+24], mm5 %else ; UYVY movq mm2, mm6 movq mm3, mm6 movq mm4, mm6 punpcklbw mm2, mm0 ; [yvyu|yvyu] ; y row 0 + 0 punpckhbw mm3, mm0 ; [yvyu|yvyu] ; y row 0 + 8 movq mm0, [y_ptr +8] ; [yyyy|yyyy] ; y[8..15] row 0 movq mm5, [y_ptr+y_stride+8] ; [yyyy|yyyy] ; y[8..15] row 1 punpcklbw mm4, mm1 ; [yvyu|yvyu] ; y row 1 + 0 punpckhbw mm6, mm1 ; [yvyu|yvyu] ; y row 1 + 8 movq [x_ptr ], mm2 movq [x_ptr +8], mm3 movq [x_ptr+x_stride ], mm4 movq [x_ptr+x_stride+8], mm6 movq mm2, mm7 movq mm3, mm7 movq mm6, mm7 punpcklbw mm2, mm0 ; [yvyu|yvyu] ; y row 0 + 0 punpckhbw mm3, mm0 ; [yvyu|yvyu] ; y row 0 + 8 punpcklbw mm6, mm5 ; [yvyu|yvyu] ; y row 1 + 0 punpckhbw mm7, mm5 ; [yvyu|yvyu] ; y row 1 + 8 movq [x_ptr +16], mm2 movq [x_ptr +24], mm3 movq [x_ptr+x_stride+16], mm6 movq [x_ptr+x_stride+24], mm7 %endif %endmacro ;------------------------------------------------------------------------------ ; YV12_TO_YUYVI( TYPE ) ; ; TYPE 0=yuyv, 1=uyvy ; ; bytes=2, pixels = 8, vpixels=4 ;------------------------------------------------------------------------------ %macro YV12_TO_YUYVI_INIT 2 %endmacro %macro YV12_TO_YUYVI 2 %ifdef ARCH_IS_X86_64 mov TMP1d, prm_uv_stride movd mm0, [u_ptr] ; [ |uuuu] movd mm1, [u_ptr+TMP1] ; [ |uuuu] punpcklbw mm0, [v_ptr] ; [vuvu|vuvu] ; uv row 0 punpcklbw mm1, [v_ptr+TMP1] ; [vuvu|vuvu] ; uv row 1 %else xchg width, prm_uv_stride movd mm0, [u_ptr] ; [ |uuuu] movd mm1, [u_ptr+width] ; [ |uuuu] punpcklbw mm0, [v_ptr] ; [vuvu|vuvu] ; uv row 0 punpcklbw mm1, [v_ptr+width] ; [vuvu|vuvu] ; uv row 1 xchg width, prm_uv_stride %endif %if %1 == 0 ; YUYV movq mm4, [y_ptr] ; [yyyy|yyyy] ; y row 0 movq mm6, [y_ptr+y_stride] ; [yyyy|yyyy] ; y row 1 movq mm5, mm4 movq mm7, mm6 punpcklbw mm4, mm0 ; [yuyv|yuyv] ; y row 0 + 0 punpckhbw mm5, mm0 ; [yuyv|yuyv] ; y row 0 + 8 punpcklbw mm6, mm1 ; [yuyv|yuyv] ; y row 1 + 0 punpckhbw mm7, mm1 ; [yuyv|yuyv] ; y row 1 + 8 movq [x_ptr], mm4 movq [x_ptr+8], mm5 movq [x_ptr+x_stride], mm6 movq [x_ptr+x_stride+8], mm7 push y_ptr push x_ptr add y_ptr, y_stride add x_ptr, x_stride movq mm4, [y_ptr+y_stride] ; [yyyy|yyyy] ; y row 2 movq mm6, [y_ptr+2*y_stride] ; [yyyy|yyyy] ; y row 3 movq mm5, mm4 movq mm7, mm6 punpcklbw mm4, mm0 ; [yuyv|yuyv] ; y row 2 + 0 punpckhbw mm5, mm0 ; [yuyv|yuyv] ; y row 2 + 8 punpcklbw mm6, mm1 ; [yuyv|yuyv] ; y row 3 + 0 punpckhbw mm7, mm1 ; [yuyv|yuyv] ; y row 3 + 8 movq [x_ptr+x_stride], mm4 movq [x_ptr+x_stride+8], mm5 movq [x_ptr+2*x_stride], mm6 movq [x_ptr+2*x_stride+8], mm7 pop x_ptr pop y_ptr %else ; UYVY movq mm2, [y_ptr] ; [yyyy|yyyy] ; y row 0 movq mm3, [y_ptr+y_stride] ; [yyyy|yyyy] ; y row 1 movq mm4, mm0 movq mm5, mm0 movq mm6, mm1 movq mm7, mm1 punpcklbw mm4, mm2 ; [uyvy|uyvy] ; y row 0 + 0 punpckhbw mm5, mm2 ; [uyvy|uyvy] ; y row 0 + 8 punpcklbw mm6, mm3 ; [uyvy|uyvy] ; y row 1 + 0 punpckhbw mm7, mm3 ; [uyvy|uyvy] ; y row 1 + 8 movq [x_ptr], mm4 movq [x_ptr+8], mm5 movq [x_ptr+x_stride], mm6 movq [x_ptr+x_stride+8], mm7 push y_ptr push x_ptr add y_ptr, y_stride add x_ptr, x_stride movq mm2, [y_ptr+y_stride] ; [yyyy|yyyy] ; y row 2 movq mm3, [y_ptr+2*y_stride] ; [yyyy|yyyy] ; y row 3 movq mm4, mm0 movq mm5, mm0 movq mm6, mm1 movq mm7, mm1 punpcklbw mm4, mm2 ; [uyvy|uyvy] ; y row 2 + 0 punpckhbw mm5, mm2 ; [uyvy|uyvy] ; y row 2 + 8 punpcklbw mm6, mm3 ; [uyvy|uyvy] ; y row 3 + 0 punpckhbw mm7, mm3 ; [uyvy|uyvy] ; y row 3 + 8 movq [x_ptr+x_stride], mm4 movq [x_ptr+x_stride+8], mm5 movq [x_ptr+2*x_stride], mm6 movq [x_ptr+2*x_stride+8], mm7 pop x_ptr pop y_ptr %endif %endmacro ;============================================================================= ; Code ;============================================================================= TEXT %include "colorspace_mmx.inc" ; input MAKE_COLORSPACE yuyv_to_yv12_mmx,0, 2,8,2, YUYV_TO_YV12, 0, 0 MAKE_COLORSPACE yuyv_to_yv12_3dn,0, 2,8,2, YUYV_TO_YV12, 0, pavgusb MAKE_COLORSPACE yuyv_to_yv12_xmm,0, 2,8,2, YUYV_TO_YV12, 0, pavgb MAKE_COLORSPACE uyvy_to_yv12_mmx,0, 2,8,2, YUYV_TO_YV12, 1, 0 MAKE_COLORSPACE uyvy_to_yv12_3dn,0, 2,8,2, YUYV_TO_YV12, 1, pavgusb MAKE_COLORSPACE uyvy_to_yv12_xmm,0, 2,8,2, YUYV_TO_YV12, 1, pavgb ; output MAKE_COLORSPACE yv12_to_yuyv_mmx,0, 2,16,2, YV12_TO_YUYV, 0, -1 MAKE_COLORSPACE yv12_to_uyvy_mmx,0, 2,16,2, YV12_TO_YUYV, 1, -1 MAKE_COLORSPACE yv12_to_yuyvi_mmx,0, 2,8,4, YV12_TO_YUYVI, 0, -1 MAKE_COLORSPACE yv12_to_uyvyi_mmx,0, 2,8,4, YV12_TO_YUYVI, 1, -1 NON_EXEC_STACK xvidcore/src/image/x86_asm/deintl_sse.asm0000664000076500007650000000577411345416076021500 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - simple de-interlacer ; * Copyright(C) 2006 Pascal Massimino ; * ; * This file is part of Xvid, a free MPEG-4 video encoder/decoder ; * ; * Xvid is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: deintl_sse.asm,v 1.7 2010-03-09 10:00:14 Isibaar Exp $ ; * ; *************************************************************************/ ;/************************************************************************** ; * ; * History: ; * ; * Oct 13 2006: initial version ; * ; *************************************************************************/ %include "nasm.inc" ;////////////////////////////////////////////////////////////////////// cglobal xvid_deinterlace_sse ;////////////////////////////////////////////////////////////////////// DATA align SECTION_ALIGN Mask_6b times 16 db 0x3f Rnd_3b: times 16 db 3 TEXT ;////////////////////////////////////////////////////////////////////// ;// sse version align SECTION_ALIGN xvid_deinterlace_sse: mov _EAX, prm1 ; Pix mov TMP0, prm3 ; Height mov TMP1, prm4 ; BpS push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm2 ; Width %else mov _EBX, [esp+4+ 8] ; Width %endif add _EBX, 7 shr TMP0, 1 shr _EBX, 3 ; Width /= 8 dec TMP0 movq mm6, [Mask_6b] .Loop_x: push _EAX movq mm1, [_EAX ] movq mm2, [_EAX+ TMP1] lea _EAX, [_EAX+ TMP1] movq mm0, mm2 push TMP0 .Loop: movq mm3, [_EAX+ TMP1] movq mm4, [_EAX+2*TMP1] movq mm5, mm2 pavgb mm0, mm4 pavgb mm1, mm3 movq mm7, mm2 psubusb mm2, mm0 psubusb mm0, mm7 paddusb mm0, [Rnd_3b] psrlw mm2, 2 psrlw mm0, 2 pand mm2, mm6 pand mm0, mm6 paddusb mm1, mm2 psubusb mm1, mm0 movq [_EAX], mm1 lea _EAX, [_EAX+2*TMP1] movq mm0, mm5 movq mm1, mm3 movq mm2, mm4 dec TMP0 jg .Loop pavgb mm0, mm2 ; p0 += p2 pavgb mm1, mm1 ; p1 += p1 movq mm7, mm2 psubusb mm2, mm0 psubusb mm0, mm7 paddusb mm0, [Rnd_3b] psrlw mm2, 2 psrlw mm0, 2 pand mm2, mm6 pand mm0, mm6 paddusb mm1, mm2 psubusb mm1, mm0 movq [_EAX], mm1 pop TMP0 pop _EAX add _EAX, 8 dec _EBX jg .Loop_x pop _EBX ret ENDFUNC ;////////////////////////////////////////////////////////////////////// NON_EXEC_STACK xvidcore/src/image/x86_asm/interpolate8x8_mmx.asm0000664000076500007650000006523211254216113023106 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - mmx 8x8 block-based halfpel interpolation - ; * ; * Copyright(C) 2001 Peter Ross ; * 2002-2008 Michael Militzer ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; ****************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ;----------------------------------------------------------------------------- ; (16 - r) rounding table ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN rounding_lowpass_mmx: times 4 dw 16 times 4 dw 15 ;----------------------------------------------------------------------------- ; (1 - r) rounding table ;----------------------------------------------------------------------------- rounding1_mmx: times 4 dw 1 times 4 dw 0 ;----------------------------------------------------------------------------- ; (2 - r) rounding table ;----------------------------------------------------------------------------- rounding2_mmx: times 4 dw 2 times 4 dw 1 mmx_one: times 8 db 1 mmx_two: times 8 db 2 mmx_three: times 8 db 3 mmx_five: times 4 dw 5 mmx_mask: times 8 db 254 mmx_mask2: times 8 db 252 ;============================================================================= ; Code ;============================================================================= TEXT cglobal interpolate8x8_halfpel_h_mmx cglobal interpolate8x8_halfpel_v_mmx cglobal interpolate8x8_halfpel_hv_mmx cglobal interpolate8x4_halfpel_h_mmx cglobal interpolate8x4_halfpel_v_mmx cglobal interpolate8x4_halfpel_hv_mmx cglobal interpolate8x8_avg4_mmx cglobal interpolate8x8_avg2_mmx cglobal interpolate8x8_6tap_lowpass_h_mmx cglobal interpolate8x8_6tap_lowpass_v_mmx cglobal interpolate8x8_halfpel_add_mmx cglobal interpolate8x8_halfpel_h_add_mmx cglobal interpolate8x8_halfpel_v_add_mmx cglobal interpolate8x8_halfpel_hv_add_mmx %macro CALC_AVG 6 punpcklbw %3, %6 punpckhbw %4, %6 paddusw %1, %3 ; mm01 += mm23 paddusw %2, %4 paddusw %1, %5 ; mm01 += rounding paddusw %2, %5 psrlw %1, 1 ; mm01 >>= 1 psrlw %2, 1 %endmacro ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_h_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- %macro COPY_H_MMX 0 movq mm0, [TMP0] movq mm2, [TMP0 + 1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm6 ; mm01 = [src] punpckhbw mm1, mm6 ; mm23 = [src + 1] CALC_AVG mm0, mm1, mm2, mm3, mm7, mm6 packuswb mm0, mm1 movq [_EAX], mm0 ; [dst] = mm01 add TMP0, TMP1 ; src += stride add _EAX, TMP1 ; dst += stride %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_h_mmx: mov _EAX, prm4 ; rounding lea TMP0, [rounding1_mmx] movq mm7, [TMP0 + _EAX * 8] mov _EAX, prm1 ; dst mov TMP0, prm2 ; src mov TMP1, prm3 ; stride pxor mm6, mm6 ; zero COPY_H_MMX COPY_H_MMX COPY_H_MMX COPY_H_MMX COPY_H_MMX COPY_H_MMX COPY_H_MMX COPY_H_MMX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_v_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- %macro COPY_V_MMX 0 movq mm0, [TMP0] movq mm2, [TMP0 + TMP1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm6 ; mm01 = [src] punpckhbw mm1, mm6 ; mm23 = [src + 1] CALC_AVG mm0, mm1, mm2, mm3, mm7, mm6 packuswb mm0, mm1 movq [_EAX], mm0 ; [dst] = mm01 add TMP0, TMP1 ; src += stride add _EAX, TMP1 ; dst += stride %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_v_mmx: mov _EAX, prm4 ; rounding lea TMP0, [rounding1_mmx] movq mm7, [TMP0 + _EAX * 8] mov _EAX, prm1 ; dst mov TMP0, prm2 ; src mov TMP1, prm3 ; stride pxor mm6, mm6 ; zero COPY_V_MMX COPY_V_MMX COPY_V_MMX COPY_V_MMX COPY_V_MMX COPY_V_MMX COPY_V_MMX COPY_V_MMX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_hv_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;----------------------------------------------------------------------------- %macro COPY_HV_MMX 0 ; current row movq mm0, [TMP0] movq mm2, [TMP0 + 1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm6 ; mm01 = [src] punpcklbw mm2, mm6 ; mm23 = [src + 1] punpckhbw mm1, mm6 punpckhbw mm3, mm6 paddusw mm0, mm2 ; mm01 += mm23 paddusw mm1, mm3 ; next row movq mm4, [TMP0 + TMP1] movq mm2, [TMP0 + TMP1 + 1] movq mm5, mm4 movq mm3, mm2 punpcklbw mm4, mm6 ; mm45 = [src + stride] punpcklbw mm2, mm6 ; mm23 = [src + stride + 1] punpckhbw mm5, mm6 punpckhbw mm3, mm6 paddusw mm4, mm2 ; mm45 += mm23 paddusw mm5, mm3 ; add current + next row paddusw mm0, mm4 ; mm01 += mm45 paddusw mm1, mm5 paddusw mm0, mm7 ; mm01 += rounding2 paddusw mm1, mm7 psrlw mm0, 2 ; mm01 >>= 2 psrlw mm1, 2 packuswb mm0, mm1 movq [_EAX], mm0 ; [dst] = mm01 add TMP0, TMP1 ; src += stride add _EAX, TMP1 ; dst += stride %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_hv_mmx: mov _EAX, prm4 ; rounding lea TMP0, [rounding2_mmx] movq mm7, [TMP0 + _EAX * 8] mov _EAX, prm1 ; dst mov TMP0, prm2 ; src pxor mm6, mm6 ; zero mov TMP1, prm3 ; stride COPY_HV_MMX COPY_HV_MMX COPY_HV_MMX COPY_HV_MMX COPY_HV_MMX COPY_HV_MMX COPY_HV_MMX COPY_HV_MMX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_h_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x4_halfpel_h_mmx: mov _EAX, prm4 ; rounding lea TMP0, [rounding1_mmx] movq mm7, [TMP0 + _EAX * 8] mov _EAX, prm1 ; dst mov TMP0, prm2 ; src mov TMP1, prm3 ; stride pxor mm6, mm6 ; zero COPY_H_MMX COPY_H_MMX COPY_H_MMX COPY_H_MMX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_v_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x4_halfpel_v_mmx: mov _EAX, prm4 ; rounding lea TMP0, [rounding1_mmx] movq mm7, [TMP0 + _EAX * 8] mov _EAX, prm1 ; dst mov TMP0, prm2 ; src mov TMP1, prm3 ; stride pxor mm6, mm6 ; zero COPY_V_MMX COPY_V_MMX COPY_V_MMX COPY_V_MMX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_hv_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x4_halfpel_hv_mmx: mov _EAX, prm4 ; rounding lea TMP0, [rounding2_mmx] movq mm7, [TMP0 + _EAX * 8] mov _EAX, prm1 ; dst mov TMP0, prm2 ; src pxor mm6, mm6 ; zero mov TMP1, prm3 ; stride COPY_HV_MMX COPY_HV_MMX COPY_HV_MMX COPY_HV_MMX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_avg2_mmx(uint8_t const *dst, ; const uint8_t * const src1, ; const uint8_t * const src2, ; const uint32_t stride, ; const uint32_t rounding, ; const uint32_t height); ; ;----------------------------------------------------------------------------- %macro AVG2_MMX_RND0 0 movq mm0, [_EAX] ; src1 -> mm0 movq mm1, [_EBX] ; src2 -> mm1 movq mm4, [_EAX+TMP1] movq mm5, [_EBX+TMP1] movq mm2, mm0 ; src1 -> mm2 movq mm3, mm1 ; src2 -> mm3 pand mm2, mm7 ; isolate the lsb pand mm3, mm7 ; isolate the lsb por mm2, mm3 ; ODD(src1) OR ODD(src2) -> mm2 movq mm3, mm4 movq mm6, mm5 pand mm3, mm7 pand mm6, mm7 por mm3, mm6 pand mm0, [mmx_mask] pand mm1, [mmx_mask] pand mm4, [mmx_mask] pand mm5, [mmx_mask] psrlq mm0, 1 ; src1 / 2 psrlq mm1, 1 ; src2 / 2 psrlq mm4, 1 psrlq mm5, 1 paddb mm0, mm1 ; src1/2 + src2/2 -> mm0 paddb mm0, mm2 ; correct rounding error paddb mm4, mm5 paddb mm4, mm3 lea _EAX, [_EAX+2*TMP1] lea _EBX, [_EBX+2*TMP1] movq [TMP0], mm0 ; (src1 + src2 + 1) / 2 -> dst movq [TMP0+TMP1], mm4 %endmacro %macro AVG2_MMX_RND1 0 movq mm0, [_EAX] ; src1 -> mm0 movq mm1, [_EBX] ; src2 -> mm1 movq mm4, [_EAX+TMP1] movq mm5, [_EBX+TMP1] movq mm2, mm0 ; src1 -> mm2 movq mm3, mm1 ; src2 -> mm3 pand mm2, mm7 ; isolate the lsb pand mm3, mm7 ; isolate the lsb pand mm2, mm3 ; ODD(src1) AND ODD(src2) -> mm2 movq mm3, mm4 movq mm6, mm5 pand mm3, mm7 pand mm6, mm7 pand mm3, mm6 pand mm0, [mmx_mask] pand mm1, [mmx_mask] pand mm4, [mmx_mask] pand mm5, [mmx_mask] psrlq mm0, 1 ; src1 / 2 psrlq mm1, 1 ; src2 / 2 psrlq mm4, 1 psrlq mm5, 1 paddb mm0, mm1 ; src1/2 + src2/2 -> mm0 paddb mm0, mm2 ; correct rounding error paddb mm4, mm5 paddb mm4, mm3 lea _EAX, [_EAX+2*TMP1] lea _EBX, [_EBX+2*TMP1] movq [TMP0], mm0 ; (src1 + src2 + 1) / 2 -> dst movq [TMP0+TMP1], mm4 %endmacro ALIGN SECTION_ALIGN interpolate8x8_avg2_mmx: mov eax, prm5d ; rounding test _EAX, _EAX jnz near .rounding1 mov eax, prm6d ; height -> _EAX sub _EAX, 8 mov TMP0, prm1 ; dst -> edi mov _EAX, prm2 ; src1 -> esi mov TMP1, prm4 ; stride -> TMP1 push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [esp + 4 + 12] ; src2 -> eax %endif movq mm7, [mmx_one] jz near .start0 AVG2_MMX_RND0 lea TMP0, [TMP0+2*TMP1] .start0: AVG2_MMX_RND0 lea TMP0, [TMP0+2*TMP1] AVG2_MMX_RND0 lea TMP0, [TMP0+2*TMP1] AVG2_MMX_RND0 lea TMP0, [TMP0+2*TMP1] AVG2_MMX_RND0 pop _EBX ret .rounding1: mov eax, prm6d ; height -> _EAX sub _EAX, 8 mov TMP0, prm1 ; dst -> edi mov _EAX, prm2 ; src1 -> esi mov TMP1, prm4 ; stride -> TMP1 push _EBX %ifdef ARCH_IS_X86_64 mov _EBX, prm3 %else mov _EBX, [esp + 4 + 12] ; src2 -> eax %endif movq mm7, [mmx_one] jz near .start1 AVG2_MMX_RND1 lea TMP0, [TMP0+2*TMP1] .start1: AVG2_MMX_RND1 lea TMP0, [TMP0+2*TMP1] AVG2_MMX_RND1 lea TMP0, [TMP0+2*TMP1] AVG2_MMX_RND1 lea TMP0, [TMP0+2*TMP1] AVG2_MMX_RND1 pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_avg4_mmx(uint8_t const *dst, ; const uint8_t * const src1, ; const uint8_t * const src2, ; const uint8_t * const src3, ; const uint8_t * const src4, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- %macro AVG4_MMX_RND0 0 movq mm0, [_EAX] ; src1 -> mm0 movq mm1, [_EBX] ; src2 -> mm1 movq mm2, mm0 movq mm3, mm1 pand mm2, [mmx_three] pand mm3, [mmx_three] pand mm0, [mmx_mask2] pand mm1, [mmx_mask2] psrlq mm0, 2 psrlq mm1, 2 lea _EAX, [_EAX+TMP1] lea _EBX, [_EBX+TMP1] paddb mm0, mm1 paddb mm2, mm3 movq mm4, [_ESI] ; src3 -> mm0 movq mm5, [_EDI] ; src4 -> mm1 movq mm1, mm4 movq mm3, mm5 pand mm1, [mmx_three] pand mm3, [mmx_three] pand mm4, [mmx_mask2] pand mm5, [mmx_mask2] psrlq mm4, 2 psrlq mm5, 2 paddb mm4, mm5 paddb mm0, mm4 paddb mm1, mm3 paddb mm2, mm1 paddb mm2, [mmx_two] pand mm2, [mmx_mask2] psrlq mm2, 2 paddb mm0, mm2 lea _ESI, [_ESI+TMP1] lea _EDI, [_EDI+TMP1] movq [TMP0], mm0 ; (src1 + src2 + src3 + src4 + 2) / 4 -> dst %endmacro %macro AVG4_MMX_RND1 0 movq mm0, [_EAX] ; src1 -> mm0 movq mm1, [_EBX] ; src2 -> mm1 movq mm2, mm0 movq mm3, mm1 pand mm2, [mmx_three] pand mm3, [mmx_three] pand mm0, [mmx_mask2] pand mm1, [mmx_mask2] psrlq mm0, 2 psrlq mm1, 2 lea _EAX,[_EAX+TMP1] lea _EBX,[_EBX+TMP1] paddb mm0, mm1 paddb mm2, mm3 movq mm4, [_ESI] ; src3 -> mm0 movq mm5, [_EDI] ; src4 -> mm1 movq mm1, mm4 movq mm3, mm5 pand mm1, [mmx_three] pand mm3, [mmx_three] pand mm4, [mmx_mask2] pand mm5, [mmx_mask2] psrlq mm4, 2 psrlq mm5, 2 paddb mm4, mm5 paddb mm0, mm4 paddb mm1, mm3 paddb mm2, mm1 paddb mm2, [mmx_one] pand mm2, [mmx_mask2] psrlq mm2, 2 paddb mm0, mm2 lea _ESI,[_ESI+TMP1] lea _EDI,[_EDI+TMP1] movq [TMP0], mm0 ; (src1 + src2 + src3 + src4 + 2) / 4 -> dst %endmacro ALIGN SECTION_ALIGN interpolate8x8_avg4_mmx: mov eax, prm7d ; rounding test _EAX, _EAX mov TMP0, prm1 ; dst -> edi mov _EAX, prm5 ; src4 -> edi mov TMP1d, prm6d ; stride -> TMP1 push _EBX push _EDI push _ESI mov _EDI, _EAX %ifdef ARCH_IS_X86_64 mov _EAX, prm2 mov _EBX, prm3 mov _ESI, prm4 %else mov _EAX, [esp + 12 + 8] ; src1 -> esi mov _EBX, [esp + 12 + 12] ; src2 -> _EAX mov _ESI, [esp + 12 + 16] ; src3 -> esi %endif movq mm7, [mmx_one] jnz near .rounding1 AVG4_MMX_RND0 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND0 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND0 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND0 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND0 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND0 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND0 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND0 pop _ESI pop _EDI pop _EBX ret .rounding1: AVG4_MMX_RND1 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND1 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND1 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND1 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND1 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND1 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND1 lea TMP0, [TMP0+TMP1] AVG4_MMX_RND1 pop _ESI pop _EDI pop _EBX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_6tap_lowpass_h_mmx(uint8_t const *dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- %macro LOWPASS_6TAP_H_MMX 0 movq mm0, [_EAX] movq mm2, [_EAX+1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm7 punpcklbw mm2, mm7 punpckhbw mm1, mm7 punpckhbw mm3, mm7 paddw mm0, mm2 paddw mm1, mm3 psllw mm0, 2 psllw mm1, 2 movq mm2, [_EAX-1] movq mm4, [_EAX+2] movq mm3, mm2 movq mm5, mm4 punpcklbw mm2, mm7 punpcklbw mm4, mm7 punpckhbw mm3, mm7 punpckhbw mm5, mm7 paddw mm2, mm4 paddw mm3, mm5 psubsw mm0, mm2 psubsw mm1, mm3 pmullw mm0, [mmx_five] pmullw mm1, [mmx_five] movq mm2, [_EAX-2] movq mm4, [_EAX+3] movq mm3, mm2 movq mm5, mm4 punpcklbw mm2, mm7 punpcklbw mm4, mm7 punpckhbw mm3, mm7 punpckhbw mm5, mm7 paddw mm2, mm4 paddw mm3, mm5 paddsw mm0, mm2 paddsw mm1, mm3 paddsw mm0, mm6 paddsw mm1, mm6 psraw mm0, 5 psraw mm1, 5 lea _EAX, [_EAX+TMP1] packuswb mm0, mm1 movq [TMP0], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_6tap_lowpass_h_mmx: mov _EAX, prm4 ; rounding lea TMP0, [rounding_lowpass_mmx] movq mm6, [TMP0 + _EAX * 8] mov TMP0, prm1 ; dst -> edi mov _EAX, prm2 ; src -> esi mov TMP1, prm3 ; stride -> edx pxor mm7, mm7 LOWPASS_6TAP_H_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_H_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_H_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_H_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_H_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_H_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_H_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_H_MMX ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_6tap_lowpass_v_mmx(uint8_t const *dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- %macro LOWPASS_6TAP_V_MMX 0 movq mm0, [_EAX] movq mm2, [_EAX+TMP1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm7 punpcklbw mm2, mm7 punpckhbw mm1, mm7 punpckhbw mm3, mm7 paddw mm0, mm2 paddw mm1, mm3 psllw mm0, 2 psllw mm1, 2 movq mm4, [_EAX+2*TMP1] sub _EAX, _EBX movq mm2, [_EAX+2*TMP1] movq mm3, mm2 movq mm5, mm4 punpcklbw mm2, mm7 punpcklbw mm4, mm7 punpckhbw mm3, mm7 punpckhbw mm5, mm7 paddw mm2, mm4 paddw mm3, mm5 psubsw mm0, mm2 psubsw mm1, mm3 pmullw mm0, [mmx_five] pmullw mm1, [mmx_five] movq mm2, [_EAX+TMP1] movq mm4, [_EAX+2*_EBX] movq mm3, mm2 movq mm5, mm4 punpcklbw mm2, mm7 punpcklbw mm4, mm7 punpckhbw mm3, mm7 punpckhbw mm5, mm7 paddw mm2, mm4 paddw mm3, mm5 paddsw mm0, mm2 paddsw mm1, mm3 paddsw mm0, mm6 paddsw mm1, mm6 psraw mm0, 5 psraw mm1, 5 lea _EAX, [_EAX+4*TMP1] packuswb mm0, mm1 movq [TMP0], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_6tap_lowpass_v_mmx: mov _EAX, prm4 ; rounding lea TMP0, [rounding_lowpass_mmx] movq mm6, [TMP0 + _EAX * 8] mov TMP0, prm1 ; dst -> edi mov _EAX, prm2 ; src -> esi mov TMP1, prm3 ; stride -> edx push _EBX lea _EBX, [TMP1+TMP1*2] pxor mm7, mm7 LOWPASS_6TAP_V_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_V_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_V_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_V_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_V_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_V_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_V_MMX lea TMP0, [TMP0+TMP1] LOWPASS_6TAP_V_MMX pop _EBX ret ENDFUNC ;=========================================================================== ; ; The next functions combine both source halfpel interpolation step and the ; averaging (with rouding) step to avoid wasting memory bandwidth computing ; intermediate halfpel images and then averaging them. ; ;=========================================================================== %macro PROLOG0 0 mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Src mov TMP1, prm3 ; BpS %endmacro %macro PROLOG 2 ; %1: Rounder, %2 load Dst-Rounder pxor mm6, mm6 movq mm7, [%1] ; TODO: dangerous! (eax isn't checked) %if %2 movq mm5, [rounding1_mmx] %endif PROLOG0 %endmacro ; performs: mm0 == (mm0+mm2) mm1 == (mm1+mm3) %macro MIX 0 punpcklbw mm0, mm6 punpcklbw mm2, mm6 punpckhbw mm1, mm6 punpckhbw mm3, mm6 paddusw mm0, mm2 paddusw mm1, mm3 %endmacro %macro MIX_DST 0 movq mm3, mm2 paddusw mm0, mm7 ; rounder paddusw mm1, mm7 ; rounder punpcklbw mm2, mm6 punpckhbw mm3, mm6 psrlw mm0, 1 psrlw mm1, 1 paddusw mm0, mm2 ; mix Src(mm0/mm1) with Dst(mm2/mm3) paddusw mm1, mm3 paddusw mm0, mm5 paddusw mm1, mm5 psrlw mm0, 1 psrlw mm1, 1 packuswb mm0, mm1 %endmacro %macro MIX2 0 punpcklbw mm0, mm6 punpcklbw mm2, mm6 paddusw mm0, mm2 paddusw mm0, mm7 punpckhbw mm1, mm6 punpckhbw mm3, mm6 paddusw mm1, mm7 paddusw mm1, mm3 psrlw mm0, 1 psrlw mm1, 1 packuswb mm0, mm1 %endmacro ;=========================================================================== ; ; void interpolate8x8_halfpel_add_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== %macro ADD_FF_MMX 1 movq mm0, [_EAX] movq mm2, [TMP0] movq mm1, mm0 movq mm3, mm2 %if (%1!=0) lea _EAX,[_EAX+%1*TMP1] %endif MIX paddusw mm0, mm5 ; rounder paddusw mm1, mm5 ; rounder psrlw mm0, 1 psrlw mm1, 1 packuswb mm0, mm1 movq [TMP0], mm0 %if (%1!=0) lea TMP0,[TMP0+%1*TMP1] %endif %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_add_mmx: PROLOG rounding1_mmx, 1 ADD_FF_MMX 1 ADD_FF_MMX 1 ADD_FF_MMX 1 ADD_FF_MMX 1 ADD_FF_MMX 1 ADD_FF_MMX 1 ADD_FF_MMX 1 ADD_FF_MMX 0 ret ENDFUNC ;=========================================================================== ; ; void interpolate8x8_halfpel_h_add_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== %macro ADD_FH_MMX 0 movq mm0, [_EAX] movq mm2, [_EAX+1] movq mm1, mm0 movq mm3, mm2 lea _EAX,[_EAX+TMP1] MIX movq mm2, [TMP0] ; prepare mix with Dst[0] MIX_DST movq [TMP0], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_h_add_mmx: PROLOG rounding1_mmx, 1 ADD_FH_MMX lea TMP0,[TMP0+TMP1] ADD_FH_MMX lea TMP0,[TMP0+TMP1] ADD_FH_MMX lea TMP0,[TMP0+TMP1] ADD_FH_MMX lea TMP0,[TMP0+TMP1] ADD_FH_MMX lea TMP0,[TMP0+TMP1] ADD_FH_MMX lea TMP0,[TMP0+TMP1] ADD_FH_MMX lea TMP0,[TMP0+TMP1] ADD_FH_MMX ret ENDFUNC ;=========================================================================== ; ; void interpolate8x8_halfpel_v_add_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== %macro ADD_HF_MMX 0 movq mm0, [_EAX] movq mm2, [_EAX+TMP1] movq mm1, mm0 movq mm3, mm2 lea _EAX,[_EAX+TMP1] MIX movq mm2, [TMP0] ; prepare mix with Dst[0] MIX_DST movq [TMP0], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_v_add_mmx: PROLOG rounding1_mmx, 1 ADD_HF_MMX lea TMP0,[TMP0+TMP1] ADD_HF_MMX lea TMP0,[TMP0+TMP1] ADD_HF_MMX lea TMP0,[TMP0+TMP1] ADD_HF_MMX lea TMP0,[TMP0+TMP1] ADD_HF_MMX lea TMP0,[TMP0+TMP1] ADD_HF_MMX lea TMP0,[TMP0+TMP1] ADD_HF_MMX lea TMP0,[TMP0+TMP1] ADD_HF_MMX ret ENDFUNC ; The trick is to correct the result of 'pavgb' with some combination of the ; lsb's of the 4 input values i,j,k,l, and their intermediate 'pavgb' (s and t). ; The boolean relations are: ; (i+j+k+l+3)/4 = (s+t+1)/2 - (ij&kl)&st ; (i+j+k+l+2)/4 = (s+t+1)/2 - (ij|kl)&st ; (i+j+k+l+1)/4 = (s+t+1)/2 - (ij&kl)|st ; (i+j+k+l+0)/4 = (s+t+1)/2 - (ij|kl)|st ; with s=(i+j+1)/2, t=(k+l+1)/2, ij = i^j, kl = k^l, st = s^t. ; Moreover, we process 2 lines at a times, for better overlapping (~15% faster). ;=========================================================================== ; ; void interpolate8x8_halfpel_hv_add_mmx(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== %macro ADD_HH_MMX 0 lea _EAX,[_EAX+TMP1] ; transfert prev line to mm0/mm1 movq mm0, mm2 movq mm1, mm3 ; load new line in mm2/mm3 movq mm2, [_EAX] movq mm4, [_EAX+1] movq mm3, mm2 movq mm5, mm4 punpcklbw mm2, mm6 punpcklbw mm4, mm6 paddusw mm2, mm4 punpckhbw mm3, mm6 punpckhbw mm5, mm6 paddusw mm3, mm5 ; mix current line (mm2/mm3) with previous (mm0,mm1); ; we'll preserve mm2/mm3 for next line... paddusw mm0, mm2 paddusw mm1, mm3 movq mm4, [TMP0] ; prepare mix with Dst[0] movq mm5, mm4 paddusw mm0, mm7 ; finish mixing current line paddusw mm1, mm7 punpcklbw mm4, mm6 punpckhbw mm5, mm6 psrlw mm0, 2 psrlw mm1, 2 paddusw mm0, mm4 ; mix Src(mm0/mm1) with Dst(mm2/mm3) paddusw mm1, mm5 paddusw mm0, [rounding1_mmx] paddusw mm1, [rounding1_mmx] psrlw mm0, 1 psrlw mm1, 1 packuswb mm0, mm1 movq [TMP0], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_hv_add_mmx: PROLOG rounding2_mmx, 0 ; mm5 is busy. Don't load dst-rounder ; preprocess first line movq mm0, [_EAX] movq mm2, [_EAX+1] movq mm1, mm0 movq mm3, mm2 punpcklbw mm0, mm6 punpcklbw mm2, mm6 punpckhbw mm1, mm6 punpckhbw mm3, mm6 paddusw mm2, mm0 paddusw mm3, mm1 ; Input: mm2/mm3 contains the value (Src[0]+Src[1]) of previous line ADD_HH_MMX lea TMP0,[TMP0+TMP1] ADD_HH_MMX lea TMP0,[TMP0+TMP1] ADD_HH_MMX lea TMP0,[TMP0+TMP1] ADD_HH_MMX lea TMP0,[TMP0+TMP1] ADD_HH_MMX lea TMP0,[TMP0+TMP1] ADD_HH_MMX lea TMP0,[TMP0+TMP1] ADD_HH_MMX lea TMP0,[TMP0+TMP1] ADD_HH_MMX ret ENDFUNC NON_EXEC_STACK xvidcore/src/image/x86_asm/colorspace_rgb_mmx.asm0000664000076500007650000004047011254216113023171 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - RGB colorspace conversions - ; * ; * Copyright(C) 2002-2008 Michael Militzer ; * 2002-2003 Peter Ross ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; ****************************************************************************/ %include "nasm.inc" ;============================================================================= ; Some constants ;============================================================================= ;----------------------------------------------------------------------------- ; RGB->YV12 yuv constants ;----------------------------------------------------------------------------- %define Y_R 0.257 %define Y_G 0.504 %define Y_B 0.098 %define Y_ADD 16 %define U_R 0.148 %define U_G 0.291 %define U_B 0.439 %define U_ADD 128 %define V_R 0.439 %define V_G 0.368 %define V_B 0.071 %define V_ADD 128 ; Scaling used during conversion %define SCALEBITS_OUT 6 %define SCALEBITS_IN 13 %define FIX_ROUND (1<<(SCALEBITS_IN-1)) ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN ;----------------------------------------------------------------------------- ; RGB->YV12 multiplication matrices ;----------------------------------------------------------------------------- ; FIX(Y_B) FIX(Y_G) FIX(Y_R) Ignored bgr_y_mul: dw 803, 4129, 2105, 0 bgr_u_mul: dw 3596, -2384, -1212, 0 bgr_v_mul: dw -582, -3015, 3596, 0 ;----------------------------------------------------------------------------- ; BGR->YV12 multiplication matrices ;----------------------------------------------------------------------------- ; FIX(Y_R) FIX(Y_G) FIX(Y_B) Ignored rgb_y_mul: dw 2105, 4129, 803, 0 rgb_u_mul: dw -1212, -2384, 3596, 0 rgb_v_mul: dw 3596, -3015, -582, 0 ;----------------------------------------------------------------------------- ; YV12->RGB data ;----------------------------------------------------------------------------- Y_SUB: dw 16, 16, 16, 16 U_SUB: dw 128, 128, 128, 128 V_SUB: dw 128, 128, 128, 128 Y_MUL: dw 74, 74, 74, 74 UG_MUL: dw 25, 25, 25, 25 VG_MUL: dw 52, 52, 52, 52 UB_MUL: dw 129, 129, 129, 129 VR_MUL: dw 102, 102, 102, 102 BRIGHT: db 128, 128, 128, 128, 128, 128, 128, 128 ;============================================================================= ; Helper macros used with the colorspace_mmx.inc file ;============================================================================= ;------------------------------------------------------------------------------ ; BGR_TO_YV12( BYTES ) ; ; BYTES 3=bgr(24bit), 4=bgra(32-bit) ; ; bytes=3/4, pixels = 2, vpixels=2 ;------------------------------------------------------------------------------ %macro BGR_TO_YV12_INIT 2 movq mm7, [bgr_y_mul] %endmacro %macro BGR_TO_YV12 2 ; y_out pxor mm4, mm4 pxor mm5, mm5 movd mm0, [x_ptr] ; x_ptr[0...] movd mm2, [x_ptr+x_stride] ; x_ptr[x_stride...] punpcklbw mm0, mm4 ; [ |b |g |r ] punpcklbw mm2, mm5 ; [ |b |g |r ] movq mm6, mm0 ; = [ |b4|g4|r4] paddw mm6, mm2 ; +[ |b4|g4|r4] pmaddwd mm0, mm7 ; *= Y_MUL pmaddwd mm2, mm7 ; *= Y_MUL movq mm4, mm0 ; [r] movq mm5, mm2 ; [r] psrlq mm4, 32 ; +[g] psrlq mm5, 32 ; +[g] paddd mm0, mm4 ; +[b] paddd mm2, mm5 ; +[b] pxor mm4, mm4 pxor mm5, mm5 movd mm1, [x_ptr+%1] ; src[%1...] movd mm3, [x_ptr+x_stride+%1] ; src[x_stride+%1...] punpcklbw mm1, mm4 ; [ |b |g |r ] punpcklbw mm3, mm5 ; [ |b |g |r ] paddw mm6, mm1 ; +[ |b4|g4|r4] paddw mm6, mm3 ; +[ |b4|g4|r4] pmaddwd mm1, mm7 ; *= Y_MUL pmaddwd mm3, mm7 ; *= Y_MUL movq mm4, mm1 ; [r] movq mm5, mm3 ; [r] psrlq mm4, 32 ; +[g] psrlq mm5, 32 ; +[g] paddd mm1, mm4 ; +[b] paddd mm3, mm5 ; +[b] push x_stride movd x_stride_d, mm0 add x_stride, FIX_ROUND shr x_stride, SCALEBITS_IN add x_stride, Y_ADD mov [y_ptr], dl ; y_ptr[0] movd x_stride_d, mm1 add x_stride, FIX_ROUND shr x_stride, SCALEBITS_IN add x_stride, Y_ADD mov [y_ptr + 1], dl ; y_ptr[1] movd x_stride_d, mm2 add x_stride, FIX_ROUND shr x_stride, SCALEBITS_IN add x_stride, Y_ADD mov [y_ptr + y_stride + 0], dl ; y_ptr[y_stride + 0] movd x_stride_d, mm3 add x_stride, FIX_ROUND shr x_stride, SCALEBITS_IN add x_stride, Y_ADD mov [y_ptr + y_stride + 1], dl ; y_ptr[y_stride + 1] ; u_ptr, v_ptr movq mm0, mm6 ; = [ |b4|g4|r4] pmaddwd mm6, [bgr_v_mul] ; *= V_MUL pmaddwd mm0, [bgr_u_mul] ; *= U_MUL movq mm1, mm0 movq mm2, mm6 psrlq mm1, 32 psrlq mm2, 32 paddd mm0, mm1 paddd mm2, mm6 movd x_stride_d, mm0 add x_stride, 4*FIX_ROUND shr x_stride, (SCALEBITS_IN+2) add x_stride, U_ADD mov [u_ptr], dl movd x_stride_d, mm2 add x_stride, 4*FIX_ROUND shr x_stride, (SCALEBITS_IN+2) add x_stride, V_ADD mov [v_ptr], dl pop x_stride %endmacro ;------------------------------------------------------------------------------ ; RGB_TO_YV12( BYTES ) ; ; BYTES 3=rgb(24bit), 4=rgba(32-bit) ; ; bytes=3/4, pixels = 2, vpixels=2 ;------------------------------------------------------------------------------ %macro RGB_TO_YV12_INIT 2 movq mm7, [rgb_y_mul] %endmacro %macro RGB_TO_YV12 2 ; y_out pxor mm4, mm4 pxor mm5, mm5 movd mm0, [x_ptr] ; x_ptr[0...] movd mm2, [x_ptr+x_stride] ; x_ptr[x_stride...] punpcklbw mm0, mm4 ; [ |b |g |r ] punpcklbw mm2, mm5 ; [ |b |g |r ] movq mm6, mm0 ; = [ |b4|g4|r4] paddw mm6, mm2 ; +[ |b4|g4|r4] pmaddwd mm0, mm7 ; *= Y_MUL pmaddwd mm2, mm7 ; *= Y_MUL movq mm4, mm0 ; [r] movq mm5, mm2 ; [r] psrlq mm4, 32 ; +[g] psrlq mm5, 32 ; +[g] paddd mm0, mm4 ; +[b] paddd mm2, mm5 ; +[b] pxor mm4, mm4 pxor mm5, mm5 movd mm1, [x_ptr+%1] ; src[%1...] movd mm3, [x_ptr+x_stride+%1] ; src[x_stride+%1...] punpcklbw mm1, mm4 ; [ |b |g |r ] punpcklbw mm3, mm5 ; [ |b |g |r ] paddw mm6, mm1 ; +[ |b4|g4|r4] paddw mm6, mm3 ; +[ |b4|g4|r4] pmaddwd mm1, mm7 ; *= Y_MUL pmaddwd mm3, mm7 ; *= Y_MUL movq mm4, mm1 ; [r] movq mm5, mm3 ; [r] psrlq mm4, 32 ; +[g] psrlq mm5, 32 ; +[g] paddd mm1, mm4 ; +[b] paddd mm3, mm5 ; +[b] push x_stride movd x_stride_d, mm0 add x_stride, FIX_ROUND shr x_stride, SCALEBITS_IN add x_stride, Y_ADD mov [y_ptr], dl ; y_ptr[0] movd x_stride_d, mm1 add x_stride, FIX_ROUND shr x_stride, SCALEBITS_IN add x_stride, Y_ADD mov [y_ptr + 1], dl ; y_ptr[1] movd x_stride_d, mm2 add x_stride, FIX_ROUND shr x_stride, SCALEBITS_IN add x_stride, Y_ADD mov [y_ptr + y_stride + 0], dl ; y_ptr[y_stride + 0] movd x_stride_d, mm3 add x_stride, FIX_ROUND shr x_stride, SCALEBITS_IN add x_stride, Y_ADD mov [y_ptr + y_stride + 1], dl ; y_ptr[y_stride + 1] ; u_ptr, v_ptr movq mm0, mm6 ; = [ |b4|g4|r4] pmaddwd mm6, [rgb_v_mul] ; *= V_MUL pmaddwd mm0, [rgb_u_mul] ; *= U_MUL movq mm1, mm0 movq mm2, mm6 psrlq mm1, 32 psrlq mm2, 32 paddd mm0, mm1 paddd mm2, mm6 movd x_stride_d, mm0 add x_stride, 4*FIX_ROUND shr x_stride, (SCALEBITS_IN+2) add x_stride, U_ADD mov [u_ptr], dl movd x_stride_d, mm2 add x_stride, 4*FIX_ROUND shr x_stride, (SCALEBITS_IN+2) add x_stride, V_ADD mov [v_ptr], dl pop x_stride %endmacro ;------------------------------------------------------------------------------ ; YV12_TO_BGR( BYTES ) ; ; BYTES 3=bgr(24-bit), 4=bgra(32-bit) ; ; bytes=3/4, pixels = 8, vpixels=2 ;------------------------------------------------------------------------------ %macro YV12_TO_BGR_INIT 2 pxor mm7, mm7 ; clear mm7 %endmacro %macro YV12_TO_BGR 2 %define TEMP_Y1 _ESP %define TEMP_Y2 _ESP + 8 %define TEMP_G1 _ESP + 16 %define TEMP_G2 _ESP + 24 %define TEMP_B1 _ESP + 32 %define TEMP_B2 _ESP + 40 movd mm2, [u_ptr] ; u_ptr[0] movd mm3, [v_ptr] ; v_ptr[0] punpcklbw mm2, mm7 ; u3u2u1u0 -> mm2 punpcklbw mm3, mm7 ; v3v2v1v0 -> mm3 psubsw mm2, [U_SUB] ; U - 128 psubsw mm3, [V_SUB] ; V - 128 movq mm4, mm2 movq mm5, mm3 pmullw mm2, [UG_MUL] pmullw mm3, [VG_MUL] movq mm6, mm2 ; u3u2u1u0 -> mm6 punpckhwd mm2, mm2 ; u3u3u2u2 -> mm2 punpcklwd mm6, mm6 ; u1u1u0u0 -> mm6 pmullw mm4, [UB_MUL] ; B_ADD -> mm4 movq mm0, mm3 punpckhwd mm3, mm3 ; v3v3v2v2 -> mm2 punpcklwd mm0, mm0 ; v1v1v0v0 -> mm6 paddsw mm2, mm3 paddsw mm6, mm0 pmullw mm5, [VR_MUL] ; R_ADD -> mm5 movq mm0, [y_ptr] ; y7y6y5y4y3y2y1y0 -> mm0 movq mm1, mm0 punpckhbw mm1, mm7 ; y7y6y5y4 -> mm1 punpcklbw mm0, mm7 ; y3y2y1y0 -> mm0 psubsw mm0, [Y_SUB] ; Y - Y_SUB psubsw mm1, [Y_SUB] ; Y - Y_SUB pmullw mm1, [Y_MUL] pmullw mm0, [Y_MUL] movq [TEMP_Y2], mm1 ; y7y6y5y4 -> mm3 movq [TEMP_Y1], mm0 ; y3y2y1y0 -> mm7 psubsw mm1, mm2 ; g7g6g5g4 -> mm1 psubsw mm0, mm6 ; g3g2g1g0 -> mm0 psraw mm1, SCALEBITS_OUT psraw mm0, SCALEBITS_OUT packuswb mm0, mm1 ;g7g6g5g4g3g2g1g0 -> mm0 movq [TEMP_G1], mm0 movq mm0, [y_ptr+y_stride] ; y7y6y5y4y3y2y1y0 -> mm0 movq mm1, mm0 punpckhbw mm1, mm7 ; y7y6y5y4 -> mm1 punpcklbw mm0, mm7 ; y3y2y1y0 -> mm0 psubsw mm0, [Y_SUB] ; Y - Y_SUB psubsw mm1, [Y_SUB] ; Y - Y_SUB pmullw mm1, [Y_MUL] pmullw mm0, [Y_MUL] movq mm3, mm1 psubsw mm1, mm2 ; g7g6g5g4 -> mm1 movq mm2, mm0 psubsw mm0, mm6 ; g3g2g1g0 -> mm0 psraw mm1, SCALEBITS_OUT psraw mm0, SCALEBITS_OUT packuswb mm0, mm1 ; g7g6g5g4g3g2g1g0 -> mm0 movq [TEMP_G2], mm0 movq mm0, mm4 punpckhwd mm4, mm4 ; u3u3u2u2 -> mm2 punpcklwd mm0, mm0 ; u1u1u0u0 -> mm6 movq mm1, mm3 ; y7y6y5y4 -> mm1 paddsw mm3, mm4 ; b7b6b5b4 -> mm3 movq mm7, mm2 ; y3y2y1y0 -> mm7 paddsw mm2, mm0 ; b3b2b1b0 -> mm2 psraw mm3, SCALEBITS_OUT psraw mm2, SCALEBITS_OUT packuswb mm2, mm3 ; b7b6b5b4b3b2b1b0 -> mm2 movq [TEMP_B2], mm2 movq mm3, [TEMP_Y2] movq mm2, [TEMP_Y1] movq mm6, mm3 ; TEMP_Y2 -> mm6 paddsw mm3, mm4 ; b7b6b5b4 -> mm3 movq mm4, mm2 ; TEMP_Y1 -> mm4 paddsw mm2, mm0 ; b3b2b1b0 -> mm2 psraw mm3, SCALEBITS_OUT psraw mm2, SCALEBITS_OUT packuswb mm2, mm3 ; b7b6b5b4b3b2b1b0 -> mm2 movq [TEMP_B1], mm2 movq mm0, mm5 punpckhwd mm5, mm5 ; v3v3v2v2 -> mm5 punpcklwd mm0, mm0 ; v1v1v0v0 -> mm0 paddsw mm1, mm5 ; r7r6r5r4 -> mm1 paddsw mm7, mm0 ; r3r2r1r0 -> mm7 psraw mm1, SCALEBITS_OUT psraw mm7, SCALEBITS_OUT packuswb mm7, mm1 ; r7r6r5r4r3r2r1r0 -> mm7 (TEMP_R2) paddsw mm6, mm5 ; r7r6r5r4 -> mm6 paddsw mm4, mm0 ; r3r2r1r0 -> mm4 psraw mm6, SCALEBITS_OUT psraw mm4, SCALEBITS_OUT packuswb mm4, mm6 ; r7r6r5r4r3r2r1r0 -> mm4 (TEMP_R1) movq mm0, [TEMP_B1] movq mm1, [TEMP_G1] movq mm6, mm7 movq mm2, mm0 punpcklbw mm2, mm4 ; r3b3r2b2r1b1r0b0 -> mm2 punpckhbw mm0, mm4 ; r7b7r6b6r5b5r4b4 -> mm0 pxor mm7, mm7 movq mm3, mm1 punpcklbw mm1, mm7 ; 0g30g20g10g0 -> mm1 punpckhbw mm3, mm7 ; 0g70g60g50g4 -> mm3 movq mm4, mm2 punpcklbw mm2, mm1 ; 0r1g1b10r0g0b0 -> mm2 punpckhbw mm4, mm1 ; 0r3g3b30r2g2b2 -> mm4 movq mm5, mm0 punpcklbw mm0, mm3 ; 0r5g5b50r4g4b4 -> mm0 punpckhbw mm5, mm3 ; 0r7g7b70r6g6b6 -> mm5 %if %1 == 3 ; BGR (24-bit) movd [x_ptr], mm2 psrlq mm2, 32 movd [x_ptr + 3], mm2 movd [x_ptr + 6], mm4 psrlq mm4, 32 movd [x_ptr + 9], mm4 movd [x_ptr + 12], mm0 psrlq mm0, 32 movd [x_ptr + 15], mm0 movq mm2, mm5 psrlq mm0, 8 ; 000000r5g5 -> mm0 psllq mm2, 32 ; 0r6g6b60000 -> mm2 psrlq mm5, 32 ; 00000r7g7b7 -> mm5 psrlq mm2, 16 ; 000r6g6b600 -> mm2 por mm0, mm2 ; 000r6g6b6r5g5 -> mm0 psllq mm5, 40 ; r7g7b700000 -> mm5 por mm5, mm0 ; r7g7b7r6g6b6r5g5 -> mm5 movq [x_ptr + 16], mm5 movq mm0, [TEMP_B2] movq mm1, [TEMP_G2] movq mm2, mm0 punpcklbw mm2, mm6 ; r3b3r2b2r1b1r0b0 -> mm2 punpckhbw mm0, mm6 ; r7b7r6b6r5b5r4b4 -> mm0 movq mm3, mm1 punpcklbw mm1, mm7 ; 0g30g20g10g0 -> mm1 punpckhbw mm3, mm7 ; 0g70g60g50g4 -> mm3 movq mm4, mm2 punpcklbw mm2, mm1 ; 0r1g1b10r0g0b0 -> mm2 punpckhbw mm4, mm1 ; 0r3g3b30r2g2b2 -> mm4 movq mm5, mm0 punpcklbw mm0, mm3 ; 0r5g5b50r4g4b4 -> mm0 punpckhbw mm5, mm3 ; 0r7g7b70r6g6b6 -> mm5 movd [x_ptr+x_stride], mm2 psrlq mm2, 32 movd [x_ptr+x_stride + 3], mm2 movd [x_ptr+x_stride + 6], mm4 psrlq mm4, 32 movd [x_ptr+x_stride + 9], mm4 movd [x_ptr+x_stride + 12], mm0 psrlq mm0, 32 movd [x_ptr+x_stride + 15], mm0 movq mm2, mm5 psrlq mm0, 8 ; 000000r5g5 -> mm0 psllq mm2, 32 ; 0r6g6b60000 -> mm2 psrlq mm5, 32 ; 00000r7g7b7 -> mm5 psrlq mm2, 16 ; 000r6g6b600 -> mm2 por mm0, mm2 ; 000r6g6b6r5g5 -> mm0 psllq mm5, 40 ; r7g7b700000 -> mm5 por mm5, mm0 ; r7g7b7r6g6b6r5g5 -> mm5 movq [x_ptr + x_stride + 16], mm5 %else ; BGRA (32-bit) movq [x_ptr], mm2 movq [x_ptr + 8], mm4 movq [x_ptr + 16], mm0 movq [x_ptr + 24], mm5 movq mm0, [TEMP_B2] movq mm1, [TEMP_G2] movq mm2, mm0 punpcklbw mm2, mm6 ; r3b3r2b2r1b1r0b0 -> mm2 punpckhbw mm0, mm6 ; r7b7r6b6r5b5r4b4 -> mm0 movq mm3, mm1 punpcklbw mm1, mm7 ; 0g30g20g10g0 -> mm1 punpckhbw mm3, mm7 ; 0g70g60g50g4 -> mm3 movq mm4, mm2 punpcklbw mm2, mm1 ; 0r1g1b10r0g0b0 -> mm2 punpckhbw mm4, mm1 ; 0r3g3b30r2g2b2 -> mm4 movq mm5, mm0 punpcklbw mm0, mm3 ; 0r5g5b50r4g4b4 -> mm0 punpckhbw mm5, mm3 ; 0r7g7b70r6g6b6 -> mm5 movq [x_ptr + x_stride], mm2 movq [x_ptr + x_stride + 8], mm4 movq [x_ptr + x_stride + 16], mm0 movq [x_ptr + x_stride + 24], mm5 %endif %undef TEMP_Y1 %undef TEMP_Y2 %undef TEMP_G1 %undef TEMP_G2 %undef TEMP_B1 %undef TEMP_B2 %endmacro ;============================================================================= ; Code ;============================================================================= TEXT %include "colorspace_mmx.inc" ; input MAKE_COLORSPACE bgr_to_yv12_mmx,0, 3,2,2, BGR_TO_YV12, 3, -1 MAKE_COLORSPACE bgra_to_yv12_mmx,0, 4,2,2, BGR_TO_YV12, 4, -1 MAKE_COLORSPACE rgb_to_yv12_mmx,0, 3,2,2, RGB_TO_YV12, 3, -1 MAKE_COLORSPACE rgba_to_yv12_mmx,0, 4,2,2, RGB_TO_YV12, 4, -1 ; output MAKE_COLORSPACE yv12_to_bgr_mmx,48, 3,8,2, YV12_TO_BGR, 3, -1 MAKE_COLORSPACE yv12_to_bgra_mmx,48, 4,8,2, YV12_TO_BGR, 4, -1 NON_EXEC_STACK xvidcore/src/image/x86_asm/qpel_mmx.asm0000664000076500007650000006135411474471353021167 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - Quarter-pixel interpolation - ; * Copyright(C) 2002 Pascal Massimino ; * ; * This file is part of Xvid, a free MPEG-4 video encoder/decoder ; * ; * Xvid is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; * $Id: qpel_mmx.asm,v 1.13 2010-11-28 15:18:21 Isibaar Exp $ ; * ; *************************************************************************/ ;/************************************************************************** ; * ; * History: ; * ; * 22.10.2002 initial coding. unoptimized 'proof of concept', ; * just to heft the qpel filtering. - Skal - ; * ; *************************************************************************/ %define USE_TABLES ; in order to use xvid_FIR_x_x_x_x tables ; instead of xvid_Expand_mmx... %include "nasm.inc" ;////////////////////////////////////////////////////////////////////// ;// Declarations ;// all signatures are: ;// void XXX(uint8_t *dst, const uint8_t *src, ;// int32_t length, int32_t stride, int32_t rounding) ;////////////////////////////////////////////////////////////////////// cglobal xvid_H_Pass_16_mmx cglobal xvid_H_Pass_Avrg_16_mmx cglobal xvid_H_Pass_Avrg_Up_16_mmx cglobal xvid_V_Pass_16_mmx cglobal xvid_V_Pass_Avrg_16_mmx cglobal xvid_V_Pass_Avrg_Up_16_mmx cglobal xvid_H_Pass_8_mmx cglobal xvid_H_Pass_Avrg_8_mmx cglobal xvid_H_Pass_Avrg_Up_8_mmx cglobal xvid_V_Pass_8_mmx cglobal xvid_V_Pass_Avrg_8_mmx cglobal xvid_V_Pass_Avrg_Up_8_mmx cglobal xvid_H_Pass_Add_16_mmx cglobal xvid_H_Pass_Avrg_Add_16_mmx cglobal xvid_H_Pass_Avrg_Up_Add_16_mmx cglobal xvid_V_Pass_Add_16_mmx cglobal xvid_V_Pass_Avrg_Add_16_mmx cglobal xvid_V_Pass_Avrg_Up_Add_16_mmx cglobal xvid_H_Pass_8_Add_mmx cglobal xvid_H_Pass_Avrg_8_Add_mmx cglobal xvid_H_Pass_Avrg_Up_8_Add_mmx cglobal xvid_V_Pass_8_Add_mmx cglobal xvid_V_Pass_Avrg_8_Add_mmx cglobal xvid_V_Pass_Avrg_Up_8_Add_mmx cglobal xvid_Expand_mmx cglobal xvid_FIR_1_0_0_0 cglobal xvid_FIR_3_1_0_0 cglobal xvid_FIR_6_3_1_0 cglobal xvid_FIR_14_3_2_1 cglobal xvid_FIR_20_6_3_1 cglobal xvid_FIR_20_20_6_3 cglobal xvid_FIR_23_19_6_3 cglobal xvid_FIR_7_20_20_6 cglobal xvid_FIR_6_20_20_6 cglobal xvid_FIR_6_20_20_7 cglobal xvid_FIR_3_6_20_20 cglobal xvid_FIR_3_6_19_23 cglobal xvid_FIR_1_3_6_20 cglobal xvid_FIR_1_2_3_14 cglobal xvid_FIR_0_1_3_6 cglobal xvid_FIR_0_0_1_3 cglobal xvid_FIR_0_0_0_1 SECTION .data align=SECTION_ALIGN align SECTION_ALIGN xvid_Expand_mmx: times 256*4 dw 0 ; uint16_t xvid_Expand_mmx[256][4] ENDFUNC xvid_FIR_1_0_0_0: times 256*4 dw 0 ENDFUNC xvid_FIR_3_1_0_0: times 256*4 dw 0 ENDFUNC xvid_FIR_6_3_1_0: times 256*4 dw 0 ENDFUNC xvid_FIR_14_3_2_1: times 256*4 dw 0 ENDFUNC xvid_FIR_20_6_3_1: times 256*4 dw 0 ENDFUNC xvid_FIR_20_20_6_3: times 256*4 dw 0 ENDFUNC xvid_FIR_23_19_6_3: times 256*4 dw 0 ENDFUNC xvid_FIR_7_20_20_6: times 256*4 dw 0 ENDFUNC xvid_FIR_6_20_20_6: times 256*4 dw 0 ENDFUNC xvid_FIR_6_20_20_7: times 256*4 dw 0 ENDFUNC xvid_FIR_3_6_20_20: times 256*4 dw 0 ENDFUNC xvid_FIR_3_6_19_23: times 256*4 dw 0 ENDFUNC xvid_FIR_1_3_6_20: times 256*4 dw 0 ENDFUNC xvid_FIR_1_2_3_14: times 256*4 dw 0 ENDFUNC xvid_FIR_0_1_3_6: times 256*4 dw 0 ENDFUNC xvid_FIR_0_0_1_3: times 256*4 dw 0 ENDFUNC xvid_FIR_0_0_0_1: times 256*4 dw 0 ENDFUNC ;////////////////////////////////////////////////////////////////////// DATA align SECTION_ALIGN Rounder1_MMX: times 4 dw 1 Rounder0_MMX: times 4 dw 0 align SECTION_ALIGN Rounder_QP_MMX: times 4 dw 16 times 4 dw 15 %ifndef USE_TABLES align SECTION_ALIGN ; H-Pass table shared by 16x? and 8x? filters FIR_R0: dw 14, -3, 2, -1 align SECTION_ALIGN FIR_R1: dw 23, 19, -6, 3, -1, 0, 0, 0 FIR_R2: dw -7, 20, 20, -6, 3, -1, 0, 0 FIR_R3: dw 3, -6, 20, 20, -6, 3, -1, 0 FIR_R4: dw -1, 3, -6, 20, 20, -6, 3, -1 FIR_R5: dw 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0 align SECTION_ALIGN FIR_R6: dw 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0 align SECTION_ALIGN FIR_R7: dw 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0 align SECTION_ALIGN FIR_R8: dw -1, 3, -6, 20, 20, -6, 3, -1 FIR_R9: dw 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0 align SECTION_ALIGN FIR_R10: dw 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0 align SECTION_ALIGN FIR_R11: dw 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0 align SECTION_ALIGN FIR_R12: dw -1, 3, -6, 20, 20, -6, 3, -1 FIR_R13: dw 0, -1, 3, -6, 20, 20, -6, 3 FIR_R14: dw 0, 0, -1, 3, -6, 20, 20, -7 FIR_R15: dw 0, 0, 0, -1, 3, -6, 19, 23 FIR_R16: dw -1, 2, -3, 14 %endif ; !USE_TABLES ; V-Pass taps align SECTION_ALIGN FIR_Cm7: times 4 dw -7 FIR_Cm6: times 4 dw -6 FIR_Cm3: times 4 dw -3 FIR_Cm1: times 4 dw -1 FIR_C2: times 4 dw 2 FIR_C3: times 4 dw 3 FIR_C14: times 4 dw 14 FIR_C19: times 4 dw 19 FIR_C20: times 4 dw 20 FIR_C23: times 4 dw 23 TEXT ;////////////////////////////////////////////////////////////////////// ;// Here we go with the Q-Pel mess. ;// For horizontal passes, we process 4 *output* pixel in parallel ;// For vertical ones, we process 4 *input* pixel in parallel. ;////////////////////////////////////////////////////////////////////// %ifdef ARCH_IS_X86_64 %macro XVID_MOVQ 3 lea r9, [%2] movq %1, [r9 + %3] %endmacro %macro XVID_PADDW 3 lea r9, [%2] paddw %1, [r9 + %3] %endmacro %define SRC_PTR prm2 %define DST_PTR prm1 %else %macro XVID_MOVQ 3 movq %1, [%2 + %3] %endmacro %macro XVID_PADDW 3 paddw %1, [%2 + %3] %endmacro %define SRC_PTR _ESI %define DST_PTR _EDI %endif %macro PROLOG_NO_AVRG 0 mov TMP0, prm3 ; Size mov TMP1, prm4 ; BpS mov eax, prm5d ; Rnd %ifndef ARCH_IS_X86_64 push SRC_PTR push DST_PTR %endif push _EBP mov _EBP, TMP1 %ifndef ARCH_IS_X86_64 mov DST_PTR, [_ESP+16 + 0*4] ; Dst mov SRC_PTR, [_ESP+16 + 1*4] ; Src %endif and _EAX, 1 lea TMP1, [Rounder_QP_MMX] movq mm7, [TMP1+_EAX*8] ; rounder %endmacro %macro EPILOG_NO_AVRG 0 pop _EBP %ifndef ARCH_IS_X86_64 pop DST_PTR pop SRC_PTR %endif ret %endmacro %macro PROLOG_AVRG 0 mov TMP0, prm3 ; Size mov TMP1, prm4 ; BpS mov eax, prm5d ; Rnd push _EBX push _EBP %ifndef ARCH_IS_X86_64 push SRC_PTR push DST_PTR %endif mov _EBP, TMP1 %ifndef ARCH_IS_X86_64 mov DST_PTR, [_ESP+20 + 0*4] ; Dst mov SRC_PTR, [_ESP+20 + 1*4] ; Src %endif and _EAX, 1 lea TMP1, [Rounder_QP_MMX] movq mm7, [TMP1+_EAX*8] ; rounder lea TMP1, [Rounder1_MMX] lea _EBX, [TMP1+_EAX*8] ; *Rounder2 %endmacro %macro EPILOG_AVRG 0 %ifndef ARCH_IS_X86_64 pop DST_PTR pop SRC_PTR %endif pop _EBP pop _EBX ret %endmacro ;////////////////////////////////////////////////////////////////////// ;// ;// All horizontal passes ;// ;////////////////////////////////////////////////////////////////////// ; macros for USE_TABLES %macro TLOAD 2 ; %1,%2: src pixels movzx _EAX, byte [SRC_PTR+%1] movzx TMP1, byte [SRC_PTR+%2] XVID_MOVQ mm0, xvid_FIR_14_3_2_1, _EAX*8 XVID_MOVQ mm3, xvid_FIR_1_2_3_14, TMP1*8 paddw mm0, mm7 paddw mm3, mm7 %endmacro %macro TACCUM2 5 ;%1:src pixel/%2-%3:Taps tables/ %4-%5:dst regs movzx _EAX, byte [SRC_PTR+%1] XVID_PADDW %4, %2, _EAX*8 XVID_PADDW %5, %3, _EAX*8 %endmacro %macro TACCUM3 7 ;%1:src pixel/%2-%4:Taps tables/%5-%7:dst regs movzx _EAX, byte [SRC_PTR+%1] XVID_PADDW %5, %2, _EAX*8 XVID_PADDW %6, %3, _EAX*8 XVID_PADDW %7, %4, _EAX*8 %endmacro ;////////////////////////////////////////////////////////////////////// ; macros without USE_TABLES %macro LOAD 2 ; %1,%2: src pixels movzx _EAX, byte [SRC_PTR+%1] movzx TMP1, byte [SRC_PTR+%2] XVID_MOVQ mm0, xvid_Expand_mmx, _EAX*8 XVID_MOVQ mm3, xvid_Expand_mmx, TMP1*8 pmullw mm0, [FIR_R0 ] pmullw mm3, [FIR_R16] paddw mm0, mm7 paddw mm3, mm7 %endmacro %macro ACCUM2 4 ;src pixel/Taps/dst regs #1-#2 movzx _EAX, byte [SRC_PTR+%1] XVID_MOVQ mm4, xvid_Expand_mmx, _EAX*8 movq mm5, mm4 pmullw mm4, [%2] pmullw mm5, [%2+8] paddw %3, mm4 paddw %4, mm5 %endmacro %macro ACCUM3 5 ;src pixel/Taps/dst regs #1-#2-#3 movzx _EAX, byte [SRC_PTR+%1] XVID_MOVQ mm4, xvid_Expand_mmx, _EAX*8 movq mm5, mm4 movq mm6, mm5 pmullw mm4, [%2 ] pmullw mm5, [%2+ 8] pmullw mm6, [%2+16] paddw %3, mm4 paddw %4, mm5 paddw %5, mm6 %endmacro ;////////////////////////////////////////////////////////////////////// %macro MIX 3 ; %1:reg, %2:src, %3:rounder pxor mm6, mm6 movq mm4, [%2] movq mm1, %1 movq mm5, mm4 punpcklbw %1, mm6 punpcklbw mm4, mm6 punpckhbw mm1, mm6 punpckhbw mm5, mm6 movq mm6, [%3] ; rounder #2 paddusw %1, mm4 paddusw mm1, mm5 paddusw %1, mm6 paddusw mm1, mm6 psrlw %1, 1 psrlw mm1, 1 packuswb %1, mm1 %endmacro ;////////////////////////////////////////////////////////////////////// %macro H_PASS_16 2 ; %1:src-op (0=NONE,1=AVRG,2=AVRG-UP), %2:dst-op (NONE/AVRG) %if (%2==0) && (%1==0) PROLOG_NO_AVRG %else PROLOG_AVRG %endif .Loop: ; mm0..mm3 serves as a 4x4 delay line %ifndef USE_TABLES LOAD 0, 16 ; special case for 1rst/last pixel movq mm1, mm7 movq mm2, mm7 ACCUM2 1, FIR_R1, mm0, mm1 ACCUM2 2, FIR_R2, mm0, mm1 ACCUM2 3, FIR_R3, mm0, mm1 ACCUM2 4, FIR_R4, mm0, mm1 ACCUM3 5, FIR_R5, mm0, mm1, mm2 ACCUM3 6, FIR_R6, mm0, mm1, mm2 ACCUM3 7, FIR_R7, mm0, mm1, mm2 ACCUM2 8, FIR_R8, mm1, mm2 ACCUM3 9, FIR_R9, mm1, mm2, mm3 ACCUM3 10, FIR_R10,mm1, mm2, mm3 ACCUM3 11, FIR_R11,mm1, mm2, mm3 ACCUM2 12, FIR_R12, mm2, mm3 ACCUM2 13, FIR_R13, mm2, mm3 ACCUM2 14, FIR_R14, mm2, mm3 ACCUM2 15, FIR_R15, mm2, mm3 %else TLOAD 0, 16 ; special case for 1rst/last pixel movq mm1, mm7 movq mm2, mm7 TACCUM2 1, xvid_FIR_23_19_6_3, xvid_FIR_1_0_0_0 , mm0, mm1 TACCUM2 2, xvid_FIR_7_20_20_6, xvid_FIR_3_1_0_0 , mm0, mm1 TACCUM2 3, xvid_FIR_3_6_20_20, xvid_FIR_6_3_1_0 , mm0, mm1 TACCUM2 4, xvid_FIR_1_3_6_20 , xvid_FIR_20_6_3_1, mm0, mm1 TACCUM3 5, xvid_FIR_0_1_3_6 , xvid_FIR_20_20_6_3, xvid_FIR_1_0_0_0 , mm0, mm1, mm2 TACCUM3 6, xvid_FIR_0_0_1_3 , xvid_FIR_6_20_20_6, xvid_FIR_3_1_0_0 , mm0, mm1, mm2 TACCUM3 7, xvid_FIR_0_0_0_1 , xvid_FIR_3_6_20_20, xvid_FIR_6_3_1_0 , mm0, mm1, mm2 TACCUM2 8, xvid_FIR_1_3_6_20 , xvid_FIR_20_6_3_1 , mm1, mm2 TACCUM3 9, xvid_FIR_0_1_3_6 , xvid_FIR_20_20_6_3, xvid_FIR_1_0_0_0, mm1, mm2, mm3 TACCUM3 10, xvid_FIR_0_0_1_3 , xvid_FIR_6_20_20_6, xvid_FIR_3_1_0_0, mm1, mm2, mm3 TACCUM3 11, xvid_FIR_0_0_0_1 , xvid_FIR_3_6_20_20, xvid_FIR_6_3_1_0, mm1, mm2, mm3 TACCUM2 12, xvid_FIR_1_3_6_20, xvid_FIR_20_6_3_1 , mm2, mm3 TACCUM2 13, xvid_FIR_0_1_3_6 , xvid_FIR_20_20_6_3, mm2, mm3 TACCUM2 14, xvid_FIR_0_0_1_3 , xvid_FIR_6_20_20_7, mm2, mm3 TACCUM2 15, xvid_FIR_0_0_0_1 , xvid_FIR_3_6_19_23, mm2, mm3 %endif psraw mm0, 5 psraw mm1, 5 psraw mm2, 5 psraw mm3, 5 packuswb mm0, mm1 packuswb mm2, mm3 %if (%1==1) MIX mm0, SRC_PTR, _EBX %elif (%1==2) MIX mm0, SRC_PTR+1, _EBX %endif %if (%2==1) MIX mm0, DST_PTR, Rounder1_MMX %endif %if (%1==1) MIX mm2, SRC_PTR+8, _EBX %elif (%1==2) MIX mm2, SRC_PTR+9, _EBX %endif %if (%2==1) MIX mm2, DST_PTR+8, Rounder1_MMX %endif lea SRC_PTR, [SRC_PTR+_EBP] movq [DST_PTR+0], mm0 movq [DST_PTR+8], mm2 add DST_PTR, _EBP dec TMP0 jg .Loop %if (%2==0) && (%1==0) EPILOG_NO_AVRG %else EPILOG_AVRG %endif %endmacro ;////////////////////////////////////////////////////////////////////// %macro H_PASS_8 2 ; %1:src-op (0=NONE,1=AVRG,2=AVRG-UP), %2:dst-op (NONE/AVRG) %if (%2==0) && (%1==0) PROLOG_NO_AVRG %else PROLOG_AVRG %endif .Loop: ; mm0..mm3 serves as a 4x4 delay line %ifndef USE_TABLES LOAD 0, 8 ; special case for 1rst/last pixel ACCUM2 1, FIR_R1, mm0, mm3 ACCUM2 2, FIR_R2, mm0, mm3 ACCUM2 3, FIR_R3, mm0, mm3 ACCUM2 4, FIR_R4, mm0, mm3 ACCUM2 5, FIR_R13, mm0, mm3 ACCUM2 6, FIR_R14, mm0, mm3 ACCUM2 7, FIR_R15, mm0, mm3 %else %if 0 ; test with no unrolling TLOAD 0, 8 ; special case for 1rst/last pixel TACCUM2 1, xvid_FIR_23_19_6_3, xvid_FIR_1_0_0_0 , mm0, mm3 TACCUM2 2, xvid_FIR_7_20_20_6, xvid_FIR_3_1_0_0 , mm0, mm3 TACCUM2 3, xvid_FIR_3_6_20_20, xvid_FIR_6_3_1_0 , mm0, mm3 TACCUM2 4, xvid_FIR_1_3_6_20 , xvid_FIR_20_6_3_1 , mm0, mm3 TACCUM2 5, xvid_FIR_0_1_3_6 , xvid_FIR_20_20_6_3, mm0, mm3 TACCUM2 6, xvid_FIR_0_0_1_3 , xvid_FIR_6_20_20_7, mm0, mm3 TACCUM2 7, xvid_FIR_0_0_0_1 , xvid_FIR_3_6_19_23, mm0, mm3 %else ; test with unrolling (little faster, but not much) movzx _EAX, byte [SRC_PTR] movzx TMP1, byte [SRC_PTR+8] XVID_MOVQ mm0, xvid_FIR_14_3_2_1, _EAX*8 movzx _EAX, byte [SRC_PTR+1] XVID_MOVQ mm3, xvid_FIR_1_2_3_14, TMP1*8 paddw mm0, mm7 paddw mm3, mm7 movzx TMP1, byte [SRC_PTR+2] XVID_PADDW mm0, xvid_FIR_23_19_6_3, _EAX*8 XVID_PADDW mm3, xvid_FIR_1_0_0_0, _EAX*8 movzx _EAX, byte [SRC_PTR+3] XVID_PADDW mm0, xvid_FIR_7_20_20_6, TMP1*8 XVID_PADDW mm3, xvid_FIR_3_1_0_0, TMP1*8 movzx TMP1, byte [SRC_PTR+4] XVID_PADDW mm0, xvid_FIR_3_6_20_20, _EAX*8 XVID_PADDW mm3, xvid_FIR_6_3_1_0, _EAX*8 movzx _EAX, byte [SRC_PTR+5] XVID_PADDW mm0, xvid_FIR_1_3_6_20, TMP1*8 XVID_PADDW mm3, xvid_FIR_20_6_3_1, TMP1*8 movzx TMP1, byte [SRC_PTR+6] XVID_PADDW mm0, xvid_FIR_0_1_3_6, _EAX*8 XVID_PADDW mm3, xvid_FIR_20_20_6_3, _EAX*8 movzx _EAX, byte [SRC_PTR+7] XVID_PADDW mm0, xvid_FIR_0_0_1_3, TMP1*8 XVID_PADDW mm3, xvid_FIR_6_20_20_7, TMP1*8 XVID_PADDW mm0, xvid_FIR_0_0_0_1, _EAX*8 XVID_PADDW mm3, xvid_FIR_3_6_19_23, _EAX*8 %endif %endif ; !USE_TABLES psraw mm0, 5 psraw mm3, 5 packuswb mm0, mm3 %if (%1==1) MIX mm0, SRC_PTR, _EBX %elif (%1==2) MIX mm0, SRC_PTR+1, _EBX %endif %if (%2==1) MIX mm0, DST_PTR, Rounder1_MMX %endif movq [DST_PTR], mm0 add DST_PTR, _EBP add SRC_PTR, _EBP dec TMP0 jg .Loop %if (%2==0) && (%1==0) EPILOG_NO_AVRG %else EPILOG_AVRG %endif %endmacro ;////////////////////////////////////////////////////////////////////// ;// 16x? copy Functions xvid_H_Pass_16_mmx: H_PASS_16 0, 0 ENDFUNC xvid_H_Pass_Avrg_16_mmx: H_PASS_16 1, 0 ENDFUNC xvid_H_Pass_Avrg_Up_16_mmx: H_PASS_16 2, 0 ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// 8x? copy Functions xvid_H_Pass_8_mmx: H_PASS_8 0, 0 ENDFUNC xvid_H_Pass_Avrg_8_mmx: H_PASS_8 1, 0 ENDFUNC xvid_H_Pass_Avrg_Up_8_mmx: H_PASS_8 2, 0 ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// 16x? avrg Functions xvid_H_Pass_Add_16_mmx: H_PASS_16 0, 1 ENDFUNC xvid_H_Pass_Avrg_Add_16_mmx: H_PASS_16 1, 1 ENDFUNC xvid_H_Pass_Avrg_Up_Add_16_mmx: H_PASS_16 2, 1 ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// 8x? avrg Functions xvid_H_Pass_8_Add_mmx: H_PASS_8 0, 1 ENDFUNC xvid_H_Pass_Avrg_8_Add_mmx: H_PASS_8 1, 1 ENDFUNC xvid_H_Pass_Avrg_Up_8_Add_mmx: H_PASS_8 2, 1 ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// ;// All vertical passes ;// ;////////////////////////////////////////////////////////////////////// %macro V_LOAD 1 ; %1=Last? movd mm4, dword [TMP1] pxor mm6, mm6 %if (%1==0) add TMP1, _EBP %endif punpcklbw mm4, mm6 %endmacro %macro V_ACC1 2 ; %1:reg; 2:tap pmullw mm4, [%2] paddw %1, mm4 %endmacro %macro V_ACC2 4 ; %1-%2: regs, %3-%4: taps movq mm5, mm4 movq mm6, mm4 pmullw mm5, [%3] pmullw mm6, [%4] paddw %1, mm5 paddw %2, mm6 %endmacro %macro V_ACC2l 4 ; %1-%2: regs, %3-%4: taps movq mm5, mm4 pmullw mm5, [%3] pmullw mm4, [%4] paddw %1, mm5 paddw %2, mm4 %endmacro %macro V_ACC4 8 ; %1-%4: regs, %5-%8: taps V_ACC2 %1,%2, %5,%6 V_ACC2l %3,%4, %7,%8 %endmacro %macro V_MIX 3 ; %1:dst-reg, %2:src, %3: rounder pxor mm6, mm6 movq mm4, [%2] punpcklbw %1, mm6 punpcklbw mm4, mm6 paddusw %1, mm4 paddusw %1, [%3] psrlw %1, 1 packuswb %1, %1 %endmacro %macro V_STORE 4 ; %1-%2: mix ops, %3: reg, %4:last? psraw %3, 5 packuswb %3, %3 %if (%1==1) V_MIX %3, SRC_PTR, _EBX add SRC_PTR, _EBP %elif (%1==2) add SRC_PTR, _EBP V_MIX %3, SRC_PTR, _EBX %endif %if (%2==1) V_MIX %3, DST_PTR, Rounder1_MMX %endif movd eax, %3 mov dword [DST_PTR], eax %if (%4==0) add DST_PTR, _EBP %endif %endmacro ;////////////////////////////////////////////////////////////////////// %macro V_PASS_16 2 ; %1:src-op (0=NONE,1=AVRG,2=AVRG-UP), %2:dst-op (NONE/AVRG) %if (%2==0) && (%1==0) PROLOG_NO_AVRG %else PROLOG_AVRG %endif ; we process one stripe of 4x16 pixel each time. ; the size (3rd argument) is meant to be a multiple of 4 ; mm0..mm3 serves as a 4x4 delay line .Loop: push DST_PTR push SRC_PTR ; SRC_PTR is preserved for src-mixing mov TMP1, SRC_PTR ; ouput rows [0..3], from input rows [0..8] movq mm0, mm7 movq mm1, mm7 movq mm2, mm7 movq mm3, mm7 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C14, FIR_Cm3, FIR_C2, FIR_Cm1 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C23, FIR_C19, FIR_Cm6, FIR_C3 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm7, FIR_C20, FIR_C20, FIR_Cm6 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C3, FIR_Cm6, FIR_C20, FIR_C20 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm1, FIR_C3, FIR_Cm6, FIR_C20 V_STORE %1, %2, mm0, 0 V_LOAD 0 V_ACC2 mm1, mm2, FIR_Cm1, FIR_C3 V_ACC1 mm3, FIR_Cm6 V_STORE %1, %2, mm1, 0 V_LOAD 0 V_ACC2l mm2, mm3, FIR_Cm1, FIR_C3 V_STORE %1, %2, mm2, 0 V_LOAD 1 V_ACC1 mm3, FIR_Cm1 V_STORE %1, %2, mm3, 0 ; ouput rows [4..7], from input rows [1..11] (!!) mov SRC_PTR, [_ESP] lea TMP1, [SRC_PTR+_EBP] lea SRC_PTR, [SRC_PTR+4*_EBP] ; for src-mixing push SRC_PTR ; this will be the new value for next round movq mm0, mm7 movq mm1, mm7 movq mm2, mm7 movq mm3, mm7 V_LOAD 0 V_ACC1 mm0, FIR_Cm1 V_LOAD 0 V_ACC2l mm0, mm1, FIR_C3, FIR_Cm1 V_LOAD 0 V_ACC2 mm0, mm1, FIR_Cm6, FIR_C3 V_ACC1 mm2, FIR_Cm1 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C20, FIR_Cm6, FIR_C3, FIR_Cm1 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C20, FIR_C20, FIR_Cm6, FIR_C3 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm6, FIR_C20, FIR_C20, FIR_Cm6 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C3, FIR_Cm6, FIR_C20, FIR_C20 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm1, FIR_C3, FIR_Cm6, FIR_C20 V_STORE %1, %2, mm0, 0 V_LOAD 0 V_ACC2 mm1, mm2, FIR_Cm1, FIR_C3 V_ACC1 mm3, FIR_Cm6 V_STORE %1, %2, mm1, 0 V_LOAD 0 V_ACC2l mm2, mm3, FIR_Cm1, FIR_C3 V_STORE %1, %2, mm2, 0 V_LOAD 1 V_ACC1 mm3, FIR_Cm1 V_STORE %1, %2, mm3, 0 ; ouput rows [8..11], from input rows [5..15] pop SRC_PTR lea TMP1, [SRC_PTR+_EBP] lea SRC_PTR, [SRC_PTR+4*_EBP] ; for src-mixing push SRC_PTR ; this will be the new value for next round movq mm0, mm7 movq mm1, mm7 movq mm2, mm7 movq mm3, mm7 V_LOAD 0 V_ACC1 mm0, FIR_Cm1 V_LOAD 0 V_ACC2l mm0, mm1, FIR_C3, FIR_Cm1 V_LOAD 0 V_ACC2 mm0, mm1, FIR_Cm6, FIR_C3 V_ACC1 mm2, FIR_Cm1 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C20, FIR_Cm6, FIR_C3, FIR_Cm1 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C20, FIR_C20, FIR_Cm6, FIR_C3 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm6, FIR_C20, FIR_C20, FIR_Cm6 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C3, FIR_Cm6, FIR_C20, FIR_C20 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm1, FIR_C3, FIR_Cm6, FIR_C20 V_STORE %1, %2, mm0, 0 V_LOAD 0 V_ACC2 mm1, mm2, FIR_Cm1, FIR_C3 V_ACC1 mm3, FIR_Cm6 V_STORE %1, %2, mm1, 0 V_LOAD 0 V_ACC2l mm2, mm3, FIR_Cm1, FIR_C3 V_STORE %1, %2, mm2, 0 V_LOAD 1 V_ACC1 mm3, FIR_Cm1 V_STORE %1, %2, mm3, 0 ; ouput rows [12..15], from input rows [9.16] pop SRC_PTR lea TMP1, [SRC_PTR+_EBP] %if (%1!=0) lea SRC_PTR, [SRC_PTR+4*_EBP] ; for src-mixing %endif movq mm0, mm7 movq mm1, mm7 movq mm2, mm7 movq mm3, mm7 V_LOAD 0 V_ACC1 mm3, FIR_Cm1 V_LOAD 0 V_ACC2l mm2, mm3, FIR_Cm1, FIR_C3 V_LOAD 0 V_ACC2 mm1, mm2, FIR_Cm1, FIR_C3 V_ACC1 mm3, FIR_Cm6 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm1, FIR_C3, FIR_Cm6, FIR_C20 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C3, FIR_Cm6, FIR_C20, FIR_C20 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm7, FIR_C20, FIR_C20, FIR_Cm6 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C23, FIR_C19, FIR_Cm6, FIR_C3 V_LOAD 1 V_ACC4 mm0, mm1, mm2, mm3, FIR_C14, FIR_Cm3, FIR_C2, FIR_Cm1 V_STORE %1, %2, mm3, 0 V_STORE %1, %2, mm2, 0 V_STORE %1, %2, mm1, 0 V_STORE %1, %2, mm0, 1 ; ... next 4 columns pop SRC_PTR pop DST_PTR add SRC_PTR, 4 add DST_PTR, 4 sub TMP0, 4 jg .Loop %if (%2==0) && (%1==0) EPILOG_NO_AVRG %else EPILOG_AVRG %endif %endmacro ;////////////////////////////////////////////////////////////////////// %macro V_PASS_8 2 ; %1:src-op (0=NONE,1=AVRG,2=AVRG-UP), %2:dst-op (NONE/AVRG) %if (%2==0) && (%1==0) PROLOG_NO_AVRG %else PROLOG_AVRG %endif ; we process one stripe of 4x8 pixel each time ; the size (3rd argument) is meant to be a multiple of 4 ; mm0..mm3 serves as a 4x4 delay line .Loop: push DST_PTR push SRC_PTR ; SRC_PTR is preserved for src-mixing mov TMP1, SRC_PTR ; ouput rows [0..3], from input rows [0..8] movq mm0, mm7 movq mm1, mm7 movq mm2, mm7 movq mm3, mm7 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C14, FIR_Cm3, FIR_C2, FIR_Cm1 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C23, FIR_C19, FIR_Cm6, FIR_C3 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm7, FIR_C20, FIR_C20, FIR_Cm6 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C3, FIR_Cm6, FIR_C20, FIR_C20 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm1, FIR_C3, FIR_Cm6, FIR_C20 V_STORE %1, %2, mm0, 0 V_LOAD 0 V_ACC2 mm1, mm2, FIR_Cm1, FIR_C3 V_ACC1 mm3, FIR_Cm6 V_STORE %1, %2, mm1, 0 V_LOAD 0 V_ACC2l mm2, mm3, FIR_Cm1, FIR_C3 V_STORE %1, %2, mm2, 0 V_LOAD 1 V_ACC1 mm3, FIR_Cm1 V_STORE %1, %2, mm3, 0 ; ouput rows [4..7], from input rows [1..9] mov SRC_PTR, [_ESP] lea TMP1, [SRC_PTR+_EBP] %if (%1!=0) lea SRC_PTR, [SRC_PTR+4*_EBP] ; for src-mixing %endif movq mm0, mm7 movq mm1, mm7 movq mm2, mm7 movq mm3, mm7 V_LOAD 0 V_ACC1 mm3, FIR_Cm1 V_LOAD 0 V_ACC2l mm2, mm3, FIR_Cm1, FIR_C3 V_LOAD 0 V_ACC2 mm1, mm2, FIR_Cm1, FIR_C3 V_ACC1 mm3, FIR_Cm6 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm1, FIR_C3, FIR_Cm6, FIR_C20 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C3, FIR_Cm6, FIR_C20, FIR_C20 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_Cm7, FIR_C20, FIR_C20, FIR_Cm6 V_LOAD 0 V_ACC4 mm0, mm1, mm2, mm3, FIR_C23, FIR_C19, FIR_Cm6, FIR_C3 V_LOAD 1 V_ACC4 mm0, mm1, mm2, mm3, FIR_C14, FIR_Cm3, FIR_C2, FIR_Cm1 V_STORE %1, %2, mm3, 0 V_STORE %1, %2, mm2, 0 V_STORE %1, %2, mm1, 0 V_STORE %1, %2, mm0, 1 ; ... next 4 columns pop SRC_PTR pop DST_PTR add SRC_PTR, 4 add DST_PTR, 4 sub TMP0, 4 jg .Loop %if (%2==0) && (%1==0) EPILOG_NO_AVRG %else EPILOG_AVRG %endif %endmacro ;////////////////////////////////////////////////////////////////////// ;// 16x? copy Functions xvid_V_Pass_16_mmx: V_PASS_16 0, 0 ENDFUNC xvid_V_Pass_Avrg_16_mmx: V_PASS_16 1, 0 ENDFUNC xvid_V_Pass_Avrg_Up_16_mmx: V_PASS_16 2, 0 ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// 8x? copy Functions xvid_V_Pass_8_mmx: V_PASS_8 0, 0 ENDFUNC xvid_V_Pass_Avrg_8_mmx: V_PASS_8 1, 0 ENDFUNC xvid_V_Pass_Avrg_Up_8_mmx: V_PASS_8 2, 0 ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// 16x? avrg Functions xvid_V_Pass_Add_16_mmx: V_PASS_16 0, 1 ENDFUNC xvid_V_Pass_Avrg_Add_16_mmx: V_PASS_16 1, 1 ENDFUNC xvid_V_Pass_Avrg_Up_Add_16_mmx: V_PASS_16 2, 1 ENDFUNC ;////////////////////////////////////////////////////////////////////// ;// 8x? avrg Functions xvid_V_Pass_8_Add_mmx: V_PASS_8 0, 1 ENDFUNC xvid_V_Pass_Avrg_8_Add_mmx: V_PASS_8 1, 1 ENDFUNC xvid_V_Pass_Avrg_Up_8_Add_mmx: V_PASS_8 2, 1 ENDFUNC ;////////////////////////////////////////////////////////////////////// %undef SRC_PTR %undef DST_PTR NON_EXEC_STACK xvidcore/src/image/x86_asm/postprocessing_sse2.asm0000664000076500007650000000662711345416076023363 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - sse2 post processing - ; * ; * Copyright(C) 2004 Peter Ross ; * 2004 Dcoder ; * ; * Xvid is free software; you can redistribute it and/or modify it ; * under the terms of the GNU General Public License as published by ; * the Free Software Foundation; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; *************************************************************************/ %include "nasm.inc" ;=========================================================================== ; read only data ;=========================================================================== DATA xmm_0x80: times 16 db 0x80 ;============================================================================= ; Code ;============================================================================= TEXT cglobal image_brightness_sse2 ;////////////////////////////////////////////////////////////////////// ;// image_brightness_sse2 ;////////////////////////////////////////////////////////////////////// %macro CREATE_OFFSET_VECTOR 2 mov [%1 + 0], %2 mov [%1 + 1], %2 mov [%1 + 2], %2 mov [%1 + 3], %2 mov [%1 + 4], %2 mov [%1 + 5], %2 mov [%1 + 6], %2 mov [%1 + 7], %2 mov [%1 + 8], %2 mov [%1 + 9], %2 mov [%1 + 10], %2 mov [%1 + 11], %2 mov [%1 + 12], %2 mov [%1 + 13], %2 mov [%1 + 14], %2 mov [%1 + 15], %2 %endmacro ALIGN SECTION_ALIGN image_brightness_sse2: %ifdef ARCH_IS_X86_64 XVID_MOVSXD _EAX, prm5d %else mov eax, prm5 ; brightness offset value %endif mov TMP1, prm1 ; Dst mov TMP0, prm2 ; stride push _ESI push _EDI ; 8 bytes offset for push sub _ESP, 32 ; 32 bytes for local data (16bytes will be used, 16bytes more to align correctly mod 16) movdqa xmm2, [xmm_0x80] ; Create a offset...offset vector mov _ESI, _ESP ; TMP1 will be esp aligned mod 16 add _ESI, 15 ; TMP1 = esp + 15 and _ESI, ~15 ; TMP1 = (esp + 15)&(~15) CREATE_OFFSET_VECTOR _ESI, al movdqa xmm3, [_ESI] %ifdef ARCH_IS_X86_64 mov _ESI, prm3 mov _EDI, prm4 %else mov _ESI, [_ESP+8+32+12] ; width mov _EDI, [_ESP+8+32+16] ; height %endif .yloop: xor _EAX, _EAX .xloop: movdqa xmm0, [TMP1 + _EAX] movdqa xmm1, [TMP1 + _EAX + 16] ; xmm0 = [dst] paddb xmm0, xmm2 ; unsigned -> signed domain paddb xmm1, xmm2 paddsb xmm0, xmm3 paddsb xmm1, xmm3 ; xmm0 += offset psubb xmm0, xmm2 psubb xmm1, xmm2 ; signed -> unsigned domain movdqa [TMP1 + _EAX], xmm0 movdqa [TMP1 + _EAX + 16], xmm1 ; [dst] = xmm0 add _EAX,32 cmp _EAX,_ESI jl .xloop add TMP1, TMP0 ; dst += stride dec _EDI jg .yloop add _ESP, 32 pop _EDI pop _ESI ret ENDFUNC ;////////////////////////////////////////////////////////////////////// NON_EXEC_STACK xvidcore/src/image/x86_asm/interpolate8x8_3dne.asm0000664000076500007650000003211511254216113023130 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - 3dne pipeline optimized 8x8 block-based halfpel interpolation - ; * ; * Copyright(C) 2002 Jaan Kalda ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; ****************************************************************************/ ; these 3dne functions are compatible with iSSE, but are optimized specifically ; for K7 pipelines %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN mmx_one: times 8 db 1 ALIGN SECTION_ALIGN mm_minusone: dd -1,-1 ;============================================================================= ; Macros ;============================================================================= %macro nop4 0 DB 08Dh,074h,026h,0 %endmacro ;============================================================================= ; Macros ;============================================================================= TEXT cglobal interpolate8x8_halfpel_h_3dne cglobal interpolate8x8_halfpel_v_3dne cglobal interpolate8x8_halfpel_hv_3dne cglobal interpolate8x4_halfpel_h_3dne cglobal interpolate8x4_halfpel_v_3dne cglobal interpolate8x4_halfpel_hv_3dne ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_h_3dne(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- %macro COPY_H_SSE_RND0 1 %if (%1) movq mm0, [_EAX] %else movq mm0, [_EAX+0] ; --- ; nasm >0.99.x rejects the original statement: ; movq mm0, [dword _EAX] ; as it is ambiguous. for this statement nasm <0.99.x would ; generate "movq mm0,[_EAX+0]" ; --- %endif pavgb mm0, [_EAX+1] movq mm1, [_EAX+TMP1] pavgb mm1, [_EAX+TMP1+1] lea _EAX, [_EAX+2*TMP1] movq [TMP0], mm0 movq [TMP0+TMP1], mm1 %endmacro %macro COPY_H_SSE_RND1 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] movq mm4, mm0 movq mm5, mm1 movq mm2, [_EAX+1] movq mm3, [_EAX+TMP1+1] pavgb mm0, mm2 pxor mm2, mm4 pavgb mm1, mm3 lea _EAX, [_EAX+2*TMP1] pxor mm3, mm5 pand mm2, mm7 pand mm3, mm7 psubb mm0, mm2 movq [TMP0], mm0 psubb mm1, mm3 movq [TMP0+TMP1], mm1 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_h_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride dec PTR_TYPE prm4; rounding jz near .rounding1 mov TMP0, prm1 ; Dst COPY_H_SSE_RND0 0 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND0 1 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND0 1 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND0 1 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 mov TMP0, prm1 ; Dst movq mm7, [mmx_one] COPY_H_SSE_RND1 lea TMP0, [TMP0+2*TMP1] COPY_H_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND1 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_v_3dne(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x8_halfpel_v_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride dec PTR_TYPE prm4; rounding ; we process 2 line at a time jz near .rounding1 pxor mm2,mm2 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] por mm2, [_EAX+2*TMP1] mov TMP0, prm1 ; Dst lea _EAX, [_EAX+2*TMP1] pxor mm4, mm4 pavgb mm0, mm1 pavgb mm1, mm2 movq [byte TMP0], mm0 movq [TMP0+TMP1], mm1 pxor mm6, mm6 add _EAX, TMP1 lea TMP0, [TMP0+2*TMP1] movq mm3, [byte _EAX] por mm4, [_EAX+TMP1] lea _EAX, [_EAX+2*TMP1] pavgb mm2, mm3 pavgb mm3, mm4 movq [TMP0], mm2 movq [TMP0+TMP1], mm3 lea TMP0, [byte TMP0+2*TMP1] movq mm5, [byte _EAX] por mm6, [_EAX+TMP1] lea _EAX, [_EAX+2*TMP1] pavgb mm4, mm5 pavgb mm5, mm6 movq [TMP0], mm4 movq [TMP0+TMP1], mm5 lea TMP0, [TMP0+2*TMP1] movq mm7, [_EAX] movq mm0, [_EAX+TMP1] pavgb mm6, mm7 pavgb mm7, mm0 movq [TMP0], mm6 movq [TMP0+TMP1], mm7 ret ALIGN SECTION_ALIGN .rounding1: pcmpeqb mm0, mm0 psubusb mm0, [_EAX] add _EAX, TMP1 mov TMP0, prm1 ; Dst push _ESI pcmpeqb mm1, mm1 pcmpeqb mm2, mm2 lea _ESI, [mm_minusone] psubusb mm1, [byte _EAX] psubusb mm2, [_EAX+TMP1] lea _EAX, [_EAX+2*TMP1] movq mm6, [_ESI] movq mm7, [_ESI] pavgb mm0, mm1 pavgb mm1, mm2 psubusb mm6, mm0 psubusb mm7, mm1 movq [TMP0], mm6 movq [TMP0+TMP1], mm7 lea TMP0, [TMP0+2*TMP1] pcmpeqb mm3, mm3 pcmpeqb mm4, mm4 psubusb mm3, [_EAX] psubusb mm4, [_EAX+TMP1] lea _EAX, [_EAX+2*TMP1] pavgb mm2, mm3 pavgb mm3, mm4 movq mm0, [_ESI] movq mm1, [_ESI] psubusb mm0, mm2 psubusb mm1, mm3 movq [TMP0], mm0 movq [TMP0+TMP1], mm1 lea TMP0,[TMP0+2*TMP1] pcmpeqb mm5, mm5 pcmpeqb mm6, mm6 psubusb mm5, [_EAX] psubusb mm6, [_EAX+TMP1] lea _EAX, [_EAX+2*TMP1] pavgb mm4, mm5 pavgb mm5, mm6 movq mm2, [_ESI] movq mm3, [_ESI] psubusb mm2, mm4 psubusb mm3, mm5 movq [TMP0], mm2 movq [TMP0+TMP1], mm3 lea TMP0, [TMP0+2*TMP1] pcmpeqb mm7, mm7 pcmpeqb mm0, mm0 psubusb mm7, [_EAX] psubusb mm0, [_EAX+TMP1] pavgb mm6, mm7 pavgb mm7, mm0 movq mm4, [_ESI] movq mm5, [_ESI] psubusb mm4, mm6 pop _ESI psubusb mm5, mm7 movq [TMP0], mm4 movq [TMP0+TMP1], mm5 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x8_halfpel_hv_3dne(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;----------------------------------------------------------------------------- ; The trick is to correct the result of 'pavgb' with some combination of the ; lsb's of the 4 input values i,j,k,l, and their intermediate 'pavgb' (s and t). ; The boolean relations are: ; (i+j+k+l+3)/4 = (s+t+1)/2 - (ij&kl)&st ; (i+j+k+l+2)/4 = (s+t+1)/2 - (ij|kl)&st ; (i+j+k+l+1)/4 = (s+t+1)/2 - (ij&kl)|st ; (i+j+k+l+0)/4 = (s+t+1)/2 - (ij|kl)|st ; with s=(i+j+1)/2, t=(k+l+1)/2, ij = i^j, kl = k^l, st = s^t. ; Moreover, we process 2 lines at a times, for better overlapping (~15% faster). %macro COPY_HV_SSE_RND0 0 movq mm0, [_EAX+TMP1] movq mm1, [_EAX+TMP1+1] movq mm6, mm0 pavgb mm0, mm1 ; mm0=(j+k+1)/2. preserved for next step lea _EAX, [_EAX+2*TMP1] pxor mm1, mm6 ; mm1=(j^k). preserved for next step por mm3, mm1 ; ij |= jk movq mm6, mm2 pxor mm6, mm0 ; mm6 = s^t pand mm3, mm6 ; (ij|jk) &= st pavgb mm2, mm0 ; mm2 = (s+t+1)/2 movq mm6, [_EAX] pand mm3, mm7 ; mask lsb psubb mm2, mm3 ; apply. movq [TMP0], mm2 movq mm2, [_EAX] movq mm3, [_EAX+1] pavgb mm2, mm3 ; preserved for next iteration pxor mm3, mm6 ; preserved for next iteration por mm1, mm3 movq mm6, mm0 pxor mm6, mm2 pand mm1, mm6 pavgb mm0, mm2 pand mm1, mm7 psubb mm0, mm1 movq [TMP0+TMP1], mm0 %endmacro %macro COPY_HV_SSE_RND1 0 movq mm0, [_EAX+TMP1] movq mm1, [_EAX+TMP1+1] movq mm6, mm0 pavgb mm0, mm1 ; mm0=(j+k+1)/2. preserved for next step lea _EAX,[_EAX+2*TMP1] pxor mm1, mm6 ; mm1=(j^k). preserved for next step pand mm3, mm1 movq mm6, mm2 pxor mm6, mm0 por mm3, mm6 pavgb mm2, mm0 movq mm6, [_EAX] pand mm3, mm7 psubb mm2, mm3 movq [TMP0], mm2 movq mm2, [_EAX] movq mm3, [_EAX+1] pavgb mm2, mm3 ; preserved for next iteration pxor mm3, mm6 ; preserved for next iteration pand mm1, mm3 movq mm6, mm0 pxor mm6, mm2 por mm1, mm6 pavgb mm0, mm2 pand mm1, mm7 psubb mm0, mm1 movq [TMP0+TMP1], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_hv_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride dec PTR_TYPE prm4 ; rounding ; loop invariants: mm2=(i+j+1)/2 and mm3= i^j movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 pxor mm3, mm6 ; mm2/mm3 ready mov TMP0, prm1 ; Dst movq mm7, [mmx_one] jz near .rounding1 lea _EBP,[byte _EBP] COPY_HV_SSE_RND0 lea TMP0,[TMP0+2*TMP1] COPY_HV_SSE_RND0 lea TMP0,[TMP0+2*TMP1] COPY_HV_SSE_RND0 lea TMP0,[TMP0+2*TMP1] COPY_HV_SSE_RND0 ret ALIGN SECTION_ALIGN .rounding1: COPY_HV_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_HV_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_HV_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_HV_SSE_RND1 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_h_3dne(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x4_halfpel_h_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride dec PTR_TYPE prm4; rounding jz .rounding1 mov TMP0, prm1 ; Dst COPY_H_SSE_RND0 0 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND0 1 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 mov TMP0, prm1 ; Dst movq mm7, [mmx_one] COPY_H_SSE_RND1 lea TMP0, [TMP0+2*TMP1] COPY_H_SSE_RND1 ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_v_3dne(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x4_halfpel_v_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride dec PTR_TYPE prm4; rounding ; we process 2 line at a time jz .rounding1 pxor mm2,mm2 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] por mm2, [_EAX+2*TMP1] ; Something like preload (pipelining) mov TMP0, prm1 ; Dst lea _EAX, [_EAX+2*TMP1] pxor mm4, mm4 pavgb mm0, mm1 pavgb mm1, mm2 movq [byte TMP0], mm0 movq [TMP0+TMP1], mm1 pxor mm6, mm6 add _EAX, TMP1 lea TMP0, [TMP0+2*TMP1] movq mm3, [byte _EAX] por mm4, [_EAX+TMP1] lea _EAX, [_EAX+2*TMP1] pavgb mm2, mm3 pavgb mm3, mm4 movq [TMP0], mm2 movq [TMP0+TMP1], mm3 ret ALIGN SECTION_ALIGN .rounding1: pcmpeqb mm0, mm0 psubusb mm0, [_EAX] ; _EAX==line0 add _EAX, TMP1 ; _EAX==line1 mov TMP0, prm1 ; Dst push _ESI pcmpeqb mm1, mm1 pcmpeqb mm2, mm2 lea _ESI, [mm_minusone] psubusb mm1, [byte _EAX] ; line1 psubusb mm2, [_EAX+TMP1] ; line2 lea _EAX, [_EAX+2*TMP1] ; _EAX==line3 movq mm6, [_ESI] movq mm7, [_ESI] pavgb mm0, mm1 pavgb mm1, mm2 psubusb mm6, mm0 psubusb mm7, mm1 movq [TMP0], mm6 ; store line0 movq [TMP0+TMP1], mm7 ; store line1 lea TMP0, [TMP0+2*TMP1] pcmpeqb mm3, mm3 pcmpeqb mm4, mm4 psubusb mm3, [_EAX] ; line3 psubusb mm4, [_EAX+TMP1] ; line4 lea _EAX, [_EAX+2*TMP1] ; _EAX==line 5 pavgb mm2, mm3 pavgb mm3, mm4 movq mm0, [_ESI] movq mm1, [_ESI] psubusb mm0, mm2 psubusb mm1, mm3 movq [TMP0], mm0 movq [TMP0+TMP1], mm1 pop _ESI ret ENDFUNC ;----------------------------------------------------------------------------- ; ; void interpolate8x4_halfpel_hv_3dne(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;----------------------------------------------------------------------------- ALIGN SECTION_ALIGN interpolate8x4_halfpel_hv_3dne: mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride dec PTR_TYPE prm4 ; rounding ; loop invariants: mm2=(i+j+1)/2 and mm3= i^j movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 pxor mm3, mm6 ; mm2/mm3 ready mov TMP0, prm1 ; Dst movq mm7, [mmx_one] jz near .rounding1 lea _EBP,[byte _EBP] COPY_HV_SSE_RND0 lea TMP0,[TMP0+2*TMP1] COPY_HV_SSE_RND0 ret ALIGN SECTION_ALIGN .rounding1: COPY_HV_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_HV_SSE_RND1 ret ENDFUNC NON_EXEC_STACK xvidcore/src/image/x86_asm/colorspace_mmx.inc0000664000076500007650000002046311147022343022331 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - colorspace conversions - ; * ; * Copyright(C) 2002-2003 Peter Ross ; * 2008 Michael Militzer ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; ****************************************************************************/ ;------------------------------------------------------------------------------ ; ; MAKE_COLORSPACE(NAME,STACK, BYTES,PIXELS,ROWS, FUNC, ARG1) ; ; This macro provides a assembler width/height scroll loop ; NAME function name ; STACK additional stack bytes required by FUNC ; BYTES bytes-per-pixel for the given colorspace ; PIXELS pixels (columns) operated on per FUNC call ; VPIXELS vpixels (rows) operated on per FUNC call ; FUNC conversion macro name; we expect to find FUNC_INIT and FUNC macros ; ARG1 argument passed to FUNC ; ; throughout the FUNC the registers mean: ;------------------------------------------------------------------------------ %define y_stride _EAX %define u_ptr _EBX %define v_ptr _ECX %define x_stride _EDX %define x_stride_d edx %define y_ptr _ESI %define x_ptr _EDI %define width _EBP %macro MAKE_COLORSPACE 8 %define NAME %1 %define STACK %2 %define BYTES %3 %define PIXELS %4 %define VPIXELS %5 %define FUNC %6 %define ARG1 %7 %define ARG2 %8 ; --- define function global/symbol ALIGN SECTION_ALIGN cglobal NAME NAME: ; --- init stack --- push _EBX ; esp + localsize + 16 %ifdef ARCH_IS_X86_64 %define localsize 2*PTR_SIZE + STACK %ifndef WINDOWS %define pushsize 2*PTR_SIZE %define shadow 0 %else %define pushsize 4*PTR_SIZE %define shadow 32 + 2*PTR_SIZE %endif %define prm_vflip dword [_ESP + localsize + pushsize + shadow + 4*PTR_SIZE] %define prm_height dword [_ESP + localsize + pushsize + shadow + 3*PTR_SIZE] %define prm_width dword [_ESP + localsize + pushsize + shadow + 2*PTR_SIZE] %define prm_uv_stride dword [_ESP + localsize + pushsize + shadow + 1*PTR_SIZE] %ifdef WINDOWS %define prm_y_stride dword [_ESP + localsize + pushsize + shadow + 0*PTR_SIZE] %define prm_v_ptr [_ESP + localsize + pushsize + shadow - 1*PTR_SIZE] push _ESI ; esp + localsize + 8 push _EDI ; esp + localsize + 4 %else %define prm_y_stride prm6d %define prm_v_ptr prm5 %endif %define prm_u_ptr prm4 %define prm_y_ptr prm3 %define prm_x_stride prm2d %define prm_x_ptr prm1 %define _ip _ESP + localsize + pushsize + 0 %define x_dif TMP0 %else %define localsize 5*PTR_SIZE + STACK %define pushsize 4*PTR_SIZE %define prm_vflip [_ESP + localsize + pushsize + 10*PTR_SIZE] %define prm_height [_ESP + localsize + pushsize + 9*PTR_SIZE] %define prm_width [_ESP + localsize + pushsize + 8*PTR_SIZE] %define prm_uv_stride [_ESP + localsize + pushsize + 7*PTR_SIZE] %define prm_y_stride [_ESP + localsize + pushsize + 6*PTR_SIZE] %define prm_v_ptr [_ESP + localsize + pushsize + 5*PTR_SIZE] %define prm_u_ptr [_ESP + localsize + pushsize + 4*PTR_SIZE] %define prm_y_ptr [_ESP + localsize + pushsize + 3*PTR_SIZE] %define prm_x_stride [_ESP + localsize + pushsize + 2*PTR_SIZE] %define prm_x_ptr [_ESP + localsize + pushsize + 1*PTR_SIZE] %define _ip _ESP + localsize + pushsize + 0 %define x_dif dword [_ESP + localsize - 5*4] push _ESI ; esp + localsize + 8 push _EDI ; esp + localsize + 4 %endif push _EBP ; esp + localsize + 0 %define y_dif dword [_ESP + localsize - 1*4] %define uv_dif dword [_ESP + localsize - 2*4] %define fixed_width dword [_ESP + localsize - 3*4] %define tmp_height dword [_ESP + localsize - 4*4] sub _ESP, localsize ; --- init varibles --- mov eax, prm_width ; fixed width add eax, 15 ; and eax, ~15 ; mov fixed_width, eax ; mov ebx, prm_x_stride ; %rep BYTES sub _EBX, _EAX ; %endrep mov x_dif, _EBX ; x_dif = x_stride - BYTES*fixed_width mov ebx, prm_y_stride ; sub ebx, eax ; mov y_dif, ebx ; y_dif = y_stride - fixed_width mov ebx, prm_uv_stride ; mov TMP1, _EAX ; shr TMP1, 1 ; sub _EBX, TMP1 ; mov uv_dif, ebx ; uv_dif = uv_stride - fixed_width/2 %ifdef ARCH_IS_X86_64 %ifndef WINDOWS mov TMP1d, prm_x_stride mov _ESI, prm_y_ptr mov _EDX, TMP1 %else mov _ESI, prm_y_ptr mov _EDI, prm_x_ptr %endif %else mov _ESI, prm_y_ptr ; $esi$ = y_ptr mov _EDI, prm_x_ptr ; $edi$ = x_ptr mov edx, prm_x_stride ; $edx$ = x_stride %endif mov ebp, prm_height ; $ebp$ = height mov ebx, prm_vflip or _EBX, _EBX jz .dont_flip ; --- do flipping --- xor _EBX,_EBX %rep BYTES sub _EBX, _EAX %endrep sub _EBX, _EDX mov x_dif, _EBX ; x_dif = -BYTES*fixed_width - x_stride lea _EAX, [_EBP-1] %ifdef ARCH_IS_X86_64 mov TMP1, _EDX mul edx mov _EDX, TMP1 %else push _EDX mul edx pop _EDX %endif add _EDI, _EAX ; $edi$ += (height-1) * x_stride neg _EDX ; x_stride = -x_stride .dont_flip: ; --- begin loop --- mov eax, prm_y_stride ; $eax$ = y_stride mov _EBX, prm_u_ptr ; $ebx$ = u_ptr mov _ECX, prm_v_ptr ; $ecx$ = v_ptr FUNC %+ _INIT ARG1, ARG2 ; call FUNC_INIT .y_loop: mov tmp_height, ebp mov ebp, fixed_width .x_loop: FUNC ARG1, ARG2 ; call FUNC add _EDI, BYTES*PIXELS ; x_ptr += BYTES*PIXELS add _ESI, PIXELS ; y_ptr += PIXELS add _EBX, PIXELS/2 ; u_ptr += PIXELS/2 add _ECX, PIXELS/2 ; v_ptr += PIXELS/2 sub _EBP, PIXELS ; $ebp$ -= PIXELS jg .x_loop ; if ($ebp$ > 0) goto .x_loop mov ebp, tmp_height add _EDI, x_dif ; x_ptr += x_dif + (VPIXELS-1)*x_stride %ifdef ARCH_IS_X86_64 mov TMP1d, y_dif add _ESI, TMP1 ; y_ptr += y_dif + (VPIXELS-1)*y_stride %else add _ESI, y_dif ; y_ptr += y_dif + (VPIXELS-1)*y_stride %endif %rep VPIXELS-1 add _EDI, _EDX add _ESI, _EAX %endrep %ifdef ARCH_IS_X86_64 mov TMP1d, uv_dif add _EBX, TMP1 ; u_ptr += uv_dif + ((VPIXELS/2)-1)*uv_stride add _ECX, TMP1 ; v_ptr += uv_dif + ((VPIXELS/2)-1)*uv_stride %else add _EBX, uv_dif ; u_ptr += uv_dif + ((VPIXELS/2)-1)*uv_stride add _ECX, uv_dif ; v_ptr += uv_dif + ((VPIXELS/2)-1)*uv_stride %endif %rep (VPIXELS/2)-1 %ifdef ARCH_IS_X86_64 mov TMP1d, prm_uv_stride add _EBX, TMP1 add _ECX, TMP1 %else add _EBX, prm_uv_stride add _ECX, prm_uv_stride %endif %endrep sub _EBP, VPIXELS ; $ebp$ -= VPIXELS jg .y_loop ; if ($ebp$ > 0) goto .y_loop ; cleanup stack & undef everything add _ESP, localsize pop _EBP %ifndef ARCH_IS_X86_64 pop _EDI pop _ESI %else %ifdef WINDOWS pop _EDI pop _ESI %endif %endif pop _EBX %undef prm_vflip %undef prm_height %undef prm_width %undef prm_uv_stride %undef prm_y_stride %undef prm_v_ptr %undef prm_u_ptr %undef prm_y_ptr %undef prm_x_stride %undef prm_x_ptr %undef _ip %undef x_dif %undef y_dif %undef uv_dif %undef fixed_width %undef tmp_height ret ENDFUNC %undef NAME %undef STACK %undef BYTES %undef PIXELS %undef VPIXELS %undef FUNC %undef ARG1 %endmacro ;------------------------------------------------------------------------------ xvidcore/src/image/x86_asm/interpolate8x8_xmm.asm0000664000076500007650000004510211254216113023100 0ustar xvidbuildxvidbuild;/***************************************************************************** ; * ; * XVID MPEG-4 VIDEO CODEC ; * - mmx 8x8 block-based halfpel interpolation - ; * ; * Copyright(C) 2002-2008 Michael Militzer ; * 2002 Pascal Massimino ; * ; * This program is free software ; you can redistribute it and/or modify ; * it under the terms of the GNU General Public License as published by ; * the Free Software Foundation ; either version 2 of the License, or ; * (at your option) any later version. ; * ; * This program is distributed in the hope that it will be useful, ; * but WITHOUT ANY WARRANTY ; without even the implied warranty of ; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ; * GNU General Public License for more details. ; * ; * You should have received a copy of the GNU General Public License ; * along with this program ; if not, write to the Free Software ; * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ; * ; ****************************************************************************/ %include "nasm.inc" ;============================================================================= ; Read only data ;============================================================================= DATA ALIGN SECTION_ALIGN mmx_one: times 8 db 1 TEXT cglobal interpolate8x8_halfpel_h_xmm cglobal interpolate8x8_halfpel_v_xmm cglobal interpolate8x8_halfpel_hv_xmm cglobal interpolate8x4_halfpel_h_xmm cglobal interpolate8x4_halfpel_v_xmm cglobal interpolate8x4_halfpel_hv_xmm cglobal interpolate8x8_halfpel_add_xmm cglobal interpolate8x8_halfpel_h_add_xmm cglobal interpolate8x8_halfpel_v_add_xmm cglobal interpolate8x8_halfpel_hv_add_xmm ;=========================================================================== ; ; void interpolate8x8_halfpel_h_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;=========================================================================== %macro COPY_H_SSE_RND0 0 movq mm0, [_EAX] pavgb mm0, [_EAX+1] movq mm1, [_EAX+TMP1] pavgb mm1, [_EAX+TMP1+1] lea _EAX,[_EAX+2*TMP1] movq [TMP0],mm0 movq [TMP0+TMP1],mm1 %endmacro %macro COPY_H_SSE_RND1 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] movq mm4, mm0 movq mm5, mm1 movq mm2, [_EAX+1] movq mm3, [_EAX+TMP1+1] pavgb mm0, mm2 pxor mm2, mm4 pavgb mm1, mm3 lea _EAX, [_EAX+2*TMP1] pxor mm3, mm5 pand mm2, mm7 pand mm3, mm7 psubb mm0, mm2 movq [TMP0], mm0 psubb mm1, mm3 movq [TMP0+TMP1], mm1 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_h_xmm: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX,_EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride jnz near .rounding1 COPY_H_SSE_RND0 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND0 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND0 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND0 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 movq mm7, [mmx_one] COPY_H_SSE_RND1 lea TMP0, [TMP0+2*TMP1] COPY_H_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND1 ret ENDFUNC ;=========================================================================== ; ; void interpolate8x8_halfpel_v_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;=========================================================================== %macro COPY_V_SSE_RND0 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] pavgb mm0, mm1 pavgb mm1, [_EAX+2*TMP1] lea _EAX, [_EAX+2*TMP1] movq [TMP0], mm0 movq [TMP0+TMP1],mm1 %endmacro %macro COPY_V_SSE_RND1 0 movq mm0, mm2 movq mm1, [_EAX] movq mm2, [_EAX+TMP1] lea _EAX,[_EAX+2*TMP1] movq mm4, mm0 movq mm5, mm1 pavgb mm0, mm1 pxor mm4, mm1 pavgb mm1, mm2 pxor mm5, mm2 pand mm4, mm7 ; lsb's of (i^j)... pand mm5, mm7 ; lsb's of (i^j)... psubb mm0, mm4 ; ...are substracted from result of pavgb movq [TMP0], mm0 psubb mm1, mm5 ; ...are substracted from result of pavgb movq [TMP0+TMP1], mm1 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_v_xmm: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX,_EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride ; we process 2 line at a time jnz near .rounding1 COPY_V_SSE_RND0 lea TMP0, [TMP0+2*TMP1] COPY_V_SSE_RND0 lea TMP0, [TMP0+2*TMP1] COPY_V_SSE_RND0 lea TMP0, [TMP0+2*TMP1] COPY_V_SSE_RND0 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 movq mm7, [mmx_one] movq mm2, [_EAX] ; loop invariant add _EAX, TMP1 COPY_V_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_V_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_V_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_V_SSE_RND1 ret ENDFUNC ;=========================================================================== ; ; void interpolate8x8_halfpel_hv_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== ; The trick is to correct the result of 'pavgb' with some combination of the ; lsb's of the 4 input values i,j,k,l, and their intermediate 'pavgb' (s and t). ; The boolean relations are: ; (i+j+k+l+3)/4 = (s+t+1)/2 - (ij&kl)&st ; (i+j+k+l+2)/4 = (s+t+1)/2 - (ij|kl)&st ; (i+j+k+l+1)/4 = (s+t+1)/2 - (ij&kl)|st ; (i+j+k+l+0)/4 = (s+t+1)/2 - (ij|kl)|st ; with s=(i+j+1)/2, t=(k+l+1)/2, ij = i^j, kl = k^l, st = s^t. ; Moreover, we process 2 lines at a times, for better overlapping (~15% faster). %macro COPY_HV_SSE_RND0 0 lea _EAX, [_EAX+TMP1] movq mm0, [_EAX] movq mm1, [_EAX+1] movq mm6, mm0 pavgb mm0, mm1 ; mm0=(j+k+1)/2. preserved for next step lea _EAX, [_EAX+TMP1] pxor mm1, mm6 ; mm1=(j^k). preserved for next step por mm3, mm1 ; ij |= jk movq mm6, mm2 pxor mm6, mm0 ; mm6 = s^t pand mm3, mm6 ; (ij|jk) &= st pavgb mm2, mm0 ; mm2 = (s+t+1)/2 pand mm3, mm7 ; mask lsb psubb mm2, mm3 ; apply. movq [TMP0], mm2 movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 ; preserved for next iteration lea TMP0,[TMP0+TMP1] pxor mm3, mm6 ; preserved for next iteration por mm1, mm3 movq mm6, mm0 pxor mm6, mm2 pand mm1, mm6 pavgb mm0, mm2 pand mm1, mm7 psubb mm0, mm1 movq [TMP0], mm0 %endmacro %macro COPY_HV_SSE_RND1 0 lea _EAX, [_EAX+TMP1] movq mm0, [_EAX] movq mm1, [_EAX+1] movq mm6, mm0 pavgb mm0, mm1 ; mm0=(j+k+1)/2. preserved for next step lea _EAX, [_EAX+TMP1] pxor mm1, mm6 ; mm1=(j^k). preserved for next step pand mm3, mm1 movq mm6, mm2 pxor mm6, mm0 por mm3, mm6 pavgb mm2, mm0 pand mm3, mm7 psubb mm2, mm3 movq [TMP0], mm2 movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 ; preserved for next iteration lea TMP0,[TMP0+TMP1] pxor mm3, mm6 ; preserved for next iteration pand mm1, mm3 movq mm6, mm0 pxor mm6, mm2 por mm1, mm6 pavgb mm0, mm2 pand mm1, mm7 psubb mm0, mm1 movq [TMP0], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_hv_xmm: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX, _EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride movq mm7, [mmx_one] ; loop invariants: mm2=(i+j+1)/2 and mm3= i^j movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 pxor mm3, mm6 ; mm2/mm3 ready jnz near .rounding1 COPY_HV_SSE_RND0 add TMP0, TMP1 COPY_HV_SSE_RND0 add TMP0, TMP1 COPY_HV_SSE_RND0 add TMP0, TMP1 COPY_HV_SSE_RND0 ret .rounding1: COPY_HV_SSE_RND1 add TMP0, TMP1 COPY_HV_SSE_RND1 add TMP0, TMP1 COPY_HV_SSE_RND1 add TMP0, TMP1 COPY_HV_SSE_RND1 ret ENDFUNC ;=========================================================================== ; ; void interpolate8x4_halfpel_h_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;=========================================================================== ALIGN SECTION_ALIGN interpolate8x4_halfpel_h_xmm: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX,_EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride jnz near .rounding1 COPY_H_SSE_RND0 lea TMP0,[TMP0+2*TMP1] COPY_H_SSE_RND0 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 movq mm7, [mmx_one] COPY_H_SSE_RND1 lea TMP0, [TMP0+2*TMP1] COPY_H_SSE_RND1 ret ENDFUNC ;=========================================================================== ; ; void interpolate8x4_halfpel_v_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ;=========================================================================== ALIGN SECTION_ALIGN interpolate8x4_halfpel_v_xmm: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX,_EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride ; we process 2 line at a time jnz near .rounding1 COPY_V_SSE_RND0 lea TMP0, [TMP0+2*TMP1] COPY_V_SSE_RND0 ret .rounding1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 movq mm7, [mmx_one] movq mm2, [_EAX] ; loop invariant add _EAX, TMP1 COPY_V_SSE_RND1 lea TMP0,[TMP0+2*TMP1] COPY_V_SSE_RND1 ret ENDFUNC ;=========================================================================== ; ; void interpolate8x4_halfpel_hv_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== ; The trick is to correct the result of 'pavgb' with some combination of the ; lsb's of the 4 input values i,j,k,l, and their intermediate 'pavgb' (s and t). ; The boolean relations are: ; (i+j+k+l+3)/4 = (s+t+1)/2 - (ij&kl)&st ; (i+j+k+l+2)/4 = (s+t+1)/2 - (ij|kl)&st ; (i+j+k+l+1)/4 = (s+t+1)/2 - (ij&kl)|st ; (i+j+k+l+0)/4 = (s+t+1)/2 - (ij|kl)|st ; with s=(i+j+1)/2, t=(k+l+1)/2, ij = i^j, kl = k^l, st = s^t. ; Moreover, we process 2 lines at a times, for better overlapping (~15% faster). ALIGN SECTION_ALIGN interpolate8x4_halfpel_hv_xmm: mov _EAX, prm4 ; rounding mov TMP0, prm1 ; Dst test _EAX, _EAX mov _EAX, prm2 ; Src mov TMP1, prm3 ; stride movq mm7, [mmx_one] ; loop invariants: mm2=(i+j+1)/2 and mm3= i^j movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 pxor mm3, mm6 ; mm2/mm3 ready jnz near .rounding1 COPY_HV_SSE_RND0 add TMP0, TMP1 COPY_HV_SSE_RND0 ret .rounding1: COPY_HV_SSE_RND1 add TMP0, TMP1 COPY_HV_SSE_RND1 ret ENDFUNC ;=========================================================================== ; ; The next functions combine both source halfpel interpolation step and the ; averaging (with rouding) step to avoid wasting memory bandwidth computing ; intermediate halfpel images and then averaging them. ; ;=========================================================================== %macro PROLOG0 0 mov TMP0, prm1 ; Dst mov _EAX, prm2 ; Src mov TMP1, prm3 ; BpS %endmacro %macro PROLOG1 0 PROLOG0 test prm4d, 1; Rounding? %endmacro %macro EPILOG 0 ret %endmacro ;=========================================================================== ; ; void interpolate8x8_halfpel_add_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== %macro ADD_FF 2 movq mm0, [_EAX+%1] movq mm1, [_EAX+%2] ;;--- ;; movq mm2, mm0 ;; movq mm3, mm1 ;;--- pavgb mm0, [TMP0+%1] pavgb mm1, [TMP0+%2] ;;-- ;; por mm2, [TMP0+%1] ;; por mm3, [TMP0+%2] ;; pand mm2, [mmx_one] ;; pand mm3, [mmx_one] ;; psubsb mm0, mm2 ;; psubsb mm1, mm3 ;;-- movq [TMP0+%1], mm0 movq [TMP0+%2], mm1 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_add_xmm: ; 23c PROLOG1 ADD_FF 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FF 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FF 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FF 0, TMP1 EPILOG ENDFUNC ;=========================================================================== ; ; void interpolate8x8_halfpel_h_add_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== %macro ADD_FH_RND0 2 movq mm0, [_EAX+%1] movq mm1, [_EAX+%2] pavgb mm0, [_EAX+%1+1] pavgb mm1, [_EAX+%2+1] pavgb mm0, [TMP0+%1] pavgb mm1, [TMP0+%2] movq [TMP0+%1],mm0 movq [TMP0+%2],mm1 %endmacro %macro ADD_FH_RND1 2 movq mm0, [_EAX+%1] movq mm1, [_EAX+%2] movq mm4, mm0 movq mm5, mm1 movq mm2, [_EAX+%1+1] movq mm3, [_EAX+%2+1] pavgb mm0, mm2 ; lea ?? pxor mm2, mm4 pavgb mm1, mm3 pxor mm3, mm5 pand mm2, [mmx_one] pand mm3, [mmx_one] psubb mm0, mm2 psubb mm1, mm3 pavgb mm0, [TMP0+%1] pavgb mm1, [TMP0+%2] movq [TMP0+%1],mm0 movq [TMP0+%2],mm1 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_h_add_xmm: ; 32c PROLOG1 jnz near .Loop1 ADD_FH_RND0 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FH_RND0 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FH_RND0 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FH_RND0 0, TMP1 EPILOG .Loop1: ; we use: (i+j)/2 = ( i+j+1 )/2 - (i^j)&1 ; movq mm7, [mmx_one] ADD_FH_RND1 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FH_RND1 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FH_RND1 0, TMP1 lea _EAX,[_EAX+2*TMP1] lea TMP0,[TMP0+2*TMP1] ADD_FH_RND1 0, TMP1 EPILOG ENDFUNC ;=========================================================================== ; ; void interpolate8x8_halfpel_v_add_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== %macro ADD_8_HF_RND0 0 movq mm0, [_EAX] movq mm1, [_EAX+TMP1] pavgb mm0, mm1 pavgb mm1, [_EAX+2*TMP1] lea _EAX,[_EAX+2*TMP1] pavgb mm0, [TMP0] pavgb mm1, [TMP0+TMP1] movq [TMP0],mm0 movq [TMP0+TMP1],mm1 %endmacro %macro ADD_8_HF_RND1 0 movq mm1, [_EAX+TMP1] movq mm2, [_EAX+2*TMP1] lea _EAX,[_EAX+2*TMP1] movq mm4, mm0 movq mm5, mm1 pavgb mm0, mm1 pxor mm4, mm1 pavgb mm1, mm2 pxor mm5, mm2 pand mm4, mm7 ; lsb's of (i^j)... pand mm5, mm7 ; lsb's of (i^j)... psubb mm0, mm4 ; ...are substracted from result of pavgb pavgb mm0, [TMP0] movq [TMP0], mm0 psubb mm1, mm5 ; ...are substracted from result of pavgb pavgb mm1, [TMP0+TMP1] movq [TMP0+TMP1], mm1 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_v_add_xmm: PROLOG1 jnz near .Loop1 pxor mm7, mm7 ; this is a NOP ADD_8_HF_RND0 lea TMP0,[TMP0+2*TMP1] ADD_8_HF_RND0 lea TMP0,[TMP0+2*TMP1] ADD_8_HF_RND0 lea TMP0,[TMP0+2*TMP1] ADD_8_HF_RND0 EPILOG .Loop1: movq mm0, [_EAX] ; loop invariant movq mm7, [mmx_one] ADD_8_HF_RND1 movq mm0, mm2 lea TMP0,[TMP0+2*TMP1] ADD_8_HF_RND1 movq mm0, mm2 lea TMP0,[TMP0+2*TMP1] ADD_8_HF_RND1 movq mm0, mm2 lea TMP0,[TMP0+2*TMP1] ADD_8_HF_RND1 EPILOG ENDFUNC ; The trick is to correct the result of 'pavgb' with some combination of the ; lsb's of the 4 input values i,j,k,l, and their intermediate 'pavgb' (s and t). ; The boolean relations are: ; (i+j+k+l+3)/4 = (s+t+1)/2 - (ij&kl)&st ; (i+j+k+l+2)/4 = (s+t+1)/2 - (ij|kl)&st ; (i+j+k+l+1)/4 = (s+t+1)/2 - (ij&kl)|st ; (i+j+k+l+0)/4 = (s+t+1)/2 - (ij|kl)|st ; with s=(i+j+1)/2, t=(k+l+1)/2, ij = i^j, kl = k^l, st = s^t. ; Moreover, we process 2 lines at a times, for better overlapping (~15% faster). ;=========================================================================== ; ; void interpolate8x8_halfpel_hv_add_xmm(uint8_t * const dst, ; const uint8_t * const src, ; const uint32_t stride, ; const uint32_t rounding); ; ; ;=========================================================================== %macro ADD_HH_RND0 0 lea _EAX,[_EAX+TMP1] movq mm0, [_EAX] movq mm1, [_EAX+1] movq mm6, mm0 pavgb mm0, mm1 ; mm0=(j+k+1)/2. preserved for next step lea _EAX,[_EAX+TMP1] pxor mm1, mm6 ; mm1=(j^k). preserved for next step por mm3, mm1 ; ij |= jk movq mm6, mm2 pxor mm6, mm0 ; mm6 = s^t pand mm3, mm6 ; (ij|jk) &= st pavgb mm2, mm0 ; mm2 = (s+t+1)/2 pand mm3, mm7 ; mask lsb psubb mm2, mm3 ; apply. pavgb mm2, [TMP0] movq [TMP0], mm2 movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 ; preserved for next iteration lea TMP0,[TMP0+TMP1] pxor mm3, mm6 ; preserved for next iteration por mm1, mm3 movq mm6, mm0 pxor mm6, mm2 pand mm1, mm6 pavgb mm0, mm2 pand mm1, mm7 psubb mm0, mm1 pavgb mm0, [TMP0] movq [TMP0], mm0 %endmacro %macro ADD_HH_RND1 0 lea _EAX,[_EAX+TMP1] movq mm0, [_EAX] movq mm1, [_EAX+1] movq mm6, mm0 pavgb mm0, mm1 ; mm0=(j+k+1)/2. preserved for next step lea _EAX,[_EAX+TMP1] pxor mm1, mm6 ; mm1=(j^k). preserved for next step pand mm3, mm1 movq mm6, mm2 pxor mm6, mm0 por mm3, mm6 pavgb mm2, mm0 pand mm3, mm7 psubb mm2, mm3 pavgb mm2, [TMP0] movq [TMP0], mm2 movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 ; preserved for next iteration lea TMP0,[TMP0+TMP1] pxor mm3, mm6 ; preserved for next iteration pand mm1, mm3 movq mm6, mm0 pxor mm6, mm2 por mm1, mm6 pavgb mm0, mm2 pand mm1, mm7 psubb mm0, mm1 pavgb mm0, [TMP0] movq [TMP0], mm0 %endmacro ALIGN SECTION_ALIGN interpolate8x8_halfpel_hv_add_xmm: PROLOG1 movq mm7, [mmx_one] ; loop invariants: mm2=(i+j+1)/2 and mm3= i^j movq mm2, [_EAX] movq mm3, [_EAX+1] movq mm6, mm2 pavgb mm2, mm3 pxor mm3, mm6 ; mm2/mm3 ready jnz near .Loop1 ADD_HH_RND0 add TMP0, TMP1 ADD_HH_RND0 add TMP0, TMP1 ADD_HH_RND0 add TMP0, TMP1 ADD_HH_RND0 EPILOG .Loop1: ADD_HH_RND1 add TMP0, TMP1 ADD_HH_RND1 add TMP0, TMP1 ADD_HH_RND1 add TMP0, TMP1 ADD_HH_RND1 EPILOG ENDFUNC NON_EXEC_STACK xvidcore/src/image/font.c0000664000076500007650000001655611564705453016475 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Font rendering to frame buffer functions - * * Copyright(C) 2002-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: font.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include "image.h" #include "font.h" #define FONT_WIDTH 4 #define FONT_HEIGHT 6 static const char ascii33[33][FONT_WIDTH*FONT_HEIGHT] = { /* ! */ {0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,0,0, 0,0,1,0}, /* " */ {0,1,0,1, 0,1,0,1, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0}, /* # */ {0,1,1,0, 1,1,1,1, 0,1,1,0, 0,1,1,0, 1,1,1,1, 0,1,1,0}, /* $ */ {0,1,1,0, 1,0,1,1, 1,1,1,0, 0,1,1,1, 1,1,0,1, 0,1,1,0}, /* % */ {1,1,0,1, 1,0,0,1, 0,0,1,0, 0,1,0,0, 1,0,0,1, 1,0,1,1}, /* & */ {0,1,1,0, 1,0,0,0, 0,1,0,1, 1,0,1,0, 1,0,1,0, 0,1,0,1}, /* ' */ {0,0,1,0, 0,0,1,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0}, /* ( */ {0,0,1,0, 0,1,0,0, 0,1,0,0, 0,1,0,0, 0,1,0,0, 0,0,1,0}, /* ) */ {0,1,0,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,1,0,0}, /* * */ {0,0,0,0, 1,0,0,1, 0,1,1,0, 1,1,1,1, 0,1,1,0, 1,0,0,1}, /* + */ {0,0,0,0, 0,0,1,0, 0,0,1,0, 0,1,1,1, 0,0,1,0, 0,0,1,0}, /* , */ {0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,1,1,0, 0,0,1,0}, /* - */ {0,0,0,0, 0,0,0,0, 0,0,0,0, 1,1,1,1, 0,0,0,0, 0,0,0,0}, /* . */ {0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,1,1,0, 0,1,1,0}, /* / */ {0,0,0,1, 0,0,0,1, 0,0,1,0, 0,1,0,0, 1,0,0,0, 1,0,0,0}, /* 0 */ {0,1,1,0, 1,0,0,1, 1,0,1,1, 1,1,0,1, 1,0,0,1, 0,1,1,0}, /* 1 */ {0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0}, /* 2 */ {0,1,1,0, 1,0,0,1, 0,0,1,0, 0,1,0,0, 1,0,0,0, 1,1,1,1}, /* 3 */ {0,1,1,0, 1,0,0,1, 0,0,1,0, 0,0,0,1, 1,0,0,1, 0,1,1,0}, /* 4 */ {0,0,1,0, 0,1,1,0, 1,0,1,0, 1,1,1,1, 0,0,1,0, 0,0,1,0}, /* 5 */ {1,1,1,1, 1,0,0,0, 1,1,1,0, 0,0,0,1, 1,0,0,1, 0,1,1,0}, /* 6 */ {0,1,1,1, 1,0,0,0, 1,1,1,0, 1,0,0,1, 1,0,0,1, 0,1,1,0}, /* 7 */ {1,1,1,0, 0,0,0,1, 0,0,0,1, 0,0,1,0, 0,0,1,0, 0,0,1,0}, /* 8 */ {0,1,1,0, 1,0,0,1, 0,1,1,0, 1,0,0,1, 1,0,0,1, 0,1,1,0}, /* 9 */ {0,1,1,0, 1,0,0,1, 1,0,0,1, 0,1,1,1, 0,0,0,1, 1,1,1,0}, /* : */ {0,0,0,0, 0,0,0,0, 0,0,1,0, 0,0,0,0, 0,0,1,0, 0,0,0,0}, /* ; */ {0,0,0,0, 0,0,1,0, 0,0,0,0, 0,0,0,0, 0,1,1,0, 0,0,1,0}, /* < */ {0,0,0,1, 0,0,1,0, 0,1,0,0, 0,1,0,0, 0,0,1,0, 0,0,0,1}, /* = */ {0,0,0,0, 1,1,1,1, 0,0,0,0, 0,0,0,0, 1,1,1,1, 0,0,0,0}, /* > */ {0,1,0,0, 0,0,1,0, 0,0,0,1, 0,0,0,1, 0,0,1,0, 0,1,0,0}, /* ? */ {0,1,1,0, 1,0,0,1, 0,0,1,0, 0,0,1,0, 0,0,0,0, 0,0,1,0}, /* @ */ {0,1,1,0, 1,0,0,1, 1,0,1,1, 1,0,1,1, 1,0,0,0, 0,1,1,0}, }; static const char ascii65[26][FONT_WIDTH*FONT_HEIGHT] = { /* A */ {0,1,1,0, 1,0,0,1, 1,0,0,1, 1,1,1,1, 1,0,0,1, 1,0,0,1}, /* B */ {1,1,1,0, 1,0,0,1, 1,1,1,0, 1,0,0,1, 1,0,0,1, 1,1,1,0}, /* C */ {0,1,1,0, 1,0,0,1, 1,0,0,0, 1,0,0,0, 1,0,0,1, 0,1,1,0}, /* D */ {1,1,0,0, 1,0,1,0, 1,0,0,1, 1,0,0,1, 1,0,1,0, 1,1,0,0}, /* E */ {1,1,1,1, 1,0,0,0, 1,1,1,0, 1,0,0,0, 1,0,0,0, 1,1,1,1}, /* F */ {1,1,1,1, 1,0,0,0, 1,1,1,0, 1,0,0,0, 1,0,0,0, 1,0,0,0}, /* G */ {0,1,1,1, 1,0,0,0, 1,0,1,1, 1,0,0,1, 1,0,0,1, 0,1,1,0}, /* H */ {1,0,0,1, 1,0,0,1, 1,1,1,1, 1,0,0,1, 1,0,0,1, 1,0,0,1}, /* I */ {0,1,1,1, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,1,1,1}, /* J */ {0,1,1,1, 0,0,1,0, 0,0,1,0, 0,0,1,0, 1,0,1,0, 0,1,0,0}, /* K */ {1,0,0,1, 1,0,0,1, 1,1,1,0, 1,0,0,1, 1,0,0,1, 1,0,0,1}, /* L */ {1,0,0,0, 1,0,0,0, 1,0,0,0, 1,0,0,0, 1,0,0,0, 1,1,1,1}, /* M */ {1,0,0,1, 1,1,1,1, 1,1,1,1, 1,0,0,1, 1,0,0,1, 1,0,0,1}, /* N */ {1,0,0,1, 1,1,0,1, 1,1,0,1, 1,0,1,1, 1,0,1,1, 1,0,0,1}, /* 0 */ {0,1,1,0, 1,0,0,1, 1,0,0,1, 1,0,0,1, 1,0,0,1, 0,1,1,0}, /* P */ {1,1,1,0, 1,0,0,1, 1,1,1,0, 1,0,0,0, 1,0,0,0, 1,0,0,0}, /* Q */ {0,1,1,0, 1,0,0,1, 1,0,0,1, 1,0,0,1, 1,0,1,0, 0,1,0,1}, /* R */ {1,1,1,0, 1,0,0,1, 1,1,1,0, 1,0,0,1, 1,0,0,1, 1,0,0,1}, /* S */ {0,1,1,0, 1,0,0,1, 0,1,0,0, 0,0,1,0, 1,0,0,1, 0,1,1,0}, /* T */ {0,1,1,1, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0}, /* U */ {1,0,0,1, 1,0,0,1, 1,0,0,1, 1,0,0,1, 1,0,0,1, 1,1,1,1}, /* V */ {1,0,0,1, 1,0,0,1, 1,0,0,1, 0,1,1,0, 0,1,1,0, 0,1,1,0}, /* W */ {1,0,0,1, 1,0,0,1, 1,0,0,1, 1,1,1,1, 1,1,1,1, 1,0,0,1}, /* X */ {1,0,0,1, 1,0,0,1, 0,1,1,0, 1,0,0,1, 1,0,0,1, 1,0,0,1}, /* Y */ {1,0,0,1, 1,0,0,1, 0,1,0,0, 0,0,1,0, 0,1,0,0, 1,0,0,0}, /* Z */ {1,1,1,1, 0,0,0,1, 0,0,1,0, 0,1,0,0, 1,0,0,0, 1,1,1,1}, }; static const char ascii91[6][FONT_WIDTH*FONT_HEIGHT] = { /* [ */ {0,1,1,0, 0,1,0,0, 0,1,0,0, 0,1,0,0, 0,1,0,0, 0,1,1,0}, /* '\' */ {1,0,0,0, 1,0,0,0, 0,1,0,0, 0,0,1,0, 0,0,0,1, 0,0,0,1}, /* ] */ {0,1,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,0,1,0, 0,1,1,0}, /* ^ */ {0,1,0,1, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0}, /* _ */ {0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 1,1,1,1}, /* ` */ {0,1,0,0, 0,0,1,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0} }; #define FONT_ZOOM 4 static void draw_num(IMAGE * img, const int stride, const int height, const char * font, const int x, const int y) { int i, j; for (j = 0; j < FONT_ZOOM * FONT_HEIGHT && y+j < height; j++) for (i = 0; i < FONT_ZOOM * FONT_WIDTH && x+i < stride; i++) if (font[(j/FONT_ZOOM)*FONT_WIDTH + (i/FONT_ZOOM)]) { int offset = (y+j)*stride + (x+i); int offset2 =((y+j)/2)*(stride/2) + ((x+i)/2); img->y[offset] = 255; img->u[offset2] = 127; img->v[offset2] = 127; } } #define FONT_BUF_SZ 1024 void image_printf(IMAGE * img, int edged_width, int height, int x, int y, char *fmt, ...) { va_list args; char buf[FONT_BUF_SZ]; int i; va_start(args, fmt); vsprintf(buf, fmt, args); va_end(args); for (i = 0; i < buf[i]; i++) { const char * font; if (buf[i] >= '!' && buf[i] <= '@') font = ascii33[buf[i]-'!']; else if (buf[i] >= 'A' && buf[i] <= 'Z') font = ascii65[buf[i]-'A']; else if (buf[i] >= '[' && buf[i] <= '`') font = ascii91[buf[i]-'[']; else if (buf[i] >= 'a' && buf[i] <= 'z') font = ascii65[buf[i]-'a']; else continue; draw_num(img, edged_width, height, font, x + i*FONT_ZOOM*(FONT_WIDTH+1), y); } } xvidcore/src/image/postprocessing.h0000664000076500007650000000540011564705453020600 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Postprocessing header - * * Copyright(C) 2003-2010 Michael Militzer * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: postprocessing.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _POSTPROCESSING_H_ #define _POSTPROCESSING_H_ #include #include "../portab.h" /* Filtering thresholds */ #define THR1 2 #define THR2 6 #define MAX_NOISE 4096 #define MAX_SHIFT 1024 #define MAX_RES (MAX_NOISE - MAX_SHIFT) #define DERING_STRENGTH 2 typedef struct { int8_t xvid_thresh_tbl[511]; uint8_t xvid_abs_tbl[511]; int8_t xvid_noise1[MAX_NOISE * sizeof(int8_t)]; int8_t xvid_noise2[MAX_NOISE * sizeof(int8_t)]; int8_t *xvid_prev_shift[MAX_RES][6]; int prev_quant; } XVID_POSTPROC; typedef struct { pthread_t handle; /* thread's handle */ XVID_POSTPROC *tbls; IMAGE * img; const MACROBLOCK * mbs; int stride; int start_x; int stop_x; int start_y; int stop_y; int mb_stride; int flags; } SMPDeblock; void image_postproc(XVID_POSTPROC *tbls, IMAGE * img, int edged_width, const MACROBLOCK * mbs, int mb_width, int mb_height, int mb_stride, int flags, int brightness, int frame_num, int bvop, int threads); void deblock8x8_h(XVID_POSTPROC *tbls, uint8_t *img, int stride, int quant, int dering); void deblock8x8_v(XVID_POSTPROC *tbls, uint8_t *img, int stride, int quant, int dering); void init_postproc(XVID_POSTPROC *tbls); void init_noise(XVID_POSTPROC *tbls); void init_deblock(XVID_POSTPROC *tbls); void add_noise(XVID_POSTPROC * tbls, uint8_t *dst, uint8_t *src, int stride, int width, int height, int shiftptr, int quant); typedef void (IMAGEBRIGHTNESS) (uint8_t * dst, int stride, int width, int height, int offset); typedef IMAGEBRIGHTNESS *IMAGEBRIGHTNESS_PTR; extern IMAGEBRIGHTNESS_PTR image_brightness; IMAGEBRIGHTNESS image_brightness_c; IMAGEBRIGHTNESS image_brightness_mmx; IMAGEBRIGHTNESS image_brightness_sse2; #endif xvidcore/src/image/image.h0000664000076500007650000000752611564705453016613 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Image related header - * * Copyright(C) 2001-2010 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: image.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _IMAGE_H_ #define _IMAGE_H_ #include #include "../portab.h" #include "../global.h" #include "colorspace.h" #include "../xvid.h" #define EDGE_SIZE 64 void init_image(uint32_t cpu_flags); static void __inline image_null(IMAGE * image) { image->y = image->u = image->v = NULL; } int32_t image_create(IMAGE * image, uint32_t edged_width, uint32_t edged_height); void image_destroy(IMAGE * image, uint32_t edged_width, uint32_t edged_height); void image_swap(IMAGE * image1, IMAGE * image2); void image_copy(IMAGE * image1, IMAGE * image2, uint32_t edged_width, uint32_t height); void image_setedges(IMAGE * image, uint32_t edged_width, uint32_t edged_height, uint32_t width, uint32_t height, int bs_version); void image_interpolate(const uint8_t * refn, uint8_t * refh, uint8_t * refv, uint8_t * refhv, uint32_t edged_width, uint32_t edged_height, uint32_t quarterpel, uint32_t rounding); float image_psnr(IMAGE * orig_image, IMAGE * recon_image, uint16_t stride, uint16_t width, uint16_t height); float sse_to_PSNR(long sse, int pixels); long plane_sse(uint8_t * orig, uint8_t * recon, uint16_t stride, uint16_t width, uint16_t height); void image_chroma_optimize(IMAGE * img, int width, int height, int edged_width); int image_input(IMAGE * image, uint32_t width, int height, uint32_t edged_width, uint8_t * src[4], int src_stride[4], int csp, int interlaced); int image_output(IMAGE * image, uint32_t width, int height, uint32_t edged_width, uint8_t * dst[4], int dst_stride[4], int csp, int interlaced); int image_dump_yuvpgm(const IMAGE * image, const uint32_t edged_width, const uint32_t width, const uint32_t height, char *filename); float image_mad(const IMAGE * img1, const IMAGE * img2, uint32_t stride, uint32_t width, uint32_t height); void output_slice(IMAGE * cur, int stride, int width, xvid_image_t* out_frm, int mbx, int mby,int mbl); void image_clear(IMAGE * img, int width, int height, int edged_width, int y, int u, int v); void image_block_variance(IMAGE * orig_image, uint16_t stride, MACROBLOCK *mbs, uint16_t mb_width, uint16_t mb_height); void image_deblock_rrv(IMAGE * img, int edged_width, const MACROBLOCK * mbs, int mb_width, int mb_height, int mb_stride, int block, int flags); /* helper function: deinterlace image. Only for YUV 4:2:0 planar format. Use bottom_first!=0 if main field is the bottom one. returns 1 if everything went ok, 0 otherwise. */ extern int xvid_image_deinterlace(xvid_image_t* img, int width, int height, int bottom_first); #endif /* _IMAGE_H_ */ xvidcore/src/image/colorspace.h0000664000076500007650000001312111564705453017647 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Colorspace related header - * * Copyright(C) 2001-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: colorspace.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _COLORSPACE_H #define _COLORSPACE_H #include "../portab.h" /* initialize tables */ void colorspace_init(void); /* colorspace conversion function (encoder) */ typedef void (packedFunc) (uint8_t * x_ptr, int x_stride, uint8_t * y_src, uint8_t * v_src, uint8_t * u_src, int y_stride, int uv_stride, int width, int height, int vflip); typedef packedFunc *packedFuncPtr; /* xxx_to_yv12 colorspace conversion functions (encoder) */ extern packedFuncPtr rgb555_to_yv12; extern packedFuncPtr rgb565_to_yv12; extern packedFuncPtr rgb_to_yv12; extern packedFuncPtr bgr_to_yv12; extern packedFuncPtr bgra_to_yv12; extern packedFuncPtr abgr_to_yv12; extern packedFuncPtr rgba_to_yv12; extern packedFuncPtr argb_to_yv12; extern packedFuncPtr yuyv_to_yv12; extern packedFuncPtr uyvy_to_yv12; extern packedFuncPtr rgb555i_to_yv12; extern packedFuncPtr rgb565i_to_yv12; extern packedFuncPtr rgbi_to_yv12; extern packedFuncPtr bgri_to_yv12; extern packedFuncPtr bgrai_to_yv12; extern packedFuncPtr abgri_to_yv12; extern packedFuncPtr rgbai_to_yv12; extern packedFuncPtr argbi_to_yv12; extern packedFuncPtr yuyvi_to_yv12; extern packedFuncPtr uyvyi_to_yv12; /* plain c */ packedFunc rgb555_to_yv12_c; packedFunc rgb565_to_yv12_c; packedFunc rgb_to_yv12_c; packedFunc bgr_to_yv12_c; packedFunc bgra_to_yv12_c; packedFunc abgr_to_yv12_c; packedFunc rgba_to_yv12_c; packedFunc argb_to_yv12_c; packedFunc yuyv_to_yv12_c; packedFunc uyvy_to_yv12_c; packedFunc rgb555i_to_yv12_c; packedFunc rgb565i_to_yv12_c; packedFunc rgbi_to_yv12_c; packedFunc bgri_to_yv12_c; packedFunc bgrai_to_yv12_c; packedFunc abgri_to_yv12_c; packedFunc rgbai_to_yv12_c; packedFunc argbi_to_yv12_c; packedFunc yuyvi_to_yv12_c; packedFunc uyvyi_to_yv12_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) /* mmx */ packedFunc bgr_to_yv12_mmx; packedFunc rgb_to_yv12_mmx; packedFunc bgra_to_yv12_mmx; packedFunc rgba_to_yv12_mmx; packedFunc yuyv_to_yv12_mmx; packedFunc uyvy_to_yv12_mmx; /* 3dnow */ packedFunc yuyv_to_yv12_3dn; packedFunc uyvy_to_yv12_3dn; /* xmm */ packedFunc yuyv_to_yv12_xmm; packedFunc uyvy_to_yv12_xmm; #endif #ifdef ARCH_IS_PPC packedFunc bgra_to_yv12_altivec_c; packedFunc abgr_to_yv12_altivec_c; packedFunc rgba_to_yv12_altivec_c; packedFunc argb_to_yv12_altivec_c; packedFunc yuyv_to_yv12_altivec_c; packedFunc uyvy_to_yv12_altivec_c; #endif /* yv12_to_xxx colorspace conversion functions (decoder) */ extern packedFuncPtr yv12_to_rgb555; extern packedFuncPtr yv12_to_rgb565; extern packedFuncPtr yv12_to_rgb; extern packedFuncPtr yv12_to_bgr; extern packedFuncPtr yv12_to_bgra; extern packedFuncPtr yv12_to_abgr; extern packedFuncPtr yv12_to_rgba; extern packedFuncPtr yv12_to_argb; extern packedFuncPtr yv12_to_yuyv; extern packedFuncPtr yv12_to_uyvy; extern packedFuncPtr yv12_to_rgb555i; extern packedFuncPtr yv12_to_rgb565i; extern packedFuncPtr yv12_to_rgbi; extern packedFuncPtr yv12_to_bgri; extern packedFuncPtr yv12_to_bgrai; extern packedFuncPtr yv12_to_abgri; extern packedFuncPtr yv12_to_rgbai; extern packedFuncPtr yv12_to_argbi; extern packedFuncPtr yv12_to_yuyvi; extern packedFuncPtr yv12_to_uyvyi; /* plain c */ packedFunc yv12_to_rgb555_c; packedFunc yv12_to_rgb565_c; packedFunc yv12_to_rgb_c; packedFunc yv12_to_bgr_c; packedFunc yv12_to_bgra_c; packedFunc yv12_to_abgr_c; packedFunc yv12_to_rgba_c; packedFunc yv12_to_argb_c; packedFunc yv12_to_yuyv_c; packedFunc yv12_to_uyvy_c; packedFunc yv12_to_rgb555i_c; packedFunc yv12_to_rgb565i_c; packedFunc yv12_to_rgbi_c; packedFunc yv12_to_bgri_c; packedFunc yv12_to_bgrai_c; packedFunc yv12_to_abgri_c; packedFunc yv12_to_rgbai_c; packedFunc yv12_to_argbi_c; packedFunc yv12_to_yuyvi_c; packedFunc yv12_to_uyvyi_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) /* mmx */ packedFunc yv12_to_bgr_mmx; packedFunc yv12_to_bgra_mmx; packedFunc yv12_to_yuyv_mmx; packedFunc yv12_to_uyvy_mmx; packedFunc yv12_to_yuyvi_mmx; packedFunc yv12_to_uyvyi_mmx; #endif #ifdef ARCH_IS_PPC packedFunc yv12_to_yuyv_altivec_c; packedFunc yv12_to_uyvy_altivec_c; #endif typedef void (planarFunc) ( uint8_t * y_dst, uint8_t * u_dst, uint8_t * v_dst, int y_dst_stride, int uv_dst_stride, uint8_t * y_src, uint8_t * u_src, uint8_t * v_src, int y_src_stride, int uv_src_stride, int width, int height, int vflip); typedef planarFunc *planarFuncPtr; extern planarFuncPtr yv12_to_yv12; planarFunc yv12_to_yv12_c; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) planarFunc yv12_to_yv12_mmx; planarFunc yv12_to_yv12_xmm; #endif #endif /* _COLORSPACE_H_ */ xvidcore/src/image/qpel.c0000664000076500007650000011761111564705453016462 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - QPel interpolation - * * Copyright(C) 2003 Pascal Massimino * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: qpel.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef XVID_AUTO_INCLUDE #include #include "../portab.h" #include "qpel.h" /* Quarterpel FIR definition ****************************************************************************/ static const int32_t FIR_Tab_8[9][8] = { { 14, -3, 2, -1, 0, 0, 0, 0 }, { 23, 19, -6, 3, -1, 0, 0, 0 }, { -7, 20, 20, -6, 3, -1, 0, 0 }, { 3, -6, 20, 20, -6, 3, -1, 0 }, { -1, 3, -6, 20, 20, -6, 3, -1 }, { 0, -1, 3, -6, 20, 20, -6, 3 }, { 0, 0, -1, 3, -6, 20, 20, -7 }, { 0, 0, 0, -1, 3, -6, 19, 23 }, { 0, 0, 0, 0, -1, 2, -3, 14 } }; static const int32_t FIR_Tab_16[17][16] = { { 14, -3, 2, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { 23, 19, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { -7, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0, 0 }, { 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0, 0 }, { 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0, 0 }, { 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0, 0 }, { 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0, 0 }, { 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0, 0 }, { 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0, 0 }, { 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1, 0 }, { 0, 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -6, 3 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 20, 20, -7 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 3, -6, 19, 23 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 2, -3, 14 } }; /* Implementation ****************************************************************************/ #define XVID_AUTO_INCLUDE /* First auto include this file to generate reference code for SIMD versions * This set of functions are good for educational purpose, because they're * straightforward to understand, use loops and so on... But obviously they * sux when it comes to speed */ #define REFERENCE_CODE /* 16x? filters */ #define SIZE 16 #define TABLE FIR_Tab_16 #define STORE(d,s) (d) = (s) #define FUNC_H H_Pass_16_C_ref #define FUNC_V V_Pass_16_C_ref #define FUNC_HA H_Pass_Avrg_16_C_ref #define FUNC_VA V_Pass_Avrg_16_C_ref #define FUNC_HA_UP H_Pass_Avrg_Up_16_C_ref #define FUNC_VA_UP V_Pass_Avrg_Up_16_C_ref #include "qpel.c" /* self-include ourself */ /* note: B-frame always uses Rnd=0... */ #define STORE(d,s) (d) = ( (s)+(d)+1 ) >> 1 #define FUNC_H H_Pass_16_Add_C_ref #define FUNC_V V_Pass_16_Add_C_ref #define FUNC_HA H_Pass_Avrg_16_Add_C_ref #define FUNC_VA V_Pass_Avrg_16_Add_C_ref #define FUNC_HA_UP H_Pass_Avrg_Up_16_Add_C_ref #define FUNC_VA_UP V_Pass_Avrg_Up_16_Add_C_ref #include "qpel.c" /* self-include ourself */ #undef SIZE #undef TABLE /* 8x? filters */ #define SIZE 8 #define TABLE FIR_Tab_8 #define STORE(d,s) (d) = (s) #define FUNC_H H_Pass_8_C_ref #define FUNC_V V_Pass_8_C_ref #define FUNC_HA H_Pass_Avrg_8_C_ref #define FUNC_VA V_Pass_Avrg_8_C_ref #define FUNC_HA_UP H_Pass_Avrg_Up_8_C_ref #define FUNC_VA_UP V_Pass_Avrg_Up_8_C_ref #include "qpel.c" /* self-include ourself */ /* note: B-frame always uses Rnd=0... */ #define STORE(d,s) (d) = ( (s)+(d)+1 ) >> 1 #define FUNC_H H_Pass_8_Add_C_ref #define FUNC_V V_Pass_8_Add_C_ref #define FUNC_HA H_Pass_Avrg_8_Add_C_ref #define FUNC_VA V_Pass_Avrg_8_Add_C_ref #define FUNC_HA_UP H_Pass_Avrg_Up_8_Add_C_ref #define FUNC_VA_UP V_Pass_Avrg_Up_8_Add_C_ref #include "qpel.c" /* self-include ourself */ #undef SIZE #undef TABLE /* Then we define more optimized C version where loops are unrolled, where * FIR coeffcients are not read from memory but are hardcoded in instructions * They should be faster */ #undef REFERENCE_CODE /* 16x? filters */ #define SIZE 16 #define STORE(d,s) (d) = (s) #define FUNC_H H_Pass_16_C #define FUNC_V V_Pass_16_C #define FUNC_HA H_Pass_Avrg_16_C #define FUNC_VA V_Pass_Avrg_16_C #define FUNC_HA_UP H_Pass_Avrg_Up_16_C #define FUNC_VA_UP V_Pass_Avrg_Up_16_C #include "qpel.c" /* self-include ourself */ /* note: B-frame always uses Rnd=0... */ #define STORE(d,s) (d) = ( (s)+(d)+1 ) >> 1 #define FUNC_H H_Pass_16_Add_C #define FUNC_V V_Pass_16_Add_C #define FUNC_HA H_Pass_Avrg_16_Add_C #define FUNC_VA V_Pass_Avrg_16_Add_C #define FUNC_HA_UP H_Pass_Avrg_Up_16_Add_C #define FUNC_VA_UP V_Pass_Avrg_Up_16_Add_C #include "qpel.c" /* self-include ourself */ #undef SIZE #undef TABLE /* 8x? filters */ #define SIZE 8 #define TABLE FIR_Tab_8 #define STORE(d,s) (d) = (s) #define FUNC_H H_Pass_8_C #define FUNC_V V_Pass_8_C #define FUNC_HA H_Pass_Avrg_8_C #define FUNC_VA V_Pass_Avrg_8_C #define FUNC_HA_UP H_Pass_Avrg_Up_8_C #define FUNC_VA_UP V_Pass_Avrg_Up_8_C #include "qpel.c" /* self-include ourself */ /* note: B-frame always uses Rnd=0... */ #define STORE(d,s) (d) = ( (s)+(d)+1 ) >> 1 #define FUNC_H H_Pass_8_Add_C #define FUNC_V V_Pass_8_Add_C #define FUNC_HA H_Pass_Avrg_8_Add_C #define FUNC_VA V_Pass_Avrg_8_Add_C #define FUNC_HA_UP H_Pass_Avrg_Up_8_Add_C #define FUNC_VA_UP V_Pass_Avrg_Up_8_Add_C #include "qpel.c" /* self-include ourself */ #undef SIZE #undef TABLE #undef XVID_AUTO_INCLUDE /* Global scope hooks ****************************************************************************/ XVID_QP_FUNCS *xvid_QP_Funcs = NULL; XVID_QP_FUNCS *xvid_QP_Add_Funcs = NULL; /* Reference plain C impl. declaration ****************************************************************************/ XVID_QP_FUNCS xvid_QP_Funcs_C_ref = { H_Pass_16_C_ref, H_Pass_Avrg_16_C_ref, H_Pass_Avrg_Up_16_C_ref, V_Pass_16_C_ref, V_Pass_Avrg_16_C_ref, V_Pass_Avrg_Up_16_C_ref, H_Pass_8_C_ref, H_Pass_Avrg_8_C_ref, H_Pass_Avrg_Up_8_C_ref, V_Pass_8_C_ref, V_Pass_Avrg_8_C_ref, V_Pass_Avrg_Up_8_C_ref }; XVID_QP_FUNCS xvid_QP_Add_Funcs_C_ref = { H_Pass_16_Add_C_ref, H_Pass_Avrg_16_Add_C_ref, H_Pass_Avrg_Up_16_Add_C_ref, V_Pass_16_Add_C_ref, V_Pass_Avrg_16_Add_C_ref, V_Pass_Avrg_Up_16_Add_C_ref, H_Pass_8_Add_C_ref, H_Pass_Avrg_8_Add_C_ref, H_Pass_Avrg_Up_8_Add_C_ref, V_Pass_8_Add_C_ref, V_Pass_Avrg_8_Add_C_ref, V_Pass_Avrg_Up_8_Add_C_ref }; /* Plain C impl. declaration (faster than ref one) ****************************************************************************/ XVID_QP_FUNCS xvid_QP_Funcs_C = { H_Pass_16_C, H_Pass_Avrg_16_C, H_Pass_Avrg_Up_16_C, V_Pass_16_C, V_Pass_Avrg_16_C, V_Pass_Avrg_Up_16_C, H_Pass_8_C, H_Pass_Avrg_8_C, H_Pass_Avrg_Up_8_C, V_Pass_8_C, V_Pass_Avrg_8_C, V_Pass_Avrg_Up_8_C }; XVID_QP_FUNCS xvid_QP_Add_Funcs_C = { H_Pass_16_Add_C, H_Pass_Avrg_16_Add_C, H_Pass_Avrg_Up_16_Add_C, V_Pass_16_Add_C, V_Pass_Avrg_16_Add_C, V_Pass_Avrg_Up_16_Add_C, H_Pass_8_Add_C, H_Pass_Avrg_8_Add_C, H_Pass_Avrg_Up_8_Add_C, V_Pass_8_Add_C, V_Pass_Avrg_8_Add_C, V_Pass_Avrg_Up_8_Add_C }; /* mmx impl. declaration (see. qpel_mmx.asm ****************************************************************************/ #if defined (ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Avrg_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Avrg_Up_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Avrg_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Avrg_Up_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_8_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Avrg_8_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Avrg_Up_8_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_8_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Avrg_8_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Avrg_Up_8_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Add_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Avrg_Add_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Avrg_Up_Add_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Add_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Avrg_Add_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Avrg_Up_Add_16_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_8_Add_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Avrg_8_Add_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_H_Pass_Avrg_Up_8_Add_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_8_Add_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Avrg_8_Add_mmx); extern XVID_QP_PASS_SIGNATURE(xvid_V_Pass_Avrg_Up_8_Add_mmx); XVID_QP_FUNCS xvid_QP_Funcs_mmx = { xvid_H_Pass_16_mmx, xvid_H_Pass_Avrg_16_mmx, xvid_H_Pass_Avrg_Up_16_mmx, xvid_V_Pass_16_mmx, xvid_V_Pass_Avrg_16_mmx, xvid_V_Pass_Avrg_Up_16_mmx, xvid_H_Pass_8_mmx, xvid_H_Pass_Avrg_8_mmx, xvid_H_Pass_Avrg_Up_8_mmx, xvid_V_Pass_8_mmx, xvid_V_Pass_Avrg_8_mmx, xvid_V_Pass_Avrg_Up_8_mmx }; XVID_QP_FUNCS xvid_QP_Add_Funcs_mmx = { xvid_H_Pass_Add_16_mmx, xvid_H_Pass_Avrg_Add_16_mmx, xvid_H_Pass_Avrg_Up_Add_16_mmx, xvid_V_Pass_Add_16_mmx, xvid_V_Pass_Avrg_Add_16_mmx, xvid_V_Pass_Avrg_Up_Add_16_mmx, xvid_H_Pass_8_Add_mmx, xvid_H_Pass_Avrg_8_Add_mmx, xvid_H_Pass_Avrg_Up_8_Add_mmx, xvid_V_Pass_8_Add_mmx, xvid_V_Pass_Avrg_8_Add_mmx, xvid_V_Pass_Avrg_Up_8_Add_mmx, }; #endif /* ARCH_IS_IA32 */ /* altivec impl. declaration (see qpel_altivec.c) ****************************************************************************/ #ifdef ARCH_IS_PPC extern XVID_QP_PASS_SIGNATURE(H_Pass_16_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_Avrg_16_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_Avrg_Up_16_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_16_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_Avrg_16_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_Avrg_Up_16_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_8_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_Avrg_8_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_Avrg_Up_8_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_8_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_Avrg_8_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_Avrg_Up_8_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_16_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_Avrg_16_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_Avrg_Up_16_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_16_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_Avrg_16_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_Avrg_Up_16_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_8_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_Avrg_8_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(H_Pass_Avrg_Up_8_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_8_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_Avrg_8_Add_Altivec_C); extern XVID_QP_PASS_SIGNATURE(V_Pass_Avrg_Up_8_Add_Altivec_C); XVID_QP_FUNCS xvid_QP_Funcs_Altivec_C = { H_Pass_16_Altivec_C, H_Pass_Avrg_16_Altivec_C, H_Pass_Avrg_Up_16_Altivec_C, V_Pass_16_Altivec_C, V_Pass_Avrg_16_Altivec_C, V_Pass_Avrg_Up_16_Altivec_C, H_Pass_8_Altivec_C, H_Pass_Avrg_8_Altivec_C, H_Pass_Avrg_Up_8_Altivec_C, V_Pass_8_Altivec_C, V_Pass_Avrg_8_Altivec_C, V_Pass_Avrg_Up_8_Altivec_C }; XVID_QP_FUNCS xvid_QP_Add_Funcs_Altivec_C = { H_Pass_16_Add_Altivec_C, H_Pass_Avrg_16_Add_Altivec_C, H_Pass_Avrg_Up_16_Add_Altivec_C, V_Pass_16_Add_Altivec_C, V_Pass_Avrg_16_Add_Altivec_C, V_Pass_Avrg_Up_16_Add_Altivec_C, H_Pass_8_Add_Altivec_C, H_Pass_Avrg_8_Add_Altivec_C, H_Pass_Avrg_Up_8_Add_Altivec_C, V_Pass_8_Add_Altivec_C, V_Pass_Avrg_8_Add_Altivec_C, V_Pass_Avrg_Up_8_Add_Altivec_C }; #endif /* ARCH_IS_PPC */ /* tables for ASM ****************************************************************************/ #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) /* These symbols will be used outside this file, so tell the compiler * they're global. */ extern uint16_t xvid_Expand_mmx[256][4]; /* 8b -> 64b expansion table */ extern int16_t xvid_FIR_1_0_0_0[256][4]; extern int16_t xvid_FIR_3_1_0_0[256][4]; extern int16_t xvid_FIR_6_3_1_0[256][4]; extern int16_t xvid_FIR_14_3_2_1[256][4]; extern int16_t xvid_FIR_20_6_3_1[256][4]; extern int16_t xvid_FIR_20_20_6_3[256][4]; extern int16_t xvid_FIR_23_19_6_3[256][4]; extern int16_t xvid_FIR_7_20_20_6[256][4]; extern int16_t xvid_FIR_6_20_20_6[256][4]; extern int16_t xvid_FIR_6_20_20_7[256][4]; extern int16_t xvid_FIR_3_6_20_20[256][4]; extern int16_t xvid_FIR_3_6_19_23[256][4]; extern int16_t xvid_FIR_1_3_6_20[256][4]; extern int16_t xvid_FIR_1_2_3_14[256][4]; extern int16_t xvid_FIR_0_1_3_6[256][4]; extern int16_t xvid_FIR_0_0_1_3[256][4]; extern int16_t xvid_FIR_0_0_0_1[256][4]; #endif /* Arrays definitions, according to the target platform */ #if !defined(ARCH_IS_X86_64) && !defined(ARCH_IS_IA32) /* Only ia32/ia64 will use these tables outside this file so mark them * static for all other archs */ #define __SCOPE static __SCOPE int16_t xvid_FIR_1_0_0_0[256][4]; __SCOPE int16_t xvid_FIR_3_1_0_0[256][4]; __SCOPE int16_t xvid_FIR_6_3_1_0[256][4]; __SCOPE int16_t xvid_FIR_14_3_2_1[256][4]; __SCOPE int16_t xvid_FIR_20_6_3_1[256][4]; __SCOPE int16_t xvid_FIR_20_20_6_3[256][4]; __SCOPE int16_t xvid_FIR_23_19_6_3[256][4]; __SCOPE int16_t xvid_FIR_7_20_20_6[256][4]; __SCOPE int16_t xvid_FIR_6_20_20_6[256][4]; __SCOPE int16_t xvid_FIR_6_20_20_7[256][4]; __SCOPE int16_t xvid_FIR_3_6_20_20[256][4]; __SCOPE int16_t xvid_FIR_3_6_19_23[256][4]; __SCOPE int16_t xvid_FIR_1_3_6_20[256][4]; __SCOPE int16_t xvid_FIR_1_2_3_14[256][4]; __SCOPE int16_t xvid_FIR_0_1_3_6[256][4]; __SCOPE int16_t xvid_FIR_0_0_1_3[256][4]; __SCOPE int16_t xvid_FIR_0_0_0_1[256][4]; #endif static void Init_FIR_Table(int16_t Tab[][4], int A, int B, int C, int D) { int i; for(i=0; i<256; ++i) { Tab[i][0] = i*A; Tab[i][1] = i*B; Tab[i][2] = i*C; Tab[i][3] = i*D; } } void xvid_Init_QP(void) { #if defined (ARCH_IS_IA32) || defined (ARCH_IS_X86_64) int i; for(i=0; i<256; ++i) { xvid_Expand_mmx[i][0] = i; xvid_Expand_mmx[i][1] = i; xvid_Expand_mmx[i][2] = i; xvid_Expand_mmx[i][3] = i; } #endif /* Alternate way of filtering (cf. USE_TABLES flag in qpel_mmx.asm) */ Init_FIR_Table(xvid_FIR_1_0_0_0, -1, 0, 0, 0); Init_FIR_Table(xvid_FIR_3_1_0_0, 3, -1, 0, 0); Init_FIR_Table(xvid_FIR_6_3_1_0, -6, 3, -1, 0); Init_FIR_Table(xvid_FIR_14_3_2_1, 14, -3, 2, -1); Init_FIR_Table(xvid_FIR_20_6_3_1, 20, -6, 3, -1); Init_FIR_Table(xvid_FIR_20_20_6_3, 20, 20, -6, 3); Init_FIR_Table(xvid_FIR_23_19_6_3, 23, 19, -6, 3); Init_FIR_Table(xvid_FIR_7_20_20_6, -7, 20, 20, -6); Init_FIR_Table(xvid_FIR_6_20_20_6, -6, 20, 20, -6); Init_FIR_Table(xvid_FIR_6_20_20_7, -6, 20, 20, -7); Init_FIR_Table(xvid_FIR_3_6_20_20, 3, -6, 20, 20); Init_FIR_Table(xvid_FIR_3_6_19_23, 3, -6, 19, 23); Init_FIR_Table(xvid_FIR_1_3_6_20, -1, 3, -6, 20); Init_FIR_Table(xvid_FIR_1_2_3_14, -1, 2, -3, 14); Init_FIR_Table(xvid_FIR_0_1_3_6, 0, -1, 3, -6); Init_FIR_Table(xvid_FIR_0_0_1_3, 0, 0, -1, 3); Init_FIR_Table(xvid_FIR_0_0_0_1, 0, 0, 0, -1); } #endif /* !XVID_AUTO_INCLUDE */ #if defined(XVID_AUTO_INCLUDE) && defined(REFERENCE_CODE) /***************************************************************************** * "reference" filters impl. in plain C ****************************************************************************/ static void FUNC_H(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t Rnd) { while(H-->0) { int32_t i, k; int32_t Sums[SIZE] = { 0 }; for(i=0; i<=SIZE; ++i) for(k=0; k> 5; if (C<0) C = 0; else if (C>255) C = 255; STORE(Dst[i], C); } Src += BpS; Dst += BpS; } } static void FUNC_V(uint8_t *Dst, const uint8_t *Src, int32_t W, int32_t BpS, int32_t Rnd) { while(W-->0) { int32_t i, k; int32_t Sums[SIZE] = { 0 }; const uint8_t *S = Src++; uint8_t *D = Dst++; for(i=0; i<=SIZE; ++i) { for(k=0; k>5; if (C<0) C = 0; else if (C>255) C = 255; STORE(D[0], C); D += BpS; } } } static void FUNC_HA(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t Rnd) { while(H-->0) { int32_t i, k; int32_t Sums[SIZE] = { 0 }; for(i=0; i<=SIZE; ++i) for(k=0; k> 5; if (C<0) C = 0; else if (C>255) C = 255; C = (C+Src[i]+1-Rnd) >> 1; STORE(Dst[i], C); } Src += BpS; Dst += BpS; } } static void FUNC_HA_UP(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t Rnd) { while(H-->0) { int32_t i, k; int32_t Sums[SIZE] = { 0 }; for(i=0; i<=SIZE; ++i) for(k=0; k> 5; if (C<0) C = 0; else if (C>255) C = 255; C = (C+Src[i+1]+1-Rnd) >> 1; STORE(Dst[i], C); } Src += BpS; Dst += BpS; } } static void FUNC_VA(uint8_t *Dst, const uint8_t *Src, int32_t W, int32_t BpS, int32_t Rnd) { while(W-->0) { int32_t i, k; int32_t Sums[SIZE] = { 0 }; const uint8_t *S = Src; uint8_t *D = Dst; for(i=0; i<=SIZE; ++i) { for(k=0; k>5; if (C<0) C = 0; else if (C>255) C = 255; C = ( C+S[0]+1-Rnd ) >> 1; STORE(D[0], C); D += BpS; S += BpS; } Src++; Dst++; } } static void FUNC_VA_UP(uint8_t *Dst, const uint8_t *Src, int32_t W, int32_t BpS, int32_t Rnd) { while(W-->0) { int32_t i, k; int32_t Sums[SIZE] = { 0 }; const uint8_t *S = Src; uint8_t *D = Dst; for(i=0; i<=SIZE; ++i) { for(k=0; k>5; if (C<0) C = 0; else if (C>255) C = 255; C = ( C+S[0]+1-Rnd ) >> 1; STORE(D[0], C); D += BpS; S += BpS; } Dst++; Src++; } } #undef STORE #undef FUNC_H #undef FUNC_V #undef FUNC_HA #undef FUNC_VA #undef FUNC_HA_UP #undef FUNC_VA_UP #elif defined(XVID_AUTO_INCLUDE) && !defined(REFERENCE_CODE) /***************************************************************************** * "fast" filters impl. in plain C ****************************************************************************/ #define CLIP_STORE(D,C) \ if (C<0) C = 0; else if (C>(255<<5)) C = 255; else C = C>>5; \ STORE(D, C) static void FUNC_H(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t RND) { #if (SIZE==16) while(H-->0) { int C; C = 16-RND +14*Src[0] +23*Src[1] - 7*Src[2] + 3*Src[3] - Src[4]; CLIP_STORE(Dst[ 0],C); C = 16-RND - 3*(Src[0]-Src[4]) +19*Src[1] +20*Src[2] - 6*Src[3] - Src[5]; CLIP_STORE(Dst[ 1],C); C = 16-RND + 2*Src[0] - 6*(Src[1]+Src[4]) +20*(Src[2]+Src[3]) + 3*Src[5] - Src[6]; CLIP_STORE(Dst[ 2],C); C = 16-RND - (Src[0]+Src[7 ]) + 3*(Src[ 1]+Src[ 6])-6*(Src[ 2]+Src[ 5]) + 20*(Src[ 3]+Src[ 4]); CLIP_STORE(Dst[ 3],C); C = 16-RND - (Src[1]+Src[8 ]) + 3*(Src[ 2]+Src[ 7])-6*(Src[ 3]+Src[ 6]) + 20*(Src[ 4]+Src[ 5]); CLIP_STORE(Dst[ 4],C); C = 16-RND - (Src[2]+Src[9 ]) + 3*(Src[ 3]+Src[ 8])-6*(Src[ 4]+Src[ 7]) + 20*(Src[ 5]+Src[ 6]); CLIP_STORE(Dst[ 5],C); C = 16-RND - (Src[3]+Src[10]) + 3*(Src[ 4]+Src[ 9])-6*(Src[ 5]+Src[ 8]) + 20*(Src[ 6]+Src[ 7]); CLIP_STORE(Dst[ 6],C); C = 16-RND - (Src[4]+Src[11]) + 3*(Src[ 5]+Src[10])-6*(Src[ 6]+Src[ 9]) + 20*(Src[ 7]+Src[ 8]); CLIP_STORE(Dst[ 7],C); C = 16-RND - (Src[5]+Src[12]) + 3*(Src[ 6]+Src[11])-6*(Src[ 7]+Src[10]) + 20*(Src[ 8]+Src[ 9]); CLIP_STORE(Dst[ 8],C); C = 16-RND - (Src[6]+Src[13]) + 3*(Src[ 7]+Src[12])-6*(Src[ 8]+Src[11]) + 20*(Src[ 9]+Src[10]); CLIP_STORE(Dst[ 9],C); C = 16-RND - (Src[7]+Src[14]) + 3*(Src[ 8]+Src[13])-6*(Src[ 9]+Src[12]) + 20*(Src[10]+Src[11]); CLIP_STORE(Dst[10],C); C = 16-RND - (Src[8]+Src[15]) + 3*(Src[ 9]+Src[14])-6*(Src[10]+Src[13]) + 20*(Src[11]+Src[12]); CLIP_STORE(Dst[11],C); C = 16-RND - (Src[9]+Src[16]) + 3*(Src[10]+Src[15])-6*(Src[11]+Src[14]) + 20*(Src[12]+Src[13]); CLIP_STORE(Dst[12],C); C = 16-RND - Src[10] +3*Src[11] -6*(Src[12]+Src[15]) + 20*(Src[13]+Src[14]) +2*Src[16]; CLIP_STORE(Dst[13],C); C = 16-RND - Src[11] +3*(Src[12]-Src[16]) -6*Src[13] + 20*Src[14] + 19*Src[15]; CLIP_STORE(Dst[14],C); C = 16-RND - Src[12] +3*Src[13] -7*Src[14] + 23*Src[15] + 14*Src[16]; CLIP_STORE(Dst[15],C); Src += BpS; Dst += BpS; } #else while(H-->0) { int C; C = 16-RND +14*Src[0] +23*Src[1] - 7*Src[2] + 3*Src[3] - Src[4]; CLIP_STORE(Dst[0],C); C = 16-RND - 3*(Src[0]-Src[4]) +19*Src[1] +20*Src[2] - 6*Src[3] - Src[5]; CLIP_STORE(Dst[1],C); C = 16-RND + 2*Src[0] - 6*(Src[1]+Src[4]) +20*(Src[2]+Src[3]) + 3*Src[5] - Src[6]; CLIP_STORE(Dst[2],C); C = 16-RND - (Src[0]+Src[7]) + 3*(Src[1]+Src[6])-6*(Src[2]+Src[5]) + 20*(Src[3]+Src[4]); CLIP_STORE(Dst[3],C); C = 16-RND - (Src[1]+Src[8]) + 3*(Src[2]+Src[7])-6*(Src[3]+Src[6]) + 20*(Src[4]+Src[5]); CLIP_STORE(Dst[4],C); C = 16-RND - Src[2] +3*Src[3] -6*(Src[4]+Src[7]) + 20*(Src[5]+Src[6]) +2*Src[8]; CLIP_STORE(Dst[5],C); C = 16-RND - Src[3] +3*(Src[4]-Src[8]) -6*Src[5] + 20*Src[6] + 19*Src[7]; CLIP_STORE(Dst[6],C); C = 16-RND - Src[4] +3*Src[5] -7*Src[6] + 23*Src[7] + 14*Src[8]; CLIP_STORE(Dst[7],C); Src += BpS; Dst += BpS; } #endif } #undef CLIP_STORE #define CLIP_STORE(i,C) \ if (C<0) C = 0; else if (C>(255<<5)) C = 255; else C = C>>5; \ C = (C+Src[i]+1-RND) >> 1; \ STORE(Dst[i], C) static void FUNC_HA(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t RND) { #if (SIZE==16) while(H-->0) { int C; C = 16-RND +14*Src[0] +23*Src[1] - 7*Src[2] + 3*Src[3] - Src[4]; CLIP_STORE(0,C); C = 16-RND - 3*(Src[0]-Src[4]) +19*Src[1] +20*Src[2] - 6*Src[3] - Src[5]; CLIP_STORE( 1,C); C = 16-RND + 2*Src[0] - 6*(Src[1]+Src[4]) +20*(Src[2]+Src[3]) + 3*Src[5] - Src[6]; CLIP_STORE( 2,C); C = 16-RND - (Src[0]+Src[7 ]) + 3*(Src[ 1]+Src[ 6])-6*(Src[ 2]+Src[ 5]) + 20*(Src[ 3]+Src[ 4]); CLIP_STORE( 3,C); C = 16-RND - (Src[1]+Src[8 ]) + 3*(Src[ 2]+Src[ 7])-6*(Src[ 3]+Src[ 6]) + 20*(Src[ 4]+Src[ 5]); CLIP_STORE( 4,C); C = 16-RND - (Src[2]+Src[9 ]) + 3*(Src[ 3]+Src[ 8])-6*(Src[ 4]+Src[ 7]) + 20*(Src[ 5]+Src[ 6]); CLIP_STORE( 5,C); C = 16-RND - (Src[3]+Src[10]) + 3*(Src[ 4]+Src[ 9])-6*(Src[ 5]+Src[ 8]) + 20*(Src[ 6]+Src[ 7]); CLIP_STORE( 6,C); C = 16-RND - (Src[4]+Src[11]) + 3*(Src[ 5]+Src[10])-6*(Src[ 6]+Src[ 9]) + 20*(Src[ 7]+Src[ 8]); CLIP_STORE( 7,C); C = 16-RND - (Src[5]+Src[12]) + 3*(Src[ 6]+Src[11])-6*(Src[ 7]+Src[10]) + 20*(Src[ 8]+Src[ 9]); CLIP_STORE( 8,C); C = 16-RND - (Src[6]+Src[13]) + 3*(Src[ 7]+Src[12])-6*(Src[ 8]+Src[11]) + 20*(Src[ 9]+Src[10]); CLIP_STORE( 9,C); C = 16-RND - (Src[7]+Src[14]) + 3*(Src[ 8]+Src[13])-6*(Src[ 9]+Src[12]) + 20*(Src[10]+Src[11]); CLIP_STORE(10,C); C = 16-RND - (Src[8]+Src[15]) + 3*(Src[ 9]+Src[14])-6*(Src[10]+Src[13]) + 20*(Src[11]+Src[12]); CLIP_STORE(11,C); C = 16-RND - (Src[9]+Src[16]) + 3*(Src[10]+Src[15])-6*(Src[11]+Src[14]) + 20*(Src[12]+Src[13]); CLIP_STORE(12,C); C = 16-RND - Src[10] +3*Src[11] -6*(Src[12]+Src[15]) + 20*(Src[13]+Src[14]) +2*Src[16]; CLIP_STORE(13,C); C = 16-RND - Src[11] +3*(Src[12]-Src[16]) -6*Src[13] + 20*Src[14] + 19*Src[15]; CLIP_STORE(14,C); C = 16-RND - Src[12] +3*Src[13] -7*Src[14] + 23*Src[15] + 14*Src[16]; CLIP_STORE(15,C); Src += BpS; Dst += BpS; } #else while(H-->0) { int C; C = 16-RND +14*Src[0] +23*Src[1] - 7*Src[2] + 3*Src[3] - Src[4]; CLIP_STORE(0,C); C = 16-RND - 3*(Src[0]-Src[4]) +19*Src[1] +20*Src[2] - 6*Src[3] - Src[5]; CLIP_STORE(1,C); C = 16-RND + 2*Src[0] - 6*(Src[1]+Src[4]) +20*(Src[2]+Src[3]) + 3*Src[5] - Src[6]; CLIP_STORE(2,C); C = 16-RND - (Src[0]+Src[7]) + 3*(Src[1]+Src[6])-6*(Src[2]+Src[5]) + 20*(Src[3]+Src[4]); CLIP_STORE(3,C); C = 16-RND - (Src[1]+Src[8]) + 3*(Src[2]+Src[7])-6*(Src[3]+Src[6]) + 20*(Src[4]+Src[5]); CLIP_STORE(4,C); C = 16-RND - Src[2] +3*Src[3] -6*(Src[4]+Src[7]) + 20*(Src[5]+Src[6]) +2*Src[8]; CLIP_STORE(5,C); C = 16-RND - Src[3] +3*(Src[4]-Src[8]) -6*Src[5] + 20*Src[6] + 19*Src[7]; CLIP_STORE(6,C); C = 16-RND - Src[4] +3*Src[5] -7*Src[6] + 23*Src[7] + 14*Src[8]; CLIP_STORE(7,C); Src += BpS; Dst += BpS; } #endif } #undef CLIP_STORE #define CLIP_STORE(i,C) \ if (C<0) C = 0; else if (C>(255<<5)) C = 255; else C = C>>5; \ C = (C+Src[i+1]+1-RND) >> 1; \ STORE(Dst[i], C) static void FUNC_HA_UP(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t RND) { #if (SIZE==16) while(H-->0) { int C; C = 16-RND +14*Src[0] +23*Src[1] - 7*Src[2] + 3*Src[3] - Src[4]; CLIP_STORE(0,C); C = 16-RND - 3*(Src[0]-Src[4]) +19*Src[1] +20*Src[2] - 6*Src[3] - Src[5]; CLIP_STORE( 1,C); C = 16-RND + 2*Src[0] - 6*(Src[1]+Src[4]) +20*(Src[2]+Src[3]) + 3*Src[5] - Src[6]; CLIP_STORE( 2,C); C = 16-RND - (Src[0]+Src[7 ]) + 3*(Src[ 1]+Src[ 6])-6*(Src[ 2]+Src[ 5]) + 20*(Src[ 3]+Src[ 4]); CLIP_STORE( 3,C); C = 16-RND - (Src[1]+Src[8 ]) + 3*(Src[ 2]+Src[ 7])-6*(Src[ 3]+Src[ 6]) + 20*(Src[ 4]+Src[ 5]); CLIP_STORE( 4,C); C = 16-RND - (Src[2]+Src[9 ]) + 3*(Src[ 3]+Src[ 8])-6*(Src[ 4]+Src[ 7]) + 20*(Src[ 5]+Src[ 6]); CLIP_STORE( 5,C); C = 16-RND - (Src[3]+Src[10]) + 3*(Src[ 4]+Src[ 9])-6*(Src[ 5]+Src[ 8]) + 20*(Src[ 6]+Src[ 7]); CLIP_STORE( 6,C); C = 16-RND - (Src[4]+Src[11]) + 3*(Src[ 5]+Src[10])-6*(Src[ 6]+Src[ 9]) + 20*(Src[ 7]+Src[ 8]); CLIP_STORE( 7,C); C = 16-RND - (Src[5]+Src[12]) + 3*(Src[ 6]+Src[11])-6*(Src[ 7]+Src[10]) + 20*(Src[ 8]+Src[ 9]); CLIP_STORE( 8,C); C = 16-RND - (Src[6]+Src[13]) + 3*(Src[ 7]+Src[12])-6*(Src[ 8]+Src[11]) + 20*(Src[ 9]+Src[10]); CLIP_STORE( 9,C); C = 16-RND - (Src[7]+Src[14]) + 3*(Src[ 8]+Src[13])-6*(Src[ 9]+Src[12]) + 20*(Src[10]+Src[11]); CLIP_STORE(10,C); C = 16-RND - (Src[8]+Src[15]) + 3*(Src[ 9]+Src[14])-6*(Src[10]+Src[13]) + 20*(Src[11]+Src[12]); CLIP_STORE(11,C); C = 16-RND - (Src[9]+Src[16]) + 3*(Src[10]+Src[15])-6*(Src[11]+Src[14]) + 20*(Src[12]+Src[13]); CLIP_STORE(12,C); C = 16-RND - Src[10] +3*Src[11] -6*(Src[12]+Src[15]) + 20*(Src[13]+Src[14]) +2*Src[16]; CLIP_STORE(13,C); C = 16-RND - Src[11] +3*(Src[12]-Src[16]) -6*Src[13] + 20*Src[14] + 19*Src[15]; CLIP_STORE(14,C); C = 16-RND - Src[12] +3*Src[13] -7*Src[14] + 23*Src[15] + 14*Src[16]; CLIP_STORE(15,C); Src += BpS; Dst += BpS; } #else while(H-->0) { int C; C = 16-RND +14*Src[0] +23*Src[1] - 7*Src[2] + 3*Src[3] - Src[4]; CLIP_STORE(0,C); C = 16-RND - 3*(Src[0]-Src[4]) +19*Src[1] +20*Src[2] - 6*Src[3] - Src[5]; CLIP_STORE(1,C); C = 16-RND + 2*Src[0] - 6*(Src[1]+Src[4]) +20*(Src[2]+Src[3]) + 3*Src[5] - Src[6]; CLIP_STORE(2,C); C = 16-RND - (Src[0]+Src[7]) + 3*(Src[1]+Src[6])-6*(Src[2]+Src[5]) + 20*(Src[3]+Src[4]); CLIP_STORE(3,C); C = 16-RND - (Src[1]+Src[8]) + 3*(Src[2]+Src[7])-6*(Src[3]+Src[6]) + 20*(Src[4]+Src[5]); CLIP_STORE(4,C); C = 16-RND - Src[2] +3*Src[3] -6*(Src[4]+Src[7]) + 20*(Src[5]+Src[6]) +2*Src[8]; CLIP_STORE(5,C); C = 16-RND - Src[3] +3*(Src[4]-Src[8]) -6*Src[5] + 20*Src[6] + 19*Src[7]; CLIP_STORE(6,C); C = 16-RND - Src[4] +3*Src[5] -7*Src[6] + 23*Src[7] + 14*Src[8]; CLIP_STORE(7,C); Src += BpS; Dst += BpS; } #endif } #undef CLIP_STORE ////////////////////////////////////////////////////////// // vertical passes ////////////////////////////////////////////////////////// // Note: for vertical passes, width (W) needs only be 8 or 16. #define CLIP_STORE(D,C) \ if (C<0) C = 0; else if (C>(255<<5)) C = 255; else C = C>>5; \ STORE(D, C) static void FUNC_V(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t RND) { #if (SIZE==16) while(H-->0) { int C; C = 16-RND +14*Src[BpS*0] +23*Src[BpS*1] - 7*Src[BpS*2] + 3*Src[BpS*3] - Src[BpS*4]; CLIP_STORE(Dst[BpS* 0],C); C = 16-RND - 3*(Src[BpS*0]-Src[BpS*4]) +19*Src[BpS*1] +20*Src[BpS*2] - 6*Src[BpS*3] - Src[BpS*5]; CLIP_STORE(Dst[BpS* 1],C); C = 16-RND + 2*Src[BpS*0] - 6*(Src[BpS*1]+Src[BpS*4]) +20*(Src[BpS*2]+Src[BpS*3]) + 3*Src[BpS*5] - Src[BpS*6]; CLIP_STORE(Dst[BpS* 2],C); C = 16-RND - (Src[BpS*0]+Src[BpS*7 ]) + 3*(Src[BpS* 1]+Src[BpS* 6])-6*(Src[BpS* 2]+Src[BpS* 5]) + 20*(Src[BpS* 3]+Src[BpS* 4]); CLIP_STORE(Dst[BpS* 3],C); C = 16-RND - (Src[BpS*1]+Src[BpS*8 ]) + 3*(Src[BpS* 2]+Src[BpS* 7])-6*(Src[BpS* 3]+Src[BpS* 6]) + 20*(Src[BpS* 4]+Src[BpS* 5]); CLIP_STORE(Dst[BpS* 4],C); C = 16-RND - (Src[BpS*2]+Src[BpS*9 ]) + 3*(Src[BpS* 3]+Src[BpS* 8])-6*(Src[BpS* 4]+Src[BpS* 7]) + 20*(Src[BpS* 5]+Src[BpS* 6]); CLIP_STORE(Dst[BpS* 5],C); C = 16-RND - (Src[BpS*3]+Src[BpS*10]) + 3*(Src[BpS* 4]+Src[BpS* 9])-6*(Src[BpS* 5]+Src[BpS* 8]) + 20*(Src[BpS* 6]+Src[BpS* 7]); CLIP_STORE(Dst[BpS* 6],C); C = 16-RND - (Src[BpS*4]+Src[BpS*11]) + 3*(Src[BpS* 5]+Src[BpS*10])-6*(Src[BpS* 6]+Src[BpS* 9]) + 20*(Src[BpS* 7]+Src[BpS* 8]); CLIP_STORE(Dst[BpS* 7],C); C = 16-RND - (Src[BpS*5]+Src[BpS*12]) + 3*(Src[BpS* 6]+Src[BpS*11])-6*(Src[BpS* 7]+Src[BpS*10]) + 20*(Src[BpS* 8]+Src[BpS* 9]); CLIP_STORE(Dst[BpS* 8],C); C = 16-RND - (Src[BpS*6]+Src[BpS*13]) + 3*(Src[BpS* 7]+Src[BpS*12])-6*(Src[BpS* 8]+Src[BpS*11]) + 20*(Src[BpS* 9]+Src[BpS*10]); CLIP_STORE(Dst[BpS* 9],C); C = 16-RND - (Src[BpS*7]+Src[BpS*14]) + 3*(Src[BpS* 8]+Src[BpS*13])-6*(Src[BpS* 9]+Src[BpS*12]) + 20*(Src[BpS*10]+Src[BpS*11]); CLIP_STORE(Dst[BpS*10],C); C = 16-RND - (Src[BpS*8]+Src[BpS*15]) + 3*(Src[BpS* 9]+Src[BpS*14])-6*(Src[BpS*10]+Src[BpS*13]) + 20*(Src[BpS*11]+Src[BpS*12]); CLIP_STORE(Dst[BpS*11],C); C = 16-RND - (Src[BpS*9]+Src[BpS*16]) + 3*(Src[BpS*10]+Src[BpS*15])-6*(Src[BpS*11]+Src[BpS*14]) + 20*(Src[BpS*12]+Src[BpS*13]); CLIP_STORE(Dst[BpS*12],C); C = 16-RND - Src[BpS*10] +3*Src[BpS*11] -6*(Src[BpS*12]+Src[BpS*15]) + 20*(Src[BpS*13]+Src[BpS*14]) +2*Src[BpS*16]; CLIP_STORE(Dst[BpS*13],C); C = 16-RND - Src[BpS*11] +3*(Src[BpS*12]-Src[BpS*16]) -6*Src[BpS*13] + 20*Src[BpS*14] + 19*Src[BpS*15]; CLIP_STORE(Dst[BpS*14],C); C = 16-RND - Src[BpS*12] +3*Src[BpS*13] -7*Src[BpS*14] + 23*Src[BpS*15] + 14*Src[BpS*16]; CLIP_STORE(Dst[BpS*15],C); Src += 1; Dst += 1; } #else while(H-->0) { int C; C = 16-RND +14*Src[BpS*0] +23*Src[BpS*1] - 7*Src[BpS*2] + 3*Src[BpS*3] - Src[BpS*4]; CLIP_STORE(Dst[BpS*0],C); C = 16-RND - 3*(Src[BpS*0]-Src[BpS*4]) +19*Src[BpS*1] +20*Src[BpS*2] - 6*Src[BpS*3] - Src[BpS*5]; CLIP_STORE(Dst[BpS*1],C); C = 16-RND + 2*Src[BpS*0] - 6*(Src[BpS*1]+Src[BpS*4]) +20*(Src[BpS*2]+Src[BpS*3]) + 3*Src[BpS*5] - Src[BpS*6]; CLIP_STORE(Dst[BpS*2],C); C = 16-RND - (Src[BpS*0]+Src[BpS*7]) + 3*(Src[BpS*1]+Src[BpS*6])-6*(Src[BpS*2]+Src[BpS*5]) + 20*(Src[BpS*3]+Src[BpS*4]); CLIP_STORE(Dst[BpS*3],C); C = 16-RND - (Src[BpS*1]+Src[BpS*8]) + 3*(Src[BpS*2]+Src[BpS*7])-6*(Src[BpS*3]+Src[BpS*6]) + 20*(Src[BpS*4]+Src[BpS*5]); CLIP_STORE(Dst[BpS*4],C); C = 16-RND - Src[BpS*2] +3*Src[BpS*3] -6*(Src[BpS*4]+Src[BpS*7]) + 20*(Src[BpS*5]+Src[BpS*6]) +2*Src[BpS*8]; CLIP_STORE(Dst[BpS*5],C); C = 16-RND - Src[BpS*3] +3*(Src[BpS*4]-Src[BpS*8]) -6*Src[BpS*5] + 20*Src[BpS*6] + 19*Src[BpS*7]; CLIP_STORE(Dst[BpS*6],C); C = 16-RND - Src[BpS*4] +3*Src[BpS*5] -7*Src[BpS*6] + 23*Src[BpS*7] + 14*Src[BpS*8]; CLIP_STORE(Dst[BpS*7],C); Src += 1; Dst += 1; } #endif } #undef CLIP_STORE #define CLIP_STORE(i,C) \ if (C<0) C = 0; else if (C>(255<<5)) C = 255; else C = C>>5; \ C = (C+Src[BpS*i]+1-RND) >> 1; \ STORE(Dst[BpS*i], C) static void FUNC_VA(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t RND) { #if (SIZE==16) while(H-->0) { int C; C = 16-RND +14*Src[BpS*0] +23*Src[BpS*1] - 7*Src[BpS*2] + 3*Src[BpS*3] - Src[BpS*4]; CLIP_STORE(0,C); C = 16-RND - 3*(Src[BpS*0]-Src[BpS*4]) +19*Src[BpS*1] +20*Src[BpS*2] - 6*Src[BpS*3] - Src[BpS*5]; CLIP_STORE( 1,C); C = 16-RND + 2*Src[BpS*0] - 6*(Src[BpS*1]+Src[BpS*4]) +20*(Src[BpS*2]+Src[BpS*3]) + 3*Src[BpS*5] - Src[BpS*6]; CLIP_STORE( 2,C); C = 16-RND - (Src[BpS*0]+Src[BpS*7 ]) + 3*(Src[BpS* 1]+Src[BpS* 6])-6*(Src[BpS* 2]+Src[BpS* 5]) + 20*(Src[BpS* 3]+Src[BpS* 4]); CLIP_STORE( 3,C); C = 16-RND - (Src[BpS*1]+Src[BpS*8 ]) + 3*(Src[BpS* 2]+Src[BpS* 7])-6*(Src[BpS* 3]+Src[BpS* 6]) + 20*(Src[BpS* 4]+Src[BpS* 5]); CLIP_STORE( 4,C); C = 16-RND - (Src[BpS*2]+Src[BpS*9 ]) + 3*(Src[BpS* 3]+Src[BpS* 8])-6*(Src[BpS* 4]+Src[BpS* 7]) + 20*(Src[BpS* 5]+Src[BpS* 6]); CLIP_STORE( 5,C); C = 16-RND - (Src[BpS*3]+Src[BpS*10]) + 3*(Src[BpS* 4]+Src[BpS* 9])-6*(Src[BpS* 5]+Src[BpS* 8]) + 20*(Src[BpS* 6]+Src[BpS* 7]); CLIP_STORE( 6,C); C = 16-RND - (Src[BpS*4]+Src[BpS*11]) + 3*(Src[BpS* 5]+Src[BpS*10])-6*(Src[BpS* 6]+Src[BpS* 9]) + 20*(Src[BpS* 7]+Src[BpS* 8]); CLIP_STORE( 7,C); C = 16-RND - (Src[BpS*5]+Src[BpS*12]) + 3*(Src[BpS* 6]+Src[BpS*11])-6*(Src[BpS* 7]+Src[BpS*10]) + 20*(Src[BpS* 8]+Src[BpS* 9]); CLIP_STORE( 8,C); C = 16-RND - (Src[BpS*6]+Src[BpS*13]) + 3*(Src[BpS* 7]+Src[BpS*12])-6*(Src[BpS* 8]+Src[BpS*11]) + 20*(Src[BpS* 9]+Src[BpS*10]); CLIP_STORE( 9,C); C = 16-RND - (Src[BpS*7]+Src[BpS*14]) + 3*(Src[BpS* 8]+Src[BpS*13])-6*(Src[BpS* 9]+Src[BpS*12]) + 20*(Src[BpS*10]+Src[BpS*11]); CLIP_STORE(10,C); C = 16-RND - (Src[BpS*8]+Src[BpS*15]) + 3*(Src[BpS* 9]+Src[BpS*14])-6*(Src[BpS*10]+Src[BpS*13]) + 20*(Src[BpS*11]+Src[BpS*12]); CLIP_STORE(11,C); C = 16-RND - (Src[BpS*9]+Src[BpS*16]) + 3*(Src[BpS*10]+Src[BpS*15])-6*(Src[BpS*11]+Src[BpS*14]) + 20*(Src[BpS*12]+Src[BpS*13]); CLIP_STORE(12,C); C = 16-RND - Src[BpS*10] +3*Src[BpS*11] -6*(Src[BpS*12]+Src[BpS*15]) + 20*(Src[BpS*13]+Src[BpS*14]) +2*Src[BpS*16]; CLIP_STORE(13,C); C = 16-RND - Src[BpS*11] +3*(Src[BpS*12]-Src[BpS*16]) -6*Src[BpS*13] + 20*Src[BpS*14] + 19*Src[BpS*15]; CLIP_STORE(14,C); C = 16-RND - Src[BpS*12] +3*Src[BpS*13] -7*Src[BpS*14] + 23*Src[BpS*15] + 14*Src[BpS*16]; CLIP_STORE(15,C); Src += 1; Dst += 1; } #else while(H-->0) { int C; C = 16-RND +14*Src[BpS*0] +23*Src[BpS*1] - 7*Src[BpS*2] + 3*Src[BpS*3] - Src[BpS*4]; CLIP_STORE(0,C); C = 16-RND - 3*(Src[BpS*0]-Src[BpS*4]) +19*Src[BpS*1] +20*Src[BpS*2] - 6*Src[BpS*3] - Src[BpS*5]; CLIP_STORE(1,C); C = 16-RND + 2*Src[BpS*0] - 6*(Src[BpS*1]+Src[BpS*4]) +20*(Src[BpS*2]+Src[BpS*3]) + 3*Src[BpS*5] - Src[BpS*6]; CLIP_STORE(2,C); C = 16-RND - (Src[BpS*0]+Src[BpS*7]) + 3*(Src[BpS*1]+Src[BpS*6])-6*(Src[BpS*2]+Src[BpS*5]) + 20*(Src[BpS*3]+Src[BpS*4]); CLIP_STORE(3,C); C = 16-RND - (Src[BpS*1]+Src[BpS*8]) + 3*(Src[BpS*2]+Src[BpS*7])-6*(Src[BpS*3]+Src[BpS*6]) + 20*(Src[BpS*4]+Src[BpS*5]); CLIP_STORE(4,C); C = 16-RND - Src[BpS*2] +3*Src[BpS*3] -6*(Src[BpS*4]+Src[BpS*7]) + 20*(Src[BpS*5]+Src[BpS*6]) +2*Src[BpS*8]; CLIP_STORE(5,C); C = 16-RND - Src[BpS*3] +3*(Src[BpS*4]-Src[BpS*8]) -6*Src[BpS*5] + 20*Src[BpS*6] + 19*Src[BpS*7]; CLIP_STORE(6,C); C = 16-RND - Src[BpS*4] +3*Src[BpS*5] -7*Src[BpS*6] + 23*Src[BpS*7] + 14*Src[BpS*8]; CLIP_STORE(7,C); Src += 1; Dst += 1; } #endif } #undef CLIP_STORE #define CLIP_STORE(i,C) \ if (C<0) C = 0; else if (C>(255<<5)) C = 255; else C = C>>5; \ C = (C+Src[BpS*i+BpS]+1-RND) >> 1; \ STORE(Dst[BpS*i], C) static void FUNC_VA_UP(uint8_t *Dst, const uint8_t *Src, int32_t H, int32_t BpS, int32_t RND) { #if (SIZE==16) while(H-->0) { int C; C = 16-RND +14*Src[BpS*0] +23*Src[BpS*1] - 7*Src[BpS*2] + 3*Src[BpS*3] - Src[BpS*4]; CLIP_STORE(0,C); C = 16-RND - 3*(Src[BpS*0]-Src[BpS*4]) +19*Src[BpS*1] +20*Src[BpS*2] - 6*Src[BpS*3] - Src[BpS*5]; CLIP_STORE( 1,C); C = 16-RND + 2*Src[BpS*0] - 6*(Src[BpS*1]+Src[BpS*4]) +20*(Src[BpS*2]+Src[BpS*3]) + 3*Src[BpS*5] - Src[BpS*6]; CLIP_STORE( 2,C); C = 16-RND - (Src[BpS*0]+Src[BpS*7 ]) + 3*(Src[BpS* 1]+Src[BpS* 6])-6*(Src[BpS* 2]+Src[BpS* 5]) + 20*(Src[BpS* 3]+Src[BpS* 4]); CLIP_STORE( 3,C); C = 16-RND - (Src[BpS*1]+Src[BpS*8 ]) + 3*(Src[BpS* 2]+Src[BpS* 7])-6*(Src[BpS* 3]+Src[BpS* 6]) + 20*(Src[BpS* 4]+Src[BpS* 5]); CLIP_STORE( 4,C); C = 16-RND - (Src[BpS*2]+Src[BpS*9 ]) + 3*(Src[BpS* 3]+Src[BpS* 8])-6*(Src[BpS* 4]+Src[BpS* 7]) + 20*(Src[BpS* 5]+Src[BpS* 6]); CLIP_STORE( 5,C); C = 16-RND - (Src[BpS*3]+Src[BpS*10]) + 3*(Src[BpS* 4]+Src[BpS* 9])-6*(Src[BpS* 5]+Src[BpS* 8]) + 20*(Src[BpS* 6]+Src[BpS* 7]); CLIP_STORE( 6,C); C = 16-RND - (Src[BpS*4]+Src[BpS*11]) + 3*(Src[BpS* 5]+Src[BpS*10])-6*(Src[BpS* 6]+Src[BpS* 9]) + 20*(Src[BpS* 7]+Src[BpS* 8]); CLIP_STORE( 7,C); C = 16-RND - (Src[BpS*5]+Src[BpS*12]) + 3*(Src[BpS* 6]+Src[BpS*11])-6*(Src[BpS* 7]+Src[BpS*10]) + 20*(Src[BpS* 8]+Src[BpS* 9]); CLIP_STORE( 8,C); C = 16-RND - (Src[BpS*6]+Src[BpS*13]) + 3*(Src[BpS* 7]+Src[BpS*12])-6*(Src[BpS* 8]+Src[BpS*11]) + 20*(Src[BpS* 9]+Src[BpS*10]); CLIP_STORE( 9,C); C = 16-RND - (Src[BpS*7]+Src[BpS*14]) + 3*(Src[BpS* 8]+Src[BpS*13])-6*(Src[BpS* 9]+Src[BpS*12]) + 20*(Src[BpS*10]+Src[BpS*11]); CLIP_STORE(10,C); C = 16-RND - (Src[BpS*8]+Src[BpS*15]) + 3*(Src[BpS* 9]+Src[BpS*14])-6*(Src[BpS*10]+Src[BpS*13]) + 20*(Src[BpS*11]+Src[BpS*12]); CLIP_STORE(11,C); C = 16-RND - (Src[BpS*9]+Src[BpS*16]) + 3*(Src[BpS*10]+Src[BpS*15])-6*(Src[BpS*11]+Src[BpS*14]) + 20*(Src[BpS*12]+Src[BpS*13]); CLIP_STORE(12,C); C = 16-RND - Src[BpS*10] +3*Src[BpS*11] -6*(Src[BpS*12]+Src[BpS*15]) + 20*(Src[BpS*13]+Src[BpS*14]) +2*Src[BpS*16]; CLIP_STORE(13,C); C = 16-RND - Src[BpS*11] +3*(Src[BpS*12]-Src[BpS*16]) -6*Src[BpS*13] + 20*Src[BpS*14] + 19*Src[BpS*15]; CLIP_STORE(14,C); C = 16-RND - Src[BpS*12] +3*Src[BpS*13] -7*Src[BpS*14] + 23*Src[BpS*15] + 14*Src[BpS*16]; CLIP_STORE(15,C); Src += 1; Dst += 1; } #else while(H-->0) { int C; C = 16-RND +14*Src[BpS*0] +23*Src[BpS*1] - 7*Src[BpS*2] + 3*Src[BpS*3] - Src[BpS*4]; CLIP_STORE(0,C); C = 16-RND - 3*(Src[BpS*0]-Src[BpS*4]) +19*Src[BpS*1] +20*Src[BpS*2] - 6*Src[BpS*3] - Src[BpS*5]; CLIP_STORE(1,C); C = 16-RND + 2*Src[BpS*0] - 6*(Src[BpS*1]+Src[BpS*4]) +20*(Src[BpS*2]+Src[BpS*3]) + 3*Src[BpS*5] - Src[BpS*6]; CLIP_STORE(2,C); C = 16-RND - (Src[BpS*0]+Src[BpS*7]) + 3*(Src[BpS*1]+Src[BpS*6])-6*(Src[BpS*2]+Src[BpS*5]) + 20*(Src[BpS*3]+Src[BpS*4]); CLIP_STORE(3,C); C = 16-RND - (Src[BpS*1]+Src[BpS*8]) + 3*(Src[BpS*2]+Src[BpS*7])-6*(Src[BpS*3]+Src[BpS*6]) + 20*(Src[BpS*4]+Src[BpS*5]); CLIP_STORE(4,C); C = 16-RND - Src[BpS*2] +3*Src[BpS*3] -6*(Src[BpS*4]+Src[BpS*7]) + 20*(Src[BpS*5]+Src[BpS*6]) +2*Src[BpS*8]; CLIP_STORE(5,C); C = 16-RND - Src[BpS*3] +3*(Src[BpS*4]-Src[BpS*8]) -6*Src[BpS*5] + 20*Src[BpS*6] + 19*Src[BpS*7]; CLIP_STORE(6,C); C = 16-RND - Src[BpS*4] +3*Src[BpS*5] -7*Src[BpS*6] + 23*Src[BpS*7] + 14*Src[BpS*8]; CLIP_STORE(7,C); Src += 1; Dst += 1; } #endif } #undef CLIP_STORE #undef STORE #undef FUNC_H #undef FUNC_V #undef FUNC_HA #undef FUNC_VA #undef FUNC_HA_UP #undef FUNC_VA_UP #endif /* XVID_AUTO_INCLUDE && !defined(REF) */ xvidcore/src/image/reduced.h0000664000076500007650000000654411564705453017143 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Reduced VOP header - * * Copyright(C) 2002 Pascal Massimino * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: reduced.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _REDUCED_H_ #define _REDUCED_H_ #include "../portab.h" /* decoding */ typedef void (COPY_UPSAMPLED_8X8_16TO8) (uint8_t *Dst, const int16_t *Src, const int BpS); typedef void (ADD_UPSAMPLED_8X8_16TO8) (uint8_t *Dst, const int16_t *Src, const int BpS); /* deblocking: Note: "Nb"_Blks is the number of 8-pixels blocks to process */ typedef void HFILTER_31(uint8_t *Src1, uint8_t *Src2, int Nb_Blks); typedef void VFILTER_31(uint8_t *Src1, uint8_t *Src2, const int BpS, int Nb_Blks); /* encoding: WARNING! These read 1 pixel outside of the input 16x16 block! */ typedef void FILTER_18X18_TO_8X8(int16_t *Dst, const uint8_t *Src, const int BpS); typedef void FILTER_DIFF_18X18_TO_8X8(int16_t *Dst, const uint8_t *Src, const int BpS); extern COPY_UPSAMPLED_8X8_16TO8 * copy_upsampled_8x8_16to8; extern COPY_UPSAMPLED_8X8_16TO8 xvid_Copy_Upsampled_8x8_16To8_C; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern COPY_UPSAMPLED_8X8_16TO8 xvid_Copy_Upsampled_8x8_16To8_mmx; extern COPY_UPSAMPLED_8X8_16TO8 xvid_Copy_Upsampled_8x8_16To8_xmm; #endif extern ADD_UPSAMPLED_8X8_16TO8 * add_upsampled_8x8_16to8; extern ADD_UPSAMPLED_8X8_16TO8 xvid_Add_Upsampled_8x8_16To8_C; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern ADD_UPSAMPLED_8X8_16TO8 xvid_Add_Upsampled_8x8_16To8_mmx; extern ADD_UPSAMPLED_8X8_16TO8 xvid_Add_Upsampled_8x8_16To8_xmm; #endif extern VFILTER_31 * vfilter_31; extern VFILTER_31 xvid_VFilter_31_C; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern VFILTER_31 xvid_VFilter_31_x86; #endif extern HFILTER_31 * hfilter_31; extern HFILTER_31 xvid_HFilter_31_C; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern HFILTER_31 xvid_HFilter_31_x86; extern HFILTER_31 xvid_HFilter_31_mmx; #endif extern FILTER_18X18_TO_8X8 * filter_18x18_to_8x8; extern FILTER_18X18_TO_8X8 xvid_Filter_18x18_To_8x8_C; #if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) extern FILTER_18X18_TO_8X8 xvid_Filter_18x18_To_8x8_mmx; #endif extern FILTER_DIFF_18X18_TO_8X8 * filter_diff_18x18_to_8x8; extern FILTER_DIFF_18X18_TO_8X8 xvid_Filter_Diff_18x18_To_8x8_C; #if defined(ARCH_IS_IA32) || defined(XVID_IS_X86_64) extern FILTER_DIFF_18X18_TO_8X8 xvid_Filter_Diff_18x18_To_8x8_mmx; #endif /* rrv motion vector scale-up */ #define RRV_MV_SCALEUP(a) ( (a)>0 ? 2*(a)-1 : (a)<0 ? 2*(a)+1 : (a) ) #endif /* _REDUCED_H_ */ xvidcore/src/image/font.h0000664000076500007650000000236611564705453016474 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Font header (contains the font definition) - * * Copyright(C) 2002-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: font.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _XVID_FONT_H_ #define _XVID_FONT_H_ #include "image.h" void image_printf(IMAGE * img, int edged_width, int height, int x, int y, char *fmt, ...); #endif /* _XVID_FONT_H_ */ xvidcore/src/image/reduced.c0000664000076500007650000001505311564705453017131 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * Reduced-Resolution utilities * * Copyright(C) 2002 Pascal Massimino * * Xvid is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: reduced.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include "../portab.h" #include "../global.h" #include "reduced.h" /* function pointers */ COPY_UPSAMPLED_8X8_16TO8 * copy_upsampled_8x8_16to8; ADD_UPSAMPLED_8X8_16TO8 * add_upsampled_8x8_16to8; VFILTER_31 * vfilter_31; HFILTER_31 * hfilter_31; FILTER_18X18_TO_8X8 * filter_18x18_to_8x8; FILTER_DIFF_18X18_TO_8X8 * filter_diff_18x18_to_8x8; /*---------------------------------------------------------------------------- * Upsampling (1/3/3/1) filter *--------------------------------------------------------------------------*/ #define ADD(dst,src) (dst) = CLIP((dst)+(src), 0, 255) static __inline void Filter_31(uint8_t *Dst1, uint8_t *Dst2, const int16_t *Src1, const int16_t *Src2) { /* Src[] is assumed to be >=0. So we can use ">>2" instead of "/2" */ int16_t a = (3*Src1[0]+ Src2[0]+2) >> 2; int16_t b = ( Src1[0]+3*Src2[0]+2) >> 2; Dst1[0] = CLIP(a, 0, 255); Dst2[0] = CLIP(b, 0, 255); } static __inline void Filter_9331(uint8_t *Dst1, uint8_t *Dst2, const int16_t *Src1, const int16_t *Src2) { /* Src[] is assumed to be >=0. So we can use ">>4" instead of "/16" */ int16_t a = (9*Src1[0]+ 3*Src1[1]+ 3*Src2[0] + 1*Src2[1] + 8) >> 4; int16_t b = (3*Src1[0]+ 9*Src1[1]+ 1*Src2[0] + 3*Src2[1] + 8) >> 4; int16_t c = (3*Src1[0]+ 1*Src1[1]+ 9*Src2[0] + 3*Src2[1] + 8) >> 4; int16_t d = (1*Src1[0]+ 3*Src1[1]+ 3*Src2[0] + 9*Src2[1] + 8) >> 4; Dst1[0] = CLIP(a, 0, 255); Dst1[1] = CLIP(b, 0, 255); Dst2[0] = CLIP(c, 0, 255); Dst2[1] = CLIP(d, 0, 255); } void xvid_Copy_Upsampled_8x8_16To8_C(uint8_t *Dst, const int16_t *Src, const int BpS) { int x, y; Dst[0] = CLIP(Src[0], 0, 255); for(x=0; x<7; ++x) Filter_31(Dst+2*x+1, Dst+2*x+2, Src+x, Src+x+1); Dst[15] = CLIP(Src[7], 0, 255); Dst += BpS; for(y=0; y<7; ++y) { uint8_t *const Dst2 = Dst + BpS; Filter_31(Dst, Dst2, Src, Src+8); for(x=0; x<7; ++x) Filter_9331(Dst+2*x+1, Dst2+2*x+1, Src+x, Src+x+8); Filter_31(Dst+15, Dst2+15, Src+7, Src+7+8); Src += 8; Dst += 2*BpS; } Dst[0] = CLIP(Src[0], 0, 255); for(x=0; x<7; ++x) Filter_31(Dst+2*x+1, Dst+2*x+2, Src+x, Src+x+1); Dst[15] = CLIP(Src[7], 0, 255); } static __inline void Filter_Add_31(uint8_t *Dst1, uint8_t *Dst2, const int16_t *Src1, const int16_t *Src2) { /* Here, we must use "/4", since Src[] is in [-256, 255] */ int16_t a = (3*Src1[0]+ Src2[0] + 2) / 4; int16_t b = ( Src1[0]+3*Src2[0] + 2) / 4; ADD(Dst1[0], a); ADD(Dst2[0], b); } static __inline void Filter_Add_9331(uint8_t *Dst1, uint8_t *Dst2, const int16_t *Src1, const int16_t *Src2) { int16_t a = (9*Src1[0]+ 3*Src1[1]+ 3*Src2[0] + 1*Src2[1] + 8) / 16; int16_t b = (3*Src1[0]+ 9*Src1[1]+ 1*Src2[0] + 3*Src2[1] + 8) / 16; int16_t c = (3*Src1[0]+ 1*Src1[1]+ 9*Src2[0] + 3*Src2[1] + 8) / 16; int16_t d = (1*Src1[0]+ 3*Src1[1]+ 3*Src2[0] + 9*Src2[1] + 8) / 16; ADD(Dst1[0], a); ADD(Dst1[1], b); ADD(Dst2[0], c); ADD(Dst2[1], d); } void xvid_Add_Upsampled_8x8_16To8_C(uint8_t *Dst, const int16_t *Src, const int BpS) { int x, y; ADD(Dst[0], Src[0]); for(x=0; x<7; ++x) Filter_Add_31(Dst+2*x+1, Dst+2*x+2, Src+x, Src+x+1); ADD(Dst[15], Src[7]); Dst += BpS; for(y=0; y<7; ++y) { uint8_t *const Dst2 = Dst + BpS; Filter_Add_31(Dst, Dst2, Src, Src+8); for(x=0; x<7; ++x) Filter_Add_9331(Dst+2*x+1, Dst2+2*x+1, Src+x, Src+x+8); Filter_Add_31(Dst+15, Dst2+15, Src+7, Src+7+8); Src += 8; Dst += 2*BpS; } ADD(Dst[0], Src[0]); for(x=0; x<7; ++x) Filter_Add_31(Dst+2*x+1, Dst+2*x+2, Src+x, Src+x+1); ADD(Dst[15], Src[7]); } #undef ADD /*---------------------------------------------------------------------------- * horizontal and vertical deblocking *--------------------------------------------------------------------------*/ void xvid_HFilter_31_C(uint8_t *Src1, uint8_t *Src2, int Nb_Blks) { Nb_Blks *= 8; while(Nb_Blks-->0) { uint8_t a = ( 3*Src1[0] + 1*Src2[0] + 2 ) >> 2; uint8_t b = ( 1*Src1[0] + 3*Src2[0] + 2 ) >> 2; *Src1++ = a; *Src2++ = b; } } void xvid_VFilter_31_C(uint8_t *Src1, uint8_t *Src2, const int BpS, int Nb_Blks) { Nb_Blks *= 8; while(Nb_Blks-->0) { uint8_t a = ( 3*Src1[0] + 1*Src2[0] + 2 ) >> 2; uint8_t b = ( 1*Src1[0] + 3*Src2[0] + 2 ) >> 2; *Src1 = a; *Src2 = b; Src1 += BpS; Src2 += BpS; } } /*---------------------------------------------------------------------------- * 16x16 -> 8x8 (1/3/3/1) downsampling * * Warning! These read 1 pixel outside of the input 16x16 block! *--------------------------------------------------------------------------*/ void xvid_Filter_18x18_To_8x8_C(int16_t *Dst, const uint8_t *Src, const int BpS) { int16_t *T, Tmp[18*8]; int i, j; T = Tmp; Src -= BpS; for(j=-1; j<17; j++) { for(i=0; i<8; ++i) T[i] = Src[2*i-1] + 3*Src[2*i+0] + 3*Src[2*i+1] + Src[2*i+2]; T += 8; Src += BpS; } T = Tmp + 8; for(j=0; j<8; j++) { for(i=0; i<8; ++i) Dst[i] = ( T[-8+i] + 3*T[0+i] + 3*T[8+i] + T[16+i] + 32 ) / 64; Dst += 8; T += 16; } } void xvid_Filter_Diff_18x18_To_8x8_C(int16_t *Dst, const uint8_t *Src, const int BpS) { int16_t *T, Tmp[18*8]; int i, j; T = Tmp; Src -= BpS; for(j=-1; j<17; j++) { for(i=0; i<8; ++i) T[i] = Src[2*i-1] + 3*Src[2*i+0] + 3*Src[2*i+1] + Src[2*i+2]; T += 8; Src += BpS; } T = Tmp; for(j=0; j<8; j++) { for(i=0; i<8; ++i) Dst[i] -= ( T[i] + 3*T[8+i] + 3*T[16+i] + T[24+i] + 32 ) / 64; Dst += 8; T += 16; } } xvidcore/src/portab.h0000664000076500007650000003621211564705453015730 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Portable macros, types and inlined assembly - * * Copyright(C) 2002-2010 Michael Militzer * 2002-2003 Peter Ross * 2002-2003 Edouard Gomez * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: portab.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _PORTAB_H_ #define _PORTAB_H_ /***************************************************************************** * Common things ****************************************************************************/ /* Buffer size for msvc implementation because it outputs to DebugOutput */ #if defined(_DEBUG) extern unsigned int xvid_debug; #define DPRINTF_BUF_SZ 1024 #endif /***************************************************************************** * Types used in Xvid sources ****************************************************************************/ /*---------------------------------------------------------------------------- | For MSVC *---------------------------------------------------------------------------*/ #if defined(_MSC_VER) || defined (__WATCOMC__) # define int8_t char # define uint8_t unsigned char # define int16_t short # define uint16_t unsigned short # define int32_t int # define uint32_t unsigned int # define int64_t __int64 # define uint64_t unsigned __int64 /*---------------------------------------------------------------------------- | For all other compilers, use the standard header file | (compiler should be ISO C99 compatible, perhaps ISO C89 is enough) *---------------------------------------------------------------------------*/ #else # include #endif /***************************************************************************** * Some things that are OS dependant ****************************************************************************/ #ifdef WIN32 # include # define pthread_t HANDLE # define pthread_create(t,u,f,d) *(t)=CreateThread(NULL,0,f,d,0,NULL) # define pthread_join(t,s) { WaitForSingleObject(t,INFINITE); \ CloseHandle(t); } # define sched_yield() Sleep(0); static __inline int pthread_num_processors_np() { DWORD p_aff, s_aff, r = 0; GetProcessAffinityMask(GetCurrentProcess(), (PDWORD_PTR) &p_aff, (PDWORD_PTR) &s_aff); for(; p_aff != 0; p_aff>>=1) r += p_aff&1; return r; } #elif defined(__amigaos4__) # include # include # define sched_yield() IDOS->Delay(1) #elif defined(SYS_BEOS) # include # define pthread_t thread_id # define pthread_create(t,u,f,d) { *(t)=spawn_thread(f,"",10,d); \ resume_thread(*(t)); } # define pthread_join(t,s) wait_for_thread(t,(long*)s) # define sched_yield() snooze(0) /* is this correct? */ #else # include #endif /***************************************************************************** * Some things that are only architecture dependant ****************************************************************************/ #if defined(ARCH_IS_32BIT) # define CACHE_LINE 64 # define ptr_t uint32_t # define intptr_t int32_t # define _INTPTR_T_DEFINED # if defined(_MSC_VER) && _MSC_VER >= 1300 && !defined(__INTEL_COMPILER) # include # else # define uintptr_t uint32_t # endif #elif defined(ARCH_IS_64BIT) # define CACHE_LINE 64 # define ptr_t uint64_t # define intptr_t int64_t # define _INTPTR_T_DEFINED # if defined (_MSC_VER) && _MSC_VER >= 1300 && !defined(__INTEL_COMPILER) # include # else # define uintptr_t uint64_t # endif #else # error You are trying to compile Xvid without defining address bus size. #endif /***************************************************************************** * Things that must be sorted by compiler and then by architecture ****************************************************************************/ /***************************************************************************** * MSVC compiler specific macros, functions ****************************************************************************/ #if defined(_MSC_VER) /*---------------------------------------------------------------------------- | Common msvc stuff *---------------------------------------------------------------------------*/ # include # include /* Non ANSI mapping */ # define snprintf _snprintf # define vsnprintf _vsnprintf /* * This function must be declared/defined all the time because MSVC does * not support C99 variable arguments macros. * * Btw, if the MS compiler does its job well, it should remove the nop * DPRINTF function when not compiling in _DEBUG mode */ # ifdef _DEBUG static __inline void DPRINTF(int level, char *fmt, ...) { if (xvid_debug & level) { va_list args; char buf[DPRINTF_BUF_SZ]; va_start(args, fmt); vsprintf(buf, fmt, args); va_end(args); OutputDebugStringA(buf); fprintf(stderr, "%s", buf); } } # else static __inline void DPRINTF(int level, char *fmt, ...) {} # endif # if _MSC_VER <= 1200 # define DECLARE_ALIGNED_MATRIX(name,sizex,sizey,type,alignment) \ type name##_storage[(sizex)*(sizey)+(alignment)-1]; \ type * name = (type *) (((int32_t) name##_storage+(alignment - 1)) & ~((int32_t)(alignment)-1)) # else # define DECLARE_ALIGNED_MATRIX(name,sizex,sizey,type,alignment) \ __declspec(align(alignment)) type name[(sizex)*(sizey)] # endif /*---------------------------------------------------------------------------- | msvc x86 specific macros/functions *---------------------------------------------------------------------------*/ # if defined(ARCH_IS_IA32) # define BSWAP(a) __asm mov eax,a __asm bswap eax __asm mov a, eax static __inline int64_t read_counter(void) { int64_t ts; uint32_t ts1, ts2; __asm { rdtsc mov ts1, eax mov ts2, edx } ts = ((uint64_t) ts2 << 32) | ((uint64_t) ts1); return ts; } # elif defined(ARCH_IS_X86_64) # include # define BSWAP(a) ((a) = _byteswap_ulong(a)) static __inline int64_t read_counter(void) { return __rdtsc(); } /*---------------------------------------------------------------------------- | msvc GENERIC (plain C only) - Probably alpha or some embedded device *---------------------------------------------------------------------------*/ # elif defined(ARCH_IS_GENERIC) # define BSWAP(a) \ ((a) = (((a) & 0xff) << 24) | (((a) & 0xff00) << 8) | \ (((a) >> 8) & 0xff00) | (((a) >> 24) & 0xff)) # include static __inline int64_t read_counter(void) { return (int64_t)clock(); } /*---------------------------------------------------------------------------- | msvc Not given architecture - This is probably an user who tries to build | Xvid the wrong way. *---------------------------------------------------------------------------*/ # else # error You are trying to compile Xvid without defining the architecture type. # endif /***************************************************************************** * GNU CC compiler stuff ****************************************************************************/ #elif defined(__GNUC__) || defined(__ICC) /* Compiler test */ /*---------------------------------------------------------------------------- | Common gcc stuff *---------------------------------------------------------------------------*/ /* * As gcc is (mostly) C99 compliant, we define DPRINTF only if it's realy needed * and it's a macro calling fprintf directly */ # ifdef _DEBUG /* Needed for all debuf fprintf calls */ # include # include static __inline void DPRINTF(int level, char *format, ...) { va_list args; va_start(args, format); if(xvid_debug & level) { vfprintf(stderr, format, args); } va_end(args); } # else /* _DEBUG */ static __inline void DPRINTF(int level, char *format, ...) {} # endif /* _DEBUG */ # define DECLARE_ALIGNED_MATRIX(name,sizex,sizey,type,alignment) \ type name##_storage[(sizex)*(sizey)+(alignment)-1]; \ type * name = (type *) (((ptr_t) name##_storage+(alignment - 1)) & ~((ptr_t)(alignment)-1)) /*---------------------------------------------------------------------------- | gcc IA32 specific macros/functions *---------------------------------------------------------------------------*/ # if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) # define BSWAP(a) __asm__ ( "bswapl %0\n" : "=r" (a) : "0" (a) ); static __inline int64_t read_counter(void) { int64_t ts; uint32_t ts1, ts2; __asm__ __volatile__("rdtsc\n\t":"=a"(ts1), "=d"(ts2)); ts = ((uint64_t) ts2 << 32) | ((uint64_t) ts1); return ts; } /*---------------------------------------------------------------------------- | gcc PPC and PPC Altivec specific macros/functions *---------------------------------------------------------------------------*/ # elif defined(ARCH_IS_PPC) # if defined(HAVE_ALTIVEC_PARENTHESES_DECL) # define AVV(x...) (x) # elif defined(HAVE_ALTIVEC_BRACES_DECL) # define AVV(x...) {x} # else # error Trying to compile PPC target without a vector declaration type. # endif # define BSWAP(a) __asm__ __volatile__ \ ( "lwbrx %0,0,%1; eieio" : "=r" (a) : "r" (&(a)), "m" (a)); static __inline unsigned long get_tbl(void) { unsigned long tbl; asm volatile ("mftb %0":"=r" (tbl)); return tbl; } static __inline unsigned long get_tbu(void) { unsigned long tbl; asm volatile ("mftbu %0":"=r" (tbl)); return tbl; } static __inline int64_t read_counter(void) { unsigned long tb, tu; do { tu = get_tbu(); tb = get_tbl(); }while (tb != get_tbl()); return (((int64_t) tu) << 32) | (int64_t) tb; } /*---------------------------------------------------------------------------- | gcc IA64 specific macros/functions *---------------------------------------------------------------------------*/ # elif defined(ARCH_IS_IA64) # define BSWAP(a) __asm__ __volatile__ \ ("mux1 %0 = %1, @rev" ";;" \ "shr.u %0 = %0, 32" : "=r" (a) : "r" (a)); static __inline int64_t read_counter(void) { unsigned long result; __asm__ __volatile__("mov %0=ar.itc" : "=r"(result) :: "memory"); return result; } /*---------------------------------------------------------------------------- | gcc GENERIC (plain C only) specific macros/functions *---------------------------------------------------------------------------*/ # elif defined(ARCH_IS_GENERIC) # define BSWAP(a) \ ((a) = (((a) & 0xff) << 24) | (((a) & 0xff00) << 8) | \ (((a) >> 8) & 0xff00) | (((a) >> 24) & 0xff)) # include static __inline int64_t read_counter(void) { return (int64_t)clock(); } /*---------------------------------------------------------------------------- | gcc Not given architecture - This is probably an user who tries to build | Xvid the wrong way. *---------------------------------------------------------------------------*/ # else # error You are trying to compile Xvid without defining the architecture type. # endif /***************************************************************************** * Open WATCOM C/C++ compiler ****************************************************************************/ #elif defined(__WATCOMC__) # include # include # ifdef _DEBUG static __inline void DPRINTF(int level, char *fmt, ...) { if (xvid_debug & level) { va_list args; char buf[DPRINTF_BUF_SZ]; va_start(args, fmt); vsprintf(buf, fmt, args); va_end(args); fprintf(stderr, "%s", buf); } } # else /* _DEBUG */ static __inline void DPRINTF(int level, char *format, ...) {} # endif /* _DEBUG */ # define DECLARE_ALIGNED_MATRIX(name,sizex,sizey,type,alignment) \ type name##_storage[(sizex)*(sizey)+(alignment)-1]; \ type * name = (type *) (((int32_t) name##_storage+(alignment - 1)) & ~((int32_t)(alignment)-1)) /*---------------------------------------------------------------------------- | watcom ia32 specific macros/functions *---------------------------------------------------------------------------*/ # if defined(ARCH_IS_IA32) || defined(ARCH_IS_X86_64) # define BSWAP(a) __asm mov eax,a __asm bswap eax __asm mov a, eax static __inline int64_t read_counter(void) { uint64_t ts; uint32_t ts1, ts2; __asm { rdtsc mov ts1, eax mov ts2, edx } ts = ((uint64_t) ts2 << 32) | ((uint64_t) ts1); return ts; } /*---------------------------------------------------------------------------- | watcom GENERIC (plain C only) specific macros/functions. *---------------------------------------------------------------------------*/ # elif defined(ARCH_IS_GENERIC) # define BSWAP(x) \ x = ((((x) & 0xff000000) >> 24) | \ (((x) & 0x00ff0000) >> 8) | \ (((x) & 0x0000ff00) << 8) | \ (((x) & 0x000000ff) << 24)) static int64_t read_counter() { return 0; } /*---------------------------------------------------------------------------- | watcom Not given architecture - This is probably an user who tries to build | Xvid the wrong way. *---------------------------------------------------------------------------*/ # else # error You are trying to compile Xvid without defining the architecture type. # endif /***************************************************************************** * Unknown compiler ****************************************************************************/ #else /* Compiler test */ /* * Ok we know nothing about the compiler, so we fallback to ANSI C * features, so every compiler should be happy and compile the code. * * This is (mostly) equivalent to ARCH_IS_GENERIC. */ # ifdef _DEBUG /* Needed for all debuf fprintf calls */ # include # include static __inline void DPRINTF(int level, char *format, ...) { va_list args; va_start(args, format); if(xvid_debug & level) { vfprintf(stderr, format, args); } va_end(args); } # else /* _DEBUG */ static __inline void DPRINTF(int level, char *format, ...) {} # endif /* _DEBUG */ # define BSWAP(a) \ ((a) = (((a) & 0xff) << 24) | (((a) & 0xff00) << 8) | \ (((a) >> 8) & 0xff00) | (((a) >> 24) & 0xff)) # include static __inline int64_t read_counter(void) { return (int64_t)clock(); } # define DECLARE_ALIGNED_MATRIX(name,sizex,sizey,type,alignment) \ type name[(sizex)*(sizey)] #endif /* Compiler test */ #endif /* PORTAB_H */ xvidcore/src/encoder.h0000664000076500007650000001351711564705453016063 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Encoder related header - * * Copyright(C) 2002-2010 Michael Militzer * 2002-2003 Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: encoder.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _ENCODER_H_ #define _ENCODER_H_ #include "xvid.h" #include "portab.h" #include "global.h" /***************************************************************************** * Constants ****************************************************************************/ /* lambda base exponential. 1<= 0 */ /* multithreaded stuff */ int num_threads; /* number of encoder threads */ SMPData * smpData; /* data structures used to pass all thread-specific data */ int m_framenum; /* debug frame num counter; unlike iFrameNum, does not reset at ivop */ float fMvPrevSigma; int num_slices; /* number of slices to code */ } Encoder; /***************************************************************************** * Inline functions ****************************************************************************/ static __inline uint8_t get_fcode(uint16_t sr) { if (sr <= 16) return 1; else if (sr <= 32) return 2; else if (sr <= 64) return 3; else if (sr <= 128) return 4; else if (sr <= 256) return 5; else if (sr <= 512) return 6; else if (sr <= 1024) return 7; else return 0; } /***************************************************************************** * Prototypes ****************************************************************************/ void init_encoder(uint32_t cpu_flags); int enc_create(xvid_enc_create_t * create); int enc_destroy(Encoder * pEnc); int enc_encode(Encoder * pEnc, xvid_enc_frame_t * pFrame, xvid_enc_stats_t * stats); #endif xvidcore/CodingStyle0000664000076500007650000002543511504426344015646 0ustar xvidbuildxvidbuildCodingStyle =========== This is a short document describing the preferred coding style for the Xvid core library. Coding style is very personal, and we won't _force_ our views on anybody. But if everybody who submits patches/codes to the CVS respect this coding style, the whole source would be easier to read/understand for all the others developers. Chapter 1: Indentation Tabs are 4 characters, and thus indentations are also 4 characters. We use tabs as indentation chars, try not using spaces as they make the source code bigger. In short, 8-char indents would have made things easier to read, and would have the added benefit of warning you when you're nesting your functions too deep. But because of some parts of the Xvid code source has to use lot of if/else/for statements together, we have chosen to set the standard tab length to 4 characters. Setting the tab length to 4 doesn't allow you to write deep code paths. Try to keep your code path as simple as possible. More than 3 levels is surely a piece of code which needs to be written again if possible. Chapter 2: Placing Braces The other issue that always comes up in C styling is the placement of braces. Unlike the indent size, there are few technical reasons to choose one placement strategy over the other, but the preferred way, is to place the braces just a the end of the line : if (x is true) { we do y } else { we do z } This way we waste no lines with a single brace in it, and can add more useful comments to the code. Function and struct braces do not obey this rule, the braces are added in the next line : int function(int x) { body of function } struct foo_t { body of the structure } Note that the closing brace is empty on a line of its own, _except_ in the cases where it is followed by a continuation of the same statement, ie a "while" in a do-statement or an "else" in an if-statement, like this: do { body of do-loop } while (condition); and if (x == y) { .. } else if (x > y) { ... } else { .... } Chapter 3: Naming C is a Spartan language, and so should your naming be. Unlike Modula-2 and Pascal programmers, C programmers do not use cute names like ThisVariableIsATemporaryCounter. A C programmer would call that variable "tmp", which is much easier to write, and not the least more difficult to understand. HOWEVER, while mixed-case names are frowned upon, descriptive names for global variables are a must. To call a global function "foo" is a shooting offense. GLOBAL variables (to be used only if you _really_ need them) need to have descriptive names, as do global functions. If you have a function that counts the number of active users, you should call that "count_active_users()" or similar, you should _not_ call it "cntusr()". Try not to use global variables as they break reentrancy and Xvid aims to be (in a long term) a threadable library. Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged - the compiler knows the types anyway and can check those, and it only confuses the programmer. No wonder MicroSoft makes buggy programs. LOCAL variable names should be short, and to the point. If you have some random integer loop counter, it should probably be called "i". Calling it "loop_counter" is non-productive, if there is no chance of it being mis-understood. Similarly, "tmp" can be just about any type of variable that is used to hold a temporary value. If you are afraid to mix up your local variable names, you have another problem, which is called the function-growth-hormone-imbalance syndrome. See next chapter. Chapter 4: Functions Functions should be short and sweet, and do just one thing. They should fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, as we all know), and do one thing and do that well. The maximum length of a function is inversely proportional to the complexity and indentation level of that function. So, if you have a conceptually simple function that is just one long (but simple) case-statement, where you have to do lots of small things for a lot of different cases, it's OK to have a longer function. However, if you have a complex function, and you suspect that a less-than-gifted first-year high-school student might not even understand what the function is all about, you should adhere to the maximum limits all the more closely. Use helper functions with descriptive names (you can ask the compiler to in-line them if you think it's performance-critical, and it will probably do a better job of it that you would have done). Another measure of the function is the number of local variables. They shouldn't exceed 5-10, or you're doing something wrong. Re-think the function, and split it into smaller pieces. A human brain can generally easily keep track of about 7 different things, anything more and it gets confused. You know you're brilliant, but maybe you'd like to understand what you did 2 weeks from now. NB : This chapter does not apply very well to some Xvid parts, but keep this "philosphy" in mind anyway. Chapter 5: Commenting Comments are good, but there is also a danger of over-commenting. NEVER try to explain HOW your code works in a comment: it's much better to write the code so that the _working_ is obvious, and it's a waste of time to explain badly written code. Generally, you want your comments to tell WHAT your code does, not HOW. Also, try to avoid putting comments inside a function body: if the function is so complex that you need to separately comment parts of it, you should probably go back to chapter 4 for a while. You can make small comments to note or warn about something particularly clever (or ugly), but try to avoid excess. Instead, put the comments at the head of the function, telling people what it does, and possibly WHY it does it. Chapter 6: Emacs settings That's OK, we all do. You've probably been told by your long-time Unix user helper that "GNU emacs" automatically formats the C sources for you, and you've noticed that yes, it does do that, but the defaults it uses are less than desirable (in fact, they are worse than random typing - a infinite number of monkeys typing into GNU emacs would never make a good program). So, you can either get rid of GNU emacs, or change it to use saner values. To do the latter, you can stick the following in your .emacs file: (defun xvid-c-mode () "C mode with adjusted defaults for use with the Xvid Sources." (interactive) (c-mode) (message "Loading xvid-c-mode") (c-set-style "K&R") (setq c-basic-offset 4) (setq indent-tabs-mode t tab-width 4) (turn-on-follow-mode) (toggle-truncate-lines) (setq make-backup-files nil) (setq column-number-mode t) ) This will define the "M-x xvid-c-mode command". When hacking the library, if you put the string -*- xvid-c -*- somewhere on the first two lines, this mode will be automatically invoked. Also, you may want to add (setq auto-mode-alist (cons '("/.*/xvidcore.*/.*\\.[ch]$" . xvid-c-mode) auto-mode-alist)) to your .emacs file if you want to have xvid-c-mode switched on automagically when you edit source files under */xvidcore/. Chapter 7: indent But even if you fail in getting emacs to do sane formatting, or you are using another kind of editors (vim?, vc++??, notepad ??? <- you crazy :-) everything is lost: use "indent". Now, again, GNU indent has the same brain dead settings that GNU emacs has, which is why you need to give it a few command line options. However, that's not too bad, because even the makers of GNU indent recognize the authority of K&R (the GNU people aren't evil, they are just severely misguided in this matter), so you just give indent the options : -bad -bap -nbbo -nbc -br -c33 -cd33 -ncdb -ce -ci4 -cli0 -cp33 -cs -d0 -di1 -nfc1 -nfca -nhnl -i4 -ip0 -l79 -lp -npcs -nprs -psl -saf -sai -saw -nsc -nsob -nss -ut -ts4 -bfda Look at the man pages for more details about each option. Basically, our style is very close to the K&R one. "indent" has a lot of options, and especially when it comes to comment re-formatting you may want to take a look at the manual page. But remember: "indent" is not a fix for bad programming. Chapter 8: Reentrance Well, at the moment, Xvid is not a reentrant library because during its development, some mistakes have been comited. But reentrance is a long term aim for this project so you should not write code which is not reentrant. To ensure you're writting reentrant code, check that : - you're not using global variables (except constants because they're used in read only access). - you're not using static variables (except constants because they're used in read only access). - functions use only local variables. - functions do not return data locally allocated on the stack. Chapter 9: Types A while ago, almost all variables got an extra type, like int32_t, uint8_t etc... to make them of identical size on all platforms (defined in compiler specific headers or in src/portab.h). You should use those types only when really needed. When defining a loop variable, you must use platform natural types defined as 'int'. This makes the code faster because the compiler transforms the int in the natural size of the CPU : 32 bit on 32 bit CPUs (x86, powerpc) and 64bit on 64 bit CPUs (ultra sparc, AMD hammer, Intel Itanium, Motorola powerpc64) But don't forget that the minimum platform targeted by the Xvid library is a 32bit cpu. So a 'int' should (never say 'is' in such a case) always be 32bit long (or bigger) Chapter 10: Portability The code _must_ be portable. Don't use specific functions to a compiler/OS/libC. Don't use specific compiler pragmas, or syntax extensions. The code _must_ be ANSI C compliant to ease portability on exotic platforms with only ANSI C compilers, much more widespread than ISO C99 compilers. Btw, if you have to use those deprecated/not portable features, then use the src/portab.h to write a wrapper for each targeted system. For the moment, the supported platforms are : - Win32 family (x86) with VC++ - Cygwin env (bypassing the cygwin.dll with -mno-cygwin flag) - Mingw + Minsys - GNU/Linux (x86 and ppc for optimized code, or any other arch for the pure C version of the library) - *BSD (same archs as GNU/Linux) - Solaris. Last edited: $Date: 2010-12-22 16:52:52 $ xvidcore/ChangeLog-1.00000664000076500007650000065260310161274301015540 0ustar xvidbuildxvidbuild# Ed.Gomez: This ChangeLog is generated from a personal tree maintained # under the arch revision control tool. That's why dates may be skewed. I # also removed all my email adresses from the output because they are not # relevant. ######################################################################### # 1.0.3 (Bitstream Version 37) ######################################################################### 2004-12-19 11:25:10 GMT patch-63 Summary: Trellis overflow for quant<=2 Revision: xvidcore--stable--1.0--patch-63 From skal: * Don't call trellis optimization if quant <= 2 as the code overflows modified files: src/utils/mbtransquant.c 2004-11-24 21:25:35 GMT patch-62 Summary: Fixed stride in vfw frontend. Revision: xvidcore--stable--1.0--patch-62 From pete: * Fixed the way stride is computed in the VFW frontend. (Same cure as for the DShow frontend) modified files: vfw/src/codec.c 2004-11-24 21:09:45 GMT patch-61 Summary: Fixed stride in DShow decoder. Revision: xvidcore--stable--1.0--patch-61 From pete: * Fixed the way stride is computed in DShow filter modified files: dshow/src/CXvidDecoder.cpp 2004-11-24 21:05:54 GMT patch-60 Summary: Fixed DiamondSearch Revision: xvidcore--stable--1.0--patch-60 From sysKin: * Fixed DiamondSearch, wrong directions were used in two cases. modified files: src/motion/estimation_common.c 2004-10-12 20:59:17 GMT patch-59 Summary: Don't read too short streams. Revision: xvidcore--stable--1.0--patch-59 From sysKin: * Dont even try to read bitstreams shorter than 4 bytes (nb: 4 bytes == size of startcodes). modified files: src/bitstream/bitstream.c 2004-10-12 20:33:59 GMT patch-58 Summary: 64bit fixes Revision: xvidcore--stable--1.0--patch-58 From Andre Werthmann (wertmann at aei dot mpg dot de): - uint vs int cleanups for addresses. This fixes various problems for 64bit platforms. modified files: src/image/interpolate8x8.h src/image/qpel.h src/motion/estimation_bvop.c src/motion/motion_comp.c 2004-10-12 19:22:53 GMT patch-57 Summary: ME fix. Revision: xvidcore--stable--1.0--patch-57 From gruel: * Diamond search sets iDirection to 0 preventing further searches. modified files: src/motion/estimation_common.c ######################################################################### # 1.0.2 (Bitstream Version 36) ######################################################################### 2004-08-29 11:35:02 GMT patch-56 Summary: ChangeLog update Revision: xvidcore--stable--1.0--patch-56 ChangeLog update modified files: ChangeLog 2004-08-29 11:24:26 GMT patch-55 Summary: Merged one important forgotten bugfix from head Revision: xvidcore--stable--1.0--patch-55 Merged one important forgotten bugfix from head Patches applied: * ed.gomez@free.fr--2004-1/xvidcore--head--0.0--patch-70 Out of bounds MVs clipping * ed.gomez@free.fr--2004-1/xvidcore--head--0.0--patch-71 Decoder optimization (fixing regression) modified files: src/decoder.c new patches: ed.gomez@free.fr--2004-1/xvidcore--head--0.0--patch-70 ed.gomez@free.fr--2004-1/xvidcore--head--0.0--patch-71 2004-08-29 10:51:58 GMT patch-54 Summary: Marking 1.0.2 Revision: xvidcore--stable--1.0--patch-54 From ed.gomez: * Marking 1.0.2 modified files: ChangeLog build/generic/configure.in src/xvid.c src/xvid.h 2004-08-22 13:08:44 GMT patch-53 Summary: Thread safety problem in idct C version Revision: xvidcore--stable--1.0--patch-53 From ed.gomez: * Fixed a thread safety problem in C version of the idct function. Added some comments on some static data not marked as RO. modified files: src/bitstream/mbcoding.c src/dct/idct.c 2004-08-21 11:45:55 GMT patch-52 Summary: Stupid typo+error in fdct_xxx_skal macro generator. Revision: xvidcore--stable--1.0--patch-52 From Nicolas Boulay: * Found a typo mistake (ecx->eax) and an error in the same line But as we're lucky, the unrolled version was bugfree, and that is that one which is used. modified files: src/dct/x86_asm/fdct_mmx_skal.asm 2004-07-26 20:21:24 GMT patch-51 Summary: ChangeLog Update Revision: xvidcore--stable--1.0--patch-51 ChangeLog Update modified files: ChangeLog 2004-07-24 11:33:57 GMT patch-50 Summary: BVOP direct/interpolated ref block rounding fix. Revision: xvidcore--stable--1.0--patch-50 From ed.gomez: * BVOP direct/interpolated ref block rounding fix. It's been using rounding=1 for averaging stage since ever. The standard says it's rounding=0. See standard clause 7.6.9.4 for explicit code and Section 6.3.5 that says "rounding=0" in bframes as they don't set the vop_rounding_type in VOP header. Both sections match, xvid was wrong modified files: src/decoder.c 2004-07-23 20:37:09 GMT patch-49 Summary: Removed data qualifer in .rodata Revision: xvidcore--stable--1.0--patch-49 From ed.gomez: * long standing warning by yasm... data isn't a keyword for .(ro)data sections. modified files: src/bitstream/x86_asm/cbp_mmx.asm src/bitstream/x86_asm/cbp_sse2.asm src/dct/x86_asm/fdct_mmx_ffmpeg.asm src/dct/x86_asm/fdct_mmx_skal.asm src/dct/x86_asm/fdct_sse2_skal.asm src/dct/x86_asm/idct_3dne.asm src/dct/x86_asm/idct_mmx.asm src/dct/x86_asm/idct_sse2_dmitry.asm src/dct/x86_asm/simple_idct_mmx.asm src/image/x86_asm/colorspace_rgb_mmx.asm src/image/x86_asm/colorspace_yuyv_mmx.asm src/image/x86_asm/interpolate8x8_3dn.asm src/image/x86_asm/interpolate8x8_3dne.asm src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/image/x86_asm/qpel_mmx.asm src/image/x86_asm/reduced_mmx.asm src/motion/x86_asm/sad_3dn.asm src/motion/x86_asm/sad_3dne.asm src/motion/x86_asm/sad_mmx.asm src/motion/x86_asm/sad_sse2.asm src/motion/x86_asm/sad_xmm.asm src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/x86_asm/cpuid.asm src/utils/x86_asm/interlacing_mmx.asm src/utils/x86_asm/mem_transfer_3dne.asm src/utils/x86_asm/mem_transfer_mmx.asm 2004-07-19 18:45:14 GMT patch-48 Summary: Complete previous xvid_decraw patch Revision: xvidcore--stable--1.0--patch-48 Complete previous xvid_decraw patch modified files: examples/xvid_decraw.c 2004-07-18 11:58:48 GMT patch-47 Summary: ISO C99'ism fix Revision: xvidcore--stable--1.0--patch-47 ISO C99'ism fix modified files: src/encoder.c 2004-07-17 11:32:42 GMT patch-46 Summary: Make sure time incr is never larger than 16bit. Revision: xvidcore--stable--1.0--patch-46 From ed.gomez: * Keep both fbase and fincr under 16bit limit. modified files: src/encoder.c 2004-07-17 10:00:42 GMT patch-45 Summary: Future version interoperability Revision: xvidcore--stable--1.0--patch-45 From ed.gomez: * Zeroing the structures is the best way not to pass wrong data when dealing with slightly new XviD (like head). If we don't zero here, then brightness is not initialized in HEAD leading to crash. modified files: examples/xvid_decraw.c 2004-07-10 17:30:40 GMT patch-44 Summary: ChangeLog update Revision: xvidcore--stable--1.0--patch-44 ChangeLog update modified files: ChangeLog 2004-07-10 17:27:06 GMT patch-43 Summary: Small mem leak in vfw. Revision: xvidcore--stable--1.0--patch-43 From sysKin: * Free zones. modified files: vfw/src/codec.c 2004-07-10 16:55:53 GMT patch-42 Summary: Fix wrong matrix reading logic. Revision: xvidcore--stable--1.0--patch-42 From sysKin: * Fix a problem reading quantization matrix. modified files: src/bitstream/bitstream.c 2004-06-26 21:26:35 GMT patch-41 Summary: low delay guessing (il)logic fix. Revision: xvidcore--stable--1.0--patch-41 From sysKin: * bframes were decoded wrong when trying to guess low_delay flag as specified in the standard when vol_control_parameters aren't available. modified files: src/decoder.c 2004-06-13 19:15:05 GMT patch-40 Summary: Small memory error in ia32 cpuid function. Revision: xvidcore--stable--1.0--patch-40 From ed.gomez: * Valgrind detected a write to suspicious stack space. To avoid any false reporting, added an explicit stack space allocation. modified files: src/utils/x86_asm/cpuid.asm ######################################################################### # 1.0.1 (Bitstream Version 35) ######################################################################### 2004-06-05 22:55:56 GMT patch-39 Summary: Marking 1.0.1 release Revision: xvidcore--stable--1.0--patch-39 Marking 1.0.1 release modified files: ChangeLog TODO build/generic/configure.in src/xvid.c src/xvid.h 2004-06-02 20:58:38 GMT patch-38 Summary: DC clipping bug for real Revision: xvidcore--stable--1.0--patch-38 From ed.gomez: * patch-25 was supposed to fix a DC clipping bug. However i added the additional clipping code in the wrong place. But at least, my fix didn't cause any trouble, it was just noop. This patch should really fix this very "unlikely bug" (i just want to remind the reader that this bug isn't easy to trigger, and eg: my test sequences don't trigger it at all) BS version incremented: 35 modified files: src/decoder.c src/motion/estimation_rd_based.c src/prediction/mbprediction.c src/prediction/mbprediction.h src/xvid.h 2004-05-31 21:11:49 GMT patch-37 Summary: time fixes to decoder. Revision: xvidcore--stable--1.0--patch-37 From ed.gomez: * timestamps were badly computed by teh decoder in some corner cases (1fps). This bug revealed that, timestamps were indeed wrong as expected, but that bvop blocks in direct mode (vectors interpolated) were somewhat compensated with wrong vectors in these same corner cases. modified files: src/bitstream/bitstream.c src/decoder.c src/decoder.h 2004-05-30 09:36:13 GMT patch-36 Summary: Wrong license header. Revision: xvidcore--stable--1.0--patch-36 From ed.gomez: * Pascal did agree a plain GPL migration long ago, but this file remained GPL+location restriction. modified files: src/image/reduced.c 2004-05-29 09:02:25 GMT patch-35 Summary: More missing va_end() calls. Revision: xvidcore--stable--1.0--patch-35 From pete: * portab.h is plenty of missing calls to va_end(). modified files: src/portab.h 2004-05-28 21:28:21 GMT patch-34 Summary: FPS=1 problem in decoder. Revision: xvidcore--stable--1.0--patch-34 From ed.gomez: * patch-24 did fix bad behavior in encoder, so at least, compliant streams were generated but the decoder was still doing the maths a wrong way. Apply same logic to decoder. Thanks to the patch-24 bug reporter for this followup. modified files: src/bitstream/bitstream.c 2004-05-27 20:04:01 GMT patch-33 Summary: Nasty typo in pvop vector lambdas. Revision: xvidcore--stable--1.0--patch-33 From sysKin: * s/+/*/ in the lambda value array for vectors in the pvop estimation module. modified files: src/motion/estimation_pvop.c 2004-05-26 13:23:38 GMT patch-32 Summary: Bits/Bytes confusion in the VFW frontend. Revision: xvidcore--stable--1.0--patch-32 From sysKin: * confusion between the kilo, in kilobits (1000) and the kilo in kilobytes (1024, should be named KiB anyway) * biSizeImage is in bytes, not bits according to the Win32 API. modified files: vfw/src/codec.c vfw/src/config.c 2004-05-26 09:28:31 GMT patch-31 Summary: Close variable argument list. Revision: xvidcore--stable--1.0--patch-31 From ed.gomez: * Close the variable argument list as specified by the ANSI C standard. Reported by Carsten on xvid-devel. modified files: src/image/font.c 2004-05-26 09:00:26 GMT patch-30 Summary: ICM compatibility for VFW Revision: xvidcore--stable--1.0--patch-30 From sysKin: * Makes the VFW frontend compatible with ICM applications (Ooo, MS Office... etc). Reported on IRC. modified files: vfw/src/config.c vfw/src/driverproc.c 2004-05-26 08:58:56 GMT patch-29 Summary: Small trellis bug Revision: xvidcore--stable--1.0--patch-29 From sysKin: * Last coeff wasn't summed. Reported by Jean Marc. modified files: src/utils/mbtransquant.c 2004-05-26 08:46:45 GMT patch-28 Summary: Small bug in bframe ME. Revision: xvidcore--stable--1.0--patch-28 From sysKin: * Small bug in bframe ME. modified files: src/motion/estimation_bvop.c ######################################################################### # 1.0.0 final (Bitstream Version 34) ######################################################################### 2004-05-08 22:26:06 GMT patch-27 Summary: Marking 1.0.0 final Revision: xvidcore--stable--1.0--patch-27 From ed.gomez: * Marking 1.0.0 final \o/ modified files: ChangeLog build/generic/configure.in src/xvid.h 2004-05-06 17:56:52 GMT patch-26 Summary: Small mismatch in hint<->widget in VFW Revision: xvidcore--stable--1.0--patch-26 From sysKin: * Small mismatch in hint<->widget. modified files: vfw/src/resource.rc 2004-05-02 22:40:50 GMT patch-25 Summary: DC prediction fix. Revision: xvidcore--stable--1.0--patch-25 From ed.gomez: * DC predictors weren't clipped to the [-2048, 2047] range. BS version increased to 33 Thanks to jnorish on our forums to point out the problem. modified files: src/bitstream/bitstream.c src/decoder.c src/motion/estimation_rd_based.c src/prediction/mbprediction.c src/prediction/mbprediction.h src/xvid.h 2004-05-02 10:30:29 GMT patch-24 Summary: Possible VOL header corruption. Revision: xvidcore--stable--1.0--patch-24 From ed.gomez: * The VOL header could be corrupted when passing fincr=fbase=1 which happens for fps=1 sequences. BS version bumped up to 32 Original report: http://www.xvid.org/modules.php?op=modload&name=phpBB2&file=viewtopic&t=2026&highlight= modified files: src/bitstream/bitstream.c src/xvid.h 2004-04-30 23:10:19 GMT patch-23 Summary: Some very light Unix build system changes Revision: xvidcore--stable--1.0--patch-23 To prepare testing framework merging. From ed.gomez: * Some typos * Copyright updates (it's 2004 since a few months ;-) * Added some checking to bootstrap.sh * Added m4 AC_PREREQ macro to configure.in modified files: build/generic/Makefile build/generic/bootstrap.sh build/generic/configure.in 2004-04-20 19:40:29 GMT patch-22 Summary: Small visual fix. Revision: xvidcore--stable--1.0--patch-22 From sysKin: * Small visual fix modified files: vfw/src/config.c 2004-04-20 19:38:24 GMT patch-21 Summary: Fix crash in decoder for non IFrame 1st frame. Revision: xvidcore--stable--1.0--patch-21 From sysKin: * Fixed the crash caused by non IFrame 1st frame. modified files: src/decoder.c 2004-04-18 16:21:50 GMT patch-20 Summary: Typo Revision: xvidcore--stable--1.0--patch-20 Typo modified files: vfw/src/resource.rc 2004-04-17 17:04:20 GMT patch-19 Summary: vfw opens audio file in shared access mode Revision: xvidcore--stable--1.0--patch-19 vfw opens audio file in shared access mode modified files: vfw/src/config.c 2004-04-15 22:39:12 GMT patch-18 Summary: Tiny xvid_decraw cleaning Revision: xvidcore--stable--1.0--patch-18 Tiny xvid_decraw cleaning modified files: examples/xvid_decraw.c 2004-04-15 19:14:31 GMT patch-17 Summary: Tiny minor fixes for msvc. Revision: xvidcore--stable--1.0--patch-17 From pete: * Missing arch endianness define in project files. * Add a textual warning about win32 console EOF misreading. * Prevent a SIGFPE when no frames were decoded. modified files: build/win32/xvid_decraw.dsp build/win32/xvid_encraw.dsp examples/xvid_decraw.c 2004-04-14 22:41:07 GMT patch-16 Summary: Fixed missing 1st frame in dshow output. Revision: xvidcore--stable--1.0--patch-16 From sysKin: * decoder flags were overwritten, this was preventing from outputing the first frame immediatly. modified files: dshow/src/CXvidDecoder.cpp 2004-04-14 22:39:17 GMT patch-15 Summary: Ressource leaking in dshow. Revision: xvidcore--stable--1.0--patch-15 From sysKin: * Same kind of ressource leaking as in vfw. Same cure. modified files: dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h 2004-04-14 19:45:23 GMT patch-14 Summary: Fixed small bug in trellis code. Revision: xvidcore--stable--1.0--patch-14 From ed.gomez (spotted by jean marc): * Trellis optimization was computing the sum |coeffs| wrongly because the Compute_sum function wasn't doing zigzag scanning and stopped at last non zero coeffs in linear scanning... The trivial fix could have been to pass the zigzag to the compute sum function so it could have done its job right. But... Why computing stuff that is already known in the upper layer ? I removed the compute sum function, and just pass the original sum value in trellis function parameters, just in case we have to return it again because trellis failed optimizing the block coeffs. This fix the bug and saves cycles :-) (but should not be noticeable) modified files: src/utils/mbtransquant.c src/xvid.h 2004-04-12 12:06:12 GMT patch-13 Summary: Don't do SAD and RD based searches for qp. Revision: xvidcore--stable--1.0--patch-13 From sysKin: * MakeGoodFlags function wasn't disabling SAD based search when using RD. This was causing slowdown for no gain at all. This patch should speed up encoding in qp mode. modified files: src/motion/estimation_pvop.c 2004-04-12 12:01:19 GMT patch-12 Summary: 3dnow functions proper separation. Revision: xvidcore--stable--1.0--patch-12 From pete (thx to a forum report): * Separate correctly pure 3dnow functions and 3dnow+mmxext functions. This fix "Illegal instruction" crash on old k6-2 CPUs. modified files: src/xvid.c 2004-04-12 11:57:20 GMT patch-11 Summary: Better MV clipping code. Revision: xvidcore--stable--1.0--patch-11 From sysKin: * Better MV clipping. DivX 5 generates out of range vectors and clipping them directly borks the decoding of other MVs for which out of range MVs were predictors. So it's just better to clip them for the block decoding and keep an unclipped version for predictions. modified files: src/decoder.c 2004-04-08 20:34:54 GMT patch-10 Summary: PGM support back in xvid_decraw. Revision: xvidcore--stable--1.0--patch-10 From ed.gomez: * pgm/pnm format added back to xvid_decraw + pgm/pnm formats are now default for yv12/i420/rgb24 pixel format. + tga is default for rgb16/32 + use option -f to choose tga, or pnm/pgm modified files: examples/xvid_decraw.c 2004-04-07 22:30:15 GMT patch-9 Summary: 3DNow Ext functions use MMXEXT opcodes. Revision: xvidcore--stable--1.0--patch-9 From Soltius (XviD Forum): * Most of 3dnow extension functions do use MMXEXT opcodes, so classify these functions as 3dnowext+mmxext. Avoids K6-2 boxes to crash with an invalid instruction error reported by the host OS. PS: original bug report http://www.xvid.org/modules.php?op=modload&name=phpBB2&file=viewtopic&t=1656 modified files: src/xvid.c 2004-04-07 22:01:54 GMT patch-8 Summary: RGB 16bit output fix. Revision: xvidcore--stable--1.0--patch-8 From ScarletteTout (XviD Forum): * Fix RGB 16bit output in C functions. From ed.gomez: * Replaced PGM output by TGA output so it's easy to implement RGB 16/24/32 and greyscale bitmaps support in a single format. (pgm could have supported RGB 24 and Greyscale only) * Added colorspace choice to xvid_decraw Use option -c csp, where csp is either rgb16, rgb24, rgb32, yv12 or i420 Defaults to i420. PS: original bug report http://www.xvid.org/modules.php?op=modload&name=phpBB2&file=viewtopic&t=1960&highlight= modified files: examples/xvid_decraw.c src/image/colorspace.c ######################################################################### # 1.0.0 RC4 (Bitstream Version 30) ######################################################################### 2004-04-04 20:21:38 GMT patch-7 Summary: DShow widget hiding. Revision: xvidcore--stable--1.0--patch-7 From Michael: * No need to keep widget visibles if they won't be in 1.0.0. modified files: dshow/src/xvid.ax.rc 2004-04-04 20:17:52 GMT patch-6 Summary: Compiler quirk in portab.h Revision: xvidcore--stable--1.0--patch-6 From Michael: * The VC.NET workaround was causing trouble. Inversed the test. modified files: src/portab.h 2004-04-04 14:19:10 GMT patch-5 Summary: Marking RC4 Revision: xvidcore--stable--1.0--patch-5 Marking RC4 modified files: ChangeLog build/generic/configure.in src/xvid.h 2004-04-04 14:07:00 GMT patch-4 Summary: Frame dropping disabling for bframes. Revision: xvidcore--stable--1.0--patch-4 From sysKin & Pete: * Disable frame dropping with bframes enabled. These two options do not play fine together. modified files: src/encoder.c 2004-04-04 14:05:50 GMT patch-3 Summary: Dead code removal. Revision: xvidcore--stable--1.0--patch-3 From sysKin: * FrameCodeP was always called with contanst parameters. Removed these parameters and associated dead code. modified files: src/encoder.c 2004-04-04 14:03:42 GMT patch-2 Summary: Typo in ME fast comparison. Revision: xvidcore--stable--1.0--patch-2 From sysKin: * Small typo in Fast ME code. modified files: src/motion/estimation_common.c 2004-04-02 23:58:19 GMT patch-1 Summary: VFW Resource leak fix (try #2) Revision: xvidcore--stable--1.0--patch-1 From Suiryc on IRC: * both encoder and decoder ending functions were calling the dll freeing code. This was an error as the first function called would unbind core function for the second called one. Thus xvidcore could not release buffers. modified files: vfw/src/codec.c vfw/src/driverproc.c 2004-04-02 20:33:02 GMT base-0 Summary: tag of ed.gomez@free.fr--2004-1/xvidcore--devapi4--1.0--patch-53 Revision: xvidcore--stable--1.0--base-0 (automatically generated log message) 2004-03-31 19:32:47 GMT patch-53 Summary: Ressources leaking in VFW. Revision: xvidcore--devapi4--1.0--patch-53 From sysKin: * Storing ressources in global vars is making multithreaded/instanced apps leaking lot of memory. Moved these vars to codec struct. Thanks to dalox to spot and fix the bug. modified files: vfw/src/codec.c vfw/src/codec.h vfw/src/driverproc.c 2004-03-31 19:28:51 GMT patch-52 Summary: Fix to bad NVOP+bframe interaction. Revision: xvidcore--devapi4--1.0--patch-52 From syskin: * When generating a NVOP, it interferes with bframe flushing and packing. modified files: src/encoder.c 2004-03-31 19:24:42 GMT patch-51 Summary: Fix GMC 2 warp point. Revision: xvidcore--devapi4--1.0--patch-51 From Skal: * Fix 2 warp points GMC. modified files: src/motion/gmc.c 2004-03-31 19:18:46 GMT patch-50 Summary: Added intra quant testing. Revision: xvidcore--devapi4--1.0--patch-50 From Skal: * Added intra quant testing to xvid_bench modified files: examples/xvid_bench.c 2004-03-31 19:07:55 GMT patch-49 Summary: input width/height check Revision: xvidcore--devapi4--1.0--patch-49 From Skal: * Input width and height aren't checked and this can cause crash. modified files: src/encoder.c 2004-03-28 01:02:21 GMT patch-48 Summary: $ CVS expansion removed Revision: xvidcore--devapi4--1.0--patch-48 $ CVS expansion removed modified files: dshow/src/Configure.cpp dshow/src/config.c examples/xvid_encraw.c 2004-03-28 00:45:23 GMT patch-47 Summary: Allow bigger frames Revision: xvidcore--devapi4--1.0--patch-47 Allow bigger frames modified files: examples/xvid_encraw.c 2004-03-28 00:41:54 GMT patch-46 Summary: Fix in postproc header Revision: xvidcore--devapi4--1.0--patch-46 Fix in postproc header modified files: src/image/postprocessing.h 2004-03-28 00:33:02 GMT patch-45 Summary: Fixed xvidvfw build on real mingw+msys systems Revision: xvidcore--devapi4--1.0--patch-45 Fixed xvidvfw build on real mingw+msys systems modified files: vfw/bin/Makefile 2004-03-15 21:48:48 GMT patch-44 Summary: VFW updates. Revision: xvidcore--devapi4--1.0--patch-44 VFW updates. modified files: vfw/src/config.c vfw/src/resource.rc 2004-03-15 21:44:17 GMT patch-43 Summary: GMC bugfix. Revision: xvidcore--devapi4--1.0--patch-43 From skal: * GMC bugfixes. modified files: src/motion/gmc.c 2004-03-15 21:41:18 GMT patch-42 Summary: Buffer overrun fix in post proc. Revision: xvidcore--devapi4--1.0--patch-42 From sysKin: * Fixed buffer overrun in postproc code. modified files: src/image/postprocessing.c src/image/postprocessing.h 2004-03-15 21:33:22 GMT patch-41 Summary: Forgotten files. Revision: xvidcore--devapi4--1.0--patch-41 Me: * Sorry i missed these file additions. Important for the packages as i build them from the tla archive. new files: dshow/src/.arch-ids/Configure.cpp.id dshow/src/.arch-ids/config.c.id dshow/src/.arch-ids/config.h.id dshow/src/.arch-ids/debug.h.id dshow/src/Configure.cpp dshow/src/config.c dshow/src/config.h dshow/src/debug.h 2004-03-03 21:01:09 GMT patch-40 Summary: MV clipping in decoder. Revision: xvidcore--devapi4--1.0--patch-40 From sysKin: * clip MVs to valid ranges. modified files: src/decoder.c src/decoder.h ######################################################################### # 1.0.0 RC3 (Bistream Version 29) ######################################################################### 2004-02-29 13:17:10 GMT patch-39 Summary: Marking 1.0.0 RC3 Revision: xvidcore--devapi4--1.0--patch-39 * Marking RC3 modified files: ChangeLog build/generic/configure.in src/xvid.h 2004-02-29 13:08:38 GMT patch-38 Summary: Win32 project fixes for xvid_encraw and xvid_decraw. Revision: xvidcore--devapi4--1.0--patch-38 From sysKin: * xvid_dec/encraw were linking against libxvidcore.lib. Now they link against xvidcore.dll.a NB: xvid_bench is left as is though it doesn't link. This program requires access to internal functions which aren't available through the dll link lib. No good solution exists for MSVC to build both a dll+its link lib+static lib. modified files: build/win32/xvid_decraw.dsp build/win32/xvid_encraw.dsp 2004-02-29 12:56:36 GMT patch-37 Summary: DShow updates Revision: xvidcore--devapi4--1.0--patch-37 Bunch of DShow updates modified files: dshow/src/CXvidDecoder.cpp dshow/src/resource.h dshow/src/xvid.ax.rc 2004-02-29 12:55:41 GMT patch-36 Summary: VFW updates. Revision: xvidcore--devapi4--1.0--patch-36 * Bunch of VFW frontends updates. modified files: vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/resource.rc vfw/src/status.c vfw/vfw.dsp 2004-02-29 12:49:24 GMT patch-35 Summary: Artefact workaround in bframes. Revision: xvidcore--devapi4--1.0--patch-35 From sysKin: * Workaround for some artefacts appearing in bframes. modified files: src/motion/estimation_bvop.c 2004-02-29 12:46:13 GMT patch-34 Summary: Decoder buffer overflow fix. Revision: xvidcore--devapi4--1.0--patch-34 From sysKin: * Avoids buffer overflow when reading the last align byte. Our bitstream do 4 byte reading because of some platform contraints (ARM) and can cause buffer overflow reads. modified files: src/decoder.c 2004-02-29 11:53:47 GMT patch-33 Summary: Compatibility decoding for old bitstreams. Revision: xvidcore--devapi4--1.0--patch-33 From syskin: * old core versions used in dev-api-3 distributed by nearly all win32 bin builders used to have a edging bug. So when this information is known, workaround the bug. modified files: src/decoder.c src/encoder.c src/image/image.c src/image/image.h ######################################################################### # 1.0.0 RC2 (Bistream Version 28) ######################################################################### 2004-02-08 01:06:40 GMT patch-32 Summary: Marking RC2 Revision: xvidcore--devapi4--1.0--patch-32 Marking RC2 modified files: ChangeLog build/generic/configure.in 2004-02-07 13:54:24 GMT patch-31 Summary: Win32 project outputs dll lib for linking. Revision: xvidcore--devapi4--1.0--patch-31 From pete: * output a lib to link against the dll (xvidcore.dll.a). modified files: build/win32/libxvidcore.dsp 2004-02-07 13:51:01 GMT patch-30 Summary: DShow update. Revision: xvidcore--devapi4--1.0--patch-30 From pete: * cmd line driving From sysKin(?): * bugfixes related to video flipping * bugfix for the 'crash at the end" bug modified files: TODO dshow/dshow.dsp dshow/src/CAbout.cpp dshow/src/CAbout.h dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h dshow/src/xvid.ax.def 2004-02-07 13:47:45 GMT patch-29 Summary: VFW updates Revision: xvidcore--devapi4--1.0--patch-29 Sorry feeling lazy about splitting this patch... From peter: * bitrate calculator From sysKin: * WMP9 bugfix modified files: TODO vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc 2004-02-07 13:43:26 GMT patch-28 Summary: Implicit overflow tuning for 2nd pass. Revision: xvidcore--devapi4--1.0--patch-28 From sysKin: * When doing a bigger 2nd pass, the overflow loop must be more aggressive else no bonus bits are reinjected. So we can auto tune the overlow values in that case. modified files: src/plugins/plugin_2pass2.c 2004-02-07 13:38:33 GMT patch-27 Summary: GMC+interlaced bugfix in decoder. Revision: xvidcore--devapi4--1.0--patch-27 From sysKin: * GMC+interlaced bugfix in decoder. modified files: src/decoder.c 2004-02-07 13:35:16 GMT patch-26 Summary: Reverted patch-23 Revision: xvidcore--devapi4--1.0--patch-26 From christoph: * reverted patch-23, old code was right. * Important typo for the YVYU csp (passing the y plane instead of u). modified files: src/image/image.c src/xvid.h 2004-01-31 11:20:36 GMT patch-25 Summary: DShow support for more mpeg4 fourccs. Revision: xvidcore--devapi4--1.0--patch-25 From sysKin: * Added support for the MP4V fourcc. modified files: dshow/src/CAbout.cpp dshow/src/CAbout.h dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h dshow/src/resource.h dshow/src/xvid.ax.rc 2004-01-31 11:12:38 GMT patch-24 Summary: DivX decoder compatibility Revision: xvidcore--devapi4--1.0--patch-24 From sysKin: * DivX decoder compatibility improved for packed bitstreams. It should now detect them and play them fine. modified files: src/bitstream/bitstream.c 2004-01-31 11:10:26 GMT patch-23 Summary: YV12/I420 confusion fixed. Revision: xvidcore--devapi4--1.0--patch-23 From christoph: * I420/YV12 were swapped since ... ages. * CSP_USER renamed to CSP_PLANAR modified files: src/encoder.c src/image/image.c src/xvid.h 2004-01-31 10:53:20 GMT patch-22 Summary: Arch separation for mem transfer functions Revision: xvidcore--devapi4--1.0--patch-22 Arch separation for mem transfer functions modified files: src/utils/mem_transfer.h 2004-01-27 14:47:08 GMT patch-21 Summary: Write to registry Flip video flag in dshow Revision: xvidcore--devapi4--1.0--patch-21 From sysKin: * The flip video flag is now saved in registry. * Changed internal flags name convention (use n prefix for all now) modified files: dshow/src/CAbout.cpp dshow/src/CAbout.h dshow/src/CXvidDecoder.cpp 2004-01-27 14:42:52 GMT patch-20 Summary: Added bitrate calc to VFW Revision: xvidcore--devapi4--1.0--patch-20 From Pete: * Added bitrate calculator. * Changed up a few function calls to static type. modified files: vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc vfw/vfw.dsp 2004-01-27 14:29:49 GMT patch-19 Summary: Bugfix in decoder Revision: xvidcore--devapi4--1.0--patch-19 From sysKin: * when stats are not used, don't write to the stats pointer. modified files: src/decoder.c ######################################################################### # 1.0.0 RC1 (Bistream Version 26) ######################################################################### 2004-01-25 16:01:06 GMT patch-18 Summary: Marking RC1 Revision: xvidcore--devapi4--1.0--patch-18 Marking RC1 modified files: ChangeLog build/generic/configure.in src/xvid.h 2004-01-25 15:37:57 GMT patch-17 Summary: VFW update (again) Revision: xvidcore--devapi4--1.0--patch-17 From sysKin: * Added Constant Quant encoding. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.rc 2004-01-25 15:35:38 GMT patch-16 Summary: Missing MB quants for PP. Revision: xvidcore--devapi4--1.0--patch-16 From sysKin: * MB Quants are used by the PP code, so don't forget to update them even if the block is skipped or not coded. modified files: src/decoder.c 2004-01-23 13:25:52 GMT patch-15 Summary: VFW update (again) Revision: xvidcore--devapi4--1.0--patch-15 From sysKin: * Status window updates. * Big resource.h cleanup, it seems msvc isn't able to do it automatically. modified files: vfw/src/resource.h vfw/src/resource.rc vfw/src/status.c 2004-01-23 11:17:20 GMT patch-14 Summary: VFW gcc warnings Revision: xvidcore--devapi4--1.0--patch-14 VFW gcc warnings modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h 2004-01-22 20:54:53 GMT patch-13 Summary: DShow updates. Revision: xvidcore--devapi4--1.0--patch-13 From sysKin: * Fixed registry params type. Bool cannot be used or something weird happens when writing to registry * Defaults set to what the Reset widget sets. modified files: dshow/src/CAbout.h dshow/src/CXvidDecoder.cpp 2004-01-22 20:35:27 GMT patch-12 Summary: VFW updates. Revision: xvidcore--devapi4--1.0--patch-12 From sysKin: * GUI improvements. * Stats fixing. * Automatic config clear upon installation. * Added postprocessing options in there too. modified files: vfw/bin/xvid.inf vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc vfw/src/status.c vfw/src/status.h 2004-01-22 20:28:54 GMT patch-11 Summary: Minor updates to text files Revision: xvidcore--devapi4--1.0--patch-11 Minor updates to text files modified files: AUTHORS TODO 2004-01-22 20:27:10 GMT patch-10 Summary: 2pass plugin changes. Revision: xvidcore--devapi4--1.0--patch-10 From sysKin: * Disabled QPel during first pass as well. * Fix a mistaking condition when enabling largers 2nd passes. From ed.gomez: * Fix the fix logic. The previous fix enclosed a condition it should not have touched. So i removed the mistaking condition, which was wrong anyway as stated in the comment, and got back the sane condition test. modified files: src/plugins/plugin_2pass1.c src/plugins/plugin_2pass2.c 2004-01-17 13:03:11 GMT patch-9 Summary: Unitialized pointers during plugin creation. Revision: xvidcore--devapi4--1.0--patch-9 From sysKin: - plugins which do not require private data were leaving the param2 unitialized. Just init it to NULL. This bug wasn't causing any trouble anyway... modified files: src/plugins/plugin_dump.c src/plugins/plugin_psnr.c vfw/src/codec.c 2004-01-17 01:09:01 GMT patch-8 Summary: DShow forwwards AR information. Revision: xvidcore--devapi4--1.0--patch-8 From syskin(?): - forward AR information to DShow framework. - grayed some widgets. modified files: dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h dshow/src/xvid.ax.rc 2004-01-07 13:50:29 GMT patch-7 Summary: Scaled zones fix. Revision: xvidcore--devapi4--1.0--patch-7 From sysKin (ideas from Koepi iirc): * Fix scaled zones computing prescaled data and so on instead of guessing everything with a global zone weight factor. modified files: src/plugins/plugin_2pass2.c 2004-01-06 01:06:39 GMT patch-6 Summary: Tab->Spaces in header only Revision: xvidcore--devapi4--1.0--patch-6 Tab->Spaces in header only modified files: src/xvid.h 2004-01-04 18:35:35 GMT patch-5 Summary: Typo fixed Revision: xvidcore--devapi4--1.0--patch-5 Typo fixed modified files: build/generic/configure.in 2004-01-04 13:40:51 GMT patch-4 Summary: VFW safer code. Revision: xvidcore--devapi4--1.0--patch-4 From sysKin: * Protects some parts of the code depending on a previous xvidcore opening. Avoids resources leaking. modified files: vfw/src/codec.c vfw/src/driverproc.c 2004-01-04 13:33:28 GMT patch-3 Summary: Fixes VC debug target name Revision: xvidcore--devapi4--1.0--patch-3 Fixes VC debug target name modified files: vfw/vfw.dsp 2004-01-02 23:10:56 GMT patch-2 Summary: Win32 linking policy revised. Revision: xvidcore--devapi4--1.0--patch-2 Finnaly Win32 linking policy is to separate all XviD components: - xvidcore.dll exports XviD API - xvidvfw.dll links against xvidcore DLL - xviddshow.dll links against xvidcore DLL From sysKin: * Changed DShow linking policy in VS project file. * Changed VFW linking policy in VS project file. * Added runtime xvidcore.dll loading in DShow and VFW. * Installs xvidcore.dll along side with xvidvfw.dll. From ed.gomez: * Changed libxvidcore.dll mingw32/cygwin target name to xvidcore.dll in the configure script. * Changed xvid.dll VFW target name to xvidvfw.dll in the generic Makefile. PS: unlike CVS, i reverted back to MS build tools in VS project files modified files: build/generic/configure.in build/win32/libxvidcore.dsp dshow/dshow.dsp dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h vfw/bin/Makefile vfw/bin/sources.inc vfw/bin/xvid.inf vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/config.h vfw/vfw.dsp 2004-01-02 22:02:07 GMT patch-1 Summary: Better seeking in dshow Revision: xvidcore--devapi4--1.0--patch-1 From Michael: * Better DShow seeking. modified files: dshow/src/CXvidDecoder.cpp 2004-01-02 12:28:39 GMT base-0 Summary: tag of ed.gomez@free.fr--2003-1/xvidcore--devapi4--1.0--patch-162 Revision: xvidcore--devapi4--1.0--base-0 Archive cycling... ######################################################################### # 1.0.0 beta3 (Bistream Version 25) ######################################################################### 2003-12-26 22:21:35 GMT patch-162 Summary: Marking 1.0.0 beta3 Revision: xvidcore--devapi4--1.0--patch-162 Marking beta3 modified files: ChangeLog TODO build/generic/configure.in src/xvid.h 2003-12-25 20:57:52 GMT patch-161 Summary: Thread safe PP. Revision: xvidcore--devapi4--1.0--patch-161 From Michael: * Thread safe PP, context is now stored in DECODER struct. modified files: src/decoder.c src/decoder.h src/image/postprocessing.c src/image/postprocessing.h 2003-12-25 20:49:36 GMT patch-160 Summary: Added Turbo option to VFW GUI Revision: xvidcore--devapi4--1.0--patch-160 From Michael: * Added turbo mode that enables all fast ME flags. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc 2003-12-25 20:46:16 GMT patch-159 Summary: Cartoon mode usage written to stream. Revision: xvidcore--devapi4--1.0--patch-159 From Michael: * @encoding: write cartoon usage in the version user data. * @decoding: detect cartoon flag appended to version string and force FILM PP disabling. modified files: src/bitstream/bitstream.c src/bitstream/bitstream.h src/decoder.c src/decoder.h src/encoder.c 2003-12-21 13:34:03 GMT patch-158 Summary: Removed unused var in VFW GUI Revision: xvidcore--devapi4--1.0--patch-158 Removed unused var in VFW GUI modified files: vfw/src/config.c 2003-12-21 13:32:52 GMT patch-157 Summary: Two pass small update Revision: xvidcore--devapi4--1.0--patch-157 from syskin: * allow second pass to be bigger than 1st one (not tested, the quant mapping formula may not be adapted for this usage, so take this change as experimental, and prefer doing second pass still smaller than 1st one) * let ivops benefit from positive overflow. from me: * set frame type in quant zones (was a buglet) modified files: src/plugins/plugin_2pass2.c 2003-12-20 22:28:07 GMT patch-156 Summary: New VFW defaults Revision: xvidcore--devapi4--1.0--patch-156 From michael: * new VFW default values modified files: vfw/src/config.c 2003-12-20 22:12:38 GMT patch-155 Summary: Added ARGB colorspace. Revision: xvidcore--devapi4--1.0--patch-155 Christoph Ngeli naegelic(at)ee{dot}ethzch asked me to add C support for ARGB colorspace. He provided the encoding part, and i extended the original patch in order to have full support for ARGB both for encoding and decoding (though it's C only, read *slow*). modified files: src/image/colorspace.c src/image/colorspace.h src/image/image.c src/xvid.c src/xvid.h 2003-12-20 21:29:37 GMT patch-154 Summary: Added 2pass1 comment about fast 1st pass. Revision: xvidcore--devapi4--1.0--patch-154 Just added a comment on fast 1st pass, so it explains why we do it that way and why some things are left aside. modified files: src/plugins/plugin_2pass1.c 2003-12-20 20:03:51 GMT patch-153 Summary: Win32 VC6 wrong libc linking. Revision: xvidcore--devapi4--1.0--patch-153 From sysKin: * Changed single thread libc linking to multithreaded version. modified files: vfw/vfw.dsp 2003-12-20 15:28:53 GMT patch-152 Summary: VOL flags updating -- take #2 Revision: xvidcore--devapi4--1.0--patch-152 From sysKin: * VOL flags updates fix take #2 modified files: src/encoder.c 2003-12-20 15:10:30 GMT patch-151 Summary: Fast ME tunings. Revision: xvidcore--devapi4--1.0--patch-151 From michael: * fast refinement for 8x8 blocks * more reliable behavior for all fast ME decisions modified files: src/motion/estimation_bvop.c src/motion/estimation_pvop.c src/xvid.h 2003-12-20 14:59:58 GMT patch-150 Summary: VFW AR revamping -- take #2 Revision: xvidcore--devapi4--1.0--patch-150 From sysKin: * more AR revamping modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc 2003-12-20 14:57:40 GMT patch-149 Summary: Unbuffured IO for 1st pass stat files Revision: xvidcore--devapi4--1.0--patch-149 Unbuffured IO for 1st pass stat files modified files: src/plugins/plugin_2pass1.c 2003-12-18 17:44:07 GMT patch-148 Summary: Forgotten bit of patch-141 Revision: xvidcore--devapi4--1.0--patch-148 Damn i forgot to merge the 1st pass changes... so lame, i tested the original patch but not the merged one. modified files: src/plugins/plugin_2pass1.c 2003-12-18 14:45:39 GMT patch-147 Summary: More postprocessing. Revision: xvidcore--devapi4--1.0--patch-147 From michael: * added film noise effect. * moved postproc initialization to decoder initialization. * added support for this postproc filter into DShow. modified files: dshow/src/CAbout.cpp dshow/src/CAbout.h dshow/src/CXvidDecoder.cpp dshow/src/resource.h dshow/src/xvid.ax.rc src/decoder.c src/image/postprocessing.c src/image/postprocessing.h src/xvid.c src/xvid.h 2003-12-18 14:38:19 GMT patch-146 Summary: BFrames ME speed up flags. Revision: xvidcore--devapi4--1.0--patch-146 From michael: * Added 3 ME flags to skip some bvop ME steps and thus speed up ME for bvops (at the expense of quality loss). - skip delta search - fast interpolate mode - early stop modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_common.c src/motion/estimation_pvop.c src/xvid.h 2003-12-18 14:11:01 GMT patch-145 Summary: VFW defaults changed Revision: xvidcore--devapi4--1.0--patch-145 From michael: * disabled DXN profiles. Better not say we are compatible to avoid problems with DXN. Mostly because of VBV lacking. * Changed overflow defaults to 5/5/5 with new 2pass code. modified files: vfw/src/config.c 2003-12-17 16:45:59 GMT patch-144 Summary: Forgotten bit for AR support in VFW Revision: xvidcore--devapi4--1.0--patch-144 Koepi might have forgotten to send me this change: * added resource id to the resource header modified files: vfw/src/resource.h 2003-12-17 15:11:37 GMT patch-143 Summary: Lower starting quantizer for CBR encoding Revision: xvidcore--devapi4--1.0--patch-143 From christoph: * Lower starting quantizer for CBR encodings. modified files: src/plugins/plugin_single.c 2003-12-17 15:04:33 GMT patch-142 Summary: Small xvid_encraw updates Revision: xvidcore--devapi4--1.0--patch-142 From christoph: * set upper frame size limit to 4096 pixels * GME refinement flag set where it belongs to. modified files: examples/xvid_encraw.c 2003-12-17 15:01:52 GMT patch-141 Summary: Two pass update. Revision: xvidcore--devapi4--1.0--patch-141 This patch improves the two pass code, quantizer distribution is smoother and results seem to be better. * Two pass now scales only a specific part of the frame length. This required changing the stats file format (added a header+MV length field) and the xvid_plg_data_t structure (binary compatible). * Overflow improvement and degradation set to 10% instead of 60% Assymetric values may help... * Some cleanup work done on the encoder part of the API header. NB: plg data struct will be cleaned up before 1.0 so if you rely on it please read the header file to know which part will disappear. modified files: src/encoder.c src/plugins/plugin_2pass1.c src/plugins/plugin_2pass2.c src/xvid.h 2003-12-17 13:53:34 GMT patch-140 Summary: VOL flags fix. Revision: xvidcore--devapi4--1.0--patch-140 From sysKin: * VOL flags updated as they should modified files: src/encoder.c 2003-12-17 11:07:15 GMT patch-139 Summary: VFW GUI Update. Revision: xvidcore--devapi4--1.0--patch-139 From Koepi: * typo in GMC description * Added AR widget (experimental) modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.rc 2003-12-14 12:44:36 GMT patch-138 Summary: Fix PP crash Revision: xvidcore--devapi4--1.0--patch-138 Fix PP crash modified files: src/decoder.c 2003-12-14 12:43:21 GMT patch-137 Summary: Texture bit counting for bframes Revision: xvidcore--devapi4--1.0--patch-137 Texture bit counting for bframes modified files: src/bitstream/mbcoding.c 2003-12-12 23:58:18 GMT patch-136 Summary: SSE2 code enabled. Revision: xvidcore--devapi4--1.0--patch-136 * SSE2 code enabled by default (only sane ones, _no_ idct) * Aligned data in xvid_bench to avoid crashes with SSE2 code because of unaligned read accesses. modified files: examples/xvid_bench.c src/xvid.c 2003-12-12 22:50:33 GMT patch-135 Summary: DShow update. Revision: xvidcore--devapi4--1.0--patch-135 From michael (from nic): * Dshow updates (colorspace etc...) * Deblocking option. modified files: dshow/src/CAbout.cpp dshow/src/CAbout.h dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h dshow/src/resource.h dshow/src/xvid.ax.rc 2003-12-12 14:18:13 GMT patch-134 Summary: Added missing postproc init Revision: xvidcore--devapi4--1.0--patch-134 Added missing postproc init modified files: src/xvid.c 2003-12-12 14:16:29 GMT patch-133 Summary: YUV space clarifications and fix. Revision: xvidcore--devapi4--1.0--patch-133 From christoph: * Put some comments * fixed UV swapping in USER case. modified files: src/image/image.c 2003-12-12 14:01:52 GMT patch-132 Summary: Default compilation flags change. Revision: xvidcore--devapi4--1.0--patch-132 from christoph: - -fgcse was causing trouble on Suse9 gcc - -01->-02 modified files: build/generic/configure.in 2003-12-12 13:43:35 GMT patch-131 Summary: Pixel Aspect Ratio support improvement. Revision: xvidcore--devapi4--1.0--patch-131 * 1:1 VGA is default now (old behavior) * When passed EXT PAR type, we now (try to) sanityse the par_width/par_height value: - make it positive - using 0 (typical memset reseting) defaults to 1 - simplify the PAR (using gcd) - then range it in [1..255] (can be lossy) * Specify valid range in API header Bitstream version bumped to 24. PS: this patch supersedes sysKin's one in CVS. modified files: src/encoder.c src/xvid.h 2003-12-10 22:57:50 GMT patch-130 Summary: xvid_decraw cmdline changes. Revision: xvidcore--devapi4--1.0--patch-130 * -nframes -> -frames (why did i put a 'n' there ?) * -save changed its meaning, it now controls per frame Elementary Stream saving. * -o string is now independent from -save. So now it's possible to save both a ES file per frame + a ES file for the sequence. modified files: examples/xvid_encraw.c 2003-12-10 15:08:20 GMT patch-129 Summary: Decoder bugfixes. Revision: xvidcore--devapi4--1.0--patch-129 From syskin: * bvops MBs were going banana from time to time because they were referencing wrong future ref MBs. * decoder now informs the client app about bvop lag, returning XVID_TYPE_NOTHING, up to the client app to display (or not) the bvop lag frame (black with error message). Fixing previous patch a bit: * added $Id: ChangeLog-1.0,v 1.5 2004-12-19 12:49:05 edgomez Exp $ fields * Fixed copyright modified files: dshow/src/CXvidDecoder.cpp src/decoder.c src/image/postprocessing.c src/image/postprocessing.h 2003-12-10 14:53:58 GMT patch-128 Summary: Deblocking code. Revision: xvidcore--devapi4--1.0--patch-128 Patch from michael: * added deblocking code Merge work: * Added postprocessing.[ch] to project files * added #include "image/postprocessing.h" directive in decoder.c * new lines missing (gcc is so pedantic) NB: slice rendering + postprocessing is impossible. Slice rendering is somewhat abandoned. new files: src/image/.arch-ids/postprocessing.c.id src/image/.arch-ids/postprocessing.h.id src/image/postprocessing.c src/image/postprocessing.h modified files: build/generic/sources.inc build/win32/libxvidcore.dsp src/decoder.c src/xvid.h 2003-12-08 18:33:26 GMT patch-127 Summary: Don't read out of bounds Revision: xvidcore--devapi4--1.0--patch-127 Don't read out of bounds modified files: src/plugins/plugin_2pass2.c 2003-12-08 18:31:41 GMT patch-126 Summary: Macroblock structure cleanup Revision: xvidcore--devapi4--1.0--patch-126 Macroblock structure cleanup modified files: src/global.h 2003-12-07 15:09:41 GMT patch-125 Summary: Small fixes. Revision: xvidcore--devapi4--1.0--patch-125 From gruel: * xvid.h: Minor color space correction. From sysKin: * codec.c: Zones fix modified files: src/xvid.h vfw/src/codec.c 2003-12-07 14:57:14 GMT patch-124 Summary: HUGE file handling in twopass. Revision: xvidcore--devapi4--1.0--patch-124 Because of a lacking cast, two pass did not handle well some very large target size (bitrate mode is not affected). It should now be safe specifying target sizes up to 2^31kB which represents 2TB. Someone using XviD in studios ? ;-) modified files: src/plugins/plugin_2pass2.c ######################################################################### # 1.0.0 beta2 (Bitstream Version 23) ######################################################################### 2003-12-05 14:43:53 GMT patch-123 Summary: Marking 1.0.0 Beta2 Revision: xvidcore--devapi4--1.0--patch-123 Marking 1.0.0 Beta2 modified files: ChangeLog TODO build/generic/configure.in 2003-12-05 14:35:22 GMT patch-122 Summary: Cap quants correctly (the best we can at least) Revision: xvidcore--devapi4--1.0--patch-122 Cap quants correctly (the best we can at least) modified files: src/plugins/plugin_single.c 2003-12-05 14:33:48 GMT patch-121 Summary: Small glitch Revision: xvidcore--devapi4--1.0--patch-121 Small glitch modified files: src/motion/vop_type_decision.c 2003-12-05 14:06:19 GMT patch-120 Summary: KFthresholding changes. Revision: xvidcore--devapi4--1.0--patch-120 As user reports proved, the logic behind the min_key_interval was 1/ misleading because the parameter is kfthreshold indeed and not a minimum keyframe interval 2/ the formula was a bit too aggressive (removing 20% of bitrate per frame until distance to next iframe was 1) I posted a RFC to try to settle a decision on what behavior this setting should have. We have still have no clear answer so i prefer just fixing the misleading name right now and wait for a common position about its behavior later. Libraries are *binary* compatible, but *source code* compatibility is broken (rename rc_2pass2_t->min_key_interval to kfthreshold). This is probably the last API change. NB: fixes a type problem during scaling parameter computing which was causing insane pb_iboost_tax_ratio values. modified files: src/plugins/plugin_2pass2.c src/xvid.h vfw/src/codec.c vfw/src/config.c vfw/src/config.h 2003-12-05 00:20:28 GMT patch-119 Summary: ivop decision tuning. Revision: xvidcore--devapi4--1.0--patch-119 ivop decision tuning from sysKin. modified files: src/motion/vop_type_decision.c 2003-12-03 18:55:29 GMT patch-118 Summary: VOSH header always written. Revision: xvidcore--devapi4--1.0--patch-118 * profile is set to sane default value in BitstreamWriteVolHeaders * VOSH is now always written (note that the ending code is never written) * doubled variable removed from bvop estimation file. bitstream version set to 23 Version 22 was used in CVS by a fix from michael for VOSH, it was just lacking the sane default value setting when profile is 0x00 from user. 0x00 is a reserved profile ID in the spec so it wasn't making much sense to write 0x00. modified files: src/bitstream/bitstream.c src/motion/estimation_bvop.c src/xvid.h 2003-12-03 15:29:30 GMT patch-117 Summary: VFW GUI fixes. Revision: xvidcore--devapi4--1.0--patch-117 min key was misleading because it was legacy code from dev-api-3. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/resource.rc 2003-12-03 15:22:25 GMT patch-116 Summary: ME fixes. Revision: xvidcore--devapi4--1.0--patch-116 From syskin: - small typo in chroma sad reset - code tweaking + adv diamond search used intead of mainsearch + and some other stuff - thresholds tuned. modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_pvop.c 2003-12-01 11:17:20 GMT patch-115 Summary: Small changes and interlacing bugfix. Revision: xvidcore--devapi4--1.0--patch-115 * Interlacing bugfix, code got swapped in a very old patch (back in March) * xvid_encraw forces VOSH writing. * VFW GUI fixes. modified files: examples/Makefile examples/xvid_encraw.c src/utils/mbtransquant.c vfw/src/config.c vfw/src/resource.rc 2003-11-30 15:47:41 GMT patch-114 Summary: Thread safe MPEG4 quantization functions + xvid_bench update Revision: xvidcore--devapi4--1.0--patch-114 * Thread safe MPEG4 quantization functions. Cleaned up version of patch provided by Michael - fixed compiling problems on gcc - added const qualifiers every where it was possible to help C compiler optimization. - added the mpeg_quant_matrices param to all ASM function prototype in comments (even if it's not used, that shows we do it deliberatly) - forces m[intra][0][0] = 8, otherwise XviD could write invalid streams. * Added real CRC computing in xvid_bench.c modified files: TODO examples/xvid_bench.c src/bitstream/bitstream.c src/decoder.c src/decoder.h src/encoder.c src/encoder.h src/motion/estimation.h src/motion/estimation_pvop.c src/motion/estimation_rd_based.c src/quant/quant.h src/quant/quant_h263.c src/quant/quant_matrix.c src/quant/quant_matrix.h src/quant/quant_mpeg.c src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/mbtransquant.c src/xvid.c 2003-11-29 18:10:25 GMT patch-113 Summary: Fixed csp asm rules for real? Revision: xvidcore--devapi4--1.0--patch-113 Fixed csp asm rules for real? modified files: build/win32/libxvidcore.dsp 2003-11-29 17:58:09 GMT patch-112 Summary: TODO/ChangeLog updated Revision: xvidcore--devapi4--1.0--patch-112 TODO/ChangeLog updated modified files: ChangeLog TODO ######################################################################### # 1.0.0 beta1 (Bitstream Version 21) ######################################################################### 2003-11-29 17:21:08 GMT patch-111 Summary: First beta marking Revision: xvidcore--devapi4--1.0--patch-111 First beta marking modified files: build/generic/configure.in src/xvid.c src/xvid.h 2003-11-29 16:59:14 GMT patch-110 Summary: Catching up with CVS. Revision: xvidcore--devapi4--1.0--patch-110 Changes from sysKin: * dquant optimization. * CBR fix modified files: src/encoder.c src/plugins/plugin_single.c 2003-11-24 22:05:38 GMT patch-109 Summary: Big level handling in trellis. Revision: xvidcore--devapi4--1.0--patch-109 Trellis was treating big levels exactly the same way as lower ones. In some cases, trellis was doing wild optimizations favoring a 0 because the distortion introduced by that big coeff change was acceptable. But visually this could result in some nasty blocks with wrong chroma information or similar brutal changes in other planes as well. Skal added big levels handling where trellis just tries to minimize the cost varying the run value only. No level modification is done anymore. modified files: TODO src/utils/mbtransquant.c 2003-11-23 16:42:55 GMT patch-108 Summary: Trellis for MPEG. Revision: xvidcore--devapi4--1.0--patch-108 * Added trellis support for MPEg quantization type. * Changed RD fixed point precision, should help avoiding overflow (see the constant TL_SHIFT) NB: we still have some problems when trellis optimizes DC for big DC values. modified files: src/utils/mbtransquant.c 2003-11-22 00:53:59 GMT patch-107 Summary: Win32 lib project fix (bis) Revision: xvidcore--devapi4--1.0--patch-107 * nasm >= 0.98.37 support in project file got reverted in a previous patch, push it back. Nota bene /O3 changed to /O2 for proper compilation with msvc compiler (everyone is not supposed to compile stuff with icc) * TODO update. modified files: TODO build/win32/libxvidcore.dsp 2003-11-19 21:26:34 GMT patch-106 Summary: updated bench crc Revision: xvidcore--devapi4--1.0--patch-106 updated bench crc modified files: examples/xvid_bench.c 2003-11-19 16:00:00 GMT patch-105 Summary: Lumimasking fixes. Revision: xvidcore--devapi4--1.0--patch-105 from sysKin: * New plugin hook entry XVID_PLG_FRAME that happens inside FrameCodeIPB when both type and quant are known. Added hook handling in all plugins. * Fixed lumimasking. from me: * small reverse commit in pvop estimation fixed. modified files: src/bitstream/bitstream.c src/bitstream/bitstream.h src/encoder.c src/motion/estimation_pvop.c src/plugins/plugin_2pass1.c src/plugins/plugin_2pass2.c src/plugins/plugin_dump.c src/plugins/plugin_lumimasking.c src/plugins/plugin_psnr.c src/plugins/plugin_single.c src/xvid.h 2003-11-19 15:37:16 GMT patch-104 Summary: Removed indirections from SearchData structure. Revision: xvidcore--devapi4--1.0--patch-104 Patch from sysKin: * removed indirections in SearchData structure. CheckCandidate functions don't use a const SearchData pointer anymore, but they should be a bit faster because of the less numerous indirections. modified files: src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_common.c src/motion/estimation_gmc.c src/motion/estimation_pvop.c src/motion/estimation_rd_based.c src/motion/vop_type_decision.c 2003-11-19 15:33:55 GMT patch-103 Summary: Formula error in twopass code. Revision: xvidcore--devapi4--1.0--patch-103 The reversing bframe formula in 2pass 2 was not right. This was in fact a test code i used when i was maintaining the code on its own branch. I should not have commited it :\ Fixed :-) modified files: src/plugins/plugin_2pass2.c 2003-11-18 21:41:08 GMT patch-102 Summary: Another problem with mis/unitialized reads. Revision: xvidcore--devapi4--1.0--patch-102 Michael introduced a fast subpel refine that uses a (iMinSAD2, currentQMV2) couple of data. The problem is that he plugged this in CheckCandidate16_qpel that is used outside this context, thus a if statement was traversed with garbage data in the standard subpel case. For perfection sake, using a iMinSAD=256*4096 value collects correct data even if it will not be used in the normal subpel case. modified files: src/motion/estimation_pvop.c 2003-11-16 17:29:39 GMT patch-101 Summary: The MEanalysis patch assumed bvops were always used Revision: xvidcore--devapi4--1.0--patch-101 The MEanalysis patch assumed bvops were always used modified files: src/encoder.c 2003-11-16 15:12:15 GMT patch-100 Summary: MEanalysis using wrong mvs + bframe search using wrong mvs Revision: xvidcore--devapi4--1.0--patch-100 Valgrind reported lot of unitialized reads. These unitialized reads helped sysKin finding three bugs: - ZeroMacroblock did not reset the cbp field. So for some skipped blocks, a test was done on the cbp value... - MEanalysis was using wrong mvs from the current bvop (unitialized or just wrong in current context). That's because in devapi3, bframes used to share the same mvs array wheras now, it's one array a bvop. - Collocated skipped MBs for a bvop didn't reset mvs[0] and b_mvs[0]. modified files: src/encoder.c src/motion/estimation_bvop.c src/motion/motion.h src/motion/motion_inlines.h src/motion/vop_type_decision.c 2003-11-15 15:21:09 GMT patch-99 Summary: Small fixes Revision: xvidcore--devapi4--1.0--patch-99 Small fixes modified files: src/encoder.c src/motion/vop_type_decision.c 2003-11-15 15:02:47 GMT patch-98 Summary: DShow update + libxvidcore project file update Revision: xvidcore--devapi4--1.0--patch-98 From peter: * DShow now links against libxvidcore.lib * Some updates to the libxvidcore project file modified files: TODO build/win32/libxvidcore.dsp dshow/dshow.dsp dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h 2003-11-15 01:51:28 GMT patch-97 Summary: Small updates to doc files Revision: xvidcore--devapi4--1.0--patch-97 Small updates to doc files modified files: AUTHORS CodingStyle TODO doc/INSTALL doc/README renamed files: .arch-ids/authors.txt.id ==> .arch-ids/AUTHORS.id .arch-ids/todo.txt.id ==> .arch-ids/TODO.id authors.txt ==> AUTHORS todo.txt ==> TODO 2003-11-14 11:23:55 GMT patch-96 Summary: Updated ChangeLog Revision: xvidcore--devapi4--1.0--patch-96 Updated ChangeLog modified files: ChangeLog 2003-11-13 23:09:34 GMT patch-95 Summary: 8x8 16bit Block SSE optimization. Revision: xvidcore--devapi4--1.0--patch-95 MMXed the calculation of SSE for 8x8 16bit blocks. This helps quite a lot VHQ=4 mode. My tests show with trellis:chroma_me: - ~20% speed improvement for vhq=4. - at least 5% when using vhq=1. Of course this speedup vanishes if more CPU intensive features are used. CruNcher who used gmc/qpel, noticed "only" a ~5% speed improvement. NB: i'm of course talking about overall speed improvement. Such a small patch for such a big improvement :-) modified files: src/motion/estimation_rd_based.c src/motion/sad.c src/motion/sad.h src/motion/x86_asm/sad_mmx.asm src/xvid.c 2003-11-13 22:34:33 GMT patch-94 Summary: Various small bug fixes. Revision: xvidcore--devapi4--1.0--patch-94 * encoder.c: GMC code fix in encoder.c. Now gmcval is initialized correctly when using GME. * xvid_decraw.c: Fix elementary stream output. * plugin_2pass2.c: Small parsing bug in stats reading in 2pass2. * decoder.c: Read resync markers in bframes. modified files: examples/xvid_decraw.c src/decoder.c src/encoder.c src/plugins/plugin_2pass2.c 2003-11-11 16:24:05 GMT patch-93 Summary: VFW update for overflow control Revision: xvidcore--devapi4--1.0--patch-93 From Koepi. * Added widget and code for overflow control strength. * Removed widgets for payback options and kfthresholds. * Activated frame stats in DebugOutputView all the time. From me: * Activated static motion detection in cartoon mode. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/resource.h vfw/src/resource.rc 2003-11-09 20:47:47 GMT patch-92 Summary: New two pass code. Revision: xvidcore--devapi4--1.0--patch-92 New two pass code. I may say it's just a fixed version, though it looks more like a "take all the ideas and write it again" version. It performs better with all natural sequences i have and a bit worse with anime. Including it now, allow me improving the code during the beta releases. modified files: src/encoder.c src/plugins/plugin_2pass1.c src/plugins/plugin_2pass2.c src/xvid.h vfw/src/codec.c vfw/src/config.c vfw/src/config.h 2003-11-09 17:07:16 GMT patch-91 Summary: Fixes for bframe compensation (used in psnr tests). Revision: xvidcore--devapi4--1.0--patch-91 * transfer_8to16_sub2_(c|mmx|xmm|3dne) write back the compensated result to current frame pointer. * transfer_8to16_sub2_mmx uses proper rounding (a+b+1)/2. The +1 operation was missing. * Blocks skipped in bframes must be compensated for psnr computing. modified files: src/encoder.c src/motion/estimation_bvop.c src/utils/mem_transfer.c src/utils/x86_asm/mem_transfer_3dne.asm src/utils/x86_asm/mem_transfer_mmx.asm 2003-11-05 16:05:44 GMT patch-90 Summary: Speed improvement not wasting setedges and interpolate calls. Revision: xvidcore--devapi4--1.0--patch-90 Patch from syskin. * This patch avoids calling setedges and interpolate for uneeded cases: - setedges is only called once per frame. - interpolate is called only when the previous rounding was different from the one needed. * Interpolation has been optimized a bit for qpel case, we do the hv pass down to top to use the cache more efficiently (hope so). modified files: src/encoder.c src/encoder.h src/image/image.c 2003-11-03 19:51:12 GMT patch-89 Summary: SSE2 dev16 fix + xvid_bench DCT block alignments. Revision: xvidcore--devapi4--1.0--patch-89 * Small error fixed by Skal in his dev16 code (missing pshufd). * Blocks used by DCT tests are now aligned with DECLARE_ALIGNED_MATRIX this avoids the well know segfaults when using SSE2 instructions that suppose data alignment. modified files: examples/xvid_bench.c src/motion/x86_asm/sad_sse2.asm 2003-11-03 15:42:23 GMT patch-88 Summary: Align .rodata section for non coff objects Revision: xvidcore--devapi4--1.0--patch-88 Align .rodata section for non coff objects modified files: src/bitstream/x86_asm/cbp_mmx.asm src/bitstream/x86_asm/cbp_sse2.asm src/dct/x86_asm/fdct_mmx_ffmpeg.asm src/dct/x86_asm/fdct_mmx_skal.asm src/dct/x86_asm/fdct_sse2_skal.asm src/dct/x86_asm/idct_3dne.asm src/dct/x86_asm/idct_mmx.asm src/dct/x86_asm/idct_sse2_dmitry.asm src/dct/x86_asm/simple_idct_mmx.asm src/image/x86_asm/colorspace_rgb_mmx.asm src/image/x86_asm/colorspace_yuyv_mmx.asm src/image/x86_asm/interpolate8x8_3dn.asm src/image/x86_asm/interpolate8x8_3dne.asm src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/image/x86_asm/qpel_mmx.asm src/image/x86_asm/reduced_mmx.asm src/motion/x86_asm/sad_3dn.asm src/motion/x86_asm/sad_3dne.asm src/motion/x86_asm/sad_mmx.asm src/motion/x86_asm/sad_sse2.asm src/motion/x86_asm/sad_xmm.asm src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/x86_asm/cpuid.asm src/utils/x86_asm/interlacing_mmx.asm src/utils/x86_asm/mem_transfer_3dne.asm 2003-11-02 23:01:43 GMT patch-87 Summary: SSE2 update Revision: xvidcore--devapi4--1.0--patch-87 * Added Dmitry SSE2 iDCT code back. * Plugged Dmitry iDCT as default for SSE2 * Fixed a bug in xvid_bench that was making it would test some CPU instruction set w/o host CPU support. xvidcore init was simply discarding irrelevant cpu flags. new files: src/dct/x86_asm/.arch-ids/idct_sse2_dmitry.asm.id src/dct/x86_asm/idct_sse2_dmitry.asm modified files: build/generic/sources.inc build/win32/libxvidcore.dsp examples/xvid_bench.c src/dct/fdct.h src/dct/idct.h src/xvid.c 2003-10-31 14:53:26 GMT patch-86 Summary: Better handling of old windres versions + GNU make dependency. Revision: xvidcore--devapi4--1.0--patch-86 Old versions of GNU windres (<2.14) don't have the same short options. But long options remain the same so it's better to use long option names to have ful compatibility with older versions. The Makefile appears to be dependent on GNU make because shell expansion for retrieving the path of the Makefile is wrong when using `` even with a single expansion assignment :=. It keeps being expanded when used. modified files: doc/INSTALL vfw/bin/Makefile 2003-10-29 11:31:28 GMT patch-85 Summary: Added sse2 f/iDCT code from skal Revision: xvidcore--devapi4--1.0--patch-85 * Added sse2 f/iDCT code from skal * Added hooking in xvid.c new files: src/dct/x86_asm/.arch-ids/fdct_sse2_skal.asm.id src/dct/x86_asm/fdct_sse2_skal.asm modified files: build/generic/sources.inc build/win32/libxvidcore.dsp src/xvid.c 2003-10-29 00:19:10 GMT patch-84 Summary: Fix the static motion detection Revision: xvidcore--devapi4--1.0--patch-84 Fix the static motion detection modified files: src/motion/estimation_pvop.c 2003-10-28 23:39:46 GMT patch-83 Summary: Added cartoon option handling. Revision: xvidcore--devapi4--1.0--patch-83 Added cartoon widgets + handling code. NB: static motion detection is disabled because of crashes on P4 cpus. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/resource.h vfw/src/resource.rc 2003-10-28 17:44:09 GMT patch-82 Summary: ASM cleanups; Revision: xvidcore--devapi4--1.0--patch-82 * Applied same style to all asm files * Replaced current sad sse2 operators with skal's ones * Removed old and unused colorspace asm files removed files: src/image/x86_asm/.arch-ids/rgb_to_yv12_mmx.asm.id src/image/x86_asm/.arch-ids/yuv_to_yv12_mmx.asm.id src/image/x86_asm/.arch-ids/yuyv_to_yv12_mmx.asm.id src/image/x86_asm/.arch-ids/yv12_to_rgb24_mmx.asm.id src/image/x86_asm/.arch-ids/yv12_to_rgb32_mmx.asm.id src/image/x86_asm/.arch-ids/yv12_to_yuyv_mmx.asm.id src/image/x86_asm/rgb_to_yv12_mmx.asm src/image/x86_asm/yuv_to_yv12_mmx.asm src/image/x86_asm/yuyv_to_yv12_mmx.asm src/image/x86_asm/yv12_to_rgb24_mmx.asm src/image/x86_asm/yv12_to_rgb32_mmx.asm src/image/x86_asm/yv12_to_yuyv_mmx.asm modified files: build/generic/sources.inc src/bitstream/x86_asm/cbp_3dne.asm src/bitstream/x86_asm/cbp_mmx.asm src/bitstream/x86_asm/cbp_sse2.asm src/dct/x86_asm/fdct_mmx_ffmpeg.asm src/dct/x86_asm/fdct_mmx_skal.asm src/dct/x86_asm/idct_3dne.asm src/dct/x86_asm/idct_mmx.asm src/dct/x86_asm/simple_idct_mmx.asm src/image/x86_asm/colorspace_mmx.inc src/image/x86_asm/colorspace_rgb_mmx.asm src/image/x86_asm/colorspace_yuv_mmx.asm src/image/x86_asm/colorspace_yuyv_mmx.asm src/image/x86_asm/interpolate8x8_3dn.asm src/image/x86_asm/interpolate8x8_3dne.asm src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/image/x86_asm/qpel_mmx.asm src/image/x86_asm/reduced_mmx.asm src/motion/x86_asm/sad_3dn.asm src/motion/x86_asm/sad_3dne.asm src/motion/x86_asm/sad_mmx.asm src/motion/x86_asm/sad_sse2.asm src/motion/x86_asm/sad_xmm.asm src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/x86_asm/cpuid.asm src/utils/x86_asm/interlacing_mmx.asm src/utils/x86_asm/mem_transfer_3dne.asm src/utils/x86_asm/mem_transfer_mmx.asm 2003-10-27 01:13:47 GMT patch-81 Summary: d_mv_bits speedup from sysKin Revision: xvidcore--devapi4--1.0--patch-81 d_mv_bits speedup from sysKin modified files: src/motion/motion_inlines.h 2003-10-27 00:55:51 GMT patch-80 Summary: fDCT changes, new asm CodingStyle applied to dct dir Revision: xvidcore--devapi4--1.0--patch-80 * Ported the ffmpeg fDCT functions (mmx and xmm). * Modified the skal's versions a bit to allow rolling loops. * Activated Skal's fDCTs (unrolled versions) for mmx _and_ xmm (old code was ignoring xmm versions) * Removed the SSE2 versions (they'll be back later) * .data -> .rodata * Applied announced asm CodingStyle to the dct dir (I'll have to add a section with the said CodingStyle) modified files: build/generic/sources.inc build/win32/libxvidcore.dsp src/dct/fdct.h src/dct/idct.h src/dct/x86_asm/fdct_mmx_ffmpeg.asm src/dct/x86_asm/fdct_mmx_skal.asm src/dct/x86_asm/idct_3dne.asm src/dct/x86_asm/idct_mmx.asm src/dct/x86_asm/simple_idct_mmx.asm src/xvid.c renamed files: src/dct/x86_asm/.arch-ids/fdct_mmx.asm.id ==> src/dct/x86_asm/.arch-ids/fdct_mmx_ffmpeg.asm.id src/dct/x86_asm/.arch-ids/fdct_xmm.asm.id ==> src/dct/x86_asm/.arch-ids/fdct_mmx_skal.asm.id src/dct/x86_asm/fdct_mmx.asm ==> src/dct/x86_asm/fdct_mmx_ffmpeg.asm src/dct/x86_asm/fdct_xmm.asm ==> src/dct/x86_asm/fdct_mmx_skal.asm 2003-10-25 13:48:42 GMT patch-79 Summary: BQuant->PQuant fix. Revision: xvidcore--devapi4--1.0--patch-79 When using closed_gop, a BFrame before an IFrame is turned into a PFrame. Thus is original quant has to be computed back, a rounding was causing these frames to be orig_quant-1. As a consequence we had very big frames before the IFrame, loosing many bits for nearly no visual benefit. modified files: src/encoder.c 2003-10-25 10:26:48 GMT patch-78 Summary: Added closed gop option to xvid_encraw Revision: xvidcore--devapi4--1.0--patch-78 Added closed gop option to xvid_encraw modified files: examples/xvid_encraw.c 2003-10-24 17:39:53 GMT patch-77 Summary: RD fixes. Revision: xvidcore--devapi4--1.0--patch-77 Inter RD optimization relied on buggy functions to predict bitsize. modified files: src/motion/estimation_pvop.c src/motion/motion_inlines.h 2003-10-22 15:48:01 GMT patch-76 Summary: Small INSTALL update due to previous build patches. Revision: xvidcore--devapi4--1.0--patch-76 Small INSTALL update due to previous build patches. modified files: doc/INSTALL 2003-10-21 21:27:46 GMT patch-75 Summary: Removed unused next_block vars. Revision: xvidcore--devapi4--1.0--patch-75 Removed unused next_block vars. modified files: src/decoder.c 2003-10-21 21:24:15 GMT patch-74 Summary: VFW build changes. Revision: xvidcore--devapi4--1.0--patch-74 The build system has been modified to look like the core lib one minus the configure system. modified files: vfw/bin/Makefile vfw/bin/sources.inc vfw/src/config.c vfw/vfw.dsp renamed files: vfw/bin/.arch-ids/Makefile.cygwin.id ==> vfw/bin/.arch-ids/Makefile.id vfw/bin/.arch-ids/Makefile.inc.id ==> vfw/bin/.arch-ids/sources.inc.id vfw/bin/Makefile.cygwin ==> vfw/bin/Makefile vfw/bin/Makefile.inc ==> vfw/bin/sources.inc vfw/src/.arch-ids/config.rc.id ==> vfw/src/.arch-ids/resource.rc.id vfw/src/config.rc ==> vfw/src/resource.rc 2003-10-21 17:00:09 GMT patch-73 Summary: Decoder cleanups and speedup Revision: xvidcore--devapi4--1.0--patch-73 cleanups, speedups from sysKin modified files: src/decoder.c 2003-10-21 16:22:15 GMT patch-72 Summary: Build fixes for newer nasm versions. Revision: xvidcore--devapi4--1.0--patch-72 nasm does not take care of adding trailing slashes to include paths. A patch to upstream authors has been refused because "the backslash() feature has been abandoned to get back to old nasm behavior" Their choice is kinda stupid as nasm is now open to user mistakes... :\ So we fix that on ou side. modified files: build/generic/configure.in build/win32/libxvidcore.dsp 2003-10-17 15:13:12 GMT patch-71 Summary: Updated docs. Revision: xvidcore--devapi4--1.0--patch-71 The doc of devapi4 is mostly out dated, i t is much better not to keep it in the repository at the moment. We'll add new docs later. Added a INSTALL doc that explains the build/install process for supported platforms. It's a first try, things may be added later. new files: doc/.arch-ids/INSTALL.id doc/INSTALL removed files: doc/.arch-ids/API.dox.id doc/.arch-ids/Makefile.id doc/.arch-ids/foot.inc.in.id doc/.arch-ids/header.tex.in.id doc/.arch-ids/xvid-decoding.txt.id doc/.arch-ids/xvid-encoder.txt.id doc/API.dox doc/Makefile doc/foot.inc.in doc/header.tex.in doc/xvid-decoding.txt doc/xvid-encoder.txt modified files: CodingStyle README doc/README renamed files: .arch-ids/README.txt.id ==> .arch-ids/README.id .arch-ids/changelog.txt.id ==> .arch-ids/ChangeLog.id README.txt ==> README changelog.txt ==> ChangeLog 2003-10-15 13:53:11 GMT patch-70 Summary: Better cross compilation handling. Revision: xvidcore--devapi4--1.0--patch-70 With this patch it is now possible to cross compile xvid quite easily for win32 platform on a build linux host. Recipe for debian system: $ apt-get install mingw32 (or create your own cross compiler/binutils suite and install mingw32 header files -- sorry i don't have a recipe for this, this is let as an exercice for the reader) $ cd ${xvidcore} $ cd build/generic $ ./bootstrap.sh $ ./configure --host=i586-mingw32msvc (all occurences of i586-mingw32msvc may be replaced with the right prefix you've choosen for your cross compiler and cross binutils) $ make $ cd ../../vfw/bin $ make -f Makefile.cygwin \ CC=i586-mingw32msvc-gcc WINDRES=i586-mingw32msvc-windres Enjoy your win32 xvid.dll build by free software, on a free OS, for a devil OS target. modified files: build/generic/configure.in vfw/bin/Makefile.cygwin 2003-10-14 15:17:28 GMT patch-69 Summary: Fixed Qpel+Interpolation decoding. Cleaned up mb->mode usage. Revision: xvidcore--devapi4--1.0--patch-69 * Fixed interpolate mode + qpel decoding. * MB->mb_type completely replaced by MB->mode modified files: src/decoder.c 2003-10-12 21:57:24 GMT patch-68 Summary: ac/dc prediction for intra RD search. Revision: xvidcore--devapi4--1.0--patch-68 From syskin, added real ac/dc prediction for INTRA's bitcount. modified files: src/motion/estimation_rd_based.c 2003-10-09 18:15:50 GMT patch-67 Summary: Pigrated asm code to new quant API. Revision: xvidcore--devapi4--1.0--patch-67 Many changes that are mostly cosmetic in the asm files. * indent * added xor eax, eax in quant_(h263|mpeg)_intra_.* functions (just to make sure the returned value isn't random) * added xor eax, eax in dequant_(h263|mpeg)_.* functions (just to make sure the returned value isn't random) * synced cpuid.asm XVID_CPU_feature constants with the one defined in the C code (xvid.h) * enabled all cpu tests in xvid_bench.c modified files: examples/xvid_bench.c src/quant/quant_h263.c src/quant/quant_mpeg.c src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/x86_asm/cpuid.asm src/xvid.h 2003-10-08 21:05:47 GMT patch-66 Summary: Updated xvid_bench for quant API changes Revision: xvidcore--devapi4--1.0--patch-66 Updated xvid_bench for quant API changes modified files: examples/xvid_bench.c 2003-10-07 13:03:51 GMT patch-65 Summary: Quant functions API changes (first step) Revision: xvidcore--devapi4--1.0--patch-65 In the road to instance safe mpeg quantization, a small cleanup to the quant API was needed. It consists in changing the way we name the functions quant_{mpeg|h263}_{inter|intra}_{arch} and in a move to a more unified API (even intra functions return the sum of coefficients, it can be used as a complexity measure at a later time). This patch touch lot of files, but all changes are trivial. NB: we should check the IA64 asm validity, i changed things but i can't test them. new files: src/quant/.arch-ids/quant.h.id src/quant/quant.h removed files: src/quant/.arch-ids/quant_h263.h.id src/quant/.arch-ids/quant_mpeg4.h.id src/quant/quant_h263.h src/quant/quant_mpeg4.h modified files: build/generic/sources.inc build/win32/libxvidcore.dsp src/decoder.c src/encoder.c src/image/qpel.c src/motion/estimation_rd_based.c src/quant/ia64_asm/quant_h263_ia64.s src/quant/quant_h263.c src/quant/quant_matrix.c src/quant/quant_matrix.h src/quant/quant_mpeg.c src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_h263_mmx.asm src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize_mpeg_xmm.asm src/utils/mbtransquant.c src/xvid.c src/xvid.h renamed files: src/quant/.arch-ids/quant_mpeg4.c.id ==> src/quant/.arch-ids/quant_mpeg.c.id src/quant/quant_mpeg4.c ==> src/quant/quant_mpeg.c src/quant/x86_asm/.arch-ids/quantize4_mmx.asm.id ==> src/quant/x86_asm/.arch-ids/quantize_mpeg_mmx.asm.id src/quant/x86_asm/.arch-ids/quantize4_xmm.asm.id ==> src/quant/x86_asm/.arch-ids/quantize_mpeg_xmm.asm.id src/quant/x86_asm/.arch-ids/quantize_3dne.asm.id ==> src/quant/x86_asm/.arch-ids/quantize_h263_3dne.asm.id src/quant/x86_asm/.arch-ids/quantize_mmx.asm.id ==> src/quant/x86_asm/.arch-ids/quantize_h263_mmx.asm.id src/quant/x86_asm/quantize4_mmx.asm ==> src/quant/x86_asm/quantize_mpeg_mmx.asm src/quant/x86_asm/quantize4_xmm.asm ==> src/quant/x86_asm/quantize_mpeg_xmm.asm src/quant/x86_asm/quantize_3dne.asm ==> src/quant/x86_asm/quantize_h263_3dne.asm src/quant/x86_asm/quantize_mmx.asm ==> src/quant/x86_asm/quantize_h263_mmx.asm 2003-10-05 00:15:15 GMT patch-64 Summary: Updated ChangeLog Revision: xvidcore--devapi4--1.0--patch-64 Updated ChangeLog modified files: changelog.txt 2003-10-04 16:04:30 GMT patch-63 Summary: Removed legacy 2pass code from vfw Revision: xvidcore--devapi4--1.0--patch-63 Removed legacy 2pass code from vfw removed files: vfw/src/.arch-ids/2pass.c.id vfw/src/.arch-ids/2pass.h.id vfw/src/2pass.c vfw/src/2pass.h 2003-10-04 00:41:38 GMT patch-62 Summary: Working VFW mingw/cygwin build system. Revision: xvidcore--devapi4--1.0--patch-62 This patch fixes the VFW building process. Now it should work out of the box using these steps: # cd ${xvidcore} # cd build/generic # ./bootstrap.sh <-- only needed for CVS checkouts. # ./configure # make # cd ../../vfw/bin # make -f Makefile.cygwin Then install as usual clicking on the inf file or "make install" in the vfw/bin dir. modified files: vfw/bin/Makefile.cygwin vfw/src/config.rc vfw/src/debug.h vfw/src/driverproc.c 2003-10-03 17:00:53 GMT patch-61 Summary: Fixes for alternate scan and interlacing support. Revision: xvidcore--devapi4--1.0--patch-61 Fixes from CVS (by sysKin) for: - added alternate scan support with VHQ - fixed interlacing support in s/b-frames. May fix a potential problem as field_pred struct field seemed not to be initialized anywhere. As it's not supported yet, write an hardcoded 0 bit. Fixes from me for the fixes from syKin: - scan_table effectively used in MBCodingBVOP - Block_CalcBits(Intra) fixes to data->scan_table (implies prototype change and code modification every where the functions were used) I also increased BS version as it might result in different bitstreams It's now at version 20. modified files: src/bitstream/mbcoding.c src/bitstream/mbcoding.h src/decoder.c src/encoder.c src/motion/estimation.h src/motion/estimation_rd_based.c src/xvid.h 2003-10-03 15:41:37 GMT patch-60 Summary: Removed BIGLUT support. Revision: xvidcore--devapi4--1.0--patch-60 Remobed legacy code for BIGLUT support. It was unused and RD based Motion Estimation was not even compatible with this type of VLC coding. modified files: build/generic/configure.in src/bitstream/mbcoding.c src/bitstream/mbcoding.h src/prediction/mbprediction.c 2003-10-03 13:25:17 GMT patch-59 Summary: Bugfix for PFrames+ Ext Search. Revision: xvidcore--devapi4--1.0--patch-59 In Qpel mode, the code was doing a diamond search for wrong predictors. This resulted in poor performance as the diamond search was sitting there for sometime. modified files: src/motion/estimation.h src/motion/estimation_pvop.c 2003-10-02 16:50:51 GMT patch-58 Summary: Added VFW makefile for cygwin/minsys Revision: xvidcore--devapi4--1.0--patch-58 Added VFW makefile for cygwin/minsys. I can't test it so it is probably not right out of the box. Waiting for feedback in order to fix it. new files: vfw/bin/.arch-ids/Makefile.cygwin.id vfw/bin/.arch-ids/Makefile.inc.id vfw/bin/Makefile.cygwin vfw/bin/Makefile.inc vfw/src/w32api/.arch-ids/=id vfw/src/w32api/.arch-ids/vfw.h.id vfw/src/w32api/vfw.h new directories: vfw/src/w32api vfw/src/w32api/.arch-ids 2003-10-02 13:35:15 GMT patch-57 Summary: Cleaned up the lumimasking code. Revision: xvidcore--devapi4--1.0--patch-57 The lumimasking code was not very plugin oriented as it has been ported from old XviD versions. This patch cleans up the code and integrates it better with plugin design. No changes done in teh functionnal code. modified files: src/plugins/plugin_lumimasking.c 2003-10-01 23:07:07 GMT patch-56 Summary: Cleaned up trailing space chars. Revision: xvidcore--devapi4--1.0--patch-56 The kind of patch we would love to avoid as they make merging a nightmare while they're kind of useless patches. Applied sed 's/[ \t]*$//' to all c/h files. modified files: src/bitstream/bitstream.c src/bitstream/bitstream.h src/bitstream/cbp.c src/bitstream/mbcoding.c src/dct/idct.c src/dct/simple_idct.c src/decoder.c src/decoder.h src/encoder.c src/encoder.h src/global.h src/image/colorspace.c src/image/colorspace.h src/image/font.c src/image/image.c src/image/image.h src/image/interpolate8x8.c src/image/interpolate8x8.h src/image/qpel.c src/image/qpel.h src/image/reduced.c src/motion/estimation_rd_based.c src/motion/gmc.c src/motion/gmc.h src/motion/motion.h src/motion/sad.c src/plugins/plugin_2pass1.c src/plugins/plugin_2pass2.c src/plugins/plugin_fixed.c src/plugins/plugin_psnr.c src/plugins/plugin_single.c src/portab.h src/prediction/mbprediction.c src/utils/mbtransquant.c src/utils/timer.c src/xvid.c src/xvid.h 2003-09-30 18:10:18 GMT patch-55 Summary: Code cleanups. Revision: xvidcore--devapi4--1.0--patch-55 It's been a while since the last ISOC89 conformance cleanup. Using the following switches help a lot :-) -Wall -Wsign-compare -Wredundant-decls -Wunreachable-code -Wnested-externs \ -ansi Result: 0 warning/0 error modified files: src/bitstream/vlc_codes.h src/global.h src/image/qpel.h src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_gmc.c src/motion/estimation_pvop.c src/motion/estimation_rd_based.c src/motion/gmc.c src/motion/gmc.h src/motion/motion_inlines.h src/motion/vop_type_decision.c 2003-09-29 00:31:32 GMT patch-54 Summary: Memory leakage fixes. Revision: xvidcore--devapi4--1.0--patch-54 The pEnc->queue was allocated but not freed when bframes == 0. And queue images were not freed as well. modified files: examples/Makefile src/encoder.c src/image/image.c src/utils/mem_align.c 2003-09-28 16:45:02 GMT patch-53 Summary: Fixes the unitialized mcsel bit in RD based ME Revision: xvidcore--devapi4--1.0--patch-53 Fixes the unitialized mcsel bit in RD based ME modified files: src/motion/estimation_rd_based.c 2003-09-28 01:00:06 GMT patch-52 Summary: Fix the XviD constant version initialization Revision: xvidcore--devapi4--1.0--patch-52 Fix the XviD constant version initialization modified files: src/xvid.h 2003-09-28 00:47:05 GMT patch-51 Summary: Fix to the build system (the come back). Revision: xvidcore--devapi4--1.0--patch-51 Ok this one fixes the way we build the targets. The VPATH thingy really works with the library targets. My understanding of the VPATH mechanism was wrong. Now it should be ok (i promess). modified files: build/generic/Makefile 2003-09-27 11:45:18 GMT patch-50 Summary: Small fix to previous patch. Revision: xvidcore--devapi4--1.0--patch-50 A pair ofdouble quotes prevented "make" to sort out the VPATH dependences in "=build". This resulted in compiling all the sources, all the time, whether a file changed or not. modified files: build/generic/Makefile 2003-09-26 22:39:44 GMT patch-49 Summary: Updated the build files for *nix. Revision: xvidcore--devapi4--1.0--patch-49 - Fixed MacOSX build (w/o module option). The subversion was not right it was just minor version though it has to be major.minor. - Fixed bootstrap.sh for MacOSX environment, it now looks for glibtoolize if libtoolize is not found. - The unified Makefile now builds XviD out of source tree in the directory =build. It's cleaner, and clashes much less w/ arch/tla source linting. - Tuned the tagging regexps so: + autoconf files are ignored (considered backup => not erased, not copied and not commited/imported) + =build is also considered backup. modified files: build/generic/Makefile build/generic/bootstrap.sh build/generic/configure.in build/generic/platform.inc.in build/generic/sources.inc examples/Makefile {arch}/=tagging-method 2003-09-24 01:38:03 GMT patch-48 Summary: Bug fix to decoder (mcsel/acpred bits swapped) Revision: xvidcore--devapi4--1.0--patch-48 As reported here: http://www.xvid.org/modules.php?op=modload&name=phpBB2&file=viewtopic&t=1513&highlight= in the spec, 'there is 'mcsel' before 'ac_pred_flag'. however, if you see the code, it's changed. We were doing the opposite, ac_pred before mcsel. modified files: src/decoder.c 2003-09-11 17:19:35 GMT patch-47 Summary: Small fix to GMC+QuarterPel -- BS version bumped to 19 Revision: xvidcore--devapi4--1.0--patch-47 During the split up, a line has been disabled that prevented good quartelpel+GMC. Fixed. This fix + patch-43 are a good reason to bump up the bitstream version to 19. modified files: src/motion/gmc.h src/xvid.h 2003-09-11 17:11:28 GMT patch-46 Summary: Build process fix for MacOSX+module option Revision: xvidcore--devapi4--1.0--patch-46 This patch fixes the install rule of the MacOSX module style library. It adds a PRE_SHARED_LIB == SHARED_LIB for all platforms except MacOSX that uses it a different way, and exploits the fact $(SPECIFIC_LDFLAGS) is the last var of the build line to insert a command for post linking the right .so file. modified files: build/generic/Makefile build/generic/configure.in build/generic/platform.inc.in 2003-09-11 13:56:40 GMT patch-45 Summary: Tree cleanup and build files updated. Revision: xvidcore--devapi4--1.0--patch-45 * Win32 files switched to Unix format. Ok, on IRC, we sorted out what was b0rking the project files each time I commit them in DOS format. In fact the Unix CVS does upload DOS format files if i commit them, and then Win32 CVS users get double \r\n files. So the best is to use unix format in CVS. This way, the checked out versions are right for Win32 users. NB: this imply a unix2dos conversion when doing a release from a Unix box. I think it's better than current policy, but automatic tarballs/zip of the tree should take care of that. * odivx and xvid_stat examples removed. Associated project files have been removed as well. * The divx4 compatibility layer has been removed. Associated options in the configure script removed. libxvidcore.def has no more need to be an autoconf generated file. * rawdec removed. It has never been used and will never be. * Added major api appending to the macosx module build. removed files: build/win32/.arch-ids/odivx_enc_dec.dsp.id build/win32/.arch-ids/xvid_stat.dsp.id build/win32/odivx_enc_dec.dsp build/win32/xvid_stat.dsp examples/.arch-ids/odivx_enc_dec.c.id examples/.arch-ids/xvid_stat.c.id examples/odivx_enc_dec.c examples/xvid_stat.c rawdec/.arch-ids/=id rawdec/.arch-ids/rawdec.c.id rawdec/.arch-ids/rawdec.dsp.id rawdec/rawdec.c rawdec/rawdec.dsp src/.arch-ids/divx4.c.id src/.arch-ids/divx4.h.id src/divx4.c src/divx4.h modified files: build/generic/Makefile build/generic/configure.in build/generic/libxvidcore.def build/win32/libxvidcore.dsp build/win32/xvid_decraw.dsp build/win32/xvid_encraw.dsp build/win32/xvidcore.dsw renamed files: build/generic/.arch-ids/libxvidcore.def.in.id ==> build/generic/.arch-ids/libxvidcore.def.id build/generic/libxvidcore.def.in ==> build/generic/libxvidcore.def removed directories: rawdec rawdec/.arch-ids 2003-09-11 12:59:19 GMT patch-44 Summary: Replaced malloc.h header file with stdlib.h Revision: xvidcore--devapi4--1.0--patch-44 Replaced malloc.h header file with stdlib.h modified files: src/plugins/plugin_lumimasking.c 2003-09-10 22:33:04 GMT patch-43 Summary: Fixed a problem for 'power of 2' framerates Revision: xvidcore--devapi4--1.0--patch-43 Fixed a problem for 'power of 2' framerates modified files: src/bitstream/bitstream.c 2003-09-10 21:57:12 GMT patch-42 Summary: Motion Estimation module splitting. Revision: xvidcore--devapi4--1.0--patch-42 The motion estimation module was the biggest file of the source tree. After some previous attempts, sysKin decided to split it up again. This time he's done it right. This split up just changes the organization of ME functions inside different files. This should help a bit in keeping the motion estimation being manageable by a normal human ;-) Here is the splitting logic quoted from sysKin's email to xvid-devel: - estimation.h: header files #included in all ME modules: + deftypes, + macros + constants NB: no code. - estimation_bvop.c: motion estimation for b-vops. everything in it :) - estimation_common.c: functions shared among all ME modules: + diamonds + subpel + refinement + picture + manipulation + tables + ... etc. - estimation_gmc.c: gruel's GME code - estimation_pvop.c: ME for p-vops. Also SAD-based mode decision - estimation_rd_based.c: everything R-D-based: mode decison (including _Fast) and ME. - gmc.c, gmc.h: no change. new files: src/motion/.arch-ids/estimation.h.id src/motion/.arch-ids/estimation_bvop.c.id src/motion/.arch-ids/estimation_common.c.id src/motion/.arch-ids/estimation_gmc.c.id src/motion/.arch-ids/estimation_pvop.c.id src/motion/.arch-ids/estimation_rd_based.c.id src/motion/.arch-ids/motion_inlines.h.id src/motion/.arch-ids/vop_type_decision.c.id src/motion/estimation.h src/motion/estimation_bvop.c src/motion/estimation_common.c src/motion/estimation_gmc.c src/motion/estimation_pvop.c src/motion/estimation_rd_based.c src/motion/motion_inlines.h src/motion/vop_type_decision.c removed files: src/motion/.arch-ids/motion_est.c.id src/motion/.arch-ids/motion_est.h.id src/motion/.arch-ids/smp_motion_est.c.id src/motion/.arch-ids/smp_motion_est.h.id src/motion/motion_est.c src/motion/motion_est.h src/motion/smp_motion_est.c src/motion/smp_motion_est.h modified files: build/generic/sources.inc build/win32/libxvidcore.dsp src/bitstream/mbcoding.h src/motion/gmc.c src/motion/gmc.h src/motion/motion.h src/motion/motion_comp.c src/motion/sad.c src/motion/sad.h src/prediction/mbprediction.c src/prediction/mbprediction.h src/utils/mbfunctions.h 2003-09-10 00:40:44 GMT patch-41 Summary: Cleanups and fix to (trellis+thresholding) logic Revision: xvidcore--devapi4--1.0--patch-41 Cleanups to some functions (loop unrolling, call to functions through function array pointers)... Fix to the trellis+thresholding logic. It was comparing the return value of trellis with a threshold but the trellis function returns the last non zero coeff index... this was basically comparing apples with oranges... funny but wrong. Trellis now returns the sum of absolute coeffs, so the comparison is logical. Btw, as discussed on the devel ML, this is probably uneeded as trellis does an RD optimized coeff distribution. modified files: src/utils/mbtransquant.c 2003-09-09 13:13:58 GMT patch-40 Summary: Missing ressource for dshow frontend Revision: xvidcore--devapi4--1.0--patch-40 Missing ressource for dshow frontend new files: dshow/src/.arch-ids/XviD_logo.bmp.id dshow/src/XviD_logo.bmp 2003-09-08 11:02:10 GMT patch-39 Summary: Small fixes for fast mode decision Revision: xvidcore--devapi4--1.0--patch-39 Small fixes for fast mode decision modified files: src/motion/motion_est.c src/xvid.h 2003-09-05 23:45:48 GMT patch-38 Summary: New RD mode decision and subpel refinement. Revision: xvidcore--devapi4--1.0--patch-38 New stuff from michael. It deals with mode decision and subpel refinement. Integration of these new flags are not settled. Wait and see. Further testing is needed. modified files: src/motion/motion_est.c src/motion/motion_est.h src/xvid.h 2003-08-29 13:56:30 GMT patch-37 Summary: Still more ME tuning Revision: xvidcore--devapi4--1.0--patch-37 Still more ME tuning modified files: src/motion/motion_est.c 2003-08-28 12:43:22 GMT patch-36 Summary: Removed expanded the cvs Id field Revision: xvidcore--devapi4--1.0--patch-36 Removed expanded the cvs Id field modified files: src/image/x86_asm/qpel_mmx.asm 2003-08-28 12:39:44 GMT patch-35 Summary: More motion est cleanup and bugfixes. Revision: xvidcore--devapi4--1.0--patch-35 Still more bugfixes, cleanups and improvements to the Motion Est by sysKin modified files: src/motion/motion_est.c src/motion/motion_est.h 2003-08-26 13:57:39 GMT patch-34 Summary: Added final bits of Aspect Ratio flag. Revision: xvidcore--devapi4--1.0--patch-34 Peter did think of the AR flag since the very beginning of devapi4. He just forgot to code the final bits to effectively write it to the bitstream. This patch adds these missing final bits so XviD now reads and writes AR flags. modified files: ./src/bitstream/bitstream.c ./src/encoder.c ./src/encoder.h ./src/xvid.h 2003-08-25 16:41:09 GMT patch-33 Summary: Small motion estimation cleanup. Revision: xvidcore--devapi4--1.0--patch-33 Cleanups from sysKin. modified files: ./src/motion/motion_est.c ./src/motion/motion_est.h 2003-08-25 14:59:28 GMT patch-32 Summary: Frame padding bug. Revision: xvidcore--devapi4--1.0--patch-32 We were edging the image repeating pixels from the image directly, but the standard says we must repeat from a 16 pixel boundary. See Chapter 7.6.4 of the standard. modified files: ./src/image/image.c 2003-08-23 15:07:44 GMT patch-31 Summary: New Qpel code. Revision: xvidcore--devapi4--1.0--patch-31 Isibaar commited a new piece of QPel code that seems to be optimized for ia32(mmx) architectures. I had to clean it up a bit to make it respectful of architecture separations. This code is disabled for non ia32 arch, a comment mention it's only faster on ia32... i wonder if it's true, some tests have to be done on sourceforge compile farm in order to confirm that. Compared to the bared CVS commit, this patch includes: - fixes the unix build. - better architecture separation. - CodingStyle respected. new files: ./src/image/.arch-ids/qpel.c.id ./src/image/.arch-ids/qpel.h.id ./src/image/qpel.c ./src/image/qpel.h ./src/image/x86_asm/.arch-ids/qpel_mmx.asm.id ./src/image/x86_asm/qpel_mmx.asm modified files: ./build/generic/sources.inc ./build/win32/libxvidcore.dsp ./src/encoder.c ./src/motion/motion_comp.c ./src/xvid.c 2003-08-22 13:20:36 GMT patch-30 Summary: sad32v does really what it's expected (ie 32x32 SAD :-) Revision: xvidcore--devapi4--1.0--patch-30 sad32v does really what it's expected (ie 32x32 SAD :-) modified files: ./src/motion/sad.c 2003-08-18 19:00:49 GMT patch-29 Summary: 64bit fix. Revision: xvidcore--devapi4--1.0--patch-29 The interpolation code was unsafe on 64bit platforms, the offset was badly sized, resulting in segfaults. modified files: ./src/image/interpolate8x8.h 2003-08-17 14:08:48 GMT patch-28 Summary: Greyscale mode fixes. Revision: xvidcore--devapi4--1.0--patch-28 We were missing some greyscale tests in the encoder loop... noticeably in the PVOP function when coding an intra block, and in BVOP function when coding all types. I added the cbp trick in the cases discussed above. modified files: ./src/encoder.c 2003-08-13 11:47:33 GMT patch-27 Summary: Forgotten bit for IA64 separation Revision: xvidcore--devapi4--1.0--patch-27 Forgotten bit for IA64 separation modified files: ./src/image/interpolate8x8.h 2003-08-11 15:42:30 GMT patch-26 Summary: Some qpel changes (sync with Isibaar) Revision: xvidcore--devapi4--1.0--patch-26 Some qpel changes (sync with Isibaar) modified files: ./examples/xvid_encraw.c 2003-08-11 15:30:04 GMT patch-25 Summary: Better architecture separation. Revision: xvidcore--devapi4--1.0--patch-25 Architecture depending functions were declared whatever arch you were compiling. This patch fixes that. I also removed the simple_idct hack in decoder.c as it was simply not used. Better not have ugly code in there. xvid_bench, should now compile and run on all archs. However I did not put the cpu definitions for each arch, i just separated ARCH_IS_IA32 so even ARCH_IS_GENERIC can compile modified files: ./examples/xvid_bench.c ./src/bitstream/cbp.h ./src/dct/fdct.h ./src/dct/idct.h ./src/decoder.c ./src/image/colorspace.h ./src/image/interpolate8x8.h ./src/image/reduced.h ./src/motion/sad.h ./src/quant/quant_h263.h ./src/quant/quant_mpeg4.h ./src/utils/emms.h 2003-08-09 20:47:42 GMT patch-24 Summary: Updated changelog Revision: xvidcore--devapi4--1.0--patch-24 Updated changelog modified files: ./changelog.txt 2003-08-09 20:31:17 GMT patch-23 Summary: Workaround to a GMC bug due to a MS compiler bug. Revision: xvidcore--devapi4--1.0--patch-23 Christoph did use a trick to speed up code that resulted in badly optimized code (teh compiler was missing a cast) modified files: ./src/motion/gmc.c 2003-08-09 17:09:00 GMT patch-22 Summary: Fixes to xvid_decraw Revision: xvidcore--devapi4--1.0--patch-22 xvid_decraw has always been used on not so high bitrate sequences and not so big sequences neither. I've been doing lot of tests on the Matrix 2 trailer (1000x540 25fps ~5Mbits/s) and xvid_decraw was not able to handle that because of bugs in the buffer filling algorithm. This patch fixes the buffer filling and catches up with christoph changes in CVS (wrong help message and option parsing). modified files: ./examples/xvid_decraw.c 2003-08-08 21:31:59 GMT patch-21 Summary: Added QPel and GMC options. Revision: xvidcore--devapi4--1.0--patch-21 XviD has so many options that we forget to propose them all on the CLI, here are two more: GMC and Qpel. modified files: ./examples/xvid_encraw.c 2003-08-07 19:26:28 GMT patch-20 Summary: SVOP handling in status window Revision: xvidcore--devapi4--1.0--patch-20 SVOP handling in status window modified files: ./vfw/src/status.c 2003-08-07 19:25:03 GMT patch-19 Summary: Warning cleanups by chl Revision: xvidcore--devapi4--1.0--patch-19 Warning cleanups by chl modified files: ./src/encoder.c ./src/encoder.h ./src/image/interpolate8x8.c ./src/motion/motion_est.c ./src/xvid.h 2003-08-06 21:13:35 GMT patch-18 Summary: Fix to GMC sprite trajectory code Revision: xvidcore--devapi4--1.0--patch-18 Fix to GMC sprite trajectory code modified files: ./src/bitstream/mbcoding.c ./src/xvid.h 2003-08-06 10:57:25 GMT patch-17 Summary: Fixes a bug in BVOP block skipping thresholding Revision: xvidcore--devapi4--1.0--patch-17 Fixes a bug in BVOP block skipping thresholding modified files: ./src/motion/motion_est.c 2003-08-03 14:57:32 GMT patch-16 Summary: Functions renaming + motion fixes. Revision: xvidcore--devapi4--1.0--patch-16 BITS flags have been renamed to RD (Rate Distorsion) flags... however function names were still xxxBitsxxx. Improved frame type decision Fix for DQUANTS plugins, their quant was never checked against valid [1..31] range. modified files: ./src/encoder.c ./src/motion/motion_est.c ./src/motion/motion_est.h 2003-08-02 15:00:49 GMT patch-15 Summary: API cleanup. Revision: xvidcore--devapi4--1.0--patch-15 Since we started devapi3 and then devapi4, feature names did not change because it was just convenient to keep them to minimize the change impact. But most of the flags were now not even suggesting what they do. So this patch cleans the API. This patch also change the way we describe flags, it's more compact and shows better flags are bit sets that must not overlap. This change fixes a plugin flag overlapping problem as well. modified files: ./examples/xvid_encraw.c ./src/decoder.c ./src/encoder.c ./src/encoder.h ./src/motion/motion_est.c ./src/motion/motion_est.h ./src/xvid.c ./src/xvid.h ./vfw/src/codec.c ./vfw/src/config.c 2003-07-29 22:25:12 GMT patch-14 Summary: Fixed bogus memory accesses Revision: xvidcore--devapi4--1.0--patch-14 Fixed bogus memory accesses modified files: ./src/encoder.c ./src/plugins/plugin_2pass1.c 2003-07-28 12:22:33 GMT patch-13 Summary: Bitstream version increased to 16 Revision: xvidcore--devapi4--1.0--patch-13 Bitstream version increased to 16 modified files: ./src/xvid.h 2003-07-25 12:01:51 GMT patch-12 Summary: Added gmc files to teh windows project file Revision: xvidcore--devapi4--1.0--patch-12 Added gmc files to teh windows project file modified files: ./build/win32/libxvidcore.dsp ./vfw/src/codec.c 2003-07-25 12:00:31 GMT patch-11 Summary: Added cartoon mode from Isibaar Revision: xvidcore--devapi4--1.0--patch-11 Added cartoon mode from Isibaar modified files: ./src/motion/motion_est.c ./src/plugins/plugin_single.c ./src/utils/mbtransquant.c ./src/xvid.h 2003-07-25 10:30:41 GMT patch-10 Summary: Bitstream syntax comments. Revision: xvidcore--devapi4--1.0--patch-10 This patch does not change the bitstream but adds some comments that can help in order to understand (lack of) calls to BitstreamPadAlways. modified files: ./src/bitstream/bitstream.c ./src/encoder.c 2003-07-22 16:34:25 GMT patch-9 Summary: Fixes Bistream errors in VOL (+ forced stuffing) Revision: xvidcore--devapi4--1.0--patch-9 After a detailed bugreport at: http://www.xvid.org/modules.php?op=modload&name=phpBB2&file=viewtopic&t=1387&highlight= I discovered that: 1/ we did not write video_signal_type, but we were padding to the next byte, that's why we had video_signal_type=0 and then only 1s until the next byte boundary. This explains the 11 next_start_code(); 2/ video_object_type_indication = Reserved is right on my machine, please check again, but i doubt there is a bug there, we use 3 hard wired values and none of them is zero. 3/ 01 : next_start_code() *** Was wrong in 24.02.2003; is correct in dev-api-4!!! *** was a bug in fact... when we write user data, we pad to the next byte boundary (if needed) like the standard says... by chancepadding was almost always done, thus the next_start_code() was respected. 4/ The extra stuffing bits were caused by a forced padding between our VOL function writer and VOP header function writer. modified files: ./src/bitstream/bitstream.c ./src/encoder.c 2003-07-16 22:57:44 GMT patch-8 Summary: Fixed quant4_intra_xmm and quant_intra_3dne bug for DC<0. Revision: xvidcore--devapi4--1.0--patch-8 These two functions were suffering the same error that consists in emulating idiv with some an inversed divisor array and an imul instruction followed by a right shift... That was always decreasing the ressult by 1 for negative DC values. A not so bad solution is simply to use a cmov instruction and choose the right value according to the DC value. As these function were for PIII and Athlon, we are sure we can use the cmov instruction. PS: the fix is somewhere in cosmetic changes... sorry but the code was too ugly to fix it like it was. modified files: ./src/quant/x86_asm/quantize4_xmm.asm ./src/quant/x86_asm/quantize_3dne.asm 2003-07-16 12:58:21 GMT patch-7 Summary: Fixed the build system (error caused by patch-5) Revision: xvidcore--devapi4--1.0--patch-7 IA64 cahnges were wrong in the build system, they make all platforms try to compile a directory... That patch should fix the IA64 target build and get back to previous behavior for other architectures. modified files: ./build/generic/configure.in ./build/generic/platform.inc.in ./build/generic/sources.inc 2003-07-13 12:16:55 GMT patch-6 Summary: Updates for GME and some cleanups. Revision: xvidcore--devapi4--1.0--patch-6 This is a all in one patch from syskin: * mcsel decision moved to ModeDecision() function. That makes motion loop completely aware of macroblock mode and vectors (amv in that case). A simple copy&paste was needed to make the the decision R-D based, to be compatible with MODEDECISION_BITS. * many bugs fixed. Most of them very small, the only big one was that BITS was misunderstanding a flag and was thinking that mpeg quant is used when h263 quant is used and vice versa :( Also, correct cbp with inter4v mode makes mode decision better. Two speedups - for BITS (no more dequantization when sum == 0) and for ChromaME (chroma sad not computed if total sad too big before that). Some GMC compiler warnings removed. Probably more, I don't remember ;) I haven't touched P/B/I decision for once. * compiler warnings removed, mostly "const mismatch" in get_amv() <-- or what was his name. * two functions made 2x smaller, shorter and faster. modified files: ./src/encoder.c ./src/global.h ./src/image/interpolate8x8.h ./src/motion/gmc.c ./src/motion/gmc.h ./src/motion/motion_est.c ./src/motion/motion_est.h ./vfw/src/codec.c 2003-07-10 17:35:59 GMT patch-5 Summary: IA64 updates. Revision: xvidcore--devapi4--1.0--patch-5 Changes from Stephan Krause Small updates so ia64 is supposed to work. Further testing is needed because tests have only been done with xvid_encraw. modified files: ./build/generic/platform.inc.in ./examples/xvid_encraw.c ./src/motion/motion_est.c ./src/xvid.c 2003-07-10 17:27:01 GMT patch-4 Summary: Removed remaining expanded $ lines from the arch repo Revision: xvidcore--devapi4--1.0--patch-4 Removed remaining expanded $ lines from the arch repo modified files: ./CodingStyle ./build/generic/bootstrap.sh ./doc/xvid-encoder.txt ./src/bitstream/ppc_asm/cbp_altivec.s ./src/bitstream/ppc_asm/cbp_ppc.s ./src/dct/x86_asm/fdct_xmm.asm ./src/image/x86_asm/colorspace_yuv_mmx.asm ./src/image/x86_asm/reduced_mmx.asm ./src/image/x86_asm/yuv_to_yv12_mmx.asm ./src/image/x86_asm/yv12_to_rgb24_mmx.asm ./src/image/x86_asm/yv12_to_rgb32_mmx.asm ./src/motion/ppc_asm/sad_altivec.c ./todo.txt 2003-07-02 23:20:39 GMT patch-3 Summary: Reset the IFrame counter when an iframe is encoded Revision: xvidcore--devapi4--1.0--patch-3 Reset the IFrame counter when an iframe is encoded modified files: ./src/encoder.c 2003-06-29 21:58:24 GMT patch-2 Summary: Added 3 warp point GMC. Revision: xvidcore--devapi4--1.0--patch-2 Added 3 warp point GMC (first cvs commit + bitstream warp writing fix from cvs) new files: ./src/motion/.arch-ids/gmc.c.id ./src/motion/.arch-ids/gmc.h.id ./src/motion/gmc.c ./src/motion/gmc.h modified files: ./build/generic/sources.inc ./src/bitstream/bitstream.c ./src/decoder.c ./src/decoder.h ./src/encoder.c ./src/encoder.h ./src/global.h ./src/motion/motion.h ./src/motion/motion_comp.c ./src/motion/motion_est.c ./src/motion/motion_est.h ./src/utils/mbfunctions.h ./src/xvid.h 2003-06-29 21:35:01 GMT patch-1 Summary: Updated changelog Revision: xvidcore--devapi4--1.0--patch-1 Updated changelog modified files: ./changelog.txt 2003-06-27 17:01:46 GMT base-0 Summary: tag of ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-80 Revision: xvidcore--devapi4--1.0--base-0 (automatically generated log message) new patches: ed.gomez@free.fr--main/xvidcore--devapi4--1.0--base-0 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-1 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-2 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-3 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-4 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-5 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-6 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-7 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-8 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-9 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-10 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-11 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-12 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-13 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-14 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-15 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-16 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-17 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-18 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-19 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-20 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-21 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-22 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-23 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-24 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-25 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-26 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-27 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-28 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-29 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-30 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-31 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-32 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-33 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-34 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-35 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-36 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-37 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-38 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-39 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-40 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-41 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-42 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-43 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-44 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-45 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-46 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-47 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-48 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-49 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-50 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-51 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-52 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-53 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-54 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-55 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-56 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-57 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-58 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-59 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-60 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-61 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-62 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-63 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-64 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-65 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-66 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-67 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-68 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-69 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-70 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-71 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-72 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-73 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-74 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-75 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-76 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-77 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-78 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-79 ed.gomez@free.fr--main/xvidcore--devapi4--1.0--patch-80 ed.gomez@free.fr--main/xvidcore--stable--0.9--base-0 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-1 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-2 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-3 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-4 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-5 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-6 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-7 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-8 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-9 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-10 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-11 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-12 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-13 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-14 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-15 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-16 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-17 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-18 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-19 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-20 ed.gomez@free.fr--main/xvidcore--stable--0.9--version-0 ed.gomez@free.fr--main/xvidcore--stable--1.0--base-0 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-1 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-2 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-3 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-4 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-5 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-6 2003-06-27 13:42:52 GMT patch-80 Summary: Still fixes and improvements to motion estimation Revision: xvidcore--devapi4--1.0--patch-80 Still fixes and improvements to motion estimation. modified files: src/motion/motion_est.c src/motion/motion_est.h 2003-06-27 13:35:20 GMT patch-79 Summary: Added compile time PNM reading Revision: xvidcore--devapi4--1.0--patch-79 Added compile time PNM reading. It can be useful to test RGB<->YV12 conversions inside XviD. modified files: examples/xvid_encraw.c 2003-06-24 12:19:01 GMT patch-78 Summary: Fixes to the RD ME Revision: xvidcore--devapi4--1.0--patch-78 Fixes to the RD ME. modified files: src/motion/motion_est.c src/motion/motion_est.h 2003-06-14 09:14:11 GMT patch-77 Summary: Zone update. Revision: xvidcore--devapi4--1.0--patch-77 Removed zone warning boxes (they are counter productive) Added zone-based force key frame option. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/config.rc vfw/src/resource.h 2003-06-14 09:06:37 GMT patch-76 Summary: Fixes a bug where type was not respected in a BEFORE plugin. Revision: xvidcore--devapi4--1.0--patch-76 Fixes a bug where type was not respected in a BEFORE plugin. The plugin framework was not copying what was passed to the call_plugins function. modified files: src/encoder.c 2003-06-12 23:03:38 GMT patch-75 Summary: Fixed the old "yellow line on left" with rgb output. Revision: xvidcore--devapi4--1.0--patch-75 Fixed the old "yellow line on left" with rgb output. modified files: src/image/x86_asm/colorspace_rgb_mmx.asm 2003-06-12 23:02:10 GMT patch-74 Summary: Removed log2bin ia32 optimization. Revision: xvidcore--devapi4--1.0--patch-74 Removed log2bin ia32 optimization. modified files: src/bitstream/bitstream.c 2003-06-12 22:55:10 GMT patch-73 Summary: Fixed some small things in encoder. Revision: xvidcore--devapi4--1.0--patch-73 Removed definitively the Hint stuff. Fixed some XXX thingies nad some cleanup. modified files: src/encoder.c 2003-06-12 22:51:55 GMT patch-72 Summary: Back to Walken's Idct Revision: xvidcore--devapi4--1.0--patch-72 The simple_idct idea was not so good as is. Waiting for a better solution from michael. modified files: src/xvid.c src/xvid.h 2003-06-10 22:45:57 GMT patch-71 Summary: VFW front end update (New live quant histogram window) Revision: xvidcore--devapi4--1.0--patch-71 Update to the VFW frontend. It includes a new window that shows live quantizer histogram during encoding session. new files: vfw/src/.arch-ids/status.c.id vfw/src/.arch-ids/status.h.id vfw/src/status.c vfw/src/status.h modified files: vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/config.h vfw/src/config.rc vfw/src/driverproc.c vfw/src/resource.h vfw/vfw.dsp 2003-06-10 20:58:09 GMT patch-70 Summary: Fixed the win32 project file because of patch-64 Revision: xvidcore--devapi4--1.0--patch-70 Removed adapt_quant.[ch] files from the libxvidcore win32 project file. modified files: build/win32/libxvidcore.dsp 2003-06-10 20:53:31 GMT patch-69 Summary: Added direct target frame size support + cosmetic. Revision: xvidcore--devapi4--1.0--patch-69 If the target bitrate is < 0, it is now interpreted as a target size in kbytes. I did also some cosmetic work to remove all space indents ^_^. modified files: src/plugins/plugin_2pass2.c 2003-06-10 09:13:40 GMT patch-68 Summary: xvid_bench updates and corresponding Makefile changes. Revision: xvidcore--devapi4--1.0--patch-68 As mentionned on the devel mailing list, xvid_bench did not even compile anymore. This patch updates xvid_bench to the new API. xvid_bench is now compiled with other examples by the Makefile, this makes mandatory to include ../build/generic/platform.inc to have the ARCH_IS_xxxx constants. Dunno if it has an impact on Win32 project files. modified files: examples/Makefile examples/xvid_bench.c 2003-06-10 09:05:14 GMT patch-67 Summary: Probably a small copy/paste error Revision: xvidcore--devapi4--1.0--patch-67 XVID_CSP_BGR was advertised as being a 32bit packed format -> 24bit is the right pixel size modified files: src/xvid.h 2003-06-09 19:39:47 GMT patch-66 Summary: Activated simple_idct_mmx. Revision: xvidcore--devapi4--1.0--patch-66 This patch activates simple_idct_mmx use. However it tries to make sure old streams (< version 10) are decoded using the mmx Walten's version. A noticeable bitstream version change, it is now numbered 11. The number 10 is used on the cvs_head version for the same code change. modified files: src/bitstream/bitstream.c src/dct/simple_idct.c src/dct/x86_asm/simple_idct_mmx.asm src/decoder.c src/decoder.h src/xvid.c src/xvid.h 2003-06-09 19:15:18 GMT patch-65 Summary: Remaining include of adapt_quant.h Revision: xvidcore--devapi4--1.0--patch-65 encoder.c was still including adapt_quant.h. Removed. modified files: src/encoder.c 2003-06-09 17:49:44 GMT patch-64 Summary: Moved code from adapt_quant.c to the lumimasking plugin. Revision: xvidcore--devapi4--1.0--patch-64 The lumimasking plugin was using functions from outside. As I understand what plugins are, they should not rely on code outside their module as much as it is possible to achieve. Here it was clear, the plugin could be made standalone. PS: it seems lumimasking is a no-op plugin, it's probably a bug in the plugin framework. No time to track this. removed files: src/quant/.arch-ids/adapt_quant.c.id src/quant/.arch-ids/adapt_quant.h.id src/quant/adapt_quant.c src/quant/adapt_quant.h modified files: build/generic/sources.inc src/plugins/plugin_lumimasking.c 2003-06-09 13:45:29 GMT patch-63 Summary: Legal GNU GPL Headers and copyright holders. Revision: xvidcore--devapi4--1.0--patch-63 Added Legal GNU GPL headers and copyright holders as defined in XviD 0.9.x. There are still some wrong copyright (atm noted 'Anonymous') and i missed probably some old headers that contain the GNU GPL pattern my script searched for. modified files: dshow/src/CAbout.cpp dshow/src/CAbout.h dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h dshow/src/IXvidDecoder.h examples/xvid_bench.c examples/xvid_decraw.c examples/xvid_encraw.c examples/xvid_stat.c rawdec/rawdec.c src/bitstream/bitstream.c src/bitstream/bitstream.h src/bitstream/cbp.c src/bitstream/cbp.h src/bitstream/mbcoding.c src/bitstream/mbcoding.h src/bitstream/vlc_codes.h src/bitstream/zigzag.h src/dct/fdct.c src/dct/fdct.h src/dct/idct.c src/dct/idct.h src/dct/simple_idct.c src/decoder.c src/decoder.h src/divx4.c src/divx4.h src/encoder.c src/encoder.h src/global.h src/image/colorspace.c src/image/colorspace.h src/image/font.c src/image/font.h src/image/image.c src/image/image.h src/image/interpolate8x8.c src/image/interpolate8x8.h src/image/reduced.c src/image/reduced.h src/motion/motion.h src/motion/motion_comp.c src/motion/motion_est.c src/motion/motion_est.h src/motion/ppc_asm/sad_altivec.c src/motion/sad.c src/motion/sad.h src/motion/smp_motion_est.c src/motion/smp_motion_est.h src/plugins/plugin_2pass1.c src/plugins/plugin_2pass2.c src/plugins/plugin_dump.c src/plugins/plugin_fixed.c src/plugins/plugin_lumimasking.c src/plugins/plugin_psnr.c src/plugins/plugin_single.c src/portab.h src/prediction/mbprediction.h src/quant/adapt_quant.c src/quant/adapt_quant.h src/quant/quant_h263.h src/quant/quant_matrix.c src/quant/quant_matrix.h src/quant/quant_mpeg4.c src/quant/quant_mpeg4.h src/utils/emms.c src/utils/emms.h src/utils/mbfunctions.h src/utils/mbtransquant.c src/utils/mem_align.c src/utils/mem_align.h src/utils/mem_transfer.c src/utils/mem_transfer.h src/utils/timer.c src/utils/timer.h src/xvid.c src/xvid.h vfw/src/2pass.h vfw/src/codec.h vfw/src/config.h vfw/src/debug.h vfw/src/resource.h vfw/src/vfwext.h 2003-06-09 01:13:50 GMT patch-62 Summary: ANSI C comments. Revision: xvidcore--devapi4--1.0--patch-62 Turned all // ISO C99 comments into ISO C89 (aka ANSI C) coment style. Now XviD compiles fine with gcc 3.x -std=iso89 option. This should help those people who want to get XviD working on DSPs or any other exotic hardware. This type of exotic hardware is usually shipped with a very spartiate ANSI C compiler. NB: Big patch that breaks all kind of cherry picking merges. modified files: examples/odivx_enc_dec.c examples/xvid_bench.c src/bitstream/bitstream.c src/bitstream/mbcoding.c src/bitstream/mbcoding.h src/bitstream/zigzag.h src/dct/fdct.c src/dct/idct.c src/dct/simple_idct.c src/decoder.c src/decoder.h src/encoder.c src/encoder.h src/global.h src/image/colorspace.c src/image/font.c src/image/image.c src/image/interpolate8x8.c src/image/interpolate8x8.h src/image/reduced.c src/motion/motion.h src/motion/motion_comp.c src/motion/motion_est.c src/motion/motion_est.h src/motion/sad.c src/plugins/plugin_2pass2.c src/prediction/mbprediction.c src/prediction/mbprediction.h src/quant/adapt_quant.c src/quant/adapt_quant.h src/quant/quant_h263.c src/quant/quant_h263.h src/quant/quant_mpeg4.c src/quant/quant_mpeg4.h src/utils/mbtransquant.c src/utils/mem_transfer.c src/utils/timer.c src/xvid.c vfw/src/2pass.c vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/config.h vfw/src/driverproc.c 2003-06-04 18:19:56 GMT patch-61 Summary: Removed AltCC from VFW frontend Revision: xvidcore--devapi4--1.0--patch-61 A previous patch removed AltCC from the 2pass plugin. Thus we remove the frontend panels for AltCC and corresponding code. modified files: src/xvid.h vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/config.rc 2003-05-29 14:47:28 GMT patch-60 Summary: Lot of two pass updates. Revision: xvidcore--devapi4--1.0--patch-60 * Removed Alt curve treatment * After reading VFW code, i found out that it was using the blocks stats fields to retrieve the number of MBs in a frame. So all this min_size[] was not meant to discover a min_size for each frame according to its intra MBs but rather an hardcoded minimum for all frames as advertised in earlier cvs revisions. It would be easier if original code was commented :-( * Some comment changes * bquant_error and pquant_error have been replaced by an array quant_error[3][32] indexed by frame type and quantizer value. * Moved some initialization stuff * I read VFW and noticed that min_length was supposed to be: min{hard coded length, min{observed lengths}} * Force frame type during the second pass. * Simplified equations. Scaling was needed because of the non linear formulas used in AltCC but now we can directly use avg_length[s->type-1] instead of "first prescaling bframes to pframes lengths then use pframe stats and at last prescaling back frame length to bframe lengths" See my new XXX: question about the overflow. modified files: src/plugins/plugin_2pass2.c 2003-05-25 10:01:55 GMT patch-59 Summary: Function reordering, fix minimum "hardcoded" frame sizes in internal_sacle(). Revision: xvidcore--devapi4--1.0--patch-59 Fixed a bug where hardcoded miminum frame lengths were computed only for first frame (IFrame) and was applied for al frames. I just moved the formulas into the frame loop. Lot of cosmetic work, function reodrering etc etc so the plugin function come first, and then we have sub function and helper functions. Some fixes in my previous comments. modified files: src/plugins/plugin_2pass2.c 2003-05-22 23:11:21 GMT patch-58 Summary: Added the container_frame_overhead field to the 2pass2 RC structure. Revision: xvidcore--devapi4--1.0--patch-58 In my previous patches, i disabled container format overhead compensation because xvidcore can be used for other things than AVI. However this compensation is usefull, so it's back with its own structure field that specifies how much bytes the container uses for a frame (average value). We can now do some direct ogm, matroska encodings without loosing a single byte... :-) modified files: src/plugins/plugin_2pass2.c src/xvid.h vfw/src/codec.c 2003-05-22 22:22:47 GMT patch-57 Summary: Fixed an overflow bug in target filesize computation. Revision: xvidcore--devapi4--1.0--patch-57 rc->target was an uint64_t data to avoid overflow when dealing with long movies and/or high bitrates. The problem is that its initialization was using int32 data, thus this was resulting in an overflow in its initial computation. Quite silly, but this bug drived me crazy during 4 hours... modified files: src/plugins/plugin_2pass2.c 2003-05-22 18:53:19 GMT patch-56 Summary: Added the mrproper Makefile target. Revision: xvidcore--devapi4--1.0--patch-56 Added the mrproper Makefile target that deletes even bootstrapped files. mrproper name comes from the linux kernel makefile, i was out of inspiration. modified files: build/generic/Makefile 2003-05-22 17:30:15 GMT patch-55 Summary: Fix a nasty bug due to a typo mistake. Revision: xvidcore--devapi4--1.0--patch-55 We were comparing frame length with a wrong min_size[index] that was out of bounds (in internal_scale). modified files: src/plugins/plugin_2pass2.c 2003-05-22 17:24:19 GMT patch-54 Summary: Removed automatic \n in DPRINTF calls. Revision: xvidcore--devapi4--1.0--patch-54 Removed automatic \n in DPRINTF calls. modified files: src/bitstream/bitstream.c src/bitstream/mbcoding.c src/decoder.c src/encoder.c src/image/image.c src/plugins/plugin_2pass2.c src/portab.h src/prediction/mbprediction.c 2003-05-22 17:03:38 GMT patch-53 Summary: Cleaned up a bit, added comments. Revision: xvidcore--devapi4--1.0--patch-53 I Cleaned up the plugin_before function. I added some comments at the same time, so now it should be more easy to understand the meaning of all these if/else thingies :-) modified files: src/plugins/plugin_2pass2.c 2003-05-18 12:12:49 GMT patch-52 Summary: Update of xvid_encraw (vop_debug, debug, max key frame) Revision: xvidcore--devapi4--1.0--patch-52 Added a -vop_debug option. This makes xvidcore to print out frame information directly into the encoded frame. Changed the meaning of the -debug option. It activates now the internal xvidcore debug output. Added a -max_key_interval. modified files: examples/xvid_encraw.c 2003-05-18 12:01:31 GMT patch-51 Summary: Missing RateControl removal from Win32 visual project. Revision: xvidcore--devapi4--1.0--patch-51 RateControl removal was missing in the visual c project. modified files: build/win32/libxvidcore.dsp 2003-05-18 00:08:46 GMT patch-50 Summary: Removed legacy RateControl module. Revision: xvidcore--devapi4--1.0--patch-50 Removed all code related to the old RateControl module. removed files: src/utils/.arch-ids/ratecontrol.h.id src/utils/.arch-ids/ratecontrol.c.id src/utils/ratecontrol.h src/utils/ratecontrol.c modified files: build/generic/sources.inc src/encoder.h 2003-05-17 23:54:55 GMT patch-49 Summary: VFW Update. Revision: xvidcore--devapi4--1.0--patch-49 Added support for the debug option. The registry key debug has been changed to vop_debug. The reg key debug is now used for the codec debugging output. Some work on zones and mispellings. modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/config.rc vfw/src/resource.h 2003-05-17 23:50:38 GMT patch-48 Summary: 2pass plugin updates for zone support. Revision: xvidcore--devapi4--1.0--patch-48 A bit more work on zones support in the 2Pass2 plugin. Simple cleanup in the 2Pass1 plugin. modified files: src/plugins/plugin_2pass1.c src/plugins/plugin_2pass2.c 2003-05-17 21:07:43 GMT patch-47 Summary: Debug is now controled through xvid_global + INIT Revision: xvidcore--devapi4--1.0--patch-47 Debug is now controlled through a global variable. It can be set thanx to the API using the new xvid_gbl_init_t.debug field. All DPRINTF constants have been turned into XVID_DEBUG_xxxx. They have been moved to xvid.h. modified files: src/bitstream/bitstream.c src/bitstream/mbcoding.c src/decoder.c src/encoder.c src/image/image.c src/plugins/plugin_2pass2.c src/plugins/plugin_single.c src/portab.h src/prediction/mbprediction.c src/utils/ratecontrol.c src/xvid.c src/xvid.h 2003-05-17 20:32:59 GMT patch-46 Summary: Fix for the patch-44. Revision: xvidcore--devapi4--1.0--patch-46 The fix from pete was breaking lot of other stuff, or at least it was outlining it. Now it works reliably. modified files: src/encoder.c 2003-05-15 17:31:04 GMT patch-45 Summary: Removed XVID_VOP_DYNAMIC_BFRAMES flag. Revision: xvidcore--devapi4--1.0--patch-45 The encoder loop bugfix removed this flag, so it's now being removed from xvid_encraw. modified files: examples/xvid_encraw.c 2003-05-15 17:24:55 GMT patch-44 Summary: Fix to the encoder loop (was not respecting dynamic decision). Revision: xvidcore--devapi4--1.0--patch-44 The long awaited fix to the encoder loop that was not respecting the dynamic decision performed by the MEAnlyse function. modified files: src/encoder.c src/xvid.h 2003-05-14 23:27:59 GMT patch-43 Summary: Added module building for MacOSX. Revision: xvidcore--devapi4--1.0--patch-43 Added the --enable-macosx_module option to the configure script. It allows module building on that platform as it differenciates loadable modules (a la dlopen) and dynamic libs that are simply linked at compile time. This was needed for transcode. Patch contributed by Tilmann Bitterberg modified files: build/generic/configure.in 2003-05-14 20:21:30 GMT patch-42 Summary: Merged RD ME from cvs_head. Revision: xvidcore--devapi4--1.0--patch-42 Syskin has changed a bit the ME algorithm, so now it does a kind of RD optimization of Vector search. modified files: src/motion/motion_est.c src/motion/motion_est.h 2003-05-14 18:40:40 GMT patch-41 Summary: Merged syskin ME changes. Revision: xvidcore--devapi4--1.0--patch-41 Merged last syskin ME changes. Matches motion_est.c:1.69 and motion_est.h:1.7 minus unneeded code plus some changes due to new API. modified files: src/motion/motion_est.c src/motion/motion_est.h 2003-05-14 17:28:52 GMT patch-40 Summary: Small update to xvid_encraw. Revision: xvidcore--devapi4--1.0--patch-40 I added an help message to mention the fact we can repeat the zone options. modified files: examples/xvid_encraw.c 2003-05-14 14:19:12 GMT patch-39 Summary: VFW Update (zone support, profile support) Revision: xvidcore--devapi4--1.0--patch-39 VFW Update (zone support, profile support) new files: vfw/src/.arch-ids/vfwext.h.id vfw/src/.arch-ids/debug.h.id vfw/src/vfwext.h vfw/src/debug.h modified files: vfw/src/codec.c vfw/src/config.c vfw/src/config.h vfw/src/config.rc vfw/src/driverproc.c vfw/src/driverproc.def vfw/src/resource.h vfw/vfw.dsp 2003-05-14 14:02:05 GMT patch-38 Summary: Add support for single RC and zones to xvid_encraw. Revision: xvidcore--devapi4--1.0--patch-38 Adds support for single RC and zones to xvid_encraw. modified files: examples/xvid_encraw.c 2003-05-14 13:58:56 GMT patch-37 Summary: Fixes for Win32 build of libxvidcore. Revision: xvidcore--devapi4--1.0--patch-37 A previous patch left the Win32 build process incomplete and not up to date. modified files: build/generic/libxvidcore.def.in build/win32/libxvidcore.dsp 2003-05-13 00:10:12 GMT patch-36 Summary: Small fixes. Revision: xvidcore--devapi4--1.0--patch-36 data->quant fix. Fixed some coding bugs in trellis code. Used __inline and not inline. modified files: src/encoder.c src/utils/mbtransquant.c 2003-05-13 00:05:03 GMT patch-35 Summary: CBR plugin is renamed Single pass. Fixed Quant plugin is disabled. Revision: xvidcore--devapi4--1.0--patch-35 With the zones feature, the CBR plugin could be used for all type of one pass RC. The better thing to do would be to include fixed quant to this new single pass plugin. Btw, a (clean) solution has not been found yet. I am obliged to disable the fixed quant plugin. This breaks xvid_encraw :-( modified files: build/generic/sources.inc src/plugins/plugin_single.c src/xvid.h renamed files: src/plugins/.arch-ids/plugin_cbr.c.id ==> src/plugins/.arch-ids/plugin_single.c.id src/plugins/plugin_cbr.c ==> src/plugins/plugin_single.c 2003-05-12 23:49:14 GMT patch-34 Summary: Removed quant limits per RC plugin, moved to global settings. Revision: xvidcore--devapi4--1.0--patch-34 The I/P/B Frames' min/max quantizers have moved from RC plugins' interface to the general encoding interface. The CBR plugin has been updated for zones and the quent limits move. modified files: src/encoder.c src/encoder.h src/plugins/plugin_2pass2.c src/plugins/plugin_cbr.c src/xvid.h 2003-05-12 23:25:54 GMT patch-33 Summary: Added encoding zones Revision: xvidcore--devapi4--1.0--patch-33 Added encoding zones in 2pass plugins. The idea behind "zones" is to define frame ranges for which we change the plugin's behavior. modified files: src/encoder.c src/encoder.h src/plugins/plugin_2pass1.c src/xvid.h 2003-05-12 23:10:17 GMT patch-32 Summary: Added the profile setting. Revision: xvidcore--devapi4--1.0--patch-32 Added the profile setting to user API. modified files: src/bitstream/bitstream.c src/bitstream/bitstream.h src/encoder.c src/encoder.h src/xvid.h 2003-05-11 23:59:01 GMT patch-31 Summary: Changed quality presets. Revision: xvidcore--devapi4--1.0--patch-31 The presets have been changed so now we should have better PSNR with higher quality presets in all cases. I changed a bit the way we treat quality overflow or overflow, now i just clip the value to allowed range. modified files: examples/xvid_encraw.c 2003-05-11 20:47:55 GMT patch-30 Summary: Some cleanups in the trellis code. Revision: xvidcore--devapi4--1.0--patch-30 Some cleanup work on trellis code. Should compile file on Visual C++ now. modified files: src/utils/mbtransquant.c 2003-05-10 23:53:28 GMT patch-29 Summary: New trellis code Revision: xvidcore--devapi4--1.0--patch-29 New trellis code from skal. It should be reworked a bit so it integrates better into XviD code. modified files: src/bitstream/mbcoding.c src/bitstream/mbcoding.h src/utils/mbtransquant.c 2003-05-10 23:43:11 GMT patch-28 Summary: Intra frame decision. Revision: xvidcore--devapi4--1.0--patch-28 These are syskin's words: hopefully, intra frame at every scene change (we really need it to cut things). modified files: src/motion/motion_est.c 2003-05-05 21:50:25 GMT patch-27 Summary: Fixed double last calculation in trellis quantization. Revision: xvidcore--devapi4--1.0--patch-27 chl changelog message: Removed double calculation of "last" => +0.02dB modified files: src/bitstream/mbcoding.c 2003-05-05 21:46:29 GMT patch-26 Summary: Added config.status to the distclean target. Revision: xvidcore--devapi4--1.0--patch-26 The target distclean is supposed to clean all files so the remaining ones are those supposed to be found in a distribution tarball... config.status is not one of them, so let's add this to the distclean target. modified files: build/generic/Makefile 2003-05-05 21:39:47 GMT patch-25 Summary: configure.in tuning (API number and lib sonames). Revision: xvidcore--devapi4--1.0--patch-25 I fixed a typo which prevented SPECIFIC_CFLAGS to be properly set by the configure script. I also bumped the API version number as API 3.0 is current cvs_head and this branch is the next major API version While trying to build my own debian package out of xvidcore, i ran into trouble with the soname not respecting some basic rules that prevented having different library revisions running alongside (with different major APIs). This has been fixed adding the major API number to the library SONAME. modified files: build/generic/configure.in 2003-04-27 23:22:30 GMT patch-24 Summary: Cleaned CBR plugin a bit, adds structure for a better initial quant. Revision: xvidcore--devapi4--1.0--patch-24 Just a clean up turning default values to preprocessor constants. I added a get_initial_quant for trying to retrieve support in a near future, a good quantizer according to the desired target bitrate. This will be done thanks to a simple LUT where we'll have lut[quant] = average_bitrate;. This seems stupid but it'll be better than starting with an hardcoded value. modified files: src/plugins/plugin_cbr.c 2003-04-27 23:18:20 GMT patch-23 Summary: b-frames look good in still motion, after all. Revision: xvidcore--devapi4--1.0--patch-23 b-frames look good in still motion, after all. modified files: src/motion/motion_est.c 2003-04-27 23:14:39 GMT patch-22 Summary: Add initial trellis quantization to inter+h263 frames. Revision: xvidcore--devapi4--1.0--patch-22 This is the initial support of trellis quantization for inter frames + h263 quantization method. Complete support is on the way. modified files: examples/xvid_encraw.c src/bitstream/mbcoding.c src/bitstream/mbcoding.h src/utils/mbtransquant.c src/xvid.h 2003-04-27 22:59:47 GMT patch-21 Summary: Fixes 2 memory leaks. Revision: xvidcore--devapi4--1.0--patch-21 After a valgrind pass I fixed these 2 leaks. We have still to fix an MEAnalysis on unitialized data. modified files: src/encoder.c src/utils/mem_align.c 2003-04-27 22:50:27 GMT patch-20 Summary: Adds Avg PSNR output to xvid_encraw. Revision: xvidcore--devapi4--1.0--patch-20 Adds Avg PSNR output to xvid_encraw. modified files: examples/xvid_encraw.c 2003-04-27 22:40:45 GMT patch-19 Summary: Fixes the vfw Visual Project. Revision: xvidcore--devapi4--1.0--patch-19 A missing file has been removed from the project file. modified files: vfw/vfw.dsp 2003-04-14 20:07:47 GMT patch-18 Summary: Fixes plugin initialization in xvid_encraw. Revision: xvidcore--devapi4--1.0--patch-18 We were initializing plugins' versions before a memset... Doh... modified files: examples/xvid_encraw.c 2003-04-14 15:28:57 GMT patch-17 Summary: Fixed function prototypes <-> definitions mismatching. Revision: xvidcore--devapi4--1.0--patch-17 Fixed function prototypes <-> definitions mismatching. modified files: src/utils/mbfunctions.h src/utils/mbtransquant.c 2003-04-14 15:23:15 GMT patch-16 Summary: VFW frontend update Revision: xvidcore--devapi4--1.0--patch-16 The VFW frontend has been updated. modified files: vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/config.h vfw/src/config.rc vfw/src/driverproc.c vfw/src/resource.h vfw/vfw.dsp 2003-04-14 15:05:20 GMT patch-15 Summary: Fixed BITS decision for low quants, reworked p/b/i decision. Revision: xvidcore--devapi4--1.0--patch-15 sysKin's log message: improved vhq (does not decrease psnr anymore - at least for low quants) and tweaked p/b/i decision again. I Added a fix to this CVS commit to avoid ALU Exception (division by zero). It has been commited to cvs_head as well by sysKin. modified files: src/motion/motion_est.c 2003-04-10 13:01:07 GMT patch-14 Summary: Removed all ABS() macros. Revision: xvidcore--devapi4--1.0--patch-14 All ABS macros have been replace with their stdlib.h/math.h equivalent. This gives a 33% overall speedup for the plain C encoder, while the ia32 one seems to suffer a small speed loss. However this speed loss is very small and it seems it depends on the CPU type as the abs/fabs usage is impacting badly on sad functions but impacting well on interpolate functions ... weird inst'it ? modified files: src/bitstream/mbcoding.c src/global.h src/image/image.c src/motion/motion_comp.c src/motion/motion_est.c src/motion/sad.c src/prediction/mbprediction.c src/utils/mbtransquant.c src/xvid.c 2003-04-09 18:44:24 GMT patch-13 Summary: Added GNU profiling option to the configure script. Revision: xvidcore--devapi4--1.0--patch-13 In order to allow easy profiling using GNU tools (gprof, gcov), I added the --enable-gnuprofile to the configure.in template. This modifies the SPECIFIC_LDFLAGS and SPECIFIC_CFLAGS for library building so they include all needed options for profiling and test coverage. /!\ When compiling your own program, don't forget to use these options: -pg -fprofile-arcs -ftest-coverage When linking your program, you MUST use the -pg option too, else your binary will not use/create profiling informations. modified files: build/generic/configure.in 2003-04-09 16:09:33 GMT patch-12 Summary: Build fix from release-0_9_1-fixes@cvs.xvid.org Revision: xvidcore--devapi4--1.0--patch-12 MacOSX build process was wrong on the linking stage as it was ignoring the equivalent of the linux soname thingy. modified files: build/generic/configure.in 2003-04-09 13:44:06 GMT patch-11 Summary: Syncing arch tree with xvid.org cvs. Revision: xvidcore--devapi4--1.0--patch-11 Synced with all the work done in the xvid.org cvs repository. I could not maintain a complete list of all items but here is a kind of digest. + Merged build files fixes from the release-0_9_1-fixes branch. + Synced all motion estimation changes from the cvs_head branch. + Added rate control plugins. + Added lumimasking plugin. + Synced optimizations from cvs_head in interpolate and cbp functions. + xvid_encraw improvements. + new mbtransquant set of functions. + Fixed bframe SSE calculation. new files: src/plugins/.arch-ids/plugin_2pass1.c.id src/plugins/.arch-ids/plugin_2pass2.c.id src/plugins/.arch-ids/plugin_cbr.c.id src/plugins/.arch-ids/plugin_fixed.c.id src/plugins/.arch-ids/plugin_lumimasking.c.id vfw/src/.arch-ids/XviD_logo.bmp.id src/plugins/plugin_2pass1.c src/plugins/plugin_2pass2.c src/plugins/plugin_cbr.c src/plugins/plugin_fixed.c src/plugins/plugin_lumimasking.c vfw/src/XviD_logo.bmp modified files: build/generic/Makefile build/generic/bootstrap.sh build/generic/configure.in build/generic/libxvidcore.def.in build/generic/platform.inc.in build/generic/sources.inc build/win32/libxvidcore.dsp examples/xvid_decraw.c examples/xvid_encraw.c examples/xvid_stat.c src/bitstream/bitstream.c src/bitstream/bitstream.h src/bitstream/cbp.c src/bitstream/mbcoding.c src/bitstream/vlc_codes.h src/decoder.c src/divx4.c src/encoder.c src/encoder.h src/global.h src/image/image.c src/image/interpolate8x8.c src/motion/motion.h src/motion/motion_comp.c src/motion/motion_est.c src/motion/motion_est.h src/motion/sad.c src/plugins/plugin_dump.c src/plugins/plugin_psnr.c src/portab.h src/prediction/mbprediction.c src/utils/mbfunctions.h src/utils/mbtransquant.c src/xvid.c src/xvid.h vfw/src/2pass.c vfw/src/codec.c 2003-03-16 00:21:32 GMT patch-10 Summary: Added suxen plugin system (Synced with CVS) Revision: xvidcore--devapi4--1.0--patch-10 Sync with the CVS and thus adds the plugin framework. new files: src/plugins/.arch-ids/=id src/plugins/.arch-ids/plugin_dump.c.id src/plugins/.arch-ids/plugin_psnr.c.id src/plugins/plugin_dump.c src/plugins/plugin_psnr.c modified files: build/win32/libxvidcore.dsp examples/Makefile examples/xvid_decraw.c examples/xvid_encraw.c src/bitstream/bitstream.c src/encoder.c src/encoder.h src/portab.h src/utils/mbtransquant.c src/xvid.h new directories: src/plugins/.arch-ids src/plugins 2003-03-11 23:37:06 GMT patch-9 Summary: Changed xvid_decraw option handling for -d/-m. Revision: xvidcore--devapi4--1.0--patch-9 -d and -m options were boolean so option values were not needed. modified files: examples/xvid_decraw.c 2003-03-11 23:30:16 GMT patch-8 Summary: Fixed frame counting in xvid_encraw. Revision: xvidcore--devapi4--1.0--patch-8 We were branching before incrementing the frame counter when core was buffering frames. This was resulting in wrong frame counting during the buffering phase. modified files: examples/xvid_encraw.c 2003-03-11 23:07:01 GMT patch-7 Summary: Ported xvid_decraw to new API. Revision: xvidcore--devapi4--1.0--patch-7 xvid_decraw has been ported to new API. It basically works fine, however i'm not completly staisfied. If I do a step by step run then i can see that second frame is reported as a VOL decoding though the first IFrame has been consumed. This makes xvid_decraw does not report correctly frames' length. Except that, xvid_decraw works well enough to activate it in the makefile. modified files: examples/Makefile examples/xvid_decraw.c 2003-03-11 20:19:44 GMT patch-6 Summary: Fix an important API comment in main header. Revision: xvidcore--devapi4--1.0--patch-6 This patch fixes a comment in xvid.h which was simply wrong and could lead to uneeded code. modified files: src/xvid.h 2003-03-11 00:36:34 GMT patch-5 Summary: PSNR is now an option. More consistent -m/-s option handling. Revision: xvidcore--devapi4--1.0--patch-5 I turned PSNR stats into an option (-s). I fixed the handling of the -m option that required only a bool. I could say in french "ma stupidit dans toute sa grandeur". -m presence is enough to significate "save _m_peg stream", we don't need the boolean value. modified files: examples/xvid_encraw.c 2003-03-10 00:36:15 GMT patch-4 Summary: Adds extended stats support even for bframes in xvidcore. Revision: xvidcore--devapi4--1.0--patch-4 This patch enables core extended stats support even for bframes. It modifies the way the MBTransQuantBVOP function does its work. It used to not dequant, idct the MB because bframes are never used as reference frames. However if we want to compute stats, then we must perform these inverse transformations. modified files: src/encoder.c src/utils/mbfunctions.h src/utils/mbtransquant.c 2003-03-09 16:42:27 GMT patch-3 Summary: Adds extended stats support. Revision: xvidcore--devapi4--1.0--patch-3 This patch enables core extended stats support. It seems that xvidcore does not compute sse for BFrames, I have tried to adds this by an ugly hack but it did not work as expected, i suppose core does not decompress bframes as they are not used as reference frames (unlike P and I frames). If we succeed in enabling sse calculation in core for bframes, then xvid_stat will not be needed anymore. This will save lot of trouble with frame matching in PSNR computation when bframes are enabled. modified files: examples/xvid_encraw.c 2003-03-09 00:23:52 GMT patch-2 Summary: Updated xvid_encraw for new API. Revision: xvidcore--devapi4--1.0--patch-2 This patch updates the xvid_encraw example to support the new API. As it's the first patch for API 4 support, I disabled all other examples. BUG: first frame type is Unknown, I suppose I'm missing a subtility of the new API. modified files: examples/Makefile examples/xvid_encraw.c src/encoder.c 2003-03-06 22:08:43 GMT patch-1 Summary: Synced with dev-api-4 XviD branch. Revision: xvidcore--devapi4--1.0--patch-1 Synced with dev-api-4 XviD branch. My branching was done at a later point than CVS. This resulted in version skew, now this branch is synced with CVS. new files: vfw/.arch-ids/=id vfw/bin/.arch-ids/=id vfw/src/.arch-ids/=id vfw/.arch-ids/vfw.dsp.id vfw/bin/.arch-ids/xvid.inf.id vfw/src/.arch-ids/2pass.c.id vfw/src/.arch-ids/2pass.h.id vfw/src/.arch-ids/codec.c.id vfw/src/.arch-ids/codec.h.id vfw/src/.arch-ids/config.c.id vfw/src/.arch-ids/config.h.id vfw/src/.arch-ids/config.rc.id vfw/src/.arch-ids/driverproc.c.id vfw/src/.arch-ids/driverproc.def.id vfw/src/.arch-ids/resource.h.id rawdec/.arch-ids/rawdec.c.id rawdec/.arch-ids/rawdec.dsp.id rawdec/.arch-ids/=id dshow/.arch-ids/=id dshow/.arch-ids/authors.txt.id dshow/.arch-ids/dshow.dsp.id dshow/src/.arch-ids/=id dshow/src/.arch-ids/CAbout.cpp.id dshow/src/.arch-ids/CAbout.h.id dshow/src/.arch-ids/CXvidDecoder.cpp.id dshow/src/.arch-ids/CXvidDecoder.h.id dshow/src/.arch-ids/IXvidDecoder.h.id dshow/src/.arch-ids/resource.h.id dshow/src/.arch-ids/xvid.ax.def.id dshow/src/.arch-ids/xvid.ax.rc.id vfw/vfw.dsp vfw/bin/xvid.inf vfw/src/2pass.c vfw/src/2pass.h vfw/src/codec.c vfw/src/codec.h vfw/src/config.c vfw/src/config.h vfw/src/config.rc vfw/src/driverproc.c vfw/src/driverproc.def vfw/src/resource.h rawdec/rawdec.c rawdec/rawdec.dsp dshow/authors.txt dshow/dshow.dsp dshow/src/CAbout.cpp dshow/src/CAbout.h dshow/src/CXvidDecoder.cpp dshow/src/CXvidDecoder.h dshow/src/IXvidDecoder.h dshow/src/resource.h dshow/src/xvid.ax.def dshow/src/xvid.ax.rc modified files: build/generic/Makefile build/generic/bootstrap.sh build/generic/configure.in build/generic/platform.inc.in build/win32/libxvidcore.dsp examples/Makefile examples/xvid_bench.c examples/xvid_decraw.c examples/xvid_encraw.c examples/xvid_stat.c src/bitstream/bitstream.c src/bitstream/bitstream.h src/bitstream/mbcoding.c src/bitstream/vlc_codes.h src/decoder.c src/decoder.h src/encoder.c src/encoder.h src/global.h src/image/colorspace.c src/image/colorspace.h src/image/image.c src/image/image.h src/image/interpolate8x8.c src/motion/motion.h src/motion/motion_comp.c src/motion/motion_est.c src/motion/motion_est.h src/motion/smp_motion_est.c src/motion/smp_motion_est.h src/portab.h src/prediction/mbprediction.c src/utils/mbtransquant.c src/xvid.c src/xvid.h todo.txt new directories: dshow/.arch-ids dshow/src/.arch-ids rawdec/.arch-ids vfw/.arch-ids vfw/bin/.arch-ids vfw/src/.arch-ids vfw vfw/bin vfw/src rawdec dshow dshow/src 2003-03-06 21:27:16 GMT base-0 Summary: tag of ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-6 Revision: xvidcore--devapi4--1.0--base-0 (automatically generated log message) new patches: ed.gomez@free.fr--main/xvidcore--stable--0.9--base-0 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-1 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-2 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-3 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-4 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-5 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-6 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-7 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-8 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-9 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-10 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-11 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-12 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-13 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-14 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-15 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-16 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-17 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-18 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-19 ed.gomez@free.fr--main/xvidcore--stable--0.9--patch-20 ed.gomez@free.fr--main/xvidcore--stable--0.9--version-0 ed.gomez@free.fr--main/xvidcore--stable--1.0--base-0 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-1 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-2 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-3 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-4 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-5 ed.gomez@free.fr--main/xvidcore--stable--1.0--patch-6 2003-02-15 18:40:33 GMT patch-3 Summary: Added latest Pete's chroma optimization code. Revision: xvidcore--stable--1.0--patch-3 During merge, i forgot that small piece of code. modified files: src/encoder.c src/xvid.h 2003-02-15 14:48:12 GMT patch-2 Summary: Fixed compilation and moved back to plain GPL. Revision: xvidcore--stable--1.0--patch-2 portab.h was missing the DPRINTF_RC flag. I merged the watcom C portab.h part. Back to plain GPL as it is in dev-api-3 branch. modified files: LICENSE src/portab.h 2003-02-15 14:05:17 GMT patch-1 Summary: Updated tree to dev-api-3 branch. Revision: xvidcore--stable--1.0--patch-1 This patch consists basically of merging things with dev-api-3 CVS branch. - All asm, C, h files have been copied from this CVS branch. - Fixed generic sources.inc to take care of changes. - Fixed architecture stuff in sources (ARCH_IS_...). - Updated examples. - Updated libxvidcore.dsp. - Fixed libxvidcore.dsp for ARCH_IS_... constants. - Changed .so and .a naming convention. It is now postfixed with the API version to avoid API incompatibilities with 2.1 which had had a postfix. - Fixed configure version 0.9.1 <-> 1.0.0 cvs-snaphot This is the first attempt. I don't even know if it compiles well. new files: src/bitstream/x86_asm/.arch-ids/cbp_3dne.asm.id src/dct/.arch-ids/simple_idct.c.id src/dct/x86_asm/.arch-ids/fdct_xmm.asm.id src/dct/x86_asm/.arch-ids/idct_3dne.asm.id src/dct/x86_asm/.arch-ids/simple_idct_mmx.asm.id src/image/.arch-ids/font.c.id src/image/.arch-ids/font.h.id src/image/.arch-ids/reduced.c.id src/image/.arch-ids/reduced.h.id src/image/x86_asm/.arch-ids/colorspace_mmx.inc.id src/image/x86_asm/.arch-ids/colorspace_rgb_mmx.asm.id src/image/x86_asm/.arch-ids/colorspace_yuv_mmx.asm.id src/image/x86_asm/.arch-ids/colorspace_yuyv_mmx.asm.id src/image/x86_asm/.arch-ids/interpolate8x8_3dne.asm.id src/image/x86_asm/.arch-ids/reduced_mmx.asm.id src/motion/.arch-ids/motion_est.h.id src/motion/.arch-ids/smp_motion_est.c.id src/motion/.arch-ids/smp_motion_est.h.id src/motion/x86_asm/.arch-ids/sad_3dne.asm.id src/quant/x86_asm/.arch-ids/quantize4_xmm.asm.id src/quant/x86_asm/.arch-ids/quantize_3dne.asm.id src/utils/x86_asm/.arch-ids/interlacing_mmx.asm.id src/utils/x86_asm/.arch-ids/mem_transfer_3dne.asm.id src/bitstream/x86_asm/cbp_3dne.asm src/dct/simple_idct.c src/dct/x86_asm/fdct_xmm.asm src/dct/x86_asm/idct_3dne.asm src/dct/x86_asm/simple_idct_mmx.asm src/image/font.c src/image/font.h src/image/reduced.c src/image/reduced.h src/image/x86_asm/colorspace_mmx.inc src/image/x86_asm/colorspace_rgb_mmx.asm src/image/x86_asm/colorspace_yuv_mmx.asm src/image/x86_asm/colorspace_yuyv_mmx.asm src/image/x86_asm/interpolate8x8_3dne.asm src/image/x86_asm/reduced_mmx.asm src/motion/motion_est.h src/motion/smp_motion_est.c src/motion/smp_motion_est.h src/motion/x86_asm/sad_3dne.asm src/quant/x86_asm/quantize4_xmm.asm src/quant/x86_asm/quantize_3dne.asm src/utils/x86_asm/interlacing_mmx.asm src/utils/x86_asm/mem_transfer_3dne.asm modified files: build/generic/Makefile build/generic/configure.in build/generic/sources.inc build/win32/libxvidcore.dsp examples/odivx_enc_dec.c examples/xvid_bench.c examples/xvid_decraw.c examples/xvid_encraw.c examples/xvid_stat.c src/bitstream/bitstream.c src/bitstream/bitstream.h src/bitstream/cbp.c src/bitstream/cbp.h src/bitstream/mbcoding.c src/bitstream/mbcoding.h src/bitstream/ppc_asm/cbp_altivec.s src/bitstream/ppc_asm/cbp_ppc.s src/bitstream/vlc_codes.h src/bitstream/x86_asm/cbp_mmx.asm src/bitstream/x86_asm/cbp_sse2.asm src/bitstream/zigzag.h src/dct/fdct.c src/dct/fdct.h src/dct/ia64_asm/fdct_ia64.s src/dct/idct.c src/dct/idct.h src/dct/ppc_asm/fdct_altivec.s src/dct/ppc_asm/idct_altivec.s src/dct/x86_asm/fdct_mmx.asm src/dct/x86_asm/idct_mmx.asm src/decoder.c src/decoder.h src/divx4.c src/divx4.h src/encoder.c src/encoder.h src/global.h src/image/colorspace.c src/image/colorspace.h src/image/image.c src/image/image.h src/image/interpolate8x8.c src/image/interpolate8x8.h src/image/x86_asm/interpolate8x8_3dn.asm src/image/x86_asm/interpolate8x8_mmx.asm src/image/x86_asm/interpolate8x8_xmm.asm src/image/x86_asm/rgb_to_yv12_mmx.asm src/image/x86_asm/yuv_to_yv12_mmx.asm src/image/x86_asm/yuyv_to_yv12_mmx.asm src/image/x86_asm/yv12_to_rgb24_mmx.asm src/image/x86_asm/yv12_to_rgb32_mmx.asm src/image/x86_asm/yv12_to_yuyv_mmx.asm src/motion/ia64_asm/sad_ia64.s src/motion/motion.h src/motion/motion_comp.c src/motion/motion_est.c src/motion/ppc_asm/sad_altivec.c src/motion/sad.c src/motion/sad.h src/motion/x86_asm/sad_3dn.asm src/motion/x86_asm/sad_mmx.asm src/motion/x86_asm/sad_sse2.asm src/motion/x86_asm/sad_xmm.asm src/portab.h src/prediction/mbprediction.c src/prediction/mbprediction.h src/quant/adapt_quant.c src/quant/adapt_quant.h src/quant/quant_h263.c src/quant/quant_h263.h src/quant/quant_matrix.c src/quant/quant_matrix.h src/quant/quant_mpeg4.c src/quant/quant_mpeg4.h src/quant/x86_asm/quantize4_mmx.asm src/quant/x86_asm/quantize_mmx.asm src/utils/emms.c src/utils/emms.h src/utils/ia64_asm/mem_transfer_ia64.s src/utils/mbfunctions.h src/utils/mbtransquant.c src/utils/mem_align.c src/utils/mem_align.h src/utils/mem_transfer.c src/utils/mem_transfer.h src/utils/ratecontrol.c src/utils/ratecontrol.h src/utils/timer.c src/utils/timer.h src/utils/x86_asm/cpuid.asm src/utils/x86_asm/mem_transfer_mmx.asm src/xvid.c src/xvid.h 2003-02-14 23:01:44 GMT base-0 Summary: Upcoming 1.0 version continuation Revision: xvidcore--stable--1.0--base-0 Continuation of the xvidcore--stable--0.9 version. new directories: {arch}/xvidcore/xvidcore--stable/xvidcore--stable--1.0 {arch}/xvidcore/xvidcore--stable/xvidcore--stable--1.0/ed.gomez@free.fr--main {arch}/xvidcore/xvidcore--stable/xvidcore--stable--1.0/ed.gomez@free.fr--main/patch-log 2003-02-11 21:03:19 GMT patch-20 Summary: Removed BFrame outdated bframe/qpel decoding. Revision: xvidcore--stable--0.9--patch-20 Michael noticed there were still pieces of bframe and qpel decoding. He proposed to remove it or upgrade it... Feeling too lazy to merge all differences (the too much divergent) file from dev-api-3. modified files: changelog.txt src/decoder.c src/image/interpolate8x8.c src/image/interpolate8x8.h 2003-02-11 18:40:48 GMT patch-19 Summary: Fixed libxvidcore.def, revamped Makefile output, fixed ia64 build, added ranlib detection. Revision: xvidcore--stable--0.9--patch-19 Ok this patch does lot of things. First, it fixes libxvidcore.def for win32 targets. This file is now generated at configure time. This way we make sure no symobols are exported without being compiled in. Second, I revamped Makefile so its output is more eye candy. Third change, I removed the ia64 dct file from SRC_IA64 variable, this was interfering with the DCT_IA64_SOURCES variable. Fourth change, ranlib is detected at configure time and used in the Makeile through the RANLIB variable. modified files: build/generic/Makefile build/generic/configure.in build/generic/libxvidcore.def.in build/generic/platform.inc.in build/generic/sources.inc renamed files: build/generic/.arch-ids/libxvidcore.def.id ==> build/generic/.arch-ids/libxvidcore.def.in.id build/generic/libxvidcore.def ==> build/generic/libxvidcore.def.in 2003-02-10 23:31:01 GMT patch-18 Summary: Fixed xvid_encraw help message. Revision: xvidcore--stable--0.9--patch-18 Fixed xvid_encraw help message. modified files: examples/xvid_encraw.c 2003-02-10 23:06:32 GMT patch-17 Summary: Added IA64 DCT source choice according to the compiler basename. Revision: xvidcore--stable--0.9--patch-17 The IA64 dct file must be choosen according to the compiler. I chose to look for a basename based on the *ecc* regexp, all other compiler will be treated as being the GNU C compiler. Hope this is enough. modified files: build/generic/Makefile build/generic/configure.in build/generic/platform.inc.in 2003-02-10 13:49:25 GMT patch-16 Summary: Changed linking option on PPC platforms (-flat_namespace) Revision: xvidcore--stable--0.9--patch-16 Guillaume sent me this fix for PPC platforms. modified files: build/generic/configure.in 2003-02-09 23:15:18 GMT patch-15 Summary: Added the configure bootstrap script. Revision: xvidcore--stable--0.9--patch-15 Added the configure bootstrap script. new files: build/generic/.arch-ids/bootstrap.sh.id build/generic/bootstrap.sh 2003-02-09 23:06:51 GMT patch-14 Summary: The PPC port is now disabled because it is outdated. Revision: xvidcore--stable--0.9--patch-14 The PPC port is now disabled because it is outdated. modified files: build/generic/configure.in 2003-02-09 23:01:30 GMT patch-13 Summary: More "unknown compiler" friendly portab.h file. Revision: xvidcore--stable--0.9--patch-13 Cristoph pointed out that portab.h was a problem when used with unknown compilers. This patch tries to fix that. modified files: src/portab.h 2003-02-09 00:49:32 GMT patch-12 Summary: DivX4 compatibility layer has been turned into an option (default:disable). Revision: xvidcore--stable--0.9--patch-12 The divx4 compatibility API has been turned into an option. This has been a long wanted thing by mplayer's guys, so here it is. As we say in french "mieux vaut tard que jamais". modified files: build/generic/Makefile build/generic/configure.in build/generic/platform.inc.in build/generic/sources.inc 2003-02-08 23:29:55 GMT patch-11 Summary: Fixed WIN32/_MSC_VER confusion and updated MSVC libxvidcore.dsp project file. Revision: xvidcore--stable--0.9--patch-11 This patch set fixes all WIN32/_MSC_VER conditional compilation in examples and in the Illegal Instruction detection for SSE2 support in xvid.c. libxvidcore.dsp file as been updated with the right defines for x86 support (ARCH_IS_IA32, ARCH_IS_32BIT, ARCH_IS_LITTLE_ENDIAN) Hope Win32 is now completely ready. modified files: build/win32/libxvidcore.dsp examples/xvid_bench.c examples/xvid_decraw.c examples/xvid_encraw.c examples/xvid_stat.c src/xvid.c 2003-02-08 14:55:19 GMT patch-10 Summary: Fixed MacOSX build. Revision: xvidcore--stable--0.9--patch-10 Two fixes for MacOSX. It adds a missing option to gcc to allow the linking stage on this platform (-fno-common). It fixes Altivec test which was outputting result on the console. modified files: build/generic/configure.in 2003-02-08 12:49:17 GMT patch-9 Summary: Added Altivec detection (Guillaume Morin) Revision: xvidcore--stable--0.9--patch-9 Added Altivec detection test in configure.in. modified files: build/generic/configure.in 2003-02-08 12:25:46 GMT patch-8 Summary: Fixed portab.h for _DEBUG target and GCC/ICC compilers. Revision: xvidcore--stable--0.9--patch-8 During the Unix build system change, i had to turn the DRPINTF macro into a real function because teh MacOSX compilers do not support the variadic macros as defined in ISO C99 standard. During this change, i forgot to adapt the macro code and the #include needed for variadic functions. modified files: src/portab.h 2003-02-08 11:45:00 GMT patch-7 Summary: Changed the way I add strings into variables. Revision: xvidcore--stable--0.9--patch-7 I changed the way I add strings to variables (CFLAGS and so on). Now i use var="$var string2" instead of var=$var" string2". Fixed a typo reported by Pete for the cygwin part. modified files: build/generic/configure.in 2003-02-07 23:16:57 GMT patch-6 Summary: Fixed the "ar" "s" option for some platforms. Revision: xvidcore--stable--0.9--patch-6 The "s" option of the "ar" program is not standard accross all platforms. I had at least problems on OpenBSD and an old Solaris version. modified files: build/generic/Makefile 2003-02-07 22:19:37 GMT patch-5 Summary: Fixed a BSD checking in ansm output format. Revision: xvidcore--stable--0.9--patch-5 Fixed a BSD checking in ansm output format. modified files: build/generic/configure.in 2003-02-07 21:18:14 GMT patch-4 Summary: Fixed options and added the --disable-assembly option Revision: xvidcore--stable--0.9--patch-4 Options have been fixed because they were not taking care of the enable_feature variable. The --disable-assembly options has been added. This is a good way to compile XviD on nearly all platforms without having to deal with the assembly code -- useful on PPC platform at the moment where gcc seems to use a different kind of assembly syntax. modified files: build/generic/configure.in 2003-02-06 21:49:16 GMT patch-3 Summary: Fixes for the new build system in sources. Revision: xvidcore--stable--0.9--patch-3 This patch fixes source files according to the new defines used by the reworked build system. modified files: src/bitstream/bitstream.h src/divx4.h src/portab.h src/utils/emms.h src/xvid.c 2003-02-06 21:22:55 GMT patch-2 Summary: Changed build system for Unix OSes Revision: xvidcore--stable--0.9--patch-2 Changed the build system for Unix systems. It is now built upon an autoconf script that automatically configures the sources. The Makefile is portable accross various platforms and "make" programs. It is at least working on these platforms for now: - Debian GNU/Linux - StrongARM - Alphave67 (alpha 64bit) - ia32 UltraSparcIII - Solaris - UltraSparcI - Sparc 32bit on old sun stations ( i don't remember the exact name) - FreeBSD 4.7 - ia32 - RedHat 7.3 - ia32 - Gentoo 1.4 - ia32 - the Irix box according to christoph tests - ia64 - Unknown OS? The unix unified makefile supports: - gmake - pmake ToDo things to finish this new build system: - Manage the ecc/gcc source choice for ia64 - Someone to test the makefile on Cygwin and/or mingw+minsys - Update MSVC projects (replace 2 or 3 defines) - See why MacOSX is complaining about duplicated symbols, it seems the mach ABI does not alow namespace collisions even between C modules. And add altivec detection in configure.in new files: build/generic/.arch-ids/configure.in.id build/generic/.arch-ids/Makefile.id build/generic/.arch-ids/platform.inc.in.id build/generic/.arch-ids/sources.inc.id build/generic/configure.in build/generic/Makefile build/generic/platform.inc.in build/generic/sources.inc removed files: build/generic/.arch-ids/Makefile.beos.id build/generic/.arch-ids/Makefile.cygwin.id build/generic/.arch-ids/Makefile.dj.id build/generic/.arch-ids/Makefile.freebsd.id build/generic/.arch-ids/Makefile.generic.id build/generic/.arch-ids/Makefile.ia64.id build/generic/.arch-ids/Makefile.inc.id build/generic/.arch-ids/Makefile.irix64.id build/generic/.arch-ids/Makefile.linuxppc.id build/generic/.arch-ids/Makefile.linuxppc_altivec.id build/generic/.arch-ids/Makefile.linuxx86.id build/generic/.arch-ids/Makefile.sparc.id build/generic/Makefile.beos build/generic/Makefile.cygwin build/generic/Makefile.dj build/generic/Makefile.freebsd build/generic/Makefile.generic build/generic/Makefile.ia64 build/generic/Makefile.inc build/generic/Makefile.irix64 build/generic/Makefile.linuxppc build/generic/Makefile.linuxppc_altivec build/generic/Makefile.linuxx86 build/generic/Makefile.sparc 2003-02-06 21:11:17 GMT patch-1 Summary: Updated to current stable CVS_HEAD Revision: xvidcore--stable--0.9--patch-1 Updated files to current stable CVS_HEAD versions. new files: build/win32/.arch-ids/odivx_enc_dec.dsp.id build/win32/.arch-ids/xvidcore.dsw.id build/win32/.arch-ids/xvid_bench.dsp.id build/win32/odivx_enc_dec.dsp build/win32/xvidcore.dsw build/win32/xvid_bench.dsp modified files: authors.txt build/generic/Makefile.beos build/generic/Makefile.generic build/generic/Makefile.linuxx86 doc/Makefile examples/Makefile examples/odivx_enc_dec.c examples/xvid_bench.c examples/xvid_encraw.c examples/xvid_stat.c src/bitstream/bitstream.c src/bitstream/bitstream.h src/bitstream/cbp.c src/bitstream/mbcoding.c src/bitstream/vlc_codes.h src/bitstream/zigzag.h src/dct/fdct.c src/dct/idct.c src/dct/idct.h src/decoder.c src/decoder.h src/divx4.h src/encoder.c src/encoder.h src/global.h src/image/colorspace.c src/image/image.c src/image/interpolate8x8.c src/image/interpolate8x8.h src/motion/motion.h src/motion/motion_comp.c src/motion/motion_est.c src/motion/sad.c src/portab.h src/prediction/mbprediction.c src/prediction/mbprediction.h src/quant/adapt_quant.c src/quant/adapt_quant.h src/quant/quant_h263.c src/quant/quant_mpeg4.c src/utils/emms.h src/utils/mbfunctions.h src/utils/mbtransquant.c src/utils/mem_align.c src/utils/mem_transfer.c src/utils/timer.c src/utils/timer.h src/xvid.h todo.txt 2003-02-06 20:59:19 GMT base-0 Summary: Imported xvidcore 0.9.0 into arch repository Revision: xvidcore--stable--0.9--base-0 Imported xvidcore 0.9.0 into arch repository. I hope I forgot nothing. new files: ./.arch-ids/CodingStyle.id ./.arch-ids/LICENSE.id ./.arch-ids/README.txt.id ./.arch-ids/authors.txt.id ./.arch-ids/changelog.txt.id ./.arch-ids/todo.txt.id ./CodingStyle ./LICENSE ./README.txt ./authors.txt ./build/.arch-ids/=id ./build/generic/.arch-ids/=id ./build/generic/.arch-ids/Makefile.beos.id ./build/generic/.arch-ids/Makefile.cygwin.id ./build/generic/.arch-ids/Makefile.dj.id ./build/generic/.arch-ids/Makefile.freebsd.id ./build/generic/.arch-ids/Makefile.generic.id ./build/generic/.arch-ids/Makefile.ia64.id ./build/generic/.arch-ids/Makefile.inc.id ./build/generic/.arch-ids/Makefile.irix64.id ./build/generic/.arch-ids/Makefile.linuxppc.id ./build/generic/.arch-ids/Makefile.linuxppc_altivec.id ./build/generic/.arch-ids/Makefile.linuxx86.id ./build/generic/.arch-ids/Makefile.sparc.id ./build/generic/.arch-ids/libxvidcore.def.id ./build/generic/Makefile.beos ./build/generic/Makefile.cygwin ./build/generic/Makefile.dj ./build/generic/Makefile.freebsd ./build/generic/Makefile.generic ./build/generic/Makefile.ia64 ./build/generic/Makefile.inc ./build/generic/Makefile.irix64 ./build/generic/Makefile.linuxppc ./build/generic/Makefile.linuxppc_altivec ./build/generic/Makefile.linuxx86 ./build/generic/Makefile.sparc ./build/generic/libxvidcore.def ./build/win32/.arch-ids/=id ./build/win32/.arch-ids/libxvidcore.dsp.id ./build/win32/.arch-ids/xvid_decraw.dsp.id ./build/win32/.arch-ids/xvid_encraw.dsp.id ./build/win32/.arch-ids/xvid_stat.dsp.id ./build/win32/libxvidcore.dsp ./build/win32/xvid_decraw.dsp ./build/win32/xvid_encraw.dsp ./build/win32/xvid_stat.dsp ./changelog.txt ./doc/.arch-ids/=id ./doc/.arch-ids/API.dox.id ./doc/.arch-ids/Makefile.id ./doc/.arch-ids/README.id ./doc/.arch-ids/foot.inc.in.id ./doc/.arch-ids/header.tex.in.id ./doc/.arch-ids/xvid-decoding.txt.id ./doc/.arch-ids/xvid-encoder.txt.id ./doc/API.dox ./doc/Makefile ./doc/README ./doc/foot.inc.in ./doc/header.tex.in ./doc/xvid-decoding.txt ./doc/xvid-encoder.txt ./examples/.arch-ids/=id ./examples/.arch-ids/Makefile.id ./examples/.arch-ids/README.id ./examples/.arch-ids/cactus.pgm.bz2.id ./examples/.arch-ids/odivx_enc_dec.c.id ./examples/.arch-ids/xvid_bench.c.id ./examples/.arch-ids/xvid_decraw.c.id ./examples/.arch-ids/xvid_encraw.c.id ./examples/.arch-ids/xvid_stat.c.id ./examples/Makefile ./examples/README ./examples/cactus.pgm.bz2 ./examples/odivx_enc_dec.c ./examples/xvid_bench.c ./examples/xvid_decraw.c ./examples/xvid_encraw.c ./examples/xvid_stat.c ./src/.arch-ids/=id ./src/.arch-ids/decoder.c.id ./src/.arch-ids/decoder.h.id ./src/.arch-ids/divx4.c.id ./src/.arch-ids/divx4.h.id ./src/.arch-ids/encoder.c.id ./src/.arch-ids/encoder.h.id ./src/.arch-ids/global.h.id ./src/.arch-ids/portab.h.id ./src/.arch-ids/xvid.c.id ./src/.arch-ids/xvid.h.id ./src/bitstream/.arch-ids/=id ./src/bitstream/.arch-ids/bitstream.c.id ./src/bitstream/.arch-ids/bitstream.h.id ./src/bitstream/.arch-ids/cbp.c.id ./src/bitstream/.arch-ids/cbp.h.id ./src/bitstream/.arch-ids/mbcoding.c.id ./src/bitstream/.arch-ids/mbcoding.h.id ./src/bitstream/.arch-ids/vlc_codes.h.id ./src/bitstream/.arch-ids/zigzag.h.id ./src/bitstream/bitstream.c ./src/bitstream/bitstream.h ./src/bitstream/cbp.c ./src/bitstream/cbp.h ./src/bitstream/mbcoding.c ./src/bitstream/mbcoding.h ./src/bitstream/ppc_asm/.arch-ids/=id ./src/bitstream/ppc_asm/.arch-ids/cbp_altivec.s.id ./src/bitstream/ppc_asm/.arch-ids/cbp_ppc.s.id ./src/bitstream/ppc_asm/cbp_altivec.s ./src/bitstream/ppc_asm/cbp_ppc.s ./src/bitstream/vlc_codes.h ./src/bitstream/x86_asm/.arch-ids/=id ./src/bitstream/x86_asm/.arch-ids/cbp_mmx.asm.id ./src/bitstream/x86_asm/.arch-ids/cbp_sse2.asm.id ./src/bitstream/x86_asm/cbp_mmx.asm ./src/bitstream/x86_asm/cbp_sse2.asm ./src/bitstream/zigzag.h ./src/dct/.arch-ids/=id ./src/dct/.arch-ids/README.IJG.id ./src/dct/.arch-ids/fdct.c.id ./src/dct/.arch-ids/fdct.h.id ./src/dct/.arch-ids/idct.c.id ./src/dct/.arch-ids/idct.h.id ./src/dct/README.IJG ./src/dct/fdct.c ./src/dct/fdct.h ./src/dct/ia64_asm/.arch-ids/=id ./src/dct/ia64_asm/.arch-ids/fdct_ia64.s.id ./src/dct/ia64_asm/.arch-ids/genidct.py.id ./src/dct/ia64_asm/.arch-ids/idct_fini.s.id ./src/dct/ia64_asm/.arch-ids/idct_ia64_ecc.s.id ./src/dct/ia64_asm/.arch-ids/idct_ia64_gcc.s.id ./src/dct/ia64_asm/.arch-ids/idct_init.s.id ./src/dct/ia64_asm/fdct_ia64.s ./src/dct/ia64_asm/genidct.py ./src/dct/ia64_asm/idct_fini.s ./src/dct/ia64_asm/idct_ia64_ecc.s ./src/dct/ia64_asm/idct_ia64_gcc.s ./src/dct/ia64_asm/idct_init.s ./src/dct/idct.c ./src/dct/idct.h ./src/dct/ppc_asm/.arch-ids/=id ./src/dct/ppc_asm/.arch-ids/fdct_altivec.s.id ./src/dct/ppc_asm/.arch-ids/idct_altivec.s.id ./src/dct/ppc_asm/fdct_altivec.s ./src/dct/ppc_asm/idct_altivec.s ./src/dct/x86_asm/.arch-ids/=id ./src/dct/x86_asm/.arch-ids/fdct_mmx.asm.id ./src/dct/x86_asm/.arch-ids/idct_mmx.asm.id ./src/dct/x86_asm/fdct_mmx.asm ./src/dct/x86_asm/idct_mmx.asm ./src/decoder.c ./src/decoder.h ./src/divx4.c ./src/divx4.h ./src/encoder.c ./src/encoder.h ./src/global.h ./src/image/.arch-ids/=id ./src/image/.arch-ids/colorspace.c.id ./src/image/.arch-ids/colorspace.h.id ./src/image/.arch-ids/image.c.id ./src/image/.arch-ids/image.h.id ./src/image/.arch-ids/interpolate8x8.c.id ./src/image/.arch-ids/interpolate8x8.h.id ./src/image/colorspace.c ./src/image/colorspace.h ./src/image/ia64_asm/.arch-ids/=id ./src/image/ia64_asm/.arch-ids/README.id ./src/image/ia64_asm/.arch-ids/interpolate8x8_ia64.s.id ./src/image/ia64_asm/.arch-ids/interpolate8x8_ia64_exact.s.id ./src/image/ia64_asm/README ./src/image/ia64_asm/interpolate8x8_ia64.s ./src/image/ia64_asm/interpolate8x8_ia64_exact.s ./src/image/image.c ./src/image/image.h ./src/image/interpolate8x8.c ./src/image/interpolate8x8.h ./src/image/x86_asm/.arch-ids/=id ./src/image/x86_asm/.arch-ids/interpolate8x8_3dn.asm.id ./src/image/x86_asm/.arch-ids/interpolate8x8_mmx.asm.id ./src/image/x86_asm/.arch-ids/interpolate8x8_xmm.asm.id ./src/image/x86_asm/.arch-ids/rgb_to_yv12_mmx.asm.id ./src/image/x86_asm/.arch-ids/yuv_to_yv12_mmx.asm.id ./src/image/x86_asm/.arch-ids/yuyv_to_yv12_mmx.asm.id ./src/image/x86_asm/.arch-ids/yv12_to_rgb24_mmx.asm.id ./src/image/x86_asm/.arch-ids/yv12_to_rgb32_mmx.asm.id ./src/image/x86_asm/.arch-ids/yv12_to_yuyv_mmx.asm.id ./src/image/x86_asm/interpolate8x8_3dn.asm ./src/image/x86_asm/interpolate8x8_mmx.asm ./src/image/x86_asm/interpolate8x8_xmm.asm ./src/image/x86_asm/rgb_to_yv12_mmx.asm ./src/image/x86_asm/yuv_to_yv12_mmx.asm ./src/image/x86_asm/yuyv_to_yv12_mmx.asm ./src/image/x86_asm/yv12_to_rgb24_mmx.asm ./src/image/x86_asm/yv12_to_rgb32_mmx.asm ./src/image/x86_asm/yv12_to_yuyv_mmx.asm ./src/motion/.arch-ids/=id ./src/motion/.arch-ids/motion.h.id ./src/motion/.arch-ids/motion_comp.c.id ./src/motion/.arch-ids/motion_est.c.id ./src/motion/.arch-ids/sad.c.id ./src/motion/.arch-ids/sad.h.id ./src/motion/ia64_asm/.arch-ids/=id ./src/motion/ia64_asm/.arch-ids/calc_delta_1.s.id ./src/motion/ia64_asm/.arch-ids/calc_delta_2.s.id ./src/motion/ia64_asm/.arch-ids/calc_delta_3.s.id ./src/motion/ia64_asm/.arch-ids/halfpel8_refine_ia64.s.id ./src/motion/ia64_asm/.arch-ids/sad_ia64.s.id ./src/motion/ia64_asm/calc_delta_1.s ./src/motion/ia64_asm/calc_delta_2.s ./src/motion/ia64_asm/calc_delta_3.s ./src/motion/ia64_asm/halfpel8_refine_ia64.s ./src/motion/ia64_asm/sad_ia64.s ./src/motion/motion.h ./src/motion/motion_comp.c ./src/motion/motion_est.c ./src/motion/ppc_asm/.arch-ids/=id ./src/motion/ppc_asm/.arch-ids/README.id ./src/motion/ppc_asm/.arch-ids/sad_altivec.c.id ./src/motion/ppc_asm/.arch-ids/sad_altivec.s.id ./src/motion/ppc_asm/README ./src/motion/ppc_asm/sad_altivec.c ./src/motion/ppc_asm/sad_altivec.s ./src/motion/sad.c ./src/motion/sad.h ./src/motion/x86_asm/.arch-ids/=id ./src/motion/x86_asm/.arch-ids/sad_3dn.asm.id ./src/motion/x86_asm/.arch-ids/sad_mmx.asm.id ./src/motion/x86_asm/.arch-ids/sad_sse2.asm.id ./src/motion/x86_asm/.arch-ids/sad_xmm.asm.id ./src/motion/x86_asm/sad_3dn.asm ./src/motion/x86_asm/sad_mmx.asm ./src/motion/x86_asm/sad_sse2.asm ./src/motion/x86_asm/sad_xmm.asm ./src/portab.h ./src/prediction/.arch-ids/=id ./src/prediction/.arch-ids/mbprediction.c.id ./src/prediction/.arch-ids/mbprediction.h.id ./src/prediction/mbprediction.c ./src/prediction/mbprediction.h ./src/quant/.arch-ids/=id ./src/quant/.arch-ids/adapt_quant.c.id ./src/quant/.arch-ids/adapt_quant.h.id ./src/quant/.arch-ids/quant_h263.c.id ./src/quant/.arch-ids/quant_h263.h.id ./src/quant/.arch-ids/quant_matrix.c.id ./src/quant/.arch-ids/quant_matrix.h.id ./src/quant/.arch-ids/quant_mpeg4.c.id ./src/quant/.arch-ids/quant_mpeg4.h.id ./src/quant/adapt_quant.c ./src/quant/adapt_quant.h ./src/quant/ia64_asm/.arch-ids/=id ./src/quant/ia64_asm/.arch-ids/quant_h263_ia64.s.id ./src/quant/ia64_asm/quant_h263_ia64.s ./src/quant/quant_h263.c ./src/quant/quant_h263.h ./src/quant/quant_matrix.c ./src/quant/quant_matrix.h ./src/quant/quant_mpeg4.c ./src/quant/quant_mpeg4.h ./src/quant/x86_asm/.arch-ids/=id ./src/quant/x86_asm/.arch-ids/quantize4_mmx.asm.id ./src/quant/x86_asm/.arch-ids/quantize_mmx.asm.id ./src/quant/x86_asm/quantize4_mmx.asm ./src/quant/x86_asm/quantize_mmx.asm ./src/utils/.arch-ids/=id ./src/utils/.arch-ids/emms.c.id ./src/utils/.arch-ids/emms.h.id ./src/utils/.arch-ids/mbfunctions.h.id ./src/utils/.arch-ids/mbtransquant.c.id ./src/utils/.arch-ids/mem_align.c.id ./src/utils/.arch-ids/mem_align.h.id ./src/utils/.arch-ids/mem_transfer.c.id ./src/utils/.arch-ids/mem_transfer.h.id ./src/utils/.arch-ids/ratecontrol.c.id ./src/utils/.arch-ids/ratecontrol.h.id ./src/utils/.arch-ids/timer.c.id ./src/utils/.arch-ids/timer.h.id ./src/utils/emms.c ./src/utils/emms.h ./src/utils/ia64_asm/.arch-ids/=id ./src/utils/ia64_asm/.arch-ids/mem_transfer_ia64.s.id ./src/utils/ia64_asm/mem_transfer_ia64.s ./src/utils/mbfunctions.h ./src/utils/mbtransquant.c ./src/utils/mem_align.c ./src/utils/mem_align.h ./src/utils/mem_transfer.c ./src/utils/mem_transfer.h ./src/utils/ratecontrol.c ./src/utils/ratecontrol.h ./src/utils/timer.c ./src/utils/timer.h ./src/utils/x86_asm/.arch-ids/=id ./src/utils/x86_asm/.arch-ids/cpuid.asm.id ./src/utils/x86_asm/.arch-ids/mem_transfer_mmx.asm.id ./src/utils/x86_asm/cpuid.asm ./src/utils/x86_asm/mem_transfer_mmx.asm ./src/xvid.c ./src/xvid.h ./todo.txt xvidcore/README0000664000076500007650000000145011504426344014346 0ustar xvidbuildxvidbuild1) Introduction --------------- Xvid is a high performance and high quality MPEG-4 video de-/encoding solution. The Xvid package currently consists of three parts: - xvidcore: the main MPEG-4 de-/encoding library, and simple example programs - dshow: windows direct show decoder filter which links against xvidcore to allow MPEG-4 playback on Windows based OS. - vfw: video for windows GUI 2) Documentation ---------------- - xvidcore/doc/README: some general information. - xvidcore/doc/INSTALL: building and installing instructions. 3) Licensing: ------------ - Xvid is licensed as a whole under the terms of the Xvid license described in the file LICENSE This is true for all files belonging to Xvid except for those which specifically carry a different license header.xvidcore/vfw/0000775000076500007650000000000011567132321014266 5ustar xvidbuildxvidbuildxvidcore/vfw/LICENSE0000664000076500007650000004362710172440271015304 0ustar xvidbuildxvidbuild GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License. xvidcore/vfw/vfw.vcproj0000664000076500007650000003650111567132315016325 0ustar xvidbuildxvidbuild xvidcore/vfw/bin/0000775000076500007650000000000011566427761015054 5ustar xvidbuildxvidbuildxvidcore/vfw/bin/sources.inc0000664000076500007650000000017510027665151017221 0ustar xvidbuildxvidbuildLIBSO = xvidvfw.dll SRC_DIR = ../src SRC_C = \ codec.c \ config.c \ driverproc.c \ status.c SRC_RES = \ resource.rc xvidcore/vfw/bin/xvid.inf0000664000076500007650000000546011113546730016513 0ustar xvidbuildxvidbuild; Xvid MPEG-4 Video Codec install [Version] Signature = "$CHICAGO$" Class = MEDIA [SourceDisksNames] 1="Xvid MPEG-4 Video Codec Install Disk",, 0001 [SourceDisksFiles] xvidvfw.dll=1 xvid.inf=1 xvidcore.dll=1 [Installable.Drivers] XVID = 1:xvidvfw.dll, "vidc.XVID", "Xvid MPEG-4 Video Codec" , , , [DefaultInstall] CopyFiles=MPEG4.Copy.Inf,MPEG4.Copy Updateinis = MPEG4.Updateini DelReg = MPEG4.DelConfig addreg = MPEG4.AddReg,MPEG4.AddReg9x,MPEG4.DoReg MediaType = SOFTWARE [DefaultInstall.ntx86] CopyFiles=MPEG4.Copy.Inf,MPEG4.Copy DelReg = MPEG4.DelConfig addreg = MPEG4.AddReg,MPEG4.AddRegNT,MPEG4.DoReg MediaType = SOFTWARE [Remove_XviD] AddReg = MPEG4.Unregister DelReg = MPEG4.DelReg DelFiles = MPEG4.Copy,MPEG4.Copy.Inf UpdateInis = MPEG4.DelIni [MPEG4.Copy] xvidvfw.dll xvidcore.dll [MPEG4.Copy.Inf] xvid.inf [MPEG4.UpdateIni] system.ini, drivers32,,"vidc.XVID=xvidvfw.dll" [MPEG4.DelIni] system.ini, drivers32,"vidc.XVID=xvidvfw.dll", [MPEG4.AddReg] [MPEG4.AddReg9x] HKLM,SYSTEM\CurrentControlSet\Control\MediaResources\icm\vidc.XVID,Description,,%XviD% HKLM,SYSTEM\CurrentControlSet\Control\MediaResources\icm\vidc.XVID,Driver,,xvidvfw.dll HKLM,SYSTEM\CurrentControlSet\Control\MediaResources\icm\vidc.XVID,FriendlyName,,"XVID" HKLM,%UnInstallPath%,DisplayName,,%XviD% HKLM,%UnInstallPath%,UninstallString,,"%10%\rundll.exe setupx.dll,InstallHinfSection Remove_XviD 132 %17%\%InfFile%" [MPEG4.AddRegNT] HKLM,SOFTWARE\Microsoft\Windows NT\CurrentVersion\drivers.desc,xvidvfw.dll,,%XviD% HKLM,SOFTWARE\Microsoft\Windows NT\CurrentVersion\drivers32,vidc.XVID,,xvidvfw.dll HKLM,%UnInstallPath%,DisplayName,,%XviD% HKLM,%UnInstallPath%,DisplayIcon,,"%11%\xvidvfw.dll,0" HKLM,%UnInstallPath%,Publisher,,%mfgname% HKLM,%UnInstallPath%,HelpLink,,%Website% HKLM,%UnInstallPath%,NoModify,%REG_DWORD%,1 HKLM,%UnInstallPath%,NoRepair,%REG_DWORD%,1 HKLM,%UnInstallPath%,UninstallString,,"%11%\rundll32.exe setupapi,InstallHinfSection Remove_XviD 132 %17%\%InfFile%" [MPEG4.DoReg] ;HKLM,Software\Microsoft\Windows\CurrentVersion\RunOnce\Setup,"Registering Xvid Direct Show ;Decoder...",,"%11%\regsvr32.exe /s %11%\xvid.ax" [MPEG4.DelReg] HKLM,SYSTEM\CurrentControlSet\Control\MediaResources\icm\vidc.XVID HKLM,SOFTWARE\Microsoft\Windows NT\CurrentVersion\drivers.desc,xvidvfw.dll,,"" HKLM,%UnInstallPath% [MPEG4.Unregister] ;HKLM,Software\Microsoft\Windows\CurrentVersion\RunOnce\Setup,"Unregistering Xvid Direct Show ;Decoder...",,"%11%\regsvr32.exe /s /u %11%\xvid.ax" [MPEG4.DelConfig] HKCU,Software\GNU\XviD [DestinationDirs] DefaultDestDir = 11 ; LDID_SYS MPEG4.Copy = 11 MPEG4.Copy.Inf = 17 [Strings] XviD="Xvid MPEG-4 Video Codec" InfFile="xvid.inf" UnInstallPath="Software\Microsoft\Windows\CurrentVersion\Uninstall\xvid" MediaClassName="Media Devices" mfgname="Xvid Development Team" Website="http://www.xvid.org/" REG_DWORD=0x00010001 xvidcore/vfw/bin/Makefile0000664000076500007650000000530511121427717016503 0ustar xvidbuildxvidbuild############################################################################## # # Makefile for XviD VFW driver # # Author : Milan Cutka # Modified by : Edouard Gomez # Peter Ross # # Requires GNU Make because of shell expansion performed at a bad time with # other make programs (even using := variable assignments) # # $Id: Makefile,v 1.7 2008-12-15 10:22:07 Isibaar Exp $ ############################################################################## include sources.inc MAKEFILE_PWD:=$(shell pwd) LOCAL_XVID_SRCTREE:=$(MAKEFILE_PWD)/../../src LOCAL_XVID_BUILDTREE:=$(MAKEFILE_PWD)/../../build/generic/=build RM = rm -rf WINDRES=windres # Constants which should not be modified # The `mingw-runtime` package is required when building with -mno-cygwin CFLAGS += -I$(SRC_DIR)/w32api -I$(LOCAL_XVID_SRCTREE) CFLAGS += -mno-cygwin CFLAGS += -D_WIN32_IE=0x0501 ############################################################################## # Optional Compiler options ############################################################################## CFLAGS += -Wall CFLAGS += -O2 CFLAGS += -fstrength-reduce CFLAGS += -finline-functions CFLAGS += -fgcse CFLAGS += -ffast-math ############################################################################## # Compiler flags for linking stage ############################################################################## # LDFLAGS += ############################################################################## # Rules ############################################################################## OBJECTS = $(SRC_C:.c=.obj) OBJECTS+= $(SRC_RES:.rc=.obj) .SUFFIXES: .obj .rc .c BUILD_DIR = =build VPATH = $(SRC_DIR):$(BUILD_DIR) all: $(LIBSO) $(BUILD_DIR): @echo " D: $(BUILD_DIR)" @mkdir -p $(BUILD_DIR) .rc.obj: @echo " W: $(@D)/$( * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: status.c 1985 2011-05-18 09:02:35Z Isibaar $ * *****************************************************************************/ #include #include #include "resource.h" #include "codec.h" #include "status.h" #include "debug.h" #define CLR_BG 0 #define CLR_FG 1 #define CLR_QUANT_I 2 #define CLR_QUANT_P 3 #define CLR_QUANT_B 4 static void set_bic(RGBQUAD * rgb, int index, int r, int g, int b) { rgb[index].rgbRed = r; rgb[index].rgbGreen = g; rgb[index].rgbBlue = b; } /* draw graph into buffer */ static void draw_graph(status_t *s) { unsigned int i,j; memset(s->buffer, CLR_BG, s->width*s->stride); for (j=0; jheight; j++) for (i=0; i<31; i++) s->buffer[ j*s->stride + i*s->width31 ] = CLR_FG; if (s->count[0]>0) { for (i=0; i<31; i++) { /* i-vops */ unsigned int j_height = (s->height-s->tm.tmHeight)*s->quant[0][i]/s->max_quant_frames; if (j_height==0 && s->quant[0][i]>0) j_height=1; for(j=0; j < j_height; j++) { memset(s->buffer + (s->tm.tmHeight+j)*s->stride + i*s->width31 + 1, CLR_QUANT_I, s->width31-1); } /* p/s-vops */ j_height += (s->height-s->tm.tmHeight)*s->quant[1][i]/s->max_quant_frames; if (j_height==0 && s->quant[1][i]>0) j_height=1; for(; j < j_height; j++) { memset(s->buffer + (s->tm.tmHeight+j)*s->stride + i*s->width31 + 1, CLR_QUANT_P, s->width31-1); } /* b-vops */ j_height += (s->height-s->tm.tmHeight)*s->quant[2][i]/s->max_quant_frames; if (j_height==0 && s->quant[2][i]>0) j_height=1; for(; j < j_height; j++) { memset(s->buffer + (s->tm.tmHeight+j)*s->stride + i*s->width31 + 1, CLR_QUANT_B, s->width31-1); } } } } static const char * number[31] = { "1", "2", "3", "4", "5", "6", "7", "8", "9", "0","1","2","3","4","5","6","7","8","9", "0","1","2","3","4","5","6","7","8","9", "0","1" }; static double avg_quant(int quants[31], int min, int max, char* buf) { int i, sum = 0, count = 0; for (i = min; i <= max && i > 0; i++) { sum += i*quants[i-1]; count += quants[i-1]; } if (count != 0) { double avg = (double)sum/(double)count; sprintf(buf, "%.2f", avg); return avg; } else { buf[0] = 0; return 0.0; } } /* status window proc handlder */ static INT_PTR CALLBACK status_proc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam) { status_t * s = (status_t*)GetWindowLongPtr(hDlg, GWLP_USERDATA); switch (uMsg) { case WM_INITDIALOG : SetWindowLongPtr(hDlg, GWLP_USERDATA, lParam); s = (status_t*)lParam; s->hGraph = GetDlgItem(hDlg, IDC_STATUS_GRAPH); s->hDc = GetDC(s->hGraph); { RECT rect; GetWindowRect(s->hGraph, &rect); s->width = rect.right - rect.left; s->height = rect.bottom - rect.top; } s->width31 = s->width/31; s->stride = (s->width/4+1)*4; s->buffer = malloc(s->width * s->stride); s->bi = malloc(sizeof(BITMAPINFOHEADER) + 256*sizeof(RGBQUAD)); memset(s->bi, 0, sizeof(BITMAPINFOHEADER) + 256*sizeof(RGBQUAD)); s->bi->bmiHeader.biSize = sizeof(BITMAPINFOHEADER); s->bi->bmiHeader.biWidth = s->stride; s->bi->bmiHeader.biHeight = s->height; s->bi->bmiHeader.biPlanes = 1; s->bi->bmiHeader.biBitCount = 8; s->bi->bmiHeader.biCompression = BI_RGB; set_bic(s->bi->bmiColors, CLR_BG, 0, 0, 0); set_bic(s->bi->bmiColors, CLR_FG, 128, 128, 128); set_bic(s->bi->bmiColors, CLR_QUANT_I, 255, 0, 0); set_bic(s->bi->bmiColors, CLR_QUANT_P, 0, 0, 255); set_bic(s->bi->bmiColors, CLR_QUANT_B, 0, 192, 0); SelectObject(s->hDc, GetStockObject(DEFAULT_GUI_FONT)); SetBkColor(s->hDc, *(DWORD*)&s->bi->bmiColors[CLR_BG]); SetTextColor(s->hDc, *(DWORD*)&s->bi->bmiColors[CLR_FG]); GetTextMetrics(s->hDc, &s->tm); draw_graph(s); SetTimer(hDlg, IDC_STATUS_GRAPH, 1000, NULL); /* 1 second */ break; case WM_DESTROY : free(s->buffer); free(s->bi); KillTimer(hDlg, IDC_STATUS_GRAPH); s->hDlg = NULL; break; case WM_DRAWITEM : if (wParam==IDC_STATUS_GRAPH) { int i; /* copy buffer into dc */ SetDIBitsToDevice(s->hDc, 0, 0, s->width, s->height, 0, 0, 0, s->height, s->buffer, s->bi, DIB_RGB_COLORS); SetTextAlign(s->hDc, GetTextAlign(s->hDc)|TA_CENTER); for (i=0; i<31; i++) { TextOut(s->hDc, i*s->width31 + s->width/62, s->height-s->tm.tmHeight, number[i], strlen(number[i])); } } break; case WM_TIMER : if (wParam==IDC_STATUS_GRAPH) { double avg_q; char buf[16]; SetDlgItemInt(hDlg, IDC_STATUS_I_NUM, (unsigned int)s->count[1], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_P_NUM, (unsigned int)s->count[2], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_B_NUM, (unsigned int)s->count[3], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_NUM, (unsigned int)s->count[0], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_IQ_MIN, s->min_quant[1], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_IQ_MAX, s->max_quant[1], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_PQ_MIN, s->min_quant[2], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_PQ_MAX, s->max_quant[2], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_BQ_MIN, s->min_quant[3], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_BQ_MAX, s->max_quant[3], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_Q_MIN, s->min_quant[0], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_Q_MAX, s->max_quant[0], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_IL_MIN, s->min_length[1], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_IL_MAX, s->max_length[1], FALSE); if (s->count[1]>0) SetDlgItemInt(hDlg, IDC_STATUS_IL_AVG, (unsigned int)(s->tot_length[1]/s->count[1]), FALSE); else SetDlgItemInt(hDlg, IDC_STATUS_IL_AVG, 0, FALSE); SetDlgItemInt(hDlg, IDC_STATUS_IL_TOT, (unsigned int)(s->tot_length[1]/1024), FALSE); SetDlgItemInt(hDlg, IDC_STATUS_PL_MIN, s->min_length[2], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_PL_MAX, s->max_length[2], FALSE); if (s->count[2]>0) SetDlgItemInt(hDlg, IDC_STATUS_PL_AVG, (unsigned int)(s->tot_length[2]/s->count[2]), FALSE); else SetDlgItemInt(hDlg, IDC_STATUS_PL_AVG, 0, FALSE); SetDlgItemInt(hDlg, IDC_STATUS_PL_TOT, (unsigned int)(s->tot_length[2]/1024), FALSE); SetDlgItemInt(hDlg, IDC_STATUS_BL_MIN, s->min_length[3], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_BL_MAX, s->max_length[3], FALSE); if (s->count[3]>0) SetDlgItemInt(hDlg, IDC_STATUS_BL_AVG, (unsigned int)(s->tot_length[3]/s->count[3]), FALSE); else SetDlgItemInt(hDlg, IDC_STATUS_BL_AVG, 0, FALSE); SetDlgItemInt(hDlg, IDC_STATUS_BL_TOT, (unsigned int)(s->tot_length[3]/1024), FALSE); SetDlgItemInt(hDlg, IDC_STATUS_L_MIN, s->min_length[0], FALSE); SetDlgItemInt(hDlg, IDC_STATUS_L_MAX, s->max_length[0], FALSE); if (s->count[0]>0) SetDlgItemInt(hDlg, IDC_STATUS_L_AVG, (int)(s->tot_length[0]/s->count[0]), FALSE); else SetDlgItemInt(hDlg, IDC_STATUS_L_AVG, 0, FALSE); SetDlgItemInt(hDlg, IDC_STATUS_L_TOT, (unsigned int)(s->tot_length[0]/1024), FALSE); if (s->count[0]>0) { uint64_t kbits = 8*s->tot_length[0]/1000; double secs = (double)s->count[0]/s->fps; SetDlgItemInt(hDlg, IDC_STATUS_KBPS, (int)(kbits/secs), FALSE); }else{ SetDlgItemInt(hDlg, IDC_STATUS_KBPS, 0, FALSE); } avg_q = avg_quant(s->quant[0], s->min_quant[1], s->max_quant[1], buf) * s->count[1]; SetDlgItemText(hDlg, IDC_STATUS_IQ_AVG, buf); avg_q += avg_quant(s->quant[1], s->min_quant[2], s->max_quant[2], buf) * s->count[2]; SetDlgItemText(hDlg, IDC_STATUS_PQ_AVG, buf); avg_q += avg_quant(s->quant[2], s->min_quant[3], s->max_quant[3], buf) * s->count[3]; SetDlgItemText(hDlg, IDC_STATUS_BQ_AVG, buf); if (s->count[0] != 0) avg_q /= (double)s->count[0]; sprintf(buf, "%.2f", avg_q); SetDlgItemText(hDlg, IDC_STATUS_Q_AVG, buf); draw_graph(s); InvalidateRect(s->hGraph, NULL, FALSE); } break; case WM_COMMAND : if (LOWORD(wParam)==IDCANCEL) { DestroyWindow(hDlg); } break; default : return FALSE; } return TRUE; } /* destroy status window (however if the auto-close box is unchecked, dont destroy) */ void status_destroy(status_t *s) { if (s->hDlg && IsDlgButtonChecked(s->hDlg,IDC_STATUS_DESTROY)==BST_CHECKED) { DestroyWindow(s->hDlg); } } /* destroy status window, alwasys */ void status_destroy_always(status_t *s) { if (s->hDlg) { DestroyWindow(s->hDlg); } } /* create status window */ void status_create(status_t * s, unsigned int fps_inc, unsigned int fps_base) { int i; s->fps = fps_base/fps_inc; memset(s->quant[0], 0, 31*sizeof(int)); memset(s->quant[1], 0, 31*sizeof(int)); memset(s->quant[2], 0, 31*sizeof(int)); s->max_quant_frames = 0; for (i=0; i<4; i++) { s->count[i] = 0; s->min_quant[i] = s->max_quant[i] = 0; s->min_length[i] = s->max_length[i] = 0; s->tot_length[i] = 0; } s->hDlg = CreateDialogParam(g_hInst, MAKEINTRESOURCE(IDD_STATUS), GetDesktopWindow(), status_proc, (LPARAM)s); ShowWindow(s->hDlg, SW_SHOW); } static char type2char(int type) { if (type==XVID_TYPE_IVOP) return 'I'; if (type==XVID_TYPE_PVOP) return 'P'; if (type==XVID_TYPE_BVOP) return 'B'; return 'S'; } static void status_debugoutput(status_t *s, int type, int length, int quant) { if (s->hDlg && IsDlgButtonChecked(s->hDlg,IDC_SHOWINTERNALS)==BST_CHECKED) { LRESULT litem; char buf[128]; sprintf(buf, "[%6d] ->%c q:%2d (%6d b)", (unsigned int)(s->count[0]), type2char(type), quant, length); SendDlgItemMessage (s->hDlg,IDC_DEBUGOUTPUT, LB_ADDSTRING, 0, (LPARAM)(LPSTR)buf); litem = SendDlgItemMessage (s->hDlg, IDC_DEBUGOUTPUT, LB_GETCOUNT, 0, 0L); if (litem > 12) litem = SendDlgItemMessage (s->hDlg,IDC_DEBUGOUTPUT, LB_DELETESTRING, 0, 0L); SendDlgItemMessage(s->hDlg,IDC_DEBUGOUTPUT, LB_SETCURSEL, (WORD)(litem-1), 0L); } } /* feed stats info into the window */ void status_update(status_t *s, int type, int length, int quant) { s->count[0]++; status_debugoutput(s, type, length, quant); if (type == XVID_TYPE_SVOP) type = XVID_TYPE_PVOP; s->count[type]++; if (s->min_quant[0]==0 || quantmin_quant[0]) s->min_quant[0] = quant; if (s->max_quant[0]==0 || quant>s->max_quant[0]) s->max_quant[0] = quant; if (s->min_quant[type]==0 || quantmin_quant[type]) s->min_quant[type] = quant; if (s->max_quant[type]==0|| quant>s->max_quant[type]) s->max_quant[type] = quant; s->quant[type-1][quant-1]++; if (s->quant[0][quant-1] + s->quant[1][quant-1] + s->quant[2][quant-1] > s->max_quant_frames) s->max_quant_frames = s->quant[0][quant-1] + s->quant[1][quant-1] + s->quant[2][quant-1]; if (s->min_length[0]==0 || lengthmin_length[0]) s->min_length[0] = length; if (s->max_length[0]==0 || length>s->max_length[0]) s->max_length[0] = length; if (s->min_length[type]==0 || lengthmin_length[type]) s->min_length[type] = length; if (s->max_length[type]==0|| length>s->max_length[type]) s->max_length[type] = length; s->tot_length[0] += length; s->tot_length[type] += length; } xvidcore/vfw/src/config.c0000664000076500007650000026106311564705453016507 0ustar xvidbuildxvidbuild/************************************************************************** * * XVID VFW FRONTEND * config * * Copyright(C) Peter Ross * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * * $Id: config.c 1985 2011-05-18 09:02:35Z Isibaar $ * *************************************************************************/ #include #include #include /* sprintf */ #include /* Xvid API */ #include "debug.h" #include "config.h" #include "resource.h" #define CONSTRAINVAL(X,Y,Z) if((X)<(Y)) X=Y; if((X)>(Z)) X=Z; #define IsDlgChecked(hwnd,idc) (IsDlgButtonChecked(hwnd,idc) == BST_CHECKED) #define CheckDlg(hwnd,idc,value) CheckDlgButton(hwnd,idc, value?BST_CHECKED:BST_UNCHECKED) #define EnableDlgWindow(hwnd,idc,state) EnableWindow(GetDlgItem(hwnd,idc),state) static void zones_update(HWND hDlg, CONFIG * config); HINSTANCE g_hInst; HWND g_hTooltip; static int g_use_bitrate = 1; int pp_brightness, pp_dy, pp_duv, pp_fe, pp_dry, pp_druv; /* decoder options */ /* enumerates child windows, assigns tooltips */ BOOL CALLBACK enum_tooltips(HWND hWnd, LPARAM lParam) { char help[500]; if (LoadString(g_hInst, GetDlgCtrlID(hWnd), help, 500)) { TOOLINFO ti; ti.cbSize = sizeof(TOOLINFO); ti.uFlags = TTF_SUBCLASS | TTF_IDISHWND; ti.hwnd = GetParent(hWnd); ti.uId = (LPARAM)hWnd; ti.lpszText = help; SendMessage(g_hTooltip, TTM_ADDTOOL, 0, (LPARAM)&ti); } return TRUE; } /* ===================================================================================== */ /* MPEG-4 PROFILES/LEVELS ============================================================== */ /* ===================================================================================== */ /* #define EXTRA_PROFILES */ /* default vbv_occupancy is (64/170)*vbv_buffer_size */ #define PROFILE_S (PROFILE_4MV) #define PROFILE_ARTS (PROFILE_4MV|PROFILE_ADAPTQUANT) #define PROFILE_AS (PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_MPEGQUANT|PROFILE_INTERLACE|PROFILE_QPEL|PROFILE_GMC) const profile_t profiles[] = { /* name p@l w h fps obj Tvmv vmv vcv ac% vbv pkt bps vbv_peak dbf flags */ #ifndef EXTRA_PROFILES { "Xvid Mobile", "Xvid Mobile", 0x00, 352, 240, 30, 1, 990, 330, 9900, 100, 128*8192, -1, 1334850, 8000000, 5, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_MPEGQUANT|PROFILE_QPEL|PROFILE_XVID }, { "Xvid Home", "Xvid Home", 0x00, 720, 576, 25, 1, 4860, 1620, 40500, 100, 384*8192, -1, 4854000, 8000000, 5, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_MPEGQUANT|PROFILE_QPEL|PROFILE_INTERLACE|PROFILE_XVID }, { "Xvid HD 720", "Xvid HD 720", 0x00, 1280, 720, 30, 1,10800, 3600, 108000, 100, 768*8192, -1, 9708400, 16000000, 5, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_MPEGQUANT|PROFILE_QPEL|PROFILE_INTERLACE|PROFILE_XVID }, { "Xvid HD 1080", "Xvid HD 1080", 0x00, 1920,1080, 30, 1,24480, 8160, 244800, 100, 2047*8192, -1,20480000, 36000000, 5, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_MPEGQUANT|PROFILE_QPEL|PROFILE_INTERLACE|PROFILE_XVID }, #else { "Handheld", "Handheld", 0x00, 176, 144, 15, 1, 198, 99, 1485, 100, 32*8192, -1, 537600, 800000, 0, PROFILE_ADAPTQUANT|PROFILE_EXTRA }, { "Portable NTSC", "Portable NTSC", 0x00, 352, 240, 30, 1, 990, 330, 36000, 100, 384*8192, -1, 4854000, 8000000, 1, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_EXTRA }, { "Portable PAL", "Portable PAL", 0x00, 352, 288, 25, 1, 1188, 396, 36000, 100, 384*8192, -1, 4854000, 8000000, 1, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_EXTRA }, { "Home Theatre NTSC", "Home Theatre NTSC", 0x00, 720, 480, 30, 1, 4050, 1350, 40500, 100, 384*8192, -1, 4854000, 8000000, 1, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_INTERLACE|PROFILE_EXTRA }, { "Home Theatre PAL", "Home Theatre PAL", 0x00, 720, 576, 25, 1, 4860, 1620, 40500, 100, 384*8192, -1, 4854000, 8000000, 1, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_INTERLACE|PROFILE_EXTRA }, { "HDTV", "HDTV", 0x00, 1280, 720, 30, 1,10800, 3600, 108000, 100, 768*8192, -1, 9708400, 16000000, 2, PROFILE_4MV|PROFILE_ADAPTQUANT|PROFILE_BVOP|PROFILE_PACKED|PROFILE_INTERLACE|PROFILE_EXTRA }, #endif { "MPEG4 Simple @ L0", "MPEG4 SP @ L0", 0x08, 176, 144, 15, 1, 198, 99, 1485, 100, 10*16368, 2048, 64000, 0, -1, PROFILE_S }, /* simple@l0: max f_code=1, intra_dc_vlc_threshold=0 */ /* if ac preidition is used, adaptive quantization must not be used */ /* <=qcif must be used */ { "MPEG4 Simple @ L1", "MPEG4 SP @ L1", 0x01, 176, 144, 15, 4, 198, 99, 1485, 100, 10*16368, 2048, 64000, 0, -1, PROFILE_S|PROFILE_ADAPTQUANT }, { "MPEG4 Simple @ L2", "MPEG4 SP @ L2", 0x02, 352, 288, 15, 4, 792, 396, 5940, 100, 40*16368, 4096, 128000, 0, -1, PROFILE_S|PROFILE_ADAPTQUANT }, { "MPEG4 Simple @ L3", "MPEG4 SP @ L3", 0x03, 352, 288, 30, 4, 792, 396, 11880, 100, 40*16368, 8192, 384000, 0, -1, PROFILE_S|PROFILE_ADAPTQUANT }, /* From ISO/IEC 14496-2:2004/FPDAM 2: New Levels for Simple Profile */ { "MPEG4 Simple @ L4a","MPEG4 SP @ L4a", 0x04, 640, 480, 30, 4, 2400, 1200, 36000, 100, 80*16368, 16384, 4000000, 0, -1, PROFILE_S|PROFILE_ADAPTQUANT }, { "MPEG4 Simple @ L5", "MPEG4 SP @ L5", 0x05, 720, 576, 30, 4, 3240, 1620, 40500, 100, 112*16368, 16384, 8000000, 0, -1, PROFILE_S|PROFILE_ADAPTQUANT }, /* From ISO/IEC 14496-2:2004/FPDAM 4: Simple profile level 6 */ { "MPEG4 Simple @ L6", "MPEG4 SP @ L6", 0x06, 1280, 720, 30, 4, 7200, 3600, 108000, 100, 248*16368, 16384,12000000, 0, -1, PROFILE_S|PROFILE_ADAPTQUANT }, #if 0 /* since rrv encoding is no longer support, these profiles have little use */ { "MPEG4 ARTS @ L1", "MPEG4 ARTS @ L1", 0x91, 176, 144, 15, 4, 198, 99, 1485, 100, 10*16368, 8192, 64000, 0, -1, PROFILE_ARTS }, { "MPEG4 ARTS @ L2", "MPEG4 ARTS @ L2", 0x92, 352, 288, 15, 4, 792, 396, 5940, 100, 40*16368, 16384, 128000, 0, -1, PROFILE_ARTS }, { "MPEG4 ARTS @ L3", "MPEG4 ARTS @ L3", 0x93, 352, 288, 30, 4, 792, 396, 11880, 100, 40*16368, 16384, 384000, 0, -1, PROFILE_ARTS }, { "MPEG4 ARTS @ L4", "MPEG4 ARTS @ L4", 0x94, 352, 288, 30, 16, 792, 396, 11880, 100, 80*16368, 16384, 2000000, 0, -1, PROFILE_ARTS }, #endif { "MPEG4 Advanced Simple @ L0", "MPEG4 ASP @ L0", 0xf0, 176, 144, 30, 1, 297, 99, 2970, 100, 10*16368, 2048, 128000, 0, -1, PROFILE_AS }, { "MPEG4 Advanced Simple @ L1", "MPEG4 ASP @ L1", 0xf1, 176, 144, 30, 4, 297, 99, 2970, 100, 10*16368, 2048, 128000, 0, -1, PROFILE_AS }, { "MPEG4 Advanced Simple @ L2", "MPEG4 ASP @ L2", 0xf2, 352, 288, 15, 4, 1188, 396, 5940, 100, 40*16368, 4096, 384000, 0, -1, PROFILE_AS }, { "MPEG4 Advanced Simple @ L3", "MPEG4 ASP @ L3", 0xf3, 352, 288, 30, 4, 1188, 396, 11880, 100, 40*16368, 4096, 768000, 0, -1, PROFILE_AS }, /* ISMA Profile 1, (ASP) @ L3b (CIF, 1.5 Mb/s) CIF(352x288), 30fps, 1.5Mbps max ??? */ { "MPEG4 Advanced Simple @ L4", "MPEG4 ASP @ L4", 0xf4, 352, 576, 30, 4, 2376, 792, 23760, 50, 80*16368, 8192, 3000000, 0, -1, PROFILE_AS }, { "MPEG4 Advanced Simple @ L5", "MPEG4 ASP @ L5", 0xf5, 720, 576, 30, 4, 4860, 1620, 48600, 25, 112*16368, 16384, 8000000, 0, -1, PROFILE_AS }, { "(unrestricted)", "(unrestricted)", 0x00, 0, 0, 0, 0, 0, 0, 0, 100, 0*16368, -1, 0, 0, -1, 0xffffffff & ~(PROFILE_EXTRA | PROFILE_PACKED | PROFILE_XVID)}, }; const quality_t quality_table[] = { /* name | m vhq mtc bf cme tbo kfi fdr | iquant pquant bquant trellis */ { "Real-time", 1, 0, 0, 0, 0, 0, 300, 0, 1, 31, 1, 31, 1, 31, 0 }, { QUALITY_GENERAL_STRING, 6, 1, 0, 0, 1, 0, 300, 0, 1, 31, 1, 31, 1, 31, 1 }, }; const int quality_table_num = sizeof(quality_table)/sizeof(quality_t); typedef struct { char * name; float value; } named_float_t; static const named_float_t video_fps_list[] = { { "15.0", 15.0F }, { "23.976 (FILM)", 23.976F }, { "25.0 (PAL)", 25.0F }, { "29.97 (NTSC)", 29.970F }, { "30.0", 30.0F }, { "50.0 (HD PAL)", 50.0F }, { "59.94 (HD NTSC)", 59.940F }, { "60.0", 60.0F }, }; typedef struct { char * name; int avi_interval; /* audio overhead intervals (milliseconds) */ float mkv_multiplier; /* mkv multiplier */ } named_int_t; #define NO_AUDIO 7 static const named_int_t audio_type_list[] = { { "MP3-CBR", 1000, 48000/1152/6 }, { "MP3-VBR", 24, 48000/1152/6 }, { "OGG", /*?*/1000, 48000*(0.7F/1024 + 0.3F/180) }, { "AC3", 64, 48000/1536/6 }, { "DTS", 21, /*?*/48000/1152/6 }, { "AAC", 21, 48000/1024/6 }, { "HE-AAC", 42, 48000/1024/6 }, { "(None)", 0, 0 }, }; /* ===================================================================================== */ /* REGISTRY ============================================================================ */ /* ===================================================================================== */ /* registry info structs */ CONFIG reg; static const REG_INT reg_ints[] = { {"mode", ®.mode, RC_MODE_1PASS}, {"bitrate", ®.bitrate, 700}, {"desired_size", ®.desired_size, 570000}, {"use_2pass_bitrate", ®.use_2pass_bitrate, 0}, {"desired_quant", ®.desired_quant, DEFAULT_QUANT}, /* 100-base float */ /* profile */ {"quant_type", ®.quant_type, 0}, {"lum_masking", ®.lum_masking, 0}, {"interlacing", ®.interlacing, 0}, {"tff", ®.tff, 0}, {"qpel", ®.qpel, 0}, {"gmc", ®.gmc, 0}, {"use_bvop", ®.use_bvop, 1}, {"max_bframes", ®.max_bframes, 2}, {"bquant_ratio", ®.bquant_ratio, 150}, /* 100-base float */ {"bquant_offset", ®.bquant_offset, 100}, /* 100-base float */ {"packed", ®.packed, 1}, {"num_slices", ®.num_slices, 1}, /* aspect ratio */ {"ar_mode", ®.ar_mode, 0}, {"aspect_ratio", ®.display_aspect_ratio, 0}, {"par_x", ®.par_x, 1}, {"par_y", ®.par_y, 1}, {"ar_x", ®.ar_x, 4}, {"ar_y", ®.ar_y, 3}, /* zones */ {"num_zones", ®.num_zones, 1}, /* single pass */ {"rc_reaction_delay_factor",®.rc_reaction_delay_factor, 16}, {"rc_averaging_period", ®.rc_averaging_period, 100}, {"rc_buffer", ®.rc_buffer, 100}, /* 2pass1 */ {"discard1pass", ®.discard1pass, 1}, {"full1pass", ®.full1pass, 0}, /* 2pass2 */ {"keyframe_boost", ®.keyframe_boost, 10}, {"kfreduction", ®.kfreduction, 20}, {"kfthreshold", ®.kfthreshold, 1}, {"curve_compression_high", ®.curve_compression_high, 0}, {"curve_compression_low", ®.curve_compression_low, 0}, {"overflow_control_strength", ®.overflow_control_strength, 5}, {"twopass_max_overflow_improvement", ®.twopass_max_overflow_improvement, 5}, {"twopass_max_overflow_degradation", ®.twopass_max_overflow_degradation, 5}, /* bitrate calculator */ {"container_type", ®.container_type, 1}, {"target_size", ®.target_size, 650 * 1024}, {"subtitle_size", ®.subtitle_size, 0}, {"hours", ®.hours, 1}, {"minutes", ®.minutes, 30}, {"seconds", ®.seconds, 0}, {"fps", ®.fps, 2}, {"audio_mode", ®.audio_mode, 0}, {"audio_type", ®.audio_type, 0}, {"audio_rate", ®.audio_rate, 128}, {"audio_size", ®.audio_size, 0}, /* motion */ {"motion_search", ®.quality_user.motion_search, 6}, {"vhq_mode", ®.quality_user.vhq_mode, 1}, {"vhq_metric", ®.quality_user.vhq_metric, 0}, {"vhq_bframe", ®.quality_user.vhq_bframe, 0}, {"chromame", ®.quality_user.chromame, 1}, {"turbo", ®.quality_user.turbo, 0}, {"max_key_interval", ®.quality_user.max_key_interval, 300}, {"frame_drop_ratio", ®.quality_user.frame_drop_ratio, 0}, /* quant */ {"min_iquant", ®.quality_user.min_iquant, 1}, {"max_iquant", ®.quality_user.max_iquant, 31}, {"min_pquant", ®.quality_user.min_pquant, 1}, {"max_pquant", ®.quality_user.max_pquant, 31}, {"min_bquant", ®.quality_user.min_bquant, 1}, {"max_bquant", ®.quality_user.max_bquant, 31}, {"trellis_quant", ®.quality_user.trellis_quant, 1}, /* debug */ {"fourcc_used", ®.fourcc_used, 0}, {"debug", ®.debug, 0x0}, {"vop_debug", ®.vop_debug, 0}, {"display_status", ®.display_status, 1}, {"cpu_flags", ®.cpu, 0}, /* smp */ {"num_threads", ®.num_threads, 0}, /* decoder, shared with dshow */ {"Brightness", &pp_brightness, 0}, {"Deblock_Y", &pp_dy, 0}, {"Deblock_UV", &pp_duv, 0}, {"Dering_Y", &pp_dry, 0}, {"Dering_UV", &pp_druv, 0}, {"FilmEffect", &pp_fe, 0}, }; static const REG_STR reg_strs[] = { {"profile", reg.profile_name, "Xvid Home"}, {"quality", reg.quality_name, QUALITY_GENERAL_STRING}, {"stats", reg.stats, CONFIG_2PASS_FILE}, }; zone_t stmp; static const REG_INT reg_zone[] = { {"zone%i_frame", &stmp.frame, 0}, {"zone%i_mode", &stmp.mode, RC_ZONE_WEIGHT}, {"zone%i_weight", &stmp.weight, 100}, /* 100-base float */ {"zone%i_quant", &stmp.quant, 500}, /* 100-base float */ {"zone%i_type", &stmp.type, XVID_TYPE_AUTO}, {"zone%i_greyscale", &stmp.greyscale, 0}, {"zone%i_chroma_opt", &stmp.chroma_opt, 0}, {"zone%i_bvop_threshold", &stmp.bvop_threshold, 0}, {"zone%i_cartoon_mode", &stmp.cartoon_mode, 0}, }; static const BYTE default_qmatrix_intra[] = { 8, 17,18,19,21,23,25,27, 17,18,19,21,23,25,27,28, 20,21,22,23,24,26,28,30, 21,22,23,24,26,28,30,32, 22,23,24,26,28,30,32,35, 23,24,26,28,30,32,35,38, 25,26,28,30,32,35,38,41, 27,28,30,32,35,38,41,45 }; static const BYTE default_qmatrix_inter[] = { 16,17,18,19,20,21,22,23, 17,18,19,20,21,22,23,24, 18,19,20,21,22,23,24,25, 19,20,21,22,23,24,26,27, 20,21,22,23,25,26,27,28, 21,22,23,24,26,27,28,30, 22,23,24,26,27,28,30,31, 23,24,25,27,28,30,31,33 }; #define REG_GET_B(X, Y, Z) size=sizeof((Z));if(RegQueryValueEx(hKey, X, 0, 0, Y, &size) != ERROR_SUCCESS) {memcpy(Y, Z, sizeof((Z)));} #define XVID_DLL_NAME "xvidcore.dll" void config_reg_get(CONFIG * config) { char tmp[32]; HKEY hKey; DWORD size; int i,j; xvid_gbl_info_t info; HINSTANCE m_hdll; memset(&info, 0, sizeof(info)); info.version = XVID_VERSION; m_hdll = LoadLibrary(XVID_DLL_NAME); if (m_hdll != NULL) { ((int (__cdecl *)(void *, int, void *, void *))GetProcAddress(m_hdll, "xvid_global")) (0, XVID_GBL_INFO, &info, NULL); FreeLibrary(m_hdll); } RegOpenKeyEx(XVID_REG_KEY, XVID_REG_PARENT "\\" XVID_REG_CHILD, 0, KEY_READ, &hKey); /* read integer values */ for (i=0 ; iqmatrix_intra, default_qmatrix_intra); REG_GET_B("qmatrix_inter", config->qmatrix_inter, default_qmatrix_inter); /* read zones */ if (config->num_zones>MAX_ZONES) { config->num_zones=MAX_ZONES; }else if (config->num_zones<=0) { config->num_zones = 1; } for (i=0; inum_zones; i++) { for (j=0; jzones[i], &stmp, sizeof(zone_t)); } if (!(config->cpu&XVID_CPU_FORCE)) { config->cpu = info.cpu_flags; config->num_threads = info.num_threads; /* autodetect */ } RegCloseKey(hKey); } /* put config settings in registry */ #define REG_SET_B(X, Y) RegSetValueEx(hKey, X, 0, REG_BINARY, Y, sizeof((Y))) void config_reg_set(CONFIG * config) { char tmp[64]; HKEY hKey; DWORD dispo; int i,j; if (RegCreateKeyEx( XVID_REG_KEY, XVID_REG_PARENT "\\" XVID_REG_CHILD, 0, XVID_REG_CLASS, REG_OPTION_NON_VOLATILE, KEY_WRITE, 0, &hKey, &dispo) != ERROR_SUCCESS) { DPRINTF("Couldn't create XVID_REG_SUBKEY - GetLastError=%i", GetLastError()); return; } memcpy(®, config, sizeof(CONFIG)); /* set integer values */ for (i=0 ; iqmatrix_intra); REG_SET_B("qmatrix_inter", config->qmatrix_inter); /* set seections */ for (i=0; inum_zones; i++) { memcpy(&stmp, &config->zones[i], sizeof(zone_t)); for (j=0; jqmatrix_intra[i], FALSE); SetDlgItemInt(hDlg, IDC_QINTER00 + i, config->qmatrix_inter[i], FALSE); } } static void quant_download(HWND hDlg, CONFIG* config) { int i; for (i=0; i<64; i++) { int temp; temp = config_get_uint(hDlg, i + IDC_QINTRA00, config->qmatrix_intra[i]); CONSTRAINVAL(temp, 1, 255); config->qmatrix_intra[i] = temp; temp = config_get_uint(hDlg, i + IDC_QINTER00, config->qmatrix_inter[i]); CONSTRAINVAL(temp, 1, 255); config->qmatrix_inter[i] = temp; } } static void quant_loadsave(HWND hDlg, CONFIG * config, int save) { char file[MAX_PATH]; OPENFILENAME ofn; HANDLE hFile; DWORD read=128, wrote=0; BYTE quant_data[128]; strcpy(file, "\\matrix"); memset(&ofn, 0, sizeof(OPENFILENAME)); ofn.lStructSize = sizeof(OPENFILENAME); ofn.hwndOwner = hDlg; ofn.lpstrFilter = "All files (*.*)\0*.*\0\0"; ofn.lpstrFile = file; ofn.nMaxFile = MAX_PATH; ofn.Flags = OFN_PATHMUSTEXIST; if (save) { ofn.Flags |= OFN_OVERWRITEPROMPT; if (GetSaveFileName(&ofn)) { hFile = CreateFile(file, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, 0); quant_download(hDlg, config); memcpy(quant_data, config->qmatrix_intra, 64); memcpy(quant_data+64, config->qmatrix_inter, 64); if (hFile == INVALID_HANDLE_VALUE) { DPRINTF("Couldn't save quant matrix"); }else{ if (!WriteFile(hFile, quant_data, 128, &wrote, 0)) { DPRINTF("Couldnt write quant matrix"); } } CloseHandle(hFile); } }else{ ofn.Flags |= OFN_FILEMUSTEXIST; if (GetOpenFileName(&ofn)) { hFile = CreateFile(file, GENERIC_READ, 0, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0); if (hFile == INVALID_HANDLE_VALUE) { DPRINTF("Couldn't load quant matrix"); } else { if (!ReadFile(hFile, quant_data, 128, &read, 0)) { DPRINTF("Couldnt read quant matrix"); }else{ memcpy(config->qmatrix_intra, quant_data, 64); memcpy(config->qmatrix_inter, quant_data+64, 64); quant_upload(hDlg, config); } } CloseHandle(hFile); } } } /* quantization matrix dialog proc */ static INT_PTR CALLBACK quantmatrix_proc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam) { CONFIG* config = (CONFIG*)GetWindowLongPtr(hDlg, GWLP_USERDATA); switch (uMsg) { case WM_INITDIALOG : SetWindowLongPtr(hDlg, GWLP_USERDATA, lParam); config = (CONFIG*)lParam; quant_upload(hDlg, config); if (g_hTooltip) { EnumChildWindows(hDlg, enum_tooltips, 0); } break; case WM_COMMAND : if (HIWORD(wParam) == BN_CLICKED) { switch(LOWORD(wParam)) { case IDOK : quant_download(hDlg, config); EndDialog(hDlg, IDOK); break; case IDCANCEL : EndDialog(hDlg, IDCANCEL); break; case IDC_SAVE : quant_loadsave(hDlg, config, 1); break; case IDC_LOAD : quant_loadsave(hDlg, config, 0); break; default : return FALSE; } break; } return FALSE; default : return FALSE; } return TRUE; } /* ===================================================================================== */ /* ADVANCED DIALOG PAGES ================================================================ */ /* ===================================================================================== */ /* initialise pages */ static void adv_init(HWND hDlg, int idd, CONFIG * config) { unsigned int i; switch(idd) { case IDD_PROFILE : for (i=0; ici_valid); break; case IDD_MOTION : SendDlgItemMessage(hDlg, IDC_MOTION, CB_ADDSTRING, 0, (LPARAM)"0 - None"); SendDlgItemMessage(hDlg, IDC_MOTION, CB_ADDSTRING, 0, (LPARAM)"1 - Very Low"); SendDlgItemMessage(hDlg, IDC_MOTION, CB_ADDSTRING, 0, (LPARAM)"2 - Low"); SendDlgItemMessage(hDlg, IDC_MOTION, CB_ADDSTRING, 0, (LPARAM)"3 - Medium"); SendDlgItemMessage(hDlg, IDC_MOTION, CB_ADDSTRING, 0, (LPARAM)"4 - High"); SendDlgItemMessage(hDlg, IDC_MOTION, CB_ADDSTRING, 0, (LPARAM)"5 - Very High"); SendDlgItemMessage(hDlg, IDC_MOTION, CB_ADDSTRING, 0, (LPARAM)"6 - Ultra High"); SendDlgItemMessage(hDlg, IDC_VHQ, CB_ADDSTRING, 0, (LPARAM)"0 - Off"); SendDlgItemMessage(hDlg, IDC_VHQ, CB_ADDSTRING, 0, (LPARAM)"1 - Mode Decision"); SendDlgItemMessage(hDlg, IDC_VHQ, CB_ADDSTRING, 0, (LPARAM)"2 - Limited Search"); SendDlgItemMessage(hDlg, IDC_VHQ, CB_ADDSTRING, 0, (LPARAM)"3 - Medium Search"); SendDlgItemMessage(hDlg, IDC_VHQ, CB_ADDSTRING, 0, (LPARAM)"4 - Wide Search"); SendDlgItemMessage(hDlg, IDC_VHQ_METRIC, CB_ADDSTRING, 0, (LPARAM)"0 - PSNR"); SendDlgItemMessage(hDlg, IDC_VHQ_METRIC, CB_ADDSTRING, 0, (LPARAM)"1 - PSNR-HVS-M"); break; case IDD_ENC : SendDlgItemMessage(hDlg, IDC_FOURCC, CB_ADDSTRING, 0, (LPARAM)"XVID"); SendDlgItemMessage(hDlg, IDC_FOURCC, CB_ADDSTRING, 0, (LPARAM)"DIVX"); SendDlgItemMessage(hDlg, IDC_FOURCC, CB_ADDSTRING, 0, (LPARAM)"DX50"); break; case IDD_DEC : SendDlgItemMessage(hDlg, IDC_DEC_BRIGHTNESS, TBM_SETRANGE, (WPARAM)TRUE, (LPARAM)MAKELONG(-96, 96)); SendDlgItemMessage(hDlg, IDC_DEC_BRIGHTNESS, TBM_SETTICFREQ, (WPARAM)16, (LPARAM)0); break; } } /* enable/disable controls based on encoder-mode or user selection */ static void adv_mode(HWND hDlg, int idd, CONFIG * config) { int profile; int weight_en, quant_en; int cpu_force; int custom_quant, bvops; int ar_mode, ar_par; int qpel_checked, mot_srch_prec, vhq_enabled, bvhq_enabled; switch(idd) { case IDD_PROFILE : profile = SendDlgItemMessage(hDlg, IDC_PROFILE_PROFILE, CB_GETCURSEL, 0, 0); EnableDlgWindow(hDlg, IDC_BVOP, profiles[profile].flags&PROFILE_BVOP); EnableDlgWindow(hDlg, IDC_QUANTTYPE_S, profiles[profile].flags&PROFILE_MPEGQUANT); EnableDlgWindow(hDlg, IDC_QUANTTYPE_S, profiles[profile].flags&PROFILE_MPEGQUANT); EnableDlgWindow(hDlg, IDC_QUANTTYPE, profiles[profile].flags&PROFILE_MPEGQUANT); custom_quant = (profiles[profile].flags&PROFILE_MPEGQUANT) && SendDlgItemMessage(hDlg, IDC_QUANTTYPE, CB_GETCURSEL, 0, 0)==QUANT_MODE_CUSTOM; EnableDlgWindow(hDlg, IDC_QUANTMATRIX, custom_quant); EnableDlgWindow(hDlg, IDC_LUMMASK, profiles[profile].flags&PROFILE_ADAPTQUANT); EnableDlgWindow(hDlg, IDC_INTERLACING, profiles[profile].flags&PROFILE_INTERLACE); EnableDlgWindow(hDlg, IDC_TFF, IsDlgChecked(hDlg, IDC_INTERLACING)); EnableDlgWindow(hDlg, IDC_QPEL, profiles[profile].flags&PROFILE_QPEL); EnableDlgWindow(hDlg, IDC_GMC, profiles[profile].flags&PROFILE_GMC); EnableDlgWindow(hDlg, IDC_SLICES, profiles[profile].flags&PROFILE_RESYNCMARKER); bvops = (profiles[profile].flags&PROFILE_BVOP) && IsDlgChecked(hDlg, IDC_BVOP); EnableDlgWindow(hDlg, IDC_MAXBFRAMES, bvops); EnableDlgWindow(hDlg, IDC_BQUANTRATIO, bvops); EnableDlgWindow(hDlg, IDC_BQUANTOFFSET, bvops); EnableDlgWindow(hDlg, IDC_MAXBFRAMES_S, bvops); EnableDlgWindow(hDlg, IDC_BQUANTRATIO_S, bvops); EnableDlgWindow(hDlg, IDC_BQUANTOFFSET_S, bvops); EnableDlgWindow(hDlg, IDC_PACKED, bvops && !(profiles[profile].flags & PROFILE_PACKED)); switch(profile) { case 0: { HICON profile_icon = LoadImage(g_hInst, MAKEINTRESOURCE(IDI_MOBILE), IMAGE_ICON, 0, 0, 0); SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)profile_icon); } break; case 1: { HICON profile_icon = LoadImage(g_hInst, MAKEINTRESOURCE(IDI_HOME), IMAGE_ICON, 0, 0, 0); SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)profile_icon); } break; case 2: { HICON profile_icon = LoadImage(g_hInst, MAKEINTRESOURCE(IDI_HD720), IMAGE_ICON, 0, 0, 0); SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)profile_icon); } break; case 3: { HICON profile_icon = LoadImage(g_hInst, MAKEINTRESOURCE(IDI_HD1080), IMAGE_ICON, 0, 0, 0); SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)profile_icon); } break; default: SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)NULL); break; } if (profile < 4) ShowWindow(GetDlgItem(hDlg, IDC_PROFILE_LABEL), SW_HIDE); else ShowWindow(GetDlgItem(hDlg, IDC_PROFILE_LABEL), SW_SHOW); break; case IDD_AR: ar_mode = IsDlgChecked(hDlg, IDC_PAR); EnableDlgWindow(hDlg, IDC_ASPECT_RATIO, ar_mode); ar_par = SendDlgItemMessage(hDlg, IDC_ASPECT_RATIO, CB_GETCURSEL, 0, 0); if (ar_par == 5) { /* custom par */ SetDlgItemInt(hDlg, IDC_PARY, config->par_y, FALSE); SetDlgItemInt(hDlg, IDC_PARX, config->par_x, FALSE); EnableDlgWindow(hDlg, IDC_PARX, ar_mode); EnableDlgWindow(hDlg, IDC_PARY, ar_mode); } else { SetDlgItemInt(hDlg, IDC_PARX, PARS[ar_par][0], FALSE); SetDlgItemInt(hDlg, IDC_PARY, PARS[ar_par][1], FALSE); EnableDlgWindow(hDlg, IDC_PARX, FALSE); EnableDlgWindow(hDlg, IDC_PARY, FALSE); } ar_mode = IsDlgChecked(hDlg, IDC_AR); config->ar_x = config_get_uint(hDlg, IDC_ARX, config->ar_x); config->ar_y = config_get_uint(hDlg, IDC_ARY, config->ar_y); EnableDlgWindow(hDlg, IDC_ARX, ar_mode); EnableDlgWindow(hDlg, IDC_ARY, ar_mode); break; case IDD_LEVEL : profile = SendDlgItemMessage(hDlg, IDC_LEVEL_PROFILE, CB_GETCURSEL, 0, 0); SetDlgItemInt(hDlg, IDC_LEVEL_WIDTH, profiles[profile].width, FALSE); SetDlgItemInt(hDlg, IDC_LEVEL_HEIGHT, profiles[profile].height, FALSE); SetDlgItemInt(hDlg, IDC_LEVEL_FPS, profiles[profile].fps, FALSE); SetDlgItemInt(hDlg, IDC_LEVEL_VMV, profiles[profile].max_vmv_buffer_sz, FALSE); SetDlgItemInt(hDlg, IDC_LEVEL_VCV, profiles[profile].vcv_decoder_rate, FALSE); SetDlgItemInt(hDlg, IDC_LEVEL_VBV, profiles[profile].max_vbv_size, FALSE); set_dlgitem_float1000(hDlg, IDC_LEVEL_BITRATE, profiles[profile].max_bitrate); SetDlgItemInt(hDlg, IDC_LEVEL_PEAKRATE, profiles[profile].vbv_peakrate, FALSE); { int en_dim = profiles[profile].width && profiles[profile].height; int en_vmv = profiles[profile].max_vmv_buffer_sz; int en_vcv = profiles[profile].vcv_decoder_rate; EnableDlgWindow(hDlg, IDC_LEVEL_LEVEL_G, en_dim || en_vmv || en_vcv); EnableDlgWindow(hDlg, IDC_LEVEL_DIM_S, en_dim); EnableDlgWindow(hDlg, IDC_LEVEL_WIDTH, en_dim); EnableDlgWindow(hDlg, IDC_LEVEL_HEIGHT,en_dim); EnableDlgWindow(hDlg, IDC_LEVEL_FPS, en_dim); EnableDlgWindow(hDlg, IDC_LEVEL_VMV_S, en_vmv); EnableDlgWindow(hDlg, IDC_LEVEL_VMV, en_vmv); EnableDlgWindow(hDlg, IDC_LEVEL_VCV_S, en_vcv); EnableDlgWindow(hDlg, IDC_LEVEL_VCV, en_vcv); } { int en_vbv = profiles[profile].max_vbv_size; int en_br = profiles[profile].max_bitrate; int en_pr = profiles[profile].vbv_peakrate; EnableDlgWindow(hDlg, IDC_LEVEL_VBV_G, en_vbv || en_br || en_pr); EnableDlgWindow(hDlg, IDC_LEVEL_VBV_S, en_vbv); EnableDlgWindow(hDlg, IDC_LEVEL_VBV, en_vbv); EnableDlgWindow(hDlg, IDC_LEVEL_BITRATE_S, en_br); EnableDlgWindow(hDlg, IDC_LEVEL_BITRATE, en_br); EnableDlgWindow(hDlg, IDC_LEVEL_PEAKRATE_S, en_pr); EnableDlgWindow(hDlg, IDC_LEVEL_PEAKRATE, en_pr); } switch(profile) { case 0: { HICON profile_icon = LoadImage(g_hInst, MAKEINTRESOURCE(IDI_MOBILE), IMAGE_ICON, 0, 0, 0); SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)profile_icon); } break; case 1: { HICON profile_icon = LoadImage(g_hInst, MAKEINTRESOURCE(IDI_HOME), IMAGE_ICON, 0, 0, 0); SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)profile_icon); } break; case 2: { HICON profile_icon = LoadImage(g_hInst, MAKEINTRESOURCE(IDI_HD720), IMAGE_ICON, 0, 0, 0); SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)profile_icon); } break; case 3: { HICON profile_icon = LoadImage(g_hInst, MAKEINTRESOURCE(IDI_HD1080), IMAGE_ICON, 0, 0, 0); SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)profile_icon); } break; default: SendDlgItemMessage(hDlg, IDC_PROFILE_LOGO, STM_SETIMAGE, IMAGE_ICON, (LPARAM)NULL); break; } if (profile < 4) ShowWindow(GetDlgItem(hDlg, IDC_PROFILE_LABEL), SW_HIDE); else ShowWindow(GetDlgItem(hDlg, IDC_PROFILE_LABEL), SW_SHOW); break; case IDD_BITRATE : { int ctype = SendDlgItemMessage(hDlg, IDC_BITRATE_CFORMAT, CB_GETCURSEL, 0, 0); int target_size = config_get_cbuint(hDlg, IDC_BITRATE_TSIZE, 0); int subtitle_size = config_get_uint(hDlg, IDC_BITRATE_SSIZE, 0); int fps = SendDlgItemMessage(hDlg, IDC_BITRATE_FPS, CB_GETCURSEL, 0, 0); int duration = 3600 * config_get_uint(hDlg, IDC_BITRATE_HOURS, 0) + 60 * config_get_uint(hDlg, IDC_BITRATE_MINUTES, 0) + config_get_uint(hDlg, IDC_BITRATE_SECONDS, 0); int audio_type = SendDlgItemMessage(hDlg, IDC_BITRATE_AFORMAT, CB_GETCURSEL, 0, 0); int audio_mode = IsDlgChecked(hDlg, IDC_BITRATE_AMODE_SIZE); int audio_rate = config_get_cbuint(hDlg, IDC_BITRATE_ARATE, 0); int audio_size = config_get_uint(hDlg, IDC_BITRATE_ASIZE, 0); int frames; int overhead; int vsize; if (duration == 0) break; if (fps < 0 || fps >= sizeof(video_fps_list)/sizeof(named_float_t)) { fps = 0; } if (audio_type < 0 || audio_type >= sizeof(audio_type_list)/sizeof(named_int_t)) { audio_type = 0; } EnableDlgWindow(hDlg, IDC_BITRATE_AMODE_RATE, audio_type!=NO_AUDIO); EnableDlgWindow(hDlg, IDC_BITRATE_AMODE_SIZE, audio_type!=NO_AUDIO); EnableDlgWindow(hDlg, IDC_BITRATE_ARATE, audio_type!=NO_AUDIO && !audio_mode); EnableDlgWindow(hDlg, IDC_BITRATE_ASIZE, audio_type!=NO_AUDIO && audio_mode); EnableDlgWindow(hDlg, IDC_BITRATE_ASELECT, audio_type!=NO_AUDIO && audio_mode); /* step 1: calculate number of frames */ frames = (int)(duration * video_fps_list[fps].value); /* step 2: calculate audio_size (kbytes)*/ if (audio_type!=NO_AUDIO) { if (audio_mode==0) { int new_audio_size = (int)( (1000.0 * duration * audio_rate) / (8.0*1024) ); /* this check is needed to avoid a loop */ if (new_audio_size!=audio_size) { audio_size = new_audio_size; SetDlgItemInt(hDlg, IDC_BITRATE_ASIZE, new_audio_size, TRUE); } }else{ int tmp_rate = (int)( (audio_size * 8.0 * 1024) / (1000.0 * duration) ); SetDlgItemInt(hDlg, IDC_BITRATE_ARATE, tmp_rate, TRUE); } }else{ audio_size = 0; } /* step 3: calculate container overhead */ switch(ctype) { case 0 : /* AVI */ case 1 : /* AVI-OpenDML */ overhead = frames; if (audio_type!=NO_AUDIO) { overhead += (duration * 1000) / audio_type_list[audio_type].avi_interval; } overhead *= (ctype==0) ? 24 : 16; overhead /= 1024; break; case 2 : /* Matroska: gknot formula */ /* common overhead */ overhead = 40 + 12 + 8+ 16*duration + 200 + 100*1/*one audio stream*/ + 11*duration; /* video overhead */ overhead += frames*8 + (int)(frames * 4 * 0.94); /* cue tables and menu seek entries (300k default) */ overhead += 300 * 1024; /* audio */ overhead += (int)(duration * audio_type_list[audio_type].mkv_multiplier); overhead /= 1024; break; case 3 : /* alexnoe formula */ overhead = (int)( (target_size - subtitle_size) * (28.0/4224.0 + (1.0/255.0)) ); break; default : /* (none) */ overhead = 0; break; } SetDlgItemInt(hDlg, IDC_BITRATE_COVERHEAD, overhead, TRUE); /* final video bitstream size */ vsize = target_size - subtitle_size - audio_size - overhead; if (vsize > 0) { SetDlgItemInt(hDlg, IDC_BITRATE_VSIZE, vsize, TRUE); /* convert from kbytes to kbits-per-second */ SetDlgItemInt(hDlg, IDC_BITRATE_VRATE, (int)(((__int64)vsize * 8 * 128) / (duration * 125)), TRUE); }else{ SetDlgItemText(hDlg, IDC_BITRATE_VSIZE, "Overflow"); SetDlgItemText(hDlg, IDC_BITRATE_VRATE, "Overflow"); } } break; case IDD_ZONE : weight_en = IsDlgChecked(hDlg, IDC_ZONE_MODE_WEIGHT); quant_en = IsDlgChecked(hDlg, IDC_ZONE_MODE_QUANT); EnableDlgWindow(hDlg, IDC_ZONE_WEIGHT, weight_en); EnableDlgWindow(hDlg, IDC_ZONE_QUANT, quant_en); EnableDlgWindow(hDlg, IDC_ZONE_SLIDER, weight_en|quant_en); if (weight_en) { SendDlgItemMessage(hDlg, IDC_ZONE_SLIDER, TBM_SETRANGE, TRUE, MAKELONG(001,200)); SendDlgItemMessage(hDlg, IDC_ZONE_SLIDER, TBM_SETPOS, TRUE, get_dlgitem_float(hDlg, IDC_ZONE_WEIGHT, 100)); SetDlgItemText(hDlg, IDC_ZONE_MIN, "0.01"); SetDlgItemText(hDlg, IDC_ZONE_MAX, "2.00"); }else if (quant_en) { SendDlgItemMessage(hDlg, IDC_ZONE_SLIDER, TBM_SETRANGE, TRUE, MAKELONG(100,3100)); SendDlgItemMessage(hDlg, IDC_ZONE_SLIDER, TBM_SETPOS, TRUE, get_dlgitem_float(hDlg, IDC_ZONE_QUANT, 100)); SetDlgItemText(hDlg, IDC_ZONE_MIN, "1"); SetDlgItemText(hDlg, IDC_ZONE_MAX, "31"); } bvops = (profiles[config->profile].flags&PROFILE_BVOP) && config->use_bvop; EnableDlgWindow(hDlg, IDC_ZONE_BVOPTHRESHOLD_S, bvops); EnableDlgWindow(hDlg, IDC_ZONE_BVOPTHRESHOLD, bvops); break; case IDD_COMMON : cpu_force = IsDlgChecked(hDlg, IDC_CPU_FORCE); EnableDlgWindow(hDlg, IDC_CPU_MMX, cpu_force); EnableDlgWindow(hDlg, IDC_CPU_MMXEXT, cpu_force); EnableDlgWindow(hDlg, IDC_CPU_SSE, cpu_force); EnableDlgWindow(hDlg, IDC_CPU_SSE2, cpu_force); EnableDlgWindow(hDlg, IDC_CPU_SSE3, cpu_force); EnableDlgWindow(hDlg, IDC_CPU_SSE4, cpu_force); EnableDlgWindow(hDlg, IDC_CPU_3DNOW, cpu_force); EnableDlgWindow(hDlg, IDC_CPU_3DNOWEXT, cpu_force); EnableDlgWindow(hDlg, IDC_NUMTHREADS, cpu_force); EnableDlgWindow(hDlg, IDC_NUMTHREADS_STATIC, cpu_force); break; case IDD_MOTION: { const int userdef = (config->quality==quality_table_num); if (userdef) { bvops = (profiles[config->profile].flags&PROFILE_BVOP) && config->use_bvop; qpel_checked = (profiles[config->profile].flags&PROFILE_QPEL) && config->qpel; mot_srch_prec = SendDlgItemMessage(hDlg, IDC_MOTION, CB_GETCURSEL, 0, 0); vhq_enabled = SendDlgItemMessage(hDlg, IDC_VHQ, CB_GETCURSEL, 0, 0); bvhq_enabled = IsDlgButtonChecked(hDlg, IDC_VHQ_BFRAME); EnableDlgWindow(hDlg, IDC_VHQ, mot_srch_prec); EnableDlgWindow(hDlg, IDC_VHQ_BFRAME, mot_srch_prec && bvops && vhq_enabled); EnableDlgWindow(hDlg, IDC_CHROMAME, mot_srch_prec); EnableDlgWindow(hDlg, IDC_TURBO, mot_srch_prec && (bvops || qpel_checked)); EnableDlgWindow(hDlg, IDC_VHQ_METRIC, mot_srch_prec && (vhq_enabled || bvhq_enabled)); EnableDlgWindow(hDlg, IDC_FRAMEDROP, mot_srch_prec); EnableDlgWindow(hDlg, IDC_MAXKEY, mot_srch_prec); } break; } } } /* upload config data into dialog */ static void adv_upload(HWND hDlg, int idd, CONFIG * config) { switch (idd) { case IDD_PROFILE : SendDlgItemMessage(hDlg, IDC_PROFILE_PROFILE, CB_SETCURSEL, config->profile, 0); SendDlgItemMessage(hDlg, IDC_QUANTTYPE, CB_SETCURSEL, config->quant_type, 0); SendDlgItemMessage(hDlg, IDC_LUMMASK, CB_SETCURSEL, config->lum_masking, 0); CheckDlg(hDlg, IDC_INTERLACING, config->interlacing); CheckDlg(hDlg, IDC_TFF, config->tff); CheckDlg(hDlg, IDC_QPEL, config->qpel); CheckDlg(hDlg, IDC_GMC, config->gmc); CheckDlg(hDlg, IDC_SLICES, (config->num_slices != 1)); CheckDlg(hDlg, IDC_BVOP, config->use_bvop); SetDlgItemInt(hDlg, IDC_MAXBFRAMES, config->max_bframes, FALSE); set_dlgitem_float(hDlg, IDC_BQUANTRATIO, config->bquant_ratio); set_dlgitem_float(hDlg, IDC_BQUANTOFFSET, config->bquant_offset); CheckDlg(hDlg, IDC_PACKED, config->packed); break; case IDD_AR: CheckRadioButton(hDlg, IDC_AR, IDC_PAR, config->ar_mode == 0 ? IDC_PAR : IDC_AR); SendDlgItemMessage(hDlg, IDC_ASPECT_RATIO, CB_SETCURSEL, (config->display_aspect_ratio), 0); SetDlgItemInt(hDlg, IDC_ARX, config->ar_x, FALSE); SetDlgItemInt(hDlg, IDC_ARY, config->ar_y, FALSE); break; case IDD_LEVEL : SendDlgItemMessage(hDlg, IDC_LEVEL_PROFILE, CB_SETCURSEL, config->profile, 0); break; case IDD_RC_CBR : SetDlgItemInt(hDlg, IDC_CBR_REACTIONDELAY, config->rc_reaction_delay_factor, FALSE); SetDlgItemInt(hDlg, IDC_CBR_AVERAGINGPERIOD, config->rc_averaging_period, FALSE); SetDlgItemInt(hDlg, IDC_CBR_BUFFER, config->rc_buffer, FALSE); break; case IDD_RC_2PASS1 : SetDlgItemText(hDlg, IDC_STATS, config->stats); CheckDlg(hDlg, IDC_DISCARD1PASS, config->discard1pass); CheckDlg(hDlg, IDC_FULL1PASS, config->full1pass); break; case IDD_RC_2PASS2 : SetDlgItemText(hDlg, IDC_STATS, config->stats); SetDlgItemInt(hDlg, IDC_KFBOOST, config->keyframe_boost, FALSE); SetDlgItemInt(hDlg, IDC_KFREDUCTION, config->kfreduction, FALSE); SetDlgItemInt(hDlg, IDC_OVERFLOW_CONTROL_STRENGTH, config->overflow_control_strength, FALSE); SetDlgItemInt(hDlg, IDC_OVERIMP, config->twopass_max_overflow_improvement, FALSE); SetDlgItemInt(hDlg, IDC_OVERDEG, config->twopass_max_overflow_degradation, FALSE); SetDlgItemInt(hDlg, IDC_CURVECOMPH, config->curve_compression_high, FALSE); SetDlgItemInt(hDlg, IDC_CURVECOMPL, config->curve_compression_low, FALSE); SetDlgItemInt(hDlg, IDC_MINKEY, config->kfthreshold, FALSE); break; case IDD_BITRATE : SendDlgItemMessage(hDlg, IDC_BITRATE_CFORMAT, CB_SETCURSEL, config->container_type, 0); SetDlgItemInt(hDlg, IDC_BITRATE_TSIZE, config->target_size, FALSE); SetDlgItemInt(hDlg, IDC_BITRATE_SSIZE, config->subtitle_size, FALSE); SetDlgItemInt(hDlg, IDC_BITRATE_HOURS, config->hours, FALSE); SetDlgItemInt(hDlg, IDC_BITRATE_MINUTES, config->minutes, FALSE); SetDlgItemInt(hDlg, IDC_BITRATE_SECONDS, config->seconds, FALSE); SendDlgItemMessage(hDlg, IDC_BITRATE_FPS, CB_SETCURSEL, config->fps, 0); SendDlgItemMessage(hDlg, IDC_BITRATE_AFORMAT, CB_SETCURSEL, config->audio_type, 0); CheckRadioButton(hDlg, IDC_BITRATE_AMODE_RATE, IDC_BITRATE_AMODE_SIZE, config->audio_mode == 0 ? IDC_BITRATE_AMODE_RATE : IDC_BITRATE_AMODE_SIZE); SetDlgItemInt(hDlg, IDC_BITRATE_ARATE, config->audio_rate, FALSE); SetDlgItemInt(hDlg, IDC_BITRATE_ASIZE, config->audio_size, FALSE); break; case IDD_ZONE : SetDlgItemInt(hDlg, IDC_ZONE_FRAME, config->zones[config->cur_zone].frame, FALSE); CheckDlgButton(hDlg, IDC_ZONE_MODE_WEIGHT, config->zones[config->cur_zone].mode == RC_ZONE_WEIGHT); CheckDlgButton(hDlg, IDC_ZONE_MODE_QUANT, config->zones[config->cur_zone].mode == RC_ZONE_QUANT); set_dlgitem_float(hDlg, IDC_ZONE_WEIGHT, config->zones[config->cur_zone].weight); set_dlgitem_float(hDlg, IDC_ZONE_QUANT, config->zones[config->cur_zone].quant); CheckDlgButton(hDlg, IDC_ZONE_FORCEIVOP, config->zones[config->cur_zone].type==XVID_TYPE_IVOP); CheckDlgButton(hDlg, IDC_ZONE_GREYSCALE, config->zones[config->cur_zone].greyscale); CheckDlgButton(hDlg, IDC_ZONE_CHROMAOPT, config->zones[config->cur_zone].chroma_opt); CheckDlg(hDlg, IDC_CARTOON, config->zones[config->cur_zone].cartoon_mode); SetDlgItemInt(hDlg, IDC_ZONE_BVOPTHRESHOLD, config->zones[config->cur_zone].bvop_threshold, TRUE); break; case IDD_MOTION : { const int userdef = (config->quality==quality_table_num); const quality_t* quality_preset = userdef ? &config->quality_user : &quality_table[config->quality]; int bvops = (profiles[config->profile].flags&PROFILE_BVOP) && config->use_bvop; int qpel_checked = (profiles[config->profile].flags&PROFILE_QPEL) && config->qpel; int bvops_qpel_motion = (bvops || qpel_checked) && quality_preset->motion_search; int vhq_or_bvhq = quality_preset->vhq_mode || quality_preset->vhq_bframe; SendDlgItemMessage(hDlg, IDC_MOTION, CB_SETCURSEL, quality_preset->motion_search, 0); SendDlgItemMessage(hDlg, IDC_VHQ, CB_SETCURSEL, quality_preset->vhq_mode, 0); SendDlgItemMessage(hDlg, IDC_VHQ_METRIC, CB_SETCURSEL, quality_preset->vhq_metric, 0); CheckDlg(hDlg, IDC_VHQ_BFRAME, quality_preset->vhq_bframe); CheckDlg(hDlg, IDC_CHROMAME, quality_preset->chromame); CheckDlg(hDlg, IDC_TURBO, quality_preset->turbo); SetDlgItemInt(hDlg, IDC_FRAMEDROP, quality_preset->frame_drop_ratio, FALSE); SetDlgItemInt(hDlg, IDC_MAXKEY, quality_preset->max_key_interval, FALSE); EnableDlgWindow(hDlg, IDC_MOTION, userdef); EnableDlgWindow(hDlg, IDC_VHQ, userdef && quality_preset->motion_search); EnableDlgWindow(hDlg, IDC_VHQ_METRIC, userdef && vhq_or_bvhq); EnableDlgWindow(hDlg, IDC_VHQ_BFRAME, userdef && bvops); EnableDlgWindow(hDlg, IDC_CHROMAME, userdef && quality_preset->motion_search); EnableDlgWindow(hDlg, IDC_TURBO, userdef && bvops_qpel_motion); EnableDlgWindow(hDlg, IDC_FRAMEDROP, userdef && quality_preset->motion_search); EnableDlgWindow(hDlg, IDC_MAXKEY, userdef && quality_preset->motion_search); break; } case IDD_QUANT : { const int userdef = (config->quality==quality_table_num); const quality_t* quality_preset = userdef ? &config->quality_user : &quality_table[config->quality]; SetDlgItemInt(hDlg, IDC_MINIQUANT, quality_preset->min_iquant, FALSE); SetDlgItemInt(hDlg, IDC_MAXIQUANT, quality_preset->max_iquant, FALSE); SetDlgItemInt(hDlg, IDC_MINPQUANT, quality_preset->min_pquant, FALSE); SetDlgItemInt(hDlg, IDC_MAXPQUANT, quality_preset->max_pquant, FALSE); SetDlgItemInt(hDlg, IDC_MINBQUANT, quality_preset->min_bquant, FALSE); SetDlgItemInt(hDlg, IDC_MAXBQUANT, quality_preset->max_bquant, FALSE); CheckDlg(hDlg, IDC_TRELLISQUANT, quality_preset->trellis_quant); EnableDlgWindow(hDlg, IDC_MINIQUANT, userdef); EnableDlgWindow(hDlg, IDC_MAXIQUANT, userdef); EnableDlgWindow(hDlg, IDC_MINPQUANT, userdef); EnableDlgWindow(hDlg, IDC_MAXPQUANT, userdef); EnableDlgWindow(hDlg, IDC_MINBQUANT, userdef); EnableDlgWindow(hDlg, IDC_MAXBQUANT, userdef); EnableDlgWindow(hDlg, IDC_TRELLISQUANT, userdef); break; } case IDD_COMMON : CheckDlg(hDlg, IDC_CPU_MMX, (config->cpu & XVID_CPU_MMX)); CheckDlg(hDlg, IDC_CPU_MMXEXT, (config->cpu & XVID_CPU_MMXEXT)); CheckDlg(hDlg, IDC_CPU_SSE, (config->cpu & XVID_CPU_SSE)); CheckDlg(hDlg, IDC_CPU_SSE2, (config->cpu & XVID_CPU_SSE2)); CheckDlg(hDlg, IDC_CPU_SSE3, (config->cpu & XVID_CPU_SSE3)); CheckDlg(hDlg, IDC_CPU_SSE4, (config->cpu & XVID_CPU_SSE41)); CheckDlg(hDlg, IDC_CPU_3DNOW, (config->cpu & XVID_CPU_3DNOW)); CheckDlg(hDlg, IDC_CPU_3DNOWEXT, (config->cpu & XVID_CPU_3DNOWEXT)); CheckRadioButton(hDlg, IDC_CPU_AUTO, IDC_CPU_FORCE, config->cpu & XVID_CPU_FORCE ? IDC_CPU_FORCE : IDC_CPU_AUTO ); set_dlgitem_hex(hDlg, IDC_DEBUG, config->debug); SetDlgItemInt(hDlg, IDC_NUMTHREADS, config->num_threads, FALSE); break; case IDD_ENC: if(profiles[config->profile].flags & PROFILE_XVID) SendDlgItemMessage(hDlg, IDC_FOURCC, CB_SETCURSEL, 0, 0); else SendDlgItemMessage(hDlg, IDC_FOURCC, CB_SETCURSEL, config->fourcc_used, 0); EnableDlgWindow(hDlg, IDC_FOURCC, (!(profiles[config->profile].flags & PROFILE_XVID))); CheckDlg(hDlg, IDC_VOPDEBUG, config->vop_debug); CheckDlg(hDlg, IDC_DISPLAY_STATUS, config->display_status); break; case IDD_DEC : SendDlgItemMessage(hDlg, IDC_DEC_BRIGHTNESS, TBM_SETPOS, (WPARAM)TRUE, (LPARAM)pp_brightness); CheckDlg(hDlg, IDC_DEC_DY, pp_dy); CheckDlg(hDlg, IDC_DEC_DUV, pp_duv); CheckDlg(hDlg, IDC_DEC_DRY, pp_dry); CheckDlg(hDlg, IDC_DEC_DRUV,pp_druv); CheckDlg(hDlg, IDC_DEC_FE, pp_fe); EnableDlgWindow(hDlg, IDC_DEC_DRY, pp_dy); EnableDlgWindow(hDlg, IDC_DEC_DRUV, pp_duv); break; } } /* download config data from dialog */ static void adv_download(HWND hDlg, int idd, CONFIG * config) { switch (idd) { case IDD_PROFILE : config->profile = SendDlgItemMessage(hDlg, IDC_PROFILE_PROFILE, CB_GETCURSEL, 0, 0); config->quant_type = SendDlgItemMessage(hDlg, IDC_QUANTTYPE, CB_GETCURSEL, 0, 0); config->lum_masking = SendDlgItemMessage(hDlg, IDC_LUMMASK, CB_GETCURSEL, 0, 0); config->interlacing = IsDlgChecked(hDlg, IDC_INTERLACING); config->tff = IsDlgChecked(hDlg, IDC_TFF); config->qpel = IsDlgChecked(hDlg, IDC_QPEL); config->gmc = IsDlgChecked(hDlg, IDC_GMC); config->num_slices = (IsDlgChecked(hDlg, IDC_SLICES) ? ((config->num_slices < 2) ? 0 : config->num_slices) : 1); config->use_bvop = IsDlgChecked(hDlg, IDC_BVOP); config->max_bframes = config_get_uint(hDlg, IDC_MAXBFRAMES, config->max_bframes); config->bquant_ratio = get_dlgitem_float(hDlg, IDC_BQUANTRATIO, config->bquant_ratio); config->bquant_offset = get_dlgitem_float(hDlg, IDC_BQUANTOFFSET, config->bquant_offset); config->packed = IsDlgChecked(hDlg, IDC_PACKED); break; case IDD_AR: config->ar_mode = IsDlgChecked(hDlg, IDC_PAR) ? 0:1; config->ar_x = config_get_uint(hDlg, IDC_ARX, config->ar_x); config->ar_y = config_get_uint(hDlg, IDC_ARY, config->ar_y); config->display_aspect_ratio = SendDlgItemMessage(hDlg, IDC_ASPECT_RATIO, CB_GETCURSEL, 0, 0); if (config->display_aspect_ratio == 5) { config->par_x = config_get_uint(hDlg, IDC_PARX, config->par_x); config->par_y = config_get_uint(hDlg, IDC_PARY, config->par_y); } break; case IDD_LEVEL : config->profile = SendDlgItemMessage(hDlg, IDC_LEVEL_PROFILE, CB_GETCURSEL, 0, 0); break; case IDD_RC_CBR : config->rc_reaction_delay_factor = config_get_uint(hDlg, IDC_CBR_REACTIONDELAY, config->rc_reaction_delay_factor); config->rc_averaging_period = config_get_uint(hDlg, IDC_CBR_AVERAGINGPERIOD, config->rc_averaging_period); config->rc_buffer = config_get_uint(hDlg, IDC_CBR_BUFFER, config->rc_buffer); break; case IDD_RC_2PASS1 : if (GetDlgItemText(hDlg, IDC_STATS, config->stats, MAX_PATH) == 0) lstrcpy(config->stats, CONFIG_2PASS_FILE); config->discard1pass = IsDlgChecked(hDlg, IDC_DISCARD1PASS); config->full1pass = IsDlgChecked(hDlg, IDC_FULL1PASS); break; case IDD_RC_2PASS2 : if (GetDlgItemText(hDlg, IDC_STATS, config->stats, MAX_PATH) == 0) lstrcpy(config->stats, CONFIG_2PASS_FILE); config->keyframe_boost = GetDlgItemInt(hDlg, IDC_KFBOOST, NULL, FALSE); config->kfreduction = GetDlgItemInt(hDlg, IDC_KFREDUCTION, NULL, FALSE); CONSTRAINVAL(config->keyframe_boost, 0, 1000); config->overflow_control_strength = GetDlgItemInt(hDlg, IDC_OVERFLOW_CONTROL_STRENGTH, NULL, FALSE); config->twopass_max_overflow_improvement = config_get_uint(hDlg, IDC_OVERIMP, config->twopass_max_overflow_improvement); config->twopass_max_overflow_degradation = config_get_uint(hDlg, IDC_OVERDEG, config->twopass_max_overflow_degradation); CONSTRAINVAL(config->twopass_max_overflow_improvement, 1, 80); CONSTRAINVAL(config->twopass_max_overflow_degradation, 1, 80); CONSTRAINVAL(config->overflow_control_strength, 0, 100); config->curve_compression_high = GetDlgItemInt(hDlg, IDC_CURVECOMPH, NULL, FALSE); config->curve_compression_low = GetDlgItemInt(hDlg, IDC_CURVECOMPL, NULL, FALSE); CONSTRAINVAL(config->curve_compression_high, 0, 100); CONSTRAINVAL(config->curve_compression_low, 0, 100); config->kfthreshold = config_get_uint(hDlg, IDC_MINKEY, config->kfthreshold); break; case IDD_BITRATE : config->container_type = SendDlgItemMessage(hDlg, IDC_BITRATE_CFORMAT, CB_GETCURSEL, 0, 0); config->target_size = config_get_uint(hDlg, IDC_BITRATE_TSIZE, config->target_size); config->subtitle_size = config_get_uint(hDlg, IDC_BITRATE_SSIZE, config->subtitle_size); config->hours = config_get_uint(hDlg, IDC_BITRATE_HOURS, config->hours); config->minutes = config_get_uint(hDlg, IDC_BITRATE_MINUTES, config->minutes); config->seconds = config_get_uint(hDlg, IDC_BITRATE_SECONDS, config->seconds); config->fps = SendDlgItemMessage(hDlg, IDC_BITRATE_FPS, CB_GETCURSEL, 0, 0); config->audio_type = SendDlgItemMessage(hDlg, IDC_BITRATE_AFORMAT, CB_GETCURSEL, 0, 0); config->audio_mode = IsDlgChecked(hDlg, IDC_BITRATE_AMODE_SIZE) ? 1 : 0 ; config->audio_rate = config_get_uint(hDlg, IDC_BITRATE_ARATE, config->audio_rate); config->audio_size = config_get_uint(hDlg, IDC_BITRATE_ASIZE, config->audio_size); /* the main window uses "AVI bitrate/filesize" not "video bitrate/filesize", so we have to compensate by frames * 24 bytes */ { int frame_compensate = 24 * (int)( (3600*config->hours + 60*config->minutes + config->seconds) * video_fps_list[config->fps].value) / 1024; int bitrate_compensate = (int)(24 * video_fps_list[config->fps].value) / 125; config->desired_size = config_get_uint(hDlg, IDC_BITRATE_VSIZE, config->desired_size) + frame_compensate; config->bitrate = config_get_uint(hDlg, IDC_BITRATE_VRATE, config->bitrate) + bitrate_compensate; } break; case IDD_ZONE : config->zones[config->cur_zone].frame = config_get_uint(hDlg, IDC_ZONE_FRAME, config->zones[config->cur_zone].frame); if (IsDlgChecked(hDlg, IDC_ZONE_MODE_WEIGHT)) { config->zones[config->cur_zone].mode = RC_ZONE_WEIGHT; }else if (IsDlgChecked(hDlg, IDC_ZONE_MODE_QUANT)) { config->zones[config->cur_zone].mode = RC_ZONE_QUANT; } config->zones[config->cur_zone].weight = get_dlgitem_float(hDlg, IDC_ZONE_WEIGHT, config->zones[config->cur_zone].weight); config->zones[config->cur_zone].quant = get_dlgitem_float(hDlg, IDC_ZONE_QUANT, config->zones[config->cur_zone].quant); config->zones[config->cur_zone].type = IsDlgButtonChecked(hDlg, IDC_ZONE_FORCEIVOP)?XVID_TYPE_IVOP:XVID_TYPE_AUTO; config->zones[config->cur_zone].greyscale = IsDlgButtonChecked(hDlg, IDC_ZONE_GREYSCALE); config->zones[config->cur_zone].chroma_opt = IsDlgButtonChecked(hDlg, IDC_ZONE_CHROMAOPT); config->zones[config->cur_zone].bvop_threshold = config_get_int(hDlg, IDC_ZONE_BVOPTHRESHOLD, config->zones[config->cur_zone].bvop_threshold); config->zones[config->cur_zone].cartoon_mode = IsDlgChecked(hDlg, IDC_CARTOON); break; case IDD_MOTION : if (config->quality==quality_table_num) { config->quality_user.motion_search = SendDlgItemMessage(hDlg, IDC_MOTION, CB_GETCURSEL, 0, 0); config->quality_user.vhq_mode = SendDlgItemMessage(hDlg, IDC_VHQ, CB_GETCURSEL, 0, 0); config->quality_user.vhq_metric = SendDlgItemMessage(hDlg, IDC_VHQ_METRIC, CB_GETCURSEL, 0, 0); config->quality_user.vhq_bframe = IsDlgButtonChecked(hDlg, IDC_VHQ_BFRAME); config->quality_user.chromame = IsDlgChecked(hDlg, IDC_CHROMAME); config->quality_user.turbo = IsDlgChecked(hDlg, IDC_TURBO); config->quality_user.frame_drop_ratio = config_get_uint(hDlg, IDC_FRAMEDROP, config->quality_user.frame_drop_ratio); config->quality_user.max_key_interval = config_get_uint(hDlg, IDC_MAXKEY, config->quality_user.max_key_interval); } break; case IDD_QUANT : if (config->quality==quality_table_num) { config->quality_user.min_iquant = config_get_uint(hDlg, IDC_MINIQUANT, config->quality_user.min_iquant); config->quality_user.max_iquant = config_get_uint(hDlg, IDC_MAXIQUANT, config->quality_user.max_iquant); config->quality_user.min_pquant = config_get_uint(hDlg, IDC_MINPQUANT, config->quality_user.min_pquant); config->quality_user.max_pquant = config_get_uint(hDlg, IDC_MAXPQUANT, config->quality_user.max_pquant); config->quality_user.min_bquant = config_get_uint(hDlg, IDC_MINBQUANT, config->quality_user.min_bquant); config->quality_user.max_bquant = config_get_uint(hDlg, IDC_MAXBQUANT, config->quality_user.max_bquant); CONSTRAINVAL(config->quality_user.min_iquant, 1, 31); CONSTRAINVAL(config->quality_user.max_iquant, config->quality_user.min_iquant, 31); CONSTRAINVAL(config->quality_user.min_pquant, 1, 31); CONSTRAINVAL(config->quality_user.max_pquant, config->quality_user.min_pquant, 31); CONSTRAINVAL(config->quality_user.min_bquant, 1, 31); CONSTRAINVAL(config->quality_user.max_bquant, config->quality_user.min_bquant, 31); config->quality_user.trellis_quant = IsDlgChecked(hDlg, IDC_TRELLISQUANT); } break; case IDD_COMMON : config->cpu = 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_MMX) ? XVID_CPU_MMX : 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_MMXEXT) ? XVID_CPU_MMXEXT : 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_SSE) ? XVID_CPU_SSE : 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_SSE2) ? XVID_CPU_SSE2 : 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_SSE3) ? XVID_CPU_SSE3 : 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_SSE4) ? XVID_CPU_SSE41 : 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_3DNOW) ? XVID_CPU_3DNOW : 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_3DNOWEXT) ? XVID_CPU_3DNOWEXT : 0; config->cpu |= IsDlgChecked(hDlg, IDC_CPU_FORCE) ? XVID_CPU_FORCE : 0; config->debug = get_dlgitem_hex(hDlg, IDC_DEBUG, config->debug); config->num_threads = min(16, config_get_uint(hDlg, IDC_NUMTHREADS, config->num_threads)); break; case IDD_ENC : if(!(profiles[config->profile].flags & PROFILE_XVID)) config->fourcc_used = SendDlgItemMessage(hDlg, IDC_FOURCC, CB_GETCURSEL, 0, 0); config->vop_debug = IsDlgChecked(hDlg, IDC_VOPDEBUG); config->display_status = IsDlgChecked(hDlg, IDC_DISPLAY_STATUS); break; case IDD_DEC : pp_brightness = SendDlgItemMessage(hDlg, IDC_DEC_BRIGHTNESS, TBM_GETPOS, (WPARAM)NULL, (LPARAM)NULL); pp_dy = IsDlgChecked(hDlg, IDC_DEC_DY); pp_duv = IsDlgChecked(hDlg, IDC_DEC_DUV); pp_dry = IsDlgChecked(hDlg, IDC_DEC_DRY); pp_druv = IsDlgChecked(hDlg, IDC_DEC_DRUV); pp_fe = IsDlgChecked(hDlg, IDC_DEC_FE); break; } } /* advanced dialog proc */ static INT_PTR CALLBACK adv_proc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam) { PROPSHEETINFO *psi; psi = (PROPSHEETINFO*)GetWindowLongPtr(hDlg, GWLP_USERDATA); switch (uMsg) { case WM_INITDIALOG : psi = (PROPSHEETINFO*) ((LPPROPSHEETPAGE)lParam)->lParam; SetWindowLongPtr(hDlg, GWLP_USERDATA, (LPARAM)psi); if (g_hTooltip) EnumChildWindows(hDlg, enum_tooltips, 0); adv_init(hDlg, psi->idd, psi->config); break; case WM_COMMAND : if (HIWORD(wParam) == BN_CLICKED) { switch (LOWORD(wParam)) { case IDC_INTERLACING : case IDC_VHQ_BFRAME : case IDC_BVOP : case IDC_ZONE_MODE_WEIGHT : case IDC_ZONE_MODE_QUANT : case IDC_ZONE_BVOPTHRESHOLD_ENABLE : case IDC_CPU_AUTO : case IDC_CPU_FORCE : case IDC_AR : case IDC_PAR : case IDC_BITRATE_AMODE_RATE : case IDC_BITRATE_AMODE_SIZE : adv_mode(hDlg, psi->idd, psi->config); break; case IDC_BITRATE_SSELECT : case IDC_BITRATE_ASELECT : { OPENFILENAME ofn; char filename[MAX_PATH] = ""; memset(&ofn, 0, sizeof(OPENFILENAME)); ofn.lStructSize = sizeof(OPENFILENAME); ofn.hwndOwner = hDlg; if (LOWORD(wParam)==IDC_BITRATE_SSELECT) { ofn.lpstrFilter = "Subtitle files (*.sub, *.ssa, *.txt, *.dat)\0*.sub;*.ssa;*.txt;*.dat\0All files (*.*)\0*.*\0\0"; }else{ ofn.lpstrFilter = "Audio files (*.mp3, *.ac3, *.aac, *.ogg, *.wav)\0*.mp3; *.ac3; *.aac; *.ogg; *.wav\0All files (*.*)\0*.*\0\0"; } ofn.lpstrFile = filename; ofn.nMaxFile = MAX_PATH; ofn.Flags = OFN_PATHMUSTEXIST | OFN_FILEMUSTEXIST; if (GetOpenFileName(&ofn)) { HANDLE hFile; DWORD filesize; if ((hFile = CreateFile(filename, GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0)) == INVALID_HANDLE_VALUE || (filesize = GetFileSize(hFile, NULL)) == INVALID_FILE_SIZE) { MessageBox(hDlg, "Could not get file size", "Error", 0); }else{ SetDlgItemInt(hDlg, LOWORD(wParam)==IDC_BITRATE_SSELECT? IDC_BITRATE_SSIZE : IDC_BITRATE_ASIZE, filesize / 1024, FALSE); CloseHandle(hFile); } } } break; case IDC_QUANTMATRIX : DialogBoxParam(g_hInst, MAKEINTRESOURCE(IDD_QUANTMATRIX), hDlg, quantmatrix_proc, (LPARAM)psi->config); break; case IDC_STATS_BROWSE : { OPENFILENAME ofn; char tmp[MAX_PATH]; GetDlgItemText(hDlg, IDC_STATS, tmp, MAX_PATH); memset(&ofn, 0, sizeof(OPENFILENAME)); ofn.lStructSize = sizeof(OPENFILENAME); ofn.hwndOwner = hDlg; ofn.lpstrFilter = "bitrate curve (*.pass)\0*.pass\0All files (*.*)\0*.*\0\0"; ofn.lpstrFile = tmp; ofn.nMaxFile = MAX_PATH; ofn.Flags = OFN_PATHMUSTEXIST; if (psi->idd == IDD_RC_2PASS1) { ofn.Flags |= OFN_OVERWRITEPROMPT; }else{ ofn.Flags |= OFN_FILEMUSTEXIST; } if ((psi->idd==IDD_RC_2PASS1 && GetSaveFileName(&ofn)) || (psi->idd==IDD_RC_2PASS2 && GetOpenFileName(&ofn))) { SetDlgItemText(hDlg, IDC_STATS, tmp); } } break; case IDC_ZONE_FETCH : SetDlgItemInt(hDlg, IDC_ZONE_FRAME, psi->config->ci.ciActiveFrame, FALSE); break; case IDC_AR_DEFAULT: CheckRadioButton(hDlg, IDC_AR, IDC_PAR, IDC_PAR); SendDlgItemMessage(hDlg, IDC_ASPECT_RATIO, CB_SETCURSEL, 0, 0); adv_mode(hDlg, psi->idd, psi->config); break; case IDC_AR_4_3: SetDlgItemInt(hDlg, IDC_ARX, 4, FALSE); SetDlgItemInt(hDlg, IDC_ARY, 3, FALSE); CheckRadioButton(hDlg, IDC_AR, IDC_PAR, IDC_AR); adv_mode(hDlg, psi->idd, psi->config); break; case IDC_AR_16_9: SetDlgItemInt(hDlg, IDC_ARX, 16, FALSE); SetDlgItemInt(hDlg, IDC_ARY, 9, FALSE); CheckRadioButton(hDlg, IDC_AR, IDC_PAR, IDC_AR); adv_mode(hDlg, psi->idd, psi->config); break; case IDC_AR_235_100: SetDlgItemInt(hDlg, IDC_ARX, 235, FALSE); SetDlgItemInt(hDlg, IDC_ARY, 100, FALSE); CheckRadioButton(hDlg, IDC_AR, IDC_PAR, IDC_AR); adv_mode(hDlg, psi->idd, psi->config); break; case IDC_DEC_DY: case IDC_DEC_DUV: EnableDlgWindow(hDlg, IDC_DEC_DRY, IsDlgChecked(hDlg, IDC_DEC_DY)); EnableDlgWindow(hDlg, IDC_DEC_DRUV, IsDlgChecked(hDlg, IDC_DEC_DUV)); break; default : return TRUE; } }else if ((HIWORD(wParam) == CBN_EDITCHANGE || HIWORD(wParam)==CBN_SELCHANGE) && (LOWORD(wParam)==IDC_BITRATE_TSIZE || LOWORD(wParam)==IDC_BITRATE_ARATE )) { adv_mode(hDlg, psi->idd, psi->config); }else if (HIWORD(wParam) == LBN_SELCHANGE && (LOWORD(wParam) == IDC_PROFILE_PROFILE || LOWORD(wParam) == IDC_LEVEL_PROFILE || LOWORD(wParam) == IDC_QUANTTYPE || LOWORD(wParam) == IDC_ASPECT_RATIO || LOWORD(wParam) == IDC_MOTION || LOWORD(wParam) == IDC_VHQ || LOWORD(wParam) == IDC_BITRATE_CFORMAT || LOWORD(wParam) == IDC_BITRATE_AFORMAT || LOWORD(wParam) == IDC_BITRATE_FPS)) { adv_mode(hDlg, psi->idd, psi->config); }else if (HIWORD(wParam) == EN_UPDATE && (LOWORD(wParam)==IDC_ZONE_WEIGHT || LOWORD(wParam)==IDC_ZONE_QUANT)) { SendDlgItemMessage(hDlg, IDC_ZONE_SLIDER, TBM_SETPOS, TRUE, get_dlgitem_float(hDlg, LOWORD(wParam), 100)); } else if (HIWORD(wParam) == EN_UPDATE && (LOWORD(wParam)==IDC_PARX || LOWORD(wParam)==IDC_PARY)) { if (5 == SendDlgItemMessage(hDlg, IDC_ASPECT_RATIO, CB_GETCURSEL, 0, 0)) { if(LOWORD(wParam)==IDC_PARX) psi->config->par_x = config_get_uint(hDlg, LOWORD(wParam), psi->config->par_x); else psi->config->par_y = config_get_uint(hDlg, LOWORD(wParam), psi->config->par_y); } } else if (HIWORD(wParam) == EN_UPDATE && (LOWORD(wParam)==IDC_BITRATE_SSIZE || LOWORD(wParam)==IDC_BITRATE_HOURS || LOWORD(wParam)==IDC_BITRATE_MINUTES || LOWORD(wParam)==IDC_BITRATE_SECONDS || LOWORD(wParam)==IDC_BITRATE_ASIZE)) { adv_mode(hDlg, psi->idd, psi->config); } else return 0; break; case WM_HSCROLL : if((HWND)lParam == GetDlgItem(hDlg, IDC_ZONE_SLIDER)) { int idc = IsDlgChecked(hDlg, IDC_ZONE_MODE_WEIGHT) ? IDC_ZONE_WEIGHT : IDC_ZONE_QUANT; set_dlgitem_float(hDlg, idc, SendMessage((HWND)lParam, TBM_GETPOS, 0, 0) ); break; } return 0; case WM_NOTIFY : switch (((NMHDR *)lParam)->code) { case PSN_SETACTIVE : DPRINTF("PSN_SET"); adv_upload(hDlg, psi->idd, psi->config); adv_mode(hDlg, psi->idd, psi->config); SetWindowLongPtr(hDlg, DWLP_MSGRESULT, FALSE); break; case PSN_KILLACTIVE : DPRINTF("PSN_KILL"); adv_download(hDlg, psi->idd, psi->config); SetWindowLongPtr(hDlg, DWLP_MSGRESULT, FALSE); break; case PSN_APPLY : DPRINTF("PSN_APPLY"); psi->config->save = TRUE; SetWindowLongPtr(hDlg, DWLP_MSGRESULT, FALSE); break; } break; default : return 0; } return 1; } /* load advanced options property sheet returns true, if the user accepted the changes or fasle if changes were canceled. */ #ifndef PSH_NOCONTEXTHELP #define PSH_NOCONTEXTHELP 0x02000000 #endif static BOOL adv_dialog(HWND hParent, CONFIG * config, const int * dlgs, int size) { PROPSHEETINFO psi[6]; PROPSHEETPAGE psp[6]; PROPSHEETHEADER psh; CONFIG temp; int i; config->save = FALSE; memcpy(&temp, config, sizeof(CONFIG)); for (i=0; iframe); if (insert) { LVITEM lvi; lvi.mask = LVIF_TEXT | LVIF_IMAGE | LVIF_PARAM | LVIF_STATE; lvi.state = 0; lvi.stateMask = 0; lvi.iImage = 0; lvi.pszText = tmp; lvi.cchTextMax = strlen(tmp); lvi.iItem = i; lvi.iSubItem = 0; ListView_InsertItem(hDlg, &lvi); }else{ ListView_SetItemText(hDlg, i, 0, tmp); } if (s->mode == RC_ZONE_WEIGHT) { sprintf(tmp,"W %.2f",(float)s->weight/100); }else if (s->mode == RC_ZONE_QUANT) { sprintf(tmp,"Q %.2f",(float)s->quant/100); }else { strcpy(tmp,"EXT"); } ListView_SetItemText(hDlg, i, 1, tmp); tmp[0] = '\0'; if (s->type==XVID_TYPE_IVOP) strcat(tmp, "K "); if (s->greyscale) strcat(tmp, "G "); if (s->chroma_opt) strcat(tmp, "O "); if (s->cartoon_mode) strcat(tmp, "C "); ListView_SetItemText(hDlg, i, 2, tmp); } static void main_mode(HWND hDlg, CONFIG * config) { const int profile = SendDlgItemMessage(hDlg, IDC_PROFILE, CB_GETCURSEL, 0, 0); const int rc_mode = SendDlgItemMessage(hDlg, IDC_MODE, CB_GETCURSEL, 0, 0); /* enable target rate/size control only for 1pass and 2pass modes*/ const int target_en = rc_mode==RC_MODE_1PASS || rc_mode==RC_MODE_2PASS2; const int target_en_slider = rc_mode==RC_MODE_1PASS || (rc_mode==RC_MODE_2PASS2 && config->use_2pass_bitrate); char buf[16]; int max; g_use_bitrate = config->use_2pass_bitrate; if (g_use_bitrate) { SetDlgItemText(hDlg, IDC_BITRATE_S, "Target bitrate (kbps):"); wsprintf(buf, "%i kbps", DEFAULT_MIN_KBPS); SetDlgItemText(hDlg, IDC_BITRATE_MIN, buf); max = profiles[profile].max_bitrate / 1000; if (max == 0) max = DEFAULT_MAX_KBPS; wsprintf(buf, "%i kbps", max); SetDlgItemText(hDlg, IDC_BITRATE_MAX, buf); SendDlgItemMessage(hDlg, IDC_SLIDER, TBM_SETRANGE, TRUE, MAKELONG(DEFAULT_MIN_KBPS, max)); SendDlgItemMessage(hDlg, IDC_SLIDER, TBM_SETPOS, TRUE, config_get_uint(hDlg, IDC_BITRATE, DEFAULT_MIN_KBPS) ); } else if (rc_mode==RC_MODE_2PASS2) { SetDlgItemText(hDlg, IDC_BITRATE_S, "Target size (kbytes):"); } else if (rc_mode==RC_MODE_1PASS) { SetDlgItemText(hDlg, IDC_BITRATE_S, "Target quantizer:"); SendDlgItemMessage(hDlg, IDC_SLIDER, TBM_SETRANGE, TRUE, MAKELONG(100, 3100)); SendDlgItemMessage(hDlg, IDC_SLIDER, TBM_SETPOS, TRUE, get_dlgitem_float(hDlg, IDC_BITRATE, DEFAULT_QUANT )); SetDlgItemText(hDlg, IDC_BITRATE_MIN, "1 (maximum quality)"); SetDlgItemText(hDlg, IDC_BITRATE_MAX, "(smallest file) 31"); } EnableDlgWindow(hDlg, IDC_BITRATE_S, target_en); EnableDlgWindow(hDlg, IDC_BITRATE, target_en); EnableDlgWindow(hDlg, IDC_BITRATE_ADV, target_en); EnableDlgWindow(hDlg, IDC_BITRATE_MIN, target_en_slider); EnableDlgWindow(hDlg, IDC_BITRATE_MAX, target_en_slider); EnableDlgWindow(hDlg, IDC_SLIDER, target_en_slider); } static void main_upload(HWND hDlg, CONFIG * config) { SendDlgItemMessage(hDlg, IDC_PROFILE, CB_SETCURSEL, config->profile, 0); SendDlgItemMessage(hDlg, IDC_MODE, CB_SETCURSEL, config->mode, 0); SendDlgItemMessage(hDlg, IDC_QUALITY, CB_SETCURSEL, config->quality, 0); g_use_bitrate = config->use_2pass_bitrate; if (g_use_bitrate) { SetDlgItemInt(hDlg, IDC_BITRATE, config->bitrate, FALSE); } else if (config->mode == RC_MODE_2PASS2) { SetDlgItemInt(hDlg, IDC_BITRATE, config->desired_size, FALSE); } else if (config->mode == RC_MODE_1PASS) { set_dlgitem_float(hDlg, IDC_BITRATE, config->desired_quant); } zones_update(hDlg, config); } /* downloads data from main dialog */ static void main_download(HWND hDlg, CONFIG * config) { config->profile = SendDlgItemMessage(hDlg, IDC_PROFILE, CB_GETCURSEL, 0, 0); config->mode = SendDlgItemMessage(hDlg, IDC_MODE, CB_GETCURSEL, 0, 0); config->quality = SendDlgItemMessage(hDlg, IDC_QUALITY, CB_GETCURSEL, 0, 0); if (g_use_bitrate) { config->bitrate = config_get_uint(hDlg, IDC_BITRATE, config->bitrate); } else if (config->mode == RC_MODE_2PASS2) { config->desired_size = config_get_uint(hDlg, IDC_BITRATE, config->desired_size); } else if (config->mode == RC_MODE_1PASS) { config->desired_quant = get_dlgitem_float(hDlg, IDC_BITRATE, config->desired_quant); } } /* main dialog proc */ static const int profile_dlgs[] = { IDD_PROFILE, IDD_LEVEL, IDD_AR }; static const int single_dlgs[] = { IDD_RC_CBR }; static const int pass1_dlgs[] = { IDD_RC_2PASS1 }; static const int pass2_dlgs[] = { IDD_RC_2PASS2 }; static const int bitrate_dlgs[] = { IDD_BITRATE }; static const int zone_dlgs[] = { IDD_ZONE }; static const int quality_dlgs[] = { IDD_MOTION, IDD_QUANT }; static const int other_dlgs[] = { IDD_ENC, IDD_DEC, IDD_COMMON }; INT_PTR CALLBACK main_proc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam) { CONFIG* config = (CONFIG*)GetWindowLongPtr(hDlg, GWLP_USERDATA); unsigned int i; switch (uMsg) { case WM_INITDIALOG : SetWindowLongPtr(hDlg, GWLP_USERDATA, lParam); config = (CONFIG*)lParam; for (i=0; i= 0x0300) SendMessage(g_hTooltip, TTM_SETMAXTIPWIDTH, 0, 400); #endif EnumChildWindows(hDlg, enum_tooltips, 0); } SetClassLongPtr(GetDlgItem(hDlg, IDC_BITRATE_S), GCLP_HCURSOR, (LONG_PTR)LoadCursor(NULL, IDC_HAND)); { DWORD ext_style = ListView_GetExtendedListViewStyle(GetDlgItem(hDlg,IDC_ZONES)); #if (_WIN32_IE >= 0x0300) ext_style |= LVS_EX_FULLROWSELECT; #endif #if( _WIN32_IE >= 0x0400 ) ext_style |= LVS_EX_FLATSB ; #endif ListView_SetExtendedListViewStyle(GetDlgItem(hDlg,IDC_ZONES), ext_style); } { typedef struct { char * name; int value; } char_int_t; const static char_int_t columns[] = { {"Frame #", 64}, {"Weight/Quant", 82}, {"Modifiers", 120}}; LVCOLUMN lvc; int i; /* Initialize the LVCOLUMN structure. */ lvc.mask = LVCF_FMT | LVCF_WIDTH | LVCF_TEXT | LVCF_SUBITEM; lvc.fmt = LVCFMT_LEFT; /* Add the columns. */ for (i=0; icode == NM_DBLCLK) { NMLISTVIEW * nmlv = (NMLISTVIEW*) lParam; config->cur_zone = nmlv->iItem; main_download(hDlg, config); if (config->cur_zone >= 0 && adv_dialog(hDlg, config, zone_dlgs, sizeof(zone_dlgs)/sizeof(int))) { zones_update(hDlg, config); } break; } break; } case WM_COMMAND : if (HIWORD(wParam) == BN_CLICKED) { switch(LOWORD(wParam)) { case IDC_PROFILE_ADV : main_download(hDlg, config); adv_dialog(hDlg, config, profile_dlgs, sizeof(profile_dlgs)/sizeof(int)); SendDlgItemMessage(hDlg, IDC_PROFILE, CB_SETCURSEL, config->profile, 0); main_mode(hDlg, config); break; case IDC_MODE_ADV : main_download(hDlg, config); if (config->mode == RC_MODE_1PASS) { adv_dialog(hDlg, config, single_dlgs, sizeof(single_dlgs)/sizeof(int)); }else if (config->mode == RC_MODE_2PASS1) { adv_dialog(hDlg, config, pass1_dlgs, sizeof(pass1_dlgs)/sizeof(int)); }else if (config->mode == RC_MODE_2PASS2) { adv_dialog(hDlg, config, pass2_dlgs, sizeof(pass2_dlgs)/sizeof(int)); } break; case IDC_BITRATE_S : /* alternate between bitrate/desired_length metrics */ main_download(hDlg, config); config->use_2pass_bitrate = !config->use_2pass_bitrate; main_mode(hDlg, config); main_upload(hDlg, config); break; case IDC_BITRATE_ADV : main_download(hDlg, config); adv_dialog(hDlg, config, bitrate_dlgs, sizeof(bitrate_dlgs)/sizeof(int)); main_mode(hDlg, config); main_upload(hDlg, config); break; case IDC_OTHER : main_download(hDlg, config); adv_dialog(hDlg, config, other_dlgs, sizeof(other_dlgs)/sizeof(int)); main_mode(hDlg, config); break; case IDC_ADD : { int i, sel, new_frame; if (config->num_zones >= MAX_ZONES) { MessageBox(hDlg, "Exceeded maximum number of zones.\nIncrease config.h:MAX_ZONES and rebuild.", "Warning", 0); break; } sel = ListView_GetNextItem(GetDlgItem(hDlg, IDC_ZONES), -1, LVNI_SELECTED); if (sel<0) { if (config->ci_valid && config->ci.ciActiveFrame>0) { for(sel=0; selnum_zones-1 && config->zones[sel].frameci.ciActiveFrame; sel++) ; sel--; new_frame = config->ci.ciActiveFrame; }else{ sel = config->num_zones-1; new_frame = sel<0 ? 0 : config->zones[sel].frame + 1; } }else{ new_frame = config->zones[sel].frame + 1; } for(i=config->num_zones-1; i>sel; i--) { config->zones[i+1] = config->zones[i]; } config->num_zones++; config->zones[sel+1].frame = new_frame; config->zones[sel+1].mode = RC_ZONE_WEIGHT; config->zones[sel+1].weight = 100; config->zones[sel+1].quant = 500; config->zones[sel+1].type = XVID_TYPE_AUTO; config->zones[sel+1].greyscale = 0; config->zones[sel+1].chroma_opt = 0; config->zones[sel+1].bvop_threshold = 0; ListView_SetItemState(GetDlgItem(hDlg, IDC_ZONES), sel, 0x00000000, LVIS_SELECTED); zones_update(hDlg, config); ListView_SetItemState(GetDlgItem(hDlg, IDC_ZONES), sel+1, 0xffffffff, LVIS_SELECTED); break; } case IDC_REMOVE : { int i, sel; sel = ListView_GetNextItem(GetDlgItem(hDlg, IDC_ZONES), -1, LVNI_SELECTED); if (sel == -1 || config->num_zones < 1) { /*MessageBox(hDlg, "Nothing selected", "Warning", 0);*/ break; } for (i=sel; inum_zones-1; i++) config->zones[i] = config->zones[i+1]; config->num_zones--; zones_update(hDlg, config); break; } case IDC_EDIT : main_download(hDlg, config); config->cur_zone = ListView_GetNextItem(GetDlgItem(hDlg, IDC_ZONES), -1, LVNI_SELECTED); if (config->cur_zone != -1 && adv_dialog(hDlg, config, zone_dlgs, sizeof(zone_dlgs)/sizeof(int))) { zones_update(hDlg, config); } break; case IDC_QUALITY_ADV : main_download(hDlg, config); if (config->quality < quality_table_num) { int result = MessageBox(hDlg, "The built-in quality presets are read-only. Would you like to copy the values\n" "of the selected preset into the \"" QUALITY_USER_STRING "\" preset for editing?", "Question", MB_YESNOCANCEL|MB_DEFBUTTON2|MB_ICONQUESTION); if (result==0 || result==IDCANCEL) break; if (result==IDYES) { memcpy(&config->quality_user, &quality_table[config->quality], sizeof(quality_t)); config->quality = quality_table_num; } } adv_dialog(hDlg, config, quality_dlgs, sizeof(quality_dlgs)/sizeof(int)); SendDlgItemMessage(hDlg, IDC_QUALITY, CB_SETCURSEL, config->quality, 0); break; case IDC_DEFAULTS : config_reg_default(config); SendDlgItemMessage(hDlg, IDC_PROFILE, CB_SETCURSEL, config->profile, 0); SendDlgItemMessage(hDlg, IDC_MODE, CB_SETCURSEL, config->mode, 0); main_mode(hDlg, config); main_upload(hDlg, config); break; case IDOK : main_download(hDlg, config); config->save = TRUE; EndDialog(hDlg, IDOK); break; case IDCANCEL : config->save = FALSE; EndDialog(hDlg, IDCANCEL); break; } } else if (HIWORD(wParam) == LBN_SELCHANGE && (LOWORD(wParam)==IDC_PROFILE || LOWORD(wParam)==IDC_MODE)) { config->mode = SendDlgItemMessage(hDlg, IDC_MODE, CB_GETCURSEL, 0, 0); config->profile = SendDlgItemMessage(hDlg, IDC_PROFILE, CB_GETCURSEL, 0, 0); if (!g_use_bitrate) { if (config->mode == RC_MODE_1PASS) set_dlgitem_float(hDlg, IDC_BITRATE, config->desired_quant); else if (config->mode == RC_MODE_2PASS2) SetDlgItemInt(hDlg, IDC_BITRATE, config->desired_size, FALSE); } main_mode(hDlg, config); main_upload(hDlg, config); }else if (HIWORD(wParam)==EN_UPDATE && LOWORD(wParam)==IDC_BITRATE) { if (g_use_bitrate) { SendDlgItemMessage(hDlg, IDC_SLIDER, TBM_SETPOS, TRUE, config_get_uint(hDlg, IDC_BITRATE, DEFAULT_MIN_KBPS) ); } else if (config->mode == RC_MODE_1PASS) { SendDlgItemMessage(hDlg, IDC_SLIDER, TBM_SETPOS, TRUE, get_dlgitem_float(hDlg, IDC_BITRATE, DEFAULT_QUANT) ); } main_download(hDlg, config); }else { return 0; } break; case WM_HSCROLL : if((HWND)lParam == GetDlgItem(hDlg, IDC_SLIDER)) { if (g_use_bitrate) SetDlgItemInt(hDlg, IDC_BITRATE, SendMessage((HWND)lParam, TBM_GETPOS, 0, 0), FALSE); else set_dlgitem_float(hDlg, IDC_BITRATE, SendMessage((HWND)lParam, TBM_GETPOS, 0, 0)); main_download(hDlg, config); break; } return 0; default : return 0; } return 1; } /* ===================================================================================== */ /* LICENSE DIALOG ====================================================================== */ /* ===================================================================================== */ static INT_PTR CALLBACK license_proc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam) { switch (uMsg) { case WM_INITDIALOG : { HRSRC hRSRC; HGLOBAL hGlobal = NULL; if ((hRSRC = FindResource(g_hInst, MAKEINTRESOURCE(IDR_GPL), "TEXT"))) { if ((hGlobal = LoadResource(g_hInst, hRSRC))) { LPVOID lpData; if ((lpData = LockResource(hGlobal))) { SendDlgItemMessage(hDlg, IDC_LICENSE_TEXT, WM_SETFONT, (WPARAM)GetStockObject(ANSI_FIXED_FONT), MAKELPARAM(TRUE, 0)); SetDlgItemText(hDlg, IDC_LICENSE_TEXT, lpData); SendDlgItemMessage(hDlg, IDC_LICENSE_TEXT, EM_SETSEL, (WPARAM)-1, (LPARAM)0); } } } SetWindowLongPtr(hDlg, GWLP_USERDATA, (LONG_PTR)hGlobal); } break; case WM_DESTROY : { HGLOBAL hGlobal = (HGLOBAL)GetWindowLongPtr(hDlg, GWLP_USERDATA); if (hGlobal) { FreeResource(hGlobal); } } break; case WM_COMMAND : if (HIWORD(wParam) == BN_CLICKED) { switch(LOWORD(wParam)) { case IDOK : case IDCANCEL : EndDialog(hDlg, 0); break; default : return 0; } break; } break; default : return 0; } return 1; } /* ===================================================================================== */ /* ABOUT DIALOG ======================================================================== */ /* ===================================================================================== */ INT_PTR CALLBACK about_proc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam) { switch (uMsg) { case WM_INITDIALOG : { xvid_gbl_info_t info; char core[100]; HFONT hFont; LOGFONT lfData; HINSTANCE m_hdll; SetDlgItemText(hDlg, IDC_BUILD, XVID_BUILD); #ifdef _WIN64 wsprintf(core, "(%s, 64-bit Edition)", XVID_SPECIAL_BUILD); #else wsprintf(core, "(%s)", XVID_SPECIAL_BUILD); #endif SetDlgItemText(hDlg, IDC_SPECIAL_BUILD, core); memset(&info, 0, sizeof(info)); info.version = XVID_VERSION; m_hdll = LoadLibrary(XVID_DLL_NAME); if (m_hdll != NULL) { ((int (__cdecl *)(void *, int, void *, void *))GetProcAddress(m_hdll, "xvid_global")) (0, XVID_GBL_INFO, &info, NULL); wsprintf(core, "xvidcore.dll version %d.%d.%d (\"%s\")", XVID_VERSION_MAJOR(info.actual_version), XVID_VERSION_MINOR(info.actual_version), XVID_VERSION_PATCH(info.actual_version), info.build); FreeLibrary(m_hdll); } else { wsprintf(core, "xvidcore.dll not found!"); } SetDlgItemText(hDlg, IDC_CORE, core); hFont = (HFONT)SendDlgItemMessage(hDlg, IDC_WEBSITE, WM_GETFONT, 0, 0L); if (GetObject(hFont, sizeof(LOGFONT), &lfData)) { lfData.lfUnderline = 1; hFont = CreateFontIndirect(&lfData); if (hFont) { SendDlgItemMessage(hDlg, IDC_WEBSITE, WM_SETFONT, (WPARAM)hFont, 1L); } } SetClassLongPtr(GetDlgItem(hDlg, IDC_WEBSITE), GCLP_HCURSOR, (LONG_PTR)LoadCursor(NULL, IDC_HAND)); SetDlgItemText(hDlg, IDC_WEBSITE, XVID_WEBSITE); } break; case WM_CTLCOLORSTATIC : if ((HWND)lParam == GetDlgItem(hDlg, IDC_WEBSITE)) { SetBkMode((HDC)wParam, TRANSPARENT) ; SetTextColor((HDC)wParam, RGB(0x00,0x00,0xc0)); return (INT_PTR)GetStockObject(NULL_BRUSH); } return 0; case WM_COMMAND : if (LOWORD(wParam) == IDC_WEBSITE && HIWORD(wParam) == STN_CLICKED) { ShellExecute(hDlg, "open", XVID_WEBSITE, NULL, NULL, SW_SHOWNORMAL); }else if (LOWORD(wParam) == IDC_LICENSE) { DialogBoxParam(g_hInst, MAKEINTRESOURCE(IDD_LICENSE), hDlg, license_proc, (LPARAM)0); } else if (LOWORD(wParam) == IDOK || LOWORD(wParam) == IDCANCEL) { EndDialog(hDlg, LOWORD(wParam)); } break; default : return 0; } return 1; } void sort_zones(zone_t * zones, int zone_num, int * sel) { int i, j; zone_t tmp; for (i = 0; i < zone_num; i++) { int cur = i; int min_f = zones[i].frame; for (j = i + 1; j < zone_num; j++) { if (zones[j].frame < min_f) { min_f = zones[j].frame; cur = j; } } if (cur != i) { tmp = zones[i]; zones[i] = zones[cur]; zones[cur] = tmp; if (i == *sel) *sel = cur; else if (cur == *sel) *sel = i; } } } static void zones_update(HWND hDlg, CONFIG * config) { int i, sel; sel = ListView_GetNextItem(GetDlgItem(hDlg, IDC_ZONES), -1, LVNI_SELECTED); sort_zones(config->zones, config->num_zones, &sel); ListView_DeleteAllItems(GetDlgItem(hDlg,IDC_ZONES)); for (i = 0; i < config->num_zones; i++) main_insert_zone(GetDlgItem(hDlg,IDC_ZONES), &config->zones[i], i, TRUE); if (sel == -1 && config->num_zones > 0) sel = 0; if (sel >= config->num_zones) sel = config->num_zones-1; config->cur_zone = sel; ListView_SetItemState(GetDlgItem(hDlg, IDC_ZONES), sel, 0xffffffff, LVIS_SELECTED); } xvidcore/vfw/src/status.h0000664000076500007650000000357411564705453016573 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Status window header - * * Copyright(C) Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: status.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _STATUS_H_ #define _STATUS_H_ #include /* int64_t */ #if defined(_MSC_VER) || defined (__WATCOMC__) # define int64_t __int64 # define uint64_t unsigned __int64 #else # include #endif typedef struct { double fps; HWND hDlg; HWND hGraph; HDC hDc; unsigned int width; unsigned int width31; unsigned int height; unsigned int stride; unsigned char * buffer; TEXTMETRIC tm; BITMAPINFO * bi; int64_t count[4]; int min_quant[4]; int max_quant[4]; int quant[3][31]; int max_quant_frames; int min_length[4]; int max_length[4]; int64_t tot_length[4]; } status_t; void status_create(status_t * s, unsigned int fps_inc, unsigned int fps_base); void status_update(status_t *s, int type, int length, int quant); void status_destroy(status_t *s); void status_destroy_always(status_t *s); #endif /* _STATUS_H_ */ xvidcore/vfw/src/debug.h0000664000076500007650000000261411564705453016330 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - Debug header - * * Copyright(C) Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: debug.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _DEBUG_H_ #define _DEBUG_H_ #if defined(_DEBUG) #include /* vsprintf */ #define DPRINTF_BUF_SZ 1024 static __inline void DPRINTF(char *fmt, ...) { va_list args; char buf[DPRINTF_BUF_SZ]; va_start(args, fmt); vsprintf(buf, fmt, args); OutputDebugString(buf); } #else static __inline void DPRINTF(char *fmt, ...) { } #endif #endif /* _DEBUG_H_ */ xvidcore/vfw/src/resource.rc0000664000076500007650000015061211564735747017261 0ustar xvidbuildxvidbuild// Microsoft Visual C++ generated resource script. // #include "resource.h" #define APSTUDIO_READONLY_SYMBOLS ///////////////////////////////////////////////////////////////////////////// // // Generated from the TEXTINCLUDE 2 resource. // #include #ifndef IDC_STATIC #define IDC_STATIC (-1) #endif ///////////////////////////////////////////////////////////////////////////// #undef APSTUDIO_READONLY_SYMBOLS ///////////////////////////////////////////////////////////////////////////// // Neutral resources #if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_NEU) #ifdef _WIN32 LANGUAGE LANG_NEUTRAL, SUBLANG_NEUTRAL #pragma code_page(1252) #endif //_WIN32 ///////////////////////////////////////////////////////////////////////////// // // Dialog // IDD_RC_2PASS1 DIALOG 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "1st Pass" FONT 8, "MS Shell Dlg" BEGIN EDITTEXT IDC_STATS,72,6,106,12,ES_AUTOHSCROLL PUSHBUTTON "...",IDC_STATS_BROWSE,182,7,22,12 CONTROL "Full quality first pass",IDC_FULL1PASS,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,15,38,83,12 CONTROL "Discard first pass",IDC_DISCARD1PASS,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,127,38,69,12 LTEXT "Stats filename:",IDC_STATIC,8,6,52,12,SS_CENTERIMAGE CTEXT "If you don't discard first pass but keep full quality disabled,\nthe resulting 1st pass stream might not be MPEG-4 compliant.",IDC_STATIC,7,114,197,35 CTEXT "Full quality first pass is only useful if you want to keep the resulting stream.\nIt doesn't improve quality of second pass and normally should be disabled.",IDC_STATIC,7,71,197,35 END IDD_MOTION DIALOGEX 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | DS_FIXEDSYS | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Motion" FONT 8, "MS Shell Dlg", 0, 0, 0x0 BEGIN COMBOBOX IDC_MOTION,112,19,76,76,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP COMBOBOX IDC_VHQ,112,37,76,76,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP CONTROL "Use chroma motion",IDC_CHROMAME,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,14,91,77,10 CONTROL "Turbo ;-)",IDC_TURBO,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,143,91,43,10 EDITTEXT IDC_FRAMEDROP,112,125,75,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_MAXKEY,112,145,76,12,ES_AUTOHSCROLL | ES_NUMBER LTEXT "Motion search precision:",IDC_STATIC,14,20,80,12,SS_CENTERIMAGE LTEXT "VHQ mode:",IDC_STATIC,14,39,38,8 LTEXT "Frame drop ratio:",IDC_FRAMEDROP_STATIC,14,125,68,12,SS_CENTERIMAGE LTEXT "Maximum I-frame interval:",IDC_STATIC,14,146,80,12,SS_CENTERIMAGE GROUPBOX "Motion Precision",IDC_STATIC,7,7,193,99 GROUPBOX "Other",IDC_STATIC,7,108,193,67 CONTROL "Use VHQ for bframes too",IDC_VHQ_BFRAME,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,14,78,95,10 COMBOBOX IDC_VHQ_METRIC,112,55,76,76,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP LTEXT "VHQ metric:",IDC_STATIC,14,56,38,8 END IDD_MAIN DIALOGEX 0, 0, 225, 255 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Xvid Configuration" FONT 8, "MS Shell Dlg", 0, 0, 0x1 BEGIN COMBOBOX IDC_PROFILE,88,16,91,200,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP PUSHBUTTON "more...",IDC_PROFILE_ADV,184,16,28,12 COMBOBOX IDC_MODE,88,34,92,76,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP PUSHBUTTON "more...",IDC_MODE_ADV,184,34,28,12 CTEXT "Target bitrate (kbps):",IDC_BITRATE_S,12,53,71,13,SS_NOTIFY | SS_CENTERIMAGE,WS_EX_DLGMODALFRAME EDITTEXT IDC_BITRATE,88,53,91,12,ES_AUTOHSCROLL PUSHBUTTON "calc...",IDC_BITRATE_ADV,184,52,28,12 CONTROL "Slider1",IDC_SLIDER,"msctls_trackbar32",TBS_BOTH | TBS_NOTICKS | WS_TABSTOP,12,78,204,14 CONTROL "List1",IDC_ZONES,"SysListView32",LVS_REPORT | LVS_SINGLESEL | LVS_SHOWSELALWAYS | LVS_NOSORTHEADER | WS_TABSTOP,14,110,198,68,WS_EX_STATICEDGE PUSHBUTTON "Add",IDC_ADD,16,182,36,12 PUSHBUTTON "Remove",IDC_REMOVE,56,182,36,12 PUSHBUTTON "Zone Options...",IDC_EDIT,155,182,57,12 PUSHBUTTON "Load Defaults",IDC_DEFAULTS,7,238,64,13 PUSHBUTTON "Other Options...",IDC_OTHER,81,238,64,13 DEFPUSHBUTTON "OK",IDOK,156,238,64,13 GROUPBOX "Main Settings",IDC_STATIC,7,3,212,202 LTEXT "Encoding type:",IDC_STATIC,14,34,70,12,SS_CENTERIMAGE LTEXT "Profile @ Level:",IDC_STATIC,14,16,70,12,SS_CENTERIMAGE LTEXT "X",IDC_BITRATE_MIN,16,70,79,8 RTEXT "X",IDC_BITRATE_MAX,125,70,84,8 GROUPBOX "Zones",IDC_STATIC,7,98,212,107 GROUPBOX "More",IDC_STATIC,7,200,212,29 COMBOBOX IDC_QUALITY,86,210,92,76,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP PUSHBUTTON "more...",IDC_QUALITY_ADV,182,210,28,12 LTEXT "Quality preset:",IDC_STATIC,12,210,70,12,SS_CENTERIMAGE END IDD_QUANT DIALOG 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Quantization" FONT 8, "MS Shell Dlg" BEGIN EDITTEXT IDC_MINIQUANT,120,18,76,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_MAXIQUANT,120,34,76,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_MINPQUANT,120,50,76,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_MAXPQUANT,120,66,76,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_MINBQUANT,120,82,76,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_MAXBQUANT,120,98,76,12,ES_AUTOHSCROLL | ES_NUMBER GROUPBOX "Quantizer restrictions",IDC_STATIC,8,6,196,112 LTEXT "Min I-frame quantizer:",IDC_STATIC,16,18,76,12,SS_CENTERIMAGE LTEXT "Max I-frame quantizer:",IDC_STATIC,16,34,76,12,SS_CENTERIMAGE LTEXT "Min P-frame quantizer:",IDC_STATIC,16,50,76,12,SS_CENTERIMAGE LTEXT "Max P-frame quantizer:",IDC_STATIC,16,66,76,12,SS_CENTERIMAGE LTEXT "Min B-frame quantizer:",IDC_STATIC,16,82,76,12,SS_CENTERIMAGE LTEXT "Max B-frame quantizer:",IDC_STATIC,16,98,76,12,SS_CENTERIMAGE CONTROL "Trellis quantization",IDC_TRELLISQUANT,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,8,128,74,10 END IDD_RC_2PASS2 DIALOG 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "2nd Pass" FONT 8, "MS Shell Dlg" BEGIN EDITTEXT IDC_STATS,72,6,112,12,ES_AUTOHSCROLL PUSHBUTTON "...",IDC_STATS_BROWSE,189,7,15,11 EDITTEXT IDC_KFBOOST,140,34,56,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_MINKEY,140,55,56,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_KFREDUCTION,140,69,56,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_OVERFLOW_CONTROL_STRENGTH,140,99,56,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_OVERIMP,140,118,56,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_OVERDEG,140,137,56,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_CURVECOMPH,140,169,56,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_CURVECOMPL,140,186,56,12,ES_AUTOHSCROLL | ES_NUMBER LTEXT "Stats filename:",IDC_STATIC,8,6,52,12,SS_CENTERIMAGE GROUPBOX "Intra-frames tuning",IDC_STATIC,7,22,197,69 LTEXT "Overflow control strength (%):",IDC_STATIC,16,100,100,12,SS_CENTERIMAGE LTEXT "High bitrate scenes degradation (%):",IDC_STATIC,16,170,124,12,SS_CENTERIMAGE LTEXT "Low bitrate scenes improvement (%):",IDC_STATIC,16,186,124,12,SS_CENTERIMAGE LTEXT "I-frame boost (%):",IDC_STATIC,16,34,91,12,SS_CENTERIMAGE LTEXT "...are reduced by (%):",IDC_STATIC,16,69,100,12 LTEXT "Max overflow improvement (%):",IDC_STATIC,16,119,100,12,SS_CENTERIMAGE LTEXT "Max overflow degradation (%):",IDC_STATIC,16,138,100,12,SS_CENTERIMAGE LTEXT "I-frames closer than... (frames):",IDC_STATIC,16,56,124,11 GROUPBOX "Overflow treatment",IDC_STATIC,7,86,197,75 GROUPBOX "Curve compression",IDC_STATIC,7,156,197,51 END IDD_ENC DIALOGEX 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | DS_FIXEDSYS | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Encoder" FONT 8, "MS Shell Dlg", 0, 0, 0x0 BEGIN COMBOBOX IDC_FOURCC,127,13,76,76,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP CONTROL "Print debug info on each frame",IDC_VOPDEBUG,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,7,36,113,10 CONTROL "Display encoding status",IDC_DISPLAY_STATUS,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,7,49,91,10 LTEXT "FourCC used:",IDC_STATIC,7,15,80,8,SS_CENTERIMAGE END IDD_QUANTMATRIX DIALOG 0, 0, 288, 149 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Custom quantization matrix" FONT 8, "MS Shell Dlg" BEGIN DEFPUSHBUTTON "OK",IDOK,172,128,47,13 EDITTEXT IDC_QINTRA00,8,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | WS_DISABLED | NOT WS_BORDER EDITTEXT IDC_QINTRA01,24,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA02,40,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA03,56,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA04,72,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA05,88,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA06,104,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA07,120,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA08,8,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA09,24,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA10,40,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA11,56,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA12,72,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA13,88,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA14,104,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA15,120,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA16,8,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA17,24,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA18,40,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA19,56,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA20,72,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA21,88,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA22,104,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA23,120,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA24,8,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA25,24,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA26,40,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA27,56,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA28,72,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA29,88,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA30,104,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA31,120,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA32,8,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA33,24,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA34,40,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA35,56,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA36,72,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA37,88,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA38,104,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA39,120,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA40,8,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA41,24,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA42,40,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA43,56,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA44,72,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA45,88,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA46,104,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA47,120,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA48,8,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA49,24,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA50,40,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA51,56,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA52,72,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA53,88,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA54,104,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA55,120,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA56,8,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA57,24,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA58,40,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA59,56,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA60,72,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA61,88,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA62,104,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTRA63,120,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER GROUPBOX "Intra matrix",IDC_STATIC,4,4,136,112 EDITTEXT IDC_QINTER00,152,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER01,168,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER02,184,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER03,200,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER04,216,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER05,232,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER06,248,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER07,264,16,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER08,152,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER09,168,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER10,184,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER11,200,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER12,216,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER13,232,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER14,248,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER15,264,28,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER16,152,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER17,168,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER18,184,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER19,200,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER20,216,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER21,232,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER22,248,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER23,264,40,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER24,152,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER25,168,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER26,184,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER27,200,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER28,216,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER29,232,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER30,248,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER31,264,52,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER32,152,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER33,168,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER34,184,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER35,200,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER36,216,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER37,232,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER38,248,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER39,264,64,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER40,152,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER41,168,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER42,184,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER43,200,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER44,216,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER45,232,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER46,248,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER47,264,76,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER48,152,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER49,168,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER50,184,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER51,200,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER52,216,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER53,232,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER54,248,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER55,264,88,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER56,152,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER57,168,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER58,184,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER59,200,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER60,216,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER61,232,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER62,248,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER EDITTEXT IDC_QINTER63,264,100,15,11,ES_AUTOHSCROLL | ES_NUMBER | NOT WS_BORDER GROUPBOX "Inter matrix",IDC_STATIC,148,4,136,112 PUSHBUTTON "&Load matrix...",IDC_LOAD,68,128,47,13 PUSHBUTTON "&Save matrix...",IDC_SAVE,120,128,47,13 END IDD_ABOUT DIALOG 0, 0, 192, 165 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Xvid MPEG-4 Video Codec" FONT 8, "MS Shell Dlg" BEGIN CTEXT "Xvid is copyrighted software. It may be distributed\naccording to the terms of the GNU GPL license.",IDC_STATIC,12,112,168,20 CTEXT "WEBSITE",IDC_WEBSITE,60,92,72,8,SS_NOTIFY | SS_CENTERIMAGE CTEXT "BUILD",IDC_BUILD,8,28,176,8,SS_CENTERIMAGE CONTROL "IDB_LOGO",IDC_STATIC,"Static",SS_BITMAP,24,56,15,13 CTEXT "Xvid MPEG-4 Video Codec",IDC_STATIC,8,16,176,12 GROUPBOX "About",IDC_STATIC,4,4,184,132 DEFPUSHBUTTON "OK",IDOK,102,144,80,14 CTEXT "CORE",IDC_CORE,8,40,176,8,SS_CENTERIMAGE CTEXT "( SPECIAL BUILD )",IDC_SPECIAL_BUILD,5,102,181,8 PUSHBUTTON "View License...",IDC_LICENSE,10,144,80,14 END IDD_RC_CBR DIALOG 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "CBR" FONT 8, "MS Shell Dlg" BEGIN EDITTEXT IDC_CBR_REACTIONDELAY,108,12,76,12,ES_AUTOHSCROLL | ES_NUMBER LTEXT "Reaction Delay Factor:",IDC_STATIC,8,12,80,12,SS_CENTERIMAGE EDITTEXT IDC_CBR_AVERAGINGPERIOD,108,28,76,12,ES_AUTOHSCROLL | ES_NUMBER LTEXT "Averaging period:",IDC_STATIC,8,28,80,12,SS_CENTERIMAGE EDITTEXT IDC_CBR_BUFFER,108,44,76,12,ES_AUTOHSCROLL | ES_NUMBER LTEXT "Smoother:",IDC_STATIC,8,44,80,12,SS_CENTERIMAGE END IDD_PROFILE DIALOGEX 0, 0, 212, 217 STYLE DS_SETFONT | DS_MODALFRAME | DS_FIXEDSYS | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Profile" FONT 8, "MS Shell Dlg", 0, 0, 0x0 BEGIN GROUPBOX "",IDC_STATIC,8,145,198,68 COMBOBOX IDC_PROFILE_PROFILE,72,21,128,200,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP COMBOBOX IDC_QUANTTYPE,124,47,76,76,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP PUSHBUTTON "Edit Matrix...",IDC_QUANTMATRIX,124,62,76,10 CONTROL "Interlaced Encoding",IDC_INTERLACING,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,94,79,8 CONTROL "Quarter Pixel",IDC_QPEL,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,106,100,8 CONTROL "Global Motion Compensation",IDC_GMC,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,118,103,8 CONTROL "B-VOPs",IDC_BVOP,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,146,40,8 LTEXT "Max consecutive B-VOPs:",IDC_MAXBFRAMES_S,16,158,96,8 EDITTEXT IDC_MAXBFRAMES,124,155,76,12,ES_AUTOHSCROLL | ES_NUMBER EDITTEXT IDC_BQUANTRATIO,124,169,76,12,ES_AUTOHSCROLL EDITTEXT IDC_BQUANTOFFSET,124,183,76,12,ES_AUTOHSCROLL CONTROL "Packed bitstream",IDC_PACKED,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,200,71,10 LTEXT "Quantizer ratio:",IDC_BQUANTRATIO_S,16,172,89,8 LTEXT "Quantizer offset:",IDC_BQUANTOFFSET_S,16,186,52,8 LTEXT "Quantization type:",IDC_QUANTTYPE_S,16,50,85,8 CONTROL "Top field first",IDC_TFF,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,125,94,56,10 COMBOBOX IDC_LUMMASK,124,75,76,76,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP LTEXT "Adaptive Quantization:",IDC_LUMMASK_S,16,77,86,8 GROUPBOX "",IDC_STATIC,4,7,202,37 LTEXT "Profile",IDC_STATIC,9,6,22,8 ICON IDI_MOBILE,IDC_PROFILE_LOGO,10,15,20,20,SS_REALSIZEIMAGE,WS_EX_ACCEPTFILES LTEXT "Profile @ Level:",IDC_PROFILE_LABEL,11,23,52,8,NOT WS_VISIBLE CONTROL "Independent Slice Coding",IDC_SLICES,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,130,103,8 END IDD_ZONE DIALOG 0, 0, 212, 194 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Zone" FONT 8, "MS Shell Dlg" BEGIN EDITTEXT IDC_ZONE_FRAME,104,6,80,12,ES_AUTOHSCROLL | ES_NUMBER PUSHBUTTON "<-",IDC_ZONE_FETCH,188,6,16,12 CONTROL "Weight:",IDC_ZONE_MODE_WEIGHT,"Button",BS_AUTORADIOBUTTON | WS_GROUP,16,40,41,10 EDITTEXT IDC_ZONE_WEIGHT,104,38,80,12,ES_AUTOHSCROLL CONTROL "Quantizer:",IDC_ZONE_MODE_QUANT,"Button",BS_AUTORADIOBUTTON,16,56,48,10 EDITTEXT IDC_ZONE_QUANT,104,54,80,12,ES_AUTOHSCROLL CONTROL "Slider1",IDC_ZONE_SLIDER,"msctls_trackbar32",TBS_BOTH | TBS_NOTICKS | WS_TABSTOP,12,82,188,14 CONTROL "Begin with keyframe",IDC_ZONE_FORCEIVOP,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,124,92,8 CONTROL "Greyscale encoding",IDC_ZONE_GREYSCALE,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,138,79,10 CONTROL "Chroma optimizer enabled",IDC_ZONE_CHROMAOPT,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,16,152,97,10 LTEXT "B-VOP sensitivity:",IDC_ZONE_BVOPTHRESHOLD_S,16,170,76,8 EDITTEXT IDC_ZONE_BVOPTHRESHOLD,100,168,84,12,ES_AUTOHSCROLL LTEXT "Start frame #:",IDC_STATIC,8,6,52,10 GROUPBOX "Rate control",IDC_STATIC,7,22,198,90 RTEXT "X",IDC_ZONE_MAX,140,74,54,8 LTEXT "X",IDC_ZONE_MIN,18,74,54,8 GROUPBOX "Static",IDC_STATIC,7,107,198,80 CONTROL "Cartoon Mode",IDC_CARTOON,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,121,124,61,10 END IDD_LEVEL DIALOGEX 0, 0, 212, 215 STYLE DS_SETFONT | DS_MODALFRAME | DS_FIXEDSYS | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Level" FONT 8, "MS Shell Dlg", 0, 0, 0x0 BEGIN COMBOBOX IDC_LEVEL_PROFILE,72,21,128,200,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP EDITTEXT IDC_LEVEL_WIDTH,90,65,28,12,ES_AUTOHSCROLL | ES_READONLY EDITTEXT IDC_LEVEL_HEIGHT,130,65,28,12,ES_AUTOHSCROLL | ES_READONLY EDITTEXT IDC_LEVEL_FPS,170,65,28,12,ES_AUTOHSCROLL | ES_READONLY EDITTEXT IDC_LEVEL_VMV,158,82,40,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY EDITTEXT IDC_LEVEL_VCV,158,99,40,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY EDITTEXT IDC_LEVEL_VBV,158,137,40,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY EDITTEXT IDC_LEVEL_BITRATE,158,153,40,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY GROUPBOX "Level - Xvid will not force you to respect these",IDC_LEVEL_LEVEL_G,10,50,196,66 LTEXT "Suggested:",IDC_LEVEL_DIM_S,18,67,48,8 LTEXT "x",IDC_STATIC,122,67,8,8 LTEXT "Max bitrate (kbps)",IDC_LEVEL_BITRATE_S,18,155,108,8 LTEXT "Max buffer size (bits):",IDC_LEVEL_VBV_S,18,139,108,8 LTEXT "Max processing rate (mbs/sec)",IDC_LEVEL_VCV_S,18,101,108,8 LTEXT "Max frame size (macroblocks):",IDC_LEVEL_VMV_S,18,85,108,8 LTEXT "x",IDC_STATIC,162,67,8,8 GROUPBOX "Video Buffer Verifier - used in Two-Pass mode",IDC_LEVEL_VBV_G,10,122,196,66 EDITTEXT IDC_LEVEL_PEAKRATE,158,170,40,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY LTEXT "Max bits over any one second interval:",IDC_LEVEL_PEAKRATE_S,18,172,128,8 GROUPBOX "",IDC_STATIC,4,7,202,37 LTEXT "Profile",IDC_STATIC,9,6,22,8 ICON IDI_MOBILE,IDC_PROFILE_LOGO,10,15,20,20,SS_REALSIZEIMAGE,WS_EX_ACCEPTFILES LTEXT "Profile @ Level:",IDC_PROFILE_LABEL,11,23,52,8,NOT WS_VISIBLE CTEXT "To ensure best playback of your videos, keep an eye out for standalone video devices carrying one of the Xvid logos!",IDC_STATIC,10,194,195,18 END IDD_DEC DIALOG 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Decoder" FONT 8, "MS Shell Dlg" BEGIN CONTROL "Deblocking Y",IDC_DEC_DY,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,15,64,63,13 CONTROL "Deblocking UV",IDC_DEC_DUV,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,15,80,61,13 CONTROL "Deringing Y",IDC_DEC_DRY,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,88,64,60,13 CONTROL "Film Effect",IDC_DEC_FE,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,154,64,45,13 GROUPBOX "Brightness",IDC_STATIC,5,6,202,41 CONTROL "Slider1",IDC_DEC_BRIGHTNESS,"msctls_trackbar32",TBS_AUTOTICKS | TBS_BOTH | WS_TABSTOP,18,18,181,24 GROUPBOX "Postprocessing",IDC_STATIC,5,52,202,46 CONTROL "Deringing UV",IDC_DEC_DRUV,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,88,80,60,13 END IDD_STATUS DIALOGEX 0, 0, 325, 220 STYLE DS_SETFONT | DS_MODALFRAME | WS_MINIMIZEBOX | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Xvid Status" FONT 8, "MS Shell Dlg", 0, 0, 0x1 BEGIN CONTROL "",IDC_STATUS_GRAPH,"Static",SS_OWNERDRAW | SS_NOTIFY,4,4,187,108 LTEXT "I-VOP",IDC_STATIC,26,139,22,8 LTEXT "B-VOP",IDC_STATIC,26,167,24,8 LTEXT "P-VOP",IDC_STATIC,26,153,23,8 LTEXT "Total",IDC_STATIC,26,181,21,8 EDITTEXT IDC_STATUS_IQ_MIN,103,137,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_IQ_MAX,123,137,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_PQ_MIN,103,152,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_PQ_MAX,123,152,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_BQ_MIN,103,166,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_BQ_MAX,123,166,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_Q_MIN,103,180,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_Q_MAX,123,180,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE CTEXT "Min",IDC_STATIC,104,126,14,8 CTEXT "Max",IDC_STATIC,123,126,16,8 EDITTEXT IDC_STATUS_IL_MIN,174,137,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_IL_MAX,206,137,28,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_PL_MIN,174,152,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_PL_MAX,206,152,28,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_BL_MIN,174,166,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_BL_MAX,206,166,28,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_L_MIN,174,180,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_L_MAX,206,180,28,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE CTEXT "Min",IDC_STATIC,177,125,24,8 CTEXT "Max",IDC_STATIC,205,125,29,8 EDITTEXT IDC_STATUS_IL_TOT,270,137,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_PL_TOT,270,152,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_BL_TOT,270,166,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_L_TOT,270,180,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE CTEXT "Total (k)",IDC_STATIC,271,125,28,8 CONTROL "Auto-close window",IDC_STATUS_DESTROY,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,13,202,100,10 EDITTEXT IDC_STATUS_KBPS,267,199,36,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_IL_AVG,237,137,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_PL_AVG,237,152,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_BL_AVG,237,166,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_L_AVG,237,180,29,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE CTEXT "Average",IDC_STATIC,237,125,30,8 EDITTEXT IDC_STATUS_I_NUM,57,138,32,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_P_NUM,57,152,32,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_B_NUM,57,166,32,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_NUM,57,180,32,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE GROUPBOX "Frame size (bytes)",IDC_STATIC,170,117,133,80,BS_CENTER GROUPBOX "Quant",IDC_STATIC,99,117,65,80,BS_CENTER GROUPBOX "Frames",IDC_STATIC,53,117,40,80,BS_CENTER LTEXT "Avg bitrate (kbps):",IDC_STATIC,200,201,59,8 LISTBOX IDC_DEBUGOUTPUT,201,13,113,100,LBS_NOINTEGRALHEIGHT | WS_TABSTOP CONTROL "Show me the internals!",IDC_SHOWINTERNALS,"Button",BS_AUTOCHECKBOX | BS_NOTIFY | WS_TABSTOP,201,1,87,10 EDITTEXT IDC_STATUS_IQ_AVG,143,137,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_PQ_AVG,143,152,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_BQ_AVG,143,166,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE EDITTEXT IDC_STATUS_Q_AVG,143,180,17,12,ES_RIGHT | ES_AUTOHSCROLL | ES_READONLY | ES_NUMBER | NOT WS_BORDER | NOT WS_TABSTOP,WS_EX_STATICEDGE CTEXT "Avg",IDC_STATIC,143,126,16,8 END IDD_AR DIALOGEX 0, 0, 211, 215 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Aspect Ratio" FONT 8, "MS Shell Dlg", 0, 0, 0x1 BEGIN CONTROL "Pixel Aspect Ratio",IDC_PAR,"Button",BS_AUTORADIOBUTTON,13,7,73,10,WS_EX_TRANSPARENT COMBOBOX IDC_ASPECT_RATIO,25,33,111,55,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP EDITTEXT IDC_PARX,39,58,36,12,ES_NUMBER EDITTEXT IDC_PARY,100,58,36,12,ES_NUMBER PUSHBUTTON "Default",IDC_AR_DEFAULT,156,21,37,11 CONTROL "Display Aspect Ratio",IDC_AR,"Button",BS_AUTORADIOBUTTON,13,81,81,10,WS_EX_TRANSPARENT EDITTEXT IDC_ARX,39,110,36,12,ES_NUMBER EDITTEXT IDC_ARY,100,110,36,12,ES_NUMBER PUSHBUTTON "4:3",IDC_AR_4_3,156,90,37,11 PUSHBUTTON "16:9",IDC_AR_16_9,156,104,37,11 PUSHBUTTON "2,35:1",IDC_AR_235_100,156,118,37,11 LTEXT "Select the shape of a pixel...",IDC_STATIC,25,18,104,10 LTEXT "Select the shape of the image...",IDC_STATIC,25,93,110,11 GROUPBOX "",IDC_STATIC,7,7,196,127,BS_CENTER GROUPBOX "",IDC_STATIC,7,81,196,52,BS_CENTER CTEXT "X :",IDC_STATIC,25,60,10,10 LTEXT "Aspect Ratio is written to MPEG-4 bitstream, but unfortunately is likely to be ignored if video stream is encapsulated in a general-purpose container (like .avi, .ogm., .mkv).",IDC_STATIC,7,138,197,27 LTEXT "Therefore, be aware that using different aspect ratio than default might be ignored by some players, especially when decoded on Windows.\n\nUse at your own risk.",IDC_STATIC,7,168,197,41 CTEXT "Y :",IDC_STATIC,86,60,10,10 CTEXT "Y :",IDC_STATIC,86,112,10,10 CTEXT "X :",IDC_STATIC,25,112,10,10 GROUPBOX "Quick Setting",IDC_STATIC,148,7,55,127,0,WS_EX_TRANSPARENT END IDD_BITRATE DIALOG 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Bitrate Calculator" FONT 8, "MS Shell Dlg" BEGIN COMBOBOX IDC_BITRATE_TSIZE,95,5,75,64,CBS_DROPDOWN | WS_VSCROLL | WS_TABSTOP EDITTEXT IDC_BITRATE_SSIZE,95,20,75,12,ES_AUTOHSCROLL PUSHBUTTON "...",IDC_BITRATE_SSELECT,178,21,20,12 COMBOBOX IDC_BITRATE_CFORMAT,95,45,75,64,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP EDITTEXT IDC_BITRATE_COVERHEAD,95,60,75,12,ES_AUTOHSCROLL | ES_READONLY EDITTEXT IDC_BITRATE_HOURS,15,100,29,11,ES_AUTOHSCROLL EDITTEXT IDC_BITRATE_MINUTES,52,100,29,11,ES_AUTOHSCROLL EDITTEXT IDC_BITRATE_SECONDS,88,100,29,11,ES_AUTOHSCROLL COMBOBOX IDC_BITRATE_FPS,124,100,71,100,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP EDITTEXT IDC_BITRATE_VSIZE,95,120,75,12,ES_AUTOHSCROLL | ES_READONLY EDITTEXT IDC_BITRATE_VRATE,95,135,75,12,ES_AUTOHSCROLL | ES_READONLY COMBOBOX IDC_BITRATE_AFORMAT,95,158,75,64,CBS_DROPDOWNLIST | WS_VSCROLL | WS_TABSTOP COMBOBOX IDC_BITRATE_ARATE,95,173,75,64,CBS_DROPDOWN | WS_VSCROLL | WS_TABSTOP EDITTEXT IDC_BITRATE_ASIZE,95,188,75,12,ES_AUTOHSCROLL PUSHBUTTON "...",IDC_BITRATE_ASELECT,178,188,20,12 CONTROL "Avg. bitrate (kbps):",IDC_BITRATE_AMODE_RATE,"Button",BS_AUTORADIOBUTTON,15,175,75,10 GROUPBOX "Video",IDC_STATIC,5,75,200,132 GROUPBOX "Audio",IDC_STATIC,5,148,200,59 CONTROL "Size (kbytes):",IDC_BITRATE_AMODE_SIZE,"Button",BS_AUTORADIOBUTTON,15,190,58,10 LTEXT "Target size (kbytes):",IDC_STATIC,15,6,64,12,SS_CENTERIMAGE LTEXT "Format:",IDC_STATIC,15,46,24,13,SS_CENTERIMAGE LTEXT "Format:",IDC_STATIC,15,160,24,8,SS_CENTERIMAGE LTEXT "Size (kbytes):",IDC_STATIC,15,121,43,8,SS_CENTERIMAGE GROUPBOX "Container:",IDC_STATIC,5,35,200,172 LTEXT "Overhead (kbytes):",IDC_STATIC,15,61,61,10,SS_CENTERIMAGE LTEXT "Average bitrate (kbps):",IDC_STATIC,15,136,72,8,SS_CENTERIMAGE CTEXT "hours",IDC_STATIC,15,90,30,8 CTEXT "minutes",IDC_STATIC,52,90,30,8 CTEXT "seconds",IDC_STATIC,88,90,30,8 CTEXT "frames per second",IDC_STATIC,126,90,70,8 LTEXT "Subtitles (kbytes):",IDC_STATIC,15,21,60,10,SS_CENTERIMAGE END IDD_LICENSE DIALOG 0, 0, 430, 234 STYLE DS_SETFONT | DS_MODALFRAME | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "GNU General Public License" FONT 8, "MS Sans Serif" BEGIN PUSHBUTTON "OK",IDOK,172,218,84,14 EDITTEXT IDC_LICENSE_TEXT,2,2,426,212,ES_MULTILINE | ES_AUTOHSCROLL | ES_READONLY | WS_VSCROLL END IDD_COMMON DIALOGEX 0, 0, 212, 212 STYLE DS_SETFONT | DS_MODALFRAME | DS_FIXEDSYS | WS_POPUP | WS_CAPTION | WS_SYSMENU CAPTION "Common" FONT 8, "MS Shell Dlg", 0, 0, 0x0 BEGIN CONTROL "Automatically detect optimizations",IDC_CPU_AUTO,"Button",BS_AUTORADIOBUTTON | WS_GROUP,16,20,121,10 CONTROL "Force optimizations",IDC_CPU_FORCE,"Button",BS_AUTORADIOBUTTON,16,33,76,10 CONTROL "MMX",IDC_CPU_MMX,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,28,44,33,10 CONTROL "Integer SSE",IDC_CPU_MMXEXT,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,28,56,54,10 CONTROL "SSE",IDC_CPU_SSE,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,28,68,30,10 CONTROL "SSE2",IDC_CPU_SSE2,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,28,81,34,10 CONTROL "3DNow!",IDC_CPU_3DNOW,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,28,118,42,10 CONTROL "3DNow! 2",IDC_CPU_3DNOWEXT,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,28,131,48,10 EDITTEXT IDC_DEBUG,127,166,76,12,ES_AUTOHSCROLL GROUPBOX "Performance optimizations",-1,8,4,196,158 LTEXT "OutputDebugString debug level:",-1,7,168,104,12 CONTROL "SSE3",IDC_CPU_SSE3,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,28,93,34,10 CONTROL "SSE4",IDC_CPU_SSE4,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,28,106,34,10 EDITTEXT IDC_NUMTHREADS,127,145,69,12,ES_AUTOHSCROLL | ES_NUMBER LTEXT "Number of threads:",IDC_NUMTHREADS_STATIC,28,147,68,8 END ///////////////////////////////////////////////////////////////////////////// // // Bitmap // IDB_LOGO BITMAP "XviD_logo.bmp" ///////////////////////////////////////////////////////////////////////////// // // Icon // // Icon with lowest ID value placed first to ensure application icon // remains consistent on all systems. IDI_ICON ICON "xvid.ico" ///////////////////////////////////////////////////////////////////////////// // // TEXT // IDR_GPL TEXT "../LICENSE" #ifdef APSTUDIO_INVOKED ///////////////////////////////////////////////////////////////////////////// // // TEXTINCLUDE // 1 TEXTINCLUDE BEGIN "resource.h\0" END 2 TEXTINCLUDE BEGIN "#include \r\n" "#ifndef IDC_STATIC\r\n" "#define IDC_STATIC (-1)\r\n" "#endif\r\n" "\0" END 3 TEXTINCLUDE BEGIN "\r\n" "\0" END #endif // APSTUDIO_INVOKED ///////////////////////////////////////////////////////////////////////////// // // DESIGNINFO // #ifdef APSTUDIO_INVOKED GUIDELINES DESIGNINFO BEGIN IDD_MOTION, DIALOG BEGIN END IDD_ENC, DIALOG BEGIN END IDD_ABOUT, DIALOG BEGIN END IDD_PROFILE, DIALOG BEGIN BOTTOMMARGIN, 213 END IDD_LEVEL, DIALOG BEGIN END IDD_DEC, DIALOG BEGIN END IDD_BITRATE, DIALOG BEGIN END IDD_LICENSE, DIALOG BEGIN END IDD_COMMON, DIALOG BEGIN END END #endif // APSTUDIO_INVOKED ///////////////////////////////////////////////////////////////////////////// // // String Table // STRINGTABLE BEGIN IDC_MODE """Single Pass""encodes to your specified bitrate or quality.\n""Two-pass 1st pass"" gathers statistics for the 2nd pass.\n""Two-pass 2nd pass"" uses 1st pass' statistics to encode at target file size or bitrate." END STRINGTABLE BEGIN IDC_INTERLACING "Encodes frames as interlaced - only use if your source contains interlacing artifacts (i.e. fields instead of progressive frames)" IDC_OVERDEG "How much of the overflow the codec can eat into during oversized sections\nLarger values will prevent oversized files better, but will also spoil quantizer distribution more." IDC_MAXBFRAMES "Maximum number of sequential B-frames." IDC_BQUANTRATIO "Ratio used to calculate the B-frame quantizer.\n\nB-VOP quant = (AVG(past VOP quant, future VOP quant) * quant ratio + quant offset)" IDC_OVERIMP "How much of the overflow the codec can eat into during undersized sections.\nLarger values will prevent undersized files better, but will also spoil quantizer distribution more." IDC_CBR_REACTIONDELAY "Determines how slowly it will adjust the current encoding quality based upon scene intensity - this has the strongest influence on quality" IDC_CBR_AVERAGINGPERIOD "Determines how slowly it adapts to the current adjusted quality" IDC_CBR_BUFFER "Provides said number of frames worth of buffer between the adjusted encoding quality and lowest possible quality" IDC_PACKED "When enabled the P-frames and B-frames are packed together in the one bitstreams. This permits decoding without delay.\neg. [I] [PB] [B] [empty] [PB] [B] [empty] [P]\n\nPacked bitstreams were introduced in DivX 5.01." END STRINGTABLE BEGIN IDC_FRAMEDROP "Frame dropping ratio. 0 = no frame dropping .. 100 = drop all frames." IDC_KFREDUCTION "Reduction of bitrate for the first consecutive i-frames. The last I-frame will get treated normally." IDC_GMC "Use Global Motion Compensation." END STRINGTABLE BEGIN IDC_PROFILE "Restrict the usage of MPEG-4 tools and limit the encoded bitrate such that bitstreams are compatible with hardware decoders." IDC_MOTION "Higher settings give higher-quality results, at the cost of slower encoding." IDC_QUANTTYPE "H.263 smooths the image whereas MPEG (slightly slower) sharpens.\nCustom lets you define your own matrix." IDC_FOURCC "Choose what you would like the avi to identify itself as" IDC_MAXKEY "Maximum number of frames allowed between I-frames" IDC_LUMMASK "Turns on Lumi masking - applies more compression to dark/light areas that the eye can't notice easily" IDC_MINIQUANT "Minimum quantizer allowed for I-frames. Only functional in 2-pass second pass." IDC_MAXIQUANT "Maximum quantizer allowed for I-frames. Only functional in 2-pass second pass." IDC_MINPQUANT "Minimum quantizer allowed for P-frames." IDC_MAXPQUANT "Maximum quantizer allowed for P-frames." IDC_MINBQUANT "Minimum quantizer allowed for B-frames, BEFORE ratio/offset scalling" IDC_MAXBQUANT "Maximum quantizer allowed for B-frames, BEFORE ratio/offset scalling" IDC_QUANTMATRIX "Define your own MPEG quantization matrices. Quantization type must be set to ""Custom"" to affect encoding." IDC_KFBOOST "A value of 20 will give 20% more bits to every I-frame" IDC_MINKEY "If keyframes are close together, it might be useful to decrease the bitrate of all keyframes except the last one.\nUse this to define how close must keyframes be to be reduced." IDC_DISCARD1PASS "Check this if you would like to skip the storage of the 1st pass output. It is often very large." END STRINGTABLE BEGIN IDC_CURVECOMPH "The higher this value, the more bits get taken from frames larger than the average size, and redistributed to others" IDC_CURVECOMPL "The higher this value, the more bits get assigned to frames below the average frame size" IDC_STATS1 "Location for 1st pass stats file to be saved to" IDC_STATS2 "Location for 2nd pass curve stats to be loaded from" END STRINGTABLE BEGIN IDC_CPU_AUTO "Enable Xvid's internal CPU detection" IDC_CPU_FORCE "Override Xvid's internal CPU detection (not recommended)" END STRINGTABLE BEGIN IDC_LOAD "Load a pair of custom intra/inter matrices" IDC_SAVE "Save the current intra/inter matrices to a file" END STRINGTABLE BEGIN IDC_QPEL "Use Quarter PixEL resolution for encoding for a more precise motion compensation" IDC_CHROMAME "Use chroma information to detect motion (slower)." IDC_BQUANTOFFSET "B-frame quantizer offset from last P-frame quantizer; refer to B-frame quant ratio (above)" IDC_VHQ "VHQ enables an additional search process to increase quality (much slower)." END STRINGTABLE BEGIN IDC_ZONE_CHROMAOPT "Interpolates colours in bright/dark areas for achieving a nicer edge impression" IDC_ZONE_BVOPTHRESHOLD "Change the amount of b-frames in this zone. Recommended values are between -20 (almost no b-vops) and 30 (many b-vops).\nThe hardcoded maximum in profile/level will never be exceeded" IDC_LEVEL_PROFILE "Restrict the usage of MPEG-4 tools and limit the encoded bitrate such that bitstreams are compatible with hardware decoders." IDC_LEVEL_WIDTH "Suggested VOP width (pixels)" IDC_LEVEL_HEIGHT "Suggested VOP height (pixels)" END STRINGTABLE BEGIN IDC_CARTOON "Enables special motion estimation features for cartoons/anime." IDC_OVERFLOW_CONTROL_STRENGTH "0=Default from core (let Xvid decide). Else overflow payback percent per frame. Higher value will meet target file size better, but will also spoil quantizer distribution more." IDC_ASPECT_RATIO "Display aspect ratio is used to scale the video on playback/anamorphic encoding).\n\nDefault 1:1 for no scaling necessary." END STRINGTABLE BEGIN IDC_PROFILE_PROFILE "Restrict the usage of MPEG-4 tools and limit the encoded bitrate such that bitstreams are compatible with hardware decoders." IDC_BVOP_THRESHOLD "Change the amount of B-frames in this zone. Recommended values are between -20 (almost no B-VOPs) and 30 (many B-VOPs).\nThe hardcoded maximum in profile/level will never be exceeded" END STRINGTABLE BEGIN IDC_LEVEL_FPS "Suggested VOP rate (frames-per-second)" IDC_LEVEL_VMV "Video Memory Verifier (VMV):\n\nThe maximum number of macroblocks permitted per VOP." IDC_LEVEL_VCV "Video Complexity Verifier (VCV):\nThe maximum macroblocks decoded per second." IDC_LEVEL_VBV "Video Buffer Verifier (VBV):\n\nThe maximum size of the video decoder buffer. The encoded bitstream but not overflow or underflow this buffer." IDC_LEVEL_BITRATE "Maximum instantaneous bitrate." IDC_BITRATE "The target AVI bitrate, file size, or quantizer." IDC_TRELLISQUANT "Advanced, high quality quantization mode" IDC_BITRATE_S "Toggle between quantizer, target bitrate, target file size" END STRINGTABLE BEGIN IDC_TURBO "Faster motion estimation for B-frames and quarterpel" IDC_BITRATE_TSIZE "Target file or media size" END STRINGTABLE BEGIN IDC_BITRATE_SSIZE "The file size of subtitles or other data files" IDC_BITRATE_SSELECT "Select file size from existing subtitle file" IDC_BITRATE_COVERHEAD "Calculated container format overhead (kbytes)." IDC_BITRATE_VRATE "Calculated average video bitrate (kilobits-per-second)\nNote that ""desired bitrate"" setting in main window includes AVI overhead, so will be larger than this value." IDC_BITRATE_VSIZE "Calculated video size.\nNote that ""desired filesize"" setting in main window includes AVI overhead, so will be larger than this value." IDC_BITRATE_ARATE "Audio bitrate (kilobits-per-second)" IDC_BITRATE_ASELECT "Select file size from existing audio file" END STRINGTABLE BEGIN IDC_CLOSEDGOV "Closes every group-of-pictures before encoding new keyframe." IDC_ZONE_WEIGHT "Change quality of this zone relative to other zones. It's not recommended to go below 0,2" IDC_ZONE_QUANT "Fix this zone's quality to desired quant" IDC_ZONE_GREYSCALE "Don't code colour information in this zone. You also have to force a keyframe, or old colour information will stay" END STRINGTABLE BEGIN IDC_ZONE_FORCEIVOP "Force a keyframe at the beginnig of the zone" END ///////////////////////////////////////////////////////////////////////////// // // Icon // // Icon with lowest ID value placed first to ensure application icon // remains consistent on all systems. IDI_MOBILE ICON "mobile_40.ico" IDI_HOME ICON "home_40.ico" IDI_HD1080 ICON "hd1080_40.ico" IDI_HD720 ICON "hd720_40.ico" #endif // Neutral resources ///////////////////////////////////////////////////////////////////////////// #ifndef APSTUDIO_INVOKED ///////////////////////////////////////////////////////////////////////////// // // Generated from the TEXTINCLUDE 3 resource. // ///////////////////////////////////////////////////////////////////////////// #endif // not APSTUDIO_INVOKED xvidcore/vfw/src/codec.h0000664000076500007650000000702611564705453016321 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - VFW codec header - * * Copyright(C) Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: codec.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _CODEC_H_ #define _CODEC_H_ #include #include "config.h" #include "status.h" #define XVID_NAME_L L"XVID" #define XVID_DESC_L L"Xvid MPEG-4 Codec" #define FOURCC_XVID mmioFOURCC('X','V','I','D') #define FOURCC_DIVX mmioFOURCC('D','I','V','X') #define FOURCC_DX50 mmioFOURCC('D','X','5','0') #define FOURCC_MP4V mmioFOURCC('M','P','4','V') #define FOURCC_xvid mmioFOURCC('x','v','i','d') #define FOURCC_divx mmioFOURCC('d','i','v','x') #define FOURCC_dx50 mmioFOURCC('d','x','5','0') #define FOURCC_mp4v mmioFOURCC('m','p','4','v') /* yuyu 4:2:2 16bit, y-u-y-v, packed*/ #define FOURCC_YUYV mmioFOURCC('Y','U','Y','V') #define FOURCC_YUY2 mmioFOURCC('Y','U','Y','2') /* yvyu 4:2:2 16bit, y-v-y-u, packed*/ #define FOURCC_YVYU mmioFOURCC('Y','V','Y','U') /* uyvy 4:2:2 16bit, u-y-v-y, packed */ #define FOURCC_UYVY mmioFOURCC('U','Y','V','Y') /* i420 y-u-v, planar */ #define FOURCC_I420 mmioFOURCC('I','4','2','0') #define FOURCC_IYUV mmioFOURCC('I','Y','U','V') /* yv12 y-v-u, planar */ #define FOURCC_YV12 mmioFOURCC('Y','V','1','2') typedef struct { CONFIG config; // decoder void * dhandle; // encoder void * ehandle; unsigned int fincr; unsigned int fbase; status_t status; /* encoder min keyframe internal */ int framenum; int keyspacing; HINSTANCE m_hdll; int (*xvid_global_func)(void *handle, int opt, void *param1, void *param2); int (*xvid_encore_func)(void *handle, int opt, void *param1, void *param2); int (*xvid_decore_func)(void *handle, int opt, void *param1, void *param2); xvid_plugin_func *xvid_plugin_single_func; xvid_plugin_func *xvid_plugin_2pass1_func; xvid_plugin_func *xvid_plugin_2pass2_func; xvid_plugin_func *xvid_plugin_lumimasking_func; xvid_plugin_func *xvid_plugin_psnr_func; } CODEC; LRESULT compress_query(CODEC *, BITMAPINFO *, BITMAPINFO *); LRESULT compress_get_format(CODEC *, BITMAPINFO *, BITMAPINFO *); LRESULT compress_get_size(CODEC *, BITMAPINFO *, BITMAPINFO *); LRESULT compress_frames_info(CODEC *, ICCOMPRESSFRAMES *); LRESULT compress_begin(CODEC *, BITMAPINFO *, BITMAPINFO *); LRESULT compress_end(CODEC *); LRESULT compress(CODEC *, ICCOMPRESS *); LRESULT decompress_query(CODEC *, BITMAPINFO *, BITMAPINFO *); LRESULT decompress_get_format(CODEC *, BITMAPINFO *, BITMAPINFO *); LRESULT decompress_begin(CODEC *, BITMAPINFO *, BITMAPINFO *); LRESULT decompress_end(CODEC *); LRESULT decompress(CODEC *, ICDECOMPRESS *); extern int pp_brightness, pp_dy, pp_duv, pp_fe, pp_dry, pp_druv; /* decoder options */ #endif /* _CODEC_H_ */ xvidcore/vfw/src/driverproc.c0000664000076500007650000002036211564705453017414 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VFW FRONTEND * - driverproc main - * * Copyright(C) Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: driverproc.c 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #include #include #include "vfwext.h" #include "debug.h" #include "codec.h" #include "config.h" #include "status.h" #include "resource.h" static int clean_dll_bindings(CODEC* codec); INT_PTR WINAPI DllMain( HANDLE hModule, DWORD ul_reason_for_call, LPVOID lpReserved) { g_hInst = (HINSTANCE) hModule; return TRUE; } /* __declspec(dllexport) */ LRESULT WINAPI DriverProc( DWORD_PTR dwDriverId, HDRVR hDriver, UINT uMsg, LPARAM lParam1, LPARAM lParam2) { CODEC * codec = (CODEC *)dwDriverId; switch(uMsg) { /* driver primitives */ case DRV_LOAD : case DRV_FREE : return DRVCNF_OK; case DRV_OPEN : DPRINTF("DRV_OPEN"); { ICOPEN * icopen = (ICOPEN *)lParam2; if (icopen != NULL && icopen->fccType != ICTYPE_VIDEO) { return DRVCNF_CANCEL; } codec = malloc(sizeof(CODEC)); if (codec == NULL) { if (icopen != NULL) { icopen->dwError = ICERR_MEMORY; } return 0; } memset(codec, 0, sizeof(CODEC)); codec->status.hDlg = NULL; codec->config.ci_valid = 0; codec->ehandle = codec->dhandle = NULL; codec->fbase = 25; codec->fincr = 1; config_reg_get(&codec->config); #if 0 /* bad things happen if this piece of code is activated */ if (lstrcmp(XVID_BUILD, codec->config.build)) { config_reg_default(&codec->config); } #endif if (icopen != NULL) { icopen->dwError = ICERR_OK; } return (LRESULT)codec; } case DRV_CLOSE : DPRINTF("DRV_CLOSE"); /* compress_end/decompress_end don't always get called */ compress_end(codec); decompress_end(codec); clean_dll_bindings(codec); status_destroy_always(&codec->status); free(codec); return DRVCNF_OK; case DRV_DISABLE : case DRV_ENABLE : return DRVCNF_OK; case DRV_INSTALL : case DRV_REMOVE : return DRVCNF_OK; case DRV_QUERYCONFIGURE : case DRV_CONFIGURE : return DRVCNF_CANCEL; /* info */ case ICM_GETINFO : DPRINTF("ICM_GETINFO"); if (lParam1 && lParam2 >= sizeof(ICINFO)) { ICINFO *icinfo = (ICINFO *)lParam1; icinfo->fccType = ICTYPE_VIDEO; icinfo->fccHandler = FOURCC_XVID; icinfo->dwFlags = VIDCF_FASTTEMPORALC | VIDCF_FASTTEMPORALD | VIDCF_COMPRESSFRAMES; icinfo->dwVersion = 0; #if !defined(ICVERSION) #define ICVERSION 0x0104 #endif icinfo->dwVersionICM = ICVERSION; wcscpy(icinfo->szName, XVID_NAME_L); wcscpy(icinfo->szDescription, XVID_DESC_L); return lParam2; /* size of struct */ } return 0; /* error */ /* state control */ case ICM_ABOUT : DPRINTF("ICM_ABOUT"); DialogBoxParam(g_hInst, MAKEINTRESOURCE(IDD_ABOUT), (HWND)lParam1, about_proc, 0); return ICERR_OK; case ICM_CONFIGURE : DPRINTF("ICM_CONFIGURE"); if (lParam1 != -1) { CONFIG temp; codec->config.save = FALSE; memcpy(&temp, &codec->config, sizeof(CONFIG)); DialogBoxParam(g_hInst, MAKEINTRESOURCE(IDD_MAIN), (HWND)lParam1, main_proc, (LPARAM)&temp); if (temp.save) { memcpy(&codec->config, &temp, sizeof(CONFIG)); config_reg_set(&codec->config); } } return ICERR_OK; case ICM_GETSTATE : DPRINTF("ICM_GETSTATE"); if ((void*)lParam1 == NULL) { return sizeof(CONFIG); } memcpy((void*)lParam1, &codec->config, sizeof(CONFIG)); return ICERR_OK; case ICM_SETSTATE : DPRINTF("ICM_SETSTATE"); if ((void*)lParam1 == NULL) { DPRINTF("ICM_SETSTATE : DEFAULT"); config_reg_get(&codec->config); return 0; } memcpy(&codec->config,(void*)lParam1, sizeof(CONFIG)); return 0; /* sizeof(CONFIG); */ /* not sure the difference, private/public data? */ case ICM_GET : case ICM_SET : return ICERR_OK; /* older-stype config */ case ICM_GETDEFAULTQUALITY : case ICM_GETQUALITY : case ICM_SETQUALITY : case ICM_GETBUFFERSWANTED : case ICM_GETDEFAULTKEYFRAMERATE : return ICERR_UNSUPPORTED; /* compressor */ case ICM_COMPRESS_QUERY : DPRINTF("ICM_COMPRESS_QUERY"); return compress_query(codec, (BITMAPINFO *)lParam1, (BITMAPINFO *)lParam2); case ICM_COMPRESS_GET_FORMAT : DPRINTF("ICM_COMPRESS_GET_FORMAT"); return compress_get_format(codec, (BITMAPINFO *)lParam1, (BITMAPINFO *)lParam2); case ICM_COMPRESS_GET_SIZE : DPRINTF("ICM_COMPRESS_GET_SIZE"); return compress_get_size(codec, (BITMAPINFO *)lParam1, (BITMAPINFO *)lParam2); case ICM_COMPRESS_FRAMES_INFO : DPRINTF("ICM_COMPRESS_FRAMES_INFO"); return compress_frames_info(codec, (ICCOMPRESSFRAMES *)lParam1); case ICM_COMPRESS_BEGIN : DPRINTF("ICM_COMPRESS_BEGIN"); return compress_begin(codec, (BITMAPINFO *)lParam1, (BITMAPINFO *)lParam2); case ICM_COMPRESS_END : DPRINTF("ICM_COMPRESS_END"); return compress_end(codec); case ICM_COMPRESS : DPRINTF("ICM_COMPRESS"); return compress(codec, (ICCOMPRESS *)lParam1); /* decompressor */ case ICM_DECOMPRESS_QUERY : DPRINTF("ICM_DECOMPRESS_QUERY"); return decompress_query(codec, (BITMAPINFO *)lParam1, (BITMAPINFO *)lParam2); case ICM_DECOMPRESS_GET_FORMAT : DPRINTF("ICM_DECOMPRESS_GET_FORMAT"); return decompress_get_format(codec, (BITMAPINFO *)lParam1, (BITMAPINFO *)lParam2); case ICM_DECOMPRESS_BEGIN : DPRINTF("ICM_DECOMPRESS_BEGIN"); return decompress_begin(codec, (BITMAPINFO *)lParam1, (BITMAPINFO *)lParam2); case ICM_DECOMPRESS_END : DPRINTF("ICM_DECOMPRESS_END"); return decompress_end(codec); case ICM_DECOMPRESS : DPRINTF("ICM_DECOMPRESS"); return decompress(codec, (ICDECOMPRESS *)lParam1); case ICM_DECOMPRESS_GET_PALETTE : case ICM_DECOMPRESS_SET_PALETTE : case ICM_DECOMPRESSEX_QUERY: case ICM_DECOMPRESSEX_BEGIN: case ICM_DECOMPRESSEX_END: case ICM_DECOMPRESSEX: return ICERR_UNSUPPORTED; /* VFWEXT entry point */ case ICM_USER+0x0fff : if (lParam1 == VFWEXT_CONFIGURE_INFO) { VFWEXT_CONFIGURE_INFO_T * info = (VFWEXT_CONFIGURE_INFO_T*)lParam2; DPRINTF("%i %i %i %i %i %i", info->ciWidth, info->ciHeight, info->ciRate, info->ciScale, info->ciActiveFrame, info->ciFrameCount); codec->config.ci_valid = 1; memcpy(&codec->config.ci, (void*)lParam2, sizeof(VFWEXT_CONFIGURE_INFO_T)); return ICERR_OK; } return ICERR_UNSUPPORTED; default: if (uMsg < DRV_USER) return DefDriverProc(dwDriverId, hDriver, uMsg, lParam1, lParam2); else return ICERR_UNSUPPORTED; } } void WINAPI Configure(HWND hwnd, HINSTANCE hinst, LPTSTR lpCmdLine, int nCmdShow) { LRESULT dwDriverId; dwDriverId = (LRESULT) DriverProc(0, 0, DRV_OPEN, 0, 0); if (dwDriverId != (LRESULT)NULL) { if (lstrcmpi(lpCmdLine, "about")==0) { DriverProc(dwDriverId, 0, ICM_ABOUT, (LPARAM)GetDesktopWindow(), 0); }else{ DriverProc(dwDriverId, 0, ICM_CONFIGURE, (LPARAM)GetDesktopWindow(), 0); } DriverProc(dwDriverId, 0, DRV_CLOSE, 0, 0); } } static int clean_dll_bindings(CODEC* codec) { if(codec->m_hdll) { FreeLibrary(codec->m_hdll); codec->m_hdll = NULL; codec->xvid_global_func = NULL; codec->xvid_encore_func = NULL; codec->xvid_decore_func = NULL; codec->xvid_plugin_single_func = NULL; codec->xvid_plugin_2pass1_func = NULL; codec->xvid_plugin_2pass2_func = NULL; codec->xvid_plugin_lumimasking_func = NULL; codec->xvid_plugin_psnr_func = NULL; } return 0; } xvidcore/vfw/src/home_40.ico0000664000076500007650000002743611507207323017017 0ustar xvidbuildxvidbuildH( /(HP  8Mb~P/v]i 9Tտũֵ՚bS|Ku?nZ>8?>>?@??8 _:JnSu'sղ¡їc{3km[a M@&:E)P6Q6R6R5[>_D_B_A_@_@_?_>_=_=Z5Q*R)B>d?PqåAE}̵un#cXJVGMZ>[=[<[;Z;[:[9[8[7Z6[6Z3T,JM!Y.o.6ЉJTJD7QEVJXKXJYJZJYJYIZHZF[F[FZE[C[CZC[B[AZ@Z?[>[>[=Z;[;[:[:[8[7[7[5Z5[5Z3Y1R&Ie K¥~Z=[<[;[;[:[:[8[7[7Z5[4Z4\6^6@s'OmbPMOKMIVQZTZSZRZQYQZQ[QZPYNYMZLYKZKYJYJZHZG[F[F[FYCQ8Y?\DU>R6[?Z@U8Z:]>V8S0[;T4W2Z8T/Z2\8T.Y/U)Fϸ?=>?POZXYWZWYVYUZUYTZRYROGA7NDYNZOZNZMYLZJZJYJYIZGZGWDf JQ{aUT,Wq1D.ŏI.זÉ`èVqG-ީa/o+cʹͩиqN(\5EӿʼRMRUXZ\Y[Y[ZZYYYXZWZWYVLGqf^RJZQYOZNYNYNYLZKZKYJL@.Y_x(kE=]mײjE,屩b/Ym򳙺ۼۣV H\2Y2CRsæOTTXZ]Z[Z[ZZYZZYYWCAnʱȲRKWOZRZQZOYOYNYNZMYLYKMA.[жiE>]nֲjL9Ȣd,Rmﶢü@B`5Z5S*y1Z~Xc=H~V[TYZ^Y\Z[Z[]^LKj~ҽѬD?YTZTZRZRZQZQYOZOZOZNYMMB.\¨F/ؖÊbƜHlϷɓпΣӟSBʺ͞ʙ`Q0\:Z7[6;{R_O\VbITNTUZZ_Z^Y[LMi~ٯ=:VSZVYUZUZSZRZRYRZQZOZNYNMD-\f$4$ÊEM[5@p`]O9[:I3gDe^T1y0`H8Y>O*v#N[ON0\:Z9[8[8X4` =ػ.?M\WeZgZf[gJVNTU[Z_EJ^۰FFVUZWZWYVZVZUYTYRYRYQZQYPYOME.^r5B4˘HP]ITAZC\GZG\EYCX>[DW=]@Z?[@W<[;\D/΅HI]\kZgZgZgWdBOTZEKYج>@WX[[YXZXYWZWZVYVYUZTYSYRYQYQYO\PeZ[RXL`PcWXKZJYIZHZG[G[FZFZCZBZBZBZAZ@Z?[>Z=[=Z<[;[;Z:Z9]<DYXkWjXiYhZgZgVcIVҽ͖a;@UWZ[YZZZZYYXZXZWZWYVZVXTYSZRZRZRXPRHYLZOVKRCXHXJZKYJYIZGZG[GYDXBXAX@X@X?Y?Y>[?[>Z>[=[;Z;Z:S0v}u~q&NdWkXkWjXjYiZhZg[hIVDJV[Z^Z]Z\Z[Z[ZZZYZXZWZWYUTPKFYTZSZRZQZQZPZOZORFREQCYKZKZJZJXGL7PT>T=S;T[>[=[I{л[>[=P0u)Z𴙒 9UYpYoYnXmXlWkWkWjXiYhETG};E\dZaY`Z_Z^Z^Z]Y\Z[YZZ[OP] ]eJF\XZUZTZRYQJB[PUIXLN@<!`H[A[@[?Z>S5hM0|GcXqXpXoYnXnXlWkXkXkFZ|;󭈲>HYaZbZaZ`Z_Z^Y^Z]Z\XZ>@TRWTZVYUZTYSKC^TUJHVWYXZXKI]YPJ]W<0NCPE?1Fz>,_NZGZF[FZF[DR9ďSBJlYwXuXuPm{:r+KcYnXmXlXkWkDXِmES^iNYLʱ̲<CPUĉNIJYYKJ]ZTPHCȯƢǬij^bSĨ>-_PYIZHZG[F[FQ;ÓYOHkWwYwXvIhs,QkZrXpYnYnXmXlWkCW^IWXdV^W`X_IOūǡ{7|PQLL_\UQRN]Y>._PZJYJYIZH[GR<ÑUOq\{XxIjn#JgYsYrYqXpYoYnXnXmSgCXۑý9Ip${v1~ISZcZbW^T[uMOJLZYUS[XSPVR>1_RYKZKYJYJYIO;ÙaZAgXxRroԿܾs-LkYtXtYrYrXqYqXpYoYnYnXlF[ɱϬӿ׼l!{TdRa˪P]YdZcYb[cPXt.{bfTXOS١sOPYXZXZXSP\Yuͳ̹>1_TYMZLZKYJZKA/AfStPrIkQq[xXuYuXtYtYrYqYqXpXoXoYnYmRgLaQeYlWiSb\iTbZfZfZeZeYc[cS\W_X_X^Z_Y^PTaeVXXYZ[ZZYX[YYWGDLH` \` \QK]>3_TYNZNZMYKZK9'ɮ^ ǬѹXyJnXyYxXxXwYvYuYuYtYtYrYqYqXpXpYoVkVkXlWkXkViSdXgZhYgYfZfZeZdZcV^U^YaZ`Z_Y^SXX\Z\Y[Y[ZZYZZXZXXUSQSOC>N>5_VYOZNYNYNUHXJNoӄFKpOrVwYxXxXwYuXuXtXtYsXrXqXqXpYoYnXmXlWkWkWjXjYiZhZgZfYfYeYeYdYbZbYaZaY_Z_Y^Z^Y]Y\Z[Y[ZZZYYXYWZWHE[<4_WYQZPZOZOODu-l8ڵBHmCh\|]{YxXvYvXuYuXtXsXrYqYqXpYoYnXmXlWkWjWkXjYhZhZgYgYfZfZeYdZbZbZaZ`Y`Z_Z_Z^Z]Z[Z[Z[ZZYYZXOMs+q𪁩NI\UYRZRYQYPMAV޹vFkChTuXxXwYwYuXuYtYtYrYrYrXpYpYoYnXmXlWkWjWjXiYhZhZgYfYfZeZdZcZbZbYaZ`Z_Z^Z_Z]Z\Z[Z[YZYZ[ZSQ]Zu*sOKZWYUYTYRZRXP` WFƩ};TuQsIkQqUsYvYtXtYsYrYqYqYpXoYnXnXlWkWkWjXjYiYhZgZgYfYeZeYdZcYbZbYaY`Z_Y^Z^Z]Y\Y[ZZYZZZXVUSZXYVZVZUZU\V8/vFƫWq%Uu6ZNmRp\x]x[u[tXqXqXpYoYnXmXlWlWkWkXjYhZhZgYgYfZfZeYcZbYbYbZ`Y`Z_Z^Y^Y]Z[Y[YZYZYYYXYWYW\YXT72߽mäԑYz6DdBbOmNkVpVpWoWpWoXnYnXmXlWkWkWjXiYhZhZgZfYfZeZdZcZbZbZaZ`Z_Z_Z_Z^Z\Z[Y[Z[YZYXXVLJQNװPPP5!Gf|յ|c}c{ZsWpUnKeGaG_KaQgPePdOcSgYjZhZgZgZfZfYeZdZcZbZbZaY_Z_Z_Z_X\RUOPHIQR[[(hޢϺѹ͜kfʃBmo#j|cvOc5L8N:M<M>OBQAPBPP\PZPZNWDMAI=D7>CHhmx/zq̵͸+6BWӼؿ̴Ǯ޾ظ鹓븑[ZWeẕɯd¾*!9MfQK\}}HR0x?@? @@?xvidcore/vfw/src/vfwext.h0000664000076500007650000000307211564705453016564 0ustar xvidbuildxvidbuild/***************************************************************************** * * XVID MPEG-4 VIDEO CODEC * - experimental vfw-api-extensions - * * Copyright(C) Peter Ross * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: vfwext.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _VFWEXT_H_ #define _VFWEXT_H_ /* VFWEXT */ #define VFWEXT_FOURCC 0xFFFFFFFF typedef struct { DWORD ciSize; /* structure size */ LONG ciWidth; /* frame width pixels */ LONG ciHeight; /* frame height pixels */ DWORD ciRate; /* frame rate/scale */ DWORD ciScale; LONG ciActiveFrame; /* currently selected frame# */ LONG ciFrameCount; /* total frames */ } VFWEXT_CONFIGURE_INFO_T; #define VFWEXT_CONFIGURE_INFO 1 #endif /* _VFWEXT_H_ */ xvidcore/vfw/src/XviD_logo.bmp0000664000076500007650000001140610454767660017467 0ustar xvidbuildxvidbuildBMv(,  RVf֖ r*՛9[Ǘffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff`o`otFDhFvtFdG`o`opF db+@`oN+!pw xpo`oog@DG``oO{pg@vpǏ`or-`h@wH``o{@p ``o`@`oww~(q@`os35S37v`oS335S333@(`oS3337s3333 K`os3333\33335@)wwwwx`o33335S33337 KfL`h`o33333833333> Kp`o`o333333333333@Jpo`oS333339333335 @p`o`o333335s333339)opgwwto`o333335S33333 po`os33337s33337  po`o3333]3333KLpo`os335s337@pxpo`oUWuU Mppo`o@(+po`oL`pMo`oS5S5@(qpopo`oS335S335mMp@DDBo`o3333:3333^`p@ po`os33335s33335 pp o`o333335S33333@p Npo`o333337s333338Ir@nB""!o`oS33333:333335o`o333333333333o`o33333Y33333<o`os33335S33337o`oS3333~33335o`oS33383333o`oS337s335o`os35S37@`oww{`o`ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff`xvidcore/vfw/src/resource.h0000664000076500007650000004657111564705453017103 0ustar xvidbuildxvidbuild//{{NO_DEPENDENCIES}} // Microsoft Visual C++ generated include file. // Used by resource.rc // #define IDC_LICENSE 3 #define IDD_MAIN 100 #define IDD_GLOBAL 101 #define IDD_QUANT 102 #define IDD_2PASS 103 #define IDD_RC_2PASS2 104 #define IDD_CREDITS 105 #define IDD_CPU 106 #define IDD_ENC 107 #define IDD_QUANTMATRIX 108 #define IDD_ABOUT 109 #define IDD_2PASSALT 110 #define IDD_RC_2PASS2_ALT 111 #define IDD_POSTPROC 112 #define IDD_RC_CBR 113 #define IDD_PROFILE 114 #define IDD_RC_2PASS1 116 #define IDD_MOTION 117 #define IDD_TOOLS 120 #define IDD_ZONE 121 #define IDD_LEVEL 122 #define IDD_DEC 124 #define IDD_STATUS 125 #define IDD_AR 126 #define IDD_BITRATE 127 #define IDD_LICENSE 129 #define IDR_GPL 131 #define IDD_COMMON 132 #define IDI_MOBILE 135 #define IDI_HOME 136 #define IDI_HD720 137 #define IDI_HD1080 138 #define WIZ_CYDLG 140 #define PROP_MIN_CYDLG 144 #define PROP_SM_CYDLG 188 #define PROP_MIN_CXDLG 212 #define PROP_MED_CYDLG 215 #define PROP_LG_CYDLG 218 #define PROP_MED_CXDLG 227 #define PROP_LG_CXDLG 252 #define WIZ_CXDLG 276 #define IDC_MODE 1000 #define IDC_SLIDER 1001 #define IDC_QUALITY 1002 #define IDC_QUALITY_ADV 1005 #define IDC_OTHER 1006 #define IDC_DEFAULTS 1007 #define IDC_PROFILE 1008 #define IDC_MOTION 1009 #define IDC_QUANTTYPE 1010 #define IDC_FOURCC 1011 #define IDC_MAXKEY 1012 #define IDC_LUMMASK 1013 #define IDC_MINIQUANT 1014 #define IDC_MAXIQUANT 1015 #define IDC_MINPQUANT 1016 #define IDC_MAXPQUANT 1017 #define IDC_MINBQUANT 1018 #define IDC_MAXBQUANT 1019 #define IDC_QUANTMATRIX 1020 #define IDC_KFBOOST 1021 #define IDC_MINKEY 1022 #define IDC_DISCARD1PASS 1023 #define IDC_DUMMY2PASS 1024 #define IDC_CURVECOMPH 1025 #define IDC_CURVECOMPL 1026 #define IDC_PAYBACK 1027 #define IDC_PAYBACKBIAS 1028 #define IDC_PAYBACKPROP 1029 #define IDC_STATS1 1030 #define IDC_STATS2 1031 #define IDC_STATS_BROWSE 1032 #define IDC_STATS2_BROWSE 1033 #define IDC_CREDITS_START 1034 #define IDC_CREDITS_START_BEGIN 1035 #define IDC_CREDITS_START_END 1036 #define IDC_CREDITS_END 1037 #define IDC_CREDITS_END_START 1038 #define IDC_CREDITS_END_BEGIN 1039 #define IDC_CREDITS_END_END 1040 #define IDC_CREDITS_RATE_RADIO 1041 #define IDC_CREDITS_RATE 1042 #define IDC_CREDITS_QUANT_RADIO 1043 #define IDC_CREDITS_QUANTI 1044 #define IDC_CREDITS_QUANT_STATIC 1045 #define IDC_CREDITS_QUANTP 1046 #define IDC_CREDITS_SIZE_RADIO 1047 #define IDC_CREDITS_START_SIZE 1048 #define IDC_CREDITS_END_STATIC 1049 #define IDC_CREDITS_END_SIZE 1050 #define IDC_CPU_AUTO 1051 #define IDC_CPU_FORCE 1052 #define IDC_CPU_MMX 1053 #define IDC_CPU_MMXEXT 1054 #define IDC_CPU_SSE 1055 #define IDC_CPU_SSE2 1056 #define IDC_CPU_3DNOW 1057 #define IDC_CPU_3DNOWEXT 1058 #define IDC_LOAD 1059 #define IDC_CPU_SSE3 1059 #define IDC_SAVE 1060 #define IDC_CPU_SSE4 1060 #define IDC_WEBSITE 1061 #define IDC_BUILD 1062 #define IDC_CORE 1063 #define IDC_QINTRA00 1064 #define IDC_QINTRA01 1065 #define IDC_QINTRA02 1066 #define IDC_QINTRA03 1067 #define IDC_QINTRA04 1068 #define IDC_QINTRA05 1069 #define IDC_QINTRA06 1070 #define IDC_QINTRA07 1071 #define IDC_QINTRA08 1072 #define IDC_QINTRA09 1073 #define IDC_QINTRA10 1074 #define IDC_QINTRA11 1075 #define IDC_QINTRA12 1076 #define IDC_QINTRA13 1077 #define IDC_QINTRA14 1078 #define IDC_QINTRA15 1079 #define IDC_QINTRA16 1080 #define IDC_QINTRA17 1081 #define IDC_QINTRA18 1082 #define IDC_QINTRA19 1083 #define IDC_QINTRA20 1084 #define IDC_QINTRA21 1085 #define IDC_QINTRA22 1086 #define IDC_QINTRA23 1087 #define IDC_QINTRA24 1088 #define IDC_QINTRA25 1089 #define IDC_QINTRA26 1090 #define IDC_QINTRA27 1091 #define IDC_QINTRA28 1092 #define IDC_QINTRA29 1093 #define IDC_QINTRA30 1094 #define IDC_QINTRA31 1095 #define IDC_QINTRA32 1096 #define IDC_QINTRA33 1097 #define IDC_QINTRA34 1098 #define IDC_QINTRA35 1099 #define IDC_QINTRA36 1100 #define IDC_QINTRA37 1101 #define IDC_QINTRA38 1102 #define IDC_QINTRA39 1103 #define IDC_QINTRA40 1104 #define IDC_QINTRA41 1105 #define IDC_QINTRA42 1106 #define IDC_QINTRA43 1107 #define IDC_QINTRA44 1108 #define IDC_QINTRA45 1109 #define IDC_QINTRA46 1110 #define IDC_QINTRA47 1111 #define IDC_QINTRA48 1112 #define IDC_QINTRA49 1113 #define IDC_QINTRA50 1114 #define IDC_QINTRA51 1115 #define IDC_QINTRA52 1116 #define IDC_QINTRA53 1117 #define IDC_QINTRA54 1118 #define IDC_QINTRA55 1119 #define IDC_QINTRA56 1120 #define IDC_QINTRA57 1121 #define IDC_QINTRA58 1122 #define IDC_QINTRA59 1123 #define IDC_QINTRA60 1124 #define IDC_QINTRA61 1125 #define IDC_QINTRA62 1126 #define IDC_QINTRA63 1127 #define IDC_QINTER00 1128 #define IDC_QINTER01 1129 #define IDC_QINTER02 1130 #define IDC_QINTER03 1131 #define IDC_QINTER04 1132 #define IDC_QINTER05 1133 #define IDC_QINTER06 1134 #define IDC_QINTER07 1135 #define IDC_QINTER08 1136 #define IDC_QINTER09 1137 #define IDC_QINTER10 1138 #define IDC_QINTER11 1139 #define IDC_QINTER12 1140 #define IDC_QINTER13 1141 #define IDC_QINTER14 1142 #define IDC_QINTER15 1143 #define IDC_QINTER16 1144 #define IDC_QINTER17 1145 #define IDC_QINTER18 1146 #define IDC_QINTER19 1147 #define IDC_QINTER20 1148 #define IDC_QINTER21 1149 #define IDC_QINTER22 1150 #define IDC_QINTER23 1151 #define IDC_QINTER24 1152 #define IDC_QINTER25 1153 #define IDC_QINTER26 1154 #define IDC_QINTER27 1155 #define IDC_QINTER28 1156 #define IDC_QINTER29 1157 #define IDC_QINTER30 1158 #define IDC_QINTER31 1159 #define IDC_QINTER32 1160 #define IDC_QINTER33 1161 #define IDC_QINTER34 1162 #define IDC_QINTER35 1163 #define IDC_QINTER36 1164 #define IDC_QINTER37 1165 #define IDC_QINTER38 1166 #define IDC_QINTER39 1167 #define IDC_QINTER40 1168 #define IDC_QINTER41 1169 #define IDC_QINTER42 1170 #define IDC_QINTER43 1171 #define IDC_QINTER44 1172 #define IDC_QINTER45 1173 #define IDC_QINTER46 1174 #define IDC_QINTER47 1175 #define IDC_QINTER48 1176 #define IDC_QINTER49 1177 #define IDC_QINTER50 1178 #define IDC_QINTER51 1179 #define IDC_QINTER52 1180 #define IDC_QINTER53 1181 #define IDC_QINTER54 1182 #define IDC_QINTER55 1183 #define IDC_QINTER56 1184 #define IDC_QINTER57 1185 #define IDC_QINTER58 1186 #define IDC_QINTER59 1187 #define IDC_QINTER60 1188 #define IDC_QINTER61 1189 #define IDC_QINTER62 1190 #define IDC_QINTER63 1191 #define IDC_USEALT 1193 #define IDC_USEAUTO 1194 #define IDC_AUTOSTR 1195 #define IDC_USEAUTOBONUS 1196 #define IDC_BONUSBIAS 1197 #define IDC_CURVETYPE 1198 #define IDC_ALTCURVEHIGH 1199 #define IDC_ALTCURVELOW 1201 #define IDC_MINQUAL 1202 #define IDC_INTERLACING 1203 #define IDC_OVERDEG 1204 #define IDC_MAXBFRAMES 1205 #define IDC_HINTFILE 1206 #define IDC_BQUANTRATIO 1207 #define IDC_HINT_BROWSE 1208 #define IDC_HINTEDME 1209 #define IDC_OVERIMP 1210 #define IDC_MAXBITRATE 1211 #define IDC_CBR_REACTIONDELAY 1212 #define IDC_CBR_AVERAGINGPERIOD 1213 #define IDC_CBR_BUFFER 1214 #define IDC_PACKED 1215 #define IDC_BSTATIC1 1216 #define IDC_MAXBFRAMES_S 1217 #define IDC_BQUANTRATIO_S 1218 #define IDC_DX50BVOP 1219 #define IDC_DEBUG 1220 #define IDC_NUMTHREADS 1221 #define IDC_NUMTHREADS_STATIC 1223 #define IDC_FRAMEDROP 1224 #define IDC_FRAMEDROP_STATIC 1225 #define IDC_KFTRESHOLD 1226 #define IDC_KFREDUCTION 1227 #define IDC_GREYSCALE 1228 #define IDC_CREDITS_GREYSCALE 1229 #define IDC_SPECIAL_BUILD 1230 #define IDC_GMC 1231 #define IDC_QPEL 1232 #define IDC_CHROMAME 1233 #define IDC_BQUANTOFFSET 1234 #define IDC_DEBLOCK_Y 1235 #define IDC_DEBLOCK_UV 1236 #define IDC_REDUCED 1237 #define IDC_VHQ 1238 #define IDC_CHROMA_OPT 1239 #define IDC_STATS 1240 #define IDC_MODE_ADV 1241 #define IDC_PROFILE_ADV 1242 #define IDC_ADD 1243 #define IDC_PROFILE_BVOP 1244 #define IDC_REMOVE 1245 #define IDC_EDIT 1246 #define IDC_PROFILE_MPEGQUANT 1247 #define IDC_BITRATE_CALC 1248 #define IDC_PROFILE_INTERLACE 1249 #define IDC_PROFILE_QPEL 1250 #define IDC_PROFILE_GMC 1251 #define IDC_PROFILE_REDUCED 1252 #define IDC_PROFILE_WIDTH 1253 #define IDC_PROFILE_HEIGHT 1254 #define IDC_PROFILE_FPS 1255 #define IDC_PROFILE_BITRATE 1256 #define IDC_PROFILE_VBV 1257 #define IDC_PROFILE_PROFILE 1258 #define IDC_PROFILE_VCV 1259 #define IDC_PROFILE_VMV 1260 #define IDC_BVOP 1261 #define IDC_BVOP_THRESHOLD 1262 #define IDC_BVOP_THRESHOLD_S 1263 #define IDC_BQUANTOFFSET_S 1265 #define IDC_CLOSEDGOV 1266 #define IDC_ZONES 1267 #define IDC_ZONE_FRAME 1268 #define IDC_ZONE_VALUE 1269 #define IDC_ZONE_BITRATE 1270 #define IDC_ZONE_WEIGHT 1274 #define IDC_ZONE_SLIDER 1275 #define IDC_ZONE_FETCH 1276 #define IDC_ZONE_MODE 1277 #define IDC_ZONE_QUANT 1278 #define IDC_ZONE_GREYSCALE 1279 #define IDC_ZONE_CHROMAOPT 1280 #define IDC_ZONE_BVOPTHRESHOLD_ENABLE 1281 #define IDC_ZONE_BVOPTHRESHOLD 1282 #define IDC_ZONE_MODE_WEIGHT 1285 #define IDC_ZONE_MODE_QUANT 1287 #define IDC_QUANTTYPE_S 1290 #define IDC_ZONE_MAX 1291 #define IDC_ZONE_MIN 1292 #define IDC_LEVEL_PROFILE 1293 #define IDC_LEVEL_WIDTH 1294 #define IDC_LEVEL_HEIGHT 1295 #define IDC_LEVEL_FPS 1296 #define IDC_LEVEL_VMV 1297 #define IDC_LEVEL_VCV 1298 #define IDC_LEVEL_VBV 1299 #define IDC_LEVEL_BITRATE 1300 #define IDC_BITRATE 1302 #define IDC_LEVEL_PEAKRATE 1302 #define IDC_BITRATE_MIN 1303 #define IDC_BITRATE_MAX 1304 #define IDC_ZONE_BVOPTHRESHOLD_S 1305 #define IDC_TRELLISQUANT 1306 #define IDC_BITRATE_S 1307 #define IDC_VOPDEBUG 1308 #define IDC_STATUS_GRAPH 1311 #define IDC_STATUS_IQ_MIN 1312 #define IDC_STATUS_IQ_MAX 1313 #define IDC_STATUS_PQ_MIN 1314 #define IDC_STATUS_PQ_MAX 1315 #define IDC_STATUS_BQ_MIN 1316 #define IDC_STATUS_BQ_MAX 1317 #define IDC_STATUS_Q_MIN 1318 #define IDC_STATUS_Q_MAX 1319 #define IDC_STATUS_IL_MIN 1320 #define IDC_STATUS_IL_MAX 1321 #define IDC_STATUS_IL_TOT 1322 #define IDC_STATUS_PL_MIN 1323 #define IDC_STATUS_PL_MAX 1324 #define IDC_STATUS_PL_TOT 1325 #define IDC_STATUS_BL_MIN 1326 #define IDC_STATUS_BL_MAX 1327 #define IDC_STATUS_BL_TOT 1328 #define IDC_STATUS_L_MIN 1329 #define IDC_STATUS_L_MAX 1330 #define IDC_STATUS_L_TOT 1331 #define IDC_STATUS_DESTROY 1332 #define IDC_STATUS_KBPS 1333 #define IDC_STATUS_IL_AVG 1334 #define IDC_DISPLAY_STATUS 1335 #define IDC_STATUS_PL_AVG 1336 #define IDC_STATUS_BL_AVG 1337 #define IDC_FULL1PASS 1338 #define IDC_DEC_DY 1339 #define IDC_STATUS_L_AVG 1340 #define IDC_ZONE_FORCEIVOP 1341 #define IDC_STATUS_IQ_AVG 1342 #define IDC_STATUS_I_NUM 1343 #define IDC_STATUS_P_NUM 1344 #define IDC_STATUS_B_NUM 1345 #define IDC_STATUS_NUM 1346 #define IDC_CARTOON 1347 #define IDC_STATUS_PQ_AVG 1348 #define IDC_OVERFLOW_CONTROL_STRENGTH 1349 #define IDC_STATUS_BQ_AVG 1350 #define IDC_ASPECT_RATIO 1351 #define IDC_STATUS_Q_AVG 1352 #define IDC_AR 1353 #define IDC_PAR 1354 #define IDC_PARX 1355 #define IDC_PARY 1356 #define IDC_ARX 1357 #define IDC_ARY 1358 #define IDC_AR_DEFAULT 1364 #define IDC_AR_4_3 1365 #define IDC_AR_16_9 1366 #define IDC_AR_235_100 1368 #define IDC_TURBO 1369 #define IDC_DEC_DUV 1370 #define IDC_DEC_DR 1371 #define IDC_DEC_DRY 1371 #define IDC_DEBUGOUTPUT 1372 #define IDC_DEC_DRUV 1372 #define IDC_SHOWINTERNALS 1373 #define IDC_DEC_FE 1374 #define IDC_BITRATE_TSIZE 1375 #define IDC_BITRATE_SSIZE 1376 #define IDC_BITRATE_SSELECT 1377 #define IDC_BITRATE_CFORMAT 1378 #define IDC_BITRATE_COVERHEAD 1379 #define IDC_BITRATE_HOURS 1380 #define IDC_BITRATE_MINUTES 1381 #define IDC_BITRATE_SECONDS 1382 #define IDC_BITRATE_FPS 1383 #define IDC_BITRATE_VRATE 1384 #define IDC_BITRATE_VSIZE 1385 #define IDC_BITRATE_AFORMAT 1386 #define IDC_BITRATE_AMODE_RATE 1387 #define IDC_BITRATE_AMODE_SIZE 1388 #define IDC_BITRATE_ARATE 1389 #define IDC_BITRATE_ASIZE 1390 #define IDC_BITRATE_ASELECT 1391 #define IDC_BITRATE_ADV 1392 #define IDC_DEC_BRIGHTNESS 1393 #define IDC_TFF 1394 #define IDC_VHQ_BFRAME 1395 #define IDC_LICENSE_TEXT 1396 #define IDC_LEVEL_PEAKRATE_S 1397 #define IDC_LEVEL_BITRATE_S 1398 #define IDC_LEVEL_VBV_S 1399 #define IDC_LEVEL_VBV_G 1400 #define IDC_LEVEL_DIM_S 1401 #define IDC_LEVEL_VMV_S 1402 #define IDC_LEVEL_VCV_S 1403 #define IDC_LEVEL_LEVEL_G 1404 #define IDC_LUMMASK_S 1405 #define IDC_PROFILE_LOGO 1406 #define IDC_PROFILE_LABEL 1407 #define IDC_VHQ_METRIC 1408 #define IDC_SLICES 1409 // Next default values for new objects // #ifdef APSTUDIO_INVOKED #ifndef APSTUDIO_READONLY_SYMBOLS #define _APS_NEXT_RESOURCE_VALUE 139 #define _APS_NEXT_COMMAND_VALUE 40001 #define _APS_NEXT_CONTROL_VALUE 1410 #define _APS_NEXT_SYMED_VALUE 101 #endif #endif xvidcore/vfw/src/hd1080_40.ico0000664000076500007650000002743611507207323016773 0ustar xvidbuildxvidbuildH( /(HP $FG\||sL.8Q᭝영zwYPlNEcF=_<28 3_BBeEF̔o hƿzz\Q_7)O$A)"*33329 <;:99 8 7 6 5 /&%:kUU泪 #Lu뽮ՋpeU,DB95:= < @A?>=<<;: 9 8 7 6 5 4321100+% &ۖ.ۦg?*9:CGGFFFEDCBA@@?>=<;: 9 9 8 7 6 543210/.-,*'١MZ뻨g?)H9IPOLJIHHGFEEDCBA@?>==<;: 9 8 8 6 5 443210/.-0* M22Ľ iՀaNIJHNPONNLMMKIHHF> CE@<A867?>=<;9 28 7 5 6 532--1..*-0 UºrK/C NT"S!R R QPOOL@ 6BKJGLwR9Q)"> X' oN>JsJ3~]QwWJ75?=;? eB594@ / *30W+M03!0*N!-#' Ʋ+^EAQ[*X&U#U#T"S!R QR M=ߓymDKCZ p_-^5^4?߻C@Y9?<L 0"?j/ ļʿs{'lB²ɊN$ŪJGg*-^oýPX%X&X%V$V$U#U#T"S!OCȼP IE\"dRxP!_5ARY|Øs'C=O 3.ƠyYhEhCN13FZ9D¨%9}= 4(3 :s޳<\*Y'X&W%W%U#U$Q? 뿱BNG^&\/8Z\v4k$B3PB<J9)ϿZOx-Ħ]Nm欉jءi㗡i6 q-6/ gKLwV G {G\)Y&Y'X&X&X&G콮DQOH_'譟^19UVp2iH9TB L f* 9%μYOv1ĦK:Z]:ǵZb\s/71.オx[%[$`*T|򴢛C \)Y'Y'V"I콮ŷ>OR QI_%m[{'Y+a6?w{Q"ܿ- Dn8;}0%pl{FɴK")ɩE>T_IJ/(=υJ׵8 22,1 $N]'a+c/b-b.U񪖋D ^+Z&GȺGRT"R R J`&xg3f:e:BQAMCGALޮ9)_!̶ڙop-Q'Գպfbz*~I"ѣ֨;+E44 437 ﲧ_b/_'f1d/d/c.a+L貝𱟕IK񹧜ĴEU"W%U#T"S!R S [, R"NR W( OV& _1^2D? IGFCBF= MN%<?; P"N(5=4J C/8 6 544.Q2/]#g1f0e0e0d/c/a+Q负to̿FU!X&W%V$U#U#T"R!LPR MINGFFIKJIGECFD@? AA>98< < 9 56: 9 8 7 6 54)=#tCb*h2g1g0f0e0d/d/d0R緣ʷKY$Z(Y'X&X%W%V$U#T"T"S!RGGPONNMLKH@>BGFEEC787 9 8 5 3 5/0;9 8 7 6 4,v[ʹTj4i3i3h2h1g0f0e0e0b,N巣ɻHZ%]*[(Z'Y'Y&X&W%W%V$U#T"DµEQOONNLD> FGF:kJ=װY6+8 <: 9 8 7 (gJDzn_#k4j4j3i3h2h1g1f0e0e0S]FzkH `-^*\)\)[(Z'Y'Y&X&W%V$MrM7gB-HS!POOH|\MCGH9觘)A;:9 9 0N(f?i/l5k4k4j3j3i3h2h2h2X|S6{eN^*^*]*\)[(Z'Y'Y'Y&W$B KPR QOGnbEG? wmɼ,A=<;: 3> 9jFi.n6m6l5k4k4j3i3j4_&[?ԴhOa-^*^*]*\)[(Z']+LʽGQR Q Hm_FGE@CH1݁g_˿.C>==<;۽TiAk0o7o7n6m6l5k4k4\!Z=H _*v\Oa-_+^*]*])]*OzV=bPHT"R Jn_GGN# 9DDC5wVH̿/D@?>== 㯡Euz_!q9o7o7n6n6m6a&~Q+Ti3b+b/tZP`+_,_*^*Y$V$BQU#Kn_IHR% 0KIH9|]O̿1FAA??>$쥔KȽc%q8q9p8o7o7h.Z2Ne.i3h2]#c0{aQb._+b.Jʽc6^1JW%Mo_JJN! }]M9A B4sPA̿2HBBA@@+|V*j.r9q9q9i.Z1h4d*j4i3i2i3b)e1kPUd1WbHKOp\wONp_LMG౟р_NcSxUE4IECCBA,|pĺk/t;r9q8o7zJ!k2l5k4j4i3i2j5b*f2W5Xo@Y%\(\(Mǽb6IxhMQFrM95JFEDCC.{~_h*w?n4c?wFf*o7n6l5l5k4j4i3j4^$g3jNb@g9V_+_+['QsM3KzXALR S!E~]I7LGFFED,򤐉 Z4i,t<h,ꥃf}L j/p8o7n6n6m6l5l4k4j3j4b)p>żǺ[#e/e2캨Ȕt]Qb.`,_+`,TbGŸW$Y%LöyVAOU#T"U#GQ!ѓxhȺƾºͿ8MHHGFF)㱡DCčb=g)t;n3i-n4s;q8p8o7o7n6m6m6l5k4j3k5f.`(d-g1g1d.^&d0a,e0b-a,`,_+`,W!a0 V"Z&[([)V#X&S W%V$U$U#W%T!B OX( T$M:NIIHGF/ķQh,o3t;t;s:r:r9q9p8p8o7n7n6m5l5k4k4h1h1i2h1g1e/a+c.d/c.c.b-a,`,`+Z%](^*\)\)Y%W"Y'Y&X&W%V$U#U#T"PMLE»<PKKJICQ$/lj[-l1o4s9t;r:r9q9q8p8o7n7n6m6l5k4k4j3j3i3h2h1g0f0e0d/d/c.b-a-`,`+_+_+^*]*\)[(Z(Y'Y&X&X&W%V$U#U#T"R H? Q LLKJ? lG73ĸ`4l0j-w?v?s:r9q9q9p8o7o7n6m6l5k4k4j3i3i3h1h1g0f0e0d/d/c.b-a-a,`,_+^+]*]*\)[(Z'Z'Y&X&X&V$V$U#U#Le=%qM ÷~ eɼἤpJڅV(a$a%j0i.m5m5l4l4j3j3j3i3i2h2g1f0e0e0d/c.c.b-a-`,`,_+^*]*\*\)[(Z'Y'Y&X&X%W%V$T"S DY-yFSuĶ尖it@o; n7m5j2c(_$_$`&e,d,d+b)f/f1e0e0d/c.b-a,`,_+_+^*])\)\(\(Z(V"SOJOX+ ⪕uE_èȷ嶚x[͓oS]4wExHr@n: XOPNOPSRRZ#Y"Y#W OIE C T c5oD'xeǺz * * This program is free software ; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation ; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY ; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program ; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * $Id: config.h 1985 2011-05-18 09:02:35Z Isibaar $ * ****************************************************************************/ #ifndef _CONFIG_H_ #define _CONFIG_H_ #include #include "vfwext.h" #include extern HINSTANCE g_hInst; /* small hack */ #ifndef IDC_HAND #define IDC_HAND MAKEINTRESOURCE(32649) #endif /* one kilobit */ #define CONFIG_KBPS 1000 /* min/max bitrate when not specified by profile */ #define DEFAULT_MIN_KBPS 16 #define DEFAULT_MAX_KBPS 20480 #define DEFAULT_QUANT 400 /* registry stuff */ #define XVID_REG_KEY HKEY_CURRENT_USER #define XVID_REG_PARENT "Software\\GNU" #define XVID_REG_CHILD "XviD" #define XVID_REG_CLASS "config" #define XVID_BUILD __TIME__ ", " __DATE__ #define XVID_WEBSITE "http://www.xvid.org/" #define XVID_SPECIAL_BUILD "Vanilla CVS Build" /* constants */ #define CONFIG_2PASS_FILE ".\\video.pass" /* codec modes */ #define RC_MODE_1PASS 0 #define RC_MODE_2PASS1 1 #define RC_MODE_2PASS2 2 #define RC_MODE_NULL 3 #define RC_ZONE_WEIGHT 0 #define RC_ZONE_QUANT 1 /* vhq modes */ #define VHQ_OFF 0 #define VHQ_MODE_DECISION 1 #define VHQ_LIMITED_SEARCH 2 #define VHQ_MEDIUM_SEARCH 3 #define VHQ_WIDE_SEARCH 4 /* quantizer modes */ #define QUANT_MODE_H263 0 #define QUANT_MODE_MPEG 1 #define QUANT_MODE_CUSTOM 2 #define MAX_ZONES 64 typedef struct { int frame; int type; int mode; int weight; int quant; unsigned int greyscale; unsigned int chroma_opt; unsigned int bvop_threshold; unsigned int cartoon_mode; } zone_t; /* this structure represents a quality preset. it encapsulates options from the motion and quantizer config pages. */ #define QUALITY_GENERAL_STRING "General purpose" #define QUALITY_USER_STRING "(User defined)" typedef struct { char * name; /* motion */ int motion_search; int vhq_mode; int vhq_metric; int vhq_bframe; int chromame; int turbo; int max_key_interval; int frame_drop_ratio; /* quant */ int min_iquant; int max_iquant; int min_pquant; int max_pquant; int min_bquant; int max_bquant; int trellis_quant; } quality_t; typedef struct { /********** ATTENTION **********/ int mode; /* Vidomi directly accesses these vars */ int bitrate; int desired_size; /* please try to avoid modifications here */ char stats[MAX_PATH]; /*******************************/ int use_2pass_bitrate; /* use bitrate for 2pass2 (instead of desired size) */ int desired_quant; /* for one-pass constant quant */ /* profile */ char profile_name[MAX_PATH]; int profile; /* used internally; *not* written to registry */ /* quality preset */ char quality_name[MAX_PATH]; int quality; /* used internally; *not* written to registry */ int quant_type; BYTE qmatrix_intra[64]; BYTE qmatrix_inter[64]; int lum_masking; int interlacing; int tff; int qpel; int gmc; int use_bvop; int max_bframes; int bquant_ratio; int bquant_offset; int packed; int display_aspect_ratio; /* aspect ratio */ int ar_x, ar_y; /* picture aspect ratio */ int par_x, par_y; /* custom pixel aspect ratio */ int ar_mode; /* picture/pixel AR */ /* zones */ int num_zones; zone_t zones[MAX_ZONES]; int cur_zone; /* used internally; *not* written to registry */ /* single pass */ int rc_reaction_delay_factor; int rc_averaging_period; int rc_buffer; /* 2pass1 */ int discard1pass; /* 2pass2 */ int keyframe_boost; int kfthreshold; int kfreduction; int curve_compression_high; int curve_compression_low; int overflow_control_strength; int twopass_max_overflow_improvement; int twopass_max_overflow_degradation; /* bitrate calculator */ int target_size; int subtitle_size; int container_type; int hours; int minutes; int seconds; int fps; int audio_mode; int audio_type; int audio_rate; int audio_size; /* user defined quality settings */ quality_t quality_user; /* debug */ int num_threads; int fourcc_used; int vop_debug; int debug; int display_status; int full1pass; DWORD cpu; int num_slices; /* internal */ int ci_valid; VFWEXT_CONFIGURE_INFO_T ci; BOOL save; } CONFIG; typedef struct PROPSHEETINFO { int idd; CONFIG * config; } PROPSHEETINFO; typedef struct REG_INT { char* reg_value; int* config_int; int def; } REG_INT; typedef struct REG_STR { char* reg_value; char* config_str; char* def; } REG_STR; #define PROFILE_ADAPTQUANT 0x00000001 #define PROFILE_BVOP 0x00000002 #define PROFILE_MPEGQUANT 0x00000004 #define PROFILE_INTERLACE 0x00000008 #define PROFILE_QPEL 0x00000010 #define PROFILE_GMC 0x00000020 #define PROFILE_4MV 0x00000040 #define PROFILE_PACKED 0x00000080 #define PROFILE_EXTRA 0x00000100 #define PROFILE_XVID 0x00000200 #define PROFILE_RESYNCMARKER 0x00000400 static const int PARS[][2] = { {1, 1}, {12, 11}, {10, 11}, {16, 11}, {40, 33}, {0, 0}, }; typedef struct { char * name; char * short_name; int id; /* mpeg-4 profile id; iso/iec 14496-2:2001 table G-1 */ int width; int height; int fps; int max_objects; int total_vmv_buffer_sz; /* macroblock memory; when BVOPS=false, vmv = 2*vcv; when BVOPS=true, vmv = 3*vcv*/ int max_vmv_buffer_sz; /* max macroblocks per vop */ int vcv_decoder_rate; /* macroblocks decoded per second */ int max_acpred_mbs; /* percentage */ int max_vbv_size; /* max vbv size (bits) 16368 bits */ int max_video_packet_length;/* bits */ int max_bitrate; /* bits per second */ int vbv_peakrate; /* max bits over anyone second period; 0=don't care */ int xvid_max_bframes; /* xvid: max consecutive bframes */ unsigned int flags; } profile_t; extern const profile_t profiles[]; extern const quality_t quality_table[]; extern const int quality_table_num; /* number of elements in quality table */ void config_reg_get(CONFIG * config); void config_reg_set(CONFIG * config); void sort_zones(zone_t * zones, int zone_num, int * sel); INT_PTR CALLBACK main_proc(HWND, UINT, WPARAM, LPARAM); INT_PTR CALLBACK about_proc(HWND, UINT, WPARAM, LPARAM); #endif /* _CONFIG_H_ */ xvidcore/vfw/src/xvid.ico0000664000076500007650000002267611475640257016552 0ustar xvidbuildxvidbuild00 %(0` ֎'֎I֎K֎6։t֍(֎D֎MՎ<ֈֈՎU֏֏֏֏֏֎Ջ&֍^֎֏֏֏Տ֎֊Ս֏֐֐֐֐֐֏ՏՎֈ ֌M֏֏֏֐֐֐֐ՐՍi֋9Ր֏֐֐֐֐֐֐֐Տ֍HՍ֌Վ֏֐֐֐֐֐֐֐֐֏ՏM֐֐֐֐֐֐֐֐֏֐Ր֎֌'Շ ֎aՏ֐֐֏֐֐֐֐֐֐֐֏֋+Ր֏֐֐֐֐֐֐֐֐֏ՐՏ֍ZՌՏ֏֐֏֐֐֐֐֐֐֐֐֐֎֎Տ֐֐֐֐֐֐֐֐֐֐֐֐֏֎Ջ3֏֐Ր֐֐֐֐֐֐֐֐֐֐֐֍l֍֎֐֏֐֐֐֐֐֐֐֐֐֐֏֐֏֊Ռ^֐֐֏֐֐֐֐֐֐֐֐֐֐֐֏֌ՎRՏ֐֐֐֐֐֐֐֐֐֐֐֐֏֐֏֌2ֈ ֍Ր֏֐֐֐֐֐֐֐֐֐֐֐֏֐֎֍֊֏Ր֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏֋Gւ֎Տ֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏֍O֍eՏ֏֐֐֐֐֐֐֐֐֐֐֐֐֏֐֏֌ZXՏՐ֐֏֐֐֐֐֐֐֐֐֐֐֐֐֐֏ֆ ֎Ր֏֐֐֐֐֐֐֐֐֐֐֐֐֐֏֏ՌO}֏֐֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏֍HՋ&֏֐֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏Ռ ֎֏֏֐֐֐֐֐֐֐֐֐֐֐֐֐֏֐֎֍\֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏ՎkՁ֌(֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏Ր֏ՃՊ֎֐֏֐֐֐֐֐֐֐֐֐֐֐֐֏Ր֏֍֍D֏֐֐֐֐֐֐֐֐֐֐֐֐֐֏֐֏Ս!֋֏֐֏֐֐֐֐֐֐֐֐֐֐֐֏֐֏֍֎?֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏ՍS֌0Տ֐֏֐֐֐֐֐֐֐֐֐֐֐֏֎O֌֏֏֐֐֐֐֐֐֐֐֐֐֐֏֐֎~֍4Տ֐֏֐֐֐֐֐֐֐֐֏ՐՎ֊֍G֏֐֏֐֐֐֐֐֐֐֐֏֐֎dՌ-֏֐֏֐֐֐֐֐֏֐Ր֎y֌>֏֐֏֐֐֐֐֐֐֏֐֍Ռ3Տ֏֐֐֐֐֏֏֏֌c։֎֏֐֐֐֐֐֐֏֎Պ֎֏֏֏֏֏֎֊֎Y֏֏֏֏֏֏֍J֊֎2֍8֎.֊O֍#֎6֎7֍&p։֋Շ֊֋֍$֎U֎֎Վy֎MՌֈ Վ;֎f֎֎֎jՎ;փ֎t֏֏֐֐֐Տ֏֎~֌H֏Տ֐֐֐֐Տ֏֌"ֈ֎֏֐֐֏֐֏֐֐֏֏֊*ՍՐՏ֐֏֐֐֏֐ՐՏ֋"֍֐֏֐֐֐֐֐֐֐֏֐֏ՋB{Տ֐֏֐֐֐֐֐֐֐֏֐֏֋ ֎֐֏֐֐֐֐֐֐֐֐֐֐֐֏Ռ$Վ֏֏֐֐֐֐֐֐֐֐֐֏֐֏֌'֍y֏֏֐֐֐֐֐֐֐֐֐֐֐֐֏ՎuՉ֎-֏Ր֐֐֐֐֐֐֐֐֐֐֐֏Ր֏ՋՍI֏֏֐֐֐֐֐֐֐֐֐֐֐֐֏Ր֏֍֍E֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐ՐՎՋՌ֎֐֏֐֐֐֐֐֐֐֐֐֐֐֐֏֐֎֋֍<֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏֐֍e֎Ր֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐ՏՎH֌֏֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏֋6Վu֐֏֐֐֐֐֐֐֐֐֐֐֐֐֐֏Ր֎օ֌G֏֏֐֐֐֐֐֐֐֐֐֐֐֐֐֏֐֏Մ Ջ%Տ֐֐֐֐֐֐֐֐֐֐֐֐֐֐֏֐֎ׂ֎[֐֏֐֐֐֐֐֐֐֐֐֐֐֐֐֏Տ֎Վ֏֐֐֐֐֐֐֐֐֐֐֐֐֏֐Տ֏֌Q֏֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐ՏՍ*Ս0֏֐֐֐֐֐֐֐֐֐֐֐֐֐֐Տ֎֌J֏֐֏֐֐֐֐֐֐֐֐֐֐֐֐֏Վy֊ՉՎx֏֐֐֐֐֐֐֐֐֐֐֐֏֐֏Վsև֌0Տ֐֏֐֐֐֐֐֐֐֐֐֐֏֐֏Ս ֍ ֎Ր֏֐֐֐֐֐֐֐֐֐֏֐֏֌?֊ Վ֐֏֐֐֐֐֐֐֐֐֐֐֐֐Ռ@Ջ֏֐֐֐֐֐֐֐֐֐֏֐Ր֎։֎~ՏՐ֏֐֐֐֐֐֐֐֐֐֐֎֌?Ր֏֐֐֐֐֐֐֐֏֐֏Վօ ֍P֏֐֐֏֐֐֐֐֐֐֐֐֏ՍK֐֐֐֐֐֐֐֏֐֐֏֍C֍ՎՏ֐֐֐֐֐֐֐֐֐֏֋!֐֏֐֐֐֐֏֐֏֍l֍ֈ֌0Վ֏֐֏֐֐֐֐֐֎֍֎ՐՐ֐ՏՏ֏֍ֆՍ\֏Տ֏֐֐֏֏֌+Ջ֏֏Տ֏֎rՈՍM֏֏֏֏֍Gֈ Պֈ։x??????xvidcore/vfw/src/w32api/0000775000076500007650000000000011566427761016200 5ustar xvidbuildxvidbuildxvidcore/vfw/src/w32api/vfw.h0000664000076500007650000001266111564705453017154 0ustar xvidbuildxvidbuild#ifndef _INC_VFW #define _INC_VFW #include #ifndef aviTWOCC #define aviTWOCC(C0,C1) ((WORD)(BYTE)(C0)|((WORD)(BYTE)(C1) << 8)) #endif #ifndef ICTYPE_VIDEO #define ICTYPE_VIDEO mmioFOURCC('v', 'i', 'd', 'c') #define ICTYPE_AUDIO mmioFOURCC('a', 'u', 'd', 'c') #endif #ifndef ICERR_OK #define ICERR_OK 0L #define ICERR_DONTDRAW 1L #define ICERR_NEWPALETTE 2L #define ICERR_GOTOKEYFRAME 3L #define ICERR_STOPDRAWING 4L #define ICERR_UNSUPPORTED -1L #define ICERR_BADFORMAT -2L #define ICERR_MEMORY -3L #define ICERR_INTERNAL -4L #define ICERR_BADFLAGS -5L #define ICERR_BADPARAM -6L #define ICERR_BADSIZE -7L #define ICERR_BADHANDLE -8L #define ICERR_CANTUPDATE -9L #define ICERR_ABORT -10L #define ICERR_ERROR -100L #define ICERR_BADBITDEPTH -200L #define ICERR_BADIMAGESIZE -201L #define ICERR_CUSTOM -400L #endif #ifndef ICMODE_COMPRESS #define ICMODE_COMPRESS 1 #define ICMODE_DECOMPRESS 2 #define ICMODE_FASTDECOMPRESS 3 #define ICMODE_QUERY 4 #define ICMODE_FASTCOMPRESS 5 #define ICMODE_DRAW 8 #endif #define AVIIF_LIST 0x00000001L #define AVIIF_TWOCC 0x00000002L #define AVIIF_KEYFRAME 0x00000010L #define ICCOMPRESS_KEYFRAME 0x00000001L typedef struct { DWORD dwFlags; LPBITMAPINFOHEADER lpbiOutput; LPVOID lpOutput; LPBITMAPINFOHEADER lpbiInput; LPVOID lpInput; LPDWORD lpckid; LPDWORD lpdwFlags; LONG lFrameNum; DWORD dwFrameSize; DWORD dwQuality; LPBITMAPINFOHEADER lpbiPrev; LPVOID lpPrev; } ICCOMPRESS; #define ICCOMPRESSFRAMES_PADDING 0x00000001 typedef struct { DWORD dwFlags; LPBITMAPINFOHEADER lpbiOutput; LPARAM lOutput; LPBITMAPINFOHEADER lpbiInput; LPARAM lInput; LONG lStartFrame; LONG lFrameCount; LONG lQuality; LONG lDataRate; LONG lKeyRate; DWORD dwRate; DWORD dwScale; DWORD dwOverheadPerFrame; DWORD dwReserved2; LONG (CALLBACK *GetData)(LPARAM,LONG,LPVOID,LONG); LONG (CALLBACK *PutData)(LPARAM,LONG,LPVOID,LONG); } ICCOMPRESSFRAMES; #define ICDECOMPRESS_HURRYUP 0x80000000L #define ICDECOMPRESS_UPDATE 0x40000000L #define ICDECOMPRESS_PREROLL 0x20000000L #define ICDECOMPRESS_NULLFRAME 0x10000000L #define ICDECOMPRESS_NOTKEYFRAME 0x08000000L typedef struct { DWORD dwFlags; LPBITMAPINFOHEADER lpbiInput; LPVOID lpInput; LPBITMAPINFOHEADER lpbiOutput; LPVOID lpOutput; DWORD ckid; } ICDECOMPRESS; typedef struct { DWORD dwSize; DWORD fccType; DWORD fccHandler; DWORD dwVersion; DWORD dwFlags; LRESULT dwError; LPVOID pV1Reserved; LPVOID pV2Reserved; DWORD dnDevNode; } ICOPEN; #define ICM_USER (DRV_USER+0x0000) #define ICM_RESERVED ICM_RESERVED_LOW #define ICM_RESERVED_LOW (DRV_USER+0x1000) #define ICM_RESERVED_HIGH (DRV_USER+0x2000) #define ICM_GETSTATE (ICM_RESERVED+0) #define ICM_SETSTATE (ICM_RESERVED+1) #define ICM_GETINFO (ICM_RESERVED+2) #define ICM_CONFIGURE (ICM_RESERVED+10) #define ICM_ABOUT (ICM_RESERVED+11) #define ICM_GETERRORTEXT (ICM_RESERVED+12) #define ICM_GETFORMATNAME (ICM_RESERVED+20) #define ICM_ENUMFORMATS (ICM_RESERVED+21) #define ICM_GETDEFAULTQUALITY (ICM_RESERVED+30) #define ICM_GETQUALITY (ICM_RESERVED+31) #define ICM_SETQUALITY (ICM_RESERVED+32) #define ICM_SET (ICM_RESERVED+40) #define ICM_GET (ICM_RESERVED+41) #define ICM_FRAMERATE mmioFOURCC('F','r','m','R') #define ICM_KEYFRAMERATE mmioFOURCC('K','e','y','R') typedef struct { DWORD dwSize; DWORD fccType; DWORD fccHandler; DWORD dwFlags; DWORD dwVersion; DWORD dwVersionICM; WCHAR szName[16]; WCHAR szDescription[128]; WCHAR szDriver[128]; } ICINFO; #define VIDCF_QUALITY 0x0001 #define VIDCF_CRUNCH 0x0002 #define VIDCF_TEMPORAL 0x0004 #define VIDCF_COMPRESSFRAMES 0x0008 #define VIDCF_DRAW 0x0010 #define VIDCF_FASTTEMPORALC 0x0020 #define VIDCF_FASTTEMPORALD 0x0080 #define VIDCF_QUALITYTIME 0x0040 #define VIDCF_FASTTEMPORAL (VIDCF_FASTTEMPORALC|VIDCF_FASTTEMPORALD) #define ICVERSION 0x0104 #define ICM_COMPRESS_GET_FORMAT (ICM_USER+4) #define ICM_COMPRESS_GET_SIZE (ICM_USER+5) #define ICM_COMPRESS_QUERY (ICM_USER+6) #define ICM_COMPRESS_BEGIN (ICM_USER+7) #define ICM_COMPRESS (ICM_USER+8) #define ICM_COMPRESS_END (ICM_USER+9) #define ICM_DECOMPRESS_GET_FORMAT (ICM_USER+10) #define ICM_DECOMPRESS_QUERY (ICM_USER+11) #define ICM_DECOMPRESS_BEGIN (ICM_USER+12) #define ICM_DECOMPRESS (ICM_USER+13) #define ICM_DECOMPRESS_END (ICM_USER+14) #define ICM_DECOMPRESS_SET_PALETTE (ICM_USER+29) #define ICM_DECOMPRESS_GET_PALETTE (ICM_USER+30) #define ICM_DRAW_QUERY (ICM_USER+31) #define ICM_DRAW_BEGIN (ICM_USER+15) #define ICM_DRAW_GET_PALETTE (ICM_USER+16) #define ICM_DRAW_UPDATE (ICM_USER+17) #define ICM_DRAW_START (ICM_USER+18) #define ICM_DRAW_STOP (ICM_USER+19) #define ICM_DRAW_BITS (ICM_USER+20) #define ICM_DRAW_END (ICM_USER+21) #define ICM_DRAW_GETTIME (ICM_USER+32) #define ICM_DRAW (ICM_USER+33) #define ICM_DRAW_WINDOW (ICM_USER+34) #define ICM_DRAW_SETTIME (ICM_USER+35) #define ICM_DRAW_REALIZE (ICM_USER+36) #define ICM_DRAW_FLUSH (ICM_USER+37) #define ICM_DRAW_RENDERBUFFER (ICM_USER+38) #define ICM_DRAW_START_PLAY (ICM_USER+39) #define ICM_DRAW_STOP_PLAY (ICM_USER+40) #define ICM_DRAW_SUGGESTFORMAT (ICM_USER+50) #define ICM_DRAW_CHANGEPALETTE (ICM_USER+51) #define ICM_DRAW_IDLE (ICM_USER+52) #define ICM_GETBUFFERSWANTED (ICM_USER+41) #define ICM_GETDEFAULTKEYFRAMERATE (ICM_USER+42) #define ICM_DECOMPRESSEX_BEGIN (ICM_USER+60) #define ICM_DECOMPRESSEX_QUERY (ICM_USER+61) #define ICM_DECOMPRESSEX (ICM_USER+62) #define ICM_DECOMPRESSEX_END (ICM_USER+63) #define ICM_COMPRESS_FRAMES_INFO (ICM_USER+70) #define ICM_COMPRESS_FRAMES (ICM_USER+71) #define ICM_SET_STATUS_PROC (ICM_USER+72) #endif /* _INC_VFW */ xvidcore/vfw/src/codec.c0000664000076500007650000010177111564705453016316 0ustar xvidbuildxvidbuild/************************************************************************** * * XVID VFW FRONTEND * codec * * Copyright(C) Peter Ross * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * * $Id: codec.c 1985 2011-05-18 09:02:35Z Isibaar $ * *************************************************************************/ #include #include #include #include "vfwext.h" #include #include "debug.h" #include "codec.h" #include "status.h" static const int pmvfast_presets[7] = { 0, 0, 0, 0, 0 | XVID_ME_HALFPELREFINE16 | 0, 0 | XVID_ME_HALFPELREFINE16 | 0 | XVID_ME_ADVANCEDDIAMOND16, XVID_ME_HALFPELREFINE16 | XVID_ME_EXTSEARCH16 | XVID_ME_HALFPELREFINE8 | 0 | XVID_ME_USESQUARES16 }; /* return xvid compatbile colorspace, or XVID_CSP_NULL if failure */ static int get_colorspace(BITMAPINFOHEADER * hdr) { /* rgb only: negative height specifies top down image */ int rgb_flip = (hdr->biHeight < 0 ? 0 : XVID_CSP_VFLIP); switch(hdr->biCompression) { case BI_RGB : if (hdr->biBitCount == 16) { DPRINTF("RGB16 (RGB555)"); return rgb_flip | XVID_CSP_RGB555; } if (hdr->biBitCount == 24) { DPRINTF("RGB24"); return rgb_flip | XVID_CSP_BGR; } if (hdr->biBitCount == 32) { DPRINTF("RGB32"); return rgb_flip | XVID_CSP_BGRA; } DPRINTF("unsupported BI_RGB biBitCount=%i", hdr->biBitCount); return XVID_CSP_NULL; case BI_BITFIELDS : if (hdr->biSize >= sizeof(BITMAPV4HEADER)) { BITMAPV4HEADER * hdr4 = (BITMAPV4HEADER *)hdr; if (hdr4->bV4BitCount == 16 && hdr4->bV4RedMask == 0x7c00 && hdr4->bV4GreenMask == 0x3e0 && hdr4->bV4BlueMask == 0x1f) { DPRINTF("RGB555"); return rgb_flip | XVID_CSP_RGB555; } if(hdr4->bV4BitCount == 16 && hdr4->bV4RedMask == 0xf800 && hdr4->bV4GreenMask == 0x7e0 && hdr4->bV4BlueMask == 0x1f) { DPRINTF("RGB565"); return rgb_flip | XVID_CSP_RGB565; } DPRINTF("unsupported BI_BITFIELDS mode"); return XVID_CSP_NULL; } DPRINTF("unsupported BI_BITFIELDS/BITMAPHEADER combination"); return XVID_CSP_NULL; case FOURCC_I420 : case FOURCC_IYUV : DPRINTF("IYUY"); return XVID_CSP_I420; case FOURCC_YV12 : DPRINTF("YV12"); return XVID_CSP_YV12; case FOURCC_YUYV : case FOURCC_YUY2 : DPRINTF("YUY2"); return XVID_CSP_YUY2; case FOURCC_YVYU : DPRINTF("YVYU"); return XVID_CSP_YVYU; case FOURCC_UYVY : DPRINTF("UYVY"); return XVID_CSP_UYVY; default : DPRINTF("unsupported colorspace %c%c%c%c", hdr->biCompression&0xff, (hdr->biCompression>>8)&0xff, (hdr->biCompression>>16)&0xff, (hdr->biCompression>>24)&0xff); return XVID_CSP_NULL; } } /* compressor */ /* test the output format */ LRESULT compress_query(CODEC * codec, BITMAPINFO * lpbiInput, BITMAPINFO * lpbiOutput) { BITMAPINFOHEADER * inhdr = &lpbiInput->bmiHeader; BITMAPINFOHEADER * outhdr = &lpbiOutput->bmiHeader; /* VFWEXT detection */ if (inhdr->biCompression == VFWEXT_FOURCC) { return (ICM_USER+0x0fff); } if (get_colorspace(inhdr) == XVID_CSP_NULL) { return ICERR_BADFORMAT; } if (lpbiOutput == NULL) { return ICERR_OK; } if (inhdr->biWidth != outhdr->biWidth || inhdr->biHeight != outhdr->biHeight || (outhdr->biCompression != FOURCC_XVID && outhdr->biCompression != FOURCC_DIVX && outhdr->biCompression != FOURCC_DX50)) { return ICERR_BADFORMAT; } return ICERR_OK; } LRESULT compress_get_format(CODEC * codec, BITMAPINFO * lpbiInput, BITMAPINFO * lpbiOutput) { BITMAPINFOHEADER * inhdr = &lpbiInput->bmiHeader; BITMAPINFOHEADER * outhdr = &lpbiOutput->bmiHeader; if (get_colorspace(inhdr) == XVID_CSP_NULL) { return ICERR_BADFORMAT; } if (lpbiOutput == NULL) { return sizeof(BITMAPINFOHEADER); } memcpy(outhdr, inhdr, sizeof(BITMAPINFOHEADER)); outhdr->biSize = sizeof(BITMAPINFOHEADER); outhdr->biSizeImage = compress_get_size(codec, lpbiInput, lpbiOutput); outhdr->biXPelsPerMeter = 0; outhdr->biYPelsPerMeter = 0; outhdr->biClrUsed = 0; outhdr->biClrImportant = 0; if ((codec->config.fourcc_used == 0) || (profiles[codec->config.profile].flags & PROFILE_XVID)) { outhdr->biCompression = FOURCC_XVID; } else if (codec->config.fourcc_used == 1) { outhdr->biCompression = FOURCC_DIVX; } else { outhdr->biCompression = FOURCC_DX50; } return ICERR_OK; } LRESULT compress_get_size(CODEC * codec, BITMAPINFO * lpbiInput, BITMAPINFO * lpbiOutput) { return 2 * lpbiOutput->bmiHeader.biWidth * lpbiOutput->bmiHeader.biHeight * 3; } LRESULT compress_frames_info(CODEC * codec, ICCOMPRESSFRAMES * icf) { #if 0 DPRINTF("%i %i", icf->lStartFrame, icf->lFrameCount); #endif codec->fincr = icf->dwScale; codec->fbase = icf->dwRate; return ICERR_OK; } static char type2char(int type) { if (type==XVID_TYPE_IVOP) return 'I'; if (type==XVID_TYPE_PVOP) return 'P'; if (type==XVID_TYPE_BVOP) return 'B'; return 'S'; } static int vfw_debug(void *handle, int opt, void *param1, void *param2) { switch (opt) { case XVID_PLG_CREATE: *((void**)param2) = NULL; case XVID_PLG_INFO: case XVID_PLG_DESTROY: case XVID_PLG_BEFORE: return 0; case XVID_PLG_AFTER: { xvid_plg_data_t *data = (xvid_plg_data_t *) param1; /* We don't use DPRINTF here because it's active only for _DEBUG * builds and that activates lot of other debug printfs. We only * want these all the time */ char buf[1024]; sprintf(buf, "[%6i] type=%c Q:%2i length:%6i", data->frame_num, type2char(data->type), data->quant, data->length); OutputDebugString(buf); return 0; } } return XVID_ERR_FAIL; } #define XVID_DLL_NAME "xvidcore.dll" static int init_dll(CODEC* codec) { if (codec->m_hdll != NULL) return 0; DPRINTF("init_dll"); codec->m_hdll = LoadLibrary(XVID_DLL_NAME); if (codec->m_hdll == NULL) { DPRINTF("dll load failed"); MessageBox(0, XVID_DLL_NAME " not found!","Error!", MB_ICONEXCLAMATION|MB_OK); return XVID_ERR_FAIL; } codec->xvid_global_func = (int (__cdecl *)(void *, int, void *, void *))GetProcAddress(codec->m_hdll, "xvid_global"); if (codec->xvid_global_func == NULL) { MessageBox(0, "xvid_global() not found", "Error", 0); return XVID_ERR_FAIL; } codec->xvid_encore_func = (int (__cdecl *)(void *, int, void *, void *))GetProcAddress(codec->m_hdll, "xvid_encore"); if (codec->xvid_encore_func == NULL) { MessageBox(0, "xvid_encore() not found", "Error", 0); return XVID_ERR_FAIL; } codec->xvid_decore_func = (int (__cdecl *)(void *, int, void *, void *))GetProcAddress(codec->m_hdll, "xvid_decore"); if (codec->xvid_decore_func == NULL) { MessageBox(0, "xvid_decore() not found", "Error", 0); return XVID_ERR_FAIL; } codec->xvid_plugin_single_func = (int (__cdecl *)(void *, int, void *, void *))(GetProcAddress(codec->m_hdll, "xvid_plugin_single")); codec->xvid_plugin_2pass1_func = (int (__cdecl *)(void *, int, void *, void *))(GetProcAddress(codec->m_hdll, "xvid_plugin_2pass1")); codec->xvid_plugin_2pass2_func = (int (__cdecl *)(void *, int, void *, void *))(GetProcAddress(codec->m_hdll, "xvid_plugin_2pass2")); codec->xvid_plugin_lumimasking_func = (int (__cdecl *)(void *, int, void *, void *))(GetProcAddress(codec->m_hdll, "xvid_plugin_lumimasking")); codec->xvid_plugin_psnr_func = (int (__cdecl *)(void *, int, void *, void *))(GetProcAddress(codec->m_hdll, "xvid_plugin_psnr")); return 0; } /* constant-quant zones for fixed quant encoding */ static void prepare_cquant_zones(CONFIG * config) { int i = 0; if (config->num_zones == 0 || config->zones[0].frame != 0) { /* first zone does not start at frame 0 or doesn't exist */ if (config->num_zones >= MAX_ZONES) config->num_zones--; /* we scrifice last zone */ config->zones[config->num_zones].frame = 0; config->zones[config->num_zones].mode = RC_ZONE_QUANT; config->zones[config->num_zones].weight = 100; config->zones[config->num_zones].quant = config->desired_quant; config->zones[config->num_zones].type = XVID_TYPE_AUTO; config->zones[config->num_zones].greyscale = 0; config->zones[config->num_zones].chroma_opt = 0; config->zones[config->num_zones].bvop_threshold = 0; config->num_zones++; sort_zones(config->zones, config->num_zones, &i); } /* step 2: let's change all weight zones into quant zones */ for(i = 0; i < config->num_zones; i++) if (config->zones[i].mode == RC_ZONE_WEIGHT) { config->zones[i].mode = RC_ZONE_QUANT; config->zones[i].quant = (100*config->desired_quant) / config->zones[i].weight; } } /* full first pass zones */ static void prepare_full1pass_zones(CONFIG * config) { int i = 0; if (config->num_zones == 0 || config->zones[0].frame != 0) { /* first zone does not start at frame 0 or doesn't exist */ if (config->num_zones >= MAX_ZONES) config->num_zones--; /* we scrifice last zone */ config->zones[config->num_zones].frame = 0; config->zones[config->num_zones].mode = RC_ZONE_QUANT; config->zones[config->num_zones].weight = 100; config->zones[config->num_zones].quant = 200; config->zones[config->num_zones].type = XVID_TYPE_AUTO; config->zones[config->num_zones].greyscale = 0; config->zones[config->num_zones].chroma_opt = 0; config->zones[config->num_zones].bvop_threshold = 0; config->num_zones++; sort_zones(config->zones, config->num_zones, &i); } /* step 2: let's change all weight zones into quant zones */ for(i = 0; i < config->num_zones; i++) if (config->zones[i].mode == RC_ZONE_WEIGHT) { config->zones[i].mode = RC_ZONE_QUANT; config->zones[i].quant = 200; } } LRESULT compress_begin(CODEC * codec, BITMAPINFO * lpbiInput, BITMAPINFO * lpbiOutput) { xvid_gbl_init_t init; xvid_enc_create_t create; xvid_enc_plugin_t plugins[3]; xvid_plugin_single_t single; xvid_plugin_2pass1_t pass1; xvid_plugin_2pass2_t pass2; xvid_plugin_lumimasking_t masking; xvid_gbl_info_t info; int i; HANDLE hFile; const quality_t* quality_preset = (codec->config.quality==quality_table_num) ? &codec->config.quality_user : &quality_table[codec->config.quality]; CONFIG tmpCfg; /* if we want to alter config to suit our needs, it shouldn't be visible to user later */ memcpy(&tmpCfg, &codec->config, sizeof(CONFIG)); if (init_dll(codec) != 0) return ICERR_ERROR; /* destroy previously created codec */ if(codec->ehandle) { codec->xvid_encore_func(codec->ehandle, XVID_ENC_DESTROY, NULL, NULL); codec->ehandle = NULL; } memset(&init, 0, sizeof(init)); init.version = XVID_VERSION; init.cpu_flags = codec->config.cpu; init.debug = codec->config.debug; codec->xvid_global_func(0, XVID_GBL_INIT, &init, NULL); memset(&info, 0, sizeof(info)); info.version = XVID_VERSION; codec->xvid_global_func(0, XVID_GBL_INFO, &info, NULL); memset(&create, 0, sizeof(create)); create.version = XVID_VERSION; /* Encoder threads */ if (codec->config.cpu & XVID_CPU_FORCE) create.num_threads = codec->config.num_threads; else create.num_threads = info.num_threads; /* Autodetect */ /* Encoder slices */ if ((profiles[codec->config.profile].flags & PROFILE_RESYNCMARKER) && codec->config.num_slices != 1) { if (codec->config.num_slices == 0) { /* auto */ int mb_width = (lpbiInput->bmiHeader.biWidth + 15) / 16; int mb_height = (lpbiInput->bmiHeader.biHeight + 15) / 16; int slices = (int)((mb_width*mb_height) / 811); /* use multiple slices only above SD resolutions for now */ if (slices > 1) { if (create.num_threads <= 1) slices &= ~1; /* make even */ else if (create.num_threads <= slices) slices = (slices / create.num_threads) * create.num_threads; /* multiple of threads */ else if (create.num_threads % slices) slices = (!(create.num_threads%2)) ? (create.num_threads/2) : (create.num_threads/3); } create.num_slices = slices; } else { create.num_slices = codec->config.num_slices; /* force manual value - by registry edit */ } } /* plugins */ create.plugins = plugins; switch (codec->config.mode) { case RC_MODE_1PASS : memset(&single, 0, sizeof(single)); single.version = XVID_VERSION; single.bitrate = codec->config.bitrate * CONFIG_KBPS; single.reaction_delay_factor = codec->config.rc_reaction_delay_factor; single.averaging_period = codec->config.rc_averaging_period; single.buffer = codec->config.rc_buffer; plugins[create.num_plugins].func = codec->xvid_plugin_single_func; plugins[create.num_plugins].param = &single; create.num_plugins++; if (!codec->config.use_2pass_bitrate) /* constant-quant mode */ prepare_cquant_zones(&tmpCfg); break; case RC_MODE_2PASS1 : memset(&pass1, 0, sizeof(pass1)); pass1.version = XVID_VERSION; pass1.filename = codec->config.stats; if (codec->config.full1pass) prepare_full1pass_zones(&tmpCfg); plugins[create.num_plugins].func = codec->xvid_plugin_2pass1_func; plugins[create.num_plugins].param = &pass1; create.num_plugins++; break; case RC_MODE_2PASS2 : memset(&pass2, 0, sizeof(pass2)); pass2.version = XVID_VERSION; if (codec->config.use_2pass_bitrate) { pass2.bitrate = codec->config.bitrate * CONFIG_KBPS; } else { pass2.bitrate = -codec->config.desired_size; /* kilobytes */ } pass2.filename = codec->config.stats; hFile = CreateFile(pass2.filename, 0, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL); if (hFile == INVALID_HANDLE_VALUE) { MessageBox(0, "Statsfile not found!","Error!", MB_ICONEXCLAMATION|MB_OK); return XVID_ERR_FAIL; } else { CloseHandle(hFile); } pass2.keyframe_boost = codec->config.keyframe_boost; /* keyframe boost percentage: [0..100...]; */ pass2.curve_compression_high = codec->config.curve_compression_high; pass2.curve_compression_low = codec->config.curve_compression_low; pass2.overflow_control_strength = codec->config.overflow_control_strength; pass2.max_overflow_improvement = codec->config.twopass_max_overflow_improvement; pass2.max_overflow_degradation = codec->config.twopass_max_overflow_degradation; pass2.kfreduction = codec->config.kfreduction; pass2.kfthreshold = codec->config.kfthreshold; pass2.container_frame_overhead = 24; /* AVI */ /* VBV */ pass2.vbv_size = profiles[codec->config.profile].max_vbv_size; pass2.vbv_initial = (profiles[codec->config.profile].max_vbv_size*3)/4; /* 75% */ pass2.vbv_maxrate = profiles[codec->config.profile].max_bitrate; pass2.vbv_peakrate = profiles[codec->config.profile].vbv_peakrate; plugins[create.num_plugins].func = codec->xvid_plugin_2pass2_func; plugins[create.num_plugins].param = &pass2; create.num_plugins++; break; case RC_MODE_NULL : return ICERR_OK; default : break; } /* zones - copy from tmpCfg in case we automatically altered them above */ create.zones = malloc(sizeof(xvid_enc_zone_t) * tmpCfg.num_zones); create.num_zones = tmpCfg.num_zones; for (i=0; i < create.num_zones; i++) { create.zones[i].frame = tmpCfg.zones[i].frame; if (tmpCfg.zones[i].mode == RC_ZONE_QUANT) { create.zones[i].mode = XVID_ZONE_QUANT; create.zones[i].increment = tmpCfg.zones[i].quant; }else{ create.zones[i].mode = XVID_ZONE_WEIGHT; create.zones[i].increment = tmpCfg.zones[i].weight; } create.zones[i].base = 100; } /* lumimasking plugin */ if ((profiles[codec->config.profile].flags & PROFILE_ADAPTQUANT) && (codec->config.lum_masking>0)) { memset(&masking, 0, sizeof(masking)); masking.method = (codec->config.lum_masking==2); plugins[create.num_plugins].func = codec->xvid_plugin_lumimasking_func; plugins[create.num_plugins].param = &masking; create.num_plugins++; } plugins[create.num_plugins].func = vfw_debug; plugins[create.num_plugins].param = NULL; create.num_plugins++; create.profile = profiles[codec->config.profile].id; create.width = lpbiInput->bmiHeader.biWidth; create.height = lpbiInput->bmiHeader.biHeight; create.fincr = codec->fincr; create.fbase = codec->fbase; create.max_key_interval = quality_preset->max_key_interval; create.min_quant[0] = quality_preset->min_iquant; create.max_quant[0] = quality_preset->max_iquant; create.min_quant[1] = quality_preset->min_pquant; create.max_quant[1] = quality_preset->max_pquant; create.min_quant[2] = quality_preset->min_bquant; create.max_quant[2] = quality_preset->max_bquant; if ((profiles[codec->config.profile].flags & PROFILE_BVOP) && codec->config.use_bvop) { /* dxn: prevent bframes usage if interlacing is selected */ if (!((profiles[codec->config.profile].flags & PROFILE_EXTRA) && codec->config.interlacing)) { create.max_bframes = codec->config.max_bframes; create.bquant_ratio = codec->config.bquant_ratio; create.bquant_offset = codec->config.bquant_offset; if (codec->config.packed) create.global |= XVID_GLOBAL_PACKED; create.global |= XVID_GLOBAL_CLOSED_GOP; /* restrict max bframes */ if ((create.max_bframes > profiles[codec->config.profile].xvid_max_bframes) && (profiles[codec->config.profile].xvid_max_bframes >= 0)) create.max_bframes = profiles[codec->config.profile].xvid_max_bframes; /* DXN: enable packed bframes */ if ((profiles[codec->config.profile].flags & PROFILE_PACKED)) { create.global |= XVID_GLOBAL_PACKED; } } } /* dxn: always write divx5 userdata */ if ((profiles[codec->config.profile].flags & PROFILE_EXTRA)) create.global |= XVID_GLOBAL_DIVX5_USERDATA; if ((profiles[codec->config.profile].flags & PROFILE_EXTRA) || (profiles[codec->config.profile].flags & PROFILE_XVID)) { create.frame_drop_ratio = 0; } else { create.frame_drop_ratio = quality_preset->frame_drop_ratio; } switch(codec->xvid_encore_func(0, XVID_ENC_CREATE, &create, NULL)) { case XVID_ERR_FAIL : return ICERR_ERROR; case XVID_ERR_MEMORY : return ICERR_MEMORY; case XVID_ERR_FORMAT : return ICERR_BADFORMAT; case XVID_ERR_VERSION : return ICERR_UNSUPPORTED; } free(create.zones); codec->ehandle = create.handle; codec->framenum = 0; codec->keyspacing = 0; if (codec->config.display_status) { status_destroy_always(&codec->status); status_create(&codec->status, codec->fincr, codec->fbase); } return ICERR_OK; } LRESULT compress_end(CODEC * codec) { if (codec==NULL) return ICERR_OK; if (codec->m_hdll != NULL) { if (codec->ehandle != NULL) { codec->xvid_encore_func(codec->ehandle, XVID_ENC_DESTROY, NULL, NULL); codec->ehandle = NULL; } } if (codec->config.display_status) status_destroy(&codec->status); return ICERR_OK; } static void apply_zone_modifiers(xvid_enc_frame_t * frame, CONFIG * config, int framenum) { int i; for (i=0; inum_zones && config->zones[i].frame <= framenum; i++) ; if (--i < 0) return; /* there are no zones, or we're before the first zone */ if (framenum == config->zones[i].frame) frame->type = config->zones[i].type; if (config->zones[i].greyscale) { frame->vop_flags |= XVID_VOP_GREYSCALE; } if (config->zones[i].chroma_opt) { frame->vop_flags |= XVID_VOP_CHROMAOPT; } if (config->zones[i].cartoon_mode) { frame->vop_flags |= XVID_VOP_CARTOON; frame->motion |= XVID_ME_DETECT_STATIC_MOTION; } if ((profiles[config->profile].flags & PROFILE_BVOP) && config->use_bvop) { frame->bframe_threshold = config->zones[i].bvop_threshold; } } #define CALC_BI_STRIDE(width,bitcount) ((((width * bitcount) + 31) & ~31) >> 3) LRESULT compress(CODEC * codec, ICCOMPRESS * icc) { BITMAPINFOHEADER * inhdr = icc->lpbiInput; BITMAPINFOHEADER * outhdr = icc->lpbiOutput; xvid_enc_frame_t frame; xvid_enc_stats_t stats; int length; const quality_t* quality_preset = (codec->config.quality==quality_table_num) ? &codec->config.quality_user : &quality_table[codec->config.quality]; memset(&frame, 0, sizeof(frame)); frame.version = XVID_VERSION; frame.type = XVID_TYPE_AUTO; /* vol stuff */ if ((profiles[codec->config.profile].flags & PROFILE_MPEGQUANT) && codec->config.quant_type != QUANT_MODE_H263) { frame.vol_flags |= XVID_VOL_MPEGQUANT; if (codec->config.quant_type == QUANT_MODE_CUSTOM) { frame.quant_intra_matrix = codec->config.qmatrix_intra; frame.quant_inter_matrix = codec->config.qmatrix_inter; }else{ frame.quant_intra_matrix = NULL; frame.quant_inter_matrix = NULL; } } if ((profiles[codec->config.profile].flags & PROFILE_QPEL) && codec->config.qpel) { frame.vol_flags |= XVID_VOL_QUARTERPEL; frame.motion |= XVID_ME_QUARTERPELREFINE16 | XVID_ME_QUARTERPELREFINE8; } if ((profiles[codec->config.profile].flags & PROFILE_GMC) && codec->config.gmc) { frame.vol_flags |= XVID_VOL_GMC; frame.motion |= XVID_ME_GME_REFINE; } if ((profiles[codec->config.profile].flags & PROFILE_INTERLACE) && codec->config.interlacing) frame.vol_flags |= XVID_VOL_INTERLACING; /* dxn: force 1:1 picture aspect ration */ if ((profiles[codec->config.profile].flags & PROFILE_EXTRA)) { frame.par = XVID_PAR_11_VGA; } else if (codec->config.ar_mode == 0) { /* PAR */ if (codec->config.display_aspect_ratio != 5) { frame.par = codec->config.display_aspect_ratio + 1; } else { frame.par = XVID_PAR_EXT; frame.par_width = codec->config.par_x; frame.par_height= codec->config.par_y; } } else { /* AR */ /* custom pixel aspect ratio -> calculated from DAR */ frame.par = XVID_PAR_EXT; frame.par_width = (100 * inhdr->biHeight) / codec->config.ar_y; frame.par_height= (100 * inhdr->biWidth) / codec->config.ar_x; } /* vop stuff */ frame.vop_flags |= XVID_VOP_HALFPEL; frame.vop_flags |= XVID_VOP_HQACPRED; if (codec->config.interlacing && codec->config.tff) frame.vop_flags |= XVID_VOP_TOPFIELDFIRST; if (codec->config.vop_debug) frame.vop_flags |= XVID_VOP_DEBUG; if (quality_preset->trellis_quant) { frame.vop_flags |= XVID_VOP_TRELLISQUANT; } if ((profiles[codec->config.profile].flags & PROFILE_4MV)) { if (quality_preset->motion_search > 4) frame.vop_flags |= XVID_VOP_INTER4V; } if (quality_preset->chromame) frame.motion |= XVID_ME_CHROMA_PVOP + XVID_ME_CHROMA_BVOP; if (quality_preset->turbo) frame.motion |= XVID_ME_FASTREFINE16 | XVID_ME_FASTREFINE8 | XVID_ME_SKIP_DELTASEARCH | XVID_ME_FAST_MODEINTERPOLATE | XVID_ME_BFRAME_EARLYSTOP; frame.motion |= pmvfast_presets[quality_preset->motion_search]; if (quality_preset->vhq_bframe) frame.vop_flags |= XVID_VOP_RD_BVOP; switch (quality_preset->vhq_mode) { case VHQ_MODE_DECISION : frame.vop_flags |= XVID_VOP_MODEDECISION_RD; break; case VHQ_LIMITED_SEARCH : frame.vop_flags |= XVID_VOP_MODEDECISION_RD; frame.motion |= XVID_ME_HALFPELREFINE16_RD; frame.motion |= XVID_ME_QUARTERPELREFINE16_RD; break; case VHQ_MEDIUM_SEARCH : frame.vop_flags |= XVID_VOP_MODEDECISION_RD; frame.motion |= XVID_ME_HALFPELREFINE16_RD; frame.motion |= XVID_ME_HALFPELREFINE8_RD; frame.motion |= XVID_ME_QUARTERPELREFINE16_RD; frame.motion |= XVID_ME_QUARTERPELREFINE8_RD; frame.motion |= XVID_ME_CHECKPREDICTION_RD; break; case VHQ_WIDE_SEARCH : frame.vop_flags |= XVID_VOP_MODEDECISION_RD; frame.motion |= XVID_ME_HALFPELREFINE16_RD; frame.motion |= XVID_ME_HALFPELREFINE8_RD; frame.motion |= XVID_ME_QUARTERPELREFINE16_RD; frame.motion |= XVID_ME_QUARTERPELREFINE8_RD; frame.motion |= XVID_ME_CHECKPREDICTION_RD; frame.motion |= XVID_ME_EXTSEARCH_RD; break; default : break; } if (quality_preset->vhq_metric == 1) frame.vop_flags |= XVID_VOP_RD_PSNRHVSM; frame.input.plane[0] = icc->lpInput; frame.input.stride[0] = CALC_BI_STRIDE(icc->lpbiInput->biWidth, icc->lpbiInput->biBitCount); if ((frame.input.csp = get_colorspace(inhdr)) == XVID_CSP_NULL) return ICERR_BADFORMAT; if (frame.input.csp == XVID_CSP_I420 || frame.input.csp == XVID_CSP_YV12) { frame.input.stride[0] = (4 * icc->lpbiInput->biWidth + 3) / 4; frame.input.stride[1] = frame.input.stride[2] = frame.input.stride[0] / 2 ; } frame.bitstream = icc->lpOutput; frame.length = icc->lpbiOutput->biSizeImage; frame.quant = 0; if (codec->config.mode == RC_MODE_NULL) { outhdr->biSizeImage = 0; *icc->lpdwFlags = AVIIF_KEYFRAME; return ICERR_OK; } // force keyframe spacing in 2-pass 1st pass if (quality_preset->motion_search == 0) frame.type = XVID_TYPE_IVOP; /* frame-based stuff */ apply_zone_modifiers(&frame, &codec->config, codec->framenum); /* call encore */ memset(&stats, 0, sizeof(stats)); stats.version = XVID_VERSION; length = codec->xvid_encore_func(codec->ehandle, XVID_ENC_ENCODE, &frame, &stats); switch (length) { case XVID_ERR_FAIL : return ICERR_ERROR; case XVID_ERR_MEMORY : return ICERR_MEMORY; case XVID_ERR_FORMAT : return ICERR_BADFORMAT; case XVID_ERR_VERSION : return ICERR_UNSUPPORTED; } if (codec->config.display_status && stats.type>0) { status_update(&codec->status, stats.type, stats.length, stats.quant); } DPRINTF("{type=%i len=%i} length=%i", stats.type, stats.length, length); if (length == 0) /* no encoder output */ { *icc->lpdwFlags = 0; ((char*)icc->lpOutput)[0] = 0x7f; /* virtual dub skip frame */ outhdr->biSizeImage = 1; }else{ if (frame.out_flags & XVID_KEYFRAME) { codec->keyspacing = 0; *icc->lpdwFlags = AVIIF_KEYFRAME; } else { *icc->lpdwFlags = 0; } outhdr->biSizeImage = length; if (codec->config.mode == RC_MODE_2PASS1 && codec->config.discard1pass) { outhdr->biSizeImage = 0; } } codec->framenum++; codec->keyspacing++; return ICERR_OK; } /* decompressor */ LRESULT decompress_query(CODEC * codec, BITMAPINFO *lpbiInput, BITMAPINFO *lpbiOutput) { BITMAPINFOHEADER * inhdr = &lpbiInput->bmiHeader; BITMAPINFOHEADER * outhdr = &lpbiOutput->bmiHeader; int in_csp = XVID_CSP_NULL, out_csp = XVID_CSP_NULL; if (lpbiInput == NULL) { return ICERR_ERROR; } if (inhdr->biCompression != FOURCC_XVID && inhdr->biCompression != FOURCC_DIVX && inhdr->biCompression != FOURCC_DX50 && inhdr->biCompression != FOURCC_MP4V && inhdr->biCompression != FOURCC_xvid && inhdr->biCompression != FOURCC_divx && inhdr->biCompression != FOURCC_dx50 && inhdr->biCompression != FOURCC_mp4v && (in_csp = get_colorspace(inhdr)) != XVID_CSP_YV12) { return ICERR_BADFORMAT; } if (lpbiOutput == NULL) { return ICERR_OK; } out_csp = get_colorspace(outhdr); if (inhdr->biWidth != outhdr->biWidth || inhdr->biHeight != outhdr->biHeight || out_csp == XVID_CSP_NULL || (in_csp == XVID_CSP_YV12 && in_csp != out_csp)) { return ICERR_BADFORMAT; } return ICERR_OK; } LRESULT decompress_get_format(CODEC * codec, BITMAPINFO * lpbiInput, BITMAPINFO * lpbiOutput) { BITMAPINFOHEADER * inhdr = &lpbiInput->bmiHeader; BITMAPINFOHEADER * outhdr = &lpbiOutput->bmiHeader; LRESULT result; if (lpbiOutput == NULL) { return sizeof(BITMAPINFOHEADER); } /* --- yv12 --- */ if (get_colorspace(inhdr) != XVID_CSP_NULL) { memcpy(outhdr, inhdr, sizeof(BITMAPINFOHEADER)); /* XXX: should we set outhdr->biSize ?? */ return ICERR_OK; } /* --- yv12 --- */ result = decompress_query(codec, lpbiInput, lpbiOutput); if (result != ICERR_OK) { return result; } outhdr->biSize = sizeof(BITMAPINFOHEADER); outhdr->biWidth = inhdr->biWidth; outhdr->biHeight = inhdr->biHeight; outhdr->biPlanes = 1; outhdr->biBitCount = 24; outhdr->biCompression = BI_RGB; /* sonic foundry vegas video v3 only supports BI_RGB */ outhdr->biSizeImage = outhdr->biHeight * CALC_BI_STRIDE(outhdr->biWidth, outhdr->biBitCount); outhdr->biXPelsPerMeter = 0; outhdr->biYPelsPerMeter = 0; outhdr->biClrUsed = 0; outhdr->biClrImportant = 0; return ICERR_OK; } #define REG_GET_N(X, Y, Z) \ { \ DWORD size = sizeof(int); \ if (RegQueryValueEx(hKey, X, 0, 0, (LPBYTE)&Y, &size) != ERROR_SUCCESS) { \ Y=Z; \ } \ }while(0) LRESULT decompress_begin(CODEC * codec, BITMAPINFO * lpbiInput, BITMAPINFO * lpbiOutput) { BITMAPINFOHEADER * inhdr = &lpbiInput->bmiHeader; xvid_gbl_init_t init; xvid_gbl_info_t info; xvid_dec_create_t create; HKEY hKey; if (init_dll(codec) != 0) return ICERR_ERROR; memset(&init, 0, sizeof(init)); init.version = XVID_VERSION; init.cpu_flags = codec->config.cpu; init.debug = codec->config.debug; codec->xvid_global_func(0, XVID_GBL_INIT, &init, NULL); memset(&info, 0, sizeof(info)); info.version = XVID_VERSION; codec->xvid_global_func(0, XVID_GBL_INFO, &info, NULL); memset(&create, 0, sizeof(create)); create.version = XVID_VERSION; create.width = lpbiInput->bmiHeader.biWidth; create.height = lpbiInput->bmiHeader.biHeight; create.fourcc = inhdr->biCompression; /* Decoder threads */ if (codec->config.cpu & XVID_CPU_FORCE) create.num_threads = codec->config.num_threads; else create.num_threads = info.num_threads; /* Autodetect */ switch(codec->xvid_decore_func(0, XVID_DEC_CREATE, &create, NULL)) { case XVID_ERR_FAIL : return ICERR_ERROR; case XVID_ERR_MEMORY : return ICERR_MEMORY; case XVID_ERR_FORMAT : return ICERR_BADFORMAT; case XVID_ERR_VERSION : return ICERR_UNSUPPORTED; } codec->dhandle = create.handle; RegOpenKeyEx(XVID_REG_KEY, XVID_REG_PARENT "\\" XVID_REG_CHILD, 0, KEY_READ, &hKey); REG_GET_N("Brightness", pp_brightness, 0); REG_GET_N("Deblock_Y", pp_dy, 0); REG_GET_N("Deblock_UV", pp_duv, 0); REG_GET_N("Dering_Y", pp_dry, 0); REG_GET_N("Dering_UV", pp_druv, 0); REG_GET_N("FilmEffect", pp_fe, 0); RegCloseKey(hKey); return ICERR_OK; } LRESULT decompress_end(CODEC * codec) { if (codec->m_hdll != NULL) { if (codec->dhandle != NULL) { codec->xvid_decore_func(codec->dhandle, XVID_DEC_DESTROY, NULL, NULL); codec->dhandle = NULL; } } return ICERR_OK; } LRESULT decompress(CODEC * codec, ICDECOMPRESS * icd) { xvid_dec_frame_t frame; /* --- yv12 --- */ if (icd->lpbiInput->biCompression != FOURCC_XVID && icd->lpbiInput->biCompression != FOURCC_DIVX && icd->lpbiInput->biCompression != FOURCC_DX50 && icd->lpbiInput->biCompression != FOURCC_MP4V && icd->lpbiInput->biCompression != FOURCC_xvid && icd->lpbiInput->biCompression != FOURCC_divx && icd->lpbiInput->biCompression != FOURCC_dx50 && icd->lpbiInput->biCompression != FOURCC_mp4v) { xvid_gbl_convert_t convert; DPRINTF("input=%c%c%c%c output=%c%c%c%c", icd->lpbiInput->biCompression&0xff, (icd->lpbiInput->biCompression>>8)&0xff, (icd->lpbiInput->biCompression>>16)&0xff, (icd->lpbiInput->biCompression>>24)&0xff, icd->lpbiOutput->biCompression&0xff, (icd->lpbiOutput->biCompression>>8)&0xff, (icd->lpbiOutput->biCompression>>16)&0xff, (icd->lpbiOutput->biCompression>>24)&0xff); memset(&convert, 0, sizeof(convert)); convert.version = XVID_VERSION; convert.input.csp = get_colorspace(icd->lpbiInput); convert.input.plane[0] = icd->lpInput; convert.input.stride[0] = CALC_BI_STRIDE(icd->lpbiInput->biWidth, icd->lpbiInput->biBitCount); if (convert.input.csp == XVID_CSP_I420 || convert.input.csp == XVID_CSP_YV12) convert.input.stride[0] = (convert.input.stride[0]*2)/3; convert.output.csp = get_colorspace(icd->lpbiOutput); convert.output.plane[0] = icd->lpOutput; convert.output.stride[0] = CALC_BI_STRIDE(icd->lpbiOutput->biWidth, icd->lpbiOutput->biBitCount); if (convert.output.csp == XVID_CSP_I420 || convert.output.csp == XVID_CSP_YV12) convert.output.stride[0] = (convert.output.stride[0]*2)/3; convert.width = icd->lpbiInput->biWidth; convert.height = icd->lpbiInput->biHeight; convert.interlacing = 0; if (convert.input.csp == XVID_CSP_NULL || convert.output.csp == XVID_CSP_NULL || codec->xvid_global_func(0, XVID_GBL_CONVERT, &convert, NULL) < 0) { return ICERR_BADFORMAT; } return ICERR_OK; } /* --- yv12 --- */ memset(&frame, 0, sizeof(frame)); frame.version = XVID_VERSION; frame.general = XVID_LOWDELAY; /* force low_delay_default mode */ frame.bitstream = icd->lpInput; frame.length = icd->lpbiInput->biSizeImage; if (~((icd->dwFlags & ICDECOMPRESS_HURRYUP) | (icd->dwFlags & ICDECOMPRESS_UPDATE) | (icd->dwFlags & ICDECOMPRESS_PREROLL))) { if ((frame.output.csp = get_colorspace(icd->lpbiOutput)) == XVID_CSP_NULL) { return ICERR_BADFORMAT; } frame.output.plane[0] = icd->lpOutput; frame.output.stride[0] = CALC_BI_STRIDE(icd->lpbiOutput->biWidth, icd->lpbiOutput->biBitCount); if (frame.output.csp == XVID_CSP_I420 || frame.output.csp == XVID_CSP_YV12) frame.output.stride[0] = CALC_BI_STRIDE(icd->lpbiOutput->biWidth, 8); } else { frame.output.csp = XVID_CSP_NULL; } if (pp_dy)frame.general |= XVID_DEBLOCKY; if (pp_duv) frame.general |= XVID_DEBLOCKUV; if (pp_dry) frame.general |= XVID_DERINGY; if (pp_druv) frame.general |= XVID_DERINGUV; if (pp_fe) frame.general |= XVID_FILMEFFECT; frame.brightness = pp_brightness; switch (codec->xvid_decore_func(codec->dhandle, XVID_DEC_DECODE, &frame, NULL)) { case XVID_ERR_FAIL : return ICERR_ERROR; case XVID_ERR_MEMORY : return ICERR_MEMORY; case XVID_ERR_FORMAT : return ICERR_BADFORMAT; case XVID_ERR_VERSION : return ICERR_UNSUPPORTED; } return ICERR_OK; } xvidcore/vfw/src/mobile_40.ico0000664000076500007650000002743611507207323017336 0ustar xvidbuildxvidbuildH( /(HP %GI^zP0:SؠXKC8|cfedccb`\xCLo\Y-sntzzzy~~~}}|zzuooa^ wZDw{.}~||{{zyxxvvvtofjpʅ /ǎA}}|{{zyxwvvutsqhc󷍼Rd֙3~}|{zzyywwvuutus[(gU~|{~}}}yw{vqxwuvsrm_RF~62{2({t=wrj)#k%3qe{fpq]¸^WWjϥWXS͔Ʋċ6~įXju]~YЮbOpˬͮeԥqMs sp vIsѫЫzr̤˩gԦhK̑{uvk8yvם̢{Y˦װĉgblmzvvaˢu{ל][z|=dǻ˜"sssn rzywwtwPل֙wڗ}lǠ}wqwqr~|z|zzyx{g֡<هٗ   ~||{{zyxxډ݋}~}||{zzsQ8إ}~~|z|yx~}{{zq{?ܵ܎>ӻ5{~}|r4vX6VHӌk~~x(x4sYJp|yEu;{נV{bqi¨Iw7m V}}@vze ~\/-YYnVTXMvJyf SO˻-],xO{ ;")hԳ*1=Uͪ٥eMKgԉc9FRQIYzxFL0x? @? @@`?xvidcore/vfw/src/hd720_40.ico0000664000076500007650000002743611507207323016713 0ustar xvidbuildxvidbuildH( /(HP &EG^|xN09Rϯ޿װ{̘QƍE†>Å3g֦NQ̦PͦP̦OեNNѥṆKòf>ÊFծϸnZø߽˗U)wlZ߫T\cddcilkkjjjihhd^^PM۱f ŏTݿ:śDv̩Ҥfʾzϸmmfbfiilmkkkjjiiihhggfffedeec_V[_Ѩ~!1ݹˑ;l۴blooooonnnmmllkkjjiihhhgggffeedddcbbXRԭP d߽ȋ2qcntvsqqppooonnmmllkkkjjiihhhggffeeedddcddLq$ѿ lНRͺlonsuutssstrqppponnnjhnleijcbgkhhhfbhgb\]\beac^NȳVʍ2øe۾sxwvvuutusi_jrqqpooplŃŌ9fnʎ1y!m˖IΛP*ҨTֵgjgp4^dyʖNʕN͛YqXteTcJɵ~ssդ]÷ct}zxxwvvuvtgǢ֫vjrqqppqfdOнqǤx鯦U´efw ̯m-_ό-ԲЫj-Ɉ*лhZdMμ\ʿsyzzyyxxwwujšĿzosrqqrflݕ7ȷdȪ̅cϲW̷ŵ]hlڹź^ڰ^m̞WfԬ̮wQxCʵjbNIu⿑˿sy{zzyyxxfԧmƟƻosssrrsh\ѸҭǢ̆q%[h²eia߯^ʬSegÉϸbˢPnзj0ݲdd]{4yuѽk۰qwy{{zy{nџ^ʣɸiuuttssti^ðƧȣ͆kYoejanɪdպn\Ͳ[͓ēv߽Qrҹf(سedeOۺ{|{u۰rԿsz|{{pҟ]׾ftvuutttujiߞF̼mɩ̄콃IDŽ"Ƨ`kÅ'‚)jжˈ-؛@pƤ˸iy ŸIԶkeecdPzȀȂȁȁv߷|׽u}}mҝXögӿvwvvuuutvjkRuʫ|͘b׶km޸ʱ̷U۪\̢čiȯ֗rGagfehUܼ~ҕ0{ʄɂɂȁǀrແɿpoҞWջdxyxxwwvvuutxy qsy tqz€zekonppolih}kgdr r]jfgfffc˞z̅ʃʃʃɂȁwἃ޷ԡXǷexzyyyxwwvvutqpusnqqklknoonlkkklifhiieeihhggff_PҔʁ˅˄˄ʃʃɂɂɂw⾆ۨk{{{zzyxxxwwvriutttsrrpiehpoonmdgdʲdebaͲdbajhhhgg]ѥsE|͆̆̅̅˅˄ʃʃɂȀs⿅åj{~||{{{zyyxxl۴źhututsskɩiopofÇ9κĿ1Ųckihhh^.zǐvΈ͇͆̆̅˅˄˄ʃʃvҙ=۱|kƀ~}|||{zzyyruϛUlwuttn͖ArnpgذRܸoiiihdq'߰Y͂ψΈ·͆͆̆̅˄˅zі0ڮsn~}}||{{zyjٿֺluvuumҡRuoeɐCU߸ojjiig[ԼJpڣ=φЉωψΈ·͇͆̆ʀԚ7նݲwoƀ}~}}||~s߼̶hܿvvvnӠPunnۼwjq^ȒVU߹pkkjjjQŸHf΂ъЉЉψψ·͇}՛5̉nӷګioǁ~~}}}vȈ&ٯskxvoҠPvmr tinlbƌBW߹plkkkkV۷wd͙yҌъЊЉωωˀғ'΍ʂ˅tĕ٪goƀ~}{r߶ĀʾsxoӡPwnq YrppeȐHWߺqlllkk[ٱ{?шҊҌыъъ΅՗.Ί̆̅ʂsջۮoqǁƁoˢΖ>϶a۷lzqԢPworڳϲ_jl`ʼn:W߻rnmmlkeɓHyܢ5цӌҋҋЇ֘.ғ!ʀ͆͆̅̅˄u׽צWvȃxњAʠƼkuҞTݸqqԣQyrjʭěΚNНS˔EY߻snmmmlfɓHyKȴЃԏӌӌ̀ԕ#ͅψ·͇͆̆̅̅u̗פPw~t}{uͨɊ+мm֤RztqpZ߼sonnnmgȐCu}ЂՑԍЃԒ΂Њωψψ·͇͆̆˃rϲểq͍ˋ%u~|uتcžqƃxuxoă'[߼tooonneѢ\ ۠=ЄԎ҉`۴֗!ЅыъЉЉωψ·͇͇͆˂ΌճϨʃʃ|⽈ݲp{xʊ&ɺxzpўVtxxxpuըhͤۼڻ׷[߽uppooo[ܹ mхԍ҉Ѕ҉ӍҋҋъЊЉЉωΈ·͇·˃˃˄ʂ~ɂ~ȁǀǀz{uz}|rytzyxxywjnzyoџV]߾vqppppYȦU ʗхӊԎԎԍӍҌҌҋыъЊωψψ·͇̅̅̅˅˄ʃɀɂɂȁȁȁǀ|}}}}|y{zzzyxxxvsrgΗH]޾wrrqqnm:]ݤ<҇ӊԍԎӍӌҌҋыъЉЉωψΈ·͆͆̆̅˄˄ʃʃɂɂȁȁǀǀ~}}}||{{zzyyxxwvlўT^޿xsrrqjÂ%ş 8߿ݣ6хЄՑՐԍӌӌҌыъъЉωψΈ·͆͆̆̅˅˄ʃʃɂɂȁȁǀǀ~~}}||{{zzyyxxqŅ#ΘNputtsri޽DdЃЃӌԍӍӌҋҋыъЉЉψψΈ͇͆̆̅˅˄˄ʃʃɂɂȁȁǀ~~}}|||{zzzyxysxdž!ٻnvuuttst GqΘڞ1ҊҊЅ҉ҊӌҋъЊЉωψΈ·͇͆̆̅˄˄˃ʃɂɂȁȁǀǀ~}}}||{{zzyyywtwvvuuu[ԹuAȒުJؘӊzЇщӎҍыЋωψΈ·͇͆̆̅̅˄ʃʃʃɂȁȁǀǀ~~}}|||{zzyyyxwvxvf߼fܽ˔߫J؛)}}ЇυЉЈψχ·͇͆͆̅˅˄˄ʃʃɂɂȁȁǀ~}}}||{{zzyxwwom8WtܸjҏЋЊψΆˀ~}˂ʁʁɀʂʄʃɃɂȁǀǀ~~}}}}|zvtotuڰqx#cƦӬĂޭ[ܪXؠ5ёёЎ΋ɀqqppqttt{zzxrojh޽nƃˍ%֧bͭ)4@TԫЧ̡ƗÑפK֣KաHקUz㿃Ęˢܽb#9EPPHXwyw}~EP/x? @? @@`7xvidcore/vfw/vfw.dsp0000664000076500007650000001216011567132321015600 0ustar xvidbuildxvidbuild# Microsoft Developer Studio Project File - Name="vfw" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** NICHT BEARBEITEN ** # TARGTYPE "Win32 (x86) Dynamic-Link Library" 0x0102 CFG=vfw - Win32 Debug !MESSAGE Dies ist kein gltiges Makefile. Zum Erstellen dieses Projekts mit NMAKE !MESSAGE verwenden Sie den Befehl "Makefile exportieren" und fhren Sie den Befehl !MESSAGE !MESSAGE NMAKE /f "vfw.mak". !MESSAGE !MESSAGE Sie knnen beim Ausfhren von NMAKE eine Konfiguration angeben !MESSAGE durch Definieren des Makros CFG in der Befehlszeile. Zum Beispiel: !MESSAGE !MESSAGE NMAKE /f "vfw.mak" CFG="vfw - Win32 Debug" !MESSAGE !MESSAGE Fr die Konfiguration stehen zur Auswahl: !MESSAGE !MESSAGE "vfw - Win32 Release" (basierend auf "Win32 (x86) Dynamic-Link Library") !MESSAGE "vfw - Win32 Debug" (basierend auf "Win32 (x86) Dynamic-Link Library") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe MTL=midl.exe RSC=rc.exe !IF "$(CFG)" == "vfw - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "Release" # PROP BASE Intermediate_Dir "Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "Release" # PROP Intermediate_Dir "Release" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c # ADD CPP /nologo /W3 /GX /O2 /I "..\src" /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /FR /YX /FD /c # ADD BASE MTL /nologo /D "NDEBUG" /mktyplib203 /win32 # ADD MTL /nologo /D "NDEBUG" /mktyplib203 /win32 # ADD BASE RSC /l 0xc09 /d "NDEBUG" # ADD RSC /l 0xc09 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /dll /machine:I386 # ADD LINK32 user32.lib comdlg32.lib comctl32.lib advapi32.lib gdi32.lib shell32.lib winmm.lib /nologo /dll /machine:I386 /out:"bin\xvidvfw.dll" # Begin Special Build Tool SOURCE="$(InputPath)" PostBuild_Cmds=copy "..\build\win32\bin\xvidcore.dll" ".\bin" # End Special Build Tool !ELSEIF "$(CFG)" == "vfw - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "Debug" # PROP BASE Intermediate_Dir "Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "Debug" # PROP Intermediate_Dir "Debug" # PROP Ignore_Export_Lib 0 # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /GZ /c # ADD CPP /nologo /W3 /Gm /GX /ZI /Od /I "..\src" /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /FR /YX /FD /GZ /c # ADD BASE MTL /nologo /D "_DEBUG" /mktyplib203 /win32 # ADD MTL /nologo /D "_DEBUG" /mktyplib203 /win32 # ADD BASE RSC /l 0xc09 /d "_DEBUG" # ADD RSC /l 0xc09 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /dll /debug /machine:I386 /pdbtype:sept # ADD LINK32 user32.lib comdlg32.lib comctl32.lib advapi32.lib gdi32.lib shell32.lib winmm.lib /nologo /dll /incremental:no /debug /machine:I386 /out:"bin\xvidvfw.dll" /pdbtype:sept # SUBTRACT LINK32 /pdb:none # Begin Special Build Tool SOURCE="$(InputPath)" PostBuild_Cmds=copy "..\build\win32\bin\xvidcore.dll" ".\bin" # End Special Build Tool !ENDIF # Begin Target # Name "vfw - Win32 Release" # Name "vfw - Win32 Debug" # Begin Group "doc" # PROP Default_Filter "" # End Group # Begin Group "Source Files" # PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" # Begin Source File SOURCE=.\src\codec.c # End Source File # Begin Source File SOURCE=.\src\config.c # End Source File # Begin Source File SOURCE=.\src\driverproc.c # End Source File # Begin Source File SOURCE=.\src\status.c # End Source File # End Group # Begin Group "Header Files" # PROP Default_Filter "h;hpp;hxx;hm;inl" # Begin Source File SOURCE=.\src\codec.h # End Source File # Begin Source File SOURCE=.\src\config.h # End Source File # Begin Source File SOURCE=.\src\debug.h # End Source File # Begin Source File SOURCE=.\src\resource.h # End Source File # Begin Source File SOURCE=.\src\status.h # End Source File # Begin Source File SOURCE=.\src\vfwext.h # End Source File # End Group # Begin Group "Resource Files" # PROP Default_Filter "ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe" # Begin Source File SOURCE=.\src\resource.rc # End Source File # Begin Source File SOURCE=.\src\xvid.ico # End Source File # Begin Source File SOURCE=.\src\XviD_logo.bmp # End Source File # End Group # Begin Group "Linker Defs" # PROP Default_Filter "def" # Begin Source File SOURCE=.\src\driverproc.def # End Source File # End Group # End Target # End Project xvidcore/AUTHORS0000664000076500007650000000232111504426344014534 0ustar xvidbuildxvidbuildAUTHORS ======= This file lists all authors of Xvid MPEG4 core library. If you think your name should appear on this list, please send us an email telling us your name, we will be pleased to add it here. The lists are classified by alphabetical order. Project initiators: ------------------- Christoph Lampert Michael Militzer Peter Ross Former 1.x maintainers: ----------------------- Edouard Gomez (lot of "lot of things") Radoslaw Czyz (lot of ME work) Regular contributors: --------------------- Pascal Massimino (quite a lot of x86 assembly) Spontaneous contributors: ------------------------- Benjamin Herrenschmidt (first ppc port attempt) Christoph Kuehnel (field interlaced decoding) Daniel Smith (rc code) Dirk Knop (vfw) Guillaume Morin (first ppc port attempt) MinChen (lot of work on early CVS versions) Architecture ports: ------------------- Christoph Ngeli (new PPC port) Last edited: $Date: 2010-12-22 16:52:52 $