2961
|
|
|
Henrik Gramner |
5 years ago
|
|
|
2922
|
|
|
Henrik Gramner |
6 years ago
|
|
|
2897
|
|
|
Henrik Gramner |
6 years ago
|
|
|
2868
|
|
Unify 8-bit and 10-bit CLI and libraries
Add 'i_bitdepth' to x264_param_t with the corresponding '--output-depth' CLI option to set the bit depth at runtime.
Drop the 'x264_bit_depth' global variable. Rather than hardcoding it to an incorrect value, it's preferable to induce a linking failure. If applications relies on this symbol this will make it more obvious where the problem is.
Add Makefile rules that compiles modules with different bit depths. Assembly on x86 is prefixed with the 'private_prefix' define, while all other archs modify their function prefix internally.
Templatize the main C library, x86/x86_64 assembly, ARM assembly, AARCH64 assembly, PowerPC assembly, and MIPS assembly.
The depth and cache CLI filters heavily depend on bit depth size, so they need to be duplicated for each value. This means having to rename these filters, and adjust the callers to use the right version.
Unfortunately the threaded input CLI module inherits a common.h dependency (input/frame -> common/threadpool -> common/frame -> common/common) which is extremely complicated to address in a sensible way. Instead duplicate the module and select the appropriate one at run time.
Each bitdepth needs different checkasm compilation rules, so split the main checkasm target into two executables.
|
Vittorio Giovara |
6 years ago
|
|
|
2749
|
|
|
Henrik Gramner |
7 years ago
|
|
|
2743
|
|
|
Anton Mitrofanov |
7 years ago
|
|
|
2660
|
|
|
Henrik Gramner |
8 years ago
|
|
|
2566
|
|
|
Anton Mitrofanov |
9 years ago
|
|
|
2526
|
|
|
Anton Mitrofanov |
9 years ago
|
|
|
2384
|
|
|
Henrik Gramner |
10 years ago
|
|
|
2354
|
|
|
Henrik Gramner |
11 years ago
|
|
|
2286
|
|
OpenCL lookahead
OpenCL support is compiled in by default, but must be enabled at runtime by an --opencl command line flag. Compiling OpenCL support requires perl. To avoid the perl requirement use: configure --disable-opencl.
When enabled, the lookahead thread is mostly off-loaded to an OpenCL capable GPU device. Lowres intra cost prediction, lowres motion search (including subpel) and bidir cost predictions are all done on the GPU. MB-tree and final slice decisions are still done by the CPU. Presets which do not use a threaded lookahead will not use OpenCL at all (superfast, ultrafast).
Because of data dependencies, the GPU must use an iterative motion search which performs more total work than the CPU would do, so this is not work efficient or power efficient. But if there are spare GPU cycles to spare, it can often speed up the encode. Output quality when OpenCL lookahead is enabled is often very slightly worse in quality than the CPU quality (because of the same data dependencies).
x264 must compile its OpenCL kernels for your device before running them, and in order to avoid doing this every run it caches the compiled kernel binary in a file named x264_lookahead.clbin (--opencl-clbin FNAME to override). The cache file will be ignored if the device, driver, or OpenCL source are changed.
x264 will use the first GPU device which supports the required cl_image features required by its kernels. Most modern discrete GPUs and all AMD integrated GPUs will work. Intel integrated GPUs (up to IvyBridge) do not support those necessary features. Use --opencl-device N to specify a number of capable GPUs to skip during device detection.
Switchable graphics environments (e.g. AMD Enduro) are currently not supported, as some have bugs in their OpenCL drivers that cause output to be silently incorrect.
Developed by MulticoreWare with support from AMD and Telestream.
|
Steve Borho |
11 years ago
|
|
|
2284
|
|
|
Fiona Glaser |
11 years ago
|
|
|
2245
|
|
|
Loren Merritt |
11 years ago
|
|
|
2212
|
|
|
Fiona Glaser |
12 years ago
|
|
|
2197
|
|
|
Fiona Glaser |
12 years ago
|
|
|
2183
|
|
|
Fiona Glaser |
12 years ago
|
|
|
2172
|
|
|
Henrik Gramner |
12 years ago
|
|
|
2154
|
|
|
Hii |
12 years ago
|
|
|
2116
|
|
|
Kieran Kunhya |
12 years ago
|
|
|