Skip to content

Releases: xiph/rav1e

v0.4.0: Happy New Year

13 Jan 14:07

Choose a tag to compare

rav1e 0.4.0 provides solid speed improvements on both x86_64 and aarch64.

image

This release, along with the 0.3.5 release, 0.4.0 supports Apple Silicon out of box.

image

The overall speedup is solid across the speed levels, both for 8bit and 10bit encoding. With some drastic improvement for aarch64 on speed 10.

image

Quality-wise, for 4:2:0 video, most metrics improved across all speed levels, with speed 5 getting the largest boost.
4:2:2 and 4:4:4 video saw greater improvements in quality, as they were brought to feature parity with 4:2:0. See the improvements section below for more details.

image

Speed level PSNR PSNR Cb PSNR Cr PSNR HVS SSIM MS SSIM CIEDE 2000 VMAF
0 -1.3542 -4.0733 -3.3946 -1.7433 -1.7734 -1.9269 -2.4361 -2.16
1 -1.0343 -3.7382 -3.4084 -1.3605 -1.2619 -1.53 -2.2265 -1.97
2 -1.0407 -3.9916 -3.6426 -1.4196 -1.6259 -1.8107 -2.3525 -2.28
3 -1.1544 -5.2352 -4.6235 -1.6259 -1.7752 -1.947 -2.672 -1.95
4 -0.548 -4.7456 -4.3344 -0.9114 -1.1915 -1.3188 -2.1232 -1.66
5 -2.3185 -4.5738 -4.4277 -2.6101 -2.9586 -2.8967 -3.2177 -3.19
6 -1.8238 -2.0511 -2.2246 -1.7811 -1.997 -1.8386 -1.9551 -2.44
7 -1.8314 -2.0694 -2.5498 -1.7675 -1.9612 -1.8752 -1.9191 -2.6
8 -1.8239 -2.2058 -2.5742 -1.795 -1.9449 -1.8676 -1.9334 -2.71
9 -1.6422 -2.0831 -2.314 -1.6198 -1.7644 -1.6923 -1.9255 -1.92
10 -0.108 -1.4077 -2.0309 -0.2935 0.1188 0.0143 -0.0963 -2.35

Improvements

  • Enable open partitions on frame boundaries (2% improvement to coding efficiency)
  • Use av-metrics in CLI to compute PNSR, PSNR-HVS, SSIM, MS-SSIM, and CIEDE2000 (see --metrics)
  • Enable deblocking in loop filter rate-distortion optimization (0.5% to 1.5% improvement to coding efficiency)
  • Thread CDEF loop filter with tiles (1.2% reduction in encoding time with 4 tiles)
  • Redesign the rate control API
  • Add monochrome support
  • Improve 4:2:2 support (37% reduction in encoding time, 0.8% to 5% improvement to coding efficiency)
  • Add compound prediction mode variants for drl=2 and drl=3
  • Enable NEAR_NEAR1MV and NEAR_NEAR2MV compound modes
  • Support arbitrary-SAR anamorphic video
  • Enforce a frame limit of 1 in still picture mode
  • Add a quiet mode to the CLI (--quiet flag)
  • Convert all motion vector predictors to full pixel precision
  • Update non-broken motion estimation predictors (0.28% improvement to coding efficiency)
  • Substantially rework initial motion estimation (9% reduction in encoding time)
  • Optimise predictors for multipass motion estimation (0.3% to 0.4% improvement to coding efficiency)
  • Optimize chroma quantizer offsets for 4:4:4 sampling
  • Allow opaque extra data to be attached to frames and retrieved from encoded packets via the API
  • Merge new dav1d assembly code for x86 and AArch64
  • Add/improve assembly code for distortion computations
  • Derive quantizers using linear models (0.7% to 1.7% improvement to coding efficiency)
  • Prune intra frame prediction mode list dynamically (5.5% to 12.2% reduction in encoding time at speed level 5)
  • Optimize rate-distortion optimization loop (1% reduction in encoding time)
  • Reduce memory allocation count in various areas
  • Optimize tile block access (1.5% reduction in encoding time)
  • Allow frame sizes <16x16 in still picture mode
  • Add high bit depth AVX2 assembly (9% to 31% reduction in encoding time for 10-bit video)

Bug Fixes

  • Fix rebuilding with fresh assembly output
  • Fix the chroma plane desyncs on narrow frames
  • Abort rate controlled encoding without a bitrate target in the CLI
  • Fix the -v CLI option
  • Fix a crash when using 4 tiles for 1080p 4:2:2 input
  • Fix intra edge filter desyncs with 4:2:2 and 4:4:4 input
  • Fix a symbol redefinition error for AArch64 builds using Clang
  • Fix loop restoration filter with 4:2:2 and 4:4:4 input
  • Fix incorrect quantizer index clamping
  • Fix cross-compiling for mingw-W64 on macOS
  • Avoid a buffer underflow condition in CDEF pad_into_tmp16()
  • Properly validate minimum RDO lookahead frames value
  • Respect quantizer bounds with rate control enabled
  • Restrict still picture mode to single-frame streams

Changes

  • Bump minimum version of NASM to 2.14.02
  • Update speed presets
    • Enable full SGR search for levels 0-4 instead of 0-8
    • Enable fine directional prediction for all speed levels
    • Enable reduced transform type search for levels 6-10 instead of 5-10
    • Disable transform type RDO for inter frames
  • Rename "native" CPU feature level to "Rust" (use RAV1E_CPU_TARGET=rust at runtime)
  • Remove in-library PSNR computation feature
  • Move frame-related data structures to a separate crate (v_frame)
  • Extend dump_lookahead_data feature
    • Export the frame_subtype property
    • Use the RAV1E_DATA_PATH environment variable to determine the output path
  • Refactor CDEF to allow easier importation of dav1d CDEF assembly, as well as simplifying interaction between loop filters
  • Remove leftover code ported from libaom
  • Remove unused diamond motion estimation
  • Reduce build time
    • Disable LTO by default
    • Disable code generation unit restriction
    • Allow incremental builds for the release profile
    • Inline various functions
    • Remove large stack allocations
    • Split large modules into multiple submodules
  • Add an unstable channel API feature
  • Prompt if the output file would be overwritten and add -y to override it.

Weekly pre-release

12 Jan 22:30

Choose a tag to compare

Weekly pre-release Pre-release
Pre-release
p20210112

Bump libfuzzer-sys

Compatibility release

31 Dec 12:00

Choose a tag to compare

  • Support Apple Silicon

Weekly pre-release

05 Jan 21:32

Choose a tag to compare

Weekly pre-release Pre-release
Pre-release
CI: Remove a useless condition

The deploy action should run only on the `xiph` organization regardless
of the events.

Weekly pre-release

29 Dec 21:33

Choose a tag to compare

Weekly pre-release Pre-release
Pre-release
CI: Use ilammy/setup-nasm@v1 to install nasm

Replaces the proprietary script

Weekly pre-release

22 Dec 21:32

Choose a tag to compare

Weekly pre-release Pre-release
Pre-release
p20201222

Add LICENSE file to v_frame

Weekly pre-release

15 Dec 21:30

Choose a tag to compare

Weekly pre-release Pre-release
Pre-release
p20201215

Add LICENSE file to v_frame

Weekly pre-release

08 Dec 21:27

Choose a tag to compare

Weekly pre-release Pre-release
Pre-release
p20201208

CI: Run on macOS 11.0

Weekly pre-release

24 Nov 21:24

Choose a tag to compare

Weekly pre-release Pre-release
Pre-release
Optimise NEON assembly for get_sad for width 16

Reduces execution time for this function by between 18% and 40%.
The most commonly used block size, 16x16, is improved by 29%.

Weekly pre-release

10 Nov 21:24
8d86097

Choose a tag to compare

Weekly pre-release Pre-release
Pre-release

Ensure that min and max quantizer are respected (#2578)