NTT Develops World's First H.265/HEVC Software-Encoding Engine Supporting 60P/120P Simultaneous Transmission of Ultra High-Definition Video
Nippon Telegraph and Telephone Corporation
NTT Advanced Technology Corporation
Nippon Telegraph and Telephone Corporation (hereafter referred to as NTT) has developed a software-encoding engine which complies with the International Standard H.265/HEVC (High Efficiency Video Coding) (abbreviated to HEVC), and which enables the simultaneous transmission of 60 frames-per-second videos that is currently used for 4K distribution as well as ultra high-definition 120 frames-per-second videos.
Encoding done by this technology will enable viewers to watch videos at 60 frames per second on existing TVs and other devices when video distribution of 120 frames per second starts in the future, facilitating a smooth switchover. This functionality is achieved by implementing temporal-direction hierarchical encoding, which enables embedding of a 60-frame-per-second HEVC bitstream in a 120-frame-per-second HEVC bitstream. NTT’s original bit rate control algorithm is used to achieve high visual quality while keeping both 60- and 120-frame-per-second HEVC bitstreams at the desired bit rates. Furthermore, by introducing an original variable bit rate-encoding technique, we have also achieved an approximately 40% reduction in the amount of data compared to a constant bit rate, with image quality equivalent to that seen previously. Therefore, we can expect a reduction in storage costs, even with the current distribution services.
This technique has been acquired by NTT Advanced Technology Corporation (hereafter referred to as NTT-AT), and upgraded versions of the software codec development kit “HEVC-1000 SDK” and the file conversion software “FileConverter FC4000” are going on sale.
Note that this technique will be exhibited at "IBC 2015” which will be held in Amsterdam, Netherlands from September 11 to 15.
1. Background and objectives
A high-definition 4K video distribution service is currently being commercialized and there are studies into the distribution of 120 frames-per-second 4K/8K video that is suitable for video footage in which subjects are actively moving, such as sports videos. To ensure that 120 frames-per-second video can be viewed even on decoders that were designed for existing 60 frames-per-second video alone, temporal-direction hierarchical encoding is specified in STD-B32*2, which is a standard issued by ARIB*1. In temporal-direction hierarchical encoding, a different bit rate is set for each frame rate, necessitating an accurate technique of estimating the amount of encoding.
The start of fully-fledged 4K video distribution means that the volume of content has also been increasing year by year. Together with expansion of resolution and frame rate, the amount of data has become huge, so compression of the video footage is required and improvements in the compression performance of H.265/HEVC encoders are expected.
2. Technical points
(1) Temporal-direction hierarchical encoding function (Fig. 1)
This software encoding engine is the world’s first encoder to support the temporal-direction hierarchical encoding specified by ARIB. In this temporal-direction hierarchical encoding, the encoding is such that some frames can be extracted from the 120 frames-per-second video stream. When the encoding is performed, each frame is analyzed and the amount of data is estimated. If the accuracy of the estimation of the amount of data is low in the temporal-direction hierarchical encoding, the allocation of the amount of encoding to each frame rate is not optimized and it may result in low image quality. At NTT, we have developed a technique of efficiently allocating the amount of data to improve the accuracy of estimating the amount of data, by resolving and analyzing the elements of each frame.
(2) Technique of controlling the amount of encoding according to variable bit rate (Fig. 2)
In services that store and distribute video footage, not only has constant bit rate (CBR) control, in which the distribution bit rate is fixed, been attracting attention, but also variable bit rate (VBR) control, in which allocation of the amount of data is varied temporally, which can reduce the overall amount of data. With VBR, the various conditions necessary for decoding must be considered when varying the amount of data, depending on the complexity of the video footage. We have made it possible to estimate the amount of data to a high degree of accuracy with respect to a combination of a number of conditions to allocate the amount of data efficiently, by applying a highly accurate technique of estimating the amount of data which NTT has developed. As a result, we have implemented a reduction in 40% of the amount of data, in comparison with the fixed bit rate control.
(3) Expansion of this technique
The software encoding engine we developed last year is being sold by NTT-AT and is being used by customers, ranging from equipment manufacturers to post-production operators and distribution business operators, through our software development kit and file conversion software as a file transcoding application to H.265/HEVC. By incorporating these techniques we have developed into our products, we enable users to get an early start on preparing for verification with respect to the 4K/8K 120P video that is planned for the future, and also contribute to improvements in existing HD video and 4K video compression rates.
Attachment 1: NTT Advanced Technology
*1 Association of Radio Industries and Businesses
*2 ARIB standard: “Video Coding, Audio Coding and Multiplexing Specifications for Digital Broadcasting”