ICME 2020 Grand Challenge on Densely Sampled Light Field Reconstruction
Densely-sampled light field (DSLF) is a discrete representation of the 4D approximation of the plenoptic function parameterized by two parallel planes (camera plane and image plane), where multi-perspective camera views are arranged in such a way that the disparities between adjacent views are less than one pixel. DSLF allows generating any desired light ray along the parallax axis by simple local interpolation. DSLF capture setting in terms of physical camera locations depends on the minimal scene depth and the camera sensor resolution. Direct DSLF capture of real-world scenes requires a very high number of densely located cameras, which is not practical. This motivates the problem of DSLF reconstruction from a given sparse set of camera images utilizing the properties of the scene objects and the underlying plenoptic function.
The goal of this challenge is two-fold:
First, to generate high-quality and meaningful DSLF datasets for further experiments
Second, to quantify the state of the art in the area of light field reconstruction and processing in order to provide instructive results about the practical LF capture settings in terms of number of cameras and their relative locations. This will be furthermore helpful in applications, aiming at:
- Predicting intermediate views from neighboring views in LF compression
- Generating high-quality content for super-multiview and LF displays
- Providing FVV functionality
- Converting LF (I.e. ray optics based) representation into holographic (I.e. wave optics based) representations for the needs of digital hologram generation
Proponents are asked to develop and implement algorithms for DSLF reconstruction from decimated-parallax imagery in three categories:
- Cat1: close camera views along parallax axis resulting in adjacent images with narrow disparity (e.g. in the range of 8 pixels)
- Cat2: moderately-distant cameras along parallax axis, resulting in adjacent images with moderate disparities (in the range of 15-16 pixels)
- Cat3: distant cameras, resulting in adjacent images with wide disparity (in the range of 30-32 pixels)
Algorithms in each category will be evaluated separately, thus resulting in three sub-challenges. Proponents are invited to submit solutions to one or more categories.
Datasets include pre-rectified multi-perspective images of 3D scenes.
For each scene, one DSLF version and three parallax-decimated versions are available. Datasets are grouped into Development Datasets and Evaluation Datasets.
- Format: raw 8-bit R, G, B color channels
- Resolution: 1280×720 pixels
- Number of densely sampled images per dataset: in the range of 100 – 200 camera views, where the disparity range between adjacent views is 1 pixel at most
Development datasets (DD)
Six datasets with varying level of scene complexity
- ‘Lambertian DD’ – images of a contemporary real scene with well-defined structure containing objects with Lambertian reflectance only.
- ‘Synthetic DD’ – images of a photorealistic 3D scene designed and rendered in Blender. The scene will contain predominantly semi-transparent objects. This scene is meant to challenge algorithms based on disparity estimation.
- ‘Complex DD’ – images of real scene with high complexity in terms of depth variations, occlusions and reflective objects.
The datasets are available at http://civit.fi/densely-sampled-light-field-datasets/
From the given densely-sampled (GT) camera views, one has to generate decimated versions in the three categories:
- Cat1: images of camera views decimated from the GT by a factor of 8 along parallax direction (i.e. dropping every 7 images between two input images of the original dataset)
- Cat2: images of camera views decimated from the GT by a factor of 16 along parallax direction (i.e. dropping every 15 images between two input images of the original dataset)
- Cat3: images of camera views decimated from the GT by a factor of 32 along parallax direction (i.e. dropping every 31 images between two input images of the original dataset)
A script for picking the views for the decimated versions is available upon request.
Evaluation datasets (DD)
Three datasets with varying level of scene complexity. These scenes are similar by complexity to the scenes from the Development Datasets, but are different in composition of objects, etc. Similarly to the DD case, they are denoted as ‘Lambertian ED‘, ‘Synthetic ED‘ and ‘Complex ED‘. The GT versions contain 100 – 200 images of each scene in DSLF mode. The decimated versions are composed in the same way as in the DD case, generating three decimated versions per scene. In summary:
- GT, all images in each dataset
- Lambertian ED GT
- Synthetic DD GT
- Complex DD GT
- Cat1: images at coarser grid, parallax decimation by a factor of 8
- Lambertian ED_Cat1
- Synthetic ED_Cat1
- Complex ED_Cat1
- Cat2: image images at coarser grid, parallax decimation by a factor of 16
- Lambertian ED_Cat2
- Synthetic ED_Cat2
- Complex ED_Cat2
- Cat3: 7 images at coarser grid, parallax decimation by a factor of 32
- Lambertian ED_Cat3
- Synthetic ED_Cat3
- Complex ED_Cat3
These versions will NOT be provided to proponents but will be used for evaluating the performance of the submitted algorithms at the organizers’ site.
Submission of algorithms and evaluation criteria
Proponents are requested to submit binaries targeting one or more of the challenge categories. Binaries should be compiled for x86-64 CPUs and working in a plain installation of Windows 10 or Linux Ubuntu 16.04 LTS. For any more specific hardware or OS requirements, proponents should contact the organizers in due course.
Proponents are invited to submit accompanying papers describing the approach and algorithms. Deadlines for paper submission are set in such a way that the papers will appear in the ICME Proceedings.
Evaluation will be done by the organizers who will run the submitted binaries on the Evaluation Datasets. Only the DSLF reconstruction quality will be evaluated. Speed or real-time performance are not considered in the evaluation process.
For each of the challenge three categories, reconstructed DSLF views will be compared against GT in term of per-view PSNR obtained as an average of the per-view PSNRs of the R, G and B color channels. The lowest per-view PSNR for each dataset will be selected as the single quality measure for the given dataset. Algorithms will be ranked for their performance across all evaluation datasets.
January 15, 2020: Grand Challenge web site operational; release of Development Datasets and scripts for calculating the quality metric
March 07, 2020: Submission of binaries to organizers
March 13, 2020: Grand Challenge papers submission
March 21, 2020: Feedback to proponents regarding running their binaries
April 15, 2020: Grand Challenge acceptance notification
April 29, 2020: Grand Challenge camera-ready paper submission
May 08, 2020: Submission of corrected binaries
Deadlines are composed in such a way so to allow the proponents to submit both binaries and papers. The paper-related deadlines are synchronized with the conference paper submission deadlines so to allow including the challenge papers in the conference proceedings. The binaries deadlines include possible communication between organizers and proponents for solving potential issues with running the binaries at the organizers’ site. This means that if, for some technical reason, first submitted binaries do not work properly, proponents will be given a second chance to submit corrected binaries.