ICME 2018 Grand Challenge on Densely Sampled Light Field Reconstruction
Researchers working in the area of light field imaging are challenged to develop and submit algorithms for reconstructing densely-sampled light field from a sparse set of camera views. This challenge is part of the Grand Challenge Program of the IEEE International Conference on Multimedia and Expo (ICME) 2018, to be held on July 23-27, 2018 in San Diego, USA. The challenge is sponsored by Business Tampere and CIVIT.
Densely-sampled light field (DSLF) is a discrete representation of the 4D approximation of the plenoptic function parameterized by two parallel planes (camera plane and image plane), where multi-perspective camera views are arranged in such a way that the disparities between adjacent views are less than one pixel. DSLF allows generating any desired light ray along the parallax axis by simple local interpolation. DSLF capture setting in terms of physical camera locations depends on the minimal scene depth and the camera sensor resolution. The number of cameras is quite high especially for capturing wide field of view content.
DSLF is an attractive representation of scene visual content, particularly for applications which require ray interpolation and view synthesis. The list of such applications includes refocusing, novel view generation for free-viewpoint video (FVV), super-multiview and light field displays, and holographic stereography. Direct DSLF capture of real-world scenes requires a very high number of densely located cameras, which is not practical. This motivates the problem of DSLF reconstruction from a given sparse set of camera images utilizing the properties of the scene objects and the underlying plenoptic function.
Proponents are asked to develop and implement algorithms for DSLF reconstruction from decimated-parallax imagery in three categories:
- Cat1: close camera views along parallax axis resulting in adjacent images with narrow disparity (e.g. in the range of 8 pixels)
- Cat2: moderately-distant cameras along parallax axis, resulting in adjacent images with moderate disparities (in the range of 15-16 pixels)
- Cat3: distant cameras, resulting in adjacent images with wide disparity (in the range of 30-32 pixels)
Algorithms in each category will be evaluated separately, thus resulting in three sub-challenges. Proponents are invited to submit solutions to one or more categories.
Datasets will include pre-rectified horizontal-parallax multi-perspective images of 3D scenes. We opt for horizontal parallax only as it is a well-established case in many important applications and a good starting point for developing and optimizing LF reconstruction methods.
For each scene, one DSLF version and three parallax-decimated versions will be created. Datasets will be grouped into Development Datasets and Evaluation Datasets.
- Format: raw 8-bit R, G, B color channels.
- Resolution: 1280×720 pixels
- Number of densely sampled images per dataset: 193 camera views, where the disparity range between adjacent views is 1 pixel at most.
Development datasets (DD)
These will be datasets with varying level of scene complexity, characterised as follows
- ‘Lambertian DD’ – images of a contemporary real scene with well-defined structure containing objects with Lambertian reflectance only.
- ‘Synthetic DD’ – images of a photorealistic 3D scene designed and rendered in Blender. The scene will contain predominantly semi-transparent objects. This scene is meant to challenge algorithms based on disparity estimation.
- ‘Complex DD’ – images of real scene with high complexity in terms of depth variations, occlusions and reflective objects.
In order to get the Development Dataset, proponents must request a download link by sending an email to firstname.lastname@example.org. They will be provided with the GT DSLF (approximate size of 300MB per dataset) and a script for picking the views for the Cat1, Cat2 and Cat3 decimated versions.
Evaluation datasets (DD)
The scenes will be similar by complexity to the scenes from the Development Datasets, but will be different in composition of objects, etc. Similarly to the DD case, they are denoted as ‘Lambertian ED‘, ‘Synthetic ED‘ and ‘Complex ED‘. The GT versions of each scene will be in DSLF mode and the decimated versions will be composed in the same way as in the DD case, generating three decimated versions per scene.
These versions will NOT be provided to proponents but will be used for evaluating the performance of the submitted algorithms at the organizers’ site.
Capture of real scenes will be done with the CIVIT’s Linear Positioning System, which is a highly-precise gantry allowing for quarter-pixel motion. A high-quality low-noise camera will be used for the scene capture.
Submission of algorithms and evaluation criteria
Proponents are requested to submit binaries targeting one or more of the challenge categories. Binaries should be compiled for x86-64 CPUs and working in a plain installation of Windows 10 or Linux Ubuntu 16.04 LTS. For any more-specific hardware or OS requirements, proponents should contact the organizers in due course. Binaries are to be submitted through a file sender platform and proponents have to request the link by sending an email to email@example.com.
Proponents are invited to submit accompanying papers describing the approach and algorithms. Paper submissions are handled by the ICME conference system.
Evaluation will be done by the organizers who will run the submitted binaries on the Evaluation Datasets. Only the DSLF reconstruction quality will be evaluated. Speed or real-time performance are not considered in the evaluation process. However, for practical reasons, the organisers will not accept programs which take unreasonably long time to run.
For each of the challenge three categories, reconstructed 193 views will be compared against GT in term of per-view PSNR obtained as a log ratio between 255^2 an the average of the per-view MSEs of the R, G and B color channels. The lowest per-view PSNR for each dataset will be selected as the single quality measure for the given dataset. Algorithms will be ranked for their performance across all evaluation datasets.
December 15, 2017: Grand Challenge web site operational; release of Development Datasets and scripts for calculating the quality metric
March 12 – March 26, 2018: Submission of binaries to organizers (contact the organizers to get a link for upload)
March 26, 2018: Grand Challenge papers submission (contact the organizers if you need more time to submit your accompanying paper)
April 01, 2018: Feedback to proponents regarding running their binaries
April 23, 2018: Grand Challenge acceptance notification
May 01, 2018: Submission of corrected binaries
May 11, 2018: Grand Challenge camera-ready paper submission
Deadlines are composed in such a way so to allow the proponents to submit both binaries and papers. The paper-related deadlines are synchronized with the conference paper submission deadlines so to allow including the challenge papers in the conference proceedings. The binaries deadlines include possible communication between organizers and proponents for solving potential issues with running the binaries at the organizers’ site. This means that if, for some technical reason, first submitted binaries do not work properly, proponents will be given a second chance to submit corrected binaries.