In the same section
- Hexagonal Fourier transform for Compression of Plenoptic video
-
Promotor, co-promotor, advisor : gauthier.lafruit@ulb.be, - , sarah.fernandes.pinto.fachada@ulb.be, daniele.bonatto@ulb.be, eline.soetens@ulb.be
Research Unit : LABORATORY OF IMAGE SYNTHESIS AND ANALYSIS - VIRTUAL REALITY (LISA-VR)
Description
Description
Plenoptic cameras (such as Raytrix) possess a main lens, a sheet of micro-lens, and a CMOS sensor. This special design offers the possibility to capture directional light rays and thus, 3D information about the scene. These cameras are called Light field cameras and are theoretically more suitable for 3D and VR applications than conventional cameras. Due to their structure, they capture an image composed of many micro-images placed in a hexagonal grid, creating patterns that are non-optimal to compress using the JPEG algorithm, even if the image itself presents redundancies that are not exploited.
The JPEG algorithm divides the image in blocks then uses the Fourier transform to compute the blocks in the frequency domain. Then, only the most significant frequencies for the human eye are encoded, creating a low-storage representation of the image. To decompress the image, the inverse operation is performed.
Context
The aim of this thesis is to design a compression scheme using a hexagonal lattice for images in plenoptic format, and explore its efficiency. Using block sizes corresponding to the micro-images will simplify the encoding of the hexagonal image structure. Several datasets captured
with different plenoptic cameras (in micro-image size, resolution, depth of field) will be tested and compared with the MPEG explorations of lenslet video coding activities.
Objective
At the end of the year, the student must present
- An implementation of a hexagonal block-based adaptation of the JPEG compression
- Compute its efficiency compared to classical image compression framework used in MPEG LVC activities
Prerequisite
· Good knowledge of C++ programming · Any multimedia course (INFOH502, INFOH503, or similar courses) · Compression knowledge (INFO-H516)
Contact person
gauthier.lafruit@ulb.be, mehrdad.teratani@ulb.be, sarah.fernandes.pinto.fachada@ulb.be, daniele.bonatto@ulb.be, eline.soetens@ulb.be
References
Hexagonal image processing : L. Middleton et J. Sivaswamy, Hexagonal image processing: a practical approach, Springer. in Advances in pattern recognition, no. Advances in pattern recognition. London: Springer, 2005.
Plenoptic camera : C. Perwass et L. Wietzke, Single lens 3D-camera with extended depth-of-field, IS&T/SPIE Electronic Imaging, Burlingame, California, USA, 2012, p. 829108. doi: 10.1117/12.909882.
- View synthesis with Gaussian splatting and Plenoptic cameras
-
Promotor, co-promotor, advisor : gauthier.lafruit@ulb.be, - , sarah.dury@ulb.be, daniele.bonatto@ulb.be, eva.dubar@ulb.be
Research Unit : LABORATORY OF IMAGE SYNTHESIS AND ANALYSIS - VIRTUAL REALITY (LISA-VR)
Description
â¯Description
View synthesis aims to generate novel viewpoints of a real-world scene from a limited set of input images.
Plenoptic cameras [1], equipped with a micro-lens array, capture both spatial and angular information in a single shot - providing depth cues without requiring multiple viewpoints. This makes them a compact and promising solution for 3D reconstruction and view synthesis.
However, the current state-of-the-art in view synthesis - 3D Gaussian Splatting [2] - has been developed for traditional cameras and requires many images to perform well. It does not natively support the plenoptic camera model.
The goal of this thesis is to adapt Gaussian Splatting to plenoptic camera data, enabling realistic view synthesis from fewer inputs. This work will bridge a gap between modern rendering techniques and emerging camera technologies, and contribute to making efficient, high-quality 3D capture more accessible.
â¯Context
The project will take place in the LISA-VR research unit, which focuses on volumetric rendering and view synthesis. You will have access to the labâs infrastructure, including plenoptic cameras and datasets specifically captured for this kind of research.
â¯Objective
The main objective is to implement Gaussian Splatting for Plenoptic cameras.
By the end of the project, the system should be able to optimize a scene using one or more plenoptic cameras, and optionally also integrate traditional images. Although a single plenoptic camera theoretically provides all the required data for 3D reconstruction, its limited spatial resolution may require additional views to improve accuracy.
A comparative study will be conducted to evaluate reconstruction quality between plenoptic and conventional cameras. This includes analyzing how many input images are needed to achieve high quality depending on scene characteristics such as lighting, material properties, and geometric complexity.
â¯Methods
Standard Gaussian Splatting implementations [2] use rasterization, which assumes a simple pinhole camera model and makes it difficult to incorporate plenoptic data directly.
Recent work on ray tracing-based Gaussian rendering [3] provides a more flexible framework that supports arbitrary camera models. This ray tracing approach will be the starting point for adapting the method to handle plenoptic cameras.
â¯Prerequisites
- Proficient in C++
- Experience with CUDA (preferred but not required, INFO-H503)
â¯Contact
For more information, please contact: Sarah Dury sarah.dury@ulb.be
â¯References
[1] C. Perwass and L. Wietzke, âSingle lens 3D-camera with extended depth-of-field,â IS&T/SPIE Electronic Imaging, 2012. [2] B. Kerbl, G. Kopanas, T. Leimkuehler, and G. Drettakis, â3D Gaussian Splatting for Real-Time Radiance Field Rendering,â ACM Transactions on Graphics, 2023. [3] N. Moenne-Loccoz et al., â3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes,â ACM Transactions on Graphics (SIGGRAPH Asia), 2024.
- Weakly Supervised Segmentation of Malignant Epithelium in Digital Breast Pathology
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, jennifer.dhont@hubruxelles.be, younes.jourani@hubruxelles.be
Research Unit : LISA - IMAGE
Description
Project title
The project aims to solve an open issue in a certain domain of application.
Background
Tumor segmentation in digital pathology plays a crucial role in breast cancer diagnosis and prognosis [1], [2]. Precise delineation of malignant epithelial regions in hematoxylin and eosin (H&E)-stained or immunohistochemistry (IHC)-stained slides enables downstream analyses, such as cellularity estimation and biomarker quantification for diagnostic pathological examination, therapeutic response assessment, treatment selection, and survival prediction [3]â[8]. Deep learning-based segmentation approaches overcome the inefficiency of manual assessment, enabling high-throughput analysis of histopathological datasets. However, current approaches predominantly rely on supervised learning, which requires labor-intensive pixel-level manual annotations that are impractical at scale [9]â[11].
Weakly supervised learning has emerged as a promising alternative, leveraging coarse-grained labels to reduce annotation burdens. Yet, existing solutions are constrained by the restriction to whole-slide image (WSI)-level classification [12], reliance on partial cell-level annotations [13], and unproven generalizability across diverse breast cancer cohorts and staining protocols [14], [15]. These challenges underscore the need for a weakly supervised segmentation method that is trained using only image-level annotations while achieving pixel-level precision in malignant epithelium delineation and generalizing to heterogeneous breast cancer datasets.
Specific tasks
Literature study to get familiar with the different topics.
Perform data preprocessing, including extracting patches from whole slide images, applying color deconvolution to separate the Hematoxylin stain from H&E and IHC images using ImageJ, and applying data augmentation techniques such as flipping, rotation, and adjusting brightness and contrast to address class imbalance.
Implement prevalent convolutional neural network (CNN) and Transformer models, as described in Table 4 and Table 5 of Ref. [16], and conduct training and inference of these models using Python, preferably with PyTorch.
Validate the segmentation results predicted by these models across various breast cancer datasets, including H&E and IHC images, by comparing them to the ground truth segmentation mask (e.g., on the MHCI and BCSS datasets) or the ground truth cellularity (e.g., on the BreastPathQ and Post-NAT-BRCA datasets).
[Optional] Develop multiple instance learning (MIL) techniques to improve segmentation performance across diverse breast cancer datasets, aiming to achieve accuracy comparable to that of supervised semantic segmentation methods.
Resources
BreastPathQ dataset: a public dataset consisting of 69 H&E stained WSI collected from the resection specimens of 37 post-neoadjuvant therapy patients with invasive residual breast cancer. 2579 image patches with ROI of 512 Ã 512 pixels are manually annotated with estimated cellularity ranging between [0, 1].
Other public datasets: https://github.com/maduc7/Histopathology-Datasets
IHC datasets in NEOCHECKRAY. There are 109 IHC patches stained with an MHC-I antibody with pixel-level manual annotations.
Prerequisite
- Python
Contact persons
Dr. Ir. Jennifer Dhont (jennifer.dhont@hubruxelles.be), Head of Data Science & AI Research Unit at Hopital Universitaire de Bruxelles (Erasme campus)
Pr O. Debeir (olivier.debeir@ulb.be)
references
[1] D. Yan, X. Ju, et al., âTumour stroma ratio is a potential predictor for 5-year disease-free survival in breast cancer,â BMC Cancer, vol. 22, no. 1, p. 1082, Oct. 2022.
[2] L. Priya C V, B. V G, V. B R, and S. Ramachandran, âDeep learning approaches for breast cancer detection in histopathology images: A review,â Cancer Biomarkers, vol. 40, no. 1, pp. 1â25, May 2024.
[3] M. Peikari, S. Salama, et al., âAutomatic Cellularity Assessment from Post-Treated Breast Surgical Specimens,â Cytometry A, vol. 91, no. 11, pp. 1078â1087, Nov. 2017.
[4] S. Akbar, M. Peikari, et al., âAutomated and Manual Quantification of Tumour Cellularity in Digital Slides for Tumour Burden Assessment,â Sci Rep, vol. 9, no. 1, p. 14099, Oct. 2019.
[5] X. Catteau, E. Zindy, et al., âComparison Between Manual and Automated Assessment of Ki-67 in Breast Carcinoma: Test of a Simple Method in Daily Practice,â Technol Cancer Res Treat, vol. 22, p. 15330338231169603, Jan. 2023.
[6] E. H. Allott, S. M. Cohen, et al., âPerformance of Three-Biomarker Immunohistochemistry for Intrinsic Breast Cancer Subtyping in the AMBER Consortium,â Cancer Epidemiology, Biomarkers & Prevention, vol. 25, no. 3, pp. 470â478, Mar. 2016.
[7] T. Vougiouklakis, B. J. Belovarac, et al., âThe diagnostic utility of EZH2 H-score and Ki-67 index in non-invasive breast apocrine lesions,â Pathology - Research and Practice, vol. 216, no. 9, p. 153041, Sep. 2020.
attached pdf document - RAG (Retrieval-Augmented Generation) for Patents
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, Julien.Cabay@ulb.be, Thomas.Vandamme@ulb.be
Research Unit : LISA-IMAGE
Description
RAG (Retrieval-Augmented Generation) for Patents
This project consists in the design, development, and testing of a RAG system (an AI chatbot with a specific knowledge base) for a dataset of patents.
Context
Patents are an invaluable economic asset, enabling inventors to protect their inventions for a set duration of time. Those invaluable assets, in the form of patent documents, represent an enormous challenge for the administrations responsible with the protection processes (i.e. Intellectual Property Offices). Those documents are highly technical, composed of different modalities (text and schematics), and are particularly numerous (there were more than 35 million patents in force worldwide as of 2023, source WIPO statistics database).
Recent technological advancements in the field of Artificial Intelligence (AI), namely Large Language Models (LLMs) and the chatbots that these power, carry enormous promises of automation for these complex tasks. One of those, Retrieval-Augmented Generation, is frequently branded as a solution to hallucination in LLMs, as well as enabling a relatively easy specialization of the model using a knowledge library.
Objective
In this project, you will design, develop and test such a solution on a large corpus of patents.
Methods
Different open-source LLMs can be used and benchmarked, as well as the different RAG techniques. The dataset can be sourced from Google Patents.
Prerequisite
- Python
- Machine Learning / Deep Learning
Contact person
For more information please contact : Thomas.Vandamme@ulb.be
- Design and Implementation of a viewer for IP (Intellectual Property) datasets
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, Julien.Cabay@ulb.be, Thomas.Vandamme@ulb.be
Research Unit : LISA-IMAGE
Description
Design and Implementation of a viewer for IP (Intellectual Property) datasets
This project consists in the design, development, and implementation of a viewer website/software for IP datasets (Trade Marks, Patents, ...). The viewer will enable users, developers and researchers to search, label and extract different relevant aspects of the datasets.
Context
Current dataset viewer tools, such as label studio (https://labelstud.io/), have demonstrated their relevance in research and development ecosystems, especially those related to data and deep learning. However, those tools are not perfect, and several panes of datasets, such as those related to the legal field (esp. text documents) are left out of such solutions.
Objective
In this project, you will develop a complete tool (ideally web-based), or an open-source plug-in for another viewer/labelizer (such as label studio, for example), capable of handling multimodal informations, such as those relative to IP (e.g. images, 3D volumes, schematics, text, sound, ...).
Prerequisite
- Web Technologies
- Python
Contact person
For more information please contact : Thomas.Vandamme@ulb.be
- Automated web scraping for dataset compilations
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, Julien.Cabay@ulb.be, Thomas.Vandamme@ulb.be
Research Unit : LISA-IMAGE
Description
Automated web scraping for dataset compilations
This project consists in the design, development, implementation and testing of a series of automated web scrapers. The ultimate goal is to develop a series of tools to enable the acquisition and synchronisation of different datasets.
Context
Deep Learning relies on voluminous and (ideally) good-quality datasets. Those are, unfortunately, hard to gather and label.
In the field of Intellectual Property (IP, including, e.g. Patents, Trade Marks, Designs), some relevant informations are curated by public IP offices (tasked with the administration of the different associated rights; registration, protection, ...). Those public bodies make publicly available a series of information through various search engines (for example, see https://www.euipo.europa.eu/en/search and https://ipportal.wipo.int/home). The offices do not allow for bulk download, and curating by hand these datasets is a particularly tedious task (there are millions of registered rights, for example).
Objective
In this project, you will develop tools to enable the fast development of web scrapers, implementing various measures to disable anti-scraping protections on the websites. A new, untested use case will be chosen to illustrate the tools capabilities.
Methods
You will create web-based applications, interface with (simple) databases and external providers if needed. Your end-product will enable non-developers to choose the elements they want to retrieve automatically in a webpage, and various other settings.
Prerequisite
- Web Technologies
- Python
- (optional) Selenium or other automation software
Contact person
For more information please contact : Thomas.Vandamme@ulb.be
- Benchmarking Compression Techniques for Multi-Layers 3D Video
-
Promotor, co-promotor, advisor : gauthier.lafruit@ulb.be, - , Eline Soetens
Research Unit : LISA-VR
Description
Benchmarking Compression Techniques for Multi-Layers 3D Video
The project aims to evaluate the performance of existing compression techniques when applied to multi-layer video content used in tensor displays.
Context
Tensor displays are glass-free 3D displays composed of multiple stacked LCD panels. A user in front of the display will see 3D content with full parallax thanks to the layered structure. However, each 3D frame requires one 2D image per layer, resulting in significantly higher data volume compared to standard 2D video. Existing compression methods were not designed specifically for this format and traditional metrics (PSNR) might not accurately reflect perceived quality in multi-layers 3D video.
Objective
Compare and evaluate existing 2D and 3D video compression methods when applied to multi-layer video intended for tensor displays. Define or select a set of evaluation metrics that better capture the perceived visual quality on tensor displays.
Key goals:
Evaluate the efficiency, quality, and suitability of different codecs.
Investigate how well these codecs preserve depth perception and visual fidelity in multi-layer 3D video.
Propose relevant and multi-dimensional performance metrics for such evaluations.
Methods
Different methods are to be tested including : * HEVC (standard 2D compression ) [1] * VVC multi-layers (layer-aware compression) [2] * MIV (multi-view compression) [3]
Evaluation metrics will be defined during the thesis, it might include:
- Rate-distortion analysis (bitrate vs. PSNR)
- Structural and perceptual metrics: SSIM, MS-SSIM, VMAF
- Temporal consistency metrics
Experiments will use standard datasets and reference implementations.
Prerequisite
- Programming in C++ and scripting languages (e.g., Python)
- Recommended but not mandatory : familiarity with video coding pipelines and concepts (e.g., INFO-H516 â Visual Media Compression)
Contact person
Eline Soetens (supervisor) : eline.soetens@ulb.be
Bibliography
[1] Sullivan, Gary J., Jens-Rainer Ohm, Woo-Jin Han, et Thomas Wiegand. « Overview of the High Efficiency Video Coding (HEVC) Standard ». IEEE Transactions on Circuits and Systems for Video Technology 22, náµ 12 (décembre 2012): 1649â68. https://doi.org/10.1109/TCSVT.2012.2221191.
[2] Bross, Benjamin, Ye-Kui Wang, Yan Ye, Shan Liu, Jianle Chen, Gary J. Sullivan, et Jens-Rainer Ohm. « Overview of the Versatile Video Coding (VVC) Standard and its Applications ». IEEE Transactions on Circuits and Systems for Video Technology 31, náµ 10 (octobre 2021): 3736â64. https://doi.org/10.1109/TCSVT.2021.3101953.
[3] « Reference software â MPEG Immersive video (MIV) ». https://mpeg-miv.org/index.php/reference-software/.