In the same section
- Vector based technical drawing analysis using transformers
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, - , Feras.Almasri@ulb.be
Research Unit : LISA-IMAGE
Description
Project title
Vector based technical drawing analysis using transformers
Context
This master's thesis aims to explore the application of transformer models in the analysis of vector-based technical drawings, to transform unstructured group of geometrical primitives into structured assembly of symbols and connections. Technical drawings play a crucial role in various engineering disciplines, architecture, and design fields. However, the analysis of these drawings for tasks such as object recognition, semantic segmentation, and understanding the spatial relationships between components remains a challenging problem.
This research will investigate how transformer-based architectures, known for their effectiveness in natural language processing and sequential data tasks, can be adapted to process vector-based graphics efficiently. By leveraging the self-attention mechanism inherent in transformers, the thesis will explore methods for capturing both local and global dependencies within technical drawings, facilitating more accurate analysis and interpretation.
The study will involve several key components, including data preprocessing techniques tailored to vector graphics, the design and training of transformer-based models, and the evaluation of their performance on various tasks relevant to technical drawing analysis. Additionally, the thesis will investigate strategies for incorporating domain-specific knowledge and constraints into the model architecture to improve its effectiveness in real-world applications.
The outcomes of this research have the potential to advance the state-of-the-art in technical drawing analysis, with implications for fields such as computer-aided design (CAD), manufacturing, and architecture. By developing more sophisticated and efficient tools for analyzing vector-based drawings, this work aims to streamline design processes, enhance collaboration among professionals, and ultimately contribute to innovation in engineering and design practices.
Prerequisite
- Python
- some experience in ML framework such as PyTorch or TensorFlow
Contact persons
For more information please contact : olivier.debeir@ulb.be, feras.almasri@ulb.be
- Symbol recognition in 2D rasterized technical drawings using vector based examples and data augmentation
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, - , feras.almasri@ulb.be
Research Unit : LISA-IMAGE
Description
Project title
Symbol recognition in 2D rasterized technical drawings using vector based examples and data augmentation
Context
This master's thesis seeks to address the challenge of symbol recognition in 2D rasterized technical drawings by leveraging vector-based examples and data augmentation techniques. Technical drawings often contain a multitude of symbols representing various components, annotations, and connections. Efficiently and accurately recognizing these symbols is essential for tasks such as automated drawing analysis, classification, and indexing.
The research will explore the utilization of vector-based representations of symbols as examples for training convolutional neural networks (CNNs) on rasterized images. By converting vector-based symbols into rasterized images, the thesis aims to create a rich dataset for training and validation. Additionally, data augmentation techniques will be employed to enhance the robustness and generalization of the trained models, considering variations in symbol appearance, orientation, scale, and noise.
The study will involve several stages, including the collection and preprocessing of vector-based symbol examples from technical drawing datasets, the development of data augmentation strategies tailored to symbol recognition tasks, and the design and training of CNN architectures optimized for recognizing symbols in rasterized images.
The outcomes of this research have implications for various domains, including CAD software, document management systems, and engineering workflows. By enhancing the automation of symbol recognition in technical drawings, this work seeks to streamline design processes, improve information retrieval, and facilitate more efficient collaboration among professionals in engineering and related fields.
Prerequisite
- some experience with ML frameworks such as PyTorch or TensorFlow
- Python
Contact person
For more information please contact : olivier.debeir@ulb.be, feras.almasri@ulb.be
- Speech Defect Detection Using Transformers for Non-Real-Time and Real-Time Applications
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, - ,
Research Unit : LISA-IMAGE
Description
Project title
Speech Defect Detection Using Transformers for Non-Real-Time and Real-Time Applications
Context
This master's thesis endeavors to address the task of speech defect detection by employing transformer models suitable for both non-real-time (non-RT) and real-time (RT) applications, comparing their efficacy with traditional methods such as those implemented in the Praat framework. Speech defects encompass a range of anomalies such as stutters, articulation errors, which can significantly impact communication and language development. Detecting and analyzing these defects is crucial for diagnosing speech disorders and guiding therapy interventions.
The research will investigate the effectiveness of transformer architectures, renowned for their capabilities in sequence modeling and understanding contextual dependencies, in comparison to the widely used Praat framework. Unlike Praat, which relies on handcrafted features and traditional signal processing techniques, transformers offer a more data-driven and scalable solution for processing sequential data.
The study will involve several key components, including the collection and preprocessing of speech data annotated with defect labels, the design and training of transformer-based models for defect detection, and the evaluation of their performance in both non-real-time and real-time scenarios. The Praat framework will be used as a benchmark for comparison, considering factors such as detection accuracy, computational efficiency, and adaptability to different speech defect types.
For non-real-time applications, emphasis will be placed on achieving high accuracy and robustness, while for real-time applications, low latency and efficiency will be prioritized.
The outcomes of this research have the potential to impact various domains, including speech therapy, assistive technology, and automated quality assessment in speech-related applications.
Prerequisite
- some experience in signal processing
- experience in ML framework such as PyTorch or TensorFlow
- Python
Contact person
For more information please contact : olivier.debeir@ulb.be
external link - Seismogram digitisation from raster images using transformers
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, - ,
Research Unit : LISA-IMAGE
Description
Project title
Seismogram digitisation from raster images using transformers
Context
This master thesis aims to digitize seismograms from raster images, utilizing both classical image processing techniques and advanced machine learning models, particularly transformers. The project will compare the effectiveness of these two approaches, specifically focusing on their ability to handle complex signals such as path crossings and interrupted signals. While classical image processing methods like those employed by specialized programs such as Digitseis excel in simpler cases, they often struggle with more intricate signal patterns. By contrast, machine learning-based approaches, particularly those leveraging transformer architectures, offer potential improvements in accuracy and efficiency for digitizing challenging seismogram data.
The research is done in collaboration with the Seismology - Gravimetry Royal Observatory of Belgium
Prerequisite
- MF frameworks such as PyTorch or TensorFlow
- Python
Contact person
For more information please contact : olivier.debeir@ulb.be
external link - Using language models to extract clinical data from medical protocols
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, egor.zindy@ulb.be,
Research Unit : LISA-IMAGE
Description
Project title
Using language models to extract clinical data from medical protocols
Context
Artificial intelligence (AI) has made significant progress in various sectors, and healthcare is no exception. One of the promising areas of AI in healthcare is natural language processing (NLP), which has the potential to revolutionise the creation of medical databases by extracting information directly from medical protocols written by clinicians and pathologists. This would make it possible to standardise data fields and improve the efficiency and accuracy of the analysis of the medical data they contain.
Objective
- Understand and analyse the use of NLP and AI in the extraction of clinical data from medical protocols.
- Gain an understanding of the basics of NLP
- Discover some NLP libraries commonly used in the healthcare field, in particular ClinicalBERT, a pre-trained language model specifically for the medical field
- Learn how NLP can be used to create medical databases
Motivation for using AI and NLP to create medical databases: The motivation for using AI and NLP to create medical databases lies in the need to standardise data fields and improve the efficiency of medical data analysis in clinical studies that may involve several hundred patients. Medical protocols written by clinicians and pathologists contain a wealth of clinical information, but this data is often unstructured and difficult to analyse systematically. By using NLP to extract data from these protocols, it is possible to standardise data fields and facilitate the analysis of medical data.
In addition, the use of standardised medical databases can lead to a better understanding of diseases, more accurate diagnoses and more effective treatments.
Prerequisite
- experience in ML framework such as PyTorch or TensorFlow
- Python
Contact person
For more information please contact : olivier.debeir@ulb.be, egor.zindy@ulb.be
references
- Texture extraction from micro-ultrasound sequence extraction based on unsupervised and semi-supervised approaches
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, - , toshna.manohar.panjwani@ulb.be
Research Unit : LISA-IMAGE
Description
Project title
Texture extraction from micro-ultrasound sequence extraction based on unsupervised and semi-supervised approaches
Context
Transrectal ultrasound (TRUS)-based targeted biopsy coupled with multiparametric MRI (mpMRI) has been one of the most important elements of the prostate cancer diagnosis pipeline. More recently it has been replaced with micro-ultrasound (microUS) for imaging and biopsy, which provides a 300% improvement in image resolution in comparison to multiparametric MRI (mpMRI) for prostate cancer screening. MicroUS is standardized with the Prostate Risk Identification using Micro-Ultrasound (PRIMUS) system, which determines risk stratification, biopsy technique and the course of treatment suitable to the patient. (Gurwin et al., 2022; Panzone et al., 2022). Focal lesions that are generally undetected with TRUS and microUS can be seen as hyperechoic areas on the image. The textural differences on images occurs since cancerous tissues reflect significantly fewer sound echoes than normal tissues, making them hyperechoic. Various studies have proposed and demonstrated significant advancements in automatic segmentation and detection of prostate cancer through various AI models in transrectal ultrasound (TRUS) and MR-based prostate cancer imaging, employing techniques ranging from wavelet transforms to machine learning algorithms. Deep learning models based on supervised and semi-supervised texture segmentation in prostate ultrasound image represent two distinct approaches. Supervised texture segmentation methods, as exemplified by the DenseMP (Fan et al., 2023), Correlation Aware Mutual Learning (CAML) framework (Gao et al., 2023), super-pixel algorithm (Carriere et al.), wavelet-based support vector machines (W-SVMs) and kernel support vector machines (KSVMs), and neural-fuzzy approach (Hossian et al., 2019, Mohammed et al., 2023, Xu et al. 2018, Layek et al. 2019) use annotated textural and spatial features for automated region segmentation in ultrasound images. Some notable semi-supervised approaches include: Tao Lei et al.âs adversarial self-ensembling network using dynamic convolution (ASE-Net), semi- supervised domain adaptation (SSDA) investigated by Basak and Yin, dual uncertainty-guided mixing consistency network for accurate 3D semi-supervised segmentation (Chenchu Xu et al.). Some of these models have been used to achieve high Dice Similarity accuracy in segmenting the prostate from TRUS images without the need for image preprocessing. Multi-atlas registration and statistical texture priors have also been employed in an automatic 3D segmentation method for ultrasound-guided prostate biopsies. The aim is to apply and compare various texture-based segmentation methods for to overcome the ambiguity and operator differences in determining the PRIMUS scoring in microultrasound images, and develop a suitable pipeline.
The project is done in collaboration with the radiology department of CHIREC - Hôpital Delta.
Prerequisite
- image processing,
- Python
Contact person
For more information please contact : toshna.manohar.panjwani@ulb.be *, * olivier.debeir@ulb.be
- Unsupervised and semi-supervised models to extract features in micro-ultrasound images.
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, - , toshna.manohar.panjwani@ulb.be
Research Unit : LISA-IMAGE
Description
Project title
The project aims to solve an open issue in a certain domain of application.
Context
Micro-ultrasound (microUS), a high-frequency, high-resolution ultrasound technique, offers a costeffective alternative to MRI with improved accuracy for diagnosing prostate cancer. MicroUS is standardized with the Prostate Risk Identification using Micro-Ultrasound (PRIMUS) system, which determines risk stratification, biopsy technique and the course of treatment suitable to the patient diagnosis (Gurwin et al., 2022; Panzone et al., 2022). Micorultrasound images can be hard to decipher, particularly due to artifacts and indistinct borders, which can lead to inconsistent PRIMUS scores given by experts and improper prostate segmentation. This can be addressed by using annotations to identify regions of interest such as lesions, hyperechoic areas and other anatomical or pathological features and artefacts. By leveraging deep learning, the need for manual annotations can be reduced, as models become capable of generating accurate annotations. Advanced segmentation models like MicroSegNet improve diagnostic reliability by focusing on hard-to-segment regions, reducing discrepancies between expert and non-expert annotations and facilitating more accurate treatment planning. Additionally, self-supervised representation learning has been applied to micro-US data, showing promise in classifying cancerous from non-cancerous tissue with high accuracy, even in the absence of labeled data (Jiang et al. 2023, Wilson et al. 2022, Gilany et a. 2022). Semi-supervised and unsupervised annotation models in micro-ultrasound (micro-US) imaging can significantly advance the detection and diagnosis of prostate cancer. The aim is to develop robust deep learning models that can confidently identify prostate cancer with high accuracy and reliability, especially in the presence of weak labels and ultrasound artifacts.
The project is done in collaboration with the radiology department of CHIREC - Hôpital Delta.
Prerequisite
- deep neural networks (pytorch or Tensorflow)
- image processing
- Python
Contact person
For more information please contact : olivier+.debeir@ulb.be , toshna.manohar.panjwani@ulb.be
- Developing an AI algorithm for Deauville score detection
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, erwin.woff@hub.be,
Research Unit : LISA-IMAGE
Description
Study Project:
Developing an AI algorithm for Deauville score detection
Context and aim of the project:
The Deauville criteria play a crucial role in assessing the response to lymphoma treatment. Initially based on visual interpretation of FDG-PET/CT scans, the Deauville score, which ranges from 1 to 5, quantifies the level of FDG uptake in lesions and thereby assists clinicians in evaluating the remaining activity of the residual disease after therapy. As the criteria have evolved, the inclusion of semi-quantitative parameters such as standardized uptake value (SUV) has improved intra- and inter-observer reproducibility. Lymphoma patients undergo FDG-PET/CT scans as part of their treatment evaluation in routine clinical practice. However, manual scoring is a time-consuming process and is susceptible to intra- and inter-observer variability. For this reason, there is a strong clinical need to develop an AI algorithm that can automatically detect and measure Deauville scores from a corpus of data composed of both images and protocols (text).
Objectives:
- Develop an AI model: Train a deep learning model to recognize and quantify FDG uptake in lesions based on FDG PET/CT scans.
- Combine image and language models: Integrate image analysis with natural language processing to extract relevant information from textual protocols.
- Automate Deauville score calculation: Create an end-to-end pipeline that takes input images and associated protocols, predicts Deauville scores, and provides confidence levels.
Methodology:
Data Collection:
- Gather a diverse dataset of FDG-PET/CT scans of lymphoma patients along with corresponding Deauville scores.
- Collect textual protocols describing the imaging procedure, patient history and Deauville scores.
Preprocessing:
- Standardize image sizes and formats.
- Extract relevant information from textual protocols (e.g., patient age, scan date, lesion location, Deauville scores).
Image Analysis:
- Use convolutional neural networks (CNNs) to analyze FDG-PET/CT scans.
- Extract features related to FDG uptake pattern (lesion with the highest SUVmax).
- Train the model to predict Deauville scores.
Natural Language Processing (NLP):
- Employ NLP techniques (e.g., BERT, GPT) to process textual protocols.
- Extract relevant keywords and phrases related to Deauville scoring.
Integration:
- Combine image and language models.
- Develop a joint representation for images and protocols.
Algorithm Training and Evaluation:
- Split the dataset into training, validation, and test sets.
- Train the integrated model using labeled data.
- Evaluate performance metrics (accuracy, precision, recall) on the test set.
Deployment:
- Deploy the AI algorithm in a clinical setting.
- Validate its performance against human experts.
Conclusion of this project:
Developing an AI algorithm for Deauville score detection involves a multidisciplinary approach, combining image analysis and NLP. This joint project by a student from the EPB and the Faculty of Medicine would help to develop interaction and synergy between the two faculties in this field of AI, which is transdisciplinary in nature.
Contacts:
olivier.debeir@ulb.be, erwin.woff@hubruxelles.be
- Machine Learning to classify volcano-seismic data.
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, corentin.caudron@ulb.be,
Research Unit : LISA-IMAGE
Description
Machine Learning to classify volcano-seismic data.
Context
Volcano seismic data are typically capturing a wide variety of earthquakes and seismic noise. Yet, their detection and classification remain complicated and time consuming. ML-based approaches learning appears as promising ways to automatically denoise, detect and classify volcanic data.
Existing, but not limited to, approaches are listed here: https://eqtransformer.readthedocs.io/en/latest/ and https://www.nature.com/articles/s41467-020-17841-x and https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023JB027167
Contact person
For more information please contact : corentin.caudron@ulb.be,olivier.debeir@ulb.be
- Gas detection and quantification using ground vibrations acquired by fiber-optic cable data
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, corentin.caudron@ulb.be,
Research Unit : LISA-IMAGE
Description
Gas detection and quantification using ground vibrations acquired by fiber-optic cable data
Context
Fiber optic cables interrogated by Distributed acoustic sensing appear sensitive to gas bubbles. Yet, their sampling frequency (~kHz) produces large amount of data. Denoising, detection and classification are desirable to clean, detect and possibly cluster interesting signals in an automatic way. Pre-processing of the data might be required to facilitate the application of denoising, detection and clustering methodologies. Papers: https://www.mdpi.com/1424-8220/22/3/1266 and https://www.nature.com/articles/s41598-024-53444-y
** Contacts
corentin.caudron@ulb.be, olivier.debeir@ulb.be
- Towards quantifying gas fluxes using sonar data.
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, corentin.caudron@ulb.be,
Research Unit : LISA-IMAGE
Description
Towards quantifying gas fluxes using sonar data.
The project aims to solve an open issue in a certain domain of application.
Context
Sonar allow to detect gas in the water column possibly providing useful metrics to quantify gas fluxes. Yet, they are sensitive to fish as well as the bottom of the sea and the lake and echosounding profiling generates large amount of data (Gb to Tb). Automated approches to detect and classify bubble sizes would facilitate future gas flux estimates. Reference: https://aslopubs.onlinelibrary.wiley.com/doi/abs/10.4319/lom.2008.6.105
Contact person
For more information please contact : corentin.caudron@ulb.be, olivier.debeir@ulb.be
- Enhanced Building windows Segmentation in Laser Cloud Acquisition using Deep Learning.
-
Promotor, co-promotor, advisor : olivier.debeir@ulb.be, Arnaud.Schenkel@ulb.be, Feras.Almasri@ulb.be
Research Unit : LISA/IMAGE
Description
Project Goal:
The goal of this project is to develop a robust and accurate methodology for detecting windows in laser cloud acquisition data of buildings.
Context:
Laser cloud acquisition is a widely adopted technique for creating detailed 3D models of buildings. However, the precision of building modelling often faces challenges due to reflections on windows. These reflections can result in the misinterpretation of laser data, introducing noise into the acquisition process and leading to the generation of false building structures. The primary challenge to address in this project is the presence of reflections on building windows.
The acquisition took place on the ULB campus using a laser cloud camera, providing 3D projections of laser points in space alongside panoramic image views of the scene. A subset of these images has been annotated with window segmentation. The project's objective is to construct a deep learning model proficient in detecting windows, facilitating the refinement of laser point projections. This methodology aims to elevate the accuracy of 3D building models by effectively minimizing the impact of window reflections.
Objective:
Image Data Preprocessing:
Implement techniques to preprocess panoramic images, with a focus on the precise characterization of windows.
Apply image data augmentation strategies to enrich the dataset and enhance the model's robustness against variations.
Machine Learning Model Implementation:
Develop and implement a machine learning model capable of learning from the annotated dataset to accurately segment building windows.
Model Validation:
Validate the developed model using a separate set of images to ensure modelâs generalization in detecting windows across diverse scenarios.
Prerequisites:
- Python
- Image processing
- Deep and Machine Learning
- Plenoptic data compression and image quality metric in the lenslet domain
-
Promotor, co-promotor, advisor : gauthier.lafruit@ulb.be, mehrdad.teratani@ulb.be, daniele.bonatto@ulb.be
Research Unit : ULB/LISA-VR
Description
Description
Plenoptic cameras are special cameras that capture a scene from multiple point of views. They use small lenses placed behind the main lens to do so. The resulting plenoptic images must be processed to extract a specific viewing point before being displayed. We aim to compress these images, but their high resolution makes compression challenging. Fortunately, plenoptic images contain a significant amount of redundant information. Our goal is to leverage this redundancy to encode the images in a format that improves compression and decompression efficiency. While tools for compressing, decompressing, and transforming plenoptic images into views already exist, an intelligent method to convert plenoptic images into a more compressible format has yet to be developed during this master thesis. This work could have a major impact on the future of plenoptic imaging and benefit several industries. Another challenge is evaluating the compression. Currently, the images are processed by a tool to render the final view, which introduces artifacts that prevent the use of traditional quality metrics such as PSNR. Instead, we aim to define a quality metric in the lenslet domain (the domain of the small lenses), where the original information remains intact before the rendering process distorts it.
Context
The goal of this master's thesis is to design a quality metric in the lenslet domain, with guidance from expert members of the lab. The metric should be thoroughly tested through experiments and analysis. Once developed, it will be used to explore compression patterns in the lenslet domain. If successful, the results will be published and could serve as a foundation for further advancements by compression experts.
Objective
At the end of the year, the student must 1. Develop a quality metric, accompanied by an analysis of its applications and limitations. 2. Utilize this metric to create a novel algorithm for pre-compressing plenoptic images, which can then be compressed using standard tools. 3. Assess the performance of the compression algorithm using the newly developed metric. â
Prerequisite
⢠Strong interest in programming and computer vision ⢠Strong interest in working in a team ⢠Not required but preferred: o Any multimedia course (INFOH502, INFOH503, or similar courses) o OpenGL o OpenCV or any other libraries for Image Processing o C++ programming
Reference
C. Perwass et L. Wietzke, Single lens 3D-camera with extended depth-of-field, IS&T/SPIE Electronic Imaging, Burlingame, California, USA, 2012, p. 829108. doi: 10.1117/12.909882.
- Differentiable view synthesis (deep learning)
-
Promotor, co-promotor, advisor : gauthier.lafruit@ulb.be, mehrdad.teratani@ulb.be, daniele.bonatto@ulb.be
Research Unit : ULB/LISA-VR
Description
Description
By using two or more cameras to capture a scene, it is possible to recreate new (previously unseen) viewpoints through various methods. These synthesized views are essential in applications such as virtual reality, enabling users to freely navigate within a scene. One such method is the use of the RVS software, which takes multiple input views along with their associated depth maps and reprojects them to create novel viewpoints. However, RVS operates in a forward mode, meaning it can only generate new views, but it does not allow us to easily assess how the input images influence the quality of the output. Deep learning employs an algorithm called backpropagation, which adjusts outputs by comparing them to the input (used as ground truth). This process relies on calculating derivatives with respect to the network's parameters. In traditional 3D rendering, software like Blender transforms 3D data (meshes, lights, camera positions) into a 2D image, but the renderer is non-differentiable, meaning it cannot revert from the 2D image back to the original 3D scene. Recent research has developed methods to make these renderers differentiable, allowing for new possibilities. We seek to apply this approach to our custom RVS renderer to allow novel research in the field of view synthesis.
Context
Novel view synthesis is one of our expertise in the lab, we create a state-of-the-art renderer. However, to integrate this renderer into deep learning applications, we need to have it differentiable. You will work within a team of researchers on this problematic.
Objective
At the end of the year, the student must -- Implement inside RVS a differentiable rendering algorithm based on papers. -- Validate the work by generating experiments and graphs that prove that the implementation is correct. -- Design simple neural networks in pytorch that uses the novel version of the software as demo applications. â
Prerequisite
⢠Strong interest in programming and computer vision ⢠Strong interest in deep learning ⢠Strong interest in working in a team ⢠Not required but preferred: o Any experience with deep learning programming o Any deep learning course o OpenGL o OpenCV or any other libraries for Image Processing o C++ programming
References
-- Deliot, Thomas, Eric Heitz, et Laurent Belcour. « Transforming a Non-Differentiable Rasterizer into a Differentiable One with Stochastic Gradient Estimation ». Proceedings of the ACM on Computer Graphics and Interactive Techniques 7, no 1 (11 mai 2024): 1 16. https://doi.org/10.1145/3651298. -- Laine, Samuli, Janne Hellsten, Tero Karras, Yeongho Seol, Jaakko Lehtinen, et Timo Aila. « Modular Primitives for High-Performance Differentiable Rendering ». arXiv, 6 novembre 2020. http://arxiv.org/abs/2011.03277. -- Liu, Shichen, Weikai Chen, Tianye Li, et Hao Li. « Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning ». In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 7707 16. Seoul, Korea (South): IEEE, 2019. https://doi.org/10.1109/ICCV.2019.00780. -- Bonatto, Daniele, Sarah Fachada, Segolene Rogge, Adrian Munteanu, et Gauthier Lafruit. « Real-Time Depth Video Based Rendering for 6-DoF HMD Navigation and Light Field Displays ». IEEE Access, 27 octobre 2021, 1 20. https://doi.org/10.1109/ACCESS.2021.3123529.
- Prototyping a Lenticular Display
-
Promotor, co-promotor, advisor : gauthier.lafruit@ulb.be, mehrdad.teratani@ulb.be, sarah.fernandes.pinto.fachada@ulb.be, daniele.bonatto@ulb.be
Research Unit : ULB/LISA-VR
Description
Description
Lenticular displays are a class of 3D displays that don't require any glasses. Those displays are composed of a micro-lens array panel placed in front of a LCD display. The LCD display creates an array of micro-images, that are focused in front of the screen by the micro-lens array to form a 3D image. By controlling the micro-images' disparity, a parallax effect is created, creating autostereoscopic vision for several user at the same time. Plenoptic cameras (such as Raytrix) possess a main lens, a sheet of micro-lens, and a CMOS sensor, dually to lenticular displays. This special design offers the possibility to capture directional light rays and thus, 3D information about the scene. These cameras are called Light field cameras and are theoretically more suitable for 3D and VR applications than conventional cameras. Due to their structure, they capture micro-images similar to those displayed on the LCD screen of the lenticular display.
Context
The aim of this thesis is to design and assemble a lenticular display for wide field of view 3D viewing, and to implement the computation of the micro-images displayed on the display. Recently, such prototypes have been developed, both for horizontal parallax only, using half cylinder-shaped micro-lenses, and full parallax using spherical micro-lenses, which costs become accessible for development. Moreover, the laboratory has developed techniques to render plenoptic images using an image and its depth information (RGBD data). These techniques need to be adapted to the developed screen.
Objective
At the end of the year, the student must * Present a lenticular display that is composed of a LCD screen and a micro-lens array * Compute the micro-images displayed on the LCD screen, to create a 3D image giving depth perception to the user, from RGBD data. * Assess the quality of the produced display (field of view, computation time, depth of field, resolution,â¦)
Prerequisite
- Strong interest in programming and computer vision/virtual reality
- Good knowledge of optics
- Not required but preferred: * Any multimedia course (INFOH502, INFOH503, or similar courses) * OpenGL ** OpenCV or any other libraries for Image Processing
References
- Xing Y, Lin XY, Zhang LB, Xia YP, Zhang HL et al. Integral imaging-based tabletop light field 3D display with large viewing angle. Opto-Electron Adv 6, 220178 (2023). doi: 10.29026/oea.2023.220178.
- C. Perwass et L. Wietzke, Single lens 3D-camera with extended depth-of-field, IS&T/SPIE Electronic Imaging, Burlingame, California, USA, 2012, p. 829108. doi: 10.1117/12.909882.
- Hexagonal Fourier transform for Compression of Plenoptic video
-
Promotor, co-promotor, advisor : gauthier.lafruit@ulb.be, mehrdad.teratani@ulb.be, sarah.fernandes.pinto.fachada@ulb.be, daniele.bonatto@ulb.be, eline.soetens@ulb.be
Research Unit : ULB/LISA-VR
Description
Description
Plenoptic cameras (such as Raytrix) possess a main lens, a sheet of micro-lens, and a CMOS sensor. This special design offers the possibility to capture directional light rays and thus, 3D information about the scene. These cameras are called Light field cameras and are theoretically more suitable for 3D and VR applications than conventional cameras. Due to their structure, they capture an image composed of many micro-images placed in a hexagonal grid, creating patterns that are non-optimal to compress using the JPEG algorithm, even if the image itself presents redundancies that are not exploited. The JPEG algorithm divides the image in blocks then uses the Fourier transform to compute the blocks in the frequency domain. Then, only the most significant frequencies for the human eye are encoded, creating a low-storage representation of the image. To decompress the image, the inverse operation is performed.
Context
The aim of this thesis is to design a compression scheme using a hexagonal lattice for images in plenoptic format, and explore its efficiency. Using block sizes corresponding to the micro-images will simplify the encoding of the hexagonal image structure. Several datasets captured with different plenoptic cameras (in micro-image size, resolution, depth of field) will be tested and compared with the MPEG explorations of lenslet video coding activities.
Objective
At the end of the year, the student must present * An implementation of a hexagonal block-based adaptation of the JPEG compression * Compute its efficiency compared to classical image compression framework used in MPEG LVC activities
Prerequisite
- Good knowledge of C++ programming
- Any multimedia course (INFOH502, INFOH503, or similar courses)
- Compression knowledge (INFO-H516)
references
- L. Middleton et J. Sivaswamy, Hexagonal image processing: a practical approach, Springer. in Advances in pattern recognition, no. Advances in pattern recognition. London: Springer, 2005.
- C. Perwass et L. Wietzke, Single lens 3D-camera with extended depth-of-field, IS&T/SPIE Electronic Imaging, Burlingame, California, USA, 2012, p. 829108. doi: 10.1117/12.909882.