Mindscan - Synthetic Brain Images: Bridging the Gap in Brain Mapping with Generative Models for Advancing AI Research in Brain Mapping

We demonstrate the feasibility of applying computer vision techniques to brain mapping using synthetic data exclusively. While the utilization of synthesized training data has been prevalent in various domains, the domain gap between real and synthetic data remains a significant challenge, particularly in the context of brain images. Previous efforts have attempted to address this gap through data mixing, domain adaptation, and domain-adversarial training. However, we present compelling evidence that synthesizing data with minimal domain gap enables models trained on synthetic data to generalize effectively to real-world in-the-wild brain image datasets. In this paper, we outline the methodology of combining a procedurally-generated parametric 3D brain model with an extensive library of carefully crafted assets to render training images with unparalleled realism and diversity. Our machine learning systems are trained on these synthetic brain images, enabling us to accomplish brain-related tasks such as precise landmark localization and brain region segmentation. We demonstrate that synthetic data can achieve comparable accuracy to real data while also enabling novel approaches that would otherwise be infeasible due to the challenges associated with manual labeling in brain mapping studies.

Brought to you by PaulChrisLuke

Collaborative Partnership Request Sent

/paper has been significantly updated to reflect a standard proposal format required for a collaboration request with a UK university. Further changes can be found reflected in the methodology and aims of the project, which has been focused into providing a synthetic data set of mri imaging data.

Found! MRI datasets and Python conversion libraries

I was able to secure test data using https://www.slicer.org/ . One of the challenges for AI models is dealing with large file types. MRI scans in particular come in a proprietary file type only readable through specific software. In order to create an extensive database for AI training we will need to convert these file types into individual slices of jpgs.

There are a variety of python packages that can help read these file types, including numpy, nibabel, and matplotlib. I am not a strong python programmer and may need to find additional help to program the python package needed to read our test images and convert to jpg.

Once we are able to reliably convert MRI brain scan files into a structured data set of jpgs, we can begin to build our database for testing.

We were also able to find additional training data available on https://openneuro.org/ thanks to Catzuo on our live stream https://twitch.tv/paulchrisluke. This should provide us a sufficient training data set to use for procedural generation of synthetic datasets ongoing.

Literature review changes methodology

I met with Rich, a Doctor I met in Amsterdamn during my travels. We have always enjoyed riffing together, and I discussed changes in our approach based on Microsofts paper, "Fake it Till You Make It":

E. Wood, T. Baltrušaitis, C. Hewitt, S. Dziadzio, M. Johnson, V. Estellers, T. J. Cashman, and J. Shotton, “Fake It Till You Make It: Face analysis in the wild using synthetic data alone,” arXiv:2109.15102 [cs.CV], 2021.

In short, Microsoft used procedural generation to create datasets based on just 2000 base images. This drastically reduces the cost required to gather MRI images, and sets an effective goalpost for this project in terms of the number of source MRI images we will need to attain.

He agreed that this approach sounds reasonable, and went over the basics of reading MRI scans.

I also adjusted this webapp, created the /paper page, which I will need to get some help on in the React side later.

Overall, the approach outlined by Microsoft seems plausible. More literature review is needed, and I will continue to educate myself and find collaboraters/experts in this field to grow this project.

Initial Commit

In this initial release I have built this website to track ongoing progress for this research paper. The site is built in React/Next.js and Tailwind CSS, using a tailwind template "Commit". In the future, each project update will be displayed in this feed, while the paper is updated.