This project will focus on improving the data science and machine learning infrastructure of the DevoWorm group. The work will focus on an extension of the Summer of Code projects completed in 2017 and 2019. The first two aims are to improve upon the OpenDevoCell web interface and to improve segmentation techniques overall. While the OpenDevoCell interface has been implemented as a Heroku app, we would like to develop a dashboard for interpretation as well as tighter integration with DevoZoo's collection of open-source microscopy data. The third aim is to deploy the code package as a unified Python library, which would be done in concert with the improvement of segmentation techniques.The priority for this Summer is to improve the web interface both in terms of interactivity and functionality. Ideally, we would like to provide users with multiple options for analysis. This includes the ability to incorporate new forms of analysis as well as algorithms for new types of data. Currently, our web app is optimized for microscopy images acquired using the SPIM technique. However, we would also like to segment microscopy images acquired using a wide range of technologies. Feeding into this is the ability to segment and obtain features for the data in our DevoZoo. The ability to extract quantitative data from these movie images is key to conducting the comparative and time-series analysis. The development of a dashboard would ideally enable users to employ various machine learning and simulation techniques in one place.These improvements are meant to increase participation in our open science initiative and make sophisticated analytical techniques more accessible to students and potential collaborators alike.
Mayukh's project is called Pre-trained Models for Developmental Neuroscience, and is based on previous work done in the group during the DevoWormML course [1]. This project is described thusly:
This project will center around building a pre-trained model for shapes and processes related to Developmental Biology and Neurobiology and extracted from image data. Our organization's Machine Learning interest group (DevoWormML) has published a blog post [1] on the advantages and need for pre-trained models in this area. In short, biological development is characterized by characteristic shapes, movements, changes in shape, and temporal processes that define important features. Pre-trained models are used in NLP and Deep Learning for the domains of sequence discovery in language processing (GPT-2) and bounding box methods for segmenting complex images (DeepLabv3). Models specialized for biology, however, do not exist. A suitable pre-trained model would greatly reduce the need for input data without sacrificing the ability to generalize to different contexts.Our main interest is in extracting spatiotemporal features from image data. We will focus on microscopy data such as that found in the DevoZoo or from more specialized sources [2]. For a typical pre-trained model, the network is pre-trained with non-random weights that approximate the generalized versions of the features we would like to discover. However, we are also interested in a semantic component, particularly the ability to incorporate elements such as meaning assigned to static knowledge (semantics) and multiple meanings for a single feature (polysemy). This will enable relational modeling and the mapping of segmented image data to lineage trees and taxonomies. This will enable relational modeling and the mapping of segmented image data to lineage trees and taxonomies. Our model, tentatively called DevLearningv1, should be applicable to a wide range of neural network and deep learning techniques.
Thanks to INCF for sponsoring our activities once again this year. Thanks also go to Vinay Varma, who will be providing support on all things mentorship this summer. Vinay was a GSoC student last summer, and will be sharing his wisdom with this year's students. I would also like to invite all those who applied for these projects to contribute to the Organization in some other way. Often, interaction with the community now can lead to additional opportunities down the road.
As for the Orthogonal Research and Education Lab project (Contextual Neurodevelopmental Dynamics), we unfortunately did not get any slots this year. Thanks to Ankit Gupta and Jesse Parent for their excellent proposals. But I would like to continue pursuing the initiative as an open-source effort, hopefully leading to other avenues for development and funding. The same community interaction advice given for OpenWorm applies to Orthogonal Lab as well.
We are going to be developing in the Meta-Brain repository on Github. Be sure to check out our Saturday Morning NeuroSim meetings for more information on this project (join our mailing list!). And register for the Neuromatch Summer School if you have not already, as it will be quite relevant to what we will be doing.
A sample of Saturday Morning NeuroSim (with a recap of the ICLR conference). Click to enlarge.
UPDATE (5/5): One of our regular meeting attendees (Devansh Batra) has also received a Google Summer of Code position with the OpenCV organization. Congrats!
NOTES:
[1] Alicea, B., Gordon, R., Kohrmann, A., Parent, J., and Varma, V. (2019). Pre-trained Machine Learning Models for Developmental Biology. The Node blog, October 29.