Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Identification of Neuronal Polarity by Node-Based Machine Learning

Identification of Neuronal Polarity by Node-Based Machine Learning Identifying the direction of signal flows in neural networks is important for understanding the intricate information dynamics of a living brain. Using a dataset of 213 projection neurons distributed in more than 15 neuropils of a Drosophila brain, we develop a powerful machine learning algorithm: node-based polarity identifier of neurons (NPIN). The proposed model is trained only by information specific to nodes, the branch points on the skeleton, and includes both Soma Features (which contain spatial information from a given node to a soma) and Local Features (which contain morphological information of a given node). After including the spatial correlations between nodal polarities, our NPIN provided extremely high accuracy (>96.0%) for the classification of neuronal polarity, even for complex neurons with more than two dendrite/axon clusters. Finally, we further apply NPIN to classify the neuronal polarity of neurons in other species (Blowfly and Moth), which have much less neuronal data available. Our results demonstrate the potential of NPIN as a powerful tool to identify the neuronal polarity of insects and to map out the signal flows in the brain’s neural networks if more training data become available in the future. . . . . . Keywords Neuronal polarity Machine learning Drosophila Connectome Axon Dendrite Introduction Kuan et al. 2015; Milyaev et al. 2012; Parekh and Ascoli 2013;Pengetal. 2015; Shinomiya et al. 2011; Xu et al., Rapid technology advances in recent years have led to the 2013;Xuet al. 2020). However, how to integrate and trans- development of several connectomic projects and large-scale form the data to address scientific questions (Lo and Chiang databases for cellular-level neural images (Chiang et al. 2011; 2016) remains a central challenge. Overall, these projects aim to provide sufficient information for the analysis of informa- tion flows in the brain. This goal is difficult to achieve in the Chen-Zhi Su and Kuan-Ting Chou contributed equally to this work. current stage, as many neural images do not provide informa- tion on polarity (axons and dendrites). The axon-dendrite po- * Chung-Chuan Lo larity of a neuron can be identified by experimental methods cclo@mx.nthu.edu.tw (Craig and Banker 1994; Matus et al. 1981;Wang etal. 2004). * Daw-Wei Wang However, these methods are not practical for large-scale neu- dwwang@phys.nthu.edu.tw ral image projects and for the image datasets that were already 1 acquired. Morphology-based polarity identification at the Brain Research Center, National Tsing Hua University, post-imaging stage is possible, but this is particularly chal- Hsinchu 30013, Taiwan 2 lenging for insects because of their highly diverse neuronal Physics Division, National Center for Theoretical Sciences, morphology (Cuntz et al. 2008; Lee et al. 2014). Hsinchu 30013, Taiwan 3 To address this issue, the method of skeleton-based polarity Department of Physics, National Tsing Hua University, identification of neurons (SPIN) has been developed using sev- Hsinchu 30013, Taiwan 4 eral classic machine-learning (ML) algorithms (Lee et al. 2014). Institute of Systems Neuroscience, National Tsing Hua University, Although SPIN reaches a decent performance in neuronal polar- Hsinchu 30013, Taiwan 5 ity identification for fruit flies, Drosophila melanogaster,with Center for Quantum Technology, National Tsing Hua University, 84%–90% accuracy, the method suffers from the cluster-sorting Hsinchu 30013, Taiwan 670 Neuroinform (2021) 19:669–684 problem. Most projection neurons (i.e., neurons that innervate Method more than one neuropil) possess two or more clusters of neural processes. Each cluster can be either axon or dendrite, but not Overview both. Using this observation, the SPIN method first identifies the clusters of processes in a neuron and then identifies the polarity The axon-dendrite polarity of a neuron is correlated with of each cluster. The strategy is highly efficient, but incorrect certain aspects of its morphology, such as the distance (or sorting of clusters can lead to incorrect polarity classification of path length) from a terminal to the soma, the number of a large number of terminal points at once. This is a major source nodes involved in a domain/cluster, and the thickness of of errors in the SPIN method. neurites (Craig and Banker 1994; Hanesch et al. 1989; n the past decade, modern ML algorithms have been ap- Rolls 2011; Squire et al. 2008). However, so far, very plied in many research fields and in daily life. The popularity few theoretical frameworks have systematically investi- of modern ML grows because of rapid developments in com- gated the relationship between these features and neuronal putational algorithms, high-speed processors, and big data polarity. These empirical conditions are loosely defined, available from various resources (LeCun et al. 1998; with many exceptions for different types of neurons. Krizhevsky et al. 2012; LeCun et al. 2015). Some widely Therefore, it is difficult to identify neuronal polarity by successful algorithms —for example, deep neural networks traditional rule-based computational programs. SPIN (Lee (DNN) and extreme gradient boosting (XGB)— may recog- et al. 2014), which is developed using classical ML algo- nize hidden patterns more efficiently than human knowledge/ rithms, can be improved in many aspects. experience, after proper training on big data. Therefore, ML In order to significantly improve the previous methods, here opens a new era when precise classification and/or prediction we develop a new polarity identifier based on the morphologic becomes possible even without full knowledge of the given features, which are extracted from neuronal nodes and handled data. As a result, many applications of ML have recently ap- by modern ML algorithms. Different from clusters, which are peared in biological and medical research (Asri et al. 2016; usually ill-defined from computational point of view, nodes are Malta et al. 2018; Mohsen et al. 2018). It is reasonable to always well-defined by the bifurcation in a neuronal skeleton. expect that one may apply modern ML for the identification The whole process of polarity identification, therefore, is com- of neuronal polarity solely using optical images of the fruit posed by the following four major steps in our NPIN model. It is fly’s brain. For neurons of this insect, several tenths of thou- instructive to briefly describe them (see Fig. 1) before the further sands of high-resolution optical images are already available, explanation in the rest of this paper: which is the largest dataset among all species. In the present work, we develop a new classifier: node- Step I. (Data Preparation and Reorganization):Wein- based polarity identifier of neurons (NPIN). The proposed vent a diagrammatic method to map a 3D neural model achieves much higher accuracy (>96%) than SPIN or skeleton structure of a given neuron onto 2D tree the human eye for the identification of neuronal polarity in diagrams, called level trees and reduced trees. This the Drosophila brain. Our NPIN is developed using a node- effective representation makes it easy to extract rep- based feature extraction method. Specifically, NPIN in- resentative features for ML. cludes both Soma Features (spatial information between a Step II. (Node-Based Feature Extraction):We determine soma and a given node) and Local Features (morphological the nodal polarity using the features of each node. information around a given node). Two state-of-the-art su- Specifically, we identify and extract both Soma pervised learning algorithms—XGB and DNN—are used as Features and Local Features for each node. two complementary classifiers, making the method applica- Step III. (ML Models): In NPIN, we apply two powerful ble to complex neurons (which have more than two axon/ ML algorithms—XGB and DNN—together. They dendrite clusters) with a competition between Soma provide two different but complementary ap- Features and Local Features. We find that NPIN provides proaches for the classification of axons and extremely good results for the classification of neuronal po- dendrites. larity, identifying important local features compared with Step IV. (Implementation of Spatial Correlation):The spa- the known soma features. We further apply NPIN to classify tial correlation of the nodal polarity in the nearby the neuronal polarity of other species of insects (in this case, region is implemented by relabeling the nodal polar- Blowfly and Moth), which may have insufficient data for ity suggested by ML models. This approach can sig- standard ML. These important achievements of NPIN are nificantly enhance the accuracy of the final output. all important steps toward the understanding of signal flow dynamics in neural networks, and should speed up the connectomic projects for the whole brain when more data Typical ML methods concentrate on the algorithms in Step are available for training. III. Instead, we put more emphasis on the other three steps in a Neuroinform (2021) 19:669–684 671 Fig. 1 Flowchart of the NPIN model. NPIN includes four major steps, as algorithms after validation. We then relabel the classification by including described in the text. The dataset contains 213 neurons with labeled spatial correlations of nodal polarities before comparing them with the polarity as the ground truth. We randomly choose 100/25/50 neurons test data with known polarities. The whole process is repeated 20 times to from the datasets for training/validation/test sets. Every neuron in the cover all 213 neurons in the original dataset. As a result, each neuron training/validation sets is mapped to a level tree and a reduced tree. We could be selected to be a test sample and classified by a model trained on then extract Soma Features and Local Features from these neuronal data other neurons for training. Preliminary results are obtained by XGB and DNN way specifically useful for the determination of neuronal po- with information including the brain regions innervated by the larity. Figure 1 shows the flowchart of the whole calculations. dendrites and axons of each neuron, the numbers of axon/ We will explain these strategies in the rest of this section. dendrite terminals, and precision/recall obtained by our model. Dataset We divide the neurons in our dataset into two types: (i) simple neurons, which have two clusters of terminals (one Our main dataset represents 213 neurons with experimental dendrite and one axon); (ii) complex neurons, which have ground truth from the Drosophila brain, which are available more than two clusters of terminals. In Figs. 2(b1)–(b3) and from the FlyCircuit database (http://www.flycircuit.tw/) (c1)–(c4), we show some typical skeleton structures of these (Chiang et al. 2011). These 213 neurons are ALL projection two types of neurons. In our dataset, we have 89 simple neu- neurons selected from various regions across the brain to rep- rons and 124 complex neurons with previously reported po- resent the diversity of neuronal morphology as much as pos- larity. Among complex neurons, most complex neurons have sible (Fig. 1(a)). Local neurons with axon/dendrite coexis- three clusters (two dendrites and one axon, or one dendrite and tence in the same branch/cluster are not included in our re- two axons). Only a few neurons have more than three clusters. search. These projection neurons innervate 15 neuropils: AL, The reason to classify these neurons is to investigate how the AOTU, CAL, CCP, DMP, EB, FB, IDFP, LH, LOB, MED, distance to soma and the number of clusters can influence the NO, PB, VLP, and VMP. Among these 213 neurons, 107 identification of neuronal polarity. Moreover, we can examine neurons have been included in the dataset used in the devel- how well NPIN performs even when the polarity is difficult to opment of the previous model, SPIN, and we have 106 addi- be identified by the human eyes in the case of three or more tional neurons for the present work. As we will show later, due terminal clusters. This is one of the most important criteria for to the improvement of feature extraction and the ML algo- a polarity identifier to be practically applicable for the deter- rithm, our model, NPIN, substantially outperforms SPIN, mination of signal flow in neuronal networks of the insect not only in the overall precision and recall but also in the brain. There are, of course, some other types of projection or applicability in more brain regions as well as more types of local neurons, which may not be easily classified by the num- ber of clusters or by their polarity distribution. We do not complex structures. In Appendix E, we list these 213 neurons 672 Neuroinform (2021) 19:669–684 Fig. 2 Drosophila melanogaster (fruit fly) neurons used in the present study. (a) All 213 neurons in our dataset, shown in their actual locations in the standard fly brain. (b1)–(b3) Skeleton structures for several simple neurons. (c1)–(c4) Skeleton structures for several complex neurons. Black dots represent somas. Black lines are the main trunks of neurons. Blue or red lines indicate the axonal or dendritic clusters, respectively. Each neuron is labeled by its ID in the FlyCircuit database include them in the dataset of this study because of a lack of or shapes. To express this information in a 2D diagram, we data with confirmed polarity to be used for training. Our ap- introduce the level structure according to the generation of proach developed here, however, may still be applicable to nodes: a soma is placed in the top-level (level 0), and the next these neurons when more data are available in the future. two nodes are placed in the lower level (level 1), and so on for their offsprings, until all the ending nodes (terminals) are Standardized Representation: Level Trees and properly placed. We take the convention that the branches Reduced Trees with more successive non-empty levels are placed in the left-hand side and the branches with less successive non- To improve the accuracy of our ML model, we first need to empty levels in the right-hand side (Figs. 3(a3) and (b3)). define how to “standardize” the morphological information of We believe that most morphological features of the neuronal these neurons, which are so different from each other in their cluster are still extractable from such standardized representa- original 3D structures. Figures 3(a1) and (b1) show two ex- tion because the spatial positions of all nodes (including soma amples of a simple neuron and a complex neuron. First, we and terminals) are still available. The only missing informa- start with the 3D skeleton structures (see Figs. 3(a2) and (b2)) tion in the level tree (compared with the 3D skeleton image of extracted from the raw images, where the width information of neurons) is the shapes and widths of neuronal branches that the trunks or branches are ignored temporarily in order to connect neighboring nodes. As we will see below, this miss- make our model more generally applicable. In our work, we ing information seems not crucial for the determination of further map the 3D skeleton structure onto a level tree (see neuronal polarity in NPIN. Figs. 3(a3) and (b3)), which keeps all information on the po- In addition to the level tree representation for a neuronal sition of each node (including soma, terminals, and cross structure, in this work, we further define a reduced tree for points between branches) and the path length between them, each neuron. The reduced tree aims to retain the major but it ignores the trunk and branch information, such as width branches of the skeleton structure to identify an axon or Neuroinform (2021) 19:669–684 673 Fig. 3 Encoding 3D optical images of neurons into level trees and neuron. Because a complex neuron has more than two clusters, there can reduced trees. First, the volume image of a neuron (a1) is converted into be more than one dividing node that separates axon clusters from den- the skeleton (a2), and then a level tree (a3), which is a 2D plot with a drites. In (c), we graphically show the rules to define the nodal polarity standardized method to label most features of the original neurons. Red, based on the polarity of terminals in the level tree (see the text). Upward blue, and yellow dots represent dendrites, axons, and dividing nodes arrows indicate that the nodal polarity in the upper level is defined by the (including terminals), respectively. (a4) represents the reduced tree of nodal polarities of the two nodes/terminals in the lower level the same neuron cell. (b1)–(b4) show the same reduction for a complex dendrite cluster. This information is important for the deter- structure of their neurons, in this study, we extend the defini- mination of cluster curvature and aspect ratio for nodal fea- tion of polarity from terminals to nodes, and we use this infor- tures within each cluster (explained below). The reduced tree mation to extract features in NPIN. In other words, we use a of a neuron can be obtained by repeatedly removing the end- bottom up method to assign the polarity for a node to be the ing nodes with the branches shorter than a characteristic length axon/dendrite class, if its offspring branches are connected to determined by the branch distribution, until it stops automat- pure axon/dendrite terminals or nodes. See below for more ically or only five levels are left (see Figs. 3(a4) and (b4)). The detail. basic assumption behind this procedure is that the major We emphasize that using features extracted from nodes has branch of a neuron skeleton structure is contained in the “in- several important advantages over using features extracted ner” (closer to the soma) and “longer” branches. Shorter and from terminals or clusters for the training process of ML. outsider branches are minor or unimportant for determining First, the number of nodes is much larger than the number of the clusters. See Appendix A for the detailed procedure of clusters in each neuron. Therefore, the polarity identification producing the reduced tree from a level tree. has significantly higher accuracy due to the larger training data. Second, nodes are well-defined in the skeleton structure Nodal Polarity (compared with clusters) and could include more morpholog- ical features (compared with terminals). Finally, these nodes The polarity of the neurons in our dataset are all predetermined can also be systematically labeled in the skeleton structure or using the presynaptic (Syt::HA) or postsynaptic in our level tree diagram, making it easy to include their cor- (Dscam17.1::GFP) markers (C.-Y. Lin et al. 2013)or using related features in the spatial distribution. This node-based the morphological features described in previous studies feature extraction is crucial in NPIN, making an accurate iden- (Fischbach and Dittrich 1989; Hanesch et al. 1989;Wu et al. tification of neuronal polarity possible. 2016). There are 7142 terminals identified as dendrites and To extend the polarity definition from terminals, as provid- 2310 as axons. However, because the axon-dendrite polarity ed in the dataset, to nodes on the skeleton of a neuron, we of these terminals is highly correlated to the morphological apply the following series of rules to define the nodal polarity 674 Neuroinform (2021) 19:669–684 according to the polarity of terminals (Fig. 3.(c)): (1)If two eigenvalues of moment of inertia of the cluster, (13) curvature child nodes (or terminals) are both axons (or dendrites), their (varicosities) of the cluster, (14) aspect ratio of the cluster, (15) parent node (the node that directly connects to them in the volume of the cluster, etc. We do not include arbor thickness upper level) is also defined as an axon (or dendrite). (2) If because not all neurons have such information in their optical one of the child nodes (or terminals) is an axon, and the other images. is a dendrite, their parent node is defined as a “dividing node.” After systematic studies and comparison of the prediction (3) If one of the child nodes is an axon (or dendrite), and the results, we eventually find out the nine most relevant features, other is a dividing node, their parent node is defined as an which can be classified into 2 groups: Soma Features (SF) and axon (or dendrite). Finally, (4) if two child nodes are both Local Features (LF). Soma Features contain spatial informa- dividing nodes, their parent node is also defined as a dividing tion from a given node to a soma, including the path length node (however, we do not have such a case in our dataset). along the neuronal branches and the direct distance in 3D The definition of diving nodes is just for the convenience and space. Local Features contain certain information on the local consistency of nodal polarity. These dividing nodes are very morphology of a given node, including the curvature and as- few (mostly none or at most two in each neuron of our dataset) pect ratio of the cluster it belongs to. Hence, Local Features do and therefore not included in our training data. If not defining not include any information about the soma, while Soma dividing points in such a way, we could not properly identify Features do not include any information about the local mor- the polarity of a node connecting to both dendrite and axon phology. Let i be the index of a given node. Soma Features of nodes. node i can be expressed as a four-component vector: SF =[l , si After applying these rules, we can label the polarity of all nl , d , nd ], which are the path length to soma, normalized si si si nodes of any neuron using the polarity information of their path length to soma, distance to soma and normalized distance terminals. Note that this expansion of nodal polarity should to soma, respectively. Local Features of node i can be not be misunderstood as introducing any artifacts or uncon- expressed as a five-component vector: LF =[l , nl , c , ar , pi pi i i firmed polarity labeling, because the morphological features rl ], which are the path length to the parent node, the normal- of terminals should be directly related to the nearby nodes by ized path length to the parent node, curvature of the cluster, definition. The introduced nodal polarity is just for the conve- aspect ratio of the cluster, and the ratio of path lengths to the nience of feature extraction and for data augmentation in ma- children nodes, respectively. If a children node does not exist, chine learning language, and will not be shown in the evalu- its features are replaced by the number, −1. We then train ation of NPIN performance. In other words, the precision of different ML models on various combinations of features to NPIN is still calculated based on terminal polarity rather than identify their roles in the identification of neuronal polarity. In nodal polarity, and will show (see below) a significant en- Appendix B, we explain how to identify and calculate soma hancement of prediction accuracy compared to the results features and local features (from the level trees and reduced using terminal information only. Finally, we note that the di- trees defined above). viding node is defined to mark the position to separate axon and dendrite clusters, and it should be important in the nerve cell development. Since the number of dividing points is much Machine Learning Models less (one or at most two points in each neuron) than the num- ber of axon or dendrite nodes, we do not include them in the We train our model by supervised learning using the training training and testing processes. Figures 3(a3) and (b3) show data extracted from the dataset. We implement several ML some representative level trees, where all nodes are properly algorithms: random forest, gradient boosting decision tree, labeled. XGB, support vector machines, and DNN. We find that, in general, XGB and DNN provide the best and complementary Feature Extraction for Nodal Polarity results from the features we selected. Therefore, we use them in our NPIN. In Appendix C, we explain the details of how to In principle, the level tree representation defined above con- implement these two algorithms in the present study. tains all information of a 3D neuron and can be used for the In addition to the algorithms, an ML model also depends identification of neuronal polarity. We test more than a dozen on the features used during the training process. To investigate of features, including (1) path length to its parent node, (2) the effects of different morphological features on the identifi- normalized path length to its parent node, (3) path length to cation of nodal polarity, we develop three models by using soma, (4) normalized path length to soma, (5) direct distance three types of features in NPIN: Model I (using both Soma to soma, (6) normalized direct distance to soma, (7) Strahler Features and Local Features), Model II (using Soma Features number, (8) angle between branches to the children nodes, (9) only), and Model III (using Local Features only). As we will ratio of path lengths to the children nodes, (10) layer number see later, we can gain insight into the relationship between in the cluster, (11) number of terminals in the clusters, (12) morphological features and polarity by systematically Neuroinform (2021) 19:669–684 675 comparing the polarity identification results between different included because their numbers are too few to be statistically models and different types of neurons. relevant. In the testing neurons, they could be recovered using the predicted polarities of other nodes (see Appendix D). Implementation of Spatial Correlation of Nodal In the following sections, we will first present the distribu- Polarity tion of nodal features, including both soma features and local features, obtained from all neurons in our dataset. This pro- In the standard application of supervised learning for classifi- vides a deep understanding of neuronal morphology and its cation, one usually obtains the results from the output proba- relationship with other results. Next, we show the results of bilities directly when the model is well-trained on the training polarity identification provided by Model I (with both Soma data. The training aims to minimize the cross-entropy between Features and Local Features) for our whole neuron dataset, the output results and the known answers by backpropagation. followed by results using Model II (with Soma Features only). However, this ML process does not guarantee reasonable re- We then focus on the results obtained by using complex neu- sults all the time without violating some necessary conditions, rons as training data for comparison. As an example of appli- which could not be included in the input features of training cation in other species, we apply NPIN to test the blowfly. data. For the task of nodal polarity identification in our present Finally, we summarize these calculation results and our work, for example, the polarities of nodes are highly depen- findings. dent on its neighboring nodes: nodes in the same cluster (and, therefore, close in space) are usually of the same type (a den- Feature Distribution and Importance Ranking drite or axon), but such loosely defined necessary condition cannot be implemented in the loss function if the polarity of Before presenting the results of neuronal polarity by NPIN, each node is identified individually. Therefore, we have to we investigate the distribution of different features (Soma include such a spatial correlation of polarity by adding other Features or Local Features) for different types of neurons methods in the ML model. (simple neurons or complex neurons). This provides a better In this work, spatial correlations between nodal polarities picture which helps to understand and explain the results of can be included by the modification of the polarity provided the present algorithm. In Fig. 4, we show the distribution of by XGB or DNN, if the probability for axon or dendrite is axon nodes and dendrite nodes (including terminals) of all below a certain threshold. More precisely, such a modification neurons as a function of the normalized path length (relative process contains three steps: (1) we perform the ML process to the largest length to the soma). Results of simple neurons for the test data and obtain the polarity and its probability for (a1) and complex neurons (a2) are shown together for com- each node. (2) Next, we accept the result of a given node if the parison. As expected, most axons have a longer path length to probability is higher than a threshold, and we reject the result soma compared with most dendrites in simple neurons, but the otherwise by changing it to be unidentified. (3) Finally, we distribution of dendrite is certainly wider than the distribution relabel these rejected/unidentified nodes according to the po- of axons. A wider distribution pattern for dendrites in simple larity of its neighboring nodes. As a result, we identify spatial neurons directly implies that it is easier to correctly classify a correlations between nodal polarities. More details of such node to be a dendrite, while it is more difficult to include all polarity modification and its effects on the NPIN performance dendrite nodes by the same classifier. Hence, this explains are described in Appendix D. why the precision is higher (or lower) than the recall for den- drites (or axons) of simple neurons (Fig. 5(a1) and (b1)). On the other hand, in Fig. 4(a2), axon nodes have a wider distri- Results bution than dendrite nodes in complex neurons, explaining why the precision is lower (or higher) than the recall for den- Our dataset includes 213 neurons with verified polarities as drites (or axons) of complex neurons (see Fig. 5(a2) and (b2)). the ground truth. In our training procedures (Fig. 1), we ran- In addition to the path length to the soma, we have also domly select 100 neurons from the dataset for training, 25 for included the direct distance from a node to a soma as a feature validation, and 50 for testing. This process is repeated for 20 (Appendix B and Fig. S2(b)). Besides, the ratio of direct dis- rounds, so that each neuron can be tested (by different models tance to the path length reflects a global morphological feature trained by other neurons) for 4–5 times on average. We then of a given node: if the distance to a soma is close to the path average these probabilities for their nodal polarity and make length to a soma, the neuron branches are more straight in the the final comparison with the ground truth. Using this method, real space. The path is more curved if this ratio is much small- the obtained results for the nodal polarity of each neuron can er than one. This implied that the node is close to the soma in be much more stable because the fluctuations due to the space with a long and curved neuronal branch in between. In dataset selection are reduced. In our training data and in com- Figs. 4(b1) and (b2), we show the distribution of axon and parison with the ground truth, the dividing points are not dendrite nodes in the space of normalized length to the soma 676 Neuroinform (2021) 19:669–684 Fig. 4 Feature distributions of axons and dendrites for all neurons in (c1) and (c3) show the nodal distribution in terms of the normalized path our dataset. (a1) and (a2) show the distribution of axon and dendrite length to the soma and the curvature of the associated cluster. Blue and nodes along the normalized path length to soma, for simple and complex red dots represent axon and dendrite nodes, respectively. Details of cur- neurons, respectively. (b1) and (b2) display the nodal distribution in terms vature calculations are described in Appendix B of the normalized path length and the normalized distance to the soma. and normalized direct distance to the soma. The distribution Apart from the two soma features mentioned above, in Fig. clearly indicates that most nodes are well-separated in such 4(c1) and (c2), we also present the distribution of nodal po- 2D space. In fact, feature ranking by XGB also reveals these larity in the space of normalized length to the soma and the two features as the most important features for the identifica- cluster curvature near a given node. We suggest that the po- tion of nodal polarity. larity classification can be effectively enhanced by including Fig. 5 Performance of NPIN with Model I, where both Soma algorithm. (c) defines the confusion matrices shown in this figure. In Features and Local Features are used. (a1)–(a3) are the confusion the upper part of the table, each row indicates the actual polarity, and matrix and precision/recall table of the terminal polarity, based on the each column indicates the polarity predicted by NPIN. The lower part of XGB algorithm for simple, complex, and all neurons, respectively. the table displays the precision and recall of axonal and dendritic termi- (b1)–(b3) are the same as in (a1)–(a3) butcalculatedbythe DNN nals. Precision and recall are defined in the equations below (c) Neuroinform (2021) 19:669–684 677 curvature as one of the local features because visual inspection simple neurons is higher than that of complex neurons by reveals that typically more dendrites (compared to the axon 1.2% for XGB (compare Figs. 5(a1) and 5(a2)), while it be- nodes) can be found in the regime of larger curvatures. Such comes 0.8% if calculated by DNN (compare Figs. 5(b1) and effects look more significant in simple neurons than in com- 5(b2)). plex neurons. However, if we use curvature or other local However, such similar accuracy of polarity identification features alone, the performance of polarity classification can- for simple and complex neurons is surprising, because com- not be as good as using the path length to the soma. plex neurons have more than two clusters. Therefore, the po- The importance of each feature can also be obtained from larity of middle clusters cannot be easily identified according the feature ranking calculation of XGB (however, this func- to its relative distance to soma. There are also various kinds of tion is not available in DNN). This can also be obtained by complex neurons (see Fig. 2, for example), which may also comparing the overall accuracy after systematically removing have an axon cluster close to the soma. As a result, a naïve certain features during the training. Our result suggests the top comparison of the path length to the soma should not work six features for the determination of nodal polarity: (1) well for a complex neuron. Hence, it is reasonable to believe unnormalized path length to the soma, (2) normalized path that, in our NPIN, the contribution of soma features to simple length to the soma, (3) unnormalized distance to the soma, and complex neurons should be different from the contribu- (4) normalized distance to the soma, (5) curvature of the as- tion of local features. To understand how this result is related sociated cluster, and (6) aspect ratio of the associated cluster. to the feature selection in various types of neurons, in the Other features are less important but can still contribute to the following section, we demonstrate the performance of NPIN overall performance of NPIN. These results also confirm that with different feature selections. local features are secondary factors for the determination of nodal polarity. Identification Results of Model II: Using Soma Features Only Identification Results of Model I: Using both Soma Features and Local Features To clarify the role of Soma Features and Local Features in the identification of neuronal polarity, we additionally use Model To present the results of polarity identification by NPIN, we II, which is trained by using Soma Features only. The model is start from Model I by using both soma features and local trained using the same protocol as in the previous section. The features for the whole dataset (with both simple and complex results are shown in Fig. 6. neurons). Figures 5(a1)–(a3) show the confusion matrix of According to Fig. 6, when using Soma Features only, we Model I based on XGB and the associated precision/recall find that the overall accuracy drops to 95.5% (94.7%) for table for the polarity of terminals in simple neurons, complex simple neurons, and 93.1% (90.0%) for complex neurons, neurons, and all neurons, respectively. Figures 5(b1)–(b3) respectively, if using XGB(DNN) algorithms. The perfor- show the results of Model I but based on DNN for compari- mance on all neurons, as shown in Fig. 6(a3) and (b3), is son. Figure 5(c) presents the definition of the confusion matrix between those of the simple and complex neurons, as and explains how the precision and recall are calculated for expected. axons and dendrites. Because the final result is a binary clas- Several important conclusions can be made. First, the over- sification of terminal polarity, the dividing nodes are not in- all accuracy of Model II is lower than for Model I (compare cluded either in the training data or in the test data. Fig. 6(a3) with Fig. 5(a3) for XGB and compare Fig. 6(b3) From results shown in Figs. 5(a3) and 5(b3), we discover with Fig. 5(b3) for DNN). However, the difference is only that NPIN is a very powerful classifier with an overall accu- 1.6% for XGB, while it is 3.5% for DNN. This means that racy of 96%. This is achieved by including both Soma the contribution of local features, which exists in Model I but Features and Local Features. The model is trained and applied not in Model II, is more significant for DNN than XGB. on both simple and complex neurons. According to our re- Second, if we compare the results for simple and complex sults, in general, the precision and recall for the polarity iden- neurons, we can see that the influence of local features is much tification of dendritic terminals are better than those of axons more significant for complex neurons than for simple neurons. by 3%–8%. One possible reason is that the total number of For example, for XGB, we find that the accuracy decreases by dendrite terminals is approximately three times more than the 1% only in simple neurons (compare Fig. 5(a1) and number of axon terminals, providing more training data that Fig. 6(a1)), while it decreases by 2.3% for complex neurons may increase the precision. (compare Fig. 5(a2) and Fig. 6(a2)). These two values become Comparing the confusion matrices for simple neurons 2.2% and 5.7%, respectively, for DNN. This clearly implies (Figs. 5(a1) and 5(b1)) and complex neurons (Figs. 5(a2) that Local Features which are included in Model I are more and 5(b2)), we can observe similar performance of XGB and important for complex neurons compared to simple neurons. DNN on simple and complex neurons: The accuracy for The most obvious reason is that complex neurons have more 678 Neuroinform (2021) 19:669–684 Fig. 6 Performance of NPIN using Model II, where only Soma algorithm. (b1)-(b3) are the same as (a1)–(a3) but for the DNN Features are included. (a1)–(a3) show the results for simple neurons, algorithm. (c) shows two similar complex neurons, where middle clusters complex neurons, and all neurons, respectively, using the XGB have opposite polarities. The cluster labeled by A/D is axons/dendrites than two clusters and, therefore, the simple application of local features plays a complementary role in polarity identifi- soma features could not provide enough information for the cation, especially for the middle clusters of complex neurons. identification of polarity. As an example, Fig. 6(c) shows two More precisely, by comparing Fig. 7(a2) to Fig. 7(a1), we find types of complex neurons, where the middle clusters have that local features can significantly reduce the number of in- different polarities. These middle clusters are difficult to clas- correct identification for axons (upper right corner of the con- sify by Soma Features only. As a result, we conclude that local fusion table, from 150 to 66); hence, the number of correctly features are crucial for the polarity identification of the middle identified axons is increased. clusters in complex neurons and the DNN algorithm may be In Fig. 7 (b1)–(b3), (c1)–(c3), and (d1)–(d3), we show more sensitive to these differences than XGB. three representative complex neurons with three or more clus- ters of terminals. Figure 7 (b1), (b2), and (b3) show the same Comparison of Models I, II, and III for Complex neuron with polarities identified by Model I, Model II, and Neurons Model III, respectively. Figure 7 (c1)–(c3) and (d1)–(d3) show similar information but for another two neurons. The In this experiment, to investigate how NPIN works with com- results obtained by using Local Features alone (Model III) plex neurons and to examine its relationship with local fea- are not satisfactory: some axon clusters with larger curvatures tures, we focus on complex neurons only: no simple neurons may be incorrectly classified as dendrites (see, for example, are included in either training data or test data. Three models two axon clusters in Fig. 7(c3)). Moreover, some dendrite are used for comparison: Model I (with both Soma Features clusters with divergent branches may be incorrectly classified and Local Features), Model II (with Soma Features only), and as axons (see, for example, the dendrite cluster in Fig. 7(d3)). Model III (with Local Features only). Because the influence of Using Soma Features only (Model II), on the other hand, Local Features is more significant in DNN than in XGB (see provides a much better result (with an accuracy of 95.8%), above), here we will apply the DNN algorithm only for because clusters that are closest to or farthest from the soma simplicity. are identified as dendrites or axons, respectively. However, as According to the results shown in Fig. 7(a1)–(a3), the ac- we see in Fig. 7(c2), (d2), and (e2), the middle clusters (de- curacy of classification is the best for Model I and slightly fined from their distance to soma) of these complex neurons reduces for Model II, but it significantly drops for Model III cannot be identified easily by Model II (with Soma Features (which uses Local Features only). This result indicates that, only), because their relative distance to the soma is not well- without any information on its relative distance to the soma, defined compared to the other clusters. Local Features alone for a given node perform poorly in po- As a summary, we find that the accuracy to classify the po- larity identification but are not completely useless (with 71% larity of middle clusters in a complex neuron can be significantly enhanced after combining Soma Features and Local Features in accuracy, see Fig. 7(a3)). Indeed, we find that the inclusion of Neuroinform (2021) 19:669–684 679 Fig. 7 Performance of NPIN with DNN algorithm for complex Model II, and Model III, respectively. Filled gray circles indicate the neurons in three different models. (a1)–(a3) are the confusion matrix terminals of incorrect classification. (c1)–(c3) and (d1)–(d3) are the same and precision-recall table for the terminal polarity for Model I (with both as in (b1)–(b3) but with two different complex neurons. (e1)–(e4) are four Soma Features and Local Features), Model II (with Soma Features only), different complex neurons, where polarities are classified by Model I with and Model III (with Local Features only), respectively. (b1)–(b3) display 100% accuracy by DNN algorithm the same complex neuron with polarity classification using Model I, Model I. More examples of complex neurons with correct polar- our NPIN, trained by Drosophila neurons, can be directly used ityidentificationbyModel IareshowninFig. 7(e1)–(e4). for other species of insects, which should have similar mor- phological features as Drosophila. Here, we take the neuron Application to Other Species of Insects: Blowfly and images of the blowfly and the moth from the Neuromorpho Moth database (http://neuromorpho.org/) as an example. The database lists 19 blowfly neurons and 3 moth neurons with labeled polarity. These data were generated by different labs In principle, our NPIN, trained on the Drosophila brain neu- rons, can also be applied to the polarity identification of other species, if the training data is replaced by the neurons of that The IDs of 19 Blowfly neurons are HSE-fluoro05, HSE-fluoro11, HSE- species. However, the number of publicly available neuronal fluoro15, HSN-cobalt, HSN-fluoro04, HSN-fluoro06, HSN-fluoro08, HSS- cobalt, VS1-cobalt, VS2-fluoro01, VS2-fluoro03, VS2-fluoro10, VS3-cobalt, data samples of other species with identified polarity is much VS4-cobalt, VS4-fluoro02, VS4-fluoro07, VS4-fluoro09, VS5-cobalt, VS9- less than that of Drosophila. Therefore, such an application cobalt. The IDs of 3 Moth neurons are Nevron-komplett-08-02-28-2a, may not be practical. However, it is still instructive to see how Nevron-komplett-08-03-13-2a, Nevron-komplett-08-08-28-1a-A. 680 Neuroinform (2021) 19:669–684 using different reconstruction methods from that of our three models (Model I: all features, Model II: Soma Features Drosophila dataset. To save space, below we just present only, Model III: Local Features only) and three types of test NPIN results of the blowfly in detail and mention the results data (simple neurons, complex neurons, and all neurons). For of the moth data in brief. simplicity, we only display the results using the DNN Figure 8 shows the results of polarity identification for 19 algorithm. blowfly neurons obtained by Model I, Model II, and Model III As explained above, the overall accuracy cannot reflect the of NPIN, which is trained on 213 Drosophila neurons in our complete information on model performance, especially when dataset with the DNN model. We find that Model I, using both the numbers of dendrites and axons are highly imbalanced. To soma features and local features, still provides a decent level generate a reliable ML model, we suggest that the precision of accuracy (83.4%). The main error stems from the pretty low and recall for both axons and dendrites have to be larger than precision and recall of the axons, which have much fewer 50%, or, in other words, we have more correctly identified terminal numbers than dendrites (dendrites: axons ratio = terminals than incorrect ones. We put stars “*” in Fig. 9 to 22.8:1). Similar results are also observed for Model II, as mark those results that do not meet these criteria. shown in Fig. 8(a2). However, a surprising result is obtained when using Model III, where only local features are included for the training on Discussion Drosophila neurons. The overall accuracy, as well as the pre- cision and recall for both dendrites and axons, are very high Comparison of NPIN and SPIN (accuracy = 98.98%). This result is even better than that ob- tained by using the blowfly data for the training process A previously developed machine-learning-based method, (Fig. 8(b)). The results clearly indicate that, unlike SPIN (described in the introduction), has identified the polar- Drosophila, where Local Features are only secondary factors ity of insect’s neurons with an overall accuracy 84%–90% compared with Soma Features, Local Features are the primary (Lee et al. 2014). SPIN starts by identifying clusters of neu- factors for the identification of neuronal polarity for blowfly ronal arbors in each neuron and then classifies the polarity of neurons that we tested in the present study. This can also be each cluster according to its geometric structure and distance observed from the skeleton structure of dendrite clusters in to the soma. As a result, terminals in a cluster are all classified Fig. 8(c1)–(c4). Therefore, to apply NPIN (trained on as having the same polarity. However, this approach has two Drosophila neurons) to neurons of other insects, it is neces- challenges. First, a cluster might not be easily identified for sary to provide not only Model I, but also Model II and Model neurons with complex morphology, and incorrect clustering III, to maximize the range of applications. However, we have could lead to a large number of incorrectly classified termi- to emphasize that since the 19 blowfly neurons are all collect- nals, for example, 14 of 213 neurons used in the present study ed from the visual system, and therefore we could not exclude were not processable by SPIN. Second, the number of avail- the possibility that the success of NPIN may be due to their able clusters may not be sufficient to achieve good training special morphology. More analysis on other types of neurons results because each neuron has only a few clusters. Due to in the blowfly should be carried out in the future when more these issues, SPINoftenfailedto classifypartoforevenall neurons with known polarity are available. terminals of a neuron if its arbors were not clustered correctly. In addition to the blowfly, we also collected 3 moth neu- The proposed NPIN avoids these issues by adopting node- rons with known polarities from the Neuromorpho database. based rather than cluster-based classification. To compare the Among the 194 dendrite terminals and 358 axon terminals, the performance of SPIN and NPIN, we examine the results of the overall accuracy of the polarity identification by NPIN polarity identification by SPIN on the same 213 neurons we (trained by the 213 Drosophila neurons with the DNN model) used here (Huang et al. 2019). We find that, among these 213 is 98.2%, 99.0% and 65.6% for Model I, II, and III respective- neurons, only 79 neurons are fully identified (i.e., without any ly. This result reflects the fact that the polarity of these three “non-classified” terminals), 120 neurons are partially predict- neurons could be much easily identified by Soma Features ed (i.e., some clusters cannot be identified), and 14 neurons only. This could be a complementary example of the blowfly cannot be predicted. Among 9452 terminals of these 213 neu- and show the importance of including both Soma Features and rons, there are 1207 unclassified terminals and 8247 classified Local Features for a general application of NPIN. terminals. Within the SPIN-classified terminals, 8038 termi- nals are correctly identified for their polarities. Therefore, the overall accuracy of SPIN is 85.04% only if we consider all Summary of Results terminals in the dataset, while it could be 97.49% if we con- sidered only classified terminals. We summarize the results of the present study in Fig. 9 by We emphasize that, in the present study, we develop a showing the accuracy of NPIN in all test conditions including completely different approach by identifying the polarity of Neuroinform (2021) 19:669–684 681 Fig. 8 Performance of NPIN on blowfly brain neurons. (a1)–(a3) are neurons in our dataset. (b) is the result for Model I but trained on blowfly the confusion matrices and precision-recall tables for Model I, Model II, neurons directly. (c1)–(c4) display four example skeleton structures of the and Model III, respectively. The models are trained on 213 fruit-fly blowfly neurons used in this test each node, which can be unambiguously defined in the skel- obtained by randomly selecting 100 neurons for training, 25 for eton structure of each neuron, with a nodal polarity also well- validation and 50 for test. In other words, each neuron shown in defined through the polarity of terminals (see Fig. 3). Such Tables S1 and S2 of Appendix E is tested by models, trained on node-based feature extraction, therefore, takes advantage of other neurons in the dataset, and therefore there is no overlap the fact that the number of nodes is much larger than the between the test data and training data in all the results presented number of clusters in each neuron. It can achieve a much above. higher accuracy (>96%) for the whole dataset (213 neurons However, in order to show how well NPIN can be applied and 9452 terminals) after including the spatial correlation. for neurons not in the same dataset, we find another 22 neu- Therefore, we conclude that NPIN outperforms SPIN in the rons with distinct connection types (mostly from AOTU to polarity identification, showing an important step toward the BU, and from MED to VMP) from those in the NPIN dataset reconstruction of the connectome. We expect to analyze the for the test. The polarities of these neurons are determined by information flow of the brain in much finer scales in the near our experimental collaborators and therefore are not published future, revealing more detailed functional relationships be- before (and also not in the original dataset either). 12 of them tween subregions of the Drosophila brain. have dendrites located in AOTU, 9 in MED and 1 complex neuron in MB, see Table S3 of Appendix E. The predicted results by NPIN (trained by the 213 neurons together) show Neurons Not in the Dataset that NPIN could still provide very high accuracy. More pre- cisely, for these 12 neurons in AOTU, 11 of them are 100% As described in the flowchart of NPIN in Fig. 1, the neuronal correct and only one is of 75% accuracy. For the 9 neurons polarity predicted by NPIN for the 213 Drosophila neurons is Fig. 9 Summary of NPIN accuracies in all test conditions using the neurons, complex neurons, and all neurons, respectively. (b) shows the DNN algorithm. (a) shows the results for Model I (with both Soma results for the same models but with the blowfly neurons (trained by our Features and Local Features), Model II (with Soma Features only), and Drosophila dataset). Results with precision or recall of less than 50% are Model III (with Local Features only), for three types of test data: simple indicated by “*” (see the text) 682 Neuroinform (2021) 19:669–684 with dendrites in MED, 7 of them are 100% correct and the terminals. The former makes it difficult to distinguish axons other two are of 94.3% and 80.6% accuracy. The complex from dendrites, while the latter could confuse NPIN by neuron with dendrites in MB (also not shown in the dataset mispredicting all terminals to be dendrites (as a result, the of NPIN) is predicted with 100% accuracy. These results precision and recall of axons are both small). For complex show that our NPIN should be applicable to neurons in other neurons, the incorrectly identified terminals usually appear brain regions. Although like all machine learning algorithms, in the middle clusters, as one may expect. However, the most NPIN is trained by labeled data with similar features (neuronal complex neurons have been correctly predicted by NPIN with morphology) to those of unlabeled data, we found that NPIN a very high accuracy (91 of the 124 complex neurons are is still able to successfully classify polarity for neurons that are identified with 100% accuracy). In our node-based feature morphologically distinct from the training neurons. extraction, it is challenging to correctly identify the clusters Nevertheless, we have to acknowledge that the number of of fewer terminals or nodes, because their local features are neurons in our dataset is far less than the total number of less representative of their local morphology. Therefore, find- neurons (approximately 135 K) in the Drosophila brain. ing a better way to define local features (less dependent on the There must be other types of neurons with polarity-specific number of terminals in the same clusters) could enhance po- morphological features, which can be very different from larity identification in future work. what we have addressed in this study. For example, the den- drites and axons of some local neurons are co-localized in the Comparison with Results of Electronic-Microscopy same cluster of arbors, and some projection neurons develop Images axonal clusters that are closer to the soma than the dendritic ones. We will include more morphologically distinct neurons Finally, a large set of electronic-microscopy images (the EM into the training set, once their experimentally verified polar- dataset) of the Drosophila brain has recently been released (C. ities are available. Therefore, although more training data are S. Xu et al. 2020). This dataset includes identified polarities, and necessary when applying our NPIN for the polarity identifica- hence can be potentially used as the training data for NPIN or be tion of the whole Drosophila brain, the present work at least compared with the results predicted by NPIN on the fluorescence demonstrates the possibilities to have a high precision identi- images. However, after careful examination of that dataset, we fication through the node-based feature extraction in NPIN. discovered two major differences in the morphological charac- We believe that the future versions of NPIN, after including teristics between the two datasets: (1) the neuronal skeletons in more types of neurons in the training data, will provide a much the EM dataset exhibit much more details, e.g., a larger number wider range of applications. of short terminal branches than what have been found in the fluorescent images in the present study. (2) Some neurons in Neurons with Low Accuracy the EM dataset have incomplete tracing or discontinuous branches. These issues prevent us from directly using the EM To examine the performance of our NPIN, we investigate those dataset. For future work, we suggest that heavy preprocessing, neurons not identified well in their polarity. As described in the containing the reconstruction of the connectome and the algo- Results section, we could obtain this information by randomly rithm of matching terminals of the same neuron from two image selecting 150 neurons (100 for training, 25 for validation, and 50 types, is required, before NPIN can utilize the EM dataset. for testing) out of the 213 neurons in the dataset for each training/ Moreover, we have to emphasize that the current hemi-brain test process and then repeating it for 20 rounds. As a result, each EM database is from ONE fly only, but neural images from neuron can be tested (by different models trained by other neu- light-microscopy based databases are often accumulated from rons) for 4–5 times on average, and their polarity identification multiple individuals. Although this database serves as crucial results can be obtained by averaging their probabilities before referencedatafor theflycommunity, it is unlikely that full- relabeling. The final results calculated by the DNN model are brain or hemi-brain EM data from many more flies or from flies shown in Appendix E. Within these 213 neurons, the terminal with different genetic manipulation will become available in the polarity of 166 neurons is identified with 100% accuracy. Only near future. By contrast, data from optical images are continu- 14 simple neurons and 33 complex neurons are not fully identi- ously generated by a large number of labs in the world. We fied. Concentrating on those neurons with a lower accuracy (say therefore believe that our NPIN will have its impact and be below 85%), we find only 5 simple neurons and 24 complex widely used by many fly labs in the future. neurons left. When looking into the skeleton structures of these neurons with a lower accuracy, we find the following features of these Conclusion neurons: Simple neurons have a very similar distance for axon clusters and dendrite clusters to the soma, and the number of In this study, we have developed NPIN, a completely new ML dendrite terminals is much larger than the number of axon model to identify the polarity of projection neurons in a Neuroinform (2021) 19:669–684 683 Drosophila brain with high precision (>96%). This result was Declarations achieved due to three major contributions: node-based feature Conflict of Interests The authors declare that they have no conflict of extraction, separation of Local Features from Soma Features, interests. and implementation of spatial correlations between nodal po- larities. In the experiments, we systematically compare the results of different models for various types of neurons. We Open Access This article is licensed under a Creative Commons demonstrate that, apart from Soma Features, Local Features Attribution 4.0 International License, which permits use, sharing, adap- are the secondary factors to determine the neuronal polarity. tation, distribution and reproduction in any medium or format, as long as Local Features can significantly improve the polarity identifi- you give appropriate credit to the original author(s) and the source, pro- cation, especially for the middle clusters of complex neurons, vide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included which cannot be well-identified by using Soma Features only. in the article's Creative Commons licence, unless indicated otherwise in a Besides the Drosophila neurons, we show that NPIN can also credit line to the material. If material is not included in the article's be applied to identify the neuronal polarity of other insects, Creative Commons licence and your intended use is not permitted by such as the blowfly. As a result, we believe that the develop- statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this ment of NPIN and its applications is an important step toward licence, visit http://creativecommons.org/licenses/by/4.0/. the determination of signal flows in complex neural networks. Information Sharing Statement References The NPIN software package contains data of sample neurons Asri, H., Mousannif, H., Moatassime, H. A., & Noel, T. (2016). Using Machine Learning Algorithms for Breast Cancer Risk Prediction with skeletal data available from the FlyCircuit database and Diagnosis. Procedia Computer Science, 83, 1064–1069. (http://www.flycircuit.tw). We also provide two online https://doi.org/10.1016/j.procs.2016.04.224. versions of NPIN to be used or tested by other research Chiang, A.-S., Lin, C.-Y., Chuang, C.-C., Chang, H.-M., Hsieh, C.-H., groups at the following address: website (https://npin-for- Yeh, C.-W., et al. (2011). Three-Dimensional Reconstruction of drosophila.herokuapp.com) and Gitlab code (https://gitlab. Brain-wide Wiring Networks in Drosophila at Single-Cell Resolution. Current Biology, 21(1), 1–11. https://doi.org/10.1016/ com/czsu32/npin). j.cub.2010.11.056. Craig, A. M., & Banker, G. (1994). Neuronal Polarity. Annual Review of Supplementary Information The online version contains supplementary Neuroscience, 17(1), 267–310. https://doi.org/10.1146/annurev.ne. material available at https://doi.org/10.1007/s12021-021-09513-y. 17.030194.001411. Cuntz, H., Forstner, F., Haag, J., & Borst, A. (2008). The Morphological Acknowledgments This work is supported by the Ministry of Science Identity of Insect Dendrites. PLoS Comput Biol, 4(12), e1000251. and Technology grant (MOST 107-2112-M-007-019-MY3) and by the https://doi.org/10.1371/journal.pcbi.1000251. Higher Education Sprout Project funded by the Ministry of Science and Fischbach, K.-F., & Dittrich, A. P. M. (1989). The optic lobe of Technology and the Ministry of Education in Taiwan. We thank the Drosophila melanogaster. I. A Golgi analysis of wild-type structure. National Center for Theoretical Sciences and Brain Research Center in Cell and Tissue Research, 258(3), 441–475. https://doi.org/10.1007/ the National Tsing Hua University for providing full support in the inter- BF00218858. disciplinary collaboration. We thank the National Center for High- Hanesch, U., Fischbach, K.-F., & Heisenberg, M. (1989). Neuronal ar- Performance Computing for providing the FlyCircuit database. We ap- chitecture of the central complex in Drosophila melanogaster. Cell preciate Prof. Che-Rung Lee for providing computational resources, Yi- and Tissue Research, 257(2), 343–366. https://doi.org/10.1007/ Ning Juan for helping with the website, Dr. Yu-Chi Huang and Prof. An- BF00261838. Shi Chiang for helpful discussion about the neuronal datasets. Huang, Y.-C., Wang, C.-T., Su, T.-S., Kao, K.-W., Lin, Y.-J., Chuang, C.-C., et al. (2019). A Single-Cell Level and Connectome-Derived Funding This work is supported by the Ministry of Science and Computational Model of the Drosophila Brain. Frontiers in Technology grant (MOST 107–2112-M-007-019-MY3) and by the Neuroinformatics, 12. https://doi.org/10.3389/fninf.2018.00099. Higher Education Sprout Project funded by the Ministry of Science and Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Technology and the Ministry of Education in Taiwan. Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Data Availability The FlyCircuit database (http://www.flycircuit.tw/)is Advances in Neural Information Processing Systems 25(pp. 1097– provided by the National Center for High-Performance Computing. 1105). Curran Associates, Inc. http://papers.nips.cc/paper/4824- imagenet-classification-with-deep-convolutional-neural-network Code Availability We provide an online version and a source code of s.pdf. Accessed 13 April 2020 NPIN (XGB version) at the following websites: Kuan, L., Li, Y., Lau, C., Feng, D., Bernard, A., Sunkin, S. M., et al. Test Page (https://npin-for-drosophila.herokuapp.com/). (2015). Neuroinformatics of the Allen Mouse Brain Connectivity Gitlab code (https://gitlab.com/czsu32/npin). Atlas. Methods, 73,4–17. https://doi.org/10.1016/j.ymeth.2014.12. 013. 684 Neuroinform (2021) 19:669–684 LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, Parekh, R., & Ascoli, G. A. (2013). Neuronal Morphology Goes Digital: A Research Hub for Cellular and System Neuroscience. Neuron, 521(7553), 436–444. https://doi.org/10.1038/nature14539. 77(6), 1017–1038. https://doi.org/10.1016/j.neuron.2013.03.008. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based Peng, H., Hawrylycz, M., Roskams, J., Hill, S., Spruston, N., Meijering, learning applied to document recognition. Proceedings of the IEEE, E., & Ascoli, G. A. (2015). BigNeuron: Large-Scale 3D Neuron 86(11), 2278–2324. Presented at the Proceedings of the IEEE. Reconstruction from Optical Microscopy Images. Neuron, 87(2), https://doi.org/10.1109/5.726791. 252–256. https://doi.org/10.1016/j.neuron.2015.06.036. Lee, Y.-H., Lin, Y.-N., Chuang, C.-C., & Lo, C.-C. (2014). SPIN: A Rolls, M. M. (2011). Neuronal polarity in Drosophila: Sorting out axons Method of Skeleton-Based Polarity Identification for Neurons. and dendrites. Developmental Neurobiology, 71(6), 419–429. Neuroinformatics, 12(3), 487–507. https://doi.org/10.1007/s12021- https://doi.org/10.1002/dneu.20836. 014-9225-6. Shinomiya, K., Matsuda, K., Oishi, T., Otsuna, H., & Ito, K. (2011). Lin, C.-Y., Chuang, C.-C., Hua, T.-E., Chen, C.-C., Dickson, B. J., Flybrain neuron database: A comprehensive database system of Greenspan, R. J., & Chiang, A.-S. (2013). A Comprehensive the Drosophila brain neurons. The Journal of Comparative Wiring Diagram of the Protocerebral Bridge for Visual Neurology, 519(5), 807–833. https://doi.org/10.1002/cne.22540. Information Processing in the Drosophila Brain. Cell Reports, Squire, L. R., Berg, D., Bloom, F., Lac, S. du, & Ghosh, A. (Eds.). 3(5), 1739–1753. https://doi.org/10.1016/j.celrep.2013.04.022. (2008). Fundamental Neuroscience, Third Edition (3rd ed.). Lo, C.-C., & Chiang, A.-S. (2016). Toward Whole-Body Connectomics. Academic Press. Journal of Neuroscience, 36(45), 11375–11383. https://doi.org/10. Wang, J., Ma, X., Yang, J. S., Zheng, X., Zugates, C. T., Lee, C.-H. J., & 1523/JNEUROSCI.2930-16.2016. Lee, T. (2004). Transmembrane/Juxtamembrane Domain- Malta, T. M., Sokolov, A., Gentles, A. J., Burzykowski, T., Poisson, L., Dependent Dscam Distribution and Function during Mushroom Weinstein, J. N., et al. (2018). Machine Learning Identifies Body Neuronal Morphogenesis. Neuron, 43(5), 663–672. https:// Stemness Features Associated with Oncogenic Dedifferentiation. doi.org/10.1016/j.neuron.2004.06.033. Cell, 173(2), 338–354.e15. https://doi.org/10.1016/j.cell.2018.03. Wu, M., Nern, A., Williamson, W. R., Morimoto, M. M., Reiser, M. B., Card, G. M., & Rubin, G. M. (2016). Visual projection neurons in Matus, A., Bernhardt, R., & Hugh-Jones, T. (1981). High molecular the Drosophila lobula link feature detection to distinct behavioral weight microtubule-associated proteins are preferentially associated programs. eLife, 5, e21022. https://doi.org/10.7554/eLife.21022. with dendritic microtubules in brain. Proceedings of the National Xu, C. S., Januszewski, M., Lu, Z., Takemura, S., Hayworth, K. J., Huang, Academy of Sciences of the United States of America, 78(5), 3010– G., et al. (2020). A Connectome of the Adult Drosophila Central Brain. bioRxiv, 2020.01.21.911859. 10.1101/2020.01.21.911859. Milyaev, N., Osumi-Sutherland, D., Reeve, S., Burton, N., Baldock, R. Xu, M., Jarrell, T. A., Wang, Y., Cook, S. J., Hall, D. H., & Emmons, S. A., & Armstrong, J. D. (2012). The Virtual Fly Brain browser and W. (2013). Computer Assisted Assembly of Connectomes from query interface. Bioinformatics, 28(3), 411–415. https://doi.org/10. Electron Micrographs: Application to Caenorhabditis elegans. Plos 1093/bioinformatics/btr677. One, 8(1), e54050. https://doi.org/10.1371/journal.pone.0054050. Mohsen, H., El-Dahshan, E.-S. A., El-Horbaty, E.-S. M., & Salem, A.-B. M. (2018). Classification using deep learning neural networks for brain tumors. Future Computing and Informatics Journal, 3(1), 68– Publisher’sNote Springer Nature remains neutral with regard to jurisdic- 71. https://doi.org/10.1016/j.fcij.2017.12.001. tional claims in published maps and institutional affiliations. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Neuroinformatics Springer Journals

Identification of Neuronal Polarity by Node-Based Machine Learning

Loading next page...
 
/lp/springer-journals/identification-of-neuronal-polarity-by-node-based-machine-learning-dTGzDwgqaV

References (41)

Publisher
Springer Journals
Copyright
Copyright © The Author(s) 2021
ISSN
1539-2791
eISSN
1559-0089
DOI
10.1007/s12021-021-09513-y
Publisher site
See Article on Publisher Site

Abstract

Identifying the direction of signal flows in neural networks is important for understanding the intricate information dynamics of a living brain. Using a dataset of 213 projection neurons distributed in more than 15 neuropils of a Drosophila brain, we develop a powerful machine learning algorithm: node-based polarity identifier of neurons (NPIN). The proposed model is trained only by information specific to nodes, the branch points on the skeleton, and includes both Soma Features (which contain spatial information from a given node to a soma) and Local Features (which contain morphological information of a given node). After including the spatial correlations between nodal polarities, our NPIN provided extremely high accuracy (>96.0%) for the classification of neuronal polarity, even for complex neurons with more than two dendrite/axon clusters. Finally, we further apply NPIN to classify the neuronal polarity of neurons in other species (Blowfly and Moth), which have much less neuronal data available. Our results demonstrate the potential of NPIN as a powerful tool to identify the neuronal polarity of insects and to map out the signal flows in the brain’s neural networks if more training data become available in the future. . . . . . Keywords Neuronal polarity Machine learning Drosophila Connectome Axon Dendrite Introduction Kuan et al. 2015; Milyaev et al. 2012; Parekh and Ascoli 2013;Pengetal. 2015; Shinomiya et al. 2011; Xu et al., Rapid technology advances in recent years have led to the 2013;Xuet al. 2020). However, how to integrate and trans- development of several connectomic projects and large-scale form the data to address scientific questions (Lo and Chiang databases for cellular-level neural images (Chiang et al. 2011; 2016) remains a central challenge. Overall, these projects aim to provide sufficient information for the analysis of informa- tion flows in the brain. This goal is difficult to achieve in the Chen-Zhi Su and Kuan-Ting Chou contributed equally to this work. current stage, as many neural images do not provide informa- tion on polarity (axons and dendrites). The axon-dendrite po- * Chung-Chuan Lo larity of a neuron can be identified by experimental methods cclo@mx.nthu.edu.tw (Craig and Banker 1994; Matus et al. 1981;Wang etal. 2004). * Daw-Wei Wang However, these methods are not practical for large-scale neu- dwwang@phys.nthu.edu.tw ral image projects and for the image datasets that were already 1 acquired. Morphology-based polarity identification at the Brain Research Center, National Tsing Hua University, post-imaging stage is possible, but this is particularly chal- Hsinchu 30013, Taiwan 2 lenging for insects because of their highly diverse neuronal Physics Division, National Center for Theoretical Sciences, morphology (Cuntz et al. 2008; Lee et al. 2014). Hsinchu 30013, Taiwan 3 To address this issue, the method of skeleton-based polarity Department of Physics, National Tsing Hua University, identification of neurons (SPIN) has been developed using sev- Hsinchu 30013, Taiwan 4 eral classic machine-learning (ML) algorithms (Lee et al. 2014). Institute of Systems Neuroscience, National Tsing Hua University, Although SPIN reaches a decent performance in neuronal polar- Hsinchu 30013, Taiwan 5 ity identification for fruit flies, Drosophila melanogaster,with Center for Quantum Technology, National Tsing Hua University, 84%–90% accuracy, the method suffers from the cluster-sorting Hsinchu 30013, Taiwan 670 Neuroinform (2021) 19:669–684 problem. Most projection neurons (i.e., neurons that innervate Method more than one neuropil) possess two or more clusters of neural processes. Each cluster can be either axon or dendrite, but not Overview both. Using this observation, the SPIN method first identifies the clusters of processes in a neuron and then identifies the polarity The axon-dendrite polarity of a neuron is correlated with of each cluster. The strategy is highly efficient, but incorrect certain aspects of its morphology, such as the distance (or sorting of clusters can lead to incorrect polarity classification of path length) from a terminal to the soma, the number of a large number of terminal points at once. This is a major source nodes involved in a domain/cluster, and the thickness of of errors in the SPIN method. neurites (Craig and Banker 1994; Hanesch et al. 1989; n the past decade, modern ML algorithms have been ap- Rolls 2011; Squire et al. 2008). However, so far, very plied in many research fields and in daily life. The popularity few theoretical frameworks have systematically investi- of modern ML grows because of rapid developments in com- gated the relationship between these features and neuronal putational algorithms, high-speed processors, and big data polarity. These empirical conditions are loosely defined, available from various resources (LeCun et al. 1998; with many exceptions for different types of neurons. Krizhevsky et al. 2012; LeCun et al. 2015). Some widely Therefore, it is difficult to identify neuronal polarity by successful algorithms —for example, deep neural networks traditional rule-based computational programs. SPIN (Lee (DNN) and extreme gradient boosting (XGB)— may recog- et al. 2014), which is developed using classical ML algo- nize hidden patterns more efficiently than human knowledge/ rithms, can be improved in many aspects. experience, after proper training on big data. Therefore, ML In order to significantly improve the previous methods, here opens a new era when precise classification and/or prediction we develop a new polarity identifier based on the morphologic becomes possible even without full knowledge of the given features, which are extracted from neuronal nodes and handled data. As a result, many applications of ML have recently ap- by modern ML algorithms. Different from clusters, which are peared in biological and medical research (Asri et al. 2016; usually ill-defined from computational point of view, nodes are Malta et al. 2018; Mohsen et al. 2018). It is reasonable to always well-defined by the bifurcation in a neuronal skeleton. expect that one may apply modern ML for the identification The whole process of polarity identification, therefore, is com- of neuronal polarity solely using optical images of the fruit posed by the following four major steps in our NPIN model. It is fly’s brain. For neurons of this insect, several tenths of thou- instructive to briefly describe them (see Fig. 1) before the further sands of high-resolution optical images are already available, explanation in the rest of this paper: which is the largest dataset among all species. In the present work, we develop a new classifier: node- Step I. (Data Preparation and Reorganization):Wein- based polarity identifier of neurons (NPIN). The proposed vent a diagrammatic method to map a 3D neural model achieves much higher accuracy (>96%) than SPIN or skeleton structure of a given neuron onto 2D tree the human eye for the identification of neuronal polarity in diagrams, called level trees and reduced trees. This the Drosophila brain. Our NPIN is developed using a node- effective representation makes it easy to extract rep- based feature extraction method. Specifically, NPIN in- resentative features for ML. cludes both Soma Features (spatial information between a Step II. (Node-Based Feature Extraction):We determine soma and a given node) and Local Features (morphological the nodal polarity using the features of each node. information around a given node). Two state-of-the-art su- Specifically, we identify and extract both Soma pervised learning algorithms—XGB and DNN—are used as Features and Local Features for each node. two complementary classifiers, making the method applica- Step III. (ML Models): In NPIN, we apply two powerful ble to complex neurons (which have more than two axon/ ML algorithms—XGB and DNN—together. They dendrite clusters) with a competition between Soma provide two different but complementary ap- Features and Local Features. We find that NPIN provides proaches for the classification of axons and extremely good results for the classification of neuronal po- dendrites. larity, identifying important local features compared with Step IV. (Implementation of Spatial Correlation):The spa- the known soma features. We further apply NPIN to classify tial correlation of the nodal polarity in the nearby the neuronal polarity of other species of insects (in this case, region is implemented by relabeling the nodal polar- Blowfly and Moth), which may have insufficient data for ity suggested by ML models. This approach can sig- standard ML. These important achievements of NPIN are nificantly enhance the accuracy of the final output. all important steps toward the understanding of signal flow dynamics in neural networks, and should speed up the connectomic projects for the whole brain when more data Typical ML methods concentrate on the algorithms in Step are available for training. III. Instead, we put more emphasis on the other three steps in a Neuroinform (2021) 19:669–684 671 Fig. 1 Flowchart of the NPIN model. NPIN includes four major steps, as algorithms after validation. We then relabel the classification by including described in the text. The dataset contains 213 neurons with labeled spatial correlations of nodal polarities before comparing them with the polarity as the ground truth. We randomly choose 100/25/50 neurons test data with known polarities. The whole process is repeated 20 times to from the datasets for training/validation/test sets. Every neuron in the cover all 213 neurons in the original dataset. As a result, each neuron training/validation sets is mapped to a level tree and a reduced tree. We could be selected to be a test sample and classified by a model trained on then extract Soma Features and Local Features from these neuronal data other neurons for training. Preliminary results are obtained by XGB and DNN way specifically useful for the determination of neuronal po- with information including the brain regions innervated by the larity. Figure 1 shows the flowchart of the whole calculations. dendrites and axons of each neuron, the numbers of axon/ We will explain these strategies in the rest of this section. dendrite terminals, and precision/recall obtained by our model. Dataset We divide the neurons in our dataset into two types: (i) simple neurons, which have two clusters of terminals (one Our main dataset represents 213 neurons with experimental dendrite and one axon); (ii) complex neurons, which have ground truth from the Drosophila brain, which are available more than two clusters of terminals. In Figs. 2(b1)–(b3) and from the FlyCircuit database (http://www.flycircuit.tw/) (c1)–(c4), we show some typical skeleton structures of these (Chiang et al. 2011). These 213 neurons are ALL projection two types of neurons. In our dataset, we have 89 simple neu- neurons selected from various regions across the brain to rep- rons and 124 complex neurons with previously reported po- resent the diversity of neuronal morphology as much as pos- larity. Among complex neurons, most complex neurons have sible (Fig. 1(a)). Local neurons with axon/dendrite coexis- three clusters (two dendrites and one axon, or one dendrite and tence in the same branch/cluster are not included in our re- two axons). Only a few neurons have more than three clusters. search. These projection neurons innervate 15 neuropils: AL, The reason to classify these neurons is to investigate how the AOTU, CAL, CCP, DMP, EB, FB, IDFP, LH, LOB, MED, distance to soma and the number of clusters can influence the NO, PB, VLP, and VMP. Among these 213 neurons, 107 identification of neuronal polarity. Moreover, we can examine neurons have been included in the dataset used in the devel- how well NPIN performs even when the polarity is difficult to opment of the previous model, SPIN, and we have 106 addi- be identified by the human eyes in the case of three or more tional neurons for the present work. As we will show later, due terminal clusters. This is one of the most important criteria for to the improvement of feature extraction and the ML algo- a polarity identifier to be practically applicable for the deter- rithm, our model, NPIN, substantially outperforms SPIN, mination of signal flow in neuronal networks of the insect not only in the overall precision and recall but also in the brain. There are, of course, some other types of projection or applicability in more brain regions as well as more types of local neurons, which may not be easily classified by the num- ber of clusters or by their polarity distribution. We do not complex structures. In Appendix E, we list these 213 neurons 672 Neuroinform (2021) 19:669–684 Fig. 2 Drosophila melanogaster (fruit fly) neurons used in the present study. (a) All 213 neurons in our dataset, shown in their actual locations in the standard fly brain. (b1)–(b3) Skeleton structures for several simple neurons. (c1)–(c4) Skeleton structures for several complex neurons. Black dots represent somas. Black lines are the main trunks of neurons. Blue or red lines indicate the axonal or dendritic clusters, respectively. Each neuron is labeled by its ID in the FlyCircuit database include them in the dataset of this study because of a lack of or shapes. To express this information in a 2D diagram, we data with confirmed polarity to be used for training. Our ap- introduce the level structure according to the generation of proach developed here, however, may still be applicable to nodes: a soma is placed in the top-level (level 0), and the next these neurons when more data are available in the future. two nodes are placed in the lower level (level 1), and so on for their offsprings, until all the ending nodes (terminals) are Standardized Representation: Level Trees and properly placed. We take the convention that the branches Reduced Trees with more successive non-empty levels are placed in the left-hand side and the branches with less successive non- To improve the accuracy of our ML model, we first need to empty levels in the right-hand side (Figs. 3(a3) and (b3)). define how to “standardize” the morphological information of We believe that most morphological features of the neuronal these neurons, which are so different from each other in their cluster are still extractable from such standardized representa- original 3D structures. Figures 3(a1) and (b1) show two ex- tion because the spatial positions of all nodes (including soma amples of a simple neuron and a complex neuron. First, we and terminals) are still available. The only missing informa- start with the 3D skeleton structures (see Figs. 3(a2) and (b2)) tion in the level tree (compared with the 3D skeleton image of extracted from the raw images, where the width information of neurons) is the shapes and widths of neuronal branches that the trunks or branches are ignored temporarily in order to connect neighboring nodes. As we will see below, this miss- make our model more generally applicable. In our work, we ing information seems not crucial for the determination of further map the 3D skeleton structure onto a level tree (see neuronal polarity in NPIN. Figs. 3(a3) and (b3)), which keeps all information on the po- In addition to the level tree representation for a neuronal sition of each node (including soma, terminals, and cross structure, in this work, we further define a reduced tree for points between branches) and the path length between them, each neuron. The reduced tree aims to retain the major but it ignores the trunk and branch information, such as width branches of the skeleton structure to identify an axon or Neuroinform (2021) 19:669–684 673 Fig. 3 Encoding 3D optical images of neurons into level trees and neuron. Because a complex neuron has more than two clusters, there can reduced trees. First, the volume image of a neuron (a1) is converted into be more than one dividing node that separates axon clusters from den- the skeleton (a2), and then a level tree (a3), which is a 2D plot with a drites. In (c), we graphically show the rules to define the nodal polarity standardized method to label most features of the original neurons. Red, based on the polarity of terminals in the level tree (see the text). Upward blue, and yellow dots represent dendrites, axons, and dividing nodes arrows indicate that the nodal polarity in the upper level is defined by the (including terminals), respectively. (a4) represents the reduced tree of nodal polarities of the two nodes/terminals in the lower level the same neuron cell. (b1)–(b4) show the same reduction for a complex dendrite cluster. This information is important for the deter- structure of their neurons, in this study, we extend the defini- mination of cluster curvature and aspect ratio for nodal fea- tion of polarity from terminals to nodes, and we use this infor- tures within each cluster (explained below). The reduced tree mation to extract features in NPIN. In other words, we use a of a neuron can be obtained by repeatedly removing the end- bottom up method to assign the polarity for a node to be the ing nodes with the branches shorter than a characteristic length axon/dendrite class, if its offspring branches are connected to determined by the branch distribution, until it stops automat- pure axon/dendrite terminals or nodes. See below for more ically or only five levels are left (see Figs. 3(a4) and (b4)). The detail. basic assumption behind this procedure is that the major We emphasize that using features extracted from nodes has branch of a neuron skeleton structure is contained in the “in- several important advantages over using features extracted ner” (closer to the soma) and “longer” branches. Shorter and from terminals or clusters for the training process of ML. outsider branches are minor or unimportant for determining First, the number of nodes is much larger than the number of the clusters. See Appendix A for the detailed procedure of clusters in each neuron. Therefore, the polarity identification producing the reduced tree from a level tree. has significantly higher accuracy due to the larger training data. Second, nodes are well-defined in the skeleton structure Nodal Polarity (compared with clusters) and could include more morpholog- ical features (compared with terminals). Finally, these nodes The polarity of the neurons in our dataset are all predetermined can also be systematically labeled in the skeleton structure or using the presynaptic (Syt::HA) or postsynaptic in our level tree diagram, making it easy to include their cor- (Dscam17.1::GFP) markers (C.-Y. Lin et al. 2013)or using related features in the spatial distribution. This node-based the morphological features described in previous studies feature extraction is crucial in NPIN, making an accurate iden- (Fischbach and Dittrich 1989; Hanesch et al. 1989;Wu et al. tification of neuronal polarity possible. 2016). There are 7142 terminals identified as dendrites and To extend the polarity definition from terminals, as provid- 2310 as axons. However, because the axon-dendrite polarity ed in the dataset, to nodes on the skeleton of a neuron, we of these terminals is highly correlated to the morphological apply the following series of rules to define the nodal polarity 674 Neuroinform (2021) 19:669–684 according to the polarity of terminals (Fig. 3.(c)): (1)If two eigenvalues of moment of inertia of the cluster, (13) curvature child nodes (or terminals) are both axons (or dendrites), their (varicosities) of the cluster, (14) aspect ratio of the cluster, (15) parent node (the node that directly connects to them in the volume of the cluster, etc. We do not include arbor thickness upper level) is also defined as an axon (or dendrite). (2) If because not all neurons have such information in their optical one of the child nodes (or terminals) is an axon, and the other images. is a dendrite, their parent node is defined as a “dividing node.” After systematic studies and comparison of the prediction (3) If one of the child nodes is an axon (or dendrite), and the results, we eventually find out the nine most relevant features, other is a dividing node, their parent node is defined as an which can be classified into 2 groups: Soma Features (SF) and axon (or dendrite). Finally, (4) if two child nodes are both Local Features (LF). Soma Features contain spatial informa- dividing nodes, their parent node is also defined as a dividing tion from a given node to a soma, including the path length node (however, we do not have such a case in our dataset). along the neuronal branches and the direct distance in 3D The definition of diving nodes is just for the convenience and space. Local Features contain certain information on the local consistency of nodal polarity. These dividing nodes are very morphology of a given node, including the curvature and as- few (mostly none or at most two in each neuron of our dataset) pect ratio of the cluster it belongs to. Hence, Local Features do and therefore not included in our training data. If not defining not include any information about the soma, while Soma dividing points in such a way, we could not properly identify Features do not include any information about the local mor- the polarity of a node connecting to both dendrite and axon phology. Let i be the index of a given node. Soma Features of nodes. node i can be expressed as a four-component vector: SF =[l , si After applying these rules, we can label the polarity of all nl , d , nd ], which are the path length to soma, normalized si si si nodes of any neuron using the polarity information of their path length to soma, distance to soma and normalized distance terminals. Note that this expansion of nodal polarity should to soma, respectively. Local Features of node i can be not be misunderstood as introducing any artifacts or uncon- expressed as a five-component vector: LF =[l , nl , c , ar , pi pi i i firmed polarity labeling, because the morphological features rl ], which are the path length to the parent node, the normal- of terminals should be directly related to the nearby nodes by ized path length to the parent node, curvature of the cluster, definition. The introduced nodal polarity is just for the conve- aspect ratio of the cluster, and the ratio of path lengths to the nience of feature extraction and for data augmentation in ma- children nodes, respectively. If a children node does not exist, chine learning language, and will not be shown in the evalu- its features are replaced by the number, −1. We then train ation of NPIN performance. In other words, the precision of different ML models on various combinations of features to NPIN is still calculated based on terminal polarity rather than identify their roles in the identification of neuronal polarity. In nodal polarity, and will show (see below) a significant en- Appendix B, we explain how to identify and calculate soma hancement of prediction accuracy compared to the results features and local features (from the level trees and reduced using terminal information only. Finally, we note that the di- trees defined above). viding node is defined to mark the position to separate axon and dendrite clusters, and it should be important in the nerve cell development. Since the number of dividing points is much Machine Learning Models less (one or at most two points in each neuron) than the num- ber of axon or dendrite nodes, we do not include them in the We train our model by supervised learning using the training training and testing processes. Figures 3(a3) and (b3) show data extracted from the dataset. We implement several ML some representative level trees, where all nodes are properly algorithms: random forest, gradient boosting decision tree, labeled. XGB, support vector machines, and DNN. We find that, in general, XGB and DNN provide the best and complementary Feature Extraction for Nodal Polarity results from the features we selected. Therefore, we use them in our NPIN. In Appendix C, we explain the details of how to In principle, the level tree representation defined above con- implement these two algorithms in the present study. tains all information of a 3D neuron and can be used for the In addition to the algorithms, an ML model also depends identification of neuronal polarity. We test more than a dozen on the features used during the training process. To investigate of features, including (1) path length to its parent node, (2) the effects of different morphological features on the identifi- normalized path length to its parent node, (3) path length to cation of nodal polarity, we develop three models by using soma, (4) normalized path length to soma, (5) direct distance three types of features in NPIN: Model I (using both Soma to soma, (6) normalized direct distance to soma, (7) Strahler Features and Local Features), Model II (using Soma Features number, (8) angle between branches to the children nodes, (9) only), and Model III (using Local Features only). As we will ratio of path lengths to the children nodes, (10) layer number see later, we can gain insight into the relationship between in the cluster, (11) number of terminals in the clusters, (12) morphological features and polarity by systematically Neuroinform (2021) 19:669–684 675 comparing the polarity identification results between different included because their numbers are too few to be statistically models and different types of neurons. relevant. In the testing neurons, they could be recovered using the predicted polarities of other nodes (see Appendix D). Implementation of Spatial Correlation of Nodal In the following sections, we will first present the distribu- Polarity tion of nodal features, including both soma features and local features, obtained from all neurons in our dataset. This pro- In the standard application of supervised learning for classifi- vides a deep understanding of neuronal morphology and its cation, one usually obtains the results from the output proba- relationship with other results. Next, we show the results of bilities directly when the model is well-trained on the training polarity identification provided by Model I (with both Soma data. The training aims to minimize the cross-entropy between Features and Local Features) for our whole neuron dataset, the output results and the known answers by backpropagation. followed by results using Model II (with Soma Features only). However, this ML process does not guarantee reasonable re- We then focus on the results obtained by using complex neu- sults all the time without violating some necessary conditions, rons as training data for comparison. As an example of appli- which could not be included in the input features of training cation in other species, we apply NPIN to test the blowfly. data. For the task of nodal polarity identification in our present Finally, we summarize these calculation results and our work, for example, the polarities of nodes are highly depen- findings. dent on its neighboring nodes: nodes in the same cluster (and, therefore, close in space) are usually of the same type (a den- Feature Distribution and Importance Ranking drite or axon), but such loosely defined necessary condition cannot be implemented in the loss function if the polarity of Before presenting the results of neuronal polarity by NPIN, each node is identified individually. Therefore, we have to we investigate the distribution of different features (Soma include such a spatial correlation of polarity by adding other Features or Local Features) for different types of neurons methods in the ML model. (simple neurons or complex neurons). This provides a better In this work, spatial correlations between nodal polarities picture which helps to understand and explain the results of can be included by the modification of the polarity provided the present algorithm. In Fig. 4, we show the distribution of by XGB or DNN, if the probability for axon or dendrite is axon nodes and dendrite nodes (including terminals) of all below a certain threshold. More precisely, such a modification neurons as a function of the normalized path length (relative process contains three steps: (1) we perform the ML process to the largest length to the soma). Results of simple neurons for the test data and obtain the polarity and its probability for (a1) and complex neurons (a2) are shown together for com- each node. (2) Next, we accept the result of a given node if the parison. As expected, most axons have a longer path length to probability is higher than a threshold, and we reject the result soma compared with most dendrites in simple neurons, but the otherwise by changing it to be unidentified. (3) Finally, we distribution of dendrite is certainly wider than the distribution relabel these rejected/unidentified nodes according to the po- of axons. A wider distribution pattern for dendrites in simple larity of its neighboring nodes. As a result, we identify spatial neurons directly implies that it is easier to correctly classify a correlations between nodal polarities. More details of such node to be a dendrite, while it is more difficult to include all polarity modification and its effects on the NPIN performance dendrite nodes by the same classifier. Hence, this explains are described in Appendix D. why the precision is higher (or lower) than the recall for den- drites (or axons) of simple neurons (Fig. 5(a1) and (b1)). On the other hand, in Fig. 4(a2), axon nodes have a wider distri- Results bution than dendrite nodes in complex neurons, explaining why the precision is lower (or higher) than the recall for den- Our dataset includes 213 neurons with verified polarities as drites (or axons) of complex neurons (see Fig. 5(a2) and (b2)). the ground truth. In our training procedures (Fig. 1), we ran- In addition to the path length to the soma, we have also domly select 100 neurons from the dataset for training, 25 for included the direct distance from a node to a soma as a feature validation, and 50 for testing. This process is repeated for 20 (Appendix B and Fig. S2(b)). Besides, the ratio of direct dis- rounds, so that each neuron can be tested (by different models tance to the path length reflects a global morphological feature trained by other neurons) for 4–5 times on average. We then of a given node: if the distance to a soma is close to the path average these probabilities for their nodal polarity and make length to a soma, the neuron branches are more straight in the the final comparison with the ground truth. Using this method, real space. The path is more curved if this ratio is much small- the obtained results for the nodal polarity of each neuron can er than one. This implied that the node is close to the soma in be much more stable because the fluctuations due to the space with a long and curved neuronal branch in between. In dataset selection are reduced. In our training data and in com- Figs. 4(b1) and (b2), we show the distribution of axon and parison with the ground truth, the dividing points are not dendrite nodes in the space of normalized length to the soma 676 Neuroinform (2021) 19:669–684 Fig. 4 Feature distributions of axons and dendrites for all neurons in (c1) and (c3) show the nodal distribution in terms of the normalized path our dataset. (a1) and (a2) show the distribution of axon and dendrite length to the soma and the curvature of the associated cluster. Blue and nodes along the normalized path length to soma, for simple and complex red dots represent axon and dendrite nodes, respectively. Details of cur- neurons, respectively. (b1) and (b2) display the nodal distribution in terms vature calculations are described in Appendix B of the normalized path length and the normalized distance to the soma. and normalized direct distance to the soma. The distribution Apart from the two soma features mentioned above, in Fig. clearly indicates that most nodes are well-separated in such 4(c1) and (c2), we also present the distribution of nodal po- 2D space. In fact, feature ranking by XGB also reveals these larity in the space of normalized length to the soma and the two features as the most important features for the identifica- cluster curvature near a given node. We suggest that the po- tion of nodal polarity. larity classification can be effectively enhanced by including Fig. 5 Performance of NPIN with Model I, where both Soma algorithm. (c) defines the confusion matrices shown in this figure. In Features and Local Features are used. (a1)–(a3) are the confusion the upper part of the table, each row indicates the actual polarity, and matrix and precision/recall table of the terminal polarity, based on the each column indicates the polarity predicted by NPIN. The lower part of XGB algorithm for simple, complex, and all neurons, respectively. the table displays the precision and recall of axonal and dendritic termi- (b1)–(b3) are the same as in (a1)–(a3) butcalculatedbythe DNN nals. Precision and recall are defined in the equations below (c) Neuroinform (2021) 19:669–684 677 curvature as one of the local features because visual inspection simple neurons is higher than that of complex neurons by reveals that typically more dendrites (compared to the axon 1.2% for XGB (compare Figs. 5(a1) and 5(a2)), while it be- nodes) can be found in the regime of larger curvatures. Such comes 0.8% if calculated by DNN (compare Figs. 5(b1) and effects look more significant in simple neurons than in com- 5(b2)). plex neurons. However, if we use curvature or other local However, such similar accuracy of polarity identification features alone, the performance of polarity classification can- for simple and complex neurons is surprising, because com- not be as good as using the path length to the soma. plex neurons have more than two clusters. Therefore, the po- The importance of each feature can also be obtained from larity of middle clusters cannot be easily identified according the feature ranking calculation of XGB (however, this func- to its relative distance to soma. There are also various kinds of tion is not available in DNN). This can also be obtained by complex neurons (see Fig. 2, for example), which may also comparing the overall accuracy after systematically removing have an axon cluster close to the soma. As a result, a naïve certain features during the training. Our result suggests the top comparison of the path length to the soma should not work six features for the determination of nodal polarity: (1) well for a complex neuron. Hence, it is reasonable to believe unnormalized path length to the soma, (2) normalized path that, in our NPIN, the contribution of soma features to simple length to the soma, (3) unnormalized distance to the soma, and complex neurons should be different from the contribu- (4) normalized distance to the soma, (5) curvature of the as- tion of local features. To understand how this result is related sociated cluster, and (6) aspect ratio of the associated cluster. to the feature selection in various types of neurons, in the Other features are less important but can still contribute to the following section, we demonstrate the performance of NPIN overall performance of NPIN. These results also confirm that with different feature selections. local features are secondary factors for the determination of nodal polarity. Identification Results of Model II: Using Soma Features Only Identification Results of Model I: Using both Soma Features and Local Features To clarify the role of Soma Features and Local Features in the identification of neuronal polarity, we additionally use Model To present the results of polarity identification by NPIN, we II, which is trained by using Soma Features only. The model is start from Model I by using both soma features and local trained using the same protocol as in the previous section. The features for the whole dataset (with both simple and complex results are shown in Fig. 6. neurons). Figures 5(a1)–(a3) show the confusion matrix of According to Fig. 6, when using Soma Features only, we Model I based on XGB and the associated precision/recall find that the overall accuracy drops to 95.5% (94.7%) for table for the polarity of terminals in simple neurons, complex simple neurons, and 93.1% (90.0%) for complex neurons, neurons, and all neurons, respectively. Figures 5(b1)–(b3) respectively, if using XGB(DNN) algorithms. The perfor- show the results of Model I but based on DNN for compari- mance on all neurons, as shown in Fig. 6(a3) and (b3), is son. Figure 5(c) presents the definition of the confusion matrix between those of the simple and complex neurons, as and explains how the precision and recall are calculated for expected. axons and dendrites. Because the final result is a binary clas- Several important conclusions can be made. First, the over- sification of terminal polarity, the dividing nodes are not in- all accuracy of Model II is lower than for Model I (compare cluded either in the training data or in the test data. Fig. 6(a3) with Fig. 5(a3) for XGB and compare Fig. 6(b3) From results shown in Figs. 5(a3) and 5(b3), we discover with Fig. 5(b3) for DNN). However, the difference is only that NPIN is a very powerful classifier with an overall accu- 1.6% for XGB, while it is 3.5% for DNN. This means that racy of 96%. This is achieved by including both Soma the contribution of local features, which exists in Model I but Features and Local Features. The model is trained and applied not in Model II, is more significant for DNN than XGB. on both simple and complex neurons. According to our re- Second, if we compare the results for simple and complex sults, in general, the precision and recall for the polarity iden- neurons, we can see that the influence of local features is much tification of dendritic terminals are better than those of axons more significant for complex neurons than for simple neurons. by 3%–8%. One possible reason is that the total number of For example, for XGB, we find that the accuracy decreases by dendrite terminals is approximately three times more than the 1% only in simple neurons (compare Fig. 5(a1) and number of axon terminals, providing more training data that Fig. 6(a1)), while it decreases by 2.3% for complex neurons may increase the precision. (compare Fig. 5(a2) and Fig. 6(a2)). These two values become Comparing the confusion matrices for simple neurons 2.2% and 5.7%, respectively, for DNN. This clearly implies (Figs. 5(a1) and 5(b1)) and complex neurons (Figs. 5(a2) that Local Features which are included in Model I are more and 5(b2)), we can observe similar performance of XGB and important for complex neurons compared to simple neurons. DNN on simple and complex neurons: The accuracy for The most obvious reason is that complex neurons have more 678 Neuroinform (2021) 19:669–684 Fig. 6 Performance of NPIN using Model II, where only Soma algorithm. (b1)-(b3) are the same as (a1)–(a3) but for the DNN Features are included. (a1)–(a3) show the results for simple neurons, algorithm. (c) shows two similar complex neurons, where middle clusters complex neurons, and all neurons, respectively, using the XGB have opposite polarities. The cluster labeled by A/D is axons/dendrites than two clusters and, therefore, the simple application of local features plays a complementary role in polarity identifi- soma features could not provide enough information for the cation, especially for the middle clusters of complex neurons. identification of polarity. As an example, Fig. 6(c) shows two More precisely, by comparing Fig. 7(a2) to Fig. 7(a1), we find types of complex neurons, where the middle clusters have that local features can significantly reduce the number of in- different polarities. These middle clusters are difficult to clas- correct identification for axons (upper right corner of the con- sify by Soma Features only. As a result, we conclude that local fusion table, from 150 to 66); hence, the number of correctly features are crucial for the polarity identification of the middle identified axons is increased. clusters in complex neurons and the DNN algorithm may be In Fig. 7 (b1)–(b3), (c1)–(c3), and (d1)–(d3), we show more sensitive to these differences than XGB. three representative complex neurons with three or more clus- ters of terminals. Figure 7 (b1), (b2), and (b3) show the same Comparison of Models I, II, and III for Complex neuron with polarities identified by Model I, Model II, and Neurons Model III, respectively. Figure 7 (c1)–(c3) and (d1)–(d3) show similar information but for another two neurons. The In this experiment, to investigate how NPIN works with com- results obtained by using Local Features alone (Model III) plex neurons and to examine its relationship with local fea- are not satisfactory: some axon clusters with larger curvatures tures, we focus on complex neurons only: no simple neurons may be incorrectly classified as dendrites (see, for example, are included in either training data or test data. Three models two axon clusters in Fig. 7(c3)). Moreover, some dendrite are used for comparison: Model I (with both Soma Features clusters with divergent branches may be incorrectly classified and Local Features), Model II (with Soma Features only), and as axons (see, for example, the dendrite cluster in Fig. 7(d3)). Model III (with Local Features only). Because the influence of Using Soma Features only (Model II), on the other hand, Local Features is more significant in DNN than in XGB (see provides a much better result (with an accuracy of 95.8%), above), here we will apply the DNN algorithm only for because clusters that are closest to or farthest from the soma simplicity. are identified as dendrites or axons, respectively. However, as According to the results shown in Fig. 7(a1)–(a3), the ac- we see in Fig. 7(c2), (d2), and (e2), the middle clusters (de- curacy of classification is the best for Model I and slightly fined from their distance to soma) of these complex neurons reduces for Model II, but it significantly drops for Model III cannot be identified easily by Model II (with Soma Features (which uses Local Features only). This result indicates that, only), because their relative distance to the soma is not well- without any information on its relative distance to the soma, defined compared to the other clusters. Local Features alone for a given node perform poorly in po- As a summary, we find that the accuracy to classify the po- larity identification but are not completely useless (with 71% larity of middle clusters in a complex neuron can be significantly enhanced after combining Soma Features and Local Features in accuracy, see Fig. 7(a3)). Indeed, we find that the inclusion of Neuroinform (2021) 19:669–684 679 Fig. 7 Performance of NPIN with DNN algorithm for complex Model II, and Model III, respectively. Filled gray circles indicate the neurons in three different models. (a1)–(a3) are the confusion matrix terminals of incorrect classification. (c1)–(c3) and (d1)–(d3) are the same and precision-recall table for the terminal polarity for Model I (with both as in (b1)–(b3) but with two different complex neurons. (e1)–(e4) are four Soma Features and Local Features), Model II (with Soma Features only), different complex neurons, where polarities are classified by Model I with and Model III (with Local Features only), respectively. (b1)–(b3) display 100% accuracy by DNN algorithm the same complex neuron with polarity classification using Model I, Model I. More examples of complex neurons with correct polar- our NPIN, trained by Drosophila neurons, can be directly used ityidentificationbyModel IareshowninFig. 7(e1)–(e4). for other species of insects, which should have similar mor- phological features as Drosophila. Here, we take the neuron Application to Other Species of Insects: Blowfly and images of the blowfly and the moth from the Neuromorpho Moth database (http://neuromorpho.org/) as an example. The database lists 19 blowfly neurons and 3 moth neurons with labeled polarity. These data were generated by different labs In principle, our NPIN, trained on the Drosophila brain neu- rons, can also be applied to the polarity identification of other species, if the training data is replaced by the neurons of that The IDs of 19 Blowfly neurons are HSE-fluoro05, HSE-fluoro11, HSE- species. However, the number of publicly available neuronal fluoro15, HSN-cobalt, HSN-fluoro04, HSN-fluoro06, HSN-fluoro08, HSS- cobalt, VS1-cobalt, VS2-fluoro01, VS2-fluoro03, VS2-fluoro10, VS3-cobalt, data samples of other species with identified polarity is much VS4-cobalt, VS4-fluoro02, VS4-fluoro07, VS4-fluoro09, VS5-cobalt, VS9- less than that of Drosophila. Therefore, such an application cobalt. The IDs of 3 Moth neurons are Nevron-komplett-08-02-28-2a, may not be practical. However, it is still instructive to see how Nevron-komplett-08-03-13-2a, Nevron-komplett-08-08-28-1a-A. 680 Neuroinform (2021) 19:669–684 using different reconstruction methods from that of our three models (Model I: all features, Model II: Soma Features Drosophila dataset. To save space, below we just present only, Model III: Local Features only) and three types of test NPIN results of the blowfly in detail and mention the results data (simple neurons, complex neurons, and all neurons). For of the moth data in brief. simplicity, we only display the results using the DNN Figure 8 shows the results of polarity identification for 19 algorithm. blowfly neurons obtained by Model I, Model II, and Model III As explained above, the overall accuracy cannot reflect the of NPIN, which is trained on 213 Drosophila neurons in our complete information on model performance, especially when dataset with the DNN model. We find that Model I, using both the numbers of dendrites and axons are highly imbalanced. To soma features and local features, still provides a decent level generate a reliable ML model, we suggest that the precision of accuracy (83.4%). The main error stems from the pretty low and recall for both axons and dendrites have to be larger than precision and recall of the axons, which have much fewer 50%, or, in other words, we have more correctly identified terminal numbers than dendrites (dendrites: axons ratio = terminals than incorrect ones. We put stars “*” in Fig. 9 to 22.8:1). Similar results are also observed for Model II, as mark those results that do not meet these criteria. shown in Fig. 8(a2). However, a surprising result is obtained when using Model III, where only local features are included for the training on Discussion Drosophila neurons. The overall accuracy, as well as the pre- cision and recall for both dendrites and axons, are very high Comparison of NPIN and SPIN (accuracy = 98.98%). This result is even better than that ob- tained by using the blowfly data for the training process A previously developed machine-learning-based method, (Fig. 8(b)). The results clearly indicate that, unlike SPIN (described in the introduction), has identified the polar- Drosophila, where Local Features are only secondary factors ity of insect’s neurons with an overall accuracy 84%–90% compared with Soma Features, Local Features are the primary (Lee et al. 2014). SPIN starts by identifying clusters of neu- factors for the identification of neuronal polarity for blowfly ronal arbors in each neuron and then classifies the polarity of neurons that we tested in the present study. This can also be each cluster according to its geometric structure and distance observed from the skeleton structure of dendrite clusters in to the soma. As a result, terminals in a cluster are all classified Fig. 8(c1)–(c4). Therefore, to apply NPIN (trained on as having the same polarity. However, this approach has two Drosophila neurons) to neurons of other insects, it is neces- challenges. First, a cluster might not be easily identified for sary to provide not only Model I, but also Model II and Model neurons with complex morphology, and incorrect clustering III, to maximize the range of applications. However, we have could lead to a large number of incorrectly classified termi- to emphasize that since the 19 blowfly neurons are all collect- nals, for example, 14 of 213 neurons used in the present study ed from the visual system, and therefore we could not exclude were not processable by SPIN. Second, the number of avail- the possibility that the success of NPIN may be due to their able clusters may not be sufficient to achieve good training special morphology. More analysis on other types of neurons results because each neuron has only a few clusters. Due to in the blowfly should be carried out in the future when more these issues, SPINoftenfailedto classifypartoforevenall neurons with known polarity are available. terminals of a neuron if its arbors were not clustered correctly. In addition to the blowfly, we also collected 3 moth neu- The proposed NPIN avoids these issues by adopting node- rons with known polarities from the Neuromorpho database. based rather than cluster-based classification. To compare the Among the 194 dendrite terminals and 358 axon terminals, the performance of SPIN and NPIN, we examine the results of the overall accuracy of the polarity identification by NPIN polarity identification by SPIN on the same 213 neurons we (trained by the 213 Drosophila neurons with the DNN model) used here (Huang et al. 2019). We find that, among these 213 is 98.2%, 99.0% and 65.6% for Model I, II, and III respective- neurons, only 79 neurons are fully identified (i.e., without any ly. This result reflects the fact that the polarity of these three “non-classified” terminals), 120 neurons are partially predict- neurons could be much easily identified by Soma Features ed (i.e., some clusters cannot be identified), and 14 neurons only. This could be a complementary example of the blowfly cannot be predicted. Among 9452 terminals of these 213 neu- and show the importance of including both Soma Features and rons, there are 1207 unclassified terminals and 8247 classified Local Features for a general application of NPIN. terminals. Within the SPIN-classified terminals, 8038 termi- nals are correctly identified for their polarities. Therefore, the overall accuracy of SPIN is 85.04% only if we consider all Summary of Results terminals in the dataset, while it could be 97.49% if we con- sidered only classified terminals. We summarize the results of the present study in Fig. 9 by We emphasize that, in the present study, we develop a showing the accuracy of NPIN in all test conditions including completely different approach by identifying the polarity of Neuroinform (2021) 19:669–684 681 Fig. 8 Performance of NPIN on blowfly brain neurons. (a1)–(a3) are neurons in our dataset. (b) is the result for Model I but trained on blowfly the confusion matrices and precision-recall tables for Model I, Model II, neurons directly. (c1)–(c4) display four example skeleton structures of the and Model III, respectively. The models are trained on 213 fruit-fly blowfly neurons used in this test each node, which can be unambiguously defined in the skel- obtained by randomly selecting 100 neurons for training, 25 for eton structure of each neuron, with a nodal polarity also well- validation and 50 for test. In other words, each neuron shown in defined through the polarity of terminals (see Fig. 3). Such Tables S1 and S2 of Appendix E is tested by models, trained on node-based feature extraction, therefore, takes advantage of other neurons in the dataset, and therefore there is no overlap the fact that the number of nodes is much larger than the between the test data and training data in all the results presented number of clusters in each neuron. It can achieve a much above. higher accuracy (>96%) for the whole dataset (213 neurons However, in order to show how well NPIN can be applied and 9452 terminals) after including the spatial correlation. for neurons not in the same dataset, we find another 22 neu- Therefore, we conclude that NPIN outperforms SPIN in the rons with distinct connection types (mostly from AOTU to polarity identification, showing an important step toward the BU, and from MED to VMP) from those in the NPIN dataset reconstruction of the connectome. We expect to analyze the for the test. The polarities of these neurons are determined by information flow of the brain in much finer scales in the near our experimental collaborators and therefore are not published future, revealing more detailed functional relationships be- before (and also not in the original dataset either). 12 of them tween subregions of the Drosophila brain. have dendrites located in AOTU, 9 in MED and 1 complex neuron in MB, see Table S3 of Appendix E. The predicted results by NPIN (trained by the 213 neurons together) show Neurons Not in the Dataset that NPIN could still provide very high accuracy. More pre- cisely, for these 12 neurons in AOTU, 11 of them are 100% As described in the flowchart of NPIN in Fig. 1, the neuronal correct and only one is of 75% accuracy. For the 9 neurons polarity predicted by NPIN for the 213 Drosophila neurons is Fig. 9 Summary of NPIN accuracies in all test conditions using the neurons, complex neurons, and all neurons, respectively. (b) shows the DNN algorithm. (a) shows the results for Model I (with both Soma results for the same models but with the blowfly neurons (trained by our Features and Local Features), Model II (with Soma Features only), and Drosophila dataset). Results with precision or recall of less than 50% are Model III (with Local Features only), for three types of test data: simple indicated by “*” (see the text) 682 Neuroinform (2021) 19:669–684 with dendrites in MED, 7 of them are 100% correct and the terminals. The former makes it difficult to distinguish axons other two are of 94.3% and 80.6% accuracy. The complex from dendrites, while the latter could confuse NPIN by neuron with dendrites in MB (also not shown in the dataset mispredicting all terminals to be dendrites (as a result, the of NPIN) is predicted with 100% accuracy. These results precision and recall of axons are both small). For complex show that our NPIN should be applicable to neurons in other neurons, the incorrectly identified terminals usually appear brain regions. Although like all machine learning algorithms, in the middle clusters, as one may expect. However, the most NPIN is trained by labeled data with similar features (neuronal complex neurons have been correctly predicted by NPIN with morphology) to those of unlabeled data, we found that NPIN a very high accuracy (91 of the 124 complex neurons are is still able to successfully classify polarity for neurons that are identified with 100% accuracy). In our node-based feature morphologically distinct from the training neurons. extraction, it is challenging to correctly identify the clusters Nevertheless, we have to acknowledge that the number of of fewer terminals or nodes, because their local features are neurons in our dataset is far less than the total number of less representative of their local morphology. Therefore, find- neurons (approximately 135 K) in the Drosophila brain. ing a better way to define local features (less dependent on the There must be other types of neurons with polarity-specific number of terminals in the same clusters) could enhance po- morphological features, which can be very different from larity identification in future work. what we have addressed in this study. For example, the den- drites and axons of some local neurons are co-localized in the Comparison with Results of Electronic-Microscopy same cluster of arbors, and some projection neurons develop Images axonal clusters that are closer to the soma than the dendritic ones. We will include more morphologically distinct neurons Finally, a large set of electronic-microscopy images (the EM into the training set, once their experimentally verified polar- dataset) of the Drosophila brain has recently been released (C. ities are available. Therefore, although more training data are S. Xu et al. 2020). This dataset includes identified polarities, and necessary when applying our NPIN for the polarity identifica- hence can be potentially used as the training data for NPIN or be tion of the whole Drosophila brain, the present work at least compared with the results predicted by NPIN on the fluorescence demonstrates the possibilities to have a high precision identi- images. However, after careful examination of that dataset, we fication through the node-based feature extraction in NPIN. discovered two major differences in the morphological charac- We believe that the future versions of NPIN, after including teristics between the two datasets: (1) the neuronal skeletons in more types of neurons in the training data, will provide a much the EM dataset exhibit much more details, e.g., a larger number wider range of applications. of short terminal branches than what have been found in the fluorescent images in the present study. (2) Some neurons in Neurons with Low Accuracy the EM dataset have incomplete tracing or discontinuous branches. These issues prevent us from directly using the EM To examine the performance of our NPIN, we investigate those dataset. For future work, we suggest that heavy preprocessing, neurons not identified well in their polarity. As described in the containing the reconstruction of the connectome and the algo- Results section, we could obtain this information by randomly rithm of matching terminals of the same neuron from two image selecting 150 neurons (100 for training, 25 for validation, and 50 types, is required, before NPIN can utilize the EM dataset. for testing) out of the 213 neurons in the dataset for each training/ Moreover, we have to emphasize that the current hemi-brain test process and then repeating it for 20 rounds. As a result, each EM database is from ONE fly only, but neural images from neuron can be tested (by different models trained by other neu- light-microscopy based databases are often accumulated from rons) for 4–5 times on average, and their polarity identification multiple individuals. Although this database serves as crucial results can be obtained by averaging their probabilities before referencedatafor theflycommunity, it is unlikely that full- relabeling. The final results calculated by the DNN model are brain or hemi-brain EM data from many more flies or from flies shown in Appendix E. Within these 213 neurons, the terminal with different genetic manipulation will become available in the polarity of 166 neurons is identified with 100% accuracy. Only near future. By contrast, data from optical images are continu- 14 simple neurons and 33 complex neurons are not fully identi- ously generated by a large number of labs in the world. We fied. Concentrating on those neurons with a lower accuracy (say therefore believe that our NPIN will have its impact and be below 85%), we find only 5 simple neurons and 24 complex widely used by many fly labs in the future. neurons left. When looking into the skeleton structures of these neurons with a lower accuracy, we find the following features of these Conclusion neurons: Simple neurons have a very similar distance for axon clusters and dendrite clusters to the soma, and the number of In this study, we have developed NPIN, a completely new ML dendrite terminals is much larger than the number of axon model to identify the polarity of projection neurons in a Neuroinform (2021) 19:669–684 683 Drosophila brain with high precision (>96%). This result was Declarations achieved due to three major contributions: node-based feature Conflict of Interests The authors declare that they have no conflict of extraction, separation of Local Features from Soma Features, interests. and implementation of spatial correlations between nodal po- larities. In the experiments, we systematically compare the results of different models for various types of neurons. We Open Access This article is licensed under a Creative Commons demonstrate that, apart from Soma Features, Local Features Attribution 4.0 International License, which permits use, sharing, adap- are the secondary factors to determine the neuronal polarity. tation, distribution and reproduction in any medium or format, as long as Local Features can significantly improve the polarity identifi- you give appropriate credit to the original author(s) and the source, pro- cation, especially for the middle clusters of complex neurons, vide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included which cannot be well-identified by using Soma Features only. in the article's Creative Commons licence, unless indicated otherwise in a Besides the Drosophila neurons, we show that NPIN can also credit line to the material. If material is not included in the article's be applied to identify the neuronal polarity of other insects, Creative Commons licence and your intended use is not permitted by such as the blowfly. As a result, we believe that the develop- statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this ment of NPIN and its applications is an important step toward licence, visit http://creativecommons.org/licenses/by/4.0/. the determination of signal flows in complex neural networks. Information Sharing Statement References The NPIN software package contains data of sample neurons Asri, H., Mousannif, H., Moatassime, H. A., & Noel, T. (2016). Using Machine Learning Algorithms for Breast Cancer Risk Prediction with skeletal data available from the FlyCircuit database and Diagnosis. Procedia Computer Science, 83, 1064–1069. (http://www.flycircuit.tw). We also provide two online https://doi.org/10.1016/j.procs.2016.04.224. versions of NPIN to be used or tested by other research Chiang, A.-S., Lin, C.-Y., Chuang, C.-C., Chang, H.-M., Hsieh, C.-H., groups at the following address: website (https://npin-for- Yeh, C.-W., et al. (2011). Three-Dimensional Reconstruction of drosophila.herokuapp.com) and Gitlab code (https://gitlab. Brain-wide Wiring Networks in Drosophila at Single-Cell Resolution. Current Biology, 21(1), 1–11. https://doi.org/10.1016/ com/czsu32/npin). j.cub.2010.11.056. Craig, A. M., & Banker, G. (1994). Neuronal Polarity. Annual Review of Supplementary Information The online version contains supplementary Neuroscience, 17(1), 267–310. https://doi.org/10.1146/annurev.ne. material available at https://doi.org/10.1007/s12021-021-09513-y. 17.030194.001411. Cuntz, H., Forstner, F., Haag, J., & Borst, A. (2008). The Morphological Acknowledgments This work is supported by the Ministry of Science Identity of Insect Dendrites. PLoS Comput Biol, 4(12), e1000251. and Technology grant (MOST 107-2112-M-007-019-MY3) and by the https://doi.org/10.1371/journal.pcbi.1000251. Higher Education Sprout Project funded by the Ministry of Science and Fischbach, K.-F., & Dittrich, A. P. M. (1989). The optic lobe of Technology and the Ministry of Education in Taiwan. We thank the Drosophila melanogaster. I. A Golgi analysis of wild-type structure. National Center for Theoretical Sciences and Brain Research Center in Cell and Tissue Research, 258(3), 441–475. https://doi.org/10.1007/ the National Tsing Hua University for providing full support in the inter- BF00218858. disciplinary collaboration. We thank the National Center for High- Hanesch, U., Fischbach, K.-F., & Heisenberg, M. (1989). Neuronal ar- Performance Computing for providing the FlyCircuit database. We ap- chitecture of the central complex in Drosophila melanogaster. Cell preciate Prof. Che-Rung Lee for providing computational resources, Yi- and Tissue Research, 257(2), 343–366. https://doi.org/10.1007/ Ning Juan for helping with the website, Dr. Yu-Chi Huang and Prof. An- BF00261838. Shi Chiang for helpful discussion about the neuronal datasets. Huang, Y.-C., Wang, C.-T., Su, T.-S., Kao, K.-W., Lin, Y.-J., Chuang, C.-C., et al. (2019). A Single-Cell Level and Connectome-Derived Funding This work is supported by the Ministry of Science and Computational Model of the Drosophila Brain. Frontiers in Technology grant (MOST 107–2112-M-007-019-MY3) and by the Neuroinformatics, 12. https://doi.org/10.3389/fninf.2018.00099. Higher Education Sprout Project funded by the Ministry of Science and Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Technology and the Ministry of Education in Taiwan. Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Data Availability The FlyCircuit database (http://www.flycircuit.tw/)is Advances in Neural Information Processing Systems 25(pp. 1097– provided by the National Center for High-Performance Computing. 1105). Curran Associates, Inc. http://papers.nips.cc/paper/4824- imagenet-classification-with-deep-convolutional-neural-network Code Availability We provide an online version and a source code of s.pdf. Accessed 13 April 2020 NPIN (XGB version) at the following websites: Kuan, L., Li, Y., Lau, C., Feng, D., Bernard, A., Sunkin, S. M., et al. Test Page (https://npin-for-drosophila.herokuapp.com/). (2015). Neuroinformatics of the Allen Mouse Brain Connectivity Gitlab code (https://gitlab.com/czsu32/npin). Atlas. Methods, 73,4–17. https://doi.org/10.1016/j.ymeth.2014.12. 013. 684 Neuroinform (2021) 19:669–684 LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, Parekh, R., & Ascoli, G. A. (2013). Neuronal Morphology Goes Digital: A Research Hub for Cellular and System Neuroscience. Neuron, 521(7553), 436–444. https://doi.org/10.1038/nature14539. 77(6), 1017–1038. https://doi.org/10.1016/j.neuron.2013.03.008. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based Peng, H., Hawrylycz, M., Roskams, J., Hill, S., Spruston, N., Meijering, learning applied to document recognition. Proceedings of the IEEE, E., & Ascoli, G. A. (2015). BigNeuron: Large-Scale 3D Neuron 86(11), 2278–2324. Presented at the Proceedings of the IEEE. Reconstruction from Optical Microscopy Images. Neuron, 87(2), https://doi.org/10.1109/5.726791. 252–256. https://doi.org/10.1016/j.neuron.2015.06.036. Lee, Y.-H., Lin, Y.-N., Chuang, C.-C., & Lo, C.-C. (2014). SPIN: A Rolls, M. M. (2011). Neuronal polarity in Drosophila: Sorting out axons Method of Skeleton-Based Polarity Identification for Neurons. and dendrites. Developmental Neurobiology, 71(6), 419–429. Neuroinformatics, 12(3), 487–507. https://doi.org/10.1007/s12021- https://doi.org/10.1002/dneu.20836. 014-9225-6. Shinomiya, K., Matsuda, K., Oishi, T., Otsuna, H., & Ito, K. (2011). Lin, C.-Y., Chuang, C.-C., Hua, T.-E., Chen, C.-C., Dickson, B. J., Flybrain neuron database: A comprehensive database system of Greenspan, R. J., & Chiang, A.-S. (2013). A Comprehensive the Drosophila brain neurons. The Journal of Comparative Wiring Diagram of the Protocerebral Bridge for Visual Neurology, 519(5), 807–833. https://doi.org/10.1002/cne.22540. Information Processing in the Drosophila Brain. Cell Reports, Squire, L. R., Berg, D., Bloom, F., Lac, S. du, & Ghosh, A. (Eds.). 3(5), 1739–1753. https://doi.org/10.1016/j.celrep.2013.04.022. (2008). Fundamental Neuroscience, Third Edition (3rd ed.). Lo, C.-C., & Chiang, A.-S. (2016). Toward Whole-Body Connectomics. Academic Press. Journal of Neuroscience, 36(45), 11375–11383. https://doi.org/10. Wang, J., Ma, X., Yang, J. S., Zheng, X., Zugates, C. T., Lee, C.-H. J., & 1523/JNEUROSCI.2930-16.2016. Lee, T. (2004). Transmembrane/Juxtamembrane Domain- Malta, T. M., Sokolov, A., Gentles, A. J., Burzykowski, T., Poisson, L., Dependent Dscam Distribution and Function during Mushroom Weinstein, J. N., et al. (2018). Machine Learning Identifies Body Neuronal Morphogenesis. Neuron, 43(5), 663–672. https:// Stemness Features Associated with Oncogenic Dedifferentiation. doi.org/10.1016/j.neuron.2004.06.033. Cell, 173(2), 338–354.e15. https://doi.org/10.1016/j.cell.2018.03. Wu, M., Nern, A., Williamson, W. R., Morimoto, M. M., Reiser, M. B., Card, G. M., & Rubin, G. M. (2016). Visual projection neurons in Matus, A., Bernhardt, R., & Hugh-Jones, T. (1981). High molecular the Drosophila lobula link feature detection to distinct behavioral weight microtubule-associated proteins are preferentially associated programs. eLife, 5, e21022. https://doi.org/10.7554/eLife.21022. with dendritic microtubules in brain. Proceedings of the National Xu, C. S., Januszewski, M., Lu, Z., Takemura, S., Hayworth, K. J., Huang, Academy of Sciences of the United States of America, 78(5), 3010– G., et al. (2020). A Connectome of the Adult Drosophila Central Brain. bioRxiv, 2020.01.21.911859. 10.1101/2020.01.21.911859. Milyaev, N., Osumi-Sutherland, D., Reeve, S., Burton, N., Baldock, R. Xu, M., Jarrell, T. A., Wang, Y., Cook, S. J., Hall, D. H., & Emmons, S. A., & Armstrong, J. D. (2012). The Virtual Fly Brain browser and W. (2013). Computer Assisted Assembly of Connectomes from query interface. Bioinformatics, 28(3), 411–415. https://doi.org/10. Electron Micrographs: Application to Caenorhabditis elegans. Plos 1093/bioinformatics/btr677. One, 8(1), e54050. https://doi.org/10.1371/journal.pone.0054050. Mohsen, H., El-Dahshan, E.-S. A., El-Horbaty, E.-S. M., & Salem, A.-B. M. (2018). Classification using deep learning neural networks for brain tumors. Future Computing and Informatics Journal, 3(1), 68– Publisher’sNote Springer Nature remains neutral with regard to jurisdic- 71. https://doi.org/10.1016/j.fcij.2017.12.001. tional claims in published maps and institutional affiliations.

Journal

NeuroinformaticsSpringer Journals

Published: Oct 1, 2021

Keywords: Neuronal polarity; Machine learning; Drosophila; Connectome; Axon; Dendrite

There are no references for this article.