Navigation

Introduction

The fusion analysis program uses several modules that contain transformation- and detection algorithms. This page will give a brief overview of the algorithms applied to the data in the specified modules. The information in this page is based on version 1.5.0 of the program.

Detrending module

This module allows for detrending of the data when there is a significant baseline shift. The method used is determined by the method set in the preferences dialog. The following methods are currently supported:

Detrending is only recommended if the data has a very poorly behaved baseline: most of the detrending algorithms used here will also distort the fusion event onset and amplitude to a certain degree. Always try the default method of none first: the detection module is quite robust against trends in the baseline and will be able to detect a very large proportion of events. Detrending is applied to both channels if dual color data is available.



Back to top

Detection module

When the data has been loaded and detrended, the detection of fusion events can be done using Fusion detection from the Analysis menu. There is also an option to explore the parameters for automatic detection using the explore window via Explore detection in the Analysis menu. This will open a new window that shows the parameters used for automatic detection and allows interactive adjustment.

Explore detection

Before using the automatic detection, it is necessary to verify the settings used for detection: the parameters for baseline length cannot exceed the total number of frames in the trace. This can be set in the preferences Edit | Preferences in the menu. Once the settings are approximately correct, the explore detection interface can be started (see Figure 1).

Fig 1: Explore detection parameters

Figure 1: Explore detection interface, showing three plots (from top to bottom): raw data and a scaled derivative; first derivative and median filtered derivative with thresholds; and median filtered derivatives for the rising and falling edge with peak detections and thresholds.

The plots will be updated upon changing the values for each of the fields available on the left. Please note that some parameters are not immediately visible in the plots. The dotted lines in the bottom plot indicate the different levels of the threshold: the distance between each dotted line is equal to 1 clean level. This means that the second dotted line is two times the detection threshold, the third line is three times the detection threshold, etc.

To explore the detection, load data and select the explore detection option from the menu. Once the window opens, a trace can be selected using the previous or next buttons on the bottom left. Once a trace is loaded, the plots will show the different traces that are used to find the peaks. The main detection is in the bottom plot, with the derivative data used for detection for the rising phase (magenta) and the falling phase (blue). The top plot will contain the raw data with the final detected peaks overlaid. Keep in mind that the peaks in this plot have been filtered for local detection level using the clean-up settings (parameters 2-4).

The parameters to the left show different options that control the detection of events, but some of the settings only apply to the second channel. The following list shows the influence of the parameters per channel:

Note that when single color data is loaded, certain fields are disabled that only influence the second channel. If the program is being run on a version prior to 2014b, the peak prominence parameter is also disabled; this is because this parameter is not available in the findpeaks() function. The values in the explore detection interface are NOT transferred to the settings until you press the Apply button: if you close the window, the modifications will be lost!



Back to top

Automatic detection

When the user has not supplied manual fusing times, the program will attempt to find the start and end of the fusion event(s) automatically. To do so, it employs a search method using the findpeaks() function in Matlab. To perform the detection, the employed algorithm goes through the following stages for the pHluorin data:

  1. calculate the first derivative of the data
  2. zero all negative values
  3. apply median filter to the gradient data
  4. establish a threshold based on the gradient baseline standard deviation
  5. use findpeaks() on the gradient data using the threshold parameters
After the initial detection of the peaks, each peak is verified to see if it meets the criteria set by the user in the preferences. The event starts at the first point above the threshold level set by the user as a factor times the local standard deviation of the local baseline. The peak in the gradient is used to calculate the peak fold values that are shown in the results table in the main interface ("peak fold" column). This value represents how many times the peak is above the threshold, and is a measure of the detection quality. A value of 1.5 is a good value to discriminate a good detection if the data has stable noise and baseline levels.

When the start point(s) of fusion have been determined, the next step involves finding the end points of the fusion event(s). This detection follows the same stages, but looks at the negative gradient of the signal. After detection, the end points are checked to see if they match the start points: every starting point can only have 1 end point, and the end point of fusion event 1 cannot be beyond the starting point of event 2. Once this check has been completed, the remaining fusion end points are corrected for the detection level threshold. Each fusion point will get a quality value associated with it, which is calculated as the signal-to-noise ratio (SNR) of the peak. This value is derived by taking the peak value and dividing it by the standard deviation of the baseline. This value is than converted to a value between 0 and 1 using a sigmoid curve (see Figure 2). This value is used for the color coding of the triangles in the main plot and in the results table in the interface ("quality" column). A value of 0.3 (corresponding to a SNR of ±5) is a good value to discriminate a good detection.

Fig 2: quality curve

Figure 2: Sigmoid curve for mapping the peak SNR to a value between 0 and 1. Dotted lines indicate the values for events with a SNR of 3, 6 and 8 with their associated quality values (0.0312, 0.5 and 0.875, respectively).

If a second channel is present, detection of the falling edge is applied to the second channel as well. The stages are as follows (with their own preference settings):

  1. apply a first stage median filter on the data
  2. calculate the negative first derivative
  3. zero all negative values in this gradient data
  4. apply a second stage median filter on the gradient
  5. establish the threshold based on a factor of the max value
  6. use findpeaks() on the gradient data using the threshold parameters
After the initial detection, each of the fusion points is verified to see that it falls below the threshold set by the user. This threshold is calculated in a similar way as the threshold for the first channel.
Fig 3: Channel 2 detection

Figure 3: Explore detection interface, showing three plots (from top to bottom): raw data and a scaled derivative; first derivative and median filtered derivative with thresholds; and median filtered derivatives for the falling edge with peak detections and thresholds.

Note that the number of fusion events in the second channel are not matched to the events in the first channel, but are detected completely independent. The value for the detection threshold can be controlled using the F level parameter. During classification, this discrepancy is accounted for in terms of classifying events, but the number of events is only based on the first channel.

Slow event detection

The automatic detection is based on finding a large change in intensity over a short period of time (usually 1 or 2 frames). However, it sometimes is the case that events in the data set have a much slower onset time (more than 5-10 frames). For these events, there is a detection method that tries to find the start of slow events. This detection uses 5 different methods to detect a deviation from the baseline:

  1. Maximum of the first derivative on the smoothed data
  2. Deviation from the median by 4× IQR (interquartile range)
  3. Deviation from the mean by 6× SD
  4. Deviation from the median by 4× IQR using a sliding window
  5. Alarm rate method using a sliding window and deviation from the mean by 3× SD
Each of the methods will return either a location for an event or a NaN value if it cannot find an event. The detection will find an event if at least 3 of the methods return a location, and the locations are not more than 15 frames apart.After this check, the time point of fusion is determined as the median of the five methods (excluding the NaN values). The goodness value is a sum of the 5 detection methods, with a value of 0.05, 0.1, 0.2, 0.25 and 0.3999 for each method, respectively. This detection is applied automatically when the automatic detection fails.


Back to top

Classification

Classification occurs on two levels: the first level is the event level, where each event is categorized according to a set of parameters. The second level is the site level, where each site that the user added will be characterized based on the synapse channel data. If no synapse data is present in the data, the default value will be used.

Event classification

During classification, the fusion events are processed for parametes that can be used to group events in categories. There are currently 3 main categories and a miscellaneous category. The classification can be adjusted using the parameters in the preference dialog. The following categories are determined based on parameters set by the user:

  1. Transient events: event duration
  2. Persistent events: event duration, rise time
  3. Slow deacidification: rise time
  4. Unclassified/miscellaneous
The second channel can be used to subdivide the transient and persistent events into subcategories. The classification module gives labels to each catefory and subcategory based on the parameters set in the preferences. The format of the classification has three levels, separated by dots: main.sub.release. The numbers have the following meaning, depending on the level (the "—" indicates that this value is not used for the corresponding level):
valuemainsubrelease
1transientfastyes
2persistentslowno
3slow deacidification
4unknown
9unknown
A three digit combination is made for each event based on the classification parameters. If the site contains no events, the classification will be set to 4.9 without further specification. For example, a fast transient event with release will be categorized as 1.1.1, while a slow persistent event without release will be categorized as 2.2.2. When there is no red channel data available, the release value will be set to 2 by default. A few examples of classification are shown in Figure 6, showing the event in the green channel using colored triangles, and the event in the red channel using an open circle.
Fig 4: Examples of categories

Figure 4: Examples of the main categories: fast transient with release (1.1.1), slow transient with release (1.2.1) and fast persistent with (2.1.1) and without (2.1.2) release.

There are 5 parameters that can be set for the classification of events. To set them, go to Edit | Preferences and modify the values in the classification panel (Figure 5).

Fig 5: classification settings

Figure 5: Preferences for classification.

The first value indicates the maximum duration of a transient event in seconds. Events longer than this value will become persistent or slow acidification events. The second parameter determines the maximum duration for a fast transient event in seconds. Values larger than this duration will be classified as slow transient events. The third parameter separates the persistent events in slow and fast, according to the duration: shorter duration than this parameter will be fast persistent. Please note that a value for this parameter lower or equal to the first parameter will result in all persistent events being labeled as slow. The fourth parameter is the max rise time of the event in seconds. This determines whether an event will be classified as a persistent or slow acidification event. This check is only performed on events that are not considered transient.
Fig 6: measurement points

Figure 6: The detection points in the first channel (triangles) are used to determine the duration of the event. In case of a second channel detection (open circle), the time between the end point in the first channel and the detection in the second channel will be used do determine the release status. The length of this trace represents 10 seconds, the duration is 1 second and the Ch2 delay is 0.5 seconds.

Finally, the fifth parameter determines whether a release event results in the loss of cargo. This parameter requires the second channel to be present in the data set. The delay of the release relative to the end point of fusion (see Figure 6) can be set to allow precise control over whether an event has release of the cargo. A long delay means that the decrease in the green channel can be uncoupled from the loss in the red channel, making the event less likely to be related to the event in the green channel. When this parameter is set to Inf, the entire duration of each green event will be searched for a release event in the second channel. This means that the max red delay is only limited by the event duration, which can be different for each event. An example is shown in Figure 7.
Fig 7: classification settings

Figure 7: When using a Ch2 max delay value of 1 second, this would be classified as a fast persistent event without release (2.1.2). When the value of Ch2 max delay is set to Inf, this will be a fast persistent event with release (2.1.1).

In this example, the same categorization can be obtained by setting the value for Ch2 max delay to something like 30 seconds. However, using a value for Ch2 max delay larger than the actual event will also consider events in channel 2 that occur before the onset of the event. By using the Inf value, the algorithm will limit the value of Ch2 max delay to the duration of each event.

Back to top

Site classification

If localization data was present in the original data import, classification will also attempt to label each site as a synaptic or extra-synaptic site. This distinction is based on k-means clustering of the intensity data supplied for each fusion site. The clustering is performed per cell, and separates the intensity data into 2 clusters. In the preferences, there is also an option to apply a manual threshold for the intensity. When using the manual option, make sure that the settings used for acquisition are identical for the localization data: it is only possible to set a single manual threshold value. When different acquisition settings were used, always disable the manual threshold setting in the preferences dialog prior to classification. When the results do not match the observations, the user can try to apply a threshold manually in ImageJ and measure the intensities in a mask image: this will set the areas above the threshold to a value of 255 and below the threshold to 0 intensity. Using these measurements as a synapse intensity in the localization of the imported data, will allow the user to set the manual threshold to any value larger than 0 to classify the synaptic events. Depending on the quality of the mask and threshold, this will give the clearest results. If another marker is used instead of synapse marker, labelling it as a synapse marker in the protocol sheet will still allow classification of the localization. If no localization data is present, all sites will be labeled as extra-synaptic, and the thresholding method has no effect.

Back to top

SynD pool size calculations

If a SynD analysis was performed on the data files using SynD aggregate, it is possible to estimate the pool size and the release probability using the intermediate save files from SynD (*-save.mat). When the files are available, the calculations for the pool size can be performed selecting Analysis | Determine pool size (SynD) from the menu. After selecting the relevant SynD files, you will be asked to select the relevant field from the data extracted. Generally, you will need to select the synapseIntensitySynMean field to get the measurements for vesicle intensities. It is assumed that the fusion events were located in the synapse channel when the analysis was performed. For information on how to perform SynD analysis, refer to the SynD documentation. The function will calculate three pool sizes:

  1. uncorrected pool: this is the raw output of SynD
  2. cell-corrected pool: this uses the mode per cell in the first 2 quantiles
  3. total-corrected pool: this uses the mode over all cells
For each of the pools, the corresponding release probability will also be calculated, by dividing the number of events by each type of pool size. The data will be stored in the data structure. Note that when modifying the data structure by adding or removing events, the release probability needs to be recalculated. A script called script_extract_release_probability is provided that will allow the data from a session file that contains the release probability and pool sizes to be exported to a text file for statistical analysis. Export can also be done directly from the main interface, using the File | Export | Release probability option from the menu. Use the menu option if you want the standard export format, and use the script if you wish to optimize or include additional calculations to the export routine.



Back to top