Work Description
Title: Cell-morphodynamic phenotype classification with application to cancer metastasis using cell magnetorotation and machine-learning Open Access Deposited
Attribute | Value |
---|---|
Methodology |
|
Description |
|
Creator | |
Depositor |
|
Contact information | |
Discipline | |
Funding agency |
|
Resource type | |
Last modified |
|
Published |
|
DOI |
|
License |
(2021). Cell-morphodynamic phenotype classification with application to cancer metastasis using cell magnetorotation and machine-learning [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/513f-1h23
Relationships
- This work is not a member of any user collections.
Files (Count: 15; Size: 108 GB)
Thumbnailthumbnail-column | Title | Original Upload | Last Modified | File Size | Access | Actions |
---|---|---|---|---|---|---|
|
ReadMe.txt | 2021-11-02 | 2021-11-02 | 6.34 KB | Open Access |
|
![]() |
10_22_2014.tar | 2021-10-28 | 2021-10-28 | 31 GB | Open Access |
|
![]() |
10_24_2014.tar | 2021-10-28 | 2021-10-28 | 24.4 GB | Open Access |
|
![]() |
10_25_2014.tar | 2021-10-28 | 2021-10-28 | 23.9 GB | Open Access |
|
![]() |
10_26_2014.tar | 2021-10-28 | 2021-10-28 | 27.7 GB | Open Access |
|
![]() |
classifier.py | 2021-10-27 | 2021-10-27 | 8.23 KB | Open Access |
|
![]() |
data_all_clean_hr14.csv | 2021-10-23 | 2021-10-23 | 133 MB | Open Access |
|
![]() |
data_all_clean_pc3.csv | 2021-10-23 | 2021-10-23 | 54.7 MB | Open Access |
|
![]() |
data_image_mcf_1.csv | 2021-10-23 | 2021-10-23 | 306 MB | Open Access |
|
![]() |
data_image_mda_06.csv | 2021-10-23 | 2021-10-23 | 113 MB | Open Access |
|
![]() |
data_image_mda_09.csv | 2021-10-23 | 2021-10-23 | 111 MB | Open Access |
|
![]() |
estimators_adaboost.py | 2021-10-27 | 2021-10-27 | 7.24 KB | Open Access |
|
![]() |
getPos6.2.py | 2021-10-23 | 2021-10-23 | 9.08 KB | Open Access |
|
![]() |
pca_feature_selection.py | 2021-10-27 | 2021-10-27 | 3.19 KB | Open Access |
|
![]() |
Remys_Pipeline.cp | 2019-04-01 | 2019-04-01 | 8.45 KB | Open Access |
|
Date: 25 October, 2021
Dataset Title: Cell-morphodynamic phenotype classification with application to cancer metastasis using cell magnetorotation and machine-learning [dataset]
Dataset Creators: R Elbez, J Folz, A McLean, H Roca, JM Labuz, KJ Pienta, S Takayama, R Kopelman
Dataset Contact: Jeff Folz folzja@umich.edu
Funding: R21 CA160157 (NIH), CA136829(NIH), R01CA186769 (NIH), 1R01CA250499 (NIH), T32-DE007057 (NIH, Tissue Engineering and Regenerative Medicine Training Program) & T32 ED005582-05(NIH, Microfluidics in Biomedical Sciences Training Program)
Key Points:
- We collected fluorescence images of magnetically activated cells loaded into a microfluidic device
- Images were cropped into single cell images, which were then processed using CellProfiler (v11710)
- After processing, we employed machine-learning algorithms to cluster and identify cell phenotypes
Research Overview:
We define cell morphodynamics as the cell’s time dependent morphology. It could be called the cell’s shape shifting ability. To measure it we use a biomarker free, dynamic histology method, which is based on multiplexed Cell Magneto-Rotation and Machine Learning. We note that standard studies looking at cells immobilized on microscope slides cannot reveal their shape shifting, no more than pinned butterfly collections can reveal their flight patterns. Using cell magnetorotation, with the aid of cell embedded magnetic nanoparticles, our method allows each cell to move freely in 3 dimensions,
with a rapid following of cell deformations in all 3-dimensions, so as to identify and classify a cell by its dynamic morphology. Using object recognition and machine learning algorithms, we continuously measure the real-time shape dynamics of each cell, where from we successfully resolve the inherent broad heterogeneity of the morphological phenotypes found in a given cancer cell population. In three illustrative experiments we have achieved clustering, differentiation, and identification of cells from (A) two distinct cell lines, (B) cells having gone through the epithelial-to mesenchymal transition, and (C) cells differing only by their motility. This microfluidic method may enable a fast screening and identification of invasive cells, e.g., metastatic cancer cells, even in the absence of biomarkers, thus providing a rapid diagnostics and assessment protocol for effective personalized cancer therapy.
Instrument and/or Software specifications: CellProfiler v11710 (https://cellprofiler.org/), Python 2.7
Files contained here:
Four .tar files can be unzipped into directories that contain unprocessed cell images collected as raw data. The four folders each pertain to a single cell line, one of MCF-7, MDA-MB-231, HR-14, and PC-3. Each folder contains 2-3 subfolders named after one cell line. Within these folders are a list of area folders and 1 positions folder. Each of the area folders represent a single area of the microfluidic device that was imaged. For each cell line, there are typically 10-20 area folders. Each of these area folders contains 60+ unprocessed cell images (.tiff files). The positions folder contains the pixel locations of the cells imaged in each corresponding area. Thus, there exists one positions folder that contains a number of .txt files, with each .txt file matching a single area that was imaged.
To recrop the images, place the python file getPos6.2.py into the same directory as the area+positions folders, and run the relevant python command (see comments/instructions in getPos6.2.py).
Files:
10_22_2014 – Raw data for MCF-7 cell line
10_24_2104 - Raw data for MDA-MB-231 cell line
10_25_2104 - Raw data for HR14 cell line
10_26_2104 - Raw data for PC3 cell line
CellProfiler File (File)
‘Remys pipeline.cp’ is the CellProfiler (v11710) pipeline used to process cropped cell images via the program CellProfiler (https://cellprofiler.org/). Once loaded into CellProfiler, it will use cropped cell images to characterize each cell.
Files:
‘remys pipeline.cp’ – The CellProfiler Pipeline used for image processing. Requires CellProfiler v11710
.csv Files:
For the curious investigator, there is no need to recrop the images. Instead, the output of the CellProfiler pipeline for each cell line is gathered in a series of .csv folders. The .csv folders are named after the cell line they describe. All measured variables, unless otherwise specified, pertain to measurements performed in pixels, or ‘pixel units’.
In the event that a ‘row’ is missing data (a representing 1 cell at one time), these cells should be ignored and removed from analysis.
For cells missing only a few parameters (the row is full, but is missing a few columns or contains NaNs), the average value of the parameter in question is inserted in pace of the NaN or missing value.
Files:
‘data_all_clean_hr14.csv’ – processed data for the HR14 cell line
‘data_all_clean_pc3.csv’ - processed data for the PC3 cell line
‘data_image_mda_06’ - processed data for the MDA-MB-231 cell line before Boyden chamber migration
’data_image_mda_09.csv’ - processed data for the MDA-MB-231 cell line post Boyden chamber migration
‘data_image_mcf_1.csv’ - processed data for the MCF-7 cell line
Python Files
For analysis and machine learning of the processed data, we utilized the Scikit Learn library (https://scikit-learn.org/stable/) available for python. Note, all python files are written in python 2.7.
Files:
‘getPos6.2.py’ – This python file is used for preprocessing, specifically, cropping the images. To use it, insert it into the same directory as the areas/position folder for each cell line,]and run the command from terminal ‘python getPos6.2.py &.tiff 00’. This action will automatically crop the images.
‘pca_feature_selection.py’ – This program conducts Principal Component Analysis, and can be sued to determine which features in the processed data contribute to the overall variance. Users will need to change the program to reflect which dataset they need to analyze.
‘estimators_adaboost.py’ – Uses the Adaboost ML algorithm to differentiate and identify the cells. Relevant for Figure 2 in the associated publication in PLOS.
‘classifier.py’ – Uses k-means clustering to cluster cells and identify subphenotypes. Relevant for figures 3 and 4 in the associated publication in PLOS.