Reproducibility#
In addition to the cellSAM
library, the source repo contains additional
code to aid in reproducing the results in the publication.
This additional code for reproducibility can be found in the
paper_evaluation
directory.
Additional resources including pre-trained model weights and the evaluation dataset are required for reproducibility. All necessary components are available for download - see Model and Datasets for details.
Setup#
In a new (empty) virtual environment, install cellSAM from the parent directory.
Example: creating an environment
Users are encouraged to use whichever environment management system with which they
are most comfortable (uv
, pixi
, conda/mamba
, etc.)
For those unsure, Python’s built-in environment management module is a simple, ubiquitous option. For example, to create and enter a new environment:
$ python3.XX -m venv cs-eval-env
$ source cs-eval-env/bin/activate
Where XX
is the Python version you wish to use (e.g. python3.13
).
You can then verify the newly created environment is empty (though pip
should be available):
$ pip list
Package Version
------- -------
pip 24.3.1
For example, from the paper_evaluation
directory:
$ pip install ..
Evaluation dependencies#
Once in a “clean” environment, install the requirements for the evaluation suite:
$ pip install -r requirements.txt
Note
This may downgrade some of the dependencies (e.g. torch
, numpy
, etc.) installed
in the previous step.
Evaluation models#
The pretrained model weights necessary for reproducibility are available via the get_model
function:
>>> from cellSAM import get_model
>>> get_model();
This will automatically download and unpack the latest version of the pretrained model weights.
Model versions
You may use the version=
keyword argument for get_model
to specify a specific
model version for evaluation.
Version
1.2
is the version that was used to produce the published results in the paperVersion
1.2
is the minimum model version which is designed to work with the reproducibility workflow.
Evaluation dataset#
Make sure you have the evaluation dataset. This can be downloaded with:
>>> from cellSAM import download_training_data
>>> download_training_data()
This will initiate the download of a compressed data archive.
The compressed data will be downloaded to
$HOME/.deepcell/datasets/cellsam-data_v{X.Y}.tar.gz
where X.Y
is the requested
dataset version.
See Model and Datasets for details.
Once the download is complete, unpack/inflate the dataset to a desired location.
Dataset Size
The compressed data archive is 14GB in size, and inflates to 84GB when uncompressed.
Therefore you may want to unpack the data to a different location.
Similarly, the decompression is comuptationally intensive, and may benefit from parallel
decompression algorithms.
Here’s an example incantation which will store the unpacked dataset to /data
using 8
threads for decompression:
$ tar --use-compress-program="unpigz -p 8" -xf $HOME/.deepcell/datasets/cellsam-data_v1.2.tar.gz -C /data
The unpacked data will then be available at /data/cellsam_v1.2
.
Running the evaluation#
Once all of the above steps are complete, the evaluation can be run via the all_run.sh
shell script.
Before running, ensure that the variables at the top of the file reflect the locations of
the models/dataset on your system.
If you used the defaults in all the steps above (and unpacked the dataset in its
download location) this will already be the case.
$ ./all_runs.sh
The results of each run will be saved locally in a summary.csv
that records the datset,
model used, and f1_mean
for that run.
Individual evaluations#
It is not necessary to run the entire evaluation suite - evaluation can be limited to
specific datasets.
See the all_runs.sh
for a general idea of how to do so via eval_main.py
.