Reproducibility#

In addition to the cellSAM library, the source repo contains additional code to aid in reproducing the results in the publication. This additional code for reproducibility can be found in the paper_evaluation directory.

Additional resources including pre-trained model weights and the evaluation dataset are required for reproducibility. All necessary components are available for download - see Model and Datasets for details.

Setup#

In a new (empty) virtual environment, install cellSAM from the parent directory.

For example, from the paper_evaluation directory:

$ pip install ..

Evaluation dependencies#

Once in a “clean” environment, install the requirements for the evaluation suite:

$ pip install -r requirements.txt

Note

This may downgrade some of the dependencies (e.g. torch, numpy, etc.) installed in the previous step.

Evaluation models#

The pretrained model weights necessary for reproducibility are available via the get_model function:

>>> from cellSAM import get_model
>>> get_model();

This will automatically download and unpack the latest version of the pretrained model weights.

Evaluation dataset#

Make sure you have the evaluation dataset. This can be downloaded with:

>>> from cellSAM import download_training_data
>>> download_training_data()

This will initiate the download of a compressed data archive. The compressed data will be downloaded to $HOME/.deepcell/datasets/cellsam-data_v{X.Y}.tar.gz where X.Y is the requested dataset version. See Model and Datasets for details. Once the download is complete, unpack/inflate the dataset to a desired location.

Running the evaluation#

Once all of the above steps are complete, the evaluation can be run via the all_run.sh shell script. Before running, ensure that the variables at the top of the file reflect the locations of the models/dataset on your system. If you used the defaults in all the steps above (and unpacked the dataset in its download location) this will already be the case.

$ ./all_runs.sh

The results of each run will be saved locally in a summary.csv that records the datset, model used, and f1_mean for that run.

Individual evaluations#

It is not necessary to run the entire evaluation suite - evaluation can be limited to specific datasets. See the all_runs.sh for a general idea of how to do so via eval_main.py.