University of South Florida
DDSM Resource



DoD BCRP Clustered Microcalcification Detection Evaluation Data

Preface - This is an experimental page designed to ease the steps needed to evaluate the performance of a clustered microcalcification detection algorithm. We are trying to determine if sample data sets, combined with some evaluation tools are valuable in promoting the comparison of algorithms. Here we have extracted a set of cases from the database that all have at lease one, malignant lesion containing clustered microcalcifications per case. The cases listed in the training side of the table can be used to optimize an algorithm and the cases on the testing side of the table can be used to measure the performance of an algorithm.


Introduction

The Digital Database for Screening Mammography here at the University of South Florida is made up of 2620 cases of data, each of which contains four mammograms. The cases were collected from four mammography centers and were scanned on one of four digitizers. Some cases in the database represent normal screening exams in which nothing unusual was found. Others contain cancers and benign lesions. Each non-normal case was examined by one of three radiologists who provided pixel level ground truth for each abnormality.

The central goal in the development of this database was to provide a common dataset of mammograms in a digital format with associated ground truth that could be used to aid in quantitative evaluation of computer-aided-detection algorithms for detecting breast cancer.

The Data Sets

Sampling a set of cases from the DDSM database to use for evaluating a microcalcification cluster detection algorithm required making some choices. While researchers like to divide the problem of cancer detection into pieces (i.e. spiculated mass detection, detection of clustered microcilcifications etc.) mammography screening exams are not easily divided along these lines. Lesions containing clustered microcalcifications can appear in a mammogram with other mammogrphic abnormalities. Therefore these cases may contain other abnormalities in addition to malignant clustered microcalcifications.

We decided to select a set of cases from the DDSM that had at least one, malignant, lesion with clustered microcalcifications in it. We selected a set of cases that were acquired from two institutions. The cases with case numbers starting with a 4 all came from one institution and were all scanned on a HOWTEK MultiRAD 850 scanner. The other cases came from a different institution and were all scanned on a HOWTEK 960 scanner. A different radiologist marked the ground truth at each institution.

The resulting set of cases were split into a training set and a test set using while attemping to balance the lesion subtlety and ACR breast density in the two datasets. The resulting list of cases, with links to the data on our FTP site can be seen in the table below.

TRAINING (50 cases)
Use this data for training and testing your algorithm as much as you want.
cancer_06
case1108
cancer_06
case1113
cancer_07
case1116
cancer_06
case1131
cancer_06
case1133
cancer_06
case1141
cancer_06
case1148
cancer_06
case1152
cancer_06
case1153
cancer_06
case1167
cancer_06
case1185
cancer_06
case1188
cancer_06
case1201
cancer_06
case1212
cancer_07
case1213
cancer_07
case1214
cancer_07
case1219
cancer_07
case1220
cancer_07
case1223
cancer_07
case1238
cancer_07
case1245
cancer_07
case1248
cancer_11
case1252
cancer_07
case1257
cancer_08
case1283
cancer_08
case1415
cancer_08
case1470
cancer_08
case1500
cancer_08
case1508
cancer_08
case1528
cancer_08
case1535
cancer_10
case1570
cancer_10
case1585
cancer_10
case1588
cancer_10
case1595
cancer_10
case1626
cancer_11
case1632
cancer_11
case1635
cancer_11
case1637
cancer_11
case1675
cancer_11
case1693
cancer_11
case1697
cancer_12
case4103
cancer_12
case4110
cancer_12
case4115
cancer_12
case4116
cancer_12
case4134
cancer_12
case4142
cancer_13
case4161
cancer_12
case4183
TESTING (50 cases)
Once your algorithm has been fixed and its parameters have have been set, test your algorithm with this data.
cancer_07
case1235
cancer_11
case1236
cancer_10
case1250
cancer_07
case1256
cancer_07
case1258
cancer_07
case1261
cancer_08
case1489
cancer_08
case1517
cancer_14
case1520
cancer_11
case1531
cancer_10
case1590
cancer_10
case1591
cancer_10
case1596
cancer_11
case1614
cancer_10
case1621
cancer_10
case1629
cancer_11
case1636
cancer_10
case1644
cancer_10
case1654
cancer_11
case1663
cancer_11
case1721
cancer_11
case1728
cancer_11
case1729
cancer_11
case1731
cancer_11
case1766
cancer_11
case1780
cancer_11
case1816
cancer_11
case1819
cancer_14
case1852
cancer_14
case1872
cancer_14
case1874
cancer_14
case1875
cancer_14
case1894
cancer_14
case1897
cancer_14
case1900
cancer_14
case1903
cancer_14
case1905
cancer_14
case1907
cancer_14
case1928
cancer_14
case1929
cancer_14
case1930
cancer_14
case1983
cancer_12
case4105
cancer_12
case4113
cancer_12
case4117
cancer_12
case4127
cancer_12
case4147
cancer_12
case4176
cancer_13
case4179
cancer_13
case4182

Each case contains four mammograms from a screening exam. The images were scanned on either a HOWTEK 960 or a HOWTEK MultiRAD 850 digitizer with a sample rate of 43.5 microns at 12 bits per pixel. The images were preprocessed to crop out much of the image that did not contain imaged breast tissue and to darken regions of the image that contained patient information or technician identifiers by setting pixels in those regions to the value zero. Each image was then compressed using a truely lossless compression algorithm. Some tools are available for decomressing the images, resampling them, mapping them to optical density and for creating masks of the ground truth regions. Click here for more information on this software.

Performance Evaluation

To evaluate an CAD algorithm using these cases of data, one can examine the training cases and use them to optimize parameters for their algorithm. During this process, the test data should not be examined or used in any way. It must ramain untouched until the algorithm is ready for testing. That means the algorithm and any required parameters must be fixed. This is very important and can not be emphasized enough! The performance can then be illustrated with a Free Receiver Operating Characteristic (FROC) plot.

An FROC plot shows the fraction of cancers that were detected and how that fraction relates to the average number of false positive detections per image. This illustrates a range of possible operating points for the algorithm. An ideal algorithm would have a true positive fraction of 1.0 at 0.0 false positives per image. Obtaining that performance in practice is not generally considered a realistic goal.

Ordering the Data

You are welcome to download the training and testing cases free of charge, but you should be warned that there is nearly 4.7 GB of data in the training set and nearly 4.6 GB of data in the test dataset. If you would like to order the data on two 8mm data cartridges, you can do so using the following order form.


Return to the main DoD BCRP Mammogrpahy Datasets Page at USF.
Please mail comments, suggestions and specific mammography questions to: ddsm@bigpine.csee.usf.edu