Mammography Software by Michael Heath
Computer Vision Laboratory
University of South Florida
February 2000

Please note there is a newer version of this software. Click here for more information.

Preface

The purpose of this web page is to describe some software for computer aided detection of cancer in mammograms that I have developed at the University of South Florida. I am posting this software on the world wide web to encourage and simplify the quantitative performance of computer aided detection algorithms. The programs included in this software were developed using mammography data in the Digital Database for Screening Mammography (DDSM) at the University of South Florida.

Disclaimer

This code is free to use, and distribute as long as it is complete and lists a reference to this web site (http://marathon.csee.usf.edu/Mammography/software/heathusf.html). It should be understood that this code SHOULD NOT BE USED IN ANY MEDICAL SYSTEM, IS FOR USE AS IS, MAY NOT BE SOLD, and COMES WITHOUT A WARRANTY OF ANY KIND. The purpose of distributing this code is to facilitate performance evaluation and comparative work in CAD for mammography. This code is the property of the University of South Florida and is copyrighted by Michael D. Heath and Dr. Kevin W. Bowyer.

Simply Using the Software

This section describes some steps you can follow to use the mass detection software. I will assume that your hardware and software are the same as mine (I am using a Sun workstation running SunOS 5.6). If you are running another version of SunOS, Linux or some other version of UNIX, you should be able to get this code working on your system without too much effort. I made this software run on my PC at home (Pentium 233, 64MB of RAM) running Caldera OpenLinux after investing about 20 minutes of time.

I am assuming that your home directory is ~yourhome and that you down load the software heathusf_v1.0.0.tar.gz into a directory named ~yourhome/heathusf. Uncompress and untar the software and compile the programs by using the following commands:

cd ~yourhome/heathusf
gunzip -c heathusf_v1.0.0.tar.gz | tar -xvf -
cd code
build_heathusf

You will need to down load some cases of data from the DDSM to have some data to process. I will assume that the data is on your system in the directory /tmp/yourhome. In that directory you will need to have cases of data in subdirectories named case#### (i.e. case1622). From within the directory /tmp/yourhome type:

~yourhome/heathusf/code/scripts/detect_and_FROC_mass
This will run the mass detection algorithm on each case and create a file named performance.txt. That file will have the true positive fraction and the average number of false positives per image as the last two columns of data in the file. This takes around 20 minutes per case of data on my system (Sun Ultra30) so be prepared to wait a while if you are running on multiple cases of data or if your computer is not as fast as mine.

Software Organization

This section is for those of you who want to know a little more about the software, how it is organized and what tools are there for you to use.

All of the software was written in 'C' and was developed on a Sun workstation running SunOS 5.6 using the compiler gcc. The software is organized into a common directory where a library of functions is built and into several application directories. Scripts (tcsh) are used to handle the flow of processing because this allows the automated processing of large sets of cases with one command.

Below is a table describing some of the scripts I have written. The puspose of these scripts is to automate the entire mass detection and evalutation process to allow others to duplicate the results I obtained with this algorithm and to provide example scripts so others can learn how to write their own scripts that use the programs in this package.

Macro Level ScriptDescription
runAFUMcase This script will run a mass detection program on a case from the DDSM database. A file will be created for each mammogram containing a list of potential detection sites in the image with detections listed in decreasing order of suspicion. The script should be run from a directory where the case to process is in an immediate subdirectory named case####. In processing the case, images are decompressed, the breast boundary is segmented, a feature image is produced, the detection files are created and the decompressed images are deleted.
create_FROC_mass This script will run the performance evaluation program (DDSMeval) on each case of from the DDSM (that is in a current subdirectory) and will output the data you can plot to form an FROC curve. The script will use all of the detection files in each directory. These could be files produced by your own CAD algorithm or they could be files produced by the script runAFUMcase. To compare results from your CAD algorithm with mine you must run the scripts as they are currently configured and must not use the test data in any way until you are actually evaluating the performance of your algorithm (You can do this only once. Otherwise it is not fair to compare the performance of your algorithm to mine because you have used the test data as training data.).
detect_and_FROC_mass This script automates the entire detection and performance evaluation. Just place yourself in a directory with a set of cases in its subdirectories and then run this script. It may take hours or days to run a large set of cases on your computer. This script calls both the runAFUMcase and create_FROC_mass scripts.
make_reduced_images This script will make 8 bit images of all the DDSM images in cases that are in subdirectories of the directory where the script is run. The script is configured to output 8bit PGM files scaled with optical densities in the range 3.0 to 0.05. The images will be created at any resolution you specify that is greater than or equal to the original resolution of the DDSM images. One could easily reconfigure the script to make images at other resolutions, in 16bits, in TIFF format or with other rescalings of pixel values.

Below is a table that provides a list of the programs that are in this package. Next to each is a short description of what each program does. Once you have downloaded the package and compiled it on your system, you can run any program with no arguments and get information on how to run that program.

ProgramDescription
jpeg This program can decompress a lossless compressed mammogram in the DDSM database.
breastsegment This program will segment the breast from an image and produce an ASCII file of the polygon outlining the breast region.
afumfeature This program will produce a probability of suspiciousness image using a novel Average Fraction (of pixels) Under the Minimum filter.
detect This program will produce a detection file listing a set of possible detection sites in decreasing order of suspicion. Any program that will use this file must make the decision regarding which subset of detections to use from the file. Two possible methods of selecting a subset are to take at most some maximum number of detections (starting with the first one) or to take all detections with a suspiciousness value higher than some threshold suspiciousness value.
mkimage This program will produce a PGM or TIFF image in 8 or 16 bits at any given resolution from a decompressed LJPEG.1 image from the DDSM database. Options are provided for rescaling the image in gray values or in optical density values.
mktemplate This program will create a binary or labeled image from a DDSM ground truth file (.OVERLAY) at any specified resolution. Many options allow the user to specify the desired lesion type and descriptive properties as well as the pathology of the lesions that are rendered in the template image. The image can be saved in either PGM or TIFF format.
drawimage This program will create an image to visualize ground truth regions and or detected regions as well as the breast segmentation. The selected items are overlayed on an existing 8bit PGM image (that could have been created with mkimage). Many options allow the user to specify the desired lesion type and descriptive properties as well as the pathology of the lesions that are rendered in the template image. Other options allow the user to specify the maximum number of detections and the suspiciousness threshold to use in determining which possible detections (present in the detection file) to plot on the image. This can be used to visualize the detections and verify the performance of an algorithm.
DDSMeval This is a performance evaluation tool that will compute the number of true positive (TP) detections, the number of false positive (FP) detections and the number of false negative (FN) detections for an image from the DDSM database given a detection file. Many options allow the user to specify the desired lesion type and descriptive properties as well as the pathology of the lesions that are to count as lesions that should be detected. Other options allow the user to specify the maximum number of detections and the suspiciousness threshold to use in determining which possible detections (present in the detection file) to declare as actual detections to use. By default, all BENIGN and MALIGNANT lesions of the lesions that match the description are used.

A paper on this algorithm titled "Mass Detection by Relative Image Intensity" has been submitted to and accepted for the 5th International Workshop on Digital Mammography to be held in Toronto, Canada during June 11-14, 2000. The paper will appear in the proceedings of the conference. Anyone wishing to use this software for performance comparision or in some other regard may reference that paper to reference the source of this algorithm and software.


Michael Heath, Computer Vision Laboratory, University of South Florida
Please direct any questions or comments to heath@csee.usf.edu.
Click here to go to this web page online.