Risk Estimation Dataset


This risk estimation dataset includes data from 2,392,998 screening mammograms (called "index mammograms") from women in the Breast Cancer Surveillance Consortium. Results of a study that used these data were published by Barlow et al. in the September 2006 issue of the Journal of the National Cancer Institute. Investigators may wish to explore modification of associations, risk factors, or statistical issues such as the effect of data imputation for missing values or alternative estimation models. The dataset used in this study was a large cross-classification of risk factors by cancer outcome. The women had no previous diagnosis of breast cancer and no breast imaging in the nine months before the index screening mammogram. All women had a mammogram in the previous five years (though not in the last nine months). Cancer registry and pathology data were linked to data on mammography and incident breast cancer (invasive or ductal carcinoma in situ) within one year after the index mammogram.

In August 2012, the BCSC added a second version of this risk estimation dataset. The second version limits observations to one per woman, as opposed to multiple observations. All population and other characteristics remain the same.



See the Risk Estimation Dataset Documentation for more information about the variables in the dataset.


Acknowledge the BCSC

The following must be cited when reproducing this data:

"The Breast Cancer Surveillance Consortium and its data collection and sharing activities are funded by the National Cancer Institute (P01CA154292). Downloaded xx/xx/xxxx from the Breast Cancer Surveillance Consortium Web site - http://www.bcsc-research.org/."

Please acknowledge the BCSC:

“We thank the participating women, mammography facilities, and radiologists for the data they have provided. You can learn more about the BCSC at: http://www.bcsc-research.org/."

Information about the BCSC may also be included in the methods section using language such as:

"Data for this study was obtained from the BCSC: http://www.bcsc-research.org/."


Access the Dataset

Investigators can access this dataset by submitting the information below.

Note: Links for both versions of this dataset will be given. The first version contains multiple observations per woman. The second version limits observations to one per woman.

Captcha Code