Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

High prevalence of somatic PIK3CA and TP53 pathogenic variants in the normal mammary gland tissue of sporadic breast cancer patients revealed by duplex sequencing

High prevalence of somatic PIK3CA and TP53 pathogenic variants in the normal mammary gland tissue... www.nature.com/npjbcancer ARTICLE OPEN High prevalence of somatic PIK3CA and TP53 pathogenic variants in the normal mammary gland tissue of sporadic breast cancer patients revealed by duplex sequencing 1,2,15 3,4,15 2 1,2 2 5 ✉ ✉ Anna Kostecka , Tomasz Nowikiewicz , Paweł Olszewski , Magdalena Koczkowska , Monika Horbacz , Monika Heinzl , 2 5 5 1 1 6 7 Maria Andreou , Renato Salazar , Theresa Mair , Piotr Madanecki , Magdalena Gucwa , Hanna Davies , Jarosław Skokowski , 8 9 3 10,11 12,13,14 3 3 Patrick G. Buckley , Rafał Pęksa , Ewa Śrutek , Łukasz Szylberg , Johan Hartman , Michał Jankowski , Wojciech Zegarski , 5 2,6 1,2 Irene Tiemann-Boege , Jan P. Dumanski and Arkadiusz Piotrowski The mammary gland undergoes hormonally stimulated cycles of proliferation, lactation, and involution. We hypothesized that these factors increase the mutational burden in glandular tissue and may explain high cancer incidence rate in the general population, and recurrent disease. Hence, we investigated the DNA sequence variants in the normal mammary gland, tumor, and peripheral blood from 52 reportedly sporadic breast cancer patients. Targeted resequencing of 542 cancer-associated genes revealed subclonal somatic pathogenic variants of: PIK3CA, TP53, AKT1, MAP3K1, CDH1, RB1, NCOR1, MED12, CBFB, TBX3, and TSHR in the −2 −1 normal mammary gland at considerable allelic frequencies (9 × 10 – 5.2 × 10 ), indicating clonal expansion. Further evaluation of the frequently damaged PIK3CA and TP53 genes by ultra-sensitive duplex sequencing demonstrated a diversified picture of multiple −2 −4 low-level subclonal (in 10 –10 alleles) hotspot pathogenic variants. Our results raise a question about the oncogenic potential in non-tumorous mammary gland tissue of breast-conserving surgery patients. npj Breast Cancer (2022) 8:76 ; https://doi.org/10.1038/s41523-022-00443-9 10–12 INTRODUCTION adducts . These stress conditions can promote the accumula- tion of post-zygotic, somatic genetic alterations that create the risk Breast cancer affects 24% of women worldwide and is the leading of malignant transformation. Indeed, several studies, including cause of cancer-related deaths in women . Most breast cancer ours, have identified such changes in the uninvolved mammary cases (85–90%) are not associated with inherited mutations of high gland of breast cancer patients that is defined as histologically penetrance genes, such as BRCA1 (MIM *113705) or BRCA2 (MIM 13–15 2,3 normal glandular tissue, distant from the primary tumor site . *600185) . High throughput genomics technologies have high- lighted the molecular complexity of breast tumors which has led to The most pronounced genetic alterations were identified in the normal tissue from mastectomy patients that per se did not have the molecular classification of four clinically meaningful subtypes: 4,5 Luminal A, Luminal B, HER2-enriched and basal-like . Large cohort direct clinical implications, as this affected tissue was removed studies of breast tumor samples identified somatic driver muta- completely during surgery, but might suggest an increased tions in key breast cancer-associated genes, such as PIK3CA (MIM mutational load in the second breast. At the same time, current *171834), TP53 (MIM *191170), MAP3K1 (MIM *600982), CDH1 (MIM clinical management of breast cancer includes breast-conserving *192090), AKT1 (MIM *164730), CBFB (MIM *121360), TBX3 (MIM surgery (BCS) - removing the tumor and sparing normal breast 6–8 16,17 *601621), RB1 (MIM *614041) . To date, the identification of tissue as one of the recommended treatments . The presumed somatic driver pathogenic variants has been inferred only from presence of pathogenic genetic alterations in the seemingly tumors, without providing information on the mutational land- normal mammary gland tissue that is not removed during BCS scape and allelic frequencies of specific variants in the tissue of might create a risk of recurrence and can affect future treatment. cancer origin, i.e., normal tissue of the mammary gland. This is Hence, we aimed to screen at unprecedented sensitivity for the highly relevant as under physiological conditions mammary gland presence of subclonal somatic pathogenic genetic alterations in tissue is mitotically stimulated by hormones and undergoes cycles breast cancer-related genes in the normal mammary gland of of intense proliferation and remodeling during puberty, pregnancy, sporadic cancer patients (study overview in the Supplementary and lactation . During life, the mammary gland is exposed to Fig. 1). estrogen and its metabolites that damage DNA by single- and Our study demonstrates that structural chromosomal aberra- double-strand breaks, mutations or, the formation of depurinating tions and clearly pathogenic point variants in crucial breast cancer 1 2 3 Faculty of Pharmacy, Medical University of Gdansk, Gdansk, Poland. 3P Medicine Lab, Medical University of Gdansk, Gdansk, Poland. Department of Surgical Oncology, Ludwik Rydygier’s Collegium Medicum UMK, Bydgoszcz, Poland. Department of Breast Cancer and Reconstructive Surgery, Prof. F. Lukaszczyk Oncology Center, Bydgoszcz, Poland. 5 6 Institute of Biophysics, Johannes Kepler University, Linz, Austria. Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala 7 8 9 University, Uppsala, Sweden. Department of Surgical Oncology, Medical University of Gdansk, Gdansk, Poland. Genuity Science Genomics Centre, Dublin, Ireland. Department 10 11 of Patomorphology, Medical University of Gdansk, Gdansk, Poland. Department of Tumor Pathology, Prof. F. Lukaszczyk Oncology Center, Bydgoszcz, Poland. Department of Perinatology, Gynaecology and Gynaecologic, Oncology, Collegium Medicum in Bydgoszcz, Nicolaus Copernicus University in Torun, Bydgoszcz, Poland. Department of 13 14 Oncology and Pathology, Karolinska Institutet, Stockholm, Sweden. Department of Pathology, Karolinska University Hospital, Stockholm, Sweden. MedTech Labs, Bioclinicum, Karolinska University Hospital, Stockholm, Sweden. These authors contributed equally: Anna Kostecka, Tomasz Nowikiewicz. email: anna.kostecka@gumed.edu.pl; tomasz.nowikiewicz@gmail.com; arkadiusz.piotrowski@gumed.edu.pl Published in partnership with the Breast Cancer Research Foundation 1234567890():,; A. Kostecka et al. driver genes are frequent in the normal mammary glandular tissue description of these genes in the context of breast cancer is that remains after breast-conserving surgery. provided in Supplementary Tables 6, 7 and Supplementary Fig. 8. All of these variants except PIK3CA c.3140 A > G (p.His1047Arg) were detected in BCS patients, in samples from the tissue portion RESULTS that was not qualified for surgical resection. Patterns of chromosomal aberrations We carried out analysis of chromosomal rearrangements with SNP Heterogeneity of PIK3CA and TP53 pathogenic variants arrays to detect DNA copy number alterations (CNAs) as well as revealed in the normal mammary gland tissue copy number neutral loss-of-heterozygosity events via mitotic Two driver genes dominate across all subtypes of invasive breast recombination. In addition to matched samples of normal cancer: PIK3CA and TP53 . PIK3CA encodes the catalytically active uninvolved mammary gland (UM) and primary tumor (PT), we p100alpha isoform that regulates cell proliferation and growth included normal mammary gland samples from 26 age-matched receptor signaling cascade. Activating PIK3CA point variants are women that underwent breast reduction surgery and served as the the most prevalent in breast tumors and were confirmed to lead control group (Supplementary Fig. 2). Spectrum of CNAs in the 22,29 to malignant transformation . We detected four hotspot PIK3CA studied cohort is presented on Fig. 1. Hierarchical clustering somatic variants in the uninvolved mammary gland, all of them revealed two clusters with PT-only and control-only samples and have been described in the COSMIC database and reported in four additional clusters with mixed sample distribution (Supple- breast tumors (Fig. 2, Table 2, Supplementary Fig. 5). TP53 tumor mentary Fig. 3). We also carried out cross analysis of CNAs type, suppressor acts as a transcription factor and is frequently size and number between the studied sample groups. The PTs inactivated in human malignancies, mostly through loss-of- stand out in this comparison (Wilcoxon test, p = 0,0094), with slight 30–32 function TP53 variants . We detected an Ile195Thr hotspot differences between normal mammary tissue from breast cancer variant in the uninvolved mammary gland that affects the central patients and the control cohort. Nonetheless, per individual basis, DNA-binding domain (Fig. 2, Table 2, Supplementary Fig. 5). total number of CNAs, the number of gains, the size of deletions, To enhance the sensitivity and accuracy of rare variant and size of CNAs in general were the discriminating features detection, we employed duplex sequencing (Supplementary Fig. between the normal mammary tissue from breast cancer patients 7). We selected four individuals: P10, P28, P51, and P52 based on and the control cohort, surprisingly suggesting more heteroge- the presence of PIK3CA and TP53 hotspot variants in PT samples neous nature of the control samples (Supplementary Fig. 4). according to standard NGS data (Fig. 3) and screened for variants We identified recurrent chromosomal aberrations in UMs from in the normal mammary gland samples with high sensitivity sporadic breast cancer patients, such as loss of 1p, 16p11.2, and duplex NGS sequencing. Ultra-deep targeted duplex sequencing 9p21.3, and 3q25.3, 4q13.1, 8q, and 20q gains, in line with of PIK3CA detected low-level subclonal pathogenic variants: 5,18 previous studies . Presence of loss of heterozygosity (LOH) at c.1093 G > A (p.Glu365Lys), c.1358 A > G (p.Glu453Gly), c.1633G > chromosome 8p, associated with poor outcome in breast cancer, A (p.Glu545Lys) c.1634A > C (p.Glu545Ala), c.2164 G > A (p. was observed in matched UMs and PTs, but also in the normal Glu722Lys), c.3140 A > G (p.His1047Arg), in the uninvolved mam- mammary gland tissue of healthy controls . We observed mary gland samples of three individuals. The detected variants additional events that frequently accompany 8p LOH, in the were located in the known PIK3CA hotspot regions, reported in UMs: 9p loss and 8q gain. ERBB2 gains were observed exclusively breast tumors in the COSMIC database and functionally confirmed in PT samples, except for one control mammary gland sample. 7,22 to affect PIK3CA function (Fig. 3, Supplementary Table 8). A screen for TP53 variants not only confirmed the presence of Subclonal somatic pathogenic variants in breast cancer driver His168Leu variant, but also revealed additional hotspot variants: genes present in the normal mammary gland tissue c.527 G > T (p.Cys176Phe), c.701 A > G (p.Tyr234Cys), c.733 G > A (p.Gly245Ser), c.745 A > T (p.Arg249Trp), c.818 G > A (p.Arg273His), We applied targeted DNA sequencing to identify variants in sets of c.839 G > C (p.Arg280Thr). Importantly, all these pathogenic UM, BL, and PT samples of 52 individuals diagnosed with sporadic breast cancer to distinguish germline and post-zygotic mutations variants are located in the central DNA-binding domain indis- 7,32 (Supplementary Table 1, Supplementary Table 2). pensable for p53 tumor-suppressive function (Fig. 3, Supple- Four individuals (4/52, 7.7%) were heterozygous for a constitu- mentary Table 8). tional pathogenic variant of a known breast cancer-associated gene, i.e. c.5179 A > T (p.Lys1727Ter) and c.181 T > G (p.Cys61Gly) DISCUSSION in the BRCA1 gene, c.509_510del (p.Arg170fs) and c.354del (p. Post-zygotic variations contribute to the genetic heterogeneity of Thr119fs) in the PALB2 and RAD50 genes, respectively (Supple- mentary Table 3). These results correspond to similar rates from an individual, which is reflected in a mosaic pattern of genetic other studies where up to 10% of reportedly sporadic cases turns alterations in all cells that make up the human body . The 5,7 out hereditary after molecular testing . Individuals with germline mammary gland remains mitotically active during life and under pathogenic variants were excluded from further analysis, resulting physiological conditions is exposed to DNA-damaging estrogen in a total of 48 clearly sporadic breast cancer patients. Constitu- metabolites . Subclonal somatic genetic changes acquired during tional variants of breast cancer-associated genes are listed in the life pose a risk of cancer development. Hence, we hypothesized Supplementary Table 3. that these factors can increase the mutational burden in the The summary of somatic variants fulfilling the cut-off criteria mammary gland. Other studies have reported the presence of detected in known breast cancer-associated and candidate breast genomic and transcriptomic changes in the normal mammary cancer-associated genes is provided in Supplementary Tables 4 gland, and suggested that histological normalcy does not exclude 34–36 and 5, respectively. We identified 15 somatic pathogenic, likely pathological biological changes . However, these studies have pathogenic variants or variants of uncertain significance with been carried out on normal mammary tissue obtained from predicted deleterious effect on the encoded protein in the normal mastectomies or cancer-adjacent samples, hence the clinical mammary gland tissue of 19% (9/48) of patients (Fig. 2). The relevance of the these findings was limited. In this study, we 5 20 21 affected genes are tumor suppressors (TP53 , RB1 , CDH1 ), screened for somatic genetic changes in the normal mammary 22 23 oncogenes (PIK3CA ), regulate cell death (MAP3K1 ), DNA repair gland tissue of sporadic cancer patients, including tissue biopsies 24 25 26 27 (AKT1 , RAD50 ), translation (CBFB ), gene expression (MED12 , from the parts of the breast that normally would not have been 28 6 TSHR ) and chromatin remodeling (NCOR1 ). A detailed removed during breast-conserving surgery. We identified npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation 1234567890():,; A. Kostecka et al. Fig. 1 Summary of Copy Number Alterations (CNAs) detected in the studied cohort. Chromosomal CNAs were calculated as mean Log R Ratio (LRR) for chromosome arm and normalized to mean LRR of a sample. Results are presented as a heatmap with colors indicating gains (positive LRR values; red) and deletions (negative LRR values; blue). Hierarchical clustering was performed with Ward2 algorithm and identified six clusters. Pie charts with proportion of samples within clusters are presented in the Supplementary Fig. 3. Ctrl control cohort mammary gland, UM uninvolved mammary gland, PT tumor. Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2022) 76 A. Kostecka et al. Fig. 2 Somatic variants detected in the uninvolved mammary gland (UM). Targeted sequencing revealed somatic variants of known breast cancer-associated genes (rows) present in 9–52% alleles in the UM of sporadic breast cancer patients (columns). Information on estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), and biological subtype of matched primary tumor sample is included. *Variants detected in matched PT sample. CNA Copy Number Alteration status based on SNP arrays. LOH loss of heterozygosity. Description of detected variants, including genomic position and pathogenicity classification is provided in Table 2. widespread genomic structural rearrangements that affect gene The PIK3CA and TP53 genes are the leading oncogenic dosage and somatic subclonal sequence variants of known breast mutations of breast malignancies and accordingly the most cancer-associated genes that control proliferation, cell death, common changes detected in our study were in the PIK3CA 5,40 metastasis, and genome integrity: PIK3CA, TP53, AKT1, MAP3K1, gene . Soysal et al. screened for somatic variants in benign CDH1, RB1, NCOR1, MED12, CBFB, TBX3, and TSHR (Supplementary biopsies of patients that subsequently developed breast cancer. Fig. 8). These variants were present in a considerable percentage PIK3CA and TP53 variants were the most prevalent changes in of cells, suggesting they occurred earlier in the mammary gland tumor samples, but not detected in benign biopsies, possibly due development or the carrier cells gained growth advantage and to limited sensitivity of standard massively parallel sequencing for underwent clonal expansion. Further, ultra-sensitive duplex rare variant detection . To overcome this limitation, we sequencing revealed heterogenous mosaic landscape of low- implemented duplex sequencing technology to detect PIK3CA level subclonal pathogenic variants of main breast cancer drivers: and TP53 variants in the normal mammary gland samples at very PIK3CA and TP53 in the normal mammary gland tissue. Notably, low frequency. In the uninvolved mammary gland tissue, we the setup of these variants was markedly different between tumor detected known hotspot pathogenic variants that might activate and normal mammary tissue from the same individuals which is PIK3CA kinase or target DNA-binding domain of TP53 tumor suggestive of multiple, independent mutational events that suppressor, disabling its function. occurred in the mammary gland (Fig. 4). We confirmed that these variants observed in tumor samples In parallel to sequence variants, we identified recurrent CNAs in were already present in the normal glandular tissue as well, albeit the mammary gland of breast cancer patients, but also in the age- at lower levels compared to the corresponding tumors. Strikingly matched control group (Fig. 1). This facilitated detecting subtle, these changes were accompanied in the same samples by other but noticeable differences in terms of total number and length of PIK3CA and TP53 pathogenic variants, present in the normal tissue, all detected CNAs per individual (Supplementary Fig. 4). Both but not in the corresponding tumors. This may suggest the groups: breast cancer and control were age-matched and there- existence of potential sites of secondary tumor formation. Notably, fore the mammary gland tissue was exposed to cycles of estrogen the majority of somatic pathogenic variants, including these for comparable time and that can explain the accumulation of PIK3CA and TP53 hotspot alterations, occurred in the normal copy number alterations in both cohorts. mammary gland samples not removed during breast-conserving The most important finding from this part of our study is that surgery, not from radical mastectomy patients. the normal mammary tissue from cancer patients showed DNA At the same time PIK3CA and TP53 variant spectra in the normal copy number alterations as well as evidence of copy number glandular tissue were more similar to the ones reported in cancer- neutral loss-of-heterozygosity. These genomic alterations in oriented database (COSMIC) than those in general population concert with damaging sequence variants recapitulate alternative (gnomAD), suggesting that the studied UM tissues reflect the routes of gene inactivation that are typically observed in the repertoire of somatic variants seen in tumor samples (Supple- malignant tumors, but not in the benign tissue. In this context, our mentary Fig. 9, Supplementary Fig. 10, Supplementary Table 9). study demonstrates that normal tissue profiling provides direct However, given the limited number of four individuals included in information on the very origin of the disease and may improve the duplex sequencing analysis, these conclusions should be inter- choice of treatment as well as may aid in further clinical preted with caution. Further studies on a larger well-characterized 37–39 management of the affected individuals . This is in contrast cohort of sporadic breast cancer patients are needed for under- to typical molecular profiling studies that rely on limited standing how specific variants arise and expand during life. retrospective information inferred from the tumors. Nevertheless, we demonstrate here that ultra-sensitive duplex npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation A. Kostecka et al. Fig. 3 Somatic PIK3CA and TP53 variants detected in the uninvolved mammary gland (UM) and primary tumor (PT) samples. Lollipop plots represent somatic variants of (a) PIK3CA and (b) TP53 genes detected by targeted next-generation sequencing (NGS). Upper panel represents variants detected in patient uninvolved mammary gland (UM) and tumor (PT) samples. All somatic variants detected according to the standard NGS and pathogenic/likely pathogenic variants detected by duplex sequencing in UM samples are included. Lower panel is a summary of somatic variants detected in breast tumors reported in the COSMIC database (https://cancer.sanger.ac.uk/cosmic). p85 p85- binding domain, RBD Ras-binding domain, C2 C2 domain, AD accessory domain, CD catalytic domain. TAD1, TAD2 transcription activation domain 1 and 2, DBD DNA-binding domain, DNA-binding sites are marked with red lines, TD tetramerization domain. Lollipop plots were prepared based on the images generated with the Protein paint application . *Variants detected by standard NGS in primary tumor samples and selected for duplex sequencing. sequencing approach might be beneficial to detect very low-level underwent breast-conserving surgery. Including molecular evalua- frequency somatic mosaicism in different tissue samples, with its tion of the normal glandular tissue of sporadic breast cancer potential clinical implications in terms of molecular diagnostics patients could be beneficial for personalized patient care. and prognosis. After surgical intervention, breast cancer patients remain under METHODS clinical surveillance with recommended yearly mammogram and Patient samples and DNA isolation physical examination every 3–4 months for the first two years after surgery . The current diagnostic approach has been focused We analyzed samples from 52 patients diagnosed with reportedly sporadic mainly on the identification of constitutional pathogenic variants breast cancer with an emphasis on breast-conserving surgery (2/3 of the patients studied) and who did not receive neoadjuvant therapy. Altogether in known breast cancer-associated genes to catch early these a total of 204 uninvolved mammary gland (UM), primary tumor (PT), skin individuals who are in a higher risk of breast cancer development (SK), and peripheral blood (BL) samples were collected via the Oncology and/or to whom the personalized targeted therapy could be Centre in Bydgoszcz and the University Clinical Centre in Gdansk, with the offered. However, over 80% of all breast cancer cases are not approval of bioethics committee at Medical University of Gdansk (MUG). associated with inherited changes . We have obtained written informed consent from all participants. PT, UM, Our results demonstrate a complex landscape of mutational SK, and BL samples from each patient were collected and stored in −80 °C burden in the seemingly normal mammary glandular tissue and upon DNA isolation. The overview of sample processing workflow is indicate an oncogenic potential of the tissue not removed during presented in the Supplementary Fig. 1. The histological subtypes and surgery. This study provides a rationale for thorough genetic and tumor tissue content of each PT sample were evaluated by pathologists clinical surveillance of sporadic breast cancer patients that according to the current American Joint Committee on Cancer Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2022) 76 A. Kostecka et al. Fig. 4 Oncogenic potential of the normal mammary tissue. We used duplex sequencing to screen for ultra-low frequency variants and detected PIK3CA and TP53 hotspot alterations. The sampled normal mammary gland tissue is referred to as uninvolved glandular tissue and was not removed during surgical resection of the tumor mass. Detected variants might alter the function of the main breast cancer drivers: activate PIK3CA oncogene and impair TP53 tumor suppressor DNA-binding capacity. The presence of these changes implicates an oncogenic potential of the uninvolved mammary gland tissue and emphasizes the importance of thorough monitoring of sporadic breast cancer patients that underwent breast-conserving surgery. guidelines . Tumor samples with less than 50% of neoplastic cell content Statistical analysis were excluded. The normal mammary gland was sampled preferably from All statistical analyses were carried out using R version 3.6.2 and package the opposite quadrant relative to the primary tumor site, with a mandatory stats. Packages pheatmap and ggpubr were used for plotting. Statistical cut-off criterion of at least 3 cm in each case, to exclude potential significance of differences between two groups was tested using the contamination with residual tumor cells. These tissue samples were also Mann–Whitney U test. Differences were considered significant at a two- sided p < 0.05. evaluated by pathologists to confirm normal histology (Table 1, Supplementary Table 1). All normal mammary gland samples from patients who underwent breast-conserving surgery were derived from the portion Targeted DNA resequencing of tissue that remained intact in the patient body after breast-conserving Targeted DNA sequencing panel was designed with Roche NimbleDesign surgery. Solid tissues were homogenized in a lysis buffer, then Proteinase K online tool (Roche, https://hyperdesign.com/). The panel included exons was added and samples were incubated at 55 °C for 48 h. DNA isolation with+ /- 50 kbp flanking regions of 542 genes selected based on in-house from UM, PT, and SK tissue lysates was performed by phenol–chloroform database and literature research (Supplementary Table 2). Sequencing extraction as previously described . Blood DNA extraction was performed libraries were prepared for sets of UM, BL, and PT samples with the with the QIAamp DNA Blood Mini Kit according to the manufacturer’s capture-based Roche SeqCap EZ system according to the manufacturer’s protocol (Qiagen, Germantown, MD). protocol (Roche, Pleasanton, CA), followed by 150 bp paired-end sequen- cing performed on Illumina NextSeq550 and MiniSeq instruments (Illumina, San Diego, CA). Sequencing read alignment to the human reference genome (hg38) was performed with the Burrows–Wheeler Copy number alteration detection transform aligner (http://bio-bwa.sourceforge.net/) . Platypus v.0.8.1.1 SNP array genotyping was performed for UM and PT samples on an (https://www.rdm.ox.ac.uk/research/lunter-group/lunter-group/) was used Illumina Infinium Global Screening Array, according to the manufacturer’s for variant calling . Variants with poor mapping quality (<30), variants recommendations (Illumina, San Diego, CA). SNP genotyping data from supported by high-quality bases (≥30) in fewer than five reads, and variants outside the targeted regions were excluded from analysis. Variants mammary gland tissues of 26 age-matched women that underwent breast were annotated with VarAFT (version 2.17-2) software . reduction surgery were used as control samples (Supplementary Fig. 2). For variant selection, only variants with sequencing depth ≥ 30 and Genotyping data was analyzed using Nexus Copy Number software version tissue allele frequency ≥ 0.07 were included in the analysis. All truncating 10.0 (BioDiscovery). Quality control of samples was performed as described 14,44 variants were included. For non-truncating variants, the following criteria previously . Briefly, samples with Log R Ratio (LRR) sd > 0.2 were flagged were used: variants were filtered by their clinical significance as reported in as poor quality and excluded from the analysis. The analysis was the ClinVar database (as of June 2021), variants classified as Pathogenic, performed with default settings except that significance threshold for Likely Pathogenic, Conflicting interpretations of pathogenicity, risk factor, −13 Copy Number Alterations (CNA) calling was decreased to 5*10 - (default and drug response were included in the study. The remaining non- −7 5*10 ), minimal number of probes per segment was increased to 10 truncating variants were included based on their frequency in the general (default 3), gain threshold was set to 0.49 and 0.14 which corresponds to population: variants with minor allele frequency (MAF) ≤ 0.001 across all approximately 40% and 10% change for a high gain and gain respectively gnomAD populations (“popmax”) or not noted in the database were (the default is 0.41 and 0.06 for a high gain and gain), the loss threshold included. For in silico splicing analysis splice prediction algorithms, i.e. SSF, was set to −0.16 and −0.74 what corresponds to approximately −10% and MaxEntScan, and NNSplice, embedded in Alamut Visual software (version −40% change for a loss and high loss respectively (the default is −0.09 and 2.14) were used. Variants described in this study were classified according −1.1 for a loss and high loss). Hierarchical clustering was performed using to the American College of Medical Genetics and Genomics and the 45 49 the Ward2 algorithm . Association for Molecular Pathology recommendations . Based on npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation A. Kostecka et al. discarded. The beads were washed twice with 80% ethanol, air dried at Table 1. Summarized clinicopathological features of sporadic breast room temperature and 23 µl of PCR grade water was added to resuspend cancer patient cohort. by pipetting. After incubating at RT for 5 min, the dissolved beads were allowed to stand at RT for 5 min, placed on a magnet and the clear Number of individuals 52 supernatant containing the size-selected DNA was transferred to a new tube. Collected samples: 204 UM 52 End-repair, A-tailing, adapter ligation, and bead purification. Size selected PT 52 genomic DNA was end-repaired and A-tailed using the NEBNext® Ultra™ II End Repair/dA-Tailing Module (New England Biolabs) according to the BL 52 manufacturer’s instructions followed by adapter ligation with the SK 48 NEBNext® Ultra™ II Ligation Module (New England Biolabs) following the Age (median/range) 45/ manufacturer’s instructions. The adapters ligated to the A-tailed DNA were 28–60 synthesized as previously described (Adapter 2) . The ligation reaction was then purified using 1.2 volumes of Sera-Mag Select beads (Cytiva). A Histology total of 96.5 µl sample was thoroughly mixed with 115.8 µl beads by IDC 44 pipetting and incubated at RT for 10 min. Tubes were placed on a magnet ILC 4 and the supernatant was discarded. The beads were washed twice with 80% ethanol. Next, the beads were dried at room temperature and 23 µl of IDC-ILC 1 PCR grade water was added to resuspend by pipetting. After incubating other 3 the dissolved beads at RT for 5 min they were placed on a magnet and the Receptors clear supernatant containing the purified DNA was transferred to a fresh tube. ER (positive/negative) 46/6 PR (positive/negative) 46/6 Pre-capture amplification. Ligated fragments were amplified with KAPA HER2 (positive/negative) 5/47 HiFi HotStart ReadyMix PCR Kit (KAPA Biosystems). Reaction components, Subtype primer sequences, and cycling conditions are listed in the Supplementary Table 10. For libraries with input DNA higher than 240 ng, two parallel Luminal A 22 reactions were prepared and pooled in the end, just before purification. Luminal B 24 The first step of amplification was 6 or 12 cycles of single primer extensions HER2-enriched 2 followed by the addition of the primer NEBNext Universal and a standard PCR amplification of 2 cycles. PCR products were purified with 1.2 volumes Triple-negative 4 Sera-Mag Select beads as described above, followed by two rounds of Uninvolved mammary gland tissue (UM), primary tumor (PT), skin (SK), and targeted capture steps to enrich the templates of interest. peripheral blood (BL) samples were collected from 52 individuals diagnosed with reportedly sporadic breast cancer. Histological evaluation Targeted captures and post-capture amplification. Two rounds of targeted of tumor samples was performed according to the current American Joint 43 captures followed by PCR amplification were performed as described in Committee on Cancer guidelines . PT samples were classified as Invasive Salazar et al., with minor modifications on the post-capture amplification Ductal Carcinoma (IDC), Invasive Lobular Carcinoma (ILC), mixed (ICD-ILC) (Supplementary Table 10) . The biotinylated probes used to target exonic or other. Estrogen (ER), progesterone (PR), and ERBB2 (HER2) receptors regions of TP53, and PIK3CA are detailed on Supplementary Table 10. were evaluated based on immunostaining or immunostaining and FISH (HER2). Biological subtypes were assigned based on ER/PR/HER2 and Ki67 status. Detailed clinicopathological information is provided in the Duplex sequencing data analysis Supplementary Table 1. FastQ files were analyzed with Galaxy platform (available on a private server provided by the Medical University of Gdansk) and first processed by the tool Trim Galore! to trim Illumina-specific adapter sequences 2,7,30,50,51 literature we selected 155 breast cancer-associated genes that including the barcode and spacer sequence at the 3' end of the raw reads. were the primary focus of variant analysis (Supplementary Table 2). Next, the reads were analyzed according to a duplex sequencing (DS) Somatic variants presented in Fig. 2 and Table 2 were confirmed by Sanger specific pipeline that includes an error correction tool . After creating the sequencing or High Resolution Melting analysis (Supplementary Fig. 5). duplex consensus sequence (DCS), a trimming step of 5 nucleotides from Lollipop plots with variant demonstration were prepared based on images both 5' and 3' end was included. The trimmed consensus sequences were generated with the Protein paint application . then aligned by BWA-MEM and BamLeftAlignIndels to the human genome assembly hg38. To avoid false-positive variants that would occur within Duplex sequencing any partial adapter sequences and barcodes at the 3' end of the consensus UM, PT, BL, and SK samples of four individuals (P10, P28, P51, and P52) sequence and were not removed by the first adapter trimming step, the were selected for detection of variants by duplex sequencing based on the tool clipOverlap from the package BamUtil was applied. Variant calling was presence of PIK3CA or TP53 hotspot variants in PT, but not UM tissue, then performed by the variant caller LoFreq. Finally, the variants according to standard NGS. The protocols used here are based on the ones (substitutions only) were further inspected and assigned to tiers using described in more detail in Salazar et al. . 55 the Variant Analyzer . Variants with DCS coverage below 500 and variants outside the probe regions were discarded from our analysis and only Tier 1 Random DNA shearing and size selection. DNA was ultrasonicated for variants were kept, together with Tier 2 that were detected more than 10 min at ≤10 °C using a Bandelin Sonorex Super RK 102 H Ultrasonic bath 55 once. For more details on this analysis see Povysil et al. . The full Galaxy ending up with a fragment size distribution of, on average, 275 bp. A workflow is publicly available: https://usegalaxy.org/u/jku-itb-lab/w/ double-size selection was performed using Sera-Mag Select beads (Cytiva) gdansk-paper---galaxy-workflow. in order to exclude fragments outside a range of 100-400 bp. The size The variant frequency was calculated by dividing the number of DCS selection was performed in 50 µl of sonicated DNA (2 µg), 20 µl 10x calling the variant by the DCS coverage at the position of the variant within CutSmart buffer (NEB), 47.6 µl PCR grade water with 0.7 volumes beads. the library it was detected. The variant frequency was calculated by the The reaction was mixed by pipetting thoroughly and incubated at room count for each alteration type (e.g. A > C) divided by the frequency of the temperature (RT) for 10 min. Tubes were then placed on a magnet for sequenced reference allele (e.g., frequency of A’s in the reference 5 min and 190 µl of supernatant was transferred to a fresh tube. Next, 2.5 sequence multiplied by the sum of the mean DCS coverage for that volumes of beads in total considering the initial bead solution was added to the solution and mixed by pipetting. The mixture was incubated at RT library). The relative count is the count for each variant type divided by the for 10 min. Tubes were placed on a magnet and supernatant was sum of all occurring variants within the tissue. Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2022) 76 A. Kostecka et al. npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation Table 2. Pathogenicity classification of somatic variants detected in the uninvolved mammary gland (UM) samples. a b c d e f f ID Gene Genomic position cDNA change (protein change) ACMG classification rsID ClinVar UM allele frequency PT allele frequency P26 AKT1 chr14:104780214 c.49 G > A (p.Glu17Lys) Pathogenic rs121434592 – 0,11 0,36 P23 CBFB chr16:67036674 c.207dup (p.Pro70fs) Pathogenic –– 0,15 not detected P18 CDH1 chr16:68819382 c.1668_1669insT (p.Lys557Ter) Pathogenic –– 0,1 0,17 P15 MAP3K1 chr5:56881868 c.2668del (p.Asn891fs) Pathogenic –– 0,09 0,15 P23 MED12 chrX:71137882 c.5983 C > T (p.Pro1995Ser) Likely pathogenic –– 0,15 not detected P15 NCOR1 chr17:16040459 c.6715 C > A (p.Pro2239Thr) Likely pathogenic –– 0,11 not detected P12 PIK3CA chr3:179234358 c.3203dup (p.Asn1068fs) Pathogenic rs587776802 Pathogenic 0,19 no data P12 PIK3CA chr3:179204536 c.1093 G > A (p.Glu365Lys) Pathogenic rs1064793732 Pathogenic 0,33 no data P23 PIK3CA chr3:179203765 c.1035 T > A (p.Asn345Lys) Pathogenic rs121913284 Likely pathogenic 0,11 not detected P27 PIK3CA chr3:179234297 c.3140 A > G (p.His1047Arg) Pathogenic rs121913279 Pathogenic 0,11 0,11 P16 RAD50 chr5:132595759 c.2165dup (p.Glu723fs) Pathogenic rs397507178 Pathogenic 0,16 not detected P20 RB1 chr13:48345117 c.418 A > G (p.Thr140Ala) Likely pathogenic –– 0,11 not detected P12 TBX3 chr12:114679572 c.796_797dup (p.Ser266fs) Pathogenic –– 0,18 not detected P31 TP53 chr17:7674947 c.584 T > C (p.Ile195Thr) Pathogenic rs760043106 Likely pathogenic 0,52 no data P20 TSHR chr14:81068264 c.253 A > G (p.Ile85Val) VUS –– 0,13 not detected Targeted DNA sequencing identified somatic DNA variants of known breast cancer-associated genes in the uninvolved mammary gland tissue of sporadic breast cancer patients. Genomic position according to the hg38 sequence assembly. Variant annotation provided for the basic isoform of the transcript. c 49 Pathogenicity classification according to the current ACMG guidelines . rsIDs in dbSNP build 152. Variant pathogenicity classification according to the ClinVar database. Detailed description of somatic variants detected in UM samples is provided in the Supplementary Table 4. Tissue allele frequency of the detected variants in matched UM and PT tissue specimens. PT sample was not available. Confirmation of somatic variants by Sanger sequencing or high-resolution melting is provided in the Supplementary Fig. 5. VUS Variant of Unknown Significance. A. Kostecka et al. DATA AVAILABILITY 29. Thorpe, L. M., Yuzugullu, H. & Zhao, J. J. PI3K in cancer: divergent roles of isoforms, modes of activation and therapeutic targeting. Nat. Rev. Cancer 15,7–24 (2015). Raw microarray, NGS and duplex sequencing data are available upon request in the 30. Vogelstein, B. et al. Cancer genome landscapes. Science 340, 1546–1558 (2013). EGA archive, study ID EGAS00001005698. 31. Campbell, P. J. et al. Pan-cancer analysis of whole genomes. Nature 578,82–93 (2020). Received: 29 September 2021; Accepted: 10 June 2022; 32. Baugh, E. H., Ke, H., Levine, A. J., Bonneau, R. A. & Chan, C. S. Why are there hotspot mutations in the TP53 gene in human cancers? Cell Death Differ. 25, 154–160 (2018). 33. Mustjoki, S. & Young, N. S. Somatic mutations in “benign” disease. N. Engl. J. Med. 384, 2039–2052 (2021). REFERENCES 34. Gadaleta, E. et al. Characterization of four subtypes in morphologically normal tissue excised proximal and distal to breast cancer. npj Breast Cancer 6, 38 (2020). 1. Heer, E. et al. Global burden and trends in premenopausal and postmenopausal 35. Aran, D. et al. Comprehensive analysis of normal adjacent to tumor tran- breast cancer: a population-based study. Lancet Glob. Heal 8, e1027–e1037 scriptomes. Nat. Commun. 8,1–13 (2017). (2020). 36. Troester, M. A. et al. DNA defects, epigenetics, and gene expression in cancer- 2. Coughlin, S. S. Epidemiology of breast cancer in women. Adv. Exp. Med. Biol. adjacent breast: a study from the cancer genome atlas. npj Breast Cancer 2, 16007 1152,9–29 (2019). (2016). 3. Kleibl, Z. & Kristensen, V. N. Women at high risk of breast cancer: molecular 37. Moore, L. et al. The mutational landscape of normal human endometrial epi- characteristics, clinical presentation and management. Breast 28, 136–144 (2016). thelium. Nature 580, 640–646 (2020). 4. Sorlie, T. Gene expression patterns of breast carcinomas distinguish tumor sub- 38. Lawson, A. R. J. et al. Extensive heterogeneity in somatic mutation and selection classes with clinical implications. PNAS 98, 10869–10874 (2001). in the human bladder. Science 370,75–82 (2020). 5. Koboldt, D. C. et al. Comprehensive molecular portraits of human breast tumours. 39. Abascal, F. et al. Somatic mutation landscapes at single-molecule resolution. Nature 490,61–70 (2012). Nature 593, 405–410 (2021). 6. Stephens, P. J. et al. The landscape of cancer genes and mutational processes in 40. Berger, A. C. et al. A comprehensive pan-cancer molecular study of gynecologic breast cancer. Nature 486, 400–404 (2012). and breast cancers. Cancer Cell 33, 690–705.e9 (2018). 7. Pereira, B. et al. The somatic mutation profiles of 2,433 breast cancers refines their 41. Soysal, S. D. et al. Genetic alterations in benign breast biopsies of subsequent genomic and transcriptomic landscapes. Nat. Commun. 7, 11479 (2016). breast cancer patients. Front. Med. 6,1–6 (2019). 8. Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole- 42. Gradishar, W. J. et al. NCCN clinical practice guidelines in Oncology. Breast Cancer genome sequences. Nature 534,47–54 (2016). Version 4. 2021. Natl. Compr. Cancer Netw. 16, 310–320 (2021). 9. Macias, H. & Hinck, L. Mammary gland development. Wiley Interdiscip. Rev. Dev. 43. Amin, M. B., et al. AJCC Cancer Staging Manual (Springer International Publishing, Biol. 1, 533–557 (2012). 2017). 10. Dall, G. V. & Britt, K. L. Estrogen effects on the mammary gland in early and late 44. Rydzanicz, M. et al. Variable degree of mosaicism for tetrasomy 18p in pheno- life and breast cancer risk. Front. Oncol. 7,1–10 (2017). typically discordant monozygotic twins—diagnostic implications. Mol. Genet. 11. Almeida, M., Soares, M., Fonseca-Moutinho, J., Ramalhinho, A. C. & Breitenfeld, L. Genom. Med. 9,1–9 (2021). Influence of estrogenic metabolic pathway genes polymorphisms on post- 45. Murtagh, F. & Legendre, P. Ward’s hierarchical agglomerative clustering method: menopausal breast cancer risk. Pharmaceuticals 14,1–9 (2021). which algorithms implement Ward’s criterion? J. Classif. 31, 274–295 (2014). 12. Yager, J. D. & Davidson, N. E. Estrogen carcinogenesis in breast cancer. N. Engl. J. 46. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler Med. 354, 270–282 (2006). transform. Bioinformatics 25, 1754–1760 (2009). 13. Ronowicz, A. et al. Concurrent DNA copy-number alterations and mutations in 47. Rimmer, A. et al. Integrating mapping-, assembly- and haplotype-based genes related to maintenance of genome stability in uninvolved mammary approaches for calling variants in clinical sequencing applications. Nat. Genet. glandular tissue from breast cancer patients. Hum. Mutat. 36, 1088–1099 (2015). 46, 912–918 (2014). 14. Forsberg, L. A. et al. Signatures of post-zygotic structural genetic aberrations in 48. Desvignes, J. P. et al. VarAFT: A variant annotation and filtration system for the cells of histologically normal breast tissue that can predispose to sporadic human next generation sequencing data. Nucleic Acids Res 46, W545–W553 breast cancer. Genome Res. 25, 1521–1535 (2015). (2018). 15. Danforth, D. N. Genomic changes in normal breast tissue in women at normal risk 49. Richards, S. et al. Standards and guidelines for the interpretation of sequence or at high risk for breast cancer. Breast Cancer Basic Clin. Res. 10, 109–146 (2016). variants: a joint consensus recommendation of the American College of Medical 16. Waks, A. G. & Winer, E. P. Breast cancer treatment: a review. JAMA -J. Am. Med. Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. Assoc. 321, 288–300 (2019). 17, 405–424 (2015). 17. Loibl, S., Poortmans, P., Morrow, M., Denkert, C. & Curigliano, G. Breast cancer. 50. Polyak, K. & Metzger Filho, O. SnapShot: breast cancer. Cancer Cell 22, 562–562.e1 Lancet 397, 1750–1769 (2021). (2012). 18. Parris, T. Z. et al. Clinical implications of gene dosage and gene expression pat- 51. Mahdavi, M. et al. Hereditary breast cancer; genetic penetrance and current terns in diploid breast carcinoma. Clin. Cancer Res. 16, 3860–3874 (2010). status with BRCA. J. Cell. Physiol. 234, 5741–5750 (2019). 19. Cai, Y. et al. Loss of chromosome 8p governs tumor progression and drug 52. Zhou, X. et al. Exploring genomic alteration in pediatric cancer using ProteinPaint. response by altering lipid metabolism. Cancer Cell 29, 751–766 (2016). Nat. Genet. 48,4–6 (2015). 20. Witkiewicz, A. K. & Knudsen, E. S. Retinoblastoma tumor suppressor pathway in 53. Salazar, R. et al. Discovery of an unusually high number of de novo mutations in breast cancer: prognosis, precision medicine, and therapeutic interventions. sperm of older men using duplex sequencing. Genome Res. 32, 499–511 (2022). Breast Cancer Res. 16, 207 (2014). 54. Stoler, N. et al. Family reunion via error correction: an efficient analysis of duplex 21. Christgen, M. et al. Lobular breast cancer: clinical, molecular and morphological sequencing data. BMC Bioinform. 21, 96 (2020). characteristics. Pathol. Res. Pract. 212, 583–597 (2016). 55. Povysil, G. et al. Increased yields of duplex sequencing data by a series of quality 22. Martínez-Saéz, O. et al. Frequency and spectrum of PIK3CA somatic mutations in control tools. NAR Genom Bioinform. 3, lqab002 (2021). breast cancer. Breast Cancer Res. 22,1–9 (2020). 23. Pham, T. T., Angus, S. P. & Johnson, G. L. MAP3K1: Genomic alterations in cancer and function in promoting cell survival or apoptosis. Genes Cancer 4, 419–426 (2013). ACKNOWLEDGEMENTS 24. Plo, I. et al. AKT1 inhibits homologous recombination by inducing cytoplasmic This work was supported by the National Science Center, Poland grant (award no. retention of BRCA1 and RAD5. Cancer Res. 68, 9404–9412 (2008). UMO-2015/19/B/NZ2/03216) to A.P. and partially funded by the Foundation for Polish 25. Fagan-Solis, K. D. et al. A P53-independent DNA damage response suppresses Science (FNP) under the International Research Agendas Program (grant number oncogenic proliferation and genome instability. Cell Rep. 30, 1385–1399.e7 MAB/2018/6) to J.P.D. and A.P., co-financed by the European Union under the (2020). European Regional Development Fund. 26. Malik, N. et al. The transcription factor CBFB suppresses breast cancer through orchestrating translation and transcription. Nat. Commun. 10,1–15 (2019). 27. Chang, H. Y. et al. MED12, TERT and RARA in fibroepithelial tumours of the breast. AUTHOR CONTRIBUTIONS J. Clin. Pathol. 73,51–56 (2020). 28. Liu, Y. C., Yeh, C. T. & Lin, K. H. Molecular functions of thyroid hormone signaling Study design and conception: A.P., I.T-.B., A.K. Sample collection and preparation: T.N., in regulation of cancer progression and anti-apoptosis. Int. J. Mol. Sci. 20,1–27 M.G., H.D., J.S., E.Ś., R.P., M.J., Ł.S., W.Z., J.H. Experiments: A.K., M.H., R.S., M.A., T.M. Data (2019). analysis and interpretation: A.K., P.O., A.P., M.K., M.H., I.T.-B. Manuscript writing: A.K., A.P., Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2022) 76 A. Kostecka et al. M.K., I.T.-B., J.P.D. All authors have read and approved the manuscript. A.K. and T.N. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims contributed equally. in published maps and institutional affiliations. COMPETING INTERESTS Open Access This article is licensed under a Creative Commons The authors declare no competing financial interests, but the following competing Attribution 4.0 International License, which permits use, sharing, non-financial interests have been declared: J.P.D. is cofounder and shareholder in adaptation, distribution and reproduction in any medium or format, as long as you give Cray Innovation AB. appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless ADDITIONAL INFORMATION indicated otherwise in a credit line to the material. If material is not included in the Supplementary information The online version contains supplementary material article’s Creative Commons license and your intended use is not permitted by statutory available at https://doi.org/10.1038/s41523-022-00443-9. regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. Correspondence and requests for materials should be addressed to Anna Kostecka, org/licenses/by/4.0/. Tomasz Nowikiewicz or Arkadiusz Piotrowski. Reprints and permission information is available at http://www.nature.com/ © The Author(s) 2022 reprints npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png npj Breast Cancer Springer Journals

Loading next page...
 
/lp/springer-journals/high-prevalence-of-somatic-pik3ca-and-tp53-pathogenic-variants-in-the-mCGzGmC205

References (83)

Publisher
Springer Journals
Copyright
Copyright © The Author(s) 2022
eISSN
2374-4677
DOI
10.1038/s41523-022-00443-9
Publisher site
See Article on Publisher Site

Abstract

www.nature.com/npjbcancer ARTICLE OPEN High prevalence of somatic PIK3CA and TP53 pathogenic variants in the normal mammary gland tissue of sporadic breast cancer patients revealed by duplex sequencing 1,2,15 3,4,15 2 1,2 2 5 ✉ ✉ Anna Kostecka , Tomasz Nowikiewicz , Paweł Olszewski , Magdalena Koczkowska , Monika Horbacz , Monika Heinzl , 2 5 5 1 1 6 7 Maria Andreou , Renato Salazar , Theresa Mair , Piotr Madanecki , Magdalena Gucwa , Hanna Davies , Jarosław Skokowski , 8 9 3 10,11 12,13,14 3 3 Patrick G. Buckley , Rafał Pęksa , Ewa Śrutek , Łukasz Szylberg , Johan Hartman , Michał Jankowski , Wojciech Zegarski , 5 2,6 1,2 Irene Tiemann-Boege , Jan P. Dumanski and Arkadiusz Piotrowski The mammary gland undergoes hormonally stimulated cycles of proliferation, lactation, and involution. We hypothesized that these factors increase the mutational burden in glandular tissue and may explain high cancer incidence rate in the general population, and recurrent disease. Hence, we investigated the DNA sequence variants in the normal mammary gland, tumor, and peripheral blood from 52 reportedly sporadic breast cancer patients. Targeted resequencing of 542 cancer-associated genes revealed subclonal somatic pathogenic variants of: PIK3CA, TP53, AKT1, MAP3K1, CDH1, RB1, NCOR1, MED12, CBFB, TBX3, and TSHR in the −2 −1 normal mammary gland at considerable allelic frequencies (9 × 10 – 5.2 × 10 ), indicating clonal expansion. Further evaluation of the frequently damaged PIK3CA and TP53 genes by ultra-sensitive duplex sequencing demonstrated a diversified picture of multiple −2 −4 low-level subclonal (in 10 –10 alleles) hotspot pathogenic variants. Our results raise a question about the oncogenic potential in non-tumorous mammary gland tissue of breast-conserving surgery patients. npj Breast Cancer (2022) 8:76 ; https://doi.org/10.1038/s41523-022-00443-9 10–12 INTRODUCTION adducts . These stress conditions can promote the accumula- tion of post-zygotic, somatic genetic alterations that create the risk Breast cancer affects 24% of women worldwide and is the leading of malignant transformation. Indeed, several studies, including cause of cancer-related deaths in women . Most breast cancer ours, have identified such changes in the uninvolved mammary cases (85–90%) are not associated with inherited mutations of high gland of breast cancer patients that is defined as histologically penetrance genes, such as BRCA1 (MIM *113705) or BRCA2 (MIM 13–15 2,3 normal glandular tissue, distant from the primary tumor site . *600185) . High throughput genomics technologies have high- lighted the molecular complexity of breast tumors which has led to The most pronounced genetic alterations were identified in the normal tissue from mastectomy patients that per se did not have the molecular classification of four clinically meaningful subtypes: 4,5 Luminal A, Luminal B, HER2-enriched and basal-like . Large cohort direct clinical implications, as this affected tissue was removed studies of breast tumor samples identified somatic driver muta- completely during surgery, but might suggest an increased tions in key breast cancer-associated genes, such as PIK3CA (MIM mutational load in the second breast. At the same time, current *171834), TP53 (MIM *191170), MAP3K1 (MIM *600982), CDH1 (MIM clinical management of breast cancer includes breast-conserving *192090), AKT1 (MIM *164730), CBFB (MIM *121360), TBX3 (MIM surgery (BCS) - removing the tumor and sparing normal breast 6–8 16,17 *601621), RB1 (MIM *614041) . To date, the identification of tissue as one of the recommended treatments . The presumed somatic driver pathogenic variants has been inferred only from presence of pathogenic genetic alterations in the seemingly tumors, without providing information on the mutational land- normal mammary gland tissue that is not removed during BCS scape and allelic frequencies of specific variants in the tissue of might create a risk of recurrence and can affect future treatment. cancer origin, i.e., normal tissue of the mammary gland. This is Hence, we aimed to screen at unprecedented sensitivity for the highly relevant as under physiological conditions mammary gland presence of subclonal somatic pathogenic genetic alterations in tissue is mitotically stimulated by hormones and undergoes cycles breast cancer-related genes in the normal mammary gland of of intense proliferation and remodeling during puberty, pregnancy, sporadic cancer patients (study overview in the Supplementary and lactation . During life, the mammary gland is exposed to Fig. 1). estrogen and its metabolites that damage DNA by single- and Our study demonstrates that structural chromosomal aberra- double-strand breaks, mutations or, the formation of depurinating tions and clearly pathogenic point variants in crucial breast cancer 1 2 3 Faculty of Pharmacy, Medical University of Gdansk, Gdansk, Poland. 3P Medicine Lab, Medical University of Gdansk, Gdansk, Poland. Department of Surgical Oncology, Ludwik Rydygier’s Collegium Medicum UMK, Bydgoszcz, Poland. Department of Breast Cancer and Reconstructive Surgery, Prof. F. Lukaszczyk Oncology Center, Bydgoszcz, Poland. 5 6 Institute of Biophysics, Johannes Kepler University, Linz, Austria. Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala 7 8 9 University, Uppsala, Sweden. Department of Surgical Oncology, Medical University of Gdansk, Gdansk, Poland. Genuity Science Genomics Centre, Dublin, Ireland. Department 10 11 of Patomorphology, Medical University of Gdansk, Gdansk, Poland. Department of Tumor Pathology, Prof. F. Lukaszczyk Oncology Center, Bydgoszcz, Poland. Department of Perinatology, Gynaecology and Gynaecologic, Oncology, Collegium Medicum in Bydgoszcz, Nicolaus Copernicus University in Torun, Bydgoszcz, Poland. Department of 13 14 Oncology and Pathology, Karolinska Institutet, Stockholm, Sweden. Department of Pathology, Karolinska University Hospital, Stockholm, Sweden. MedTech Labs, Bioclinicum, Karolinska University Hospital, Stockholm, Sweden. These authors contributed equally: Anna Kostecka, Tomasz Nowikiewicz. email: anna.kostecka@gumed.edu.pl; tomasz.nowikiewicz@gmail.com; arkadiusz.piotrowski@gumed.edu.pl Published in partnership with the Breast Cancer Research Foundation 1234567890():,; A. Kostecka et al. driver genes are frequent in the normal mammary glandular tissue description of these genes in the context of breast cancer is that remains after breast-conserving surgery. provided in Supplementary Tables 6, 7 and Supplementary Fig. 8. All of these variants except PIK3CA c.3140 A > G (p.His1047Arg) were detected in BCS patients, in samples from the tissue portion RESULTS that was not qualified for surgical resection. Patterns of chromosomal aberrations We carried out analysis of chromosomal rearrangements with SNP Heterogeneity of PIK3CA and TP53 pathogenic variants arrays to detect DNA copy number alterations (CNAs) as well as revealed in the normal mammary gland tissue copy number neutral loss-of-heterozygosity events via mitotic Two driver genes dominate across all subtypes of invasive breast recombination. In addition to matched samples of normal cancer: PIK3CA and TP53 . PIK3CA encodes the catalytically active uninvolved mammary gland (UM) and primary tumor (PT), we p100alpha isoform that regulates cell proliferation and growth included normal mammary gland samples from 26 age-matched receptor signaling cascade. Activating PIK3CA point variants are women that underwent breast reduction surgery and served as the the most prevalent in breast tumors and were confirmed to lead control group (Supplementary Fig. 2). Spectrum of CNAs in the 22,29 to malignant transformation . We detected four hotspot PIK3CA studied cohort is presented on Fig. 1. Hierarchical clustering somatic variants in the uninvolved mammary gland, all of them revealed two clusters with PT-only and control-only samples and have been described in the COSMIC database and reported in four additional clusters with mixed sample distribution (Supple- breast tumors (Fig. 2, Table 2, Supplementary Fig. 5). TP53 tumor mentary Fig. 3). We also carried out cross analysis of CNAs type, suppressor acts as a transcription factor and is frequently size and number between the studied sample groups. The PTs inactivated in human malignancies, mostly through loss-of- stand out in this comparison (Wilcoxon test, p = 0,0094), with slight 30–32 function TP53 variants . We detected an Ile195Thr hotspot differences between normal mammary tissue from breast cancer variant in the uninvolved mammary gland that affects the central patients and the control cohort. Nonetheless, per individual basis, DNA-binding domain (Fig. 2, Table 2, Supplementary Fig. 5). total number of CNAs, the number of gains, the size of deletions, To enhance the sensitivity and accuracy of rare variant and size of CNAs in general were the discriminating features detection, we employed duplex sequencing (Supplementary Fig. between the normal mammary tissue from breast cancer patients 7). We selected four individuals: P10, P28, P51, and P52 based on and the control cohort, surprisingly suggesting more heteroge- the presence of PIK3CA and TP53 hotspot variants in PT samples neous nature of the control samples (Supplementary Fig. 4). according to standard NGS data (Fig. 3) and screened for variants We identified recurrent chromosomal aberrations in UMs from in the normal mammary gland samples with high sensitivity sporadic breast cancer patients, such as loss of 1p, 16p11.2, and duplex NGS sequencing. Ultra-deep targeted duplex sequencing 9p21.3, and 3q25.3, 4q13.1, 8q, and 20q gains, in line with of PIK3CA detected low-level subclonal pathogenic variants: 5,18 previous studies . Presence of loss of heterozygosity (LOH) at c.1093 G > A (p.Glu365Lys), c.1358 A > G (p.Glu453Gly), c.1633G > chromosome 8p, associated with poor outcome in breast cancer, A (p.Glu545Lys) c.1634A > C (p.Glu545Ala), c.2164 G > A (p. was observed in matched UMs and PTs, but also in the normal Glu722Lys), c.3140 A > G (p.His1047Arg), in the uninvolved mam- mammary gland tissue of healthy controls . We observed mary gland samples of three individuals. The detected variants additional events that frequently accompany 8p LOH, in the were located in the known PIK3CA hotspot regions, reported in UMs: 9p loss and 8q gain. ERBB2 gains were observed exclusively breast tumors in the COSMIC database and functionally confirmed in PT samples, except for one control mammary gland sample. 7,22 to affect PIK3CA function (Fig. 3, Supplementary Table 8). A screen for TP53 variants not only confirmed the presence of Subclonal somatic pathogenic variants in breast cancer driver His168Leu variant, but also revealed additional hotspot variants: genes present in the normal mammary gland tissue c.527 G > T (p.Cys176Phe), c.701 A > G (p.Tyr234Cys), c.733 G > A (p.Gly245Ser), c.745 A > T (p.Arg249Trp), c.818 G > A (p.Arg273His), We applied targeted DNA sequencing to identify variants in sets of c.839 G > C (p.Arg280Thr). Importantly, all these pathogenic UM, BL, and PT samples of 52 individuals diagnosed with sporadic breast cancer to distinguish germline and post-zygotic mutations variants are located in the central DNA-binding domain indis- 7,32 (Supplementary Table 1, Supplementary Table 2). pensable for p53 tumor-suppressive function (Fig. 3, Supple- Four individuals (4/52, 7.7%) were heterozygous for a constitu- mentary Table 8). tional pathogenic variant of a known breast cancer-associated gene, i.e. c.5179 A > T (p.Lys1727Ter) and c.181 T > G (p.Cys61Gly) DISCUSSION in the BRCA1 gene, c.509_510del (p.Arg170fs) and c.354del (p. Post-zygotic variations contribute to the genetic heterogeneity of Thr119fs) in the PALB2 and RAD50 genes, respectively (Supple- mentary Table 3). These results correspond to similar rates from an individual, which is reflected in a mosaic pattern of genetic other studies where up to 10% of reportedly sporadic cases turns alterations in all cells that make up the human body . The 5,7 out hereditary after molecular testing . Individuals with germline mammary gland remains mitotically active during life and under pathogenic variants were excluded from further analysis, resulting physiological conditions is exposed to DNA-damaging estrogen in a total of 48 clearly sporadic breast cancer patients. Constitu- metabolites . Subclonal somatic genetic changes acquired during tional variants of breast cancer-associated genes are listed in the life pose a risk of cancer development. Hence, we hypothesized Supplementary Table 3. that these factors can increase the mutational burden in the The summary of somatic variants fulfilling the cut-off criteria mammary gland. Other studies have reported the presence of detected in known breast cancer-associated and candidate breast genomic and transcriptomic changes in the normal mammary cancer-associated genes is provided in Supplementary Tables 4 gland, and suggested that histological normalcy does not exclude 34–36 and 5, respectively. We identified 15 somatic pathogenic, likely pathological biological changes . However, these studies have pathogenic variants or variants of uncertain significance with been carried out on normal mammary tissue obtained from predicted deleterious effect on the encoded protein in the normal mastectomies or cancer-adjacent samples, hence the clinical mammary gland tissue of 19% (9/48) of patients (Fig. 2). The relevance of the these findings was limited. In this study, we 5 20 21 affected genes are tumor suppressors (TP53 , RB1 , CDH1 ), screened for somatic genetic changes in the normal mammary 22 23 oncogenes (PIK3CA ), regulate cell death (MAP3K1 ), DNA repair gland tissue of sporadic cancer patients, including tissue biopsies 24 25 26 27 (AKT1 , RAD50 ), translation (CBFB ), gene expression (MED12 , from the parts of the breast that normally would not have been 28 6 TSHR ) and chromatin remodeling (NCOR1 ). A detailed removed during breast-conserving surgery. We identified npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation 1234567890():,; A. Kostecka et al. Fig. 1 Summary of Copy Number Alterations (CNAs) detected in the studied cohort. Chromosomal CNAs were calculated as mean Log R Ratio (LRR) for chromosome arm and normalized to mean LRR of a sample. Results are presented as a heatmap with colors indicating gains (positive LRR values; red) and deletions (negative LRR values; blue). Hierarchical clustering was performed with Ward2 algorithm and identified six clusters. Pie charts with proportion of samples within clusters are presented in the Supplementary Fig. 3. Ctrl control cohort mammary gland, UM uninvolved mammary gland, PT tumor. Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2022) 76 A. Kostecka et al. Fig. 2 Somatic variants detected in the uninvolved mammary gland (UM). Targeted sequencing revealed somatic variants of known breast cancer-associated genes (rows) present in 9–52% alleles in the UM of sporadic breast cancer patients (columns). Information on estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), and biological subtype of matched primary tumor sample is included. *Variants detected in matched PT sample. CNA Copy Number Alteration status based on SNP arrays. LOH loss of heterozygosity. Description of detected variants, including genomic position and pathogenicity classification is provided in Table 2. widespread genomic structural rearrangements that affect gene The PIK3CA and TP53 genes are the leading oncogenic dosage and somatic subclonal sequence variants of known breast mutations of breast malignancies and accordingly the most cancer-associated genes that control proliferation, cell death, common changes detected in our study were in the PIK3CA 5,40 metastasis, and genome integrity: PIK3CA, TP53, AKT1, MAP3K1, gene . Soysal et al. screened for somatic variants in benign CDH1, RB1, NCOR1, MED12, CBFB, TBX3, and TSHR (Supplementary biopsies of patients that subsequently developed breast cancer. Fig. 8). These variants were present in a considerable percentage PIK3CA and TP53 variants were the most prevalent changes in of cells, suggesting they occurred earlier in the mammary gland tumor samples, but not detected in benign biopsies, possibly due development or the carrier cells gained growth advantage and to limited sensitivity of standard massively parallel sequencing for underwent clonal expansion. Further, ultra-sensitive duplex rare variant detection . To overcome this limitation, we sequencing revealed heterogenous mosaic landscape of low- implemented duplex sequencing technology to detect PIK3CA level subclonal pathogenic variants of main breast cancer drivers: and TP53 variants in the normal mammary gland samples at very PIK3CA and TP53 in the normal mammary gland tissue. Notably, low frequency. In the uninvolved mammary gland tissue, we the setup of these variants was markedly different between tumor detected known hotspot pathogenic variants that might activate and normal mammary tissue from the same individuals which is PIK3CA kinase or target DNA-binding domain of TP53 tumor suggestive of multiple, independent mutational events that suppressor, disabling its function. occurred in the mammary gland (Fig. 4). We confirmed that these variants observed in tumor samples In parallel to sequence variants, we identified recurrent CNAs in were already present in the normal glandular tissue as well, albeit the mammary gland of breast cancer patients, but also in the age- at lower levels compared to the corresponding tumors. Strikingly matched control group (Fig. 1). This facilitated detecting subtle, these changes were accompanied in the same samples by other but noticeable differences in terms of total number and length of PIK3CA and TP53 pathogenic variants, present in the normal tissue, all detected CNAs per individual (Supplementary Fig. 4). Both but not in the corresponding tumors. This may suggest the groups: breast cancer and control were age-matched and there- existence of potential sites of secondary tumor formation. Notably, fore the mammary gland tissue was exposed to cycles of estrogen the majority of somatic pathogenic variants, including these for comparable time and that can explain the accumulation of PIK3CA and TP53 hotspot alterations, occurred in the normal copy number alterations in both cohorts. mammary gland samples not removed during breast-conserving The most important finding from this part of our study is that surgery, not from radical mastectomy patients. the normal mammary tissue from cancer patients showed DNA At the same time PIK3CA and TP53 variant spectra in the normal copy number alterations as well as evidence of copy number glandular tissue were more similar to the ones reported in cancer- neutral loss-of-heterozygosity. These genomic alterations in oriented database (COSMIC) than those in general population concert with damaging sequence variants recapitulate alternative (gnomAD), suggesting that the studied UM tissues reflect the routes of gene inactivation that are typically observed in the repertoire of somatic variants seen in tumor samples (Supple- malignant tumors, but not in the benign tissue. In this context, our mentary Fig. 9, Supplementary Fig. 10, Supplementary Table 9). study demonstrates that normal tissue profiling provides direct However, given the limited number of four individuals included in information on the very origin of the disease and may improve the duplex sequencing analysis, these conclusions should be inter- choice of treatment as well as may aid in further clinical preted with caution. Further studies on a larger well-characterized 37–39 management of the affected individuals . This is in contrast cohort of sporadic breast cancer patients are needed for under- to typical molecular profiling studies that rely on limited standing how specific variants arise and expand during life. retrospective information inferred from the tumors. Nevertheless, we demonstrate here that ultra-sensitive duplex npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation A. Kostecka et al. Fig. 3 Somatic PIK3CA and TP53 variants detected in the uninvolved mammary gland (UM) and primary tumor (PT) samples. Lollipop plots represent somatic variants of (a) PIK3CA and (b) TP53 genes detected by targeted next-generation sequencing (NGS). Upper panel represents variants detected in patient uninvolved mammary gland (UM) and tumor (PT) samples. All somatic variants detected according to the standard NGS and pathogenic/likely pathogenic variants detected by duplex sequencing in UM samples are included. Lower panel is a summary of somatic variants detected in breast tumors reported in the COSMIC database (https://cancer.sanger.ac.uk/cosmic). p85 p85- binding domain, RBD Ras-binding domain, C2 C2 domain, AD accessory domain, CD catalytic domain. TAD1, TAD2 transcription activation domain 1 and 2, DBD DNA-binding domain, DNA-binding sites are marked with red lines, TD tetramerization domain. Lollipop plots were prepared based on the images generated with the Protein paint application . *Variants detected by standard NGS in primary tumor samples and selected for duplex sequencing. sequencing approach might be beneficial to detect very low-level underwent breast-conserving surgery. Including molecular evalua- frequency somatic mosaicism in different tissue samples, with its tion of the normal glandular tissue of sporadic breast cancer potential clinical implications in terms of molecular diagnostics patients could be beneficial for personalized patient care. and prognosis. After surgical intervention, breast cancer patients remain under METHODS clinical surveillance with recommended yearly mammogram and Patient samples and DNA isolation physical examination every 3–4 months for the first two years after surgery . The current diagnostic approach has been focused We analyzed samples from 52 patients diagnosed with reportedly sporadic mainly on the identification of constitutional pathogenic variants breast cancer with an emphasis on breast-conserving surgery (2/3 of the patients studied) and who did not receive neoadjuvant therapy. Altogether in known breast cancer-associated genes to catch early these a total of 204 uninvolved mammary gland (UM), primary tumor (PT), skin individuals who are in a higher risk of breast cancer development (SK), and peripheral blood (BL) samples were collected via the Oncology and/or to whom the personalized targeted therapy could be Centre in Bydgoszcz and the University Clinical Centre in Gdansk, with the offered. However, over 80% of all breast cancer cases are not approval of bioethics committee at Medical University of Gdansk (MUG). associated with inherited changes . We have obtained written informed consent from all participants. PT, UM, Our results demonstrate a complex landscape of mutational SK, and BL samples from each patient were collected and stored in −80 °C burden in the seemingly normal mammary glandular tissue and upon DNA isolation. The overview of sample processing workflow is indicate an oncogenic potential of the tissue not removed during presented in the Supplementary Fig. 1. The histological subtypes and surgery. This study provides a rationale for thorough genetic and tumor tissue content of each PT sample were evaluated by pathologists clinical surveillance of sporadic breast cancer patients that according to the current American Joint Committee on Cancer Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2022) 76 A. Kostecka et al. Fig. 4 Oncogenic potential of the normal mammary tissue. We used duplex sequencing to screen for ultra-low frequency variants and detected PIK3CA and TP53 hotspot alterations. The sampled normal mammary gland tissue is referred to as uninvolved glandular tissue and was not removed during surgical resection of the tumor mass. Detected variants might alter the function of the main breast cancer drivers: activate PIK3CA oncogene and impair TP53 tumor suppressor DNA-binding capacity. The presence of these changes implicates an oncogenic potential of the uninvolved mammary gland tissue and emphasizes the importance of thorough monitoring of sporadic breast cancer patients that underwent breast-conserving surgery. guidelines . Tumor samples with less than 50% of neoplastic cell content Statistical analysis were excluded. The normal mammary gland was sampled preferably from All statistical analyses were carried out using R version 3.6.2 and package the opposite quadrant relative to the primary tumor site, with a mandatory stats. Packages pheatmap and ggpubr were used for plotting. Statistical cut-off criterion of at least 3 cm in each case, to exclude potential significance of differences between two groups was tested using the contamination with residual tumor cells. These tissue samples were also Mann–Whitney U test. Differences were considered significant at a two- sided p < 0.05. evaluated by pathologists to confirm normal histology (Table 1, Supplementary Table 1). All normal mammary gland samples from patients who underwent breast-conserving surgery were derived from the portion Targeted DNA resequencing of tissue that remained intact in the patient body after breast-conserving Targeted DNA sequencing panel was designed with Roche NimbleDesign surgery. Solid tissues were homogenized in a lysis buffer, then Proteinase K online tool (Roche, https://hyperdesign.com/). The panel included exons was added and samples were incubated at 55 °C for 48 h. DNA isolation with+ /- 50 kbp flanking regions of 542 genes selected based on in-house from UM, PT, and SK tissue lysates was performed by phenol–chloroform database and literature research (Supplementary Table 2). Sequencing extraction as previously described . Blood DNA extraction was performed libraries were prepared for sets of UM, BL, and PT samples with the with the QIAamp DNA Blood Mini Kit according to the manufacturer’s capture-based Roche SeqCap EZ system according to the manufacturer’s protocol (Qiagen, Germantown, MD). protocol (Roche, Pleasanton, CA), followed by 150 bp paired-end sequen- cing performed on Illumina NextSeq550 and MiniSeq instruments (Illumina, San Diego, CA). Sequencing read alignment to the human reference genome (hg38) was performed with the Burrows–Wheeler Copy number alteration detection transform aligner (http://bio-bwa.sourceforge.net/) . Platypus v.0.8.1.1 SNP array genotyping was performed for UM and PT samples on an (https://www.rdm.ox.ac.uk/research/lunter-group/lunter-group/) was used Illumina Infinium Global Screening Array, according to the manufacturer’s for variant calling . Variants with poor mapping quality (<30), variants recommendations (Illumina, San Diego, CA). SNP genotyping data from supported by high-quality bases (≥30) in fewer than five reads, and variants outside the targeted regions were excluded from analysis. Variants mammary gland tissues of 26 age-matched women that underwent breast were annotated with VarAFT (version 2.17-2) software . reduction surgery were used as control samples (Supplementary Fig. 2). For variant selection, only variants with sequencing depth ≥ 30 and Genotyping data was analyzed using Nexus Copy Number software version tissue allele frequency ≥ 0.07 were included in the analysis. All truncating 10.0 (BioDiscovery). Quality control of samples was performed as described 14,44 variants were included. For non-truncating variants, the following criteria previously . Briefly, samples with Log R Ratio (LRR) sd > 0.2 were flagged were used: variants were filtered by their clinical significance as reported in as poor quality and excluded from the analysis. The analysis was the ClinVar database (as of June 2021), variants classified as Pathogenic, performed with default settings except that significance threshold for Likely Pathogenic, Conflicting interpretations of pathogenicity, risk factor, −13 Copy Number Alterations (CNA) calling was decreased to 5*10 - (default and drug response were included in the study. The remaining non- −7 5*10 ), minimal number of probes per segment was increased to 10 truncating variants were included based on their frequency in the general (default 3), gain threshold was set to 0.49 and 0.14 which corresponds to population: variants with minor allele frequency (MAF) ≤ 0.001 across all approximately 40% and 10% change for a high gain and gain respectively gnomAD populations (“popmax”) or not noted in the database were (the default is 0.41 and 0.06 for a high gain and gain), the loss threshold included. For in silico splicing analysis splice prediction algorithms, i.e. SSF, was set to −0.16 and −0.74 what corresponds to approximately −10% and MaxEntScan, and NNSplice, embedded in Alamut Visual software (version −40% change for a loss and high loss respectively (the default is −0.09 and 2.14) were used. Variants described in this study were classified according −1.1 for a loss and high loss). Hierarchical clustering was performed using to the American College of Medical Genetics and Genomics and the 45 49 the Ward2 algorithm . Association for Molecular Pathology recommendations . Based on npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation A. Kostecka et al. discarded. The beads were washed twice with 80% ethanol, air dried at Table 1. Summarized clinicopathological features of sporadic breast room temperature and 23 µl of PCR grade water was added to resuspend cancer patient cohort. by pipetting. After incubating at RT for 5 min, the dissolved beads were allowed to stand at RT for 5 min, placed on a magnet and the clear Number of individuals 52 supernatant containing the size-selected DNA was transferred to a new tube. Collected samples: 204 UM 52 End-repair, A-tailing, adapter ligation, and bead purification. Size selected PT 52 genomic DNA was end-repaired and A-tailed using the NEBNext® Ultra™ II End Repair/dA-Tailing Module (New England Biolabs) according to the BL 52 manufacturer’s instructions followed by adapter ligation with the SK 48 NEBNext® Ultra™ II Ligation Module (New England Biolabs) following the Age (median/range) 45/ manufacturer’s instructions. The adapters ligated to the A-tailed DNA were 28–60 synthesized as previously described (Adapter 2) . The ligation reaction was then purified using 1.2 volumes of Sera-Mag Select beads (Cytiva). A Histology total of 96.5 µl sample was thoroughly mixed with 115.8 µl beads by IDC 44 pipetting and incubated at RT for 10 min. Tubes were placed on a magnet ILC 4 and the supernatant was discarded. The beads were washed twice with 80% ethanol. Next, the beads were dried at room temperature and 23 µl of IDC-ILC 1 PCR grade water was added to resuspend by pipetting. After incubating other 3 the dissolved beads at RT for 5 min they were placed on a magnet and the Receptors clear supernatant containing the purified DNA was transferred to a fresh tube. ER (positive/negative) 46/6 PR (positive/negative) 46/6 Pre-capture amplification. Ligated fragments were amplified with KAPA HER2 (positive/negative) 5/47 HiFi HotStart ReadyMix PCR Kit (KAPA Biosystems). Reaction components, Subtype primer sequences, and cycling conditions are listed in the Supplementary Table 10. For libraries with input DNA higher than 240 ng, two parallel Luminal A 22 reactions were prepared and pooled in the end, just before purification. Luminal B 24 The first step of amplification was 6 or 12 cycles of single primer extensions HER2-enriched 2 followed by the addition of the primer NEBNext Universal and a standard PCR amplification of 2 cycles. PCR products were purified with 1.2 volumes Triple-negative 4 Sera-Mag Select beads as described above, followed by two rounds of Uninvolved mammary gland tissue (UM), primary tumor (PT), skin (SK), and targeted capture steps to enrich the templates of interest. peripheral blood (BL) samples were collected from 52 individuals diagnosed with reportedly sporadic breast cancer. Histological evaluation Targeted captures and post-capture amplification. Two rounds of targeted of tumor samples was performed according to the current American Joint 43 captures followed by PCR amplification were performed as described in Committee on Cancer guidelines . PT samples were classified as Invasive Salazar et al., with minor modifications on the post-capture amplification Ductal Carcinoma (IDC), Invasive Lobular Carcinoma (ILC), mixed (ICD-ILC) (Supplementary Table 10) . The biotinylated probes used to target exonic or other. Estrogen (ER), progesterone (PR), and ERBB2 (HER2) receptors regions of TP53, and PIK3CA are detailed on Supplementary Table 10. were evaluated based on immunostaining or immunostaining and FISH (HER2). Biological subtypes were assigned based on ER/PR/HER2 and Ki67 status. Detailed clinicopathological information is provided in the Duplex sequencing data analysis Supplementary Table 1. FastQ files were analyzed with Galaxy platform (available on a private server provided by the Medical University of Gdansk) and first processed by the tool Trim Galore! to trim Illumina-specific adapter sequences 2,7,30,50,51 literature we selected 155 breast cancer-associated genes that including the barcode and spacer sequence at the 3' end of the raw reads. were the primary focus of variant analysis (Supplementary Table 2). Next, the reads were analyzed according to a duplex sequencing (DS) Somatic variants presented in Fig. 2 and Table 2 were confirmed by Sanger specific pipeline that includes an error correction tool . After creating the sequencing or High Resolution Melting analysis (Supplementary Fig. 5). duplex consensus sequence (DCS), a trimming step of 5 nucleotides from Lollipop plots with variant demonstration were prepared based on images both 5' and 3' end was included. The trimmed consensus sequences were generated with the Protein paint application . then aligned by BWA-MEM and BamLeftAlignIndels to the human genome assembly hg38. To avoid false-positive variants that would occur within Duplex sequencing any partial adapter sequences and barcodes at the 3' end of the consensus UM, PT, BL, and SK samples of four individuals (P10, P28, P51, and P52) sequence and were not removed by the first adapter trimming step, the were selected for detection of variants by duplex sequencing based on the tool clipOverlap from the package BamUtil was applied. Variant calling was presence of PIK3CA or TP53 hotspot variants in PT, but not UM tissue, then performed by the variant caller LoFreq. Finally, the variants according to standard NGS. The protocols used here are based on the ones (substitutions only) were further inspected and assigned to tiers using described in more detail in Salazar et al. . 55 the Variant Analyzer . Variants with DCS coverage below 500 and variants outside the probe regions were discarded from our analysis and only Tier 1 Random DNA shearing and size selection. DNA was ultrasonicated for variants were kept, together with Tier 2 that were detected more than 10 min at ≤10 °C using a Bandelin Sonorex Super RK 102 H Ultrasonic bath 55 once. For more details on this analysis see Povysil et al. . The full Galaxy ending up with a fragment size distribution of, on average, 275 bp. A workflow is publicly available: https://usegalaxy.org/u/jku-itb-lab/w/ double-size selection was performed using Sera-Mag Select beads (Cytiva) gdansk-paper---galaxy-workflow. in order to exclude fragments outside a range of 100-400 bp. The size The variant frequency was calculated by dividing the number of DCS selection was performed in 50 µl of sonicated DNA (2 µg), 20 µl 10x calling the variant by the DCS coverage at the position of the variant within CutSmart buffer (NEB), 47.6 µl PCR grade water with 0.7 volumes beads. the library it was detected. The variant frequency was calculated by the The reaction was mixed by pipetting thoroughly and incubated at room count for each alteration type (e.g. A > C) divided by the frequency of the temperature (RT) for 10 min. Tubes were then placed on a magnet for sequenced reference allele (e.g., frequency of A’s in the reference 5 min and 190 µl of supernatant was transferred to a fresh tube. Next, 2.5 sequence multiplied by the sum of the mean DCS coverage for that volumes of beads in total considering the initial bead solution was added to the solution and mixed by pipetting. The mixture was incubated at RT library). The relative count is the count for each variant type divided by the for 10 min. Tubes were placed on a magnet and supernatant was sum of all occurring variants within the tissue. Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2022) 76 A. Kostecka et al. npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation Table 2. Pathogenicity classification of somatic variants detected in the uninvolved mammary gland (UM) samples. a b c d e f f ID Gene Genomic position cDNA change (protein change) ACMG classification rsID ClinVar UM allele frequency PT allele frequency P26 AKT1 chr14:104780214 c.49 G > A (p.Glu17Lys) Pathogenic rs121434592 – 0,11 0,36 P23 CBFB chr16:67036674 c.207dup (p.Pro70fs) Pathogenic –– 0,15 not detected P18 CDH1 chr16:68819382 c.1668_1669insT (p.Lys557Ter) Pathogenic –– 0,1 0,17 P15 MAP3K1 chr5:56881868 c.2668del (p.Asn891fs) Pathogenic –– 0,09 0,15 P23 MED12 chrX:71137882 c.5983 C > T (p.Pro1995Ser) Likely pathogenic –– 0,15 not detected P15 NCOR1 chr17:16040459 c.6715 C > A (p.Pro2239Thr) Likely pathogenic –– 0,11 not detected P12 PIK3CA chr3:179234358 c.3203dup (p.Asn1068fs) Pathogenic rs587776802 Pathogenic 0,19 no data P12 PIK3CA chr3:179204536 c.1093 G > A (p.Glu365Lys) Pathogenic rs1064793732 Pathogenic 0,33 no data P23 PIK3CA chr3:179203765 c.1035 T > A (p.Asn345Lys) Pathogenic rs121913284 Likely pathogenic 0,11 not detected P27 PIK3CA chr3:179234297 c.3140 A > G (p.His1047Arg) Pathogenic rs121913279 Pathogenic 0,11 0,11 P16 RAD50 chr5:132595759 c.2165dup (p.Glu723fs) Pathogenic rs397507178 Pathogenic 0,16 not detected P20 RB1 chr13:48345117 c.418 A > G (p.Thr140Ala) Likely pathogenic –– 0,11 not detected P12 TBX3 chr12:114679572 c.796_797dup (p.Ser266fs) Pathogenic –– 0,18 not detected P31 TP53 chr17:7674947 c.584 T > C (p.Ile195Thr) Pathogenic rs760043106 Likely pathogenic 0,52 no data P20 TSHR chr14:81068264 c.253 A > G (p.Ile85Val) VUS –– 0,13 not detected Targeted DNA sequencing identified somatic DNA variants of known breast cancer-associated genes in the uninvolved mammary gland tissue of sporadic breast cancer patients. Genomic position according to the hg38 sequence assembly. Variant annotation provided for the basic isoform of the transcript. c 49 Pathogenicity classification according to the current ACMG guidelines . rsIDs in dbSNP build 152. Variant pathogenicity classification according to the ClinVar database. Detailed description of somatic variants detected in UM samples is provided in the Supplementary Table 4. Tissue allele frequency of the detected variants in matched UM and PT tissue specimens. PT sample was not available. Confirmation of somatic variants by Sanger sequencing or high-resolution melting is provided in the Supplementary Fig. 5. VUS Variant of Unknown Significance. A. Kostecka et al. DATA AVAILABILITY 29. Thorpe, L. M., Yuzugullu, H. & Zhao, J. J. PI3K in cancer: divergent roles of isoforms, modes of activation and therapeutic targeting. Nat. Rev. Cancer 15,7–24 (2015). Raw microarray, NGS and duplex sequencing data are available upon request in the 30. Vogelstein, B. et al. Cancer genome landscapes. Science 340, 1546–1558 (2013). EGA archive, study ID EGAS00001005698. 31. Campbell, P. J. et al. Pan-cancer analysis of whole genomes. Nature 578,82–93 (2020). Received: 29 September 2021; Accepted: 10 June 2022; 32. Baugh, E. H., Ke, H., Levine, A. J., Bonneau, R. A. & Chan, C. S. Why are there hotspot mutations in the TP53 gene in human cancers? Cell Death Differ. 25, 154–160 (2018). 33. Mustjoki, S. & Young, N. S. Somatic mutations in “benign” disease. N. Engl. J. Med. 384, 2039–2052 (2021). REFERENCES 34. Gadaleta, E. et al. Characterization of four subtypes in morphologically normal tissue excised proximal and distal to breast cancer. npj Breast Cancer 6, 38 (2020). 1. Heer, E. et al. Global burden and trends in premenopausal and postmenopausal 35. Aran, D. et al. Comprehensive analysis of normal adjacent to tumor tran- breast cancer: a population-based study. Lancet Glob. Heal 8, e1027–e1037 scriptomes. Nat. Commun. 8,1–13 (2017). (2020). 36. Troester, M. A. et al. DNA defects, epigenetics, and gene expression in cancer- 2. Coughlin, S. S. Epidemiology of breast cancer in women. Adv. Exp. Med. Biol. adjacent breast: a study from the cancer genome atlas. npj Breast Cancer 2, 16007 1152,9–29 (2019). (2016). 3. Kleibl, Z. & Kristensen, V. N. Women at high risk of breast cancer: molecular 37. Moore, L. et al. The mutational landscape of normal human endometrial epi- characteristics, clinical presentation and management. Breast 28, 136–144 (2016). thelium. Nature 580, 640–646 (2020). 4. Sorlie, T. Gene expression patterns of breast carcinomas distinguish tumor sub- 38. Lawson, A. R. J. et al. Extensive heterogeneity in somatic mutation and selection classes with clinical implications. PNAS 98, 10869–10874 (2001). in the human bladder. Science 370,75–82 (2020). 5. Koboldt, D. C. et al. Comprehensive molecular portraits of human breast tumours. 39. Abascal, F. et al. Somatic mutation landscapes at single-molecule resolution. Nature 490,61–70 (2012). Nature 593, 405–410 (2021). 6. Stephens, P. J. et al. The landscape of cancer genes and mutational processes in 40. Berger, A. C. et al. A comprehensive pan-cancer molecular study of gynecologic breast cancer. Nature 486, 400–404 (2012). and breast cancers. Cancer Cell 33, 690–705.e9 (2018). 7. Pereira, B. et al. The somatic mutation profiles of 2,433 breast cancers refines their 41. Soysal, S. D. et al. Genetic alterations in benign breast biopsies of subsequent genomic and transcriptomic landscapes. Nat. Commun. 7, 11479 (2016). breast cancer patients. Front. Med. 6,1–6 (2019). 8. Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole- 42. Gradishar, W. J. et al. NCCN clinical practice guidelines in Oncology. Breast Cancer genome sequences. Nature 534,47–54 (2016). Version 4. 2021. Natl. Compr. Cancer Netw. 16, 310–320 (2021). 9. Macias, H. & Hinck, L. Mammary gland development. Wiley Interdiscip. Rev. Dev. 43. Amin, M. B., et al. AJCC Cancer Staging Manual (Springer International Publishing, Biol. 1, 533–557 (2012). 2017). 10. Dall, G. V. & Britt, K. L. Estrogen effects on the mammary gland in early and late 44. Rydzanicz, M. et al. Variable degree of mosaicism for tetrasomy 18p in pheno- life and breast cancer risk. Front. Oncol. 7,1–10 (2017). typically discordant monozygotic twins—diagnostic implications. Mol. Genet. 11. Almeida, M., Soares, M., Fonseca-Moutinho, J., Ramalhinho, A. C. & Breitenfeld, L. Genom. Med. 9,1–9 (2021). Influence of estrogenic metabolic pathway genes polymorphisms on post- 45. Murtagh, F. & Legendre, P. Ward’s hierarchical agglomerative clustering method: menopausal breast cancer risk. Pharmaceuticals 14,1–9 (2021). which algorithms implement Ward’s criterion? J. Classif. 31, 274–295 (2014). 12. Yager, J. D. & Davidson, N. E. Estrogen carcinogenesis in breast cancer. N. Engl. J. 46. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler Med. 354, 270–282 (2006). transform. Bioinformatics 25, 1754–1760 (2009). 13. Ronowicz, A. et al. Concurrent DNA copy-number alterations and mutations in 47. Rimmer, A. et al. Integrating mapping-, assembly- and haplotype-based genes related to maintenance of genome stability in uninvolved mammary approaches for calling variants in clinical sequencing applications. Nat. Genet. glandular tissue from breast cancer patients. Hum. Mutat. 36, 1088–1099 (2015). 46, 912–918 (2014). 14. Forsberg, L. A. et al. Signatures of post-zygotic structural genetic aberrations in 48. Desvignes, J. P. et al. VarAFT: A variant annotation and filtration system for the cells of histologically normal breast tissue that can predispose to sporadic human next generation sequencing data. Nucleic Acids Res 46, W545–W553 breast cancer. Genome Res. 25, 1521–1535 (2015). (2018). 15. Danforth, D. N. Genomic changes in normal breast tissue in women at normal risk 49. Richards, S. et al. Standards and guidelines for the interpretation of sequence or at high risk for breast cancer. Breast Cancer Basic Clin. Res. 10, 109–146 (2016). variants: a joint consensus recommendation of the American College of Medical 16. Waks, A. G. & Winer, E. P. Breast cancer treatment: a review. JAMA -J. Am. Med. Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. Assoc. 321, 288–300 (2019). 17, 405–424 (2015). 17. Loibl, S., Poortmans, P., Morrow, M., Denkert, C. & Curigliano, G. Breast cancer. 50. Polyak, K. & Metzger Filho, O. SnapShot: breast cancer. Cancer Cell 22, 562–562.e1 Lancet 397, 1750–1769 (2021). (2012). 18. Parris, T. Z. et al. Clinical implications of gene dosage and gene expression pat- 51. Mahdavi, M. et al. Hereditary breast cancer; genetic penetrance and current terns in diploid breast carcinoma. Clin. Cancer Res. 16, 3860–3874 (2010). status with BRCA. J. Cell. Physiol. 234, 5741–5750 (2019). 19. Cai, Y. et al. Loss of chromosome 8p governs tumor progression and drug 52. Zhou, X. et al. Exploring genomic alteration in pediatric cancer using ProteinPaint. response by altering lipid metabolism. Cancer Cell 29, 751–766 (2016). Nat. Genet. 48,4–6 (2015). 20. Witkiewicz, A. K. & Knudsen, E. S. Retinoblastoma tumor suppressor pathway in 53. Salazar, R. et al. Discovery of an unusually high number of de novo mutations in breast cancer: prognosis, precision medicine, and therapeutic interventions. sperm of older men using duplex sequencing. Genome Res. 32, 499–511 (2022). Breast Cancer Res. 16, 207 (2014). 54. Stoler, N. et al. Family reunion via error correction: an efficient analysis of duplex 21. Christgen, M. et al. Lobular breast cancer: clinical, molecular and morphological sequencing data. BMC Bioinform. 21, 96 (2020). characteristics. Pathol. Res. Pract. 212, 583–597 (2016). 55. Povysil, G. et al. Increased yields of duplex sequencing data by a series of quality 22. Martínez-Saéz, O. et al. Frequency and spectrum of PIK3CA somatic mutations in control tools. NAR Genom Bioinform. 3, lqab002 (2021). breast cancer. Breast Cancer Res. 22,1–9 (2020). 23. Pham, T. T., Angus, S. P. & Johnson, G. L. MAP3K1: Genomic alterations in cancer and function in promoting cell survival or apoptosis. Genes Cancer 4, 419–426 (2013). ACKNOWLEDGEMENTS 24. Plo, I. et al. AKT1 inhibits homologous recombination by inducing cytoplasmic This work was supported by the National Science Center, Poland grant (award no. retention of BRCA1 and RAD5. Cancer Res. 68, 9404–9412 (2008). UMO-2015/19/B/NZ2/03216) to A.P. and partially funded by the Foundation for Polish 25. Fagan-Solis, K. D. et al. A P53-independent DNA damage response suppresses Science (FNP) under the International Research Agendas Program (grant number oncogenic proliferation and genome instability. Cell Rep. 30, 1385–1399.e7 MAB/2018/6) to J.P.D. and A.P., co-financed by the European Union under the (2020). European Regional Development Fund. 26. Malik, N. et al. The transcription factor CBFB suppresses breast cancer through orchestrating translation and transcription. Nat. Commun. 10,1–15 (2019). 27. Chang, H. Y. et al. MED12, TERT and RARA in fibroepithelial tumours of the breast. AUTHOR CONTRIBUTIONS J. Clin. Pathol. 73,51–56 (2020). 28. Liu, Y. C., Yeh, C. T. & Lin, K. H. Molecular functions of thyroid hormone signaling Study design and conception: A.P., I.T-.B., A.K. Sample collection and preparation: T.N., in regulation of cancer progression and anti-apoptosis. Int. J. Mol. Sci. 20,1–27 M.G., H.D., J.S., E.Ś., R.P., M.J., Ł.S., W.Z., J.H. Experiments: A.K., M.H., R.S., M.A., T.M. Data (2019). analysis and interpretation: A.K., P.O., A.P., M.K., M.H., I.T.-B. Manuscript writing: A.K., A.P., Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2022) 76 A. Kostecka et al. M.K., I.T.-B., J.P.D. All authors have read and approved the manuscript. A.K. and T.N. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims contributed equally. in published maps and institutional affiliations. COMPETING INTERESTS Open Access This article is licensed under a Creative Commons The authors declare no competing financial interests, but the following competing Attribution 4.0 International License, which permits use, sharing, non-financial interests have been declared: J.P.D. is cofounder and shareholder in adaptation, distribution and reproduction in any medium or format, as long as you give Cray Innovation AB. appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless ADDITIONAL INFORMATION indicated otherwise in a credit line to the material. If material is not included in the Supplementary information The online version contains supplementary material article’s Creative Commons license and your intended use is not permitted by statutory available at https://doi.org/10.1038/s41523-022-00443-9. regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. Correspondence and requests for materials should be addressed to Anna Kostecka, org/licenses/by/4.0/. Tomasz Nowikiewicz or Arkadiusz Piotrowski. Reprints and permission information is available at http://www.nature.com/ © The Author(s) 2022 reprints npj Breast Cancer (2022) 76 Published in partnership with the Breast Cancer Research Foundation

Journal

npj Breast CancerSpringer Journals

Published: Jun 29, 2022

There are no references for this article.