View Full Table | Close Full ViewTable 1.

Sequence composition of modified HMPR and UF libraries.

 
Modified HMPR
Libraries B73 Husk B73 Root Mo17 Root UF
No. Mb % No. Mb % No. Mb % No. Mb %
454 reads 391,778 65.6 470,918 101.2 1,284,692 236.7 543,385 130.9
Total 479,565 63.6 100 771,557 97.6 100 1,937,032 225.5 100 543,350 130.9 100
Chloroplast 3,771 0.6 0.8 5,567 0.9 0.7 30,835 4.1 1.6 3,118 0.8 0.6
Mitochondrial 1,319 0.2 0.3 20,332 3.0 2.6 224,593 29.7 11.6 5,493 1.4 1.0
Non-maize 6,829 0.9 1.4 530,876 67.4 68.8 454,413 49.1 23.5 41,149 9.8 7.6
Repeats 150,786 21.7 31.4 34,378 5.2 4.5 75,225 9.3 3.9 343,072 83.8 63.1
Non-repeats 316,860 40.2 66.1 180,404 21.1 23.4 1,151,966 133.3 59.5 150,518 35.1 27.7
The number of sequences in each category expressed as a percentage of the total number of sequences.
Sequencing reads generated on the 454 GS FLX.
§454 reads from modified HMPR libraries were in silico digested with HpaII, and only sequences ≥40 bp were kept and BLAT searched against nucleotide databases. 454 reads from the UF library were not in silico digested with HpaII, and only sequences ≥40 bp were kept and BLAT searched against nucleotide databases.
Sequences that did not significantly match any of the screened plant nucleotide, organellar, or repeat databases. All of these sequences were classified as putatively non-maize with the majority of unknown or bacterial origin.
#Sequences from the maize nuclear genome that significantly matched to The Institute for Genomic Research (TIGR) Maize Repeat version 4 database, which consists of characterized, uncharacterized, and predicted repeats.
††Sequences from putatively non-repetitive regions of the maize genome with significant matches to the Maize Assembled Gene Islands Version 4.0 Contigs and Singletons (MAGIv4.0 C&S) database, sorghum or rice genome sequences, or the Dana-Farber Cancer Institute (DFCI) maize gene index.



View Full Table | Close Full ViewTable 2.

Summary of putative SNPs and call rates at various coverage depths and quality value thresholds with and without implementation of the paralog distinguishing list (PDL).

 
With PDL Without PDL
CD Q B73 Mo17 FDR B73 Mo17 FDR
SNPs Kb Rate SNPs Kb Rate SNPs Kb Rate SNPs Kb Rate
≥1X All 11,904 19,515 0.61 126,683 31,435 4.03 15.1% 50,936 21,675 2.35 174,476 34,756 5.02 46.8%
≥20 10,701 18,450 0.58 119,294 29,675 4.02 14.4% 47,343 20,495 2.31 164,904 32,719 5.04 45.8%
≥30 8,955 16,282 0.55 106,475 25,843 4.12 13.3% 39,910 17,897 2.23 147,335 28,553 5.16 43.2%
≥35 5,703 11,182 0.51 86,830 18,794 4.62 11.0% 23,149 12,057 1.92 119,465 20,813 5.74 33.4%
≥40 2,352 5,470 0.43 62,966 13,036 4.83 8.9% 10,378 5,830 1.78 85,547 14,451 5.92 30.1%
≥50 1,609 4,349 0.37 57,205 11,603 4.93 7.5% 6,832 4,679 1.46 77,688 12,884 6.03 24.2%
≥60 879 2,747 0.32 45,610 9,346 4.88 6.6% 3,724 2,956 1.26 61,991 10,384 5.97 21.1%
≥70 634 2,113 0.30 39,787 8,153 4.88 6.1% 2,651 2,266 1.17 54,279 9,062 5.99 19.5%
≥2X All 2,072 5,054 0.41 61,584 12,543 4.91 8.4% 9,048 5,451 1.66 83,547 13,925 6.00 27.7%
≥20 2,057 5,017 0.41 61,527 12,531 4.91 8.4% 9,017 5,465 1.65 83,475 13,913 6.00 27.5%
≥30 2,031 5,078 0.40 61,300 12,485 4.91 8.1% 8,910 5,433 1.64 83,173 13,862 6.00 27.3%
≥40 1,953 4,883 0.40 60,573 12,337 4.91 8.1% 8,529 5,298 1.61 82,169 13,695 6.00 26.8%
≥50 1,609 4,349 0.37 57,205 11,603 4.93 7.5% 6,832 4,679 1.46 77,688 12,884 6.03 24.2%
≥60 879 2,747 0.32 45,610 9,346 4.88 6.6% 3,724 2,956 1.26 61,991 10,384 5.97 21.1%
≥70 634 2,113 0.30 39,787 8,153 4.88 6.1% 2,651 2,266 1.17 54,279 9,062 5.99 19.5%
≥3X All 702 2,127 0.33 37,980 7,783 4.88 6.8% 3,127 2,282 1.37 51,769 8,657 5.98 22.9%
≥20 699 2,118 0.33 37,975 7,782 4.88 6.8% 3,124 2,280 1.37 51,763 8,656 5.98 22.9%
≥30 697 2,112 0.33 37,966 7,780 4.88 6.8% 3,114 2,273 1.37 51,751 8,654 5.98 22.9%
≥40 689 2,153 0.32 37,912 7,769 4.88 6.6% 3,088 2,271 1.36 51,681 8,642 5.98 22.7%
≥50 679 2,122 0.32 37,833 7,753 4.88 6.6% 3,047 2,257 1.35 51,572 8,624 5.98 22.6%
≥60 649 2,028 0.32 37,448 7,690 4.87 6.6% 2,899 2,196 1.32 51,044 8,550 5.97 22.1%
≥70 529 1,763 0.30 35,417 7,272 4.87 6.2% 2,299 1,900 1.21 48,339 8,097 5.97 20.3%
≥4X All 322 1,039 0.31 24,454 5,084 4.81 6.4% 1,452 1,108 1.31 33,403 5,662 5.90 22.2%
≥20 319 1,029 0.31 24,454 5,084 4.81 6.4% 1,449 1,115 1.30 33,402 5,661 5.90 22.0%
≥30 318 1,026 0.31 24,454 5,084 4.81 6.4% 1,445 1,112 1.30 33,402 5,661 5.90 22.0%
≥40 317 1,057 0.30 24,451 5,083 4.81 6.2% 1,443 1,110 1.30 33,399 5,661 5.90 22.0%
≥50 316 1,053 0.30 24,443 5,082 4.81 6.2% 1,437 1,105 1.30 33,391 5,659 5.90 22.0%
≥60 313 1,043 0.30 24,430 5,079 4.81 6.2% 1,426 1,105 1.29 33,368 5,656 5.90 21.9%
≥70 311 1,037 0.30 24,356 5,064 4.81 6.2% 1,405 1,098 1.28 33,272 5,639 5.90 21.7%
CD, coverage depth. The number of reads with an aligned base that supported the consensus base call.
Q, quality values. Quality values were computed using the 454 base-calling software (single reads) or the CAP3 assembly program (multiple sequence alignments).
§The number of kilobases (Kb) of HpaII consensus sequence that aligned to MAGIv4.0 C&S database.
The number of SNPs called per Kb of HpaII consensus sequence (SNPs/Kb).
#The percent false discovery rate (FDR) at each coverage depth was calculated by dividing the B73 call rate by the Mo17 call rate and multiplying by 100.
††No filtering on Q values.



View Full Table | Close Full ViewTable 3.

Summary of B73/Mo17 454 SNP validation.

 
With PDL WithoutPDL
Panzea SNPs 724 724
HpaII SNPs 523 720
Shared SNPs 449 586
HpaII FDR 14.1% 18.6%
HpaII Power 62.0% 80.9%
The number of identified Panzea SNPs that mapped to Mo17 HpaII consensus sequences.
The number of B73/Mo17 HpaII SNPs identified via 454 pyrosequencing that mapped to Panzea sequences. These B73/Mo17 HpaII SNPs that mapped are a subset of the 126,683 putative SNPs (≥1X coverage depth; All Q values) that were called using the paralog distinguishing list (PDL).
§The number of B73/Mo17 HpaII SNPs identified via 454 pyrosequencing that mapped to Panzea sequences. These B73/Mo17 HpaII SNPs that mapped are a subset of the 174,476 putative SNPs (≥1X coverage depth; All Q values) that were called without using the paralog distinguishing list (PDL).
SNPs that were identified in both the B73/Mo17 HpaII SNP and Panzea SNP datasets.
#We assumed that all SNPs called from the Panzea sequence dataset were true SNPs. The percent false discovery rate (FDR) was calculated as [1–(449/523)*100] and [1–(586/720)*100].
††We assumed that all SNPs in the Panzea sequence dataset were identified. Power was calculated as [(449/724)*100] and [(586/724)*100].