N. Brousseau,2
R. Brousseau,1
J. W. A. Salt,2
L. Gutz,2
and M. D. B. Tucker2
1Institut de Recherche en Biotechnologie, 6100 Avenue Royalmount, Montréal, Québec H4P 2R2, Canada.
2The other authors are with the Electronic Warfare Division, Communications Electronic Warfare Section, Defence Research Establishment, Ottawa, 3701 Carling Avenue, Ottawa, Ontario K1A 0Z4, Canada.
N. Brousseau, R. Brousseau, J. W. A. Salt, L. Gutz, and M. D. B. Tucker, "Analysis of DNA sequences by an optical time-integrating correlator," Appl. Opt. 31, 4802-4815 (1992)
The analysis of the molecular structure called DNA is of particular interest for the understanding of the basic processes governing life. Correlation techniques implemented on digital computers are currently used to do this analysis, but the process is so slow that the mapping and sequencing of the entire human genome requires a computational breakthrough. This paper presents a new method of performing the analysis of DNA sequences with an optical time-integrating correlator. The method is characterized by short processing times that make the analysis of the entire human genome a tractable enterprise. A processing strategy and the resultant processing times are presented. Experimental proofs of concept for the two types of analysis specified by the strategy are also included.
You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
You do not have subscription access to this journal. Figure files are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
You do not have subscription access to this journal. Article tables are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
You do not have subscription access to this journal. Equations are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
Representations are with 255-bit maximum-length pseudorandom sequences that are designated35 by their octal and polynomial representations.
Table 3
Analysis Time for a 50 × 106 Base Database as a Function of the Number of Bases in the Query Sequencea
Query-Sequence Length
Bit Repetition
Bit Rate (MHz)
Duration of Query Sequence (pts)
Analysis Time for 1 Phase (s)
No. f Time Shifts
Total Analysis Time (s)
12–14
2
1
168–196
700
1
700
15–28
1
1
105–196
350
1
350
29–42
1
2
102–147
175
1
175
43–57
1
3
100–133
116
1
116
58–71
1
4
101–124
86
1
86
72–142
1
5
101–199
70
1
70
143–214
1
10
100–150
35
1
35
215–285
1
15
100–133
23
1
23
286–428
1
20
100–150
18
1
18
429–105
1
30
100–23,333
12
1–117
12–1404
Query-sequence lengths (number of bases) between 12 and 10,000 are illustrated. The analysis time for query sequences longer than 857 bases is proportional to the length of the query sequence, because for a longer query sequence more time shifts are required to find the correlation peak.
Table 4
Correlations Produced by a Fine Analysis: Identical Segmenta
This analysis is performed with a seven-base query sequence contained in a database that is 20 bases long in which a segment is identical to the query sequence. The region where a match is found is between position 4 and 10 of the database.
Database: t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 tl5 t16 t17 t18 t19 t20: T G A C N C A N N G T G A N N A G G G T
Query sequence: s1 s2 s3 s4 s5 s6 s7: C N C A N N G
Are the bases identical?
Table 5
Correlations Produced by a Fine Analysis: Similar Segmenta
This analysis is performed with a seven-bases query sequence contained in a database that is 20 bases long in which a segment is similar to the query sequence. The region where a match is found is between position 4 and position 10 of the database with discrepancies at location 6 and 9.
Database: t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18 t19 t20: T G A C N T A N A G T G A N N A G G G T
Query sequence: s1 s2 s3 s4 s5 s6 s7: C N C A N N G
Are the bases identical?
Representations are with 255-bit maximum-length pseudorandom sequences that are designated35 by their octal and polynomial representations.
Table 3
Analysis Time for a 50 × 106 Base Database as a Function of the Number of Bases in the Query Sequencea
Query-Sequence Length
Bit Repetition
Bit Rate (MHz)
Duration of Query Sequence (pts)
Analysis Time for 1 Phase (s)
No. f Time Shifts
Total Analysis Time (s)
12–14
2
1
168–196
700
1
700
15–28
1
1
105–196
350
1
350
29–42
1
2
102–147
175
1
175
43–57
1
3
100–133
116
1
116
58–71
1
4
101–124
86
1
86
72–142
1
5
101–199
70
1
70
143–214
1
10
100–150
35
1
35
215–285
1
15
100–133
23
1
23
286–428
1
20
100–150
18
1
18
429–105
1
30
100–23,333
12
1–117
12–1404
Query-sequence lengths (number of bases) between 12 and 10,000 are illustrated. The analysis time for query sequences longer than 857 bases is proportional to the length of the query sequence, because for a longer query sequence more time shifts are required to find the correlation peak.
Table 4
Correlations Produced by a Fine Analysis: Identical Segmenta
This analysis is performed with a seven-base query sequence contained in a database that is 20 bases long in which a segment is identical to the query sequence. The region where a match is found is between position 4 and 10 of the database.
Database: t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 tl5 t16 t17 t18 t19 t20: T G A C N C A N N G T G A N N A G G G T
Query sequence: s1 s2 s3 s4 s5 s6 s7: C N C A N N G
Are the bases identical?
Table 5
Correlations Produced by a Fine Analysis: Similar Segmenta
This analysis is performed with a seven-bases query sequence contained in a database that is 20 bases long in which a segment is similar to the query sequence. The region where a match is found is between position 4 and position 10 of the database with discrepancies at location 6 and 9.
Database: t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18 t19 t20: T G A C N T A N A G T G A N N A G G G T
Query sequence: s1 s2 s3 s4 s5 s6 s7: C N C A N N G
Are the bases identical?