DNA Markers Evaluation to Preparing a Genetic Database for Identification in Crimes and Incidents

Natural disasters like floods and earthquakes as well as manmade ones like wars, accidents, and plane crashes, may take the lives of many people and leave their bodies identifiable; genetic techniques can be used to identify them. Genetic identification is also one way to identify a criminal from crime scenes. Unidentified bodies can be identified using their genetic information and matching it with the genetic database or with the victims’ parents, children, brothers, or sisters. This technology is based on the theory that interindividual differences result from differences in the genetic information held in their DNA. Therefore, a genetic database for a target population would prevent non-identification of missing persons whenever necessary.1-3 A total of 99% of every person’s DNA is the same; in fact, only one nucleotide per 1000 nucleotides is different in each individual. Considering this difference, genetic tests can be performed to distinguish individuals from each other.3-6 Variable genomic regions that appear frequently and sporadically are introduced as genetic markers in various forms. Several types of these DNA markers can be used for identification purposes, and the most important examples of them are discussed below.


Introduction
Natural disasters like floods and earthquakes as well as manmade ones like wars, accidents, and plane crashes, may take the lives of many people and leave their bodies identifiable; genetic techniques can be used to identify them.Genetic identification is also one way to identify a criminal from crime scenes.Unidentified bodies can be identified using their genetic information and matching it with the genetic database or with the victims' parents, children, brothers, or sisters.This technology is based on the theory that interindividual differences result from differences in the genetic information held in their DNA.2][3] A total of 99% of every person's DNA is the same; in fact, only one nucleotide per 1000 nucleotides is different in each individual.4][5][6] Variable genomic regions that appear frequently and sporadically are introduced as genetic markers in various forms.Several types of these DNA markers can be used for identification purposes, and the most important examples of them are discussed below.

Variable Number of Tandem Repeats
Variable number of tandem repeats (VNTRs) are DNA fragments comprised of repetitive units made up of 7-25 nucleotides which are interconnected.Their central nucleus is repeated 50 times and is 1 kb in length.This marker gets its name from the difference in the number of repetitive units.Technically, this marker is suitable for artificial probes, the application of radioactive materials, and the southern blot.The most important advantage of using VNTRs for identification purposes is the high polymorphism of these regions, which has resulted in the formation of multiple alleles in a single locus (4-7).VNTR polymorphisms are often located near telomeres, and their repetitive nucleus consists of 6 to 100 bp.The nucleus sequence in several of these alleles is repeated thousands of times.The variable number of repetitions creates alleles ranging in size from 500 to 30 000 bp. VNTRs were the first polymorphisms used to create DNA profiles.
After some time, however, their use was limited by the need for large amounts of DNA with high molecular weight and the difficulty of interpreting their results.For these reasons, short tandem repeats (STRs) are now used instead of VNTRs.(Figure 1) 5,6 Mitochondrial DNA Unlike nuclear DNA which is passed in equal parts to a child from both parents, mitochondria are inherited solely from mothers, because the sperm cell has no mitochondrion.Thus, the mitochondria are inherited only through the egg to the egg cell.Therefore, mitochondrial DNA (mtDNA) is investigated only to study the maternal inheritance.Since mtDNA is transmitted unchanged (except in the case of mutations) from mothers to children and from daughters to the next generation, one can determine the genetic relationship between a grandmother and a granddaughter without a mother. 2Currently, mtDNA is widely used in identification science for two reasons.First, the D-Loop region is highly polymorphic and very suitable for use in identification tests.Secondly, although mtDNA accounts for less than 1% of the cell DNA, its genes have a large number of copies.If there are approximately 200 mitochondria in each cell, there are more than 400 copies of the mitochondrial gene for each nuclear gene that may have 1 or 2 copies.Thus, considering the low levels of nuclear DNA in cases such as hair and sperm as well as bones and teeth specimens where the nuclear DNA is often degenerated, the use of mtDNA often produces good results.The mtDNA sequence is also used in archeological studies (Figure 2). 3,4ngle Nucleotide Polymorphism Variations that occur due to differences in a nucleotide position are called single nucleotide polymorphisms (SNPs).They are found abundantly in the human genome; there is almost one SNP per every 400 nucleotides. 2SNPs are advantageous for identification purposes.First, they have higher frequency in the genome (millions) than the STR loci (1 or 2 copies).Secondly, due to their smaller size (less than 100 BP), SNPs produce better results than STRs in studying crushed DNA specimens (such as old tissue).However, their big disadvantage is that, unlike STRs that have a lot of alleles in each locus, most SNPs are made up of two alleles.Therefore, they exhibit poor identification potential at each location, and about 30-40 SNPs should be analyzed when being used for identification purposes. 2,7,8SNPs are the simplest type of DNA polymorphism and are, in fact, considered as a difference in only one nucleotide sequence in the DNA sequence. 9,10The structure of a typical SNP is shown in Figure 3.
This figure shows two alleles that differ only in the one nucleotide marked with a star.The fourth positions in alleles G and A are guanine and adenine, respectively.In most cases, mutation in a particular locus, which causes the formation of an SNP, leads to the formation of two alleles.
SNPs are formed in cells that undergo meiosis due to the occurrence of mutations during DNA replication. 11Certain genomic regions have more SNPs than other regions.For example, chromosome 1 has an average of one SNP in a 1.45 kb region, while there is one SNP in a 2.18 kb region of chromosome 19. 12SNPs normally have two alleles, e.g., a guanine allele and an adenine allele.Therefore, SNP is not considered to be a highly informative polymorphism, and therefore does not appear to be suitable for forensic analysis.However, SNPs are found abundantly throughout the genome, and theoretically, it is possible to identify hundreds of them, which in turn leads to a significant increase in the total resolution power.It seems that achieving a resolution of 10 STR requires 50 to 80 SNP, which is far more difficult than the analysis of 10 STR. 2,13,14e Significance of SNPs in Forensic Genetics One constant challenge in forensic work is the analysis of a severely degraded DNA sample, which should be extracted from bone and teeth specimens of individuals who have long been dead.3][4][5] Although the use of mtDNA has advantages, its sequence variation is lower than that of STRs and maternal inheritance limits some genetic relationships, such as the father-daughter relationship.Also, the mtDNA study is a time-consuming process. 15One way to reduce the amplicon length is to redesign the primers of the current known STRs, generally referred to as mini-STRs. 16Although STR sequences are widely used in forensic laboratories worldwide and there are many commercial kits available for such purpose, there is still a growing interest in SNPs, which could be due to their lower mutation rate and the possibility of analyzing crushed DNA specimens. 17,18

Short Tandem Repeats
The polymerase chain reaction (PCR)-based analysis of STRs is one of the most important methods used in identification labs.The proliferation of STRs on genomic DNA, now known as STR loci, is an integral part of research and analyses performed in identification labs.STRs are similar to VNTRs in terms of central repetitive interconnected units, but the length of these repetitive units in STRs is 2-6 bp. 2,7,9,19hese central repetitive units may be repeated in a locus up to 40 times, but the most common STRs are those in which repetitive units are repeated 7 to 15 times.STRs are now the most commonly analyzed genetic polymorphisms in forensic genetics.These sequences were first introduced to forensic genetics in the mid-1990s, and they are now used as the main tool in forensic laboratories. 20here are thousands of STRs that can be used in forensic analysis.These sequences are scattered throughout the genome, namely, 22 autosomal chromosomes as well as chromosomes X and Y. Their nucleus consists of 1 to 6 nucleotides, and their number of repeats is generally between 50 and 300 bp.Major loci used in forensic genetics are tetranucleotide repeats. 21,22STRs meet the need for a forensic marker entirely.A wide range of biological materials can be analyzed using STRs, and the results obtained from their analysis in different laboratories are well comparable.STRs have very high resolution, especially when several of them are applied simultaneously.They also have high sensitivity, and a small number of cells are sufficient to ensure a successful analysis.Creating STRs profiles is easy and cost-efficient.On the other hand, there are a lot of STRs that are not under any selective pressure throughout the genome. 2,23,24Although a large number of STRs are known, only about 20 of them are used in forensic work. 2,9,19,25STR markers widely used in forensic genetics consist of nuclei of 4 or 5 bp. 2,26he reparative nucleus may consist of 1 to 6 nucleotides.This example illustrates the structure of two alleles of the D8S1179 locus.The alleles are named based on the number of their repeats.
One of the essential characteristics of any STR used in forensic analysis is that it can be evaluated and analyzed in a multiplex reaction with other STRs. 2,27,29In the United Kingdom, the Forensic Science Service (FSS) designed the first system for determining STR type in forensic analysis.Subsequently, this system was replaced by second generation multiplex (SGM).Two commercial companies, Applied Biosystems and Promega, produce a series of multiplexes that are already used in many laboratories.The AmpF/STR ® SGM Plus, produced by Applied Biosystems, has now replaced SGM in the United Kingdom.In the United States, STR technology was introduced to the forensic field by selecting 13 loci as the CODIS locus (Combined DNA Index System). 19,30Each of these sequences has capabilities and limitations and can be used by anyone depending on the database application type.Collecting genetic information for the preparation of a genetic database includes the stages of DNA extraction, quality control, PCR, profiling, and database creation.

DNA Extraction
The first step for the preparation of a genetic database is to carry out the sampling process on the target population and perform the DNA purification steps on the obtained blood specimens.4][5][6][7][8][9][10] Quality Control Specimens should undergo quantitative and qualitative control after performing the purification and before the PCR test.Nano Photometer can be used to perform quantitative and qualitative control.This device can detect very low levels of DNA, RNA, and protein by ultraviolet light spectroscopy.Another quantitative DNA testing method is to use the real time PCR method.4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20] Polymerase Chain Reaction The number of intended sequences (here, STR) is increased after the PCR-based quality control step is performed.][22][23][24][25][26][27][28][29][30] Profiling: Specimens are run on the genetic device analyzer.This system is based on the capillary electrophoresis and argon laser.[22][23][24][25] Database The last step includes the preparation of the genetic database from specimen profiles.It is composed of 2 parts.The first part includes the genetic database of known specimens prepared from individuals working in high-risk occupations.The second part relates to specimens from unidentified bodies that are taken from accident scenes.[27][28][29][30]

Conclusions
Based on the foregoing, it is recommended that genetic information from blood specimens obtained from a target population be extracted and stored as a genetic database.Then, in case of accidents, the genetic information of the unknown specimens should be extracted and matched with the genetic database.This method is currently being implemented for the entire population in some countries around the world.In Iran, there is no genetic database.Thus, DNA specimens extracted from an unidentified specimen must be matched with specimens of all families of missing individuals, and this is a very time-consuming process.Moreover, all family members may have died along with the unidentified individual; consequently, the specimen cannot be identified at all.Therefore, preparing a genetic database for individuals working in high-risk occupations, such as those in the armed forces, seems necessary so as to prevent non-identification.To this end, several types of genetic markers can be used, the most significant of which include VNTRs, STRs, SNPs, mtDNAs, and Y chromosomes.Considering limitations and capabilities as well as the type of database application, one can use each of these markers.
The advantages of a genetic database include: 1.A small amount of tissue specimen is sufficient for determining an individual's identity.2. Using this method, bone specimens discovered long ago can be used for identification purposes, and this information will not be lost over time.3. Identification is fast.4. The genetic information of any person is always assessable and cannot be forged; thus, it is considered to be the best identification method in criminal cases.Genetic identification is the best and most accurate method for identifying unidentified individuals; thus, the preparation of genetic databases for high-risk occupations seems to be essential.

Figure 1 .
Figure 1.Structure of 2 Alleles at VNTR Locus D1S7.Both of these alleles are relatively short and have 104 and 136 repeats, respectively.The alleles in this locus can have up to 2000 repeats.The central nucleus of the alleles, which is repeated frequently, is 9 bp in length.The image above was taken from the book "An Introduction of Forensic Genetics''.2

Figure 2 .
Figure 2. Structure of the Mitochondrial DNA and its D-Loop Sequence.3

Figure 3 .
Figure 3. Single Nucleotide Polymorphism.Taken from the book ''An Introduction of Forensic Genetics''.2