“DNA sequencing is a method used for determining the sequence and order of nucleotides present in particular DNA sequence.” 

DNA sequencing methods available nowadays are quite accurate, fast and reliable. 

The human genome project was completed in 2013, it takes almost 10 years to complete the entire genome sequencing of us. 

But in recent molecular genetic trends, the scenario is different, the process of whole-genome sequencing is now faster and robust, thanks NGS. 

Using the latest technology, the scientists have now sequenced a lot of prokaryotic as well as the eukaryotic genomes. 

Sequencing a gene or a DNA sequence is easy, but the sequencing of the whole genome is a tedious and lengthy process. 

Determining the sequences and orders of nucleotide present in the entire haploid set of DNA is called genome sequencing or whole-genome sequencing. 

In the present article, we will brief you 3 of the best method using which, you can sequence your entire genome. 

Read more on DNA: DNA story: The structure and function of DNA

But before that lets quickly discuss some basics needed to understand the present topic. 

What is a genome? 

The entire human genome contains 3.2 billion base pairs in which only a small portion of it is function, approximately ~3%. 

This tiny functional portion contains around 2000 genes that can encode specific structural or functional proteins. 

A gene is a functional unit made up of the long chain of the DNA (adenine, thymine, cytosine and guanine). 

The DNA is universally present in all organism which inherited from one generation to another generation and thus inherits phenotypes. 

Genes are located on chromosomes, a total number of 46 chromosomes are present in somatic cells while only 23 chromosomes are present in germ cells, called haploid set. 

A haploid set of chromosomes or a single set of chromosomes are called a genome. 

If we sequenced all the DNA present on all 46 numbers of chromosomes, it is actually waste of time and money because chromosomes are present in pairs so for entire genome sequencing we only need to sequence the haploid set or a single set of chromosomal DNA. 

Interesting article: 20 Incredible DNA Facts: The 20th One Is Actually Shocking

Steps in DNA sequencing: 

Note: The segment is a brief introduction of different steps of DNA sequencing. 

Sample collection


Any of the biological samples like blood, saliva, solid tissue, tumour tissue, saliva, plant tissue, bacterial sample, amniotic fluid, chorionic villi or body fluid can be used for DNA sequencing. 

DNA extraction


The next step in genome sequencing is to obtaining or extracting DNA from the collected sample. 

For DNA sequencing, use ready to use DNA extraction kits, although, for whole-genome sequencing, I advise to use DNA extraction that has high yield capacity. 

Use any of the DNA extraction methods listed below, 

Or you can use any of the methods enlisted in the article given below,

Quantification and purification of DNA


For quantifying the DNA, prefer to use DNA sample having the quantity more than 200ng and purity of nearby ~1.80. 

If the DNA sample is not pure, purify the DNA sample using the alcohol purification method or ready to use DNA purification kit. 

DNA purification is a very crucial step in DNA sequencing because the inhibitors present in the sample may hurdle in sequencing. 

Store DNA sample at 4°C for a shorter period, or store it at -20°C for a longer period of time. Check out this sample storage guide of ours: DNA Sample Storage: What To Do And What Not To Do.

DNA fragmentation and library preparation


Digest the DNA sample using various endonucleases for creating a library or cloning the DNA chunk. 

  • Clone fragmented DNA in BAC (for clone-to-clone sequencing)
  • Insert into the plasmid (shotgun sequencing)
  • Or prepare a library of different DNA fragments for direct sequencing

Genome sequencing


The machine does the rest of the work in favour of us. 

Different fluorescently labelled nucleotides are used to bind with the nucleotides of the DNA fragments and the fluorescent signal emitted from the machine are recorded by the machine. 

The output signal of each fluorescent signal is represented in terms of a “peak”. 

Each peak represents each fluorescently labelled nucleotide binds with the DNA sequence. 

After completion of the process, the data of genome sequencing are sent for the bioinformatic analysis. 

Analysis of data

The generated countings or overlapping fragments are arranged and the sequence of different DNA fragments are arranged orderly. 

Once the sequencing is done, the entire sequence data is compared with the reference genome available in the database. 

The brief of all steps are shown in the figure below,

Steps of DNA sequencing

Now in the next section, we will discuss the methods best suitable for sequencing the entire genome.

Three of the best genome sequencing methods: 

1. Clone-by-cline Genome sequencing: 

It is actually a difficult process to sequence a genome in a single run. 

In the whole genome shotgun sequencing, the genome is divided into smaller segments to facilitate efficient reading. 

For that, the genome is cleaved into segments of 150kb to 200kb fragments using the restriction digestion or physical cleaving methods. 

After digesting the genome, the location of each chunk of DNA or the fragment of DNA is mapped and a chromosomal map of the genome is created. 

Here, the chromosomal mapping will help to arrange the data of genome sequencing in the end. 

Note: the size of the digested “DNA fragments” are too larger thus we call it DNA chunks instead of DNA fragments. 

Now the chunk of the DNA sequences is inserted into the BAC, bacterial artificial chromosome. 

Each time when the bacterial colonies divided our chunk of DNA will also divided and by doing this we can get thousands of copies of our DNA chunks. 

Why we are using cloning instead of PCR? 

The question may have popped in your mind, right? 

The answer is very simple. The PCR amplification method has limited capacity for amplifying specific DNA fragment. 

A normal PCR can amplify DNA fragment of around 1200 to ~1800bp, even, the long-range PCR can only amplify few kb DNA fragments. 

This is the reason, the BACs are used for whole-genome sequencing. 

Ok, let’s come to the topic, 

After creating BACs, the DNA chunks of the whole genome is even divided into smaller fragments of 500 to 1000bp fragments and inserted into a known plasmid vector. 

Now the process of DNA sequencing starts with the known DNA sequence of the plasmid and extended till the unknow fragment of DNA is sequenced. 

Following this process, all the fragments of DNA inserted into know plasmid from the known BAC are sequenced simultaneously. 

All the data of all DNA sequences inserted into all the vectors are collected together and, 

The overlapping sequence of know vector DNA is removed by joining all the unknow segments. 

The assembly of DNA segments is created by removing all the overlapping segments. 

Now in the last step, using the chromosomal map (data collected at the beginning of the sequencing), the sequences of DNA are assembled back on the chromosomes (using their locations), and sequence of whole-genome is completed. 

The data can be used for comparing it with another genome data available. 

The sequential representation of the clone-by-clone sequencing is shown in the figure below,

graphical representation of clone-by-clone sequencing method

The human genome sequencing project was completed using the clone-by-clone sequencing method because it is a reliable method. 

Nonetheless, the method is time-consuming, tedious, lengthy and costly. 

2. Whole-genome shotgun sequencing: 

As we discussed, the clone-by-clone sequencing methods have several limitations and also, the reading is not always accurate. 

The whole-genome shotgun sequencing method is an advanced version of the clone-by-clone sequencing method which is more faster and accurate than the previous one. 

By using the clone-by-clone genome sequencing method, one can sequence the genome of prokaryotes and some primitive eukaryotes. However, sequencing the entire mammalian genome is a difficult process. 

Due to the larger size of it and the structural complexity of the eukaryotic genome, the method can not work accurately all the time.

In the whole-genome shotgun sequencing, chromosomal mapping and cloning steps are bypassed which save time and reduces the complexity of the process. 

Instead of this, 

The entire genome is digesting using specific endonucleases in random fragments. The fragments are of randoms sizes, some are 2kb or 3kb long while some are 20 kb or 200kb. 

The fragments are ligated into the known plasmid called libraries. 

Each library of DNA fragments is sequenced independently in the automated sequencer machine. 

Using the overlapping fragments of each digested DNA pieces, the entire sequence is assembled in a computational program. 

See the sequential representation of the entire process,

graphical representation of the shotgun sequencing method

Although, some gaps or errors may remain some time in the present sequencing method. Therefore, the method can be at its best when we have a reference sequence to compare. 

In comparison with the clone-by-clone sequencing method, the shotgun genome sequencing method is faster and cost-effective. 

3. Next-generation DNA sequencing:

The NGS, next-generation sequencing is one of the most advanced, rapid and cost-effective methods for whole-genome sequencing. 

“More than 5 different human genomes can be sequenced in a single run at a cost of 5,000$ per genome.”

It is a massive parallel sequencing in which millions of DNA fragments can be sequenced in a parallel fashion or reaction. 

Through the bridge amplification, thousands of DNA fragments are sequenced in a robust machine. 

The graphical representation of next generation sequencing

Some of the steps in the NGS is also the same as the other sequencing enlisted above. The DNA sample is first fragmented into smaller DNA fragments.

Now the fragments are ligated with the known sequence adapter and proceed to bridge amplification.

After each round of labelled nucleotide addition, the signals are recorded shown in the form of a peak.

What is the need for DNA sequencing? 

The DNA sequencing can be used for various purposes some of them are enlisted here, 

  • For detecting mutations 
  • For detection, identification and characterisation of SNP (single nucleotide polymorphism). 
  • For identification of new mutations and variation in a genome. 
  • Characterisation and identification of various microbes and their different strains. 
  • For evolutionary studies. 
  • Species and speciation study and migration studies. 
  • Further, it is required for the metagenomic analysis, forensic analysis and medicinal studies. 

Read an interesting article: Introduction To Genetics: Definition, History, Applications And Branches 

Conclusion: 

The clone-by-clone sequencing is nowadays not so popular in whole-genome sequencing. 

DNA sequencing technology can be the best option for cancer diagnosis and prediction of disease conditions in an earlier stage or before birth. 

Still, the cost and the accuracy of the machine is a major limitation for entire genome sequencing.