What is metagenomics?

What is Metagenomics?-Definition, Steps, Process and Applications

“An interdisciplinary field of genetics includes the study of environmental samples using genetic techniques known as metagenomics.” 

We can also say that,

The study of many genomes recovered from environmental samples using state of the art genetic technologies is known as metagenomics. 

The term metagenomics is new to us, right! Because it is not traditionally used in genetics, it evolved so recently. 

Simply put, metagenomics is a study of microbes. We are investigating microbes directly from their natural habitat and applying which we can classify different microbes in a single experiment and also investigating their genes’ role. 

To make our point more clear we have to understand the importance of microbes to us. 

Microbes are unicellular organisms associated with our life, very closely. They are either useful or harmful to us. For instance, the microbes present in our gut help us to digest food while some other microbes ferment food and make it edible to us. 

Some respiratory microbes make us sick too.  

Since the microscope was discovered we are observing microbes. Techniques of cell culture enable us to grow them and study them in the lab. 

However, the traditional microbiology techniques are not powerful enough to culture diverse microbes at once. Scientists could only study single microbes at once. Furthermore, the technique was time-consuming and contamination prone. 

It takes up to 3 days to culture a bacteria or microbe. 

So we can say, we can’t study the entire microbial community from a single habitat using the traditional microbiology methods (don’t offend microbiologists, it’s true!) methods. 

So is there any technique that identifies microbes in rapid time and in bulk? Can we study microbial load more accurately? Metagenomic analysis solved this problem. 

In the present article, we are going to talk about metagenomics, its definition, and its importance in numerous fields of science. Furthermore, we will look broadly into the entire process of how it is done. 

We will also go through the advantages, limitations and applications of metagenomic analysis. 

What is metagenomics? 

The entire microbial load or microbial community can be studied and their interaction with host/environment or effect on human, plant or animal health can be examined using metagenomics.   

So metagenomics is a research technique as well as a diagnostic approach too used to evaluate the effect of many microbes on human health without affecting their natural habitat that is one aspect of it. 

The word ‘metagenomics’ is a combination of two letters ‘meta’ means ‘huge’ or vast and ‘genomics’ means related to ‘genome’. Here it means, “the study of the microbial genome”. 

The term was first defined by Jo Handelsman & Jon Clardy and co-workers. 


“The study comprises microbial identification, characterization and studies from the environmental samples, known as metagenomics.”

Let us understand it with a simple example. 

Suppose we wish to study some common dog infections in your region. It may happen due to the pond water in your village (suppose it, it’s just an example). The dogs frequently went to the pond and drank water and became sick. 

A water sample from the pond is collected and metagenomic analysis is performed. A database of the type of microbial community present in pond water is constructed and evaluated.

Now if we wish to study how it is related to dog’s health, feces samples or gut samples from the dogs of your area are taken and also processed for the same.

A comparative analysis of both the microbes database is performed to get a link, whether the microbes present in the pond water is the reason for infection in dogs or not. 

In another case, if some plants near the pond grew well and produced more fruits than the plants away from the pond. We can also do metagenomic studies to know whether any unknown microbiomes in the pond help to grow plants or not! 

These are just two examples (what I know so far!), the technique is used in any type of analysis associated with microorganisms. 

A similar type of analysis can also be performed on the human population as well in order to get information of how infection of microbes happens to us. 

 Soil, water, seawater and other environmental samples are collected and investigated to study its role in human disease or infection. 

So the present technique is practiced to know what microbes are doing in a particular environment. The starting sample is not only the environmental one but also the gut microbes, intestine microbes, teeth cavity microbes and microbes present in urine and feces samples.


The aim of the metagenomics study is not only to identify the microorganisms present in a particular sample but also to know the function, effect, or activity of their presence or their genome on the surrounding.

“Microbes manipulate the environment beyond our imagination.”

Steps and process of metagenomics: 

  • Sample collection 
  • DNA extraction 
  • Sample pre-preparation 
  • Sample analysis 
    • PCR 
    • DNA sequencing 
    • DNA microarray 
  • Bioinformatics studies 
  • Results and interpretation 
What is metagenomics?
The process of metagenomic analysis.

Sample collection: 

Sample collection is a crucial step in metagenomics studies. An environmental sample (known as eDNA) is collected. Soil sample, water sample, urine sample, feces sample and gut sample are taken. 

Commonly no special sample collection system is needed here. 

Related article: What is an eDNA?

DNA extraction: 

Soon after the DNA extraction is performed to get DNA. You can use any of the DNA extraction methods enlisted here: 

For metagenomic analysis, I strongly recommended using the ready to use DNA extraction kit. Check the purity and quantity of DNA and store it in at 4C. 

Sample preparation: 

To analyze the sample we have to prepare so far, purify the DNA sample with the DNA purification method or kit (spoiler- don’t use alcohol purification to purify the DNA sample for metagenomic analysis). 

Now to process it for DNA sequencing we have to prepare a DNA library. First fragment the DNA (if recommended) and ligate it with adapters. 

If the quantity of samples is not sufficient, amplify the fragments using PCR and then prepare a library of DNA. 

Various metagenomic DNA amplification kits for DNA sequencing are now available. 

Sample analysis: 

There are two common ways you can process a sample DNA, either DNA amplification or DNA sequencing. The microarray for metagenomics is less advisable. 


The polymerase chain reaction is performed only to study the 16s rRNA genes or 18s rRNA genes of microbes and analysis of some known genes associated with some disease or infection. 

Commonly the PCR is used for 16s rRNA and 18s rRNA gene amplification not to investigate new functions.

In a PCR, the sequence-specific primers for different microbes are designed and amplified.

The results of PCR are analyzed on conventional gene electrophoresis. Although, real-time quantification is also done to measure viral load. 

DNA sequencing: 

DNA sequencing is the best method to study metagenomics. Here the DNA sample, what we have extracted from the environmental sample is determined in a machine that identifies every nucleotide present in a sample, viz it can sequence all the DNA sequences, genes, or part of the genome (whatever present in a sample). 

Through the computational programs, various genes related to various microbes are predicted and studied to get information on how they infect.

Shotgun sequencing and high throughput DNA sequencing are two common methods employed for metagenomic studies. 

The shotgun DNA sequencing facilitates the parallel sequencing of all the genome or genes of a sample.

We have covered so many articles on DNA sequencing so we are not discussing the process and principle here. You can read it here: 

  1. What is DNA sequencing?- a beginner’s guide.
  2. DNA sequencing- definitions, history, steps and procedure.
  3. 3 of the best genome sequencing methods.
  4. Whole-genome exome sequencing.

Why are we doing the metagenomic analysis? 

In the first instance, we can say, to identify the microbes but the reason is more detailed than only recognizing. 

Two common reasons force us to do metagenomics; first the limitations of conventional microbiology techniques. 

If we have a sample of soil and we wish to study microbes in it, we can identify a few hundred microorganisms in several months. And is only possible if the sample remains non-contaminated. 

But by metagenomic analysis, we can investigate thousands of microbes in rapid time and without contamination.  

Second, the majority of microbial activities are governed by the complex behavior of the entire community of microbes. So we have to study all, to investigate some effect or activity properly. 

Note here that metagenomics is not a 16s rRNA gene analysis. 16s rRNA gene analysis is performed to characterize microbes. It can’t enable us to make a functional genomic analysis. 

So if your knowledge is only up to 16s rRNA, read this article completely. By the way, you can read the related article here: 16s rRNA gene sequencing.

On the other side, metagenomics is performed in order to study the functional genomic portion of microbes. Here we are investigating how different genes of microbes behave and are involved in some diseases. 

Advantages of metagenomics: 

One of the uppermost advantages of the present technique is its power to study many microorganisms in a single experiment that could not be possible with the conventional microbiology methods. 

Besides, the sequencing-based metagenomic technique is highly accurate and faster. 

We can also quantify the amount of viral load.

Pictorial illustration of the process of environmental DNA analysis.
A pictorial illustration of the process of environmental DNA analysis.

Disadvantages of metagenomics: 

The techniques are so costly. Next-generation sequencing techniques cost around 500$ to 5000$ approximately.

So many microbes are still unknown to us therefore we don’t have enough information to compare and study novel microbe sequences. 

Related article: Microbial genetics: A rapid advancement in microbiology.

Applications of metagenomics:

The metagenomics studies are now in its preliminary phase but it is potential enough to penetrate in different fields to solve different problems. It is used widely in ecology, environment conservation, infectious disease diagnosis, Environmental remediation, Biotechnology, and agriculture. 

Biotechnology studies: 

In recent times scientists’ more focus is on microbial studies viz metagenomic analysis. Protease, lipase, and nitrilases like enzymes are the product of metagenomic studies. 

Enzymes, antibiotics, biochemicals, Bioactive compounds and pharmaceuticals are made by studying microbes only. 

Ecological studies: 

Microbial studies like metagenomics have great importance in ecology, conservation and invasive species studies. 

Sea, rivers, soil, air, and rain forest are the habitat for so many different animals and microbes. 

The complex symbiotic relationship between animals, microbes, and plants help us to understand the health of the habitat. For example, the feces of one animal might be a nutrient-rich food source for another different species. Note that it might possible because of the microbes’ activity! 

The metagenomic analysis provides insight into how both are important for an ecosystem. 

It is also used for conservative and endangered species studies. 

Healthcare and medical: 

Complex infections can be immediately studied by metagenomic analysis. 

Sample from the patient is taken and processed for DNA sequencing in order to know which microorganism may be present in them. 

The role of different RNA viruses in human health can also be evaluated by isolating RNA and converting them into cDNA. 

The impact of pollutants on the ecosystem and environment can be monitored and investigated by knowing how its microbial load behaves, by metagenomic analysis. 

Agriculture and soil ecology 

Metagenomics is tremendously used in soil and agriculture studies. 

The soil is a common habitat for so many microorganisms and plants too. Approximately one gram of soil sample contains approximately 109 to 1010 microbial cells. 

If we sequence all the microbes from one gram of soil sample, it gives 1gb output sequencing information and that’s huge!

Complex relations between plants and microorganisms are the major focus of these studies. The microbiomes that are useful to plant growth have great economical values in terms of production. 

Conclusively we can say, we can use metagenomics in different fields as per our vision. Even scientists are trying biofuel production through metagenomic.


The process of metagenomic sample analysis might look familiar but it’s not only restricted to the sequencing of 16s rRNA gene analysis only. A lot of computational tools are used to get so much information related to the DNA sequences from a sample. 

Thousands of different microbial strains can be identified, studied, and analyzed by the present technique. The world of microbes is even more complex than we think!


Scroll to Top