Full speed ahead for sequencing
DNA sequencing methods, which were first used in the 1970s, have been continuously developed since then to optimise their costs, speed, quality and throughput.
The most recent step is so-called Next Generation Sequencing (NGS), which enables scientists to decode enormous quantities of even different DNA fragments in just one fully automatic sequencing run. Rapid growth in computing capacity, the intelligent steering of biochemical processes and the handling of gigantic data volumes have paved the way for this development. BLICKWINKEL spoke with Dr Paul Scholz, Research Scientist & Project Manager at BRAIN, about the potential of NGS.
BRAIN: It is claimed that Next Generation Sequencing (NGS) is set to revolutionise at least life sciences, medicine and evolution biology. What stands NGS for?
Dr Paul Scholz: New high-throughput methods in biological and medical research have been driving rapid progress for some time now. Methods like NGS make it possible to plan tests in a completely new way and open up innumerable opportunities for new findings. Formerly you had to be very familiar with the gene you were examining, or at least know the region of the genome or transcriptome in which it is located. Methods like NGS analyse the entire genome or transcriptome of an organism in a single working step. Detection becomes holistic and is not restricted to specific genomic regions.
BRAIN: How is this development influencing research in biotechnology?
Dr Paul Scholz: Fundamentally, high-throughput methods offer an opportunity to sequence the as yet largely unexplored variety of species on earth more quickly and efficiently, and thus to discover new nature-based proteins and enzymes for biotechnological applications. Research in biotechnology, including ours at BRAIN, puts us in a position to reliably assemble complete genomes of newly discovered or generated bacteria or yeast strains, i.e. to combine or reconstruct them using computer programs. Besides this, we can analyse mutageneses that have been systematically performed in the lab, and natural gene modifications or the effect of environmental influences. A truly pathbreaking possibility is to analyse the impact of individual reagents or even foods on gene expression in a whole genome. Even a change made in a single gene is usually relevant for the expression of many genes.
BRAIN: Are NGS methods the end of the line?
Dr Paul Scholz: Just as nobody could predict NGS in the 1970s, DNA/RNA sequencing techniques will be sure to evolve in future. NGS still calls for enormous computing capacity, storage capacity and high-tech bioinformatics. So there are many areas where further improvements can be made.
BRAIN: Can you give us a specific example of progress that you were especially pleased about?
Dr Paul Scholz: Only a few months ago, an international team of scientists headed by Professor Jillian F. Banfield at the University of California, Berkeley presented a completely new tree of life that included all known living organisms of our Earth at that time. It includes about 2.3 million animals, plants, fungi and microorganisms. The team of scientists collected data from thousands of studies over past years to create the family tree, and analysed its own metagenome samples from different habitats by means of NGS. They also included very unconventional habitats such as the metagenome of a dolphin’s tongue. By piecing together existing and newly collected data, the scientists were able to present a completely revised tree of life.
The most spectacular finding related to this tree of life is a new branch, formed by so-called Candidate Phyla Radiation. That is a newly discovered domain of single-celled prokaryotic organisms without a cell nucleus. The exciting thing is that this domain, which was unknown until very recently, has a biodiversity that encompasses roughly a third of all forms of life known so far. Although this only reflects our current knowledge, we now presume that microorganisms form three of the four large branches of the tree of life, the domain of the Archaea (single-celled microorganisms formerly classified as bacteria), the domain of the bacteria and the newly discovered domain of the Candidate Phyla Radiation. Animals and plants only constitute a twig on the fourth main branch, the domain of the eukaryotes (organisms with a cell nucleus).
Such studies not only show the wealth of species that remain to be discovered, but also the potential that NGS and other new biotechnological analysis methods hold for producing findings in the scientific discipline of biology.
BRAIN: Producing reads of the genome and transcriptome is one thing. But what about understanding the information in the reads: is our understanding keeping pace with growth in computing capacity and databases?
Dr Paul Scholz: Yes, the genome and the biochemical processes that take place in an organism are highly complex. And of course NGS is no magic technology that enables us to understand all processes at cellular level. It will be a long time before research attains that level. But NGS is a huge step forward. It provides an overview of the complete genome or transcriptome of cells, genes and entire individuals, very fast and reliable.
BRAIN: You are responsible for DNA/RNA sequencing at BRAIN. What are your specific tasks?
Dr Paul Scholz: My job is to analyse complete genomes and transcriptomes. The focus may vary a great deal from project to project, and depending on the questions to be answered. One project might involve searching for mutations in a yeast strain. Another might search for genes that are upregulated in cells following contact with a specific natural substance.
Let’s take a specific example from our enzyme research. A bacterial or yeast strain that originally came from our BioArchive and that we optimised using biotechnological processes reliably produces a new enzyme. That’s good to know. But it is even better to understand how and why DNA modification works in the production strain. To do that, we need to sequence the strain. This analysis provides insight into the region of the genome where a change in DNA took place, what exactly this change looks like and whether there have been other, possibly undesirable changes in DNA that we could reverse in order to increase the yield in our desired products.
BRAIN: Apart from research, what else is sequencing important for?
Dr Paul Scholz: Sequencing techniques are also important for launching new products on the market. Authorities ask for detailed statements on this for certain approval procedures. Often, they are also needed in order to obtain patent protection for inventions.
BRAIN: How important is Next Generation Sequencing for BRAIN?
Dr Paul Scholz: NGS is playing an increasingly important role at BRAIN. It is essential for us to characterise improved production strains for enzymes and other biomolecules in the best possible and most efficient way. Meanwhile it is standard practice to analyse the whole genomes of our strains using NGS, also to exclude undesired effects as early on as possible. We are also using NGS methods more and more for transcriptome analysis, which concerns gene regulation. That helps us to understand the effects of changing environmental conditions or mutations on the entire transcriptome.
BRAIN: In which fields of research and application at BRAIN do you see potential based on NGS?
Dr Paul Scholz: One example are new research projects that we are planning at the moment, which concern the metagenome (all genomic information present in several thousands of different microorganisms that coexist in a community). The aim is to sequence and analyse the complete genomes of these habitats. Thereby we of course keep an eye out for new DNA sequences, e.g. of enzymes that might be of interest for biotechnological applications.
DNA/RNA sequencing
DNA/RNA sequencing involves producing a text stream (“read”) of an organism’s genome or transcriptome. The transcriptome is the set of all genes transcribed from DNA into RNA in a cell at a given point in time. The DNA molecules hold the “blueprint” for life forms. DNA consists of individual building blocks called nucleotides, which are made up of a nitrogenous base, a sugar and a phosphate. Two DNA strands zip together via mutual attraction between two of the four bases in each case: adenine (A) pairs with thymine (T), and guanine (G) pairs with cytosine (C). The paired DNA strands form a double helix.
Classical DNA sequencing was presented in 1977 by the British biochemist Frederick Sanger, who received his second Nobel prize for his findings in 1980. He was the first to succeed in separating the double helix and producing a read of each strand using DNA polymerase enzymes. Next Generation Sequencing (NGS) works according to the same basic principle, but is a faster and more efficient process as up to several billion sequencing reactions take place in parallel and run largely automatically. The next generation of even more efficient techniques is known as nanopore sequencing. Here, DNA molecules are drawn through minute pores that measure the specific electric potential of their building blocks.
DNA sequencing technology ushered in the scientific age of genomics, which involves the systematic analysis of all active genes. Around 24,000 genomes of the most diverse organisms have been completely decoded to date. Researchers around the globe can access this information via databases. DNA sequencing techniques are also the basis for epigenetics, the research into molecular mechanisms that influence gene activity without changing the DNA sequence.

The new tree of life with its four domains Archaea, bacteria, Candidate Phyla Radiation and eukaryotes makes it possible to trace back the ancestry and relationships of all known forms of life to its origins about 3.5 billion years ago. (Source: Laura A. Hug, Brett J. Baker et al., A new view of the tree of life, Nature Microbiology 1, Article number: 16048 (2016), www.nature.com/articles/nmicrobiol201648, Figure 1: A current view of the tree of life, encompassing the total diversity represented by sequenced genomes., modification by BLICKWINKEL: deletion of colors and detailed names. This work is licensed under a Creative Commons Attribution 4.0 International License.)