Digitally supported metagenomics
During a visit to BRAIN this summer, the Hessian Minister for Digitalization informed herself about the advantages of digitalization in biotechnology
What benefits does digitalization bring to people? This summer, the Hessian Minister for Digital Strategy and Development, Professor Kristina Sinemus, addressed this question. She visited a wide range of companies and institutions in Hesse − including BRAIN AG in Zwingenberg.
What a happy coincidence: The minister herself has a degree in biology and so the speakers, biotechnologist Dr. Alexander Pelzer and bioinformatician Dr. Paul Scholz, could assume her basic understanding of their explanations. Both had chosen their hobbyhorse, digitalized metagenome analysis, as an example of possible applications. They vividly explained the role of digitally supported analysis of initially unmanageable amounts of data and how it can be used to identify and optimize enzymes that will later be used in industrial processes.
Big Data maps microbial diversity
What is the definition of a metagenome? A soil sample taken in a natural environment contains a multitude and variety of microorganisms − and the genetic information of all these microorganisms forms the so-called metagenome in such a sample. Microbial diversity is also accompanied by a diversity of biomolecules. The scientists at BRAIN make use of this diversity, because the ultimate goal is to develop enzymes optimized for customer-specific requirements.
“Only the advances made in digitalization and the associated rapid increase in computing capacity enable us to handle and evaluate these data volumes.”
To get from the many genes to many biomolecules, BRAIN's specialists first sequence the metagenome, i.e. they determine the nucleotide sequence of the DNA. The method they use is so-called nanopore sequencing. The amount of data generated in this way is enormous: up to 280 giga base pairs are generated in this way every week. "Only the advances made in digitalization and the associated rapid increase in computing capacity enable us to handle and evaluate these data volumes," says Paul Scholz, who built up the relevant technology at BRAIN and has headed the bioinformatics department for two years.
In the next step, the bioinformatician digitally classifies the sequence data obtained and evaluates them digitally in the form of a comparative analysis. Only in this way is it possible to process the large biodiversity from a small sample and theoretically identify and characterize up to 1010 microorganisms from a single gram of a soil sample. The biomolecules discovered in this way, some of them new and including proteins and enzymes, may later be used in biotechnological applications. Once the scientists have digitally identified a biomolecule that has potential, they create a "digital variant library" and initially optimize the molecule using bioinformatics.
Applying the data
"However, what the computer depicts and what is then found in reality in laboratory tests can only be examined analogously in the laboratory," says Alexander Pelzer. The biotechnologist and “protein engineer” adds: “Nature holds such an enormous potential of as yet undiscovered biomolecules and in order to tap into the vast treasure trove of data, we need both the new digital possibilities and experts in the laboratory.”
“In order to tap into the vast treasure trove of data, we need both the new digital possibilities and experts in the laboratory.”
This digital optimization process sometimes uses structural models that predict a digital 3D model from the one-dimensional linear amino acid sequence of an enzyme. "These 3D models give us clues as to where the properties of an enzyme can be improved," explains Alexander Pelzer. However, it is rare for the theoretical model to be checked just once in the laboratory. Instead, an enzyme is usually optimized step by step and under predefined conditions for the planned industrial application in several cycles.
Expression studies and protein engineering
The digital, sequence-driven discovery of an enzyme candidate is followed by expression studies in the laboratory. The DNA is synthesized and introduced into a suitable microorganism. This microorganism should reliably "translate" the DNA so that the microorganism produces the corresponding enzyme. If the properties of the enzyme do not yet meet expectations, the enzyme is readjusted in the "protein engineering" process - either by rational design, i.e. the digital prediction of a molecular change, or by directed evolution (mutation and selection). BRAIN usually applies the former method. Alexander Pelzer: "In this way, for example, the temperature optimum and the pH optimum of an enzyme can be adjusted to meet the requirements of our customers in their industrial process.”
If the microorganism and the identified enzyme prove successful, the fermentation process is scaled up to produce larger quantities of the enzyme.
Benefits for people
Returning to the Minister's visit: Asked about the concrete benefits people can derive from digitalization in this context, the scientists agreed that the current digital possibilities − also in combination with AI − make it possible to predict previously unknown enzyme activities so efficiently that some of the repetitive cycles between laboratory and database can be eliminated. In addition to this practical benefit for the daily work of scientists, society as a whole will benefit from the new digital possibilities, because optimized enzymes could, for example, reduce the energy consumption in an industrial process or the amount of environmentally harmful chemicals, thus leading to more sustainability in the industry.
Digitalized metagenomics at BRAIN:
Basis for natural and sustainable solutions for industrial challenges
- BRAIN's rapidly growing digitalized data resource as a basis for screening comprises more than 2000 Gbp of proprietary genetic information.
- BRAIN's metagenome database currently contains more than 100 million functionally annotated genes.
- The use of next-generation sequencing (NGS) gives BRAIN access to as yet unknown natural biodiversity.
- In silico-based techniques enable the discovery of novel molecules by means of their structurally related neighboring molecules within a short time.
- The use of rational protein engineering at BRAIN enables fast and specific enzyme optimization.