en News - BIOINFORMATICS BARCELONA News Fri, 13 Sep 2019 12:18:17 +0000 Fri, 13 Sep 2019 12:18:17 +0000 Houdini 2 (http://houdini.antaviana.cat/) http://www.bioinformaticsbarcelona.eu/news Made of Genes y Synlab presentan "Salud Personalizada: Más allá de la genética"

¿Por qué la dieta que funcionó tan bien a mi compañera de trabajo a mí no me funciona? ¿Por qué me cuesta bajar peso si hago ejercicio? ¿Por qué me afectan tanto los cambios de horario? ¿Por qué si me tomo más de una taza de café no descanso por la noche? ¿Por qué las recomendaciones de estilo de vida son genéricas si yo soy única?

La información genética nos acerca a conocer lo más fundamental de nosotros mismos. Pero la genética no es limitante. Nos permite conocer nuestros límites, sí. Pero también cuál es el mejor camino para romperlos.

Es imprescindible, entonces, conocer no solo cómo eres, sino además cómo estás en cada momento para tener una visión real de nuestras vidas como un punto de partida sobre el que podremos actuar de una u otra manera. Esto es lo que entendemos como salud personalizada.

En este evento moderado por Pere Estupinya (presentador y director de "El ladrón de cerebros" TVE), debatiremos al lado de Dr. Oscar Flores (Doctor en Biomedicina, Ingeniero Informático y cofundador de Made of Genes) y el Dr. David de Lorenzo (experto internacional en Nutrigenómica y Genómica Personal) cómo apoyándonos en la ciencia más innovadora podemos encontrar impactos en la salud que nos permitan educar en estilos de vida saludables personalizados.


Fecha y Hora



Movistar Centre

16 Plaça de Catalunya

08002 Barcelona


Fri, 13 Sep 2019 12:18:17 +0000 http://www.bioinformaticsbarcelona.eu/news//news/145/made-of-genes-y-synlab-presentan-salud-personalizada-mas-alla-de-la-genetica http://www.bioinformaticsbarcelona.eu/news/145 0
The avocado genome informs deep angiosperm phylogeny

The avocado genome has been sequenced by an international team in which the researchers Julio Rozas and Alejandro Sáncez-Gracia, from the Faculty of Biology and the Biodiversity Research Institute of the UB (IRBio), and members of the platform Bioinformatics Barcelona (BIB) have taken part too.

The new study, published in the journal Proceedings of the National Academy of Science (PNAS), will help improve the programs of genetic modification to optimize the growth of this plant -green gold in the international market- and promote its resistance towards pathogenic agents and diseases, among other aspects.

The study counts on other participants from about twenty institutions, under the supervision of the researchers Luis Herrera-Estrella (Center for Research and Advanced Studies of the National Polytechnic Institute, Mexico), and Víctor A. Albert, from the University of Buffalo (New York, United States).

Avocados, the green gold in worldwide agriculture

People have eaten this tropical fruit from the Persea genera -grown in South America, since pre-Columbian times- a lot over the years, and its consumption has increased worldwide and has generated an economic interest in the international market. The international team has sequenced the genome of two avocado plants, the Mexican variety (Persea americana var. drymifolia) and the most popular culture one (Persea americana Mill. cv. Hass).

The genome of this tropical plant -organized in 12 chromosomes known so far-, has a size of 920 Mb, with little variations among the studied models, according to the new study. "The most relevant element of the genomic structure of the avocado is its history of complete genome duplications", notes Julio Rozas, professor of Genetics and together with Alejandro Sánchez-Gracia, co-director of the Research Group on Evolutionary Genomics and Bioinformatics at the Department of Genetics, Microbiology and Statistics of the UB and IRBio.

In particular, the experts compared the syntenic relationship -the order in which the genes are conserved and positioned in the chromosomes- between the avocado genomes and the species Amborella trichopoda Baill 1869. This species -an endemic plant from New Caledonia- is considered to be the only current representative of the most primitive lineage of flowering plants and angiosperms. In this primitive angiosperm, there are no signs of complete genome duplications, therefore it is the source of the study of genomic duplication evolution for all other flowering plants. 

Tandem duplications and resistance to attacks from pathogens

According to the results, "for one region of the analysed genome, there are four copies of the genomic fragments in the avocado, and one copy in Amborella. This suggests the avocado genome has undergone two complete genome duplication processes", says the researcher Alejandro Sánchez-Gracia (UB-IRBio).    

These recent tandem duplications are involved in adaption metabolic responses of the avocado towards the attack of fungal pathogens, according to the authors. "At the same time, those duplications that appeared during the complete genome duplications -and which are maintained due natural selection- seem to be involved in basic aspects of the plant's development and physiology", notes Sánchez-Gracia.  

Discovering the phylogenetic tree of angiosperm plants

There are still many doubts on the origins and evolution of the avocado plant, a species that belongs to the magnolia genus. The new study profiles a new scenario to find its phylogenetic position in the evolutionary tree of angiosperms, in particular regarding some eudicot species with great economic interest within agriculture, such as coffee (Coffea), or tomato (Solanum), or grapevine (Vitis), which share more genetic compounds among them than with the avocado.  

In this group of plants, the diversification process was fast and it made the phylogenetic analysis difficult. Through a comprehensive phylogenomic analysis of nineteen angiosperm species -with different molecular markers- the new study reveals that the avocado plant is a sibling of the monocotyledon and eudicots families (coffee, tomato, grapevines). 

"The only genetic compounds all these species share are those defining all angiosperms and differentiating from gymnosperms and non-seed plants", notes researcher Pablo Librado, former PhD student of the University of Barcelona and co-author of the study, now member of the Center for Geogenetics of the University of Copenhagen and the Natural History Museum of Denmark. 

From genomic libraries to the Biocomputing software created in the UB

For the sequencing of the Mexican variation, researchers used different genomic libraries and sequencing technology, such as the Bacterial Artificial Cromosome (BAC) or the Illumina platform HiSeq, which provide a great coverage of the studied genome. In Hass, they used the PacBio sequencing methodology (single-molecule real-time-SMRT) with high quality DNA.

The population analysis based on the study of single nucleotide polymorphisms (SNP) enabled getting an insight of the genetic composition and history of the commercial variety Hass, the result of the introgression of genetic material from Guatemala -selective sweeps- in the genomic background of the Mexican avocado.

In the study, the UB-IRBio experts carried out a phylogenomic analysis consisting on the determination of single-copy orthologues and trees with information on aminoacid sequencings and coding nucleotides for this group of genes. A great part of this study was focused on the analysis of the dynamics on the gene loss and origin through BadiRate, a Biocomputing software carried out at the UB by the experts Julio Rozas and Pablo Librado.  

Improving agricultural productivity in world crops

The new study brings a new essential perspective to conduct association studies on the species genome and to find the genes -and different genetic variants- that determine the most relevant features for the economy and productivity of agriculture.  

In this context, the Research Group on Evolutionary Genomics and Bioinformatics of the UB conducted several scientific collaborations with the team of the lecturer Víctor A. Albert, among which stands out the study that identified genetic changes that enabled the adaption of carnivorous diet in different plants (Nature Ecology & Evolution, 2017), an evolutionary process that has been independently repeated in several species using the same molecular solutions.

The UB-IRBio team of experts had a distinguished participation in the sequencing of the genomes of different organisms such as the tick (Ixodes scapularis), centipede (Strigamia maritima), coffee (Coffea canephora), and the body louse (Pediculus humanus humanus), among others.

Fri, 06 Sep 2019 12:30:32 +0000 http://www.bioinformaticsbarcelona.eu/news//news/143/the-avocado-genome-informs-deep-angiosperm-phylogeny http://www.bioinformaticsbarcelona.eu/news/143 0
Experts sequence the genome of an endemic spider from the Canary Islands

A research team of the Faculty of Biology and the Biodiversity Research Institute (IRBio) of the University of Barcelona has sequenced the genome of the spider Dysdera silvatica Schmidt 1981, an endemic species living in the laurel forests in the islands La Gomera, La Palma, and El Hierro -in the Canary Islands. The new study reveals the first genome sequencing of an arthropod in the Canary Islands, an archipelago with a rich biodiversity regarding endemic species that are distributed around the insular area.

Participants in the new study, published in the journal GigaScience, are the experts Julio Rozas, Miquel Arnedo, José Francisco Sánchez-Herrero, Cristina Frías-López, Paula Escuer, Silvia Hinojosa-Alvarez and Alejandro Sánchez-Gracia (UB-IRBio) and also members of the platform Bioinformatics Barcelona (BIB).

Dysdera sylvatica: a ravening predator in the Canary laurel forests

The Dysdera genus, which belongs to the species Dysdera silvatica, includes more than 250 spider species mainly distributed around the Mediterranean area. The Macaronesian archipelagos represent the western limits of the distribution of this taxon, which reached a significant diversification in the Canary Islands, where there are about 50 endemic species currently.

"One of these species is Dysdera silvatica, integrated in an evolutionary lineage that became one of the main predators -both in abundance and diversity- in the insular terrestrial invertebrate trophic networks", notes Professor Miquel Arnedo, from the Department of Evolutionary Biology, Ecology and Environmental Sciences of the Faculty of Biology and the Biodiversity Research Institute (IRBio) of the UB. 

"The species D. silvatica is a generalist predator. Unlike other spider groups, the Dysdera includes experts on hunting and consumption of terrestrial isopods. All these species live in the Canary Islands, where the crustacean trophic specialization seems to have evolved independently several times", adds the researcher, head of the research group on Arthropod Systematics and Animal Evolution of the UB.

The first genome sequencing in the Dysderoidea superfamily

This is the first sequencing of the nuclear and mitochondrial genome for a species of the Dysderoidea superfamily, and the second one known in the Synspermiata group, one of the main spider lineages. Regarding this group, the first species with the available genomic data was the brown recluse spider (Loxoceles reclusa Gertsch & Mulaik, 1940), a species distributed around the American continent and quite known for its necrotic venom. 

According to the conclusions, the genome of the D. silvatica species is large (1.7 Gb) and shows a high complexity, with a high fraction of repetitive elements. According to Professor Julio Rozas (UB-IRBio), who co-led the study together with Alejandro Sánchez-Garcia, "Within this study, we created a 1.4 Gb genomic sequencing assembly, 54 % of which is built by repetitive elements"

"We identified and characterized a total of 36,000 protein-coding gens", notes Professor Julio Rozas (UB-IRBio), head of the research group on Evolutionary Genomics and Bioinformatics at the UB.

The D. silvatica species has a diploid chromosome set of six pairs of autosomes and two (females are XX) or one (males are X0) sex chromosomes (XX-X0), females have six non-sexual chromosome pairs (autosomes) and the XX sexual paired chromosomes. Males have six pairs of autosomes and only one X sexual chromosome. 

Third generation sequencing techniques to treat a complex genome

The research study to sequence the genome of Dysdera silvatica started about five years ago with the application of next generation massive sequencing (NGS) such as Illumina. With this protocol, they created one billion short sequences (100 paired bases) that were not enough to get a quality assembly for the complex genome of the species.

Therefore, the research team completed these data with PacBio and Nanopore single molecule sequencing (SMS) techniques, "more expensive but more efficient methodologies to obtain larger genome sequencing, and provide a quality genomic assembly using the hybrid assembly strategy, combining data from the obtained sequencing through different technologies", notes José Francisco Sánchez-Herrero, member of the Department of Genetics, Microbiology and Statistics of IRBio and first author of the article.

Ecology, evolution and reproductive behavior

From a global perspective, the study brings new views to know more about the genetic bases of the eco-phenotype change that takes place during the adaptive radiation phenomena over biological evolution. In particular, the models from the Dysdera genus in the Canary Islands, the genome sequencing of this first species can bring valuable information on the underlying genetic architecture in the phenotype and physiological changes related to the trophic specialization, as well as the underground environment adaptations, a natural environment where some species got used to live exclusively in the lava tubes.

Regarding their reproductive behavior, the Dysderidae family includes species that show a cryptic female choice mechanism, that is, a reproductive strategy which consists on the female's post-copulatory choice of a male's sperm to fertilize their ova. This choice is conducted through a complex system of diverticulum and glands associated with the female vulva. Knowing the features of the genome of a species from this family could contribute to determine the genetic bases of this behavior, through a comparative study of several regions of the genome under different selective constrictions between sexes, and among species with different sexual strategies.

Last, the study provides useful resources to treat studies on other evolutionary and essential issues, such as the origins and evolution of products with medical and commercial interests produced by spiders (venom, silk, etc.).

Fri, 06 Sep 2019 12:30:42 +0000 http://www.bioinformaticsbarcelona.eu/news//news/141/experts-sequence-the-genome-of-an-endemic-spider-from-the-canary-islands http://www.bioinformaticsbarcelona.eu/news/141 0
A webserver to study the protein networks perturbed in diseases

Researchers from the UPF, IMIM and UVIC have developed the website GUILDify to study the molecular environment of the proteins involved in diseases. The website is a promising tool to study the molecular mechanisms underlying a disease, to understand the relationships between two diseases and to propose potential drugs for their treatment.

The website GUILDify has been developed at the Structural Bioinformatics Group of GRIB (IMIM-UPF) led by Prof. Baldo Oliva with the objective to study the protein interactions involved in a disease. When accessing the website, the user finds an input box to introduce any disease name and the species of the interactome (human, mouse, rat...). After introducing the disease, GUILDify provides the user with a list of proteins whose corresponding genes are likely to be mutated when the disease occurs. The gene mutations are obtained from DisGeNET, a database of genes associated to human diseases. GUILDify uses these proteins as seeds for an algorithm that searches for the proteins in the interactome that are more connected to them. In the end, the user obtains the protein interactions that are more likely to cause the disease and a list of the biological functions affected.

Quim Aguirre, PhD student at the Structural Bioinformatics Group (SBI) of GRIB, behind this project, tells us all about it in an article on the El·lipse website.


Mon, 17 Jun 2019 10:23:05 +0000 http://www.bioinformaticsbarcelona.eu/news//news/139/a-webserver-to-study-the-protein-networks-perturbed-in-diseases http://www.bioinformaticsbarcelona.eu/news/139 0
Computational method increases design efficiency of protein-based drugs

Researchers from the Institute of Biotechnology and Biomedicine (IBB), in collaboration with scientists from the University of Warsaw recently presented an important update to their AGGRESCAN 3D computational method, focused on facilitating and reducing the cost of developing new generation protein-based drugs, diminishing their propensity to form aggregates and keeping them stable and active for a longer period of time.

Protein aggregation is a common phenomenon found in a wide range of pathologies, from Parkinson's and Alzheimer's diseases to some cancers and type 2 diabetes. A growing molecular knowledge of this phenomenon has yielded the development of different algorithms capable of identifying and predicting the regions with a greater tendency to aggregate. Among the first was AGGRESCAN, developed by the same researchers of the IBB, which took into account the propensity of the linear sequence, but not the 3D structure acquired by globular proteins. Four years ago, this same team of researchers expressed the idea of conducting predictions on these protein structures by implementing the AGGRESCAN 3D (A3D) server. This server offered a higher precision than those based on linear sequencing to predict the aggregation properties of globular proteins. It also provided new features, such as the possibility of easily modelling pathogenic mutations, or a dynamic mode, which allowed modelling the flexibility of small proteins to find potentially hidden regions.

The latest update was presented as a web server freely accessible to the academic world, in addition to a desktop version compatible with Windows, MacOS and Linux. The new algorithm surpasses all previous limitations and substantially broadens computational costs to allow modelling the flexibility of molecules of biomedical interest. It also includes different tools such as an automatic generation of mutations to facilitate redesigns of proteins as antibodies to make them stable and at the same time more soluble, and an improved user interface with which to view the data directly on the website.

"With this update, the A3D becomes one of the most complete aggregation predictors. The fact that one same place offers you the chance to make protein aggregation predictions, model their flexibility, study options for a smart redesign and verify how different factors can affect them, represents a giant step forward with regard to other similar servers", affirms Salvador Ventura, researcher at the IBB and the Department of Biochemistry and Molecular Biology, as well as creator of the A3D. "Among other things, all of this will allow us to improve the production of protein-based drugs, reducing the costs of development, production, storage and distribution".

Protein aggregation, a key element in biomedicine and biotechnology

Protein aggregation has gone from being an ignored area of protein chemistry to becoming a key element within the biomedicine and biotechnology fields. "A bad protein folding and subsequent aggregation is behind a growing number of human disorders and one of the most important impediments to designing and manufacturing proteins for therapeutic applications. These therapies, which imply the use of monoclonal antibodies, growth factors and enzyme substitutions, have already demonstrated high precision of molecular targeting, and therefore the need to study them more in depth becomes even more transcendent", Salvador Ventura concludes.


Aleksander Kuriata, Valentín Iglesias, Jordi Pujols, Mateusz Kurcinski, Sebastian Kmiecik and Salvador Ventura. Aggrescan3D (A3D) 2.0: prediction and engineering of protein solubility  Nucleic Acids Research, gkz321, doi.org/10.1093/nar/gkz321

Aleksander Kuriata, Valentín Iglesias, Salvador Ventura and Sebastian Kmiecik. Aggrescan3D standalone package for structure-based prediction of protein aggregation propertiesBioinformatics. 2019 pii: btz143. doi: 10.1093/bioinformatics/btz143.

To learn more about how AGGRESCAN3D works please visit the following website: http://biocomp.chem.uw.edu.pl/A3D2/

Tue, 28 May 2019 07:43:23 +0000 http://www.bioinformaticsbarcelona.eu/news//news/137/computational-method-increases-design-efficiency-of-protein-based-drugs http://www.bioinformaticsbarcelona.eu/news/137 0
First stable simulations of DNA crystals

A research team has presented the first stable simulations of DNA crystals, according to a study published in the journal Chem -part of the publisher Cell- and led by Modesto Orozco, Professor from the Department of Biochemistry and Molecular Biomedicine of the Faculty of Biology of the UB, and head of a research group of the Institute for Research in Biomedicine (IRB Barcelona) and the platform Bioinformatics Barcelona (BIB).

The new study shows the most detailed description so far about the properties of crystal systems with DNA at an atomic scale. This scientific milestone enables explaining the importance of chemical additives which are experimentally used to reach suitable crystallization conditions to get stable crystals in the laboratories.

According to Pablo D. Dans, postdoctoral researcher at IRB Barcelona, "the first to benefit from the study is the community of computational physicists and the chemists and biophysicists, who now have a clear protocol and reference to get stable simulations of DNA crystals".

According to Professor Modesto Orozco, head of the Molecular Modelling and Bioinformatics laboratory of IRB Barcelona, "in the long run, the simulation of several crystals obtained in under different experimental conditions should allow us to predict the effect of a certain chemical additive, and guide crystallographers in their experiments, reducing the costs and the time to get the crystals". 

Further information

Thu, 30 May 2019 09:01:19 +0000 http://www.bioinformaticsbarcelona.eu/news//news/135/first-stable-simulations-of-dna-crystals http://www.bioinformaticsbarcelona.eu/news/135 0
Researchers discover the action mechanism of an antitumor drug to treat glioblastoma Glioblastoma is a type of brain tumor with no cure, usually associated with mutations in the epidermal growth factor receptor (EGFR). The main EGFR mutation found in this tumor -known as EGFRvlll- is treated with the antibody mAb806, a drug developed by the Ludwig Institute for Cancer Research (United States) about twenty years ago, although its action mechanism was unknown. Now, a new study published in the journal Proceedings of the National Academy of Science (PNAS) reveals for the first time -the action mechanism of this antibody on the mutated EGFR receptor.


The results of the study, which open new pathways for the treatment of cancer, suggest the antibody mAb806 could be used in many tumours in which EGFR has mutated and not only in a specific mutation -like researchers believed so far. The study counts on the participation of experts from the University of Barcelona, the Institute for Research in Biomedicine (IRB Barcelona), Stockholm University (Sweden), and the University of California (United States), among other institutions.

Moreover, the scientific team proved that, even if EGFR has not mutated yet, it can be treated to make it sensitive to the protocol with the antibody mAb806. "These findings provide the rational basis to conduct anti-EGFR therapies combined with antibodies and kinase inhibitors, instead of blind testing them, as it has happened so far", notes Modesto Orozco, professor at the Department of Biochemistry and Molecular Biology at the Faculty of Chemistry of the UB, head of the Molecular Modelling and Bioinformatics Lab at IRB Barcelona and member of the Bioinformatics Barcelona platform (BIB).


Further information


Thu, 23 May 2019 09:19:11 +0000 http://www.bioinformaticsbarcelona.eu/news//news/133/researchers-discover-the-action-mechanism-of-an-antitumor-drug-to-treat-glioblastoma http://www.bioinformaticsbarcelona.eu/news/133 0
Ready, set, go for crossing the barrier!

Genes contain all the information needed for the functioning of cells, tissues, and organs in our body. Gene expression, meaning when and how are the genes being read and executed, is thoroughly regulated like an assembly line with several things happening one after another.

Researchers at the Centre for Genomic Regulation (CRG) in collaboration with the Structural Bioinformatics group of GRIB (IMIM-UPF) led by Baldo Oliva and the Department of Molecular Epigenetics, Helmholtz Center Munich, Germany, have discovered a new step in this line, which controls the expression of some genes with an important role in cancer. "We observed that breast cancer cells need a particular modification to express a set of genes required for cellular proliferation and tumour progression," explains the CRG - Beatriu de Pinós postdoctoral researcher Priyanka Sharma, first author of the paper. "This modification allows the enzyme RNA polymerase II to overcome a pausing barrier and to continue to transcribe these genes," adds Sharma.

Cancer cells are willing to quickly proliferate so, genes involved in cell division and proliferation are really active and usually highly expressed. Such a precise and meticulous machinery involves many different molecules to properly function. In this case, when all the machinery to express proliferation genes is ready, it still has to wait for a particular modification to go. As in race when runners are asked to be ready, set and go. Here, the polymerase is also ready and set but still needs a final modification to cross the barrier for transcription and go.

Deciphering every single step and all actors involved in this process is an important achievement in terms of fundamental science. We are now able to better understand how an intricate mechanism of gene regulation actually works and this might be a new target for clinical researchers to study novel therapies for certain types of cancer," states Miguel Beato, CRG group leader and principal investigator in this work.

The work, which has been published in Molecular Cell, describes a novel modification of in the Carboxyl terminal domain of RNA Polymerase II, namely the de-imination of an arginine, by the enzyme PADI2, which allows the polymerase to transcribe genes relevant for cancer cell growth. "Most chemo-therapies are oriented at blocking the activity of enzymes, but we know that PADI2 participates in many different processes involving the nervous system, immune response and inflammation, among others. Thus, inhibiting PADI2 would have multiple side effects. Our results make it possible to target just the particular action of PADI2 on RNA polymerase needed for tumour progression without globally blocking the enzyme," explains Beato.

Reference article: Priyanka Sharma et al. "Arginine citrullination at the c-terminal domain controls RNA polymerase II transcription" Molecular Cell (2018) DOI: 10.1016/j.molcel.2018.10.01.

For further information and media requests, please, contact: Laia Cendrós, press officer, Centre for Genomic Regulation (CRG) - Tel. +34 93 316 0237. For press releases in spanish and catalan click here.

Mon, 28 Jan 2019 13:00:20 +0000 http://www.bioinformaticsbarcelona.eu/news//news/125/ready-set-go-for-crossing-the-barrier http://www.bioinformaticsbarcelona.eu/news/125 0
First year of the european project eTRANSAFE: read the keynote from Prof. Ferran Sanz, Project Coordinator

The eTRANSAFE (Enhancing TRANslational SAFEty Assessment through Integrative Knowledge Management) project aims at enhancing the efficiency of translational safety assessment during the drug discovery and development process by means of the development of an integrative data infrastructure and innovative computational methods and tools. This infrastructure will be underpinned by legacy data sharing, the use of data standards, such as the SEND format for preclinical studies, and the implementation of a modular computational architecture.

eTRANSAFE started its journey in September 2017 and during the first year, we have set the foundations for the development of widely accepted guidelines on safe legacy data sharing, and we have been working in the extension of the preclinical database released in the previous IMI eTOX project, in terms of incorporation of individual animal data in SEND format together with structural and pharmacological information. In addition, identification of useful data available in the public domain, its integration with the data contributed from the EFPIA partners and the identification of potential challenges by means of prototypes and use cases have been the first steps to move forward the development of the eTRANSAFE Knowledge Hub.

In the past months, an internal group, which gathers representatives from all workpackages, has been intensively working on the development of a preliminary prototype solution that allows searching for compounds, annotation of those compounds with pre-clinical and clinical data and visualisation of that data. This has proven to be a useful exercise to provide the project a means to better explore how data will need to be integrated and accessed from the Final Knowledge Hub. This work will be further developed.

The Project is pleased of the progress achieved and will continue advancing towards the final objective of supporting and improving the drug discovery and development.

The eTRANSAFE consortium is a private and public partnership of 8 academic institutions, 6 SMEs and 12 pharmaceutical companies, and is coordinated by the Fundació Institut Mar d'Investigacions Mèdiques (IMIM) and led by the pharmaceutical company Novartis. Universitat Pompeu Fabra is partner of the Consortium. 

Click here to read more about the eTRANSAFE project.

Click here to subscribe to the eTRANSAFE Newsletter to be up to date on the progress of the Project activities.

Mon, 28 Jan 2019 13:00:03 +0000 http://www.bioinformaticsbarcelona.eu/news//news/123/first-year-of-the-european-project-etransafe-read-the-keynote-from-prof-ferran-sanz-project-coordinator http://www.bioinformaticsbarcelona.eu/news/123 0
The depths of the ocean and gut flora unravel the mystery of microbial genes

Understanding the functions of genes in bacteria that form part of the human microbiome-the collection of microbes found inside our bodies-is important because these genes might explain mechanisms of bacterial infection or cohabitation in the host, antibiotic resistance, or the many effects-positive and negative-that the microbiome has on human health.

Surprisingly, the functions of a huge number of microbial genes are still unknown. This knowledge gap can be thought of as "genomic dark matter" in microbes, and neither computational biology nor current lab techniques have been able address this gap.

This challenge has now been tackled through an international collaboration between the Institute for Research in Biomedicine (IRB Barcelona) and two other interdisciplinary research centres, namely the IJS in Ljubljana (Slovenia) and RBI in Zagreb (Croatia). The findings have been published recently in Microbiome, the international journal of reference in microbiome research. The study was led by Fran Supek, computational biologist and leader of the Genome Data Science lab at IRB Barcelona, and first-authored by Vedrana Vidulin, a computer scientist affiliated to the centres in Slovenia and Croatia.

Intelligent prediction method

The researchers have developed a new computational method able to examine thousands of metagenomes simultaneously and identify the evolutionary signal that can predict the function of many microbial genes. This method, which analyses "big data" from human microbiomes (e.g. from the intestine or skin) and other metagenomes (e.g. from the soil or ocean) is based on a special kind of machine learning algorithm: it can create "decision trees" to predict hundreds of different functions at once, finding links between genes and at the same time predicting what they do in the microbial cell.

"This makes the algorithm very good at not getting confused by the noise in the metagenomic data, meaning that it is accurate and can confidently propose a biological role for a large number of genes with unknown functions. Intriguingly, it also proposes many additional functions for genes that already have some known role," says Supek.

The most important finding to emerge from this research is that the analysis of human microbiomes and other metagenomic data, such as those of the soil and ocean, allows researchers to assign hundreds of gene functions that have evaded current computational genomics approaches until now. "In other words, metagenomes allow scientists to see what ordinary genomes don't," explains the Croatian researcher, who was recently awarded a grant from the European Research Council (ERC).

Diversity is key

The scientists have found that different types of environments can predict different types of gene functions. For example, metagenomes from the ocean can be used to predict the genes used by bacteria for photosynthesis. But as the researchers point out, this could not have been discovered from the bacteria in the human gut. In contrast, the gut microbiome has been very useful for predicting key genes involved in the mechanisms underlying the development of disease and in the metabolism of alcohol and the biosynthesis of certain amino acids-predictions that would have been more difficult to make using microbiomes from the environment.

The authors conclude that, through machine learning, a large and diverse set of environments allows us to learn about many different gene functions in microbes. "Computational methods like this one are shedding light on the "dark matter" within microbial genomes ­­-- the enormous number of genes in bacteria and in archaea whose functions are a mystery," says Supek.

The thousands of computational predictions generated will need to be validated in experiments. Once validated, they may lead to the discovery of new genes that explain how bacteria shape the ecosystems around us and indeed the ecosystem within us-the human microbiome.

This study has been funded through the European FP7 'Future and Emerging Technologies' Programme and an ERC Starting Grant.

Reference article:

The evolutionary signal in metagenome phyletic profiles predicts many gene functions

Vedrana Vidulin, Tomislav Šmuc, Sašo Džeroski and Fran Supek

Microbiome (2018) 6:129 Doi: https://doi.org/10.1186/s40168-018-0506-4


VIDEO MEET OUR SCIENTISTS. Fran Supek: "Solving the riddle of DNA"


Wed, 18 Jul 2018 08:26:59 +0000 http://www.bioinformaticsbarcelona.eu/news//news/121/the-depths-of-the-ocean-and-gut-flora-unravel-the-mystery-of-microbial-genes http://www.bioinformaticsbarcelona.eu/news/121 0