Sequencing the genome of an organism allows scientists to investigate its unique genetic makeup, its evolutionary links to other creatures and how it has adapted to its environment. KAUST researchers have sequenced the first coral reef fish genome of the blacktail butterflyfish (Chaetodon austriacus), an iconic Red Sea species considered to be an indicator species for coral health.
While genome sequences already exist for well-established model species such as the zebrafish, which is commonly used in medical research, there are no genomes publicly available for natural populations of tropical reef fish. A team from KAUST, including Michael Berumen, former postdoctoral fellow Joseph DiBattista and their colleagues, sought to fill this significant gap in fish genomic data.
“The blacktail butterflyfish has one of the most restricted ranges of any butterflyfish species, and is largely concentrated in the northern and central Red Sea,” explained DiBattista. “Therefore, it is likely to have developed unique genomic adaptations to this environment.”
Identifying these genetic mechanisms may also help predict how other marine organisms could adapt to challenging sea conditions in future.
The team faced a considerable task when it came to sequencing the new genome, partly because they had no reference genomes from closely related fish to compare. They took portions of gill filaments from a wild butterflyfish and generated a mix of DNA fragments, which are also called reads.
“We then undertook a series of steps to figure out which reads connected with each other and how they overlapped as a whole,” explained Berumen. “Imagine trying to reconstruct a lengthy book from tiny segments consisting of a few hundred characters, each taken from a random part of that book. This very quickly becomes a computer science problem, since it would be impossible to do it manually. Most fish genomes consist of around a billion base pairs, or a book with a billion characters in our analogy!”
Berumen sought the bioinformatics expertise of Manuel Aranda's group from the University’s Red Sea Research Center. Once the team had assembled the genome, they analyzed it to ensure it made sense; for example, by checking for the existence of genes previously identified in other organisms.
Their final high-quality genome includes 28,926 protein-coding genes. The team hopes their genome will enable studies on the co-evolution of reef fish species and comparisons of gene sequences between closely related fish across the Indo-Pacific region.
The genome may also help stem trading in wild reef fish because aquaculture specialists may eventually be able use the data to produce new aquarium-tolerant species to fulfill the market demand for decorative fish.