Penn A&S masthead

spacer image

In Silico Fertilization
Information Age Gives Birth to New Knowledge in Biology

The usual disorder of papers, books, journals, file folders, and manuscripts crowd the laboratory office of biology professor David Roos. But he also keeps stacks of computer diskettes, CDs, and zip disks among the empty coffee mugs. Rows of software manuals and user guides hold up pots of plants, and a laptop is wedged into a tight clearing amid rising piles of clutter on his desk.

“What I’d really like,” remarks the intense, pony-tailed, jeans-and-sandal-shod scientist, “is to walk into my lab in the morning and have my computer wake up and say, ‘You know, I just happened to be looking at the genes in this parasite yesterday, and did you know that one-third of all Toxoplasma proteins have an A as their second letter? Is that interesting?’”

The letter A is a symbol for the amino acid alanine, a piece of the formula that defines the composition of a protein in one of the parasites that Roos’ team of researchers studies. The prominence and recurring position of alanine in the organism’s proteins are clues to how the genes are put together and what evolution has designed them to do. “That’s something we never knew before and couldn’t find out without these powerful computational tools.” He talks fast and in monotone, as though reading data from a printout, with occasional inflections that don’t seem to emphasize anything in particular.

“A generation ago there was probably no one in the biology department who knew the chair of computer science,” he points out, “unless perhaps they lived on the same block.” Not long after his remark, Fernando Pereira, head of Penn’s department of computer and information science on the other side of campus, showed up for one of their informal meetings. “We’ve been exploring ways in which computational approaches can be exploited to investigate important biological problems,” Roos explains “and ways in which biological problems will drive advances in computer science.”

Roos, the Merriam Professor in Biology and director of the Genomics Institute, is no stranger to collaboration. He takes part in teaching and research in the schools of arts and sciences, medicine, veterinary medicine, and engineering and applied science. “It’s no longer possible to distinguish between such previously disparate disciplines as anatomy, biochemistry, botany, cell biology, genetics, microbiology, zoology, and pharmacology,” he insists. Now innovations in high-tech fields like robotics and computers have entered into the life sciences mix. “As this integrative expansion encompasses engineering technologies, it’s becoming possible to examine entire systems—all the genes in a genome, all the proteins in a cell, all the metabolic processes in a tissue,” Roos adds. That intersection of biology and high technology, wet-lab research and bioinformatics (the application to biology of computer and statistical expertise), has revolutionized how researchers look at the science of life.

Roos studies protozoan parasites. “Super parasites,” he calls them, single-celled vermin that implant themselves inside the individual cells of a host. One, Toxoplasma gondii, is a prominent source of congenital neurological birth defects. The parasite is sometimes present in cat feces, making the home litter box a potential source of contagion, although infection mostly occurs from contaminated garden soil or rare meat. Mothers who become infected during pregnancy can pass along toxoplasmosis to their children, which can inflict blindness, deafness, seizures, mental retardation, and in severe cases, death. The CDC estimates that more than 60 million Americans are infected with the parasite, but healthy immune systems are able to suppress the disease. For people compromised by immunodeficient illnesses like AIDS, however, the opportunistic parasite is a leading cause of death.

A second protozoa Roos investigates is Plasmodium, the genus of parasites that cause malaria. The World Health Organization lists malaria, along with AIDS and tuberculosis, as one of the world’s most devastating diseases and estimates that 300 to 500 million people become infected by Plasmodium every year. More than one million of them die. Most are children. More than 90 percent of malaria deaths occur in the impoverished states of Sub-Saharan Africa. A study released by WHO in April 2000 reported that Africa’s gross domestic product would be nearly one-third higher, were it not for the economic life this ancient, mosquito-born fever drains from the continent.

Existing medicines for battling malaria, notably chloroquine and Fansidar, the frontline weapons, are losing their potency as Plasmodium builds up drug resistance in the fight against it. “There are many places in the world where these drugs are essentially useless,” Roos observes. The parasite’s counterattack has sparked an international collaboration of scientists, the Malaria Genome Project, who are sequencing the genome of Plasmodium falciparum, by far the most vicious of the four species of malaria bugs. Along with Chris Stoeckert from the medical school, Roos leads the Penn teams responsible for the database, published on the Web, for P. falciparum’s nearly 6,000 genes and 30 million letters of DNA code. The PlasmoDB website has built-in tools that help scientists from around the world parse and query the swarms of data. In July, Roos traveled to Kampala, Uganda, to lead a bioinformatics training course for scientists throughout Africa, one of many workshops he has helped organize on behalf of WHO, the Howard Hughes Medical Institute, the U.S. National Academy of Sciences, and others.

“It’s a fascinating cell-biological problem to think about how these parasites interact with a host,” he ruminates. Many scientists have focused narrowly on malaria, the disease, but Roos, who serves on several NIH and WHO advisory committees, has taken up the broader challenge of understanding the whole biology of this class of unicellular pathogens. In today’s science, a potent way of doing that is by identifying and studying a “model system”—a species that is easy to manipulate experimentally and has a lot of biology in common with other organisms.

“We work on Toxoplasma,” he explains, “in part because we’ve been able to develop these parasites as a useful model for studying Plasmodium.” Over the last decade, the Roos laboratory has pioneered some of the breakthrough technologies in genetic engineering that scientists around the globe have adopted in their research centers. Because of his work, Toxoplasma is now widely recognized as a guinea pig for exploring the cell biology, molecular genetics, and pharmacology of the cell-invading parasites that cause malaria and other maladies.

Biologists studying even single-celled creatures now generate oceans of data that rival in size and complexity the databases that drive weather forecasting or predictions of market performance. “Increasingly my lab has become involved in developing tools for exploring the vast explosion of data that emerges from genome sequencing and ‘post-genomics’ projects,” he says. “But I don’t really think of myself as primarily a technologist. . . . We’ve always developed technology with an eye towards solving biological problems.”

When he’s not crunching numbers or delving through data at a keyboard, Roos can be found at the laboratory bench, pipetting fluids or peering at organisms through a microscope. “It keeps us honest,” he notes, adding pointedly that “advances are being made at the interface of laboratory and computational work.”

One surprising discovery that came out of the Roos lab was of a previously unknown organelle, a specialized subcellular structure comparable to an organ, inside Toxoplasma, Plasmodium, and related parasites. The organelle, called the apicoplast, possesses its own genome with DNA that resembles genetic material in chloroplasts, the structures plants use for photosynthesis. “These parasites are not plants,” Roos emphasizes, “but long ago in their evolutionary history they engulfed an alga and retained the chloroplast as a non-photosynthetic organelle.”

The Roos group has been engineering genetic markers to trace which genes play a part in creating and nurturing the apicoplast, an approach that could yield important clues for how to kill the parasite without hurting its host. Sorting through the genetic alphabets in their databases, Roos and his colleagues tracked down genes inside the nucleus of the malaria bug that help form the newly found organelle. “Because this bit of genetic code is derived from plants,” he suggests, “it is sufficiently different from that of humans and provides an exciting target for new drugs.” A major hurdle in devising new remedies is to find a substance that attacks disease-causing bugs without harming the human body. Typically scientists search the biology of a pathogen for something that has no counterpart in Homo sapiens.

Plant proteins are promising candidates. The proteins identified by Roos and his team of scientists have already been used by pharmaceutical companies to work up a number of anti-malarial drugs, some of which are undergoing clinical trials.

With the new molecular and computational technologies, scientists can look at biological phenomena on a scale never possible before, decoding entire genomes and tracking the chemicals of life as they assemble proteins into organelles, cells, organs, and tissues, and direct hormones, enzymes, and other molecules that sustain the organism they built up. “Genomics is indistinguishable from biology,” Roos contends. “Biologists have always observed complex systems and used their observations to develop hypotheses to be tested experimentally. Genomics allows us to expand the list of systems that we look at. Now we can study large-scale databases in the same way that we look at cells. These databases are no more complex than a cell, but we look at them using a computational microscope.”


spacer image

Return to Table of Contents