How is it? How can we investigate this flora of viruses that surround us, and aid medicine? How can we turn our cumulative knowledge of virology into a simple, hand-held, single diagnostic assay? I want to turn everything we know right now about detecting viruses and the spectrum of viruses that are out there into, let's say, a small chip.
When we started thinking about this project—how we would make a single diagnostic assay to screen for all pathogens simultaneously—well, there's some problems with this idea. First of all, viruses are pretty complex, but they're also evolving very fast. This is a picornavirus. Picornaviruses—these are things that include the common cold and polio, things like this. You're looking at the outside shell of the virus, and the yellow color here are those parts of the virus that are evolving very, very fast, and the blue parts are not evolving very fast. When people think about making pan-viral detection reagents, usually it's the fast-evolving problem that's an issue, because how can we detect things if they're always changing? But evolution is a balance: where you have fast change, you also have ultra-conservation—things that almost never change.
And so we looked into this a little more carefully, and I'm going to show you data now. This is just some stuff you can do on the computer from the desktop. I took a bunch of these small picornaviruses, like the common cold, like polio and so on, and I just broke them down into small segments. And so took this first example, which is called coxsackievirus, and just break it into small windows. And I'm coloring these small windows blue if another virus shares an identical sequence in its genome to that virus. These sequences right up here—which don't even code for protein, by the way—are almost absolutely identical across all of these, so I could use this sequence as a marker to detect a wide spectrum of viruses, without having to make something individual. Now, over here there's great diversity: that's where things are evolving fast. Down here you can see slower evolution: less diversity.
Now, by the time we get out here to, let's say, acute bee paralysis virus—probably a bad one to have if you're a bee—this virus shares almost no similarity to coxsackievirus, but I can guarantee you that the sequences that are most conserved among these viruses on the right-hand of the screen are in identical regions right up here. And so we can encapsulate these regions of ultra-conservation through evolution—how these viruses evolved—by just choosing DNA elements or RNA elements in these regions to represent on our chip as detection reagents.
Okay, so that's what we did, but how are we going to do that? Well, for a long time, since I was in graduate school, I've been messing around making DNA chips—that is, printing DNA on glass. And that's what you see here: These little salt spots are just DNA tacked onto glass, and so I can put thousands of these on our glass chip and use them as a detection reagent. We took our chip over to Hewlett-Packard and used their atomic force microscope on one of these spots, and this is what you see: you can actually see the strands of DNA lying flat on the glass here. So, what we're doing is just printing DNA on glass—little flat things—and these are going to be markers for pathogens. Okay, I make little robots in lab to make these chips, and I'm really big on disseminating technology. If you've got enough money to buy just a Camry, you can build one of these too, and so we put a deep how-to guide on the Web, totally free, with basically order-off-the-shelf parts. You can build a DNA array machine in your garage. Here's the section on the all-important emergency stop switch. Every important machine's got to have a big red button. But really, it's pretty robust. You can actually be making DNA chips in your garage and decoding some genetic programs pretty rapidly. It's a lot of fun.
And so what we did—and this is a really cool project—we just started by making a respiratory virus chip. I talked about that—you know, that situation where you go into the clinic and you don't get diagnosed? Well, we just put, basically, all the human respiratory viruses on one chip, and we threw in herpes virus for good measure—I mean, why not? The first thing you do as a scientist is, you're gonna make sure your stuff works. And so what we did is, we take tissue culture cells and infect them with various viruses, and we take the stuff and fluorescently label the nucleic acid, the genetic material that comes out of these tissue culture cells—mostly viral stuff—and stick it on the array to see where it sticks. Now, if the DNA sequences match, they'll stick together, and so we can look at spots. And if spots light up, we know there's a certain virus in there.
That's what one of these chips really looks like, and these red spots are, in fact, signals coming from the virus. And each spot represents a different family of virus or species of virus. And so, that's a hard way to look at things, so I'm just going to encode things as a little barcode, grouped by family, so you can see the results in a very intuitive way. What we did is, we took tissue culture cells and infected them with adenovirus, and you can see this little yellow barcode next to adenovirus. And, likewise, we infected them with parainfluenza-3—that's a paramyxovirus—and you see a little barcode here. And then we did respiratory syncytial virus. That's the scourge of daycare centers everywhere—it's like boogeremia, basically. You can see that this barcode is the same family, but it's distinct from parainfluenza-3, which gives you a very bad cold. And so we're getting unique signatures, a fingerprint for each virus. Polio and rhino: they're in the same family, very close to each other. Rhino's the common cold, and you all know what polio is, and you can see that these signatures are distinct. And Kaposi's sarcoma-associated herpes virus gives a nice signature down here. And so it is not any one stripe or something that tells me I have a virus of a particular type here; it's the barcode that in bulk represents the whole thing.
All right, I can see a rhinovirus—and here's the blow-up of the rhinovirus's little barcode—but what about different rhinoviruses? How do I know which rhinovirus I have? There're 102 known variants of the common cold, and there're only 102 because people got bored collecting them: there are just new ones every year. And so, here are four different rhinoviruses, and you can see, even with your eye, without any fancy computer pattern-matching recognition software algorithms, that you can distinguish each one of these barcodes from each other.
Now, this is kind of a cheap shot, because I know what the genetic sequence of all these rhinoviruses is, and I in fact designed the chip expressly to be able to tell them apart, but what about rhinoviruses that have never seen a genetic sequencer? We don't know what the sequence is; just pull them out of the field. So, here are four rhinoviruses we never knew anything about—no one's ever sequenced them—and you can also see that you get unique and distinguishable patterns. You can imagine building up some library, whether real or virtual, of fingerprints of essentially every virus. But that's, again, shooting fish in a barrel, you know, right? You have tissue culture cells. There are a ton of viruses. What about real people? You can't control real people, as you probably know. You have no idea what someone's going to cough into a cup, and it's probably really complex, right? It could have lots of bacteria, it could have more than one virus, and it certainly has host genetic material. So how do we deal with this? And how do we do the positive control here?
Well, it's pretty simple. That's me, getting a nasal lavage. And the idea is, let's experimentally inoculate people with virus. So, this is all IRB-approved, by the way; they got paid. And basically, we experimentally inoculate people with the common cold virus. Or, even better, let's just take people right out of the emergency room—undefined, community-acquired respiratory tract infections. You have no idea what walks in through the door. So, let's start off with the positive control first, where we know the person was healthy. They got a shot of virus up the nose, let's see what happens.
Day zero: nothing happening. They're healthy; they're clean—it's amazing. Actually, we thought the nasal tract might be full of viruses even when you're walking around healthy. It's pretty clean. If you're healthy, you're pretty healthy. Day two: we get a very robust rhinovirus pattern, and it's very similar to what we get in the lab doing our tissue culture experiment. So that's great, but again, cheap shot, right? We put a ton of virus up this guy's nose. So, I mean, we wanted it to work. I mean, he really had a cold. So, how about the people who walk in off the street?
So, here are two individuals represented by their anonymous ID codes. They both have rhinoviruses; we've never seen this pattern in lab. We sequenced part of their viruses; they're new rhinoviruses no one's actually even seen. Remember, our evolutionary-conserved sequences we're using on this array allow us to detect even novel or uncharacterized viruses, because we pick what is conserved throughout evolution. Here's another guy. You can play the diagnosis game yourself here. These different blocks represent the different viruses in this paramyxovirus family, so you can kind of go down the blocks and see where the signal is. Well, doesn't have canine distemper; that's probably good. But by the time you get to block nine, you see that respiratory syncytial virus. Maybe they have kids. And then you can see, also, the family member that's related: RSVB is showing up here. So, that's great. Here's another individual, sampled on two separate days—repeat visits to the clinic. This individual has parainfluenza-1, and you can see that there's a little stripe over here for Sendai virus: that's mouse parainfluenza. The genetic relationships are very close there. That's a lot of fun.
So, we built out the chip. We made a chip that has every known virus ever discovered on it. Why not? Every plant virus, every insect virus, every marine virus, everything that we could get out of GenBank—that is, the national repository of sequences. Now we're using this chip. And what are we using it for? Well, first of all, when you have a big chip like this, you need a little bit more informatics, so we designed the system to do automatic diagnosis. And the idea is that we simply have virtual patterns, because we're never going to get samples of every virus—it would be virtually impossible. But we can get virtual patterns, and compare them to our observed result—which is a very complex mixture—and come up with some sort of score of how likely it is this is a rhinovirus or something. And this is what this looks like. If, for example, you used a cell culture that's chronically infected with papilloma, you get a little computer readout here, and our algorithm says it's probably papilloma type 18. And that is, in fact, what these particular cell cultures are chronically infected with.
So let's do something a little bit harder. We put the beeper in the clinic. When somebody shows up, and the hospital doesn't know what to do because they can't diagnose it, they call us. That's the idea, and we're setting this up in the Bay Area. And so, this case report happened three weeks ago. We have a 28-year-old healthy woman, no travel history, immunocompetent, doesn't smoke, doesn't drink. 10-day history of fevers, night sweats, bloody sputum—she's coughing up blood—muscle pain. She went to the clinic, and they gave her antibiotics, and then sent her home. She came back after ten days of fever, right? Still has the fever, and she's hypoxic—she doesn't have much oxygen in her lungs. They did a CT scan. A normal lung is all sort of dark and black here. All this white stuff—it's not good. This sort of tree and bud formation indicates there's inflammation; there's likely to be infection. Okay. So, the patient was treated then with a third-generation cephalosporin antibiotic and doxycycline, and on day three, it didn't help: she had progressed to acute failure. They had to intubate her, so they put a tube down her throat and they began to mechanically ventilate her. She could no longer breathe for herself. What to do next? Don't know. Switch antibiotics: so they switched to another antibiotic, Tamiflu. It's not clear why they thought she had the flu, but they switched to Tamiflu.
And on day six, they basically threw in the towel. You do an open lung biopsy when you've got no other options. There's an eight percent mortality rate with just doing this procedure, and so basically—and what do they learn from it? You're looking at her open lung biopsy. And I'm no pathologist, but you can't tell much from this. All you can tell is, there's a lot of swelling: bronchiolitis. It was "unrevealing." That's the pathologist's report. And so, what did they test her for? They have their own tests, of course, and so they tested her for over 70 different assays, for every sort of bacteria and fungus and viral assay you can buy off the shelf: SARS, metapneumovirus, HIV, RSV—all these. Everything came back negative, over 100,000 dollars worth of tests. I mean, they went to the max for this woman.
And basically, on hospital day eight, that's when they called us. They gave us endotracheal aspirate—you know, a little fluid from the throat, from this tube that they got down there—and they gave us this. We put it on the chip; what do we see? Well, we saw parainfluenza-4. Well, what the hell's parainfluenza-4? No one tests for parainfluenza-4. No one cares about it. In fact, it's not even really sequenced that much. There's just a little bit of it sequenced. There's almost no epidemiology or studies on it. No one would even consider it, because no one had a clue that it could cause respiratory failure. And why is that? Just lore. There's no data—no data to support whether it causes severe or mild disease. Clearly, we have a case of a healthy person that's going down.
Okay, that's one case report. I'm going to tell you one last thing in the last two minutes that's unpublished—it's going to come out tomorrow—and it's an interesting case of how you might use this chip to find something new and open a new door. Prostate cancer. I don't need to give you many statistics about prostate cancer. Most of you already know it: third leading cause of cancer deaths in the U.S. Lots of risk factors, but there is a genetic predisposition to prostate cancer. For maybe about 10 percent of prostate cancer, there are folks that are predisposed to it. And the first gene that was mapped in association studies for this, early-onset prostate cancer, was this gene called RNASEL. What is that? It's an antiviral defense enzyme. So, we're sitting around and thinking, "Why would men who have the mutation—a defect in an antiviral defense system—get prostate cancer? It doesn't make sense—unless, maybe, there's a virus?"
So, we put tumors—and now we have over 100 tumors—on our array. And we know who's got defects in RNASEL and who doesn't. And I'm showing you the signal from the chip here, and I'm showing you for the block of retroviral oligos. And what I'm telling you here from the signal is that men who have a mutation in this antiviral defense enzyme, and who have a tumor, often have—40 percent of the time—a signature which reveals a new retrovirus. Okay, that's pretty wild. What is it? So, we clone the whole virus. First of all, I'll tell you that a little automated prediction told us it was very similar to a mouse virus. But that doesn't tell us too much, so we actually clone the whole thing. And the viral genome I'm showing you right here? It's a classic gamma retrovirus, but it's totally new; no one's ever seen it before. Its closest relative is, in fact, from mice, and so we would call this a xenotropic retrovirus, because it's infecting a species other than mice. And this is a little phylogenetic tree to see how it's related to other viruses. And then we've done it for many patients now, and we can say that they're all independent infections. They all have the same virus, but they're different enough that there's reason to believe that they've been independently acquired. Is it really in the tissue? And I'll end up with this: yes. We take slices of these biopsies of tumor tissue and use material to actually locate the virus, and we find cells here with viral particles in them. These guys really do have this virus.
Does this virus cause prostate cancer? Nothing I'm saying here implies causality. I don't know. Is it a link to oncogenesis? I don't know. Is it the case that these guys are just more susceptible to viruses? Could be. And it might have nothing to do with cancer. But now it's a door. We have a strong association between the presence of this virus and a genetic mutation that's been linked to cancer. That's where we're at. So, it opens up more questions than it answers, I'm afraid, but that's what, you know, science is really good at. This was all done by folks in the lab—I cannot take credit for most of this. This is a collaboration between myself and Don. This is the guy who started the project in my lab, and this is the guy who's been doing prostate stuff. Thank you very much.