I recently read an article published in the Public Library of Science (link) about biofilm development on nematodes. Before getting into the article, some background is needed.
Bacteria are classically seen as unicellular organisms that exist independently of one another. These cells do not communicate with each other, and are really just a large group of individuals. Cell in multicellular organisms, in contrast, communicate with each other extensively through a variety of means. There are individuals, but individuals exist for the good of the whole. (Cancer is an example of individuals acting in the interest of individuals, as opposed to acting in the interest of the whole organism.)
This model is nice and simple, but untrue. Different species and strains of bacteria show certain levels of communication. Though none of these forms of communication are as extensive as those seen in multicellular organisms, they are still significant. A fairly common type of bacterial communication is known as quorum sensing. In quorum sensing, bacterial cells are able to send a message to each other that essentially reads "we have reached a certain size".
As to how bacteria respond to this message depends on the particular species. For certain pathogenic bacteria, it is interpreted as an attack message. For a small group of bacteria, attacking a host would be certain death. The numbers are too small to cause significant damage to the host, minimizing the amount of gain from an attack. More importantly, the host will mount defenses in the form of an immune response, and a small group could very quickly be eradicated. For a small group, it is much more advantageous to sit and wait. The groups numbers slowly build, but the bacteria are proverbially under the radar of the host. As long as the bacteria are not actually harming the host, the host has no advantage in expending energy and attacking the bacteria. At some point, the bacterial numbers become significant, to the point where an immune response would not be able to dispatch the bacteria so quickly. It is at this point where the size signal is sent, triggering the bacteria to attack the host. Such behavior is quite advantageous, showing the power of such a seemingly simple signal.
In the paper, the authors looked at biofilm development of a certain group of bacterium, namely Yersinia. (Yersinia includes the infamous Yersinia pestis, which causes the black plague.) Biofilms are the closest bacteria get to being multicellular. Within a biofilm, bacteria live in close quarters with each other, producing a variety of compounds that benefit the group as a whole. Biofilms act as a platform for growth, and as a whole tend to be resistant to things that would otherwise kill off bacteria, including antibiotics. The creation of biofilms is no simple feat for bacteria, and it is often mediated by the production of chemical signals to each other.
Enter the poor nematode. This is a simple, very tiny worm, which is often used as a model organism in biology. Yersinia can actually make its home on nematodes, and is even capable of making biofilms on nematodes. The authors investigated how such biofilms were made. Given that the nematodes are capable of (and do) move around, such biofilms seem to be an interesting area of study, as many biofilms tend to develop on static surfaces. Sure enough, the construction of these motile biofilms is mediated by the same quorum sensing signals as seen in other bacteria. Biofilms are loaded with the quarum sensing signal, namely N-acylhomoserine lactone (AHL). The authors genetically engineered bacteria that were incapable of making AHL, and the resulting bacteria were unable to develop substantial biofilms. In addition to biofilm production, they also found that quorum sensing signals triggered pathogenesis in general, as evidenced by the need for AHLs to make virulon proteins.
Though quorum sensing appears to be widely utilized by bacteria, there appears to be a large amount of variation on the common theme. There are a lot of different ways in which a "we number this many" signal can be used advantageously for bacteria. Life, through evolution, tends to explore many of niches, and experimentally it seems that quorum sensing is no exception. The authors note how a number of other pathogens utilize quorum sensing in their own specific ways.
This leads to an interesting topic for experimental drugs. Without the quorum sensing signal, certain pathogens never actually express pathogenic behavior. If we can develop a drug that prevents this signal from ever reaching its target, be it through destroying the signal, blocking its receptor, or some other means, then the bacteria in question never mount an attack. While they are still there, they are effectively harmless. It seems that quorum sensing is specific to bacteria, so presumably such drugs would target bacteria specifically. Additionally, being that quorum sensing is a common theme for pathogens, such drugs may specifically target pathogenic bacteria, sparing "good" bacteria. This is unlike modern broad spectrum antibiotics, which usually kill off everything. (Many of the negative side effects of antibiotics are due to good bacteria getting killed.) There seems to be a lot of good that could come of quorum sensing research, and I'm excited to see what the future holds for it in terms of medicine.
Monday, April 25, 2011
Monday, April 18, 2011
Recursive Pathogens
I recently read an article published in Nature letters (link). The topic of the article is that of a newly discovered pathogen: the virophage.
The virophage is something completely out of the ordinary, compared to usual pathogens. Virophages, like viruses, are not actually alive. They lack their own molecular machinery for reproduction, and must rely on the host's machinery for this purpose. For a typical virus, this is fairly simple conceptually. A typical virus hijacks the molecular machinery of the cell, using it to produce viral proteins and induce other behavior advantageous to the virus. The cell is forced to create new viruses with its own machinery, allowing for the creation and spreading of even more viruses.
In the respect of hijacking a host's machinery, the virophage is no different from a typical virus. What is atypical, however, is that virophages actually hijack the already hijacked cellular machinery. That is, virophages require that some other virus has already modified the molecular machinery of a cell in a certain way that the virophage can use it. The virophage alone cannot infect a cell; it requires both the cell and another virus infecting the cell.
For this matter, it may be wrong to say the virophage infects the cell. Based on the results of the paper, it seems more accurate to say that the virophage infects the other virus, which happens to reside in a cell. Infection with virophage caused many of the normal viral components produced to be nonfunctional. That is, the virophage impeded the spread of the infecting virus. The virophage actually had a beneficial effect for cells. Significantly fewer cells died when infected with virus + virophage instead of just virus (virophage + cells was no different than cells alone).
Although this is not too difficult to understand, it's a very different way of thinking. The common terms "pathogen" and "host" which used to have clear definitions become blurred. The virophage is not a cellular pathogen, but rather a viral pathogen. Given that viruses are not alive, this is a paradox: how can something nonliving be a pathogen to something else that is nonliving? This gets at the very root of what it means to be "alive", which has been hotly debated in the past by people across a wide variety of fields.
I think there are a lot of directions in which this research could go. For one, it is suggested that virophages are extremely common in oceans, and perhaps elsewhere. So far, all virophages discovered have come from common cooling towers, so they exist out of the ocean as well. I wonder how many different kind of virophages there are. Perhaps we could find a virophage for existing viral human pathogens, although this is probably jumping the gun.
A logical next step is to determine exactly how the virophage is hijacking the other virus. The nonfunctional viral particles produced are very strange, and it does not seem obvious how they come about.
Another question that comes to mind is selection advantage and the evolution of virophages. Consider an extremely virulent virus. This virus usually kills its host. For a virus, it is unfavorable to kill off the host, since the host is required for reproduction. Additionally, it is unfavorable to adversely affect the host significantly. Generally, very sick people partially quarantine themselves from the rest of the population, namely by bedrest. It is in the virus' best interest to spread to as many people as possible, and a very sick host cannot do that. This is partially why the cold virus is so ubiquitous - people rarely get sick to the point of avoiding others, which in turn spreads the virus. In summation, a highly virulent virus is bad both for the host it infects and the virus itself.
This is where I see a virophage coming in. Although the virophage is a viral pathogen, in this case, it is actually in the virus' best interest not to be so virulent. If the virophage prevents the host virus from being so pathogenic, then the end result is that the host virus can spread to more people. Granted, much less of it is spreading, but considering that only one virus is theoretically needed to start an infection, this reduction may be acceptable. The virophage is also beneficial to the cell, as cells simultaneously infected with virophage and virus usually do much better than cells infected with only virus.
That's my suspicion anyway. As stated before, there are a lot of paths this research can take from here, and I only scratched the surface with these ideas. Time to revise the textbooks.
The virophage is something completely out of the ordinary, compared to usual pathogens. Virophages, like viruses, are not actually alive. They lack their own molecular machinery for reproduction, and must rely on the host's machinery for this purpose. For a typical virus, this is fairly simple conceptually. A typical virus hijacks the molecular machinery of the cell, using it to produce viral proteins and induce other behavior advantageous to the virus. The cell is forced to create new viruses with its own machinery, allowing for the creation and spreading of even more viruses.
In the respect of hijacking a host's machinery, the virophage is no different from a typical virus. What is atypical, however, is that virophages actually hijack the already hijacked cellular machinery. That is, virophages require that some other virus has already modified the molecular machinery of a cell in a certain way that the virophage can use it. The virophage alone cannot infect a cell; it requires both the cell and another virus infecting the cell.
For this matter, it may be wrong to say the virophage infects the cell. Based on the results of the paper, it seems more accurate to say that the virophage infects the other virus, which happens to reside in a cell. Infection with virophage caused many of the normal viral components produced to be nonfunctional. That is, the virophage impeded the spread of the infecting virus. The virophage actually had a beneficial effect for cells. Significantly fewer cells died when infected with virus + virophage instead of just virus (virophage + cells was no different than cells alone).
Although this is not too difficult to understand, it's a very different way of thinking. The common terms "pathogen" and "host" which used to have clear definitions become blurred. The virophage is not a cellular pathogen, but rather a viral pathogen. Given that viruses are not alive, this is a paradox: how can something nonliving be a pathogen to something else that is nonliving? This gets at the very root of what it means to be "alive", which has been hotly debated in the past by people across a wide variety of fields.
I think there are a lot of directions in which this research could go. For one, it is suggested that virophages are extremely common in oceans, and perhaps elsewhere. So far, all virophages discovered have come from common cooling towers, so they exist out of the ocean as well. I wonder how many different kind of virophages there are. Perhaps we could find a virophage for existing viral human pathogens, although this is probably jumping the gun.
A logical next step is to determine exactly how the virophage is hijacking the other virus. The nonfunctional viral particles produced are very strange, and it does not seem obvious how they come about.
Another question that comes to mind is selection advantage and the evolution of virophages. Consider an extremely virulent virus. This virus usually kills its host. For a virus, it is unfavorable to kill off the host, since the host is required for reproduction. Additionally, it is unfavorable to adversely affect the host significantly. Generally, very sick people partially quarantine themselves from the rest of the population, namely by bedrest. It is in the virus' best interest to spread to as many people as possible, and a very sick host cannot do that. This is partially why the cold virus is so ubiquitous - people rarely get sick to the point of avoiding others, which in turn spreads the virus. In summation, a highly virulent virus is bad both for the host it infects and the virus itself.
This is where I see a virophage coming in. Although the virophage is a viral pathogen, in this case, it is actually in the virus' best interest not to be so virulent. If the virophage prevents the host virus from being so pathogenic, then the end result is that the host virus can spread to more people. Granted, much less of it is spreading, but considering that only one virus is theoretically needed to start an infection, this reduction may be acceptable. The virophage is also beneficial to the cell, as cells simultaneously infected with virophage and virus usually do much better than cells infected with only virus.
That's my suspicion anyway. As stated before, there are a lot of paths this research can take from here, and I only scratched the surface with these ideas. Time to revise the textbooks.
Monday, April 11, 2011
Congenitally Blind "Sight"
I recently read an article in PNAS about how sounds are processed in the brains of humans who are blind from birth (link). Before this study, it was already known that the brains of people who are blind from birth process sounds in a fundamentally different way than that of sighted people. In sighted people, a part of the brain known as the visual cortex deals with image processing. Of course, in blind people, there are no images to process. Studies have shown that this does not, however, mean that this region of the brain is inactive in congenitally blind people (people blind from birth). The area has shown to be active instead in the processing of sound, in addition to the normal parts of the brain that deal with sound. One may hypothesize that this additional activity forms the basis of improved sound perception in congenitally blind people, although such is speculation.
While it has been previously shown that the visual cortex processes sounds in the congenitally blind, this is not all too specific. Different neural pathways go through the visual cortex. Relevant to this paper are two pathways. One is a "what" pathway that is involved in object recognition. The other is a "where" pathway that is responsible for understanding the spacial relationships between objects.
This study picks up where others have left off, and attempts to determine which pathways are active for different sounds in congenitally blind people. The authors manipulated the pitches and locations of sounds, carefully recording which pathways became active via fMRI. The authors found that when pitch was varied, the "what" visual pathway became active. Given that pitch is one of the descerning properties of sounds, and that the "what" visual pathway more generically analyzes the properties of objects, this result makes sense. When the location of a given sound was varied, the "where" visual pathway became active. This is also not a major cognitive leap; the location a sound came from is analogous to where an object is currently located in one's field of vision.
I find it fascinating that there is sufficient plasticity in the brain so that these two similar systems can be mapped so well. For me, this study presents many more questions than answers. For example, most of the brain's neuronal connections in the visual cortex are at their adult state by age 11. This implies that if someone were to become blind after this age, then the brain could not rewire itself in such an elegant or effective way as seen in the congenitally blind. It is assumed that such people would not be able to achieve better hearing, or at least hearing comparable to the congenitally blind. It would also be interesting to see the opposite case - a congenitally blind human achieving sight after age 11. In this case, the brain would have to rewire itself for vision, and it may not be capable of this. Again, "may" is the operating word; actual studies would need to be conducted to verify such hypothesis. The plasticity of the brain is fascinating, though it seems that the brain is somehow able to map to related systems when the primary system does not work. On the same note, this study make it look like the brain attempts to achieve better utilization of existing inputs when some inputs are nonexistent. Fascinating stuff!
While it has been previously shown that the visual cortex processes sounds in the congenitally blind, this is not all too specific. Different neural pathways go through the visual cortex. Relevant to this paper are two pathways. One is a "what" pathway that is involved in object recognition. The other is a "where" pathway that is responsible for understanding the spacial relationships between objects.
This study picks up where others have left off, and attempts to determine which pathways are active for different sounds in congenitally blind people. The authors manipulated the pitches and locations of sounds, carefully recording which pathways became active via fMRI. The authors found that when pitch was varied, the "what" visual pathway became active. Given that pitch is one of the descerning properties of sounds, and that the "what" visual pathway more generically analyzes the properties of objects, this result makes sense. When the location of a given sound was varied, the "where" visual pathway became active. This is also not a major cognitive leap; the location a sound came from is analogous to where an object is currently located in one's field of vision.
I find it fascinating that there is sufficient plasticity in the brain so that these two similar systems can be mapped so well. For me, this study presents many more questions than answers. For example, most of the brain's neuronal connections in the visual cortex are at their adult state by age 11. This implies that if someone were to become blind after this age, then the brain could not rewire itself in such an elegant or effective way as seen in the congenitally blind. It is assumed that such people would not be able to achieve better hearing, or at least hearing comparable to the congenitally blind. It would also be interesting to see the opposite case - a congenitally blind human achieving sight after age 11. In this case, the brain would have to rewire itself for vision, and it may not be capable of this. Again, "may" is the operating word; actual studies would need to be conducted to verify such hypothesis. The plasticity of the brain is fascinating, though it seems that the brain is somehow able to map to related systems when the primary system does not work. On the same note, this study make it look like the brain attempts to achieve better utilization of existing inputs when some inputs are nonexistent. Fascinating stuff!
Sunday, April 3, 2011
Sources of Anxiety
I recently read an article in Nature that explores the nature of anxiety, found here. The paper notes previous studies on conditioned anxiety. That is, anxiety that isn't inborn, but rather learned. However, the paper instead explores anxiety that is hard-wired into the brains of mice, which are assumed to have similar anxiety pathways as humans. They looked at the amygdala, which is known to mediate emotional learning, or the attaching of certain emotions to memories. Given that many anxieties have a root at some traumatic, memorable experience, it makes sense that this region is explored. But again, they were not looking for evidence of anxiety due to such memories, but rather anxiety without a root memory.
In mice, there are some documented memory-less anxieties. Specifically, the authors looked at a specific mouse behavior that is reminiscent of agoraphobia. Mice naturally tend to avoid wide open areas, and show anxiety when in such open areas. Given that mice are small prey animals, one can see the selective advantaged conferred by such a behavior. With anxiety of open areas, mice avoid such exposed positions, limiting their chances of becoming the next meal for a predator. Of course, usually one can be a meal only once, so this would not really work as a learned behavior.
The authors constructed experimental mazes which would make it apparent when mice showed anxiety or not. In addition, they conducted cell physiology work, in which it is possible to measure the activity of a single protein pump on a single neuron. Using these techniques, it is possible to very accurately quantify neuronal activity. Using the constructed apparatus, they were able to trigger the anxious agoraphobic behavior in mice. Watching videos of the mice moving in the experiments (included in the paper's supplemental material) makes it very clear that the authors are, in fact, able to control the anxiety. The mice show the anxiety within a second or two of activation of the apparatus, and they return to normal just as quickly when the procedure is stopped.
I would like to go into more detail than this, but I honestly do not understand the methods in their entirety. I do not think the methods are the important take-home point anyway. The authors were able to prove that there is such a thing as totally inborn anxiety, and they were able to map out a significant part of the neuronal circuitry involved with it. This could lead to a dramatic improvement in anti-anxiety medications. The authors point out that we currently do not understand anxiety in its entirety, and that current anti-anxiety medications do not directly target the pathways that trigger anxiety. Given that we do not know the pathways, it is no wonder that the medications cannot be that specific, eliciting broad, undesirable side effects.
For that matter, there is another subtle point of the paper: there are certain anxieties that are beyond our direct control. Directly controlling this region of the brain is like directly controlling heart rate. Yes, it is possible to both reduce and increase heart rate temporarily depending on the actions one performs, but it is not as simple as saying one wants a change in heart rate. Who knows what inborn fears are lurking in the deep recesses of our amygdala?
In mice, there are some documented memory-less anxieties. Specifically, the authors looked at a specific mouse behavior that is reminiscent of agoraphobia. Mice naturally tend to avoid wide open areas, and show anxiety when in such open areas. Given that mice are small prey animals, one can see the selective advantaged conferred by such a behavior. With anxiety of open areas, mice avoid such exposed positions, limiting their chances of becoming the next meal for a predator. Of course, usually one can be a meal only once, so this would not really work as a learned behavior.
The authors constructed experimental mazes which would make it apparent when mice showed anxiety or not. In addition, they conducted cell physiology work, in which it is possible to measure the activity of a single protein pump on a single neuron. Using these techniques, it is possible to very accurately quantify neuronal activity. Using the constructed apparatus, they were able to trigger the anxious agoraphobic behavior in mice. Watching videos of the mice moving in the experiments (included in the paper's supplemental material) makes it very clear that the authors are, in fact, able to control the anxiety. The mice show the anxiety within a second or two of activation of the apparatus, and they return to normal just as quickly when the procedure is stopped.
I would like to go into more detail than this, but I honestly do not understand the methods in their entirety. I do not think the methods are the important take-home point anyway. The authors were able to prove that there is such a thing as totally inborn anxiety, and they were able to map out a significant part of the neuronal circuitry involved with it. This could lead to a dramatic improvement in anti-anxiety medications. The authors point out that we currently do not understand anxiety in its entirety, and that current anti-anxiety medications do not directly target the pathways that trigger anxiety. Given that we do not know the pathways, it is no wonder that the medications cannot be that specific, eliciting broad, undesirable side effects.
For that matter, there is another subtle point of the paper: there are certain anxieties that are beyond our direct control. Directly controlling this region of the brain is like directly controlling heart rate. Yes, it is possible to both reduce and increase heart rate temporarily depending on the actions one performs, but it is not as simple as saying one wants a change in heart rate. Who knows what inborn fears are lurking in the deep recesses of our amygdala?
Type Erasure Part II: Bytecode and You
My last post ended in a mystery. It seems that type erasure can occasionally be circumvented using different return types, despite the fact that return types are not actually part of a method's signature. I asked this question on Stack Overflow, and I got an excellent response from "irreputable".
For all of these examples, I'm importing java.util.*. So here's something that doesn't compile:
public class Test {
public int method( List< Integer > list ) {
return 0;
}
public int method( List< Double > list ) {
return 1;
}
}
The error is that they "have the same erasure", meaning that the Integer and Double are both replaced with Object, which is how type erasure typically works. With this in mind, these methods are clearly the same.
Here's something else that doesn't work (bold portions changed from last):
public class Test {
public int method( List< Integer > list ) {
return 0;
}
public double method( List< Integer > list ) {
return 1;
}
}
In this case, the error is actually different: "method is already defined". This is where things start to get weird. Return types are not part of method signatures, so that part should be able to be ignored. Even before type erasure, these methods still have the same type signatures, so this shouldn't work. Hence, this is why the "method is already defined" is more appropriate than "have the same erasure".
But here's something that does work:
public class Test {
public int method( List< Integer > list ) {
return 0;
}
public double method( List< Double > list ) {
return 1;
}
}
No errors, no warnings, and some simple testing shows that it works exactly like it looks like it should. The return types aren't part of method signatures, so that can be ignored. Ignoring return type, this is the exact same method as the first example, which didn't compile for type erasure reasons. But this inexplicably works.
It turns out that method signatures in bytecode actually do include return types, and that this information can be utilized for overloading, despite the fact that the more basic tutorial-style documentation makes it seem like this doesn't happen. With this in mind, going through all the examples, the bytecode signatures are as follows (factoring in type erasure):
So the bytecode contains the correct methods. But how can the compiler know which method is referred to when one calls it? One could infer this based on how the return type is used, as in double myDouble = method( List< Double > ). However, we could have just as easily called it like method( List< Double > ), discarding the return value and leaving the compiler without any additional typing information. So if we can't choose the correct method based on the return type, then how can we choose the method if type erasure makes everything an Object?
The answer is that it's only an Object at run time, not compile time. Since overloading occurs at compile time, this information is available to the compiler. To my knowledge, however, there is no way to access this information at compile time through code, despite the fact that it is available somewhere.
So what's the problem? Why can't I overload with different generic type parameters if the compiler can make this judgement call? The answer is you can, as long as the return types differ. Again, the problem isn't in the compiler, it's in the actual bytecode/JVM. It doesn't store generic types in the type signature, so there is no way to differentiate methods by the same name that differ only in generic type parameters. But since it also stores return type in the method signature (unlike the typical language-neutral definition of "method signature"), it can use this return type to separate the otherwise identical methods.
I've tested this with both Java and Scala, both of which show this behavior, and both of which run on the JVM. If all of this is correct, then this behavior should be true of all languages that run on the JVM, assuming they allow for method overloading and static typing. (If they don't then this doesn't apply to you anyway.)
Long story short: return types are part of the method signature in the JVM bytecode.
For all of these examples, I'm importing java.util.*. So here's something that doesn't compile:
public class Test {
public int method( List< Integer > list ) {
return 0;
}
public int method( List< Double > list ) {
return 1;
}
}
The error is that they "have the same erasure", meaning that the Integer and Double are both replaced with Object, which is how type erasure typically works. With this in mind, these methods are clearly the same.
Here's something else that doesn't work (bold portions changed from last):
public class Test {
public int method( List< Integer > list ) {
return 0;
}
public double method( List< Integer > list ) {
return 1;
}
}
In this case, the error is actually different: "method is already defined". This is where things start to get weird. Return types are not part of method signatures, so that part should be able to be ignored. Even before type erasure, these methods still have the same type signatures, so this shouldn't work. Hence, this is why the "method is already defined" is more appropriate than "have the same erasure".
But here's something that does work:
public class Test {
public int method( List< Integer > list ) {
return 0;
}
public double method( List< Double > list ) {
return 1;
}
}
No errors, no warnings, and some simple testing shows that it works exactly like it looks like it should. The return types aren't part of method signatures, so that can be ignored. Ignoring return type, this is the exact same method as the first example, which didn't compile for type erasure reasons. But this inexplicably works.
It turns out that method signatures in bytecode actually do include return types, and that this information can be utilized for overloading, despite the fact that the more basic tutorial-style documentation makes it seem like this doesn't happen. With this in mind, going through all the examples, the bytecode signatures are as follows (factoring in type erasure):
- List -> int; List -> int
- List -> int; List -> double
- List -> int; List -> double
So the bytecode contains the correct methods. But how can the compiler know which method is referred to when one calls it? One could infer this based on how the return type is used, as in double myDouble = method( List< Double > ). However, we could have just as easily called it like method( List< Double > ), discarding the return value and leaving the compiler without any additional typing information. So if we can't choose the correct method based on the return type, then how can we choose the method if type erasure makes everything an Object?
The answer is that it's only an Object at run time, not compile time. Since overloading occurs at compile time, this information is available to the compiler. To my knowledge, however, there is no way to access this information at compile time through code, despite the fact that it is available somewhere.
So what's the problem? Why can't I overload with different generic type parameters if the compiler can make this judgement call? The answer is you can, as long as the return types differ. Again, the problem isn't in the compiler, it's in the actual bytecode/JVM. It doesn't store generic types in the type signature, so there is no way to differentiate methods by the same name that differ only in generic type parameters. But since it also stores return type in the method signature (unlike the typical language-neutral definition of "method signature"), it can use this return type to separate the otherwise identical methods.
I've tested this with both Java and Scala, both of which show this behavior, and both of which run on the JVM. If all of this is correct, then this behavior should be true of all languages that run on the JVM, assuming they allow for method overloading and static typing. (If they don't then this doesn't apply to you anyway.)
Long story short: return types are part of the method signature in the JVM bytecode.
Saturday, April 2, 2011
Anonymous Functions and Type Erasure in Scala
Before I get into anything, I would like to make it clear that I'm using Scala 2.7.7.final, not the newest (and significantly different) 2.8.1.final. I'm using for my thesis, and I learned the hard way long ago that you should try to avoid changing your toolset once your underway.
Recently, I've started to really get into the use of anonymous functions. I'm writing my own higher-order functions, and I like the simplicity of everything. They can be used as alternatives to a number of design patterns, including Template Method, and tend to have a lot less superfluous syntax associated with them.
However, today I encountered a bit of an issue. I wrote two higher order functions in the same class whose type signatures differed only in the types used for the anonymous functions. I was greeted with a double definition error message. What?
So I delved a bit into the implementation of anonymous functions. The Scala library defines a series of traits called Function0 through Function22. These appear to refer to anonymous functions, where the number refers to the number of parameters the function takes.
First thing's first: let's see what happens when you have more than 22 parameters, because that sounds like it'd be entertaining. So...
scala> ( ( a: Int, b: Int, c: Int, d: Int, e: Int, f: Int, g: Int, h: Int, i: Int, j: Int, k: Int, l: Int, m: Int, n: Int, o: Int, p: Int, q: Int, r: Int, s: Int, t: Int, u: Int, v: Int, w: Int ) => "hahaha" )
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 23
at scala.tools.nsc.typechecker.Typers$Typer.decompose$2(Typers.scala:1502)
at scala.tools.nsc.typechecker.Typers$Typer.typedFunction(Typers.scala:1504)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3153)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3406)
at scala.tools.nsc.typechecker.Typers$Typer.computeType(Typers.scala:3457)
at scala.tools.nsc.typechecker.Namers$Namer.typeSig(Namers.scala:859)
at scala.tools.nsc.typechecker.Namers$Namer$$anonfun$typeCompleter$1.apply(Namers.scala:415)
at scala.tools.nsc.typechecker.Namers$Namer$$anonfun$typeCompleter$1.apply(Namers.scala:413)
at scala.tools.nsc.typechecker.Namers$$anon$1.complete(Namers.scala:982)
at scala.tools.nsc.symtab.Symbols$Symbol.info(Symbols.scala:555)
at scala.tools.nsc.symtab.Symbols$Symbol.initialize(Symbols.scala:669)
at scala.tools.nsc.typechecker.Typers$Typer.addGetterSetter(Typers.scala:1139)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$10.apply(Typers.scala:1219)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$10.apply(Typers.scala:1219)
at scala.List.flatMap(List.scala:1132)
at scala.tools.nsc.typechecker.Typers$Typer.typedTemplate(Typers.scala:1219)
at scala.tools.nsc.typechecker.Typers$Typer.typedModuleDef(Typers.scala:1114)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3091)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3395)
at scala.tools.nsc.typechecker.Typers$Typer.typedStat$1(Typers.scala:1598)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.List$.loop$1(List.scala:300)
at scala.List$.mapConserve(List.scala:317)
at scala.tools.nsc.typechecker.Typers$Typer.typedStats(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer.typedTemplate(Typers.scala:1221)
at scala.tools.nsc.typechecker.Typers$Typer.typedModuleDef(Typers.scala:1114)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3091)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3395)
at scala.tools.nsc.typechecker.Typers$Typer.typedStat$1(Typers.scala:1598)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.List$.loop$1(List.scala:300)
at scala.List$.mapConserve(List.scala:317)
at scala.tools.nsc.typechecker.Typers$Typer.typedStats(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer.typedTemplate(Typers.scala:1221)
at scala.tools.nsc.typechecker.Typers$Typer.typedModuleDef(Typers.scala:1114)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3091)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3395)
at scala.tools.nsc.typechecker.Typers$Typer.typedStat$1(Typers.scala:1598)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.List$.loop$1(List.scala:300)
at scala.List$.mapConserve(List.scala:317)
at scala.tools.nsc.typechecker.Typers$Typer.typedStats(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3084)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3395)
at scala.tools.nsc.typechecker.Analyzer$typerFactory$$anon$2.apply(Analyzer.scala:41)
at scala.tools.nsc.Global$GlobalPhase.applyPhase(Global.scala:267)
at scala.tools.nsc.Global$GlobalPhase$$anonfun$run$1.apply(Global.scala:246)
at scala.tools.nsc.Global$GlobalPhase$$anonfun$run$1.apply(Global.scala:246)
at scala.Iterator$class.foreach(Iterator.scala:414)
at scala.collection.mutable.ListBuffer$$anon$1.foreach(ListBuffer.scala:266)
at scala.tools.nsc.Global$GlobalPhase.run(Global.scala:246)
at scala.tools.nsc.Global$Run.compileSources(Global.scala:574)
at scala.tools.nsc.Interpreter$Request.compile(Interpreter.scala:820)
at scala.tools.nsc.Interpreter.interpret(Interpreter.scala:505)
at scala.tools.nsc.Interpreter.interpret(Interpreter.scala:494)
at scala.tools.nsc.InterpreterLoop.interpretStartingWith(InterpreterLoop.scala:242)
at scala.tools.nsc.InterpreterLoop.command(InterpreterLoop.scala:230)
at scala.tools.nsc.InterpreterLoop.repl(InterpreterLoop.scala:142)
at scala.tools.nsc.InterpreterLoop.main(InterpreterLoop.scala:298)
at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:141)
at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
This crashed the REPL, sending me back to the command line. Yup, entertaining. (If you need more than 22 parameters, or any number of parameters even close to 22 for that matter, you shouldn't be allowed within 500 feet of a computer.)
Back to actual work. I constructed the simplest example that showed the same error:
class MyClass {
def method( fun: Int => Int ) {}
def method( fun: Double => Double ) {}
}
...which fails at the REPL with...
<console>:6: error: double definition:
method method:((Double) => Double)Unit and
method method:((Int) => Int)Unit at line 5
have same type after erasure: (Function1)Unit
def method( fun: Double => Double ) {}
^
Type erasure. My old nemesis. These traits do their magic with generic types. The syntax Int => Int is really just syntactic sugar for Function1[Int,Int], which explains the type erasure error. Ho hum.
But wait. Here's something that does work:
class MyClass {
def method( fun: Int => Int ) = fun( 0 )
def method( fun: Double => Double ) = fun( 1.0 )
}
...what? The only change from the nonfunctional version is that the return types have changed from Unit to Int and Double, respectively. Although the parameters are specified with generics, the return types (based on those parameters) are not. However, this still shouldn't matter, as the return type doesn't get factored into the signature. After all, one can simply not have anything grab the returned value, in which case there is no way to extract return type information.
I probed into this a little further. Calling getDeclaredMethods on MyClass's class gets two relevant methods with the following signatures:
public double MyClass.method(scala.Function1)
public int MyClass.method(scala.Function1)
Ok, now I'm really confused. It doesn't appear that there is any additional magic that Scala's performing behind the scene. So the next move was to get the Scala compiler out of the picture, and do this in Java with:
import scala.*;
public class Test {
public int method( Function1< Integer, Integer > fun ) {
return 0;
}
public double method( Function1< Double, Double > fun ) {
return 1.0;
}
}
This compiles. What? What? So the next move was to get Scala out entirely. Two classes were needed:
public class Pair< T, U > {
public T first;
public U second;
}
public class Test2 {
public int method( Pair< Integer, Integer > pair ) {
return 0;
}
public double method( Pair< Double, Double > pair ) {
return 1.0;
}
}
This compiles. However, if you change the return types to be the same, it doesn't. I'm actually at a loss at this point. I went back and read a number of basic materials from Oracle, including materials on type erasure and method overloading (note that the method overloading information includes type signature information). Based on the documentation, this shouldn't work.If anyone has any insights, that would be great. I posted the reason why in my next post here. Tricky, tricky!
Recently, I've started to really get into the use of anonymous functions. I'm writing my own higher-order functions, and I like the simplicity of everything. They can be used as alternatives to a number of design patterns, including Template Method, and tend to have a lot less superfluous syntax associated with them.
However, today I encountered a bit of an issue. I wrote two higher order functions in the same class whose type signatures differed only in the types used for the anonymous functions. I was greeted with a double definition error message. What?
So I delved a bit into the implementation of anonymous functions. The Scala library defines a series of traits called Function0 through Function22. These appear to refer to anonymous functions, where the number refers to the number of parameters the function takes.
First thing's first: let's see what happens when you have more than 22 parameters, because that sounds like it'd be entertaining. So...
scala> ( ( a: Int, b: Int, c: Int, d: Int, e: Int, f: Int, g: Int, h: Int, i: Int, j: Int, k: Int, l: Int, m: Int, n: Int, o: Int, p: Int, q: Int, r: Int, s: Int, t: Int, u: Int, v: Int, w: Int ) => "hahaha" )
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 23
at scala.tools.nsc.typechecker.Typers$Typer.decompose$2(Typers.scala:1502)
at scala.tools.nsc.typechecker.Typers$Typer.typedFunction(Typers.scala:1504)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3153)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3406)
at scala.tools.nsc.typechecker.Typers$Typer.computeType(Typers.scala:3457)
at scala.tools.nsc.typechecker.Namers$Namer.typeSig(Namers.scala:859)
at scala.tools.nsc.typechecker.Namers$Namer$$anonfun$typeCompleter$1.apply(Namers.scala:415)
at scala.tools.nsc.typechecker.Namers$Namer$$anonfun$typeCompleter$1.apply(Namers.scala:413)
at scala.tools.nsc.typechecker.Namers$$anon$1.complete(Namers.scala:982)
at scala.tools.nsc.symtab.Symbols$Symbol.info(Symbols.scala:555)
at scala.tools.nsc.symtab.Symbols$Symbol.initialize(Symbols.scala:669)
at scala.tools.nsc.typechecker.Typers$Typer.addGetterSetter(Typers.scala:1139)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$10.apply(Typers.scala:1219)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$10.apply(Typers.scala:1219)
at scala.List.flatMap(List.scala:1132)
at scala.tools.nsc.typechecker.Typers$Typer.typedTemplate(Typers.scala:1219)
at scala.tools.nsc.typechecker.Typers$Typer.typedModuleDef(Typers.scala:1114)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3091)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3395)
at scala.tools.nsc.typechecker.Typers$Typer.typedStat$1(Typers.scala:1598)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.List$.loop$1(List.scala:300)
at scala.List$.mapConserve(List.scala:317)
at scala.tools.nsc.typechecker.Typers$Typer.typedStats(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer.typedTemplate(Typers.scala:1221)
at scala.tools.nsc.typechecker.Typers$Typer.typedModuleDef(Typers.scala:1114)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3091)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3395)
at scala.tools.nsc.typechecker.Typers$Typer.typedStat$1(Typers.scala:1598)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.List$.loop$1(List.scala:300)
at scala.List$.mapConserve(List.scala:317)
at scala.tools.nsc.typechecker.Typers$Typer.typedStats(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer.typedTemplate(Typers.scala:1221)
at scala.tools.nsc.typechecker.Typers$Typer.typedModuleDef(Typers.scala:1114)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3091)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3395)
at scala.tools.nsc.typechecker.Typers$Typer.typedStat$1(Typers.scala:1598)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer$$anonfun$19.apply(Typers.scala:1643)
at scala.List$.loop$1(List.scala:300)
at scala.List$.mapConserve(List.scala:317)
at scala.tools.nsc.typechecker.Typers$Typer.typedStats(Typers.scala:1643)
at scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:3084)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3358)
at scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:3395)
at scala.tools.nsc.typechecker.Analyzer$typerFactory$$anon$2.apply(Analyzer.scala:41)
at scala.tools.nsc.Global$GlobalPhase.applyPhase(Global.scala:267)
at scala.tools.nsc.Global$GlobalPhase$$anonfun$run$1.apply(Global.scala:246)
at scala.tools.nsc.Global$GlobalPhase$$anonfun$run$1.apply(Global.scala:246)
at scala.Iterator$class.foreach(Iterator.scala:414)
at scala.collection.mutable.ListBuffer$$anon$1.foreach(ListBuffer.scala:266)
at scala.tools.nsc.Global$GlobalPhase.run(Global.scala:246)
at scala.tools.nsc.Global$Run.compileSources(Global.scala:574)
at scala.tools.nsc.Interpreter$Request.compile(Interpreter.scala:820)
at scala.tools.nsc.Interpreter.interpret(Interpreter.scala:505)
at scala.tools.nsc.Interpreter.interpret(Interpreter.scala:494)
at scala.tools.nsc.InterpreterLoop.interpretStartingWith(InterpreterLoop.scala:242)
at scala.tools.nsc.InterpreterLoop.command(InterpreterLoop.scala:230)
at scala.tools.nsc.InterpreterLoop.repl(InterpreterLoop.scala:142)
at scala.tools.nsc.InterpreterLoop.main(InterpreterLoop.scala:298)
at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:141)
at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
This crashed the REPL, sending me back to the command line. Yup, entertaining. (If you need more than 22 parameters, or any number of parameters even close to 22 for that matter, you shouldn't be allowed within 500 feet of a computer.)
Back to actual work. I constructed the simplest example that showed the same error:
class MyClass {
def method( fun: Int => Int ) {}
def method( fun: Double => Double ) {}
}
...which fails at the REPL with...
<console>:6: error: double definition:
method method:((Double) => Double)Unit and
method method:((Int) => Int)Unit at line 5
have same type after erasure: (Function1)Unit
def method( fun: Double => Double ) {}
^
Type erasure. My old nemesis. These traits do their magic with generic types. The syntax Int => Int is really just syntactic sugar for Function1[Int,Int], which explains the type erasure error. Ho hum.
But wait. Here's something that does work:
class MyClass {
def method( fun: Int => Int ) = fun( 0 )
def method( fun: Double => Double ) = fun( 1.0 )
}
...what? The only change from the nonfunctional version is that the return types have changed from Unit to Int and Double, respectively. Although the parameters are specified with generics, the return types (based on those parameters) are not. However, this still shouldn't matter, as the return type doesn't get factored into the signature. After all, one can simply not have anything grab the returned value, in which case there is no way to extract return type information.
I probed into this a little further. Calling getDeclaredMethods on MyClass's class gets two relevant methods with the following signatures:
public double MyClass.method(scala.Function1)
public int MyClass.method(scala.Function1)
Ok, now I'm really confused. It doesn't appear that there is any additional magic that Scala's performing behind the scene. So the next move was to get the Scala compiler out of the picture, and do this in Java with:
import scala.*;
public class Test {
public int method( Function1< Integer, Integer > fun ) {
return 0;
}
public double method( Function1< Double, Double > fun ) {
return 1.0;
}
}
This compiles. What? What? So the next move was to get Scala out entirely. Two classes were needed:
public class Pair< T, U > {
public T first;
public U second;
}
public class Test2 {
public int method( Pair< Integer, Integer > pair ) {
return 0;
}
public double method( Pair< Double, Double > pair ) {
return 1.0;
}
}
This compiles. However, if you change the return types to be the same, it doesn't. I'm actually at a loss at this point. I went back and read a number of basic materials from Oracle, including materials on type erasure and method overloading (note that the method overloading information includes type signature information). Based on the documentation, this shouldn't work.
Subscribe to:
Posts (Atom)