Category Archives: Research

Papers and Inauthenticity

Just once, I’d like to write a paper that communicated its results clearly, without all sorts of excess fluff. Literally every time I write a paper, the nonsense I need to put into it thanks to the way the system is arranged both depresses and infuriates me. Surrounding a pure idea with meaningless words is like using mud to frost a delicious cake.

But then it would be one or two pages long, would have very few references (why people think it’s a good practice to acknowledge papers you’ve never even read or used, I’ll never know), and would be immediately rejected by peer reviewers. (Thanks, oh faithful Gatekeepers of Truth, for making it impossible to effectively communicate ideas). The skill most people call technical writing seems to be primarily in convincing people that what you’re writing is not actually filler. I think I’ve become very good at it by now because I’m getting a lot of things published by this point, but it’s not that the ideas are any different from before – I just frosted them with more mud.

Getting a Ph. D. is exhibiting mastery of this skill, by writing 150+ pages of material to present the same one page of results that I’m currently devoting 10 pages to this time around. I actually did that first, but copying and pasting from my dissertation is considered “autoplagiarism” (Plagiarism = taking credit for the ideas of another. Therefore, autoplagiarism = taking credit for your own ideas), so I can’t even use the material I’ve previously written on the same topic.

The long list of publications people show off in order to cudgel their way into academic positions tends to be composed of at most 3 or 4 ideas, rehashed in different ways, with minor variations on the details, but very little true originality. And it works because no one has the time to read the majority of an applicant’s papers! The best people have managed to do is include an “acceptance ratio”, or measure of how selective the conferences and journals submitted to were, as if peer endorsement guarantees that the idea within is good.

It’s all so fake, just like society in general. And when true authenticity occasionally rears its head (as it must, because civilization still exists), it’s immediately bludgeoned and violently suppressed.

I must say, an agrarian lifestyle seems more appealing with each passing day. Nature alone rewards honesty. I don’t know how much more of this I can take.

Aiming too low.

I just read about that “Stand Up to Cancer” initiative in Time magazine and realized that, while it introduces some things that people should have been doing from the beginning, other aspects are essentially more of the same in another guise.

The “dream team” concept is ok, although this again brings up the issue of selection accuracy and the goals as well as the abilities of the people you’re selecting. Putting these people together is phenomenal and could produce some exciting collaborations.

Going for cancers such as GBM and pancreatic is also a great idea, as these have been neglected and currently have very poor survival rates. These forms of cancer are essentially death sentences today, and treatments that can raise survival rates are long overdue.

The problem is one of metrics. Everyone wants only the best scientists to work on their projects. But how are these scientists going to be selected? Publication counts? Approval of their peers? “h-index”?

Whatever it is, it won’t be directly on the strength of their ideas. This is a pity, because the existing methods don’t work on the cancers you’re targeting (and they don’t work too well in general). Even the notion of a survival rate is absurd. Do people speak about survival rates for influenza? For the cold? Even for the black plague these days? No – because these diseases are either innately harmless or have been rendered harmless. It is highly unusual for people to die of them. Cancer isn’t like that – it’s innately harmful, and only very few cancers have actually been rendered harmless by medicine.

That brings me to a bigger mistake – one that Stand Up to Cancer makes in the same way that existing research programs do. Research scientists don’t get funding unless they have results. SUtC scientists won’t get funding unless they have a treatment. You’re calling it something else, but the bottom line is: you want to see an immediate return on your investment.

Cancer is a big problem, like energy independence. There is no immediate return on the investment, and if you try to make one, you’ll end up with “publish or perish” in a new form – tons of simple incremental advances which do nothing to revolutionize the field.

And that is tied in with the third, and largest, problem with this endeavor: no one is speaking of a “cure”. You all want to “increase” survival rates, not to render the concept obsolete. If you can get the 5 year survival rate of pancreatic cancer up from 3% to 6%, you’ll call it a victory and tout how much progress you’re making.

True, the other 3% will appreciate it. It’s worth it. But it’s short of the goal you need to look for.

If anyone over the age of 8, even a world renowned oncologist, were to speak of “curing cancer”, you would laugh at him. The entire scientific community would laugh at him. He wouldn’t find funding. His research endeavors would be doomed from the start.

And the bottom line is this: you have set the bar too low because you are collectively afraid of failure. You ridicule anyone who attempts to make an audacious advance, because it’s far easier to tout a string of minor successes.

But in the end, it’s that major advance that’s required to do away with this disease. And you’ll never find it if you’re averse to the very idea that a cure could exist.

One might be right under your nose, and you’d miss it. Imagine if Fleming had discovered Penicillin and, instead of remarking on its properties, shouted “preposterous!” and dumped it in the trash. (And that brings up another point: Fleming was another of what I call “near misses”, because, were it not for Chain, his work may have never attained publicity).

So if you want earnest results, start making earnest attempts. Be committed. Be bold. Give it 100% and don’t accept anything short of 100% as an end goal.

Telomerase is a reverse transcriptase. It's an opportunity to cause a buffer overflow!

Not being a biologist, I had assumed that telomerase was “hard-coded” with the telomere DNA sequence it writes at the end of a chromosome. This is actually not quite the case; the coding for a telomere is encoded in a sequence of RNA that the telomerase wraps around (making it a ribonucleoprotein) called TERC.

I, probably like many others, had once thought that inactivation of telomerase would result in a cure for many different cancers. However, for some reason, probably due to activation of other immortality pathways, this is not the case (although drugs that rely on this principle appear to be among the more successful treatment modalities in trials). This also appears to be one of those ideas that everyone is aware of but no one is acting on – I blame the way that science currently works for this (as I’ve mentioned before, how you express your values tangibly affects the impact you will have on reality; if you prefer to publish a lot and have a stable job, then you will not have the time to embark on the sorts of long-range high-risk research projects that actually make a difference).

Anyway, mere inactivation is unlikely to work. However, because TERC actually provides a template for what telomerase writes on the end of the cell’s chromosomes, inactivation is not necessary.

Here’s the fun part where I get to speculate wildly about the current state of the art because I can’t get the training that actually matters to actualize these sorts of ideas (you want your “committee of experts” and I’m the computer scientist. Fine, but the whole team suffers for the lack of synergy and vision):

Modification would do as well. If we could change what telomerase writes out to the end of the cell, we can write anything we want to it – and it would be specific to telomerase-immortalized cells (few normal cells carry this immortality, but it is very common in cancer cells), which means a treatment based on this idea would have few to no side effects.

What could we code for? I’m really not qualified to answer this, but some choices that seem obvious to me are the tumor suppressors that the cancers are inactivating in the first place, such as p53. Reactivate the suppressors, stop the tumors, and they won’t harm normal cells that produce telomerase but are making tumor-suppressors already. Again, minimal to no side effects.

And that’s the idea! It’s another interdisciplinary fusion:

This is what, in computer science, we would call a “buffer overflow with arbitrary code execution”. The code in this case is DNA. The “program counter” is the position of the ribosome. The end of the buffer is the telomere. Telomerase writes code out to the end of this buffer. You can take advantage of software this way by executing whatever code you want; you should be able to do the same to cells.

Algorithmic complexity curves can be shifted.

It is theoretically possible to construct an algorithm with a complexity growth curve that is shifted on the x axis. In other words, you might get part of the left side of a parabola as well as the right for an O(n^2) algorithm if the algorithm happens to have a particular value of n that it performs best at were you to graph the algorithm’s runtime.

This is a fairly unimportant point, but it’s one that traditional algorithmic theory doesn’t really touch upon.

Afraid of novelty?

My dissertation committee is actually freaked out because they believe that my work might be too novel. I find this funny, because my own opinion of the novelty is that it is minimal.

I’m having a hard time taking it seriously. Taking something that can be written about in 15 pages and blowing it up to 150… why?

(To graduate, that’s why)

More artificial intuition ideas…

A post I just made on Slashdot in the context of an article about improving computer “Go” opponents:

Intuition is something a successful AI (and a successful human Go player) will require, and while we can model it on a computer, most people haven’t thought of doing so. Most systems are either based on symbolic logic, statistics, or reinforcement learning, all of which rely on deductive A->B style rules. You can build an intelligent system on that sort of reasoning, but not ONLY on that sort of reasoning (besides, that’s not the way that humans normally think either).

I suspect that what we need is something more akin to “clustering” of concepts, in which retrieval of one concept invokes others that are nearby in “thought-space”. The system should then try to merge the clusters of different concepts it thinks of, resulting in the sort of fusion of ideas that characterizes intuition (in other words, the clusters are constantly growing). Since there is such a thing as statistical clustering, that may form a good foundation. Couple it with deductive logic and you should actually get a very powerful system.

I also suspect that some of the recent manifold learning techniques, particularly those involving kernel PCA, may play a part, as they replicate the concept of abstraction, another component of intuition, fairly well using statistics. Unfortunately, they tend to be computationally intense.

There are many steps that would need to be involved, none of them trivial, but no one said AI was easy:

1. Sense data.
2. Collect that data in a manageable form (categorize it using an ontology, maybe?)
3. Retrieve the x most recently accessed clusters pertaining to other properties of the concept you are reasoning about, as well as the cluster corresponding to the property being reasoned about itself (remembering everything is intractable, so the agent will primarily consider what it has been “mulling over” recently). For example, if we are trying to figure out whether a strawberry is a fruit, we would need to pull in clusters corresponding to “red things” and “seeded things” as well as the cluster corresponding to “fruits”.
4. Once a decision is made, grow the clusters. For example, if we decide that strawberries are fruits, we would look at other properties of strawberries and extend the “fruit” cluster to other things that have these properties. We might end up with the nonsymbolic equivalent of “all red objects with seeds are fruit” from doing that.

What I’ve described is an attempt to model what Jung calls “extroverted intuition” – intuition concerned with external concepts. Attempting to model introverted intuition – intuition concerned with internal models and ideas – is much harder, as it would require clustering the properties of the model itself, forming a “relation between relations” – a way that ideas are connected in the agent’s mental model.

But that’s for general AI, which I’m still not completely we’re ready for anyway. If you just want a stronger Go player, wait just a bit longer and it’ll be brute forced.

This is apparently a tough medical question…

I asked a radiologist this and he didn’t know the answer, so up it goes on my blog. Perhaps someone can answer it? I have a few hypotheses that depend on the answer.

When microcalcifications tend to occur in association with
unilateral cancer (in breast cancer, for example), why do they tend to
occur bilaterally in the same region of tissue?

I’m wondering whether this might indicate an underlying genetic or environmental factor that affects both organs and predisposes to, but is not sufficient for, carcinogenesis (if it were sufficient on its own, I’d expect cancers associated with bilateral calcifications to be bilateral themselves). Another possibility is that the presence of cancer itself causes the calcifications, but the only
thing I can think of that would cause them bilaterally is some sort of regional immune/inflammatory response to the cancer.

Two steps remain

I passed my first preliminary exam today. Only the second prelim and my dissertation defense lie between me and completion.

Once I’m free to stop commuting to the lab every day (just another couple of weeks…), I am going to stop fooling around and I am going to blast through the rest of my dissertation at a rate that makes my previous 10 pages/week pace look sluggish by comparison.

Prelim II by January, defense by May. That’s the goal. If I need more time, I’ll only give it until August. As long as I’m in grad. school, I’m prevented from doing the research that really matters, so I need to hold fast to my three-year ultimatum.

Can tensors have fractal ranks?

This question has been brewing in my mind for quite some time, but I don’t really know enough about how fractal/Hausdorff dimension is computed to answer it. If they can, it might be possible to obtain good compression for certain types of datasets using this property.