Category Archives: Ideas

Using Physical Properties and Forces to Cluster?

It seems plausible to create clustering algorithms based on gravity and the Coulomb force, with masses or charges corresponding to specific point weights. A “cluster” then becomes the resulting “solar system”. For example, if we represented all objects in the solar system with their masses and distances, the theoretical model would label them as one cluster (“Sol”).

Another idea I’ve been toying with is to use the concept of physical momentum with gradient descent (I don’t believe this is the same thing as the existing technique called “gradient descent with momentum”), such that an “energy counter” is kept that increments when the gradient points downward (proportional to its magnitude) and decrements when it points upward. This will cause the optimization to “roll” down slopes, completely clearing small minima, which tend to be pathological. The result is wherever the optimization/rolling comes to a halt. (Nevermind, this is in fact the same thing, or almost so)

Of course, I still think estimating the minima of the MSE curve from what is already known of it then moving there to check would be much faster and possibly more accurate.

Finally, another idea is to deform a surface to minimize the local MSE of its k-nearest neighbors at each of several regions. I’m not sure if this replicates the behavior of an SVM with a kernel, however, but it should probably operate much more quickly than the cubic learning of an SVM hyperplane due to the local nature of the constraints.

Panidealist Subjectivity

Over the vacation, I’ve had some thoughts which have just now bubbled to the surface:

When you call the Fountainhead or Atlas Shrugged a great book, please remember that the former was rejected by 12 different publishers before it saw the light of day.

When you use technologies developed around neural nets, think back on how much more developed the field (and consequent applications) would be if scientists did not neglect NNs for a decade because they misunderstood something Marvin Minsky said in his book.

When you speak of elliptical planetary orbits or heliocentricism, remember that your worldview would likely have you excommunicated and/or put to death 400 years ago. In fact, you can regard all of your thoughts that conflicted with anything Aristotle said – and a lot of what Aristotle said was wrong – in the same manner.

When you use technologies derived from genetics – when you speak of heritable traits – when you breed cats – when you are screened for diseases you have a high risk of getting because of family history – recall that evolution is still a contested idea that many would like to suppress.

The computer you view this on? Perhaps five of them would exist in the whole world if pundits had their way. The Internet? It would just be for universities and governments. Jesus himself was killed for expressing an unconventional idea.

So stop for a moment and think: do you take part in this collective? How many ideas have you suppressed? How can you be sure that an idea is “bad”? If you cannot, you lack the knowledge to make a rational decision.

Kant and the Nature of Philosophy

Kant is an extremely underrated philosopher. He had some absolutely great ideas. Some of them are similar to principles of my own philosophy, such as his “categorical imperative” and stance as a nondeterminist, while others are similar to principles of Panidealism (which I guess is another facet of my own philosophy), such as the absolute inability to know what the nature of free will is and thus the lack of ability to make an objective moral judgment based on this inability. If you replace morals and free will with a general concept of an “idea”, you have a central Panidealist tenet. Actually, a quick review of his main works indicates that Kant may have hit on this in his “Critique of Pure Reason” (which, in my defense, took him 10 years to write) – the resulting philosophy is called “transcendental idealism”. However, it’s only one Panidealist principle, although an important one, and so my philosophy chugs onward.

Even where his ideas differ from mine, they remain intriguing.

So far I’ve been told my developing philosophy has similarities with Plato, Aristotle, Kant, Berkeley, and Russel, and, aside from the Republic, I’ve never studied any of their philosophy. I don’t study philosophy as a subject, as I believe that it’s foolish to let your own philosophy be influenced by the thoughts of others. I simply think, and the philosophy of others just… emerges.

Even if it bears similarities with other philosophers, my philosophy remains new at least in how I combine these principles. The best way to forge ahead in philosophy is to simply ignore everything that came before and think. Maybe you’ll reinvent many things. Maybe you’ll say the same thing in different ways. But somewhere in there, there will be novelty.

Neuroplasticity and tool-using behavior

One highly likely reason for neuroplasticity is to compensate for damage. However, the adaptation of the cortex in primates to accommodate new “limbs” connected via a brain-computer interface is also very interesting, and leads me to believe that tool-using behaviors require a certain amount of such plasticity. After all, there’s no good evolutionary reason for the cortex to adapt to new limbs after a certain age… animals don’t grow new limbs. I suppose one important question that this raises is: “does the same degree of neuroplasticity occur in animals that are less prone to tool use?” Hominids, anyway, are pretty good at it, and it wouldn’t surprise me if primates are too. A handful of other animals, such as finches, ravens, and dolphins might also qualify. But that leaves a large number of animals who aren’t known to be prone to tool use. Would they exhibit the same response?

Simplifying the closed form of linear regression

Here’s one that must already exist:

Linear regression is given by the closed matrix form:

θ = (XTX)-1 XTY.

We have a rule we can apply here: (AB)-1 = B-1 A-1
Which gives us: θ = X-1 (XT)-1 XT Y.

But the transpose and its inverse cancel, yielding the identity matrix when multiplied.
This leaves us with:

θ = X-1 Y.

Now, surely there must be some reason that it’s not taught this way. Is it because this requires X to be a square matrix? Can we use a pseudoinverse instead to solve this?

Update: Indeed we can. Now I understand why it’s presented that way; the pseudoinverse is given by (XTX)-1 XT! Therefore, we can also just say the optimal parameters are given by X+Y, where X+ is the pseudoinverse.

“Reinvention is talent crying out for background”, I suppose.

Learning anything at any age

I’ve held these beliefs for a while, but kept them fairly private. Since a major theme of mine during the last year and a half has been the necessity of an appropriate educational system for nurturing great thinkers, I think I’ll state them in the open:

We really must, as a society, move beyond the “no child left behind” mentality. We cannot afford to slow classes down more and more in order to cater to the slowest learners. It does a great disservice to the normal children and a far greater one to accelerated learners. The manifestations of this are pretty plain: boredom, lack of focus on work (but not a general lack of focus), preference for self-devised side projects, frustration, detachment, etc. If these abound in a classroom, chances are the work is too slow.

The worst culprit, however, is not No Child Left Behind. It’s the concept of a grade. Even at young ages, mental ability is not uniform. It is absolutely unfair to group all children of a certain age into a single grade, learning all the same materials, and then to keep them there for a fixed amount of time. Sorry, but if a student learns multiplication halfway through 1st grade, for instance, he really should not need to keep learning it for the rest of the academic year – move the work up to something more appropriate for the student’s background.

The rest of this will concern math, because it’s the area I’ve tutored most extensively. However, it can be applied to any field in a similar manner, taking appropriate safety precautions in fields such as chemistry, of course. (“You know how to use a bunsen burner, now let’s work with some HCl!” is probably not a good idea).

From my own experience as a student and a tutor, I bet we’ll see students learning algebra in 3rd or 4th grade in such a system. I started learning it at 8, but I had no help in the matter (as usual; I’ve always had to do everything on my own), so I bet we could teach it even earlier, at least to the mathematically gifted. Taught properly, rules from arithmetic supply the intuition – things like multiplication and division being inverse operations, associativity, and even binomial multiplication (FOIL) on simple things that the student can verify conventionally, including showing how it emerges from the distributive property, such as (5 + 1) * (4 + 3) = 5(4 + 3) + 1(4 + 3) = 5*4 + 5*3 + 1*4 + 1*3 = 42. Anyone who knows simple addition and multiplication can of course verify this simple example by adding 5 and 1, 4 and 3, and then multiplying the resulting 6 and 7, which allows the student to verify for himself that the identity does indeed work on this example. After this is done, generalize by moving onto variables.

Once basic algebra is mastered (and it doesn’t need to take years), introduce the basics of calculus. If you’re feeling really adventurous, it’s probably an ideal time to introduce some abstract algebra as well – right after a student finishes generalizing numbers to variables, they’ll be in the right mindset to generalize variables to teach them to generalize algebra itself – to get rid of the whole “number” thing and just talk about systems.

Logarithms and things are all special cases in calculus and can be safely ignored until the students learn about them (their “special” properties really arise from the definition of the operators anyway). Limits are easy to teach – use a number line and terminology like “gets closer and closer”. And DO NOT tell me you can’t teach someone d/dx(x^n) = n x^(n-1) as soon as they know algebra, because I won’t buy it. If they can learn the quadratic formula, they can certainly learn how to differentiate a polynomial.

Integrals are taught pretty well as-is (Riemann sums and the Fundamental Theorem of Calculus give good intuitive foundations for the concept), but again, far too late. It should be 7th / 8th grade sort of stuff at the latest.

The idea is to teach students enough that they can learn the rest, then move on and let them learn the rest (with help, if necessary). Exactly how much is “enough to learn the rest” varies from student to student and must be recognized on an individual basis.

Forcing homework upon them isn’t going to help them learn the rest, either. They should be encouraged, but not coerced. Students start out wanting to learn – in my own experience, the younger my proteges, the more enthusiastic they were about learning science, engineering, and/or math. Associating boring and repetitive work with such subjects for years eventually turns most people off to it (or worse, makes them think they can’t do it when they really can). The high school and college students tended to be the least enthusiastic, because they had the choice of (a) rote study or (b) an active social life. Assignments force them to the choice while providing very little benefit. Einstein states that “it is a miracle that curiosity survives formal education”, but, scientist that he was, he hasn’t seen the casualties.

Ultimately, it is society that suffers.

Yes, I’m aware that Montessori came up with a similar idea, but I’ll stop saying it when I see people acting on it. It doesn’t matter whether the idea already exists if people don’t do anything with it. Further derivation just serves to underscore the need for an implementation.

Why AI?

I just realized that once we get the whole genetic engineering thing down, we have no real need for AI anymore. We could create natural intelligence just as easily.

In fact, once we get working the source code (DNA) down, biological systems can be made to behave very much like computers.

Man, the world is going to be a wacky place in the future.

Where's my "very large prize"?

Here’s another one of mine.

Digg Style Voting on Search Results

Sometimes I feel so much like the Roark to society’s Keating that it’s uncanny. Maybe they came up with this independently, but given that it was one of the things I mentioned during my interview, it’s doubtful. It’s nice to see my ideas implemented, but it’s not so nice to constantly have them ripped from me without them returning anything to their creator. It’s something I’ve had to deal with for most of my life… the only consolation is that it cannot last; I only need to succeed once for people to start noticing my ideas.

Can Data Mining be Unified?

A lot of the concepts in data mining / machine learning seem to share commonalities that suggest a unification is possible. For example, SVD is related to PCA is related to K-means clustering is related to the Lloyd Algorithm is related to Vector Quantization is related to compression is related to MDL is related to Kolmogorov complexity. K-means is also related to kNN classification is related to boosting is related to SVMs is related to regression is related to neural networks.

And so on. All of these concepts share notions. The question is whether they can be treated as the overall expression of one concept.

The answer is probably no, but they certainly could be condensed as new results arise.

Ideas – Basis, Rank, Power, and Community

I began thinking about the selective nature of certain communities in terms of my “panidealist” philosophy this morning, only to come to a shocking conclusion:

Any community that enforces a single set of common beliefs through selection or coercion reduces itself to the strength of a single free-thinking individual.

Recall that my philosophy states that reality is itself an expression of various combinations of ideas. Mathematically, it is the image of a basis of ideas represented as a matrix. I’ve been told that this aspect of my philosophy is also the philosophical view of Bertrand Russel (though I’ve never read his philosophy and don’t really read philosophy in general, preferring to keep my own worldview untainted by the philosophies of others). However, what I am about to propose extends beyond his philosophy.

We can define the rank of a matrix as the number of linearly independent columns. Because the ideas underlying reality form a basis, they are, by definition, full rank. Their expression is the image of this basis, thus it is not full-rank. In other words, redundant ideas are expressed in various facets of reality (which is fine; the idea of sentience is not independent from the idea of humanity, for example).

Now let us take a community that selects for a shared set of ideas. Such selections include “fit”, personality, interests, etc.

Because all members of this community share these ideas in common, the size (and thus rank) of the basis is reduced. The more ideas are shared, the more the community’s basis approaches the size of a single individual.

Now it gets interesting: what if we define intelligence, or “cognitive power” (to differentiate it from the psychometric concept of intelligence), as the number of ideas one is simultaneously capable of expressing or creating?

We discover that an community consisting entirely of shared values is as intelligent as a single person. A community with completely independent ideas or values (deliberately selecting for people who do not match the existing basis would be the only way I can see of approaching one; actually attaining this is impossible) is full rank, and operates optimally save for the fact that any individual idea may not have enough momentum within the community to become fully expressed (a major problem). As this community introduces more redundancy, the size of the basis does not scale with the number of members, and the rank of the community remains the same despite an increasing size. Thus the average cognitive power of the community drops despite increasing membership. Negative returns.

This results in the satisfying conclusion (if the premises are correct, which is a philosophical matter) that any society that continuously expands its membership while selecting for particular ideas will ultimately run itself into the ground, possibly to be overcome by the thought of a single individual.

As Ayn Rand puts it at the end of Anthem, “For they have nothing to fight me with, save the brute force of their numbers. I have my mind.”