Should have guessed…

November 17, 2008 at 12:20 am | Posted in arbit, chappar, humour, nitk, sarcasm, Technical | 10 Comments
Tags: , , , , , , ,

While riding on the internets, and surfing the tubes, I came across this nifty site called Gender Analyzer. Using free text classifier algorithms from a site called Uclassify, this site aims to judge whether a blog/website is written by a woman or a man. A very active research topic.

Gender Analyzer

I tried out using some known standard cases, and here’s the goldmine.

Evil Sense

Gosh, I didn’t know that Machine Learning had become so accurate these days. Be paranoid, very.

Incidentally, Chappar, when you were on wordpress, your manliness rating was 83%. Did anything special happen during the transition phase?

A thousand apologies, plus one extra, just in case.

And to those who might think of an oh-so-brilliant, “Look who’s talking !!!”,line. I’m at 71%. Muha ha ha.

P.S: Incidentally again, this is the 2nd in the chappar series of posts, the first one having been written nearly 2 years ago.[hyper-link to click in case you’re bored]

Update: Google Hindi translation of this post is too funny. 

excerpt: जबकि internets पर, घुड़सवारी और ट्यूबों सर्फिंग, मैं इस गंधा साइट भर में आया जेंडर विश्लेषक कहते हैं

lol [link]

Advertisements

10 Comments »

RSS feed for comments on this post. TrackBack URI

  1. My blog happenes to be at 61% 🙂

    And i happened to give the same website’s URL as the input string. 85%. Not bad.

    God only knows how that algorithm works!!

  2. the training set wasn’t right, i think.

  3. @kitta- firstly, congrats on 61. The algorithm is available for download. My hunch is that it uses some variant of boosting algorithms. Neural net is a bit tedious for this. It needs to have a semantic-language processing toolset, which might be tougher than the classifier itself. @priya- the problem with machine learning usually lies with an insufficient or incorrect dataset.here they use 2000 blogs, most of which could be very localised w.r.t country, agegroup, topic etc…with more classifications, the learning data will grow, and hopefully become more accurate. Having said that, i’m quite satisfied with its accuracy;-), a few outliers being funny exceptions.

  4. i found one paper which distinguishes between men and women writers by the types of words they use. apparently some words are ‘male’ and some are ‘female’. it uses some weighting function to determine which is more dominant.

  5. and when i put in some jane austen text in that implementation, it gave a wrong answer. ditto for agatha christie and joanne rowling.

  6. Probably it is yet to account for women’s lib… Results might improve in future versions.

  7. Wow. A sexist program.

  8. […] Issues and Personality Disorders Thanks to Logik for this one. I needed some […]

  9. it translates ‘nifty’ to गंधा?

  10. That one was totally off the mark, but translating “ride” to घुड़सवारी was pretty ‘horsey’.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.
Entries and comments feeds.

%d bloggers like this: