Archive for the 'UMD' Category
The year that was… 2009.

Today is 31st of Dec 2009. We all are eager to welcome the new year 2010. I am very excited about the new year, as I hope it will bring me a good and interesting job.

But today I am writing this entry to review the past year 2009. 2009 started off with a party at Radhika Tai’s place in Portland and academically with the last semester in sight leading to completion of my degree. My last semester at the University was most fulfilling and enriching for me from the educational perspective. I loved the course Machine Learning and my research work turned out fine. I have been since hooked on with various machine learning algorithms and their applications to the real world problems. Personally, not everything went as planned but I would not talk too much about it here. Job hunt in the past year has been a rough one, my work as a RA at the Univ has been fun. I had to take a break in last few weeks due to an important function at home and return to India.

I hope next year will bring some zing in the markets and will bring cheer to the faces of people around the world. May the next year bring peace and harmony to the man kind. May we fight less and live more! Lets hope next year will be healthier for all of us…

May next year  be happy and prosperous one for everyone.

3 hour hack at Search result summarization…

One of my friends Anand Kishore along with some of his friends at Yahoo built a nice Text summarising app “Dygest” using the Search Monkey and some other SDKs. Their nice achievement is they say they were able to write a statistical text summarising algorithm. Now I don’t have the details of the algorithm with me, but what I could see from the summary was that it was HELPFUL. So first up Kudos to you guys, Good Job. Next I was kind of intrigued by the thought how they must have done it, and that set my brain rolling and I started reading what they must have done.

Few Observations I made about their results,

1. Very well presented. :-) , I am not good at web design and stuff so I really admire that.

2. Their results (sentences) were just too well formed to be machine generated. So, I was like where did they come from?

3. First hunch was, may be they just put out most interesting sentences.Which turned out to be partly true, their sentences are infact “picked” as is from the source text. But they put entire sentence so that it reads well, and also they probably put more than one sentence in order to make it sound coherent. They might also be doing some grammar analysis before putting the sentences together.

So these things were going in my head and I was like what would it take to pick up meaningful sentence form the text, the simplest thing. Then I set out to write my own code to do the same. I pulled from net a python script that could extract text from a url (it is not so powerful, but works). Then I took two web pages returned by Dygest for search term “Stimulus” and converted it to text using this python script. I wrote a small perl script to clean the text and build a matrix of the form “paragraph X words” and then scored them based on the number of meaningful words contributed by the paragraph. Here meaningful words are the words that are left after striping out the stop words (put together by searching some stop list online). The paragraph with maximum score is selected as the representative for the article.

I know this is really naive method of doing things, but I wanted to validate the thought I had in mind about the ability of this idea. And it turns out it stands validated. I don’t have results with me on this machine right now to put up here but will do that once I go home. Also, I was surprised as it did select some of the really good paragraphs as a answer.

There are a few variations I would have liked to try but did not, here are those.

1. Use a better importance measure.

2. Add more granularity to the text selected. I could easily go to sentence level and then show the top 3 sentences as answers.

3. Use second order context similarity for the search term and the paragraph selected. This would be really interesting but is a lot more involved and I did not have enough time.

All in all, it was fun app that it turned out. I will be uploading it soon here so keep watching this post or email me if you are in a real hurry. But again I don’t claim that the code is a top class code and is the best way to go. It is one of the way to go though. I also want to thank Andy (Anand Kishore) who’s post (Dygest) I mentioned before, inspired this act of mine.

Ice Skating it was fun…

Well as the gentleman at the ring said, Ice Skating and Skiing are a part of life for a Minnesotan, I decided to take a peek at it. Ok let me admit it was not this flimy! Today we all from International club went ice skating and believe me it was fun. There we met this gentleman Neel (I believe was his name), and a kind lady Kay, who taught me my first lessons of ice Skating. Today I was trying to balance myself on my legs. It was difficult, and I was in a awe to believe that, I can not balance myself on my OWN LEGS!!! and then there was a realization, I had those skates under them.

Today I felt like probably how a baby would feel when he / she is trying to stand on their own feet. It was an experience of my life and just told me that I could do it, just needs practice. And then I saw kids not more than 3 years of age doing that far better than me. I mean I am a bloody 6 feet tall guy with two strong enough legs trying to stand on those skates and those little kids they were just fabulous. one of them was even doing figure skating and doing some cool maneuvers. it just told me the same lesson again IT NEEDS PRACTICE!

I did fell a few times and one was while posing for a camera, which I kind of hate coz that just got me an fall on my account, not that anyone was counting but just for my own pride. ;-)

Well I must say it is a good experience and I am sure I will try it again…