|
Home About Python |
Anandpygami at infogami blogTechnology in IITsTechnology in IITs is really really bad. Today they are supposed to annonce gate results. So they are going to receive millions of hits. Are they prepared for this? Not at all. Let me explain my experience with these sites. GATE results will be announced in all IITs and IISc websites.
I started looking at all sites and found iit roorkee has a link to GATE Results 2007. So i opened that and entered the number i wanted to see. It said That number is not qualified. I was puzzled! I knew that can't happen. I tried some random numbers. Got the same response. I thought it must be some bug in their program so i wrote a progam to confirm it. I quieried for 100 registration numbers and it gave same response for all. What a pity! After sometime IITK also put the results and the response is same. After lot of waiting and lot of frustration i went back to IIT Roorkee site. Now i see a different error message. It gives the following error:
Now it is clear what they are doing. For every request they open a text file with all numbers and do a sequential search to find the result. And they do it for every request. Lets do some quick calculations. Suppose there are 1 million registration numbers. For each number they must store score. So it takes 6 chars for the number + 5 chars for score + one space and one new line. So total 13 chars per number. So total 13 MB per request. If they get 1000 requests at a time. It will take 1.3 GB of ram and system starts trashing. There is a simple alternative. Write a small People at IITK have done some kind of replication which is really stupid. They have put the results in 3 directories on the same machine.
How confusing. Gate results are available in JMET 2006 and JMET 2007 directories. After a while all servers stopped responding.
Surprising! IITM is working. and i got what i wanted. last updated 2 years ago # itchI started scratching my personal itch to port web.py to scheme. I have been thinking for a while for porting web.py to scheme and haskell. but motivation for this moment came from here. last updated 2 years ago # browsing historyI started looking for tools to capture my browsing history. These are my findings.
last updated 2 years ago # tracking browsing historyI explore many websites everyday. I somehow want to keep track of important ones with notes and able to go back and read later. delicious is good, but it is not for that purpose. There i can bookmark interesting websites. Here i want to capture just the browsing history. sites i visit today will be displayed on the top, site i visit often are displayed in bold or something like that. If you visit more pages in the same site, they will be grouped together. If you see an old page again, it will come to top. Somehow it should also capture all my browsing graph. May be this is very good idea for an web2.0 startup. Using this you can capture the browsing patterns of people and findout which sites are popular and which sites are hot etc. last updated 2 years ago # missing markdown!Its painful to blog with wordpress. I am really missing markdown. I stopped blogging since its very painful to insert links in wordpress. So i am back here again. last updated 2 years ago # new blogI started blogging in wordpress now. Check it out: http://anandology.wordpress.com. last updated 3 years ago # fortunenot feeling like working now. looking at fortune.
last updated 3 years ago # gumstixRecently i was exploring low power computers. found very interesting one, gumstix. It is very small computer which you can keep in your pocket and it runs linux. it has a full powered webserver too! Somebody has already ported astrisk on it. see this link or google cached copy. I am thinking of buying one and try to implement in timbaktu. Unfortunately they are not delivering it in india. should see how to get it. last updated 3 years ago # svn + trac on ubuntui was trying to setup svn + trac on ubuntu 5.10. installation was really smooth. I got all the required information from the following 2 links:
last updated 3 years ago # Web.py on apachei am now able to run my web.py application on top of apache. i wasnt as difficult as i thought. i downloaded fastcgi module and installed. followed the instructions given in http://webpy.org/install. but it refused to work. i looked at apache error logs. it gave the following error:
Then i tried Thats all i have web.py running on top of apache. I am working on setting up svn and trac at my work place. I will try to use web.py to automate some tasks there. In the process i looked at lighttpd as the preferred the webserver for web.py. reddit run on it! should install it and try it on my machine. last updated 3 years ago # The Pragmatic ProgrammerJust read The Pragmatic Programmer Quick Reference Guide. It looks really interesting. I think i should get a copy of The Pragmatic Programmer. last updated 3 years ago # twilltwill is a simple scripting language for Web browsing. It uses mechanize. Seems like it is a better idea to result a twill script than mechanize code for mechanizer(twiller?). last updated 3 years ago # YouOSI am playing with YouOS now and its pretty cool. it has many applications including a browser. They have tutorials to develop applications for YouOS. I should get my hands dirty! last updated 3 years ago # Einstein's PuzzleI solved the Einstein's Puzzle: Who owns the fish in python. Here is the code. last updated 3 years ago # thoughts...i wrote web crawler using mechanize, and it worked really well for my job. i wrote a web interface to the downloaded pages using web.py, it really saved lot of my time by making navigation very easy. i want to make the web interface complete by adding some more things. i have been thinking of learning css. web.py, form.py, template.py all there i want to play with and make something that looks beautiful. i found some css design and downloaded it. will play with that over weekend. today i saw a new y-combinator startup wufoo. their idea is simple and very nice. make forms online, without effort. it looks really nice, though i couldnt play with it, wasn't working on my machine :(. But i wonder, infogami will soon be able to do many more things than that. coming back to web crawler, i want to a generic web crawler interface, driven by web which generates the code for doing the web crawling. idea is quite simple, first page asks you for which url you want to go. you enter that. it downloads that and replaces the links with its own, so that it can know what user is submiting and record it as a python script. startup school 2006 is coming up. i wish i could go there.... last updated 3 years ago # Hackers boxQuite some time back i read a paul graham's article Return of the Mac. I personally use mac, so i liked that article. But i didnt realize what he means. I see it now. almost all founders of Y combinators startups i have seen use mac. TODO: write more.. last updated 3 years ago # Web crawlingI have been trying for quite some days to do some web crawling using python. i was using urllib2. it was quite painful to see the HTML source and try some heuristics to go to the next page. but none of my attemts resulted in success. i guess there was some problem with cookies. after lot of googling, today i found something interesting. its mechanize. i found it from a blog. This is talking about applying tidy on the html output to make it more robust etc... i want to try all these, but i am in hurry to use mechanize right now. hope it works. last updated 3 years ago # Hello, world!Hello, world! last updated 3 years ago # |