Wednesday, February 20, 2008

so we want to do a beowulf?

ya right.. considering the extremely boring notion of running a simulation for a week and then finding out that the simulation parameters were wrong in the first place.. and then having to re-run it again and again.. (we're iterative learners, but you know that right?). i've been through that before and dont intend to be that again..

hence the beowulf.. now time for some terminology clarification..
  • beowulf, as you must have guessed is not beowulf, the movie. seriously.. no one can do a movie.. a group of linux/unix machines doing the same code is more like it.. something like a 400 core processor.. hah! we beat core2duo big time.. anyways.. most supercomputers are something like large beowulfs..
  • next comes dear MPI. message passing interface: this dude-ic C/C++ library allows such a these machines we talked about to communicate and do the codes we talked about without over- or under-doing it.
  • SSH: the backbone.. MPI executes commands through secure shell access

as for what we plan to do with this monster of a cluster, we haven't a final idea.. what i'd propose is some kind of dna simulation for a start, since i'm already familiar with the software and procedures.. other things that can be done would be doing the mersenne thing (http://www.mersenne.org/), as suggested by vinayakzark, who shall be generously contributing to the cluster soon. more ideas are awaited..

Saturday, February 16, 2008

Sharing our sharing experience

MPI was successfully installed on 3 comps today (Thanks to Krishna for giving his comp for this project). We wrote down same passwords in the .mpd.conf files on the 3 nodes (TMI I know). We changed the ssh too and our firewall settings. Running the client-server pairs on two computers worked successfully. However, we encountered problems trying to make a ring running mpdboot on ssh on the 2 other nodes from the same comp. Right now we attribute this problem to ssh sessions requiring passwords. After checking out the links [1] and [2] we hope to get the thing fixed tomorrow using silent logins.

IEEE Transactions on Parallel and Distributed Systems : latest TOC