Pages

Wednesday, February 4, 2009

Grid Computing and Its prospect in Nepal -- we are responsible for Change

Abstract
This paper visualizes the prospect of highly advanced grid computing technology in Nepal. Using examples such as SETI@HOME, Large Hadron Collider, it explains why Nepal should shift its gear towards such technology and how it can achieve the goals. However, it does not explain the organization and architecture of grids to solve complex computational problems. Lastly, it includes my personal experiences of working in my minor project “Linux Cluster: Prime Number Calculation" and my observations through those experiences.

INTRODUCTION
Visualizing the present scenario of Information and Communication Technology industry in Nepal, dependency on the foreign agencies is extensive. Sustainability, originality and reusability are some of the key aspects of any software engineering principles, but we could always lose everything in air at once. So, the better idea would be to spend our knowledge and credibility on some independent high level
advanced technology, which would revolutionize the way world would think. One of such technologies is Grid Computing.


GRID COMPUTING OVERVIEW
Grid computing is a computing model that provides the ability to perform higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture that is able to distribute process execution across a parallel infrastructure. Therefore, the basic concept of Grid Computing is to make use of the underutilized resources. For example, only one-tenth of processing power is used during normal deployment. If we could merge all the wasted CPU cycles together, this power can be utilized to compute and solve the complex and computationally expensive problems. The term
grid, however, may mean different things to different people. To some users, a grid is any network of machines, including personal or desktop computers within a virtual organization. To others, grids are networks that include computer clusters, clusters of clusters etc. Grids are usually heterogeneous networks.
Grid nodes, generally individual computers, consist of different hardware and use a variety of operating systems, and the networks connecting them vary in bandwidth
.

PRACTICAL REALIZATIONS: SETI@HOME
SETI@Home (Search for Extra-Terrestrial Intelligence @ Home) is the most popular and successful Grid Computing Project in the world. It was released in May 1999 by Space Science Laboratory in University of California, Berkley and was aimed towards finding extra-terrestrial life outside earth. The objective behind the search was to analyze the Radio Frequency signals in a particular band (band of 2.5MHz centered at 1420MHz) for any abnormality, which might be signs of some terrestrial radio source. The radio signals that received from telescope were converted to binary data and transferred to Berkeley's Laboratory. It is about 35 GB of data each day. This data is then broken down into smaller working unit sets which are functionally independent from each other, and are fed into an algorithm for calculation of anomalies. When processed on a Simple Desktop computer would require about 2.5 trillion mathematical operations and about 10-50 hours per work unit. Physicists at the laboratory were facing a big problem because such huge amount of data generated each day. This would have required many powerful computers to processes and analyze, and even if super-computers were commissioned, it would have meant millions of dollars of
investment. This is where Grid computing became effective. The principle behind the complete solution setup is to utilize millions of CPU cycles which go waste when normal desktop PCs and Laptops are not being used. Thus, SETI@Home developers developed a simple client/server application called BIONC
where client side would receive work units and data sets from the Berkley Laboratory's servers, process them and send them back with the results. All that is required is an application to be installed on the participating nodes which can interact with the server.


PRACTICAL REALIZATIONS: CERN
The other example can be world's largest and highest-energy particle accelerator Large Hadron Collider located at CERN (European Organization for Nuclear Research). This project is expected to be completed on May 2008 and is speculated to provide physicists world with amazing view and information about the way particles behave. But, the biggest technical problem it faced was that it
would produce over 10 Petabytes of data a year. This scale of data would call for a huge amount of supercomputers, and hence a massive incurring technical and inventory costs. Thus, a perfect solution for that would be Grid Computing. What grid computing does is that it uses large number of smaller systems,
spread across a large geographical region and presents a unified picture to the world. It
enables the use and pooling of computer and data resources to solve complex mathematical problems. The technique is the latest development in an evolution that earlier brought forth such advances as distributed computing, the Worldwide Web, and collaborative computing.


GRID COMPUTING APPLICATIONS
Grid computing connects a wide array of machines and other resources to rapidly process and solve problems beyond an organization’s available capacity. Academic and government researchers have used it for several years to solve large-scale problems, and the private sector is increasingly adopting the technology to create innovative products and services, reduce time to market, and enhance business processes. Not only scientists but businesses can also optimize computing and data resources, pool them for large capacity workloads, share them across networks, and enable collaboration. Cancer Research project from United Devices is also using Grid computing technology to analyze the millions
of combinations of chemical data for cancer treatment. Nowadays, many other industries like Automotive and aerospace, for collaborative design and data-intensive testing; financial services, for running long, complex scenarios and arriving at more accurate decisions; life sciences, for analyzing and
decoding strings of biological and chemical information; government, for enabling seamless collaboration and agility in both civil and military departments and agencies; higher education for enabling advanced, data and compute intensive research, are using Grid computing technology.

GRID COMPUTING AND NEPAL
In the developing countries like Nepal, the government cannot afford super computers for researches in physics or mathematics. Therefore, for departments such as Election Commission, National Planning Commission involving large computations, data mining and data storage, grid computing technology can
be a perfect solution. For example in National planning Commission where census are surveyed and analyzed every ten years, data mining technology can be used to study the pattern of education rate and its causes, similarly the pattern of infant mortality rate can be analyzed, but this technology requires lots of storage, computation and processing power, which normal Desktop computer cannot deliver. Thus, grid computing can be used to speed trade transactions, crunch huge volumes of data, and provide a more stable IT environment. Moreover, we can use grids to pool, secure, and integrate vast stockpiles of data. However, considering present circumstances, we may not be able to operate on high budget and complex grid computing projects like SETI@HOME and Large Hadron Collider. But, we can at least implement these technologies for academic and government researches. An example is the project initiated by Department of Computer Science and Engineering, Kathmandu University with my own coordination and support from Assistant professor Pursottam Kharel. The project was an initiation of how we can employ the underutilized computer in the department. During the process, we have built a Linux Cluster with 6 off-the-shelf computers to calculate an array of prime number. Even if the project was not a great success due to technical problems with Openmosix, it was a worthwhile attempt to develop distributed computing. Such projects should be initiated in companies involved in the life sciences, such as
genome research and pharmaceutical development, they can use parallel and grid computing to process, cleanse, cross-tabulate, and compare massive amounts of data. Faster processing means getting to market faster, and in those industries, a slight edge can be the deciding factor. This technology can also be deployed in Nepal Army, Nepal police as Many civilian and military agencies need the capabilities of cross-agency collaboration, data integrity and security, and lightning-fast information access across thousands of data repositories.

CONCLUSION
This paper reports the importance of change in Software Tradition of Nepal. Currently we are dependent on off-the-shore clients. This can be unreliable and unsustainable, so it’s the time ICT professional should find alterantives and initiate projects such as grid computing. This project will certainly benefit in terms of cost, security, computation and storage and with thus provide a sustainable solution for long
term.

No comments: