Sector: simplifying distributed computing

Sector is a system infrastructure software that provides functionality for distributed data storage, access, and analysis/processing. It automatically manages large volumetric data across servers or clusters, even those over distributed wide area high speed networks. Sector provides simple tools and APIs to access and/or process the data. Data and server locations are transparent to users, as the whole Sector network is a single networked super computer to the users.

Sector uses a DHT-based P2P routing algorithm to store and locate metadata of resources (data, processing operator, etc.). There is no central management server in the Sector network. Nodes can join and leave the system without affecting ongoing operations. Sector makes multiple copies of the data files such that when old servers are removed, new copies can be automatically made on new servers. Sector uses UDT for high speed data transfer between servers and between a server and clients.

Users can use the Sector client API to write distributed applications. Because Sector provides uniform data access across the system, there is no need to move data. However, what makes Sector better is that it can automatically locate and schedule processors to run user-defined data processing functions, therefore there is no need to write any code for explicit communications, scheduling, and fault tolerance. Sector will significantly simplify the development of distributed applications.

The Sector software is open source under GPL. We have just made a prototype release and are still improving it. We are also seeking research partners (from both academics and industry) to help manage and process their data sets. (In return, we hope to learn more from users and make Sector better.) If you are interested, please contact us by email.

Sector for SDSS Sector has already been used for distributing Sloan Digital Sky Survey data, total 13TB. The SDSS-Sector server network is running over the 10GE Teraflow Network, and global astronomers use Sector to access the data sets. For more information, please visit sdss.ncdm.uic.edu.


UDTUDT is another project that has been developed by us. Sector uses UDT for high speed data transfer between Sector servers and between a Sector server and clients. UDT is an application level data transfer protocol on top of UDP. UDT is reliable, connection oriented, and can be used in shared network environments, For more information about UDT, please visit udt.sf.net.

SECTOR | Contact Us | ©2007 National Center for Data Mining