Example Use Cases

SDSS data distribution

The SDSS project (Sloan Digital Sky Survey) is an ambitious project that is systematically mapping a quarter of the entire sky. So far about 13 terabytes of raw data (telescope images) have been produced. This large volumetric data set needs to be deliver to astronomers around the world for further analysis in their research work. However, delivering 13TB of data over Internet would be impossible and in the past these scientists have copied the data into portable disks and ship them around.

Today, we work together with the astronomers to store the SDSS data in a Sector network installed on 12 nodes connected by 10Gb/s international networks. Astronomers can connect to any node to download the data. Most of them can check out the catalog (about 1.34TB) within several hours. The details can be found at http://sdss.ncdm.uic.edu.

Similarly, Sector can help to store and distribute any large volumetric data sets, especially those from scientific experiments and instruments. Data stored in Sector is safe since Sector will automatically make a new backup if the number of copies of a particular file is less than a threshold (e.g., 3).

Angle Network Monitoring

Angle uses distributed data mining method to detect network intrusion. In a big network, the network traffic data is collected from multiple geographically distributed locations. Every periodical time, the new generated Angle data files are uploaded onto the Sector network. Particularly, data files from a specific location can be uploaded to the local/nearest Sector node.

The Angle client, which is analyzing the collected data, uses Sphere API to process the distributed data sets. Sphere allows users to define a unit processing function to process per record, per group of records, or per file. Users do not need to write any explicit code to locate data file, nor do they need to take care of load balancing and fault tolerance.

| Contact Us | ©2007 National Center for Data Mining