Geeks With Blogs

News kaleidoscope 1817, lit. "observer of beautiful forms," coined by its inventor, Sir David Brewster (1781-1868), from Gk. kalos "beautiful" + eidos "shape" (see -oid) + -scope, on model of telescope, etc. Figurative meaning "constantly changing pattern" is first attested 1819 in Lord Byron, whose publisher had sent him one.
Kaleidoscope Everything under the sun, ending in .Net

How it works?

Using the front-end of the service, a user can specify a size in MB for the input data set to sort.



The CreateAndSplit task generates the input data and stores them as 10 blobs in the utility storage. The URLs to these blobs are packaged as Separate work items and written to the queue.

· Separate

The Separate task reads the blobs with the random numbers created in the CreateAndSplit task and places the random numbers into buckets. The interval of the numbers that go into one bucket is chosen so that the expected amount of numbers (assuming a uniform distribution of the numbers in the original data set) is around 100 kB. Each bucket is represented as a blob container in utility storage. Whenever there are 10 blobs in one bucket (i.e., the placement in this bucket is complete because we had 10 original splits), the separate task will generate a new Sort task and write the task into the queue.

· Sort

The Sort task merges all blobs in a single bucket and sorts them using a standard sort algorithm. The result is stored as a blob in utility storage.

· Concat

The concat task merges the results of all Sort tasks into a single blob. This blob can be downloaded as a text file using this Web page. As the resulting file is presented in text format, the size of the file is likely to be larger than the specified input file.


Technorati Tags:
Posted on Sunday, March 21, 2010 4:49 PM | Back to top

Comments on this post: Distributed Sort Sample Service

No comments posted yet.
Your comment:
 (will show your gravatar)

Copyright © kaleidoscope | Powered by: