Commits

Russell Power  committed daa0dcd

Fix description.

  • Participants
  • Parent commits 9677f4e

Comments (0)

Files changed (2)

 
 Use the MapReduce interface to easily handle processing of larger datasets::
   
-  input_desc = [mycloud.resource.CSV('my_input_%d.csv' % i for i in range(100)]
-  output_desc = [mycloud.resource.BlockedTable('my_output_file_%d' % i) for i in range(1)]
-  
+  from mycloud.resource import CSV  
+  input_desc = [CSV('my_input_%d.csv' % i for i in range(100)]
+  output_desc = [CSV('my_output_file_%d.csv' % i) for i in range(1)]
+   
   def map_identity(k, v):
     yield (k, int(v[0]))
   
   for k, v in result[0].reader():
     print k, v
 
-
     name="mycloud",
     description="Work distribution for small clusters.",
     long_description='''
+mycloud
+===================
+
+Leverage small clusters of machines to increase your productivity.
+
+mycloud requires no prior setup; if you can SSH to your machines, then
+it will work out of the box.  mycloud currently exports a simple 
+mapreduce API with several common input formats; adding support for
+your own is easy as well.
+
+usage
+=====
+
+Starting your cluster::
+  
+  # list each machine and the number of cores to use
+  cluster = mycloud.Cluster([('machine1', 4),
+                             ('machine2', 4)],
+                            fs_prefix='/path/to/store/results')
+
+Invoke a function over a list of inputs::
+  
+  result = cluster.map(my_expensive_function, range(1000))
+
+Use the MapReduce interface to easily handle processing of larger datasets::
+  
+  from mycloud.resource import CSV  
+  input_desc = [CSV('my_input_%d.csv' % i for i in range(100)]
+  output_desc = [CSV('my_output_file.csv']
+   
+  def map_identity(k, v):
+    yield (k, int(v[0]))
+  
+  def reduce_sum(k, values):
+    yield (k, sum(values))
+  
+  mr = mycloud.mapreduce.MapReduce(cluster,
+                                   map_identity,
+                                   reduce_sum,
+                                   input_desc,
+                                   output_desc)
+  
+  result = mr.run()
+  
+  for k, v in result[0].reader():
+    print k, v
 
 ''',
     classifiers=['Development Status :: 3 - Alpha',
     author="Russell Power",
     author_email="power@cs.nyu.edu",
     license="BSD",
-    version="0.11",
+    version="0.15",
     url="http://rjpower.org/browse.cgi/mycloud",
     package_dir={ '' : 'src' },
     packages=[ 'mycloud' ],