Welcome, guest | Sign In | My Account | Store | Cart

Notice! PyPM is being replaced with the ActiveState Platform, which enhances PyPM’s build and deploy capabilities. Create your free Platform account to download ActivePython or customize Python with the packages you require and get automatic updates.

Download
ActivePython
INSTALL>
pypm install mycloud

How to install mycloud

  1. Download and install ActivePython
  2. Buy and install the Business Edition license from account.activestate.com
  3. Open Command Prompt
  4. Type pypm install mycloud

mycloud contains builds that are only available via PyPM when you have a current ActivePython Business Edition subscription.

 Python 2.7Python 3.2Python 3.3
Windows (32-bit)
0.38
0.45Never BuiltWhy not?
0.38 Available View build log
0.26 Available View build log
0.25 Available View build log
0.23 Available View build log
0.22 Available View build log
0.19 Available View build log
0.18 Available View build log
0.16 Available View build log
0.15 Available View build log
Windows (64-bit)
0.38
0.45Never BuiltWhy not?
0.38 Available View build log
0.26 Available View build log
0.25 Available View build log
0.23 Available View build log
0.22 Available View build log
0.19 Available View build log
0.18 Available View build log
0.16 Available View build log
0.15 Available View build log
Mac OS X (10.5+)
0.38
0.45Never BuiltWhy not?
0.38 Available View build log
0.26 Available View build log
0.25 Available View build log
0.23 Available View build log
0.22 Available View build log
0.19 Available View build log
0.18 Available View build log
0.16 Available View build log
0.15 Available View build log
Linux (32-bit)
0.45
0.45 Available View build log
0.43 Available View build log
0.41 Available View build log
0.38 Available View build log
0.37 Available View build log
0.26 Available View build log
0.25 Available View build log
0.23 Available View build log
0.22 Available View build log
0.19 Available View build log
0.18 Available View build log
0.16 Available View build log
0.15 Available View build log
Linux (64-bit)
0.45
0.45 Available View build log
0.43 Available View build log
0.41 Available View build log
0.38 Available View build log
0.37 Available View build log
0.26 Available View build log
0.25 Available View build log
0.23 Available View build log
0.22 Available View build log
0.19 Available View build log
0.18 Available View build log
0.16 Available View build log
0.15 Available View build log
 
Author
License
BSD
Imports

MyCloud

Leverage small clusters of machines to increase your productivity.

MyCloud requires no prior setup; if you can SSH to your machines, then it will work out of the box. MyCloud currently exports a simple mapreduce API with several common input formats; adding support for your own is easy as well.

Usage

Starting your cluster:

import mycloud

cluster = mycloud.Cluster(['machine1', 'machine2'])

# or use defaults from ~/.config/mycloud
# cluster = mycloud.Cluster()

Map over a list:

result = cluster.map(compute_factors, range(1000))

ClientFS makes accessing local files seamless!

def my_worker(filename):
  do_work(mycloud.fs.FS.open(filename, 'r'))

cluster.map(['client:///my/local/file'], my_worker)

Use the MapReduce interface to easily handle processing of larger datasets:

from mycloud.mapreduce import MapReduce, group
from mycloud.resource import CSV
input_desc = [CSV('client:///path/to/my_input_%d.csv') % i for i in range(100)]
output_desc = [CSV('client:///path/to/my_output_file.csv')]

def map_identity(kv_iter, output):
  for k, v in kv_iter:
    output(k, int(v[0]))

def reduce_sum(kv_iter, output):
  for k, values in group(kv_iter):
    output(k, sum(values))

mr = MapReduce(cluster, map_identity, reduce_sum, input_desc, output_desc)

result = mr.run()

for k, v in result[0].reader():
  print k, v

Performance

It is, keep in mind, written entirely in Python.

Some simple operations I've used it for (6 machines, 96 cores):

  • Sorting a billion numbers: ~5m
  • Preprocessing 1.3 million images (resizing and SIFT feature extraction): ~1 hour

Input formats

Mycloud has builtin support for processing the following file types:

  • LevelDB
  • CSV
  • Text (lines)
  • Zip

Adding support for your own is simple - just write a resource class describing how to get a reader and writer. (see resource.py for details).

Why?!?

Sometimes you're developing something in Python (because that's what you do), and you decide you'd like it to be parallelized. Our current options are multiprocessing (limiting us to a single machine) and Hadoop streaming (limiting us to strings and Hadoop's input formats).

Also, because I could.

Credits

MyCloud builds on the phenomonally useful cloud serialization, SSH/Paramiko, and LevelDB libraries.

Subscribe to package updates

Download Stats

Last month:2

What does the lock icon mean?

Builds marked with a lock icon are only available via PyPM to users with a current ActivePython Business Edition subscription.

Need custom builds or support?

ActivePython Enterprise Edition guarantees priority access to technical support, indemnification, expert consulting and quality-assured language builds.

Plan on re-distributing ActivePython?

Get re-distribution rights and eliminate legal risks with ActivePython OEM Edition.