To run scripts in parallel, you must first install
-Instructions for doing so are provided on the mpipy website. Once that has
+Instructions for doing so are provided on the mpipy website. Once that has
been accomplished, you're all done! You just need to launch your scripts with
``mpirun`` (or equivalent) and signal to YT
that you want to run them in parallel.
The alternative to spatial decomposition is a simple round-robin of the grids.
-This process alows YT to pool data access to a given Enzo data file, which
+This process alows YT to pool data access to a given Enzo data file, which
ultimately results in faster read times and better parallelism.
The following operations use grid decomposition:
In a fashion similar to grid decomposition, computation can be parallelized
over objects. This is especially useful for
-`embarrasingly parallel <http://en.wikipedia.org/wiki/Embarrassingly_parallel>`_
-tasks where the items to be worked on can be split into seperate chunks and
+`embarrassingly parallel <http://en.wikipedia.org/wiki/Embarrassingly_parallel>`_
+tasks where the items to be worked on can be split into separate chunks and
saved to a list. The list is then split up and each MPI task performs parts of
You can also request a fixed number of processors to calculate each
-angular momenum vector. For example, this script will calculate each angular
+angular momenum vector. For example, this script will calculate each angular
momentum vector using a workgroup of four processors.
* Projections: projections are parallelized utilizing a quad-tree approach.
Data is loaded for each processor, typically by a process that consolidates
open/close/read operations, and each grid is then iterated over and cells
- are deposited into a data structure that stores values corre
+ are deposited into a data structure that stores values corresonding to
positions in the two-dimensional plane. This provides excellent load
balancing, and in serial is quite fast. However, as of yt 2.3, the
operation by which quadtrees are joined across processors scales poorly;
* If you are using object-based parallelism but doing CPU-intensive computations
on each object, you may find that setting ``num_procs`` equal to the
number of processors per compute node can lead to significant speedups.
- By default, most mpi impl
imentations will assign tasks to processors on a
+ By default, most mpi implmentations will assign tasks to processors on a
'by-slot' basis, so this setting will tell yt to do computations on a single
object using only the processors on a single compute node. A nice application
for this type of parallelism is calculating a list of derived quantities for