packages.rst 2.61 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
Package preference
==================

Initially we want our developers to following the
`coding guidelines for astropy-affiliated packages <https://docs.astropy.org/en/latest/development/codeguide.html>`_
as much as possible.
A few important conventions and special cases should be outlined here.

Basic preference
----------------

Several packages are favored over others if they can be used to solve the problem under study.
Developers should use them as much as possible.


Standard libraries
    Python standard libraries have the highest priorities, e.g., ``os``, ``re``, etc.
``numpy``, ``scipy``, ``matplotlib``
    The ``BIG 3`` for Python scientific computing.
``astropy`` and its ``astropy``-affiliated packages
    For example, ``astropy.io.fits`` is favored over ``pyfits``.


Parallel computing
------------------

The two packages are preferred for implementing `embarrassingly` parallel computing (without inter-communication).

- ``multiprocessing``: https://docs.python.org/3/library/multiprocessing.html
- ``joblib``: https://joblib.readthedocs.io/en/latest/

.. literalinclude:: ./example_multiprocessing.py
    :linenos:
    :language: python
    :caption: an example of using ``multiprocessing`` for parallel computing

The output is

.. code-block::

    Total time cost: 5.095193147659302 sec!

.. literalinclude:: ./example_joblib.py
    :linenos:
    :language: python
    :caption: an example of using ``joblib`` for parallel computing

The output is

.. code-block::

    [Parallel(n_jobs=5)]: Using backend LokyBackend with 5 concurrent workers.
    [Parallel(n_jobs=5)]: Done   1 tasks      | elapsed:    5.2s
    [Parallel(n_jobs=5)]: Done   2 out of   5 | elapsed:    5.2s remaining:    7.8s
    [Parallel(n_jobs=5)]: Done   3 out of   5 | elapsed:    5.2s remaining:    3.5s
    [Parallel(n_jobs=5)]: Done   5 out of   5 | elapsed:    5.2s remaining:    0.0s
    [Parallel(n_jobs=5)]: Done   5 out of   5 | elapsed:    5.2s finished
    Total time cost: 5.1958301067352295 sec!

.. tip::
    ``joblib`` is recommended for its highly concise syntax and verbose info -- do every thing within one statement.
    ``n_jobs`` can be set to ``-1`` to use almost all CPUs, ``backend`` can be set to ``multiprocessing``
    to use the backend built by standard library ``multiprocessing``, or ``loky`` for alleged high robustness.
    Visit https://joblib.readthedocs.io/en/latest/ for more info and usages of ``joblib``,
    such as the ``batch_size`` and ``verbose`` parameters.

For parallel computing with inter-communication or distributed computing,
we recommend developers to consider using ``mpi4py``: https://github.com/mpi4py/mpi4py.