.. include:: ../global.inc
.. _decorators.product:
.. index::
    pair: @product; Syntax

.. seealso::

    * :ref:`@product <new_manual.product>` in the **Ruffus** Manual
    * :ref:`Decorators <decorators>` for more decorators

.. |input| replace:: `input`
.. _input: `decorators.product.input`_
.. |filter| replace:: `filter`
.. _filter: `decorators.product.filter`_
.. |input2| replace:: `input2`
.. _input2: `decorators.product.input2`_
.. |filter2| replace:: `filter2`
.. _filter2: `decorators.product.filter2`_
.. |extras| replace:: `extras`
.. _extras: `decorators.product.extras`_
.. |output| replace:: `output`
.. _output: `decorators.product.output`_
.. |matching_formatter| replace:: `matching_formatter`
.. _matching_formatter: `decorators.product.matching_formatter`_

################################################################################################################################################
product
################################################################################################################################################
************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
@product( |input|_, |filter|_, [|input2|_, |filter2|_, ...], |output|_, [|extras|_,...] )
************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************


    **Purpose:**

        Generates the Cartesian **product**, i.e. all vs all comparisons, between multiple sets of |input|_ (e.g. **A B C D**, and **X Y Z**),

        The effect is analogous to the python `itertools  <http://docs.python.org/2/library/itertools.html#itertools.product>`__
        function of the same name, i.e. a nested for loop.

        .. code-block:: pycon
            :emphasize-lines: 2

            >>> from itertools import product
            >>> # product('ABC', 'XYZ') --> AX AY AZ BX BY BZ CX CY CZ
            >>> [ "".join(a) for a in product('ABC', 'XYZ')]
            ['AX', 'AY', 'AZ', 'BX', 'BY', 'BZ', 'CX', 'CY', 'CZ']

        Only out of date tasks (comparing input and output files) will be run

        |output|_ file names and strings in the extra parameters
        are generated by string replacement via the :ref:`formatter()<decorators.formatter>` filter
        from the |input|_. This can be, for example, a list of file names or the
        |output|_ of up stream tasks.
        .
        The replacement strings require an extra level of nesting to refer to
        parsed components.

            #. The first level refers to which *set* in each tuple of |input|_.
            #. The second level refers to which |input|_ file in any particular *set* of |input|_.

        This will be clear in the following example:

    **Example**:

        Calculates the **@product** of **A,B** and **P,Q** and **X, Y** files

        If |input|_ is three *sets* of file names

            .. code-block:: python
                :emphasize-lines: 1-10

                    set1 = [ 'a.start',                         # 0
                             'b.start'])

                    set2 = [ 'p.start',                         # 1
                             'q.start'])

                    set3 = [ ['x.1_start', 'x.2_start'],        # 2
                             ['y.1_start', 'y.2_start'] ]

        The first job of:

            .. code-block:: python

                @product( input  = set1, filter  = formatter(),
                          input2 = set2, filter2 = formatter(),
                          input3 = set2, filter3 = formatter(),
                          ...)

        Will be

            .. <<python

            .. code-block:: python
                :emphasize-lines: 1-6

                # One from each set
                ['a.start']
                # versus
                ['p.start']
                # versus
                ['x.1_start', 'x.2_start'],
            ..
                python

        First level of nesting (one list of files from each set):
            .. code-block:: python
                :emphasize-lines: 1-6

                ['a.start']                 # [0]
                ['p.start']                 # [1]
                ['x.1_start', 'x.2_start'], # [2]

        Second level of nesting (one file):
            .. code-block:: python
                :emphasize-lines: 1-6

                'a.start'                   # [0][0]
                'p.start'                   # [1][0]
                'x.1_start'                 # [2][0]

        Parse filename without suffix
            .. code-block:: python
                :emphasize-lines: 1-6

                'a'                         # {basename[0][0]}
                'p'                         # {basename[1][0]}
                'x'                         # {basename[2][0]}

        Python code:


            .. code-block:: python
                :emphasize-lines: 4,19,21,24,27,29-32,34-35,37-40

                from ruffus import *
                from ruffus.combinatorics import *

                #   Three sets of initial files
                @originate([ 'a.start', 'b.start'])
                def create_initial_files_ab(output_file):
                    with open(output_file, "w") as oo: pass

                @originate([ 'p.start', 'q.start'])
                def create_initial_files_pq(output_file):
                    with open(output_file, "w") as oo: pass

                @originate([ ['x.1_start', 'x.2_start'],
                             ['y.1_start', 'y.2_start'] ])
                def create_initial_files_xy(output_files):
                    for o in output_files:
                        with open(o, "w") as oo: pass

                #   @product
                @product(   create_initial_files_ab,        # Input
                            formatter("(.start)$"),         # match input file set # 1

                            create_initial_files_pq,        # Input
                            formatter("(.start)$"),         # match input file set # 2

                            create_initial_files_xy,        # Input
                            formatter("(.start)$"),         # match input file set # 3

                            "{path[0][0]}/"                 # Output Replacement string
                            "{basename[0][0]}_vs_"          #
                            "{basename[1][0]}_vs_"          #
                            "{basename[2][0]}.product",     #

                                                            # Extra parameter: path for
                            "{path[0][0]}",                 # 1st set of files, 1st file name

                                                            # Extra parameter: basename for
                            ["{basename[0][0]}",            # 1st set of files, 1st file name
                             "{basename[1][0]}",            # 2nd
                             "{basename[2][0]}",            # 3rd
                             ])
                def product_task(input_file, output_parameter, shared_path, basenames):
                    print "# basenames      = ", " ".join(basenames)
                    print "input_parameter  = ", input_file
                    print "output_parameter = ", output_parameter, "\n"


                #
                #       Run
                #
                #pipeline_printout(verbose=6)
                pipeline_run(verbose=0)


        This results in:

            .. code-block:: pycon
                :emphasize-lines: 2,6,10,14,18,22,26,30

                >>> pipeline_run(verbose=0)
                # basenames      =  a p x
                input_parameter  =  ('a.start', 'p.start', 'x.start')
                output_parameter =  /home/lg/temp/a_vs_p_vs_x.product

                # basenames      =  a p y
                input_parameter  =  ('a.start', 'p.start', 'y.start')
                output_parameter =  /home/lg/temp/a_vs_p_vs_y.product

                # basenames      =  a q x
                input_parameter  =  ('a.start', 'q.start', 'x.start')
                output_parameter =  /home/lg/temp/a_vs_q_vs_x.product

                # basenames      =  a q y
                input_parameter  =  ('a.start', 'q.start', 'y.start')
                output_parameter =  /home/lg/temp/a_vs_q_vs_y.product

                # basenames      =  b p x
                input_parameter  =  ('b.start', 'p.start', 'x.start')
                output_parameter =  /home/lg/temp/b_vs_p_vs_x.product

                # basenames      =  b p y
                input_parameter  =  ('b.start', 'p.start', 'y.start')
                output_parameter =  /home/lg/temp/b_vs_p_vs_y.product

                # basenames      =  b q x
                input_parameter  =  ('b.start', 'q.start', 'x.start')
                output_parameter =  /home/lg/temp/b_vs_q_vs_x.product

                # basenames      =  b q y
                input_parameter  =  ('b.start', 'q.start', 'y.start')
                output_parameter =  /home/lg/temp/b_vs_q_vs_y.product


    **Parameters:**


.. _decorators.product.input:

    * **input** = *tasks_or_file_names*
       can be a:

       #.  Task / list of tasks.
            File names are taken from the |output|_ of the specified task(s)
       #.  (Nested) list of file name strings.
            File names containing ``*[]?`` will be expanded as a |glob|_.
             E.g.:``"a.*" => "a.1", "a.2"``


.. _decorators.product.filter:

.. _decorators.product.matching_formatter:

    * **filter** = *formater(...)*
       a :ref:`formatter<decorators.formatter>` indicator object containing optionally
       a  python `regular expression (re) <http://docs.python.org/library/re.html>`_.


.. _decorators.product.input2:

.. _decorators.product.filter2:

Additional **input** and **filter** as needed:

    * **input2** = *tasks_or_file_names*

    * **filter2** = *formater(...)*


.. _decorators.product.output:

    * **output** = *output*
        Specifies the resulting output file name(s) after string substitution


.. _decorators.product.extras:

    * **extras** = *extras*
       Any extra parameters are passed verbatim to the task function

       If you are using named parameters, these can be passed as a list, i.e. ``extras= [...]``

       Any extra parameters are consumed by the task function and not forwarded further down the pipeline.

