Metadata-Version: 2.4
Name: wikitrans
Version: 1.4
Summary: Wiki markup translator.
Home-page: http://www.gnu.org.ua/projects/wikitrans
Author: Sergey Poznyakoff
Author-email: gray@gnu.org
License: GPL License
Keywords: mediawiki markup translation
Platform: any
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License (GPL)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Markup
Requires-Python: >=3.5
Description-Content-Type: text/x-rst
License-File: COPYING.txt
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: platform
Dynamic: requires-python
Dynamic: summary

MediaWiki Markup Translator
===========================
This package provides Python framework for translating WikiMedia
articles to various formats. The present version supports
conversions to plain text, HTML, and Texinfo formats.

A command line converter utility is included.

Classes
=======

class ``WikiMarkup``
--------------------
A base class for all translator classes. Unless you plan extending
wikitrans, you will never have to create objects of this
class. Instead, you will be using one of its derived classes.

Constructor arguments common for all derived classes:

filename = *name*
  The file *name* is opened and used for input.
file = *fd*
  An already opened file *fd* is used for input.
text = *string*
  Input is taken from *string*, line by line.

lang = *code*
  Specifies language version. Default is ``en``. This variable can be
  referred to as ``%(lang)s`` in the keyword arguments below.
html_base = *url*
  Base URL for cross-references. Default is
  ``http://%(lang)s.wikipedia.org/wiki/``.
image_base = *url*
  Base URL for images. Default is
  ``http://upload.wikimedia.org/wikipedia/commons/thumb/a/bf``
media_base = *url*
  Base URL for media files. Default is
  ``http://www.mediawiki.org/xml/export-0.3``

debug_level = *int*
  Debug verbosity level (0 - no debug info, 100 - excessively
  copious debug messages). Default is 0.

strict = *bool*
  Strict parsing mode. Throw exceptions on syntax errors. Default is False.

class ``TextWikiMarkup``
------------------------
Translates material in Wiki markup language to plain text. Usage::

   from WikiTrans.wiki2text import TextWikiMarkup

   markup = TextWikiMarkup(filename='input.txt')
   markup.parse()
   print(str(markup))

Specific constructor arguments:

width = *N*
  Limit output width to *N* columns. Default is 78.  
show_urls = *bool*
  Whether or not to show the URLs links refer to. If *bool* is
  ``True`` (the default), a URL will be displayed in parentheses next
  to the link text. If ``False``, only the link text will be displayed. 

class ``TextWiktionaryMarkup``
------------------------------
Translate material from wiktionary to plain text form. This is
supposed to provide a wiktionary-specific form of
``TextWikiMarkup``. Currently, this class differs from
``TextWikiMarkup`` only in that the default value for ``html_base``
is ``http://%(lang)s.wikipedia.org/wiki/``.


class ``TexiWikiMarkup``
------------------------
Translate Wiki markup to Texinfo source. Usage::

   from WikiTrans.wiki2texi import TexiWikiMarkup

   markup = TexiWikiMarkup(filename='input.txt')
   markup.parse()
   print(str(markup))

Two markup-specific keywords control the sectioning model used.

sectioning_model = *model*
  Selects the Texinfo sectioning model for the output
  document. Possible values are:

  ``numbered``
     Top of document is marked with ``@top``. Headings (``=``, ``==``,
     ``===``, etc) produce ``@chapter``,
     ``@section``, ``@subsection``, etc.
  ``unnumbered``
     Unnumbered sectioning: ``@top``, ``@unnumbered``, ``@unnumberedsec``,
     ``@unnumberedsubsec``.
  ``appendix``
     Sectioning suitable for appendix entries: ``@top``, ``@appendix``,
     ``@appendixsec``, ``@appendixsubsec``, etc.
  ``heading``
     Use heading directives to reflect sectioning: ``@majorheading``,
     ``@chapheading``, ``@heading``, ``@subheading``, etc.

sectioning_start = *n*
  Shift resulting heading level by *n* positions. For example, supposing
  ``sectioning_model=numbered``, ``== A ==`` will produce ``@section
  A`` on output. If ``sectioning_start=1`` is also given, this
  directive will produce ``@subsection A`` instead.

class ``HtmlWikiMarkup``
------------------------
Translates Wiki markup to HTML. Usage::

   from WikiTrans.wiki2html import HtmlWikiMarkup

   markup = HtmlWikiMarkup(filename='input.txt')
   markup.parse()
   print(str(markup))

Supported keywords are same as for ``WikiMarkup`` class.

class ``HtmlWiktionaryMarkup``
------------------------------
Translate material from wiktionary to HTML form. This is
supposed to provide a wiktionary-specific form of
``HtmlWikiMarkup``. Currently both classes are equivalent, except that
the default value for ``html_base`` in ``HtmlWiktionaryMarkup``
is ``http://%(lang)s.wikipedia.org/wiki/``.

The ``wikitrans`` utility
=========================
This command line utility converts the supplied text to selected
output format. The usage syntax is::

  wikitrans [OPTIONS] ARG

If ARG looks like a URL, the wiki text to be converted will be
downloaded from that URL.

Otherwise, if the ``--base-url=URL`` option is given, ARG is treated as
the name of the page to get from the WikiMedia istallation at ``URL``.

Otherwise, ARG is treated as the name of the file to read wiki
material from.

Examples::

  wikitrans text.wiki

  wikitrans --base-url http://en.wiktionary.org door

  wikitrans https://en.wiktionary.org/wiki/Special:Export/door

Options are:

``--version``
  Show program's version number and exit.
``-h``, ``--help``
  Show a short usage summary and exit.
``-v``, ``--verbose``
  Verbose operation.
``-I ITYPE``, ``--input-type=ITYPE``
  Set input document type. *ITYPE* is one of: ``default`` or ``wiktionary``.
``-t OTYPE``, ``--to=OTYPE``, ``--type=OTYPE``
  Set output document type (``html`` (the default), ``texi``,
  ``text``, or ``dump``).
``-l LANG``, ``--lang=LANG``
  Set input document language.
``-o KW=VAL``, ``--option=KW=VAL``
  Pass the keyword argument ``KW=VAL`` to the parser class constructor.
``-d DEBUG``, ``--debug=DEBUG``
  Set debug level (0..100).
``-D``, ``--dump``
  Dump parse tree and exit; same as ``--type=dump``.
``-b URL``, ``--base-url=URL``
  Set base url. 

Note: when using ``--base-url`` or passing URL as an argument (2nd and 3rd
use cases above), if the URL is in 'wikipedia.org' or 'wiktionary.org'
domain, the options ``--input-type``, and ``--lang`` are set automatically.

