Home | History | Annotate | Download | only in docs
      1 Using Markdown as Python Library
      2 ================================
      3 
      4 First and foremost, Python-Markdown is intended to be a python library module
      5 used by various projects to convert Markdown syntax into HTML.
      6 
      7 The Basics
      8 ----------
      9 
     10 To use markdown as a module:
     11 
     12     import markdown
     13     html = markdown.markdown(your_text_string)
     14 
     15 Encoded Text
     16 ------------
     17 
     18 Note that ``markdown()`` expects **Unicode** as input (although a simple ASCII 
     19 string should work) and returns output as Unicode.  Do not pass encoded strings to it!
     20 If your input is encoded, e.g. as UTF-8, it is your responsibility to decode 
     21 it.  E.g.:
     22 
     23     input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8")
     24     text = input_file.read()
     25     html = markdown.markdown(text, extensions)
     26 
     27 If you later want to write it to disk, you should encode it yourself:
     28 
     29     output_file = codecs.open("some_file.html", "w", encoding="utf-8")
     30     output_file.write(html)
     31 
     32 More Options
     33 ------------
     34 
     35 If you want to pass more options, you can create an instance of the ``Markdown``
     36 class yourself and then use ``convert()`` to generate HTML:
     37 
     38     import markdown
     39     md = markdown.Markdown(
     40             extensions=['footnotes'], 
     41             extension_configs= {'footnotes' : ('PLACE_MARKER','~~~~~~~~')},
     42             safe_mode=True,
     43             output_format='html4'
     44     )
     45     return md.convert(some_text)
     46 
     47 You should also use this method if you want to process multiple strings:
     48 
     49     md = markdown.Markdown()
     50     html1 = md.convert(text1)
     51     html2 = md.convert(text2)
     52 
     53 Working with Files
     54 ------------------
     55 
     56 While the Markdown class is only intended to work with Unicode text, some
     57 encoding/decoding is required for the command line features. These functions 
     58 and methods are only intended to fit the common use case.
     59 
     60 The ``Markdown`` class has the method ``convertFile`` which reads in a file and
     61 writes out to a file-like-object:
     62 
     63     md = markdown.Markdown()
     64     md.convertFile(input="in.txt", output="out.html", encoding="utf-8")
     65 
     66 The markdown module also includes a shortcut function ``markdownFromFile`` that
     67 wraps the above method.
     68 
     69     markdown.markdownFromFile(input="in.txt", 
     70                               output="out.html", 
     71                               extensions=[],
     72                               encoding="utf-8",
     73                               safe=False)
     74 
     75 In either case, if the ``output`` keyword is passed a file name (i.e.: 
     76 ``output="out.html"``), it will try to write to a file by that name. If
     77 ``output`` is passed a file-like-object (i.e. ``output=StringIO.StringIO()``),
     78 it will attempt to write out to that object. Finally, if ``output`` is 
     79 set to ``None``, it will write to ``stdout``.
     80 
     81 Using Extensions
     82 ----------------
     83 
     84 One of the parameters that you can pass is a list of Extensions. Extensions 
     85 must be available as python modules either within the ``markdown.extensions``
     86 package or on your PYTHONPATH with names starting with `mdx_`, followed by the 
     87 name of the extension.  Thus, ``extensions=['footnotes']`` will first look for 
     88 the module ``markdown.extensions.footnotes``, then a module named 
     89 ``mdx_footnotes``.   See the documentation specific to the extension you are 
     90 using for help in specifying configuration settings for that extension.
     91 
     92 Note that some extensions may need their state reset between each call to 
     93 ``convert``:
     94 
     95     html1 = md.convert(text1)
     96     md.reset()
     97     html2 = md.convert(text2)
     98 
     99 Safe Mode
    100 ---------
    101 
    102 If you are using Markdown on a web system which will transform text provided 
    103 by untrusted users, you may want to use the "safe_mode" option which ensures 
    104 that the user's HTML tags are either replaced, removed or escaped. (They can 
    105 still create links using Markdown syntax.)
    106 
    107 * To replace HTML, set ``safe_mode="replace"`` (``safe_mode=True`` still works 
    108     for backward compatibility with older versions). The HTML will be replaced 
    109     with the text defined in ``markdown.HTML_REMOVED_TEXT`` which defaults to 
    110     ``[HTML_REMOVED]``. To replace the HTML with something else:
    111 
    112         markdown.HTML_REMOVED_TEXT = "--RAW HTML IS NOT ALLOWED--"
    113         md = markdown.Markdown(safe_mode="replace")
    114 
    115     **Note**: You could edit the value of ``HTML_REMOVED_TEXT`` directly in 
    116     markdown/__init__.py but you will need to remember to do so every time you 
    117     upgrade to a newer version of Markdown. Therefore, this is not recommended.
    118 
    119 * To remove HTML, set ``safe_mode="remove"``. Any raw HTML will be completely 
    120     stripped from the text with no warning to the author.
    121 
    122 * To escape HTML, set ``safe_mode="escape"``. The HTML will be escaped and 
    123     included in the document.
    124 
    125 Output Formats
    126 --------------
    127 
    128 If Markdown is outputing (X)HTML as part of a web page, most likely you will
    129 want the output to match the (X)HTML version used by the rest of your page/site.
    130 Currently, Markdown offers two output formats out of the box; "HTML4" and 
    131 "XHTML1" (the default) . Markdown will also accept the formats "HTML" and 
    132 "XHTML" which currently map to "HTML4" and "XHTML" respectively. However, 
    133 you should use the more explicit keys as the general keys may change in the 
    134 future if it makes sense at that time. The keys can either be lowercase or 
    135 uppercase.
    136 
    137 To set the output format do:
    138 
    139     html = markdown.markdown(text, output_format='html4')
    140 
    141 Or, when using the Markdown class:
    142 
    143     md = markdown.Markdown(output_format='html4')
    144     html = md.convert(text)
    145 
    146 Note that the output format is only set once for the class and cannot be 
    147 specified each time ``convert()`` is called. If you really must change the
    148 output format for the class, you can use the ``set_output_format`` method:
    149 
    150     md.set_output_format('xhtml1')
    151