LaTeX2Python (tex2py)

Tex2py converts LaTeX into a Python parse tree, using TexSoup. This allows you to navigate latex files as trees, using either the default or a custom hierarchy. See md2py for a markdown parse tree.

Note tex2py currently only supports Python3.

created by Alvin Wan

Installation

Install via pip.

pip install tex2py

Usage

LaTeX2Python offers only one function tex2py, which generates a Python parse tree from Latex. This object is a navigable, "Tree of Contents" abstraction for the latex file.

Take, for example, the following latex file. (See pdf)

chikin.tex

\documentclass[a4paper]{article}
\begin{document}

\section{Chikin Tales}

\subsection{Chikin Fly}

Chickens don't fly. They do only the following:

\begin{itemize}
\item waddle
\item plop
\end{itemize}

\section{Chikin Scream}

\subsection{Plopping}

Plopping involves three steps:

\begin{enumerate}
\item squawk
\item plop
\item repeat, unless ordered to squat
\end{enumerate}

\subsection{I Scream}

\end{document}

Akin to a navigation bar, the TreeOfContents object allows you to expand a latex file one level at a time. Running tex2py on the above latex file will generate a tree, abstracting the below structure.

          <Document>
          /        \
  Chikin Tales   Chikin Scream
      /            /     \
 Chikin Fly  Plopping   I Scream

At the global level, we can access the title.

>>> from tex2py import tex2py
>>> with open('chikin.tex') as f: data = f.read()
>>> toc = tex2py(data)
>>> toc.section
Chikin Tales
>>> str(toc.section)
'Chikin Tales'

Notice that at this level, there are no subsections.

>>> list(toc.subsections)
[]

The main section has two subsections beneath it. We can access both.

>>> list(toc.section.subsections)
[Chikin Fly, Chikin Scream]
>>> toc.section.subsection
Chikin Fly

The TreeOfContents class also has a few more conveniences defined. Among them is support for indexing. To access the ith child of an <element> - instead of <element>.branches[i] - use <element>[i].

See below for example usage.

>>> toc.section.branches[0] == toc.section[0] == toc.section.subsection
True
>>> list(toc.section.subsections)[1] == toc.section[1]
True
>>> toc.section[1]
Chikin Scream

You can now print the document tree. (There is some weirdness with branches beyond titles, so for only titles, we have the following:

           ┌Chikin Tales┐
           │            └Chikin Fly
 [document]┤
           │             ┌Plopping
           └Chikin Scream┤
                         │        
                         │        
                         └I Scream

Additional Notes

Behind the scenes, tex2py uses TexSoup. All tex2py objects have a source attribute containing a TexSoup object.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
tests		tests
tex2py		tex2py
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pytest.ini		pytest.ini
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LaTeX2Python (tex2py)

Installation

Usage

Additional Notes

About

Releases

Packages

Contributors 4

Languages

License

alvinwan/tex2py

Folders and files

Latest commit

History

Repository files navigation

LaTeX2Python (tex2py)

Installation

Usage

Additional Notes

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages