其他编写Python完整的Latex书籍的工作流程困难

ymonad 2019-03-09 21:32

这是我写的一个小脚本。它拆分单个*.ipynb文件并将其转换为多个*.tex文件。

用法是：

复制以下脚本并将其另存为 main.py
执行python main.py init。它将创建main.tex并style_ipython_custom.tplx
在您的Jupyther笔记本中，向要提取的每个单元格添加多余的行#latex:tag_a，#latex:tag_b..。相同的标签将被提取到相同的*.tex文件中。
将其另存为*.ipynb文件。幸运的是，当前的VSCode python插件支持导出到*.ipynb，或使用jupytext从转换*.py为*.ipynb。
运行python main.py path/to/your.ipynb，它将创建tag_a.tex并tag_b.tex
编辑main.tex并添加\input{tag_a.tex}或\input{tag_b.tex}在任何需要的地方。
运行pdflatex main.tex，它将产生main.pdf

该脚本背后的想法是：

使用默认值从jupyter笔记本转换为LaTex nbconvert.LatexExporter会生成完整的LaTex文件，其中包含宏定义。使用它转换每个单元格可能会创建大型LaTex文件。为避免此问题，脚本首先创建main.tex仅具有宏定义的脚本，然后将每个单元格转换为不具有宏定义的LaTex文件。可以使用自定义模板文件完成此操作，style_ipython.tplx

标记或标记单元格可能是使用单元格元数据完成的，但我找不到如何在VSCode python插件（Issue）中进行设置的方法，因此它使用regex模式扫描每个单元格的源^#latex:(.*)，并在将其转换为LaTex文件之前将其删除。。

资源：

import sys
import re
import os
from collections import defaultdict
import nbformat
from nbconvert import LatexExporter, exporters

OUTPUT_FILES_DIR = './images'
CUSTOM_TEMPLATE = 'style_ipython_custom.tplx'
MAIN_TEX = 'main.tex'


def create_main():
    # creates `main.tex` which only has macro definition
    latex_exporter = LatexExporter()
    book = nbformat.v4.new_notebook()
    book.cells.append(
        nbformat.v4.new_raw_cell(r'\input{__your_input__here.tex}'))
    (body, _) = latex_exporter.from_notebook_node(book)
    with open(MAIN_TEX, 'x') as fout:
        fout.write(body)
    print("created:", MAIN_TEX)


def init():
    create_main()
    latex_exporter = LatexExporter()
    # copy `style_ipython.tplx` in `nbconvert.exporters` module to current directory,
    # and modify it so that it does not contain macro definition
    tmpl_path = os.path.join(
        os.path.dirname(exporters.__file__),
        latex_exporter.default_template_path)
    src = os.path.join(tmpl_path, 'style_ipython.tplx')
    target = CUSTOM_TEMPLATE
    with open(src) as fsrc:
        with open(target, 'w') as ftarget:
            for line in fsrc:
                # replace the line so than it does not contain macro definition
                if line == "((*- extends 'base.tplx' -*))\n":
                    line = "((*- extends 'document_contents.tplx' -*))\n"
                ftarget.write(line)
    print("created:", CUSTOM_TEMPLATE)


def group_cells(note):
    # scan the cell source for tag with regexp `^#latex:(.*)`
    # if sames tags are found group it to same list
    pattern = re.compile(r'^#latex:(.*?)$(\n?)', re.M)
    group = defaultdict(list)
    for num, cell in enumerate(note.cells):
        m = pattern.search(cell.source)
        if m:
            tag = m.group(1).strip()
            # remove the line which contains tag
            cell.source = cell.source[:m.start(0)] + cell.source[m.end(0):]
            group[tag].append(cell)
        else:
            print("tag not found in cell number {}. ignore".format(num + 1))
    return group


def doit():
    with open(sys.argv[1]) as f:
        note = nbformat.read(f, as_version=4)
    group = group_cells(note)
    latex_exporter = LatexExporter()
    # use the template which does not contain LaTex macro definition
    latex_exporter.template_file = CUSTOM_TEMPLATE
    try:
        os.mkdir(OUTPUT_FILES_DIR)
    except FileExistsError:
        pass
    for (tag, g) in group.items():
        book = nbformat.v4.new_notebook()
        book.cells.extend(g)
        # unique_key will be prefix of image
        (body, resources) = latex_exporter.from_notebook_node(
            book,
            resources={
                'output_files_dir': OUTPUT_FILES_DIR,
                'unique_key': tag
            })
        ofile = tag + '.tex'
        with open(ofile, 'w') as fout:
            fout.write(body)
            print("created:", ofile)
        # the image data which is embedded as base64 in notebook
        # will be decoded and returned in `resources`, so write it to file
        for filename, data in resources.get('outputs', {}).items():
            with open(filename, 'wb') as fres:
                fres.write(data)
                print("created:", filename)


if len(sys.argv) <= 1:
    print("USAGE: this_script [init|yourfile.ipynb]")
elif sys.argv[1] == "init":
    init()
else:
    doit()

Adam B 2019-03-09 01:03:10

这太神奇了，完美解决了我的问题！我测试了它，效果很好。您认为您可以在代码中添加一些注释，以便我了解您在做什么吗？您应该制作一个执行此操作的VS Code插件。

Adam B 2019-03-09 02:02:04

另外，我的绘图出现错误时出现了一个小问题。从标记生成的乳胶文件正在寻找output.png，但是jupyter笔记本却没有，而main.py也没有。谢谢

ymonad 2019-03-09 02:50:55

@AdamB我在脚本后面写了这个主意，并添加了评论。稍后我将尝试包含该图像。

ymonad 2019-03-09 12:59:00

@AdamB我修改了代码，以便将Jupyther Notebook中的图像写入./images目录。

ymonad 2019-03-09 13:15:52

做完了现在它应该正确地将图像输出到前缀为./images目录的目录tag。

其他 - 编写Python完整的Latex书籍的工作流程困难

相关问题

热门github