Warm tip: This article is reproduced from serverfault.com, please click

javascript, Hugo: convert unordered list to the JSON

发布于 2021-01-05 03:42:05


I want to convert the below code

<nav id="TableOfContents">
    <li><a href="#js">JS</a>
        <li><a href="#h3-1">H3-1</a></li>
        <li><a href="#h3-2">H3-2</a></li>
    <li><a href="#python">Python</a>
        <li><a href="#h3-1-1">H3-1</a></li>
        <li><a href="#h3-2-1">H3-2</a></li>

to the following JSON format with Javascript(or Hugo)

{"t": "root", "d": 0, "v": "", "c": [
    {"t": "h", "d": 1, "v": "<a href=\"#js\">JS</a>", "c": [
            {"t": "h", "d": 2, "v": "<a href=\"#h3-1\">H3-1</a>"}, 
            {"t": "h", "d": 2, "v": "<a href=\"#h3-2\">H3-2</a>"}
    {"t": "h", "d": 1, "v": "<a href=\"#python\">Python</a>", "c": [
            {"t": "h", "d": 2, "v": "<a href=\"#h3-1-1\">H3-1</a>"}, 
            {"t": "h", "d": 2, "v": "<a href=\"#h3-2-1\">H3-2</a>"}

The above is just an example, the function you provide should be able to convert any similar structure to a JSON string like the one above without any problems (In fact, the only requirement is not to use hard coding to complete)

You can output your code like below

<!DOCTYPE html>
      var input_text = `<nav id="TableOfContents">...`;
      var output = my_func(input_text);
      console.log(output); /* expected result: {"t": "root", "d": 0, "v": "", "c": [ ... */

    function my_func(text) {
      /* ... */

Long Story

What I want to do?

I wanted to use a mindmap(markmap, markmap-github) on Hugo and applying that to the article (single.html) as the table of contents

Hugo already provides the TOC by TableOfContents

i.e. My.md

## JS

### H3-1

### H3-2

## Python

### H3-1

### H3-2

and then {{ .TableOfContents }} will output

<nav id="TableOfContents">
    <li><a href="#js">JS</a>
        <li><a href="#h3-1">H3-1</a></li>
        <li><a href="#h3-2">H3-2</a></li>
    <li><a href="#python">Python</a>
        <li><a href="#h3-1-1">H3-1</a></li>
        <li><a href="#h3-2-1">H3-2</a></li>

however, If I use the markmap then I must provide a JSON string to it, like below,

{"t": "root", "d": 0, "v": "", "c": [
    {"t": "h", "d": 1, "v": "<a href=\"#js\">JS</a>", "c": [
            {"t": "h", "d": 2, "v": "<a href=\"#h3-1\">H3-1</a>"}, 
            {"t": "h", "d": 2, "v": "<a href=\"#h3-2\">H3-2</a>"}
    {"t": "h", "d": 1, "v": "<a href=\"#python\">Python</a>", "c": [
            {"t": "h", "d": 2, "v": "<a href=\"#h3-1-1\">H3-1</a>"}, 
            {"t": "h", "d": 2, "v": "<a href=\"#h3-2-1\">H3-2</a>"}

What have I done?

I tried to do it with just Hugo's functions but failed (logically possible, but syntactically difficult to implement)

So I put my hope in javascript, but all along, my JS is to find someone else's code to modify it, and this example is a brand new example to me; I can't start (to be frank, I am a layman in JS).

The following is I implement it with Python.

from bs4 import BeautifulSoup, Tag
from typing import Union
import os
from pathlib import Path
import json

def main():
    data = """<nav id="TableOfContents">
        <li><a href="#js">JS</a>
            <li><a href="#h3-1">H3-1</a></li>
            <li><a href="#h3-2">H3-2</a></li>
        <li><a href="#python">Python</a>
            <li><a href="#h3-1-1">H3-1</a></li>
            <li><a href="#h3-2-1">H3-2</a></li>
    soup = BeautifulSoup(data, "lxml")

    sub_list = []
    cur_level = 0
    dict_tree = dict(t='root', d=cur_level, v='', c=sub_list)
    root_ul: Tag = soup.find('ul')
    toc2json(root_ul, cur_level + 1, sub_list)  # <-- core
    toc_json: str = json.dumps(dict_tree)
    output_file = Path('test.temp.html')
    create_markmap_html(toc_json, output_file)

def toc2json(tag: Tag, cur_level: int, sub_list):
    for li in tag:
        if not isinstance(li, Tag):
        a: Tag = li.find('a')
        ul: Union[Tag, None] = li.find('ul')
        if ul:
            new_sub_list = []
            toc2json(ul, cur_level + 1, new_sub_list)
            cur_obj = dict(t='h', d=cur_level, v=str(a), c=new_sub_list)
            cur_obj = dict(t='h', d=cur_level, v=str(a))

def create_markmap_html(json_string: str, output_file: Path):
    import jinja2
    markmap_template = jinja2.Template("""
    <!DOCTYPE html>
    <html lang=en>
      <script src="https://d3js.org/d3.v6.min.js"></script>
      <script src="https://cdn.jsdelivr.net/npm/markmap-view@0.2.0"></script>
      .mindmap {
        width: 100vw;
        height: 100vh;
        <svg id="mindmap-test" class="mindmap"></svg>
          (e, json_data)=>{
          {{ input_json }}

    html_string = markmap_template.render(dict(input_json=json_string))

    with open(output_file, 'w', encoding='utf-8') as f:

if __name__ == '__main__':

For the sake of this little effort, please help me out.

1,772 2021-01-07 18:00:33

I will be happy to continue improving the code if necessary :)

class Parser {
    this.result = {t: 'root', d: 0, v: "", c:[]};
    this.data = htmlExpr.split('\n').map(row => row.trim()) // to lines ?
    this.open = RegExp('^<(?<main>[^\/>]*)>')           // some open tag
    this.close = RegExp('^<\/(?<main>[^>]*)>')          // some close tag
    this.target = RegExp('(?<main><a[^>]*>[^<]*<\/a>)') // what we looking for
    let [o, c, v] = [ // all matches
    // extract tagNames and value
    if (o) o = o.groups.main
    if (c) c = c.groups.main
    if (v) v = v.groups.main
    return [o, c, v];
    const parents = [];
    let level = 0;
    let lastNode = this.result;
    let length = this.data.length;
    for(let i = 0; i < length; i++){
      const [o, c, v] = this.test(this.data[i])
      if (o === 'ul') {
        level +=1;
      } else if (c === 'ul') {
        level -=1;
      if (v) {
        lastNode = {t: 'h', d: level, v}; // template
        parents[level - 1].push(lastNode) // insert to result
    return this.result;

let htmlExpr = `<nav id="TableOfContents">
        <li><a href="#js">JS</a>
            <li><a href="#h3-1">H3-1</a></li>
            <li><a href="#h3-2">H3-2</a></li>
        <li><a href="#python">Python</a>
            <li><a href="#h3-1-1">H3-1</a></li>
            <li><a href="#h3-2-1">H3-2</a></li>
const parser = new Parser(htmlExpr);
const test = parser.parse();
console.log(JSON.stringify(test, null, 2))