Warm tip: This article is reproduced from serverfault.com, please click

Is there an efficient way to use regular expressions to extract data from an HTML string?

发布于 2020-11-28 10:05:01

I'll be doing all this is node.js In my scenario I have an html string and it contains this string:

// there is html code above ^^^
<input type="hidden" name="token" id="token" value="MTYwNjU1NzAwOHRor9RCGkXDyFBLI7HUPCwb-v46P012KayHiFSHTKDdW7CUBvjiKTHoC3lVtRBOBIGwSRA4_ojvfiG3Khnsd54." />
//and html code below vvv

is there a regular expression that could extract only the value of the token? e.g.:

MTYwNjU1NzAwOHRor9RCGkXDyFBLI7HUPCwb-v46P012KayHiFSHTKDdW7CUBvjiKTHoC3lVtRBOBIGwSRA4_ojvfiG3Khnsd54.

I've also looked into html parsing npm modules, no such luck.

Questioner
djsnoob
Viewed
0
The fourth bird 2020-11-28 21:27:57

I've also looked into html parsing npm modules, no such luck.

You can use for example jsdom:

const jsdom = require("jsdom");
const { JSDOM } = jsdom;
const dom = new JSDOM(`<input type="hidden" name="token" id="token" value="MTYwNjU1NzAwOHRor9RCGkXDyFBLI7HUPCwb-v46P012KayHiFSHTKDdW7CUBvjiKTHoC3lVtRBOBIGwSRA4_ojvfiG3Khnsd54." />`);
let elm = dom.window.document.getElementById("token");
if (elm) console.log(elm.value);

Output

MTYwNjU1NzAwOHRor9RCGkXDyFBLI7HUPCwb-v46P012KayHiFSHTKDdW7CUBvjiKTHoC3lVtRBOBIGwSRA4_ojvfiG3Khnsd54.