I am trying to download a website as static, I mean without JS, only HTML & CSS.
I've tried many approaches yet some issues still present regarding CSS and Images.
A snippet
const puppeteer = require('puppeteer');
const {URL} = require('url');
const fse = require('fs-extra');
const path = require('path');
(async (urlToFetch) => {
const browser = await puppeteer.launch({
headless: true,
slowMo: 100
});
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on("request", request => {
if (request.resourceType() === "script") {
request.abort()
} else {
request.continue()
}
})
page.on('response', async (response) => {
const url = new URL(response.url());
let filePath = path.resolve(`./output${url.pathname}`);
if(path.extname(url.pathname).trim() === '') {
filePath = `${filePath}/index.html`;
}
await fse.outputFile(filePath, await response.buffer());
console.log(`File ${filePath} is written successfully`);
});
await page.goto(urlToFetch, {
waitUntil: 'networkidle2'
})
setTimeout(async () => {
await browser.close();
}, 60000 * 4)
})('https://stackoverflow.com/');
I've tried using
content = await page.content();
fs.writeFileSync('index.html', content, { encoding: 'utf-8' });
As well as, I download it using CDPSession.
I've tried it using website-scraper-puppeteer
So what is the best approach to come to a solution where I provide a website link, then It downloads it as static website.
from Download website locally without Javascript using puppeteer
No comments:
Post a Comment