Tuesday 29 June 2021

Access nested elements in HTMLRewriter - Cloudflare Workers

I have to access a nested element using HTMLRewriter in a Cloudflare worker.

Example

<div data-code="ABC">
   <div class="title">Title</div>
   <div class="price">9,99</div>
</div>
<div data-code="XYZ">
   <div class="title">Title</div>
</div>

I was thinking about use multiple .on() but the order is not preserved because some .price are missing and I cannot merge correctly results from codeHandler and a PriceHandler

await new HTMLRewriter().on("[data-code]", codeHandler)
                        .on(".price", priceHandler)
                        .transform(response).arrayBuffer()

I was thinking about iterating new HTMLRewriter() multiple times but the readable stream is locked.

Current code

Worker

class codeHandler {
    constructor() {
        this.values = []
    }

    element(element) {
        let data = {
            code: element.getAttribute("data-code"),
            title: element.querySelector(".title").innerText, <--
            price: element.querySelector(".price").innerText, <--- HERE
        }
        this.values.push( data )
    }
}


const url = "https://www.example.com"

async function handleRequest() {

  const response = await fetch(url)

   const codeHandler = new codeHandler()
   await new HTMLRewriter().on("[data-code]", codeHandler).transform(response).arrayBuffer()
    
    
   console.log(codeHandler.values)

    const json = JSON.stringify(codeHandler.values, null, 2)


    return new Response(json, {
        headers: {
        "content-type": "application/json;charset=UTF-8"
        }
    })  

}

addEventListener("fetch", event => {
  return event.respondWith(handleRequest())
})


from Access nested elements in HTMLRewriter - Cloudflare Workers

No comments:

Post a Comment