Tuesday 26 October 2021

Parse xml and remove width attribute from Image Tag

I am reading an XML file and parsing it. I want to remove the width attribute from every img element that is part of the XML document.

How do I parse this HTML file and search for the image tag and update it and return that updated HTML?

Follow is XML sample ..In description tag want to remove img attr

       <?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
>

  <channel>
<title></title>
<atom:link href="" rel="self" type="application/rss+xml" />
<link></link>
<description>Award-Winning Impact Media - Alt Protein &#38; 
    Sustainability Breaking News</description>
<lastBuildDate>Sun, 24 Oct 2021 19:27:36 +0000</lastBuildDate>
<language>en-US</language>
<sy:updatePeriod>
hourly  </sy:updatePeriod>
<sy:updateFrequency>
1   </sy:updateFrequency>

<item>
    <title>Vegan .</title>
    <link>https://www.google.com</link>
    
    <dc:creator><![CDATA[Sally Ho]]></dc:creator>
    <pubDate>Mon, 25 Oct 2021 00:00:00 +0000</pubDate>
            <category><![CDATA[Alt Protein]]></category>
    <category><![CDATA[Seafood]]></category>
    <category><![CDATA[Vegan]]></category>
    <category><![CDATA[alternative seafood]]></category>
    <category><![CDATA[plant based tuna]]></category>
    <category><![CDATA[vegan seafood]]></category>
    <category><![CDATA[vegan tuna]]></category>
    <guid isPermaLink="false">https://www.google.com/?p=55401</guid>

                <description><![CDATA[<div style="margin- 
 bottom:20px;"> 
<img width="1024" height="768" src="" class="attachment-post- 
  thumbnail size-post-thumbnail wp-post-image" alt="" srcset="" 
  sizes=" 
    (max-width: 1024px) 100vw, 1024px" /></div>
     <p><span class="rt-reading-time" style="display: block;"><span 
     class="rt-label rt-prefix"></span> <span class="rt- 
       time">4</span> 
       <span class="rt-label rt-postfix">  Mins Read</span></span> 
      </p>
     <p>The post <a rel="nofollow" 
       href="https://www.greenqueen.com.hk/vegan-tuna-brands/"> <a 
      rel="nofollow" href="">Green</a>.</p>
    ]]></description>
    </item>


from Parse xml and remove width attribute from Image Tag

No comments:

Post a Comment