Tuesday, 9 April 2019

How to transform ordinary quotation marks to Guillemets (French quotes) except tags

Let's say we have the following text:

<a href="link">some link</a> How to transform "ordinary quotes" to «Guillemets»

What is needed is to transform it to

<a href="link">some link</a> How to transform «ordinary quotes» to «Guillemets»

using regex and Python.

I've tried

import re

content = '<a href="link">some link</a> How to transform "ordinary quotes" to «Guillemets»'

res = re.sub('(?:"([^>]*)")(?!>)', '«\g<1>»', content)

print(res)

but, as @Wiktor Stribiżew noticed, this won't work if one or more tags will have multiple attributes, so

<a href="link" target="_blank">some link</a> How to transform "ordinary quotes" to «Guillemets»

will be transformed to

<a href=«link" target=»_blank">some link</a> How to transform «ordinary quotes» to «Guillemets»

Update

Please note that text

  • can be html, i.e:

<div><a href="link" target="_blank">some link</a> How to transform "ordinary quotes" to «Guillemets»</div>

  • can not be html, i.e.:

How to transform "ordinary quotes" to «Guillemets»

  • can not be html, but include some html tags, i.e.

<a href="link" target="_blank">some link</a> How to transform "ordinary quotes" to «Guillemets»



from How to transform ordinary quotation marks to Guillemets (French quotes) except tags

No comments:

Post a Comment