Wednesday 30 November 2022

Link getting malformed while parsing email

I am getting the emails from the Gmail API in full format and then parsing it in string format so that I can extract relevant links from it. While viewing the email on the official Gmail site if I do inspect element on the button, whose redirect link I am interested in it looks something like this.

<p style="margin-top:0px;margin-bottom:20px;font-family:Arial"> 


<a href="http://delivery.ncp.flipkart.com/HCIPJTENG?id=88656=ehkDAAoOVwYCTFZTVwMNCQoAXQdTAFYBVAJVDgQFVgBfUAIHBQVXBAUCAANWAlxYUFRFAgAAA1EEUgQPBAcBDgpXUlJXBgQHDFEHUldSXVsBW1oDVAAHBQIJBEsFBFYODFACBAQDWAYBBQFUWkhQTUYTAxoaUABeW0cERU0cDVRJS1VcW0YKUkZEGgYMWRdxcSppf2FxK3UNWAVLQgE=&fl=URFHQAgZTl8aVlgME19ZS0ZNWlpYGxEdUV0IVF8dVW9NFVZzR0cCeVUFBAAMXFlzZRAOTX9XDFxReQZcdkcuQlE1cVwHYih5UGcMPA5iaVdtCFxeDF4xVXpYDnINBxNAUxZsaWR8A0BbX2sRIlYIch8HQGBTQiZwAUQrdgdKLUdQBll8BlRQcQFHGTIBTVpACy9zYFdtF2EGYztbVlNVZE5VRklmaRdCfgJYAQlzS09HVVtMfgQOeWgECQZbZFB1ABxJBH5bBgRsfX49IG5pDx8mUlN9TVUAd0I1D1VfLFkAEF5mfU5TSgd6GSkXWlZOHwVNBWZ5ImB2YxF2UFgGW3IQd3YFeRsDdVZtASd1SXNIF1IYW2VRbWdUEFsBdhlvWFNwd0QPGWxDYWNdPGdZc1gMSltjYhxAR0EgYGZdNXRvIwpdWAYHaVV7TApXTHNRYTJMGGNBEQZFcjYEZ0oiQUgBW3EDRwV+fEhNHDNtdHUKD0kDQAcLV0djL21WcAVmYzp1e0B8IlxHcgIIMVN9f2YpbkNTAjYDBwQqYUcFNgdeNVxcYhsAZH1FDT0QUm4BBVVLdmFbFV5KZzgBeVwAY00gckh+UAJceEAZOgxfekBLB29TBm4uVFV1C1J1CxoaSgQBcUFnKgdHCH83U18BUQsGDVlhTl1EbXIbZ25VEVhXDVBlW1MQVnBnTTE7QV4LXCFYdn91I1ZrBBlEekcODgoGQ0MLGxFgbH5+KjdVXm9TM2hMBUUABR8GBntDdAVoZhZpckdaJn11dkMiIVENfwIVSwdDBVV2fWlZcQRgGlNyB0FdfEQRYAZxegwyf1paZStrB1gGInheUzJ6a10hBgwpdUBBeg0LDA0=&ext=ZT10cnVl" style="background-color:rgb(41,121,251);font-family:Arial;color:#fff;border:0px;font-size:14px;display:inline-block;margin-top:0px;border-radius:2px;text-decoration:none;width:160px;line-height:32px;text-align:center" target="_blank" data-saferedirecturl="https://www.google.com/url?q=http://delivery.ncp.flipkart.com/HCIPJTENG?id%3D88656%3DehkDAAoOVwYCTFZTVwMNCQoAXQdTAFYBVAJVDgQFVgBfUAIHBQVXBAUCAANWAlxYUFRFAgAAA1EEUgQPBAcBDgpXUlJXBgQHDFEHUldSXVsBW1oDVAAHBQIJBEsFBFYODFACBAQDWAYBBQFUWkhQTUYTAxoaUABeW0cERU0cDVRJS1VcW0YKUkZEGgYMWRdxcSppf2FxK3UNWAVLQgE%3D%26fl%3DURFHQAgZTl8aVlgME19ZS0ZNWlpYGxEdUV0IVF8dVW9NFVZzR0cCeVUFBAAMXFlzZRAOTX9XDFxReQZcdkcuQlE1cVwHYih5UGcMPA5iaVdtCFxeDF4xVXpYDnINBxNAUxZsaWR8A0BbX2sRIlYIch8HQGBTQiZwAUQrdgdKLUdQBll8BlRQcQFHGTIBTVpACy9zYFdtF2EGYztbVlNVZE5VRklmaRdCfgJYAQlzS09HVVtMfgQOeWgECQZbZFB1ABxJBH5bBgRsfX49IG5pDx8mUlN9TVUAd0I1D1VfLFkAEF5mfU5TSgd6GSkXWlZOHwVNBWZ5ImB2YxF2UFgGW3IQd3YFeRsDdVZtASd1SXNIF1IYW2VRbWdUEFsBdhlvWFNwd0QPGWxDYWNdPGdZc1gMSltjYhxAR0EgYGZdNXRvIwpdWAYHaVV7TApXTHNRYTJMGGNBEQZFcjYEZ0oiQUgBW3EDRwV%2BfEhNHDNtdHUKD0kDQAcLV0djL21WcAVmYzp1e0B8IlxHcgIIMVN9f2YpbkNTAjYDBwQqYUcFNgdeNVxcYhsAZH1FDT0QUm4BBVVLdmFbFV5KZzgBeVwAY00gckh%2BUAJceEAZOgxfekBLB29TBm4uVFV1C1J1CxoaSgQBcUFnKgdHCH83U18BUQsGDVlhTl1EbXIbZ25VEVhXDVBlW1MQVnBnTTE7QV4LXCFYdn91I1ZrBBlEekcODgoGQ0MLGxFgbH5%2BKjdVXm9TM2hMBUUABR8GBntDdAVoZhZpckdaJn11dkMiIVENfwIVSwdDBVV2fWlZcQRgGlNyB0FdfEQRYAZxegwyf1paZStrB1gGInheUzJ6a10hBgwpdUBBeg0LDA0%3D%26ext%3DZT10cnVl&source=gmail&ust=1669470586360000&usg=AOvVaw2TGtpbQMv9Nx4CsQgYUmpZ">


</p>

But when I parse the email, the link is malformed. The link I am getting when parsing the email is this

http://delivery.ncp.flipkart.com/HCIPJTENG?id=88656=ehkDAAoOVwYCTFdSUQZZXwZVCAJQUlIBBghXDgxQBlUBVwQDVwFXA1EAAl1RBF0OClRFAgAAA1EEUgQPBAcBDgpXUlJXBgQHDFEHUldSXVsBW1oDVAAHBQIJBEsFBFYODFACBAQDWAYBBQFUWkhQTUYTAxoaUABeW0cERU0cDVRJS1VcW0YKUkZEGgYMWRdxcSppf2FxK3UNWAVLQgE=&fl=URFHQAgZTl8aVlgME19ZS0ZNWlpYGxEdUV0IVF8dClhsAUF8fWAUB1ZRVhwlTFBDQisOYl5dOlhQew5UXVZWVHcLW2p+YQoFewJsLTVRe2gALm0AQWVdQ3FnAkMCWC1xaQB0dB9GE1BefFUyOVZBSXQmDldwbTQGamQOAUZoJwFxJFRJdm8EUWxjWzRWe012eyJ6ZHwBBn1lUDhFY2IWXVRXWUpeeSJpcXdcHS8ZfVcHAXVmUVlcSHxzKFpVcQVjDi9HX1oDGwtnXnpRBnFbSUo0SnB7XipXfEMLR1dHVhptBkleC2EFAENGbiATfWF7czN3d3cEXQcLfgpoeWgqVQpcQ0hfXExwUGoMCAFADHdhF3tHR2UwQ2FWKGNfdBVbQTQeYndyKGpQeV0cOnZ0e0BRaHtHZ1BDR2QYVQ1qWnR6FkdWWQMOcQBWTDIWA3VyBjtLY1FwEH94YxZlbmEzD08OfEVRcxR0UgF7KwFXbWZeLAteA202SGRWBl9VfQV+XVVCRnt1JEV6CV0GGmdBU3k2dwNyZRR/Y3IQTVF0EHRUEHFkWxswUUVJYAcSeVlzACZPd3BVUllGAlVNAgAoAU0VWAVibjIEXQh4URlwUGhHBwx7U1wCXVFTFXECHw==&ext=ZT10cnVl

I am using this logic to decode the email, when fetching it in full format from Gmail API `

private fun getTextFromBodyPart(
    bodyPart: MessagePart
): String {

    var result: String = ""
    if (bodyPart.mimeType == "text/plain") {
        val r = bodyPart.body.decodeData()
        result = r.toString(Charsets.UTF_8)
    } else if (bodyPart.mimeType == "text/html") {
        val html = bodyPart.body.decodeData()
        result = html.toString(Charsets.UTF_8)
    }
    return result
}

and this

   val emailSize = email.payload.parts.size
var parsedEmail = " "

for (k in 0 until emailSize) {
    parsedEmail += getTextFromBodyPart(email.payload.parts[k])
}

And I am then using regex to extract the link from the email, the regex I am using is this ^http:\/\/delivery\..+?\.flipkart\.com\/([A-Za-z0-9\?\=&\/\\\+]+)$

Also when I analyse the decoded email, everything seems fine expect the link. Also the link in malformed even when I click on view original of this email in the official Gmail site.

Cannot really understand why is the link getting malformed.

I also tried to get the emails in raw format using Gmail Api, but the link is again malformed after decoding it..



from Link getting malformed while parsing email

No comments:

Post a Comment