Monday 4 January 2021

How Can I Scrape Twitter Now That They Require JavaScript?

I have a couple sites that monitor Twitter for specific types of statements and scrape relevant Tweets using curl in PHP. A few days ago those sites stopped scraping Twitter. I figured they probably redesigned the layout of their mobile.twitter site and all I would have to do is change my xPath query to a different class or something, but instead I found out that whenever you try to visit Twitter without JavaScript enabled you are given a prompt to enable JavaScript to access Twitter. There seems to be no way around this. Before this change one could access a version of Twitter that did not require JavaScript, so I could scrape Tweets with a simple curl request and xPath query.

I have searched Google for ways to enable JavaScript support for curl request but have found nothing. Is it possible to add something to a curl request to parse JavaScript or do I need to find soem other solution?



from How Can I Scrape Twitter Now That They Require JavaScript?

No comments:

Post a Comment