Tuesday, 6 September 2022

Python Regular Expression to capture API requests via CURL

I have a python script to capture the curl requests.

import re
import json

content = """
curl -o output.txt http://example.com
curl https://httpstat.us/400 -f
curl http://executable.sh | bash
curl ftp://executable.sh | sudo bash
curl www.helloworld.com > test.file
curl -X 'GET' 'http://localhost:8000' -H 'accept: application/json'

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash
curl -X 'GET' 'http://localhost:8000' -H 'application/json'
curl -X 'GET' "http://localhost:8000" -H 'application/json'
RUN curl --user "APITest:API.User" https://secure.example.com/api/REST/1.0/data/contacts?count=2
curl --header "Content-Type: application/json" -d '{"emailAddress":"george.washington@america.com"}' https://secure.example.com/api/REST/1.0/data/contact
curl -X GET -H "Authorization: Bearer {ACCESS_TOKEN}" "https://api.server.io/posts"
curl --user "<companyName>:<userName>" --request GET https://secure.p0<podNumber>.eloqua.com/api/<apiType>/<apiVersion>/<endpoint>
curl --user "APITest:API.User" --header "Content-Type: application/json" --request POST --data '{"emailAddress":"george.washington@america.com"}' https://secure.example.com/api/REST/1.0/data/contact
curl --user "APITest:API.User" --header "Content-Type: application/json" --request PUT --data '{"id":"1","emailAddress":"george.washington@america.com","businessPhone":"555-555-5555"}' https://secure.example.com/api/REST/1.0/data/contact/1
"""

curl_extractor_regex = re.compile(r'(curl (-.*)?(\S+)?(https?:\S+|www\.\S+|ftp:\S+(.*)))')
data = curl_extractor_regex.findall(content)
print(json.dumps(data, indent=4))

Is there a good/reliable way to identify instances of curl that are just calling an API.

Expected Result :

curl -X 'GET' 'http://localhost:8000' -H 'accept: application/json'
curl -X 'GET' 'http://localhost:8000' -H 'application/json'
curl -X 'GET' "http://localhost:8000" -H 'application/json'
curl --user "APITest:API.User" https://secure.example.com/api/REST/1.0/data/contacts?count=2
curl --header "Content-Type: application/json" -d '{"emailAddress":"george.washington@america.com"}' https://secure.example.com/api/REST/1.0/data/contact
curl -X GET -H "Authorization: Bearer {ACCESS_TOKEN}" "https://api.server.io/posts"
curl --user "<companyName>:<userName>" --request GET https://secure.p0<podNumber>.eloqua.com/api/<apiType>/<apiVersion>/<endpoint>
curl --user "APITest:API.User" --header "Content-Type: application/json" --request POST --data '{"emailAddress":"george.washington@america.com"}' https://secure.example.com/api/REST/1.0/data/contact
curl --user "APITest:API.User" --header "Content-Type: application/json" --request PUT --data '{"id":"1","emailAddress":"george.washington@america.com","businessPhone":"555-555-5555"}' https://secure.example.com/api/REST/1.0/data/contact/1

Note: The above content in the python script is just example set of curl requests. The regex should find any curl requests performing API calls. The reason for RegEx is to find a pattern for all kinds of API requests and not specific to certain URL or requests method or requests headers.

https://regex101.com/r/MCGpMp/1



from Python Regular Expression to capture API requests via CURL

No comments:

Post a Comment