Wednesday, 11 October 2023

The script produces inconsistent results compared to what the site displays

I've created a script in Python to scrape the links to different properties produced upon choosing a few options from two dropdowns on a webpage using the requests module.

Here is the image of the filled-out form of the first filter. I used this search keyword Kelowna General Hospital, Kelowna, BC, Canada in the first inputbox.

As for the second filter , it is located at the top right corner of the webpage. When it is clicked, multiple options for different filters are displayed. I only chose house under Property type and I also chose pool and king-size-bed under Amenities.

The script appears to be working errorlessly. However, the problem is that the results the script produces are inconsistent. Sometimes they are 33, 36, or 39, whereas the results displayed on that site are 42. I've hardcoded the value of sha256Hash in the payload. Although it is optional, it would be great if I knew how to get that value.

import json
import requests

base = 'https://www.airbnb.ca/rooms/{}'
main_link = 'https://www.airbnb.ca/api/v3/StaysSearch?operationName=StaysSearch&locale=en-CA&currency=CAD'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
    'Origin': 'https://www.airbnb.ca',
    'Referer': 'https://www.airbnb.ca/',
    'Accept': '*/*',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-US,en;q=0.9',
    'X-Airbnb-Api-Key': 'd306zoyjsyarp7ifhu67rjxn52tv0t20'
}

payload = {"operationName":"StaysSearch","variables":{"staysSearchRequest":{"requestedPageType":"STAYS_SEARCH","metadataOnly":False,"searchType":"filter_change","treatmentFlags":["decompose_stays_search_m2_treatment","decompose_stays_search_m3_treatment","decompose_stays_search_m3_5_treatment","flex_destinations_june_2021_launch_web_treatment","new_filter_bar_v2_fm_header","flexible_dates_12_month_lead_time","lazy_load_flex_search_map_compact","lazy_load_flex_search_map_wide","im_flexible_may_2022_treatment","search_add_category_bar_ui_ranking_web","feed_map_decouple_m11_treatment","decompose_filters_treatment","homepage_static_seo_v2_flag","flexible_dates_options_extend_one_three_seven_days","super_date_flexibility","micro_flex_improvements","micro_flex_show_by_default","search_input_placeholder_phrases","pets_fee_treatment"],"rawParams":[{"filterName":"adults","filterValues":["1"]},{"filterName":"amenities","filterValues":["7","1000"]},{"filterName":"cdnCacheSafe","filterValues":["false"]},{"filterName":"channel","filterValues":["EXPLORE"]},{"filterName":"checkin","filterValues":["2023-10-20"]},{"filterName":"checkout","filterValues":["2023-10-25"]},{"filterName":"datePickerType","filterValues":["calendar"]},{"filterName":"flexibleTripLengths","filterValues":["one_week"]},{"filterName":"itemsPerGrid","filterValues":["18"]},{"filterName":"l2PropertyTypeIds","filterValues":["1"]},{"filterName":"monthlyLength","filterValues":["3"]},{"filterName":"monthlyStartDate","filterValues":["2023-11-01"]},{"filterName":"placeId","filterValues":["ChIJFfE6YlKLfVMR5bnoPuVUsb4"]},{"filterName":"priceFilterInputType","filterValues":["0"]},{"filterName":"priceFilterNumNights","filterValues":["5"]},{"filterName":"query","filterValues":["Kelowna General Hospital, Kelowna, BC, Canada"]},{"filterName":"refinementPaths","filterValues":["/homes"]},{"filterName":"screenSize","filterValues":["large"]},{"filterName":"tabId","filterValues":["home_tab"]},{"filterName":"version","filterValues":["1.8.3"]}]},"staysMapSearchRequestV2":{"requestedPageType":"STAYS_SEARCH","metadataOnly":False,"searchType":"filter_change","treatmentFlags":["decompose_stays_search_m2_treatment","decompose_stays_search_m3_treatment","decompose_stays_search_m3_5_treatment","flex_destinations_june_2021_launch_web_treatment","new_filter_bar_v2_fm_header","flexible_dates_12_month_lead_time","lazy_load_flex_search_map_compact","lazy_load_flex_search_map_wide","im_flexible_may_2022_treatment","search_add_category_bar_ui_ranking_web","feed_map_decouple_m11_treatment","decompose_filters_treatment","homepage_static_seo_v2_flag","flexible_dates_options_extend_one_three_seven_days","super_date_flexibility","micro_flex_improvements","micro_flex_show_by_default","search_input_placeholder_phrases","pets_fee_treatment"],"rawParams":[{"filterName":"adults","filterValues":["1"]},{"filterName":"amenities","filterValues":["7","1000"]},{"filterName":"cdnCacheSafe","filterValues":["false"]},{"filterName":"channel","filterValues":["EXPLORE"]},{"filterName":"checkin","filterValues":["2023-10-20"]},{"filterName":"checkout","filterValues":["2023-10-25"]},{"filterName":"datePickerType","filterValues":["calendar"]},{"filterName":"flexibleTripLengths","filterValues":["one_week"]},{"filterName":"itemsPerGrid","filterValues":["18"]},{"filterName":"l2PropertyTypeIds","filterValues":["1"]},{"filterName":"monthlyLength","filterValues":["3"]},{"filterName":"monthlyStartDate","filterValues":["2023-11-01"]},{"filterName":"placeId","filterValues":["ChIJFfE6YlKLfVMR5bnoPuVUsb4"]},{"filterName":"priceFilterInputType","filterValues":["0"]},{"filterName":"priceFilterNumNights","filterValues":["5"]},{"filterName":"query","filterValues":["Kelowna General Hospital, Kelowna, BC, Canada"]},{"filterName":"refinementPaths","filterValues":["/homes"]},{"filterName":"screenSize","filterValues":["large"]},{"filterName":"tabId","filterValues":["home_tab"]},{"filterName":"version","filterValues":["1.8.3"]}]},"feedMapDecoupleEnabled":True,"decomposeCleanupEnabled":False,"isLeanTreatment":False},"extensions":{"persistedQuery":{"version":1,"sha256Hash":"96cad171e4c04dd2c011a961e8affc17653f950ea248532a8fe3be44d4f5e47c"}}}

unique_links = []

while True:
    res = requests.post(main_link,json=payload,headers=headers)
    container = res.json()['data']['presentation']['explore']['sections']['sectionIndependentData']['staysMapSearch']['mapSearchResults']
    for item in container:
        name = item['listing']['name']
        inner_link = base.format(item['listing']['id'])
        if inner_link not in unique_links:
            print(name,inner_link)
        unique_links.append(inner_link)

    try:
        next_page_cursor = res.json()['data']['presentation']['explore']['sections']['sectionIndependentData']['staysSearch']['paginationInfo']['nextPageCursor']
    except Exception: next_page_cursor = ""
    if not next_page_cursor: 
        break
    payload['variables']['staysSearchRequest']['cursor'] = next_page_cursor

How can I get the exact same results as the website?



from The script produces inconsistent results compared to what the site displays

No comments:

Post a Comment