Hemant Vishwakarma: Max out CPU with parallel API calls to local endpoint using Python

Monday, 29 November 2021

Max out CPU with parallel API calls to local endpoint using Python

Python 3.10

I've been tasked with proving I can max out the CPU of my Mac laptop (10 cores) by calling a local API endpoint which runs in a Java VM, as well as "measure and record throughput," all using Python. For parallelization, I have researched and decided to go with asyncio per this answer: https://stackoverflow.com/a/59385935/7191927

I plan to use htop to show all cores maxed out so that part I think I have covered. Where I'm getting tripped up is what I actually need to do in the code to max out the CPU.

This is what I have so far. This code is to call two local API endpoints (which each just process blocks of text and extract relevant terms):

import asyncio
from api import API, DocumentParameters, EndException


def background(f):
    def wrapped(*args, **kwargs):
        return asyncio.get_event_loop().run_in_executor(None, f, *args, **kwargs)
    return wrapped

@background
def get_word_results(data):
    api = API(url='http://localhost:8181/rest/words')     
    words_data = data        
    params = DocumentParameters()
    params["content"] = words_data 
    try:
        result = api.words(params)
    except EndException as exception:
        print(exception)
    return result

@background
def get_language_results(data):
    api = API(url='http://localhost:8181/rest/languages')     
    language_data = data        
    params = DocumentParameters()
    params["content"] = language_text_data 
    try:
        result = api.language(params)
    except EndException as exception:
        print(exception)
    return result

if __name__ == '__main__':
    filepath = "/Users/me/stuff.txt"
    with open(filepath, 'r') as file: 
        data = file.read()
    get_word_results(data)
    get_language_results(data)
    print('Done.')

This is where my Python knowledge/experience begins to wane.

So what would be the most efficient way to:

Run this code continuously and at increasing thread counts in attempt to max out the CPU.
Measure and record throughput, as per the requirement.

EDIT 1 - Bounty started. I need a solid solution for this - which maxes out CPU and gives some kind of output that shows this, as well as how many calls are being made and causing the max. Based on what Mr Miyagi says in the comments, it sound like multiprocessing is what I want, either instead of or in tandem with asyncio The winner will achieve with the lowest amount of lines of code.

from Max out CPU with parallel API calls to local endpoint using Python

Hemant Vishwakarma

Monday, 29 November 2021

Max out CPU with parallel API calls to local endpoint using Python

No comments:

Post a Comment