The Synchronous Problem

In a server-side architecture, there are often independent tasks that run synchronously. However, this can slow down the overall process of the system, and become complex and unmanageable. In these situations, asyncio can give developers working in Python peace of mind — allowing them to write asynchronous code in a declarative way with predictable outcomes.

In our campaign creation platform, a user can see the summary of an entire project, including targeting information and associated campaigns. Creating the summary page requires us to call two main methods, generating summary data from both Facebook and Adwords. In addition, the two main methods have a few helper methods responsible for retrieving all the necessary data. These two main tasks are independent of each other — and would benefit from utilizing asyncio to speed up the build process.

Asyncio Overview

Asyncio is a module introduced in Python 3.4, and provides a framework to develop asynchronous code by the use of an event loop. The main components of asyncio are futures, coroutines, and the aforementioned loop. The event loop is responsible for directing the execution of tasks and for shifting control between different tasks. To shift control between tasks, the await keyword needs to be used in front of a future or coroutine. Futures represent the eventual result of an asynchronous operation, similar to Javascript promises. Finally, coroutines are functions that are similar to generators. The main difference is that generators are used for producing data, whereas coroutines are used for consuming data. It’s important to keep in mind that coroutines are not only used within asyncio, but can be used through Python code. Using all three of these is how asyncio is able to provide an asynchronous framework while still being run on a single thread.

To start an asynchronous process, tasks need to be created and then an event loop must be run. Depending on the command used, the event loop will run indefinitely, or until certain conditions are met. The general programming steps to achieve this can be defined as:

  1. Get / Create the event loop responsible for running tasks

  2. Define the tasks that need to run

  3. Start the event loop so that it can complete tasks

  4. Shift the control of execution between tasks by using the keyword await

  5. Close the loop

The image illustrates an event loop that is in charge of two tasks. The bars represent the work being done over time and the arrows represent shifts of control. If a task is never awaited during its execution, there will be no improvement in speed and the event loop will simply run the tasks synchronously. To help stop this from happening an entire logical sequence should be made asynchronous by using async / await. The async keyword in front of a function definition is what makes that function a coroutine. The await keyword is used to give control back to the event loop to do work elsewhere.

While using these keywords, there are two problems to be aware of. The first is using the await keyword in front of a function call that does not return a future or a function call that is not from a coroutine. Doing this will result in a type error exception being thrown. The other situation to watch out for is forgetting to place await whenever a function call from a coroutine is made. Forgetting await results in a runtime warning and causes the coroutine to not run.

Migrating to Asyncio

Here’s a simplified example of how Pixability uses asyncio to run two methods synchronously to build a project summary:

def main_A():
     summary_A = {"A": {}}
     summary_A["A"]["1"] = sub_1()
     summary_A["A"]["2"] = sub_2()
     summary_A["A"]["3"] = sub_3()
     return summary_A

def main_B():
     summary_B = {"B": {}}
     summary_B["B"]["1"] = sub_1()
     summary_B["B"]["2"] = sub_2()
     summary_B["B"]["3"] = sub_3()
     return summary_B

def sub_1():
     return [{"1": i} for i in range(3)]

def sub_2():
     return [{"2": i} for i in range(3)]

def sub_3():
     return [{"3": i} for i in range(3)]

summary_A = main_A()
summary_B = main_B()
print([summary_A, summary_B])

To make this code asynchronous, deciding what the event loop should manage is most important. Since the overall process is driven by main_A and main_B, it makes sense that these two should be created as tasks. What’s often difficult is deciding what to do with the submethods. One option is to make the submethods coroutines and to await each of their results. However, this doesn’t really give us any advantage, as the event loop will cede control to sub 1, 2, or 3 then return the control back to main_A or main_B, effectively the same as building the summary synchronously. These submethods can be thought of as blocking operations of the loop. While it’s best practice to avoid blocking methods, sometimes that isn’t always possible. Fortunately, asyncio has a solution to help with blocking tasks. The code sample below shows how to solve this problem:

import asyncio
from concurrent.futures import ThreadPoolExecutor

async def main_A():
     executor = ThreadPoolExecutor(max_workers=3)

     summary_A = {"A": {}}
     summary_A["A"]["1"] = await event_loop.run_in_executor(executor, sub_1)
     summary_A["A"]["2"] = await event_loop.run_in_executor(executor, sub_2)
     summary_A["A"]["3"] = await event_loop.run_in_executor(executor, sub_3)
     return summary_A

async def main_B():
     executor = ThreadPoolExecutor(max_workers=3)

     summary_B = {"B": {}}
     summary_B["B"]["1"] = await event_loop.run_in_executor(executor, sub_1)
     summary_B["B"]["2"] = await event_loop.run_in_executor(executor, sub_2)
     summary_B["B"]["3"] = await event_loop.run_in_executor(executor, sub_3)
     return summary_B

def sub_1():
     return [{"1": i} for i in range(3)]

def sub_2():
     return [{"2": i} for i in range(3)]

def sub_3():
     return [{"3": i} for i in range(3)]

event_loop = asyncio.get_event_loop()
task_A = asyncio.ensure_future(main_A())
task_B = asyncio.ensure_future(main_B())
task_A_B = asyncio.gather(task_A, task_B)
event_loop.run_until_complete(task_A_B)
print(task_A_B.result())

In the example above, main_A and main_B are made coroutines by adding the async keyword in front of the function definitions. Within main_A and main_B, the event loop runs the blocking tasks inside their own threads. The run_in_executor method returns a future, so awaiting this call will shift control to another task, while the blocking task is computed in the background. The overall execution can be described as follows:

  1. Get the event loop

  2. Create task A and B

  3. Gather both task A and B into a single task

  4. Run the loop until the combined task A_B task is complete (asyncio.gather is needed to ensure that the loop runs until both tasks are completed)

  5. Close the loop

  6. Print the results

Asyncio For Your Systems

Asyncio is a great tool to add to the standard Python library, and can likely speed up a program through asynchronous code. To understand the impact of asyncio, I recommend benchmarking part of a program, and testing both with and without asyncio implemented. It’s important to keep in mind any technical debt added — if you find yourself building an overly complex solution for minimal speed improvement, asyncio might not be the right fit. To get more information about asyncio, check out the documentation here: https://docs.python.org/3/library/asyncio.html

The post Using Asyncio at Pixability appeared first on .

This article sources information from Tech Blog