Writing / Articles
ArticleCode

Fast Heatmap Generation with Python, Flask, and Numba

A practical introduction to Numba through a real-world problem: generating heatmap tiles in Python. From data collection to image generation, we'll optimize every step of the pipeline.

Problem statement

In this article, we'll explore the Python library Numba and see how it can significantly improve the performance of Python code using parallel execution and CUDA acceleration.

To achieve this, we'll use a problem that is both well-suited for parallel computing and easy to visualize: heatmap generation.

In any mapping system, a map is represented as a collection of small pieces called tiles. Typically, the base map is rendered first (which is outside the scope of this article), and then an additional layer — such as a heatmap — is drawn on top of it.

Algorithm

  1. Retrieve the geographic coordinates of all Perekrestok stores in Moscow.
    • Download the list of store addresses from the unofficial website.
    • Use the Google Geocoding API to convert the addresses into geographic coordinates.
  2. Build a store accessibility map.
    • For each point within a tile, calculate the distance to every store.
    • Based on these distances, generate an accessibility map.
  3. Convert the accessibility map into colors (generate an image).
  4. Visualize the result using the mapping service geojson.io.

The unofficial Perekrestok website is a third-party resource. If its HTML structure changes, the parser will stop working. Therefore, treat it as an example or template rather than a reliable production-ready solution. The screenshot below shows the website as it appeared at the time this article was written.

We will consider a store to be accessible if it is located within 1.2 km of the point being evaluated. This distance corresponds to approximately a 15-minute walk.

Data preparation

As the subject of our experiment, we'll use the Perekrestok supermarket chain in Moscow and visualize the accessibility of its stores across the city.

In production environments, tiles are typically pre-generated and then served by a tile server. However, since the focus of this article is on accelerating computations, we'll generate them on the fly. The faster the tiles are generated, the smoother the map rendering will be.

Retrieving the list of stores

Using the aiohttp library, we'll download the page and then use BeautifulSoup to extract the store addresses from it:

python
async def get_shops_addresses(session):    
    async with session.get('https://perekrestok-promo.ru/store/g-moskva') as response:
        html = await response.text()
    tree = BeautifulSoup(html, 'html.parser')
    feature = 'Магазин Перекресток по адресу '
    raw_links = tree.body.find_all('a', string=re.compile(feature))
    links = [(link.text.replace(feature, '')) for link in raw_links]
    return links

Geocoding

In the previous step, we obtained a list of store addresses. Now we'll use the Google Geocoding API to convert each address into geographic coordinates:

python
async def coordinate_from_address(session, address):
    params = [
        ('key', 'API_KEY'),
        ('address', address)
    ]
    async with session.get('https://maps.googleapis.com/maps/api/geocode/json', params=params) as response:
        text = await response.text()
        if response.status == 200:
            data = json.loads(text)
            try:
                position = data['results'][0]['geometry']['location']
                return (float(position['lat']), float(position['lng']))
            except (KeyError, IndexError):
                pass
    return None

The final function uses the asyncio library to first retrieve the list of store addresses and then geocode them to obtain their geographic coordinates:

python
async def load_points():
    chunk_size = 30
    session = aiohttp.ClientSession()
    addresses = await get_shops_addresses(session)
    tasks = [coordinate_from_address(session, address) for address in addresses]
    results = []
    while len(tasks):
        chunk = tasks[:chunk_size]
        tasks = tasks[chunk_size:]
        finished, unfinished = await asyncio.wait(chunk)
        results.extend([task.result() for task in finished if task.result() is not None])
        await asyncio.sleep(1)
    await session.close()
    return results

Note that the function sends no more than 30 requests to the geocoder concurrently in order to avoid exceeding the Google Geocoding API's limit on the number of simultaneous requests.

Tile generation

Introduction to Numba

Numba is an open-source library that accelerates numerical computations using Just-In-Time (JIT) compilation. JIT compilation improves performance by compiling Python bytecode into native machine code at runtime, allowing computationally intensive code to execute much faster.

Working with Numba typically begins by applying one of the following decorators: njit, vectorize, or guvectorize. Let's briefly look at each of them.

njit

This decorator tells Numba to compile a function using nopython=True. On the one hand, this restricts the function from using arbitrary Python objects and many external Python features. On the other hand, it allows Numba to generate highly optimized native machine code, resulting in significantly better performance.

vectorize

This decorator is used for functions that define an operation on a single array element (a scalar). For example, if you want to add two arrays, the decorated function only needs to describe how to add two numbers. Numba automatically applies the function to every element of the input arrays and can parallelize the computation.

guvectorize

This is the most flexible—and also the most advanced—approach to array processing. Unlike vectorize, the function typically operates on entire arrays rather than individual scalar values. One important detail is that the output array is passed into the function as an argument. In other words, the function receives both the input data and a preallocated array where it is expected to write the results.

Let's look at an example. The function below calculates the distance between two geographic coordinates:

python
@njit('f8(f8[:], f8[:])')
def dist(point1, point2):
    R = 6373.0

    lat1, lon1 = point1
    lat2, lon2 = point2

    delta = point2 - point1
    a = math.sin(delta[0] / 2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(delta[1] / 2)**2
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
    return R * c

The function above calculates the distance between two geographic coordinates in kilometers. The part we're interested in is the function signature specified in the decorator: f8(f8[:], f8[:]).

  • f8 — a float64 value.
  • f8[:] — a one-dimensional array of float64 values.

In other words, the function takes two arrays of floating-point numbers (latitude/longitude pairs) as input and returns a single floating-point value.

Let's look at another example. The function below clamps a value to a specified range: if the input is smaller than the minimum, it returns the minimum; if it is greater than the maximum, it returns the maximum.

python
@njit('f8(f8, f8, f8)')
def clamp(x, min_val, max_val):
    return min(max(min_val, x), max_val)

When working with Numba, you'll often need to write small helper functions like this, since functions compiled with Numba decorators can only call other Numba-compatible functions or functions explicitly supported by the library.

Grid generation

Modern mapping frameworks represent a map as a collection of images called tiles. Each tile is identified by two coordinates and is displayed only at a specific zoom level.

Therefore, a tile is uniquely defined by three values: (x, y, zoom).

Based on a tile's size and coordinates, we need to generate an array containing the geographic coordinates of every point within the tile. This is an ideal candidate for parallel execution.

Let's take a closer look at the function arguments specified in the decorator:

  • zeros: An empty three-dimensional float64 array used solely to define the shape of the result argument (explained below). The array is three-dimensional because each point in the tile (which already has two dimensions) stores both a latitude and a longitude. For example, result[10][10][1] represents the longitude of the point at coordinates (10, 10) within the tile.

  • x_range, y_range: One-dimensional arrays of float64 values representing the coordinate ranges along the X and Y axes of the tile. For example, if a tile has coordinates x = 3 and y = 4 and a size of 10 × 10 pixels, then x_range would be [3, 3.11, 3.22, ..., 4], while y_range would be [4, 4.11, 4.22, ..., 5].

  • zoom: A float64 value representing the zoom level of the tile being processed.

  • result: A three-dimensional float64 array where the function writes its output. Like zeros, it has the same shape, but unlike zeros, it is the actual output buffer. The zeros argument exists only to inform the JIT compiler about the dimensionality of result, whereas result itself is populated with the computed geographic coordinates.

The second argument of the guvectorize decorator is the signature, which explicitly defines the relationship between the dimensions of the input and output arguments:

(a, b, c), (a), (b), () -> (a, b, c)

You might notice that c is always equal to 2 (latitude and longitude), so it may seem that we could write something like (a), (b), () -> (a, b, 2) and eliminate the zeros argument altogether. Unfortunately, this is not possible due to one of Numba's limitations. The signature may contain only symbolic dimension variables—not numeric constants. As a result, we pass an empty array with the required shape so that Numba can infer the dimensions of the output array.

The function below computes the geographic coordinates for every point within a tile:

python
@guvectorize(['void(f8[:, :, :], f8[:], f8[:], f8, f8[:, :, :])'],
             '(a, b, c), (a), (b), () -> (a, b, c)',
             target='parallel', nopython=True)
def calc_grid(zeros, x_range, y_range, zoom, result):
    tile_count = 2.0 ** zoom
    for i, x in enumerate(x_range):
        for j, y in enumerate(y_range):
            lon = x / tile_count * 2 * math.pi - math.pi
            lat = math.atan(math.sinh(math.pi * (1 - 2 * y / tile_count)))
            result[j][i] = (lat, lon)

Pay attention to the additional arguments passed to the decorator:

target="parallel" instructs Numba to execute the computation across all available CPU cores. Replacing parallel with cuda allows the computation to run on the GPU instead, which can provide an even greater performance boost.

nopython=True is what actually enables Numba's high-performance compilation. According to the Numba documentation, setting this flag to False switches the compiler to object mode, where Python objects are supported, but the resulting performance is generally no better than that of regular Python code.

Distance calculation and accessibility map generation

At this point, we have computed the geographic coordinates for every point in the tile, prepared an array containing the geographic coordinates of all points of interest (the store locations in our case), and implemented a function for calculating the distance between two geographic coordinates.

Using the assumption that 1.2 km is the distance a person can walk in 15 minutes, we can calculate the number of accessible stores for every point in the tile. We then convert this count into an accessibility score.

We'll consider an accessibility score to be optimal if a point has four or more accessible stores nearby. This threshold was chosen empirically and essentially serves as a normalization factor.

python
@guvectorize(['void(f8[:, :, :], f8[:, :], f8[:, :])'],
             '(n, m, k), (a, b) -> (n, m)',
             target='parallel', nopython=True)
def calc_dists(positions, points, result):
    for i in range(positions.shape[0]):
        for j in range(positions.shape[1]):
            buffer = 0
            for point in points:
                cur_dist = dist(positions[i][j], point)
                # копим значение доступности магазинов
                # чем магазин ближе, тем больший вклад он внесёт, если дальше чем 1.2 км
                # считаем его недоступным, а его вклад в карту доступноти нулевым
                buffer += max(0, 1.2 - cur_dist)
            result[i][j] = min(1, buffer / 4)

Mapping accessibility values to RGB colors

The final step in generating the tiles is to convert the accessibility score into a color. Under normal circumstances, we could simply use the colormap function from the matplotlib library. However, as mentioned earlier, functions compiled by Numba cannot call arbitrary third-party libraries.

The solution is straightforward: take the implementation of the desired colormap from the matplotlib source code and reimplement it in a way that is compatible with Numba. In this article, we'll use the jet colormap.

The function below converts an accessibility score into a color. Pay close attention to the signature specified in the decorator. It declares that the input and output arrays have the same shape. At first glance, this may seem odd: an accessibility score is a single value, whereas a color consists of four components.

Before applying the colormap, we expand the accessibility map from a 256×256 array to a 256×256×4 array. This allows us to describe the output shape correctly in the decorator signature. The accessibility score itself is stored in the first component of each input element.

For example:

  1. Original accessibility map: A = 256×256A[i][j] = val
  2. Expanded accessibility map: A = 256×256×4A[i][j] = (val, 0, 0, 0)
python
@guvectorize(['void(f8[:, :, :], f8[:, :, :])'],
             '(n, m, k) -> (n, m, k)',
             target='parallel', nopython=True)
def calc_colors(dists, result):
    for i in range(dists.shape[0]):
        for j in range(dists.shape[1]):
            x = dists[i][j][0]
            r = clamp((4.0 * x - 1.5) if x < 0.7 else (-4.0 * x + 4.5), 0, 1)
            g = clamp((4.0 * x - 0.5) if x < 0.5 else (-4.0 * x + 3.5), 0, 1)
            b = clamp((4.0 * x + 0.5) if x < 0.3 else (-4.0 * x + 2.5), 0, 1)
            a = clamp(dists[i][j][0] * 2, 0, 0.95)
            result[i][j] = (r, g, b, a)
    result *= 255

Final tile generation function

Now that all the components have been implemented, all that remains is to assemble them in the correct order. The function below generates an accessibility map for a given tile as a PNG image, which can then be overlaid on the map as a separate layer.

python
def gen_tile(zoom, x, y, points):
    size = 256
    rng = np.linspace(0, 1, size)
    grid = np.zeros((size, size, 2))

    # Generate a coordinate grid (geographic coordinates for every point in the tile)
    calc_grid(grid, x + rng, y + rng, zoom, grid)

    # Convert the coordinates of the points of interest to radians (this simplifies the distance calculations)
    points = np.radians(points)

    # Compute the accessibility map
    dists = np.zeros((size, size))
    calc_dists(grid, points, dists)

    # Expand it into a three-dimensional array
    dists = np.dstack((
        dists.reshape((size, size, 1)),
        np.zeros((size, size, 3))
    ))

    # Allocate an empty array for the colors and compute the color values
    colors = np.zeros_like(dists)
    calc_colors(dists, colors)

    # Generate the final image
    image = Image.fromarray(colors.astype('uint8'))
    filename = 'files/{z}/{x}/{y}.png'.format(z=zoom, x=x, y=y)

    if not os.path.exists(os.path.dirname(filename)):
        os.makedirs(os.path.dirname(filename))

    image.save(filename)
    return filename

Results

Accessibility map of Perekrestok stores in Moscow

Accessibility map of post offices in Kyiv

Optimization

Bonus section! Let's optimize the algorithm by excluding points of interest that cannot possibly contribute to the final tile.

To do this, we check the following condition for each point of interest: the distance from the point of interest to the closest point within the tile must be greater than the threshold distance (1.2 km). A simple way to evaluate this is to subtract the distance from the tile center to its edge from the distance between the point of interest and the tile center (see the illustration below).

This simple optimization can significantly reduce the tile generation time. We first exclude all points of interest that cannot possibly affect the tile, and only then use the remaining points to compute the accessibility map.

Links

Web developmentnumbaflaskpython
Karen Grigorian
Karen Grigorian