Building an IoT Application Using an HTTP API

For years the world has been abuzz with IoT devices. These devices range from alarm clocks that show the current weather to refrigerators that list the prices of nearby groceries. Whatever the specifics, these devices rely on APIs to communicate with data sources. But how exactly do we connect the messages, data, and devices?

In this post, we’ll show you an example of how to design and model data for an IoT device. We’ll use the M5Stack—a small, modular IoT device with a display screen—and connect to the API for the New York City Metropolitan Transportation Authority (NYC MTA) to render the latest subway times for various stations.

While we’ll focus on the M5Stack, the concepts we’ll discuss will apply to designing an IoT application across a wide variety of devices.

So let’s get started!

Prerequisites

In this tutorial, we’ll focus on the bigger conceptual ideas around how to request data from an API. Some knowledge programming will be very helpful. Although you don’t need an M5Stack, if you do have one, then you can follow along and upload the finished project on your own device.

With that in mind, you can download the VS Code IDE and the M5Stack plugin. If you’ve never booted an M5Stack before, follow their guide to set up WiFi and the necessary firmware. For this project, we’ll use Python 3, which is the main programming language that the M5Stack uses.

You’ll need to sign up for an NYC MTA developer account for a free Developer’s API key to access their real-time subway data.

Lastly, you should sign up for a free Gravitee account to use the API designer, which will make it easier to visualize and understand the data flow in your API calls!

The source material for this project was inspired by this open-source project, so go ahead and star that repository if it’s been helpful.

Designing the API Interaction

Before writing a single line of code, let’s take a step back and consider what sort of information we need to complete this project:

  • Information about the relevant subway stations
  • Which trains pass through those stations
  • The latest real-time data about those trains

According to the documentation, the API is separated into static data feeds and real-time data feeds.

Static data feeds hold information about stations. With that information, we can get the actual live train data from the real-time data feeds API. The data provided by the MTA is in the following CSV format:

stop_id,stop_code,stop_name,stop_desc,stop_lat,stop_lon,zone_id,stop_url,location_type,parent_station

Since the only static information we need is the station ID, we can simply pluck a random station ID and use it for the real-time feeds. In this case, I’m choosing the Hoyt–Schermerhorn station for its relative complexity: two separate trains pass through it (the A and the C). Stations are also identified by whether they are northbound (N) or southbound (S).

A42,,Hoyt-Schermerhorn Sts,,40.688484,-73.985001,,,1,
A42N,,Hoyt-Schermerhorn Sts,,40.688484,-73.985001,,,0,A42
A42S,,Hoyt-Schermerhorn Sts,,40.688484,-73.985001,,,0,A42

From these rows, all we need is the parent stop ID (A42) to identify the trains passing through the station, both northbound (A42N) and southbound (A42S).

The real-time feeds are represented in Google’s GTFS format, which is based on protocol buffers (also called protobuf). While the NYC MTA doesn’t have documented examples of its specific feeds, GTFS does. From the GTFS documentation, we can identify how to get the time-to-arrival for the latest trains at a particular station in a protobuf format.

Here’s an example of a response from the GTFS endpoint, converted into JSON for easier visualization:

{
  "trip":{
     "trip_id":"120700_A..N",
     "start_time":"20:07:00",
     "start_date":"20220531",
     "route_id":"A"
  },
  "stop_time_update":[
     {
        "arrival":{
           "time":1654042672
        },
        "departure":{
           "time":1654042672
        },
        "stop_id":"H06N"
     },

     //…more stops…

     {
        "arrival":{
           "time":1654044957
        },
        "departure":{
           "time":1654044957
        },
        "stop_id":"A42N"
     }
  ]
}

Because of the amount of information the NYC MTA API throws at you, it can be very helpful to use the Gravitee API Designer to model what the API returns, mapping out and visualizing the data. Here’s a snapshot of our API Designer mind map:

API Designer mind map

The API Designer helps you identify all of the resources (endpoints) for your API, along with the data attributes associated with resources. Those attributes will include the inputs an endpoint needs and the outputs it provides.

In our map, we have a resource with the path /gtfs/. We can attach as many attributes as necessary, and we can annotate each of those attributes with data types. By looking at our map, we can draw a direct path from the endpoint to the arrival and departure times identified at the bottom right.

So, in order to represent the data we need, we’ll need to:

  • Identify the ID of the station we want train information from
  • Issue an HTTP request to the NYC MTA’s GTFS feed for the train line(s) we’re interested in
  • Iterate over the results, comparing the stop_id in the response array with our station ID
  • We can then act on the time information for our specific station and train

This represents a few moving parts, but it shouldn’t be anything we can’t handle!

Coding It

Before running anything on our M5Stack, let’s first make sure our code works locally. We’ll install a few Python packages to make our project easier to build.

pip3 install --upgrade gtfs-realtime-bindings
pip3 install protobuf3_to_dict
pip3 install requests

The first two packages convert protocol buffers into Python dictionaries (or hashes), which makes for an easier data model to work with. The last package makes it easier to issue HTTP requests from Python.

We’ll start our program by importing the Python packages:

from google.transit import gtfs_realtime_pb2
import requests
import time

Next, we’ll issue our HTTP request to the NYC MTA GTFS feed:

api_key = "YOUR_API_KEY"

# Requests subway status data feed from the NYC MTA API
headers = {'x-api-key': api_key}
feed = gtfs_realtime_pb2.FeedMessage()
response = requests.get(
    'https://api-endpoint.mta.info/Dataservice/mtagtfsfeeds/nyct%2Fgtfs-ace',
    headers=headers)
feed.ParseFromString(response.content)

So far, so good. The GTFS endpoint we’re using here is the one for the A/C/E trains, which we can identify by the -ace suffix on the URL. (Except, for this demo, we don’t care about the E train—sorry, E train!)

Let’s convert that GTFS protocol buffer response into a dictionary:

from protobuf_to_dict import protobuf_to_dict
subway_feed = protobuf_to_dict(feed)  # converts MTA data feed to a dictionary
realtime_data = subway_feed['entity']

At this point, I would highly recommend issuing a print(realtime_data), so we can see what the actual data structure looks like. If this were a real project, such an analysis might help you identify which keys and values ​​in the dictionary you’d need to iterate over—but since this is a tutorial, we’ve already covered that.

def station_time_lookup(train_data, station):
   for trains in train_data:
       if trains.__contains__('trip_update'):
           unique_train_schedule = trains['trip_update']
           if unique_train_schedule.__contains__('stop_time_update'):
             unique_arrival_times = unique_train_schedule['stop_time_update']
             for scheduled_arrivals in unique_arrival_times:
                 stop_id = scheduled_arrivals.get('stop_id', False)
                 if stop_id == f'{station}N':
                     time_data = scheduled_arrivals['arrival']
                     unique_time = time_data['time']
                     if unique_time != None:
                         northbound_times.append(unique_time)
                 elif stop_id == f'{station}S':
                     time_data = scheduled_arrivals['arrival']
                     unique_time = time_data['time']
                     if unique_time != None:
                         southbound_times.append(unique_time)

# Keep a global list to collect various train times
northbound_times = []
southbound_times = []

# Run the above function for the station ID for Hoyt-Schermerhorn
station_time_lookup(realtime_data, 'A42')

Suddenly we have a lot of code! But don’t worry—what we’re doing isn’t so complicated:

  • We iterate over the array of train information for the A/C lines.
  • For each array entry, we verify that we have values ​​for all of the keys we need. This is defensive coding because we can’t be 100% certain that this third-party service has what we need when we need it!
  • After that, we iterate through all the station information and stop when we land on the parent ID we need (A42) for both northbound and southbound trains.
  • Finally, we keep lists for the upcoming train arrival times in two separate global variables.

Next, let’s present this information:

# Sort collected times in chronological order
northbound_times.sort()
southbound_times.sort()

# Pop off the earliest and second earliest arrival times from the list
nearest_northbound_arrival_time = northbound_times[0]
second_northbound_arrival_time = northbound_times[1]

nearest_southbound_arrival_time = southbound_times[0]
second_southbound_arrival_time = southbound_times[1]

### UI FOR M5STACK SHOULD GO HERE ###

def print_train_arrivals(
        direction,
        time_until_train,
        nearest_arrival_time,
        second_arrival_time):
    if time_until_train <= 0:
        next_arrival_time = second_arrival_time
    else nearest_arrival_time:
        next_arrival_time_s = time.strftime(
            "%I:%M %p",
            time.localtime(next_arrival_time))
    print(f"The next {direction} train will arrive at {next_arrival_time_s}")

# Grab the current time so that you can find out the minutes to arrival
current_time = int(time.time())
time_until_northbound_train = int(
    ((nearest_northbound_arrival_time - current_time) / 60))
time_until_southbound_train = int(
    ((nearest_southbound_arrival_time - current_time) / 60))
current_time_s = time.strftime("%I:%M %p")
print(f"It's currently {current_time_s}")

print_train_arrivals(
    "northbound",
    time_until_northbound_train,
    nearest_northbound_arrival_time,
    second_northbound_arrival_time)
print_train_arrivals(
    "southbound",
    time_until_southbound_train,
    nearest_southbound_arrival_time,
    time_until_southbound_train)

Most of what we’re doing above is data formatting. The key steps are as follows:

  • We sort the arrival times of the northbound and southbound trains at the station.
  • We take the first two times (the “soonest” trains arriving).
  • We compare those times with the current time to get a distance in minutes for the train’s arrival. We pass those train arrival times to print_train_arrivals.
  • If the next train is arriving in less than a minute, we’ll show the second arrival time—you’re not going to make that train, I’m afraid! Otherwise, we’ll show the nearest arrival time.

If you run this script on the terminal, then you should see a message similar to the following:

It's currently 05:59 PM
The next northbound train will arrive at 06:00 PM
The next southbound train will arrive at 06:02 PM

Deploying to the M5Stack

Now that we have tested locally that our Python code can communicate with the NYC MTA API, it’s time to get this code running on our M5Stack. The easiest way to program the M5Stack is through the free UI Flow IDE, which is just a web page that communicates with your device through WiFi. You can learn more about how to configure your device for WiFi access through their documentation.

Although the M5Stack can be programmed through WYSIWYG UI elements, it can also accept (and run) Python code. However, the main advantage of the WYSIWYG elements is that it makes visualizing the text drawn on the screen much easier:

WYSIWYG

In this GIF, I’ve created a label with the default string of “Text” on the sample M5Stack screen. When I switch to Python, we see that the label is an instantiation of an object called M5TextBox. As the label is dragged around, its X and Y coordinates (the first two arguments in the constructor) change in Python. This makes it easy to see how your program will be displayed. You can also change the variable used in the Python code (as well as other properties) by clicking on the label itself:

change the variable used in the Python code

For the most part, the Python script we wrote can be used on the M5Stack with some slight modifications. We can copy the Python code from our local machine and paste it into the Python tab of the UI Flow IDE.

In our code, we find the ### UI FOR M5STACK SHOULD GO HERE ### comment and replace everything below it with the following code:

time_label = M5TextBox(146, 27, "", lcd.FONT_Default, 0xFFFFFF, rotate=0)
northbound_label = M5TextBox(146, 95, "", lcd.FONT_Default, 0xFFFFFF, rotate=0)
southbound_label = M5TextBox(146, 163, "", lcd.FONT_Default, 0xFFFFFF, rotate=0)

def print_train_arrivals(
        direction,
        label,
        time_until_train,
        nearest_arrival_time,
        second_arrival_time):
    if time_until_train <= 0:
        next_arrival_time = second_arrival_time
    else nearest_arrival_time:
        next_arrival_time_s = time.strftime(
            "%I:%M %p",
            time.localtime(next_arrival_time))
    label.setText(f"The next {direction} train will arrive at {next_arrival_time_s}")

while True:
    # Grab the current time so that you can find out the minutes to arrival
    current_time = int(time.time())
    time_until_northbound_train = int(
        ((nearest_northbound_arrival_time - current_time) / 60))
    time_until_southbound_train = int(
        ((nearest_southbound_arrival_time - current_time) / 60))
    current_time_s = time.strftime("%I:%M %p")
    time_label.setText(f"It's currently {current_time_s}")

    print_train_arrivals(
        "northbound",
        northbound_label,
        time_until_northbound_train,
        nearest_northbound_arrival_time,
        second_northbound_arrival_time)
    print_train_arrivals(
        "southbound",
        southbound_label,
        time_until_southbound_train,
        nearest_southbound_arrival_time,
        time_until_southbound_train)
  
    sleep 5

Most of this should look familiar! There are two major modifications to get this code running on the M5Stack.

First, we created the labels which will be placeholders for our time and train data:

  • time_label
  • northbound_label
  • southbound_label

Second, we’ve put everything inside of a while loop, which will grab the current times and set the label text. The loop will sleep for five seconds and then restart the process.

And that’s it! When we hit the Run button, we should see our train strings update every five seconds with the latest route data.

Conclusion

That’s that! IoT devices are often used by hobbyists, but if you continue working on this project, there are several real-world considerations. One consideration is rate-limiting, making sure you’re requesting data in an efficient way from the MTA API. Another consideration is connectivity. If your device temporarily loses WiFi access, how will it reestablish a connection to fetch the information it needs?

Once you start thinking about these production-grade concerns, or if you want to scale your project across multiple devices, you’ll also need to consider API management. I mentioned Gravitee Designer earlier in this article, which is great in the design phase. Gravitee has other tools for API management, like an API gateway, monitoring, and real-time analytics, deployment.

IoT application development may seem daunting for developers who are used to writing code for traditional servers and web browsers. However, the leap to IoT devices is actually quite small. Today’s devices, with their built-in support for popular languages ​​and frameworks, make IoT a fun and innovative way to build or integrate with APIs and applications.

.

Leave a Comment