https://www.ics.uci.edu/~thornton/ics32a/ProjectGuide/Project3/ Background We saw in the previous project that our Python programs are capable of connecting to the "outside world" around them — to...

1 answer below »

https://www.ics.uci.edu/~thornton/ics32a/ProjectGuide/Project3/


Background


We saw in the previous project that our Python programs are capable of connecting to the "outside world" around them — to other programs running on the same machine, or even to other programs running on different machines in faraway places. This is a powerful thing for a program to be able to do, because it is no longer limited to taking its input from a user or from a file stored locally; its input is now potentially anything that's accessible via the Internet, making it possible to solve a vast array of new problems and process a much broader collection of information. Once you have the ability to connect your programs to others, a whole new world opens up. Suddenly, the idea that you should be able to write a program that combines, say, Google search queries, the Internet Movie Database, and your favorite social network to find people who like movies similar to the ones you like doesn't seem so far-fetched.


But we also saw that getting programs to share information is tricky, for (at least) two reasons. Firstly, there's a software engineering problem: A protocol has to be designed that both programs can use to have their conversation. Secondly, there's a social problem: If the same person (or group of people) isn't writing both programs, it's necessary for them to agree on the protocol ahead of time, then to implement it. This second problem has a potentially catastrophic effect on our ability to make things work — how could you ever convince Google to agree to use your protocol just to communicate with you?


In practice, both of these problems are largely solved by the presence of
standards, such as those defined by the
World Wide Web Consortium
and the
Internet Engineering Task Force. Standards help by providing detailed communication protocols whose details have already been hammered out, with the intention of handling the most common set of needs that will arise in programs. This eliminates the need to design one's own protocol (where the standard protocols will suffice, which is more often than you might think) and allows programs to be combined in arbitrary ways; as long as they support the protocol, they've taken a big step toward being able to interoperate with each other. What's more, standard protocols often have standard implementations, so that you won't have to code up the details yourself as you did in the previous project. For example, Python has built-in support for a number of standard Internet protocols, including HTTP (HyperText Transfer Protocol, the protocol that your browser uses to download web pages) among others.


At first blush, HTTP doesn't seem all that important. It appears to be a protocol that will allow you to write programs that download web pages (i.e., that allow you to write programs that play the same role that web browsers do). But it turns out that HTTP is a lot more important than that, since it is the protocol that underlies a much wider variety of traffic on the Internet than you might first imagine. This is not limited only to the conversation that your browser has with a web server in order to download a web page, though that conversation most often uses HTTP (or its more secure variant, HTTPS). HTTP also underlies a growing variety of program-to-program communications using web protocols, where web sites or other software systems communicate directly with what are broadly called
web services, fetching data and also making changes to it. This is why you can post tweets to Twitter using either their web site, a client application on your laptop, or a smartphone app; all of these applications use the same protocol to communicate with the Twitter service, differing only in the form of user interface they provide.


Fortunately, since HTTP support is built directly into Python, we can write programs that use these web services without having to handle low-level details of the protocol, though there are some details that you'll need to be familiar with if you want to use the provided implementation effectively. We'll be discussing some of these details in lecture soon, and these will be accompanied by a
code example, which will give you some background in the tools you'll need to solve these kinds of problems in Python.


This project gives you the opportunity to explore a small part of the vast sea of possibilities presented by web APIs and web services. You'll likely find that you spend a fair amount of your time in this project understanding the web API you'll need — being able to navigate technical documentation and gradually build an understanding of another system is a vital skill in building real software — and that the amount of code you need might not be as much as you expect when you first read the project write-up. As always, work incrementally rather than trying to work on the entire project all at once; there is partial credit available for a partial solution, as long as the portions that you've finished are stable and correct. When you're done, you'll have taken a valuable step toward being able to build Python programs that interact with web services, which opens up your ability to write programs for yourself that are real and useful.


Additionally, you'll get what might be your first experience with writing classes in Python, which will broaden your ability to write clean, expressive Python programs, a topic we'll continue revisiting and refining throughout the rest of this course. Along with that, you'll learn about why it can be a powerful technique to write multiple, similar classes in a way that leaves them intentionally identical in at least one aspect of how they behave.




The problem


Perhaps particularly for people with chronic respiratory problems, but certainly for everyone, the quality of the air we breathe can have a dramatic impact on our short- and long-term health. As a kid growing up in the southern California area in the 1980s, there were days when I went to school but none of us was permitted to play outside during recess due to what, in those days, were called "smog alerts," which tuned me into the idea, from an early age, that air quality matters. The gray-brown skies of my youth are mostly a thing of the past, but, nonetheless, there are some days when you really want to avoid breathing the outside air as much as possible. The tricky part is knowing which days they are, because you can't often look out the window and see definitively what the quality of the air is; much of what makes the air problematic is invisible, more so than when I was a child.


Nowadays, the Internet provides a valuable resource to help us to monitor and manage the impact of air quality. In your work on this project, you'll write a program that can answer a question similar to the following: Where are some places where the air quality is unhealthy within 30 miles of where I am now?


To do that, though, we'll need some information that we won't have at our fingertips; it's not our ambition to build an air quality sensor and drive around in a 30-mile area looking for an unhealthy reading. But thanks to the ubiquitous Internet of today, we'll be able to obtain and use (free of charge) information that will allow us to answer a question like this without ever leaving the house. What we'll need are two things:



  • A collection of air quality sensors that can provide us with up-to-the-minute data about air quality all over the United States (and, to a lesser extent, the rest of the world). We don't need the sensors, of course, but we need their output.

  • A geocoding service that can tell us things like "Where is Bren Hall in Irvine, California?" or "What's the street address at this latitude and longitude?"


Given the ability to obtain answers to those kinds of questions and use them as input to our program, the rest of the problem is reduced to interpreting that input appropriately and performing the right calculations on it.


Because we're building a program in a problem domain that's new to us, though, we'll need to know some things about it. We don't need to become experts in air quality measurement or the intricacies of geographic algorithms and mapping, but we need to know enough about those things to be able to build what we seek to build. When we build programs, we're in the automation business, but we have to know something about what we're automating, even if we don't have to know everything.




How is air quality measured?


In the United States, the usual technique for reporting on air quality is called the
Air Quality Index
(AQI), so we'll use that same technique. A basic explanation of AQI is available below:



Reading through that document, you'll see that a standardized scale is used to describe risk levels — 175 is in a range considered to be unhealthy, for example — but it turns out that the same scale is used to describe the risks posed by different pollutants: ozone, carbon monoxide, particulates of various sizes, and so on. But, of course, the risk posed by those pollutants is different — how much ozone is too much doesn't necessarily correspond to how much carbon monoxide is too much — so what we really need to know are two things:



  • For the pollutant whose risk we're assessing, we first need a measurement of the concentration of that pollutant (i.e., how much of that pollutant is in the air).

  • Given that concentration, we need a formula for translating it to an AQI value that adequately communicates its risk.


The concentrations of different pollutants are measured differently. The risk posed by them is different, too, so the formula for translating concentrations to AQI is also different for each.


Of course, we aren't building a sensor to measure the concentration of pollutants in the air, so we'll need an online source for that data. But that source won't give us the AQI value, so it'll be up to us to determine it ourselves.


Determining the AQI value


We'll be considering only one pollutant, which is commonly referred to as
PM2.5, which is a shorthand for "particulates smaller than 2.5 microns" (i.e., smaller than 2.5 millionths of a metter). Sensors generally report concentrations of PM2.5 in µg/m
3

(micrograms per cubic meter). So how do we convert that concentration to an AQI value? We do so by following this procedure:









































If the concentration is between...Then...
0.0 ≤ µg/m
3

0.0 µg/m
3

is an AQI of 0
12.0 µg/m
3

is an AQI of 50
Every other value in this range is proportional to those
(e.g., 6.0 is halfway between 0.0 and 12.0, so the AQI would be halfway between 0 and 50, i.e., 25)
12.1 ≤ µg/m
3

12.1 µg/m
3

is an AQI of 51
35.4 µg/m
3

is an AQI of 100
Every other value in this range is proportional to those
(e.g., 23.75 is halfway between 12.1 and 35.4, so the AQI would be halfway between 51 and 100, i.e., 75.5, which rounds up to 76)
35.5 ≤ µg/m
3

35.5 µg/m
3

is an AQI of 101
55.4 µg/m
3

is an AQI of 150
Every other value in this range is proportional to those
(e.g., 45.45 is halfway between 35.5 and 55.4, so the AQI would be halfway between 101 and 150, i.e., 125.5, which rounds up to 126)
55.5 ≤ µg/m
3

55.5 µg/m
3

is an AQI of 151
150.4 µg/m
3

is an AQI of 200
Every other value in this range is proportional to those
(e.g., 102.95 is halfway between 55.5 and 150.4, so the AQI would be halfway between 151 and 200, i.e., 175.5, which rounds up to 176)
150.5 ≤ µg/m
3

150.5 µg/m
3

is an AQI of 201
250.4 µg/m
3

is an AQI of 300
Every other value in this range is proportional to those
(e.g., 200.45 is halfway between 150.5 and 250.4, so the AQI would be halfway between 201 and 300, i.e., 250.5, which rounds up to 251)
250.5 ≤ µg/m
3

250.5 µg/m
3

is an AQI of 301
350.4 µg/m
3

is an AQI of 400
Every other value in this range is proportional to those
(e.g., 300.45 is halfway between 250.5 and 350.4, so the AQI would be halfway between 301 and 400, i.e., 350.5, which rounds up to 351)
350.5 ≤ µg/m
3

350.5 µg/m
3

is an AQI of 401
500.4 µg/m
3

is an AQI of 500
Every other value in this range is proportional to those
(e.g., 425.45 is halfway between 350.5 and 500.4, so the AQI would be halfway between 401 and 500, i.e., 450.5, which rounds up to 451)
above 500.5The AQI reading is "off the charts" (i.e., the highest meaningful reading is 500), so we'll report it as 501.

The thing to notice is that the formula changes slightly as we move up the scale, but each row in the formula works the same way: Each uses a technique generally called
linear interpolation, which basically means "Given the value at each endpoint, assume that the rest of the values are represented by a straight line in between." In that case, some relatively straightforward algebra will get us where we need to go.


Note, too, that AQI is always reported as an integer value, and that we always round to the nearest integer (i.e., if the formula above yields 24.7, the AQI would be reported as 25).




Latitudes, longitudes, and geocoding


Before you get too much farther, if you don't about how the latitude and longitude system works — don't feel bad if you don't, but you do need to understand this in order to solve this problem! — take a look at the link below:



In paticular, note the limits on allowable latitudes and longitudes, as well as the difference between North and South latitude and between West and East longitude. And note, too, that latitude and longitude, generally, don't work the same way, so once you've understood one, you'll still need to be sure you've wrapped your mind around the other. There aren't a lot of details, but if you haven't thought about them in a while — or if you've never seen them before — it's worth taking a few minutes to get your understanding sorted out before continuing.


What is geocoding?


The word
geocoding
sounds like some kind of programming technique, but it's actually something else: It's a process for converting the descriptions of places on the Earth into their locations and back again. In other words, it allows us to answer questions such as these.



  • What is the latitude and longitude where Bren Hall in Irvine, California is located?

  • What is located at latitude 33.674381°N and longitude 117.865975°W?


The first of those questions is what we'd call
forward geocoding
(i.e., taking the description of a location and turning it into geographic coordinates). The second is what we'd instead call
reverse geocoding
(i.e., taking geographic coordinates and describing what's there).


Of course, answering questions like these requires an enormous amount of data that we don't have, so it won't be up to us to determine these answers; instead, we'll obtain them online as we need them.




Determining the distances between two locations


One of the fundamental operations your program needs is to be able to determine the distance between two locations on Earth. Before you can do that, though, we first need to agree on what is meant by "distance." The Earth is (more or less) spherical and a particular location (i.e., a latitude and longitude) specifies a point somewhere on its surface. When we consider the distance between two such locations, there are two ways to think about it:



  • A straight line traveling through the interior of the sphere, with the two locations as the endpoints of the line. We might call this the
    straight-line distance
    between the locations.

  • The shortest arc that travels along the surface of the sphere that has the two locations as the endpoints of the arc. The length of such an arc is called the
    great-circle distance
    between the two locations.


As is often the case, there's a tension between what's easier to implement and what's actually required. The straight-line distance would presumably be easier to calculate, but if our goal is to calculate distances that people might travel, it's a misleading answer — it assumes that people travel from one location on Earth to another by boring a hole in the Earth! The great-circle distance makes a lot more sense when we consider the distances between locations on Earth, because people would tend to travel either along the Earth's surface (e.g., by walking, bicyling, or riding in a car) or roughly parallel to it (e.g., in an airplane).


So, when calculating the distance between two locations, your goal is to calculate the great-circle distance between them. Of course, we'll need a formula to do it, and, as luck would have it, there's a relatively simple formula that's plenty precise for our needs.


The equirectangular approximation


Given two points on the surface of the Earth expressed in terms of latitudes and longitudes, we can calculate the distance between them by using a formula we might call the
equirectangular approximation. Why it's an approximation is that it's based around a slightly imprecise "rounding off" of reality, in which we imagine that if you laid the entire Earth's surface out flat, latitudes would be horizontal lines equally spaced from each other and longitudes would be vertical lines equally spaced from each other. Then we imagine that flat surface "wrapped back around" a sphere. While this isn't quite accurate, the approximation is not far from reality, particularly in the context of the shorter distances that we'll be interested in here, so we'll use this simpler model, since it also leads to a simple and performant formula for calculating distances.


Given that, how do we calculate our distances? Some mild algebra and trigonometry (since we're dealing with spheres and angles) is all we need.



let dlat be the difference in the latitudes of the two points, in radians let dlon be the difference in the longitudes of the two points, in radians let alat be the average of the two latitudes, in radians let R be the radius of the Earth, in miles (3958.8) let x = dlon * cos(alat) let d = sqrt(x
2

+ dlat
2
) * R


After going through those steps,
d
will be a reasonably close approximation of the distance between the two points, expressed in miles.




Where will we get our data?


While we'll be implementing some calculations of our own, the most meaningful input to our program will need to be obtained online, which raises the question of where we're going to get the information and how we're going to make sense out of it.


Air quality data from PurpleAir's API



PurpleAir
is a company that sells Internet-aware air quality monitoring devices. Many of those devices are configured to be connected to the Internet, in which case they send their data back to PurpleAir, with some owners sharing that data publicly; its that public data that we'll be using in this project.


PurpleAir actually provides two separate APIs containing its sensor data, one that's called the "legacy" API (i.e., it's been around longer) and another that's called the "experimental" API (i.e., it's newer, but its output is shorter and simpler). Of these, we'll depend on the experimental API.


Downloading the experimental API data for all of PurpleAir's public sensors is a simple matter of visiting the following URL:



It's not a bad idea to save a copy of this file in the same directory as your program's code. It will vary over time, but you'll need a stable copy that you can test with, so you don't have to download this huge amount of data every time you run your program as you build it — something that PurpleAir ultimately won't allow (see the section titled
Limitations
below).


Let's take a look at what some of the data looks like, as of this writing. Looking at the overall format, we can recognize it as the JSON format we saw when we learned about
Web APIs. Its basic arrangement appears to be the following:



  • All in all, what we got back was one large JSON object.

  • Its first field is called
    version, which presumably indicates the current version of the API.

  • Its second field is called
    fields, whose value is a list of strings.

  • Its third field is called
    data, whose value is a list of lists, where each sublist appears to have the same number of elements that the
    fields
    list had. (That's not an accident. What we've got is a complete set of information from each sensor.)


So, what will we want to know about each sensor?



  • The second element in each sensor's list is the one named
    pm, which indicates its current reading of the concentration of PM2.5 in the air. It's being reported in µg/m
    3
    .

  • The fifth element in each sensor's list is the one named
    age, which specifies how many seconds it's been since the sensor last reported its value to PurpleAir. We'll ignore any sensor that hasn't reported a value in the last hour.

  • The 26th element in each sensor's list is the one named
    Type, which specifies whether the sensor is indoors or outdoors. When the value is
    0, the sensor is outdoors; when the value is
    1, the sensor is indoors. We're not interested in sensors that are indoors, since our goal is reporting on outdoor air quality.

  • The 28th element in each sensor's list is the one named
    Lat, which is a latitude, in degrees, where the sensor has been placed.

  • The 29th element in each sensor's list is the one named
    Lon, which is a longitude, in degrees, where the sensor has been placed.


Any sensor that doesn't have these elements, or that has these elements but they have values that aren't what they're expected to be (e.g., they're
null
instead of a number) should be ignored.


Geocoding via Nominatim's API


Nominatim is a web API that provides geocoding services using an open set of map data called
OpenStreetMap. Specifically, we'll be interested in using it for two things:



  • Forward geocoding, which means that we want to take a description such as
    Bren Hall, Irvine, CA
    and find out its latitude and longitude.

  • Reverse geocoding, which means that we have a latitude and longitude, such as
    33.5935341°N
    and
    117.874846°W, and we want a description of what's there.


Nominatim's API has fairly extensive documentation that describes its use, so you'll want to take a look through that to understand the services it provides and how to access them. See if you can construct URLs that find the answers to the two examples above. Don't worry if it takes a little while, but
do
spend some time working on that problem before you try to reach out to Nominatim's API from your program; you can't use tools that you don't understand how to use.



Testing without the APIs


One of the challenges when you work on a project since as this is that your ability to test the program — or even to run it and see its output — is at least partly dependent on the performance of the API. If the API isn't functioning properly, your program won't function properly either, but when you're building a program, it's good to be able to tell the difference between a program that isn't working because it's broken in some way and one that's working fine but dependent on something outside of it that's not working.


For that reason, your program will need a way to obtain its information from a file stored on your hard drive, instead of reaching out to the API. This will allow you to test your program with known-good data, which you'll mostly want to do, except when you're specifically working on the parts of the program where you're reaching out to the APIs. In the next sections of this write-up, you'll see how we'll make that possible.




The program


Your program will read a sequence of lines of input from the Python shell that configure its behavior, then generate and print some output consistent with that configuration. The general goal of the program is this: Given a "center" point, a range (in miles), and an AQI threshold, describe the locations within the given range of the center point having the
n
worst AQI values that are at least as much as the threshold. (That's a mouthful, so you'll want to read that sentence a few times; there's a lot going on there. Read further, too, and you'll see an example that will help to clarify.)


The input


The first thing your program does is read several lines of input that describe the job you want it to do. Your program should not print any prompts to the user; it should just blindly read this input, expecting that the user understands how to use the program already.



  • The first line of input will be in one of two formats:


    • CENTER NOMINATIM
      location
      , where
      location
      is any arbitrary, non-empty string describing the "center" point of our analysis. For example, if this line of input said
      CENTER NOMINATIM Bren Hall, Irvine, CA, the center of our analysis is Bren Hall on the campus of UC Irvine. The word
      NOMINATIM
      indicates that we'll use Nominatim's API to determine the precise location (i.e., the latitude and longitude) of our center point.


    • CENTER FILE
      path
      , where
      path
      is the path to a file on your hard drive containing the result of a previous call to Nominatim. The file needs to exist. The expectation is the file will contain data in the same format that Nominatim would have given you, but will allow you to test your work without having to call the API every time — important, because Nominatim imposes limitations on how often you can call into it, and because this could allow you to make large parts of the program work without having hooked up the APIs at all.



  • The second line of input will be in the following format:


    • RANGE
      miles
      , where
      miles
      is a positive integer number of miles. For example, if this line of input said
      RANGE 30, then the range of our analysis is 30 miles from the center location.



  • The third line of input will be in the following format:


    • THRESHOLD
      AQI
      , where
      AQI
      is a positive integer specifying the
      AQI threshold, which means we're interested in finding places that have AQI values
      at least as high
      as that threshold.



  • The fourth line of input will be in the following format:


    • MAX
      number
      , where
      number
      is the maximum number of locations we want to find in our search. For example, if this line of input said
      MAX 5, then we're looking for up to five locations where the AQI value is at or above the AQI threshold.



  • The fifth line of input will be in one of two formats:


    • AQI PURPLEAIR, which means that we want to obtain our air quality information from PurpleAir's API.


    • AQI FILE
      path
      , where
      path
      is the path to a file on your hard drive containing the result of a previous call to PurpleAir's API with all of the sensor data in it.



  • The sixth line of input will be in one of two formats:


    • REVERSE NOMINATIM, which means that we want to use the Nominatim API to do reverse geocoding, i.e., to determine a description of where problematic air quality sensors are located.


    • REVERSE FILES
      path1
      path2
      ...
      , which means that we want to use files stored on our hard drive containing the results of previous calls to Nominatim's reverse geocoding API instead. Paths are separated by spaces — which means they can't
      contain
      spaces — and we expect there to be as many paths listed as the number we passed to
      MAX
      (e.g., if we said
      MAX 5
      previously, then we'd specify five files containing reverse geocoding data).




We will not be testing invalid inputs in the Python shell, so you can feel free to handle them in any way you'd like — up to and including a program crash.


The output


After reading all of the input, you'd first display the latitude and longitude of the center location, with latitudes and longitudes shown in the following format.



CENTER 33.64324045/N 117.84185686276017/W


Then, you'd use the information that's either stored in the specified files or downloaded from the specified APIs to find the sensors that are in the specified range of the center location, then determine which of those sensors have the highest AQI values and, for any of them that are at or above the AQI threshold, display information about the first
n
of them. For example, suppose the input was as follows:



CENTER NOMINATIM Bren Hall, Irvine, CA RANGE 30 THRESHOLD 150 MAX 5 AQI PURPLEAIR REVERSE NOMINATIM


This means we're looking for up to five locations within 30 miles of Bren Hall at UC Irvine where the AQI value is at least 150. Given a choice (i.e., if there are more than five locations with AQI values that meet the threshold), we want to show information about the five locations with the highest AQI values. For each location, you'd print three lines of output:




  • AQI
    AQI_value
    , where
    AQI_value
    is the AQI value you calculated for this location.



  • latitude
    longitude
    , which is the latitude and longitude for this location, in the same format as you printed the center location's latitude and longitude.



  • description
    , which is the full description of the location.


A complete example


As I was writing this, I ran a test, whose results I'm showing below. Note that the output you're seeing is wholly dependent on data from PurpleAir's sensors at the moment I ran the test, as well as the geocoding service done by Nominatim's API, so if you run the same test, you will almost certainly obtain different results, but this is a good demonstration of the output format that's required here.




CENTER NOMINATIM Bren Hall, Irvine, CA RANGE 30 THRESHOLD 100 MAX 5 AQI PURPLEAIR REVERSE NOMINATIM
CENTER 33.64324045/N 117.84185686276017/W AQI 180 33.53814/N 117.5998/W Garcilla Drive, Orange County, California, 92690, United States of America AQI 157 33.690376/N 118.03055/W Orange County, California, United States of America AQI 154 33.68315/N 117.66642/W Alton Parkway, Foothill Ranch, Lake Forest, Orange County, California, 92610, United States of America AQI 152 33.816/N 118.23275/W Arco, Tesoro Carson Refinery, Bangle, Carson, Los Angeles County, California, 90810, United States of America AQI 151 33.86117/N 117.96228/W 1880, West Southgate Avenue, Fullerton, Orange County, California, 92833, United States of America


What to do in the case of API failure


In this project, we face the problem that our program may be written perfectly, yet still might fail in some circumstances. This is because we're dependent on two APIs sending us the data we need, in the format we expect, without which our program can't generate its output. Yet, the APIs are themselves software, and software fails; our communication with the APIs is done via a computer network, and computer networks fail, too. So we'll need to account for these possibilities in our design, and also have a mechanism for testing them.


First, we'll need to decide what it means for the APIs to have failed. To do that, we'll attack the problem from the opposite angle: What does success look like?



  • The HTTP status code in the response to
    all
    of our API requests was 200. Any other status code is considered a failure, regardless of the data that sent in the response.

  • The content of
    all
    of our API requests was formatted as we expected (e.g., it was in JSON format if that's what we expected).

  • If we used a file on our hard drive in place of a call to an API, the file existed and the contents of the file were formatted as we expected (e.g., it was in JSON format if that's what we expected).


In any other case, we'll say that our program has failed, and we'll print an alternatively-formatted set of output — entirely separate from the normal one — that briefly describes the first failure you encountered.



  • The first line of output will simply be the word
    FAILED.

  • If the first failure you encountered was due to the use of an API...

  • The second line of output will contain the HTTP status code of the first API request that failed, as well as the URL that you connected to. (Note that the status code might still be 200, if the failure was due to missing or misformatted content.)

  • The third line of output will be exactly one of these three phrases:


    • NOT 200
      (if an API request returned a status code other than 200)


    • FORMAT
      (if an API request returned data that had missing or misformatted content)


    • NETWORK
      (if an API request couldn't be sent at all because, for example, there was no network connectivity)



  • If the first failure you encountered was due to a file on your hard drive that you were using in place of a call to an API...

    • The second line of output will contain the path to the file that you attempted to use.

    • The third line of output will be exactly one of these two phrases:


      • MISSING
        (if the file does not exist or couldn't be opened)


      • FORMAT
        (if the file could be opened, but its contents had missing or misformatted content)






For example, if your program makes an API request whose response contains the HTTP status code 429, your output would be this:



FAILED 429 NOT 200


Or if your program tried to use the file
D:\Examples\Python\purpleair.json
but that file didn't exist, your output would be this instead:



FAILED D:\Examples\Python\purpleair.json MISSING


To be clear, you'll print this alternative output (and only this alternative output) if any of the API requests or usages of files fails; otherwise, you'll follow the requirements above and print output describing the center location and any locations where air quality is problematic.




Design requirements and advice


As with the previous project, you'll be required to design your program using multiple Python modules (i.e., multiple
.py
files), each encapsulating a different major part of the program. We'll leave you some flexibility in determining where to draw the line between what's in one module and what's in another, but the module that you'd execute to run your program must be named precisely
project3.py.


Fetching our data with classes


There are three points in your program where you'll need to fetch data from either an API or a file:



  1. When you use forward geocoding to determine the location of the center of your analysis.

  2. When you need to obtain information from air quality sensors.

  3. When you use reverse geocoding to determine the description of where a problematic air quality sensor is.


In each of these three cases, there are two separate ways to solve the problem — one using an API and the other using a file. In each case, you'll be required to implement Python
classes, which contain attributes that configure it, if necessary (e.g., the path to a file that should be read), and a method that obtains the data. Classes that obtain the same data must share an interface (i.e., they must have a method with the same name, the same parameters, and the same type of return value), so that you can build objects of these types when you read your program's input, then execute them later without knowing which types of objects they actually are.


(This is one key benefit in using classes in Python; we can treat different kinds of objects with similar capabilities the same way, which avoids us having to use if statements to differentiate. We saw an example of this in lecture, when we talked about
duck typing.)


Where should I start?


There are lots of ways to start this project, but your goal, as always, is to find stable ground as often as possible. One problem you know you'll need to solve is generating the final report, so you could begin by generating a portion of it — maybe just the

SOME DETAILS OF THE OUTPUT REPORT
, but formatted correctly, since this can lead to partial credit. Now you're on stable ground.


One problem you know you'll need to solve is the problem of calculating an AQI value, given a PM2.5 concentration; you might consider continuing with that. You can test this using the Python shell or
assert-based tests before proceeding, and then you're on stable ground. Continue with the equirectangular approximation of distances between points on the Earth, then test that. Now you're on stable ground again.


From there, you might continue by implementing a module that obtains the air quality data from PurpleAir's API, perhaps first by implementing the class that reads that data from a file, then later implementing the class that loads it from the web instead. (You'll want the part that reads from a file pretty early on, because there are limitations on how often you can PurpleAir to send you all of its sensor data, so better not to keep asking repeatedly.)


Once you've got these implemented, you might continue with forward and reverse geocoding using Nominatim — again, first by implementing the classes that read this data from a file, then later implementing the classes that load them from Nominatim's API instead.


Now you'd have a lot of pieces in place, and you can start thinking about how to tie them together. At this point, you may feel like you don't have a
program
yet, but that's not so out of the ordinary when you work on a large project; it's often quite a while before you have something that runs an entire end-to-end process, because you first need to build and test a lot of smaller-scale tools. In that sense, this project is a pretty realistic view into what it takes to build realistic programs that interact with complex sets of inputs and outputs.


But, again, there are lots of sequences that could lead to a good solution, and you'll want to consider how you can achieve partial solutions that nonetheless meet the requirements partially, because partial credit is available for those. Still, if you find a way to approach this that's different than what I've suggested, but that leads you to a complete program that meets the design requirements, that's fine; we don't care what order you implement it in, ultimately, but we're happy to help you find an ordering if you're not sure what to work on next.

Answered 3 days AfterNov 09, 2021

Answer To: https://www.ics.uci.edu/~thornton/ics32a/ProjectGuide/Project3/ Background We saw in the previous...

Swapnil answered on Nov 13 2021
122 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here