Project Background

  • Northeastern US House Price Predictor
  • Takes user input for # of bedrooms, # of bathrooms, and # of acres
  • Uses an ML model trained on a dataset to predict house price and give it to user as an output
  • Also gives the user the option to save input preferences, which will then be loaded in by default when user opens or reloads the page

Collections

Blog Python Model code and SQLite Database.

  • From VSCode using SQLite3 Editor, show your unique collection/table in database, display rows and columns in the table of the SQLite database.

Project makes use of a SQLite database to store the desired user input settings by the user clicking “save settings” on the frontend. This collection table in the SQLite database stores the latest saved value by a user, and displays it whenever the site is reopened or refreshed for that user.

  • From VSCode model, show your unique code that was created to initialize table and create test data.

Below Python code defines a function create_settings_table() that creates a table named “settings” in the SQLite database if it doesn’t already exist. The table has columns for id, bedrooms, bathrooms, and acre_lot. The function is called immediately after its definition to ensure that the table is created when the application starts. This section of the code ensures that the necessary table structure is in place for storing settings data in the SQLite database.

# Create a table to store settings if it doesn't exist.
def create_settings_table():
    with sqlite3.connect(DATABASE) as conn:
        cursor = conn.cursor()
        cursor.execute('''CREATE TABLE IF NOT EXISTS settings (
                            id INTEGER PRIMARY KEY,
                            bedrooms REAL,
                            bathrooms REAL,
                            acre_lot REAL
                          )''')
        conn.commit()

# Call the function to create the settings table.
create_settings_table()

Lists and Dictionaries

Blog Python API code and use of List and Dictionaries.

  • In VSCode using Debugger, show a list as extracted from database as Python objects.
  • In VSCode use Debugger and list, show two distinct example examples of dictionaries, show Keys/Values using debugger.

In the following image, a dictionary is shown storing the values of the 3 user inputs (bedrooms, bathrooms, acre_lot) in a dictionary named data. The keys are the 3 user input variables and the keys are the user inputted data values itself. The data dictionary is then split into 3 seperate variables for each user input and this is stored in a list named features which is then run through the ML algorithm to produce a predicted price output, later sent to the frontend as a user output.

In the following image, a dictionary is shown storing the values of the 3 user inputs (bedrooms, bathrooms, acre_lot) in a dictionary named data. The keys are the 3 user input variables and the keys are the user inputted data values itself. The data dictionary is then split into 3 seperate variables for each user input and this is added to the existing SQLite database and stored for future use by the user.

APIs and JSON

Blog Python API code and use of Postman to request and respond with JSON.

  • In VSCode, show Python API code definition for request and response using GET, POST, UPDATE methods. Discuss algorithmic condition used to direct request to appropriate Python method based on request method.

Python API code uses a POST method for the frontend to recieve data from the backend. The algorithmic condition used to direct the request to the correct Python method based on request method is if the API accessed by frontend is /predict, the POST method is used and the predict_house_price() function is run.

  • In VSCode, show algorithmic conditions used to validate data on a POST condition.

The algorithmic conditions used to validate data on a POST condition are also present in the predict_house_price() function. Inside the function, there is a try block where the data sent by the user is extracted from the request JSON using request.json. Then, the input features (bedrooms, bathrooms, acre_lot) are taken from the data, and they are validated by conversion to floats and handling missing values using data.get(). If any error occurs during this process, it is caught in the except block, and an error message is returned with status code 400 (Bad Request). This ensures that the data sent by the user is properly validated before further processing. Otherwise, a 200 success status code is returned along with a prediction output for the user.

@houseprice_api.route('/predict', methods=['POST'])
def predict_house_price():
    try:
        # Extract the data sent by the user.
        data = request.json
        
        # Get the input features from the data.
        bedrooms = float(data.get('bedrooms', 0))
        bathrooms = float(data.get('bathrooms', 0))
        acre_lot = float(data.get('acre_lot', 0))
        
        # Prepare the features for prediction.
        features = [[bedrooms, bathrooms, acre_lot]]
        
        # Use the model to predict the house price based on the provided features.
        predicted_price = rf_regressor.predict(features)[0]
        
        # Format the predicted price to two decimal places.
        formatted_price = f"{predicted_price:.2f}"
        
        # Create a response object containing the prediction.
        response = {'predicted_price': formatted_price}
        return jsonify(response), 200
    except Exception as e:
        # If something goes wrong, send back an error message with status code 400 (Bad Request).
        return jsonify({'error': str(e)}), 400
  • In Postman, show URL request and Body requirements for GET, POST, and UPDATE methods.
  • In Postman, show the JSON response data for 200 success conditions on GET, POST, and UPDATE methods.

200 OK success shown on the GET and POST methods below.

POST request needs body data for acre_lot, bathrooms, and bedrooms to predict a home value, and returns the predicted value

POST request needs body data for acre_lot, bathrooms, and bedrooms to save the settings for user, and returns a confirmation to the user that settings are saved

GET request doesn’t need any body data, recieves the saved settings in an output including data for acre_lot, bathrooms, and bedrooms

  • In Postman, show the JSON response for error for 400 when missing body on a POST request.

When body doesn’t exist on POST request, 400 BAD REQUEST error is returned.

Even if just one input value in the body is missing, 400 BAD REQUEST error is returned. 200 success and an output is only given if all body values are present.

  • In Postman, show the JSON response for error for 404 when providing an unknown user ID to a UPDATE request.

This cannot be accomplished by my code. In my code, user ID is automatically populated into the SQLite database to be id 1. It cannot be changed by user and therefore cannot result in a 404 error for unknown user ID.

@houseprice_api.route('/settings', methods=['POST'])
def save_settings():
    try:
        # Extract the data sent by the user.
        data = request.json
        
        # Get the input features from the data.
        bedrooms = float(data.get('bedrooms', 0))
        bathrooms = float(data.get('bathrooms', 0))
        acre_lot = float(data.get('acre_lot', 0))
        
        # Connect to the database.
        with sqlite3.connect(DATABASE) as conn:
            cursor = conn.cursor()
            
            # Insert or replace the settings in the database.
            cursor.execute('''INSERT OR REPLACE INTO settings (id, bedrooms, bathrooms, acre_lot) 
                              VALUES (1, ?, ?, ?)''', (bedrooms, bathrooms, acre_lot))
            
            # Commit the transaction.
            conn.commit()
        
        # Return a success message.
        return jsonify({'message': 'Settings saved successfully'}), 200
    except Exception as e:
        # If something goes wrong, send back an error message with status code 400 (Bad Request).
        return jsonify({'error': str(e)}), 400

Frontend

Blog JavaScript API fetch code and formatting code to display JSON.

  • In Chrome inspect, show response of JSON objects from fetch of GET, POST, and UPDATE methods.
    • In the inspect window, the user input data is formatted into JSON and sent to the API which runs the data on the ML model, and returns a value to the API which is then formatted and sent to the frontend as a user output. The API/JSON is essentially the “code in the middle”, primarily formatting and sending data between the frontend and backend.
  • In the Chrome browser, show a demo (GET) of obtaining an Array of JSON objects that are formatted into the browsers screen.
    • The ML prediction output is shown on the left in the Chrome Browser as it is the predicted price result given to the user. It is fetched from the API.
  • In JavaScript code, describe fetch and method that obtained the Array of JSON objects.

In this JavaScript code snippet, the fetch function is used to make a POST request to a specific URL (http://127.0.0.1:8059/api/houseprice/predict). The request includes headers that the content type of the data being sent is JSON (“Content-Type”: “application/json”). Afterward, the request body contains JSON-formatted data obtained from the requestData variable, which is converted into a JSON string using JSON.stringify(requestData).

fetch("http://127.0.0.1:8059/api/houseprice/predict", {
            method: "POST",
            headers: {
                "Content-Type": "application/json",
            },
            body: JSON.stringify(requestData),
        })
        .then(response => response.json())
  • In the Chrome browser, show a demo (POST or UPDATE) gathering and sending input and receiving a response that show update. Repeat this demo showing both success and failure.

The below screenshot shows the success of the fetch function by delivering an output to the user when the POST method recieves all the user inputs it needs.

The below screenshot shows the failure of the fetch function along with an error message for the user when the POST method doesn’t have all the input it needs.

  • In JavaScript code, show code that performs iteration and formatting of data into HTML.
  • In JavaScript code, show and describe code that handles success. Describe how code shows success to the user in the Chrome Browser screen.
  • In JavaScript code, show and describe code that handles failure. Describe how the code shows failure to the user in the Chrome Browser screen.

The code iterates over the user inputs (acreLot, bedrooms, and bathrooms) to ensure they fall within specified ranges. If any input is outside the range, it displays a message to the user indicating they should enter values within the suggested range. This is done using a for loop and conditional if statements. The data is then formatted into a HTML JSON object called requestData, which is sent to the server for prediction using a POST request.

After sending the prediction request to the server, the code handles the success response by first converting the response data to JSON format (response.json()). Then, it extracts the predicted price from the response data and formats it using parseFloat(data.predicted_price).toLocaleString(). Then, it displays the predicted price to the user by setting the inner text of an HTML element with the id “result” to “Predicted Price:” followed by the formatted predicted price.

To handle failure, if an error occurs during the fetch request, the code catches the error in the .catch() block. It logs the error to the console for debugging purposes (console.error(“Error:”, error)), and then informs the user that an error occurred by setting the inner text of the HTML element with the id “result” to “An error occurred. Please try again.”

In the Chrome Browser screen, when the user interacts with the webpage and triggers the predictPrice() function, they will see one of two outcomes:

  1. If the inputs are within the specified ranges and the prediction is successful, they will see the predicted price displayed on the screen. (success)
  2. If any input is outside the range or an error occurs during the prediction process, they will see an error message indicating that something went wrong and they should try again. (failure)
function predictPrice() {
        // Create constants for each of the user input values
        const acreLot = parseInt(document.getElementById("acre_lot").value);
        const bedrooms = parseInt(document.getElementById("bedrooms").value);
        const bathrooms = parseInt(document.getElementById("bathrooms").value);

        // Create list for the 3 inputs
        const inputs = [acreLot, bedrooms, bathrooms];

        // Create a list with nested dictionaries for each variable along with its ranges
        const inputFields = [
            { id: "acre_lot", range: { min: 1, max: 1800 } },
            { id: "bedrooms", range: { min: 1, max: 11 } },
            { id: "bathrooms", range: { min: 1, max: 15 } }
        ];

        // Iterate through for loop along with conditional if statement to ensure inputs are within range
        for (let i = 0; i < inputs.length; i++) {
            const value = inputs[i];
            const range = inputFields[i].range; // Get the range from inputFields array
            if (value < range.min || value > range.max) {
                document.getElementById("result").innerText = "Please enter values within the suggested range.";
                return;
            }
        }

        const requestData = {
            "acre_lot": acreLot,
            "bedrooms": bedrooms,
            "bathrooms": bathrooms
        };
        console.log(JSON.stringify(requestData))
        fetch("http://127.0.0.1:8059/api/houseprice/predict", {
            method: "POST",
            headers: {
                "Content-Type": "application/json",
            },
            body: JSON.stringify(requestData),
        })
        .then(response => response.json())
        .then(data => {
            const predictedPrice = parseFloat(data.predicted_price).toLocaleString(); // Add parseFloat() to ensure correct conversion to number before formatting
            document.getElementById("result").innerText = "Predicted Price: $" + predictedPrice;
        })
        .catch(error => {
            console.error("Error:", error);
            document.getElementById("result").innerText = "An error occurred. Please try again.";
        });
    }

Algorithm Analysis

In the ML projects, there is a great deal of algorithm analysis. Think about preparing data and predictions.

  • Show algorithms and preparation of data for analysis. This includes cleaning, encoding, and one-hot encoding.
  • Show algorithms and preparation for predictions.
  • Discuss concepts and understanding of Linear Regression algorithms.
  • Discuss concepts and understanding of Decision Tree analysis algorithms.

ML algorithm below prepares data for analysis through cleaning. Dropping rows with missing values in the ‘price’ column is done using the dropna() function, making sure that only rows with valid price data are kept for analysis.

ML algorithm below prepares for predictions through model training/testing. The line ‘X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)’ splits the data into training and testing sets using the train_test_split() function from sklearn.model_selection. In this model, 70% of the data is used to train and the remaining 30% is used to test. This prepares the data for training and evaluating the model’s performance on predictions.

ML algorithm below prepares for predictions through random forest regression as well. A Random Forest Regressor model is created and trained using the training data. Random forests are a learning method based on decision trees, which is capable of handling non-linear relationships between features and the target variable. This model is then used to make predictions on the testing data.

Linear regression algorithms are not used in this ML algorithm. Instead, a Random Forest Regressor model is employed. Linear regression algorithms assume a linear relationship between independent and dependent variables, which is simpler but not as complex/accurate as a decision tree analysis algorithm.

ML algorithm below uses decision tree analysis algorithms. It is used within the Random Forest Regressor, which is comprised of multiple decision trees, each trained on a random subset of the data. Decision trees split the feature space into regions based on the feature values in order to minimize the variability of the target variable. Random forests use the predictions of multiple decision trees to improve prediction accuracy.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error

# Load data from a CSV file into a table-like structure called DataFrame
price_data = pd.read_csv('realtor-data.csv')

# Drop rows with missing values in 'price' column
price_data.dropna(subset=['price'], inplace=True)

# Select the columns 'bed', 'bath', 'acre_lot' to use for predicting 'price'.
# Convert their values to numbers and replace any missing or invalid entries with zero.
X = price_data[['bed', 'bath', 'acre_lot']].apply(pd.to_numeric, errors='coerce').fillna(0)
y = price_data['price'].apply(pd.to_numeric, errors='coerce').fillna(0)

# Split the data into two parts: one part for training the model, and one part for testing its predictions.
# Here, 70% of the data is used for training and 30% for testing.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a machine learning model based on random forests, which is a method that uses multiple decision trees.
rf_regressor = RandomForestRegressor(n_estimators=100, random_state=42)

# Teach the model to predict 'Market Cap' using the training data.
rf_regressor.fit(X_train, y_train)

# Use the trained model to predict 'Market Cap' for the testing data.
y_pred = rf_regressor.predict(X_test)

# Calculate and print the mean squared error for the model's predictions.
# The mean squared error tells us how close the model's predictions are to the actual values, where lower numbers are better.
mse = mean_squared_error(y_test, y_pred)
print('Model MSE:', mse)

Deployment Lecture Notes

  • assets/js/config.js
    • Deployment URL and localhost backend URL in here
  • Nginx Methods
    • Allow-Credentials, Allow-Origin, Allow-Method adapted to my own system
  • Python CORS
    • CORS policies moved from main.py to init.py
  • Python Auth
    • @token_required shold guard specific HTTP endpoints
    • @token_required function in authmiddleware.py file
  • Python CSRF
    • In init.py
  • Certbot HTTPS
    • Change from HTTP to HTTPS
    • I’ve done before
  • Need unguarded and guarded requests in CPT and DS projects
  • Requests need options, fetches need URIs
  • Handle standard errors in the backend