Skip to content
View abbasali-io's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report abbasali-io

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
abbasali-io/README.md

Pinned Loading

  1. Clean Property Types in the Data Frame Clean Property Types in the Data Frame
    1
    def clean_property_types(propType):
    2
        # Define the cleaned types without the extra details
    3
        cleanTypes = [                     
    4
            'Condominium',
    5
            'Serviced Residence',
  2. Clean the Built_Size column, remove ... Clean the Built_Size column, remove the 'sf. ft.' from the data and along with few other varients of these labels. Also, convert the string into the numeric format for calculations at a later stage
    1
    # Convert Built_Size into numeric value
    2
    def convert_built_size_numeric(bsize):
    3
        try:
    4
            if re.search(r"sq\.*\s*ft\.*", bsize) is None:
    5
                return None
  3. Create the plot showing the distribu... Create the plot showing the distribution of properties by Size in Kuala Lumpur
    1
    # highly sqft per area
    2
    all_property_sqft = df.groupby('Location')['Built_Size'].mean().sort_values(ascending=False)
    3
    
                  
    4
    bx = all_property_sqft.plot(kind='bar', title="Property Size Distribution in Kuala Lumpur", figsize=(15,10), legend=True, fontsize=10, rot=90)
    5
    bx.set_xlabel("Locations", fontsize=10)
  4. Split the Built Up Type & Built Up A... Split the Built Up Type & Built Up Area Size into two separate columns, i.e. Built_Type & Built_Size, and show the top 5 rows of the dataframe to view the effect
    1
    # define the function to split Size into an array of two differnt values
    2
    def split_property_size(size, tp=0):
    3
        try:
    4
            return size.split(":")[tp].strip()
    5
        except AttributeError:
  5. Plot the Sq per Price for Kuala Lump... Plot the Sq per Price for Kuala Lumpur price
    1
    #  create price per sqft
    2
    df['Price_sqft'] = df['Price'] / df['Built_Size']
    3
    
                  
    4
    # most expensive area by price per sqft
    5
    dfc = df.copy(deep=True)
  6. Imputation of missing values with knn. Imputation of missing values with knn.
    1
    import numpy as np
    2
    import pandas as pd
    3
    from collections import defaultdict
    4
    from scipy.stats import hmean
    5
    from scipy.spatial.distance import cdist