Skip to content

Latest commit

 

History

History
236 lines (134 loc) · 22.6 KB

README.md

File metadata and controls

236 lines (134 loc) · 22.6 KB

Credit Scoring Analytics for Credit Card Product using Logistic Regression Algorithms

1. Introduction & Background

A. Introduction

In the contemporary financial landscape, the effective management of credit risk is imperative for the sustained growth and stability of banking institutions. Amara Bank, a prominent player in the banking sector, acknowledges the pivotal role of its Credit Card Product division in catering to the diverse financial needs of its clientele. To ensure the continued viability and profitability of its credit card services, Amara Bank's Credit Card Product division endeavors to develop a sophisticated credit scoring model using Logistic Regression.

The pervasive nature of credit card usage in the modern economy has underscored the necessity for Amara Bank to accurately assess the creditworthiness of its prospective and existing cardholders. As the banking industry undergoes rapid digital transformation, the demand for efficient and secure credit card services has become more pronounced. Consequently, Amara Bank's Credit Card Product division recognizes the urgency to leverage advanced analytics and statistical modeling techniques to refine its credit evaluation process, thereby minimizing the risks associated with extending credit facilities.

This journal seeks to outline the comprehensive business understanding behind the development of a credit scoring model tailored explicitly for Amara Bank's Credit Card Product division. By elucidating the underlying business context, objectives, constraints, stakeholders, and success criteria, this study aims to establish a holistic framework that aligns the technical intricacies of Logistic Regression modeling with the overarching business imperatives of Amara Bank. Through this comprehensive understanding, the Credit Card Product division endeavors to fortify its risk management practices, enhance customer satisfaction, and sustain a competitive edge in the dynamic credit card market landscape.

B. Business Context

The Credit Card Product division of Amara Bank operates within an increasingly dynamic and competitive financial services landscape. As digitalization continues to reshape the banking industry, the demand for convenient, secure, and seamless credit card services has witnessed an unprecedented surge. The division operates in an environment characterized by evolving consumer behaviors, changing regulatory landscapes, and heightened expectations for personalized and user-friendly financial experiences.

Amara Bank's Credit Card Product division acknowledges the significance of maintaining a balanced credit card portfolio to safeguard the institution from potential financial risks and losses. The proliferation of digital payment solutions and the growing preference for cashless transactions have amplified the importance of ensuring the responsible and judicious extension of credit to cardholders. Against the backdrop of increasing incidents of fraudulent activities and identity theft, the division is compelled to enhance its risk management practices to safeguard both the interests of the bank and the financial well-being of its customers.

Furthermore, the competitive landscape within the credit card market necessitates Amara Bank's Credit Card Product division to differentiate its offerings through a combination of attractive rewards programs, competitive interest rates, and, most importantly, a seamless and efficient credit approval process. Thus, the development of an advanced credit scoring model using Logistic Regression serves as a strategic imperative for the division, enabling it to leverage data-driven insights to make well-informed credit decisions while ensuring customer satisfaction and loyalty.

The division recognizes that in the current era of heightened digital connectivity, a robust credit scoring model is not only a tool for risk assessment but also a key enabler for fostering enduring customer relationships. By leveraging advanced analytics and machine learning techniques, the division aims to strike a delicate balance between prudent risk management and the facilitation of seamless and personalized credit experiences for its diverse customer base. This business context underscores the critical role of the credit scoring model in fortifying Amara Bank's position as a customer-centric and technologically adept financial institution within the competitive credit card market.

B. Business Objective

The development of an advanced credit scoring model using Logistic Regression within Amara Bank's Credit Card Product division is driven by a multifaceted set of business objectives, aimed at enhancing risk management practices, optimizing credit card approval processes, and fostering a culture of data-driven decision-making.

Main Objectives:

• Minimize Non-Performing Loans (NPL) and Bad Debt Ratio:

Amara Bank's Credit Card Product division is committed to minimizing the incidence of Non-Performing Loans and reducing the overall Bad Debt Ratio within its credit card portfolio. By implementing a robust credit scoring model, the division aims to accurately assess the creditworthiness of applicants, thereby mitigating the risk of default and delinquency. This objective aligns with the division's strategic focus on maintaining a healthy and sustainable credit card portfolio, ensuring long-term profitability, and preserving the trust and confidence of its stakeholders.

Metric 1: Reduction in Non-Performing Loans (NPL) by at least 15% within the first year of implementing the credit scoring model. Metric 2: Decrease in the Bad Debt Ratio by a minimum of 10% within the initial two years post-implementation.

• Increase Approval Rates and Market Share:

The division aims to leverage the insights derived from the credit scoring model to streamline and expedite the credit approval process. By accurately identifying creditworthy applicants, the division seeks to increase its approval rates, thereby expanding its customer base and bolstering its market share within the competitive credit card market. This objective reflects the division's commitment to fostering inclusive and accessible financial services while capitalizing on growth opportunities and strengthening its position as a market leader in the credit card segment.

Metric 3: Increase in the Credit Card Approval Rate by 20% in the first six months after the deployment of the credit scoring model. Metric 4: Expansion of the Market Share within the credit card segment by 8% within the first year of adopting the credit scoring model.

• Transition from Manual Lending to Scorecards:

To enhance operational efficiency and ensure consistency in credit decision-making, the division aims to transition from traditional manual lending practices to automated credit scoring methodologies. By adopting a data-driven approach to credit assessment through the utilization of scorecards, the division seeks to standardize and optimize its decision-making processes, thereby reducing the reliance on subjective evaluations and manual underwriting. This objective underscores the division's commitment to embracing technological advancements and fostering a culture of data-driven decision-making, ensuring greater accuracy, efficiency, and transparency in the credit approval process.

Metric 5: Reduction in the average time taken for credit assessment and approval by 30% within the initial three months of transitioning to the credit scoring model. Metric 6: Implementation of the credit scoring model in at least 90% of credit card application evaluations within the first year, facilitating a comprehensive shift from manual lending practices to automated scorecard-based assessments.

By setting these specific target numbers for the key performance metrics, the Credit Card Product division of Amara Bank can effectively gauge the success and impact of the credit scoring model's implementation. These measurable KPIs not only provide a clear benchmark for evaluating the model's effectiveness but also serve as guiding parameters for the division to track its progress in achieving the outlined business objectives and strategic goals.

C. Deliverables:

• Customized Credit Scoring Model:

Development of a tailored Logistic Regression-based credit scoring model designed specifically for the Credit Card Product division of Amara Bank. Comprehensive documentation outlining the model's variables, parameters, and underlying assumptions for transparent and auditable credit evaluations.

• Performance Evaluation Framework:

Establishment of a robust performance evaluation framework to monitor the effectiveness and accuracy of the credit scoring model in predicting credit risk and minimizing defaults. Regular performance reports highlighting the model's predictive power, stability, and calibration, enabling continuous refinement and optimization of the scoring model.

• Implementation Strategy and Training Program:

Detailed implementation strategy outlining the phased deployment of the credit scoring model within the existing credit card approval process of Amara Bank's Credit Card Product division. Comprehensive training program for relevant stakeholders, including credit analysts, underwriters, and decision-makers, to ensure seamless integration and effective utilization of the new credit scoring model.

• Change Management Plan:

Structured change management plan to facilitate a smooth transition from manual lending practices to automated scorecard-based assessments within the Credit Card Product division. Communication strategy outlining the key benefits and implications of the new credit scoring model to foster a culture of data-driven decision-making and enhance organizational readiness for technological advancements.

• Compliance Documentation and Regulatory Alignment:

Compliance documentation ensuring adherence to relevant regulatory guidelines and data privacy standards in the development and deployment of the credit scoring model. Alignment with industry best practices and regulatory requirements to foster transparency, fairness, and accountability in the credit evaluation process, thereby instilling trust and confidence among stakeholders and customers.

These deliverables serve as essential components for the successful development, implementation, and integration of the credit scoring model within the Credit Card Product division of Amara Bank, fostering a culture of data-driven decision-making and reinforcing the division's commitment to prudent risk management and customer-centric financial services.

  1. Dataset & Features

For create the analysis, we use the dataset with features as follows

  • Age : Represents the age of the person
  • Occupation : Represents the occupation of the person
  • Annual_Income : Represents the annual income of the person
  • Monthly_Inhand_Salary : Represents the monthly base salary of a person
  • Num_Bank_Accounts : Represents the number of bank accounts a person holds
  • Num_Credit_Card : Represents the number of other credit cards held by a person
  • Interest_Rate : Represents the interest rate on credit card
  • Num_of_Loan : Represents the number of loans taken from the bank
  • Type_of_Loan : Represents the types of loan taken by a person
  • Delay_from_due_date : Represents the average number of days delayed from the payment date
  • Changed_Credit_Limit : Represents the average number of payments delayed by a person
  • Num_Credit_Inquiries : Represents the number of credit card inquiries
  • Credit_Mix : Represents the classification of the mix of credits
  • Outstanding_Debt : Represents the remaining debt to be paid (in USD)
  • Credit_Utilization_Ratio : Credit_History_Age
  • Credit_History_Age : Represents the age of credit history of the person
  • Payment_of_Min_Amount: Represents whether only the minimum amount was paid by the person
  • Total_EMI_per_month : Represents the monthly EMI payments (in USD)
  • Amount_invested_monthly: Represents the monthly amount invested by the customer (in USD)
  • Payment_Behaviour : Represents the payment behavior of the customer (in USD)
  • Monthly_Balance : Represents the monthly balance amount of the customer (in USD)

Source of original dataset can be access through this link: https://www.kaggle.com/datasets/parisrohan/credit-score-classification/data

  1. Scorecard Development

The data acquisition process plays a critical role in the development of the credit scoring model for the Credit Card Product division of Amara Bank. The following are the key components involved in the data acquisition stage:

A. Data Acquisition

Application Form:

The application form serves as a primary source of customer-provided data, capturing essential information such as personal details, employment history, income, and other relevant financial information. This data provides insights into the applicant's financial standing, employment stability, and credit requirements, forming the initial dataset for the credit evaluation process.

Credit Bureau:

The Credit Bureau data encompasses information obtained from various external sources, including SLIK (Sistem Layanan Informasi Keuangan), BI Checking (Bank Indonesia Checking), and APPI Checking (Asosiasi Perusahaan Pembiayaan Indonesia Checking). These sources provide comprehensive credit information, including the applicant's credit history, outstanding debts, existing credit accounts, and repayment patterns, enabling a thorough assessment of the applicant's creditworthiness and risk profile.

Internal System - Behavioral Data:

Behavioral data obtained from internal systems includes a comprehensive record of the applicant's interactions with Amara Bank, particularly within the Credit Card Product division. This data encompasses the applicant's historical credit card usage patterns, transactional behavior, repayment history, and any instances of repeat or additional orders. The analysis of this data provides valuable insights into the applicant's credit utilization habits and repayment discipline, contributing to a holistic assessment of the applicant's credit risk profile.

Performance Data:

Performance data refers to the outcome data associated with each credit application, encompassing critical information such as the applicant's payment history, instances of default, frequency of late payments, and any history of delinquency. This data enables the division to assess the applicant's credit performance and behavior, facilitating the identification of potential high-risk applicants and the establishment of risk mitigation strategies to minimize the incidence of non-performing loans and defaults within the credit card portfolio.

By effectively acquiring and analyzing these diverse data sources, the Credit Card Product division of Amara Bank can construct a comprehensive and multidimensional dataset, laying the groundwork for the subsequent development and implementation of the credit scoring model using Logistic Regression.

B. Good/Bad Definition

Observation/Selection Statuses:

Observation Exclude:

This status involves the exclusion of certain observations from the dataset, primarily to address sample bias issues. These exclusions may arise due to data inconsistencies, data quality issues, or other sample selection biases that could potentially skew the analysis and compromise the accuracy of the credit scoring model.

Reject:

The 'Reject' status refers to cases where applicants are not selected for credit approval, as determined by the lender. These cases represent instances where the lender deems the applicant to be ineligible for credit, typically due to factors such as a high-risk profile, poor credit history, or inadequate financial standing.

Not Taken Up (NTU):

'Not Taken Up' signifies cases where applicants are selected for credit but do not proceed with the credit facility, despite being approved by the lender. This status often arises when the borrower chooses not to utilize the credit product, even after meeting the lender's criteria for approval.

Mutual Accept:

'Mutual Accept' represents cases where the selected applicants have accepted and utilized the credit facility provided by the lender, either responsibly or in a manner that indicates potential credit misuse or abuse.

Outcome/Performance Statuses:

Good:

The 'Good' status signifies desirable outcomes, such as timely payments, responsible credit utilization, and a positive credit performance, reflecting a borrower's ability to manage credit responsibly and meet their financial obligations effectively.

Bad:

The 'Bad' status refers to undesirable outcomes, including instances of default, delinquency, or any other adverse credit events that indicate a high level of credit risk and potential financial instability.

Indeterminate:

The 'Indeterminate' status represents outcomes that fall between the 'Good' and 'Bad' categories, indicating a moderate level of credit risk or uncertainty in the borrower's credit behavior. This optional category allows for the nuanced classification of cases that do not distinctly fit into either the 'Good' or 'Bad' classification.

Excludes:

The 'Excludes' category encompasses any outcomes that lie outside the intended scope of the credit scoring model, such as operational-risk events or fraudulent activities that may impact the overall risk assessment but are not directly related to the borrower's credit behavior.

In the final scorecard development, the use of the 'Good' and 'Bad' classifications enables the Credit Card Product division to establish clear benchmarks for evaluating the creditworthiness of applicants, facilitating more accurate risk assessments and informed credit decisions.

C. Performance and Sample Windows

Performance Window:

The performance window refers to the designated time frame during which the credit performance of accounts is continuously monitored and assessed to assign specific performance targets. This window allows for the evaluation of the borrower's credit behavior over a defined period, enabling the identification of patterns, trends, and potential risks associated with the borrower's repayment patterns and credit utilization habits.

Sample Window:

The sample window denotes the specific time frame from which known 'good' and 'bad' cases are selected for the sample dataset. This window serves as the basis for selecting historical data to train and validate the credit scoring model, facilitating the identification of key variables and patterns that contribute to the differentiation between creditworthy and high-risk applicants.

Furthermore, there are several approaches to determining the sample and performance windows, including:

Basel II:

This approach, as defined by the Basel II framework, emphasizes the use of historical data and specific time frames to assess credit risk and establish adequate capital requirements, ensuring that financial institutions maintain sufficient capital reserves to cover potential credit losses within a defined time horizon.

"Decision Horizon" Approach from IFRS 9:

The "Decision Horizon" approach, as outlined by the International Financial Reporting Standards (IFRS) 9, emphasizes the evaluation of the credit performance of accounts within a specific time frame to assess expected credit losses accurately and facilitate effective risk management practices within the banking sector.

Portfolio Maturity Using Cohort/Vintage Analysis:

This approach involves the analysis of portfolio maturity through cohort or vintage analysis, enabling the identification of specific cohorts or groups of accounts with similar origination dates to assess their credit performance and behavior over time. This analysis aids in the understanding of the credit risk dynamics associated with different borrower segments within the portfolio.

By leveraging these diverse approaches, the Credit Card Product division of Amara Bank can effectively determine the performance and sample windows, enabling the accurate assessment of credit risk, the identification of key trends, and the development of a robust credit scoring model tailored to the unique dynamics of the credit card market.

D. Data Sampling

Sample Types:

Development Set:

The development set comprises a significant portion, typically 70 to 80 percent, of the total sample, and is used to construct and develop each scorecard within the credit scoring model. This set allows for the iterative refinement and optimization of the model, enabling the identification of key variables and patterns that contribute to the accurate differentiation between 'good' and 'bad' credit applicants.

Validation Set:

The validation set constitutes the remaining 20 to 30 percent of the sample and is utilized to independently test and validate the performance and predictive accuracy of the scorecard. This set enables the division to assess the model's generalizability and robustness, ensuring that the developed model exhibits consistent performance and accuracy across diverse datasets and real-world scenarios.

Sample Sizes:

According to Anderson, the literature on credit scoring recommends a minimum sample size of:

1,500 'bads' 1,500 'goods' 1,000 'rejects' These sample size requirements ensure an adequate representation of both high-risk and low-risk credit applicants, as well as rejected applicants, facilitating a comprehensive analysis of the credit risk dynamics within the Credit Card Product division's portfolio.

Sampling Methods:

Simple Random Sample:

The simple random sampling method involves the random selection of observations from the population, ensuring that each element within the population has an equal chance of being selected for the sample. This method enables the division to obtain a representative sample and minimize potential sampling biases or distortions within the dataset.

Stratified Random Sample:

The stratified random sampling method involves the division of the population into distinct strata or subgroups based on specific characteristics or attributes, followed by the random selection of observations from each stratum. This method allows for the representation of diverse borrower segments within the sample, ensuring that the model adequately captures the unique credit risk profiles of different customer segments.

By leveraging these sample types, sample sizes, and sampling methods, the Credit Card Product division of Amara Bank can construct robust and reliable datasets for the development and validation of the credit scoring model, facilitating accurate risk assessments and informed credit decision-making within the dynamic credit card market landscape.

In the process of developing a robust credit scoring model for the Credit Card Product division of Amara Bank, preventing leakage of test set information and conducting exploratory data analysis (EDA) on the training set are critical steps to ensure the model's predictive performance on unseen data. The following elaborates on the key actions taken during the EDA phase: