Assignment 1
Please address the following:
Summary of assignment – Using Sample Mortgage Origination data, perform the following:
- Perform descriptive statistics:
Distribution of Original LTV by FICO
Distribution of CLTV by FICO
Distribution of DTI by Orig_UPB
Distribution of LTV by Orig_UPB
Distribution of FICO by Orig UPB
Distribution of Top 5 States by Orig UPB
Distribution of Top 5 Sellers by Orig UPB
Distribution of Top 5 MSA Codes by Orig UPB
Distribution of Borrower Count by Orig UPB
Distribution of Int rate by Orig UPB
Distribution of MI Pct by Orig UPB
Distribution of Property_type by Orig UPB
Distribution of loan Term by Orig UPB
Distribution of Count_units by Orig UPB
Distribution of loan_purpose by Orig UPB
Distribution of Occupancy Status by Orig UPB
Distribution of UPB Category by Orig UPB
Distribution of First time Homebuyer Flag by Orig UPB
Distribution of Prepayment penalty by Orig UPB
Unknown population as a percentage of DTI, FICO, LTV & MI
Minimum and Maximum value CLTV, Int Rt, Term DTI, FICO, LTV & MI
- Hypothesis: There is a positive correlation between LTV and MI and a negative correlation between FICO and Int Rt. Test hypothesis. Remove null results and/or remove outliers to improve regression results, if necessary. Explain results and implications
3) Description of the model(s) used
4) Model Output and do file and explanation of results
5) Conclusion
Information on Sample data
The information provided in this document serves as a reference tool of understanding the data included the Single Family Loan-Level data
Sample 2005-2006 loan-level credit performance data of the fully amortizing 30-year fixed-rate single-family mortgages
Sample loan-level data files are flat text files, where the content is pipe (“|”) delimited.
Data File LayoutThe Origination and Monthly Performance Data File Layout section in this document provides additional information on each of the data elements contained in the loan level dataset files. The information is structured as follows:
Field |
Description |
---|
Column Name[1] |
The abbreviated name of the data element that appears as the header row in each data file. |
Ofmal Name and Definition |
Name and definition of the loan-level data element. |
Valid Values/Calculations |
Allowable values of the specific data field and the calculations used (if applicable). |
Type (Data Type) |
The type of data found in each column:
- Alpha – contains only letters
- Alpha-numeric – contains letters and numbers
- Numeric – contains only numbers
- Date – represents a specific date (Y = Year, M = Month)
Example:
YYYYMM (201207) = July 2012 |
Length |
Represents the maximum number of characters allowed of the data field. |
Data Element
|
File Type
|
Valid Values
|
If Not Valid
|
---|
Credit Score (FICO) |
Origination |
300-850 |
Space (3) |
Mortgage Insurance Percentage (MI %) |
Origination |
1%-55% |
Space (3) |
Original Debt-to-Income Ratio (DTI) |
Origination |
0%
|
Space (3) |
Original Loan-to-Value Ratio (LTV) |
Origination |
6%-105% |
Space (3) |
Sample Methodology
|
Fields Impacted
|
---|
All dates will only include month and year |
- First Payment Date
- Maturity Date
- Zero Balance Effective Date
|
Origination loan amount will be rounded to the nearest $1000 |
|
Current unpaid loan balance will be NULL (empty) of first 6 months after loan origination |
|
Customers who deliver less than 1% of total origination UPB in a given origination quarter will be identified as “Other” |
|