I'm Brett, and in this article I'll present some more findings from an analysis on the Bondora Loan dataset.
Click here to see the previous analysis I did on this dataset. The previous analysis looked at how to potentially reduce default rates by using the Portfolio Manager.
In this article I'll look at some of the factors you can use to choose loans through the Marketplace and Secondary Market. Note that I'm only presenting data from Estonian borrowers, and also that past performance is no indicator of future performance.
So with that in mind, what factors are useful in screening out higher risk borrowers, and which factors bear little relationship with loan default rates?
Read on to find out...
There is quite a significant difference between the loan default rates of male and female borrowers:
|Percentage default rates (Y) plotted against borrower's gender (X)|
Only loaning money to women does significantly cut down on the number of available loans though. Men comprise 53% of the market, and women 47%.
As far as age goes, the first observation is the data quality isn't great for anyone younger than 21 or older than 65. So I'd definitely avoid anyone outside of the core 21 - 60 age group.
But in this core age group, it's fairly obvious that there's a good relationship between age and default rates on loans:
|Percentage default rates (Y) plotted against age of borrower (X)|
Again, I would avoid lending to borrowers older than 60 as there is a spike in defaults at 65 for some reason.
There's also some useful information that can be found in the marital status factor:
|Percentage default rates (Y) plotted against borrower's marital status (X)|
But lowest risk of all are divorced people. I'm a little surprised at this, so it could definitely be a factor worthy of further investigation in a future article...
Use of Loan
In my previous attempt at analysing the factors you could use to select loans using the Bondora portfolio manager I was somewhat disappointed.
Not so with the factors you can use in the Marketplace or Secondary Market!
Another really useful factor is Use of Loan:
|Percentage default rates (Y) plotted against borrower's declared use of loan monies (X)|
I was a little surprised about this, and previously I would avoid investing in any travel related loans on Bondora.
I was flat wrong!
I guess that people who book vacations are feeling reasonably confident that their financial situation is stable and that they'll be able to repay their loan.
And I suppose that people who borrow for a business are the type of personality who works hard and is able to find enough opportunities out there to repay their loans.
After producing this chart, I would definitely be more wary of loans relating to education and consolidation of existing loans.
I read on a forum that somebody likes to avoid vehicle loans. Well these do look higher risk, but they're not the highest risk - that honour belongs to education.
Finally, I'll definitely be seeking out real estate or home improvement loans, which appear to have a lower risk profile compared to some other uses.
From the chart below, it appears that loan risk is lower the more educated the borrower is:
|Percentage default rates (Y) plotted against borrower's highest level of education (X)|
Incidentally, take the default rate for primary level education with a pinch of salt due to the comparatively low number of borrowers in this group.
To generate this data, I used the following process:
- I downloaded the loan Excel spreadsheet from Bondora.
- I imported the Excel spreadsheet into SQL Server.
- I wrote some custom SQL queries to analyse the data.
- I exported the results sets from SQL Server back into Excel in order to turn them into charts.
I have assumed that the AD column equaling 1 indicates that a loan has defaulted. I have excluded loans that were applied for within the last 3 months or so. Finally, I've only included Estonian loans in all the queries except for the one relating to country.
If you want to have a go at analysing the data yourself, then this is the basic SQL query I used, in this case the query for the education_id factor:
case education_id when 1 then 'Primary'
when 2 then 'Basic'
when 3 then 'Vocational'
when 4 then 'Secondary'
when 5 then 'Higher'
(Sum(AD) / Count(*) * 100) AS 'Percentage Defaulted',
SUM(AD) as NumberInDefault,
COUNT(*) as NumberOfLoans
where country = 'EE' and creditdecision = 1
and education_id between 1 and 5
and LoanApplicationStartedDate < '2014-10-27'
group by education_id
order by education_id
Summary and Conclusions
The basic message is that if you want more control over your loan future default rates, then you have to buy loans on the Marketplace and possibly in the Secondary Market as well.
There are definitely some good factors that can be used to lower your potential default rates.
One thing I'll point out is that in this article I've only considered the effect of one factor at a time on loan default rates.
I'd sure like to drill down to get the likely default rates of a particular group (e.g. 50-55 year old women who want travel loans). I suspect that default rates for these types of groups will be significantly lower than the market average.
Comments? Questions? Suggestions? Leave feedback below!