This vehicle insurance dataset contains one year’s worth of information for insured vehicles. The response variable represents the amount of claims experienced for that vehicle in that year.
This dataset contains 209,240 insurance records. The target variable is a dollar amount of claims experienced for that vehicle in that year, and the explanatory variables contain information about the policy, on the vehicle (such as model and make, year and other miscellaneous vehicle characteristics), and a row and household identifier.
In this case we have a continuous variable as a target, so it is a regression task. To evaluate the results of this competition the organizers used normalized Gini coefficient computed on 2008 data, given only the data from 2005 to 2007.
Although this dataset can make a huge difference on the insurance business' performance, it has some problems that complicate its usage. Luckily, Synthesized can solve these problems in a fast and intuitive way.
The data is available in the Kaggle competition "Allstate Claim Prediction Challenge".
Connect with other Synthesized users and directly with our engineers.Join our Community