[SOLVED] CS MANG 2043 Analytics for Marketing

$25

File Name: CS_MANG_2043__Analytics_for_Marketing.zip
File Size: 348.54 KB

5/5 - (1 vote)

MANG 2043 Analytics for Marketing

MAT012 Credit Risk Scoring

Copyright By Assignmentchef assignmentchef

About This Module
To give a comprehensive review of the objectives, methods and practical implementations of:
Credit and behavioural scoring in particular,
Data mining in general
Use software: SAS, R, Python, Excel
Lectures: Wednesday 9-12pm (W1-5)
Computer Lab Sessions: Wednesday 12-1pm (W2-5)
Lecturer: Dr Meirion Assistants (Shauna Ford &)

Learning Materials and Assessment
Recommended texts:
R. Anderson, The Credit Scoring Toolkit, OUP, 2007
A. Field, Discovering statistics using SAS, Sage, 2010
D.J. Hand, H. Mannila,P. Smyth, Principles of Data Mining, MIT Press, Cambridge, 2001.
Study guide, handouts and additional materials will be available on Learning Central
Assessment: Individual Report (100% of the marks)
Submission deadline:TBA

Selection of References
Credit scoring
L C Credit Models: Pricing, Profit and Portfolios, OUP 2009
L.C.Thomas, J.N.Crook, D.B. Edelman, Credit Scoring and its Applications, SIAM, Philadelphia,2002
L C Thomas, J N Crook, D B Edelman, Readings in Credit Scoring, OUP, 2004.
E M Lewis, An introduction to credit scoring, Athena Press, San Rafael, 1992.
E.Scoring for Risk Managers, South-Western, Mason, 2004
H.McNab, A Wynn, Principles and Practice of Consumer Credit Risk Management , CIB Publishing, Canterbury, 2000
D.J.Hand, W.E.Henley, Statistical classification methods in consumer credit, J.Royal Sat.Soc Series A 160,523-541,1997
L.C.Thomas, A Survey of credit and behavioural scoring; Forecasting financial risk of lending to consumers, International Journal of Forecasting 16, 149-172,(2000)
J.N.Crook, D.B.Edelman, L.C.Thomas,Recent Developments in Consumer Credit Risk assessment European J. Operational Research 183, 1447-1465, ( 2007)
Statistical Classification Techniques
D J Hand, Discrimination and Classification,, Chichester, 1981.
D.W. Hosmer, S Lemeshow, Applied Logistic Regression,,, 1989.
L Breman, J H Friedman, R A Olshen, C J Stene, Classification and Regression Trees, Wadsworth, International Group, Belmont, 1984.
Hand, D.J. (1997) Construction and Assessment of Classification Rules. Chichester:.
Data Mining
M.J.A. Berry, G.Linoff, Mastering Data Mining Wiley,. 2000
J.Han, M. , Data Mining: Concepts and Techniques,, San Francisco, 2011

This Lectures Learning Contents
Introduction to data mining
Introduction to credit scoring

Data, Databases and Data Warehouses
Data can be in many forms such as facts (e.g. economic news), measurements (e.g. blood pressure figures) and statistics (e.g. car insurance claim rates) collected together from some environment.
Datasets are often represented as a data matrix
n rows representing the n objects on which measurements taken
p columns representing the characteristics (e.g. variables, features, fields) collected on each object
Often, n and p can be very large (n around 106 in credit scoring and p around 105 in market basket analysis. So it is necessary to be able to store and retrieve data efficiently.

Transactions

A data warehouse is a database system used to store data from wide range of operational databases and other sources in a company.

Data, Databases and Data Warehouses
Databases: ways of organising and storing data to allow fast access to subgroups of data.
Typically used to conduct the daily operations of the company, bank accounts databases, airline reservation systems etc.
Used for answering well-defined and repetitive queries such as:
What is the customer address; or How much do they have in their accounts
Used as decision support tools
To answer queries involving analysis of data (e.g. the statistics of the car accident rate in South Wales in the past 3 years and how does this affect the car insurance premium)
What is the change in on-line sales of M&S last Christmas compared with a year before?
Database management system (DBMS): a software package
controls the creation, maintenance and use of data and acts as interface between application programs and physical data files.
It allows you to query or update contents of the database using a data manipulation language, most commonly Structured Query Language)
Database centralization and decentralization: over time, companies have gradually developed a number of databases for different regions or different functions of organisations.
Data Warehouse
A system used to store data from wide range of operational databases and other sources in a company

A data warehouse is a database system used to store data from wide range of operational databases and other sources in a company.

Data mining
Analysis of (often large) observational dataset to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner.
Let the data speak
Usually applies computational techniques (i.e. data mining algorithms implemented as computer programs) to actually find interesting patterns, relationships and summaries or derive a predictive model
To a large extent computer-driven discovery of (possibly unexpected) patterns or model building
Typically involves more complex analyses at a deeper level of detail than simply aggregating data
Could involve building predictive models, i.e. which predict future events based on past observations

Data mining application procedure
Task: deciding what the data mining application trying to do
visualization, classification, clustering regression etc.
Structure: determining what model or pattern we should use to fit to the data
linear regression model;
classification tree;
hierarchical clustering
Measurement function: how to judge the quality
Goodness fit test;
how well does the model describe the existing data(e.g. R2) or forecasting;
How well does it perform on data not yet collected or not used?

Data mining application procedure
Optimisation method: finding the best structure/model and parameters in the selected model. This is usually what one thinks of as data mining algorithm

Data management technique: how to store index retrieve data needed. For small datasets, this may not be important. But for massive datasets, this is essential. The location where the data are stored often needs to be accessed as part of the optimization method. Therefore, it is critical the application being feasible.

Acronym for Cross-Industry Standard Process for Data Mining
Has become the most commonly applied or referred to framework
Distinguishes between six phases

Phase 1: Business Understanding
Determine the business objectives
e.g. better retain current customers by predicting whether they are prone to churn
Set out how to measure success
Assess the situation
Inventory of resources;what data is available?
Cost-benefit analysis of the project
Determine the data mining goals
e.g. predict which individual customers are more prone to churn given their purchases, demographic details and complaints made
Produce project planning

Phase 2: Data Understanding
Collect initial data
Describe the data
Tables, number of records, variables in them, etc.
Explore the data
Understand distribution of variables (attributes), relationships between variables, etc.
Verify data quality
Missing values
Errors and inconsistencies

Data Quality
Some common tasks
Checking number of missing variables
Different formats of data:
E.g. CF103EU, cf10 3eu
e.g female, F,
Logic check:
E.g. Number of dependents < Number of people in the household?Coding errors: E.g. Income: -1000 GBPE.g. Year of birth: 1079Some Examples of DataVariableExample of DataGenderFemale/MaleCustomer ID109384, 109385, 109386Date of the transaction01/09/2013, 02/09/2013Age18, 19, 20, 21, 22, 23Age Group<18, 18-25, 26-30, 31-40, 41+Total transaction amount10, 25, 28, 46PostcodeSO17, NW1, CA3, GU5Employee satisfaction level1-Very dissatisfied,, 4-Neutral,, 7-Very satisfiedClassificationGenerate the Frequency DistributionsA summary of data presented in the form of class intervals and frequenciesQualitative data: distribution in each classClassFrequencyRelative FrequencyPrimary10210%Secondary22723%College/Diploma27127%Undergraduate24925%Postgraduate or higher 15115%Total1,000-Qualitative Data: Bar/Pie ChartsWhich one is preferable? Pie Chart of Education LevelPrimarySecondaryCollege/DiplomaUndergraduatePostgraduate or higher0.1020.2270.2710.2490.151Bar Chart of Education Level (N=1,000)PrimarySecondaryCollege/DiplomaUndergraduatePostgraduate or higher0.1020.2270.2710.2490.151PercentageBar Chart of Education Level (N=1,000)PrimarySecondaryCollege/DiplomaUndergraduatePostgraduate or higher102.0227.0271.0249.0151.0CountQuantitative Data: HistogramUsing the frequency distributionLabel x-axis either by Class Midpoints or the Class EndpointsHistogram of Income18.525.532.539.546.553.560.567.574.581.588.595.50.1034126163391930.07238883143743540.09617373319544980.1271975180972080.1147880041365050.1747673216132370.1747673216132370.09927611168562560.032057911065150.004136504653567740.00.00103412616339193Class MidpointRelative FrequencyQuantitative Data: Box PlotDisplay the Range, Media and the QuartileAn example for the Age variableMedian = 44Min=18; Max=75Lower Quartile = 37.5Upper Quartile = 49Quantitative Data: Scatter PlotPlot every observationsRelationship between two variablesOne variable plotted on x-axis and other variable plotted on y-axisScatter Plot for Total Value vs AgeVALUE2443.052.044.037.046.051.049.044.045.045.044.033.051.044.043.021.024.049.043.046.044.043.046.043.056.043.040.041.044.050.039.045.055.047.045.033.046.048.038.052.039.074.048.041.040.048.040.045.043.048.045.033.040.051.049.044.040.029.049.045.037.057.039.035.023.048.036.050.032.044.075.046.046.039.036.035.041.039.059.047.035.028.052.038.033.053.065.055.059.028.034.038.036.054.041.035.048.044.042.044.058.019.044.032.040.046.040.050.057.020.042.053.051.041.037.055.052.020.046.042.033.026.034.032.054.044.054.042.044.056.034.041.030.049.024.066.049.034.038.055.040.036.051.019.037.047.043.047.040.056.042.031.039.043.047.052.045.039.050.031.038.047.050.038.054.042.034.044.046.043.018.038.026.052.050.050.044.039.043.044.052.042.052.039.048.053.031.028.044.058.055.040.037.033.037.062.042.035.045.067.040.044.045.043.046.049.040.032.045.055.052.043.029.041.047.053.051.055.041.042.053.048.044.052.041.040.048.049.052.039.044.034.028.061.057.035.047.055.055.036.040.039.042.067.043.046.044.051.037.049.036.057.034.040.053.045.037.042.047.049.044.037.039.044.039.043.042.030.029.035.050.055.052.035.060.038.039.047.042.054.027.043.060.040.049.033.049.049.055.030.053.041.044.069.050.051.049.050.045.044.067.047.035.043.060.035.029.034.055.060.037.049.057.049.034.038.050.031.035.046.055.034.034.031.040.052.021.049.040.035.036.051.054.037.034.034.031.040.040.041.054.051.045.044.050.037.065.062.044.056.032.035.044.066.048.057.054.065.050.059.046.050.035.047.042.038.030.043.040.045.040.045.045.041.038.039.045.047.046.055.052.051.041.040.043.051.035.048.057.064.027.043.038.044.040.041.057.051.040.049.048.029.043.039.064.043.049.057.034.038.051.039.047.049.043.035.042.034.049.063.051.057.039.051.051.037.061.034.059.041.031.033.054.044.048.054.034.045.039.057.044.051.032.049.056.041.042.055.074.041.052.046.049.044.037.044.049.050.043.043.048.052.047.053.064.039.038.041.049.038.051.035.043.040.031.053.047.040.038.053.035.045.060.038.049.047.040.044.035.068.049.055.050.046.049.051.044.043.059.034.045.064.041.053.040.027.043.039.033.044.033.045.058.044.047.051.047.021.018.042.037.048.058.032.031.038.051.052.047.041.034.046.054.044.039.029.037.044.039.036.045.049.037.058.051.045.044.035.060.051.051.037.047.051.055.045.042.056.067.042.052.040.047.045.050.046.044.040.031.044.044.046.046.061.030.038.042.028.042.063.035.036.048.049.044.044.041.043.056.051.037.057.043.050.038.039.046.032.065.045.053.048.070.037.044.047.041.033.049.036.054.048.043.041.049.032.032.045.039.046.041.065.042.035.054.051.047.045.040.061.037.071.045.034.052.044.048.055.022.068.032.044.043.038.044.045.041.041.037.036.052.044.035.022.071.035.043.039.050.044.038.034.075.029.052.048.046.046.050.041.047.055.030.049.034.049.045.042.047.041.041.038.045.049.029.045.041.054.048.027.045.045.042.044.059.019.039.046.058.023.044.042.019.066.044.043.044.048.031.052.042.029.045.067.033.048.047.032.053.047.025.022.044.048.041.057.055.040.037.031.049.027.041.061.035.040.042.029.027.029.051.053.048.064.044.040.045.035.029.026.022.059.043.038.030.044.039.055.039.023.033.041.035.037.043.048.044.047.047.020.049.040.049.039.055.027.047.038.047.044.035.026.043.043.045.052.052.045.052.047.039.044.029.049.047.026.041.045.028.046.043.044.018.050.052.040.033.046.045.039.041.035.029.037.042.035.041.052.030.047.047.046.043.028.055.033.032.034.053.041.053.031.040.031.037.047.059.045.046.040.051.040.044.057.047.023.046.046.041.046.047.036.052.048.025.040.035.050.043.029.042.056.050.039.040.022.018.037.045.037.047.046.065.033.043.042.054.043.043.027.048.031.051.038.031.022.038.045.034.041.048.053.032.050.045.023.072.049.051.045.030.043.055.038.018.033.034.031.030.029.048.067.030.040.031.031.051.032.045.022.047.041.029.040.048.050.049.043.056.044.045.029.026.031.031.033.027.027.028.050.050.035.052.046.040.059.044.045.047.027.055.054.025.045.019.025.045.018.025.028.043.045.043.046.043.049.054.0547.0265.0355.0466.0556.0309.0417.0308.0321.0415.0266.0587.0356.0448.0469.0602.0398.0387.0484.0327.0567.0234.0644.0379.0512.0232.0281.0575.0493.0511.0330.0551.0439.0318.0285.0714.0529.0417.0271.0404.0228.0367.0240.0194.0204.0630.0455.0685.0337.0216.0655.0290.0662.0323.0860.0123.0529.0562.0539.0597.0472.0325.0219.0449.0488.0220.0254.0158.0194.0412.0373.0473.0184.0279.0491.0352.0341.0126.0173.0136.0635.0476.0414.01238.0282.0666.0320.0433.0276.0475.0548.0220.0527.0292.0248.0374.0488.0359.0612.0350.0312.0243.0137.0330.0219.0460.0552.0659.0609.0417.0265.0695.0569.0313.0408.0389.0506.0272.0914.0288.01010.0439.0428.0456.0686.0541.0225.0264.0418.0413.0476.0434.0438.0400.0188.0695.0301.0782.0432.0892.0760.0337.0299.0468.0314.0403.0177.0376.0506.0499.0320.0434.0159.0617.0409.0349.0572.0405.0843.0977.0512.0523.0319.0168.0174.0500.0367.0442.0549.0673.0601.0246.0633.0502.0445.0132.0429.0453.0355.0387.0471.0179.0694.0663.0630.0595.0672.0496.0657.0591.0549.0473.0154.0167.0296.0209.0317.0228.0185.0566.0442.0338.0378.0389.0478.0141.0562.0499.0438.0386.0415.0670.0210.0315.0263.0439.0303.0499.0429.0922.0388.0472.0583.0161.0529.0350.0410.0512.0169.0985.0548.0474.0163.0450.0131.0447.0409.0292.0776.0427.0433.0470.0479.0472.0454.0225.0127.0153.0442.0322.0244.0180.0368.0439.0226.0284.0368.0730.0185.0208.0155.0510.0330.0300.0228.0222.0308.0155.0203.0264.0256.0484.0952.0500.0253.0426.0354.0396.0305.0294.0332.0339.0338.0781.0674.0431.0634.0330.0362.0316.0487.0383.0347.0440.0451.0151.0574.0554.0655.0410.0335.01026.0184.0657.0502.0212.0514.0376.0508.0614.0599.0333.0675.0203.0284.0271.0245.0139.0192.0442.0326.0266.0349.0248.0360.0159.0331.01112.0343.0290.0372.0222.0656.0802.0437.0203.0332.0277.0755.0337.0919.0478.0201.0305.0377.0545.0485.0344.0605.0375.0380.0568.0312.0225.0496.0317.0477.0313.0466.0234.0129.0302.0425.0157.0434.0240.0598.0387.0511.0472.0404.0420.0621.0408.0317.0307.0407.0371.0370.0328.0278.0208.0460.0608.0278.0863.0226.0493.0472.0201.0486.0150.0383.0428.0347.0601.0143.0391.0471.0483.0699.0468.0452.0281.0301.0310.0186.0446.0295.0480.0661.0387.0431.0512.0509.0452.0132.0506.0630.0270.0428.0225.0333.0459.0430.0698.0580.0337.0212.0438.0201.0289.0309.0398.0138.0378.0171.0408.0294.0310.0313.0381.0334.0719.0153.0474.0509.0433.0177.0326.0694.0199.0256.0637.0697.0528.0175.0397.0198.0488.0667.0474.0464.0516.0277.0430.0191.0127.0363.0484.0517.0620.0485.0405.0300.0130.0512.0204.0126.0528.0291.0310.0138.0154.0884.0377.0294.0163.0134.0123.0415.0478.0360.0491.0317.0473.0509.0308.0189.0416.0125.0578.0399.0196.0507.0373.0312.0364.0285.0517.0377.0440.0231.0172.0277.0489.0599.0392.0462.0511.0192.0591.0188.0244.0491.0471.0371.0471.0140.0269.0315.0703.0598.0441.01017.0299.0269.0264.0699.0504.0364.0330.0984.0350.0154.0405.0270.0608.0498.0428.0421.0150.0268.0434.0409.0165.0373.0538.0136.0128.0389.0252.0489.0243.0497.0547.0368.0869.0657.0699.0594.0466.0486.0421.0260.0346.0556.0772.0718.0477.0363.0213.0341.0157.0567.0216.0325.0446.0434.0473.0286.0461.0269.0355.0313.0447.0693.0182.0554.0515.0369.0449.0156.0718.0506.0473.0344.0314.0266.0240.0491.0394.0753.0244.0328.0135.0274.0254.0509.0524.0323.0658.0234.0180.0343.0284.0379.0124.0215.0151.0377.0751.0395.0462.0464.0484.0414.0256.0423.0406.0432.0477.0237.0356.0417.0148.0219.0382.0503.0202.0565.0334.0234.0235.0355.0343.0358.0203.0200.0364.0418.0231.0315.0471.0519.0175.0549.0226.0220.0237.0297.0333.0256.0393.0406.0197.0329.0227.0454.0509.0286.0398.0349.0304.0176.0284.0285.0337.0249.0430.0381.0380.0302.0547.0366.0706.0373.0442.0589.0617.0657.0259.0275.0212.0313.0432.0389.0224.0419.0504.0760.0187.0485.0425.0390.0447.0497.0499.0296.0399.0279.0272.0567.0382.0576.0386.0178.0232.0527.0232.0500.0372.0314.0506.0191.0264.0259.0678.0445.0345.0514.0448.0386.0411.0438.0371.0466.0346.0510.0498.0315.0479.0417.0327.0455.0285.0503.0312.0221.0134.0233.0160.0171.0703.0520.0246.0234.0492.0263.0503.0331.0285.0219.0492.0518.0555.0314.0730.0183.0557.0307.0185.0871.0353.01012.0359.0755.0121.0505.0747.0313.0251.0284.0494.0565.0238.0407.0226.0162.0255.0414.0400.0416.0360.0158.0153.0418.0146.0686.0337.0343.0361.0257.0123.0577.0354.0306.0726.0431.0260.0185.0351.0257.0503.0599.0316.0456.0554.0382.0757.0419.0125.0294.0437.0612.0315.0163.0524.0434.0720.0456.0417.0473.0482.0487.0622.0443.0236.0306.0346.0403.0497.0406.0807.0331.0471.0288.0622.0428.0382.0407.0395.0260.0343.0289.0641.0425.0350.0225.0267.0445.0514.0832.0645.0300.0135.0450.0440.0755.0427.01013.0492.0226.0628.0235.0520.0741.0704.0380.0474.0434.0675.0623.0208.0156.0385. CS: assignmentchef QQ: 1823890830 Email: [email protected]

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] CS MANG 2043 Analytics for Marketing
$25