Principal Component Analysis (PCA)
This chapter aims to give you a clear understanding of Principal Components Analysis (PCA) using appliance sales in USA. PCA is a statistical technique that can deal with multivariable data set by reducing the dimension and provides a simple processed pattern of data. You may be able to find latent information under huge raw data which overwhelmed you.
First, click gMultibase_2015h and gOpen Formh to open PCA dialog box. Click "Done" without any changes.
After a few minutes, three new worksheets, "Multibase_PCA", "Multibase_PCA2" and" Multibase_Result" will be generated. In the sheet of Multibase_PCA, the loading and score plot with PC1, PC2 and PC3 will be displayed as shown below.
Score plots distributed by PC1, PC2 and PC3 explain the regional buying motives. In the case of Pennsylvania (Red), the facsimile is distributed on the same direction in loading plot and this means that facsimiles are sold well in Pennsylvania. When you plot the bar graph of facsimile raw data, it is clear that facsimiles are sold in Pennsylvania more than in the other states.
Next, open the sheet of "Multibase_PCA2". You can see two graphs as shown below.
First graph is called as R2 and Q2. R2 menas how the model explains initial data. The more component increases, R2 closes to one. Q2 is called as "Predictive Value", which shows how accurate the model is.Generally it shows maximum value when componennt increases. Multibase determines optical number of component using Q2 value.
Second graph shows "Distance from Model", called as "DmodX", which is an indicator of how the model explain raw data.
PLS-DA (Discriminant Analysis using PLS)
PLS Discriminant Analysis (PLS-DA) is performed in order to sharpen the separation of categories. PLS-DA is based on a classical PLS regression where the response variables are replaced by the set of dummy variables describing the categories.
To perform PLS-DA, you go back to "Appliance" sheet and click gMultibase_2015h and gOpen Formh. When "Preparation" dialog box appears, select gPLS-DAh and click gNext>>".
After Multibase_Preparation sheet is generated, click "Multibase_2015" and "Open Form" again, and show "PLS-DA" dialog box. Click "Done" without any changes.
Now, three new worksheet "Multibase_PLSDA", "Multibase_PLSDA2" and" Multibase_Result" are generated, and the loading and score plots with PC1, PC2 and PC3 are displayed on "Multibase_PLSDA" sheet as shown below.
Each group separation becomes much clearer than PCA. In the case of Massachusetts (purple), hair drier is distributed on same direction of PC3, it means that hair driers are sold well in Massachusetts. You can confirm it by displaying bar graph of raw data as shown below.
2-3-1 Marunouchi Chiyoda Tokyo Japan