PCA error infinite or missing values in 'x'












2















I have really been struggling with using R to analyze financial data. I am new to programming in general, really, except very accustomed to doing work in Excel. Consequently, I have spent a lot of time (probably too much time) formatting my CSV file, just so I could minimize the hassle when working in R, but this hasn't worked.



Here is my code for PCA analysis. I have only gotten it to work when I have used smaller data files with no N/As or blanks, but I need to know how to handle these in R.



returns <- read.csv("PCA Data File.csv", skip = 1, header = T)
#standardize the variables
returns.pca <- prcomp(returns[2:ncol(returns)], scale = TRUE)


The result is:




Error in svd(x, nu = 0) : infinite or missing values in 'x'




Many questions arise from this, the first being how do you resolve this? Second, how do I explore my data to make sure missing values are properly addressed or replaced? Is it the fact that my data is a data.frame and not matrix that is causing the issue?



I am not sure how to attach the CSV file, but here are the first few rows from the file (there are 241 rows):



Date    Returns Var1    Var2    Var3    Var4    Var5    Var6    Var7    Var8    Var9    Var10   Var11   Var12   Var13   Var14   Var15   Var16   Var17   Var18   Var19   Var20   Var21   Var22   Var23   Var24   Var25   Var26   Var27   Var28   Var29   Var30   Var31   Var32   Var33   Var34   Var35   Var36   Var37   Var38   Var39   Var40   Var41   Var42   Var43   Var44   Var45   Var46   Var47   Var48   Var49   Var50   Var51   Var52   Var53   Var54   Var55   Var56   Var57   Var58   Var59   Var60   Var61
6/30/2014 0.48 18.12 9.44 107.43 19.53 1.92 11.54 0.99 3.33 98.83 0.44 2.59 3.42 105.15 308.59 80.44 1.36 0.94 102.07 1.69 331.47 53656.02 21897.39 11022.87 23144.90 15131.80 0.59 2.70 1.35 0.58 0.33 0.25 103.38 1.67 2.59 3.42 1.75 0.10 1.09 2.00 -0.11 1.24 2.08 0.22 138780.00
5/31/2014 1.52 17.63 9.44 107.18 14.36 1.96 12.48 1.01 3.49 98.60 0.37 2.55 3.39 101.79 306.79 79.96 1.37 0.93 101.84 1.68 324.69 53122.21 21159.31 10558.07 22584.93 14343.14 0.59 2.62 1.40 0.52 0.41 0.11 103.39 1.58 2.55 3.39 1.81 0.09 1.11 1.96 -0.07 1.15 2.29 0.47 3.50 1.49 138492.00 171.04 11302.80 4322654.00 55.40 -44.39 441.59 1000.70 117.44 11.60 6.50 1.50 0.50
4/30/2014 1.07 17.40 9.45 107.11 22.93 1.96 14.20 1.02 3.49 98.24 0.40 2.69 3.52 102.03 308.63 79.85 1.38 0.93 102.51 1.67 323.24 51470.08 21660.07 10399.85 22598.44 14475.33 0.61 2.67 1.53 0.53 0.47 0.06 103.47 1.69 2.69 3.52 1.82 0.09 1.49 2.08 0.02 1.16 2.04 -4.63 0.04 3.50 1.42 138268.00 171.58 11227.50 4296049.00 54.90 -47.04 425.02 204.90 117.57 11.60 27.30 6.60 1.80 1.40
3/31/2014 0.50 17.51 9.51 106.40 25.98 1.95 14.84 1.09 3.65 98.40 0.38 2.72 3.62 100.51 303.49 79.87 1.38 0.91 102.36 1.66 316.98 47046.98 20839.70 10097.38 21980.77 14694.83 0.61 2.72 1.59 0.52 0.48 0.04 103.44 1.63 2.72 3.62 1.99 0.08 1.73 2.10 0.00 1.13 2.02 0.91 3.30 1.20 137964.00 171.47 11169.00 4226971.00 53.70 -44.18 452.77 608.80 117.39 11.70 15.10 27.30 6.80 1.60 0.20
2/28/2014 1.76 17.10 9.52 106.27 25.35 1.96 15.47 1.13 3.88 98.46 0.31 2.70 3.66 100.68 294.91 80.44 1.37 0.90 102.12 1.66 315.92 47367.89 20039.38 10048.23 22188.31 14617.57 0.60 2.74 1.66 0.44 0.44 0.01 103.45 1.50 2.69 3.66 2.16 0.07 1.82 2.10 -0.05 1.04 1.87 0.91 3.10 1.08 137761.00 169.34 11133.50 4159972.00 53.20 -42.59 383.36 -48.40 116.28 11.70 27.30 6.90 1.70 1.70









share|improve this question

























  • Paste in the output of dput(head(returns, 10)) rather than the current copy-paste.

    – Thomas
    Jul 29 '14 at 19:48











  • Have a look here stat.ethz.ch/pipermail/r-help/2008-January/150896.html

    – konvas
    Jul 30 '14 at 7:56











  • I think I have been to that page before. In any case, now I get this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance

    – user2662565
    Jul 30 '14 at 13:34











  • Found another post addressing this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance Updated with following: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na..rm=TRUE) != 0], scale = TRUE)) Error in FUN(newX[, i], ...) : unused argument (na..rm = TRUE) Received this error: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na.rm=TRUE) != 0], scale = TRUE)) Error in svd(x, nu = 0) : a dimension is zero

    – user2662565
    Jul 30 '14 at 15:22


















2















I have really been struggling with using R to analyze financial data. I am new to programming in general, really, except very accustomed to doing work in Excel. Consequently, I have spent a lot of time (probably too much time) formatting my CSV file, just so I could minimize the hassle when working in R, but this hasn't worked.



Here is my code for PCA analysis. I have only gotten it to work when I have used smaller data files with no N/As or blanks, but I need to know how to handle these in R.



returns <- read.csv("PCA Data File.csv", skip = 1, header = T)
#standardize the variables
returns.pca <- prcomp(returns[2:ncol(returns)], scale = TRUE)


The result is:




Error in svd(x, nu = 0) : infinite or missing values in 'x'




Many questions arise from this, the first being how do you resolve this? Second, how do I explore my data to make sure missing values are properly addressed or replaced? Is it the fact that my data is a data.frame and not matrix that is causing the issue?



I am not sure how to attach the CSV file, but here are the first few rows from the file (there are 241 rows):



Date    Returns Var1    Var2    Var3    Var4    Var5    Var6    Var7    Var8    Var9    Var10   Var11   Var12   Var13   Var14   Var15   Var16   Var17   Var18   Var19   Var20   Var21   Var22   Var23   Var24   Var25   Var26   Var27   Var28   Var29   Var30   Var31   Var32   Var33   Var34   Var35   Var36   Var37   Var38   Var39   Var40   Var41   Var42   Var43   Var44   Var45   Var46   Var47   Var48   Var49   Var50   Var51   Var52   Var53   Var54   Var55   Var56   Var57   Var58   Var59   Var60   Var61
6/30/2014 0.48 18.12 9.44 107.43 19.53 1.92 11.54 0.99 3.33 98.83 0.44 2.59 3.42 105.15 308.59 80.44 1.36 0.94 102.07 1.69 331.47 53656.02 21897.39 11022.87 23144.90 15131.80 0.59 2.70 1.35 0.58 0.33 0.25 103.38 1.67 2.59 3.42 1.75 0.10 1.09 2.00 -0.11 1.24 2.08 0.22 138780.00
5/31/2014 1.52 17.63 9.44 107.18 14.36 1.96 12.48 1.01 3.49 98.60 0.37 2.55 3.39 101.79 306.79 79.96 1.37 0.93 101.84 1.68 324.69 53122.21 21159.31 10558.07 22584.93 14343.14 0.59 2.62 1.40 0.52 0.41 0.11 103.39 1.58 2.55 3.39 1.81 0.09 1.11 1.96 -0.07 1.15 2.29 0.47 3.50 1.49 138492.00 171.04 11302.80 4322654.00 55.40 -44.39 441.59 1000.70 117.44 11.60 6.50 1.50 0.50
4/30/2014 1.07 17.40 9.45 107.11 22.93 1.96 14.20 1.02 3.49 98.24 0.40 2.69 3.52 102.03 308.63 79.85 1.38 0.93 102.51 1.67 323.24 51470.08 21660.07 10399.85 22598.44 14475.33 0.61 2.67 1.53 0.53 0.47 0.06 103.47 1.69 2.69 3.52 1.82 0.09 1.49 2.08 0.02 1.16 2.04 -4.63 0.04 3.50 1.42 138268.00 171.58 11227.50 4296049.00 54.90 -47.04 425.02 204.90 117.57 11.60 27.30 6.60 1.80 1.40
3/31/2014 0.50 17.51 9.51 106.40 25.98 1.95 14.84 1.09 3.65 98.40 0.38 2.72 3.62 100.51 303.49 79.87 1.38 0.91 102.36 1.66 316.98 47046.98 20839.70 10097.38 21980.77 14694.83 0.61 2.72 1.59 0.52 0.48 0.04 103.44 1.63 2.72 3.62 1.99 0.08 1.73 2.10 0.00 1.13 2.02 0.91 3.30 1.20 137964.00 171.47 11169.00 4226971.00 53.70 -44.18 452.77 608.80 117.39 11.70 15.10 27.30 6.80 1.60 0.20
2/28/2014 1.76 17.10 9.52 106.27 25.35 1.96 15.47 1.13 3.88 98.46 0.31 2.70 3.66 100.68 294.91 80.44 1.37 0.90 102.12 1.66 315.92 47367.89 20039.38 10048.23 22188.31 14617.57 0.60 2.74 1.66 0.44 0.44 0.01 103.45 1.50 2.69 3.66 2.16 0.07 1.82 2.10 -0.05 1.04 1.87 0.91 3.10 1.08 137761.00 169.34 11133.50 4159972.00 53.20 -42.59 383.36 -48.40 116.28 11.70 27.30 6.90 1.70 1.70









share|improve this question

























  • Paste in the output of dput(head(returns, 10)) rather than the current copy-paste.

    – Thomas
    Jul 29 '14 at 19:48











  • Have a look here stat.ethz.ch/pipermail/r-help/2008-January/150896.html

    – konvas
    Jul 30 '14 at 7:56











  • I think I have been to that page before. In any case, now I get this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance

    – user2662565
    Jul 30 '14 at 13:34











  • Found another post addressing this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance Updated with following: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na..rm=TRUE) != 0], scale = TRUE)) Error in FUN(newX[, i], ...) : unused argument (na..rm = TRUE) Received this error: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na.rm=TRUE) != 0], scale = TRUE)) Error in svd(x, nu = 0) : a dimension is zero

    – user2662565
    Jul 30 '14 at 15:22
















2












2








2








I have really been struggling with using R to analyze financial data. I am new to programming in general, really, except very accustomed to doing work in Excel. Consequently, I have spent a lot of time (probably too much time) formatting my CSV file, just so I could minimize the hassle when working in R, but this hasn't worked.



Here is my code for PCA analysis. I have only gotten it to work when I have used smaller data files with no N/As or blanks, but I need to know how to handle these in R.



returns <- read.csv("PCA Data File.csv", skip = 1, header = T)
#standardize the variables
returns.pca <- prcomp(returns[2:ncol(returns)], scale = TRUE)


The result is:




Error in svd(x, nu = 0) : infinite or missing values in 'x'




Many questions arise from this, the first being how do you resolve this? Second, how do I explore my data to make sure missing values are properly addressed or replaced? Is it the fact that my data is a data.frame and not matrix that is causing the issue?



I am not sure how to attach the CSV file, but here are the first few rows from the file (there are 241 rows):



Date    Returns Var1    Var2    Var3    Var4    Var5    Var6    Var7    Var8    Var9    Var10   Var11   Var12   Var13   Var14   Var15   Var16   Var17   Var18   Var19   Var20   Var21   Var22   Var23   Var24   Var25   Var26   Var27   Var28   Var29   Var30   Var31   Var32   Var33   Var34   Var35   Var36   Var37   Var38   Var39   Var40   Var41   Var42   Var43   Var44   Var45   Var46   Var47   Var48   Var49   Var50   Var51   Var52   Var53   Var54   Var55   Var56   Var57   Var58   Var59   Var60   Var61
6/30/2014 0.48 18.12 9.44 107.43 19.53 1.92 11.54 0.99 3.33 98.83 0.44 2.59 3.42 105.15 308.59 80.44 1.36 0.94 102.07 1.69 331.47 53656.02 21897.39 11022.87 23144.90 15131.80 0.59 2.70 1.35 0.58 0.33 0.25 103.38 1.67 2.59 3.42 1.75 0.10 1.09 2.00 -0.11 1.24 2.08 0.22 138780.00
5/31/2014 1.52 17.63 9.44 107.18 14.36 1.96 12.48 1.01 3.49 98.60 0.37 2.55 3.39 101.79 306.79 79.96 1.37 0.93 101.84 1.68 324.69 53122.21 21159.31 10558.07 22584.93 14343.14 0.59 2.62 1.40 0.52 0.41 0.11 103.39 1.58 2.55 3.39 1.81 0.09 1.11 1.96 -0.07 1.15 2.29 0.47 3.50 1.49 138492.00 171.04 11302.80 4322654.00 55.40 -44.39 441.59 1000.70 117.44 11.60 6.50 1.50 0.50
4/30/2014 1.07 17.40 9.45 107.11 22.93 1.96 14.20 1.02 3.49 98.24 0.40 2.69 3.52 102.03 308.63 79.85 1.38 0.93 102.51 1.67 323.24 51470.08 21660.07 10399.85 22598.44 14475.33 0.61 2.67 1.53 0.53 0.47 0.06 103.47 1.69 2.69 3.52 1.82 0.09 1.49 2.08 0.02 1.16 2.04 -4.63 0.04 3.50 1.42 138268.00 171.58 11227.50 4296049.00 54.90 -47.04 425.02 204.90 117.57 11.60 27.30 6.60 1.80 1.40
3/31/2014 0.50 17.51 9.51 106.40 25.98 1.95 14.84 1.09 3.65 98.40 0.38 2.72 3.62 100.51 303.49 79.87 1.38 0.91 102.36 1.66 316.98 47046.98 20839.70 10097.38 21980.77 14694.83 0.61 2.72 1.59 0.52 0.48 0.04 103.44 1.63 2.72 3.62 1.99 0.08 1.73 2.10 0.00 1.13 2.02 0.91 3.30 1.20 137964.00 171.47 11169.00 4226971.00 53.70 -44.18 452.77 608.80 117.39 11.70 15.10 27.30 6.80 1.60 0.20
2/28/2014 1.76 17.10 9.52 106.27 25.35 1.96 15.47 1.13 3.88 98.46 0.31 2.70 3.66 100.68 294.91 80.44 1.37 0.90 102.12 1.66 315.92 47367.89 20039.38 10048.23 22188.31 14617.57 0.60 2.74 1.66 0.44 0.44 0.01 103.45 1.50 2.69 3.66 2.16 0.07 1.82 2.10 -0.05 1.04 1.87 0.91 3.10 1.08 137761.00 169.34 11133.50 4159972.00 53.20 -42.59 383.36 -48.40 116.28 11.70 27.30 6.90 1.70 1.70









share|improve this question
















I have really been struggling with using R to analyze financial data. I am new to programming in general, really, except very accustomed to doing work in Excel. Consequently, I have spent a lot of time (probably too much time) formatting my CSV file, just so I could minimize the hassle when working in R, but this hasn't worked.



Here is my code for PCA analysis. I have only gotten it to work when I have used smaller data files with no N/As or blanks, but I need to know how to handle these in R.



returns <- read.csv("PCA Data File.csv", skip = 1, header = T)
#standardize the variables
returns.pca <- prcomp(returns[2:ncol(returns)], scale = TRUE)


The result is:




Error in svd(x, nu = 0) : infinite or missing values in 'x'




Many questions arise from this, the first being how do you resolve this? Second, how do I explore my data to make sure missing values are properly addressed or replaced? Is it the fact that my data is a data.frame and not matrix that is causing the issue?



I am not sure how to attach the CSV file, but here are the first few rows from the file (there are 241 rows):



Date    Returns Var1    Var2    Var3    Var4    Var5    Var6    Var7    Var8    Var9    Var10   Var11   Var12   Var13   Var14   Var15   Var16   Var17   Var18   Var19   Var20   Var21   Var22   Var23   Var24   Var25   Var26   Var27   Var28   Var29   Var30   Var31   Var32   Var33   Var34   Var35   Var36   Var37   Var38   Var39   Var40   Var41   Var42   Var43   Var44   Var45   Var46   Var47   Var48   Var49   Var50   Var51   Var52   Var53   Var54   Var55   Var56   Var57   Var58   Var59   Var60   Var61
6/30/2014 0.48 18.12 9.44 107.43 19.53 1.92 11.54 0.99 3.33 98.83 0.44 2.59 3.42 105.15 308.59 80.44 1.36 0.94 102.07 1.69 331.47 53656.02 21897.39 11022.87 23144.90 15131.80 0.59 2.70 1.35 0.58 0.33 0.25 103.38 1.67 2.59 3.42 1.75 0.10 1.09 2.00 -0.11 1.24 2.08 0.22 138780.00
5/31/2014 1.52 17.63 9.44 107.18 14.36 1.96 12.48 1.01 3.49 98.60 0.37 2.55 3.39 101.79 306.79 79.96 1.37 0.93 101.84 1.68 324.69 53122.21 21159.31 10558.07 22584.93 14343.14 0.59 2.62 1.40 0.52 0.41 0.11 103.39 1.58 2.55 3.39 1.81 0.09 1.11 1.96 -0.07 1.15 2.29 0.47 3.50 1.49 138492.00 171.04 11302.80 4322654.00 55.40 -44.39 441.59 1000.70 117.44 11.60 6.50 1.50 0.50
4/30/2014 1.07 17.40 9.45 107.11 22.93 1.96 14.20 1.02 3.49 98.24 0.40 2.69 3.52 102.03 308.63 79.85 1.38 0.93 102.51 1.67 323.24 51470.08 21660.07 10399.85 22598.44 14475.33 0.61 2.67 1.53 0.53 0.47 0.06 103.47 1.69 2.69 3.52 1.82 0.09 1.49 2.08 0.02 1.16 2.04 -4.63 0.04 3.50 1.42 138268.00 171.58 11227.50 4296049.00 54.90 -47.04 425.02 204.90 117.57 11.60 27.30 6.60 1.80 1.40
3/31/2014 0.50 17.51 9.51 106.40 25.98 1.95 14.84 1.09 3.65 98.40 0.38 2.72 3.62 100.51 303.49 79.87 1.38 0.91 102.36 1.66 316.98 47046.98 20839.70 10097.38 21980.77 14694.83 0.61 2.72 1.59 0.52 0.48 0.04 103.44 1.63 2.72 3.62 1.99 0.08 1.73 2.10 0.00 1.13 2.02 0.91 3.30 1.20 137964.00 171.47 11169.00 4226971.00 53.70 -44.18 452.77 608.80 117.39 11.70 15.10 27.30 6.80 1.60 0.20
2/28/2014 1.76 17.10 9.52 106.27 25.35 1.96 15.47 1.13 3.88 98.46 0.31 2.70 3.66 100.68 294.91 80.44 1.37 0.90 102.12 1.66 315.92 47367.89 20039.38 10048.23 22188.31 14617.57 0.60 2.74 1.66 0.44 0.44 0.01 103.45 1.50 2.69 3.66 2.16 0.07 1.82 2.10 -0.05 1.04 1.87 0.91 3.10 1.08 137761.00 169.34 11133.50 4159972.00 53.20 -42.59 383.36 -48.40 116.28 11.70 27.30 6.90 1.70 1.70






r






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 22 '18 at 15:10









Ben Bolker

135k13227317




135k13227317










asked Jul 29 '14 at 19:46









user2662565user2662565

1742519




1742519













  • Paste in the output of dput(head(returns, 10)) rather than the current copy-paste.

    – Thomas
    Jul 29 '14 at 19:48











  • Have a look here stat.ethz.ch/pipermail/r-help/2008-January/150896.html

    – konvas
    Jul 30 '14 at 7:56











  • I think I have been to that page before. In any case, now I get this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance

    – user2662565
    Jul 30 '14 at 13:34











  • Found another post addressing this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance Updated with following: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na..rm=TRUE) != 0], scale = TRUE)) Error in FUN(newX[, i], ...) : unused argument (na..rm = TRUE) Received this error: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na.rm=TRUE) != 0], scale = TRUE)) Error in svd(x, nu = 0) : a dimension is zero

    – user2662565
    Jul 30 '14 at 15:22





















  • Paste in the output of dput(head(returns, 10)) rather than the current copy-paste.

    – Thomas
    Jul 29 '14 at 19:48











  • Have a look here stat.ethz.ch/pipermail/r-help/2008-January/150896.html

    – konvas
    Jul 30 '14 at 7:56











  • I think I have been to that page before. In any case, now I get this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance

    – user2662565
    Jul 30 '14 at 13:34











  • Found another post addressing this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance Updated with following: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na..rm=TRUE) != 0], scale = TRUE)) Error in FUN(newX[, i], ...) : unused argument (na..rm = TRUE) Received this error: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na.rm=TRUE) != 0], scale = TRUE)) Error in svd(x, nu = 0) : a dimension is zero

    – user2662565
    Jul 30 '14 at 15:22



















Paste in the output of dput(head(returns, 10)) rather than the current copy-paste.

– Thomas
Jul 29 '14 at 19:48





Paste in the output of dput(head(returns, 10)) rather than the current copy-paste.

– Thomas
Jul 29 '14 at 19:48













Have a look here stat.ethz.ch/pipermail/r-help/2008-January/150896.html

– konvas
Jul 30 '14 at 7:56





Have a look here stat.ethz.ch/pipermail/r-help/2008-January/150896.html

– konvas
Jul 30 '14 at 7:56













I think I have been to that page before. In any case, now I get this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance

– user2662565
Jul 30 '14 at 13:34





I think I have been to that page before. In any case, now I get this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance

– user2662565
Jul 30 '14 at 13:34













Found another post addressing this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance Updated with following: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na..rm=TRUE) != 0], scale = TRUE)) Error in FUN(newX[, i], ...) : unused argument (na..rm = TRUE) Received this error: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na.rm=TRUE) != 0], scale = TRUE)) Error in svd(x, nu = 0) : a dimension is zero

– user2662565
Jul 30 '14 at 15:22







Found another post addressing this error: Error in prcomp.default(na.omit(returns[2:ncol(returns)]), scale = TRUE) : cannot rescale a constant/zero column to unit variance Updated with following: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na..rm=TRUE) != 0], scale = TRUE)) Error in FUN(newX[, i], ...) : unused argument (na..rm = TRUE) Received this error: > returns.pca <- prcomp(na.omit(returns[,apply(returns[2:ncol(returns)], 2, var, na.rm=TRUE) != 0], scale = TRUE)) Error in svd(x, nu = 0) : a dimension is zero

– user2662565
Jul 30 '14 at 15:22














2 Answers
2






active

oldest

votes


















0














It looks like your data has problems with missing values for some of the dates so you have to do some data cleanup. The code below is an example of how you might do this for the rows you provided. Only two dates seem to be complete so continuing on to the PCA analysis didn't make much sense.



I've loaded you input data from above into the variable xx.



 xx <- sub("n"," ",xx)            #  delete n in data
xy <- unlist(strsplit(xx,split=" ")) # change string to character vector
start_of_new_date <- grep("[0-9]/[0-9]{2}/2014",xy) # find start of new dates in data
diff(start_of_new_date) # notice that the number of values between dates are not all 62 so some lines are missing values
ar <- matrix(c(c("Date", xy[1:61]), xy[168:291]), nrow=3,byrow=TRUE ) # convert only complete dates, March and April, to matrix
df <- data.frame(Date=ar[2:3,1], ar[2:3,2:62], stringsAsFactors=FALSE) # convert dates and data to data frame
colnames(df) <- c("Date",ar[1,2:62]) # make var strings column names in data frame
df[,2:62] <- sapply(df[,2:62], as.numeric) # convert data columns from character to numeric
dfs <- scale(df[,2:62]) # example only; running scale on two row data columns is meaningless since all will scale to same values





share|improve this answer
























  • I am only using columns 2:ncol(returns) so that I exclude the date. Shouldn't this make it so the date is irrelevant to this?

    – user2662565
    Jul 30 '14 at 17:28











  • Sorry, I had taken the data string in your post to be the value of returns, not the file contents. Your using read.csv to try to bring this in but there aren't any comma's so it wouldn't separate the values properly. Two thoughts: First, look at the contents of returns to see if they look correct. Second, explain a little more how you're generating this file from Excel.

    – WaltS
    Jul 30 '14 at 21:44





















0














Possible duplicate of Error in svd(x, nu = 0) : 0 extent dimensions



Negative infinity values can be replaced after a log transform as below.



log_features <- log(data_matrix[,1:8])
log_features[is.infinite(log_features)] <- -99999





share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f25023693%2fpca-error-infinite-or-missing-values-in-x%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    It looks like your data has problems with missing values for some of the dates so you have to do some data cleanup. The code below is an example of how you might do this for the rows you provided. Only two dates seem to be complete so continuing on to the PCA analysis didn't make much sense.



    I've loaded you input data from above into the variable xx.



     xx <- sub("n"," ",xx)            #  delete n in data
    xy <- unlist(strsplit(xx,split=" ")) # change string to character vector
    start_of_new_date <- grep("[0-9]/[0-9]{2}/2014",xy) # find start of new dates in data
    diff(start_of_new_date) # notice that the number of values between dates are not all 62 so some lines are missing values
    ar <- matrix(c(c("Date", xy[1:61]), xy[168:291]), nrow=3,byrow=TRUE ) # convert only complete dates, March and April, to matrix
    df <- data.frame(Date=ar[2:3,1], ar[2:3,2:62], stringsAsFactors=FALSE) # convert dates and data to data frame
    colnames(df) <- c("Date",ar[1,2:62]) # make var strings column names in data frame
    df[,2:62] <- sapply(df[,2:62], as.numeric) # convert data columns from character to numeric
    dfs <- scale(df[,2:62]) # example only; running scale on two row data columns is meaningless since all will scale to same values





    share|improve this answer
























    • I am only using columns 2:ncol(returns) so that I exclude the date. Shouldn't this make it so the date is irrelevant to this?

      – user2662565
      Jul 30 '14 at 17:28











    • Sorry, I had taken the data string in your post to be the value of returns, not the file contents. Your using read.csv to try to bring this in but there aren't any comma's so it wouldn't separate the values properly. Two thoughts: First, look at the contents of returns to see if they look correct. Second, explain a little more how you're generating this file from Excel.

      – WaltS
      Jul 30 '14 at 21:44


















    0














    It looks like your data has problems with missing values for some of the dates so you have to do some data cleanup. The code below is an example of how you might do this for the rows you provided. Only two dates seem to be complete so continuing on to the PCA analysis didn't make much sense.



    I've loaded you input data from above into the variable xx.



     xx <- sub("n"," ",xx)            #  delete n in data
    xy <- unlist(strsplit(xx,split=" ")) # change string to character vector
    start_of_new_date <- grep("[0-9]/[0-9]{2}/2014",xy) # find start of new dates in data
    diff(start_of_new_date) # notice that the number of values between dates are not all 62 so some lines are missing values
    ar <- matrix(c(c("Date", xy[1:61]), xy[168:291]), nrow=3,byrow=TRUE ) # convert only complete dates, March and April, to matrix
    df <- data.frame(Date=ar[2:3,1], ar[2:3,2:62], stringsAsFactors=FALSE) # convert dates and data to data frame
    colnames(df) <- c("Date",ar[1,2:62]) # make var strings column names in data frame
    df[,2:62] <- sapply(df[,2:62], as.numeric) # convert data columns from character to numeric
    dfs <- scale(df[,2:62]) # example only; running scale on two row data columns is meaningless since all will scale to same values





    share|improve this answer
























    • I am only using columns 2:ncol(returns) so that I exclude the date. Shouldn't this make it so the date is irrelevant to this?

      – user2662565
      Jul 30 '14 at 17:28











    • Sorry, I had taken the data string in your post to be the value of returns, not the file contents. Your using read.csv to try to bring this in but there aren't any comma's so it wouldn't separate the values properly. Two thoughts: First, look at the contents of returns to see if they look correct. Second, explain a little more how you're generating this file from Excel.

      – WaltS
      Jul 30 '14 at 21:44
















    0












    0








    0







    It looks like your data has problems with missing values for some of the dates so you have to do some data cleanup. The code below is an example of how you might do this for the rows you provided. Only two dates seem to be complete so continuing on to the PCA analysis didn't make much sense.



    I've loaded you input data from above into the variable xx.



     xx <- sub("n"," ",xx)            #  delete n in data
    xy <- unlist(strsplit(xx,split=" ")) # change string to character vector
    start_of_new_date <- grep("[0-9]/[0-9]{2}/2014",xy) # find start of new dates in data
    diff(start_of_new_date) # notice that the number of values between dates are not all 62 so some lines are missing values
    ar <- matrix(c(c("Date", xy[1:61]), xy[168:291]), nrow=3,byrow=TRUE ) # convert only complete dates, March and April, to matrix
    df <- data.frame(Date=ar[2:3,1], ar[2:3,2:62], stringsAsFactors=FALSE) # convert dates and data to data frame
    colnames(df) <- c("Date",ar[1,2:62]) # make var strings column names in data frame
    df[,2:62] <- sapply(df[,2:62], as.numeric) # convert data columns from character to numeric
    dfs <- scale(df[,2:62]) # example only; running scale on two row data columns is meaningless since all will scale to same values





    share|improve this answer













    It looks like your data has problems with missing values for some of the dates so you have to do some data cleanup. The code below is an example of how you might do this for the rows you provided. Only two dates seem to be complete so continuing on to the PCA analysis didn't make much sense.



    I've loaded you input data from above into the variable xx.



     xx <- sub("n"," ",xx)            #  delete n in data
    xy <- unlist(strsplit(xx,split=" ")) # change string to character vector
    start_of_new_date <- grep("[0-9]/[0-9]{2}/2014",xy) # find start of new dates in data
    diff(start_of_new_date) # notice that the number of values between dates are not all 62 so some lines are missing values
    ar <- matrix(c(c("Date", xy[1:61]), xy[168:291]), nrow=3,byrow=TRUE ) # convert only complete dates, March and April, to matrix
    df <- data.frame(Date=ar[2:3,1], ar[2:3,2:62], stringsAsFactors=FALSE) # convert dates and data to data frame
    colnames(df) <- c("Date",ar[1,2:62]) # make var strings column names in data frame
    df[,2:62] <- sapply(df[,2:62], as.numeric) # convert data columns from character to numeric
    dfs <- scale(df[,2:62]) # example only; running scale on two row data columns is meaningless since all will scale to same values






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jul 30 '14 at 17:08









    WaltSWaltS

    4,26711119




    4,26711119













    • I am only using columns 2:ncol(returns) so that I exclude the date. Shouldn't this make it so the date is irrelevant to this?

      – user2662565
      Jul 30 '14 at 17:28











    • Sorry, I had taken the data string in your post to be the value of returns, not the file contents. Your using read.csv to try to bring this in but there aren't any comma's so it wouldn't separate the values properly. Two thoughts: First, look at the contents of returns to see if they look correct. Second, explain a little more how you're generating this file from Excel.

      – WaltS
      Jul 30 '14 at 21:44





















    • I am only using columns 2:ncol(returns) so that I exclude the date. Shouldn't this make it so the date is irrelevant to this?

      – user2662565
      Jul 30 '14 at 17:28











    • Sorry, I had taken the data string in your post to be the value of returns, not the file contents. Your using read.csv to try to bring this in but there aren't any comma's so it wouldn't separate the values properly. Two thoughts: First, look at the contents of returns to see if they look correct. Second, explain a little more how you're generating this file from Excel.

      – WaltS
      Jul 30 '14 at 21:44



















    I am only using columns 2:ncol(returns) so that I exclude the date. Shouldn't this make it so the date is irrelevant to this?

    – user2662565
    Jul 30 '14 at 17:28





    I am only using columns 2:ncol(returns) so that I exclude the date. Shouldn't this make it so the date is irrelevant to this?

    – user2662565
    Jul 30 '14 at 17:28













    Sorry, I had taken the data string in your post to be the value of returns, not the file contents. Your using read.csv to try to bring this in but there aren't any comma's so it wouldn't separate the values properly. Two thoughts: First, look at the contents of returns to see if they look correct. Second, explain a little more how you're generating this file from Excel.

    – WaltS
    Jul 30 '14 at 21:44







    Sorry, I had taken the data string in your post to be the value of returns, not the file contents. Your using read.csv to try to bring this in but there aren't any comma's so it wouldn't separate the values properly. Two thoughts: First, look at the contents of returns to see if they look correct. Second, explain a little more how you're generating this file from Excel.

    – WaltS
    Jul 30 '14 at 21:44















    0














    Possible duplicate of Error in svd(x, nu = 0) : 0 extent dimensions



    Negative infinity values can be replaced after a log transform as below.



    log_features <- log(data_matrix[,1:8])
    log_features[is.infinite(log_features)] <- -99999





    share|improve this answer






























      0














      Possible duplicate of Error in svd(x, nu = 0) : 0 extent dimensions



      Negative infinity values can be replaced after a log transform as below.



      log_features <- log(data_matrix[,1:8])
      log_features[is.infinite(log_features)] <- -99999





      share|improve this answer




























        0












        0








        0







        Possible duplicate of Error in svd(x, nu = 0) : 0 extent dimensions



        Negative infinity values can be replaced after a log transform as below.



        log_features <- log(data_matrix[,1:8])
        log_features[is.infinite(log_features)] <- -99999





        share|improve this answer















        Possible duplicate of Error in svd(x, nu = 0) : 0 extent dimensions



        Negative infinity values can be replaced after a log transform as below.



        log_features <- log(data_matrix[,1:8])
        log_features[is.infinite(log_features)] <- -99999






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited May 23 '17 at 10:31









        Community

        11




        11










        answered Jan 10 '16 at 21:31









        Joshua BurkhartJoshua Burkhart

        16717




        16717






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f25023693%2fpca-error-infinite-or-missing-values-in-x%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            Academy of Television Arts & Sciences

            L'Équipe

            1995 France bombings