How to get data from adjusted quantile plot in R? -
i have data.frame 2 columns , using aq.plot
mvoutlier
package, identify potential outliers in 2-d dataset. problem i'm not happy "look" of plots generated , grab data plotting , make plot in other software.
for specific case plot generated by,
library('mvoutlier') data = read.csv(fp, colclasses=c("null",na,na)) h = aq.plot(data)
the data.frame, data
looks this:
pr tas 1 5.133207 59.24362 2 20.173075 75.81661 3 24.819054 97.31020 4 35.893467 92.11203 5 27.752425 95.70120 6 25.765618 91.14163 7 20.895360 57.30519 8 8.921513 70.31467 9 36.031261 98.24573 10 27.166213 92.79554 11 8.889431 54.48514 12 59.564447 85.69632 13 43.818336 99.36451 14 43.408963 84.23207 15 22.653269 84.89939 16 21.480331 96.18303 17 22.827370 69.97202 18 23.252464 85.08739 19 14.618731 45.30504 20 40.795519 78.56758 21 37.310456 80.30799 22 31.099105 91.31675 23 33.107472 63.07043 24 9.611930 35.62702
and generated plot looks this:
so question is, how can information plotted in subplot on top right? information mean x, y coordinates , number associated each point. great if there way x values @ 2 vertical lines drawn.
i see output h
calling aq.plot()
command gives boolean array stating points outliers (true) or not (false) there appear no access underlying components of plot.
any appreciated.
it's in code aq.plot
. here specific code plot in upper right:
plot(s$x, (1:length(dist))/length(dist), col = 3, xlab = "ordered squared robust distance", ylab = "cumulative probability", type = "n") text(s$x, (1:length(dist))/length(dist), as.character(s$ix), col = 3, cex = 0.8) t <- seq(0, max(dist), = 0.01) lines(t, pchisq(t, df = ncol(x)), col = 6) abline(v = delta, col = 5) text(x = delta, y = 0.4, paste(100 * (pchisq(delta, df = ncol(x))), "% quantile", sep = ""), col = 5, pos = 2, srt = 90, cex = 0.8) xarw <- arw(x, covr$center, covr$cov, alpha = alpha) if (xarw$cn < inf) { abline(v = xarw$cn, col = 4) text(x = xarw$cn, y = 0.4, "adjusted quantile", col = 4, pos = 4, srt = 90, cex = 0.8) }
if through code of function aq.plot
, see can x coordinates , associated observation way:
covr <- robustbase::covmcd(data, alpha = 1/2) dist <- mahalanobis(data, center = covr$center, cov = covr$cov) s <- sort(dist, index = true) s$x # 22 4 6 10 21 18 15 5 14 # 0.1152036 0.2181437 0.3148553 0.3255492 0.3752751 0.4076276 0.4661830 0.5299942 0.7093746 # 9 20 3 16 2 13 17 23 12 # 0.7564636 0.7756129 0.8838616 1.0807574 1.3059546 1.4891242 1.8606975 2.9690980 3.9152682 # 8 7 1 11 19 # 4.0283820 5.0767176 7.4233298 7.9488595 10.3217389
then y coordinates:
(1:length(dist))/length(dist) #[1] 0.04347826 0.08695652 0.13043478 0.17391304 0.21739130 0.26086957 0.30434783 0.34782609 #[9] 0.39130435 0.43478261 0.47826087 0.52173913 0.56521739 0.60869565 0.65217391 0.69565217 #[17] 0.73913043 0.78260870 0.82608696 0.86956522 0.91304348 0.95652174 1.00000000
you can rebuild plot directly using following code altered above. reading through code , following along build plot should see find each piece of information. abline
calls info on vertical lines, , you'll find values here qchisq(0.975, df = ncol(data))
, here arw(data, covr$center, covr$cov, alpha = 0.05)$cn
plot(s$x, (1:length(dist))/length(dist), col = 3, xlab = "ordered squared robust distance", ylab = "cumulative probability", type = "n") text(s$x, (1:length(dist))/length(dist), as.character(s$ix), col = 3, cex = 0.8) t <- seq(0, max(dist), = 0.01) lines(t, pchisq(t, df = ncol(data)), col = 6) abline(v = qchisq(0.975, df = ncol(data)), col = 5) text(x = qchisq(0.975, df = ncol(data)), y = 0.4, paste(100 * (pchisq(qchisq(0.975, df = ncol(data)), df = ncol(data))), "% quantile", sep = ""), col = 5, pos = 2, srt = 90, cex = 0.8) xarw <- arw(data, covr$center, covr$cov, alpha = 0.05) if (xarw$cn < inf) { abline(v = xarw$cn, col = 4) text(x = xarw$cn, y = 0.4, "adjusted quantile", col = 4, pos = 4, srt = 90, cex = 0.8) }
Comments
Post a Comment