Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Function for estimating Generalized Pareto Distribution parameters in R not working

In this paper https://www.jstor.org/stable/27867255?seq=1#metadata_info_tab_contents they provide a method of estimation for the GPD, based on the profile likelihood function and the empirical Bayesian method. The code they provide is the following:

# x is the sample data from the GPD
f <- function(x) {
  n <- length(x)
  x <- sort(x)
  lx <- function(b,x) {
    k <- -mean(log(1-b*x))
    if (b==0) {
      k-1-log(mean(x))
    } else {
      k-1+log(b/k)
    }
    }
  p <- (3:9)/10
  xp <- x[round(n*(1-p)+.5)]
  m <- 20+round(n^.5)
  xq <- x[round(n*(1-p*p)+.5)]
  k <-  log(xq/xp-1,p)
  a <- k*xp/(1-p^k)
  a[k==0] <- (-xp/log(p))[k==0]
  k <- -1
  b <- w <- L <- (n-1)/(n+1)/n[n]-(1-((1:m-.5)/m)^k)/k/median(a)/2
  for (i in 1:m) L[i] <- n*lx(b[i],x)
  for (i in 1:m) w[i] <- 1/sum(exp(L-L[i]))
  b <- sum(b*w)
  k <- -mean(log(1-b*x))
  list(sigma=k/b,k=k)
}

I simulated a vector of x values with size n = 100 from a GPD with shape parameter equals to 1 and scale parameter equals to -1. Then I get the error:

Error in if (condition) { : missing value where TRUE/FALSE needed

According to this question Error in if/while (condition) {: missing Value where TRUE/FALSE needed the error can be attributed to the NA values in the b vector, but I don’t understand why this happens.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

My x vector:

x
  [1]   0.24264350   0.71418670   0.90929131   1.31011008   1.26467953   0.66491141  40.16132902
  [8]   0.24815930   0.83783287   1.18879459   1.87167421   0.00396890   1.58125086   0.32042532
 [15]   0.01330135   0.80800732   0.67832942   0.32623323  10.87447448   0.64779973   0.16550198
 [22]   1.03595024   0.08463867   6.17820208  15.07862358   0.19188437   0.14925543   2.20031552
 [29]   1.14093185   2.92184174   0.79279984   0.14736789   0.32954571  24.41728358   5.64925376
 [36]   1.31141081   1.62897827   0.01491289   0.86641233   0.82289179   0.12105522   1.66701079
 [43]   4.65817711  53.08764134   8.87696704   0.61560327   0.77817268   0.65148331   2.33976096
 [50]   2.96629395   0.43783850   3.11005777   0.44923740   0.27073261   2.98854135   1.85962571
 [57]   0.67065561   1.09661475   1.55934896   1.65683579   0.62373160  27.04125805   0.16137916
 [64]   0.32925148   0.40487288   1.41714764   0.48798225   0.07624247   2.32993211   2.13228723
 [71]   6.74938943   2.10121108   0.56772145   2.84504482   1.67119601   1.68640938  23.65633183
 [78]   7.78195561   0.21317910   0.33639542   1.14508402   6.61466064   2.03818446   2.28166528
 [85]   0.28095778  30.93008603   0.16512528   1.21974281   0.32121843 103.90277963   0.09672460
 [92]   1.04882224   1.68444513  29.93956683   0.43961534   2.60623811   0.25039076   0.76208631
 [99] 568.46662349   1.66133511

>Solution :

I think you have some transcription errors in your function (to be fair, the code is not well formatted in the paper). I think this should be what you’re looking for:

f <- function(x) {
  n <- length(x) 
  x <- sort(x)
 lx <- function(b, x) {
   k <- -mean(log(1 - b * x))
   if (b == 0) k - 1 - log(mean(x)) else k - 1 + log(b/k) 
 }
 p <- (3:9)/10 
 xp <- x[round(n*(1 - p)+ 0.5)] 
 m <- 20 + round(n^0.5)
 xq <- x[round(n*(1 - p * p) + .5)]
 k <- log(xq / xp - 1, p)
 a <- k*xp/(1 - p^k)
 a[k == 0] <- (-xp/log(p))[k == 0]
 k <- -1
 b <- w <- L <- (n-1)/(n+1)/x[n] -(1-((1:m-.5)/m)^k)/k/median(a)/2
 for (i in 1:m) L[i] <- n*lx(b[i],x) 
 for (i in 1:m) w[i] <- 1/sum(exp(L-L[i])) 
 b <- sum(b * w) 
 k <- -mean(log(1 - b * x))
 list(sigma = k / b, k = k)
}

Which gives, on your data:

f(x)
#> $sigma
#> [1] 1.050348
#> 
#> $k
#> [1] -1.070265

data

x <- c(0.2426435, 0.7141867, 0.90929131, 1.31011008, 1.26467953, 0.66491141, 
40.16132902, 0.2481593, 0.83783287, 1.18879459, 1.87167421, 0.0039689, 
1.58125086, 0.32042532, 0.01330135, 0.80800732, 0.67832942, 0.32623323, 
10.87447448, 0.64779973, 0.16550198, 1.03595024, 0.08463867, 
6.17820208, 15.07862358, 0.19188437, 0.14925543, 2.20031552, 
1.14093185, 2.92184174, 0.79279984, 0.14736789, 0.32954571, 24.41728358, 
5.64925376, 1.31141081, 1.62897827, 0.01491289, 0.86641233, 0.82289179, 
0.12105522, 1.66701079, 4.65817711, 53.08764134, 8.87696704, 
0.61560327, 0.77817268, 0.65148331, 2.33976096, 2.96629395, 0.4378385, 
3.11005777, 0.4492374, 0.27073261, 2.98854135, 1.85962571, 0.67065561, 
1.09661475, 1.55934896, 1.65683579, 0.6237316, 27.04125805, 0.16137916, 
0.32925148, 0.40487288, 1.41714764, 0.48798225, 0.07624247, 2.32993211, 
2.13228723, 6.74938943, 2.10121108, 0.56772145, 2.84504482, 1.67119601, 
1.68640938, 23.65633183, 7.78195561, 0.2131791, 0.33639542, 1.14508402, 
6.61466064, 2.03818446, 2.28166528, 0.28095778, 30.93008603, 
0.16512528, 1.21974281, 0.32121843, 103.90277963, 0.0967246, 
1.04882224, 1.68444513, 29.93956683, 0.43961534, 2.60623811, 
0.25039076, 0.76208631, 568.46662349, 1.66133511)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading