Function for estimating Generalized Pareto Distribution parameters in R not working

Advertisements

In this paper https://www.jstor.org/stable/27867255?seq=1#metadata_info_tab_contents they provide a method of estimation for the GPD, based on the profile likelihood function and the empirical Bayesian method. The code they provide is the following:

# x is the sample data from the GPD
f <- function(x) {
  n <- length(x)
  x <- sort(x)
  lx <- function(b,x) {
    k <- -mean(log(1-b*x))
    if (b==0) {
      k-1-log(mean(x))
    } else {
      k-1+log(b/k)
    }
    }
  p <- (3:9)/10
  xp <- x[round(n*(1-p)+.5)]
  m <- 20+round(n^.5)
  xq <- x[round(n*(1-p*p)+.5)]
  k <-  log(xq/xp-1,p)
  a <- k*xp/(1-p^k)
  a[k==0] <- (-xp/log(p))[k==0]
  k <- -1
  b <- w <- L <- (n-1)/(n+1)/n[n]-(1-((1:m-.5)/m)^k)/k/median(a)/2
  for (i in 1:m) L[i] <- n*lx(b[i],x)
  for (i in 1:m) w[i] <- 1/sum(exp(L-L[i]))
  b <- sum(b*w)
  k <- -mean(log(1-b*x))
  list(sigma=k/b,k=k)
}

I simulated a vector of x values with size n = 100 from a GPD with shape parameter equals to 1 and scale parameter equals to -1. Then I get the error:

Error in if (condition) { : missing value where TRUE/FALSE needed

According to this question Error in if/while (condition) {: missing Value where TRUE/FALSE needed the error can be attributed to the NA values in the b vector, but I don’t understand why this happens.

My x vector:

x
  [1]   0.24264350   0.71418670   0.90929131   1.31011008   1.26467953   0.66491141  40.16132902
  [8]   0.24815930   0.83783287   1.18879459   1.87167421   0.00396890   1.58125086   0.32042532
 [15]   0.01330135   0.80800732   0.67832942   0.32623323  10.87447448   0.64779973   0.16550198
 [22]   1.03595024   0.08463867   6.17820208  15.07862358   0.19188437   0.14925543   2.20031552
 [29]   1.14093185   2.92184174   0.79279984   0.14736789   0.32954571  24.41728358   5.64925376
 [36]   1.31141081   1.62897827   0.01491289   0.86641233   0.82289179   0.12105522   1.66701079
 [43]   4.65817711  53.08764134   8.87696704   0.61560327   0.77817268   0.65148331   2.33976096
 [50]   2.96629395   0.43783850   3.11005777   0.44923740   0.27073261   2.98854135   1.85962571
 [57]   0.67065561   1.09661475   1.55934896   1.65683579   0.62373160  27.04125805   0.16137916
 [64]   0.32925148   0.40487288   1.41714764   0.48798225   0.07624247   2.32993211   2.13228723
 [71]   6.74938943   2.10121108   0.56772145   2.84504482   1.67119601   1.68640938  23.65633183
 [78]   7.78195561   0.21317910   0.33639542   1.14508402   6.61466064   2.03818446   2.28166528
 [85]   0.28095778  30.93008603   0.16512528   1.21974281   0.32121843 103.90277963   0.09672460
 [92]   1.04882224   1.68444513  29.93956683   0.43961534   2.60623811   0.25039076   0.76208631
 [99] 568.46662349   1.66133511

>Solution :

I think you have some transcription errors in your function (to be fair, the code is not well formatted in the paper). I think this should be what you’re looking for:

f <- function(x) {
  n <- length(x) 
  x <- sort(x)
 lx <- function(b, x) {
   k <- -mean(log(1 - b * x))
   if (b == 0) k - 1 - log(mean(x)) else k - 1 + log(b/k) 
 }
 p <- (3:9)/10 
 xp <- x[round(n*(1 - p)+ 0.5)] 
 m <- 20 + round(n^0.5)
 xq <- x[round(n*(1 - p * p) + .5)]
 k <- log(xq / xp - 1, p)
 a <- k*xp/(1 - p^k)
 a[k == 0] <- (-xp/log(p))[k == 0]
 k <- -1
 b <- w <- L <- (n-1)/(n+1)/x[n] -(1-((1:m-.5)/m)^k)/k/median(a)/2
 for (i in 1:m) L[i] <- n*lx(b[i],x) 
 for (i in 1:m) w[i] <- 1/sum(exp(L-L[i])) 
 b <- sum(b * w) 
 k <- -mean(log(1 - b * x))
 list(sigma = k / b, k = k)
}

Which gives, on your data:

f(x)
#> $sigma
#> [1] 1.050348
#> 
#> $k
#> [1] -1.070265

data

x <- c(0.2426435, 0.7141867, 0.90929131, 1.31011008, 1.26467953, 0.66491141, 
40.16132902, 0.2481593, 0.83783287, 1.18879459, 1.87167421, 0.0039689, 
1.58125086, 0.32042532, 0.01330135, 0.80800732, 0.67832942, 0.32623323, 
10.87447448, 0.64779973, 0.16550198, 1.03595024, 0.08463867, 
6.17820208, 15.07862358, 0.19188437, 0.14925543, 2.20031552, 
1.14093185, 2.92184174, 0.79279984, 0.14736789, 0.32954571, 24.41728358, 
5.64925376, 1.31141081, 1.62897827, 0.01491289, 0.86641233, 0.82289179, 
0.12105522, 1.66701079, 4.65817711, 53.08764134, 8.87696704, 
0.61560327, 0.77817268, 0.65148331, 2.33976096, 2.96629395, 0.4378385, 
3.11005777, 0.4492374, 0.27073261, 2.98854135, 1.85962571, 0.67065561, 
1.09661475, 1.55934896, 1.65683579, 0.6237316, 27.04125805, 0.16137916, 
0.32925148, 0.40487288, 1.41714764, 0.48798225, 0.07624247, 2.32993211, 
2.13228723, 6.74938943, 2.10121108, 0.56772145, 2.84504482, 1.67119601, 
1.68640938, 23.65633183, 7.78195561, 0.2131791, 0.33639542, 1.14508402, 
6.61466064, 2.03818446, 2.28166528, 0.28095778, 30.93008603, 
0.16512528, 1.21974281, 0.32121843, 103.90277963, 0.0967246, 
1.04882224, 1.68444513, 29.93956683, 0.43961534, 2.60623811, 
0.25039076, 0.76208631, 568.46662349, 1.66133511)

Leave a ReplyCancel reply