How to create a test and train dataset in R by specifying the range in the data set instead of using set.seed() function and probability?

Advertisements I am a noob at programming, sorry if this is a silly question. My supervisor doesn’t seem to trust set.seed() function in r as every number will yield a different output (with different test and train sets). Thus she asked me to specify the range for my training and test dataset. I am conducting… Read More How to create a test and train dataset in R by specifying the range in the data set instead of using set.seed() function and probability?

Unable to retrieve multiple values from database

Advertisements The following data exists in the database: [ { "_id": { "$oid": "628c787de53612aad30021ab" }, "ticker": "EURUSD", "dtyyyymmdd": "20030505", "time": "030000", "open": "1.12161", "high": "1.12209", "low": "1.12161", "close": "1.12209", "vol": "561", "id": 1 }, { "_id": { "$oid": "628c787de53612aad30021ac" }, "ticker": "EURUSD", "dtyyyymmdd": "20030505", "time": "030100", "open": "1.12206", "high": "1.1225", "low": "1.12206", "close": "1.1225", "vol":… Read More Unable to retrieve multiple values from database

Elastic net regression model with loops is not giving me list of results based on alpha R

Advertisements i would much appreciate some advice Im currently doing an elastic net regression where I am using a for loop to apply 10 diferents value of alpha from 0 to 1, the thing is that when I ask the results of the models it is only giving me the result of alpha=1, here is… Read More Elastic net regression model with loops is not giving me list of results based on alpha R

Displaying percentages within category for continuous/ordered variable (with ggplot)

Advertisements I have two questions, the first a (hopefully) straightforward mechanical one and the second more theoretical (though still with a technical element). I am trying to do something nearly identical to this question, but I have a variable that is ordered/continuous (0 – 4), instead of a 1/0 dichotomous variable, which means that filtering… Read More Displaying percentages within category for continuous/ordered variable (with ggplot)

Specific color for each boxplot in aes

Advertisements I have a set of data that can be summarized as in the example below. Some values and two other columns that represent some categorical values. library(ggplot2) set.seed(123) sdata=data.frame( aicat=sample(letters[1:4], 50, replace=TRUE), site=sample(letters[5:6], 50, replace=TRUE), vals=sample(1:1000, 50) ) head(sdata) aicat site vals 1 c f 409 2 c e 308 3 c e 278… Read More Specific color for each boxplot in aes

R: Randomly Replace Values with NA

Advertisements I am working the R programming language. I am trying select 10% of the elements in my dataset (excluding elements in the first column) and replace them with NA. I tried to do this with the following code: library(longitudinalData) data(artificialLongData) second_dataset = artificialLongData second_dataset[sample(nrow(second_dataset),0.1*nrow(second_dataset ))]<- NA This produces the following error: Error in `[<-.data.frame`(`*tmp*`,… Read More R: Randomly Replace Values with NA