reaction time results appear in different two columns – Page 2

This topic has 26 replies, 2 voices, and was last updated 4 years, 8 months ago by iasmaa.

Viewing 12 posts - 16 through 27 (of 27 total)

← 1 2

Author

Posts
October 31, 2020 at 6:38 am #6281
iasmaa
Participant
Hi,

Target item: (in pcibex results file)
```
1596973404,MD5HASH_EDITED,PennController,17,0,trials,NULL,Controller-DashedSentence,DashedSentence,1,For,1596972359450,For example the_dodo_birds are extinct. ,No,No,No_article,C,465,false,For example the_dodo_birds are extinct. ,Any addtional parameters were appended as additional columns
```
Target item (As csv):

Filler item: (in pcibex results file)
```
1596973404,MD5HASH_EDITED,PennController,30,0,trials,NULL,Controller-DashedSentence,DashedSentence,1,But,1596972504366,458,false,But Bill sold used cars for twenty years.,Any addtional parameters were appended as additional columns
```
filler item (As csv):
- This reply was modified 4 years, 8 months ago by Jeremy. Reason: Anonymized MD5 hash
November 1, 2020 at 10:59 pm #6285

Jeremy
Keymaster

Hi,

I don’t see anything wrong with the lines you directly copied from your results file, so I suspect that the linebreaks visible in your screenshots are inserted by the software that you use to open the file. You do not need, and actually you probably should not, open the results file with a spreadsheet editor before analyzing it in R. When you’re in your project on the farm, simply click the eye icon under your results file to open its content in a new tab, and when you’re on that new tab, use your browser’s “Save as” (or “Save Page As”) option to save it on your local device. If you don’t enter a new filename, the file that gets saved should normally be simply named results, without any extension. Whether you add an extension or not to the filename, the content of the file is already comma-separated values (CSV), so there is no need to convert it or save it again after opening it in a spreadsheet editor. Just point to the unedited file you just saved with your browser from within the read.csv function in your R script, and I don’t see a reason why you should experience this problem.

If you follow this procedure and still experience a problem, feel free to send me your results file at support@pcibex.net—I won’t share its content with anyone and will delete it immediately after troubleshooting your problem, but you only have my word for it, so make sure it is OK to send me your results file before doing it

Jeremy

November 2, 2020 at 6:39 pm #6290
Jeremy
Keymaster
Hi again,

I took a look at your results file, and it turns out I was completely wrong about the spreadsheet editor issue. Given your results file, the code I gave you indeed fails to detect columns after the 13th one, because the first CSV line (ie. non-comment line) it encounters in the file has 13 columns.

Fortunately there actually is a ready-made function from the tutorial that solves the problem: read.pcibex. So I inserted it to an adapted version of the code above, and it’s working now:
```
read.pcibex <- function(filepath, auto.colnames=TRUE, fun.col=function(col,cols){cols[cols==col]<-paste(col,"Ibex",sep=".");return(cols)}) {
  n.cols <- max(count.fields(filepath,sep=",",quote=NULL),na.rm=TRUE)
  if (auto.colnames){
    cols <- c()
    con <- file(filepath, "r")
    while ( TRUE ) {
      line <- readLines(con, n = 1, warn=FALSE)
      if ( length(line) == 0) {
        break
      }
      m <- regmatches(line,regexec("^# (\\d+)\\. (.+)\\.$",line))[[1]]
      if (length(m) == 3) {
        index <- as.numeric(m[2])
        value <- m[3]
        if (index < length(cols)){
          cols <- c()
        }
        if (is.function(fun.col)){
          cols <- fun.col(value,cols)
        }
        cols[index] <- value
        if (index == n.cols){
          break
        }
      }
    }
    close(con)
    return(read.csv(filepath, comment.char="#", header=FALSE, col.names=cols))
  }
  else{
    return(read.csv(filepath, comment.char="#", header=FALSE, col.names=seq(1:n.cols)))
  }
}

# First read the results file
all_results <- read.pcibex("resultsf")
# We'll work on the DashedSentence lines only
dashed_results <- all_results
# Keep only the DashedSentence lines
dashed_results <- subset(dashed_results, PennElementType=="Controller-DashedSentence")
# Let us first make the columns character-based instead of factors
cols.num <- seq(13,21)
dashed_results[cols.num] <- sapply(dashed_results[cols.num],as.character)
# Now we'll move the columns for the short lines (the ones without extra1 and extra2)
short_lines <- dashed_results$Comments==""
# Now let's move the columns
wrong.cols <- seq(13,16)
empty.cols <- seq(18,21)
dashed_results[short_lines,empty.cols] <- dashed_results[short_lines,wrong.cols]
dashed_results[short_lines,wrong.cols] <- NA
# Done!
```
Sorry for overlooking this problem, I shouldn’t have insisted it came from an editing process when you were clearly just following my advice

Let me know if you have any questions

Jeremy
November 3, 2020 at 2:53 am #6293

iasmaa
Participant

Hi,

Thank you so much for your effort! I really appreciate your help.
The results look much more tidy now. However, there are still some remaining issue occur:

– The RQ is actually the rating question that consists of a scale of 4. So, the content of RQ should be numbers 1,2,3,4
I do not really know why it takes it as yes/no answer question! although I have only one yes/no question, which is the CQ.
However, the answers of the RQ exist in a later columns under the dashed sentence (it is clearly shown in the results file).
So I think because we kept the dashed sentences lines only, we lost the RQ number answers. How to get them back and include them in the RQ column?

I also noticed that the filler items (which include other tenses than just articles) do not have answers for all the three questions (the CQ that needs to be answered by Yes/No, the RQ that takes a scale of 4, and the MC which takes an article), but takes NA instead?

It also give a group name (C) to all items which is not true because I have 5 groups presenting the five conditions I have (A,B,C,D,E)

Thank you.

November 3, 2020 at 3:32 am #6294

iasmaa
Participant

I think I know now why it recorded the RQ as yes/no question. I did that in my tamplate in correct2 column and logged it at the end of the trial .log( “RQcorrect” , row.Correct2)

I logged the scale answer too, but not at the end of the trial. So now I have it in my results but when we exclude all other lines and keep the dashed lines only, the scale answers disappear of course

November 3, 2020 at 3:56 am #6295

iasmaa
Participant

Sorry but I figured out something and I just want you to have a clear picture of this.

The answers of the three questions that appear in the columns CQ, RQ, CM are the right answers written by me in the tamplate. However, the other three lines under (that disappear after we keep only the dashed lines) are the participants answers and I need both the right answers and the participants answer to show when keeping the dashed lines columns.

These are the three line I’m talking about:

OnPaste.20201103-074822

Sorry to make it too complicated.

November 3, 2020 at 11:45 am #6301
Jeremy
Keymaster
Hi,

You will need to do some more transformations on your data frames. Here is what you can do:
```
# Set 'group' to non-null value for each participant
require(dplyr)
dashed_results <- dashed_results %>% 
  group_by(Time.results.were.received, MD5.hash.of.participant.s.IP.address) %>%
  mutate(group=max(group))

# Select fields from all_results, make them columns per item, and merge with dashed_results
require(tidyr)
fields <- c("yesnocorrect","scalecorrect","whicharticle")
correct_which_results <- subset(all_results, PennElementName%in%fields)
correct_which_results <- correct_which_results %>% 
  group_by(Time.results.were.received, MD5.hash.of.participant.s.IP.address, Item.number) %>%
  select(Time.results.were.received, MD5.hash.of.participant.s.IP.address, Item.number, PennElementName, Value) %>%
  spread(., PennElementName, Value)
dashed_results <- merge(dashed_results, correct_which_results,
      by=c("Time.results.were.received", "MD5.hash.of.participant.s.IP.address", "Item.number"))
```
You’ll still have some NAs for whicharticle because you didn’t seem to have that Scale for your filler items.

Jeremy
- This reply was modified 4 years, 8 months ago by Jeremy. Reason: replaced tmp with dashed_results
November 4, 2020 at 5:24 am #6309

iasmaa
Participant

Many thanks to this.

-I moved (whichtense) also for the fillers MC following the same codes and it works fine. but they still have NA for the three columns CQcorrect, RQcorrect, MCcorrect although I set that in my pcibex script.

-I also moved choose(for nationality) and choose2 (for English level) and that worked in the correct_which_results but appear with NA in the main tmp ! I do not really know why!
I tried also to have a column of the groups which defined by the PennElementName (TextInput2) but I got this message:
Error: Each row of output must be identified by a unique combination of keys.
Keys are shared for 144 rows:
Although I deleted row number 10 because it has the same name (TextInput2) of row 11, but that did not solve the problem.

Sorry I feel that become more into R area instead of pcibex but I would appreciate your help in that.

November 4, 2020 at 11:55 am #6311

Jeremy
Keymaster

Hi,

-I moved (whichtense) also for the fillers MC following the same codes and it works fine. but they still have NA for the three columns CQcorrect, RQcorrect, MCcorrect although I set that in my pcibex script.

If you did not generate your filler items using a Template but still used something like .log( "CQcorrect" , row.CQcorrect ) on your filler trial, then you got NAs because there is simply no row for row to point to

-I also moved choose(for nationality) and choose2 (for English level) and that worked in the correct_which_results but appear with NA in the main tmp ! I do not really know why!

Your choose and choose2 Scale elements were not present in every single trial, but only as part of a non-test, non-filler initial trial. The code for correct_which_results picks the answers to yesnocorrect, scalecorrect and whicharticle for every single trial and reports them as additional columns for the DashedSentence lines corresponding to the same trial.

If you are not familiar with the dplyr package, you should read the documentation so you get a better sense of what the code above does. I also used the tidyr package because it contains the spread function that transforms pairs of columns from multiple rows into multiple columns on a single row.

I tried also to have a column of the groups which defined by the PennElementName (TextInput2) but I got this message:
Error: Each row of output must be identified by a unique combination of keys.
Keys are shared for 144 rows:
Although I deleted row number 10 because it has the same name (TextInput2) of row 11, but that did not solve the problem.

This is basically the same problem: TextInput2 is only defined for trial #1, so you need to adapt the code to that. Also, you have two lines for TextInput2 for every participant, you only want to keep the second one, the one for which Parameter is Final.

Jeremy

November 9, 2020 at 12:24 pm #6332

iasmaa
Participant

Thank you for this.

If you did not generate your filler items using a Template but still used something like .log( “CQcorrect” , row.CQcorrect ) on your filler trial, then you got NAs because there is simply no row for row to point to.

^ so how to fix that?

This is basically the same problem: TextInput2 is only defined for trial #1, so you need to adapt the code to that. Also, you have two lines for TextInput2 for every participant, you only want to keep the second one, the one for which Parameter is Final.

^ I adapted the code and got an empty data frame!

Appreciate your help.

November 9, 2020 at 12:41 pm #6333

Jeremy
Keymaster

Hi,

I’m not sure I understand your first question: I seem to understand that you had a CQcorrect column for you non-filler trials to indicate which answer was the correct one for each of those trials. I don’t know the details of your design, but you probably cannot use CQcorrect from the non-filler trials to retrieve what it should be for the filler trials. Since you didn’t do it before data completion, you’ll have to associate your filler trials with their own CQcorrect manually now, after data completion: I don’t know how you could determine CQcorrect for the filler trials a priori.

Regarding the second point, there could be plenty of reasons why you end up with an empty data frame. This is definitely R-related, so I cannot give priority to this question. I encourage you to see if you can find additional assistance from colleagues who are familiar with R (no need to be familiar with PCIbex here, since it’s all about data-frame transformation). You can still post your R code here, but as I said I won’t have time to answer it immediately.

Jeremy

November 9, 2020 at 12:51 pm #6334

iasmaa
Participant

Thank you so much for your help.
Author

Posts

Viewing 12 posts - 16 through 27 (of 27 total)

← 1 2

You must be logged in to reply to this topic.