Missing data (Failed submission)

PennController for IBEX Forums Support Missing data (Failed submission)

Viewing 15 posts - 1 through 15 (of 15 total)
  • Author
  • #7669


    Recently I conducted an experiment in which 61 participants participated, and the Results panel shows “Preview of latest submission (out of 61)” indicating there were 61 participants, but when I downloaded the csv file, there were only 60 participants, with 1 missing.

    I checked further into the results, and I found that the data of the 2nd participant is missing. The results.csv indicates that the presence of the participant in question because the following rows have been recorded:

    # Results on Thu
    # USER AGENT: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML
    # Design number was non-random = 0
    # Columns below this comment are as follows:
    # 1. Results reception time.
    # 2. MD5 hash of participant's IP address.
    # 3. Controller name.
    # 4. Order number of item.
    # 5. Inner element number.
    # 6. Label.
    # 7. Latin Square Group.

    However, no further rows are recorded with this participant.

    Now I remember that I had similar cases in the past. In all these cases, the participants whose data are missing seem to have completed the tasks to the end, because they provided us with the randomly generated codes that are shown at the end of the experiment.

    My question is, how and under what conditions could this happen? Are there any precautions I can take to prevent this?

    Demostration link:




    Unfortunately that submission was sent to the farm’s servers precisely at the beginning of a short downtime episode, at 2:12am GMT on January 13. The submission entry was created, but the service crashed right before the data could be saved to memory

    Unfortunately there is not much you can do on your end. I have been monitoring the problem more closely these past couple days, and the crashes appear to happen when there is a lot of incoming data at once, so minimizing the data in the submissions is good practice, but in your case, we’re talking about only 30 lines per submission, which is already very little

    My attempts at controlling or working around those crashes have not been the most successful so far. The service is supposed to temporarily save a copy of the incoming data on disk before sending it to the database, but even that copy seems to have failed to be saved in your case

    I apologize for the inconvenience. I am still actively working on addressing the issue



    OK, thank you for the detailed explanation regarding the situation. I really appreciate the presence of this platform and also thank you for your efforts to improve the system!

    -Ken N.



    I seem to have the same problem, but out of 10 participants, 6 were not saved. The details are the same; I have a few rows per “missing” participant recorded in my results file, like so:

    # Results on Tue
    # USER AGENT: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML
    # Design number was non-random = 8
    # Columns below this comment are as follows:
    # 1. Results reception time.
    # 2. MD5 hash of participant’s IP address.
    # 3. Controller name.
    # 4. Order number of item.
    # 5. Inner element number.
    # 6. Label.
    # 7. Latin Square Group.

    All 6 participants submitted between 15:57 GMT and 16:17:47 GMT; other participants’ data was successfully stored before and after this point. Can this be attributed to a downtime episode as well?

    Would minimising the data in submissions help prevent this, and if so, how would I do that? (I still need to collect 160 participants.)

    Thank you.

    Demo: https://farm.pcibex.net/r/UeksSE/


    Hi Merel,

    The database shows that your experiment did receive 10 submissions, with between 1918 and 1922 rows for each of them, so you should see all your submissions and the corresponding rows in the results file. Maybe you tried to generate the results file before all the incoming data had finished being processed by the server?

    Let me know if the problem persists



    Hi Jeremy,

    You’re right, thank you! I tried downloading it several times yesterday and it kept coming out the same way, but reloading it this morning made more rows appear.


    Hi Jeremy,

    I’m having a similar issue with missing results in my data. It appears I’m missing 4 submissions from the following experiment: https://farm.pcibex.net/r/SpzIlM/. I wonder if there is any chance this data can be recovered from the PCIbex server?



    Hi Monica,

    I see 13 submissions for that experiment, the four most recent ones received on April 27, April 28, May 6 and May 12. Each of your 13 submissions has between 54 and 58 rows

    Were you expecting another 4 submissions (for a total of 17 submissions)?



    Hi Jeremy,
    now I have that exact same problem with my experiment: 1 out of 4 participants (who clearly did the experiment (recruited via prolific, and s/he got the completion code right)) shows me only the rows as posted above:

    # Results on Wed	 09 Aug 2023 11:30:00 GMT		
    # USER AGENT: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML	 like Gecko) Chrome/ Safari/537.36		
    # Design number was non-random = 0			
    # Columns below this comment are as follows:			
    # 1. Results reception time.			
    # 2. MD5 hash of participant's IP address.			
    # 3. Controller name.			
    # 4. Order number of item.			
    # 5. Inner element number.			
    # 6. Label.			
    # 7. Latin Square Group.			

    link to the experiment:


    It has been several hours now – I refreshed the page, logged out and in again of PCIbex Farm, but there are never more rows. Normally when I do these little experiments, I can download all of the results when prolific is telling me that the participants completed the experiment.
    Is it possible that it sometimes takes longer for the results to show up? Or is it again a problem with the server?

    Thank you!!
    Best, Anna


    Hi Anna,

    Sorry for the slow reply. I only have intermittent access to the database, and I haven’t had the chance to look it up yet. I’ll take a look at it, but judging from the comment lines you see in the results file, if the submission is still not present when you download the file, chances are that an entry was added to the database but the content never reached it for some reason, which can happen when the load on the server is high. Apologies for the inconvenience



    Hi again,

    I was finally able to check our servers, and unfortunately I can now confirm that the rows for that submission are nowhere to be found. Apologies for the inconvenience




    The results file obtained from an experiment seems to be affected by partial data loss. I ran the survey yesterday, August 15, and advertised it through the Clickworker platform. There it said, around 1 pm (Berlin time), that the survey was completed by 32 people (my desired N). So, I checked my results file on PC Ibex, but there are only 19 people listed and the last update of the file was around 11 am. So, I am missing 13 participants (entirely, no rows are present in the data file after the 19th participant), as the results do not cover the last two hours of the experiment. Is there anything that can be done to retrieve the missing data? Why could that have happened?
    We had already used the same script for other experiments, and this problem had never occurred.
    Following the suggestions from the support forum, I have already refreshed the PC Ibex page, logged out and then back in, and tried from different browsers and devices, but nothing helps.

    This is the link to the experiment if it helps:



    Hi Valentina,

    I replied to an email sent to support@pcibex.net about this. Like I explained in my email, that address does not point to the PCIbex Farm, so we cannot help with data recovery. However, keep in mind that some participants might not have really completed the experiment but instead retrieved the confirmation code by other means to get credits



    Thank you for the quick and helpful reply, Jeremy!


    Hi Jeremy,

    I collected the data just now and when I checked the result files, it seems that some datapoints are missing for some participants. I check it in R, and it reports as this. For each participant, he/she is supposed to have 1212 datapoints. Could it be because they ran the experiment at around the same time? How can I minimize the possible data loss when participants do the same link at around the same time?
    Below is my link and the datapoints collected
    IP 1 2 3 4
    1e07cab9da3e98e661b2d0e80f5b99e7 0 1212 0 0
    23cb2c1b15ec3675dcfb69574b2fbfd6 1212 0 0 0
    733c9df3b988e58faae28c0e00bb4f35 843 0 0 0
    e8bc449ea6c87f719356447d4f2da4f5 0 0 837 0
    eeeb66b06c9a1bdfd1764cd830a43bd8 0 0 0 833

Viewing 15 posts - 1 through 15 (of 15 total)
  • You must be logged in to reply to this topic.