There are 7.8 million researchers(1) in the world, and it is safe to say that they all use survey as a method of data collection at least once in their career. This already amounts to an overwhelming number of surveys, but it carries even far more weight when you think of all the data collected through these surveys and all the fields they cover. And yet, many surveys, whether used for data collection, feedback, or general opinions, are not well validated or validated at all.
That is why we have outlined a number of steps to determine whether your survey is actually measuring what you want it to measure. We used this method in our EICAA project (Entrepreneurial and Intrapreneurial Competence Assessment Alliance)(2). EICAA is building a testing tool to measure and assess entrepreneurial competencies. We did 2 rounds of validation for this tool, using a 5-step process.
The survey was formulated using a well-researched rubric system. It was then scanned, and subsequently iterated, several times by more than 15 experts of the project consortium.
Point to note: the likelihood of version control issues increases with the number of versions. You can prevent this by putting an effective version control strategy in place.
To validate the (different versions of the) survey, we ran 2 pre-tests with over 300 responses across both rounds. Currently, the EICAA consortium is running a pilot round, aiming at 1,000+ responses.
Point to note: planning for data is crucial. For some of our analyses, we will need at least 200 points of data, so wear your planning hat well in advance!
Our survey was an online survey on Qualtrics, which is why data entry wasn’t an issue for us. If you plan to have an offline survey, involving more than just one person is always a good idea for data entry.
To clean our data, we did the following:
Point to note: we didn’t have the problem of reverse coding. But if your survey has questions with answer options opposite to the other questions, you need to reverse code the data.
Aiming to be on the right side of the bell curve, we performed the following analyses on the data collected in our pre-test rounds:
Point to note: take into account the data type when choosing a reliability parameter. For example, parametric or non-parametric, nominal or not, etc.
Point to note: this was a nice-to-have rather than a must-have for our analysis.
Point to note: an exploratory factor analysis is not recommended when there are more survey items / variables than points of observation / responses. We addressed this by breaking our responses down into groups and then conduct the analysis. We divided our data into the 3 main area domains.
Point to note: as mentioned before, you need at least 200 responses for the results to be accurate.
All in all, the little seed we planted at the beginning of our work has grown into pre-test stage and is now speaking for itself, and all it took were these 5 simple steps.
References