Developing a patient-reported outcome (PRO) instrument is like building a bridge between clinical science and a patient’s lived experience. The process takes time and requires both qualitative and quantitative approaches. A well-designed PRO tool ensures that the patient’s voice is heard clearly in clinical trials, routine care, and health research. Here is a detailed, step-by-step guide to how PRO instruments are developed, with examples to help you understand each stage.
Step 1: Identify the Concept of Interest
Every PRO instrument begins with a clear definition of what you want to measure. This is called the concept of interest. It could be something broad, like health-related quality of life, or something very specific, like treatment-related fatigue in breast cancer patients.
For example, imagine you are creating a PRO tool for patients with rheumatoid arthritis. You might want to measure how pain affects their ability to perform daily activities such as walking, cooking, or even sleeping. Without defining this focus, the questionnaire may become too vague and fail to provide useful data.
At this stage, you review existing literature and clinical guidelines to ensure that the concept you are focusing on is well-defined and clinically relevant.
Step 2: Engage Patients and Clinicians Early
A PRO instrument is only as good as the insights that go into it. This is why researchers start by talking to patients and clinicians. Patients provide first-hand perspectives on their symptoms and quality of life, while clinicians help ensure that the questions align with medical knowledge.
For example, for a new cancer fatigue tool, researchers might conduct one-on-one interviews with 20 patients who are undergoing chemotherapy. They would ask questions like:
- “How does fatigue affect your ability to perform daily tasks?”
- “What times of day is your fatigue most intense?”
Clinicians might then review these patient responses to suggest medically relevant items that should be included in the questionnaire.
Step 3: Generate an Item Pool
Item generation involves creating a large list of possible questions based on what patients and experts have shared. These items cover all aspects of the concept of interest.
Example Questions for a Migraine PRO Tool:
- “How many days in the past month did you experience migraine pain?”
- “How much did migraines interfere with your ability to concentrate?”
- “How anxious were you about your next migraine attack?”
At this stage, it is better to create more items than necessary. Later, these items will be tested, and those that are confusing, redundant, or irrelevant will be removed.
Step 4: Cognitive Debriefing
Once you have a draft questionnaire, the next step is cognitive debriefing. This process involves asking patients to complete the questionnaire and explain how they understood each question.
A patient might be asked, “What did you think this question was asking?” If a question like “How much did your fatigue affect your work?” confuses patients who are unemployed or retired, the item may need to be adjusted or replaced with a more inclusive version.
Cognitive debriefing ensures that patients interpret the questions consistently and that the language is accessible.
Step 5: Pilot Testing and Data Collection
After refining the questionnaire, it is tested on a small sample of the target population. This pilot phase helps determine if the questions are clear, if the response scales (e.g., 1 to 5) make sense, and if the questionnaire length is appropriate.
A pilot test might involve 50 patients with chronic pain completing the draft PRO tool. Researchers might notice that many patients choose the same middle option on a 1 to 5 scale. This could mean that the scale needs to be adjusted for better differentiation.
Step 6: Psychometric Validation
Psychometric validation ensures that the PRO instrument is both reliable and valid. This involves statistical tests like:
- Cronbach’s alpha, which measures the internal consistency of the questionnaire.
- Test-retest reliability, which checks if patients provide similar answers when their condition hasn’t changed.
- Factor analysis, which identifies groups of related questions (called factors or domains).
In a PRO tool for depression, factor analysis might reveal two distinct domains—emotional symptoms (e.g., sadness) and physical symptoms (e.g., fatigue). This information helps refine the structure of the questionnaire.
Step 7: Establish Content Validity
Content validity ensures that the questionnaire covers all relevant aspects of the concept. This is confirmed through reviews with both patients and experts.
For a diabetes-related quality of life tool, patients may highlight that dietary restrictions and social embarrassment (such as checking blood sugar in public) are major concerns. These issues must be included in the questionnaire for it to be considered complete.
Step 8: Known Groups Validity, Sensitivity, and Responsiveness
A PRO instrument should distinguish between groups that are expected to differ. This is called known groups validity. It also needs to be sensitive to differences between treatments and responsive to changes in an individual’s condition over time.
If you have a PRO tool for osteoarthritis pain, it should show higher pain scores for patients awaiting joint replacement surgery compared to those who recently had successful surgery. Over time, as patients recover, the instrument should detect improvements in their pain and mobility.
Step 9: Translation and Cultural Adaptation
If the PRO tool will be used in multiple languages, it must undergo translation and cultural adaptation. This is not just word-for-word translation but also involves ensuring cultural relevance.
A question about “walking a mile” may not be relevant in a rural area where distances are measured differently. It might be adapted to “walking for 20 minutes” or something culturally equivalent.
Step 10: Define Scoring and Interpretation
The next step is to establish how responses are scored. Some instruments simply sum the item responses, while others use weighted scoring or convert scores into a 0 to 100 scale for easier interpretation.
The SF-36, a widely used health survey, uses scoring algorithms that translate raw answers into domain scores like physical functioning or emotional well-being.
Researchers also define what a meaningful change in score is. For instance, a 5-point drop in fatigue might indicate a clinically important improvement.
Step 11: Real-World Testing and Continuous Improvement
Once validated, the PRO instrument is implemented in clinical studies or real-world settings. Feedback from patients and researchers often leads to further refinement.
A PRO tool for migraine may initially focus only on pain intensity. After real-world use, feedback might suggest adding questions about cognitive symptoms like brain fog, leading to an updated version.
Examples of PRO Instruments in Use
There are many successful PRO instruments currently used across healthcare:
-
PHQ-9 (Patient Health Questionnaire) – measures depression severity.
-
EORTC QLQ-C30 – evaluates quality of life in cancer patients.
-
PROMIS Fatigue Short Form – measures fatigue across different chronic conditions.
-
SEAR Questionnaire – assesses sexual satisfaction and confidence in men with erectile dysfunction.
These instruments all went through similar development steps to ensure they are meaningful and scientifically valid.
Why should you care
If you are studying health outcomes or planning to work in clinical research, understanding how PRO instruments are developed will help you critically evaluate the tools you use. A poorly designed questionnaire can lead to unreliable data and poor decision-making. By learning this process, you can ensure that your studies truly reflect what matters most to patients.