AlterLab-Academic-Skills alterlab-survey-design
Part of the AlterLab Academic Skills suite for faculty and researchers. Comprehensive survey and instrument design assistant. Supports questionnaire construction, Likert scale design, question types (open/closed/matrix), response bias mitigation, sampling strategies (probability/non-probability), pilot testing, instrument validation (Cronbach's alpha, factor analysis), online survey tools (Qualtrics, REDCap, Google Forms), interview protocol development, focus group facilitation, mixed-mode surveys, and cultural adaptation of instruments. Triggers on: survey design, questionnaire, Likert scale, sampling strategy, pilot testing, instrument validation, Cronbach's alpha, factor analysis, interview protocol, focus group, Qualtrics, REDCap, survey bias, response rate, questionnaire construction, scale development.
git clone https://github.com/AlterLab-IEU/AlterLab-Academic-Skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/AlterLab-IEU/AlterLab-Academic-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/research-tools/alterlab-survey-design" ~/.claude/skills/alterlab-ieu-alterlab-academic-skills-alterlab-survey-design && rm -rf "$T"
skills/research-tools/alterlab-survey-design/SKILL.mdSurvey Design — Survey & Instrument Design Agent
A comprehensive survey and instrument design tool for faculty and researchers. Covers the full lifecycle of survey-based research: from construct definition and item writing through pilot testing, validation, deployment, and analysis of survey data.
Overview
Survey research is one of the most widely used methods across social sciences, health sciences, education, and business. Despite its apparent simplicity, designing a valid and reliable survey instrument requires systematic attention to construct definition, item wording, response format, sampling, bias mitigation, and psychometric validation.
This skill treats survey design as a scientific process, not an art. Every design decision should be justified and documented.
When to Use This Skill
This skill should be used when:
- Designing a new survey or questionnaire from scratch
- Adapting an existing instrument for a new population or context
- Writing Likert-scale items or other structured response formats
- Developing interview protocols or focus group guides
- Planning sampling strategies for survey research
- Conducting pilot tests and cognitive interviews
- Validating instruments (reliability and validity analysis)
- Selecting online survey platforms (Qualtrics, REDCap, Google Forms)
- Improving response rates and reducing bias
- Conducting cultural adaptation and translation of instruments
- Teaching research methods courses that include survey design
Does NOT Trigger
| Scenario | Use Instead |
|---|---|
| Qualitative data analysis (coding, themes) | |
| Statistical analysis beyond validation | Data science skills |
| Writing the research paper | |
| Ethics/IRB for survey research | |
Core Capabilities
1. Survey Design Process
The 10-Step Survey Design Framework:
Step 1: Define research objectives and constructs What do you want to measure? What are your research questions? │ Step 2: Review existing instruments Has someone already validated an instrument for your construct? │ Step 3: Define the target population and sampling frame Who will you survey? How will you reach them? │ Step 4: Choose survey mode Online, paper, phone, in-person, mixed-mode? │ Step 5: Write items and response options Craft questions that are clear, unambiguous, and aligned to constructs │ Step 6: Design survey structure and flow Organize sections, add skip logic, manage survey length │ Step 7: Expert review Subject matter experts and methodologists evaluate the instrument │ Step 8: Cognitive interviews and pilot testing Test with a small sample from the target population │ Step 9: Psychometric validation Reliability analysis, factor analysis, validity assessment │ Step 10: Deploy, monitor, and analyze Launch survey, track response rates, clean and analyze data
2. Construct Definition and Operationalization
Before writing a single item, define what you are measuring.
Construct Mapping Template:
## Construct Map ### Construct: [Name] ### Definition: [Precise conceptual definition with citation] ### Dimensions/Facets: 1. [Dimension 1] — [Definition] - Indicators: [Observable behaviors or attitudes] - Example items: [Draft items] 2. [Dimension 2] — [Definition] - Indicators: [Observable behaviors or attitudes] - Example items: [Draft items] 3. [Dimension 3] — [Definition] - Indicators: [Observable behaviors or attitudes] - Example items: [Draft items] ### Related but Distinct Constructs: - [Construct A] — How it differs from your construct - [Construct B] — How it differs from your construct ### Nomological Network: - Should correlate positively with: [Constructs] - Should correlate negatively with: [Constructs] - Should be unrelated to: [Constructs]
3. Item Writing
Question Types and When to Use Them
| Type | Format | Best For | Example |
|---|---|---|---|
| Closed-ended (single choice) | Radio buttons | Mutually exclusive categories | "What is your highest degree? ( ) Bachelor's ( ) Master's ( ) Doctoral" |
| Closed-ended (multiple choice) | Checkboxes | Non-mutually exclusive categories | "Which tools do you use? [ ] Qualtrics [ ] REDCap [ ] Google Forms" |
| Likert scale | Rating scale | Attitudes, perceptions, frequency | "I feel confident using statistics: Strongly Disagree 1 2 3 4 5 Strongly Agree" |
| Semantic differential | Bipolar scale | Evaluative judgments | "The training was: Useless ::::___ Useful" |
| Ranking | Drag-and-drop or numbered | Forced prioritization | "Rank these factors from most to least important: ___" |
| Matrix/Grid | Likert items in table | Multiple items with same response scale | [See matrix example below] |
| Open-ended | Text box | Exploratory, rich responses | "What challenges do you face in your research?" |
| Numeric | Number input | Precise quantities | "How many publications do you have? ___" |
| Visual analog scale (VAS) | Slider | Continuous measurement | "Rate your pain: No pain |
Likert Scale Design
Number of Points:
| Points | Pros | Cons | Use When |
|---|---|---|---|
| 4-point | Forces a choice (no midpoint) | May frustrate genuinely neutral respondents | You want to avoid social desirability midpoint clustering |
| 5-point | Most common; well-understood | Central tendency bias; midpoint ambiguity | Standard attitudinal measurement |
| 6-point | Forced choice with more granularity | Less familiar to respondents | You want to force direction with more options |
| 7-point | Greater discrimination; better for factor analysis | May exceed respondents' discriminative capacity | Established psychometric instruments; research contexts |
Likert Scale Labeling:
FULLY LABELED (recommended for clarity): Strongly Disagree | Disagree | Neutral | Agree | Strongly Agree END-ANCHORED ONLY (acceptable for experienced respondents): Strongly Disagree | 2 | 3 | 4 | Strongly Agree AGREEMENT: Strongly Disagree → Strongly Agree FREQUENCY: Never → Always IMPORTANCE: Not at all Important → Extremely Important SATISFACTION: Very Dissatisfied → Very Satisfied LIKELIHOOD: Very Unlikely → Very Likely QUALITY: Very Poor → Excellent
Item Writing Rules
DO:
- Use simple, clear language (avoid jargon, acronyms, technical terms unless your population uses them)
- Ask about one thing per item (no double-barreled questions)
- Use specific time frames ("In the past 30 days..." not "Do you ever...")
- Match the response scale to the question stem
- Include both positively and negatively worded items (with caution — see pitfalls)
- Pilot test items with your target population
- Write 2-3x more items than you need (expect to cut during validation)
DO NOT:
- Use leading or loaded language ("Don't you agree that...")
- Use double negatives ("How much do you disagree with not implementing...")
- Assume knowledge ("Rate the effectiveness of the Delphi method" — respondent may not know it)
- Use absolutes ("always," "never," "all," "none") unless measuring frequency
- Create unnecessarily long items (aim for under 20 words per item)
- Use hypothetical scenarios when asking about actual behavior
Examples of Item Revisions:
POOR: "How satisfied are you with the quality and timeliness of feedback?" (Double-barreled: quality AND timeliness) FIX: Item 1: "How satisfied are you with the quality of feedback you receive?" Item 2: "How satisfied are you with the timeliness of feedback you receive?" POOR: "Students should not be required to not attend classes." (Double negative) FIX: "Class attendance should be mandatory." POOR: "Do you agree that the new policy is beneficial?" (Leading — assumes the policy is beneficial) FIX: "The new policy has been beneficial to my work." (Neutral stem; let the Likert scale capture agreement/disagreement) POOR: "Rate your teaching effectiveness." (1-5) (Socially desirable response; no reference frame) FIX: "In the past semester, how often did you use student feedback to modify your teaching?" (Never / Rarely / Sometimes / Often / Always)
4. Survey Structure and Flow
Recommended Survey Organization:
## Survey Structure Template ### Page 1: Welcome and Consent - Study title, purpose, estimated time - Consent checkbox (mandatory before proceeding) - Contact information for questions ### Page 2: Screening Questions (if applicable) - Eligibility criteria - Route ineligible respondents to end-of-survey message ### Page 3-N: Main Content Sections - Group by topic/construct - Progress bar visible - Section headers with brief context - Start with engaging, easy questions - Place sensitive questions in the middle (after rapport, before fatigue) - Use skip logic to hide irrelevant questions ### Page N+1: Demographics - Place at the END (reduces dropout from sensitive questions early) - Include only demographics you will actually analyze - Provide "Prefer not to answer" option for sensitive items ### Final Page: Thank You - Thank participant - Provide debriefing information - Share contact info for results - Remind of withdrawal procedure
Skip Logic Design:
Q1: Do you supervise graduate students? ( ) Yes → Show Q2-Q5 (supervision questions) ( ) No → Skip to Q6 Q3: How many students do you currently supervise? [Number input] If Q3 > 5 → Show Q4 (workload management question) If Q3 ≤ 5 → Skip to Q5 Q10: Would you like to participate in a follow-up interview? ( ) Yes → Show Q11 (contact information) ( ) No → Skip to end
5. Sampling Strategies
Probability Sampling (every member of population has a known, non-zero chance of selection):
| Method | How It Works | Pros | Cons |
|---|---|---|---|
| Simple random | Select randomly from complete list | Unbiased, generalizable | Requires complete sampling frame |
| Systematic | Select every kth element from list | Easy to implement | Periodicity risk if list has pattern |
| Stratified | Divide population into strata, then random sample within each | Ensures representation of subgroups | Requires knowledge of population characteristics |
| Cluster | Randomly select clusters (schools, hospitals), then sample within | Practical when no individual-level list exists | Higher sampling error than SRS |
| Multi-stage | Combine methods (e.g., cluster then stratified) | Flexible, practical for large populations | Complex to implement and analyze |
Non-Probability Sampling (no guarantee of representativeness):
| Method | How It Works | Pros | Cons |
|---|---|---|---|
| Convenience | Recruit whoever is available | Fast, cheap | Not generalizable; strong bias |
| Purposive | Select participants based on specific criteria | Targets relevant subgroups | Researcher bias in selection |
| Snowball | Existing participants recruit others | Access to hard-to-reach populations | Biased toward connected individuals |
| Quota | Set quotas for subgroups, then convenience sample within | Ensures diversity on key dimensions | Not truly random within quotas |
Sample Size Determination:
For descriptive surveys (estimating proportions): n = (Z² × p × (1-p)) / E² Where: Z = Z-score for confidence level (1.96 for 95%) p = Expected proportion (use 0.5 if unknown — most conservative) E = Margin of error (e.g., 0.05 for ±5%) Example: 95% confidence, 5% margin of error, unknown proportion n = (1.96² × 0.5 × 0.5) / 0.05² = 384.16 → 385 respondents Adjust for finite population: n_adj = n / (1 + (n-1)/N) Where N = population size Adjust for expected response rate: n_needed = n_adj / expected_response_rate Example: 385 / 0.30 = 1,284 invitations needed for 30% response rate
For comparative surveys (detecting differences between groups):
# Power analysis for two-group comparison from scipy import stats import numpy as np def sample_size_two_groups(effect_size, alpha=0.05, power=0.80): """ Calculate sample size per group for independent samples t-test. effect_size: Cohen's d (0.2=small, 0.5=medium, 0.8=large) alpha: significance level power: desired statistical power """ z_alpha = stats.norm.ppf(1 - alpha/2) z_beta = stats.norm.ppf(power) n = 2 * ((z_alpha + z_beta) / effect_size) ** 2 return int(np.ceil(n)) # Examples print(f"Small effect (d=0.2): {sample_size_two_groups(0.2)} per group") # 394 print(f"Medium effect (d=0.5): {sample_size_two_groups(0.5)} per group") # 64 print(f"Large effect (d=0.8): {sample_size_two_groups(0.8)} per group") # 26
6. Response Bias Mitigation
| Bias Type | Definition | Mitigation Strategies |
|---|---|---|
| Social desirability | Respondents answer in ways they believe are socially acceptable | Anonymous data collection; indirect questioning; validated social desirability scales (e.g., Marlowe-Crowne) |
| Acquiescence | Tendency to agree with statements regardless of content | Mix positively and negatively worded items; use forced-choice formats |
| Central tendency | Tendency to select middle response options | Use even-point scales (no midpoint); provide behavioral anchors |
| Extreme responding | Tendency to select extreme endpoints | Use more response options (7-point); provide clear anchor descriptions |
| Order effects | Earlier questions influence responses to later questions | Randomize item order within sections; counterbalance across respondents |
| Nonresponse bias | Systematic differences between responders and non-responders | Follow-up reminders; analyze early vs. late responders; compare demographics to population |
| Recall bias | Inaccurate recall of past events | Use shorter recall periods; provide memory aids; use event-specific prompts |
| Common method bias | Inflated correlations due to same measurement method | Use different measurement methods; temporal separation; marker variables |
7. Pilot Testing
Three-Phase Pilot Testing Protocol:
## Phase 1: Expert Review (n = 3-5 experts) ### Content Validity - Do items adequately cover the construct? - Are any important facets missing? - Are items relevant to the target population? - Content Validity Index (CVI): Rate each item as 1 = Not relevant, 2 = Somewhat relevant, 3 = Quite relevant, 4 = Highly relevant Item-CVI = proportion of experts rating 3 or 4 (threshold: ≥ 0.78) Scale-CVI/Ave = mean of Item-CVIs (threshold: ≥ 0.90) ### Face Validity - Do items appear to measure what they claim? - Is the language clear and appropriate? - Is the survey length reasonable? --- ## Phase 2: Cognitive Interviews (n = 5-10 from target population) ### Think-Aloud Protocol "Please read each question out loud and tell me what you are thinking as you decide on your answer." ### Probing Questions - "What does [term] mean to you?" - "How did you arrive at your answer?" - "Was this question easy or difficult to answer? Why?" - "Can you put this question in your own words?" - "Is there anything confusing about this question?" - "Would you change anything about this question?" ### Document - Items that cause confusion or hesitation - Items interpreted differently than intended - Items where response options do not fit - Suggested wording improvements - Time to complete each section --- ## Phase 3: Quantitative Pilot (n = 30-50 from target population) ### Assess - [ ] Completion rate and completion time - [ ] Item-level missing data (flag items with >10% missing) - [ ] Response distributions (flag items with >90% in one category) - [ ] Internal consistency (Cronbach's alpha per subscale) - [ ] Item-total correlations (flag items < 0.30) - [ ] Inter-item correlations (flag pairs > 0.85 — redundancy) - [ ] Open-ended feedback on survey experience - [ ] Technical issues (display, skip logic, mobile compatibility)
8. Instrument Validation
Reliability
Internal Consistency:
import pandas as pd import numpy as np def cronbachs_alpha(df): """ Calculate Cronbach's alpha for a set of items. df: DataFrame where each column is an item and each row is a respondent. """ n_items = df.shape[1] item_variances = df.var(axis=0, ddof=1) total_variance = df.sum(axis=1).var(ddof=1) alpha = (n_items / (n_items - 1)) * (1 - item_variances.sum() / total_variance) return alpha # Example data = pd.DataFrame({ 'item1': [4, 3, 5, 4, 3, 5, 4, 3, 2, 4], 'item2': [3, 3, 4, 4, 2, 5, 4, 3, 3, 4], 'item3': [4, 4, 5, 3, 3, 4, 5, 2, 3, 5], 'item4': [3, 2, 4, 4, 3, 5, 4, 3, 2, 3], }) alpha = cronbachs_alpha(data) print(f"Cronbach's alpha: {alpha:.3f}") # Interpretation: # α ≥ 0.90 Excellent (but check for redundancy) # 0.80 ≤ α < 0.90 Good # 0.70 ≤ α < 0.80 Acceptable # 0.60 ≤ α < 0.70 Questionable # α < 0.60 Poor — revise items
Item-Total Correlations:
def item_total_correlations(df): """Calculate corrected item-total correlations.""" results = {} for col in df.columns: rest = df.drop(columns=col).sum(axis=1) corr = df[col].corr(rest) results[col] = round(corr, 3) return results itc = item_total_correlations(data) for item, corr in itc.items(): flag = " ← REVIEW" if corr < 0.30 else "" print(f" {item}: r = {corr}{flag}")
Validity
| Type | Question | Method |
|---|---|---|
| Content validity | Do items cover the construct adequately? | Expert review, CVI calculation |
| Face validity | Do items appear to measure the construct? | Target population review |
| Construct validity | Does the instrument measure the theoretical construct? | Factor analysis (EFA/CFA) |
| Convergent validity | Does it correlate with similar measures? | Correlation with established instruments (r > 0.50) |
| Discriminant validity | Is it distinct from different constructs? | Low correlation with theoretically unrelated measures (r < 0.30) |
| Criterion validity (concurrent) | Does it correlate with a current criterion? | Correlation with gold standard measured simultaneously |
| Criterion validity (predictive) | Does it predict a future outcome? | Correlation with criterion measured later |
| Known-groups validity | Can it distinguish groups known to differ? | Compare scores between groups that should differ |
Exploratory Factor Analysis (EFA):
from factor_analyzer import FactorAnalyzer from factor_analyzer.factor_analyzer import calculate_bartlett_sphericity, calculate_kmo # Check suitability for factor analysis chi_square, p_value = calculate_bartlett_sphericity(data) print(f"Bartlett's test: χ² = {chi_square:.2f}, p = {p_value:.4f}") # p < 0.05 → suitable for factor analysis kmo_all, kmo_model = calculate_kmo(data) print(f"KMO: {kmo_model:.3f}") # KMO > 0.60 → suitable; > 0.80 → good; > 0.90 → excellent # Determine number of factors (parallel analysis) fa = FactorAnalyzer(rotation=None, n_factors=data.shape[1]) fa.fit(data) eigenvalues, _ = fa.get_eigenvalues() print("Eigenvalues:", [f"{ev:.3f}" for ev in eigenvalues]) # Retain factors with eigenvalue > 1 (Kaiser criterion) # Also use scree plot and parallel analysis # Run EFA with chosen number of factors fa = FactorAnalyzer(n_factors=2, rotation='oblimin', method='ml') fa.fit(data) # Factor loadings loadings = pd.DataFrame( fa.loadings_, index=data.columns, columns=[f'Factor {i+1}' for i in range(2)] ) print("\nFactor Loadings:") print(loadings.round(3)) # Items should load ≥ 0.40 on one factor and < 0.30 on others
9. Online Survey Platform Comparison
| Feature | Qualtrics | REDCap | Google Forms | LimeSurvey |
|---|---|---|---|---|
| Cost | Institutional license (expensive) | Free for institutions | Free | Free (open source) |
| Skip logic | Advanced | Advanced | Basic | Advanced |
| Randomization | Yes (items, blocks) | Limited | No | Yes |
| Piping | Yes | Yes | No | Yes |
| Offline data collection | Yes (app) | Yes (app) | No | Yes |
| HIPAA compliant | Yes (BAA available) | Yes (designed for it) | No | Self-hosted: yes |
| API access | Yes | Yes | Limited | Yes |
| Data export | CSV, SPSS, Excel | CSV, Excel, SPSS, SAS, R, Stata | CSV, Excel | CSV, Excel, SPSS, R |
| Multi-language | Yes | Yes | Manual | Yes |
| Panel integration | Yes (Prolific, MTurk) | No | No | Limited |
| Best for | Complex academic surveys | Clinical and health research | Simple surveys, course evaluations | Budget-conscious complex surveys |
10. Interview Protocol Development
Semi-Structured Interview Guide Template:
## Interview Protocol ### Study: [Title] ### Interviewer: [Name] ### Participant ID: ___ Date: ___ Start Time: ___ --- ### Opening (5 minutes) - Thank participant for their time - Review consent (confirm recording permission) - Explain purpose: "I'm interested in understanding your experiences with [topic]" - Explain format: "I have some questions prepared, but this is a conversation. There are no right or wrong answers. Please share as much or as little as you're comfortable with." ### Warm-Up Question (5 minutes) 1. "Can you tell me about your role and how you came to it?" - Probe: "How long have you been in this position?" ### Main Questions (30-40 minutes) **Block A: [Topic/Construct 1]** 2. "Describe your experience with [topic]." - Probe: "Can you give me a specific example?" - Probe: "How did that make you feel?" - Probe: "What happened next?" 3. "What challenges have you encountered related to [topic]?" - Probe: "How did you handle that?" - Probe: "What support, if any, did you receive?" **Block B: [Topic/Construct 2]** 4. "How has [topic] changed over time for you?" - Probe: "What prompted that change?" - Probe: "Looking back, what would you have done differently?" 5. "What factors have been most influential in shaping your [topic]?" - Probe: "Can you elaborate on [specific factor mentioned]?" **Block C: [Topic/Construct 3]** 6. [Question] - Probes ### Closing (5 minutes) 7. "Is there anything else about [topic] that you think is important and that I haven't asked about?" 8. "Do you have any questions for me?" ### Post-Interview - Thank participant; explain next steps and timeline - Stop recording - Write field notes immediately after: - Key impressions - Non-verbal observations - Reflections on the interview process - Emerging analytical ideas End Time: ___ Total Duration: ___
11. Focus Group Facilitation
Focus Group Design Checklist:
## Focus Group Planning ### Composition - Participants per group: 6-10 (4-6 for complex topics) - Number of groups: 3-5 per population segment (until saturation) - Homogeneity within groups (shared experience/characteristic) - Heterogeneity across groups (variation in perspectives) ### Roles - Moderator: Facilitates discussion, manages dynamics - Note-taker: Records non-verbal cues, group dynamics, key quotes - Optional: Observer behind one-way glass or via video ### Environment - Comfortable, neutral, private setting - Circular or U-shaped seating (no head of table) - Recording equipment tested before session - Refreshments available - Name tents (first names or pseudonyms) ### Facilitation Techniques - Opening: Icebreaker or round-robin introduction - Funnel approach: Broad → specific questions - Manage dominant voices: "Let's hear from others..." - Draw out quiet participants: "We haven't heard from everyone yet..." - Handle conflict: "It sounds like there are different perspectives here, and that's valuable. Let's explore both views." - Closing: "Of everything we've discussed, what stands out as most important?"
12. Cultural Adaptation of Instruments
Brislin's (1970) Back-Translation Method:
Original instrument (Source Language) │ ▼ Forward translation by Translator A (Source → Target language; bilingual with target as dominant) │ ▼ Back-translation by Translator B (Target → Source language; bilingual with source as dominant; has NOT seen original instrument) │ ▼ Compare original and back-translation (Research team + both translators) │ ├── Discrepancies? → Revise target version → Re-translate → Compare again │ └── Equivalent? → Proceed to expert review │ ▼ Cultural review panel (Experts familiar with target culture) │ ▼ Cognitive interviews in target population │ ▼ Pilot test and psychometric validation in target population
ISPOR Guidelines for Cross-Cultural Adaptation:
- Preparation (permissions, concept elaboration)
- Forward translation (2 independent translators)
- Reconciliation of forward translations
- Back-translation
- Back-translation review
- Harmonization across language versions
- Cognitive debriefing with target population
- Review of cognitive debriefing results
- Proofreading
- Final report documenting all decisions
Best Practices
-
Start with constructs, not questions. Define exactly what you are measuring before writing a single item. Each item should trace back to a specific construct or dimension.
-
Use existing validated instruments when possible. Do not reinvent the wheel. Search the literature for instruments with established psychometric properties.
-
Pilot everything. Every survey should go through cognitive interviews and a quantitative pilot before full deployment. There is no substitute for testing with your target population.
-
Keep it short. Every additional item increases dropout risk. Include only items you will actually analyze. A good survey is as short as possible and as long as necessary.
-
Design for your weakest respondent. Write at an appropriate reading level. Test on mobile devices. Consider accessibility (screen readers, color contrast). Provide translations if needed.
-
Randomize item order within sections. This reduces order effects and helps detect careless responding.
-
Include attention checks. Embed 1-2 instructed response items (e.g., "Please select 'Agree' for this item") to identify careless respondents.
-
Plan your analysis before collecting data. Every question should have a purpose in your analysis plan. If you cannot say how you will analyze an item, remove it.
-
Document everything. Keep a survey design log recording every decision: why items were added, removed, or revised; pilot test results; expert feedback.
-
Protect respondent data. Use anonymous links when possible; store data securely; minimize collection of identifiers; comply with IRB requirements.
Common Pitfalls
| Pitfall | Why It Happens | How to Avoid |
|---|---|---|
| Double-barreled questions | Trying to be efficient; asking two things at once | Split into separate items; one concept per item |
| Leading questions | Researcher's hypothesis influences wording | Have a colleague blind to your hypothesis review items |
| Response options that do not match the stem | Copy-pasting from another survey | Ensure stem and response scale are grammatically and logically matched |
| Too many open-ended questions | Wanting rich data | Limit to 2-3 open-ended items; save depth for interviews |
| No pilot testing | Time pressure; overconfidence in item clarity | Always pilot — even a quick cognitive interview with 3-5 people helps |
| Ignoring mobile respondents | Designing on desktop | Test on multiple devices; avoid matrix questions on mobile (they break) |
| Low response rate | No follow-up plan; survey too long; no incentive | Pre-notify; send reminders (3-4 contacts); shorten survey; offer incentive |
| Neglecting psychometric validation | Assuming items are valid because they "look right" | Run reliability and factor analysis; report results in your paper |
| Convenience sampling reported as representative | Not understanding sampling limitations | Be honest about sampling method in limitations section |
| Cultural insensitivity | Assuming instruments transfer across cultures | Use formal adaptation procedures (back-translation, cognitive interviews) |
References
- DeVellis, R. F., & Thorpe, C. T. (2022). Scale development: Theory and applications (5th ed.). Sage.
- Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, phone, mail, and mixed-mode surveys: The tailored design method (4th ed.). Wiley.
- Fowler, F. J. (2014). Survey research methods (5th ed.). Sage.
- Groves, R. M., Fowler, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2009). Survey methodology (2nd ed.). Wiley.
- Krosnick, J. A., & Presser, S. (2010). Question and questionnaire design. In P. V. Marsden & J. D. Wright (Eds.), Handbook of survey research (2nd ed., pp. 263-313). Emerald.
- Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research. Journal of Applied Psychology, 88(5), 879-903.
- Willis, G. B. (2005). Cognitive interviewing: A tool for improving questionnaire design. Sage.
See also:
references/survey-methodology.md for expanded methodology details.