From concept to validated instrument – we design scientifically rigorous assessments tailored to your specific organizational needs and target populations.
Adhering to best practices in psychometric test development proposed by Van Zyl & Ten Klooster (2022), we employ a rigorous, multi-phased methodology to design robust assessment measures.
This multifaceted approach ensures a balanced and comprehensive development process of each of our bespoke diagnostic models. Each phase is meticulously designed to build upon its predecessors, ensuring a comprehensive understanding and accurate measurement, of organizational health within your unique business environment. This methodology guides the development of a robust measurement model and validated diagnostic tools that can provide genuine, meaningful insights.
The design process comprises six critical steps:
Conceptualization & Framework Design
Begin by defining the constructs to be measured through conceptual analysis, stakeholder engagement, and review of existing frameworks. This forms the foundation for building a behaviourally anchored measurement model.
Competency & Behaviour Mapping
Translate abstract values or constructs into measurable states, traits, behaviours, competencies or experiences using participatory action research, focus groups, and thematic analysis. Define observable indicators for each competency.
Test Specification & Item Development
Design the assessment structure: select item formats (e.g., Likert, scenario-based), define response scales, and draft items. Conduct expert reviews and cognitive interviews to refine content.
Pilot Testing
Administer the draft tool to a representative sample to evaluate factor structure, item performance, reliability, and usability. Use both classical and modern psychometric techniques (e.g., EFA, ESEM, Rasch analysis).
Construct Validity (CFA, ESEM)
Criterion Validity (predictive, concurrent)
Measurement Invariance (across groups)
Reliability (internal consistency, inter-rater, parallel forms)
Validation
Perform full psychometric validation including:
Multi-Method, Multi-Source Integration
Combine self, peer, and group data with qualitative feedback and objective business metrics to create a holistic understanding of the constructs.
Norming, Scoring & Reporting
Develop scoring algorithms, benchmarks, and reporting formats. Establish norms based on pilot and validation data. Create clear, stakeholder-ready feedback mechanisms.
Deployment & Continuous Improvement
Implement via secure web-based platforms with multilingual support. Monitor performance, update norms, and refine items based on ongoing data collection.
The purpose is to develop a bespoke measurement model centered around the unique states, traits and behaviours you would like to assess
Here, we identify both what explains what inputs to measure the factor and to link it to core business outcomes and stakeholder perceptions.
The purpose is to develop a bespoke predictive process model to identify the drivers (i.e. antecedents) of the factor you would like to measure and identify how this affects important business and people outcomes.
By understanding the drivers, it affords the opportunity to identify the specific areas that can be targeted to improve the factor but is also crucial for the validation phase
Design the assessment structure and Assessment Methodology: select item formats (e.g., Likert, scenario-based), define response scales, and draft items. Conduct expert reviews and cognitive interviews to refine content.
In this step, the structure and content of the assessment tool are designed, including item formats, rating scales, and data collection methods. The methodology is tailored to accurately capture both qualitative and quantitative feedback from a diverse range of stakeholders. It involves these steps
Define the Assessment Objectives
Clarify what the assessment aims to measure (e.g., behaviours, traits, perceptions), who the target population is, and how the results will be used for decision-making or development.
Select Item Types and Data Inputs
Decide on the most suitable question formats (e.g., Likert scales, semantic differentials, open-ended questions, scenario-based items), and identify both subjective (self/peer ratings) and objective (performance metrics) data sources.
Choose the Response Scales
Determine the appropriate response format (e.g., 5-point, 7-point) and anchor descriptions that best capture the nuance of the construct being assessed.
Determine the Number of Items
Use statistical tools like power analysis and Monte Carlo simulations to estimate how many items are required to ensure reliable and valid measurement.
Define the Scoring Approach
Decide whether to use weighted or unweighted scoring, and whether composite or subscale scores are needed for interpretation and decision-making.
Select Psychometric Models
Identify the most appropriate psychometric evaluation models, such as Classical Test Theory, Item Response Theory, or Rasch modelling, depending on the data structure and test goals.
Integrate Qualitative Analysis Tools
For open-ended data, apply natural language processing (NLP) tools—like transformer-based models—to analyse sentiment, themes, and respondent narratives at scale.
Design the Administration Method
Choose the mode of delivery (e.g., online, mobile-friendly), ensure accessibility across user groups, and design the system for real-time data collection and processing.
Design the assessment structure and Assessment Methodology: select item formats (e.g., Likert, scenario-based), define response scales, and draft items. Conduct expert reviews and cognitive interviews to refine content.
The next step is to pilot the draft assessment tool with a small, representative sample of internal stakeholders to explore its factorial structure, evaluate its effectiveness, and refine the item pool. This phase also assesses the credibility and transferability of qualitative data collected through open-ended questions. Based on pilot feedback and empirical findings, the instrument will be optimized for clarity, reliability, and validity.
The process typically includes:
Administering the draft assessment electronically to a random sample of approximately 300 participants to evaluate factor structure, item relevance, and usability.
Collecting 360° feedback by having each participant rated by multiple peers and their direct supervisor, including measures of key drivers and outcomes.
Conducting exploratory factor analysis (e.g., ML-EFA) and Rasch modelling to examine item performance, remove poorly functioning items, and streamline the item set.
Assessing measurement quality through indicators such as factor loadings, internal consistency (e.g., McDonald’s Omega), and reliability metrics (e.g., Cronbach’s Alpha).
Applying natural language processing to analyse qualitative responses, enhancing understanding through thematic and sentiment analysis.
Evaluating metadata (e.g., response patterns, item difficulty) and collecting user feedback to assess usability and respondent burden.
Finalizing the item pool and developing a scoring algorithm that supports reporting at the individual, team, organizational, and group levels.
Integrating data from self-ratings, peer feedback, open-ended responses, and objective indicators into a comprehensive scoring model for holistic measurement.
Conducting extensive psychometric evaluation in order to establish validity, reliability, and overall psychometric quality of the assessment tool
This step involves rigorous statistical analysis to evaluate the reliability, validity, and overall psychometric quality of the assessment tool. The goal is to confirm that the instrument accurately measures the intended constructs and can be applied consistently across diverse populations and contexts. Specifically, this phase focuses on assessing psychometric properties, reliability, measurement invariance, and both concurrent and predictive validity.
The validation process typically includes:
(a) Administering the final instrument electronically to a large, representative internal sample (e.g., N=600), using a cross-sectional, online survey-based research design.
(b) Identifying key antecedents and outcomes from the theoretical model and constructing an additional survey to assess these variables for criterion validation.
Quantitative data analysis involves:
(i) Confirmatory factor analysis (CFA) using supervised machine learning techniques and structural equation modeling (SEM), comparing traditional CFA with exploratory SEM (ESEM). Measurement quality is assessed using indices such as CFI, TLI, RMSEA, SRMR, factor loadings, and reliability metrics like McDonald’s Omega. Poorly performing items are removed.
(ii) Testing for measurement invariance across demographic and organizational subgroups (e.g., age, gender, department) by comparing increasingly restrictive models (configural, metric, scalar, strict, latent means). Model changes are assessed using ΔRMSEA, ΔSRMR, and ΔCFI.
(iii) Estimating separate unsupervised machine learning models to evaluate relationships between the assessment results and business-relevant outcomes, with path coefficients and model fit indices reported.
Qualitative data analysis includes:
(i) Using topic modeling (e.g., LDA) and NLP models (e.g., BERT) to extract and contextualize themes from open-ended responses.
(ii) Training supervised learning classifiers to categorize qualitative responses into relevant domains.
(iii) Conducting sentiment analysis to gauge emotional tone and stakeholder sentiment using models trained on labelled text data.
Finally, the tool is compiled and finalized, and a recommended methodology for future assessments is outlined, ensuring the solution is both scientifically robust and practically actionable.
The final stage of the assessment development process involves creating a comprehensive norming, scoring, and reporting system
The final stage of the assessment development process involves creating a comprehensive norming, scoring, and reporting system. This step consolidates data from both the assessment instrument and relevant objective indicators to produce a standardized framework for score calculation, interpretation, and communication. The aim is to deliver meaningful insights at both the individual and organizational levels, while enabling ongoing benchmarking and progress tracking.
The process typically includes:
Developing a scoring algorithm that integrates data from prior phases, applying appropriate weights to competencies, behaviors, and objective indicators based on their relevance to key performance outcomes.
Establishing clear benchmarks and thresholds to categorize results (e.g., high, moderate, or low alignment) and guide interpretation.
Designing a reporting system that (1) describes the current state of measured constructs, (2) diagnoses the contributing factors to the current scores, (3) forecasts potential changes based on predictive modeling, and (4) recommends targeted interventions to improve future outcomes. Reports are tailored to various stakeholders, from individual feedback to aggregated organizational summaries.
Including visual reporting elements like as dashboards, heatmaps, and trend charts in order to enhance accessibility and engagement with the results.
Creating user documentation and training resources to ensure consistent, reliable use of the tool across contexts, and equipping key personnel to administer the assessment and interpret findings effectively.
Our validation protocols not only meets but exceed professional standards (APA, SIOP, ITC) to ensure your assessment delivers reliable, meaningful results.
Confirmatory factor analysis, convergent/discriminant validity testing, and multi-trait multi-method approaches to verify we're measuring intended constructs.
Rigorous testing against relevant outcome measures to demonstrate predictive power for your specific use cases.
Measurement invariance testing across demographic groups to ensure fairness and comparable score interpretation.
Benefit from our ongoing research and regular updates to keep your assessments at the cutting edge.
Trait-based, state-based, and dynamic personality measures for selection, development, and clinical applications.
Verbal, numerical, spatial, and abstract reasoning assessments with culture-fair design principles
Multimedia SJTs measuring job-specific competencies with realistic work scenarios.
360° Feedback InstrumentsMulti-rater assessments with sophisticated norming and rater agreement analytics.
Predictive algorithms that go beyond traditional scoring approaches.
Automated text analysis for open-ended responses and qualitative data coding.
Precision measurement through adaptive testing and computerized adaptive questionnaires.
Advanced structural equation modelling for CFA, EFA, ESEM and MEasurement Invariance
Our team of PhD-level psychometricians and data scientists will guide you through every step of creating a scientifically validated instrument.