Reliability and Validity


The National Spanish Examinations recognizes that test reliability is defined as the degree to which the test gives consistent results each time it is given.

Reliability answers the following questions:

  1. Can I depend on the test to measure the same outcomes consistently?
  2. Given all the other variables being the same, will the test produce the same results again?

Please view the reliability coefficients for the following exam years:

2022 Reliability Coefficients

2021 Reliability Coefficients

2020 Reliability Coefficients

2019 Reliability Coefficients

2018 Reliability Coefficients

2017 Reliability Coefficients

2016 Reliability Coefficients

2015 Reliability Coefficients

2014 Reliability Coefficients

2013 Reliability Coefficients

2012 Reliability Coefficients

2011 Reliability Coefficients

2010 Reliability Coefficients



NSE recognizes three types of evidence which can be collected to ensure test validity:

1.      criterion-related evidence
Evidence demonstrating the systematic relationship of test scores on a predictor test to a predicted criterion variable.
2.      construct-related evidence
Empirical evidence that (1) supports the posited existence of a hypothetical construct and (2) indicates an assessment device does, in fact, measure that construct.
3.      content-related evidence
Evidence indicating that an assessment instrument suitably reflects the content domain it is supposed to represent

Choosing the type of evidence to collect is based on the type of test.  Because criterion-related evidence of validity is concerned with aptitude tests, the NSE does not use this type of evidence in assessing validity of items on the National Spanish Examination.  Similarly, the NSE does not collect construct-related evidence, because there are no hypothetical constructs assessed on the NSE. 

Content-related evidence is the most important in assessing validity for the National Spanish Examination, because the achievement section of the National Spanish Examination is designed from specific learner outcomes based on content standards for specifications for vocabulary and grammar.  Furthermore, content-related evidence of validity is ensured through (1) development care and (2) external reviews.  Furthermore, items in the proficiency section (Interpretive Reading and Interpretive Listening) are validated through a high number of samples which are placed accordingly into the applicable level.

Development care

The National Spanish Exam Writing Committee employs a set of test development procedures carefully focused on assuring that the questions on each level of the exam accurately assess the learner outcomes for that level.  The principal objective is to match the content of the test with the assessment domain through specific procedures during the test development process.

  1. Team reviews standards.
  2. Team reviews benchmarks under each standard.
  3. Team reviews learner outcomes under each benchmark by level.
  4. One team member writes question(s).
  5. Other team members review and discuss question(s).
  6. Director of Test Development confirms relationship between question(s) and learner outcome.

External reviews

Content validity is usually assessed by expert judgment rather than statistical analysis.

After completion of the test draft, the National Spanish Examination Review Committee rates the content appropriateness of a given test by performing a systematic review of the test.  This study relies very heavily on human judgment; therefore, the teachers who are chosen for this committee are experts in standards-based curriculum, instruction and assessment.

After the tests have been administered, the National Spanish Exam analyzes the question details reports which provide information on how a specific sampling of students have done on a specific item.  A validity panel that consists of trained professionals who have taught across all levels confirms through the QDR that all test items have face and content validity.