Saudi Arabia Flag
Official government website of the Government of the Kingdom of Saudi Arabia
Live Stream LinkLive Stream
Link Icon
Links to official Saudi websites end withgov.sa

All links to official websites of government agencies in the Kingdom ofSaudi Arabia end with .gov.sa

Password Icon
Government websites use theHTTPSprotocol for encryption and security.

Secure websites in the Kingdom of Saudi Arabia use the HTTPS protocolfor encryption.

Dga Logo

Registered with the Digital Government Authority under number :

20250724844

Methodology and Quality Report for Education and Training Statistics Publication 2025

 

Methodology and Quality Update

Latest Update on Methodology and Quality

2026/01/07

 

Statistical Presentation

Data description

The Education and Training Statistics Publication provides data on the participation of the population in education for the age group (5–19 years) in the Kingdom of Saudi Arabia.
Education and training statistics publication is a survey conducted to collect data on the basic characteristics as follows:
•    School age.
•    Parental participation.
•    Household-related information.
•    School-related information.
Data is also used to estimates:
•    Participation in education and training from 5-19 years old.
•    Participation of parents in school activities.
•    Teaching and school quality.
•    Data relating to the school or university.
•    Bullying and truancy.
•    Home learning environment.
•    Free time away from school.
•    Time spent on doing homework.
•    Participating in activities outside of school.
•    School vision.

 

Classifications

The following classifications are applied in the Education and Training Statistics publication.
The National Classification for Economic Activities (ISIC4):
It is a statistical classification based on the International Standard Industrial Classification of All Economic Activities (ISIC4), used to describe the productive activities of an establishment.
Saudi Standard Classification of Occupations(ISCO_08):
A statistical classification based on the International Classification  (ISCO_08) that provides a system for the classification and compilation of professional information obtained through censuses, statistical surveys, and administrative records.
This classification is used in the Education and Training Survey for classifying employed persons by occupation.
Saudi Standard Classification of Educational Levels and Fields (SASCED-20):
A statistical classification based on the International Standard Classification of Education (ISCED 2011) for educational levels and ISCED-F 2013 for fields of education and training, issued by the United Nations Educational, Scientific, and Cultural Organization (UNESCO). It serves as the reference classification for organizing educational programmes and related qualifications according to their educational levels and fields. The classification comprehensively covers all educational programmes, levels, and modes of education, and spans all stages of education from early childhood education through to postgraduate levels.
This classification is used in the Education and Training Survey to classify individuals aged 15 years and above according to their fields of study and educational levels.
National Code of Countries and Nationalities (3166 ISO – codes Country):
A statistical classification based on the international standard  (ISO 3166_Country codes), which is a standard issued by the International Organization for Standardization (ISO of the UN), and this classification gives numeric and literal codes for the world’s (248) countries, based on the classification of countries.  
This classification is used in the Education and Training Survey to classify individuals into Saudi and non-Saudi categories.
Metadata is collected through interviews, so that outputs can be produces in accordance with all relevant classifications.
The classifications are available on the GASTAT’s website: https://www.stats.gov.sa/classifications 

 

Statistical concepts and definitions

Terms and concepts of the Education and Training Survey publication:
•    School:
It is an institution where students receive education from teachers.
•    Education:
It refers to formal education and training, known as institutionalized and structured education provided by public institutions and recognized private entities. These collectively constitute the formal education system of the state, and the educational programs within this category are recognized as such by the relevant national education authorities or their equivalents. An example of this is education provided by school systems, colleges, universities, and other official educational institutions that typically offer systematic and continuous education on a full-time basis.
•    Academic achievement:
The highest educational level successfully completed by an individual, typically measured in relation to the highest educational level successfully achieved and usually certified by a recognized qualification.
•    Bullying:
It refers to repeated aggressive behavior aimed at harming or intimidating another person. This behavior can be physical, verbal, or psychological.
•    Truancy:
It refers to staying away from school without permission. It is a type of absenteeism that can be occasional or habitual. Truancy is often seen as a form of rebellion or avoidance, where students skip school to avoid an unpleasant situation they are facing.
•    Parental participation in understanding the educational pathway of children:
Refers to the participation or awareness of parents or guardians regarding school activities, attendance at meetings, and assistance with school subjects.
•    Home learning environment:
Refers to the set of material and technological resources and facilities provided by the household within the home to create a supportive and motivating environment for learning, and to facilitate the acquisition of knowledge and skills either independently or with the support of parents or guardians.
•    Time spent using digital devices:
Used to assess the amount of time an individual spends using digital devices, referring to the total time spent by the individual on any digital devices each day.

 

Data sources

The main source of data for the Education and Training Statistics Publication is the Education and Training Survey 2025:
The main variables disseminated in the Education and Training Statistics Publication are:
•    Sex.
•    Nationality.
•    Age groups.

 

Designing the data collection tool

Data were collected using a questionnaire prepared and designed by specialists at the General Authority for Statistics. International recommendations, standards, and definitions were taken into account in its design, and it was also reviewed by relevant entities to obtain their views and comments. The questions were formulated in a clear scientific manner to standardize the wording and administration of questions. 
The questionnaire includes several sections, including:
•    Household residents.
•    Identification questions.
•    Basic personal characteristics and demographics.
•    Disability.
•    Languages.
•    School enrolment.
•    School-related information.
•    School attendance.
•    Personal standards on academic effort.
•    Teaching and school quality.
•    Bullying and school avoidance.
•    Non-enrolment in educational programmes.
•    Highest educational qualification.
•    Literacy.
•    Living conditions.
•    Home learning environment.
•    Free time away from school.
•    Time spent on doing homework.
•    Out-of-school activities.
•    Ambitions and life goals.
•    Economic activities.
Method of calculating the indicators:

Indicator Calculation formula
Completion rate (Number of individuals aged 3–5 years above the official age of the final grade who completed that grade ÷ Total number of individuals in the same age group) × 100
Percentage of students who have experienced bullying during the past 12 months  = (Number of students who reported experiencing bullying ÷ Total number of students participating in the survey) × 100
Enjoying school = Number of students’ responses by level of agreement (Strongly disagree, Disagree, Agree, Strongly agree) ÷ Total number of students’ responses) × 100
Percentage of parents’ participation in school activities = (Number of participating parents ÷ Total number of students) × 100
Time spent on doing homework daily  = (Time spent on doing homework ÷ Total number of students’ responses) × 100
Students’ attitudes toward teaching = (Number of students’ responses on teaching methods based on the relevant questions ÷ Total number of students’ responses) × 100


Review and Correction Rules:
Audit and control rules have been established in the form to ensure that the data collected is consistent, accurate, and logical. These rules were designed by establishing logical relationships between responses, questions, and different variables to help the field researcher detect any errors directly during data entry.
To ensure the quality of the Education and Training Survey data, four types of review and correction rules were established, as follows:
•    Automated adjustment rules:
These rules were established for the automatic calculation of certain fields or the automatic adjustment of responses in specific fields, in line with some questionnaires, totaling (45) rules.
•    Navigation rules between sections and fields:
Special rules were programmed to regulate automatic navigation between sections and fields based on the respondent’s answers, totaling 27 rules.
•    Error rules:
These are rules that cannot be bypassed during the data entry process. The field researcher must correct the data by referring back to the respondent to verify its accuracy. The total number of these rules exceeds 18.
•    Alert rules (warnings):
These rules are designed to verify the correctness of the data entered by the researcher. The field researcher may override them if the data accuracy is confirmed, with a total of approximately 0 rules.

 

Questionnaire test (cognitive test)

Cognitive testing was conducted on a number of questionnaire items. The interview sample consisted of a random sample of the population and dwellings distributed across the regions of the Kingdom of Saudi Arabia.
During the cognitive testing process, the following evaluation pillars were taken into consideration: The overall concept of the question, clarity of question wording, clarity of terms used in the question, appropriateness of the response options, participants’ ability to answer the questions effectively, and the extent to which participants were willing to disclose their answers. This process resulted in a report summarizing the full findings of the cognitive test.

 

Statistical population

The statistical population of the Education and Training Statistics Publication comprises male and female individuals residing in the Kingdom of Saudi Arabia aged 5–19 years, excluding households that do not include children within the 5-19 age group.

 

Sample Design

    The sample was designed using a two-stage stratified systematic cluster random sampling method.
    The Saudi Census 2022 frame was used, and the basic sampling unit is the individual aged 5–19 years.
    A confidence level was used in calculating the estimates.(1-α)=0.95
    The total sample size amounted to 27,785 households, distributed across 1,747 enumeration areas.
    The sample was representative at the administrative region level, excluding households that did not include children in the 5–19 age group.
Sample type:
The sample was designed using a two-stage stratified systematic cluster random sampling method. In the first stage, a random sample of primary sampling units (enumeration areas) was selected within each stratum of the adopted sampling design.   In the second stage, a systematic random sample of housing units (households) is selected within each selected initial sampling unit.
Stratification:
To increase the efficiency of the sample and its representativeness of the target population, the primary sampling units in the sample frame were classified into homogeneous strata. This approach was aimed at obtaining more accurate results compared to a simple random sample of the same size. The stratification was carried out as follows:
    Governorates were used as actual strata to meet the requirement of producing survey indicators separately for each governorate.
    The degree of urbanization (urban, rural) was used as actual strata.
Allocating the sample across strata:
In this survey, the sample size was calculated at the administrative region level.  The sample of each administrative region was then allocated to the strata within it using the probability proportional to size (PPS) allocation method, as follows:   

Where:
•    : Represents the total sample size of households in the Kingdom or within the sub-domain (administrative region).
•     : Represents the sample size of households allocated to stratum h.
•   : Represents the size of stratum h (total number of households in the sampling frame).
Procedures related to small-sized strata:

•    A minimum sample size of 72 housing units was set for each stratum.
•    A minimum of two enumeration areas was assigned to each stratum to facilitate the calculation of variance in the estimates. 

It was found that there are two strata, namely: It was found that two strata—Jeddah Rural and Al-Khobar Rural (2012, 5052)—in the sampling frame contained no enumeration areas after excluding enumeration areas with fewer than 50 housing units; therefore, the calculated sample size for each of these strata was adjusted to zero.
It was found that three strata: Khafji Rural, Ras Tanura Rural, and Turaif Rural (5062, 5072, 9022)—each contained only one enumeration area in the sampling frame. Accordingly, the sample size in each of these strata consisted of one enumeration area, which is less than the planned size; therefore, the cluster size in these strata (i.e., the number of housing units selected from the enumeration area) was increased to 72 housing units.
It was found that, in eight strata (1172, 4012, 5002, 11012, 1202, 4052, 7052, 9032), the number of enumeration areas in the sample (4) exceeded those available in the sampling frame (2 or 3). Therefore, the number of enumeration areas in the sample for these strata was reassigned to match those available in the frame (2 or 3). To ensure that this adjustment did not reduce the household sample size in these strata, the number of housing units selected from each enumeration area (cluster size) was increased from 18 to 24 when the number of enumeration areas was 3, and from 18 to 36 when the number of enumeration areas was 2. As a result, the minimum number of housing units per stratum (72 housing units) was maintained.
Calculation of sample size:
The sample size calculation was carried out in two stages:
•    At the national level, the sample size was calculated for the three key indicators, resulting in three different sample sizes.
•    The largest sample size among the three values obtained in the previous step was selected, and the indicator that produced the largest value was used to calculate the sample size at the level of each administrative region.
Parameters and specifications used in estimating the sample size:
The sample size was calculated using the following parameters and specifications:
•    To ensure that the estimates calculated from the sample have a specified level of precision and a specified coefficient of variation (CV). The allowable coefficient of variation (CV) used in calculating the sample size was less than 1% at the national level, 2% at the administrative region level, and 2.6% at the governorate level.
•    After determining the sample size at the administrative region level, the sample size in each administrative region was allocated across its governorates, and the resulting coefficient of variation was calculated at the governorate level. The median coefficient variation was 8.2%, with values ranging between 2.4% and 11.6%.
•    The expected response rate was assumed to be 80%. However, after calculating the sample size and comparing it with the sample size from the previous survey cycle (2024) and reviewing the coefficient of variation (CV) observed in the previous cycle, the response rates used in the sample size calculation were adjusted in light of the actual response rates from the previous cycle. This adjustment was made to ensure that the resulting sample size and the resulting coefficients of variation comply with the criteria specified in item (1) above.
•     A confidence level of (1–α) = 0.95 was used in calculating the estimates.
The design effect was calculated for three indicators: 
•    Educational attainment – Intermediate level: Design effect = 2.3 (national level).
•    Educational attainment – Secondary level: Design effect = 3.43 (national level).
•    Net enrolment rate – Secondary level: Design effect = 2.2 (national level).

Whereas:
•    : Represents the population size for each study domain h (administrative region).
•    : Represents the sample size for each study domain h (administrative region).
•     : Represents the allowable relative margin of error in estimating the proportion of Internet use to obtain health-related information for each study domain h (administrative region).
•    : Represents the estimated design effect for each study domain h (administrative region).
•    : Represents the estimated response rate for each study domain h (administrative region).
•    : Represents the educational attainment rate (Intermediate education) in study domain h (administrative region).
•    : Represents the confidence level parameter for the above-mentioned proportion for each study domain h (administrative region).

Distribution of the sample at the level of administrative regions:

Administrative region Number of households
Riyadh 4,666
Makkah 3,722
Madinah 2,286
Qassim 2,003
Eastern Region 2,619
Aseer 2,376
Tabuk 1,522
Hail 1,397
Northern Borders 1,006
Jazan 2,214
Najran 1,490
Al-Baha 1,170
Al-Jouf 1,314
Total 27,785

Statistical unit (sampling unit)

The statistical units in the Education and Training Statistics Publication are the individual and out-of-school activities.

 

Data collection

Data collection from the survey:
Data for the Education and Training Statistics Publication are collected through Computer-Assisted Personal Interviews (CAPI).
The data are then stored in the Authority’s databases after undergoing daily data validation and review processes, in accordance with approved statistical methods and recognized quality standards, such as: 
•    Verification that all questionnaire fields have been fully completed.
•    Data validation and consistency checks.
•    Ensuring the absence of duplicate data.
•    Ensuring that review and correction rules are functioning as intended

 

Data collection frequency 

The data collection process for the Education and Training Survey is conducted on an annual basis.

 

Reference area

The Education and Training Statistics Publication covers 13 administrative regions in the Kingdom of Saudi Arabia.

 

Reference period (time reference)

References period to the variables or dataset as following:
•    Data related to household members and their demographic and educational characteristics are based on the individual’s interview date.
•    Bullying data are based on the 12 months preceding the interview date.
•    Data on school avoidance (voluntary absenteeism) are based on the 12 months preceding the interview date.
•    Data on participation in training activities are based on the 12 months preceding the interview date.

 

Base period

Not applicable.

 

Measurement unit

All results are calculated as percentages, for example: Completion rate.

 

Time coverage

Data are available from the year 2017 to 2024.

 

Publication frequency

The results of the Education and Training Survey statistics are published on an annual basis in accordance with the approved statistical plan.

 

Statistical processing

Error detection

Rigorous procedures are implemented to detect errors in the data collected during the field survey and stored in the data lake. This is achieved through the automation of the data collection tool and the application of necessary controls and procedures to regulate and manage entered data, ensuring quality, accuracy, and consistency. In addition, supporting methods are used to measure quality indicators and to examine the data for anomalies using clearly defined validation rules.
These procedures include:
•    Missing value treatment.
-    Mandatory field identification.
-    Non-response monitoring.
•    Frequencies:
-    Verification of the absence of duplicate records using unique identifiers.
•    Correctable outliers:
-    Correction of illogical values using arithmetic and standard rules.
•    Outliers requiring manual review:
-    Individual review of anomalous cases by specialists.
-    Documenting the cause of the error and applying the appropriate corrective action.

 

Data integration and matching from multiple sources 

Not applicable, as the Education and Training Statistics rely on a single primary data source.

 

Imputation and calibration

•    Imputation procedures:
A direct imputation approach based on logical rules was applied to handle missing values and ensure consistency among variables. In addition, manual treatment was applied to selected cases to ensure data accuracy. Multiple imputation was not used.
•    Calculation of variables and aggregates:
Composite variables were derived, and categories were reclassified based on predefined logical conditions. Indicators and aggregates (percentages and averages) were calculated according to the main classifications (sex, age, and region).
•    Weights, calibration, and non-response adjustment: 
Sampling weights are factors used in analyzing data collected from a sample rather than from the entire population. Their purpose is to correct biases arising from differences in selection probabilities among households in the sample. This helps ensure that the analysis results are more accurately representative of the population.
Main uses of sampling weights:
•    Bias correction:
Adjusting for biases resulting from unequal selection probabilities among members of the population.
•    Population representation:
Ensuring that the results derived from the sample accurately reflect the true characteristics of the population.
How to calculate sampling weights:
When the sample is drawn using SPSS, design weights are automatically calculated at the household level and appear under the name “SampleWeight_Final ”.  

The design weight is expressed as the inverse of the selection probability of each individual in the sample; if it is not available, it can be calculated as follows:
 If the selection probability of individual   from stratum    is denoted by    then the weight of the selected individual in the sample is given as follows:   

Where:
•    m: Number of enumeration areas in the sample for each stratum.
•    M: Number of enumeration areas in the sampling frame for each stratum.

Where:
•    n: Number of households in the sample for each enumeration area.
•    N: Number of households in the sampling frame for each enumeration area.
Weight adjustment:
Adjustment due to the exclusion of part of the population:
Weight adjustment to compensate for non-response or missing data to ensure proper representation of the sample. This adjustment is made after data collection and identifying the response cases, and is calculated using the following formula:

Whereas:
•     : The non-response–adjusted weight for household i in stratum h.
•      : The design weight for household i in stratum h.

•     :  : The adjustment factor for enumeration area m in stratum h, which is calculated as follows:

Where:
  Where R represents the responses and NR represents the non-responses .
Weight calibration (final weights):
When survey indicators relate to individuals, the weights are calibrated (adjusted) to align with the population distribution based on known characteristics such as age, sex, nationality, and administrative region. This ensures that the sample is representative and comprehensive with respect to these characteristics. If under-representation or a shortage in the number of responding individuals is observed for any of the aforementioned characteristics, two options are available:
The first option is to merge weight calibration categories that contain no sample or have a very small sample size with other categories that have higher response rates.
The second option is to use the raking weights method instead of post-stratification.
Calculation of weights for individuals aged 5–19 years using the raking weights method:
Objective: A two-dimensional raking procedure will be implemented to calibrate the sample weights to the population estimates.
Calibration will be carried out using two levels:
•    At the Kingdom level:
By creating calibration categories consisting of a combination (code) of single-year age groups, nationality, and sex, referred to as (dim2).
•    At the administrative region level (ADMIN):
By creating calibration categories consisting of a combined code of the administrative region code, three-year age groups, nationality, and sex. and referred to as (dim1).
Below is the R script used for weight calibration. The program consists of the following main components:

•    Reading the data file (Member_ETS) and the population projection files (dimension1) and (dimension2).
•    Trimming (capping) of design weights to eliminate outliers in the weights
•    Specification of the sampling design
•    Calibration of the trimmed weights
•    Validation of the calibrated (final) weights
•    Saving the calibrated (final) weights to an Excel file
Table of survey response outcomes after the data collection process is as follows:

Administrative region Provided complete data Nonresponsive Increase comprehensiveness Total
Riyadh 4076 33 481 4666
Makkah 3107 116 338 3722
Madinah 1664 210 376 2286
Qassim 1539 0 172 2003
Eastern Region 1549 465 417 2619
Aseer 1804 97 338 2376
Tabuk 1236 93 163 1522
Hail 1149 83 149 1397
Northern Borders 776 41 186 1006
Jazan 2194 0 2 2214
Najran 1105 0 380 1490
Al-Baha 847 80 127 1170
Al-Jouf 999 87 152 1314
Total 22045 1305 3281 27785

Seasonal adjustments

Not applicable, only final results will be published. 

 

Adjustment of preliminary results 

Not applicable, only final results will be published. 

 

Resources used

Description Total
Total employees (GASTAT employees and researchers). 194

Total number of days of the data collection period (end
date − start date).

30 days
Average number of interviews conducted per day (during the data collection period). 4

Quality dimensions

Suitability

A criterion that indicates the extent to which the product meets users’ needs.

 

User needs 

Internal users in the GASTAT for education and training statistics publication data:
•    Social statistics:
-    Population, gender and diversity.
-    Living conditions, lifestyles and justice statistics.
-    Health and education statistics.
There are a significant number of external users and beneficiaries of the education and training statistics publication data, including:
•    Government entities.
•    Regional and international organizations.
•    Research institutions.

•    Media.
•    Individuals.
Key variables most utilized by external users:

Ministry of Education

Completion rate.
Out-of-school children.

Ministry of Economy and Planning

Completion rate.
Participation in training.

The Education and Training Evaluation Commission (ETEC) Parental participation indicators.
Human Capability Development Program Parental participation.
Ministry of Human Resource and Social Development Out-of-school activities.
Family Affairs Council

Completion.
Parental participation.
Out-of-school activities.

Technical and Vocational Training Corporation Participation in training.

Completeness 

The Education and Training Survey data are characterized by a high level of completeness, as they cover all key variables (demographic characteristics, enrolment, educational attainment, and parental participation) and include all administrative regions and target groups in the Kingdom of Saudi Arabia, in line with national and international standards. Any missing or inconsistent data were addressed using logical imputation methodologies to ensure the accuracy of the results and to minimize any impact on the quality of the analysis.

 

Accuracy and reliability 

A measure that indicates how close calculations or estimates are to the true or exact values that reflect reality.

 

Overall accuracy 

Accuracy is defined as the degree to which the estimated value of a statistical indicator is close to the true value of the target population.
Accuracy is assessed by analyzing sources of error into two main components:
•    Sampling errors:
This type of error arises from using a sample instead of a complete enumeration, and it is the only component of total error that can be measured quantitatively. It is statistically expressed using the Relative Standard Error (RSE) and confidence intervals (CI).
•    Non-sampling errors:
Non-sampling error encompasses all errors that may occur at any stage of the survey, except those arising from sampling.
They are classified into four main types:
•    Coverage errors:
These errors occur due to a mismatch between the sampling frame and the target population, resulting in over-coverage (such as housing units that have been merged into another dwelling) or under-coverage.
•    Measurement errors:
These errors arise from deviations between the recorded value and the true value of a characteristic and are attributable to factors such as interviewer performance.
•    Processing errors:
Errors that occur during the data preparation stages after data collection and before analysis, including processes such as coding, data entry, editing, and imputation.
•    Non-response errors:
These errors occur due to systematic differences between the characteristics of respondents and non-respondents. Their impact on bias is assessed through the analysis of non-response rates (such as refusals or inability to contact).

 

Timeliness and punctuality 

A standard that measures the time gap between the availability of information and the occurrence of the event.
However, timeliness reflects the time difference between the date of data publication and the target date when it is actually published.

 

Timeliness 

The General Authority for Statistics is keen to apply internationally recognized standards regarding the announcement and clarification of the timing of statistical releases on its official website through the statistical calendar, as well as adherence to the announced publication time. In the event of any delay, the schedule will be updated accordingly.

 

Punctuality 

The publication takes place according to the published release dates on the statistical calendar for education and training statistics publication on the website of the General Authority for Statistics.
The data are available at the expected time, as scheduled in the statistical release calendar, If the publication is delayed, reasons shall be provided.

 

Coherence and comparability

A standard that refers to the necessity of internal and temporal consistency of statistics, their logical coherence, and their comparability and integration across different regions and sources.

 

Comparability - geographical

•    Comparability:
The data are comparable at both national and international levels, as they adhere to international statistical classifications and standards (such as the Sustainable Development Goals (SDGs) indicators).
•    Locally (across regions):
The data are comparable across all administrative regions in the Kingdom, due to the use of unified official geographic classifications and the application of consistent methodology and data collection tools nationwide, including the Saudi Standard Classification of Educational Levels and Fields.
•    Internationally:
The data are internationally comparable due to alignment with global standards and classifications (such as ISCED), and through linking the results to harmonized international indicators (such as SDG 4 under the Sustainable Development Goals (SDGs)).
•    Interpretation of differences:
Any changes in administrative boundaries or variations in coverage and data quality are documented to explain any differences in results between regions or in comparisons with other countries.

 

Comparability - over time 

The survey was launched in 2017 as a triennial survey, and the following are the main changes introduced in recent years:
•    2020:
It was not implemented due to the (Covid-19) pandemic.
•    2021- 2022:
The survey was not conducted due to the Saudi Census 2022.
•    2023:
The education and training statistics publication has been published and has not been published due to the harmonization work with the Ministry of Education.
•    2024:
Available on the Authority’s website.

 

Coherence- Cross domain

The data are consistent, as their coherence is verified against all other statistics containing similar indicators. These procedures contribute to ensuring integration and coherence among statistics, thereby enhancing data reliability and the quality of the analyses based on them, and ensuring that the results are free from any unjustified inconsistencies.

 

Coherence- Sub-annual and annual statistics 

Not applicable, as the Education and Training Survey is published on an annual basis only.

 

Coherence- National Accounts 

Not applicable, as the Education and Training Survey data are not comparable with National Accounts data.

 

Coherence- Internal 

The Education and Training Statistics Publication estimates for the reference period have full internal coherence, as they are all based on the same set of microdata.
They are calculated using the same estimation methods:
•    Logical consistency of data:
The survey is characterized by a high level of internal consistency to ensure the logical coherence of the data. This was achieved by applying validation rules and automated skip patterns during the data collection process to ensure the absence of inconsistencies between responses.
•    Consistency across different measures:
Extensive post-collection validation checks were conducted to ensure consistency across different measures. 
•    Internal consistency issues:
Any internal consistency issues were identified and addressed through direct rule-based imputation or manual adjustment in necessary individual cases, to ensure the absence of numerical inconsistencies in the final outputs.

 

Accessibility and clarity

The accessibility of data for users, the availability of detailed or aggregated data, as well as the availability of the methodology and quality report.

 

Press releases

The announcements for each publication are available on the statistical calendar as mentioned in 10.1. The press releases can be viewed on the website of GASTAT on the link: 
Press release

 

Publications

GASTAT issues the Education and Training Statistics Publication on a regular basis within a pre-prepared dissemination plan, and it is published on GASTAT’s website.  GASTAT is keen to disseminate its results in a manner that serves all types of users, including publications in various formats that contain dissemination tables and charts for data and indicators, the methodology and quality report, and the survey questionnaires used, in both English and Arabic.
The Education and Training Statistics publication is available at the link:
Education and Training Statistics

 

Online database

During the process of disseminating the data in the statistical database 
 GASTAT (stats.gov.sa)

 

Microdata accessibility

Accurate data is unit-level disaggregated data obtained from multiple sources such as sample statistical surveys, general population and housing censuses, and administrative systems, providing detailed information about the characteristics of individuals, families, business entities, and geographical areas, supporting the construction and development of statistical indicators and scientific research.
Different types of microdata files are available to meet diverse information needs.
•    Public use: 
It consists of sets of records containing information on individuals, households, or business entities anonymized in such a way that the respondent cannot be identified either directly, such as by name, address, contact number, identity number, etc., or indirectly (by combining different – especially rare – characteristics of respondents), such as age, occupation, education, etc.
•    Scientific use:
Microdata files are produced in accordance with defined methodologies and in response to data users’ requests for datasets with specific characteristics, supporting strategic studies, decision-making, and scientific research on individuals, households, and establishments, while ensuring the exclusion of any direct identifiers and compliance with confidentiality protection controls.
Qualified users who meet the standards and procedures of confidentiality protection can access the files of scientific use of accurate data through the platform "ITAHA" of the General Authority for Statistics, while the most sensitive data for use is shared by visiting the accurate data laboratory within a secure environment managed by the Authority.

 

References and standards

•    Education Indicators Guide issued by the UNESCO Institute for Statistics (UIS):
Education Indicators Guide 
•    Summary of Sustainable Development Goal 4 data based on household survey data for SDG 4 monitoring:
Sustainable Development Goal 4 (SDG 4)

 

Quality assurance

GASTAT declares that it considers the following principles: Impartiality; ensuring that the statistical product is user-oriented; maintaining the quality of processes and outputs; enhancing the effectiveness of statistical operations; and reducing the burden on respondents. 
Data is validated through procedures and quality controls that are applied during the process at various stages, such as data entry, data collection, and other final controls.

 

Quality assessment

The General Authority for Statistics performs all statistical activities in accordance with the national model, the Generic Statistical Business Process Model (GSBPM). Under the GSBPM, the final phase of statistical activities is the overall evaluation stage, during which the information collected in each phase or sub-process is used to prepare an evaluation report that summarizes all challenges related to the quality of each statistical process and serves as input for improvement and development actions.

 

Confidentiality

Confidentiality – Policy

According to Royal Decree No. 23 dated 07/12/1379, data must always be kept confidential and must be used by GASTAT for statistical purposes only.
Therefore, the data is protected in the data servers of GASTAT.

 

Confidentiality - Data Treatment

Data of SMEs survey are presented in right tables in order to summarize, understand, as well as extract their results. Moreover, to compare them with other data, and to obtain statistical significance about the selected study population. However, referring to such data indicated in tables is much easier than going back to check the original questionnaire that may include some data like names and addresses of individuals, and names of data providers, which violates data confidentiality of statistical data.
“Anonymity of data” is one of the most important procedures. To keep data confidential,
GASTAT removed information on individual persons, households, or business entities such a way that the respondent cannot be identified either directly such as: (name, address, contact number, identity number etc.) or indirectly (by combining different - especially rare - characteristics of respondents) such as (age, occupation, education etc.).

 

Dissemination policy

Statistical calendar

The Education and Training Statistics Publication has been included in the statistical calendar.
Statistical Calendar

 

User access

One of GASTAT’s objectives is to better meet its clients' needs, so it immediately provides them with the publication's results once the Education and Training Statistics publication is published.
Customer questions and inquiries about the publication and its results are also received through various communication channels, such as:
•    GASTAT official website: www.stats.gov.sa
•    GASTAT official email address:   info@stats.gov.sa
•    Official visits to GASTAT’s official head office in Riyadh or one of its branches in Saudi Arabia.
•    Official letters.
•    Statistical telephone: (199009).