Article Text

Original research
Staff–pupil SARS-CoV-2 infection pathways in schools in Wales: a population-level linked data approach
  1. Daniel A Thompson,
  2. Hoda Abbasizanjani,
  3. Richard Fry,
  4. Emily Marchant,
  5. Lucy Griffiths,
  6. Ashley Akbari,
  7. Joe Hollinghurst,
  8. Laura North,
  9. Jane Lyons,
  10. Fatemeh Torabi,
  11. Gareth Davies,
  12. Mike B Gravenor,
  13. Ronan A Lyons
  1. Swansea University Medical School, Swansea University, Swansea, UK
  1. Correspondence to Dr Richard Fry; r.j.fry{at}


Background Better understanding of the role that children and school staff play in the transmission of SARS-CoV-2 is essential to guide policy development on controlling infection while minimising disruption to children’s education and well-being.

Methods Our national e-cohort (n=464531) study used anonymised linked data for pupils, staff and associated households linked via educational settings in Wales. We estimated the odds of testing positive for SARS-CoV-2 infection for staff and pupils over the period August– December 2020, dependent on measures of recent exposure to known cases linked to their educational settings.

Results The total number of cases in a school was not associated with a subsequent increase in the odds of testing positive (staff OR per case: 0.92, 95% CI 0.85 to 1.00; pupil OR per case: 0.98, 95% CI 0.93 to 1.02). Among pupils, the number of recent cases within the same year group was significantly associated with subsequent increased odds of testing positive (OR per case: 1.12, 95% CI 1.08 to 1.15). These effects were adjusted for a range of demographic covariates, and in particular any known cases within the same household, which had the strongest association with testing positive (staff OR: 39.86, 95% CI 35.01 to 45.38; pupil OR: 9.39, 95% CI 8.94 to 9.88).

Conclusions In a national school cohort, the odds of staff testing positive for SARS-CoV-2 infection were not significantly increased in the 14-day period after case detection in the school. However, pupils were found to be at increased odds, following cases appearing within their own year group, where most of their contacts occur. Strong mitigation measures over the whole of the study period may have reduced wider spread within the school environment.

  • SARS-CoV-2
  • schools
  • disease transmission
  • public health

Data availability statement

Data are available upon reasonable request. The data used in this study are available in the Secure Anonymised Information Linkage (SAIL) Databank at Swansea University, Swansea, UK. All proposals to use SAIL data are subject to review by an independent Information Governance Review Panel (IGRP). Before any data can be accessed, approval must be given by the IGRP. The IGRP gives careful consideration to each project to ensure proper and appropriate use of SAIL data. When access has been approved, it is gained through a privacy-protecting safe haven and remote access system referred to as the SAIL Gateway. SAIL has established an application process to be followed by anyone who would like to access data via SAIL:

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What is known about the subject?

  • Evidence of the role schools play in the transmission of SARS-CoV-2 is limited.

  • Higher positivity rates are observed in school staff compared to pupils.

  • Lack of evidence on transmission pathways transmission into and within schools.

What this study adds?

  • A national level study of transmission between pupils and staff in a school environment during the SARS-CoV-2 pandemic.

  • Schools opening September–December 2020 was not associated with an increased subsequent risk of testing positive in staff.

  • Pupils were found to be at increased risk of testing positive, following cases appearing within their own year group.


The role schools play in the transmission of SARS-CoV-2 requires further robust evidence. There is ongoing debate regarding closures and related concerns of the negative impacts and widening inequalities in children’s health, well-being, educational attainment, as well as on family income and the overall economy. Since the WHO declared the SARS-CoV-2 outbreak a global pandemic on 11 March 2020,1 education for children and young people has varied from online, in-person and hybrid learning, with wide variance of measures implemented for different groups, within school settings and between countries.2

Current evidence suggests that younger children are less susceptible to infection3 and have considerably milder disease compared with adults.4 SARS-CoV-2 positivity rate within the school setting has been low3 5 and higher positivity rates are observed in school staff compared with pupils.5 In the UK, enhanced surveillance was undertaken following the reopening of schools during the summer half-term 2020, confirming that while overall risk of infection was low among pupils and staff, there was a higher risk of SARS-CoV-2 infection among staff and staff–staff transmission was most common.6

Emerging research from the UK Office for National Statistics COVID-19 Infection Survey and Schools Infection Survey7 8 reports increased transmission among school staff and school-aged children, particularly aged 12 years and above (secondary school age) towards the end of 2020, against a background of high community prevalence. However, the evidence base is still limited and does not cover the dynamics of transmission and infection from households to schools, and within the school setting.

This study contributes to this body of evidence through analyses of population-level data held within the Secure Anonymised Information Linkage (SAIL) Databank.9–11 By linking data on all staff, pupils and associated household contacts in Wales, we aimed to improve understanding of likely transmission pathways into and through educational settings. We assessed the likelihood of test positivity in pupils and staff in relation to other recent cases in linked pupils, staff or their households.

Methods e-cohort creation

We created an e-cohort of school children (aged 4–17 years), school staff and linked household members for both children and staff (figure 1). The e-cohort was created using anonymised linked data held within the SAIL Databank at Swansea University.9–11 Data are anonymised at an individual and household level.12 13 Our primary health data cohort was the Welsh COVID-19 e-cohort,14 which consists of all people alive and known to the NHS in Wales on or after the 1 January 2020. To this core, we linked the School Workforce Annual Census, which details all individuals who work in a publicly funded school15 covering 1498 out of 1502 schools in Wales; and the Pupil Level Annual School Census,16 which includes annual returns on 1480 out of 1502 schools. Finally, we linked COVID-19 antigen testing data to the cohort. This data combined pillar 1 and pillar 2 data collected by Public Health Wales (PHW).17 Pillar 1 is swab testing in PHW laboratories and NHS hospitals for those with a clinical need, and health and care workers; and pillar 2 is swab testing for the wider population, as set out in government guidance. These linkages are summarised in figure 1.

Figure 1

Health and administrative education data linkages. Four data sources are used to create our e-cohort: the Welsh COVID-19 e-cohort, SWAC, PLASC and COVID-19 antigen testing data. We linked SWAC and PLASC to the Welsh COVID-19 e-cohort. We also linked staff and pupils via educational settings using a SALF. Furthermore, we linked staff and pupils to their household members using the Welsh COVID-19 e-cohort. Missing variables of staff and pupils (in the Welsh COVID-19 e-cohort) before being confirmed eligible are reported in online supplemental table S3. PLASC, Pupil Level Annual School Census; SALF, School Anonymised Linking Field; SWAC, School Workforce Annual Census.

Supplemental material

Our e-cohort study used pupils, staff and linked household members in Wales grouped into educational settings using a School Anonymised Linking Field . We followed participants from 1 August 2020 to 25 December 2020. Our educational setting data are recent up to the end of the academic year 2019–2020. Therefore, we removed pupils who: finished primary school (year 6) in the school year 2019/2020 and finished secondary school (year 11) in the school year 2019/2020 from the statistical models, because it is not possible to confirm their linked education setting over the period. Staff members contracted to multiple schools (ie, peripatetic teachers) were also removed because it was not possible to determine durations within each school.

Patient and public involvement

All proposals to use anonymised data in SAIL are scrutinised by an independent Information Governance Review Panel that includes members of the public prior to the commencement of the research.

Statistical modelling

Our outcome was the probability of testing positive, following a pillar 1 or pillar 2 test. When an individual has multiple test results: if any return positive, the individual’s outcome is positive and date of the positive test taken as the date-of-interest; if all tests return negative, the individual’s outcome is negative, and date of the most recent negative test taken as the date-of-interest. The outcome was determined by the number of school-linked positive cases in the preceding 14-day period, prior to the collection date of the outcome’s specimen (date-of-interest). Exposure measures investigated were: (1) total number of cases within the linked school, (2) total number of cases within the linked household, (3) total number of cases in any households linked to the school and (4) total number of cases within the same year group (pupils only), which represents the pupil population in which the vast majority of contacts for an individual pupil would occur.

We used binary logistic regression to determine the ORs for a positive outcome after a SARS-CoV-2 test. We first combined both staff and pupils test results to determine general associations (model 1, M1), with a categorical variable indicating whether an individual was a staff or a pupil member at the linked school. We then stratified by staff (M2) and pupil outcomes (M3). Individuals with any missing covariate data were removed. As additional covariates, we included age, sex, rurality,18 school type and number of staff and pupils in the same school.


Cohort characteristics

The study was based on 464531 pupils and staff attending schools in Wales. Details of numbers, school categories, tests and percentage positive are shown in table 1.

Table 1

Cohort summary

Potential routes of transmission

Table 2 summarises the different settings in which potential exposure to the SARS-CoV-2 virus can be identified, based on a time window of 14 days preceding a positive test. The large majority of pupils and staff had a recorded exposure in either their household or school. There were recent potential exposures at school for 76% of positive staff, with 59% having school-but-not-household exposure. For pupils, 83% had recent school cases, with 44% having school-but-not-household exposure.

Table 2

Distribution of known potential exposure to infection by setting for staff and pupils (excluding staff contracted to multiple schools, and pupils aged 11 years or 18+ years)

Effect of school exposure on odds of a positive test

In unadjusted analyses (online supplemental tables S1 and S2), we found significantly increased odds of testing positive across all settings, following known cases in linked schools and households. However, after adjusting for age, sex, rurality, school type, household case exposure and numbers of staff/pupils in school/household, we found that total numbers of cases in the preceding 14 days in the school was associated with lower odds of testing positive (staff OR: 0.93, 95% CI 0.89 to 0.97; pupils OR: 0.97, 95% CI 0.95 to 0.98; table 3 M1).

Table 3

Fully adjusted multivariable logistic regression results (M1 staff and pupils; M2 stratified by staff; and M3 stratified by pupils). Adjustments for age, sex, residential settlement type, number of pupils and staff within the linked school, and number of people within linked household are included in the models, ORs of the fully adjusted covariates can be found in online supplemental table S2). ORs are calculated per individual case of known exposure

Unsurprisingly, by far, the strongest signal in the data (for both staff and pupils) is related to exposure to known cases in the household (table 3, M1–M3). We also found a significant association with linked cases in a household (table 3, M1–M3).

When stratifying by staff test results, and after adjusting for covariates (including household cases), the total number of cases occurring in a linked school setting was again associated with slightly lower odds of a positive SARS-CoV-2 outcome (staff OR: 0.92, 95% CI 0.85 to 1.00; pupil OR: 0.98, 95% CI 0.93 to 1.02). Staff members in primary and special schools had a higher odds of a SARS-CoV-2 positive test compared with middle and secondary schools, and staff had higher odds of a positive outcome compared with the reference level of pupils (OR: 2.99, 95% CI 1.67 to 5.37, p<0 .001) (online supplemental table S2).

When stratifying by pupils, and adjusting for covariates (including household cases), the total number of staff and non-year group cases in the school was not associated with increased odds of testing positive (table 3). However, in contrast, the number of cases in pupils within the same year group was significantly associated with testing positive (OR: 1.12, 95% CI 1.08 to 1.15).


Summary of main findings

Our results show that the total number of SARS-CoV-2 positive staff and pupils within a school following the reopening in Wales in September 2020 was not associated with an increased subsequent odds of testing positive in staff or pupils. By including likely household exposure and the number of cases in all households linked to the school in the models, we aimed to adjust for one of the primary routes of transmission (own household), and also a proxy measure of community prevalence, which increased considerably over the period. The lack of association at the school level sheds light on the effectiveness of reducing transmission within the school environment, and also on the policy of isolation following exposure.19 Wales adopted an aggressive policy of school year group (secondary), school class (primary) and large bubble closures following the detection of cases, even when prevalence was low. Notably, the number of pupils in schools declined dramatically during the period of highest prevalence in December. Average pupil attendance was approximately 85% until the end of November, but dropped to 70% by the 7 December and 33% by the 14 December.

Nevertheless, our results also demonstrate increased odds of a SARS-CoV-2 positive outcome in pupils dependent on the number of cases found in the same year group, when the majority of classroom interactions occur. As this represents by far the majority of contacts for all schoolchildren, the results are consistent with pupil–pupil transmission. We estimated a 12% increase in the odds of testing positive, for case in the year group in the preceding exposure window (75% increase for 5 cases). It is notable that this signal can be detected after adjustment for household exposure, some measures of community prevalence, and especially amid a background of active isolation measures.

Unsurprisingly, SARS-CoV-2 infections within an individual’s household posed a highly significant increased odds of subsequent infection in school staff and pupils. In addition, the number of SARS-CoV-2 positive outcomes within any households linked to the school also suggest increased odds of a SARS-CoV-2 positive outcome in staff and pupils. This may reflect a direct effect of contacts occurring around the school environment, or also be a general marker of community prevalence. We noted that very few cases were recorded who did not have a link to a known case in either the home or school environment. Furthermore, a large majority of both staff and pupils were potentially exposed to school cases, while having no known household exposure.

Comparison with previous work

Public health responses, and decisions on school closures, are informed by the best available evidence. This is rapidly evolving and a number of reviews have been published recently,2 20 some of which include primary studies on transmission during the first wave, and others which look at the situation across 2020. A recent review highlighted the large heterogeneity among studies investigating the impact of school closures and reopening schools on transmission.21

There is consistent evidence that children aged below 10–14 years have lower susceptibility to SARS-CoV-2 infection than adults3 20 and that children play a limited role in overall transmission rates. However, there remains few high-quality studies that disentangle potential transmission routes between households and schools, and transmission of SARS-CoV-2 within the school setting between pupils and school staff.21 Our study contributes to this gap in the evidence base, and demonstrates that transmission risks in schools exist, but likely are at much lower than in households as long as other mitigation measures are in place.

The balance of evidence thus far indicates low overall positivity rates in the school environment.5 A low overall risk of infection among staff and pupils within educational settings has been observed in countries that remained open for face-to-face teaching during the first wave in Spring 2020 in Australia22 and Sweden.4 These studies concluded that the attendance of children and school staff within educational settings maintaining physical distancing and hygiene measures did not contribute substantially to overall infection rates. Following national school closures and the reopening of schools in the summer term of 2020, evidence from Israel23 suggested that schools reopening had a limited effect on SARS-CoV-2 infection rate in children and adults, and national surveillance in England found low overall risk of infection among staff and pupils in educational settings, although staff–staff transmission was most common.6 Our study extends this evidence base by examining if transmission varied between and within year groups. Our results show pupil–pupil transmission within a year group may occur before cases are identified, but current measures, including rapid isolation and implementing physical distancing such as segregated year groups, may be effective in reducing the scale of this, and containing subsequent transmission within the school.

In a similar time period to the current study (August– December 2020), evidence from Canada24 examined secondary transmission of SARS-CoV-2 and reported no instances of child-to-adult transmission during in-person teaching. While findings from the current study reflect that of largely symptomatic testing of pupils and staff, contact tracing during this period of all children (symptomatic and asymptomatic) under 14 years exposed to a confirmed case and tested during the following 14-day isolation period showed minimal pupil–pupil and pupil–staff transmission in primary schools situated within two counties in Norway with high community incidence.25 Consistent with other studies is our finding of higher positivity rates among school staff compared with pupils5 6 22 and may reflect the higher population-based rates observed in adults.

Study strengths and limitations

Our study included the entire staff and pupil records in Wales, in publicly funded schools, and hence avoids some selection biases, other than through the privately educated sector, which is very small in Wales (75 private schools). The sample size of tests and the numbers of infections were substantial. A key strength is the fine scale of data linkage, which allowed us to link household and school events, which has not been a feature in previous reports. Adjusting for likely transmission in the home and through extended school bubbles is important in clarifying effect sizes for likely transmission in the school and community setting.

Among the weaknesses of our study design is that testing for cases has been very largely based on testing those who are symptomatic, and most staff and pupils have not been tested. Hence, potential exposure is linked only to positive test results and not necessarily all cases (particularly, non-symptomatic cases). The school links are generated from 2019 data. Some pupils will have left or moved school during the summer holidays, which could introduce biases. To mitigate against this, we excluded all children aged 11 years or 16+ years in the 2019 data, as these will have moved from primary to secondary schools or have left school. We cannot exclude that there will be some mismatches with linking children to schools they no longer attend.

Measures to reduce transmission in the school environment, although advised at a national government level, will likely have varied subtly across schools in Wales dependent on setting, numbers of staff available and personal behaviours and activities of children, staff and parents (eg, mask wearing, congregating at school opening and closing times, and duration of exposures). We are unable to capture these variations in routine data, which may explain some of the differences observed and we have also not examined new variants of SARS-CoV-2. We were unable to account for ethnicity of pupils and staff in the study due to incomplete coding of this information in the available data. In our analysis, we could test only for additive effects (log odds scale) of the case numbers that individuals were exposed to, combined with the size of the population in which the cases were identified (household or school). As more data becomes available, the interaction, or other functional relationships between the effect of exposure to a certain number of cases and the background population size (or density) could be explored in more detail. Finally, we are currently unable to account for days when pupils may not have been present in school, which may have resulted in different exposures for a small number of cases.


National school closures are a topic of ongoing debate regarding the risks and benefits between potential transmission within the school setting, balanced against concerns of the negative impacts and widening inequalities in children’s health, well-being and educational attainment, and the broader economic and societal impact. Findings from this study suggest that pupil-to-pupil SARS-CoV-2 transmission is likely but the absolute effects on the wider school population and staff can be minimised through the implementation of current mitigation measures, although measures that have been strict. Approximately 15% of the pupil population was absent from school over most of the study period, increasing to 70% as the second wave peak approached, with early complete Christmas closure.

This study has examined plausible transmission pathways within a school environment and not the risk of staff or pupils becoming moderately or seriously ill from COVID-19. Further work is also required on specific subgroups of the school populations, for example, pupils with special educational needs and those from different ethnic minorities. As part of these future developments in the work, considerations to multilevel modelling and cluster effects within school settings will be included. As there is a paucity of evidence on the effectiveness of the vaccines on the reduction of transmission, it is beyond the scope of this paper to assess whether educational staff should be reprioritised for vaccination. However, as the vaccines are rolled out further, urgent work is warranted to examine the effectiveness of vaccines in reducing transmission within educational settings.


This study has shown that there are significant complexities in understanding the vectors for transmission within schools. While this study has been conducted in Wales, it is highly likely that the findings are generalisable to the UK and many parts of the world in temperate climates where schools have around 30 pupils per class and are largely educated indoors. We conclude that there is good evidence that the number of cases in pupils is associated with exposure to previous pupil cases within the school year group, consistent with pupil–pupil transmission linked to schools. A wide range of extensive mitigation measures in our study population has likely reduced the potential for further spread within the wider school pupil population and from pupil to staff.

Data availability statement

Data are available upon reasonable request. The data used in this study are available in the Secure Anonymised Information Linkage (SAIL) Databank at Swansea University, Swansea, UK. All proposals to use SAIL data are subject to review by an independent Information Governance Review Panel (IGRP). Before any data can be accessed, approval must be given by the IGRP. The IGRP gives careful consideration to each project to ensure proper and appropriate use of SAIL data. When access has been approved, it is gained through a privacy-protecting safe haven and remote access system referred to as the SAIL Gateway. SAIL has established an application process to be followed by anyone who would like to access data via SAIL:


This work uses data provided by patients and collected by the NHS as part of their care and support. We would also like to acknowledge all data providers who make anonymised data available for research. We wish to acknowledge the collaborative partnership that enabled acquisition and access to the de-identified data, which led to this output. The collaboration was led by the Swansea University Health Data Research UK team under the direction of the Welsh Government Technical Advisory Cell and includes the following groups and organisations: the Secure Anonymised Information Linkage (SAIL) Databank, Administrative Data Research Wales, NHS Wales Informatics Service, Public Health Wales, NHS Shared Services Partnership and the Welsh Ambulance Service Trust. All research conducted has been completed under the permission and approval of the SAIL independent Information Governance Review Panel (project number: 0911).


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @richfry, @emily_marchant, @AshleyAkbari, @fatemetrb

  • Contributors DAT and HA led the design, analysis and drafting of the paper. All other authors contributed equally to the design, data acquisition and interpretation of the data and reviewed the manuscript contents. All authors have approved the final published version.

  • Funding This work was supported by the Medical Research Council (grant number: MR/V028367/1); Health Data Research UK (grant number: HDR-9006), which receives its funding from the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation (BHF) and the Wellcome Trust; and Administrative Data Research UK, which is funded by the Economic and Social Research Council (grant number: ES/S007393/1).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.