Review

Feasibility and diagnostic accuracy of neonatal anthropometric measurements in identifying low birthweight and preterm infants in Africa: a systematic review and meta-analysis

Abstract

Background Complications of prematurity are the leading cause of under-5 mortality globally and 80% of newborn deaths are of low birth weight (LBW) babies. Early identification of LBW and preterm infants is crucial to initiate timely interventions.

Objective To evaluate the feasibility and diagnostic accuracy of alternative neonatal anthropometric measurements in identifying LBW and preterm infants in Africa.

Methods In this systematic review and meta-analysis, we evaluated the diagnostic performance of infant foot length, mid-upper arm circumference (MUAC), head and chest circumferences against birth weight and gestational age. Pooled correlation between the index and the reference methods was estimated. Multiple anthropometric thresholds were considered in estimating the pooled sensitivity, specificity and area under receiver operating characteristic curve (AUC).

Results 21 studies from 8 African countries met the inclusion criteria. Correlation coefficients with birth weight were 0.79 (95% CI 0.70 to 0.85) for chest circumference, 0.71 (95% CI 0.62 to 0.78) for MUAC and 0.66 (95% CI 0.59 to 0.73) for foot length. Foot length measured by rigid ruler showed a higher correlation than tape measurement. Chest circumference with 28.8 cm cut-off detects LBW babies with AUC value of 0.92 (95% CI 0.71 to 0.97). Foot length identified preterm infants, with 82% sensitivity, 89% specificity and AUC of 0.91 (95% CI 0.69 to 0.98) at a 7.2 cm optimal cut-off point. MUAC had an AUC of 0.83 (95% CI 0.47 to 0.95) for preterm detection. In identifying LBW babies, foot length and MUAC have AUC values of 0.89 (95% CI 0.70 to 0.96) and 0.91 (95% CI 0.73 to 0.97) at 7.3 cm and 9.8 cm optimal cut-off points, respectively. Foot length and MUAC are relatively simple and minimise the risk of exposing infants to cold.

Conclusion Newborn foot length, MUAC, head and chest circumferences have comparable diagnostic accuracy in identifying LBW and preterm babies. Using foot length and MUAC in low-resource settings are the most feasible proxy measures for screening where weighing scales are not available.

PROSPERO registration number CRD42023454497.

What is already known on this topic

  • Low birth weight (LBW) and prematurity are the leading cause of neonatal death accounting for 80% of newborn deaths globally.

  • Early identification of LBW and preterm infants is crucial to initiate timely interventions.

  • However, in low-resource settings considerable number of newborns are not weighed because of home delivery or unavailability of weighing scales in primary care facilities.

  • Neonatal anthropometric indicators can be used as alternative screening tools in these settings.

  • Prior individual studies have inconsistent findings; whereas, systematic reviews did not account for multiple thresholds; and feasibility of the methods in African context.

What this study adds

  • Foot length (FL) has 89% pooled accuracies to identify LBW babies at 7.3 cm optimal cut-off point.

  • MUAC (mid-upper arm circumference) has 91% pooled accuracies to identify LBW babies at 9.8 cm optimal cut-off point.

  • Based on the pooled estimates from studies conducted in Africa, FL can identify preterm infants with 82% sensitivity, 89% specificity.

  • FL measured by rigid ruler showed higher correlation with birth weight than the tape measurement.

  • FL and MUAC may be relatively simple and minimise risk of exposing infants.

How this study might affect research, practice or policy

  • Newborn FL, MUAC, head and chest circumferences have comparable diagnostic accuracy in identifying LBW and preterm babies.

  • Using FL and MUAC in low-resource settings are the most feasible proxies to screen for small babies at birth.

  • These findings can be used to design methods to identify high-risk newborns for life-saving interventions and are currently being applied in additional field studies.

Introduction

Globally, more than 20 million low birth weight (LBW, birth weight <2500 g) and about 15 million preterm (birth at <37 weeks gestation) babies are reported annually.1–3 LBW and prematurity remain the leading causes of death in newborns and children under 5 years.3 4 More than 80% of neonatal deaths are in LBW newborns, of which two-thirds are preterm and one-third are term small-for-gestational age.5 Furthermore, LBW newborns also have a higher risk of morbidity, stunting in childhood, and long-term developmental and physical ill health, including adult-onset chronic conditions such as cardiovascular disease and diabetes.3 4

Low-income and middle-income countries (LMICs) bear an inequitable share of LBW births, as more than 95% of LBW infants are born in these particular regions. More than 65% of global preterm births happen in Southern Asia and sub-Saharan Africa. Neonatal mortality and morbidity rates remain alarmingly high in Africa, with LBW and preterm birth being major contributing factors.2

Early identification of LBW and preterm infants is crucial to initiate appropriate and timely health interventions to improve their outcomes. In LMICs, a considerable number of newborns are not weighed at birth.1 This includes cases where babies are delivered at home and in some primary healthcare facilities where weighing scales are either dysfunctional or unavailable.5

Several anthropometric measurements at birth have been studied in different countries as potential screening proxies for birth weight6–11 and gestational age.12–16 These include foot length (FL), head circumference (HC)/occipito-frontal circumference (OFC), mid-upper arm circumference (MUAC), chest circumference (CC), crown-heel length (CHL), hand length (HL), intermammary distance (IMD), umbilical nipple distance (UND), hand breadth (HB), foot breadth (FB), abdominal circumference, thigh circumference (ThC) and calf circumference (CaC). However, the diagnostic accuracy results were inconsistent across studies.6–28

Prior systematic reviews attempted to evaluate diagnostic accuracy of neonatal anthropometric indicators in identifying LBW and preterm babies. However, the studies had some key limitations including; a narrow focus on a single anthropometric measurement and its diagnostic accuracy to identify either LBW or preterm babies, not accounting for multiple anthropometric thresholds used across studies and limited discussion on the trade-off between diagnostic accuracy and feasibility in the African context. In addition, systematic reviews conducted before 10 years may be limited in reflecting recent advancements in health systems as well as emerging alternative techniques and awareness of the community.22 23 These gaps highlight the need for updated evidence addressing methodological limitations and result discrepancies. Also, evidence should be tailored towards optimising feasible and accurate anthropometric screening approaches to identify LBW and preterm neonates in the African context.

This systematic review and meta-analysis was conducted to evaluate the feasibility and accuracy of alternative neonatal anthropometric measurements in identifying LBW and preterm babies in Africa. The results will contribute to the development and implementation of effective screening programmes, aimed at early detection of LBW and preterm infants and links to appropriate care that will assist in reducing neonatal mortality and improving the overall health outcomes of newborns in resource-limited settings.

Methods

Search strategies

The study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.29 The systematic review includes a comprehensive literature search of published scientific articles, in the English language, from 1 January 2010 to 30 August 2023, using the electronic databases deemed; MEDLINE/PubMed, Google Scholar and EMBASE databases. The search was performed using Medical Subject Heading terms (where applicable), keywords, context and population combined by Boolean operators (“AND” and “OR”). Specifically, Mid-upper arm circumference (MUAC), Foot length, Head circumference, Calf circumference, Gestational age, Diagnostic accuracy, Sensitivity and Specificity, Predictive values, Feasibility, Acceptability, Reliability, Estimation, Measurement, Birth weight, Low birthweight, preterm, premature, Infant, neonate, newborn and Africa were included in the search strategy (online supplemental appendix 1). The protocol for the review has been registered at the International Prospective Register of Systematic Reviews (PROSPERO, CRD42023454497).

Inclusion and exclusion criteria

In this systematic review and meta-analysis, we included all published articles in the English language, between 1 January 2010 and 30 August 2023, about the feasibility and accuracy of alternative measurements to identify LBW and preterm birth among newborns in Africa. Citations without abstract and/or full-text, anonymous reports, editorials, case reports and case series studies were excluded.

Study selection and screening

The retrieved studies were exported to EndNote V.9 to remove duplicate studies. Then, we screened the title and abstract of each study. The remaining papers were screened for their full text by two independent reviewers (AA and FWB) using the prespecified inclusion criteria. Any disagreement was discussed and resolved based on the article selection criteria for the final selection of studies to be included in the systematic review and meta-analysis (figure 1).

Figure 1
Figure 1

Flow diagram on selection of studies.

Data extraction

The authors developed a standardised and structured data extraction form in Excel, and the following data were extracted for eligible studies: year of publication, country, reference method, evaluated anthropometric measurement, method used, cut-off point, sample size, true positive, false positive, true negative and false negative. Two authors (AA and FBB) independently extracted all-important parameters from each study using the extraction form. The extracted data were checked again by two researchers (FWB and RF), and disagreements were resolved by tracing back to the original articles.

Quality assessment of studies

The revised Quality Assessment of Diagnostic Accuracy Studies-2 was used to evaluate potential sources of bias in the included studies.30 For the two reference standards (birth weight and gestational age) and the four index tests (FL, MUAC, CC and HC) in this study, we evaluated the risk of bias and applicability concerns. Under the risk of bias dimension, each study was rated for four domains (patient selection, index test, reference standard and flow and timing), while the applicability concern was rated for the first three. In this review, we considered a high risk of publication bias or applicability concern if one or more signalling questions in each domain answered ‘yes’ or ‘unclear’. Two authors (AA and RF) independently evaluated the included studies. When there was disagreement between the two evaluations, the final decision was made by the third author (FWB). The quality assessment of the included study was presented with online supplemental appendix 2.

Data synthesis and analysis

Studies reported several diagnostic accuracy metrics of anthropometric measurements for identifying LBW and preterm babies. These included correlation coefficients, sensitivity, specificity, predictive values, likelihood ratios and area under the receiver operating characteristic (ROC) curve (AUC) for various anthropometric cut-offs.

In the meta-analysis, we evaluated the performance of FL, MUAC, CC and HC when the required effect sizes were reported by three or more studies. To account for the multiple thresholds used in the studies, we employed two complementary analysis approaches. We considered the correlation coefficient effect size which is not affected by variation in points. The correlation coefficient values, the sample size and other characteristics reported by the included studies were retrieved to estimate the overall strength of association between each of the anthropometric measurements and birth weight or gestational age. The pooled correlation coefficient was computed by using a random-effects model in the metafor package in R.31

Second, we meta-analysed sensitivity and specificity at each reported cut-off point by using a random effects model in the diagmeta package in R.32 For each cut-off reported by each study, we constructed 2×2 contingency tables of true positives, false positives, false negatives and true negatives for detecting LBW or preterm birth. This approach accounts for multiple cut-offs by taking the correlation between sensitivity and specificity into consideration. We then calculated the pooled sensitivity, specificity and summary ROC (SROC) curve by using Common Random Intercept and Common Slope (CICS) conditional inference approach to estimate a random effects model . This model will account for inequality of variance and produce less biased pooled estimates for sensitivity and specificity.32

Heterogeneity and publication bias

The amount of heterogeneity was estimated using the restricted maximum-likelihood estimator. In addition to the estimate of τ2, the Q-test for heterogeneity and the I2 statistic are reported. Based on this, we employed DerSimonian and Laird random-effects model to estimate the pooled effect sizes. We also assessed the magnitude of heterogeneity by inspecting the dispersion of points and the closeness between the 95% prediction region and 95% confidence region in the SROC curve. We then conducted subgroup analysis based on the publication year, study setting, landmark and techniques used for anthropometric measurement.33

Cook’s distances were used to examine whether studies may be outliers and/or influential. Publication bias was assessed through funnel plots and regression tests. The rank correlation and Egger tests were used to check for funnel plot asymmetry. Leave-one-out sensitivity analysis was conducted. Meta-regression analyses were performed to evaluate the effect of potential moderators like the publication year, sample size and anthropometric measurement techniques.34

All analyses were performed following published guidelines on systematic reviews and meta-analysis of diagnostic test accuracy studies35 P values <0.05 were considered statistically significant.

Patient and public involvement

Not applicable.

Results

Search results

A total of 1266 studies were identified from electronic databases and manual searches. To identify these studies, we used MEDLINE/PubMed, Google Scholar and EMBASE databases and bibliographies of identified articles. 413 duplicate records were eliminated. After the authors screened titles and abstracts, 799 studies were removed and full texts of the remaining 54 articles were evaluated for eligibility. Finally, 21 studies were found relevant for this systematic review (figure 1).

Description of the included studies

In this meta-analysis, 21 studies that included 13 565 participants, for diagnosis of LBW and preterm babies using different anthropometric methods in 8 African countries between 2010 and 2023 were included. Of the studies, 20 employed institution-based cross-sectional design and 2 studies also included community follow-up. Only one study used a cohort study design. Based on the independent evaluation of authors, the included studies have a low risk of bias and applicability concern (online supplemental appendix 2).

The included studies assessed different anthropometric measurements and indicators including FL, HC/OFC, MUAC, CC, CHL, HL, IMD, UND, HB, FB, abdominal circumference, ThC and CaC. However, meta-analysis was performed for FL, MUAC, CC and HC/OFC for which at least three studies were obtained (table 1).

Table 1
|
Description of the included studies

15 studies reported the correlation coefficient of anthropometric measurements with birth weight, whereas 5 studies reported the correlation with gestational age. Of the 18 studies that reported sensitivity and specificity of anthropometric indicators, 15 used birth weight; 5 studies used gestational age and 3 studies used both as a reference method (online supplemental tables 1–3).

Two studies evaluated reliability of anthropometric measurements and reported that FL has minimum interobserver variability.12 20 Two studies also compared measurements collected on day 1 and day 516 24 and reported that FL has comparable diagnostic ability until the fifth day of life. A study conducted in Uganda reported that CC and FL measurements taken by midwives and community health workers have no significant mean difference, while length, HC, ThC, CaC and MUAC measurements differ significantly.18

Despite the comparability of diagnostic performance, the different anthropometric methods have varying advantages and limitations in terms of simplicity, accuracy, reliability, and feasibility for community-level intervention. Overall, FL and MUAC are relatively simple to measure with basic equipment, have moderate to high accuracy, low interobserver variability and minimal infant exposure or distress. CC is highly accurate due to the use of clear nipple line landmark for measurement; however, it increases the risk of infant exposure to hypothermia. HC is prone to moulding errors and subjective landmarks. Using multiple measures improves accuracy but requires more time and increases infant exposure risk (online supplemental table 4).

Correlation of anthropometric measurements with birth weight and gestational age

A total of 10 studies reported correlation between FL and birth weight measurements. Based on this, the reported correlation coefficient ranges from 0.46 to 0.80. The pooled estimate of the correlation coefficient based using random-effects model was 0.66 (95% CI 0.59 to 0.73). The model shows a significant heterogeneity among the included studies (I2=94% and τ2=0.037, p<0.001) (online supplemental figure 1).

To identify the source of heterogeneity, we calculated the pooled effect size for subgroups based on the publication year, study setting, the landmark and the techniques used to measure the FL. The result showed a significantly higher correlation between birth weight and FL measurements (p<0.01) when it was done by using hard ruler or calliper than non-stretchable tape (figure 2).

Figure 2
Figure 2

Subgroup analysis for pooled estimate of correlation coefficient between birth weight (BW) and foot length (FL) by measurement techniques.

The correlation between FL and birth weight measurements also has significant difference based on the subgroup analysis by the foot landmarks used to measure the length. Accordingly, studies that measured FL from the heel to tip of the big toe showed significant improvement in the correlation estimate when compared with studies that did not specify the landmarks (online supplemental figure 1).

Nine studies reported a correlation coefficients between MUAC and birth weight within the range of 0.47–0.87. The pooled estimate of the correlation coefficient based on the random-effects model is 0.71 (95% CI 0.62 to 0.78). It also shows high heterogeneity among the included studies (I2=97% and τ2=0.064, p<0.01) (figure 3).

Figure 3
Figure 3

Pooled estimate of correlation coefficient between birth weight (BW) and mid-upper arm circumference (MUAC).

The nine studies reported correlation coefficient for CC and birth weight between 0.63 and 0.92. Based on this, the estimated average correlation coefficient using a random-effects model is 0.79 (95% CI 0.70 to 0.85). Similarly, there was high heterogeneity among these studies (I2=98% and τ2=0.083, p<0.01) (online supplemental figure 2).

In 12 studies, the correlation values between HC and birth weight ranges from 0.37 to 0.82. The estimated average correlation coefficient based on the random-effects model was 0.68 (95% CI 0.60 to 0.74). Significant heterogeneity was also observed among the studies (I2=96% and τ2=0.055, p<0.01) (online supplemental figure 3).

The pooled correlation coefficient between gestational age and FL was 0.66 (95% CI 0.24 to 0.87) (online supplemental table 4). The observed correlation coefficients ranged from 0.14 to 0.92 and showed high heterogeneity (I2=99% and τ2=0.383, p<0.01). Since few studies reported correlation coefficient values for gestational age and other anthropometric measurements, the pooled estimates were not calculated.

Diagnostic accuracy of anthropometric indicators in identifying LBW and preterm babies

The diagnostic accuracy of anthropometric measurements to identify LBW or preterm babies was examined in this meta-analysis (when the accuracy of the method was reported by at least three studies). Based on the pooled estimates of the sensitivity, specificity and AUC curve, CC showed sensitivity of 84.7% (95% CI 68.4% to 93.4%) and specificity of 96.8% (95% CI 92.4% to 98.7%), with an excellent AUC of 0.92 (95% CI 0.71 to 0.97) to identify LBW. In identifying preterm infants, FL had accuracy of 82.0% (95% CI 59.2% to 93.5%) sensitivity, 89.1% (95% CI 79.8% to 94.4%) specificity and AUC of 0.91 (95% CI 0.69 to 0.98) (table 2).

Table 2
|
Pooled estimates of sensitivity, specificity, AUC and correlation coefficient values for FL, MUAC, CC and HC measurements accuracy to identify LBW or preterm babies

The optimal cut-off points for identifying LBW in the studies included were 7.3 cm for FL, 9.8 cm for MUAC, 28.8 cm for chest and 33.4 cm for HC. For discriminating preterm babies, the optimal cut-off scores were 7.2 cm for FL, 8.8 cm for MUAC, 30.3 cm for chest and 33.4 cm for HC (table 2).

The SROC curve of CC is closest to the upper left corner, indicating it has high diagnostic accuracy for identifying LBW babies (figure 4). Also, the FL SROC curve to identify preterm babies showed similar performance (figure 5). Overall, the AUC the ROC curve to identify LBW babies ranges between 0.89 and 0.92 indicating that the four anthropometric measurements have high diagnostic accuracy for LBW detection (figure 4). Except for MUAC, the AUC to identify preterm babies was also above 85% (figure 5).

Figure 4
Figure 4

Summary receiver operator characteristics curve (sROC) showing diagnostic accuracy of anthropometries against birth weight (BW). Each sROC is shown in (a) foot length (FL), (b) mid-upper arm circumference (MUAC), (c) head circumference (HC) and (d) chest circumference (CC) to identify LBW babies. AUC, area under curve; LBW, low birth weight.

Figure 5
Figure 5

Summary receiver operator characteristics curve (sROC) showing diagnostic accuracy of anthropometries against gestational age (GA). Each sROC is shown in; (a) foot length (FL), (b) chest circumference (CC) and (c) mid-upper arm circumference (MUAC) to identify small for GA babies. AUC, area under curve.

We inspected the funnel plots for symmetric distribution of effect sizes reported by the included studies (figure 6). However, the regression tests indicated that the funnel plots are asymmetrical, but not the rank correlation test. The Cook’s distances also revealed that none of the studies had indication of outliers or overly influential.

Figure 6
Figure 6

Funnel plots showing publication bias of studies on correlation between birth weight and four anthropometric measurements; (a) sROC of foot length, (b) mid-upper arm circumference (MUAC), (c) chest circumference and (d) head circumference. sROC, summary receiver operating characteristic.

We examined the potential moderating effects of publication year and sample size on the correlation between the reference tests and the four anthropometric measurements by performing meta-regression analysis. The regression model shows that sample size has significant moderator effect on the correlation between gestational age and FL. However, neither factor showed a significant moderating effect in the meta-regression models for other anthropometric measurements (online supplemental table 5).

Discussion

Various anthropometric measurements can be used to identify LBW and preterm babies. In this systematic review and meta-analysis, we evaluated diagnostic accuracy of four common anthropometric measurements (FL, MUAC, CC and HC) reported by 21 studies conducted in eight African countries.

The pooled measurement agreement between birth weight and CC shows a strong positive correlation. The SROC from this meta-analysis also shows that CC (at 28.8 cm optimal cut-off point) has high diagnostic accuracy for identifying LBW babies, as its curve has the largest AUC. These results are in line with results reported by a previous meta-analysis of similar studies in developing countries.22 The diagnostic accuracy observed in CC is partly due to the fact that chest has a larger cross-section with less chance of systematic or random errors in measurement.28 In addition, while CC has high accuracy with clear nipple line landmarks, its use has lower acceptance due to the hypothermia risk from undressing newborns.8 11 15 However, preterm neonates with hyperinflammation of the chest including meconium aspiration appear to have increased lung volume and heavier weight. Therefore, using their CC could lead misclassification of the actual birth weight.36

The pooled estimate for FL shows that it has moderate to strong positive correlation with gestational age and with birth weight. Another important finding in this meta-analysis was the higher correlation observed between FL and birth weight when the measurement was done by a hard ruler or calliper rather than non-stretchable tape. This meta-analysis also found that FL (at 7.2 cm optimal cut-off point) has high diagnostic performance with AUC of 0.91 for detecting preterm babies. Similarly, a previous systematic review reported that FL (at optimal cut-off point <7.3 and <7.9 cm) had relatively high sensitivity and specificity to classify very LBW and LBW infants.23 Another study in Bangladesh also found that FL measured with a firm ruler has better accuracy than by tape measurement.37 Based on qualitative observations of FL measurement procedures, one of the included studies reported that FL measurement using a ruler was simple to learn and explain to others.24

Different studies also reported that FL has minimum interobserver variability and has comparable diagnostic ability until the fifth day of life12 20 and can be performed with simple and available equipment with minimal exposure to the infant. A study conducted in Uganda reported that CC and FL measurements taken by midwives and community health workers have no significant mean difference.18

A logical conclusion that could be drawn from these findings is that FL measurement could be a candidate for community-level identification of LBW and preterm babies. However, other studies reported that FL measurement requires proper positioning to minimise the measurement bias caused by grasp reflex.12

We also observed the need to standardise landmarks for FL measurement as some of the included studies measured from the heel to the tip of the longest toe (hallux or second toe),10 12–15 17 20 while others measured from the heel to tip of the big toe6 11 20 21 24 26; and from the centre of the heel pad to the middle of the tip of the big toe.16 In addition, the correlation between FL and birth weight measurements has significant differences based on the subgroup analysis of the foot landmarks used in the studies. To this end, we suggest that measurements taken from the heel to the tip of the longest toe will provide the maximum distance and can reduce the chance of systematic or random errors as observed in the CC measurement.28

Furthermore, soft tissues of subcutaneous fat are decreased in infants small for gestational age. As FL is based on the measurement of the soft tissue and the bone size, it might give inaccurate estimation when used to measure GA in this group of newborns.20 38

There is a strong positive correlation between MUAC and birth weight measurements in this analysis. The pooled diagnostic accuracy of MUAC, measured by AUC, was also high for identifying preterm and LBW babies. Similarly strong correlation was observed between birth weight and MUAC by a previous meta-analysis study.22 However, the study conducted in Uganda reported that MUAC measurements taken by midwives and community health workers differ significantly.18 Despite the fact that locating the appropriate landmark for MUAC measurement might contribute to intraobserver and interobserver variability, the measurement is familiar for many community health workers due to its routine application in growth monitoring and nutritional assessments.11 The study conducted in Ethiopia reported that MUAC has comparable sensitivity and specificity to identify LBW babies until fifth day of life, however, it is important to note that for clinical interventions to be most beneficial to these neonates, identification is required as soon as possible after birth.16

There was also a wide CI for the pooled diagnostic accuracy estimate of MUAC to identify preterm babies. Despite the fact that gestational age is best assessed by early ultrasound examination, most of the included studies have used other methods including last menstrual period, fundal height or New Ballard Score.39 Hence, the variation of GA estimation method used across these studies and their level of uncertainty could be the reason for the wide confidence intervals of the poled estimates.

The meta-analysis also shows that HC has a significant positive correlation with birth weight. We also found that it has comparable diagnostic accuracy to detect LBW babies when compared with other anthropometric surrogates. However, in Uganda, it was observed that the measurements differ significantly when performed by midwives and community health workers.18 Furthermore, HC measurement has more challenges with subjective landmarks and high liability of head moulding, especially during prolonged and obstructed labour.6 7 21 25 27

Strengths and limitations

In this study, two reference tests (ie, birth weight and gestational age) were considered. The review included the four main anthropometric measurements and evaluated their diagnostic accuracy for detection of both LBW and preterm babies. In addition, multiple thresholds of the index anthropometric indicators were included in calculating the pooled estimates for the diagnostic accuracy. We also calculated pooled estimates of the correlation between the reference and index tests irrespective of the different cut-off points used in the included studies. This was used to evaluate the performance of the anthropometric measurements in predicting birth weight and gestational age without being influenced by the multiple thresholds used across the studies. However, our study is limited because while birth weight is affected by sex, subgroup analysis on sex differences was not performed as the studies did not report the required information. We were also unable to evaluate the diagnostic performance of HC in detecting preterm babies because only two studies reported the required information. Although efforts were made to identify possible sources of heterogeneity, there is considerable uncertainty in most of the pooled estimates in this meta-analysis. In addition, different anthropometric measurement modalities were used across studies. However, only few studies reported specific FL measurement techniques and landmarks to enable deep comparison across subsets.

Conclusion

Overall, FL, MUAC, head and CCs have comparable diagnostic performance in identifying LBW or preterm babies. While CC has high accuracy with clear nipple line landmarks, its use has less acceptability due to increased risk of hypothermia. Likewise, HC measurement has more issues with subjective landmarks and high liability of head moulding especially during prolonged and obstructed labour.

FL and MUAC measurements, however, are relatively easy to perform with minimal infant handling or exposure and have good reliability for identifying LBW or preterm babies. Considering the trade-off between simplicity and accuracy, and based on the findings of the systematic review, we recommend that FL and MUAC measurements should be the first choices for consideration when developing and testing tools for identifying LBW or preterm babies in the community and in settings where weighing scales are not available or functional. Furthermore, these screening tools are not intended to replace birth weight as it is recommended to use a well-calibrated scale for measuring birth weight whenever available. We also recommend further community-level studies on the design and implementation of appropriate FL and MUAC-based screening programmes. Also, the accuracy and utility of these tools should be assessed by conducting facility and community-based studies.