Featured image of post Creating Custom Child Growth Charts in Excel
🌏 中文

Creating Custom Child Growth Charts in Excel

I'm an Excel geek. After reading this, you might think we use different Excels.

This is about how I used Excel, data visualization, AI, statistics, and formulas to create a custom growth chart. I’ll explain everything clearly, even the basics.

Many parents use apps to track their baby’s height and weight. I only used that one feature. Installing a large app just for that felt wasteful. It was a perfect chance to use my Excel skills. It’s just data analysis, right? Excel can handle it!

System Planning

First, I needed a plan. Let’s see how growth curves work in parenting apps.

Baobaoshu app weight curve page, x-axis shows age from 8 to 14 months, y-axis shows weight kg from 4 to 44, displaying 3%/25%/50%/75%/97% reference lines with green data points near 50% line

This is a growth curve from Baobaoshu (BabyTree). The 50% line is the median. If my baby’s height (or weight) is on this line, about half of babies are taller (or heavier) and half are shorter (or lighter). The 75% and 97% lines mean the height (or weight) exceeds 75% and 97% of babies of the same age. The 25% and 3% lines work similarly. This shows my baby’s growth compared to others.

I wanted a similar tool to:

  1. Record my baby’s height and weight.
  2. Query the normal height and weight range for each month.
  3. Show how much my baby’s measurements deviate from the norm.

A chart or curve didn’t matter. The key was the third point: calculating and displaying the deviation intuitively. A diverging bar chart seemed suitable:

Diverging bar chart Birth Rate by Years, center shows years 1940-2010, left orange bars show birth rate for mothers aged 20-25, right blue bars show birth rate for mothers aged 30-35

This chart compares two data sets in the same dimension.

R language Diverging Bars example showing mtcars dataset car mileage deviation from average, green Above Average to right, red Below Average to left

For one data set, it shows direction and distance from a benchmark, often for positive and negative values.

This was perfect. I’d use the median as the benchmark, showing if my daughter’s height (or weight) was above or below it. The bar length would show the deviation. To simplify, I used symbols: a minus for below, a plus for above, with more symbols meaning greater deviation. Seeing “+++” or “—-” would signal a need to check her growth trend.

Preparing the Data

With a goal, I started working. First, the first two capabilities:

  1. Record height and weight.
  2. Query normal ranges.

Baby’s Growth Data

My baby’s data was in the Baobaoshu app (dates are omitted to protect my daughter’s birthday):

Baobaoshu app child growth record list screenshot showing 1yr3mo17d/1yr2mo16d/1yr1d/11mo15d/9mo25d records with height cm/weight kg/head circumference cm data

Baobaoshu doesn’t export data. Manual entry was an option, but there had to be a better way.

Google Play Store ScreenMaster Screenshot Markup app page, by Blossgraph, 4.5 stars 77.6K reviews, 5M+ downloads

I took screenshots of the records and used an Android app, Screen Master, to stitch them into one long image.

Then, I used Baimiao OCR (https://web.baimiaoapp.com/) to extract the text:

Baimiao OCR web page screenshot, left Origin Photo shows Baobaoshu growth record screenshot, right Results shows extracted text with date/height/weight/head circumference data mixed together

The format was messy, but AI can handle that.

ChatGPT conversation screenshot, user prompt requests converting OCR content to table format with date as index in first column and information in second column separated by spaces

ChatGPT output table screenshot with Date and Information columns, Information column contains mixed age/weight/height/weight/head circumference data

ChatGPT conversation screenshot, user prompt requests changing date column to M/D/YYYY format

ChatGPT conversation screenshot, user prompt requests removing age-related text like years/months/days from information column

ChatGPT conversation screenshot, user prompt requests splitting information column into height/weight/head circumference columns, kg data to weight column, cm data to height column, second cm data to head circumference column

ChatGPT conversation screenshot, user prompt requests moving units to column headers, keeping only numeric values in columns, and removing nutrition supplement notes from head circumference column

ChatGPT output cleaned table screenshot with Date/Height cm/Weight kg/Head Circumference cm four columns, date column blurred, numeric columns contain only numbers

Done! I just copied it to Excel. Day age, month age, and age were automatically calculated by subtracting my daughter’s birthday from the recording date.

Excel child growth record table screenshot with Date/Day Age/Month Age/Age/Height cm/Weight kg six columns, date column blurred, day age from 0 to 151, month age from 0 to 5

Normal Range Standards

Reference values are on the National Health Commission’s website. The 2022 standard, WS/T 423—2022, is the same source as Baobaoshu: http://www.nhc.gov.cn/fzs/s7848/202211/8b94606198e8457dafb3f8355135f1a3/files/e38068f0a62d4a1eb1bd451414444ec1.pdf

The data was in this format:

National Health Commission weight standard deviation table for girls under 7 years, with Age/-3SD/-2SD/-1SD/Median/+1SD/+2SD/+3SD eight columns, from 0 months to 2 years 6 months

I’ll explain this table. We’ve covered the median. The key is “SD,” or Standard Deviation. It’s a basic statistical term. First, we need to understand normal distribution. The Health Commission’s statistics use a large sample size, measuring many children. Height and weight are random and, with a large enough sample, normally distributed around the average (or median, which is very close). A normal distribution looks like this:

Normal distribution bell curve, x-axis labeled μ-3σ to μ+3σ, y-axis shows probability density, marking 68.26%/95.44%/99.72% three standard deviation interval percentages

The horizontal axis is height (or weight), and the vertical axis is the number of children. The center dashed line is the median. Most children are near the median. Fewer children are at the extremes.

Standard deviation is the distance between the dashed lines, which are equally spaced. It’s like a ruler for the normal distribution. It tells us the proportion of data within a range. For example, 68% of children are within one standard deviation above and below the median; 95% are within two.

Standard deviation is a key property of normal distribution. Proportions for 1, 2, and 3 standard deviations are always 68%, 95%, and 99.7%. Knowing the average (or median) and standard deviation lets us find any data point’s position.

I copied the table to Excel and converted all ages to months:

Excel growth reference table screenshot with Month Age/Weight -3SD to +3SD/Height -3SD to +3SD 17 columns, month age from 0 to 33 listing standard deviation values

The table shows the median and values at 1, 2, and 3 standard deviations above and below it. This helps me see where my daughter’s measurements fall and how much they deviate.

Drawing the Curve

Now, the hard part: showing how much my baby’s measurements deviate from the normal range. This requires real Excel skills.

I had two tables: my baby’s data and the reference ranges. I needed to add deviation columns, query the reference table, calculate the deviation, and show it with plus and minus signs. Minuses would be right-aligned in the left column, and pluses left-aligned in the right, creating a simplified diverging bar chart.

Excel growth record table screenshot with Date/Day Age/Month Age/Age/Height cm/Height Deviation/Weight kg/Weight Deviation eight columns, deviation columns pending fill

Matching the Reference Month

In theory, this is simple: use VLOOKUP to match the month, then nested IFs to compare and output symbols.

But the National Health Commission table has gaps:

Excel growth reference table screenshot with red boxes highlighting 24/27/30/33 month age rows, showing data every 3 months after 2 years, creating month age gaps

From 2 years old, data is provided every 3 months. This is reasonable, as growth slows. But it affects querying. At 25 months, a direct VLOOKUP finds nothing.

One workaround is to manually complete the reference table, adding missing months and using values from younger months (e.g., using 24-month values for 25 and 26 months).

But I wanted intelligent matching!

So, I added a hidden column to find the corresponding reference month for each row.

Excel growth record table screenshot with red box highlighting new Reference Month Age column, showing month age 25 matches 24, month age 30 matches 30, implementing downward matching to reference table

The formula for this column is:

=IF(ISBLANK(A2),"",INDEX('生长对照表'!A$3:A$46,COUNTIFS('生长对照表'!A$3:A$46,"<="&C2),0))

In plain English, the formula checks if the date is blank. If so, the cell is empty. Otherwise, it counts rows in the reference table with months less than or equal to the baby’s age, effectively “matching down.”

Before two years, the baby’s age matches the reference age. I tested this; a 25-month record will match the 24-month reference.

Excel table screenshot showing #REF! error, month age 25 row reference month age shows 24, red box highlights that row, demonstrating formula reference error case

Calculating Deviation

The reference month column handles mismatches, so we can calculate deviations.

The “height below average” column formula serves as an example:

=IF(ISBLANK(F2),"",IF(F2>VLOOKUP(E2,'生长对照表'!A$3:O$46,12),"",IF(F2=VLOOKUP(E2,'生长对照表'!A$3:O$46,12),"=",REPT("-",5-RANK(F2,{F2,VLOOKUP(E2,'生长对照表'!A$3:O$46,11),VLOOKUP(E2,'生长对照表'!A$3:O$46,10),VLOOKUP(E2,'生长对照表'!A$3:O$46,9)},1)))))

Okay, this formula looks insane. Let’s break it down, layer by layer, starting from the outside:

Layer 1

=IF(ISBLANK(F2),"",IF(F2>VLOOKUP(E2,'生长对照表'!A$3:O$46,12),"",IF(F2=VLOOKUP(E2,'生长对照表'!A$3:O$46,12),"=",Layer 2)))

This part first checks if the height column (F2) is empty. If so, this cell is also empty. Otherwise, it compares F2 with the corresponding median height from the reference table. If F2 is greater than the median, the cell remains blank (as this column only shows negative deviations). If F2 equals the median, it displays “=”. If F2 is less than the median, the second layer calculates the number of “-” signs to output.

Layer 2

REPT("-",Layer 3)

I initially planned to use nested IF statements to determine the number of minus signs, but that seemed a bit silly. Here’s a simpler approach: The REPT function can repeat a string a specified number of times. Now, the problem is passed to the third layer: calculating the number of minus signs to output.

Layer 3

5-RANK(F2,{F2,VLOOKUP(E2,'生长对照表'!A$3:O$46,11),VLOOKUP(E2,'生长对照表'!A$3:O$46,10),VLOOKUP(E2,'生长对照表'!A$3:O$46,9)},1)

Here’s a hidden gem in Excel: array constants. We often use ranges in formulas, which are implicitly arrays. But did you know you can create arrays manually, like in programming, using curly braces {}? For instance, {1,2,3,4} in a formula is the same as:

Excel cell auto-fill screenshot, column A shows 0 to 9 number sequence, blue fill handle at bottom-right of A5 cell, demonstrating drag-to-fill feature

Array constants are far more flexible. You can combine seemingly unrelated data. Just look at what’s inside the {}:

{F2,VLOOKUP(E2,'生长对照表'!A$3:O$46,11),VLOOKUP(E2,'生长对照表'!A$3:O$46,10),VLOOKUP(E2,'生长对照表'!A$3:O$46,9)}

Excel growth reference table screenshot with red box highlighting -3SD/-2SD/-1SD three columns of height data, values from 44.7 to 61 increasing, used for array formula reference

This array combines my baby’s height (F2) with the heights at -1, -2, and -3 standard deviations from the mean.

RANK(F2,Array,1)

Next, RANK sorts my baby’s height among those four values, ascending. Subtracting the rank from 5 gives the number of minus signs. Why 5? Think it through based on different scenarios, and it’ll become clear.

Excel growth record table screenshot with red boxes highlighting Height Deviation and Weight Deviation columns, showing –/-/=/+/++ symbols indicating deviation degree

I used a similar approach for the other three deviation columns. It works perfectly. The number of symbols indicates the standard deviation range. 95% of children fall within two standard deviations, so two symbols are fine. All good so far!

Data Visualization

For data visualization, I need to highlight key data. The plus and minus signs are basic.

I don’t need fancy graphics. To flag outliers, I just replaced the pluses and minuses with distinct symbols and added simple conditional formatting for background colors. That’s enough for me.

Excel Conditional Formatting Rules Manager screenshot showing conditional formatting rules for Height Deviation and Weight Deviation columns, 3 symbols yellow background, 4 symbols red background

Three symbols mean the measurements are outside the 95% range – I use yellow. Four symbols mean outside 99.7% – I use red. I manually adjusted a few extreme values for demonstration:

Excel growth record table final result screenshot with conditional formatting applied to Height Deviation and Weight Deviation columns, yellow highlights 3 symbols, red highlights 4 symbols, intuitively showing deviation degree

Wrap-up

Finished! Time to uninstall that parenting app. I happily clicked the “x”.

There are many growth trackers, but building my own is uniquely satisfying. I learned about arrays, REPT, and RANK on the fly – a great experience. The initial planning was the most interesting. Once started, it took just an hour.

It shows the power of combining knowledge, tools, and techniques. Improvise, adapt, overcome.

I should mention I prefer Google Sheets. Replicating this in Excel might require tweaks, but the formulas are similar.


[2024.1.18 Update] I’ve received requests for the spreadsheet. Converting to Excel had issues: Excel doesn’t support array constants as a RANK range, and you can’t reference other cells within them. Doing this in Excel is harder, likely needing many nested IFs. I recommend Feishu sheets or Google Sheets.

I’ve made boy/girl versions available.

Boy version: https://my.feishu.cn/wiki/JlMKw1NiBis8yok62BJcbCZ3n2d?from=from_copylink

Girl version: https://my.feishu.cn/wiki/RKHuwkXafiS987kLxPIc8jkxnAc?from=from_copylink